Manuel
Guerris
*,
Jordi
Cuadros
*,
Lucinio
González-Sabaté
and
Vanessa
Serrano
IQS Univ. Ramon Llull, Via Augusta 390, 08017 Barcelona, Spain. E-mail: manuel.guerris@iqs.url.edu; jordi.cuadros@iqs.url.edu; Tel: +34 932672000
First published on 14th May 2020
The public image of chemistry is a relevant issue for chemical stakeholders. It has been studied throughout history by means of document analysis and more recently through surveys. Twitter, a worldwide online social network, is based on spontaneous opinions. We tried to identify the public perception of chemistry on Twitter, what it explains, and which sentiments are perceived. We gathered 256833 tweets between 1st January 2015 and 30th June 2015 containing the words “chemistry”, “chemical” or “chem”. We cleaned and filtered them down to 50725 tweets with textual information in English and clustered them using spherical k-means. The resulting clusters were categorised according to six topics by 18 chemistry experts. The prevailing topics were the learning environment topic, related to activities and tasks in chemistry courses, and the human activity topic, referring to facts and news about the chemical industry. The scientific knowledge topic, concerning communication of chemistry knowledge, only accounted for a small percentage of the tweets. We classified the tweets of most relevant topics based on their sentiment values and obtained more positive than negative perceptions. Nevertheless, the analysis of the unigrams and bigrams word clouds revealed a significant presence of chemophobia-related terms in the human activity topic, both in positive and negative classified tweets. It also revealed specific elements of chemistry courses negatively perceived in the learning environment topic.
These perceptions are related to how chemistry communicates itself. Science communication helps citizenship to acquire the knowledge about science to participate actively and responsibly in, with and for society (Hazelkorn, 2015), plays a significant role in awakening vocations (Stekolschik et al., 2010) and promoting scientific careers (Hayden et al., 2011). A lack of vocation can cause an insufficient supply of graduates from upper-secondary and higher education to meet increasing demand across the EU, and a shortage of STEM (science, technology, engineering and mathematics) professionals (Cedefop, 2016). Additionally, the role of chemistry is not well understood by policymakers, funders and the chemistry community itself (Palermo, 2018). Therefore, the knowledge of chemistry's public image and its understanding in terms of its contents and sentiments perceived by the public matters to all chemistry stakeholders. It is critical to understand what is communicated and how it is perceived to define better policies to reduce this shortage.
There is scientific literature that investigates this public image at academic and social levels. At the academic level, studies reveal that the view perceived by students of chemistry (Yager and Penick, 1983; Furió Más, 2006) and of science (Schibeci, 1986) is generally negative because of a lack of clarity on its communication and a distorted perception of students (Nicolas, 2006; Penagos and Lozano, 2009; Chamizo, 2011; Lacolla et al., 2013). Academic contents far from students motivation (Piñeros and Parga, 2014), the lack of historical and social perspectives in the curricula (Jiménez and Criado García-Legaz, 2005; Nicolas, 2006; Muñoz and Nardi, 2011; Linthorst, 2012) and the absence of science, technology and society relationships in the teaching of science (Solbes and Vilches, 1992; Ribelles et al., 1995; Malaver et al., 2004; Furió Más, 2006) are aspects that contribute to giving a poor image of chemistry far away from the real world impacting students in a negative way.
At the social level, the public image of chemistry seems to have inherited a negative perception due to its negative historical associations (Schummer and Spector, 2007; Schummer et al., 2007) and a lack of efficient communication on behalf of chemists (Hartings and Fahy, 2011). For instance, sensationalist propaganda in global media used to associate the chemical industry and chemistry with pollution and environmental degradation (Trozzolo, 1975; Penagos and Lozano, 2009). Despite this negative perception, efforts to improve the image of chemistry have been implemented such as “Chemical for All” (Hill and Kumar, 2013), a global strategy to convince the public that chemistry provides health, comfort, and well-being.
It seems the public image of chemistry has always been negative, although some of the recent studies reviewed suggest a positive change. In 2004, IUPAC (Mahaffy et al., 2008) found a negative public image of chemistry related to the misunderstanding of chemistry, chemists, chemicals and the chemical industry. In 2010, the European Chemical Industry Council (Hadhri, 2010) measured the public perception of the chemical industry in relation to several industries in the European Union. It suggested that chemistry had a favourable image approximately at the same level as it was in 2008 and has improved since the late 90s. The Royal Society of Chemistry analysed chemists’ internal perceptions and the society's perception of chemists and chemical products in the UK (The Royal Society of Chemistry and TNS BMRB, 2015) and refuted the negative image. The results of the study showed a neutral or even positive image with 51% of the respondents being neutral and 19% happy, and 59% of the respondents answering that the benefits of chemistry were higher than its harmful effects. They mostly perceived chemistry as a solution to major global challenges such as oil dependence, food shortages, pollution and access to drinking water as well feeling a positive impact on well-being. Additionally, 21% of the general public associated chemistry to school or teachers with negative memories related to it but with mixed feelings about the chemistry that they learnt at school. 48% either agreed or were neutral that school had put them off chemistry, 45% disagreed that chemistry learnt at school had been useful in everyday life, and 52% agreed that they did not feel confident enough to talk about chemistry, with negative described perceptions of chemistry in comparison with science.
Several authors have proposed educational activities to improve this perception (Pratt and Yezierski, 2018; Ratamun and Osman, 2018; Molina and Carriazo, 2019; Tortorella et al., 2019).
All the methods used to study the public image of chemistry were based on surveys and document analysis. They are not designed and are not able to capture spontaneous opinions. Social networks, on the other hand, collect ideas that are expressed spontaneously. Twitter with its 204 million monthly active users in the second quarter of 2015 (Clement, 2019), and 23% of total adult internet users (Duggan, 2015) demonstrates to be a relevant and significant online social network. Moreover, it is used by citizens to read news (Pew Research Center, 2019) and it is one of the leading social media platforms used by business-to-business (B2B) and business-to-consumer (B2C) marketers worldwide (Statista, 2019). On Twitter, users can communicate by exchanging short messages or tweets of up to 140 characters in real-time during the time span of this research, and can follow other users without any relationship between them. Public tweets can be freely gathered using Twitter Search API and analysed to get their sentiments (Sailunaz and Alhajj, 2019).
Tweets overcome survey challenges (Choi and Pak, 2005; Tourangeau and Yan, 2007; Krumpal, 2013) and document analysis challenges (Casadevall and Fang, 2009; Antilla, 2010) because tweets are part of conversations between Twitter users (Boyd et al., 2010; Huang et al., 2010; Smith et al., 2014). These conversations give Twitter users the ability to express their thoughts, opinions (Kanavos et al., 2014) and emotions (Tago and Jin, 2018) which are included in human social behaviour (Aarts et al., 2012; Ye and Wu, 2013).
Therefore, Twitter seems an appropriate social network for this analysis because of its high number of users, the spontaneous contents written by its users, its use in different sectors and the ability to gather public users’ messages and to analyse their sentiments. Consequently, it will complement existing literature about the public image of chemistry.
The objectives of this research are the following:
• Which topics related to chemistry can be found on Twitter?
• To what extent do Twitter messages portray positive and negative sentiments towards chemistry?
• What do users tweet about chemistry?
Just as an outline of the methods used, text mining enables deriving information extracted from written resources through computation (Gupta and Lehal, 2009) and includes techniques and processes reported in the literature (Hearst, 1999; Hotho et al., 2005; Berry, 2007; Feldman and Sanger, 2007; Delen and Crossland, 2008; Gupta and Lehal, 2009; Irfan et al., 2015) to clean and obtain relevant information and to cluster those data according to topics. Clustering consists in grouping a set of objects based on their similarity and is useful with new or unlabelled objects (Jain et al., 1999). There are different groups of clustering techniques (Fahad et al., 2014), the partitioning-based methods such as the k-means algorithm (Jain, 2010) being the most popular and the most used.
We gathered twitter data during a period of time, cleaned and filtered them to eliminate non-relevant information and clustered them using a partitioning-based technique suitable to be applied to documents. Chemistry experts classified the clusters obtained into topics based on their contents. We did not opt for automatic classification methods which need text documents already classified into those topics (Hotho et al., 2005) to classify new documents.
Sentiment analysis techniques (Yadollahi et al., 2017) allow us to evaluate sentiments from terms, sentences, and documents. We classified the tweets related to the topics in the literature based on their sentiment value using a lexicon-based approach and analysed them by applying statistical and visual methods.
Retweets, tweets written by a user and forwarded by another one, were discarded to avoid them possibly hiding in the clusters other tweets with a lower number and related to other contents. Expressions that did not add any meaning such as HTML tags, Unicode codes, Twitter account names, emails, URL addresses, non-letter symbols, and one and two-letter words were removed. Hashtags were kept because they could contain meaningful information.
English language tweets were selected with the textcat package in R (Hornik et al., 2013). Stop words, commonly used words that do not add meaning to a document and provided by the tm package in R (Feinerer et al., 2008), were removed. Based on previous tests, the terms “just”, “now”, “got”, “will”, “get”, “much”, “can” and “no”, which did not contribute to the clustering process, as well as empty and duplicated tweets were also eliminated.
A bigram is a set of two consecutive words in a tweet. A bigram TDM (term-document matrix) is a matrix where each column corresponds to a tweet, a row to a bigram and each cell ij contains the number of times bigram i appears in tweet j. We built bigram TDM with cleaned tweets. We used bigrams instead of unigrams (single words) because there is no clear advantage in using unigrams in text categorization (Bekkerman and Allan, 2004) and bigrams are more accurate during cluster analysis. We did not want to lose information using n-grams, a sequence of n consecutive words from a given document being n an integer over three as their appearance in TDM could decrease.
We reduced the dimensionality of bigram TDM removing those bigrams with a low frequency to keep the most relevant ones and avoid those which might add noise to the information hampering the clustering process.
We selected the number of clusters in skmeans based on two commonly used clustering validity indices, the elbow method (Madhulatha, 2012; Kodinariya and Makwana, 2013) and the silhouette method (Rousseeuw, 1987). Both methods are heuristic and used to determine the number of clusters visually. Then we quantitatively calculated it using the L-method algorithm (Salvador and Chan, 2004) and the curvature of a graph (Zhang et al., 2017).
The different results obtained allowed us to choose a specific number of clusters trying to find a balance between the chemistry experts’ capacity to manually classify the clusters and the closeness to the best solutions. With this number of clusters, we ran the skmeans clustering technique almost 10000 times because of its stochastic behaviour, and we obtained the best solution. This optimal was the one in which the minimum value was calculated by the sum of the distance of each tweet to its respective prototype assigned to a cluster.
The word clouds generated were visually interpreted by a group of chemistry experts. The topics obtained from our previous tests were the following:
• Human activity (HA): most terms are related to the presence of chemistry within the human activity such as production or the chemical industry.
• Scientific knowledge (SK): most terms are related to chemical concepts and abstract entities.
• Learning environment (LE): most terms are related to chemistry as a subject or course taught in class as well as student activities or exercises.
• Entertainment (E): most terms are related to cultural and media performances such as songs, musical groups, movies or TV series.
• Human relationships (HR): most terms are related to feelings between two or more people or emotions in general.
• Undefined (U): most terms either belong to several previous topics in which none of the terms predominate over others, or they belong to topics not defined in the list.
We defined a balanced incomplete block design (BIBD) (Fleiss, 1981) to assign clusters to be classified by chemistry experts. The experts were randomly divided into three groups where every expert was randomly assigned to one of the cluster groups defined in the BIBD. The order of the clusters analysed by every expert was randomized too. Each cluster was represented by its unigram and bigram word clouds.
We calculated the percentage of clusters and tweets assigned to each topic summing all the votes that a cluster received in each topic. The cluster was assigned to the topic with the highest number of votes. If several topics had the same number of votes, then the cluster was assigned to the U topic. Tweets belonging to the cluster inherited their cluster assignation.
We statistically analysed the results obtained using Fleiss’ kappa (Fleiss, 1971). Fleiss’ kappa and its significance level were calculated for every topic and the whole experiment (Fleiss et al., 2003). The closer the value of kappa to one, the better the agreement. If the value was zero or below zero, the agreement was weaker than expected by chance. We used a common benchmark scale (Landis and Koch, 1977) to evaluate HA and LE Fleiss’ kappa value results in addition to their statistical representativeness.
The lexicon we used was based on SentiWordNet 3.0 lexicon (Baccianella et al., 2010), which is commonly referenced in the literature (Medhat et al., 2014; Mohey and Hussein, 2018; Sun et al., 2017; Yadollahi et al., 2017; Mäntylä et al., 2018). This lexicon contains repeated words with different contexts and similar or different words which share polarity value because of similar contexts. We separated different words that shared a polarity value, assigned one polarity value per word and calculated an average polarity value for identical words. The result was a new list of single words with a single polarity value for each word.
A comparison word cloud is a graphical representation of terms from different documents represented in the same word cloud and differentiated by colour. The common terms are assigned to the document where the term has its maximum deviation calculated by its frequency in that document minus the average frequency in all the documents (Fellows, 2018). We built comparison word clouds of unigrams and bigrams with positive and negative tweets for the HA and LE topics and used them to interpret their main contents visually.
We selected representative samples of the tweets corresponding to some of the most frequent positive learning environment terms to understand their content better. Samples sizes were calculated using sample size for a proportion formula with a 0.95 confidence interval, 0.05 margin of error and 0.5 (worst case) expected sample proportion. We visually analysed the contents of the tweets samples and classified them into ironies or non-ironies and positive, negative and neutral sentiment.
We built the two bigram TDMs with their main characteristics described in Table 1. The selection of bigrams with a frequency over 29 caused the number to be reduced from 302637 to 864 and the non-empty tweets from 76242 to 50725.
TDM | Tweets number | Bigrams number | Frequency | ||||
---|---|---|---|---|---|---|---|
Min | Q1 | Median | Q3 | Max | |||
Bigrams | 76242 | 302637 | 1 | 1 | 1 | 1 | 3990 |
Bigrams with frequency over 29 | 50725 | 864 | 30 | 36 | 48 | 81 | 3990 |
This reduction was due to the low frequency of most bigrams in tweets. At least 75% appeared once in all of them (Q3 = 1). We show an example of the processing of three tweets and part of the bigram TDM (Table 2).
Original tweets | Tweet 1: “im dreading going back to college… especially bc that monday i have an 8am chem lab” | |||
Tweet 2: “baking cookies for chem lab and accidentally used self rising flour OMG” | ||||
Tweet 3: “2 kumbe 1 chem journal… Copying time” | ||||
Cleaned tweets | Tweet 1: “dreading going back college especially monday chem lab” | |||
Tweet 2: “baking cookies chem lab accidentally used self rising flour omg” | ||||
Tweet 3: “kumbe chem journal copying time” | ||||
Bigrams obtained from cleaned Tweet 1 | “dreading going”, “going back”, “back college”, “college especially”, “especially monday”, “monday chem”, “chem lab” | |||
Bigram term document matrix | Tweet 1 | Tweet 2 | Tweet 3 | |
“college especially” | 1 | 0 | 0 | |
“especially monday” | 1 | 0 | 0 | |
“monday chem” | 1 | 0 | 0 | |
“chem lab” | 1 | 1 | 0 |
We observed that there was no clear elbow in the elbow method graph and no sharp change in the slope of the silhouette method graph, which did not let us visually determine the best number of clusters. We used the L-method and the curvature method to select the number of clusters numerically. The cluster corresponding to the minimum value in the L-method and the one corresponding to the maximum in the curvature method were considered to be the best ones. These results are included in ESI 1 (ESI†).
The elbow graph suggested using 78 and 193 clusters and the silhouette graph, 78 and 98. As there was no clear and exact solution for the number of clusters, we decided to round it off to 100 clusters. It is large enough to minimize mixed topic clusters and small enough to be classified by a chemistry expert.
We tried to run the skmeans implementation technique for 100 clusters 10000 times, but we could only get 9723 due to technical issues. We selected the best solution with the best skmeans criterion value. In that solution, the minimum tweets per cluster were 95, the maximum 3476, the median 383 and Q1 and Q3 were 251 and 647 respectively.
With a total of 18 different experts, we replicated this design in each of the three groups with six experts per group, so each cluster was categorized by nine different experts. An example of one cluster with its graphical representation is shown in Fig. 3. The representations of all the clusters are included in ESI 2 (ESI†). These representations were used to classify the clusters obtaining the results shown in Table 3. The table with the specific cluster number assigned to each expert and the detailed classification results per expert are included in ESI 3 (ESI†).
Topic | Percentage classified | |
---|---|---|
Clusters (%) | Tweets (%) | |
Human activity (HA) | 20 | 18 |
Scientific knowledge (SK) | 6 | 5 |
Learning environment (LE) | 45 | 39 |
Entertainment (E) | 5 | 13 |
Human relationship (HR) | 10 | 7 |
Undefined (U) | 14 | 18 |
It was possible to classify most clusters and tweets and only 14% of the clusters and 18% of the tweets were considered as undefined. LE and HA were the topics that obtained the largest numbers of clusters and tweets. LE represented 45% of the clusters and 39% of the tweets and HA 20% of the clusters and 18% of the tweets classified, whereas SK and E obtained the lowest ones. The SK topic, which concerns spreading scientific knowledge, only attained 6% of the clusters and 5% of the tweets classified.
We calculated Fleiss’ kappa for each topic and the whole experiment. These values were statistically representative because of their very low p-values (less than 1 × 10−6). We compared the Fleiss’ kappa values for HA and LE (0.388 and 0.517 respectively) with Landis and Koch (1977) Kappa's benchmark scale. The obtained values show fair and moderate inter-rater reliabilities respectively.
Fig. 4 Tweets classified by their polarity (A) human activity (HA) tweets (total number = 9159) (B) learning environment (LE) tweets (total number = 19804). |
It is worth noting how, for instance, the terms “free” and “acid” were valued as negative. Their polarity values depend on the different contexts provided by the lexicon, the polarity value of each context and the average polarity value calculated with all polarity values of a term. As an example, the different contexts of “acid” in the lexicon were “water-soluble compounds being able to damage water”, “having the characteristics of an acid”, “an acid reaction” and “being sour to the taste”. Each context had a polarity value which was used to calculate the average polarity value of “acid”.
Finally, we compared the positive and negative tweets of the HA and LE categories through comparative unigram and bigram word clouds (Fig. 5).
The visual analysis of HA comparison word clouds suggests that tweets classified as negative containing terms such as “attack”, “syria”, “chemical attack”,”syrian opposition” and “chemical warfare” and “toxic”, “toxic chemical”, “chemical fire” and “chemical leak” predominate over other negative tweet contents capable of fuelling chemophobia attitudes. The existence of terms such as “chemical free” and “used chemical” in tweets classified as positive might reinforce this effect. The term “chemical”, seemingly understood as industry products, is considered mostly in negative tweets whereas “chemistry”, apparently related as a physical science, appears in positive tweets.
The visual analysis of LE comparison word clouds indicates the difficulty of chemistry as a subject. Terms in tweets classified as negative related to academic activities such as “final”, “exam”, “chem final”, “final tomorrow”, “quiz tomorrow”, “lab”, “lecture”, “chem lab”, and “chem lecture” and terms related to feelings such as “hate”, “hard”, “crying“, “need help”, “never understand” and “chemistry hard” are predominant. This difficulty of learning chemistry is reinforced by the presence of terms in positive tweets such as “someone help” and “help chemistry”, which can be due to several factors described in the literature.
Despite these negative feelings about chemistry, terms such as “test”, “chem test”, “chemistry test”, “test tomorrow”, “teacher”, “chem teacher” and “chemistry teacher” also appear in tweets classified as positive. We created two statistically representative samples of the tweets corresponding to the terms “test” and “teacher” to analyse them and understand their content. These terms included the rest of the most frequent positive terms. Sample sizes were 344 and 292 tweets from 3203 and 1208 corresponding to “test” and “teacher” terms respectively. We randomly selected these tweets and analysed and classified their contents as it was described in the experimental section part. Results obtained are included in ESI 4 (ESI†).
We found that many tweets in both samples seem to transmit either a neutral or a negative sentiment. There were also many ironies within negative classified tweets such as “First organic chemistry test tonight… Will someone start digging my grave now?” or “My Chem teacher looks like she could be a character on Phineas and Ferb”. Despite been classified as positive by the sentiment lexicon used, these terms seem to reinforce the negative ones and thus increasing negative sentiment.
Chemistry-related public tweets containing the words “chemistry”, “chemical” or “chem” reveal a strong presence of learning environment (LE) and human activities (HA) topics, but a limited one as far as transmitting chemistry knowledge is concerned.
The sentiment analysis results of HA tweets with a higher percentage of positive tweets than negative ones seem to follow the trend of the most recent studies (Hadhri, 2010; The Royal Society of Chemistry and TNS BMRB, 2015). The existence of both positive and negative tweets also seems to be in line with the contraposition between the positive and negative effects of chemistry described in The Royal Society of Chemistry and TNS BMRB (2015). There, 59% of the respondents answered that the benefits of chemistry were higher than its harmful effects and 51% perceived a neutral feeling. This contraposition seems to be highlighted by the terms “chemical”, understood as industry products, and “chemistry”, considered as a physical science, with a negative and positive connotations respectively. These results are in common with The Royal Society of Chemistry and TNS BMRB (2015). Chemophobia attitudes seem to be suggested by terms that appear in both negative and positive tweets related to HA. Many of these terms seem to be related to chemical war, chemical toxicity and chemical disasters. At the same time, these terms might create or reinforce chemophobia perceptions on Twitter users.
The sentiment analysis results of LE tweets also results in a higher percentage of positive tweets than negative ones. LE positive tweets, however, should be analysed deeper to review their sentiment values because of the context of the terms used in the sentiment lexicon. The contraposition between the positive and negative effects of chemistry is also present in this topic.
In LE, chemistry image seems to be based on specific elements of chemistry education such as evaluation methods and teachers rather than chemistry communication topics in academia and their influence (Nicolas, 2006; Penagos and Lozano, 2009; Chamizo, 2011; Lacolla et al., 2013) and curricula contents (Jiménez and Criado García-Legaz, 2005; Nicolas, 2006; Muñoz and Nardi, 2011; Linthorst, 2012; Piñeros and Parga, 2014) being a new contribution on this topic. Similar to our research, this image has so far always been considered negative in the view perceived by students of chemistry (Yager and Penick, 1983; Furió Más, 2006).
Additionally, LE messages seem to contain words related to classroom elements such as lectures and exams, perceived negatively as well as expressing the difficulty of learning. These negative feelings about chemistry learning might favour the association between chemistry and negative memories, in agreement with the conclusions of The Royal Society of Chemistry and TNS BMRB (2015) study.
Text cleaning and preparation techniques reduced to 50725 useful tweets. These were classified into six different topics. The two more frequent topics were activities and tasks related to chemistry courses (the learning environment topic, 39%) and facts and news related to the chemical industry and industry products (the human activity topic, 18%). Only a small percentage of tweets related to the transmission and communication of chemistry knowledge (the scientific knowledge topic, 5%) was found. The remaining tweets are either unclassified or belonging to categories less relevant to chemistry and chemistry education.
Sentiment analysis techniques helped us to observe many terms in the human activity topic suggesting chemophobia, whereas chemistry is perceived as difficult by Twitter users in the learning environment topic. These terms seem to relate to war, toxicity and disasters in the human activity topic. In the learning environment, most frequent terms seem to relate to classroom activities and students’ sentiments about the chemistry subject.
These two topics contained both positive and negative sentiments aligned with the latest accepted vision of the public image of chemistry with chemophobia still present in the human activity topic. This observation and the negative feelings found in the learning environment topic suggest that there is still room for improvement in current practices in chemistry education, both in the formal and informal settings. These improvements may lead to better scientific communication and knowledge, enhancing and improving citizenship participation in science.
Further studies should focus on improving the main limitations described above as well as on monitoring the evolution of the public perception of chemistry on Twitter during a longer period. Thus, scientists and practitioners could obtain a wider view of this perception on Twitter as well as to be able to detect new topics and associated contents. This new knowledge will be helpful to chemistry stakeholders for improving the public image of chemistry.
Footnote |
† Electronic supplementary information (ESI) available: Additional analysis results and figures. See DOI: 10.1039/c9rp00282k |
This journal is © The Royal Society of Chemistry 2020 |