Martin
Rusek
* and
Karel
Vojíř
Department of Chemistry and Chemistry Education, Charles University, Faculty of Education, Magdalény Rettigové 4, 116 39 Praha 1, Czech Republic. E-mail: martin.rusek@pedf.cuni.cz; karel.vojir@pedf.cuni.cz
First published on 7th August 2018
This paper focuses on the procedure and results for analyzing text-difficulty in lower-secondary chemistry textbooks in the Czech Republic. The authors use established methodology for text-difficulty analysis by Nestler, adapted by Průcha and Pluskal by adding a second independent analyser to improve reliability. Some textbooks do not follow the expected trend of either text-difficulty coherence or increasing text-difficulty between books for the 8th and 9th grade. No trend in topic difficulty was found either. The results show that learning outcomes may differ significantly when different books are used, despite the fact that they are supposed to support the same curriculum. For this reason, the results serve to support not only teachers when selecting a textbook, but also researchers as a starting point for lesson observations.
The process of learning in general and learning of science depends heavily on effective communication (Aydin et al., 2014, p. 384). As textbooks are one of the first resources where students encounter field-specific information, textbooks play an important role in offering credible information to students, when used appropriately. They are also a cultural artefact that mediates the process of internalising an individual's thinking (Z. Sikorová, 2008, p. 59). Textbooks are also valuable sources for learners who need to study alone at their own pace or to remember the topics taught in class (Aydin et al., 2014, p. 384). For these reasons, the text in textbooks is supposed to be comprehensible for students, as its role is to stimulate learning. As “language is a major barrier (if not the major barrier) to most students in learning science” (Wellington and Osborne, 2001, p. 2). This point stresses the interconnectedness of reading literacy and the transfer of field-specific information (Callender and McDaniel, 2009). In relation to Markow's (1988) and Nemeth's (2006) ideas, this transfer can also be seen as one of the main components in scientific literacy – mastering scientific language, i.e. using terms appropriately (see OECD, 2016, pp. 17–46). Nevertheless, as element symbols/names or formulas are field-specific representations, their meaning is not different to a word – scientific term (cf.Wellington and Osborne, 2001).
For these reasons, research on textbooks, especially their text readability, is a very important topic in teaching (Pappa and Tsaparlis, 2011). Understanding the history of science allows us to explain science education's current context, because the current context is conditioned by former initiatives (cf.Alpaslan et al., 2015, p. 208). In the same way, we need to understand current textbooks to be able to create good didactical materials.
Comprehending new information depends on both student and textual characteristics. Therefore, instructors need to assess not only their students’ comprehension ability and prior knowledge, but also the difficulty of the materials they present to their students (Pyburn and Pazicni, 2014, p. 782). Research has shown that inaccurate or incomprehensible texts can cause students’ misconceptions (Sanger and Greenbowe, 1999; Pedrosa and Dias, 2000; Bergqvist and Chang Rundgren, 2017). This can be caused either by simplifying the subject matter/concept too much or a text which is too difficult for students to comprehend. As textbooks are often a tool which determine educational content and concepts (Drechsler and Schmidt, 2005; Dávila and Talanquer, 2010), the problems of text analysis and the scope of its difficulty are of high importance. The scope of textbook analysis is vast. The papers are, for example, focused on:
• the quality of elaboration of particular topics in textbooks (Sanger and Greenbowe, 1999; Klapko, 2006; Knecht, 2007; Nakiboğlu and Yildirir, 2011),
• curricular reform indicators (Kahveci, 2010),
• comparing textbook emphasis (Kim and Park, 2008) or conception (Lee, 2012).
In the literature, several efficient textbook parameters are emphasized: lucidity, scientific accuracy, content adequacy (cf.Knecht and Weinhöfer, 2006; Mikk, 2007; M. Sikorová, 2007), topic/subject matter sequencing (Mikk, 2007), visual element clearness, and relevance (Knecht and Weinhöfer, 2006). The selected textbook parameters and criteria to evaluate them have one common intersection – subject matter/text adequacy, i.e. textbook readability (see Mikk, 2008). In the above-mentioned research focused on science textbooks, several approaches were chosen. A number of research articles focus on the structure of terms (Fitzgerald et al., 2017), the order of the subject matter (Tsaparlis, 2014) or language explicitness within one concrete topic (Taibu et al., 2015). Other authors focus on text characteristics that reflect its difficulty and comprehensibility, i.e. the use of phrases and text organization in textbooks (Biber et al., 2004; Parodi, 2010), or the use of terms and their frequency (Hsu, 2014). There are also endeavours to analyse text difficulty with a system for computing computational cohesion and coherence metrics (Pyburn and Pazicni, 2014). For the overwhelming majority of languages, there are no such systems, therefore original approaches are still being employed.
Textbook analysis has quite a tradition in Czech education research with text-difficulty with respect to semantic and syntax being one of the most frequently examined issues (Greger, 2005; Hrabí, 2007; Rusek et al., 2016). The majority of authors follow the work by K. Nestler (1974, 1982), further developed by J. Průcha (1984) and adjusted by M. Pluskal (Pluskal, 1996). The Nestler–Průcha–Pluskal method for text-difficulty analysis has been (with just small alterations) used since (see e.g.Hrabí, 2007; Weinhöfer, 2007; Rusek et al., 2016), as it offers a complex set of monitored criteria and also the possibility to compare textbook text-difficulty within one field or among other fields.
Some of the cited research focused explicitly on science textbooks. However, chemistry textbooks have not yet been given appropriate attention by researchers. This text follows the tradition, uses, and in some respects further develops the established methodology and focuses on the so far neglected field of lower-secondary chemistry education. The authors of this paper start from Beneš et al. (2009) and Banýr (1988) and from the paper by Rusek et al. (2016), who focused on the whole range of elementary school chemistry textbooks and developed the method for text-difficulty analysis by Pluskal (1996). As previous papers focused mostly on textbook text-difficulty in general, this research is more focused on semantic difficulty, comparing particular textbooks in one textbook series and the difficulty of selected topics within the textbooks.
For this reason, the textbooks approved by the Ministry of Education were included in the analysis. The textbooks are listed in Table 1.
Textbook title | Publisheda | Authors | Publisher |
---|---|---|---|
a The two records relate to two books for 8th and 9th grades. | |||
Základy praktické chemie 1; 2 | 1999, 2000 | Beneš, P., Pumpr, V. and Banýr, J. | Prague: Fortuna |
Základy chemie 1; 2 | 1993 | Beneš, P., Pumpr, V. and Banýr, J. | Prague: Fortuna |
Chemie 8; 9 | 2006, 2007 | Škoda, J. and Doulík, P. | Plzeň: Fraus |
CHEMIE Krok za krokem; CHEMIE Na každém kroku | 1999, 2000 | Bílek, M. and Rychtera, J. | Pardubice: Moby Dick |
Chemie 8; 9 | 2016, 2015 | Mach, J.; Plucková, I. and Šibor, J. | Brno: Nová škola |
Chemie I; II | 2004, 2007 | Karger, I., Pečová, D. and Peč, P. | Olomouc: Prodos |
They were chosen based on several criteria:
• fundamental topics in lower-secondary chemistry,
• all topics covered in all the analysed textbooks,
• even distribution of the analysed topics in study years,
• each analysed chapter contains at least a 200-word explanatory text in all the analysed books.
Based on these criteria, the following six topics were chosen
• Air
• Hydrogen
• Neutralization
• Alkanes
• Carboxylic acids
• Proteins
I. From each of the textbook texts, a minimum text of 200 or more words long was chosen from the explanatory part of the chapter (legends, data in tables, experiment descriptions, etc. do not count). The sample text ends with the end of the last sentence where the number of 200 characters was reached. (The number of 200 words was proved to be a sufficient sample for analysis (Průcha, 2002).)
II. Total number of words (N), sentences (from the capital letter to the full stop or other symbols) (S) and verbs in the active form (V) was counted.
III. Nouns along with substantivized verbs were identified.
IV. The nouns were sorted into categories (we used different colours): (T1) new general terms, (T2) new scientific terms, (T3) geographical terms (Earth, Moon, Sun, places, states, cities, etc. and other factual names), (T4) quantitative terms (numbers, era, percentage, mass etc.), and (T5) repeated terms (within the selected text area).
Specifics of chemistry – symbols of elements, formulas, etc. were also considered scientific terms (cf.Markic and Childs, 2016, p. 435). Although there are other categories of terms which contribute to text difficulty (philosophical terms, mathematical symbols, historical terms, etc.), the exception in terms of geographical terms was made to reproduce the literature-known procedure where the special treatment of those terms is justified.
V. Procedures in steps II–IV were carried out by two independent researchers. The procedures in 2–4 were confirmatory, whereas 5 produced a high number of variance. Many nouns (terms) were either not identified by one or the other researcher or were categorized differently. Therefore, a third researcher analysed the particular text in order to decide (cf.Teo et al., 2014).
VI. The numbers were added to tables for each parameter. Furthermore, two levels of difficulty were counted: the syntactic difficulty (formula for N, V, and S) and the conception difficulty (formula for T1–T5). The sum of the latter gives the overall rate of text difficulty.
The list of abbreviations is provided in Appendix 1.
D = Dst + Dsm | (1) |
![]() | (2) |
The semantic text difficulty is given by the following formula (3).
![]() | (3) |
Text difficulty (readability) is also influenced by the average sentence length (L) and the average length of the sentence section. Both factors are given as a fraction of the total number of words and sentences (4) or verbs in the active form (5). All the following formulas (4–13) represent the total numbers of terms in books by the publishers. They are therefore sums of the terms calculated for particular analysed topics.
![]() | (4) |
![]() | (5) |
![]() | (6) |
![]() | (7) |
![]() | (8) |
![]() | (9) |
![]() | (10) |
![]() | (11) |
Out of the five categories of terms, the sum of T2, T3 and T4 represents scientific information delivered by the analysed text. Another aspect of the text's scientific value is therefore the density of scientific information (i and h) coefficients, which were also calculated (12) and (13).
![]() | (12) |
![]() | (13) |
An example of the term categorization is shown in Appendix 2.
The second part of the analysis focused on the scientific terms used in a textbook (only Dsm, i and h are followed). In this part, the topics within each book were compared. The “scientific term load” is a reflection of the textbook authors’ pedagogical content knowledge (Shulman, 1987), i.e. the selection of terms that particular authors consider important for the students to learn in each topic. This could be the cause of the different learning opportunity that students have when using different textbooks to learn chemistry. That is why the calculated values (Dsm, i and h) were analysed within each book and between the textbooks, in order to highlight this important textbook feature.
Within the books by the Moby-Dick publishing house, the most significant decline in D was found. Compared with other textbooks for the 8th grade, this textbook is the most difficult in all three calculated factors (D, Dsm and Dst). In contrast, the textbook for the 9th grade is the easiest in Dsm and D.
The decline in D (20.6) is caused by the decline of Dst (4.3) and Dsm (16.3). The decline of Dst is caused namely by the decline in sentence length (4.5). The decline of Dsm is caused by a decrease in the proportion of terms in the text (9.4%), especially influenced by a decrease in the proportion of scientific terms (4.6%). This suggests that the books for 9th graders do not bring as much scientific information as the books for 8th graders.
With respect to the development of students’ thinking and progress in scientific education, the opposite trend was expected. Easier study texts may result in deceleration of students’ development (Ainsworth, 2006).
It is possible to follow the analogous trend in books by the Fortuna publishing house (ZCH). Between all the analysed textbooks, the book for the 8th grade in this book series is the second most difficult in Dst. In the book for the 9th grade, Dsm and D are the second easiest. The decline in D (10.0) is caused by the drop in Dst (2.7) and Dsm (7.3) – this represents the second biggest drop in both values between all the analysed textbooks. The drop in Dst is caused by the drop in the number of sentences (2.14). The authors use more verbs in the active form in the sentences, therefore making the text more comprehensible. The drop in Dsm is caused namely by a lower proportion of newly introduced scientific terms (by 4.5%). However, the proportion of general terms used is higher in this case (3.8%). Despite the change in Dst, in this case, caused by a different aspect than in the case of the previous books by the Moby-Dick publishers, similar risks may occur here too.
In contrast to the previous textbooks, the most outstanding increase in D (16.0) was noted between the textbooks by the Nová škola publishing house. This is caused by the increase in Dst (6.8) and Dsm (9.2). The textbook for the 9th grade is the most difficult among all of the three parameters. The Dst value is caused by an increase in sentence length (5.0) – the greatest increase recorded. Dsm was caused by the overall proportion of terms in the text (5.4%), where the proportion of scientific terms (3.0%) and repeated terms (10.4) grows and the proportion of general terms decreases (7.5%). This increase in text difficulty could have a positive effect on scientific literacy development, and it could also affect the importance of a topic as perceived by students. On the other hand, it could result in lower text comprehensibility for students and interfere with the basic function of a textbook – subject matter transfer and learning regulation.
An increasing trend can be observed in books by the Fraus publishing house too. According to all three observed values, the books for the 8th grade by this publisher are the easiest. Among the books for the 9th grade, this book is the easiest in Dst. Among all the analysed textbooks, the D value (8.3) represents the second biggest increase given, namely by the increase in Dsm (6.8), caused by the overall increase in the proportion of terms used (4.4%) – the second biggest – and the increased proportion of repeated terms (3.8%). With textbook function in mind, this result can be considered appropriate as the text in these textbooks broadens the context and aims to repeat scientific terms.
Among all the analysed textbooks, the book series by the Prodos publishing house is the most consistent in Dsm between the book for the 8th and 9th grade (0.3) but the proportion of general terms increases (4.8) – the biggest of all the analysed books. However, a decrease in the proportions of scientific terms (1.3%) and quantitative terms (2.5%) was recorded.
The particular topic values of Dsm in the analysed books are shown in Fig. 2. It is obvious that the authors of the analysed textbooks approach the topics differently. The text's semantic difficulty seems not to be considered (see Fig. 3). This could be problematic for education's effectiveness as “students… show greater learning (comprehension) of science texts when they are highly cohesive, as determined by the use of repetition of nouns and phrases” (Hall et al., 2014, p. 79).
A certain trend can be observed (with the exception of the topic on proteins) in the textbook by the Nová škola publishing house. In the topic on air, all the three observed values were the lowest of all the analysed topics. The highest Dsm was found in the topics on carboxylic acids (42.7) and alkanes (41.4). The highest i and h values (19.6 resp. 47.1) were found in the topic on hydrogen. Except for proteins, there is a notable trend of increasing particular topics’ Dsm. The text carries characteristics of systematic text-difficulty development. This is important for a textbook as “learners must know how a representation encodes and presents information” (Ainsworth, 2006, p. 186).
Within the textbooks with the greatest Dsm span (Moby-Dick and Fortuna (ZCH)), the greatest Dsm variance in particular topics was found. The Dsm value does not correlate with the importance of a particular topic – an example is the topic on neutralization (see below). No unifying key was found within this variation.
The biggest difference in Dsm from all the textbooks was found in the books by the Moby Dick publishing house, with distinct differences in Dsm and i among the analysed topics (Dsm 31.5 and i 15.3). The highest Dsm value (50.5) was found in the topic on neutralization, the lowest (19.0 resp. 20.0) was found in the topic on proteins, resp. alkenes. A similar trend was also noted for i. It was the highest value (23.5) in the topic on neutralization and the lowest value (8.2) in the topic on proteins. The h value also differs significantly. The topic on hydrogen (48.9) and, in contrast, the topic on proteins (22.5) show substantial differences. Regarding proteins, the authors introduce the least terms of all of the analysed textbooks. This suggests the topics in the textbook are not elaborated evenly. Despite the fact that there is no standard provided, the book authors set it themselves. Such a huge difference may seem as if neutralization is somehow a more difficult or more scientifically important topic, whereas proteins represent a considerably easier or less important topic. However, the range of Dsm in the case of neutralization between the book by Fortuna (20) and Moby Dick (50) shows that authors of the Fraus book managed to explain the topic using simpler language.
Distinct differences in Dsm and i were found in the books by the Fortuna publishing house (ZCH) too. Among other textbooks, the i values represent the biggest difference (15.6) from all the analysed books. This series also differs in Dsm compared to the other textbooks. The Dsm value is again significantly higher in the topic on neutralization (49.5) – an outlier. The i value (26.0) is distinctly higher in the same topic. The authors introduce most of the new terms between all the analysed textbooks. In contrast, the i value is lower (10.4) in the topic on proteins. In this topic, the lowest h value (22.8) was found. This suggests that the authors introduce the least new terms in this chapter.
The difference in topics can also be observed in books by the Prodos publishing house. A significantly higher difference in the h values (32.1) for the analysed topics was found. This difference is the highest between all the analysed textbooks. The Dsm value for the neutralization topic is distinctly higher (41.2), whilst the topic on air is the lowest (23.2). Neutralization is also characterised by a distinctly higher i value (26.2). In contrast, the i value in the topic on proteins (13.7) is distinctly lower. The highest h value was found within the topics of hydrogen (61.7), neutralization (60.9) and carboxylic acids (60.0), and the lowest within the topic on proteins (29.6). The fact that three of the six analysed topics are characterised by a distinctly higher h suggests that newly introduced terms are not repeated as much as in the other textbooks in this book series.
Compared to the other books, the Fortuna (PCH) and Fraus publishing house books contain significantly fewer differences in Dsm (12.2 resp. 12.1). These books carry characteristics of text difficulty equality in the analysed topics.
The analysed topics in the Fortuna publishing house (PCH) books differ in i (4.9) and h (10.9), which is the smallest difference compared to the other analysed textbooks. In this textbook series, the highest Dsm (38.5) was found in the topic on alkanes, and the lowest (26.3) in the topic on hydrogen. Alkanes are also a topic with a significantly higher i value (20.5) than the rest of the analysed topics. Alkanes with 45.2 and carboxylic acids with 46.8 represent topics with the greatest h value. In contrast, the lowest h value (35.9) was found in the topic on proteins.
In the books by the Fraus publishing house, a significantly lower i value difference (5.5) was found than in the other textbooks. In the analysed topics, opposite values for the topics on neutralization and proteins were found. Whereas Dsm (31.5) for proteins is considerably higher, Dsm (19.4) in the neutralization topic is lower. This trend is the opposite of that observed in the other analysed textbooks. A significantly higher i value (17.24) was found in the topic on alkanes and a conversely lower value (11.8) in the topic on neutralization. As far as the h value is concerned, the topic on alkanes with 47.9 is distinctly higher than the others. In contrast, the topic on hydrogen has a considerably lower (29.6) h value.
The general method of textbook analysis utilized in this research was proved to be usable for chemistry textbook analysis too, thanks to the variety of term categories included in the formula. It is also not language-specific, so it can be used for other languages too. With the restrictions that the text-analysis algorithms for less frequent languages have, this method offers an interesting option.
In the analysed Czech chemistry textbooks for elementary schools (grades 8 and 9), no systematic approach towards scientific literacy and reading literacy development was found. The textbooks do not reflect the authors’ work with the theory of education in terms of introducing new terms within one topic. Nor do they reflect the authors’ strategy of the work with text syntax. With Markow's (1988) original idea further developed by Nemeth (2006) in mind, the lingual representation plays a vital role in chemistry education. Also, the fact that most students at the beginning of science study face problems understanding field-oriented texts (Callender and McDaniel, 2009) suggests how important studies such as this are. The fact discovered therefore points out the first conclusion – the syntactic as well as semantic aspect of textbooks require proper attention in future textbook writing.
The text difficulty (D) of the analysed textbooks was not as constant or progressive as expected. There were textbook series where the eighth-graders’ book was more difficult (semantically and syntactically) than the ninth-graders’ book. This also concerns the semantic text difficulty of analysed topics which confirms the results presented by Knecht (2007, p. 107). This result suggests that the term selection for particular topic chapters was accidental. This is the most obvious within the neutralization topic. With the presumption that all the elementary school chemistry topics follow the national curriculum (expected outcomes) in mind, equality within text-difficulty indicators is expected. The effort to put the educational outcomes more precisely is visible in the latest addition to the Czech national curriculum – the Educational Standards and so-called indicator tasks (see Vojíř et al., 2017). Nevertheless, the discovered inequality suggests that students who learn with different textbooks gain different in-depth information as well as quality.
The results of this study can help teachers choose a textbook which corresponds with their teaching style or their students’ needs. Also, the results can serve researchers. The method of text analysis can be replicated and used for different fields or in different countries to compare textbooks and the textbook quality and its effect on the students’ results in an international comparison (e.g. PISA). Knowledge about textbook difficulty can also be an interesting starting point for lesson observation. Last but not least, the results may serve textbook authors in textbook reeditions or new textbook creation.
The limitations of this study dwell in having only limited information gained by text-difficulty analysis. Naturally, there are many other aspects that an efficient textbook needs to contain. Therefore, it is not possible to assess a textbook solely on the basis of its text difficulty. However, the text is the most important component. For this reason the text difficulty was the first aspect analysed in the textbooks. Another will follow in order to receive a fuller picture.
In further work, the authors of this text intend to focus on the way teachers use textbooks in education, more precisely the frequency, purpose and phases in which textbooks are being used. Special attention will be given to problem tasks in the textbooks, their quality and cognitive requirements with regard to the authors’ previous work in this field (see Vojíř et al., 2017).
Beneš P., Pumpr V. and Banýr J., (1993), Základy chemie 1 pro 8. ročník základní školy a nižší ročníky víceletých gymnázií, Praha: Fortuna.
Beneš P., Pumpr V. and Banýr J., (1993), Základy chemie 2 pro 9. ročník základní školy a nižší ročníky víceletých gymnázií, Praha: Fortuna.
Beneš P., Pumpr V. and Banýr J., (1999), Základy praktické chemie 1 pro 8. ročník základní školy, Praha: Fortuna.
Beneš P., Pumpr V. and Banýr J., (2000), Základy praktické chemie 2 pro 9. ročník základní školy, Praha: Fortuna.
Bílek M. and Rychtera J., (1999), Chemie krok za krokem, Pardubice: Moby Dick.
Bílek M. and Rychtera J., (2000), Chemie na každém kroku, Pardubice: Moby Dick.
Karger I., Pečová D. and Peč P., (2007), Chemie I pro 8. ročník základních škol a nižší ročníky víceletých gymnázií, Olomouc: Prodos.
Mach J., Plucková I. and Šibor J., (2016), Chemie pro 8. ročník Úvod do obecné a anorganické chemie (učebnice), Brno: Nová škola.
Pečová D., Karger I. and Peč P., (2004), Chemie II pro 9. ročník základní školy a nižší ročníky víceletých gymnázií, Olomouc: Prodos.
Šibor J., Plucková I. and Mach J., (2015), Chemie pro 9. ročník Úvod do obecné a anorganické chemie, biochemie a dalších chemických oborů (učebnice), Brno: Nová škola.
Škoda J. and Doulík P., (2006), Chemie 8 učebnice pro základní školy a víceletá gymnázia, Plzeň: Fraus.
Škoda J. and Doulík P., (2007), Chemie 9 učebnice pro základní školy a víceletá gymnázia, Plzeň: Fraus.
Total text difficulty – D
Syntactic text difficulty – Dst
Semantic text difficulty – Dsm
Total number of words – N
Sentences – S
Vocabs in active form – V
Total number of terms – T
New general terms – T1
New scientific terms – T2
Geographical terms – T3
Quantitative terms – T4
Repeated terms – T5
Density coefficients of scientific information – i, h
Average sentence length – L
Average length of sentence section – M
Proportion of terms – P
Proportion of new general terms – P1
Proportion of new scientific terms – P2
Proportion of geographical terms – P3
Proportion of quantitative terms – P4
Proportion of repeated terms – P5
Example 1 (Škoda and Doulík, 2006, p. 34):
Hydrogen2 is the most widespread element2 in the Universe5. 90%4 of all atoms2 in the Universe3 are atoms3 of hydrogen3. On Earth5, hydrogen3 is the third most widespread element3.
The indices above the terms mark the category the term was placed into. For the meaning of the categories see chapter Analysed categories (IV). Categorization of the repeated terms is obvious. In this example Universe and Earth are not considered to be a scientific term as they represent only a reference. In geography, however, their meaning would be different.
The only difficulty is in the categorization of general and scientific terms. Its occurrence is obvious in the second example.
Example 2 (Škoda and Doulík, 2006, p. 34):
Hydrogen5 is a colourless gas2 without taste1 and odour1. It produces diatomic molecules2 H22.
As seen above, chemical symbols and formulas are considered as scientific symbols.
This journal is © The Royal Society of Chemistry 2019 |