Shang Doua,
Qing Zhoub,
Weiping Hu*ac,
Xipei Guoa and
Yujing Guod
aKey Laboratory of Modern Teaching Technology, Ministry of Education, Shaanxi Normal University, Xi’an, Shaanxi, China. E-mail: Weipinghu@163.com
bSchool of Chemistry and Chemical Engineering, Shaanxi Normal University, Xi’an, China
cShaanxi Normal University Branch, Collaborative Innovation Center of Assessment toward Basic Education Quality at Beijing Normal University, China
dCollege of Education, Capital Normal University, Beijing, China
First published on 25th July 2025
This study used a graph theory-based social network analysis method to explore the cognitive knowledge structures of groups of high school students with different achievement levels on the topic of ethanol. Its aim was to provide practical suggestions for classroom-based teaching of different student groups. Semi-structured interviews were used to collect data on the ethanol knowledge of students at three different achievement levels from a high school in Shenzhen, China. Each of these three levels consisted of 23 students, for a total of 69 students. Interviews were conducted one week after the students had received ethanol instruction. Subsequently, the interview records for each student were transformed into an individual-student co-occurring phrases matrix. The co-occurring phrases matrices of the 23 students in each group were ultimately combined into a larger group co-occurring phrases matrix for evaluation using social network analysis. Data on cognitive knowledge structures were analyzed along three dimensions: structural features, content based on organizational features, and learning difficulties. The results revealed that, on the topic of ethanol, (1) the students with the highest academic achievement constructed a cognitive knowledge structure with more nodes, connections, and greater integration compared to the group with the lowest academic achievement; (2) the organization and content of the cognitive knowledge structures on ethanol differed among different student achievement levels; e.g., the node categories with the ability to control the exchange of information were more diversified in the high-achieving student group, while those of the low-achieving group were more homogeneous, and the organization of the former's cognitive knowledge structure was clearer than that of the latter; and (3) all student groups experienced learning difficulties with certain ethanol contents, including odor, symbolic representation, oxidation by strong oxidants, and structure–property linkages.
Many scholars have highlighted the value of studying cognitive structures. For example, it can help identify students' learning difficulties (Temel and Özcan, 2016; Hrin et al., 2018) and understand CKS organization features and specific knowledge content (Derman et al., 2024). It can also aid teachers in organizing instructional materials (Ifenthaler et al., 2011), improving students' learning outcomes (Tsai and Huang, 2002), conducting instructional interventions, and targeting instruction. Therefore, researchers have focused on identifying students’ CKS and researching them in detail (Anderson and Demetrius, 1993; Tsai, 2001; Nawani et al., 2016; Akatan et al., 2022; Derman et al., 2024).
Researchers have used various methods to measure and characterize the CKS of students, such as the word association method (Gunstone, 1980), the graph-construction method (Shavelson, 1974), concept mapping (Novak and Cañas, 2004), the flow map method (Anderson and Demetrius, 1993), systemic synthesis questions (Hrin et al., 2018), the repertory grid technique (RGT; Rozenszajn et al., 2021), etc. The word association method asks learners to make associations based on words provided by the researcher and write down the associated words (Shavelson, 1974; Tsai and Huang, 2002; Guerrero et al., 2010). The graph-construction method involves providing learners with a list of words and asking them to construct a tree diagram by connecting word pairs through “similarities” between them (Shavelson, 1974). Concept mapping requires subjects to integrate their mental knowledge and draw it hierarchically or unrestrictedly (Wallace and Mintzes, 1990; Markham et al., 1994; Besterfield-Sacre et al., 2004; Novak and Cañas, 2004; Assaraf and Orion, 2005). The flow map method collects data using interviews and then involves drawing a flow map of an individual's CKS using transcription and analysis (Anderson and Demetrius, 1993; Wu and Tsai, 2005; Zhou et al., 2015; Yang and Zhang, 2018). The system synthesis problem method requires participants to draw graphs based on the phrases provided in the test. The graphs should involve as many connections between phrases as possible and be in a closed loop (Hrin et al., 2018). The RGT involves asking participants to rate phrases provided by the researcher or generated by themselves (Rozenszajn et al., 2021).
The various research methods mentioned above and the empirical studies based on them have significantly advanced the understanding of CKS. These methods and related studies have also generated new insights, such as by studying CKS from the perspective of complex networks. First, the approaches described previously all reflect, to some extent, the complexity of CKS's network-like character. The concept mapping and word association methods, for example, are both based on the associations between basic words and the associative features of phrases (Gunstone, 1980; Novak, 1990; Novak and Cañas, 2004)—the retrospective arrows in the flow map reflect these features (Tsai and Huang, 2002), and the “node” and “arrow” in the system synthesis questions approach (Hrin et al., 2016a, 2016b, 2017) represent the basic units of the network (de Nooy et al., 2011). Second, many empirical studies based on the above research methods reflect the network-like nature of students' CKS (Anderson and Demetrius, 1993; Nakiboglu, 2008; Burrows and Mooring, 2015; Derman and Eilks, 2016; Baptista et al., 2019; Avcı, 2021). Finally, the research foundations upon which the above studies are based, such as hierarchical semantic network theory (Quillian, 1967) and spreading-activation theory (Collins and Loftus, 1975), also support the idea that the knowledge stored in the human brain is interconnected. Therefore, the study of CKS can be developed from the perspective of complex networks.
When researching students' CKS, researchers usually focus on two aspects: its characteristics and content. For example, the word association method usually classifies students’ associated words and counts their frequency (Özcan and Tavukçuoğlu, 2018; Alkan et al., 2021). The concept mapping method takes the number of concepts, connections between them, their hierarchy, and instances as the main dimensions of the data analysis (Wallace and Mintzes, 1990; Markham et al., 1994). The flow map method focuses on the dimensions of extent, richness, integration, misconceptions, and information retrieval rate (Anderson and Demetrius, 1993; Zhou et al., 2015; Yang and Zhang, 2018), and system synthesis questions analyze the data in terms of dimensions of extent, complexity, learning difficulties, and lack of understanding (Hrin et al., 2018). The RGT involves using descriptive statistics and content analysis in terms of characteristics and categories (Tomico et al., 2009; Rozenszajn et al., 2021). Researchers have also emphasized that, in addition to focusing on CKS's characteristics and content, its organization is a crucial feature to be studied (Fensham et al., 1985). Therefore, CKS analysis can be developed across different dimensions, including features, organization, and content.
Extensive research has revealed that organic chemistry is a challenging field for students (Johnstone, 2010; O’Dwyer and Childs, 2017), involving students at different stages, from secondary school to university (Childs and Sheehan, 2009). Studies have shown that students encounter many misconceptions, learning difficulties, and a lack of understanding regarding organic chemistry (Duffy, 2006; Sendur and Toprak, 2013; Akkuzu and Uyulgan, 2016). For example, research has found that students have misconceptions about the molecular structure and composition of acetic acid (Zhou et al., 2015) and the structure of alkene (Sendur and Toprak, 2013). Therefore, the study of CKS in organic chemistry requires attention. It is important to note that organic chemistry is a broad field (Duis, 2011) that includes many topics (e.g., hydrocarbons, alcohols, aldehydes, acids, esters, etc.), the differences between which may affect the construction of good CKSs for students. Therefore, it is important to conduct a CKS study on different organic chemistry topics (e.g., ethanol) to aid understanding of students' CKSs with respect to different topics, as well as the learning difficulties, such as misconceptions, that may be embedded within them. This will also help to scientifically promote the formation of good CKSs for students and beneficial teaching and learning guided by theories such as conceptual change (Posner et al., 1982; Sendur and Toprak, 2013).
In summary, there are three areas of research on CKS that should be focused on. Research methodology should be based on a network perspective; research content should focus on different dimensions (e.g., knowledge content and knowledge organization); and research objects should focus on different learning topics.
It is important to note that ethanol, as the simplest organic alcohol compound, ethanol has been emphasized in the chemistry curriculum standards of several countries, such as Singapore (Ministry of Education Singapore, 2022), Germany (Ministry of Education and Continuing Education of North Rhine Westphalia, 2011), and Canada (Ministry of Education Ontario, 2008). In China, ethanol is one of the elements required by the National Chemistry Curriculum Standards, and is described in the curriculum standards as an important organic compound for life. The Chemistry Curriculum Standards for Senior High Schools (2017 edition, revised in 2020) requires students to master the structural characteristics, physical properties, chemical properties, and uses of ethanol, as well as conduct hands-on experiments on the chemical properties of ethanol (Ministry of Education of the People's Republic of China, 2020). The curriculum standards require all first-year high school students to study ethanol. For those students who choose to study chemistry in their second year of high school, the curriculum standards require them to learn the structure and properties of ethanol in further depth. In addition, ethanol has a wide range of applications in life, and its study is one of the ways to develop students' scientific literacy, representing a foundation for further study of organic chemistry. Furthermore, our understanding of students' specific CKSs about ethanol is lacking, and it is therefore necessary to study students’ CKSs on the topic of “ethanol” in organic chemistry.
It should also be pointed out that our current education system is still predominantly organized around the class-based teaching system, which has changed little since Comenius systematically elaborated on this organizational model (Zhang et al., 2022) and discussed it in depth. Teaching practice currently requires the targeting of multiple students simultaneously, so research on students' CKSs should also focus on multiple students (i.e., on groups of students) simultaneously. Studying the groups’ CKS will reveal the deep-rooted rules embedded in each group, making the study's results relevant to real-world education and teaching. Therefore, it is necessary to study the CKS of student groups rather than of individual students.
The third point is that the core purpose of CKS research is to promote the formation of good CKSs among students, especially among low- and middle-achieving students. Therefore, CKS research should focus on students of different achievement levels in the same grade. Previous studies have emphasized this idea (Baptista et al., 2019; Nakiboğlu, 2023; Derman et al., 2024). Therefore, it is necessary to investigate the CKS of students at different levels.
In summary, research on CKS should focus on the following aspects: first, the network-like nature of CKS; second, different research dimensions (e.g., knowledge content, knowledge organization) and research themes (e.g., ethanol); third, analyzing student groups as a whole; and fourth, analyzing students with different achievement levels. Given the importance of CKSs, this study intends to respond to the above issues by using the social network analysis (SNA) method from the network perspective.
SNA is derived from graph theory in mathematics (de Nooy et al., 2011) and can uncover information embedded deep within complex social networks (Scott and J. Carrington, 2014). It has been used in several educational research studies (Bodin, 2012; Dou and Zwolak, 2019; Williams et al., 2019; Wagner and Priemer, 2023), but has not received much attention in the field of CKS. SNA is consistent with the complex knowledge networks in students' minds, so applying it to studying the CKSs of student groups has advantages.
To summarize, CKS plays an important role in chemistry education, and SNA has many advantages for conducting CKS research. Research on the CKSs of student groups at different achievement levels will contribute to promoting the formation of good CKSs, and research on ethanol is lacking. Therefore, the present study aimed to use SNA to investigate the CKSs of high school student groups at different achievement levels regarding their understanding of ethanol as a component of their organic chemistry learning.
Specifically, this study aims to address the following three objectives:
(1) Understand the CKS characteristics of student groups at different achievement levels about “ethanol.”
(2) Understand the key elements and organizational characteristics of the CKSs of student groups at different achievement levels about ethanol.
(3) Understand the learning difficulties encountered by student groups at different achievement levels in learning about ethanol.
Before conducting the formal study, consent was sought and obtained from the students' schools and teachers. Before the interviews, the researcher briefed each student on the study's purpose, methodology, process, and other critical matters. The researcher informed the students that the study aimed to explore the students' mastery of ethanol, which would facilitate their teachers' teaching practices. The students were also informed that the study would use the interview method and that they would be asked three questions and follow-up questions when appropriate, that they could draw pictures if they had difficulty expressing themselves, and that the whole research process would be audio-recorded. The researcher also informed the students that there was no right or wrong answer, no need to be nervous, that all data collected throughout the research would be kept confidential, the results of the study would not be reported to teachers or parents, that their data would be anonymized, and that they would be able to request the results of the study from the researcher in the future after study completion.
(1) Could you tell me the main facts or concepts about ethanol?
(2) Could you tell me more about the details you mentioned?
(3) Could you tell me the relationships between the facts or concepts you have already told me?
In addition, a follow-up question was asked at the appropriate time and was limited to: is there anything else you would like to add?
In accordance with the normal schedule for the school, one week after the students had learned about the topic of “ethanol,” the first author conducted data collection using audio recording equipment based on the interview outline. We chose one week as the time point for data collection because we expected that the CKSs would be derived from students' long-term memories and mentally constructed by themselves (rather than involving a simple recall of classroom instruction). Three main pieces of evidence support conducting the interviews at one-week post-instruction. First, psychological research on memory suggests that almost all of our memories of a particular event are lost within the first 24 hours and that students will essentially forget the content of their prior experiences within a week (Ebbinghaus, 1913). Second, previous studies on students' CKSs also used one week as the time point for data collection (Zhou et al., 2015). Third, a week of study and life for high school students involves many interruptions, so the students will have almost forgotten what the teachers had taught them a week before.
For the interview process, we provided students with paper and pens before the interview began, to facilitate them in writing down knowledge they found difficult to express orally. We then gave each student a thorough introduction to the interview and asked if they were ready. Once we received a definite answer from the student, we immediately proceeded to the formal interview. During the interview, we paid attention to factors such as the student's facial expression, demeanor, fluency, and whether there were any pauses. We usually waited for 3–5 seconds after the student answered a question, and we would ask the next question if the student did not continue to answer. If there was a difference in facial expression or other indications, such as a hesitation or pause, we followed up with the student to ask if there was anything else they would like to add and then waited for their comments. If the student provided an additional response, we let them continue to make their statement, and if they expressed that they had nothing to add, we would ask them the next question until the end of the interview. Overall, the duration of each student interview was approximately five minutes. The duration varied for different groups of students at different levels; excluding the interviewer's speaking time, the interview duration averaged about 3 minutes for the HAGs, 2.5 minutes for the MAGs, and 1.5 minutes for the LAGs.
In addition, the four test scores used to determine the students' academic achievement level were also retained. It should be noted that this study was conducted in China, and all initial information was presented in Chinese.
The SNA of the data in this study comprised four main steps as follows: transforming the audio-recorded text of the interviews of individual students into pairs of phrases (which we refer to as “keyword co-occurring phrases”); transforming the phrase pairs of each student into an individual-student phrase relationship network (which we refer to as “keyword co-occurrence matrix”, where the numbers in the matrix [e.g., 1, 2, …] indicate the presence and frequency of co-occurrence relationships, and 0 indicates the absence of co-occurrence relationships); combining the phrase relationship networks of different individuals within the same group into a group phrase relationship network; and analyzing each group's phrase relationship network. Fig. 1 presents a diagram illustrating these four steps.
In the first step, we used the method of text transcription and utterance segmentation to complete the transformation of interview recordings into keyword co-occurring phrases. This method first converts sentences from the interview discourse into text, then segments them into smaller units of phrases or words (the smallest unit that expresses a certain practical meaning, e.g., phrases or words such as volatile, physical property, etc., as analyzed and determined by the researchers), and in the process, constructs phrase pairs based on the original meaning of the utterance.
In constructing co-occurrence relations of phrase pairs, we primarily considered different situations, such as adjacency between phrases, sentence-level proximity of phrases, and thematic links at the topic level. Specifically, we constructed four basic analysis rules based on adjacency, sentence-level proximity, and thematic links, which were then used to analyze the data. The four rules were: use the short sentence as the basic data analysis unit, consider the adjacency between phrases, consider the logical relationship between phrases, and consider the hierarchical relationship between phrases. Note that in the hierarchical relationship analysis, we divided two categories of concepts-superordinate and subordinate-with subordinate concepts being relative to superordinate concepts. Specifically, we refer to concepts that encompass more knowledge or concepts as superordinate concepts, such as the framework concepts of “physical properties” and “chemical properties,” and reaction types, such as “oxidation reactions,” which are based on consensus among teachers and experts. The rules for data analysis and more detailed procedures and examples can be found in Appendix B.
The words in the phrase pairs represent the nodes, and the phrase relationships between phrase pairs represent the connectivity, which has been used frequently in previous studies (Geeslin and Shavelson, 1975; Novak and Musonda, 1991; Kinchin, 2011). To ensure the study's reliability, we adopted a back-to-back research approach during the utterance analysis; that is, the same student interview was analyzed by two researchers separately, and the consistency of the results was tested. For example, if the phrase pairs produced by two researchers who analyzed the content of an interview were 16 and 18, respectively, and the same word pair for both was 15, the reliability would be (15 + 15)/(16 + 18) = 0.88. After several stages of pre-analysis, negotiation to modify the analysis rules, and formal analysis, the agreement rate of phrase pairs between the two researchers in the present study was 0.81, which is in line with previous studies (Liu et al., 2002) and is sufficient. When the two researchers encountered inconsistencies, they resolved them by mutual agreement. We then analyzed the data based on the finalized phrase pairs.
In the second step, we used Excel 2022 to correlate pairs of phrases from individual students, resulting in a keyword co-occurrence matrix for each student.
In the third step, using Excel 2022 again, we combined the keyword co-occurrence matrices of the 23 individual students in each achievement level group into a larger student group matrix, which was used for subsequent data analysis and graphical visualization.
In the fourth step, we analyzed the keyword co-occurrence matrix using the SNA software Ucinet 6.0 and drew network diagrams of the CKS of both individual students and student groups.
Fig. 2–4 illustrate the final mapped CKS network diagrams for individual students and groups of students, respectively. In particular, the two graphs in Fig. 2 illustrate the individual CKS network diagrams for a student who is a good learner and another student who is struggling (Appendix C also provides an individual CKS network diagram of a student with moderate learning abilities). These three students have a certain degree of representativeness in their respective groups and are located above the average level of each group in various indicators of CKS. The relevant indicator data can be seen in Tables 1 and 2 of the “Results and findings” section. Fig. 3 and 4 display the final CKS network diagrams for the good learner and learning difficulties groups, respectively (Appendix C also provides the CKS network diagram of students’ group with moderate learning abilities). The thickness of the “edges” in the network diagram of CKSs is determined by the frequency of phrase pairs, and the size of the nodes is derived from the frequency of phrase nodes. Specifically, regarding the determination of edges, if 10 students describe the connection between node A and node B, but only one student describes the relationship between node C and node D, edge AB will be thicker than edge CD. The size of a node can be calculated based either on the node's frequency of occurrence or on its connection with the surrounding nodes. However, there is little difference between the two kinds of representations, except in the visualization of the graph. In general, the thicker the “edge,” the stronger the connection between nodes. The larger the node, the more central it is or the more often it is mentioned.
![]() | ||
Fig. 2 Two examples of CKS network diagrams based on high and low achievement student (student A and C). |
Variable | EXT | RICH | INTE | IRR | MISCON | ACHV |
---|---|---|---|---|---|---|
Note: EXT: extent; RICH: richness; INTE: integration; IRR: information retrieval rate; MISCON: misconceptions; ACHV: achievement score from four times regular test; A: student with high achievement; B: student with medium achievement; C: student with low achievement. | ||||||
Student A | 43 | 49 | 0.54 | 0.19 | 1 | 85.75 |
Student B | 24 | 23 | 0.49 | 0.15 | 1 | 61.75 |
Student C | 16 | 15 | 0.48 | 0.20 | 2 | 44.5 |
EXT | RICH | INTE | IRR | MISCON | ACHV | |
---|---|---|---|---|---|---|
Note: HASG: high achievement student group; MASG: medium achievement student group; LASG: low achievement student group; H: Kruskal–Wallis test H. | ||||||
HASG | 25.57 | 26.26 | 0.51 | 0.15 | 0.43 | 82.51 |
MASG | 18.22 | 17.57 | 0.49 | 0.13 | 0.52 | 63.51 |
LASG | 11.74 | 10.74 | 0.48 | 0.16 | 0.61 | 42.15 |
H | 32.73 | 33.15 | 22.46 | 2.855 | 0.811 | 60.42 |
p | <0.001 | <0.001 | <0.001 | 0.240 | 0.667 | <0.001 |
The results showed that the CKSs of individual students or groups of students can be studied in terms of the following nine variables:
1. Extent: the number of nodes within the CKS; the higher the number of nodes, the richer the students' knowledge and the better the CKS.
2. Richness: the number of connections within the CKS; the higher the number of connections, the more links students have established in their knowledge and the better their CKS.
3. Integration: the greater the degree of knowledge integration within the CKS, the better the student's knowledge integration, and the better the CKS. Integration is expressed as the number of connections/(number of nodes + number of connections).
4. Information retrieval rate (IRR): the amount of information retrieved by students per unit of time, expressed as the number of nodes/times. This reflects students' fluency in retrieving and extracting information from their minds; the larger the value, the more information students can extract per unit of time, and the better the CKS.
5. Learning difficulties: misconceptions held by students; the lower the number, the better the CKS.
6. Centrality (de Nooy et al., 2011): an indicator used to judge the node's position in the CKS; the larger the value, the more central the node.
7. Structural hole (Lin et al., 2022): another indicator used to judge the node's position in the CKS; the smaller the value, the more the node can serve as a bridge to other nodes in the CKSs of students. Fig. 5 provides a detailed explanation of the structural hole. In this figure, A and B cannot be directly connected and can only be connected through C. This lack of connection between A and B is called a gap in the structure (also known as a structural hole). This indicator reflects how A and B are constrained by other nodes. In a network, the more structural holes a node has (e.g., a node such as A in Fig. 5), the more likely it is to be constrained by other nodes. Therefore, the smaller the index of the structural hole for a given node, the less likely it is to be restricted by other nodes (e.g., a node such as C in Fig. 5; i.e., the more pronounced its bridging role).
8. Edge: an indicator used to judge the importance of phrase pairs in the group CKS; the larger the value, the more important the corresponding phrase pairs are in the students' CKS.
9. Blockmodels (White et al., 1976): a block division of students' CKSs, which can be used to judge the similarity of CKSs among different students or groups of students.
The centrality, structural holes, Edges, and blockmodels of individual CKSs were not studied because of the relative simplicity of a single individual's CKS.
Table 2 shows the one-way ANOVA results and quantitative analysis regarding the group CKS performance in the HASG (Fig. 3), LASG (Fig. 4), and MASG (Appendix C), with Table 3 showing the post hoc test results. The results revealed significant differences (p < 0.05) in the extent, richness, and integration dimensions within CKSs across HASG, MASG, and LASG groups. In contrast, no significant differences were observed (p > 0.05) in misconception and information retrieval rate dimensions. Post hoc tests revealed significant differences in the extent and richness dimensions between HASG, MASG, and LASG (p < 0.05). Additionally, significant differences in the integration dimension were observed between HASG and MASG, as well as between HASG and LASG (p < 0.05). However, no significant differences were observed between MASG and LASG in the integration dimension (p > 0.05).
Variable | EXT | RICH | INTE | |||
---|---|---|---|---|---|---|
H | Adj. p | H | Adj. p | H | Adj. p | |
Note: Adj. p: corrected p-value. | ||||||
Group H–M | 16.35 | 0.017 | 16.85 | 0.013 | 17.91 | 0.007 |
Group H–L | 33.80 | 0.000 | 34.02 | 0.000 | 27.35 | 0.000 |
Group M–L | 17.46 | 0.009 | 17.17 | 0.011 | 9.46 | 0.323 |
Variable | IRR | MISCON | ACHV | |||
---|---|---|---|---|---|---|
H | p | H | p | H | Adj. p | |
Group H–M | — | — | — | — | 22.96 | 0.000 |
Group H–L | — | — | — | — | 45.98 | 0.000 |
Group M–L | — | — | — | — | 23.02 | 0.000 |
These results indicate that HASG's CKS contains more nodes and connections, followed by MASG and LASG. Additionally, HASG exhibits a higher degree of knowledge integration than MASG and LASG, which share the same level of integration. No significant differences were found among the three groups regarding information retrieval rate or misconceptions.
Results suggest that: (1) regarding the node centrality in CKSs, the HASG demonstrated the highest centrality, followed by the MASG, with the LASG exhibiting the lowest; (2) nodes of physical/chemical properties, application, functional groups, esterification reactions, sodium, and acetic acid held relative importance for HASG and MASG, whereas LASG focused on physical/chemical properties and oxidation reactions.
The above results reflect, first, that there are more nodes with strong information exchange control ability in the CKS of HASG for ethanol, and that these nodes have more control ability and diversity, followed by MASG, and then by LASG. Second, the characteristics of the control nodes are varied across groups, and the control nodes of HASG and MASG have both superordinate and subordinate concepts, while LASG only has superordinate concepts. Third, the nodes of the different groups exhibit both similarities and differences; the nodes of ethanol, physical properties, and application are significant for all three groups, but HASG focused most on “acetaldehyde,” MASG on “oxidation reactions,” and LASG on “catalytic oxidation.”
Hierarchy of node | HASG | MASG | LASG | |||
---|---|---|---|---|---|---|
Nodes and relations | Fre | Nodes and relations | Fre | Nodes and relations | Fre | |
Note: Fre: frequency.a MSW: mutually soluble with water in any proportion.b DLW: density less than water. | ||||||
1 | Ethanol-physical property | 20 | Ethanol-physical property | 14 | Ethanol-chemical property | 15 |
2 | Ethanol-chemical property | 16 | Ethanol-application | 11 | Ethanol-physical property | 12 |
3 | Physical property-MSWa | 13 | Ethanol-functional group | 9 | Physical property-colorless | 11 |
4 | Ethanol-condensed structural formula | 12 | Ethanol-chemical property | 9 | Physical property-DLWb | 6 |
5 | Physical property-chemical property | 12 | Physical property-MSWa | 9 | Functional group-hydroxyl group | 6 |
6 | Physical property-boiling point | 11 | Physical property-colorless | 9 | Ethanol-functional group | 5 |
7 | Physical property-easily volatile | 11 | Functional group-hydroxyl group | 9 | Ethanol-molecular formula | 5 |
8 | Physical property-DLWb | 10 | Chemical property-esterification reaction | 8 | Ethanol-colorless | 4 |
9 | Ethanol-functional group | 9 | Esterification reaction-acetic acid | 7 | Physical property-irritating odor | 4 |
10 | Ethanol-application | 9 | Application-disinfectant | 6 | Chemical property-oxidation reaction | 4 |
The above results indicate HASG paid more attention to the same phrase pairs, constructed a systematic CKS for ethanol from the basic chemistry learning framework of properties, structures, and applications, and constructed the physical properties of ethanol more systematically. MASG achieved a balance between a framework and detailed knowledge. The CKS of the LASG was more focused on knowledge details and was characterized by fragmentation.
In summary, HASG and MASG share comparable CKS organizational patterns, whereas LASG exhibits distinct structural divergence.
Regarding student groups' overall performance, students' descriptions of ethanol focused on four modules: physical properties, chemical properties, structure and characterization, and application. First, HASG, MASG, and LASG showed a decreasing trend in accuracy when stating knowledge about the four modules mentioned above, with percentages of 13.45%, 10.51%, and 5.20%, respectively. However, contrary to the previous trend, the three groups did not show a similar pattern in describing misconceptions, with proportions of 2.89%, 2.46%, and 3.04%, respectively. Overall, HASG performed better than the other groups.
Regarding specific misconceptions that students had, there were two main areas where students produced more errors: the odor of ethanol and the characterization of its chemical formula. All three groups experienced both misconceptions, but these misconceptions differed in their specific manifestations. In terms of ethanol's odor, more students thought that it had an irritating odor, was odorless, and had other odors; the percentages of the three groups were 13.00%, 11.60%, and 15.90%, respectively; the differences between the three groups were relatively insignificant. In terms of the representation of ethanol's chemical formula, more students were unable to represent ethanol accurately. Students not only confused ethanol's molecular formula, condensed structural formula, and structural formula but also wrote these formulas incorrectly, with percentages of 7.20%, 1.45%, and 4.35% for the three groups, respectively. However, the proportion of HASG was higher than that of MASG, and the proportion of LASG was in the middle.
Two other points should be highlighted. First, a low percentage of students stated knowledge of ethanol's strong oxidant oxidation reaction (i.e., ethanol oxidized by KMnO4(H+) or K2Cr2O7(H+)). The percentage of students in this dimension was lower than that of students with knowledge of ethanol's combustion, substitution reaction (with sodium), catalytic oxidation, and esterification. Second, all three student groups also neglected the correlation between the chemical and physical properties of ethanol and its structure. Specific data on these observations can be found in Table 7.
In summary, the study found that the three groups, HASG, MASG, and LASG, exhibited a decreasing trend in the amount of correct knowledge, and there was no difference in the statement of incorrect knowledge. The three groups had relatively similar learning difficulties, all focusing relatively more on the odor of ethanol, the characterization of its chemical formula, its oxidation by strong oxidizing agents, and the link between its structure and properties, but the specific performance of the different groups on these points varied. In addition, different groups had distinct learning difficulties. For example, students in the HASG group had more learning difficulties in areas such as the products of the catalytic oxidation of ethanol.
Regarding the first research question (what are the characteristics of the CKSs of different groups of students at different achievement levels?), the CKS of HASG has more nodes, more connections, and a higher degree of knowledge integration, a finding consistent with the results of individual-based CKS research (Yang and Zhang, 2018). The research findings were also aligned with the actual interviews, with the better learners usually giving longer interviews, describing more content, and having more connections. However, there was no significant difference in the degree of knowledge integration between MASG and LASG. The reasons for differences between groups of students in terms of the number of nodes, the number of connections, and the degree of knowledge integration may be related to how students at different levels process information. Previous studies have pointed out that, usually, the higher the academic achievement of students, the better the information processing strategies they employ (Wu and Tsai, 2005). The better the students’ information processing strategies are, the more likely they are to engage in knowledge integration and therefore to retain more knowledge. Second, the three student groups did not differ significantly in terms of IRR and the number of misconceptions, consistent with previous studies (Tsai, 2001; Zhou et al., 2015). This result may be related to using interviews as a data collection method, i.e., this method gives the students more freedom to retrieve and extract only solid and familiar information constructed in their long-term memory and to provide fewer or no answers to unfamiliar and uncertain content, which leads to the same speed of IRR and the same number of incorrect descriptions among the groups of students. The study's findings provide two insights: first, we can focus on the connections between pieces of knowledge in our teaching practice to facilitate students' processing of information, and second, we can pay more attention to in-depth classroom question-and-answer sessions in our teaching and try to encourage different groups of students to participate as deeply as possible so as to understand what they really think.
With regard to the second research question (what are the organizational features of the CKSs of different groups of students at different achievement levels?), the centrality analysis found that HASG was more centered on the nodes of physical properties and chemical properties, followed by MASG and LASG, which reflects the fact that the CKSs of HASG organized knowledge centered primarily on these two nodes. However, HASG and MASG were more centered on the application and esterification reactions nodes, whereas LASG was centered on the node of oxidation reactions. Combining the analysis of interview data and actual teaching, we found that the focus on esterification reactions by HASG and MASG was because students bridged what they had learned about esterification reactions in the lesson on “acetic acid” with what they had learned about ethanol in the lesson on “ethanol” (in China, “ethanol” and “acetic acid” are taught in two separate lessons, and ethanol is also learned as part of the esterification reaction of acetic acid). That is, students constructed their CKS for ethanol by backward-transferring ethanol-related knowledge points they had learned in other lessons. In contrast, the oxidation reactions described by LASG were the combustion reactions that the students had learned in lessons on ethanol. Students have learned that combustion reactions are described as oxidation reactions many times before they learn about ethanol (e.g., in the chemical properties of alkanes, alkenes, and alkynes), making it more familiar and easier to retrieve and extract from long-term memory. This situation could reflect that LASG emphasized the oxidation reaction more as a construction of CKS through forward transfer. The above distinction reflects two issues: first, LASG focused more on relatively simple content (because the ability of organic substances to burn is usually an important chemical property of organic substances), and second, it was more difficult for LASG to transfer and integrate previously learned content with subsequent learning. This also provides two insights: first, we should focus on the nodes valued by students, such as physical properties, chemical properties, application, and esterification reactions; second, we should assist students in transferring and integrating their knowledge in practical teaching and learning.
Structural hole analysis revealed that the key nodes for ethanol identified by the student groups exhibited relative consistency with those from the centrality analysis. This consistency may be attributed to the fact that in knowledge networks, nodes occupying central positions are more likely to act as bridges connecting to other nodes (de Nooy et al., 2011). However, the structural hole analysis yielded three meaningful findings: first, compared to MASG and LASG, HASG exhibited more nodes occupying controlling positions, and these nodes were relatively more controlling within their CKSs, reflecting more robust knowledge organization in this group, which is consistent with the research on CKS characteristics. Second, the nodes emphasized by different groups had distinct characteristics. HASG and MASG concentrated on both superordinate concepts (such as “physical property”) and subordinate concepts (such as “water”), while LASG focused solely on superordinate concepts. There is nothing special about these superordinate concepts of physical properties etc. being able to control the exchange of information, but why should the subordinate concepts still be able to do so? The analysis of interview data revealed that the “water” node in the CKSs of HASG and MASG serves as a conceptual bridge connecting knowledge in different knowledge modules, including ethanol's catalytic oxidation, combustion, esterification reactions, and solubility. Finally, the control nodes of interest to different groups are similar and different, with the similarity reflecting the consistency of the CKSs in the three groups, but the difference perhaps highlights the specificity of the CKSs. Take the example of the node “water” for HASG and the node “catalytic oxidation” for LASG as an explanation-the interviews revealed that the node “catalytic oxidation” was only related to concepts closely associated with itself, such as catalyst, copper/silver, oxygen, etc., and did not link the knowledge of the different knowledge modules in the same way as the node “water” did. Whether or not the nodes link up knowledge from different knowledge modules may reflect, to some extent, the essential differences in the CKSs of different groups of students, i.e., whether students make connections between various pieces of knowledge. The above findings have two implications: first, we should pay attention to apparent superordinate controlling nodes, such as “physical properties” and “application,” and second, we should pay attention to hidden controlling nodes (or subordinate concepts, such as “water”), because this kind of node can link up the knowledge of different modules, enhance the connection between pieces of knowledge, and, to some extent, enhance the theoretical interest of the course, as well as strengthen the communication of the knowledge nodes in the students’ CKSs.
Edge analyses revealed that HASG exhibited a higher frequency of shared “edges”. In contrast, MASG and LASG demonstrated lower frequencies of shared edges. This reflects the greater overlap of CKSs among good learners, with less overlap among intermediate learners, and the lowest overlap among struggling learners. Overall, this indicates that the network graphs of good learners are more like one another, while those of struggling and intermediate learners are more distinct. The above situation may be because HASG tends to construct systematic CKSs, whereas LASG relies more on isolated memories. Whereas systematic CKSs need to rely on certain frameworks (e.g., structure, properties, application in chemistry), the frameworks hardly ever change, so the CKSs of HASG are more likely to overlap, leading to higher frequencies. In contrast, the fragmented knowledge of LASG leads to less overlap. This frame-based CKS of HASG is also confirmed by the analysis of the 'centrality' and 'edges' of the data in this paper, a view that has been extensively confirmed in comparative studies of experts and novices (National Research Council, 2000). The results suggest that we pay more attention to the organizational framework of knowledge in practical teaching and learning and use the knowledge framework to assist students in connecting fragmented knowledge, thus promoting the formation of good CKSs.
The blockmodel analysis revealed similarities in CKSs between HASG and MASG, whereas marked differences emerged when compared with LASG. This conclusion was also substantiated through centrality analysis, structural hole metrics, and edge analysis. For instance, both HASG and MASG prioritized identical centrality nodes (e.g., physical properties, chemical properties, applications, and esterification reactions) and structural hole nodes (e.g., water and applications), and both groups focused on the same framework knowledge and its connections. These patterns likely stem from the hierarchical and well-organized knowledge of HASG and MASG, contrasting with the fragmented knowledge organization observed in LASG. Specifically, HASG and MASG unified their CKSs through superordinate concepts such as physical properties, chemical properties, and applications, and then organized their knowledge through secondary concepts such as esterification and substitution reactions. In contrast, LASG would directly place keywords for different levels, such as chemical properties, oxidation reactions, and solid fuels, at the same level. This feature was also demonstrated in the interviews. For example, HASG logically and hierarchically stated that “the main knowledge of ethanol includes physical properties, chemical properties, and applications, and for physical properties, it has a lower density than water and can be soluble in water in any proportion, and for chemical properties, it can be esterified and undergo esterification reaction”, while LASG stated the following phrases: “ethanol can undergo oxidation reactions and has chemical properties”, “ethanol can undergo oxidation reactions”, and “ethanol can be used as a solid fuel”. The above findings signify that LASG needs to undergo a structural shift in its CKS to make progress. It also reveals that teachers need to pay attention to two points in their teaching practice: first, they should focus particularly on the organization of knowledge when teaching to help students form a clear knowledge structure, mainly because experts' knowledge is more logical than novice's knowledge (Snyder, 2000), and clear knowledge organization can help students reduce the burden of memorization (Sweller, 2012). Second, it is crucial for teachers to focus on key conceptual nodes for teaching and learning, to enhance the centrality of key nodes, and to strengthen the linkages between pieces of knowledge, for example, focusing on the nodes of physical properties, chemical properties, applications, esterification reactions, and water.
Regarding the third research question (what learning difficulties do students face?), the three student groups, HASG, MASG, and LASG, exhibited a decreasing trend, in order, for correct knowledge content related to ethanol, consistent with previous research (Zhang, 2012), i.e., the better the student's academic achievement, the more knowledge content they usually hold. There was no difference in the number of misconceptions held by students among the three student groups. However, there were similarities and differences in the specific manifestations of these misconceptions. In terms of similarities, all three groups had learning difficulties related to the odor of ethanol, the characterization of its chemical formula, its oxidation reaction with strong oxidizing agents such as acidic potassium permanganate or potassium dichromate, and the relationship between its properties and its structure. However, the performance of the groups varied; HASG and MASG had learning difficulties on points such as the products of catalytic oxidation of ethanol, and MASG and LASG had learning difficulties on points such as ethanol's solubility. This may be related to students' working memory; when students learn something new, those with a strong working memory are more likely to structure and process what they learn and integrate it with their prior knowledge, which is then also stored in their long-term memory. These students are also more likely to extract knowledge from long-term memory when needed. Regarding specific learning difficulties, for the odor of ethanol, all three student groups thought that it had an “irritating odor” or was “odorless”, etc. The reason for this result may be that the students do not know the difference between the keywords “special aroma”, “irritating odor”, and “odor”, or it may be because the students have not smelled these odors during their actual learning, and lack practical experience. Regarding the symbolic representation of ethanol, some students described its molecular formula as CH3CH2OH and the condensed structural formula as C2H5OH. However, HASG had a larger proportion, which was much higher than MASG and slightly higher than LASG. This may be because students have difficulty distinguishing between various expressions and their concepts, and HASG paid more attention to symbolic representations than MASG and LASG (the proportions for the three groups were 19.80%, 7.25%, and 11.55%). Regarding ethanol's oxidation reaction with strong oxidants, this may be because during this lesson, the teacher paid more attention to the combustion and catalytic oxidation reactions of ethanol and neglected the oxidation reaction of strong oxidants. Regarding the fourth point, all three groups of students ignored the link between the properties and structure of ethanol, despite this structure–property link being a key element of the teachers’ lectures as well as explicitly emphasized in the Chinese Chemistry Curriculum Standards for Senior High Schools (2017 edition revised in 2020) (Ministry of Education of the People's Republic of China, 2020). This oversight by the students may be due to the difficulty of this knowledge point. It has been noted that the structure–property linkage in organic chemistry is a difficult one, with students often focusing on reactants and products and finding it challenging to link structure to properties (Ferguson, 2003), with even postgraduate-level students having a relatively low understanding (Bhattacharyya and Bodner, 2005).
This above findings inspire us to use appropriate teaching methods to help students overcome the four learning difficulties mentioned above during future ethanol teaching, for example by assisting students in identifying odors through physical experience, facilitating the learning of chemical formula expressions through clear concept profiles (Mortimer and El-Hani, 2014; Aguiar et al., 2018), and prompting a deeper understanding of oxidation by strong oxidizing agents and strengthening the structure–property connection through explanation-based instruction (Braaten and Windschitl, 2011; National Research Council, 2012; Lu, 2015).
However, this research is only a preliminary study of student groups' CKSs from a network perspective. We consider the network of students' CKSs to be a complex system. A more in-depth study is needed to analyze students' CKS networks and to better explore the deep information contained within them. Subsequently, we will compare different methods for data collection and data analysis and launch a series of studies on students' CKSs from a network perspective, focusing on different learning content (e.g., redox reactions in inorganic chemistry, electrolytic and galvanic cells in the principles of chemical reactions, etc.). This will reveal deeper, more objective, and more comprehensive information about students' CKSs from a network perspective, which will, in turn, facilitate students' learning of chemistry, educators’ chemistry teaching, and provide meaningful conclusions for chemistry education.
Source | Terminology | Specific connotation |
---|---|---|
(Nawani et al., 2016) | Cognitive knowledge structure | Interconnectedness of students’ knowledge about a topic or domain. |
(Shavelson, 1972, 1974) | Cognitive structure | Cognitive structure is a hypothetical construct referring to the organization (relationships) of concepts in long-term memory. |
(Derman et al., 2024) | Cognitive structure | How information is stored in the long-term memory and how different information and concepts are related to each other in a certain domain. |
(Ozcan and Tavukcuoglu, 2018) | Cognitive structure | Cognitive structure is a structure exhibiting the mutual associations of concepts recorded in the long-term memory. |
(Amith et al., 2017, 2020) | Knowledge structure | Mental organization of knowledge knowledge organization reflected in…generated |
(Burrows and Mooring, 2015) | Knowledge structure | The schema in which students organize and relate various concepts to make sense of a particular topic. |
(Goldsmith et al., 1991) | Structural knowledge | Someone knows the global relations among important concepts within a domain. |
If the statement obtained from the transcription is long, splitting the sentence into shorter phrases is necessary. A long sentence is defined as one that usually contains more sentence elements, such as clauses and conjunctions. The rule of splitting is based on the connection to the topic (or subject). For example, the long sentence “the physical property of ethanol is that it is miscible with water in any proportion, and its chemical property is that it undergoes an oxidation reaction” can be split into “the physical property of ethanol is that it is miscible with water in any proportion” and “the chemical property of ethanol is that it undergoes an oxidation reaction” based on the topic “ethanol”. Similarly, the long sentence “ethanol has a low boiling point and is volatile” can be split into two shorter sentences “ethanol has a low boiling point” and “ethanol is volatile” based on the topic “ethanol”.
Rule two: Analysis based on the adjacency of phrases within a short sentence.
The analysis is carried out within the short sentence. If the phrases within the short sentence are simple and have no logical relationships, such as conjunctions, the analysis is based on the adjacency between the phrases. For example, the student says, “Ethanol is volatile.” The phrase pair “ethanol–volatile” can be constructed.
Rule three: Consider the logical relationship between phrases.
Suppose a student uses a sentence with relational conjunctions such as “because” and “so”, these conjunctions should be considered. For example, the student says, “Ethanol is volatile because it has a low boiling point”. Consider the conjunction “because” in this sentence. The words “because” link “low boiling point” and “volatile,” so the above sentence can be constructed as three phrase pairs: ethanol-low boiling point, ethanol-volatile, and low boiling point-volatile (based on thematic links and adjacency).
Rule four: Analyze phrases based on the hierarchical relationship between the phrases within the short sentence.
If a student presents a complex short sentence containing many phrases, we will simplify the analysis of the sentence to some extent. This simplification is based on the hierarchical relationship between phrases. In the hierarchy, we distinguish between superordinate and subordinate concepts based on the consensus of teachers and curriculum experts. For example, for the short sentence “Ethanol can undergo esterification reaction with acetic acid to produce ethyl acetate and water,” we use the superordinate concept of “esterification reaction” to unify “acetic acid,” “ethyl acetate,” and “water,” without considering the relationship between the phrases “acetic acid,” “ethyl acetate,” and “water” (although there may be a relationship between these phrases), and then construct the phrase pairs ethanol–esterification, esterification–acetic acid, and esterification–ethyl acetate, esterification reaction–water.
This journal is © The Royal Society of Chemistry 2025 |