Ming
Chi
a,
Changlong
Zheng
*a and
Peng
He
b
aInstitute of Chemical Education, Northeast Normal University, People's Republic of China. E-mail: zhengcl@nenu.edu.cn
bCollege of Education, Washington State University, USA
First published on 29th June 2024
Chemical thinking is widely acknowledged as a core competency that students should develop in the context of school chemistry. This study aims to develop a measurement instrument to assess students’ chemical thinking. We employed the Essential Questions-Perspectives (EQ-P) framework and Structure of Observed Learning Outcome (SOLO) classification to construct a hypothetical model of chemical thinking. This model comprises three aspects and each aspect includes five cognitive levels to assess students’ chemical thinking. Accordingly, we developed an initial instrument consisting of 27 items in multiple formats, including multiple-choice, two-tier diagnostic, and open-ended questions. We applied the partial credit Rasch model to establish the validity and reliability of measures for the final instrument. Following the process of pilot test, revision, and field test, we finalized the instrument with a refined 20-item instrument. Two hundred and twenty-one Chinese high school students (Grade 12) participated in the pilot and field tests. The results demonstrate that the final instrument effectively produces reliable and valid measures of students’ chemical thinking. Furthermore, the empirical results align well with the hypothetical model, suggesting that the SOLO classification can effectively distinguish the levels of proficiency in students’ chemical thinking.
Chemical thinking involves students in developing and applying knowledge and practices, serving specific purposes such as chemical analysis, transformation, and synthesis (Sevian and Talanquer, 2014). Moreover, chemical thinking equips learners with domain-specific thinking to access and organize chemical knowledge to find and address questions within the discipline (Sevian and Talanquer, 2014; Landa et al., 2020; Chi et al., 2023). Many studies have been conducted to investigate students’ chemical thinking in response to specific disciplinary questions (Yan and Talanquer, 2015; Moreira et al., 2019; Macrie-Shuck and Talanquer, 2020). For instance, Ngai and Sevian (2017) applied open-ended questions to assess students’ chemical identification thinking (i.e., “What is this substance?”). Through qualitative analysis, Weinrich and Talanquer (2015) examined different thinking patterns among students while addressing questions related to chemical causality (i.e., “Why do chemical reactions happen?”), chemical mechanism (i.e., “How do these processes occur?”), and chemical control (i.e., “How can these processes be controlled?”). Cullipher and his colleagues (2015) explored students’ chemical thinking at different levels when they solved tasks associated with benefits, costs, and risks (i.e., “How to evaluate the impacts of chemically transforming matter?”). Overall, researchers have investigated students’ chemical thinking at different aspects, which contribute to a deeper understanding of students’ domain-specific cognitive processes in solving problems using chemical knowledge and practices.
However, previous research has revealed several limitations in assessing students’ chemical thinking. Firstly, assessing students' chemical thinking necessitates an explicit delineation of the types of chemical rationales expected to guide their problem-solving approaches (Talanquer, 2019). While previous studies have either implicitly or explicitly identified the chemical rationales demonstrated by students when tackling different disciplinary problems (Ngai et al., 2014; Weinrich and Talanquer, 2015; Stammes et al., 2022), there remains a lack of an explicit framework to systematically organize these findings and subsequently integrate them into the design of an instrument aimed at assessing students' chemical thinking. Moreover, while most studies have focused on mapping the types and levels of students’ chemical thinking, little evidence of the reliability and validity in assessing students’ chemical thinking was established. Therefore, this paper intends to address this gap by applying the Essential Questions-Perspectives (EQ-P) framework (Chi et al., 2023, N.D. under review) to establish an alternative understanding of chemical thinking and by developing and validating an instrument to assess students’ chemical thinking.
Over the past decade, significant research efforts have investigated the characteristics of chemical thinking from these two aspects. Certain studies have focused on what chemistry can make us think, thereby identifying certain disciplinary essential questions (Hoffmann, 1995; NRC, 2003; Sevian and Talanquer, 2014). Others have concentrated on examining how chemistry guides our thought processes, highlighting perspectives as thinking tools and identifying critical chemical perspectives relevant to school chemistry (Landa et al., 2020; Ottenhofm et al., 2022). Disciplinary essential questions play a pivotal role in defining chemists’ visions and delineating the overall landscape of chemistry, encapsulating the realm within which chemistry provides explanations (Crombie, 1994; Bensaude-Vincent, 2009; Talanquer, 2011), such as “Why do the properties or behaviours of substances emerge?”. Chemical perspectives embody chemists’ valuable viewpoints on specific aspects of the world and serve as indispensable domain-specific thinking tools, such as thermodynamic perspective on explaining the extent of chemical processes (Giere, 2006; Brigandt, 2013; Landa et al., 2020).
More recently, we have established a chemical thinking framework linking consensus essential questions with perspectives, providing a comprehensive approach for researchers and practitioners to understand, teach, and assess chemical thinking in school chemistry (Chi et al., 2023, ND. under review). Our framework refers to the essential questions-perspectives (EQ-P) framework (shown in Fig. 1), encompassing three disciplinary essential questions and 12 corresponding chemical perspectives. Table 1 presents a descriptive definition of chemical thinking in our framework.
Essential questions | Perspectives | Definition (Chi et al., 2023) |
---|---|---|
What is the substance? (description and identification [D&I]) | Physical characteristics (PC) | The fundamental assumption of chemical identification is that each material exhibits certain unique characteristics that differ from those exhibited by other substances. Substances have specific characteristics that can emerge without chemical reactions, such as colour and smell. The physical characteristics of substances are one of the important clues to describing and identifying chemical substances. |
Reaction characteristics (RC) | Substances have specific behavioural characteristics that must manifest through chemical reactions, such as redox, acid–base, thermal stability, etc. Chemical substances can be described and identified based on their representative chemical reactivities. | |
Composition (C) | The material system consists of specific chemical components, such as pure substances, chemical elements, or atoms. Components of the substance system are critical information for characterizing and identifying substances. | |
Structure (S) | There are interactions between the various components of a substance, giving each chemical substance its unique structure. The fundamental assumption underlying the determination of structure is that there is precisely one distinctive chemical structure for every chemical substance. Chemical substances can be characterized and identified based on their unique structures. | |
Why do the properties or behaviours of substances emerge? (explanation and prediction [E&P]) | Particle (P) | Early chemistry relied on a reductionist view that portrayed matter as an assembly of submicroscopic particles. The properties of substances are thus associated with particles with inherent characteristics that account for the observed properties of materials in linear and additive ways. From the particle perspective, matter is a static chemical system composed of submicroscopic particles such as atoms, groups, and molecules. The types and properties of these particles or groups of such particles influence the properties of matter. |
Interaction (I) | The systematic view shows that the properties and behaviour of matter are supposed to emerge from the dynamic interactions among submicroscopic components (e.g., molecules, atoms, ions). These interactions manifest at the subatomic, atomic, molecular, and multimolecular scale, and the properties that appear at different scale result from the interactions among subscale particles. The properties or phenomenon that emerge at different scales can be explained and predicted based on the interactions among subscale particles. | |
Thermodynamics (T) | As a dynamic system in a state of equilibrium, chemical substances exist under certain conditions concerning temperature, pressure, and other contextual variables at the macroscopic level. The possibility of substance change emerges in interaction with the system's surroundings or other systems. The stability and reactivity of chemical substances depend on the system state function and the changes (energetic, entropic) in that function that characterize multiple interactions between the system and its surroundings. The reaction system state function and its changes can be used to explain and predict the direction and extent of the transformation of the material system. | |
Kinetics (K) | Matter systems can undergo chemical reactions by interacting with other matter systems or the surroundings. Chemical reactions involve the issue of rate. The reaction rate depends on particles’ random collision and vibration under specific conditions, activation energy, and other factors, representing the kinetic factors of the reaction. Based on those factors, the mechanism can be revealed, and the reaction rate can be explained and predicted. | |
How do we transform and make the substance? (transformation and synthesis [T&S]) | Target molecule (TM) | The synthesis needs to identify desirable products and their molecular structures. Retrosynthetic analysis theory reflects that determining the target molecule's structure and disconnecting its bond can design the synthetic route. Starting materials, reagents, and synthetic routes can be deduced step by step by disconnecting the target molecule. |
Starting material (SM) | The starting material perspective includes the impact of starting material on material synthesis. The starting materials’ structure, nature, and degree of fit with the synthetic reagents influence the artificial design. From the perspective of starting materials, synthetic reagents and suitable chemical reactions can be selected, and synthetic routes can be designed. | |
Addition-removal (AR) | Making target products often involves the addition and removal of specific chemical components. The addition-removal perspective emphasises introducing and removing certain components for a specific purpose during the synthesis process. For example, target functional groups can be protected based on the rational addition and removal of groups. | |
Process control (PC) | Transformation and synthesis of substances in chemistry must employ chemical reactions. The synthetic route, reaction thermodynamics, and kinetic factors limit the synthesis yield. From the process control perspective, simplifying the reaction steps and controlling various thermodynamic and kinetic factors can improve the synthesis yield. |
The EQ-P framework introduces three key aspects of chemical practice from a disciplinary standpoint: description and identification (D&I), explanation and prediction (E&P), and transformation and synthesis (T&S). Additionally, our framework offers four alternative and complementary chemical perspectives to address essential questions. Each perspective can guide students’ thought processes in solving problems related to these questions. For example, the question “Why do energy changes relate to the occurrence of a chemical reaction?” falls under the E&P category. Students can approach this question using various chemical perspectives. One approach is the interaction perspective (see Table 1), where students recognize that bond breaking and formation in a chemical reaction result in endothermic and exothermic phenomena. Alternatively, students can address the question through a thermodynamic perspective (see Table 1), considering the difference in total energy before and after a chemical reaction system, which leads to endothermic and exothermic processes. Thus, the EQ-P framework emphasizes the importance of establishing multiple chemical perspectives for understanding and analysing problems. It encourages students to develop diverse perspectives and consider specific problems from various chemical perspectives, thereby characterizing the sophisticated conceptual mode we expect students to form. Addressing these questions through different perspectives reflects the varying levels of students’ chemical thinking proficiency about conceptual sophistication.
The SOLO taxonomy encompasses five levels of sophistication, namely ‘prestructural’, ‘unistructural’, ‘multistructural’, ‘relational’, and ‘extended abstract’. Prestructural responses indicate a lack of understanding, whereby students cannot provide meaningful answers to questions. Unistructural responses imply that students can only apply a single aspect of information, fact, or idea to address questions. Multistructural responses demonstrate that students can utilize information, facts, and ideas from multiple aspects; however, these aspects remain disparate and are not yet integrated into a cohesive framework. These three levels manifest the progressive complexity of student thinking through quantitative changes in performance characteristics.
In contrast, ‘relational’ and ‘extended abstract’ represent the two ‘deep level’ responses, which reflect an enhancement in the quality of student thinking and a transition from concrete to abstract reasoning (Biggs and Collis, 1982; Minogue and Jones, 2009). Relational responses involve the integration of at least two distinct aspects of information, facts, or ideas. Such integration facilitates structured thinking, where these aspects collaborate to answer a given question. ‘Extended abstract’ thinking surpasses the confines of provided information and context, enabling the extraction of more general rules or frameworks with broader applicability, involving metacognition of thought processes.
We proposed a theoretical hypothesis model that combines the EQ-P framework with SOLO classification (see Fig. 2). Fig. 2 demonstrates that the EQ-P framework offers a domain-specific foundation for chemical thinking, while SOLO classification characterizes the varying levels of conceptual sophistication students exhibit.
Based on the SOLO classification, we defined Levels 0 to 2 from a quantitative perspective to assess whether students have developed one or more chemical perspectives to address relevant disciplinary questions (D&I, E&P, and T&S, see Table 1) and thus determine their levels of chemical thinking. Subsequently, Level 3 assesses whether students comprehend the connections and differences between two or more chemical perspectives and effectively apply the corresponding chemical perspectives to solve disciplinary problems. Furthermore, Level 4 measures students’ capability to transcend the given problem context and generalize a comprehensive framework for addressing essential questions. These descriptions of conceptual sophistication levels constitute the theoretical basis for developing an instrument to assess student chemical thinking.
RQ1: What evidence exists pertaining to the reliability and validity of the obtained data using the developed instrument for assessing students’ chemical thinking?
RQ2: What evidence exists to support the appropriateness of the hypothetical levels of chemical thinking in distinguishing students’ chemical thinking proficiencies?
Building upon the hypothetical model delineating distinct levels of chemical thinking, we incorporated a range of item formats, encompassing multiple-choice, two-tier diagnostic, and open-ended questions. Specifically, Levels 1 and 2 predominantly assess students’ mastery of chemical perspectives through multiple-choice questions, while Level 3 primarily assesses students’ structured thinking using two-tier diagnostic questions. Finally, Level 4 focuses on measuring students’ metacognitive abilities related to the problem-solving process by incorporating open-ended questions. The examples and design intentions of the four level items are shown in Table 2. To comprehensively assess each essential question of chemical thinking (see Table 1), we have developed a total of 27 items, with nine items assigned to each aspect (D&I, E&P, and T&S). A depiction of the relationship between all the items in the initial instrument and the hypothetical model can be found in Table 3.
Level | Items | Design intention |
---|---|---|
Level 1 | Adenosine triphosphate (ATP) is a molecule that directly provides energy for cellular processes. The process of ATP hydrolysis (breakdown) into adenosine diphosphate (ADP) releases energy and is an essential part of cellular metabolism. Which of the following statements about this process is incorrect? (Correct answer: D) | The item focuses on enthalpy and entropy changes that occur during the transformation of ATP to ADP. If a student holds a thermodynamic perspective, the student can predict the directionality and spontaneity of reactions, based on heat release, entropy increase, and Gibbs free energy decrease. Additionally, the student may recognize that catalysts are independent of reaction spontaneity. Conversely, if a student does not hold a thermodynamic perspective, the student may struggle to answer the question correctly. We used binary scores (1 and 0) to grade students’ responses, with a correct answer receiving a score of 1 (D), indicating that students have established a chemical perspective. An incorrect answer (A, B, C) was assigned a score of 0, indicating that a thermodynamic perspective had not yet been established. |
![]() |
||
A. The hydrolysis of ATP to ADP releases energy in the form of heat and tends to drive the reaction towards ADP formation. | ||
B. The process of ATP hydrolysis increases the degree of entropy (disorder) in the system, contributing to a tendency towards ADP formation. | ||
C. The hydrolysis of ATP to ADP is an energetically favourable reaction, with a decrease in Gibbs free energy. | ||
D. The hydrolysis of ATP to ADP typically requires an enzyme catalyst and does not occur spontaneously under physiological conditions at 37 °C and neutral pH. | ||
Level 2 | Which of the following propositions about the exothermic reaction of hydrogen peroxide (H2O2) decomposition is correct? (correct answer: D) | This item assesses students’ comprehension of the decomposition reaction of hydrogen peroxide from thermodynamic and interaction perspectives. It includes four propositions, with Propositions I and II testing thermodynamic understanding. Students must utilize thermodynamic data, such as heat release and entropy increase, and apply Gibbs free energy to predict the spontaneity of the reaction at all temperatures. Proposition II, which suggests a catalyst affects reaction directionality, is incorrect. Propositions III and IV examine chemical interactions, expecting students to explain hydrogen peroxide's solubility and weak acidity through molecular interactions and bond instability. Proposition IV is incorrect as students can predict the properties of molecules by considering the strength of interactions among atoms. Scoring ranges from 0 to 2, with option D indicating mastery of both perspectives for a score of 2, option B reflecting an understanding of only one perspective for a score of 1, and any other incorrect choices resulting in a score of 0. |
I. At any temperature above absolute zero, the reaction can proceed spontaneously in the forward direction. | ||
II. Manganese dioxide (MnO2) can reduce the activation energy of the reaction, leading to greater progress in the forward direction. | ||
III. The reason why H2O2 can dissolve in H2O in any proportion may be due to the high polarity of both H2O2 and H2O molecules. | ||
IV. Based on the molecular structure of H2O2, the H2O2 solution should not exhibit acidity. | ||
A. I, IV B. III C. II, III D. I, III E. IV | ||
Level 3 | In the figure, the catalytic reaction process between benzene and liquid bromine is depicted. | This item in a two-tier diagnostic question format assesses students’ ability to distinguish between thermodynamic and kinetic perspectives in chemical reactions. This approach not only gauges their understanding of each perspective through Propositions I–IV but also their capacity to integrate both in problem-solving. The format's strength lies in its exploration of cognitive dimensions, as it prompts students to justify their choices, thereby revealing their depth of thinking (Treagust, 1988; Peterson et al., 1989; Tsui and Treagust, 2003). The grading rubric reflects this, with scores ranging from 0–3 based on the extent of their perspective integration and the thoroughness of their reasoning. This method provides a comprehensive evaluation of students’ conceptual grasp and application skills in chemistry. |
![]() |
||
(1) Please select the correct proposition about this reaction from the options below: (correct answer: C) | The grading rubric is as follows: a score of 0 is allocated to students opting for an incorrect answer, signifying an absence of a discernible chemical perspective; a score of 1 pertains to individuals solely favouring a kinetic perspective and selecting option B; a score of 3 is assigned when students comprehensively consider thermodynamic and kinetic factors, predict the possibility and reality of the reaction, and provide detailed reasoning for their answer. If the answer rationale lacks thoroughness or comprehensiveness, 2 points are awarded (associated with option C). | |
I. Based on the information in the figure, the catalytic reaction between benzene and Br2 is exothermic. | ||
II. Under the influence of the catalyst, the rate of transformation from ![]() ![]() ![]() |
||
III. The rate of this reaction primarily depends on the step involving the transformation of ![]() ![]() |
||
IV. The main product obtained from the catalytic reaction between benzene and Br2 is ![]() |
||
A. I B. III C. II, III, IV D. I, II, IV E. I, III, IV | ||
(2) Please analyse the reaction in detail and explain the reason for choosing the above option? | ||
Level 4 | Solubility is an important property for chemists to consider, as different substances generally exhibit varying levels of solubility under identical conditions. For instance, phosphine displays weaker solubility in water compared to ammonia (both of which release heat upon dissolution), while calcium chloride exhibits greater solubility in water than sodium chloride (with calcium chloride causing a significant temperature increase upon dissolution, whereas sodium chloride does not). Please provide a comprehensive explanation for the reasons behind these phenomena based on different chemical perspectives. Based on the above examples, please think and answer the causes responsible for the differences in solubility observed between different substances. | This open-ended question assesses students’ ability to reflect abstractly on their problem-solving strategy, focusing on solubility as influenced by interaction and thermodynamics. Students could consider ammonia's polarity and hydrogen bonding with water. For sodium chloride and calcium chloride, students could adopt a thermodynamic perspective, considering calcium chloride's exothermic dissolution and entropy increase, based on Gibbs free energy. Scoring reflects students’ integration of chemical concepts in explaining solubility. A score of 4 indicates effective summarization and reflection on the cognitive process for explaining solubility of four substances. A score of 3 shows a comprehensive explanation for each substance, while a score of 2 means students can explain solubility for some substances. A score of 1 shows the ability to explain solubility for one substance. |
Level | Item format | D&I | E&P | T&S |
---|---|---|---|---|
Level 1 | Multiple-choice questions | Q1, Q2 | Q3, Q4 | Q5, Q6 |
Level 2 | Multiple-choice questions | Q7, Q8, Q9 | Q10, Q11, Q12 | Q13, Q14, Q15 |
Level 3 | Two-tier diagnostic questions | Q16, Q17 | Q18, Q19 | Q20, Q21 |
Level 4 | Open-ended questions | Q22, Q23 | Q24, Q25 | Q26, Q27 |
During the development of the instrument, both the test content and response process serve as pivotal sources of validity evidence (American Educational Research Association, 2014; Lewis, 2022). The validation of the test content aims to ensure the theoretical soundness of the instrument and its acceptability and accessibility among the target population (Deng et al., 2021; Lewis, 2022). To achieve this, an expert panel comprising three chemistry education researchers and one physics chemistry professor meticulously evaluated the appropriateness and coherence of the test content designed to assess students’ chemical thinking (Abd-El-Khalick et al., 2015; He et al., 2022b; Li et al., 2024). Additionally, four high school chemistry teachers assessed the clarity and suitability of the item content for high school students.
The response process was designed to address concerns regarding the interpretability of the items for students (Deng et al., 2021; Lewis, 2022; Schreurs et al., 2024). To this end, ten Grade 12 students participated voluntarily in this process. They were administered the instrument and then participated in interviews. The following four questions were posed to the participants:
(1) What is your understanding of this question?
(2) What chemical knowledge or principles do you believe are required to answer this question?
(3) Can you suggest any improvements to the wording of any items to enhance comprehension?
(4) Why did you decide to pick the option?
Questions 1–3 were intended to elicit feedback from students regarding their comprehension of the instrument, its effectiveness in assessing students’ chemical perspectives, and any remaining challenges in item wording (Danczak et al., 2020; Deng et al., 2021; He et al., 2021). The fourth question aimed to explore whether students’ explanations of the items aligned with our expectations. Based on the insights gained from these interviews, we revised the wording of certain items in line with feedback from the expert panel, teachers, and students. For example, when faced with Proposition IV in Example 2 of Table 2, none of the interviewed students applied the interaction perspective to consider the properties of hydrogen peroxide solution. They often believed, based on their life experience, that hydrogen peroxide solutions cannot be acidic. Consequently, we revised the wording of Proposition IV to include “Please predict properties based on the molecular structure of hydrogen peroxide” to better diagnose whether students have failed to construct an interaction perspective. After this process, it is expected that all items used for the pilot test will encourage students to apply certain chemical perspectives.
The Rasch model offers indices that are essential in establishing the reliability and validity of the measurement. The separation index for both persons and items was utilized in Rasch model, which serves as an indicator of the reliability. Generally, a separation index exceeding two is considered adequate (Duncan et al., 2003). Additionally, the validation of an instrument based on the Rasch model typically involves testing for fit statistics, local independence and dimensionality, and the Wright map (Liu, 2010).
Item fit statistics commonly used include mean square residual (MNSQ) and standardized mean square residual (ZSTD), which evaluate the degree of the deviation between observed and expected scores based on the Rasch model. MNSQ and ZSTD can be summed over all persons for each item by two methods of Infit and Outfit, resulting in four fit statistics. An acceptable fit criterion is Infit and Outfit MNSQ within the range of 0.7 to 1.3 and Infit and Outfit ZSTD within the range of −2.0 to +2.0 (Linacre, 2011). In addition, Linacre (2022) recommends that before examining fit statistics, point-measure correlation (PTMEA) should be examined first. The PTMEA indicates how the items contribute to the measures, referring to the correlation between students’ scores on the items and their Rasch measures, which should be positively correlated, and a higher positive correlation is better (Liu, 2010). The purpose of assessing local independence and dimensionality is to determine the relevance and non-redundancy of all instrument items, as well as whether the variation among responses to an item is accounted for by a latent construct (Andrich and Marais, 2019). The criteria to confirm local independence include a ZSTD value greater than −2.0 or a correlation coefficient of residuals lower than 0.7 (Smith, 2005; Linacre, 2011; He et al., 2016). Principal Components Analysis of Rasch Residual is employed to evaluate the extent to which a data set deviates from unidimensionality (Linacre, 2011). Lastly, the Wright map aligns persons and items on a shared linear scale, providing insightful information on the ordering and spacing of items, and thus indicating the alignment between the actual difficulty and the predetermined difficulty of items (Liu and Boone, 2023).
This study utilized the partial credit Rasch models for data analysis. The partial credit Rasch model was appropriate for this study due to the instrument's inclusion of various item formats and its evaluation using multiple rating categories (Liu, 2010). The Rasch model analysis was performed using Winsteps software (Version 3.72.0).
In the pilot study, the person separation index yielded a value of 2.09, equating to a person reliability coefficient of 0.81. Moreover, the item separation index yielded a value of 4.93, indicating an item reliability coefficient of 0.96. Based on the criterion of the Rasch model, four items were found to fall outside the acceptable range for fit statistics, specifically the Infit MNSQ and Infit ZSTD. These four items simultaneously involved both substance identification (D&I) and explanation of properties (E&P), which are two aspects of chemical thinking. Therefore, we removed these four items. Furthermore, the other four items that did not meet the measure of unidimensionality were removed. The deletion was because these items involved everyday situations or socio-scientific issues that could potentially influence students’ cognitive judgments.
Through the analysis of the Wright map, we identified a gap in the item difficulty measures and addressed this concern by adding a multiple-choice question at level 2. This item was designed with the background of the synthesis of salicylic acid and involves the T&S aspect of chemical thinking. Additionally, we modified the item format of three items whose difficulty measures did not align with the expected difficulty observed in the pilot test. For example, Q2 (Level 1) was found to be close in difficulty to Level 3 and was therefore revised to utilize the format of two-tier diagnostic questions (Level 3). Furthermore, we noticed that two Level 4 items lacked corresponding ability measures, indicating that students with higher abilities should be selected for the field test. Ultimately, the revised instrument consisted of 20 items, and the new item code and a detailed description are listed in Table 4.
Level | Items |
---|---|
Note: The annotations in parentheses are the item codes of the initial instrument in Table. | |
Level 1 | L1A (Q1), L1B (Q8), L1C (Q5), L1D (Q3), L1E (Q4), L1F (Q6) |
Level 2 | L2A (Q16), L2B (Q12), L2C (Q19), L2D (Q14), L2E (new addition) |
Level 3 | L3A (Q2), L3B (Q17), L3C (Q18), L3D (Q21) |
Level 4 | L4A (Q25), L4B (Q22), L4C (Q24), L4D (Q23), L4E (Q27) |
Measure | Error | Infit | Outfit | |||
---|---|---|---|---|---|---|
MNSQ | ZSTD | MNSQ | ZSTD | |||
Person | 0.07 | 0.18 | 0.94 | −0.2 | 1.05 | 0.1 |
Item | 0.00 | 0.16 | 0.99 | −0.1 | 1.03 | 0.0 |
Item | Measure | S.E. | Infit | Outfit | PTMEA | ||
---|---|---|---|---|---|---|---|
MNSQ | ZSTD | MNSQ | ZSTD | ||||
L1A | −1.56 | 0.23 | 1.33 | 2.5 | 1.86 | 2.7 | 0.11 |
L1D | 0.91 | 0.20 | 1.26 | 3.5 | 1.63 | 3.4 | 0.18 |
L1F | −1.04 | 0.21 | 1.11 | 1.1 | 1.20 | 1.9 | 0.32 |
L1B | −2.21 | 0.27 | 1.22 | 1.3 | 1.18 | 1.0 | 0.16 |
L2D | −0.30 | 0.13 | 1.11 | 1.1 | 1.16 | 1.9 | 0.54 |
L2B | −0.87 | 0.16 | 1.13 | 1.1 | 1.15 | 1.2 | 0.46 |
L4C | 1.67 | 0.12 | 1.07 | 0.6 | 1.08 | 0.7 | 0.63 |
L2E | −1.17 | 0.15 | 0.97 | −0.2 | 1.05 | 0.3 | 0.54 |
L3C | 0.08 | 0.11 | 1.05 | 0.4 | 1.02 | 0.2 | 0.67 |
L1C | −1.77 | 0.24 | 0.98 | −0.1 | 1.01 | 0.1 | 0.38 |
L1E | −1.18 | 0.21 | 0.99 | 0.0 | 0.99 | 0.0 | 0.42 |
L3A | 0.25 | 0.12 | 0.98 | −0.1 | 0.97 | −0.3 | 0.66 |
L4B | 1.32 | 0.11 | 0.92 | −0.7 | 0.93 | −0.5 | 0.74 |
L2A | −0.48 | 0.13 | 0.90 | −0.9 | 0.89 | −0.7 | 0.64 |
L4D | 1.15 | 0.10 | 0.88 | −1.0 | 0.89 | −0.7 | 0.74 |
L4A | 1.74 | 0.16 | 0.83 | −1.3 | 0.80 | −1.6 | 0.65 |
L4E | 2.21 | 0.13 | 0.78 | −1.2 | 0.76 | −1.3 | 0.65 |
L2C | 1.05 | 0.15 | 0.75 | −1.4 | 0.74 | −1.6 | 0.65 |
L3D | 0.49 | 0.12 | 0.74 | −1.5 | 0.72 | −1.8 | 0.77 |
L3B | −0.29 | 0.12 | 0.73 | −1.9 | 0.70 | −2.0 | 0.79 |
The loading scatterplot of the residuals’ principal components analysis is presented in Fig. 3. The horizontal axis indicates the estimated item difficulty in the Rasch model, while the left vertical axis displays the correlation coefficient between students’ item scores and an additional latent construct unrelated to chemical thinking. Each letter in the plot represents a different item, and the right vertical axis is used to show the frequency of items with a correlation coefficient from the left vertical axis. When the items fall within the contrast loading range of −0.4 to +0.4, they do not strongly measure another construct. Based on Fig. 3, the majority of the items meet the criterion of unidimensionality, with only three items (L1C, L1F, and L4A) falling outside the −0.4 to +0.4 range. This suggests that most of the items in the instrument effectively assess students’ latent construct of chemical thinking.
Level | Items (measures) | Threshold |
---|---|---|
Level 1 | L1A (−1.56), L1B (−2.21), L1C (−1.77), | −1.48 |
L1E (−1.18), L1F (−1.04), L2E (−1.17) | ||
Level 2 | L2A (−0.48), L2B (−0.87), | −0.49 |
L2D (−0.30), L3B (−0.29) | ||
Level 3 | L3A (0.25), L2C (1.05), L1D (0.91) | 0.56 |
L3C (0.08), L3D (0.49) | ||
Level 4 | L4A (1.74), L4B (1.32), L4C (1.67), | 1.62 |
L4D (1.15), L4E (2.21) |
According to Table 7, five different levels of chemical thinking among students can be identified. A student's chemical thinking proficiency is categorized as level 0 if their Rasch measurement is less than −1.48. For students whose measure falls between −1.48 and −0.49, their chemical thinking proficiency is categorized as level 1. Similarly, a measurement between −0.49 and 0.56 indicates level 2 proficiency, while a measurement between 0.56 and 1.62 indicates level 3 proficiency. Finally, the Rasch measurement above 1.62 suggests a proficiency level of 4. The range of students’ chemical thinking levels along the Rasch scale is shown in Fig. 5. This result aligns with the hypothetical model based on SOLO classification, which supports the rationale of the hypothetical levels of chemical thinking.
Our findings contribute to existing research in the following aspects. Firstly, this study presents an alternative model for assessing students' chemical thinking proficiencies in terms of conceptual sophistication. Assessing students’ chemical thinking entails identifying the specific types of thinking we expect them to exhibit (Sevian and Talanquer, 2014; Stammes et al., 2022; Talanquer, 2019). The EQ-P framework offers a novel approach to characterizing students' chemical thinking within the conceptual mode dimension. The chemical perspectives outlined by the EQ-P framework explicitly represent core chemical ideas that we aim for students to comprehend during their secondary school chemistry education (Chi et al., 2023, N.D. under review). It is also essential to recognize that various chemical perspectives can simultaneously yield valid and potentially complementary responses (Giere, 2006; Griesemer, 2011; Talanquer, 2019). Our objective is for students not only to acquire multiple chemical perspectives but also to apply and explore these perspectives in problem-solving scenarios. To this end, this study introduced the SOLO classification to characterize students' conceptual sophistication levels within the EQ-P framework. The findings provide empirical evidence that the SOLO classification can function as a theoretical framework for differentiating students’ proficiency in chemical thinking. This integration facilitates a more nuanced assessment of students’ chemical thinking proficiencies by incorporating both quantitative and qualitative aspects of their cognitive processes.
Secondly, previous studies primarily focused on characterizing students’ levels of chemical thinking performance using phenomenological methods (Yan and Talanquer, 2015; Moreira et al., 2019; Macrie-Shuck and Talanquer, 2020). However, these studies have not yet developed instruments with satisfactory reliability and validity of measurement that are suitable for large-scale measurement. To address this gap, our study hypothesized a theoretical model for assessing chemical thinking and employed the Rasch model to establish psychometric properties for the measurement. By following a rigorous development process, we finalized a high-quality instrument capable of differentiating students’ performance in the realm of chemical thinking. The instrument is expected to be utilized in testing larger samples, thus further advancing research in this chemical education.
Thirdly, the development of students’ chemical thinking depends on designing curriculum and instruction that explicitly target chemical thinking (Talanquer and Pollard, 2010), offering ample opportunities for its cultivation. Pioneering research has demonstrated the effectiveness of such curriculum implementations, revealing that, compared to traditional methods, students exhibit positive improvements in performance on the ACS conceptual exam and subsequent courses (Talanquer and Pollard, 2017). Nonetheless, evaluating the effectiveness of teaching implementations still requires a set of instruments specifically tailored to chemical thinking (Talanquer and Pollard, 2017; Talanquer, 2019), enabling targeted diagnostics of students’ development in this area. The findings of this study provide a practical and relevant instrument for assessing the efficacy of chemistry curriculum designs and instructional methods in secondary schools.
Furthermore, there is a need to continuously explore and refine the chemical thinking assessment model. This study utilized the SOLO classification to investigate students’ potential to engage in sophisticated conceptual modes during chemical thinking. Previous studies have found that more advanced knowledge may lead individuals to construct less sophisticated but more targeted and productive explanations (Weinrich and Talanquer, 2016). In other words, among the various reasoning modes, such as descriptive or multicomponent (Sevian and Talanquer, 2014), students often do not need the most complex reasoning mode when solving real problems. However, this study indicates that students exhibit different levels of conceptual sophistication when asked to provide more complex argumentation. Students who have not established any chemical perspective or only apply partial perspectives may find it difficult to effectively solve all the problems at hand. In contrast, students with a sophisticated conceptual mode may have more options when facing problems and are more likely to find the most suitable path or reasoning mode to solve those problems. Nonetheless, the challenge remains in determining the most appropriate conceptual and reasoning mode for student problem-solving. Therefore, assessing students' chemical thinking requires not only recognizing conceptual sophistication and various reasoning modes but also determining which approach is most necessary for solving problems based on the nature of the problem or task (Weinrich and Talanquer, 2016; Talanquer, 2019). It is essential to develop an instrument based on a chemical thinking assessment framework that integrates ideas, reasoning, and practice to identify the most effective problem-solving strategies in future studies. (Talanquer, 2019).
Secondly, the majority of students who participated in the instrument administration have learned through traditional chemistry teaching methods. Our instrument still measured the chemical thinking proficiency of students who solve chemical problems by applying the content they have learned traditionally. However, a high exam performer may not necessarily exhibit higher levels of chemical thinking. It is crucial to further investigate whether our instruments can distinguish between students who excel in traditional exams and those who truly excel in chemical thinking. With the ongoing promotion of the “Disciplinary Essential Questions and Chemical Perspectives” teaching method in senior high schools in Mainland China, we plan to apply the instrument to assess students who have experienced this method in the future. We will compare their proficiency with that of students taught through traditional methods to further enhance the discriminant and concurrent validity of the measurement using the instrument.
Thirdly, the final version of the instrument required an excessive amount of time for students to complete. This may be due to the high number of items or our requirement for students to demonstrate sophisticated thinking levels as much as possible. This characteristic may render our instrument suitable for summative assessment purposes but is impractical for classroom administration by teachers. To mitigate this issue, categorizing the items into different essential questions categories (D&I, E&P, T&S) could yield three sub-instruments. However, further development is needed to create formative assessment instruments that are more suitable for classroom use by teachers.
Finally, the data were collected from a limited sample of students from two schools, which may not be representative of the broader population of students across different regions or educational settings. Future research should incorporate a more diverse sample from multiple schools and regions to enhance the generalizability of the results and provide a more comprehensive understanding of students’ chemical thinking proficiencies in various educational environments.
In conclusion, this study highlights the significance of assessing students’ development of chemical thinking and presents the development process of a high-quality measurement instrument. The research findings offer valuable insights for effective assessing and teaching chemical thinking within the context of school chemistry. Some limitations identified in this study will be addressed in future research.
Footnote |
† Electronic supplementary information (ESI) available. See DOI: https://doi.org/10.1039/d4rp00106k |
This journal is © The Royal Society of Chemistry 2024 |