Exploring different types of assessment items to measure linguistically diverse students’ understanding of energy and matter in chemistry

Kihyun Ryoo *, Emily Toutkoushian and Kristin Bedell
School of Education, University of North Carolina, 309E Peabody Hall, Chapel Hill, 27599-3500, USA. E-mail: khryoo@email.unc.edu; Tel: +1-(919) 962-0345

Received 26th July 2017 , Accepted 21st October 2017

First published on 1st November 2017

Energy and matter are fundamental, yet challenging concepts in middle school chemistry due to their abstract, unobservable nature. Although it is important for science teachers to elicit a range of students’ ideas to design and revise their instruction, capturing such varied ideas using traditional assessments consisting of multiple-choice items can be difficult. In particular, the linguistic complexity of these items may hinder English learners (ELs) who speak English as a second language from understanding and representing their ideas. This study explores how multi-modal assessments using different types of open-ended items can document ELs’ and English-dominant students’ (EDSs) understanding of energy and matter in chemistry. 38 eighth-grade, linguistically diverse students taught by one teacher at a low-income middle school completed an assessment designed to elicit their ideas about properties of matter and chemical reactions through arguing from evidence, writing explanations, and developing models of chemical phenomena. The results show that the three types of assessment items captured different correct and alternative ideas that ELs and EDSs held. In particular, modeling appears promising as a tool to assess what ELs know about properties of matter and chemical reactions in middle school chemistry, compared to other written items. The findings of this study provide insights into how different types of assessment items can be used to better understand the range of ideas held by linguistically diverse students.


Developing a coherent understanding of properties of matter and chemical reactions requires middle school students to integrate unobservable scientific concepts at the microscopic level (e.g., movements of atoms and molecules), observable phenomena at the macroscopic level (e.g., changes in states of matter), and the role of energy in chemical processes (e.g., thermal energy affecting the speed of molecules) (e.g., Driver and Millar, 1986; Liu and Lesniak, 2006; Sevian and Talanquer, 2014). This integrated understanding of chemistry is particularly emphasized by the Next Generation Science Standards ([NGSS]; NGSS Lead States, 2013), which demand that all students integrate crosscutting concepts of energy and matter into disciplinary core ideas through science practices, such as developing a model and constructing scientific explanations (Krajcik et al., 2014; Pellegrino et al., 2014). For example, students should be able to articulate in written and visual forms that particles are spaced differently in solids, liquids, and gases as thermal energy is added or removed (Sevian and Talanquer, 2014).

Despite its importance, many students struggle with developing such an integrated understanding of chemistry due to its abstract and unobservable nature (e.g., Abraham et al., 1992; Nakhleh, 1992; Hadenfeldt et al., 2016). A significant body of research has documented a wide range of students’ alternative ideas about properties of matter and chemical reactions, including confusing a physical change with a chemical change (e.g., Ahtee and Varjola, 1998), attributing macroscopic properties to microscopic concepts (e.g., Hadenfeldt et al., 2016), believing that atoms can be created or destroyed in chemical reactions (e.g., Sampson et al., 2011), and failing to distinguish energy from matter (e.g., Lee et al., 1993).

These varied ideas that students have can be productive resources to help them develop a more coherent understanding of new scientific phenomena (e.g., DiSessa, 1993; Brown et al., 2000). A number of researchers have emphasized the importance of providing students with opportunities to elicit their initial predictions, compare them to a scientific phenomenon, and evaluate which ideas are valuable (Smith III et al., 1994; Taber, 2000; Linn and Eylon, 2011; DiSessa, 2014). Given that students can develop diverse ideas based on their everyday experiences (e.g., ice cube melting), cultural background (Lee and Fradd, 1998), and interactions with formal schooling (e.g., Garnett et al., 1995; Erman, 2017), it is critical for science teachers to elicit their ideas and create learning opportunities for students to use these ideas as building blocks (Duschl and Gitomer, 1991; Vosniadou, 1994; Ruiz-Primo and Furtak, 2007; Taber and García-Franco, 2010).

However, capturing the range of students' ideas about complex chemical phenomena is challenging when using traditional standardized assessments consisting of multiple-choice items that often measure isolated facts using complex linguistic features (Liu et al., 2011; Noble et al., 2012). This can be particularly problematic for English learners (ELs) who speak a language other than English at home because many ELs in English-dominant mainstream classrooms are still developing proficiency in academic English (Hakuta et al., 2000). Given the rapidly growing population of ELs in public schools, there is a critical need to develop assessment items that can provide all students, including ELs, with multiple opportunities to express their understanding of scientific phenomena.

This study explores how using multimodal assessments consisting of three item types (modeling, claim-evidence-reasoning [CER], and explanations) can capture a range of correct and alternative ideas that ELs who only or mainly speak English at home and English-dominant students (EDSs) may hold (e.g., Ruiz-Primo et al., 2010). While a number of studies have shown the value of different types of assessment items in general (e.g., Prain and Waldrip, 2006; Scalise and Gifford, 2006; O'Byrne, 2009), there is limited research on how different types of items can be responsive to ELs who are underserved in mainstream science classrooms. Specifically, this study investigated the following research questions:

(1) How do different item types (CER, explanation, and modeling) measure middle school ELs’ and EDSs’ understanding of energy and matter in chemistry?

(2) How do different item types elicit ELs’ and EDSs’ correct ideas and alternative ideas about energy and matter in chemistry?

Theoretical framework

Role of language in science instruction and assessment for English learners

Understanding unobservable molecular processes can be more challenging for ELs who are less familiar with the academic language of science (Scarcella, 2003; Snow, 2010; Ryoo and Bedell, 2017). Language plays a critical role in learning science because students are required to use academic language to comprehend, communicate, and represent their ideas about scientific phenomena (Lee et al., 2013). However, research has shown that the academic language of science has unique features that are not often used in social language, such as technical terms (e.g., atoms), everyday vocabulary that has different scientific meanings (e.g., matter), complex sentence structures, passive voice, frequent nominalizations, and unique discourse patterns (Lemke, 1990; Schleppegrell, 2005; Fang, 2006). These linguistic features have been shown to increase ELs’ difficulty with comprehending text and participating in classroom discussions (e.g., González-Howard and McNeill, 2016). This can be even more problematic when scientific concepts are abstract, such as chemical reactions, as ELs need to infer the unobservable processes by relying on unfamiliar linguistic resources.

Moreover, several researchers have raised a concern that the linguistic complexity of assessment items can interfere with accurately measuring ELs’ understanding of scientific phenomena (Abedi, 2002; Shaftel et al., 2006). For instance, traditional science assessments often use multiple-choice items asking students to choose one of the four given options (Delandshere and Petrosky, 1998). However, such item types represent multiple scientific concepts in complex sentences using several technical terms (Abedi, 2002; Harlow and Jones, 2004; Shaftel et al., 2006; Turkan and Liu, 2012). Furthermore, such assessments often require ELs to develop written explanations to represent their understanding of scientific systems using academic language, but generating written products to articulate complex scientific phenomena can be more challenging for ELs (Fillmore and Fillmore, 2012). Even when ELs have a concrete understanding of scientific phenomena, they may not be able to choose a correct answer or represent their understanding because they misinterpret questions or do not know the technical vocabulary (Abedi, 2002). Although the largest achievement gaps in science have been consistently found between ELs and their English-dominant peers, these findings indicate that ELs’ underperformance may be due to rigorous language demands rather than their content knowledge (Lyon et al., 2012).

In order to reduce the linguistic demand, many researchers have suggested accommodation strategies for ELs, including administering tests in students’ home languages (e.g., Anderson et al., 2000; Hofstetter, 2003), providing dictionaries or glossaries (e.g., Abedi et al., 2000), providing additional time for ELs to complete the test (e.g., Abedi and Lord, 2001), providing read-aloud accommodations (e.g., Kieffer et al., 2009), modifying the linguistic features using simplified language (e.g., Johnson and Monroe, 2004), and incorporating visual representations (e.g., Solano-Flores et al., 2014). For instance, Abedi et al. (2000) compared assessment results for students based on language status and the use of glossaries, modified language, or extra time. The authors found that the only modification to decrease the performance gap between ELs and EDSs was the simplified language. Providing visual representations has also been found to be beneficial to help ELs understand the questions (Martiniello, 2009; Solano-Flores and Wang, 2015; Kachchaf et al., 2016). Despite the positive outcomes of these studies, most of the accommodation strategies have been implemented for multiple-choice items. Although multiple-choice items are most common in standardized assessments, they are often unable to adequately assess multiple components of students’ ideas and their reasoning processes, as well as identify specific alternative ideas they might have (Pellegrino et al., 2014).

Providing ELs with multiple opportunities to express their ideas

In order to more accurately capture a wide range of ideas that ELs have regarding complex chemical phenomena, it is important to provide ELs with opportunities to represent their understanding of science in multiple ways. Multi-modal assessments using various open-ended item types such as drawing can elicit a range of correct and alternative ideas that students hold by allowing them to freely express their understanding of science in their own words or models (e.g., Linn et al., 1991). In particular, compared to traditional explanation items that typically ask ELs to explain a scientific phenomenon in writing (e.g., Sandoval and Reiser, 2004; Ruiz-Primo et al., 2010), different item types can also reduce the linguistic burden on ELs as they can provide more structured scaffolding. For instance, modeling items enable students to build a visual representation of unobservable scientific systems, such as molecular movements (Windschitl et al., 2008; Cheng and Brown, 2015). Such items have the potential to provide ELs with an alternative way to demonstrate their ideas while reducing the language demand (Ryoo and Linn, 2015). CER items can also allow ELs to use relevant evidence from visual representations of data, such as tables and graphs, to support their claim about a scientific concept (Gotwals and Songer, 2010). Providing graphs and tables in items can provide ELs with additional resources that may help generate written responses (Hakuta, 2014).

Given prior research showing that language can interfere with ELs’ understanding and representation of scientific concepts (e.g., Abedi, 2002), it is important for science teachers to accurately assess whether poor performance on an assessment item is related to a lack of content knowledge or to difficulty expressing ideas in words. Using different forms of items has the potential to offer multiple sources of evidence about the types of ideas ELs hold regarding scientific phenomena (Prain and Waldrip, 2006; O'Byrne, 2009; Pellegrino et al., 2014). However, there is little research addressing whether and how different types of open-ended items can accurately capture ELs’ and EDSs’ various ideas about complex chemistry concepts. The purpose of this study is to explore how three types of open-ended assessment items, namely, explanation, CER, and modeling, can measure 8th-grade ELs’ and EDSs’ correct and alternative ideas about energy and matter in properties of matter and chemical reactions.


Participants and study design

This study involved 38 eighth-grade students taught by one science teacher from a Title I middle school. Before the study began, approval from the Institutional Review Board (IRB) was obtained. One researcher from the research team visited each classroom to verbally explain the study in language that eighth-grade students would understand and answer any questions students had. Students also received written assent forms that provided more details about the study. Furthermore, all students’ parents or guardians received written parent consent forms with an introductory letter, and their written consent was obtained. Participation in the study was voluntary. Only the 38 students who assented and whose parents or guardian consented were included in the study. At the time of the study, the middle school served 31% Hispanic, 56% White, 9% African American, and 47% of students who received free or reduced lunch. Among the 38 students, 15 were identified as ELs who speak a language other than English at home and speak English as their second language, and 23 were identified as EDSs who mainly or only speak English at home.

Prior to this study, all students received formal instruction on chemistry concepts, including structure of matter, properties of matter, and chemical reactions, using the official district-provided curriculum. State guidelines recommend that teachers use concrete models to help students learn chemistry concepts, including chemical changes (North Carolina Department of Public Instruction, 2012). At the end of the school year, approximately six months after the chemistry instruction, students individually completed our assessment using the Web-based Inquiry Science Environment (WISE) during one 50 minute class period.

Assessment development

In order to develop multimodal assessments that can capture different levels of all students’ complex thinking in chemistry, we used principles from the Evidence Centered Design (ECD) framework, as well as the Knowledge Integration (KI) framework. The ECD framework (see Table 1) for assessment development lays out how assessment tasks should be designed and combined through a systematic process (e.g., Mislevy and Haertel, 2006). The first step in developing the assessment for this study was domain analysis, which involved identifying key concepts students should demonstrate related to energy and matter in chemistry, particularly properties of matter and chemical reactions, by unpacking NGSS, American Association for the Advancement of Science (AAAS), and North Carolina Essential Standards for Science for 8th-grade students. In addition to reviewing standards, we also reviewed curricular resources for middle school students and collaborated with our teacher partners. After domain analysis, one of the main relationships we detailed in domain modeling was the connection between the NGSS practices and the concepts in both chemical reactions and properties of matter. Specifically, we carefully reviewed three NGSS practices that could be utilized in this assessment: engaging in argument based on evidence (CER), developing explanations of a scientific phenomenon (explanation), and developing and using a model (modeling).
Table 1 Our assessment development process in relation to ECD
ECD level ECD step Description of ECD step Our assessment process
Understanding domain Domain analysis Reviewing information related to domain of interest – Reviewed multiple middle school science standards (NGSS, AAAS, NC state standards)
Domain modeling Finding, specifying, and organizing relationships between key areas identified in the domain analysis – Identified key concepts related to chemical reactions and properties of matter (matter transformation, matter conservation, structure of matter, energy transformation, energy conservation)

– Identified key practices from NGSS standards (explanation, argumentation, modeling)

Conceptual assessment framework Student model Articulating the constructs of interest for the assessment, as well as the desired levels of achievement – Used KI framework: students need to make elaborated connections between key concepts to have higher levels of understanding
Task model Defining aspects of appropriate tasks for the assessment in order to guide task construction – Reviewed, modified, and drafted items that fit into context of three NGSS practices (explanation, CER, modeling)

– Pilot tested and revised items based on results and teacher comments

Evidence model Defining how responses from tasks will provide evidence about the constructs and levels in the student model – Created KI rubrics for each item to reflect how students are making connections between normative ideas

The next level in the ECD framework is the articulation of the different models within the Conceptual Assessment Framework (CAF) to create and choose assessment tasks. Within the CAF, we specified the student model, which highlighted the key concepts of energy and matter in chemical reactions and properties of matter that the assessment should measure. To categorize levels of student understanding, we utilized the KI framework that emphasizes the importance of eliciting students’ range of ideas and capturing how students link normative ideas to represent their integrated understanding of scientific phenomena (Linn and Eylon, 2011). Under this framework, students are rewarded for making such elaborated connections between normative ideas, rather than identifying a single factual piece of knowledge (Liu et al., 2008; Lee and Liu, 2010). Students with a higher level of understanding about the concepts of properties of matter and chemical reactions, therefore, would be able to make elaborated connections between ideas. For the task model, we focused on using the three NGSS practices to modify the items we reviewed from large-scale assessments, including the National Assessment of Educational Progress (NAEP), Trends in International Mathematics and Science Study (TIMSS), and North Carolina End-of-Grade (EOG) assessments, as well as research studies focusing on these concepts (e.g., Harris et al., 2015) and assessment items available online. In cases in which none of the assessment items we reviewed adequately covered our target concepts, we developed new items to better gather evidence of ELs’ and EDSs’ science understanding of these concepts. The assessment items were reviewed by an experienced 8th-grade science teacher to ensure the linguistic accessibility and content accuracy, and the items also underwent initial pilot-testing in his classrooms prior to this study. During initial pilot-testing, we examined different item types for identical concepts to understand which assessment task could be most appropriate as well as how items could be modified for accessibility. Finally, in the evidence model, we specified rubrics for the scoring of the different items using the KI framework. The rubrics were developed for each item to reflect the normative connections between the key concepts.

The final online assessment was comprised of a total of nine, multi-part items consisting of three types of items: explanation, CER, and modeling (see Table 2 for examples).

Table 2 Examples of modeling, CER, and explanation items
Item type Example
Modeling: items asked students to build a model for a scientific phenomenon using an online drawing tool in the assessment and to provide a written description of their model. Water on the floor item image file: c7rp00141j-u1.tif
Explanation: items asked students to explain unobservable phenomena. Prompts were provided to direct student responses. Ice cube item image file: c7rp00141j-u2.tif An ice cube is placed in a heated pan on the stove. Using scientific evidence, explain what will happen to the energy and molecules of the ice cube as the ice is heated.

Make sure that your explanation includes:

– What will happen to the energy of the ice cube's molecules?

– What will happen to the molecules?

– What causes your answers to happen?

CER: items asked students to choose a claim and use evidence to support their claim. Mass item image file: c7rp00141j-u3.tif 1. Look at picture 3. Which statement best describes the mass of experiments A and B AFTER the baking soda was added to the vinegar?

A. A and B have the same amount of mass.

B. A has more mass than B.

C. A has less mass than B.

2. Look at the experiment above. What evidence from the experiment can support your answer to Q1?

3. How does your evidence (Q2) support your answer to Q1?

• Explanation items asked students to develop a coherent explanation about a chemical phenomenon by integrating concepts of energy and matter in chemistry.

• CER items provided students with a context, as well as data tables and observations, and they asked students to choose a claim, provide evidence to support their claim, and explain their reasoning regarding how their evidence supports their claim.

• Modeling items asked students to build a visual model of an unobservable scientific system. Students were given an online modeling system consisting of representations familiar to the students, as well as a directional video demonstrating how to use the modeling tool. In addition to building a model, students were asked to explain what they were showing in their visual model.

While the three item types elicited students’ understanding of the target concepts in different ways, all items asked students to integrate two or three focal ideas about energy and matter into the processes of properties of matter and chemical reactions, rather than focusing on a single concept (see Table 3). Each of the item types has potential advantages and disadvantages for linguistically diverse students. For instance, explanation items can allow students to express their understanding in their own ways, but generating written products could be more challenging for ELs who speak English as a second language (Beck et al., 2013). While CER items have built-in scaffolds asking students to use evidence from visual sources to support their claims, middle school students may struggle with how to interpret and use data as appropriate evidence (McNeill and Krajcik, 2007). Modeling items can potentially reduce the linguistic burden by allowing ELs to visualize their understanding of unobservable phenomena, but students may be less familiar with using a tool to develop visual models.

Table 3 Disciplinary core ideas and cross cutting concepts measured in each assessment item
Item type Item name Item description NGSS standard Disciplinary core ideas Energy and matter concepts
Matter Energy Energy–matter
Explanation Ice cube Students were asked to explain what happens to the molecules and energy of an ice cube when it was put on a hot stove. MS-PS1-4 – Structure and properties of matter

– Molecular movement in states of matter

× × ×
Heated gas Students were asked to identify what happens to molecules when thermal energy is added to helium gas and provide an explanation. MS-PS1-4 – Structure and properties of matter

– Molecular movement in states of matter

× × ×
Evaporation Students were asked to explain what happens to water molecules after the water evaporates. MS-PS1-4


– Structure and properties of matter

– Molecular movement in states of matter

– Chemical vs. physical changes

× ×
CER Mass Students were asked to choose a claim, select evidence from data, and provide reasoning to explain what happens to the total mass of substances during a chemical reaction. MS-PS1-2


– Matter conservation in chemical changes

– Changes in matter during chemical reactions

Balloon Students were asked to choose a claim, select evidence from data, and provide reasoning to explain whether or not matter disappears during a chemical reaction. MS-PS1-2


– Chemical vs. physical changes

– Changes in energy and matter in chemical reactions

– Matter conservation in chemical changes

× ×
Liquid Students were asked to choose a claim, select evidence from data, and provide reasoning to explain how changes in energy are related to chemical reaction. MS-PS1-2


– Characteristics of physical and chemical properties

– Release of energy and temperature change during chemical reactions

× ×
Modeling Water on the floor (WOF) Students were asked to build a model to show what happens to water molecules when water evaporates and explain their model. MS-PS1-4


– Structure and properties of matter

– Molecular movement in states of matter

× ×
Energy Students were asked to build a model to demonstrate what happens to energy during and after a chemical reaction. Students were also asked to explain their model. MS-PS1-4 – Changes in energy and matter in chemical


– Chemical vs. physical changes

Reaction Students were asked to build a model to show how a chemical reaction happens and what happens to the total number of atoms. Students were also asked to explain their model. MS-PS1-1


– Changes in energy and matter in chemical reactions

– Matter conservation in chemical changes


Data analysis

In order to understand how the different assessment items functioned for ELs and EDSs, all nine items were first coded using a revised KI scoring rubric (level 0–5), which rewards students for making connections among target focal ideas about the content covered in the item (e.g., Liu et al., 2008) (see Table 4). The modeling items were coded based on both the students’ visual models and written response together in order to capture all of the ideas expressed by the students in that item. Students’ responses were coded by two independent raters, with discrepancies resolved through conversation until agreement reached 100%. The initial overall levels of agreement between the two raters across items ranged from 70 to 80%.
Table 4 Scoring rubric
Score Level Description Student examples from the ice cube explanation item Student examples from the mass CER item Student examples from the WOF modeling item
0 Off-task No answer or “I don’t know” “I don’t know.” A and B have the same amount of mass.

“I don’t know.”

image file: c7rp00141j-u4.tif
1 No link Non-normative or irrelevant ideas “The stove will burn off all of the molecules in the ice” A has more mass than B.

“#2. cause you can see the balloons running out of mass and one isnt”

image file: c7rp00141j-u5.tif “The water molecules become evaporated but the there is still water molecules on the floor after it is evaporated It's just less than hours before”
2 Inadequate link Both non-normative and normative ideas present in the given response “As the ice cube melts the energy molecules start to rise and move more until it turns into a solid” A and B have the same amount of mass.

“The same amount of baking soda was added to both beakers, the only difference was with the balloon. On the chart, the baking soda's mass is 10 g in each beaker.”

image file: c7rp00141j-u6.tif

“Over the minutes and hour the water evaporated. The water was consumed. The water turned in gas.”

3 Partial link Normative ideas without scientifically valid connections between ideas “The ice melts when it is on the heated pan” A has less mass than B.

“In the picture it looks like A has more vinegar and B has less vinegar which can mean that it has less mass than B. For A they covered the top with a balloon and it inflated after the vinegar was added.”

image file: c7rp00141j-u7.tif “The water H2O was on the ground and the sunlight's heat evaporated most of the water but still left a few and brought the oxygen and some hydrogen into the air as a gas.”
4 Full link One scientifically valid and elaborated link between normative and relevant ideas “As the cube is heated the molecules with begin to spread out and they will move around more. The heat will cause the molecules energy to increase.” A has more mass than B.

“I think that it had more mass because it probably weighed more with all that gas inside it. The balloon in the first one was inflated and in the second one it was deflated so that in my way of thinking makes it so one of them has more mass than the other one.”

image file: c7rp00141j-u8.tif “There were less molecules and the water molecules had evaporated because of the heat.”
5 Complex link Two or more scientifically valid and elaborated links between normative and relevant ideas “When an ice cube is heated in a pan on a stove, the energy of the ice cube molecules turn from potential energy to kinetic energy. The molecules begin to move faster and to spread out because of the increase in temperature. The molecules began to have more motion which requires kinetic energy and this is all because the ice cube is beginning to melt into water.” A has more mass than B.

“Beaker A will have more mass than Beaker B because Beaker A was a closed experiment. Since Beaker B wasn't covered with the balloon, the gas that was created from the chemical reaction was released. Beaker A was covered so the balloon trapped and collected the gas from the chemical reaction. My evidence in question 2 supports my answer to question 1 because it shows that Beaker A was covered with a balloon which made it a closed experiment. This means that the mass was contained in the balloon and all of the original mass was conserved. Since Beaker 2 didn't have a balloon, the gas that was created from the chemical reaction was released and is no longer a part of the mass of the experiment.”

image file: c7rp00141j-u9.tif “After the water has evaporated, it is turned into a gas. Thus, the molecules of the water will spread out more and will move at a faster pace. There aren't less water molecules, they are just more widespread and are moving at a more rapid speed.”

To explore the effects of different types of assessment items on ELs’ and EDSs’ understanding of energy and matter in chemistry, we first compared the mean scores between the two language groups for three item types (e.g., explanation, modeling, and CER) using t tests. Next, to capture the range of students’ ideas about chemistry, all nine items were further coded using rubrics developed during the evidence model phase of the ECD framework for specific links students made between correct ideas and alternative ideas students demonstrated (see Table 5). To identify students’ correct and alternative ideas, we used the state standards and NGSS, as well as previous literature on energy and matter in chemistry (e.g., Lee et al., 1993), properties of matter (e.g., Liu and Lesniak, 2006), and chemical reactions (e.g., Boo and Watson, 2001). The number of correct ideas and alternative ideas per response was compared by language group for each item type using t tests to determine if there were differences between ELs and EDSs. Additionally, in order to understand the range of ideas present in students’ responses, the percentages of students with a specific correct or alternative idea were calculated for each item type.

Table 5 Examples of correct ideas and alternative ideas codes
Type Student examples
Correct ideas Matter transformation-macro “The baking soda and the vinegar changed into a different substance through a chemical change”

“If the ice cube heats up even more, the water could turn into gas”

Matter transformation-micro “During the chemical reaction, the oxygen atoms split up and joined with the hydrogen pairs, making H2O” image file: c7rp00141j-u10.tif
Energy–matter “the thermal energy will transfer to the ice cube, making it's molecules spread out, and making it become a liquid, and eventually a gas”

“Heat provides energy, and energy results in more movement for the molecules as they are given more space to move around in.”

Alternative ideas Chemical vs. physical change (e.g., Abraham et al., 1992) “the molecules would get closer together [after evaporation] have chemical reaction and change.”

“The state matter changed which has to mean that energy was released”

Lack of matter conservation (e.g., Liu and Lesniak, 2006; Claesgens et al., 2009) “The molecules moved closer because of evaporation, which made them disappear” image file: c7rp00141j-u11.tif
Not differentiating energy–matter (e.g., Lee et al., 1993) “The energy of the ice cube has been turned into a liquid and spreads rapidly.”

“I used the law of conservation of mass. Energy cannot be created or destroyed.”

Confusing macro and micro properties (e.g., Hadenfeldt et al., 2016) “The molecules will go from being frozen or not moving at all to a fast constant rate of moving when it starts to get heated up more” image file: c7rp00141j-u12.tif

Results and discussion

Overall results

The results show that there was a significant difference between ELs and EDSs in their overall scores, t(36) = 2.33, p < 0.05, ES = 0.74. EDSs achieved significantly higher scores (M = 3.76, SD = 0.76), compared to ELs, (M = 3.06, SD = 1.11), indicating that EDSs demonstrated a more coherent understanding of energy and matter in chemistry compared to their peers who speak English as a second language.

When comparing ELs’ and EDSs’ performances on three types of assessment items (explanation, CER, and modeling), we found that EDSs significantly outscored ELs in the explanation items, t(36) = 2.37, p < 0.05, ES = 0.77, and CER items, t(36) = 2.80, p < 0.05, ES = 0.90 (see Table 6). However, there were no significant differences between the two language groups in the modeling items, t(36) = 1.05, p = 0.30, ES = 0.35. EDSs scored highest on the CER items, while ELs scored highest on the modeling items, suggesting that the difference in scores between EDSs and ELs on the CER could be related to language-based aspects of the item type. Consistent with previous research (e.g., Abedi, 2002; Lyon et al., 2012), ELs appeared to struggle with articulating their understanding of the content in the language-intensive CER and explanation items. As shown in Ryoo and Linn (2015), the modeling items allowed ELs to display their knowledge visually and did not vary significantly based on language status.

Table 6 Mean KI scores by item type and language status
Overall 3.06 (1.11) 3.76 (0.76)
CER 3.00 (1.19) 3.87 (0.82)
Explanation 2.78 (1.25) 3.64 (0.98)
Modeling 3.40 (1.27) 3.78 (0.97)

Number of correct ideas and alternative ideas across item types per language group

To further explore how different item types can capture ELs’ and EDSs’ complex ideas about energy and matter in chemistry, we compared the number of correct ideas and alternative ideas that ELs and EDSs expressed across the three item types. The results for the explanation items show that EDSs demonstrated a higher number of correct ideas, t(36) = 2.61, p < 0.05, ES = 0.87, and significantly fewer alternative ideas, t(36) = 2.66, p < 0.05, ES = 0.84, compared to ELs (see Fig. 1). By contrast, the modeling items revealed no significant differences between ELs and EDSs in the number of correct ideas, t(36) = 0.92, p = 0.36, ES = 0.30, and alternative ideas, t(36) = 0.63, p = 0.53, ES = 0.21. Interestingly, the CER items showed that EDSs demonstrated more correct concepts about energy and matter, t(36) = 2.81, p < 0.05, ES = 0.95, but there was no significant difference in the number of alternative ideas between ELs and EDSs, t(36) = 0.74, p = 0.47, ES = 0.25.
image file: c7rp00141j-f1.tif
Fig. 1 Mean correct ideas and alternative ideas per response by item type and language.

Correct ideas

When we examined the specific correct ideas that were elicited by the three item types, some different patterns emerged regarding the differences between ELs and EDSs, as well as the types of ideas that were present in student responses (see Table 7).
Table 7 Percentage of students with types of correct ideas by item type
CER items Explanation items Modeling items
Properties of matter 0.0 0.0 35.6 55.1 36.7 43.5
Matter transformation 24.4 31.9 0.0 0.0 33.3 47.8
Matter conservation 26.7 49.3 0.0 4.3 30.0 39.1
Energy–matter relationship 0.0 26.1 13.3 31.9 0.0 0.0
Energy transformation 0.0 4.3 0.0 7.3 60.0 69.6
Energy conservation 20.0 34.8 0.0 0.0 66.7 69.6

Explanation items. This item type had the greatest difference between ELs and EDSs in terms of the different ideas that were expressed. Across all of the categories of ideas that were elicited by explanation items, a larger percentage of EDSs expressed correct ideas than ELs. For example, while over half of the EDSs expressed correct ideas about properties of matter, accurately describing molecular movement at different states or spread of molecules, only about a third of the ELs expressed such ideas. The explanation example in Table 8 illustrates how an EDS made several connections about what happens to the spacing and speed of the molecules when thermal energy is added as well as the accompanying energy transformation that occurs during this process. By contrast, the EL in the Table 8 example only made one correct assertion that molecules separated during phase change and also incorrectly stated that the amount of molecules decreased. Additionally, very few ELs’ responses expressed correct ideas about energy, with 0% expressing correct ideas about energy transformation and only 13% correctly linking energy and matter ideas. Aligned with prior research showing that ELs may be less familiar with or have less exposure to the academic language of science (Scarcella, 2003; Hakuta et al., 2000; Snow, 2010), these findings indicate that ELs could have faced additional challenges when generating written explanations to articulate their understanding of the target concepts.
Table 8 Examples of responses by language status and item type
Explanation: ice cube CER: mass Modeling: WOF
ELs “The energy of the water molecules will decrease as it becomes a gas (steam or vapor) and the molecules will most likely separate from each other. I say this because as you heat an ice cube it becomes a gas if molecules from the ice cube are being taken away to make the gas, it will decrease the amount of molecules.” A has more mass than B. “In the picture it looks like A has more vinegar and B has less vinegar which can mean that it has less mass than B.” image file: c7rp00141j-u13.tif “The molecules of water will spread out and move at a faster pace. There aren’t less water molecules”
EDSs “The H2O molecules in the ice cube are, originally packed tightly together and don’t move as much. As the ice cube is being heated up, it becomes a liquid. The molecules in the liquid spread out farther and they start moving more, using more energy. If the stove is hot enough, the liquid would then evaporate causing for the molecules to spread out even FARTHER and them to move so much faster! The is what causes this to happen, it causes the change of states in this case.” A has more mass than B. “They put a balloon on top of beaker A but not beaker B. The balloon on beaker A would have contained everything that was produced by the chemical reaction, whereas anything produced in beaker B would have disappeared into the air.” image file: c7rp00141j-u14.tif “A liquid is spread far apart but still packed together, but a gas moves more freely and far apart.”

CER items. Similar to the explanation items, EDSs demonstrated more correct ideas about energy and matter in chemistry across categories on CER items than ELs (see Table 7). In particular, 49.3% of EDSs, compared to 26.7% of ELs, correctly integrated their understanding of matter conservation throughout the CER items by stating that “molecules can’t disappear” and “[molecules] just change forms” during chemical reactions. For example, when students were asked to compare the mass of beakers in open and closed systems after a chemical reaction, EDSs, such as the example in Table 8, were more likely to select relevant evidence to support their claim by giving evidence of conservation by discussing the mass of the reactants and identify the difference between the systems. By contrast, the EL in the Table 8 example chose a correct claim, but she did not provide any relevant correct ideas about matter conservation using appropriate evidence.

Of particular interest is that, while 26.1% of EDSs were able to explain the relationship between energy and matter during a physical change by explaining how adding or removing thermal energy affects matter, none of the ELs identified this concept in their responses. Some ELs were able to demonstrate correct ideas about energy by stating that “energy cant be created or destroyed only transformed,” but they often did not connect it to how energy affects particle motion.

Modeling items. Compared to the CER and explanation items, both ELs and EDSs expressed more correct ideas in the modeling items, and their patterns were similar. Consistent with the literature on the value of visual models for representing abstract scientific concepts (e.g., Chang et al., 2010; Ryoo and Linn, 2015), both ELs and EDSs were better able to demonstrate their understanding of unobservable concepts of energy and matter in the modeling items, such as matter transformation, energy transformation, and energy conservation. For instance, when asked to develop a visual model of what happens to water molecules when water was left on the floor in the sun (see Table 8), the first response from an EL shows that the student understood the change in spread of molecules during evaporation, as well as matter conservation evidenced by maintaining the same number of molecules in both pictures. Additionally, the written response, which references matter conservation and properties of matter, reflects the concepts shown in the EL's model. The presence of more correct ideas in the written responses for ELs suggests that the use of visual representations in assessment items could have helped them develop their written responses because they could reference the models they developed. Indeed, when compared with the EL explanation item example in Table 8, the EL was able to express more correct molecular ideas in the modeling item, such as matter conservation, that were absent or incorrect in the explanation response. This is echoed in the relative similarity of the percentage of ELs and EDSs with correct ideas for modeling items, suggesting that ELs may have more correct ideas about properties of matter and chemical reactions than they were able to express with words in the CER and explanation items. Visual supports have been identified as an important accommodation strategy to help ELs better understand assessment items (Solano-Flores et al., 2014; Solano-Flores and Wang, 2015), and our finding suggests that such visual aids can be used to support ELs in developing more coherent written explanations (Hakuta, 2014).

Alternative ideas

While examining the alternative ideas that were elicited by the three item types, some different patterns emerged between ELs and EDSs, including the types of ideas that were present in student responses (see Table 9).
Table 9 Percentage of students with types of alternative ideas by item type
CER items Explanation items Modeling item
Matter conservation 16.7 2.2 33.3 0.0 63.3 52.2
Molecules change forms 0.0 0.0 33.3 26.1 23.3 30.4
Confusing chemical/physical changes 17.8 18.8 16.7 2.2 0.0 0.0
Confusing macro/micro properties 0.0 0.0 33.3 21.7 46.7 60.9
Confusing energy/matter 8.9 14.5 11.1 8.7 6.7 8.7
Energy conservation 20.0 13.0 26.7 21.7 33.3 34.8

Explanation items. Explanation items not only elicited the widest range of alternative ideas among students but also some large differences between ELs and EDSs. More ELs expressed alternative ideas about chemical and physical changes (16.7% ELs, 2.2% EDSs), as well as about the conservation of matter (33.3% ELs, 0% EDSs), compared to EDSs. For instance, as shown in the example in Table 8, the EL correctly stated that the molecules spread out when ice becomes a gas. However, this student also displayed alternative ideas about matter disappearing saying that the molecules would be “taken away” from the ice cube “to make the gas.” Given that the particulate nature of matter has been found to be a difficult concept for many students (e.g., Abraham et al., 1992; Lee and Liu, 2010), it is possible that ELs may not have had a coherent understanding of what happens to molecules in liquid and gas. However, as shown in prior research on the role of language in measuring ELs’ understanding of content knowledge (Kopriva and Sexton, 1999; Solano-Flores and Trumbull, 2003), the language choices of ELs in their written explanations also indicates that they may not have the appropriate linguistic resources to express the abstract concepts of matter in a written form.
CER items. For CER items, ELs and EDSs had a different pattern of alternative ideas than for explanation items, with ELs having more alternative ideas than EDSs for three categories and EDSs having more than ELs for two categories. The most frequently observed alternative ideas for both ELs (17.8%) and EDSs (18.8%) involved confusing chemical and physical change, such as describing vinegar and baking soda in a chemical reaction as “evaporating into the balloon” rather than as producing a different substance. One of the major differences between the two language groups was regarding ELs’ alternative ideas about matter conservation. For instance, 16.7% of ELs, compared to only 2.2% of EDSs, believed that matter can disappear or can be created during a chemical reaction. In addition, 13.3% of ELs, compared to 4.3% of EDSs, showed alternative ideas about energy conservation, such as energy “goes away” or “gets used up” (see Table 10). One unexpected finding was that more EDSs (14.5%) failed to distinguish energy from matter than ELs (8.9%). For instance, one EDS was able to correctly state that “the baking soda and vinegar did not disappear. They were turned into bubbles,” but then incorrectly equated this change with the idea that energy is conserved through transformation by stating that “because energy cannot be created or destroyed only transformed.”
Table 10 Examples of alternative ideas from the WOF modeling item by language group
Visual model Written response
EL image file: c7rp00141j-u15.tif “The molecules didn’t go away, they simply changed form, so they disconnected and were made into a gas”
EDS image file: c7rp00141j-u16.tif “The water molecules slowing spread further apart and evaporate becoming a gas”

Modeling items. Unlike the CER and explanation items, the modeling items elicited more alternative ideas for EDSs than ELs across several categories, particularly molecules changing forms (23.3% ELs, 30.4% EDSs) and confusion between macro and micro properties (46.7% ELs, 60.9% EDSs). These findings suggest that contrary to the CER and explanation items, modeling items could capture different patterns of linguistically diverse students’ ideas than written items capture. An example of a major alternative idea that appeared in the modeling items, but not in the written items, was related to macro/micro confusion. As seen in Table 10, many ELs and EDSs showed the alternative idea that molecules break up during evaporation. The accompanying written response explained that these diagrams were showing water molecules turning into a gas when they evaporate. This suggests that students are confusing the macro concept of liquids turning into a gas with what happens to molecules during evaporation. Another alternative idea related to macro/micro confusion can be seen in the EL student's model in Table 10 showing that molecules change form during evaporation. The EL shows water molecules when it is in liquid form and then replaces the hydrogen atoms with carbon atoms in the gas form. The explanation correctly states that the liquid is turning into a gas; however, the model indicates that the student meant that the gas is composed of carbon dioxide molecules instead of water molecules. Aligned with findings from a number of studies that show students’ confusions regarding the distinctions between the observable (macro) and unobservable (micro) levels of chemical phenomena (e.g., Lee et al., 1993; Nakhleh et al., 2005; Hadenfeldt et al., 2016), these examples of models and their accompanying explanation suggest that many students might not understand molecular changes during evaporation, despite being able to write an explanation about what happens at the macro level. The CER and explanation written items may have been less equipped to measure this particular alternative idea because it is difficult to know what students mean when their response only focuses on the macro process of water turning into a gas without a picture into their thinking.

Our findings are consistent with prior research on the benefits of visual representations to capture students’ molecular understanding (e.g., Chang et al., 2010; Nyachwaya et al., 2011). Given that ELs and EDSs showed similar patterns in their alternative ideas in the modeling items, compared to written items, this study provides evidence that the modeling items could have elicited alternative ideas that students were unable to explain clearly in words, but could show using the model. Comparing the performance of ELs and EDSs on written and modeling items could reveal alternative ideas that are hidden by written items.


The purpose of this study was to explore how different assessment item types can reveal 8th-grade ELs’ and EDSs’ understanding of unobservable chemical phenomena. The results of this study show that explanations, CER, and modeling items elicited a wide range of correct and alternative ideas that ELs and EDSs had regarding properties of matter and chemical reactions. This suggests that using multi-modal assessments with different item types can provide more insights into how ELs and EDSs understand energy and matter during chemical phenomena as the three item types captured different correct and alternative ideas from linguistically diverse students. We found that the explanation and CER items favored EDSs, compared to ELs, but there were no differences between the two language groups in the modeling items. Consistent with the current literature (e.g., Lyon et al., 2012), EDSs in our study expressed a significantly higher number of correct ideas but a significantly lesser number of alternative ideas on the explanation items, compared to ELs, with large effect sizes (ES = 0.87, ES = −0.84, respectively). The significant differences between ELs and EDSs may be attributed to the fact that explanation items required the most use of language with limited visual support. Although ELs in our study were able to explain their ideas in everyday, conversational English, it is possible that generating written products could have increased the linguistic demand for ELs who are still developing proficiency in English (Kopriva and Sexton, 1999; Beck et al., 2013).

By contrast, there were no differences in the number of correct and alternative ideas between the two language groups on the modeling items. For instance, the percentages of ELs and EDSs who visually modeled correct ideas about matter conservation were similar (30.0% for ELs and 39.1% for EDSs). These findings are in line with a recent study by Ryoo and Linn (2015) showing that ELs and EDSs demonstrated a similar understanding of energy flow in life science using a concept diagram. Although there is limited research on the use of modeling to measure ELs’ understanding of chemistry, the findings of this study are aligned with research on the value of visual representations as a way to better understand general students’ ideas about the particulate nature of matter (e.g., Chang et al., 2010; Nyachwaya et al., 2011). In particular, our study indicates that modeling items may have reduced the linguistic demand on ELs by allowing them to visually represent their ideas and providing visual resources to support their writing process. In particular, the difference in how explanation and modeling items function for ELs and EDSs suggest that language may indeed exacerbate the assessment challenges facing ELs in chemistry to such an extent that traditional assessments and assessments with only open-ended explanation items may incorrectly measure their content understanding (e.g., Noble et al., 2012; Kachchaf et al., 2016). Given the benefits of developing a visual model as an instructional approach to reduce the linguistic burden for ELs, further research should be conducted to explore how such item types can support ELs in articulating their ideas.

Additionally, the three item types captured different patterns of alternative ideas between ELs and EDSs. Such findings suggest that both middle school ELs and EDSs hold conflicting ideas about energy and matter which may go undetected by traditional assessments. In particular, both ELs and EDSs continued to hold a wide range of alternative ideas about energy and matter in the structure and properties of matter and chemical reactions even after receiving formal science instruction. For example, students who could describe molecules spreading apart during phase changes would show images of molecules breaking apart into individual atoms. This tendency to hold onto multiple ideas is consistent with the fractured frameworks described by Nakhleh et al. (2005), who found that middle school students frequently held confused macro- and micro-level understandings about the properties and structure of matter. As new ideas are introduced through instruction, students may be unable to effectively sort out ideas into coherent frameworks, and instead hold onto alternative ideas. These findings suggest the importance of using different types of assessment items as a way to provide multiple sources of evidence regarding students’ reasoning.

The findings of our study provide important implications for science teachers serving linguistically diverse students in mainstream science classrooms, particularly with regard to formative and summative assessment practices. In systems of classroom assessments, it is important that items can provide formative information that can help teachers uncover and diagnose the range of correct and alternative ideas students may have, as well as gain an accurate summative measure of how much students have learned. The findings of this study suggest that different types of assessment items can uncover different patterns of both correct ideas and alternative ideas held by ELs and by their English-dominant peers. This in turn suggests that teachers should be conscious about both using different item types and considering which types of items they are including in assessments. For instance, multimodal assessments consisting of a mix of items may provide teachers with more insight into the full range of ideas held by linguistically diverse students. This is especially relevant in the formative assessment of complex concepts such as macro/micro properties and matter–energy relationships in chemical reactions and properties of matter. Likewise, using multimodal summative assessments can provide teachers with a fuller understanding of individual students’ mastery of core chemistry concepts while also providing teachers with data they can use to revise their teaching in the future based on persistent alternative ideas their students hold. Furthermore, the study results highlight the importance of items, such as visual modeling items that can reduce linguistic barriers for ELs and allow them to show what they know. Thus, by eliciting a wide range of correct ideas and alternative ideas, multimodal assessments may be able to help teachers plan and revise their instruction to better target the learning needs of ELs and EDSs in linguistically diverse classrooms.


This study provides evidence about the potential benefits of using multi-modal assessments for linguistically diverse students, but our assessments focused on their understanding of energy and matter in properties of matter and chemical reactions. Thus, the results may not generalize to other chemistry concepts (e.g., Boyle's law), other science content areas (e.g., biology), or other grade levels (e.g., high school). Moreover, as our study was conducted in mainstream, English-dominant classrooms located within a Title I (high-poverty) middle school, results may not generalize to other types of school settings. Although several precautions were taken to protect the validity of the findings, including having multiple independent coders and cross-checking findings with those reported in the literature, more research should be conducted to ensure the advantages of different item types for linguistically diverse students.

There are also possible limitations arising from the study design that should be considered in future research. First, as this study is an initial effort for a larger project focusing on developing assessments for linguistically diverse students’ understanding of chemistry, data were collected from only 38 students after formal science instruction. It is possible that the small sample size or other factors such as students’ prior instruction and unfamiliarity with the item types may have impacted findings. Future research should be conducted with a larger number of students from diverse school settings. Second, while all the items focused on measuring students’ integrated understanding of chemistry based on scientifically valid connections among focal ideas, traits innate to the item types may have helped or hindered students. For instance, although the CER items presented more information and scaffolding in the observational data and tables, compared to other item types, interpreting such information could have been difficult for students. It is also possible that the unstructured nature of the explanation items could have helped students express a wide range of ideas, while such items could have also provided additional challenges to ELs who are still developing English proficiency. Future research should include student interviews to explore how students interpret each item type and their reasoning processes. Third, although students were able to develop models, as this is a common teaching approach for middle school science and emphasized in the NGSS and state standards, the modeling tool that students used may have been unfamiliar to some and could have led to models that inaccurately suggested alternative ideas. We provided students with video directions about how to use the tool multiple times, but using technology to construct models might have been difficult for some students and could have affected how they represent their ideas. Future research should ensure that students have multiple opportunities to develop models using the same technology during science instruction, aligned with assessments. Finally, the modeling items in this study were scored based on both students’ visual representations and written response together. However, this may have obscured some different ideas that students showed in their models and written responses. Future research should consider scoring the visual models and written responses separately to more accurately capture the ideas that are elicited by the different parts of the items.

Conflicts of interest

There are no conflicts to declare.


This material is based upon work supported by the National Science Foundation under Grant No. 1552114. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation.


  1. Abedi J., (2002), Standardized achievement tests and English language learners: psychometric issues, Educ. Assess., 8(3), 231–257.
  2. Abedi J. and Lord C., (2001), The language factor in mathematics tests, Appl. Meas. Educ., 14(3), 219–234.
  3. Abedi J., Lord C., Hofstetter C. and Baker E., (2000), Impact of accommodation strategies on English language learners' test performance, Educ. Meas.: Issues Pract., 19(3), 16–26.
  4. Abraham M. R., Grzybowski E. B., Renner J. W. and Marek E. A., (1992), Understandings and misunderstandings of eighth graders of five chemistry concepts found in textbooks, J. Res. Sci. Teach., 29(2), 105–120.
  5. Ahtee M. and Varjola I., (1998), Students’ understanding of chemical reaction, Int. J. Sci. Educ., 20(3), 305–316.
  6. Anderson M., Liu K., Swierzbin B., Thurlow M. and Bielinski J., (2000), Bilingual accommodations for limited English proficient students on statewide reading tests: Phase 2, Minnesota Report No. 31, Minneapolis, MN: National Center on Educational Outcomes, University of Minnesota Search PubMed.
  7. Beck S. W., Llosa L. and Fredrick T., (2013), The challenges of writing exposition: lessons from a study of ELL and non-ELL high school students, Read. Writ. Quart., 29(4), 358–380.
  8. Boo H. K. and Watson, J. R., (2001), Progression in high school students’ (aged 16–18) conceptualizations about chemical reactions in solution, Sci. Educ., 85(5), 568–585.
  9. Brown A., Bransford J. and Cocking, R., (2000), How people learn: brain, mind, experience and school, DC: National Academy Press.
  10. Chang, H. Y., Quintana, C. and Krajcik, J. S., (2010), The impact of designing and evaluating molecular animations on how well middle school students understand the particulate nature of matter. Sci. Educ., 94(1), 73–94.
  11. Cheng M. and Brown D. E., (2015), The role of scientific modeling criteria in advancing students' explanatory ideas of magnetism, J. Res. Sci. Teach., 52(8), 1053–1081.
  12. Claesgens J., Scalise K., Wilson M. and Stacy, A., (2009), Mapping student understanding in chemistry: the perspectives of chemists. Sci. Educ., 93(1), 56–85.
  13. Delandshere G. and Petrosky A. R., (1998), Assessment of complex performances: limitations of key measurement assumptions, Educ. Res., 27(2), 14–24.
  14. DiSessa A. A., (1993), Toward an epistemology of physics, Cogn. Instruct., 10(3), 105–225.
  15. DiSessa, A. A., (2014), The construction of causal schemes: learning mechanisms at the knowledge level, Cogn. Sci., 38(5), 795–850.
  16. Driver R. and Millar R. (ed.), (1986), Energy matters: Proceedings of an invited conference: Teaching about energy within the secondary science curriculum, University of Leeds: Centre for Studies in Science and Mathematics Education.
  17. Duschl, R. A. and Gitomer, D. H., (1991), Epistemological perspectives on conceptual change: implications for educational practice, J. Res. Sci. Teach., 28(9), 839–858.
  18. Erman E., (2017), Factors contributing to students’ misconceptions in learning covalent bonds, J. Res. Sci. Teach., 54(4), 520–537.
  19. Fang Z., (2006), The language demands of science reading in middle school, Int. J. Sci. Educ., 28(5), 491–520.
  20. Fillmore L. W. and Fillmore C. J., (2012), What does text complexity mean for English learners and language minority students? in Understanding language: language, literacy, and learning in the content areas, Hakuta K. and Santos M. (ed.), Stanford: Stanford University, pp. 64–74.
  21. Garnett P. J., Garnett, P. J. and Hackling, M. W., (1995), Students' alternative conceptions in chemistry: a review of research and implications for teaching and learning, Stud. Sci. Educ., 25, 69–96.
  22. González-Howard M. and McNeill K. L., (2016), Learning in a community of practice: factors impacting English-learning students' engagement in scientific argumentation. J. Res. Sci. Teach., 53(4), 527–553.
  23. Gotwals A. W. and Songer N. B., (2010), Reasoning up and down a food chain: using an assessment framework to investigate students' middle knowledge, Sci. Educ., 94(2), 259–281.
  24. Hadenfeldt J. C., Neumann K., Bernholt S., Liu X. and Parchmann I., (2016), Students’ progression in understanding the matter concept, J. Res. Sci. Teach., 53(5), 683–708.
  25. Hakuta K., (2014), Assessment of content and language in light of the new standards: challenges and opportunities for English language learners, J. Negro Educ., 83(4), 433–441.
  26. Hakuta K., Butler G. and Witt D., (2000), How long does it take learners to attain English proficiency?, Santa Barba, CA: University of California Linguistic Minority Research Institute.
  27. Harlow A. and Jones A., (2004), Why students answer TIMSS science test items the way they do, Res. Sci. Educ., 34(2), 221–238.
  28. Harris C. J., Penuel W. R., D'Angelo C. M., DeBarger A. H., Gallagher L. P., Kennedy C. A. and Krajcik J. S., (2015), Impact of project-based curriculum materials on student learning in science: results of a randomized controlled trial, J. Res. Sci. Teach., 52(10), 1362–1385.
  29. Hofstetter C. H., (2003), Contextual and mathematics accommodation test effects for English-language learners. Appl. Meas. Educ., 16(2), 159–188.
  30. Johnson E. and Monroe B., (2004), Simplified language as an accommodation on math tests, Assess. Eff. Interv., 29(3), 35–45.
  31. Kachchaf R., Noble T., Rosebery A., O’Connor C., Warren B. and Wang Y., (2016), A closer look at linguistic complexity: pinpointing individual linguistic features of science multiple-choice items associated with English language learner performance, Biling. Res. J., 39(2), 152–166.
  32. Kieffer M., Lesaux N., Rivera M. and Francis D., (2009), Accommodations for English language learners taking large-scale assessments: a meta-analysis on effectiveness and validity, Rev. Educ. Res., 79(3), 1168–1201.
  33. Kopriva, R., and Sexton, U. M., (1999), Guide to Scoring LEP Student Responses to Open-Ended Science Items, SCASS LEP Consortium Project Search PubMed.
  34. Krajcik J., Codere S., Dahsah C., Bayer R. and Mun K., (2014), Planning instruction to meet the intent of the Next Generation Science Standards, J. Sci. Teach. Educ., 25(2), 157–175.
  35. Lee O., Eichinger D. C., Anderson C. W., Berkheimer G. D. and Blakeslee T. D., (1993), Changing middle school students' conceptions of matter and molecules, J. Res. Sci. Teach., 30(3), 249–270.
  36. Lee O. and Fradd, S. H., (1998), Science for all, including students from non-English-language backgrounds. Educ. Res., 27(4), 12–21.
  37. Lee H. and Liu O. L., (2010), Assessing learning progression of energy concepts across middle school grades: the knowledge integration perspective. Sci. Educ., 94(4), 665–688.
  38. Lee O., Quinn H. and Valdés G., (2013), Science and language for English language learners in relation to Next Generation Science Standards and with implications for Common Core State Standards for English language arts and mathematics. Educ. Res., 42(4), 223–233.
  39. Lemke J. L., (1990), Talking science: language, learning, and values, Norwood, NJ: Ablex Publishing Corporation.
  40. Linn M. C. and Eylon, B. S., (2011), Science learning and instruction: taking advantage of technology to promote knowledge integration, New York, NY: Routledge.
  41. Linn R. L., Baker E. L. and Dunbar S. B., (1991), Complex, performance-based assessment: expectations and validation criteria, Educ. Res., 20(8), 15–21.
  42. Liu, O. L., Lee H.-S., Hofstetter C. and Linn M. C., (2008), Assessing knowledge integration in science: construct, measures, and evidence. Educ. Assess., 13(1), 33–55.
  43. Liu O. L., Lee H. S. and Linn M. C., (2011), An investigation of explanation multiple-choice items in science assessment, Educ. Assess., 16(3), 164–184.
  44. Liu X. and Lesniak K., (2006), Progression in children's understanding of the matter concept from elementary to high school. J. Res. Sci. Teach., 43(3), 320–347.
  45. Lyon E. G., Bunch G. C. and Shaw J. M., (2012), Navigating the language demands of an inquiry-based science performance assessment: classroom challenges and opportunities for English learners, Sci. Educ., 96(4), 631–651.
  46. Martiniello M., (2009), Linguistic complexity, schematic representations, and differential item functioning for English language learners in math tests, Educ. Assess., 14(3–4), 160–179.
  47. McNeill, K. L. and Krajcik, J., (2007), Middle school students’ use of appropriate and inappropriate evidence in writing scientific explanations, in Thinking with Data, Lovett M. and Shah P. (ed.), New York, NY: Taylor & Francis Group, LLC, pp. 233–265.
  48. Mislevy R. J. and Haertel, G. D., (2006), Implications of evidence-centered design for educational testing. Educ. Meas.: Issues Pract., 25(4), 6–20.
  49. Nakhleh M. B., (1992), Why some students don't learn chemistry: chemical misconceptions, J. Chem. Educ., 69(3), 191.
  50. Nakhleh M. B., Samarapungavan A. and Saglam Y., (2005), Middle school students' beliefs about matter. J. Res. Sci. Teach., 42(5), 581–612.
  51. NGSS Lead States, (2013), Next Generation Science Standards: for states, by states, Washington, DC: The National Academies Press.
  52. Noble T., Suarez C., Rosebery A., O'Connor M. C., Warren, B. and Hudicourt-Barnes J., (2012), “I never thought of it as freezing”: how students answer questions on large-scale science tests and what they know about science, J. Res. Sci. Teach., 49(6), 778–803.
  53. North Carolina Department of Public Instruction, (2012), North Carolina Essential Science Standards, Raleigh, NC: North Carolina State Board of Education.
  54. Nyachwaya, J. M., Mohamed, A. R., Roehrig, G. H., Wood, N. B., Kern, A. L. and Schneider, J. L., (2011), The development of an open-ended drawing tool: an alternative diagnostic tool for assessing students' understanding of the particulate nature of matter. Chem. Educ. Res. Prac., 12(2), 21–132.
  55. O'Byrne B., (2009), Knowing more than words can say: using multimodal assessment tools to excavate and construct knowledge about wolves. Int. J. Sci. Educ., 31(4), 523–539.
  56. Pellegrino J. W., Wilson M. R., Koenig J. A. and Beatty A. S. (ed.), (2014), Developing assessments for the Next Generation Science Standards, Washington, DC: National Academies Press.
  57. Prain V. and Waldrip B., (2006), An exploratory study of teachers' and students' use of multi-modal representations of concepts in primary science, Int. J. Sci. Educ., 28(15), 1843–1866.
  58. Ruiz-Primo M. A. and Furtak, E. M., (2007), Exploring teachers' informal formative assessment practices and students' understanding in the context of scientific inquiry. J. Res. Sci. Teach., 44(1), 57–84.
  59. Ruiz-Primo M. A., Li M., Tsai S. and Schneider J., (2010), Testing one premise of scientific inquiry in science classrooms: examining students' scientific explanations and student learning, J. Res. Sci. Teach., 47(5), 583–608.
  60. Ryoo K. and Bedell K., (2017), The effects of visualizations on linguistically diverse students’ understanding of energy and matter in life science. J. Res. Sci. Teach.,  DOI:10.1002/tea.21405.
  61. Ryoo, K. and Linn, M., (2015), Designing and validating assessments of complex thinking in science. Theory Pract., 54(3), 238–254.
  62. Sampson V., Grooms J. and Walker J. P., (2011), Argument-Driven Inquiry as a way to help students learn how to participate in scientific argumentation and craft written arguments: an exploratory study, Sci. Educ., 95(2), 217–257.
  63. Sandoval W. A. and Reiser B. J., (2004), Explanation-driven inquiry: integrating conceptual and epistemic scaffolds for scientific inquiry. Sci. Educ., 88(3), 345–372.
  64. Scalise K. and Gifford B., (2006), Computer-based assessment in e-learning: a framework for constructing “intermediate constraint” questions and tasks for technology platforms, J. Technol., Learn. Asses., 4(6), 3–44.
  65. Scarcella, R., (2003), Academic English: A Conceptual Framework, Technical Report 2003-1, University of California Linguistic Minority Research Institute Search PubMed.
  66. Schleppegrell M., (2005), Technical writing in a second language: the role of grammar and metaphors, in Analysing academic writing: contextualized frameworks, Ravelli L. and Ellis R. (ed.), London: Continuum, pp. 172–189.
  67. Sevian H. and Talanquer V., (2014), Rethinking chemistry: a learning progression on chemical thinking, Chem. Educ. Res. Pract., 15(1), 10–23.
  68. Shaftel J., Belton-Kocher E., Glasnapp D. and Poggio J., (2006), The impact of language characteristics in mathematics test items on the performance of English language learners and students with disabilities, Educ. Assess., 11(2), 105–126.
  69. Smith III, J. P., DiSessa, A. A. and Roschelle, J., (1994), Misconceptions reconceived: a constructivist analysis of knowledge in transition. J. Learn. Sci., 3(2), 115–163.
  70. Snow C. E., (2010), Academic language and the challenge of reading for learning about science. Science, 328, 450–452.
  71. Solano-Flores, G., and Trumbull, E. (2003). Examining language in context: The need for new research and practice paradigms in the testing of English-language learners. Educ. Res., 32(2), 3–13.
  72. Solano-Flores G. and Wang C., (2015), Complexity of illustrations in PISA 2009 science items and its relationship to the performance of students from Shanghai-China, the United States, and Mexico. Teach. Coll. Rec., 117(1), 1–18.
  73. Solano-Flores G., Wang C., Kachchaf R., Soltero-Gonzalez L. and Nguyen-Le K., (2014), Developing testing accommodations for English language learners: illustrations as visual supports for item accessibility, Educ. Assess., 19(4), 267–283.
  74. Taber K. S., (2000), Multiple frameworks? evidence of manifold conceptions in individual cognitive structure. Int. J. Sci. Educ., 22(4), 399–417.
  75. Taber K. S. and García-Franco, A., (2010), Learning processes in chemistry: drawing upon cognitive resources to learn about the particulate structure of matter. J. Learn. Sci., 19(1), 99–142.
  76. Turkan S. and Liu O. L., (2012), Differential performance by English language learners on an inquiry-based science assessment. Int. J. Sci. Educ., 34(15), 2343–2369.
  77. Vosniadou S., (1994), Capturing and modeling the process of conceptual change. Learn. Instr., 4(1), 45–69.
  78. Windschitl M., Thompson J. and Braaten M., (2008), Beyond the scientific method: model-based inquiry as a new paradigm of preference for school science investigations, Sci. Educ., 92(5), 941–967.

This journal is © The Royal Society of Chemistry 2018