Explicit versus implicit similarity – exploring relational conceptual understanding in organic chemistry

Nicole Graulich*, Sebastian Hedtrich and René Harzenetter
Justus-Liebig University Giessen, Institute of Chemistry Education, Heinrich-Buff Ring 17, 35392 Giessen, Germany. E-mail: Nicole.Graulich@didaktik.chemie.uni-giessen.de

Received 23rd February 2019 , Accepted 20th August 2019

First published on 20th August 2019

Learning to interpret organic structures not as an arrangement of lines and letters but, rather, as a representation of chemical entities is a challenge in organic chemistry. To successfully deal with the variety of molecules or mechanistic representations, a learner needs to understand how a representation depicts domain-specific information. Various studies that focused on representational competence have already investigated how learners relate a representation to its corresponding concept. However, aside from a basic connectional representational understanding, the ability to infer a comparable reactivity from multiple different functional groups in large molecules is important for undergraduate students in organic chemistry. In this quantitative study, we aimed at exploring how to assess undergraduate students’ ability to distinguish between conceptually relevant similarities and distracting surface similarities among representations. The instrument consisted of multiple-choice items in four concept categories that are generally used to estimate the reactivity in substitution reactions. This exploratory study shows that the item design for assessing students’ conceptual understanding influences students’ answering patterns. Insights and pitfalls gained from this investigation and future directions for research and teaching are provided.


Establishing structure–property relationships lies at the heart of chemistry and its scientific practices (DeFever et al., 2015; Talanquer, 2018). While learning chemistry, learners are confronted with a large variety of structural representations and are then challenged to “associate” chemical and physical properties to these external representations. The difficulty in determining structure–property relationships arises due to the twofold nature of a chemical entity – the structural representation as the explicit part and its respective implicit, conceptual counterpart (Hoffmann and Laszlo, 1991). Without the cognitive linkage between the chemical information that a structural representation conveys and its surface structure, a learner's perception and interpretation of this information may remain superficial (Ainsworth, 2006). All studies that analyzed a student's and an expert's use of representations and their perceptions of similarity observed that students focused on single surface features or symbolic patterns rather considering functional similarities; this finding was consistent irrespective of whether the students were looking at single molecules or complete reactions (Domin et al., 2008; Graulich and Bhattacharyya, 2017; Galloway et al., 2018). One can assume, that learners tend to perceive chemical similarity as a visual construct; if structure A looks similar to structure B on the surface level, then it automatically implies that B might also react similarly to A. This may be a useful heuristic in some cases, but these shortcut strategies lead students to overlook “emergent properties”, i.e., the change in typical properties of functional groups due to a different structural environment (Talanquer, 2008; DeFever et al., 2015).

How students establish relationships between the explicit and implicit information of entities has been the focus of a large body of research in chemistry education (cf. Graulich, 2015). Cooper and colleagues have intensively explored the difficulty students have in extracting chemical or physical properties, such as polarity or boiling points, from Lewis structures. They showed that inferring properties from the molecular shape is actually a complex task for students (Cooper et al., 2010, 2012, 2013). It requires students to make multiple inferences simultaneously, e.g., they need to be able to determine lone pairs and their electron density to deduce a molecular geometry from the electronic structure. DeFever et al. (2015) added to this research by demonstrating that senior chemistry students had more difficulty generating a Lewis structure of a molecule bearing a certain property than identifying the properties of a given molecule. The senior chemistry students in this study searched their mental library of known structures to find the molecule that fits certain constraints instead of creating or modifying a molecular structure (DeFever et al., 2015). This finding suggests that inferring properties from a representation may be unidirectional for students. They seem to link properties to visual representational features, e.g., “a negative charge indicates a nucleophile”, or certain reaction contexts, rather than conceptualizing them independently of the structural context, i.e., making the properties transferable to other representations. This may apply to organic reactions as well. Anzovino and Bretz (2015, 2016) documented that students only often rely on explicit representational features, such as charges or drawn-out lone pairs, to identify a nucleophile, even though they are able to recall the definition of a nucleophile. Consequently, this reliance on structural features results in incorrectly estimating if a reaction involves nucleophiles and electrophiles. Students seem to be unable to recognize how the context of a chemical reaction can change the reactivity of a substance. Estimating if an implicit property, such as the basic or nucleophilic character of an entity, predominates in a reaction context requires balancing both electronic and steric effects (de Arellano and Towns, 2014). Given these findings, a student's ability to identify a chemical reactivity in a reaction context or to judge the relative strength of an implicit property strongly depends on identifying and using implicit property in their reasoning (DeFever et al., 2015; Flynn and Featherstone, 2017; Weinrich and Sevian, 2017; Caspari et al., 2018b). Even in unfamiliar contexts, students were more successful when they employed implicit properties in their reasoning (Weinrich and Sevian, 2017; Caspari et al., 2018a).

As the aforementioned discussion indicates, students’ successes, when advancing in their studies, may depend on the following abilities: (1) making appropriate links between the chemical entity and the representation, e.g., deriving properties from structural representations; (2) identifying multiple implicit properties of an entity; (3) determining which implicit property might be relevant in a given problem context. In light of this and knowing that students often rely solely on visual features of a representation, an appropriate diagnosis of students’ conceptual understanding may only be possible if an item hinders the reliance on similar, but irrelevant visual structural features and requires students to focus solely on an implicit property of a chemical entity. We herein illustrate our attempt to assess students’ abilities to discriminate between explicit and implicit properties in the realm of substitution reactions.

Theoretical background

Using multiple representations is a common practice in chemistry education (Kozma et al., 2000). Supporting learners in acquiring a solid conceptual understanding requires us to understand how students perceive and deal with representations. There has been a growing interest in students’ representational abilities in science education research. Particularly in regard to students managing multiple external representations (Ainsworth, 1999, 2006; Corradi et al., 2013), learning with picture-text combinations (Schnotz and Bannert, 2003), moving along the chemistry triplet and its representations (Treagust et al., 2003; Gkitzia et al., 2011; Stieff et al., 2013), or mentally rotating organic molecules (Stieff and Raje, 2010). The first competency that learners acquire when dealing with representations is the visual understanding; students must develop an understanding of the format of the representation, e.g., what does a line mean in chemistry? and how to map this representation to knowledge about the depicted referent or the domain-specific concept (Wu and Rau, 2018). This acquisition of visual understanding followed by the development of conceptual understanding is influenced by how a learner perceives the external features of representations. Research has documented that external features of organic structures may not be equally salient for students (Domin et al., 2008; Graulich and Bhattacharyya, 2017). Highly salient features for students, measured independently of a reaction context, are functional groups, distinct letters, or bonds between carbon atoms and heteroatoms (Mason et al., 2016).

In addition, learners need to work with multiple representations at a time, a situation in which “connectional understanding” plays a role (Wu and Rau, 2018). This applies when judging if a ball-stick model or an electron density map of water displays the same molecular referent. In this case, the type of representation changes, but the referent stays the same. Having a connectional understanding also becomes relevant when comparing representations of different molecules, e.g., considering an alcohol (C4H9OH) or an ether (C4H10O), and comparing the properties of these molecules. Although both molecules have the same number of atoms, their properties, the connectivity of their atoms, or their boiling points, are different. This connectional understanding requires a learner to differentiate between explicit and implicit similarities of representations. However, when solving a problem in upper-level organic chemistry classes, visual and connectional understandings do not sufficiently cover the competencies students need when dealing with large molecules and multiple functional groups.

Expanding the representational competencies

Every reasoning process with representations in organic chemistry starts with “decoding” a molecular representation and inferring the respective properties of the depicted molecule, i.e., determining what is displayed and what is known in terms of properties. Each chemical entity can thus be characterized by an explicit part and an implicit part. The explicit part of an entity is represented by its Lewis or line structure, which models the explicit properties of the entity, e.g., connectivity of the elements, charges, bonding order, or geometric information. The bulkiness or sterically crowdedness of a structure can be considered an explicit property that can determined by looking at the representations. Decoding the latter property from an entity can be supported by adding dashed-wedged bonds to indicate the three-dimensional nature or by drawing out C–H bonds.

The implicit properties of an entity, e.g., partial charges or the size of an atom, are generally not explicitly represented symbolically and need to be inferred from it. Some implicit properties, such as the polarity of a bond or a partial charge, can be expressed by adding their corresponding symbols to the representation. Implicit properties comprise properties that explicate the electronic structure, i.e., polarizability, polarity of a bond, or partial charges, as well as other empirical properties. These empirical properties are, for example, the pKa or the electronegativity of an atom, which are only indirectly linked to the electronic structure and are often expressed by relative numbers. Some explicit features are relevant when estimating the reactivity of structures. Recognizing the explicit property that differ between a tertiary carbon atom and primary one is relevant to estimate possible hyperconjugative effects (implicit property).

Successfully describing an entity and its implicit properties depends on the following considerations: (1) which implicit property should be inferred from the representation and (2) how an implicit property changes depending on the structural context. The first case refers to the “reliability of the match”, i.e., the match between the explicit property and the implicit property. Carbon–heteroatom bonds, as an explicit property, are, in most of the cases, polar bonds. The second case refers to emergent properties of entities, which result due to a change in the structural environment, e.g., when groups adjacent to a functional group change the relevant implicit properties. An example of this is that an α,β-unsaturated carbonyl reacts differently than a single alkene or a carbonyl group. Students tend to disregard that the implicit property of an α,β-unsaturated carbonyl is different and that the implicit properties of a carbonyl and an alkene are not preserved and thus identical for the α,β-unsaturated carbonyl (DeFever et al., 2015). Identifying relevant implicit properties is thus contextualized (Goldstone et al., 1997), and in the case of emergent properties, it may be misleading to infer implicit property for each constituent functional group (e.g., implicit property of alkene and carbonyl in the upper example).

When solving a mechanistic problem in upper-level organic chemistry classes, the number of functional groups and implicit properties increases when dealing with large molecules. This requires the learner to have to additionally weigh the relative strength of an implicit property to determine the most reactive functional group. Depending on the context of the problem, one implicit property may be more relevant, or it may predominate other properties, e.g., a hydroxide may react as a base or as a nucleophile. The important connection between structure and context has also been emphasized by Anzovino and Bretz (2015). They concluded that it is “crucial that students be able to use both inherent characteristics (structure) as well as contextual clues to suggest function (What else is reacting?)” (Anzovino and Bretz, 2015, p. 809). If asked to estimate the trend for the acidity of common functional groups, a learner has to derive the implicit properties from each functional group and then compare them to determine which is the most acidic (cf. Fig. 1).

image file: c9rp00054b-f1.tif
Fig. 1 Explicit and implicit properties and influences of the context (green shaded boxes show the correct solution).

When the context of the question changes, the implicit properties that govern the reactivity need to be evaluated and weighed again to solve the new problem. If an explicit feature of a representation attracts attention, but is irrelevant to solve a problem at hand, it can be distracting for the learner, leaving less capacity to attend to other, presumably important features (Elby, 2000; Heckler, 2011). Intuitively, students attend to salient features and perceive them as reasonable to be a correct answer, because they are easily processed (McClary and Talanquer, 2011; Talanquer, 2014). Research in physics education has, however, documented that explicit distracting features of representations may negatively interfere with the relevant content knowledge (Scaife and Heckler, 2010; Heckler, 2011).

The ability “to disregard the superficial contained in pictorial and diagrammatic representations and extract information they deem relevant to the task at hand” has been described as an important aspect for expertise in organic chemistry (DeFever et al., 2015, p. 416). These considerations add a third aspect to the previously described representational competencies (Ainsworth, 2006; Rau, 2017), which we refer to as relational conceptual understanding, i.e., the ability to identify and weigh contextually relevant properties of different representations.

Learners’ relational conceptual understanding may be hidden when students are assessed on similarly looking representations. Case 1 in Fig. 2, for example, can be answered correctly when one of the following occurs (1) one recognizes the most plausible nucleophiles in the reaction context or (2) when one simply chooses the molecules by their similar explicit properties, e.g., negatively charged oxygens. When two structures, which react similarly, do not share explicit properties, as in case 2 (Fig. 2), estimating a similar nucleophilicity is much more complex. To estimate the nucleophilicity of a molecule, the learner needs to see beyond the representation and to recall implicit properties (cf. case 2, Fig. 2) to make a correct judgement. The fact that a negative charge not always implies a good nucleophile is apparent.

image file: c9rp00054b-f2.tif
Fig. 2 Illustration of two items with differing explicit properties (green shaded boxes show the correct solution).

Research questions

The nature of our conceptual understanding guides whether a perceived similarity between molecules is based on explicit, and often irrelevant or on implicit features of representations (Goldstone and Son, 2012); thus, this influences how a learner conceptually and contextually decodes representations (i.e., inferring implicit properties and comparing the relative strength of these properties in the given context). In this quantitative study, we wanted to explore how we can assess a student's relational conceptual understanding in the context of substitution reactions with the following relevant concept categories: (1) leaving group ability, (2) hyperconjugative effects, (3) nucleophilicity, and (4) solvent effects. For this objective, we designed items that included explicitly distracting representational features and explored the answers to the following research questions:

• How does students’ answering pattern differ in the four concept categories when judging a comparable reactivity in items that hinder the reliance on surface features similarity?

• How does a student's level of elaboration, when asked to provide a reason for their choice, relate to their performance?


Setting and participants

We conducted the study with students enrolled in a first-year organic chemistry class for chemistry majors and chemistry teaching majors at two German universities. The survey was administered to a population of N = 156 students at the end of the second semester. All students who volunteered for this study were provided with information about their rights and the handling of the data; informed consent was obtained from all participants. IRB (Institutional Review Board) approval is not required at German universities, but the recruitment process followed ethical guideline and ensured that students could opt out at any time during data collection.

Although the participants attended two different universities, there were no significant differences in terms of age, gender, or performance of the population (determined with an anonymous demographic survey, separately from the main survey) (Table 1). Both organic chemistry classes had a comparable course content in terms of topics, the chronology of the topics and the depth of the discussed content. Both lectures were videotaped and compared to ensure that the content of the courses were comparable in terms of depth of discussion. The respective content on substitution reaction (SN1 and SN2) were covered in both course by highlighting typical influences separately, i.e., leaving group ability, solvent effects, nucleophilicity and the influence of hyperconjugative effects. Neither courses trained students to estimate multiple effects simultaneously or estimate reaction pathways through comparing steric effects.

Table 1 Statistics for the sample population
Population Gendera Average age Teaching majors Chemistry majors
a m = male; f = female.
Justus-Liebig-University Gießen 46(m) 54% 20.51 15 70
39(f) 46%
Philipps-University Marburg 46(m) 65% 20.72 14 57
25(f) 35%
N = 156 92(m) 59% 20.61 39 127
64(f) 41%


The construct that we wanted to explore is a student's ability to recognize a comparable reactivity despite distracting explicit features; along this same line, we are also interested in whether the quality of the student's elaboration of their choice is an indicator of their performance. Therefore, we used a two-part paper–pencil test. Part 1 consisted of multiple-choice tasks that required students to select two molecules out of three that would have a comparable reactivity in the given reaction context. Here, the students only had to mark their answer. Part 2 of the instrument was administered after part 1, and also included items of each concept category; however, this time, each item was accompanied by an open-ended elaboration prompt, such as “Explain in detail, why you made this choice”. This open-ended elaboration prompt was used to qualitatively code students’ rationales afterwards. Our leading hypothesis was that if a student is able to abstract from eventually irrelevant and misleading explicit properties when evaluating the given structures in the context of the problem, then we can consider this as an evidence for his or her relational conceptual understanding.

Content area

We focused on typical substitution reactions, which are taught in these introductory organic chemistry classes, and the four different concepts that are important to decide between SN1 and SN2 reactions: (1) leaving group, (2) hyperconjugative effects, (3) nucleophilicity, and (4) solvent effects. The implicit properties that underlie these concept categories are not equally difficult to recognize. Reasoning about some implicit properties might even be possible by focusing on explicit features of the representation, especially when estimating hyperconjugative effects or solvent effects. In these cases, an increase of methyl-substituents at the reaction center implies a higher hyperconjugative effects, as well as an increase in steric hindrance, due to an increase of substituents at the reaction center. Students may thus answer these items not by reasoning about hyperconjugative effects in the transition state, but by referring to steric hindrance in the initial reaction process. The two introductory courses emphasize hyperconjugative effects as the cause for a higher thermodynamic stability of tertiary versus primary carbocations and less often highlighted the steric hindrance during nucleophilic attack. Thus, we use the term hyperconjugative effects to cover substrate effects in an SN2 reaction. Comparing a substitution reaction at a tertiary, secondary, or a primary halogen alkane, would give the same answer, when a student is focusing on hyperconjugative effects or on steric effects. Both ways are reasonable approaches, although steric effects dominate in an SN2 reaction. Students may, however, not be using any reasoning beyond the heuristic of counting substituents and make a correct statement without referring to any possible rationale.

Estimating if a solvent is protic, which is an implicit property, can be based on recognizing the OH group of a structure. It is thus easier for the students because the implicit property is closely connected to the explicit property.

The leaving group ability or the nucleophilicity of a compound depends on multiple implicit properties that are not directly connected to an explicit property of the structure and thus causes difficulties in students’ reasoning (Popova and Bretz, 2018). Estimating the leaving group ability usually requires consideration of anion stability, which is assessed by considering the pKa of the anion's conjugate acid. A low pKa of the conjugate acid is associated with a better leaving group ability. Polarizability plays a role as well, as the leaving group needs to be polarizable in order to lower the energy of the transition state in a SN2 reaction; which makes fluoride a poor leaving group. The most difficult of these concepts is nucleophilicity, as this concept is characterized by multiple implicit properties, such as electronegativity, polarizability, and aspects of steric, shape, and basicity. Recognition of a good nucleophile is not enclosed by just recognizing a negative charge, i.e., an explicit property. Anzovino and Bretz (2015) showed that students generally are able to define a nucleophile, but struggle to identify nucleophiles.

Item design

To elicit students’ relational conceptual understanding, we created different items for each of the four concept categories. The items were designed as triplet case comparisons, in which the students were asked to select two out of three displayed molecules that would react similarly in the given context of the problem. In the case of the leaving group ability, the students had to choose the two most plausible leaving groups; in items considering solvent effects, the students had to choose the two solvents that would allow the same reaction type (e.g., an SN1 reaction). When comparing multiple representations, some incidentally similar explicit properties are shared between representations, for example, two molecules sharing the same length of a side chain or carrying additional non-reacting substituents, such as methyl groups. These explicit features may be distracting for learners who rely on surface feature similarity. Other explicit properties can be relevant, such as whether the implicit property derived from it allows a sound claim about the reactivity in the context of the problem. To create items that are highly distracting with regards to their irrelevant explicit features, we had the following two constraints: (1) each molecule had to be chemically plausible and would react comparably to one of the other molecules given in the task, and (2) molecules that react similarly in an item should have at least one or more different distracting explicit properties, e.g., length of the alkyl-chain, ring structures, or heteroatom (cf. Table 2). Just focusing on the similarity of these distracting explicit properties would lead to a wrong answer. Given these constraints, not all variations could be used to maintain a chemical feasibility and a reasonable level of difficulty. During item design, we used our research group meetings (graduate and undergraduate research assistants) to collect various explicit properties that the representations could share. In a second round after the item design the authors coded independently if an explicit feature was distracting (i.e., shared by two molecules and leading to an incorrect answer). This helped us to determine the possible number of distracting explicit properties that an item contained, to ensure that every item had one or more of these distracting explicit properties, as shown in Table 2. The interrater reliability in terms of Cohen's κ for this round of coding was approximately 0.899 over all of the codings.
Table 2 Number of distracting explicit properties used in each concept category
Distracting explicit properties Concept categoriesa
Leaving group Hyperconjugative effects Nucleophilicity Solvent effects
a Numbers refer to the number of items that showed this distracting explicit property.
Cyclic rings 14 21 8 9
C–O bonds 11 15 3
Methyl groups 2 11 2
2° carbons 14 4
Negative charges 5 5
Aromatic rings 7 1
Charged oxygens 15
X[double bond, length as m-dash]O bonds 5
Multiple bonds 3
1° carbons 2
Ester bonds 1

Table 3 Numbers of items in each concept category
  Hyperconjugative effects Leaving group Nucleophilicity Solvent effects Total
Items per category 30 13 15 10 68
Number of subscales 6 2 3 2 13

Depending on the concept category, multiple distracting explicit properties could be included in an item. Table 2, for instance, shows that the explicit property “methyl group” was a distracting explicit property in two items of the scale leaving group, e.g., two out of three structures in an item shared the same amount of methyl groups.

Picking these two structures because of their explicit similarity, however, would lead to a wrong answer. In the concept category nucleophilicity, a negative charge was a distracting feature in 5 items. The number of distracting items is higher than the number of items in each concept category (cf. Table 2), as some items could share multiple distracting features. If a student would focus only on distracting explicit properties while answering the items, he or she would not be successful. This does not mean that those properties are always misleading.

In the cases of hyperconjugative effects or solvent effects, the distraction is often limited to changes of the carbon backbone of the structures (e.g., two linear structures and a cyclic one, cf. Table 4, hyperconjugative effects, subscale 2). For instance, in one of the items in the category solvent effects shown in Table 4, the given solvents share the distracting explicit properties “C[double bond, length as m-dash]O bond” and “methyl group”. The distracting explicit properties in the categories leaving group or nucleophilicity included the same heteroatoms or two negative charges. In an nucleophilicity item, for instance, two nucleophiles could share a negatively charged oxygen, whereas the other nucleophile does not (cf. Table 4, nucleophilicity, subscale 3). Thus, we created different subscales in each concept category, as shown with two exemplary items for each concept category in Table 4. We additionally discussed the set of items with the two professors of the classes to check face validity and content-related validity. They ranked our items based on what they emphasized in class, and only those items with mutual agreement were administered in the survey. The initial set of items was piloted in a student teacher's course (N = 25), with additional interviews, to estimate time and wording of the items. We changed wording of the prompts and deleted some of the items that were unintelligible for the students. The final set represented 68 items in total (cf. Table 3).

Table 4 Example items for the four concept categories and subscales (green shaded boxes show the correct solution)
image file: c9rp00054b-u1.tif

Data collection

The data collection was carried out at the end of the semester, one week before the final exam of the class. The surveys were administered during the same week in both cohorts, and the two parts were handed out consecutively. First, the students received the first booklet (part 1) with the multiple-choice items. After 30 min, they received the second booklet (part 2). This time, each item was followed by an elaboration prompt asking the students to explain their choice. To avoid test effects, we distributed the items in an incomplete balanced block design (cf. Frey et al., 2009) with the help of our own software program. This allows distributing the items equally in the booklets we used, without overwhelming the students with the total set of items. The software randomly selected different items from each subscale so that each concept category was represented with at least two different items. Additionally, the different items were arranged randomly. Hence, undesirable effects such as fatigue or loss of interest could be minimized. The final booklet of part 1 and part 2 contained 28 items each.

Data analysis

Quantitative data analysis

We utilized a basal scoring for the analysis of the items in both parts. A correct pair of molecules in an item is scored with one, and all other combinations are scored with zero. This simple scoring allows the calculation of category scores by simply calculating the mean of all items solved in the respective category. These scores were used to draw comparisons between students’ performance in the concept categories in part 1 and 2.

Qualitative data analysis

Rubric development. We developed a coding rubric for the analysis of students’ elaborations in part 2 of the instrument. The codes were refined iteratively through discussions between the authors (Saldana, 2016). After multiple rounds of discussions, we agreed on a 4-level rubric to code students’ elaborations (cf. Table 5). For the purpose of this study, we were not aiming to characterize how students reasoned; rather, we focused on content aspects, such as whether students mentioned an explicit or an implicit property or referred to the problem context. The 4-level rubric is characterized by an increasing reference to the reaction context, which varies from descriptive to functional, as well as by an increasing reference to properties, which varies from explicit to implicit. Students often express an implicit property of an entity through verbal shortcuts. These verbal shortcuts do not explicitly state the underlying implicit property, but rather, they describe an activity that results from this property. When a student mentioned “good leaving group”, we coded this, as explicit-functional, because it refers to an activity that can be “observed” on the Lewis-structural level, and it refers to the reaction context. A statement such this does not explicitly express the underlying implicit property, i.e., what characterizes a good leaving group. Although some of the students, who gave this answer may have a deeper understanding, we could only code what the students verbalized. All given responses in the elaboration were additionally coded by performance for the respective item. To determine interrater reliability for our rubric, two of our undergraduate research assistants independently coded a randomly chosen portion of the elaborations (20% of the elaborations). Interrater reliability was evaluated using Cohen's Kappa. A value of κ ≈ 0.87 indicates that the raters agreed in nearly every case.
Table 5 Rubric for the qualitative coding of students’ elaborations in part 2
Level of elaboration Code description Student examples
E1 explicit-descriptive Student states explicit property of the molecules • “both molecules look similar
• “both molecules have rings
• “two secondary halogens
E2 explicit-functional Student states the role of the entity in the problem context • “both are good leaving groups
– Or states an explicit property and the problem context • “tertiary carbons undergo SN1 reactions
E3 implicit-descriptive Student states an implicit property of the molecules • “both have a high electronegativity
– Or adds an implicit property to an E2 elaboration • “both molecules have a high electronegativity and are good leaving groups”
E4 implicit-functional Student states an implicit property and refers to its role in the problem context • “both stabilize a resulting negative charge, when leaving the molecule, as the conjugate base is weak in both cases.
• “both are polar protic solvents and favour SN1 by hydrating the carbocation

Results and discussion

Do students perform similarly in all concept categories?

The first objective in exploring students’ relational conceptual understanding while solving problems with distracting items was to elucidate if there were differences in students’ performances in the concept categories and if prompting students to provide a reason for their answer would correlate with their performance.

First, we calculated the mean of the item scores in each concept category and compared the performances in part 1 and 2 of the instrument to estimate the effect of prompting. Table 6 shows the descriptive statistics for each of the four concept categories of the instrument in part 1 and part 2 and includes the calculated p values and effect sizes, as measured by Cohen's d (Coolican, 2009), between the two parts of the instrument.

Table 6 Descriptive statistics for the concept categories (LG leaving group; HE hyperconjugative effects; NUC nucleophilicity; SE solvent effects; part 1 without elaboration prompt, part 2 with elaboration prompt)
Concept category Median Mean SD Skewness Kurtosis p Cohens d
LG Part 1 0.34 0.41 0.23 0.71 0.32 0.015** 0.27
Part 2 0.50 0.48 0.32 0.19 −0.90
HE Part 1 0.50 0.50 0.24 −0.17 −0.84 0.186 0.16
Part 2 0.50 0.54 0.26 −0.24 −0.62
NUC Part 1 0.17 0.20 0.21 0.84 −0.16 0.001*** 0.39
Part 2 0.25 0.29 0.27 0.84 0.12
SE Part 1 0.75 0.62 0.32 −0.38 −0.89 0.017** 0.22
Part 2 0.75 0.69 0.32 −0.63 −0.75

The average score in each category revealed that students’ performances were ambiguously spread over the concept categories. Students showed, on average, a higher performance in the concept categories solvent effects and hyperconjugative effects, whereas the students usually earned lower scores in the concept category nucleophilicity. The mean score for the category solvent effects indicated that students performed the best in this concept category, with a mean of 0.75 (part 1), and did worst in the category nucleophilicity, with a mean of 0.17 (part 1).

We further calculated the box-and-whisker plots (the ends of the whiskers represent the minimum and maximum, excluding outliers, of all responses) for the four categories to illustrate the distribution of correct student responses in each of the four concept categories and in the two parts of the instrument respectively. The distribution of the students’ answers in Fig. 3 may not be surprising, as the recognition of hyperconjugative effects in terms of recognizing differing degree of substituents or bulkiness of molecules (in the category hyperconjugative effects), as well as two OH groups (in the category solvent effects) is more easily linked to explicit features of the representations. The high performance in these concept categories was also observed in part 2 of the instrument. The category nucleophilicity, however, seems to be the most difficult for students, and it showed a trend towards lower performances in part 1. Comparing the distribution of the answers in part 1 and part 2 showed that asking students to elaborate on their answer had a statistically significant effect (p < 0.05). The answering pattern significantly changed from part 1 to part 2 in the concept categories nucleophilicity (p = 0.001), leaving group (p = 0.015) and solvent effects (p = 0.017), whereas the performance of the students increased the most for items in the concept category nucleophilicity. The students’ answering pattern in the category hyperconjugative effects (p = 0.186) showed no significant changes between the two parts of the instruments.

image file: c9rp00054b-f3.tif
Fig. 3 Box-and-whisker plots for the distribution of mean scores in the concept categories (part 1 without elaboration prompt, part 2 with elaboration prompt).

The calculated effect sizes, as measured by Cohen's d (Coolican, 2009), showed that students’ performances increased by small effect sizes in the categories solvent effects (d = 0.22), leaving group (d = 0.27), and hyperconjugative effects (d = 0.16), and by a medium effect size with d = 0.39, in the category nucleophilicity. At this point, we can assume that asking students to elaborate on their answers changed their performances by varying degrees. Elaborating on the nucleophilicity of the displayed molecules seemed to have a higher effect on the students, whereas elaborating on hyperconjugative effects did not change their performance.

However, when looking at the box-and-whisker plots, it becomes apparent that the concept category leaving group tended to be bimodal and not normally distributed in part 1, with many outliers on the top and on the bottom (Fig. 3). The other categories showed no differing distributions. The large number of outliers can be a weak evidence for a multi-dimensional scale and for different factors that load on the same scale. This indicate that we may have a different internal structure of this whole scale and that some items may be more difficult than others.

Exploratory factor analysis

Do the concept categories measure the same ability?. Given the spread in the category leaving group, we conducted an exploratory factor analysis to examine the internal structure of the instrument. We used a Principal Axis Factoring with a Varimax as rotation method.

The number of extracting factors were estimated by a scree plot. A loading of less than 0.2 was suppressed in the resulting factor table to retain lucidity. A subscale of items was treated as belonging to a factor if the loading was higher than the threshold of 0.3. Two different factors could be extracted when all subscales of the concept categories were considered. The factor analysis showed that all subscales of the concept categories are one-dimensional with one exception (cf. Table 7). The concept category leaving group loads on two different factors. Subscale 1 loads on the same factor as the concept category solvent effects and all subscales of the category hyperconjugative effects, whereas subscale 2 of the leaving group loaded on a distinct factor together with all the subscales of the nucleophilicity. The tendency of a different factor loading had already become apparent when considering the large number of outliers in the box-and-whisker plot (cf. Fig. 3) and is confirmed with these additional results. This first exploratory factor analysis revealed, on one hand, the one-dimensionality of three of the four concept categories, which loaded on two different factors, and showed that the two subscales of the leaving group must be splitted. Thus, both subscales of the concept category leaving group have to be considered as two distinct subscales (cf. Table 7).

Table 7 Exploratory factor analysis
Rotated factor matrixa
  Factor 1 Factor 2
a Extraction method: principal axis factoring; rotation converged in four iterations.
Leaving group Subscale 1 0.445  
Subscale 2   0.495
Solvent effects Subscale 1–2 0.600  
Nucleophilicity Subscale 1–3   0.356
Hyperconjugative effects Subscale 1–6 0.683  

We further explored the differences and the nature of the items that were loaded on to these two factors. In a third round, we coded independently if an explicit property is a distracting explicit properties (e.g., leading to an incorrect response) or if the molecules share, what we refer to as supporting explicit properties (e.g., leading to the correct response). Table 8 shows the results of this third round of coding. This helped us determine if some of the items were not as distracting as expected and if shared explicit features were more salient for the students, which consequently influenced their performance.

Table 8 Numbers of supporting explicit properties in the items (LG leaving group; HE hyperconjugative effects; NUC nucleophilicity; SE solvent effects; type 1 with supporting explicit properties; type 2 without supporting explicit properties)
Supporting explicit property LG HE SE NUC
Identified item type
Type 1 Type 2 Type 1 Type 1 Type 2
Halogens 6
C–O bonds 3
1° carbons 3
2° carbons 7
3° carbons 2
Methyl groups 5
Aliphatic bonds 14
OH bonds 7
X[double bond, length as m-dash]O bonds 1

When comparing the number of possible supporting explicit properties in items from each concept category and subscale, it became evident, that items (loading on factor 2), which we refer to as item type 2, do not have any supporting explicit properties and only share distracting ones, such as heteroatoms or charges (cf. Table 2). This is the case for the concept categories nucleophilicity and subscale 2 of leaving group. Deciding, for example, between the three given nucleophiles in the category nucleophilicity may have been highly misleading for those students, who rely on surface similarity in their answers (cf. Table 4, concept category nucleophilicity).

Items in the concept categories hyperconjugative effects or solvent effects had some sort of supporting explicit properties, such as variations of the structural backbone, which were shared between the molecules. These items were combined in item type 1. Those type of items may not fully assess students’ underlying reasoning, i.e., their relational conceptual understanding, as recognizing similarity between surface features is sufficient to provide a correct answer in the multiple choice items.

Given the results of the factor analysis, we further considered the distribution of students’ answers in the two item types. Across the item types 1 and 2 and the two parts of the instrument (part 1 and part 2), the box-and-whisker plots showed that students performed significantly better in both parts of the study when items belonged to type 1 (Fig. 4). These results provide the first evidence that item design is crucial for assessing students’ relational conceptual understanding. The items from type 2 were revealed to be highly difficult for the students, whereas items from type 1 were answered correctly more often.

image file: c9rp00054b-f4.tif
Fig. 4 Box-and-whisker plots for the distribution of mean scores by item type and instrument (type 1 with supporting explicit properties; type 2 without supporting explicit properties; part 1 without elaboration prompt; part 2 with elaboration prompt).

The effect sizes for each item type showed a small effect size for the elaboration prompt (cf. Table 9). This suggests that, independent of the item type, asking students to elaborate on their own answer only slightly influenced their performance. One can assume that the prompts did not engage students in reflecting on their answers. Other types of indirect metacognitive prompts, used to help students overcome their intuitive heuristics, may have a stronger effect on students’ answers (Talanquer, 2017).

Table 9 Descriptive statistics for the performance of both item types (type 1 with supporting explicit properties; type 2 without supporting explicit properties; part 1 without elaboration prompt; part 2 with elaboration prompt)
Item type Median Mean SD Skewness Kurtosis p Cohen's d
Type 1 Part 1 0.64 0.60 0.23 −0.53 −0.62 0.009*** 0.26
Part 2 0.67 0.66 0.22 −0.51 −0.01
Type 2 Part 1 0.29 0.28 0.17 0.45 −0.15 0.001*** 0.31
Part 2 0.27 0.34 0.24 0.68 −0.02

Students’ level of elaboration and performance

In the second step of our analysis, we looked at the distribution of the elaboration codes students obtained in each concept category of part 2 of the instrument and the relation to students’ performance. In addition to students’ elaborations in the concept category solvent effects, the distribution of the four elaboration codes did not differ as largely in the other concept categories (Fig. 5). Students were predominantly explaining their answers by referring to a similarity of explicit features of the molecules. About three-quarters of the students usually obtained an E1 (explicit-descriptive) or E2 (explicit-functional) elaboration. Only about one quarter of the students obtained higher elaboration codes and mentioned some type of implicit property in their elaboration.
image file: c9rp00054b-f5.tif
Fig. 5 Coded levels of elaboration by concept category and performance (SE solvent effects; NUC nucleophilicity; HE hyperconjugative effects; LG leaving group).

The distribution of students’ elaboration codes in the concept category solvent effects differed compared to the other categories, as most of the students obtained an E3 (implicit-descriptive) or an E4 (implicit-functional) code, both with a high percentage of correct answers; 83% correct answers with an E3 and 89% with an E4 code. Students in our sample could easily name implicit properties of solvents, such as protic-polar characteristics (i.e., which is coded on the E3 level), or could consider the most favourable solvent type for the reaction context, which was coded as E4. When looking at hyperconjugative effects, which contained only type 1 items (with supporting explicit properties), students with an E1 elaboration (focusing on explicit properties) had a 49% chance of getting the correct answer in these items, by mentioning “two secondary carbons” or “primary carbons react the same”. The type of items of this category allowed the successful use of similar explicit properties. If students were thinking about steric effects when choosing to focus on these explicit features could not be determined in detail. None of the students in our sample were actually mentioning steric effects in their elaboration.

In the concept category nucleophilicity, which only contained type 2 items (distracting explicit properties), only 3% of responses with E1 were correct. The responses of the students to the elaboration prompt showed that students were focusing on explicit features, such as “two negative charges” or “two negatively charged oxygen”.

This indicates that the identification of explicit properties (i.e., E1 level), rather than implicit properties induces a high chance of failure if inferring the implicit property is necessary to make a sound judgement between the given structures, at least in items of type 2.

The perceived salience of explicit properties in the student's elaborations changed depending on the item. A shared carbon–heteroatom, specifically a carbon–oxygen bond, was the main salient feature mentioned by the students in their E1 elaborations for nucleophilicity items. E1 elaborations for hyperconjugative effects predominately mentioned the shared degree of branching, e.g., primary or tertiary carbons as the salient explicit property. Given the design of the item (cf. Table 3), a shared carbon–oxygen bond was used as a distracting explicit property, which lead to erroneous answers, whereas the recognition of shared tertiary carbons was a supporting explicit property, which leads to a correct answer (cf. Table 9). Type 2 items seem to be useful to differentiate between students who showed a clear focus on explicit properties and those whose focus was on implicit properties.

Fig. 6 confirms this trend when performance by elaboration code and by item type are considered. Although the percentage of E3 and E4 codes in item type 1 is higher because of the high percentage of E3 codes in the category solvent effects, the trend of getting a correct answer, when elaborating on an E3 or E4 level was apparent. The performance decreased from 23% correct answers in E1 in item type 1 to 5% in E1 for item type 2.

image file: c9rp00054b-f6.tif
Fig. 6 Coded levels of elaboration by item type and performance (type 1 with supporting explicit properties; type 2 without supporting explicit properties).

In all concept categories and item types, the likelihood of answering an item correctly increased for students who earned an E3 (implicit-descriptive) or an E4 (implicit-functional) code for their elaboration. This results clearly indicates that the code a student gets for his or her elaboration may be a good indicator of their underlying relational conceptual understanding. This also supports former research that showed that considering implicit properties seemed to allow students to be successful in structurally distracting contexts (Weinrich and Sevian, 2017). Only the concept category leaving group contained both items types, thus allowing us to make an accurate statement if a higher elaboration code correlated with an increased performance in both item types and in the same concept category. The distribution of the elaboration levels in subscale 1 (item type 1) and subscale 2 (item type 2) did not differ, but the percentage of correct answers between the two item types showed that students elaborating on an E1 level had higher chances of a correct answer in item type 1, although they did not used implicit properties in their elaboration. This effect is reversed in subscale 2 with items that do not carry supporting explicit properties. In this case, students answered items from subscale 2 more often incorrectly on the E1 level. In both subscales, the higher the elaboration code obtained by a student, the higher the probability of getting both item types correctly. The results for the different item types in the concept category leaving group indicate that when items allow a reliance on similar surface features (subscale 1), students’ actual conceptual understanding cannot be properly assessed. Consequently, this may mask students’ difficulties.

From these findings, we can conclude that a relational conceptual understanding can only be properly assessed (1) if only items of type 2 are used in an instrument or (2) if students’ elaborations, as an indirect measure, are taken into account.


This study was meant to be exploratory and focused on the relational conceptual understanding of students in organic chemistry. The data analysis and discussion of the results revealed various difficulties and limitations in appropriately assessing students’ competencies that may guide future research in this area. Two organizational aspects limit the findings from this study. We could only work with a limited number of items. The chemical feasibility of the items and a reasonable number of items for each booklet were limiting factors. We could only reach a relatively small number of students; thus, the instrument has not been the subject of a detailed study to evaluate the reliability of each scale. Calculating Cronbach's alpha would require approximately 30 answers per items, which would have reduced the scope of this exploratory study. As we were aiming for a broad perspective on students’ relational understanding, testing four concepts with a limited number of items provided us with insights into item design, which may guide future design of assessment instruments. Additionally, the concept categories that we tested in our instrument did not equally require the reliance on implicit properties. Leaving group and nucleophilicity were the most difficult concept categories, whereas hyperconjugative effects and solvent effects could often be solved by regarding shared explicit properties, e.g., determining the degree of substituents or two hydroxyl functional groups. The varied results in students’ performances showed that using different item types is especially useful to determine students understanding about nucleophilicity, whereas students' understanding about hyperconjugative and solvent effects might not be properly assessible with these items. For these concept categories hyperconjugative effects and solvent effects, which revealed to be type 1 items, an intelligent task design is important to actually assess the relational conceptual understanding in these concepts, in order to ensure that students who use a sound chemically reasoning, e.g., arguing with steric hindrance or relief of strain, can be identified.

Students’ difficulties to fully explain their reasoning behind their answers created an uncertainty about the depth of students’ conceptual understanding because we cannot capture students’ reasonings in their full detail. Especially, if students are expressing a more in depth understanding of steric hindrance, when they referred to the difference between tertiary and primary carbons. None of the students in our sample did mentioned steric effects in their answers, which would have been coded at the E1 and E2 level as being explicit-descriptive or explicit-functional. Ongoing research on purposeful assessment items or appropriately prompting students may guide future research to overcome this problem (Stowe and Cooper, 2017; Talanquer, 2017). The finding that students providing a higher level of elaboration were more likely to be successful in dealing with visually complex representations (Fig. 7) supports ongoing research efforts in fostering students’ scientific explanations and argumentations.

image file: c9rp00054b-f7.tif
Fig. 7 Coded levels of elaboration by each subscale in the category leaving group and by performance (type 1 with supporting explicit properties; type 2 without supporting explicit properties).

Conclusion and implication

In this study, we wanted to explore how we can assess students’ relational conceptual understanding in the realm of substitution reactions. Our findings revealed various aspects that influence an appropriate assessment. We could show that items may vary on their degree of shared distracting or supporting explicit properties. Items that do not include a supporting explicit property (item type 2) are more difficult for students, in contrast to items (item type 1) in which shared explicit properties support the correct answer. This result may not be surprising, as reasoning in organic chemistry relies on visuals, and studies in organic chemistry reported students’ strong reliance on surface features when regarding organic structures or reactions (Domin et al., 2008; Graulich and Bhattacharyya, 2017). However, these findings may call attention to designing assessment items that easily allow a recognition of shared explicit properties and do not require an inference of implicit properties. Students’ approaches, while solving those items, seemed to be more closely tied to visual representational features than to implicit properties of entities.

Pattern recognition is a valid tool in organic chemistry and we do not argue against fostering pattern recognition, as for example arguing with steric effects (which can be considered explicit properties) would be chemically sound. Curriculum reforms have used specifically a focus on pattern of reaction mechanism and the respective symbolism (Flynn and Ogilvie, 2015; Galloway et al., 2017). Traditional curricula often have a far greater attention on explicit properties in teaching than the respective implicit properties of molecules. Considering intrinsic properties of substances is a quite sophisticated stance for students’ thinking about structure–property relationships, which is often not attained by students in their career (Talanquer, 2018).

One option to use students’ reliance on similar explicit properties is to actively create scenarios in which a surface level focus provokes cognitive dissonance or to create case comparisons that provide learners with sufficient opportunity to weigh properties (Graulich and Schween, 2018). Additionally, changing the contexts in which students learn a concept need to be as diverse as possible. It became evident in the last studies in organic chemistry education that a concept learned in one reaction context may not be effortlessly activated in other contexts (Anzovino and Bretz, 2015; Popova and Bretz, 2018). Ongoing research shows that a stronger emphasis on letting students propose mechanistic steps, rather than filling in intermediates of mechanistic steps and connecting structural and energetic considerations of reactions, might be beneficial for students to go beyond the explicit representational level (Caspari et al., 2018b; Bodé et al., 2019; DeCocq and Bhattacharyya, 2019).

Another option to purposefully use students’ visual focus would be to include supportive visuals in teaching organic chemistry to be able to picture the conceptual level. Visually expressing what common terms mean might foster students’ mental models, reminding students what it looks like when a carbocation is stabilized by hyperconjugation via representing the effect of electron distribution through an electron density map. This may help students to link the common representation of a tertiary carbocation to the conceptual level. This study offers insights into the area of assessing conceptual understanding and builds upon the ongoing research of students’ representational competence. This study was meant to be exploratory in nature and to inform future research efforts in assessment and item design in organic chemistry.

Conflicts of interest

There are no conflicts to declare.


The authors would like to thank Prof. Michael Schween (Philips University Marburg) and Prof. Richard Göttlich (Justus-Liebig University Gießen) for cooperating in this project. We thank all students who participated in the study and the current members of the Graulich research group for fruitful discussions.


  1. Ainsworth S., (1999), The functions of multiple representations, Comput. Educ., 33, 131–152.
  2. Ainsworth S., (2006), DeFT: a conceptual framework for considering learning with multiple representations, Learn. Instr., 16, 183–198.
  3. Anzovino M. E. and Bretz S. L., (2015), Organic chemistry students' ideas about nucleophiles and electrophiles: the role of charges and mechanisms, Chem. Educ. Res. Pract., 16, 797–810.
  4. Anzovino M. E. and Bretz S. L., (2016), Organic chemistry students' fragmented ideas about the structure and function of nucleophiles and electrophiles: a concept map analysis, Chem. Educ. Res. Pract., 17, 1019–1029.
  5. Bodé N. E., Deng J. M. and Flynn A. B., (2019), Getting Past the Rules and to the WHY: Causal Mechanistic Arguments When Judging the Plausibility of Organic Reaction Mechanisms, J. Chem. Educ., 96, 1068–1082.
  6. Caspari I., Kranz D. and Graulich N., (2018a), Resolving the complexity of organic chemistry students' reasoning through the lens of a mechanistic framework, Chem. Educ. Res. Pract., 19, 1117–1141.
  7. Caspari I., Weinrich M., Sevian H. and Graulich N., (2018b), This mechanistic step is “productive”: organic chemistry students' backward-oriented reasoning, Chem. Educ. Res. Pract., 19, 42–59.
  8. Coolican H., (2009), Research methods and statistics in psychology, London, UK: Hodder Education Group.
  9. Cooper M. M., Grove N., Underwood S. M. and Klymkowsky M. W., (2010), Lost in Lewis Structures: An Investigation of Student Difficulties in Developing Representational Competence, J. Chem. Educ., 87, 869–874.
  10. Cooper M. M., Underwood S. M. and Hilley C. Z., (2012), Development and validation of the implicit information from Lewis structures instrument (IILSI): do students connect structures with properties? Chem. Educ. Res. Pract., 13, 195–200.
  11. Cooper M. M., Corley L. M. and Underwood S. M., (2013), An investigation of college chemistry students' understanding of structure–property relationships, J. Res. Sci. Teach., 50, 699–721.
  12. Corradi D. M., Elen J., Schraepen B. and Clarebout G., (2013), Understanding Possibilities and Limitations of Abstract Chemical Representations for Achieving Conceptual Understanding, Int. J. Sci. Educ., 1–20.
  13. de Arellano D. C.-R. and Towns M., (2014), Students understanding of alkyl halide reactions in undergraduate organic chemistry, Chem. Educ. Res. Pract., 15, 501–515.
  14. DeCocq V. and Bhattacharyya G., (2019), TMI (Too much information)! Effects of given information on organic chemistry students’ approaches to solving mechanism tasks, Chem. Educ. Res. Pract., 20, 213–228.
  15. DeFever R. S., Bruce H. and Bhattacharyya G., (2015), Mental Rolodexing: Senior Chemistry Majors Understanding of Chemical and Physical Properties, J. Chem. Educ., 92, 415–426.
  16. Domin D. S., Al-Masum M. and Mensah J., (2008), Students' categorizations of organic compounds, Chem. Educ. Res. Pract., 9, 114–121.
  17. Elby A., (2000), What students' learning of representations tells us about constructivism, J. Math. Psychol., 19, 481–502.
  18. Flynn A. B. and Featherstone R. B., (2017), Language of mechanisms: exam analysis reveals students' strengths, strategies, and errors when using the electron-pushing formalism (curved arrows) in new reactions, Chem. Educ. Res. Pract., 18, 64–77.
  19. Flynn A. B. and Ogilvie W. W., (2015), Mechanisms before reactions: a mechanistic approach to the organic chemistry curriculum based on patterns of electron flow, J. Chem. Educ., 92, 803–810.
  20. Frey A., Hartig J. and Rupp A. A., (2009), An NCME Instructional Module on Booklet Designs in Large-Scale Assessments of Student Achievement: Theory and Practice, Educ. Meas., 28, 39–53.
  21. Galloway K. R., Stoyanovich C. and Flynn A. B., (2017), Students' interpretations of mechanistic language in organic chemistry before learning reactions, Chem. Educ. Res. Pract., 18, 353–374.
  22. Galloway K. R., Leung M. W. and Flynn A. B., (2018), A Comparison of How Undergraduates, Graduate Students, and Professors Organize Organic Chemistry Reactions, J. Chem. Educ., 95, 355–365.
  23. Gkitzia V., Salta K. and Tzougraki C., (2011), Development and application of suitable criteria for the evaluation of chemical representations in school textbooks, Chem. Educ. Res. Pract., 12, 5–14.
  24. Goldstone R. L. and Son J. Y., (2012), Similarity, ed. Holyoak K. J. and Morrison R. G., in The Oxford Handbook of Thinking and Reasoning, Oxford: Oxford University Press.
  25. Goldstone R. L., Medin D. L. and Halberstadt J., (1997), Similarity in Context, Mem. Cogn., 25.
  26. Graulich N., (2015), The tip of the iceberg in organic chemistry classes: how do students deal with the invisible? Chem. Educ. Res. Pract., 16, 9–21.
  27. Graulich N. and Bhattacharyya G., (2017), Investigating students' similarity judgments in organic chemistry, Chem. Educ. Res. Pract., 18, 774–784.
  28. Graulich N. and Schween M., (2018), Concept-Oriented Task Design: Making Purposeful Case Comparisons in Organic Chemistry, J. Chem. Educ., 95, 376–383.
  29. Heckler A. F., (2011), The Ubiquitous Patterns of Incorrect Answers to Science Questions: The Role of Automatic, Bottom-up Processes, ed. Mestre J. P. and Ross B. H., in The Psychology of Learning and Motivation: Cognition in Education, San Diego, CA: Elsevier, vol. 55, pp. 227–267.
  30. Hoffmann R. and Laszlo P., (1991), Representation in Chemistry, Angew. Chem., Int. Ed. Engl., 30, 1–16.
  31. Kozma R., Chin E., Russell J. and Marx N., (2000), The roles of representations and tools in the chemistry laboratory and their implications for chemistry learning, J. Learn. Sci., 9, 105–143.
  32. Mason B., Rau M., Jain L. and Nowak R. D., Modelling Perceptual Fluency with Visual Representations, in Proceedings of the 33rd International Conference on Machine Learning, New York, 2016.
  33. McClary L. and Talanquer V., (2011), College Chemistry Students' Mental Models of Acids and Acid Strength, J. Res. Sci. Teach., 48, 396–413.
  34. Popova M. and Bretz S. L., (2018), Organic Chemistry Students’ Understandings of What Makes a Good Leaving Group, J. Chem. Educ., 95, 1094–1101.
  35. Rau M. A., (2017), Conditions for the Effectiveness of Multiple Visual Representations in Enhancing STEM Learning, Educ. Psychol. Rev., 29, 717–761.
  36. Saldana J., (2016), The Coding Manual for Qualitative Researchers, Los Angeles: Sage Publishing.
  37. Scaife T. M. and Heckler A. F., (2010), Student understanding of the direction of the magnetic force on a charged particle, Am. J. Phys., 78, 869–876.
  38. Schnotz W. and Bannert M., (2003), Construction and interference in learning from multiple representation, Learn. Instr., 13, 141–156.
  39. Stieff M. and Raje S., (2010), Expert Algorithmic and Imagistic Problem Solving Strategies in Advanced Chemistry, Spat Cogn Comput, 10, 53–81.
  40. Stieff M., Ryu M. and Yip J. C., (2013), Speaking across levels – generating and addressing levels confusion in discourse, Chem. Educ. Res. Pract., 14, 376–389.
  41. Stowe R. L. and Cooper M. M., (2017), Practicing What We Preach: Assessing “Critical Thinking” in Organic Chemistry, J. Chem. Educ., 94, 1852–1859.
  42. Talanquer V., (2008), Students' predictions about the sensory properties of chemical compounds: Additive versus emergent frameworks, Sci. Educ., 92, 96–114.
  43. Talanquer V., (2014), Chemistry Education: Ten Heuristics To Tame, J. Chem. Educ., 91, 1091–1097.
  44. Talanquer V., (2017), Concept Inventories: Predicting the Wrong Answer May Boost Performance, J. Chem. Educ., 94, 1805–1810.
  45. Talanquer V., (2018), Progressions in reasoning about structure–property relationships, Chem. Educ. Res. Pract., 19, 998–1009.
  46. Treagust D. F., Chittleborough G. and Mamiala T. L., (2003), The role of submicroscopic and symbolic representations in chemical explanations, Int. J. Sci. Educ., 25, 1353–1368.
  47. Weinrich M. L. and Sevian H., (2017), Capturing students' abstraction while solving organic reaction mechanism problems across a semester, Chem. Educ. Res. Pract., 18, 169–190.
  48. Wu S. P. W. and Rau M. A., (2018), Effectiveness and efficiency of adding drawing prompts to an interactive educational technology when learning with visual representations, Learn. Instr., 55, 93–104.

This journal is © The Royal Society of Chemistry 2019