Constraints on organic chemistry students’ reasoning during IR and 1 H NMR spectral interpretation

Megan C. Connor; Solaire A. Finkenstaedt-Quinn; Ginger V. Shultz

doi:10.1039/C9RP00033J

View PDF VersionPrevious ArticleNext Article

DOI: 10.1039/C9RP00033J (Paper) Chem. Educ. Res. Pract., 2019, 20, 522-541

Constraints on organic chemistry students’ reasoning during IR and ¹H NMR spectral interpretation

Megan C. Connor , Solaire A. Finkenstaedt-Quinn and Ginger V. Shultz *
Department of Chemistry, University of Michigan, 930 N. University Ave, Ann Arbor, MI 48109, USA. E-mail: gshultz@umich.edu

Received 30th January 2019 , Accepted 28th March 2019

First published on 5th April 2019

Abstract

Promoting students’ ability to engage in discipline-specific practices is a central goal of chemistry education. Yet if instruction is to meaningfully foster such ability, we must first understand students’ reasoning during these practices. By characterizing constraints on chemistry students’ reasoning, we can design instruction that targets this constrained reasoning and ultimately promotes more sophisticated ways of thinking. For this study, we investigated reasoning used by 18 organic chemistry students at a large university in the United States as they evaluated the success of chemical syntheses through IR and ¹H NMR spectral interpretation, a common task of practicing chemists. Students completed a series of interpretation tasks while having their eye movements tracked and then participated in semi-structured, cued retrospective think-aloud (RTA) interviews about their reasoning during spectral interpretation. RTA interviews were analyzed qualitatively to characterize invalid chemical assumptions and heuristic reasoning strategies used by participants, both of which science education literature identifies as fundamental constraints to learning. The most problematic assumptions and heuristics, i.e., those used more frequently by unsuccessful participants, were then identified through statistical analysis. Findings suggest that the most problematic constraints on students’ reasoning during spectral interpretation constitute a combination of particular invalid chemical assumptions and heuristic reasoning strategies.

Introduction

Characterization of molecular structure is a fundamental practice of chemistry that is typically accomplished through spectroscopic analysis, where infrared (IR) and nuclear magnetic resonance (NMR) spectroscopy serve as key structural determination methods. The importance of IR and NMR spectroscopy to the undergraduate chemistry curriculum is therefore not surprising, with many organic chemistry textbooks devoting at least one chapter to these characterization methods (Alexander et al., 1999). Spectral interpretation is an inherent aspect of this practice, yet despite its importance to practicing chemists and the instructional focus on spectroscopic techniques, few research studies investigate student knowledge and learning of this aspect broadly or for IR and NMR specifically. This relative lack of research is problematic given the complex nature of spectral interpretation; for students to correctly interpret spectral data, they must not only be able to recognize functional groups, determine electronegativity effects, and identify molecular symmetry but also understand how molecules interact with electromagnetic radiation. In addition, students must be able to apply their knowledge of spectroscopy and molecular properties to graphical representations as well as translate between molecular and graphical representations as they reason.

Teaching and learning IR and NMR spectroscopy

The vast majority of literature on teaching and learning IR and NMR spectroscopy focuses on scaffolding strategies and laboratory activities, with minimal investigation of learning outcomes (Parmentier et al., 1998; Debska and Guzowska-Swider, 2007; Veeraraghavan, 2008; Livengood et al., 2012; Azman and Esteb, 2016; Erhart et al., 2016; Graham et al., 2016; Vosegaard, 2018). However, a small number of studies have investigated how undergraduates, graduate students, and faculty members interpret IR and NMR spectra (Cartrette and Bodner, 2010; Cullipher and Sevian, 2015; Topczewski et al., 2017). Findings from these studies provide some insight into how individuals learn to engage in this aspect of spectroscopic analysis.

In an investigation of graduate students’ and faculty members’ spectral interpretation approaches for combined IR and ¹H NMR problems, Cartrette and Bodner (2010) found that successful participants often had more experience solving complex spectra and applied a consistent approach across problems. Additionally, successful participants were more flexible in their understanding of the “N + 1 rule” for ¹H NMR spectral interpretation and could effectively explain deviations from the rule. Cullipher and Sevian (2015) used eye tracking and think-aloud interviewing to investigate undergraduates’ and graduate students’ reasoning as they related molecular structures to corresponding IR spectra. This study found that participants in given educational levels (1) relied on different assumptions about structure–property relationships as they reasoned and (2) viewed spectral data with different gaze patterns, both of which suggest different approaches to evaluating spectra. Topczewski et al. (2017) also used eye tracking to investigate interpretation approaches used by undergraduates and graduate students, in particular the approaches used to match organic molecules to appropriate ¹H NMR spectra. This study also found differences in gaze patterns between less advanced and more advanced participants, implying different approaches to interpreting spectra between the two groups. These studies provide a useful foundation for understanding how individuals learn to interpret spectral data. In addition, the range of spectral interpretation ability demonstrated by participants in each study suggests that such ability develops along a progression; if stages of this progression can be mapped, then instruction can be designed to cultivate students’ ability to interpret spectral data.

In addition to the aforementioned studies focusing on learning IR and ¹H NMR spectroscopy, research by Connor and Shultz (2018) investigated teaching assistants’ knowledge for teaching ¹H NMR spectroscopy. They found that as the fraction of teaching experience in ¹H NMR spectroscopy increased relative to other courses, teaching assistants’ knowledge for teaching this topic also increased, irrespective of general teaching experience or organic chemistry research experience. This finding is promising given that it provides insight into how other instructors may cultivate knowledge for teaching spectroscopy and in turn improve instruction quality and student learning outcomes. However, additional research focusing on students’ reasoning during spectral interpretation is needed in order to inform instructor education and thus expedite the development of this knowledge.

Theoretical framework

Categorization of mental representations

Research in cognitive psychology suggests that as individuals interact with an entity (e.g., an object, event, state, person, idea, etc.) they construct a mental representation of this entity that corresponds to a given category; it is this categorization that facilitates the organization of experiences (Margolis, 1994; Gelman, 2009). Researchers argue that category membership is not simply determined by the surface similarities of entities, but rather by underlying knowledge structures or theories in which the representations are fixated (Murphy and Medin, 1985). Future reasoning is then guided by the assumptions one has about the properties and behaviour of entities belonging to a category (Maeyer and Talanquer, 2013). However, because categories are defined and governed by underlying knowledge structures, these assumptions may also serve to constrain reasoning (Vosniadou, 1994).

Findings from discipline-based education research (DBER) illustrate how underlying knowledge structures influence the categorization of mental representations. For example, Chi et al. (1979) found differences in the categorization of physics problems between experts and novices, as well as differences in the knowledge associated with these categories. Novices in this study tended to group problems using explicit features provided in the problem, whereas experts generated groupings based on relevant physics principles. Further, Galloway et al. (2018) found differences in the categorization of organic chemistry reactions between organic chemistry students and professors; students in this study tended to group reactions using surface features, whereas professors generated categories solely for process-oriented reasons. Stains and Talanquer (2008) also found differences in the categorization of chemical reactions at various stages of expertise, with undergraduate participants in this study tending to group reactions by surface features (e.g., “aqueous reactant”, “produce water,” etc.) and graduate students forming groups based on traditional, discipline-based reaction types (e.g., redox, acid–base, etc.). Findings from these studies suggest that in order to design chemistry instruction that promotes expert-like thinking, the underlying knowledge structures which govern categorization should be characterized.

Assumptions and heuristics as cognitive constraints

A number of studies have investigated chemistry students’ assumptions in order to characterize underlying knowledge structures in the discipline as well as gain insight into the cognitive elements that constrain reasoning (Talanquer, 2006, 2009; Cullipher and Sevian, 2015). This literature defines assumptions as “presuppositions about the properties and behavior of the entities and phenomena in the domain” (Maeyer and Talanquer, 2013); these assumptions can range from being intuitive in nature, much like phenomenological primitives (diSessa, 1993), to learned principles. In addition to those aligning with scientifically accepted views, assumptions may also be inaccurate. Maeyer and Talanquer (2013) refer to assumptions that reflect inaccurate ideas about chemical entities as spurious chemical assumptions; according to these authors, this class of assumptions often arises from misunderstanding or generalizing learned chemical principles.

Assumptions about the nature of entities often work together with heuristic reasoning strategies during judgement and decision-making, especially when relevant background knowledge is lacking (Talanquer, 2006); sets of these assumptions and heuristic reasoning strategies constitute the fundamental constraints to learning in a domain (Talanquer, 2009). Heuristic reasoning strategies, or heuristics, are simplification and effort-reduction methods used by individuals to decrease the amount of information to process during decision-making (Shah and Oppenheimer, 2008). According to the dual-process theory of cognition, reasoning is guided by two types of thinking: Type 1 and Type 2 processing (Osman and Stavy, 2006). The use of heuristics is associated with Type 1 processing, which tends to be fast, automatic, and independent of cognitive ability. This system is autonomous and does not require working memory (Evans and Stanovich, 2013). Type 2 processing, on the other hand, tends to be slow, systematic, and dependent on cognitive ability. This system requires working memory (Evans and Stanovich, 2013).

Day-to-day decisions are often facilitated by heuristic reasoning strategies associated with Type 1 processing; it is this type of thinking that allows us to complete common tasks without excessive cognitive load. Further, expertise in a field is not characterized by a lack of heuristic reasoning but rather the effective use of heuristics in appropriate contexts (Maeyer and Talanquer, 2013). However, their use may also result in cognitive bias and errors (McClary and Talanquer, 2011), as demonstrated by a number of studies that investigated the use of heuristics in chemistry (Taber, 2009; Cooper et al., 2013; Graulich, 2015). Chemistry students’ use of heuristics, in addition to their assumptions, therefore merits investigation. By characterizing these cognitive constraints, instruction can be designed to target this reasoning and in turn promote more sophisticated ways of thinking. In addition, the characterization of cognitive constraints at various stages of learning in an area can assist in the construction of a learning progression that guides instruction and supports students’ development of knowledge (Smith et al., 2006; Talanquer, 2009). This work provides insight into the heuristics and invalid assumptions which may co-occur with valid assumptions at the lower anchor of a learning progression on spectral interpretation.

Research questions

In an effort to characterize constraints on organic chemistry students’ reasoning during IR and ¹H NMR spectral interpretation, this study addressed the following research questions:

(1) What invalid chemical assumptions or heuristic reasoning strategies (if any) do undergraduate organic chemistry students use when determining the success of a synthesis using IR and ¹H NMR spectra?

(2) What invalid chemical assumptions or heuristic reasoning strategies (if any) most severely constrain organic chemistry students’ reasoning when determining the success of a synthesis using IR and ¹H NMR spectra?

For this investigation, invalid chemical assumptions included what Maeyer and Talanquer (2013) define as spurious chemical assumptions, or “invalid ideas about the properties of chemical entities or reactions, often resulting from misinterpretations and overgeneralizations of chemical principles.” In addition, invalid chemical assumptions included assumptions reflecting any intuitive knowledge that contradicts scientifically accepted principles.

It is important to note that students may also hold scientifically accurate assumptions which guide their thinking along productive avenues toward expertise. If these assumptions are characterized, instruction can build upon them to further cultivate sophisticated thinking. In addition, these assumptions may also restrict thinking given that individuals may rely on them to provide local explanatory coherence rather than achieve conceptual coherence (Talanquer, 2009). For these reasons, chemistry education studies have typically investigated students’ valid assumptions in addition to their invalid assumptions (Talanquer, 2009; Maeyer and Talanquer, 2013). This study is limited to the investigation of invalid assumptions in order to provide a detailed account of any findings while also maintaining their accessibility. Providing a richly detailed, accessible description of findings will ensure their utility and allow for assessments of transferability. Further, by investigating invalid assumptions and heuristics we can first identify any potential significant barriers to analytical thinking, as well as gain some insight into how students may use heuristics effectively. This study thus serves as a productive initial step in the process of understanding students’ reasoning surrounding this complex practice. Characterization of the range of conceptual sophistication demonstrated by organic chemistry students during spectral interpretation, which includes both scientifically accurate and inaccurate assumptions, will be the focus of future work.

Methods

Sample and setting

Eighteen undergraduates from a large, public Midwestern university participated in the study. Seventeen undergraduates were enrolled in an organic chemistry II laboratory course at the time of data collection, and one undergraduate had completed the course in a prior semester. Participants were recruited from four sections of the course. Three sections followed a traditional design and were a combination of chemistry majors and non-majors, with each section taught by a different instructor. The fourth section followed a course-based undergraduate research experience (CURE) design (Auchincloss et al., 2014) and was a combination of chemistry majors and non-majors, with this section taught by one of the three instructors mentioned above. Eleven students were recruited from the traditional sections, and six students were recruited from the CURE section. Participants were recruited via email and in-class announcements by the first author. The undergraduate who completed the course in a prior semester was recruited via snowball sampling (Cohen et al., 2011). Of the 18 undergraduates that volunteered to participate, all were interviewed; the study population was therefore a convenience sample. Responses from the undergraduate who completed the course in a prior semester did not noticeably differ from that of the larger sample population, so they were included in subsequent data analysis. Of the 18 participants, there were nine males and nine females. Participants represented a variety of ethnicities, which is a general representation of the larger student population at this institution. All individuals voluntarily consented to participate in the study and IRB approval was obtained.

IR and ¹H NMR spectroscopy were taught in detail in the organic chemistry II laboratory course at the institution in which the study took place. Instructors of each section covered content relevant to the course in a weekly one-hour laboratory lecture. IR spectroscopy was covered in this lecture during Weeks 4 and 6 of the semester, and ¹H NMR spectroscopy was covered during Weeks 8, 11, and 12. As part of instruction on IR spectroscopy, students were taught to (1) identify the main components of an IR spectrum (e.g., peak characteristics, units, and regions); (2) identify major functional groups and bonds in the functional group region; (3) interpret authentic spectra collected in lab and identify if it corresponds to a product, starting material, or reaction solvent; (4) match a set of compounds to the appropriate IR spectra; (5) use an IR spectrum along with other characterization methods to identify unknown compounds and (6) use an IR spectrum, with and without other information, to predict molecular structure. As part of instruction on ¹H NMR spectroscopy, students were taught to (1) interpret features of a spectrum (e.g., number of peaks, peak position, integration, first-order splitting, some second-order splitting, and splitting of OH and NH hydrogens) to determine molecular fragments or the complete structure of a compound; (2) interpret coupling constants and use them to differentiate between structural isomers; (3) match a set of compounds to appropriate NMR spectra; (4) use an NMR spectrum, with and without other information, to predict molecular structure; (5) use an authentic NMR spectrum, along with an IR spectrum and thin layer chromatography, to identify an unknown compound and (6) compare an authentic NMR spectrum obtained in lab to spectral data from the literature. Students were also provided with a coursepack containing optional practice problems involving IR and ¹H NMR spectral interpretation.

Data collection

For this investigation, we used cued retrospective think-aloud (RTA) interviewing to collect qualitative data on students’ reasoning during spectral interpretation. Cued RTA interviewing is a qualitative technique paired with eye tracking to characterize individuals’ thinking (Hyrskykari et al., 2008), where eye tracking involves measuring individuals’ eye movements as they complete a visual-based task (Just and Carpenter, 1980). Cued RTA interview protocols involve participants watching a recording of their eye movements following the completion of a visual-based task and verbalizing in as much detail as possible what they were viewing and thinking as they completed the task. van Gog et al. (2005) demonstrated that cued RTA interviewing is an effective tool for eliciting problem-solving process information when compared to concurrent think-aloud reporting and standard retrospective reporting. In addition, cued RTA interviewing is a particularly well-suited tool for investigating individuals’ thinking as they complete a complex task such as spectral interpretation because it allows participants to work on their own and in silence, as opposed to concurrent think-aloud interviewing which requires participants to verbalize their thoughts in-the-moment and thus increase their cognitive load.

Each participant took part in one 30–60 minute session in which they completed three spectral interpretation tasks while having their eye movements tracked. Following the completion of each interpretation task, participants completed a semi-structured, cued RTA interview in which they watched a recording of their eye movements and described in detail what they were focusing on and thinking about during each task (Guan et al., 2006). Cued RTA interviews were conducted using the Tobii Studio 3.4.8 RTA feature, which allows for simultaneous audio recording and playback of Tobii Studio eye movement recordings (Tobii Technology, 2018). Data collected in this investigation included audio-visual recordings of RTA interview responses and information relating to participants’ research experience interpreting IR and ¹H NMR spectra. All data was collected during a four-week period following instruction on ¹H NMR spectroscopy. Data collection continued during this period until 18 individuals had participated; participants expressed no new reasoning at this point, indicating data saturation was achieved (Cohen et al., 2011).

Prior to the start of each session, participants were given an overview of the task format. To ensure that the description of the task was interpreted as intended, participants were asked to describe the task and its objective in their own words. Participants were then given an explanation of the cued RTA interview protocol and informed that each interpretation task would be followed by a cued RTA interview. Participants were not time-restricted as they completed interpretation tasks, and all tasks were presented in a randomized order. Prior to the start of each RTA interview, participants were informed that they could pause the recording of eye movements at any time during the interview if they needed more time to speak. The interviewer was also able to pause the recording of eye movements in order to further probe students’ reasoning. Audio-visual recordings of RTA responses included a video screen capture of eye movement recordings viewed by participants during the RTA interview overlaid with an audio narration of their verbalized thoughts.

Description of interpretation tasks

The three interpretation tasks included in this study were of an identical format (Fig. 1). Each task included a short prompt explaining that chemists first attempted to synthesize a given compound and then analysed their final product spectroscopically to determine if the synthesis was successful. The prompt then instructed participants to determine if the synthesis of the desired product was successful using the provided spectroscopic data (IR and ¹H NMR spectra). This problem type was selected given that determining the outcome of a synthesis using spectroscopic data aligns with the common day-to-day problems of practicing organic chemists (Raker and Towns, 2012); by incorporating authenticity into the tasks, any findings may more meaningfully inform classroom instruction.


	Fig. 1 Spectral interpretation task (Synthesis 1) asking participants to determine if N-(2-hydroxyethyl)-propanamide was successfully synthesized using the provided IR spectrum and ¹H NMR spectrum.

Tasks were labelled as Synthesis 1 (Fig. 1), Synthesis 2 (Appendix, Fig. 4), and Synthesis 3 (Appendix, Fig. 5). Molecules corresponding to the desired product of each synthesis are provided in Fig. 2. All spectra were obtained from the Spectral Database for Organic Compounds (SDBSWeb, 1997) and were free of signals due to solvent or impurities; these spectra were selected in order to reduce participants’ cognitive load and allow for the completion of already complex tasks. All spectra are reproduced herein with permission from SDBSWeb. Integration values and multiplicities were included on all ¹H NMR resonances given that the authors wished to investigate students’ reasoning and not their ability to distinguish between individual peaks. The labels served to further reduce participants’ cognitive load. In addition, reference tables containing characteristic IR absorption values and ¹H NMR chemical shift values (Bruice, 2011) were included with each task given that content knowledge can act as confounding variable during task completion (Bowen and Bodner, 1991). These tables also mirrored resources available to students when interpreting spectra in the context of the course. After completing the task, participants could respond with “yes, the product was synthesized”, “no, the product was not synthesized”, or “not enough information to tell.”


	Fig. 2 Compounds identified as desired products in each interpretation task.

As part of a larger study, a faculty member with more than ten years of teaching experience in IR and NMR spectroscopy was interviewed to provide insight into ¹H NMR spectral features that often create difficulty for undergraduates. Molecules with ¹H NMR spectra that included these potentially difficult spectral features were incorporated into tasks for this investigation in order to increase the likelihood of eliciting invalid chemical assumptions and problematic heuristic reasoning strategies among participants. For Synthesis 1, participants evaluated the synthesis of N-(2-hydroxyethyl)-propanamide (Fig. 2). The provided IR and ¹H NMR spectra corresponded to this molecule (Fig. 1) and the correct response to Synthesis 1 was “yes, the product was synthesized.” This molecule was selected because it contains an amide functional group that results in splitting patterns that deviate from the “N + 1 rule,” a feature the consulted faculty member identified as difficult for students. Participants in this study received classroom instruction on deviations from the “N + 1 rule,” however the authors hypothesized that such features may still pose difficulty for participants. For Synthesis 2, participants evaluated the synthesis of isochroman (Fig. 2). The provided IR and ¹H NMR spectra corresponded to isochroman (Appendix, Fig. 4) and the correct response to Synthesis 2 was “yes, the product was synthesized.” Isochroman was selected because its ¹H NMR spectrum contains overlapping signals resulting from aromatic hydrogens, another feature the consulted faculty member identified as difficult for students. Participants had also received classroom instruction on this phenomenon. Lastly, students evaluated the synthesis of 3-(allyloxy)propanal for Synthesis 3 (Fig. 2). The provided IR and ¹H NMR spectra corresponded to 3-allyloxypropionic acid (Appendix, Fig. 5) and the correct response to Synthesis 3 was “no, the product was not synthesized.” This molecule was selected given that it contains a variety of functional groups, which the authors hypothesized would increase difficulty.

Prior to the study, each task was piloted with four undergraduates having recently completed the organic chemistry II laboratory course in order to ensure that the prompt was interpreted as intended and that no task was too easy or too difficult for the study population. No task received either all correct or incorrect responses, suggesting that they were of an appropriate level of difficulty and that participants’ inability to make annotations did not inhibit their ability to interpret spectra. Four interpretation tasks were initially developed and piloted, however pilot study members reported fatigue after the third task, so only three tasks were included in the study.

Qualitative analysis of RTA interviews

A mixed-methods approach with a conversion design (Cohen et al., 2011) was used to investigate invalid chemical assumptions and heuristic reasoning strategies that constrained undergraduates’ reasoning during spectral interpretation. RTA interviews were first analysed qualitatively to identify invalid chemical assumptions and heuristics used by participants. Frequencies of responses containing assumptions and heuristics were then analysed quantitatively to identify any assumptions or heuristics that most severely constrained participants’ reasoning.

RTA interviews were transcribed verbatim, and audio-visual recordings of the interviews were used when necessary to clarify any ambiguous references to spectral data in the transcripts. The first author inductively coded all transcripts for invalid chemical assumptions. During this process, the author generated in vivo codes and descriptive codes that corresponded to specific invalid ideas about chemical and spectral features expressed by participants (Miles and Huberman, 1994; Saldaña, 2016). Codes and definitions were then refined in order to combine closely related invalid ideas into single codes. The second author then deductively coded all transcripts using the revised codes and definitions as well as inductively coded transcripts to identify any invalid ideas not identified by the first author. To establish reliability of the coding, the first and second author then discussed and revised all codes until 100% agreement was reached. NVivo 11 software was used throughout the coding process (Saldaña, 2016). After consensus was established, the first author then identified themes among the invalid chemical assumption codes using constant comparative analysis (Creswell and Poth, 2018). Written analytic memos and regular discussions with the second author facilitated the identification of themes (Saldaña, 2016).

Following this analysis, themes and their contributing invalid chemical assumptions were shared with five external experts to establish the validity and transferability of the findings to other interpretation tasks and instructional contexts. External experts were instructors of first and second-semester organic chemistry laboratory and lecture courses from four institutions of various types (one public doctoral-granting university in the Midwest, one public doctoral-granting university in Canada, one private doctoral-granting university in the Midwest, and one private liberal arts college in the Midwest) who provided feedback on the perceived extent to which the identified themes constrain their own students’ reasoning. All external experts cover IR and ¹H NMR spectroscopy in their respective courses. Experts had teaching experience in IR and ¹H NMR spectroscopy ranging from two years to nearly 20 years.

To code for heuristic reasoning strategies, the first author developed an initial list of heuristics and corresponding definitions using existing literature on heuristic reasoning in chemistry (McClary and Talanquer, 2011; Maeyer and Talanquer, 2013; Talanquer, 2014). The first author then deductively coded all transcripts using this list. Any heuristics that did not appear in responses were then removed from the initial list, and definitions of remaining heuristics were revised in order to clearly operationalize each heuristic. The first and second author then deductively coded all transcripts using the refined list. The authors then discussed and revised all codes until 100% agreement was reached.

Quantitative analysis of assumptions and heuristics

After the first and second authors reached consensus for all codes, frequencies of responses containing given invalid chemical assumptions and heuristic reasoning strategies were tabulated for each interpretation task. For this tabulation, invalid chemical assumptions were grouped into previously identified themes. In order to investigate if the use of certain assumptions or heuristics was task-specific, a two-sided Fisher's exact test and a Pearson χ² test of independence were used to determine if certain interpretation tasks disproportionately elicited specific invalid chemical assumptions or heuristic reasoning strategies. A Pearson χ² test of independence was used for the analysis of heuristic frequency distributions between tasks, whereas a Fisher's exact test was used for the analysis of assumption frequency distributions between tasks. A Fisher's exact test was used for the later analysis because the total number of assumptions did not meet the minimum requirements for the Pearson χ² test of independence, and a two-sided Fisher's exact test is recommended in lieu of a Pearson χ² test of independence when the total number of observations is less than 20 (Sheskin, 2011). Statistical significance was set at 0.05 for all significance testing. All statistical analyses were completed using the R Stats Package in RStudio Version 1.1.453 (R Core Team, 2018).

In order to identify assumptions or heuristics that most severely constrained organic chemistry students’ reasoning, one-sided Fisher's exact tests were used to determine if certain invalid chemical assumptions or heuristic reasoning strategies appeared in incorrect responses significantly more than they appeared in correct responses. A one-sided test is used in lieu of a two-sided test when frequencies are expected to be greater for a given group (Sheskin, 2011). Incorrect responses involved incorrectly determining the success of given syntheses (e.g., selecting “no, the product was not synthesized” when the IR and ¹H NMR spectra corresponded to the target molecule), whereas correct responses involved correctly determining the success of given syntheses (e.g., selecting “yes, the product was synthesized” when the IR and ¹H NMR spectra corresponded to the target molecule). Responses in which the “not enough information to tell” option was selected were omitted from this analysis. Odds ratios were evaluated post hoc as a measure of effect size for assumptions and heuristics that appeared relatively more in incorrect responses. The Haldane–Anscombe correction was used for the determination of odds ratios due to some frequencies equalling zero (Lawson, 2004). Small, moderate, and large effects corresponded to odds ratios equalling 1.68, 3.47, and 6.71, respectively (Chen et al., 2010). While some correct responses exhibited constraints on reasoning (e.g., invalid assumptions that either did not influence participants’ decisions or resulted in participants responding correctly “for the wrong reasons”), incorrect responses represent an extreme case of constrained reasoning; identification of invalid assumptions and heuristics common to incorrect responses may thus provide insight into the cognitive elements that most severely constrain organic chemistry students’ reasoning during spectral interpretation.

Results and discussion

Of the 18 participants, only five correctly determined the success of all three syntheses. Nearly half of participants (n = 8) correctly determined the success of two of the three syntheses, and four participants correctly determined the success of only one synthesis. Lastly, one participant did not correctly determine the success of any synthesis. This distribution further suggests that interpretation tasks were of an appropriate level of difficulty for this population and that undergraduates’ reasoning during spectral interpretation merits investigation. Syntheses 1 and 2 appeared to be of equal difficulty, with six incorrect responses to each of these tasks. Synthesis 3 appeared to be slightly less difficult, with only four incorrect responses to this task. Further, each synthesis had one response indicating “not enough information to tell.”

Qualitative findings: invalid chemical assumptions

Through the inductive coding of RTA interview responses, we identified a total of 20 unique invalid chemical assumptions. Of these assumptions, 12 related to ¹H NMR spectroscopy, 5 related to IR spectroscopy, and 3 related to molecular structure (Table 1). From these 20 coded assumptions, we identified five themes that more comprehensively explain the invalid chemical assumptions that constrained students’ reasoning during spectral interpretation (Table 1). Invalid chemical assumptions were related to specific spectral features included in this study. However, both the use of interpretation tasks that incorporated a variety of spectral features and the identification of common themes contribute to the transferability of our findings to other IR and ¹H NMR spectral interpretation tasks for this study population. Validation of these themes by external experts further contributes to the transferability of these findings to other interpretation tasks, as well as to other instructional contexts. These themes are described in detail below.

Table 1 Invalid chemical assumptions that constrained students’ reasoning and themes among these assumptions. The n-values in the third column correspond to the number of participants (n = 18) who used specific assumptions at least once. The n-values under each theme correspond to the number of participants with responses contributing to each theme

Theme	Contributing invalid chemical assumptions	n	Synthesis
Assumptions that the “N + 1 rule” should hold (n = 13)	• NH and/or OH should not appear as singlets	5	1
	• CH₂ groups between NH and OH should appear as quartets	4	1
	• The aromatic ring has too few corresponding NMR peaks	6	2
	• Double bonds should obey the “N + 1 rule”	5	3

Assumptions that spectral data should be absolute (n = 9)	• IR peaks should be prominent if the functional group is present	9	1, 2, 3
	• Chemical shift values should match the reference material	1	2
	• The number of chemically equivalent hydrogen sets should match the number of peaks	1	2
	• A messy IR spectrum suggests an unsuccessful synthesis	1	3

Visuospatial invalid assumptions (n = 8)	• Isochroman is symmetric	7	2
Visuospatial invalid assumptions (n = 8)	• Incorrect number of hydrogen atoms attached to methylene and methine carbon atoms	1	2

Practical invalid assumptions (n = 8)	• There is an IR peak corresponding to a halogen functional group	2	1, 3
	• The NH singlet corresponds to an artefact the NMR spectrometer “picked up”	1	1
	• The IR peak near 3000 cm⁻¹ corresponds to the OH functional group	1	2
	• The broad IR peak near 3000 cm⁻¹ corresponds to the CH functional group (n = 4) or water (n = 1)	5	3

Fundamental invalid assumptions (n = 6)	• Parts of a molecule vary in concentration	1	1
	• Incorrect splitting knowledge: connected hydrogen atoms determine multiplicity	1	1, 2
	• Incorrect splitting knowledge: multiplicity determined by absolute number of adjacent hydrogen atoms (“N”) rather than “N + 1”	1	2, 3
	• De-shielding causes a shift right	1	2
	• Oxygen nuclei generate ¹H NMR signals	1	2
	• Doublets are part of doublet of doublets	3	3

Theme I: assumptions that the “N + 1 rule” should hold. Each ¹H NMR spectrum included in this study incorporated one spectral feature for which the “N + 1 rule” fails to hold. The “N + 1 rule” is a guideline commonly included in ¹H NMR instructional materials for determining signal multiplicity (Bruice, 2011), however a number of exceptions to this “rule” exist. A majority of interviewees (n = 13) incorrectly indicated that the failure of the “N + 1 rule” to hold was problematic and that this failure suggested that given syntheses were unsuccessful (Table 1).

For Synthesis 1 (Fig. 1), four students regarded the appearance of singlets corresponding to OH and NH hydrogens as problematic, stating that these hydrogens should appear as triplets given their number of nearest neighbours. One of these students, Frances, explains how this deviation from the “N + 1 rule” influenced her evaluation:

“And then, I basically, I concluded that both of the single hydrogens that were on the alcohol and on the NH, they did have neighboring hydrogens next to them and because of that, because of that, those peaks couldn't be singlets. My reasoning for the question.”

One additional student regarded the singlet corresponding to the NH hydrogen as problematic yet recognized that signals corresponding to OH hydrogens do not always undergo splitting. Further, four students correctly paired corresponding singlets to the NH and OH hydrogens, however they stated that the two hydrogen groups between these functional groups should appear as either two triplets or two quartets (and not as one quartet and one triplet, as they appear in the spectrum). All six participants who incorrectly determined that Synthesis 1 was unsuccessful relied on one of these invalid assumptions during their reasoning.

Invalid chemical assumptions that contribute to Theme I also appeared in RTA responses for Synthesis 2 and Synthesis 3 (Appendix, Fig. 4 and 5). For Synthesis 2, nearly-equivalent aromatic protons in isochroman give rise to one multiplet and one doublet rather than two doublets and two triplets as the “N + 1 rule” indicates. One-third of students (n = 6) incorrectly deduced that the actual splitting pattern may imply an unsuccessful synthesis. Again, Frances explains how this deviation from the “N + 1” influenced her evaluation:

Frances: So, I think again I was double checking using the “N + 1 rule” just trying to figure the different environments. And then counting in my head, just trying to see the different environments.

Interviewer: Did the “N + 1 rule” seem to be checking out for you?

Frances: I believe that on the right part of the molecule it was working on it. At least on the left part of the molecule, when I was analyzing it, there didn't seem like there was… because on the spectrum it's listed in the three in one, which I didn't really… to me didn't make any sense just because it didn't seem like there was any way, at least in mind, to quantify that.

Of the six participants that incorrectly determined that Synthesis 2 was unsuccessful, five relied on this invalid assumption during their reasoning, further suggesting that students’ reasoning is constrained by the notion that the “N + 1 rule” should generally hold.

Students’ reasoning was further constrained by this notion in RTA responses to Synthesis 3. For this interpretation task, several students (n = 5) reasoned using the incorrect assumption that the “N + 1 rule” should apply to vinylic hydrogen atoms. One student, John, explains that the appearance of two doublets corresponding to terminal vinylic hydrogen atoms in 3-(allyloxy)propanal in part led him to question the success of this synthesis:

“…So I figured there are… too many integrations of one. And I figure that might not be right. Yeah there are too many ones and too many splits over here. It shouldn't be [so] many splits. I figured that is the wrong thing.”

As illustrated above, assumptions that the “N + 1 rule” should hold appeared in all three interpretation tasks, providing some indication that these assumptions constrain students’ reasoning in a number of contexts. This result aligns with the finding by Cartrette and Bodner (2010) that unsuccessful participants were less flexible in their understanding of the “N + 1 rule” when compared to successful participants, providing additional evidence of transferability to other interpretation tasks and instructional contexts.

Theme 2: assumptions that spectral data should be absolute. Students’ reasoning was further constrained by invalid assumptions that certain spectral data should be prominent or definite if corresponding molecular features are present in the synthesized product. Half of students (n = 9) incorporated such assumptions into their reasoning. These invalid chemical assumptions contribute to Theme 2 (Table 1), the notion that spectral data should be absolute if molecular features are present. The most prevalent of these assumptions was that IR peaks should be readily distinguishable if corresponding functional groups are present in the synthesized product, with half of students (n = 9) incorrectly identifying IR peaks of low intensity or overlapping IR peaks as evidence of unsuccessful syntheses. Audrey's response illustrates how this assumption influenced her reasoning for Synthesis 3:

“That stretch around [1700 cm⁻¹] is kind of the combination of… It's on the high end of the C double bond C stretch and the low end of the carbonyl stretch, I didn't like the fact that it was like one. And I was like, ‘You'd probably see something different.’ And so I think that in the end was what led me to say, ‘No.’”

A number of less prevalent assumptions also contributed to Theme 2. Similar to the previously described invalid assumption, one student, Stephen, incorrectly reasoned that a complex IR spectrum in Synthesis 3 (Appendix, Fig. 5) provided some evidence of an unsuccessful synthesis:

Stephen: Yeah, it was really the IR that in the end made me decide no. I think I was just really confused by the NMR, so I ended up, yeah.

Interviewer: So maybe another IR due to contamination maybe? Or…

Stephen: Yeah, maybe. It just seemed - yeah, kind of messy to me.

Interviewer: Okay, so more “messy” than you normally see?

Stephen: Yeah.

Another student incorrectly reasoned that Synthesis 2 was unsuccessful using the invalid assumption that ¹H NMR chemical shift values should exactly match values provided in the reference table. Lastly, one student identified a mismatch in the number of chemically equivalent hydrogen groups and ¹H NMR resonances in Synthesis 2 as evidence of an unsuccessful synthesis, further suggesting this notion constrained students’ reasoning.

Theme 3: visuospatial invalid assumptions. Students’ invalid chemical assumptions relating to their visuospatial thinking also appeared to constrain reasoning (Table 1). A surprising number of students (n = 7) reasoned using the invalid assumption that isochroman possesses molecular symmetry in their response to Synthesis 2. One student, Madelyn, explains how this assumption influenced her reasoning:

“And, then I moved straight to NMR. See what I did. Here's what I counted, right off the bat, the peaks. The phenyl I counted wrong a bunch of times because of the symmetrics. There should be two on the phenyl. Three on the other ring. Three. That lined up good with that.”

Of the six students who incorrectly determined that Synthesis 2 was unsuccessful, four relied on this invalid assumption in their reasoning. No invalid assumptions relating to symmetry appeared in responses to Syntheses 1 and 3, likely because of the distinct asymmetry of the molecules in each corresponding task. Chemistry students’ difficulty with visuospatial thinking is widely reported in chemistry education literature (Wu and Shah, 2004; Harle and Towns, 2011), however much of this literature relating to organic chemistry focuses on students’ difficulty with forming three-dimensional mental images while visualizing two-dimensional molecular structures or performing mental rotation tasks. Students’ inability to recognize the asymmetry of isochroman (a task that does not require mental rotation) suggests that students’ difficulty with visuospatial thinking in the context of organic chemistry may extend to less complex visualization tasks. Further, this difficulty may serve to constrain students’ reasoning during spectral interpretation.

Theme 4: practical invalid assumptions. Nearly half of students (n = 8) reasoned using invalid chemical assumptions that likely arose from a lack of practical experience interpreting spectral data (Table 1). These invalid assumptions most commonly took the form of students incorrectly identifying characteristic IR peaks. For Synthesis 3, four participants incorrectly identified the broad IR peak near 3000 cm⁻¹, a peak characteristic of the OH group of a carboxylic acid, as corresponding to the CH functional group. IR peaks corresponding to the CH functional group are notably less broad and intense than those corresponding to this OH (Pavia et al., 2015). Of the four students who incorrectly stated that Synthesis 3 was successful, two reasoned using this assumption. Similar to this assumption, Nancy incorrectly reasoned that Synthesis 2 was unsuccessful after misidentifying an IR peak corresponding to the CH functional group as belonging to the OH group of a carboxylic acid. She explains how this assumption influenced her reasoning:

“I know a broader peak around 3000 [cm⁻¹] usually corresponds to an OH, and I didn't see that [in the molecule]. I know that that's not necessary, but that peak kind of looks like what I've seen before with an OH. I didn't see that…. It was really that three [corresponding to integration of the multiplet NMR peak], I think, that I was basing my decision off of. The three and then this peak here [the IR peak near 3000 cm⁻¹].”

A small number of students (n = 2) also incorrectly reasoned that Syntheses 1 and 3 were unsuccessful due to the presence of apparent IR peaks corresponding to a halogen functional group in the fingerprint region of the spectra. Further, one student rationalized the presence of the unexpected broad singlet corresponding to the NH hydrogen in Synthesis 1 as an artifact that the spectrometer detected. This student, Robert, explains his reasoning below:

“Yeah, so I went back to the NMR, because I was really stuck on the NH being a singlet. And it was concerning to me that it was only a singlet and it was so small. And it wasn't a real peak, I guess, it was more like it was just kind of a small thing that was picked up by the machine.”

Robert's notion that the spectrometer can detect phenomena other than the absorption of electromagnetic energy by hydrogen nuclei further contributes to the practical invalid assumptions that constrained students’ reasoning.

Theme 5: fundamental invalid assumptions. The last class of assumptions that constrained students’ reasoning were fundamental misunderstandings about basic NMR principles (n = 6, Table 1). The most common of these fundamental invalid assumptions (n = 3) was that the two doublets corresponding to each vinylic hydrogen atom in Synthesis 3 comprised a set of doublet of doublets. In addition, one student reasoned using the assumption that the number of hydrogen atoms on a carbon atom (rather than the adjacent carbon atoms) gives rise to a signal's splitting pattern. Other assumptions held by individual students were (1) specific parts of a molecule may vary in concentration and result in unexpected peaks, (2) de-shielding causes a shift right rather than left on the NMR spectrum, (3) oxygen nuclei give rise to ¹H NMR signals, and (4) multiplicity is determined by the absolute number of adjacent hydrogen atoms (“N”) and not the number of adjacent hydrogen atoms plus one (“N + 1”).

External expert validation. All experts stated that the themes accurately reflected problematic reasoning they have encountered among their own students. However, two of the five experts stated that the identified themes did not capture invalid chemical assumptions that are problematic during the interpretation of more authentic ¹H NMR spectra in undergraduate laboratory courses (e.g., spectra containing solvent peaks, unlabelled multiplicities, or raw integration values). These assumptions were not captured due to the format of the interpretation tasks and are discussed in the Limitations section below. Further, one expert stated that their students often invalidly assume that aldehyde hydrogens (in reference to Synthesis 3) and OH and NH hydrogens (in reference to Synthesis 1) should always appear as singlets. A number of participants in this study assumed that aldehyde hydrogens, as well as OH and NH hydrogens, appear as singlets; however, it is unclear from the interview data if participants assumed these hydrogens always appear as singlets. In addition, students received instruction explaining that aldehyde hydrogens often appear as singlets due to a combination of low instrument resolution and the small coupling between aldehyde hydrogens and hydrogens on adjacent carbons (Pavia et al., 2015). Such reasoning was therefore not coded as an invalid assumption given that it would require significant inference by the authors. Nevertheless, these codes would fall under Theme 2 (i.e., assumptions that spectral data should be absolute) and thus do not discredit this study's findings. The inability to identify any problematic reasoning that required inference is further discussed in the Limitations section below.

Lastly, two of the five experts stated that they had not observed a small number of specific invalid chemical assumptions among their own students. These assumptions included participants’ notion that parts of a molecule can vary in concentration, that IR peaks should be prominent if the functional group is present, and that halogen peaks were present in certain IR spectra. However, these experts explained that they may not have observed these assumptions among their own students given that they have not asked questions that would elicit such reasoning.

Qualitative findings: heuristic reasoning strategies

Through the deductive coding of heuristic reasoning strategies, we identified eight heuristics used in at least 20% of responses (Table 2). All participants used at least one heuristic in each response, though the way participants used them varied with individual and context. Talanquer (2014) divides common heuristics used by chemistry students for judgement and decision-making into three general groups: (1) fundamental associative processes, (2) inductive judgements, and (3) affective judgements. Rather than present a list of heuristic strategies used by our participants, we aim to demonstrate how individuals used heuristics from each of these groups as they evaluated the success of syntheses via spectral interpretation.

Table 2 Heuristics identified in at least 20% of RTA interview transcripts, corresponding definitions, the number of participants who used corresponding heuristics at least once (n = 18), and total number of responses containing each heuristic (N = 54)

	Heuristic	Description	Participants	Responses (total)
Fundamental associative processes	Processing fluency	Readily making sense of any salient molecular or spectral features	18	52
Fundamental associative processes	Associative activation	Associating one observed spectral or molecular feature with a corresponding feature	17	42

Inductive judgements	Generalization	Overgeneralizing learned rules or patterns without considering all variables that may be involved	16	29
	Representativeness	Using some (but not all) spectral features to decide if entire spectra correspond to a molecule	15	18
	Reduction	Eliminating spectral features as information to process when alternative molecules share similar spectral features	14	20
	Rigidity	Using knowledge that has worked in the past and failing to consider other approaches	11	16
	One-reasoning decision making (ORDM)	Considering multiple spectral features while reasoning, but ultimately basing a decision on one spectral feature	10	11

Affective judgements	Affect	Experiencing positive or negative emotions evoked by spectral data	14	21

Fundamental associative processes. Participants most commonly employed processing fluency and associative activation when evaluating each synthesis, both of which fall under the category of fundamental associative processes (Table 2). All participants used processing fluency in at least two out of three responses, and 17 out of 18 participants used associative activation in at least two of their responses. The prevalence of these heuristics is not surprising given that they often work together to support other heuristic reasoning (cf., Talanquer, 2014). Processing fluency refers to the ease with which an individual processes either explicit or implicit cues. In the context of spectral interpretation, use of this heuristic took the form of participants readily making sense of salient spectral and molecular features using either existing knowledge of such features or provided reference material. As in Robert's response to Synthesis 1 below, the heuristic often appeared at the beginning of participants’ responses and focused more on explicit rather than implicit features:

“Okay, so I started by looking at the molecule and counting all of the hydrogens, and comparing it to the NMR, to look at the integration and the splitting again. I feel like that's the easiest way to start… And so I saw that there's the right amount of integration values and it looks like they all correlate to the peaks as they should.”

In this example, Robert uses existing knowledge of chemical equivalency and its effect on the appearance of ¹H NMR signals to conduct what he deems as an easy, initial evaluation of the NMR spectrum. His focus on explicit features is unsurprising given that experts rather than novices tend to process implicit features more readily (cf., Talanquer, 2014).

Where processing fluency refers to the ease with which information is processed, associative activation refers to the processing mechanism by which associations are automatically evoked through interaction with some stimulus (Morewedge and Kahneman, 2010). For this study, associative activation took the form of participants either (1) associating a spectral or molecular feature with a corresponding characteristic feature using existing knowledge of such combinations or (2) explicitly connecting spectral features to those observed previously in instructional materials, laboratory, or other contexts. Associative activation therefore extends beyond readily processing spectral features using any source of information (i.e., existing knowledge regarding basic principles or reference material) to encompass the use of activated existing knowledge of spectral and molecular features. Ralph's response to Synthesis 1 illustrates this distinction. In his response below, he first observed the broad IR peak near 3300 cm⁻¹ and correctly associated it with the OH functional group using existing knowledge of this combination. He then observed an IR peak near 1700 cm⁻¹ and correctly associated it with the carbonyl functional group, a combination he observed previously in the laboratory:

“And so I saw there was an OH peak, I remember that. That was like one of the only peaks I remember, by memory…. But yeah, so this one and then also a carbonyl peak in this area. 1700 [cm⁻¹] area. I do remember that from lab. So that was kind of like, okay, those both match.”

Associative activation typically occurs alongside processing fluency given that the mechanism typically involves processing information with ease, and it is difficult to present evidence of this heuristic in isolation of the other (cf., Talanquer, 2014). Ralph's response illustrates both associative activation and processing fluency, as do all excerpts identified as associative activation in this study; however, not all excerpts coded as processing fluency necessarily involve associative activation. Further, research on heuristic reasoning suggests that strongly activated information tends to disproportionately influence decision-making (Heckler, 2011). Associative activation can serve as an effective heuristic when used in correct contexts like in Ralphs's case above; however, it can be problematic if used inappropriately. For instance, Nancy incorrectly associated the IR peak near 3000 cm⁻¹ with the OH functional group for Synthesis 2. As noted in the description of participants’ practical invalid assumptions, this narrower peak actually corresponds to the CH functional group. Nancy then relied on this association and the target molecule's lack of an OH group to incorrectly determine that the synthesis was unsuccessful.

Inductive judgements. Five heuristics identified in participants’ responses contribute to inductive reasoning; generalization, rigidity, representativeness, reduction, and one-reason decision making (ORDM). Participants applied these heuristics less frequently than fundamental associative processes, however their use was still prevalent. Participants most commonly used generalization, with this heuristic appearing in at least one out of three responses for 16 out of 18 participants. Generalization involves extending previously observed patterns or rules to potentially unfamiliar situations or contexts in order to make a judgement, and among novices in a field it tends to entail the over-extension of learned rules or principles (cf., Talanquer, 2014). In the context of this study, this heuristic often manifested as students’ over-extension of the “N + 1 rule” to given ¹H NMR signals without considering other variables such as amide bond coupling behaviour, the near-equivalent chemical environment of aromatic hydrogens, or the chemical inequivalence of vinylic hydrogen atoms. Other common generalizations involved claims that IR peaks should be prominent if functional groups are present, chemical shift values should match reference material exactly, and the number of chemically equivalent hydrogen atom sets should match the number of ¹H NMR signals. It should be noted that a number of invalid chemical assumptions (Table 1) resulted from generalization, however not all invalid assumptions were a product of this heuristic.

Rigidity is related to generalization and involves relying on problem-solving approaches that have worked in the past while failing to consider other strategies in new contexts. Over half of participants (11 out of 18) used this heuristic in at least one response. When participants applied the rigidity heuristic, they most often relied on invalid chemical assumptions resulting from generalizations to make a final decision about the success of syntheses. Rigidity and generalization heuristics therefore often co-occurred, yet participants could still apply the generalization heuristic without being rigid in their generalization. Notably, participants that relied on both the generalization and rigidity heuristics tended to incorrectly determine the success of syntheses rather than simply question the success.

Participants also used the representativeness heuristic frequently, with over two-thirds of participants (15 out of 18) applying it in at least one response. This heuristic involves using easily processed information to determine whether an object belongs to a given class (Tversky and Kahneman, 1974); if the object is judged to belong, a decision is then made using properties of the class. In the context of this study, the representativeness heuristic involved participants evaluating a limited number of spectral features to determine if they corresponded to any of the many features they would expect from the target molecule. If participants found that selected features corresponded to some features expected from the target molecule, they judged their selected features to be an adequate representation of the expected spectra. They then made a decision about the synthesis using this judgement. When applying this heuristic, participants failed to evaluate one or more explicit spectral cues and ultimately stopped evaluating spectral data once they felt there was enough evidence to make a decision. Explicit spectral cues constituted prominent spectral features that provided some indication of each synthesis's success and that were evaluated by the majority of participants (e.g., the large IR peak corresponding to the OH functional group in Synthesis 3). This heuristic is useful given that it allows individuals to make a judgement when they lack necessary background information, as in Kim's correct response to Synthesis 1 below:

“And it looked good, like the numbers mostly worked out. I was going back over everything. Again, I'm not super confident on my¹H NMR, but from what I knew it looked pretty good. So [I] decided that it had been synthesized.”

In this response, Kim judged the integration values and number of peaks in the NMR spectrum to be an adequate representation of the NMR spectrum that would be expected. She then used this judgement to correctly determine the synthesis's success. However, use of the heuristic became problematic when participants disregarded critical spectral features when determining representativeness, as in Chris's incorrect response to Synthesis 3:

“And then I just started looking again at the multiplet of integration value of one, because I didn't really know where that came from at first, and I still wasn't totally sure. I thought it could have something to do with the oxygen or proton transfer, but I wasn't really sure where it would come from, but I thought based on the other evidence I found that it was… I don't know. I could place six out of seven peaks and the IR matched up close enough that I thought it was a good representation of the molecule.”

In Chris's response, he judged six out of seven ¹H NMR peaks and selected IR peaks to be an adequate representation of each expected spectra while disregarding spectral features corresponding to a carboxylic acid functional group. He then used this judgement to incorrectly determine the success of the synthesis.

Like the representativeness heuristic, the reduction heuristic also involves reducing the amount of information to be processed. More specifically, the reduction heuristic involves eliminating cues that are shared among alternative options as information to process (McClary and Talanquer, 2011). When individuals used the reduction heuristic in this study, they explicitly chose to disregard certain spectral features they considered as either characteristic of more than one molecular feature or uncharacteristic of any particular molecular feature. For example, participants often chose to disregard absorption peaks in the fingerprint region of IR spectra as information to process given this region's complexity. Reduction of this information did not appear to inhibit participants’ reasoning. However, use of the reduction heuristic became problematic when participants failed to recognize spectral features that were characteristic of a molecular feature and then eliminated them as information to process. For example, Shelia failed to recognize the IR peak characteristic of the OH functional group in Synthesis 3 (one indication of the unsuccessful synthesis of the target molecule) and then disregarded its presence to incorrectly determine the synthesis was successful:

“I'm looking at that 3000 [cm⁻¹] peak, and I'm having a hard time piecing together what it might be. I think it might be an alkane, but it's not like a big functional group that we talked about a lot, like anything that's really special.”

Participants’ use of the reduction heuristic in both unproblematic and problematic ways aligns with the notion that although experts and novices both use heuristics, novices often lack knowledge of the appropriate contexts in which heuristics can be successfully applied (Kahneman and Klein, 2009).

One-reason decision making (ORDM) was the least frequently used heuristic in this group, with 10 participants using it approximately once; however, its use had a noticeable influence on participants’ decision-making. This heuristic involves looking for one ‘clever’ cue, and then using only this cue to make a decision; its use may further entail the search for more than one cue, however a decision is made using a single feature (Gigerenzer and Gaissmaier, 2011). Participants in this study used ORDM as they assessed multiple spectral features during their evaluation but ultimately based their decision on just one feature. Use of the ORDM heuristic often involved participants relying on an invalid chemical assumption resulting from a generalization to make a decision, as Audrey's incorrect response to Synthesis 1 illustrates:

Audrey: And then I just decided they should all be quartets and if they weren't that wasn't what we had.

Interviewer: Okay. So it seems like you decided that before you looked at other pieces of information. What was your rationale for saying, “Okay, these don't match up. These should all be quartets. Let me look at the IR”?

Audrey: Yeah, so…I feel better about the IR kind of, and so if I could disprove it with the IR that would just like add to my confidence, I guess, with it.

Interviewer: Gotcha.

Audrey: But, then like going back I was like, “well they all do all have like those three [adjacent hydrogen atoms] so we should see quartets for all of them. So I decided that was good enough.

During this evaluation, Audrey searched for additional features in the IR spectrum that could provide some indication of the synthesis's success, however she based her decision only on an unexpected ¹H NMR splitting pattern. This example illustrates that ORDM becomes problematic when relevant background knowledge is lacking, or in this case with the co-occurrence of an invalid assumption.

Affective judgements. Only one heuristic identified in over 20% of participants’ responses fell under the category of affective judgements (Table 2). This heuristic, termed affect, involves relying on one's positive or negative impressions to make decisions. This heuristic facilitates decision-making given that relying on one's impressions is often easier than systematically evaluating the weight of several cues, however such reasoning may result in illogical judgements (cf., Talanquer, 2014). A majority of participants (14 out of 18) applied this heuristic in at least one response, and its use took the form of individuals expressing positive or negative impressions of whether spectra corresponded with target molecules. In most cases, participants expressed positive or negative impressions about the data but the impact of such impressions on their decisions could only be inferred. For instance, in Shelia's incorrect response to Synthesis 2, she expressed confusion regarding the large ¹H NMR peak corresponding to hydrogen atoms on the benzene ring:

“…at the 7 ppm, that's kind of tripping me up. I'm looking more at the zoomed in version again. I just, that benzene ring is really tripping me up and it may be symmetrical, it may not be but still, the multiplet still doesn't make sense to me.”

Whether this negative impression contributed to her incorrect decision is uncertain. However, the authors still coded such reasoning as the affect heuristic in order to over-estimate rather than under-estimate the influence of such impressions on decision-making. Nonetheless, some responses did demonstrate the direct influence of individuals’ impressions on their decision-making. For example, Robert expressed that he felt positively about the provided spectral data for Synthesis 1 and used this feeling to inform his decision:

“I chose yes because I guess that I feel like it was there in a small amount… So I could redo the NMR with a higher concentration to see if it was what I thought it was or not. And so I just kind of had a gut feeling that it was there.”

As illustrated in the description of qualitative findings, participants used a number of invalid chemical assumptions and heuristic reasoning strategies during spectral interpretation. Invalid chemical assumptions and heuristics constrained participants’ reasoning to various degrees, with some assumptions and heuristics appearing to result in the incorrect determination of each synthesis's success and others having no obvious impact on decision-making. Further, use of some heuristics was problematic in certain contexts and productive in others, like the use of the representativeness heuristic by Kim and Chris as described above.

Quantitative findings: identification of invalid chemical assumptions and heuristics that most severely constrained reasoning

Following the qualitative analysis, frequencies of responses containing invalid chemical assumptions and heuristics were analysed quantitatively in order to (1) determine if particular interpretation tasks disproportionately elicited certain invalid chemical assumptions or heuristics and (2) identify assumptions or heuristics that most severely constrained participants’ reasoning.

The extent to which interpretation tasks disproportionately elicited certain assumptions or heuristics was investigated in order to establish that participants’ reasoning was not dictated by problem-specific features. Demonstrating that assumptions and heuristics were used with similar distributions across a variety of tasks serves as a means to establish transferability of any findings to other IR and ¹H NMR spectral interpretation tasks for this study population, in particular to tasks involving molecules with a similar variety of functional groups. Frequencies of responses containing invalid chemical assumptions and heuristics for each interpretation task are provided in the Appendix (Tables 4 and 5, respectively). A two-sided Fisher's exact test was used to determine if certain interpretation tasks disproportionately elicited certain invalid chemical assumptions. The distributions of assumption frequencies varied significantly with interpretation task (p = 0.009, two-sided Fisher's exact test); however, when visuospatial invalid assumptions were omitted from the analysis, distributions of assumption frequencies did not vary significantly (p = 0.323, two-sided Fisher's exact test). This lack of significance suggests that the tasks included in this study may only disproportionately elicit visuospatial invalid assumptions, as evinced by the exclusive appearance of these assumptions in responses to Synthesis 2 (Appendix, Table 4). Although a Fisher's exact test is used in lieu of a Pearson's χ² test of independence when sample size is sufficiently small, a Pearson's χ² test of independence and post hoc residual analysis of invalid chemical assumption frequencies also revealed that the interpretation tasks only disproportionately elicited visuospatial invalid assumptions. Further, a Pearson's χ² test of independence was used to determine if certain interpretation tasks disproportionately elicited certain heuristics. The distribution of heuristic frequencies did not vary significantly with interpretation task (χ² = 9.03, p = 0.83), indicating that interpretation tasks did not disproportionately elicit certain heuristics (Appendix, Table 5). The fact that interpretation tasks only disproportionately elicited visuospatial invalid assumptions and no other assumptions or heuristics provides additional evidence that findings from this study are transferable to other similar spectral interpretation tasks for this study population.

One-sided Fisher's exact tests were used to identify any assumptions or heuristics that tended to appear in incorrect responses and not in correct responses for each task. By identifying such assumptions or heuristics, the most problematic constraints on organic chemistry students’ reasoning could be identified. The p-values corresponding to all one-sided Fisher's exact tests are provided in Table 3. Significant p-values correspond to given assumptions or heuristics for which (1) the proportion of incorrect responses that used the assumption or heuristic is statistically greater than the proportion of incorrect responses that did not use the assumption or heuristic and (2) the proportion of correct responses that used an assumption or heuristic is statistically less than the proportion of correct responses that did not use the assumption or heuristic (Table 3). In other words, significant p-values correspond to assumptions or heuristics for which responding incorrectly was associated with whether the assumption or heuristic was used and responding correctly was associated with whether the assumption or heuristic was not used; significant p-values thus allow for identification of the assumptions and heuristics that tended to appear in incorrect responses and not in correct responses. Odds ratios were evaluated post hoc as a measure of effect size for statistically significant Fisher's exact tests (Table 3) (Sheskin, 2011). The odds ratio corresponds to the odds of using an assumption or heuristic and responding incorrectly versus the odds of using an assumption or heuristic and responding correctly. All odds ratios associated with significant p-values far exceeded the criteria for a large effect size (odds ratios >6.71) (Ferguson, 2009; Chen et al., 2010), indicating a substantially large effect of using particular assumptions or heuristics on the ultimate accuracy of one's response. These large effect sizes were expected given that (1) a large effect would be necessary to result in statistically significant p-values for this relatively small sample size and (2) some assumptions and heuristics appeared exclusively in a majority of incorrect responses and not in correct responses.

Table 3 The p-values and odds ratios corresponding to one-sided Fisher's exact tests for assumptions and heuristics associated with incorrect responses

Assumptions and heuristics	Fisher's exact test p-values			Odds ratios
	Synthesis 1	Synthesis 2	Synthesis 3	Synthesis 1	Synthesis 2	Synthesis 3
Corresponds to significance at the p < 0.05 level, corresponds to significance at the p < 0.01 level, and **corresponds to significance at the p < 0.001 level. †Corresponds to odds ratios associated with significant p-values. All odds ratios associated with significant p-values far exceeded the criteria for a large effect size (odds ratios >6.71).a n/a corresponds to contingency tables with frequencies of zero in all cells.b n/a corresponds to contingency tables in which all frequencies in one row or column were zero.
Assumptions
Assumptions that the “N + 1 rule” should hold	0.002**	<0.001***	0.792	49.4^†	84.3^†	0.9
Assumptions that spectral data should be absolute	n/a^a	0.001**	0.999	n/a^a	41.4^†	0.2
Practical invalid assumptions	0.999	0.353	0.099	0.5	6.3	7.0
Visuospatial invalid assumptions	n/a^a	0.145	n/a^a	n/a^a	4.4	n/a^a
Fundamental invalid assumptions	0.999	0.999	0.299	0.3	0.2	3.3

Heuristics
Processing fluency	n/a^b	n/a^b	0.765	n/a^b	n/a^b	1.1
Associative activation	0.890	0.928	0.421	0.5	0.4	3.0
Generalization	0.017*	<0.001***	0.985	21.7^†	91.0^†	0.2
Representativeness	0.999	0.999	0.015*	0.0	0.3	27.0^†
Affect	0.841	0.841	0.999	0.7	2.4	0.1
Reduction	0.999	0.925	0.080	0.1	0.5	10.4
Rigidity	<0.001***	<0.001***	0.999	299.0^†	84.3^†	0.2
One-reasoning decision making	<0.001***	0.029*	0.999	299.0^†	23.0^†	0.5

For Syntheses 1 and 2, incorrect responses contained assumptions that the “N + 1 rule” should hold, as well as use of the generalization, rigidity, and one-reason decision making (ORDM) heuristics at a statistically significant level (Table 3). In addition, incorrect responses to Synthesis 2 contained assumptions that spectral data should be absolute (Table 3). The presence of both assumptions and heuristics at a statistically significant level aligns with research demonstrating that problematic reasoning among students is a product of multiple factors including heuristics and not simply misconceptions (Cooper et al., 2013). This finding also aligns with research stating that sets of assumptions and heuristics constitute fundamental constraints to learning (Talanquer, 2009). Incorrect responses to Synthesis 3 only contained the representativeness heuristic at a statistically significant level, possibly because of the smaller number of incorrect responses to this task. For this task, practical invalid assumptions and the reduction heuristic had p-values close to the 0.05 criteria for statistical significance (0.099 and 0.080, respectively), but it is uncertain if a larger number of incorrect responses would have resulted in significant p-values.

Further, assumptions and heuristics identified through statistical analysis did not appear in isolation from one another but rather co-occurred in incorrect responses, as Fig. 3 illustrates. Combinations of assumptions and heuristics in responses to Syntheses 1 and 2 took on a similar form, where participants first expressed an invalid chemical assumption resulting from a generalization, demonstrated rigidity with respect to this generalization, and then ultimately made a decision using only the spectral feature which violated their invalid assumption. A number of excerpts provided in the qualitative findings section illustrate these combinations, as does Frances’ previously described response to Synthesis 1:


	Fig. 3 Co-occurrence of assumptions and heuristics in incorrect responses. The total number of incorrect responses (N) to a given task is indicated at the bottom of each corresponding Venn diagram. The number of incorrect responses containing particular assumptions and heuristics (n) is indicated in corresponding intersections of each diagram. Incorrect responses to Synthesis 2 contained both assumptions that the “N + 1 rule” should hold and assumptions that spectral data should be absolute at statistically significant level; of these responses, two contained only assumptions that the “N + 1 rule” should hold, one contained only assumptions that spectral data should be absolute, and three contained both types of assumptions.

As individuals make decisions, they typically identify cues and then assess their weight, or importance (Shah and Oppenheimer, 2008). Participants who held these invalid, rule-based assumptions possibly found spectral features which violated such assumptions to be accessible, highly weighted cues. The significant weight of such cues then potentially resulted in the use of ORDM and rigidity, both of which are effort-reduction heuristics, to facilitate decision-making.

Combinations of assumptions and heuristics in incorrect responses to Synthesis 3 took the form of participants making a practical invalid assumption regarding the large IR peak near 3000 cm⁻¹, subsequently reducing this IR peak as information to process, and then failing to evaluate other significant spectral data before making a decision. Madelyn's evaluation of Synthesis 3 illustrates this combination. Madelyn first correctly identified the IR peak near 3000 cm⁻¹ as corresponding to the OH functional group:

“I started looking at the broadest [IR] peak first because usually I correspond it to an OH, so right off the bat I kind of felt that it wasn't getting synthesized.”

However, she then judged the NMR spectrum to be an adequate representation of the molecule while failing to evaluate chemical shift values that provided additional evidence of an OH functional group. She ultimately ended her evaluation by incorrectly rationalizing that the IR peak near 3000 cm⁻¹ could instead correspond to the CH functional group:

“At this point I went back to the IR [spectrum] to maybe see if that peak [near 3000 cm⁻¹] could represent something else, cause usually that's an OH group. So I went back to the table and saw that also CH sometimes is a medium intensity. I was thinking maybe that would be a CH, especially if it's like an aldehyde….”

This invalid assumption facilitated Madelyn's reduction of the IR peak as information to process and, when combined with the representativeness heuristic used to evaluate the NMR spectrum, resulted in her incorrect response.

From the quantitative analysis, assumptions and heuristics that most severely constrained reasoning appear to vary somewhat with spectra and thus depend upon context. For two of the three interpretation tasks (Syntheses 1 and 2), incorrect response contained a combination of assumptions that the “N + 1 rule” should hold and the generalization, rigidity, and ORDM heuristics at a statistically significant level. For one of these two tasks (Synthesis 2), assumptions that spectral data should be absolute also appeared in incorrect responses at a statistically significant level. For one of the three tasks (Synthesis 3), incorrect responses contained practical invalid assumptions and the reduction and representativeness heuristics, though only the representativeness heuristic appeared at a statistically significant level. As noted above, spectral interpretation tasks did not disproportionately elicit these assumptions or heuristics among the study population, suggesting that use of these particular assumptions and heuristics is a reflection of severely constrained reasoning and not reasoning used by all students. However, it is uncertain if there are additional combinations of such constraints given that the use of these combinations appears to depend somewhat on context. Additional investigations are needed to further characterize any additional severely constrained reasoning.

Conclusions

This study investigated constraints on organic chemistry students’ reasoning during IR and ¹H NMR spectral interpretation, in particular the invalid chemical assumptions and heuristic reasoning strategies used by students when evaluating the success of chemical syntheses using spectral data. A mixed-methods approach with a conversion design was used to first qualitatively characterize the invalid chemical assumptions and heuristic reasoning strategies used by study participants during spectral interpretation. Themes among invalid chemical assumptions were identified in order to more comprehensively characterize participants’ reasoning. Frequencies of responses containing given assumptions and heuristics were then analysed quantitively to identify assumptions and heuristics that most severely constrained reasoning. Findings from both analyses provide insight into reasoning that may in part represent the lower anchor of a learning progression on spectral interpretation.

Findings from the initial qualitative analysis provide insight into which assumptions and heuristics organic chemistry students use during spectral interpretation. For this analysis, 20 invalid chemical assumptions were identified in participants’ responses. Five themes emerged from these invalid chemical assumptions that more comprehensively illustrate constraints on participants’ reasoning during these tasks: (1) assumptions that the “N + 1 rule” should hold, (2) assumptions that spectral data should be absolute, (3) visuospatial invalid assumptions, (4) practical invalid assumptions, and (5) fundamental invalid assumptions. These themes were validated by external experts who provided insight into the validity and transferability of findings to other interpretation tasks and instructional contexts. Eight heuristic reasoning strategies were also identified during this initial qualitative analysis, all of which fall into one of three categories of heuristic reasoning described by Talanquer (2014): (1) fundamental associative processes, (2) inductive judgements, and (3) affective judgements. Heuristic reasoning strategies constrained participants’ reasoning to various degrees, with some heuristics appearing to result in the incorrect determination of each synthesis's success (e.g., one-reason decision making and rigidity) and others having no obvious impact on decision-making (e.g., processing fluency). Further, use of some heuristics appeared problematic in certain contexts and supportive of correct decision-making in others (e.g., representativeness and affect).

The quantitative analysis of invalid chemical assumption and heuristic frequencies provided insight into cognitive elements that most severely constrained participants’ reasoning. These assumptions and heuristics are those that tended to appear in incorrect responses and not in correct responses. While some constraints also appeared in correct responses (e.g., invalid chemical assumptions that ultimately did not influence decision-making), incorrect responses represent an extreme case of constrained reasoning. From this quantitative analysis, incorrect responses more often contained assumptions that the “N + 1 rule” should hold, assumptions that spectral data should be absolute, and the generalization, rigidity, ORDM, and representativeness heuristics when compared to correct responses. The prevalence of rule-based assumptions and effort-reduction heuristics in incorrect responses may have resulted from less engagement with optional practice problems in the provided coursepack, in particular since these problems often included unexpected spectral features; however, we do not have data to support this claim. Further, these assumptions and heuristics tended to occur in combination with one another, as Fig. 3 illustrates; in other words, the use of both assumptions and heuristics appears to result in incorrect responses rather than the use of assumptions or heuristics in isolation. This co-occurrence aligns with previous research on students’ reasoning in chemistry which states that problematic reasoning is not just a collection of misconceptions but a combination of multiple factors including heuristics (Cooper et al., 2013) and that sets of particular assumptions and heuristics constitute the fundamental constraints to learning (Talanquer, 2009).

Limitations

The design of interpretation tasks had inherent limitations. Integration values and multiplicities were provided for all ¹H NMR resonances in order to investigate students’ reasoning and not their ability to distinguish between individual peaks. In addition, all provided spectra were free of signals resulting from solvent or impurities in order to reduce participants’ cognitive load. By using these clean, labelled spectra, any potential invalid chemical assumptions relating to the interpretation of more authentic spectra (i.e., those containing unlabelled multiplicities, integration values, peaks due to solvent or impurities, etc.) were not elicited. Two of five external experts stated that such assumptions were held by their own students but were not captured by this study. Findings from this study may therefore only partially transfer to the interpretation of more authentic spectra.

Moreover, participants only completed three interpretation tasks due to fatigue reported by pilot study members. Although interpretation tasks included a variety of spectral features in order to elicit a range of reasoning from participants, it is possible that some invalid chemical assumptions or heuristic reasoning strategies were not captured given this small number of tasks. Further, Pearson's χ² test of independence and Fisher's exact test were used to establish that tasks did not disproportionately elicit certain assumptions or heuristics. However, combinations of assumptions and heuristics appearing in incorrect responses did tend to vary with context. Further investigations are therefore needed to characterize any additional combinations.

In addition to not eliciting all possible reasoning, some potentially problematic reasoning was not coded as an invalid chemical assumption or heuristic as it required significant inference by the authors. For instance, while some participants stated that OH and NH hydrogens should appear as singlets, it was unclear if these participants invalidly assumed that these hydrogens always appear as singlets regardless of solvent effects. Similarly, some participants may have used certain heuristics subconsciously (McClary and Talanquer, 2011), however such use was not captured. One exception was made for the affect heuristic, which was coded regardless of whether participants’ emotions directly influenced their ultimate decision. The prevalence of certain assumptions or heuristics may therefore have been underestimated for this study. It is also possible that participants failed to verbalize or remember their reasoning during RTA interviews, however this limitation was mitigated by the interviewer's use of probing questions and students’ ability to pause the eye movement recording during the interview and reflect on their thinking.

Lastly, the study population was a convenience sample and therefore may not reflect reasoning used by all organic chemistry students, in particular those in other instructional contexts. The review of invalid chemical assumption themes by external experts in part contributes to the transferability of the findings to other instructional contexts. However, we have included a detailed description of the instructional setting and data in the form of participant quotes to allow for judgements regarding transfer.

Implications for teaching and research

Implications for teaching

Findings from this study provide additional evidence that problematic reasoning among students is not solely a product of any misconceptions they hold, but rather a combination of their underlying assumptions and heuristics. If instruction is to foster students’ ability to interpret spectra, it must therefore explicitly address these assumptions as well as actively promote students’ shift from Type 1 to Type 2 thinking. There are a number of promising strategies for shifting decision makers from Type 1 to Type 2 thinking (Lerner and Tetlock, 1999; Milkman et al., 2009); for instance, research in cognitive psychology suggests that prompting individuals to “consider the opposite” of any decision they are about to make can promote Type 2 thinking and correct for decision biases resulting from the use of heuristics (Mussweiler et al., 2000). This research also suggests that having individuals assess the rightness of their decision promotes Type 2 thinking, where individuals with low feelings of rightness demonstrate increased rethinking times and increased probability of answer change (Thompson et al., 2011). Further, research in chemistry education demonstrates that having chemistry students predict how incorrect students may respond to given questions positively influences performance on these questions (Talanquer, 2017); this finding further suggests that even simple interventions may help students spend more time evaluating cues and reflecting on their decisions. Instructors could incorporate any of these strategies into course materials on spectral interpretation (e.g., clicker questions, practice problems, exam questions, etc.) with minimal effort.

In addition to these easily-adopted strategies, research on cognitive biases in medicine offers targeted approaches for promoting Type 2 thinking (Croskerry, 2003). One of these approaches involves providing practitioners with detailed descriptions of common, problematic heuristics along with several clinical examples that illustrate how their use negatively impacts decision-making (Croskerry, 2003). Transferring this approach to instruction on spectral interpretation would involve instructors providing students with detailed descriptions of common, problematic heuristics along with example spectra which illustrate how the use of each heuristic results in erroneous decision-making. Findings from this study provide insight into the most problematic heuristics used by organic chemistry students and may thus inform such instruction.

Implications for research

To the best of the authors’ knowledge, this is the first study to use RTA interviewing in combination with eye tracking to collect qualitative data on students’ reasoning in chemistry. The abundance and complexity of assumptions and heuristics captured using this data collection method indicate that it serves as a valid and promising tool to investigate students’ reasoning for complex chemistry tasks, in particular those for which standard think-aloud techniques may overburden participants’ cognitive load.

In addition, findings from this study lay groundwork for the development of a learning progression on spectral interpretation. This study focused only on reasoning at the lower anchor, in particular the invalid assumptions and heuristics used at this level. Additional studies are therefore needed to characterize the valid assumptions that guide students’ reasoning at the lower anchor as well as how knowledge develops beyond this level. Further, findings demonstrate that problematic reasoning is not only a product of misconceptions but also of heuristic reasoning strategies. Research to characterize students’ reasoning in chemistry should therefore extend beyond generating inventories of misconceptions.

Conflicts of interest

There are no conflicts to declare.

Appendix


	Fig. 4 Spectral interpretation task (Synthesis 2) asking participants to determine if isochroman was successfully synthesized using the provided IR spectrum and ¹H NMR spectrum.


	Fig. 5 Spectral interpretation task (Synthesis 3) asking participants to determine if 3-(allyloxy)propanal was successfully synthesized using the provided IR spectrum and 1H NMR spectrum.

Table 4 Invalid chemical assumption themes, the number of participants who used corresponding assumptions at least once (n = 18), and frequencies of responses containing corresponding assumptions. The distributions of frequencies vary significantly with interpretation task (p = 0.009, two-tailed Fisher's exact test). However, when visuospatial invalid assumptions are omitted, distributions of frequencies do not vary significantly with task (p = 0.323, two-tailed Fisher's exact test); this lack of significance suggests that tasks may only disproportionately elicit visuospatial invalid assumptions

Invalid chemical assumptions	Participants	Synthesis 1	Synthesis 2	Synthesis 3	Total responses
Assumptions that the “N + 1 rule” should hold	13	8	6	5	19
Assumptions that spectral data should be absolute	9	1	5	4	10
Practical invalid assumptions	8	2	1	6	9
Visuospatial invalid assumptions	8	0	8	0	8
Fundamental invalid assumptions	6	2	3	3	8

Table 5 Heuristics identified in at least 20% of RTA interview transcripts, the number of participants who used corresponding heuristics at least once (n = 18), and frequencies of responses containing corresponding heuristics. The distribution of heuristic frequencies did not vary significantly with interpretation task (χ² = 9.03, p = 0.83)

Heuristic	Participants	Synthesis 1	Synthesis 2	Synthesis 3	Total responses
Processing fluency	18	18	17	17	52
Associative activation	17	16	11	15	42
Generalization	16	11	8	10	29
Representativeness	15	9	2	7	18
Affect	14	8	6	7	21
Reduction	14	4	6	10	20
Rigidity	11	6	6	4	16
One-reasoning decision making	10	6	3	2	11

References

Alexander C. W., Asleson G. L., Doig M. T. and Heldrich F. J., (1999), Spectroscopic instruction in introductory organic chemistry: results of a national survey, J. Chem. Educ., 76(9), 1294–1296.
Auchincloss L. C., Laursen S. L., Branchaw J. L., Eagan K., Graham M., Hanauer D. I., et al., (2014), Assessment of course-based undergraduate research experiences: a Meeting Report, CBE-Life Sci. Educ., 13, 29–40.
Azman A. M. and Esteb J. J., (2016), A coin-flipping analogy and web app for teaching spin–spin splitting in ¹H NMR spectroscopy, J. Chem. Educ., 93(8), 1478–1482.
Bowen C. W. and Bodner G. M., (1991), Problem solving processes used by students in organic synthesis, Int. J. Sci. Educ., 13, 143–158.
Bruice P. Y., (2011), Organic Chemistry, 6th edn, Upper Saddle River: Prentice Hall.
Cartrette D. P. and Bodner G. M., (2010), Non-mathematical problem solving in organic chemistry, J. Res. Sci. Teach., 47(6), 643–660.
Chen H., Cohen P. and Chen S., (2010), How big is a big odds ratio? Interpreting the magnitudes of odds ratios in epidemiological studies, Commun. Stat. Simul. Comput., 39(4), 860–864.
Chi M.T. H., Feltovich P. J. and Glaser R., (1979), Categorization and representation of physics problems by experts and novices, Cognit. Sci., 5, 121–152.
Cohen L., Manion L. and Morrison K., (2011), Research methods in education, 7th edn, London: Routledge.
Connor M. C. and Shultz G. V., (2018), Teaching assistants’ topic-specific pedagogical content knowledge in ¹H NMR spectroscopy, Chem. Educ. Res. Pract., 19(3), 653–669.
Cooper M. M., Corley L. M. and Underwood S. M., (2013), An investigation of college chemistry students’ understanding of structure–property relationships, J. Res. Sci. Teach., 50(6), 699–721.
Creswell J. W. and Poth C. N., (2018), Qualitative inquiry and research design, 4th edn, Los Angeles: Sage.
Croskerry P., (2003), The importance of cognitive errors in diagnosis and strategies to minimize them, Acad. Med., 78(8), 775–780.
Cullipher S. and Sevian H., (2015), Atoms versus Bonds: How Students Look at Spectra, J. Chem. Educ., 92(12), 1996–2005.
Debska B. and Guzowska-Swider B., (2007), Molecular structures from ¹H NMR spectra: education aided by internet programs, J. Chem. Educ., 84(3), 556–560.
diSessa A. A., (1993), Toward an epistemology of physics, Cognit. Instruct., 10(2), 105–225.
Erhart S. E., McCarrick R. M., Lorigan G. A. and Yezierski E. J., (2016), Citrus quality control: an NMR/MRI problem-based experiment, J. Chem. Educ., 93(2), 335–339.
Evans J. S. B. T. and Stanovich K. E., (2013), Dual-process theories of higher cognition: advancing the debate, Perspect. Psychol. Sci., 8(3), 223–241.
Ferguson C. J., (2009), An effect size primer: a guide for clinicians and researchers, Prof. Psychol. Res. Pract., 40(5), 532–538.
Galloway K. R., Leung M. W. and Flynn A. B., (2018), A comparison of how undergraduates, graduate students, and professors organize organic chemistry reactions, J. Chem. Educ., 95(3), 355–365.
Gelman S. A., (2009), Learning from others: children's construction of concepts, Annu. Rev. Psychol., 60, 115–140.
Gigerenzer G. and Gaissmaier W., (2011), Heuristic decision making, Annu. Rev. Psychol., 62, 451–482.
Graham K. J., Mcintee E. J. and Schaller C. P., (2016), Web-based 2D NMR spectroscopy practice problems, J. Chem. Educ., 93(8), 1483–1485.
Graulich N., (2015), Intuitive judgments govern students’ answering patterns in multiple-choice exercises in organic chemistry, J. Chem. Educ., 92(8), 205–211.
Guan Z., Lee S., Cuddihy E. and Ramey J., (2006), The validity of the stimulated retrospective think-aloud method as measured by eye tracking, in Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI ’06), ACM Press, pp. 1253–1262.
Harle M. and Towns M., (2011), A Review of spatial ability literature, its connection to chemistry, and implications for instruction, J. Chem. Educ., 88(3), 351–360.
Heckler A. F., (2011), The ubiquitous patterns of incorrect answers to science questions: the role of automatic, bottom-up processes, in Mestre J. P. and Ross B. H. (ed.), Psychology of Learning and Motivation, London: Academic Press, vol. 55, pp. 227–267.
Hyrskykari A., Ovaska S., Majaranta P., Räihä K.-J. and Lehtinen M., (2008), Gaze path stimulation in retrospective think-aloud, J. Eye Mov. Res., 2(4), 1–18.
Just M. A. and Carpenter P. A., (1980), A theory of reading: from eye fixations to comprehension, Psychol. Rev., 87(4), 329–354.
Kahneman D. and Klein G., (2009), Conditions for intuitive expertise: a failure to disagree, Am. Psychol., 64(6), 515–526.
Lawson R., (2004), Small sample confidence intervals for the odds ratio, Commun. Stat. Part B: Simul. Comput., 33(4), 1095–1113.
Lerner J. and Tetlock P., (1999), Accounting for the effects of accountability, Psychol. Bull., 125, 255–275.
Livengood K., Lewallen D. W., Leatherman J. and Maxwell J. L., (2012), The use and evaluation of scaffolding, student centered-learning, behaviorism, and constructivism to teach nuclear magnetic resonance and IR spectroscopy in a two-semester organic chemistry course, J. Chem. Educ., 89(8), 1001–1006.
Maeyer J. and Talanquer V., (2013), Making predictions about chemical reactivity: assumptions and heuristics, J. Res. Sci. Teach., 50(6), 748–767.
Margolis E., (1994), A reassessment of the shift from the classical theory of concepts to prototype theory, Cognition, 51(1), 73–89.
McClary L. and Talanquer V., (2011), Heuristic reasoning in chemistry: making decisions about acid strength, Int. J. Sci. Educ., 33(10), 1433–1454.
Miles M. B. and Huberman A. M., (1994), Qualitative Data Analysis, 2nd edn, Thousand Oaks: Sage Publications.
Milkman K. L., Chugh D. and Bazerman M. H., (2009), How can decision making be improved? Perspect. Psychol. Sci., 4(4), 379–383.
Morewedge C. K. and Kahneman D., (2010), Associative processes in intuitive judgment, Trends Cognit. Sci., 14(10), 435–440.
Murphy G. L. and Medin D. L., (1985), The role of theories in conceptual change, Psychol. Rev., 92(3), 289–316.
Mussweiler T., Strack F. and Pfeiffer T., (2000), Over coming the Inevitable Anchoring Effect: Considering the Opposite Compensates for Selective Accessibility, Personal. Soc. Psychol. Bull., 26(9), 1142–1150.
Osman M. and Stavy R., (2006), Development of intuitive rules: evaluating the application of the dual-system framework to understanding children's intuitive reasoning, Psychon. Bull. Rev., 13(6), 935–953.
Parmentier L. E., Lisensky G. C. and Spencer B., (1998), A guided inquiry approach to NMR spectroscopy, J. Chem. Educ., 75(4), 470–471.
Pavia D. L., Lampman G. M., Kriz G. S. and Vyvyan J. R., (2015), Introduction to spectroscopy, 5th edn, Stamford: Cengage Learning.
Raker J. R. and Towns M. H., (2012), Problem types in synthetic organic chemistry research: implications for the development of curricular problems for second-year level organic chemistry instruction, Chem. Educ. Res. Pract., 13(3), 179–185.
R Core Team, (2018), R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria, https://www.R-project.org/.
Saldaña J., (2016), The coding manual for qualitative researchers, Los Angeles: Sage Publications.
SDBSWeb, (1997), Spectral database for organic compounds, National Institute of Advanced Industrial Science and Technology, http://sdbs.db.aist.go.jp (accessed 24 January 2018).
Shah A. K. and Oppenheimer D. M., (2008), Heuristics made easy: an effort-reduction framework, Psychol. Bull., 134(2), 207–222.
Sheskin D. J., (2011), Handbook of parametric and nonparametric statistical procedures, 5th edn, Boca Raton: CRC Press.
Smith C. L., Wiser M., Anderson C. W. and Krajcik J., (2006), Focus article: Implications of research on children's learning for standards and assessment: a proposed learning progression for matter and the atomic-molecular theory, Meas. Interdiscip. Res. Perspect., 4(1–2), 1–98.
Stains M. and Talanquer V., (2008), Classification of chemical reactions: stages of expertise, J. Res. Sci. Teach., 45(7), 771–793.
Taber K. S., (2009), College students’ conceptions of chemical stability: the widespread adoption of a heuristic rule out of context and beyond its range of application, Int. J. Sci. Educ., 31(10), 1333–1358.
Talanquer V., (2006), Commonsense chemistry: a model for understanding students’ alternative conceptions, J. Chem. Educ., 83(5), 811.
Talanquer V., (2009), On cognitive constraints and learning progressions: the case of “structure of matter”, Int. J. Sci. Educ., 31(15), 2123–2136.
Talanquer V., (2014), Chemistry education: ten heuristics to tame, J. Chem. Educ., 91(8), 1091–1097.
Talanquer V., (2017), Concept Inventories: Predicting the Wrong Answer May Boost Performance, J. Chem. Educ., 94, 1805–1810.
Thompson V. A., Turner J. A. P. and Pennycook G., (2011), Intuition, reason, and metacognition, Cognit. Psychol., 63(3), 107–140.
Tobii Technology, (2018), Tobii Pro Studio, https://www.tobiipro.com/product-listing/tobii-pro-studio/ (accessed 18 December 2018).
Topczewski J. J., Topczewski A. M., Tang H., Kendhammer L. K. and Pienta N. J., (2017), NMR spectra through the eyes of a student: eye tracking applied to NMR items, J. Chem. Educ., 94(1), 29–37.
Tversky A. and Kahneman D., (1974), Judgment under Uncertainty: Heuristics and Biases, Science, 185(4157), 1124–1131.
van Gog T., Paas F., van Merriënboer J. J. G. and Witte P., (2005), Uncovering the problem-solving process: cued retrospective reporting versus concurrent and retrospective reporting, J. Exp. Psychol. Appl., 11(4), 237–244.
Veeraraghavan S., (2008), NMR spectroscopy and its value: a primer, J. Chem. Educ., 85(4), 537–540.
Vosegaard T., (2018), ISpec: a web-based activity for spectroscopy teaching, J. Chem. Educ., 95(1), 97–103.
Vosniadou S., (1994), Capturing and modeling the process of conceptual change, Learn. Instruct., 4(1), 45–69.
Wu H. K. and Shah P., (2004), Exploring visuospatial thinking in chemistry learning, Sci. Educ., 88(3), 465–492.

Click here to see how this site uses Cookies. View our privacy policy here.

Constraints on organic chemistry students’ reasoning during IR and 1H NMR spectral interpretation