Evaluating students’ learning gains, strategies, and errors using OrgChem101's module: organic mechanisms—mastering the arrows

Myriam S. Carle , Rebecca Visser and Alison B. Flynn *
Department of Chemistry & Biomolecular Sciences, University of Ottawa, Ottawa, Ontario, Canada. E-mail: Alison.flynn@uOttawa.ca

Received 28th November 2019 , Accepted 23rd January 2020

First published on 31st January 2020

We developed an online learning module called “Organic Mechanisms: Mastering the Arrows” to help students learn part of organic chemistry's language—the electron-pushing formalism. The module guides students to learn and practice the electron-pushing formalism using a combination of interactive videos, questions with instant feedback, and metacognitive skill-building opportunities. This module is part of http://OrgChem101.com, an open educational resource (OER) that houses a series of learning modules. To evaluate the mechanism module's effects on students’ learning and experiences, we offered a workshop during which undergraduate students used the module. We investigated their learning gains via a pre-test and post-test format and their experiences using a survey. Analysis of responses revealed significant learning gains between the pre- and post-test, especially with questions that asked students to draw the products of a reaction. After using the learning tool, students used more analysis strategies, such as mapping, attempted more questions, and made fewer errors. The students reported positive experiences and a belief that the module would help them in their organic chemistry courses. Previous work also identified greater metacognitive skills after using the module, related to the module's intended learning outcomes. Herein, we describe the module, evaluation study, findings, and implications for research and practice.


Translating between the molecular and representational levels: chemistry's language

A cohesive understanding in science, technology, engineering, and mathematics (STEM) relies on making sense of processes that are often invisible, operate at many scales (e.g., size), and are depicted with specialized representations (Johnstone, 1982, 2000; Kozma and Russell, 1997; Gilbert, 2008; Talanquer, 2011). Learners often do not know which aspects of these representations are significant, much less how to develop their mental models, translate between representations, and apply their knowledge to solve complex issues (Johnstone, 2000; Bhattacharyya and Bodner, 2005; Russ et al., 2008; Cheng and Gilbert, 2009; Gilbert and Treagust, 2009; Grove et al., 2012; National Research Council (NRC), 2012a, 2012b; Weinrich and Talanquer, 2015; Becker et al., 2016; Cooper et al., 2016; Caspari et al., 2017; Weinrich and Sevian, 2017; Vasilyeva et al., 2018; Bodé et al., 2019; Moreira et al., 2019), akin to working in a second language in which many symbols, representations, and tools function as STEM's words, grammar, and syntax (Taber, 2009; Taskin and Bernholt, 2014). The electron-pushing formalism (EPF) forms part of organic chemistry's language, using curved arrows to represent the electron flow in reaction mechanism (Fig. 1). These curved arrows start with electrons (non-bonding electrons or a bond) and point to an electron deficient atom. Despite their use as a tool for experts (Bhattacharyya, 2013), students encounter a number of challenges using the EPF (Ferguson and Bodner, 2008; Craig et al., 2012; Grove and Lowery Bretz, 2012; Bhattacharyya, 2014; Anzovino and Lowery Bretz, 2015; Graulich, 2015).
image file: c9rp00274j-f1.tif
Fig. 1 Learning outcomes and key principles associated with the electron-pushing formalism.

To support student mastery of and fluency in the EPF, we created the “Organic Mechanisms: Mastering the Arrows” module in http://OrgChem101.com (Flynn and Visser, 2018). The module addresses four learning outcomes (LOs, Fig. 1): (1) draw the electron-pushing arrows, given the starting materials and products of the elementary steps, (2) draw the products of a reaction step, given the starting materials and electron-pushing arrows, (3) draw the transition state structure for a reaction step, and (4) draw the reverse reaction mechanism, given the elementary steps in the forward direction (Flynn and Ogilvie, 2015). These learning outcomes are designed to help students gain fluency that they can leverage when they are learning more advanced concepts of reactivity.

The organic reaction mechanisms module begins with a “Get started” section that has a self-assessment as part of metacognitive skill-building (Brown et al., 2014), pre-test, a comparison of those two assessments with a prompt that asks students to decide what to do next for their learning, and an introductory video. Each learning outcome section involves interactive videos and activities with feedback. The module finishes with a “Wrap-up” section that has a self-assessment, post-test, and summary. All the sections are aligned with the module's intended learning outcomes and the module can be used in any curriculum type, including traditional and transformed (Flynn and Ogilvie, 2015).

The module introduces strategies for analyzing questions, such as: expanding or redrawing the structure, mapping the atoms and electrons involved in the reaction step, and building a model. Expanding the structure includes drawing all non-bonding electrons (lone pairs) on heteroatoms involved in the reaction steps. Expanding the structure also helps avoid errors related to implicit atoms and electrons, including avoiding pentavalent carbon atoms. Mapping involves keeping track of electrons and atoms from the starting materials to the products, which usually involves numbering or lettering atoms and electrons, but which can take other forms, such as circling atoms, writing geometrical shapes, or highlighting atoms or electrons. Mapping is particularly useful to determine the bonds broken and formed during a reaction step and can help students avoid mistakes such as missing atoms or misplaced atoms. Building a model can be used to examine the molecule in various conformations, including the reactive one, plus drawing products that have more complex stereochemical information. More details about the module are available in the module itself and in an earlier publication (Visser and Flynn, 2018).

The module was designed based on existing literature, including a study that used exam analysis of thousands of isolated EPF questions that found that students who were taught and assessed on those learning outcomes used few reversed arrows (e.g., atom to electrons), few pentavalent atoms, and higher scores for Draw arrows (LO1) than Draw products (LO2); lower scores were correlated with questions involving implicit atoms and electrons, intramolecular reaction steps, and reactants drawn in conformations differing the reactive one (Flynn and Featherstone, 2017). A follow-up study used an interview format to investigate students’ meaning making when analyzing EPF questions, finding that all participants analyzed electron movement and leveraged their prior knowledge while approaching these questions and the most successful students used mapping strategies (Galloway et al., 2017). Participants relied on charge as a cue to identify areas of reactivity and some used stepwise approaches that resulted in non-chemically feasible intermediates—the latter approach may have simply been a problem-solving strategy to reduce cognitive load or may represent how participants visualized the reactions occurring. Expanding and mapping strategies have also been correlated with successful problem-solving in organic synthesis (Bodé and Flynn, 2016).

http://OrgChem101.com and the modules’ impacts

The Organic Mechanisms module (Flynn et al., 2016) forms part of a larger suite of modules hosted on http://OrgChem101.com: (1) the Nomenclature101 Module, which helps students learn the International Union of Pure and Applied Chemistry (IUPAC) nomenclature and functional groups through 10-question quizzes (Flynn et al., 2014), (2) the mechanisms module which focuses on teaching EPF (Visser and Flynn, 2018), and (3) the acid–base module, which is aligned with acid–base learning outcomes (LO) (Stoyanovich et al., 2015). Each module is student-controlled, interactive, available in English and French, and is an open education resource (free and open access). The latter two modules have metacognitive skill-building layers, designed to help students identify what they know, need to know, and plan their learning strategies. We previously conducted an educational evaluation to determine the Nomenclature101 module's effectiveness for learning and students’ experiences using the learning tool (Bodé et al., 2016). Using a pre-/post-test format and questionnaires, we found high learning gains in a short period of time, plus high student satisfaction. The other modules have not yet been evaluated with respect to their impacts on student learning and experiences, although students’ accuracy estimating their ability (an aspect of metacognition) increased after using the module (Visser and Flynn, 2018). In the present study, we examine students’ learning and experiences when using the Organic Mechanisms Module (Fig. 2).
image file: c9rp00274j-f2.tif
Fig. 2 Overview of the online learning module.

Theoretical framework

The module's design and evaluation was guided by information processing theory (IPT) (Ausubel et al., 1978; Sweller, 1999; Kalyuga et al., 2003; Mayer and Moreno, 2003; Schunk, 2016). IPT states that people perceive information through their senses and parse the new information in working memory. The learners then integrate information into long-term memory by integrating the information with relevant prior knowledge. The learner must actively choose to integrate the information by perceiving it as meaningful and linking it to prior knowledge (Ausubel et al., 1978; Novak, 1993; Bretz, 2001; Galloway and Bretz, 2015; Schunk, 2016). For this reason, new knowledge must be presented in a meaningful way to the learner (Novak, 1993). Modern IPT also incorporates the learner's affect and the learning environment (Schunk, 2016).

IPT guided the design of the module; information is presented to student in a meaningful manner by linking the new concepts to what learners have previously learned and to future required (tested) skills. The module is interactive and engages the learner by pausing the videos and allowing the students to build their own answers. The module also contains practice questions with feedback, allowing the student to practice the skills learned and build on their knowledge. Metacognitive skill-building activities and prompts (e.g., compare your skill rating with your pre-test score: how will you manage your studying accordingly?) provide opportunities for learners to increase their skill in identifying what they know and don’t know, as well as planning their learning time accordingly (National Research Council, 2000).

Research questions

We focused on three research questions regarding undergraduate students’ learning and experiences when using the Organic Mechanisms module in http://Orgchem101.com.

RQ1: What are students’ learning gains after using the module?

RQ2: What effect does the module have on students’ strategies when solving EPF-related questions?

RQ3: What effect does the module have on students’ errors when solving EPF-related questions?


Setting and course

Participants in the study were taking Organic Chemistry I (cohort I study) and II (cohort II and cohort III studies) courses at a large, research-intensive Canadian university. Organic Chemistry I is offered in the winter semester of students’ first year of studies, and Organic Chemistry II is offered in the summer and fall semesters. Both of these courses may be taken in either English or French and consist of two weekly lectures (1.5 hours each, mandatory, lecture or flipped format), and an optional tutorial session (1.5 hours, also called a recitation or discussion group). The Organic Chemistry I course has a required, associated laboratory section (3 hours biweekly) and the Organic Chemistry II course has a laboratory course that runs concurrently and is only required for some programs (3 hours weekly). The University of Ottawa uses a principles and patterns of mechanisms curriculum; in that curriculum, the electron-pushing formalism is explicitly taught before deeper concepts of reactivity are addressed (Flynn and Ogilvie, 2015).

Participants and the study's structure

The University of Ottawa's Research Ethics Board approved all phases of this study. Cohort I studies informed the design of the Cohort II study described herein and are described in previously published work (Visser and Flynn, 2018) and in Appendix 1. The cohort I studies consisted of four parts (Fig. 3): (1) a pre-test, (2) time allotted for the students to use the learning module, (3) a post-test, and (4) a survey asking for students’ opinions and feelings about the module (a more in-depth description is available in Appendix 1). The workshop focused on the first two learning outcomes (Draw the arrows and Draw the products).
image file: c9rp00274j-f3.tif
Fig. 3 Overview of the studies.

For the cohort II study described herein, Organic Chemistry II students enrolled in the 2018 fall term were invited to participate in a workshop held during a regularly held tutorial session (i.e., recitation, discussion group). The researchers made an announcement during a class period and the professor teaching the course posted a recruitment text on the class’ online page. Workshop attendees provided informed consent to participate in the study and could elect to participate in the workshop without having their data used for the study; 103 of 172 attendees consented to have their data used for research purposes; 330 students were enrolled in the course in total.

The cohort III study consisted of Organic Chemistry II students enrolled in the Fall 2019 term during their second tutorial of the term. Two sections of the tutorial were used and were separated into two groups (1) the intervention group that followed the same procedure as the cohort II study, and (2) the control group for which their regular teaching assistant (TA) gave a lesson in acid–base chemistry instead of using the mechanisms module. The tutorial consisted of instruction in organic acid–base chemistry and determining the relative strength of acids and bases. The TA taught the students how to differentiate between weak and strong acids and bases and then the students were asked to complete some questions in small groups; the TA then went over the questions with the students. This specific tutorial and lesson were chosen because they are unrelated to the electron-pushing formalism and therefore served as a control group. Similar to the cohort II study, the participants were asked to provide informed consent to participate but any student was welcomed to attend without their data being used for research; Twenty-four and forty students provided consent in the intervention and control groups, respectively.

The pre-test and post-tests were identical to ensure that the tests were of equal difficulty (Fig. 4). Questions 1–4 were aligned with LO1 (Draw the arrows); Questions 5–8 were aligned with LO2 (Draw the products). While all four learning outcomes are believed to be important, due to time constraints, we focused on what we think are the learning outcomes that are more essential to students’ later success in analyzing mechanisms. During the workshop, students used the module and could ask questions of facilitators who were present. The intended learning outcomes that would be tested were shared with the students and they were encouraged to focus on only those learning outcomes during the session. Students worked individually, in pairs, or in small groups, according to their preference.

image file: c9rp00274j-f4.tif
Fig. 4 Questions on the pre- and post-tests used in cohorts II and III, with Draw the Arrows questions (LO1) on the left with answers in red (bonds and electrons must be expanded if they are involved in the reaction step) and Draw the Products questions (LO2) on the right with answers are in green.

Data analysis

We analyzed the data according to the research questions, using R studio for statistical analyses. For RQ1, the worksheets were coded for the correctness of the answers. For the Draw the arrows questions, a point was given for each correct arrow, which are drawn in red in Fig. 4. The Draw the products questions were coded per arrow. Interpreting an arrow correctly could be given up to two points, one for breaking a bond and one for making a bond, if relevant. In Question 5, for example, points would be associated with the long arrow if the answer showed the breaking the π bond in the aromatic ring interpreted correctly and one point for forming the new C–C σ bond correctly.

Some arrows, such as the small arrow in Question 8, were only worth one point since the arrow represents the breaking of a single (e.g., C–Br) bond and no new bonds are formed. Similarly, the arrow for the collapse of the tetrahedral intermediate in Questions 6 and 7 was only worth one point because no bonds are being broken and one C–O π-bond is being formed.

For RQ2, we analyzed the following strategies for all questions: mapping, expanding the structure, redrawing the structure, and drawing non-bonding electrons. This analysis was done in the Cohort II study. We coded each strategy as being absent or present (even if it was not properly used). Mapping was used whenever a participant would mark (with a number, letter, shape, highlighter) part of a molecule to help situate it in the product. Expanding the structure was coded when implicit atoms and electrons/bonds were drawn. Re-drawing the structure implied that the student had re-drawn either the product or the starting material. Drawing the non-bonding electrons was coded when students drew the electrons on heteroatoms.

For RQ3, the errors on the tests were coded and analyzed for the Cohort II study (codebook is available Appendix 2). For the Draw the arrows learning outcome, the following codes were used for each Draw the arrows question: correct, reversed, wrong, from atom/charge, vague, extra or did not attempt (Flynn and Featherstone, 2017). A correct arrow demonstrated the correct electron flow. A reversed arrow started from the electron deficient site and pointed toward the electron rich site. A wrong arrow started or ended at the wrong site on the molecule. An arrow coded from atom/charge demonstrated the correct electron flow from electron rich to electron poor but did not start from electrons; rather, the arrow started from a charge or an atom. Missing arrow was coded when a required arrow was absent, while extra arrow was coded if there were too many arrows present. Finally, if a student did not draw a single arrow, the did not attempt code was used.

For Draw the products questions, the errors were coded as in previous work (Flynn and Featherstone, 2017). For example, a formal charge error represented an instance with an extra, missing, or incorrect formal charge. A placement error was coded when the product was not correctly connected; a double bond or an atom was misplaced. Transplanting electrons represented a situation when electrons in a bond or on an atom were relocated to a new atom instead of bonding two atoms together.

Reliability of coding

Reliability was addressed via weekly debriefing session between the first and corresponding authors during which the codebook was developed and revised. Once the codebook was fully developed, one researcher coded the entire data set and second coder analyzed 15% of the data to establish inter-rater reliability. The researchers obtained 91% agreement for the scores of the test with Krippendorff's α = 0.81, which is above the threshold of 0.80 (Krippendorff, 2004). For coding the errors, a 93% agreement was obtained, with a Krippendorff's α = 0.91. Similarly, for the strategies used, 89% agreement and Krippendorff's α = 0.90 was obtained. We were satisfied with the reliability of the data analysis, based on the aforementioned results.

Results and discussion

Learning gains were observed across questions and learning outcomes (RQ1)

In the Cohort II study, scores were significantly higher for the post-test than for the pre-test (Fig. 5 and Table 1). Both questions types had significant improvements. LO1's Draw the arrows scores were significantly higher on the post-test than on the pre-test. LO2's Draw the products scores were significantly higher on the post-test than on the pre-test. The highest gains in scores occurred with Draw the products questions.
image file: c9rp00274j-f5.tif
Fig. 5 Post-test versus pre-test scores: overall (left, blue circles), for LO1: Draw the arrows (middle, red triangles), for LO2: Draw the products (right, yellow squares). N = 103. Cohort II.
Table 1 t-Test statistics from comparing the pre-test and post-test scores for the overall score and scores on questions related to the individual LOs
Independent variable Pre-test Post-test t(102) p Cohen's d
Mean (%) SD (%) Mean (%) SD (%)
Overall score 59.5 22.7 72.3 17.6 8.597 <0.001 0.729
Scores from LO1: Draw the arrows 58.3 26.1 66.7 26.9 −3.992 <0.001 0.315
Scores from LO2: Draw the products 60.5 31.1 77.0 19.1 −5.806 <0.001 0.863

Some of the students obtained a score of zero in some of the questions and their answers were examined further for validity. We found that these scores were all valid since they resulted from one of the following three scenarios: (1) some students focused on questions related to only one of the LOs, and thereby obtained a score of zero on the other LO; (2) the students answered the questions incorrectly; or (3) the students wrote a partial (and incorrect) answer on the questions. Since all of these students had answered at least one of the questions on the worksheet, their scores were included in the data analysis.

Scores on Questions 4 and 8 had the largest improvements (Fig. 6). Moreover, only 63% of participants attempted Question 4 and 57% attempted Question 8 on the pre-test, while 82% and 84% attempted those questions on the post-test, respectively.

image file: c9rp00274j-f6.tif
Fig. 6 Score distributions per question on the pre- and post-tests (N = 103). Cohort II.

Normalized learning gains were also calculated and analyzed to account for the high scores of certain participants and a potential ceiling effect in Questions 3 and 6. Normalized learning gains account for the variance in the pre-test scores, the gain in score that is possible for each participant, and ceiling effects; they were calculated using eqn (1). The normalized learning gain calculation accounts for the fact that lower scores have more room for improvement and higher scores have less. A normalized learning gain of 1.0 indicates a perfect score on the post-test. A normalized learning gain of 0.0 indicates no improvement from the pre-test to the post-test, while a negative value shows deterioration of the score.

image file: c9rp00274j-t1.tif(1)

The normalized learning gains revealed improvements on the post-test from the pre-test as a whole (Fig. 7). The biggest learning gains occurred in Question 8, which had a median normalized learning gain of 0.336. Questions 3 and 6 did not have high normalized learning gains since participants performed well on the pre-test, with means of 73% and 72%, respectively.

image file: c9rp00274j-f7.tif
Fig. 7 Normalized learning gain scores, N = 103. Cohort II.

Comparison between the intervention with non-EPF work

The Cohort III study was used to compare normalized learning gains between the intervention and control groups (Table 2 and Fig. 8). We analyzed the data using a Mann–Whitney test due to lack of normality in students’ scores and learning gains in the control group. In the control group, there was no significant difference between the overall pre-test scores and the post-test scores. In contrast, the intervention group had significantly higher scores on the post-test than on the pre-test. The intervention group had LO2's Draw the products scores that were significantly higher on the post-test than on the pre-test. However, LO1's Draw the arrows scores were not statistically different between the pre-test and the post-test. These results we similar to the Cohort I study in which there were high learning gains in LO2 and the overall score and smaller learning gains in LO1.
Table 2 Mann–Whitney test for comparison of pre-test scores and post-test scores for the intervention and the control groups
Group Independent variable Pre-test Post-test U p r
Mean (%) SD (%) Mean (%) SD (%)
Intervention (N = 24) Overall scores 54.4 19.9 78.3 14.7 107.0 <0.001 0.548
Scores on LO1: Draw the arrows 63.1 25.8 70.8 24.4 242.5 0.345
Scores on LO2: Draw the products 47.2 33.8 84.5 11.8 79.5 <0.001 0.136
Control (N = 40) Overall scores 60.0 24.0 62.5 20.3 776.0 0.817
Scores on LO1: Draw the arrows 60.6 26.7 57.0 28.2 854.5 0.602
Scores on LO2: Draw the products 59.4 29.0 67.1 22.6 719.0 0.434

image file: c9rp00274j-f8.tif
Fig. 8 Normalized learning gains for the intervention (N = 24) and control (N = 40) groups. Cohort III.

We found a significant difference between the control and intervention group in the overall learning gains (Table 3), LO1 Draw the arrows normalized learning gains and LO2 Draw the products normalized learning gains. The large effect sizes calculated show that the intervention had a strong effect on students’ learning gains compared to the control group. In summary students in two different cohorts (cohort II and cohort III) had learning gains using the mechanism module over a short period of time.

Table 3 Mann–Whitney tests to compare the normalized learning gains from the control and the intervention groups
Independent variable Intervention (N = 24) Control (N = 40) U p r
Mean SD Mean SD
Overall normalized LG −0.076 0.391 −0.111 0.569 167.0 <0.001 0.548
Normalized LG on LO1: Draw the arrows −0.250 0.254 −0.302 0.932 1102.0 <0.001 0.320
Normalized LG on LO2: Draw the products 0.525 0.450 −0.196 1.212 213.5 <0.001 0.462

Participants used common problem-solving strategies (RQ2)

To investigate RQ2, we analyzed the Cohort II study data for the strategies used. All of the strategies mentioned earlier (expanding the structure, re-drawing the structure, and mapping) are explained and used in the module. We anticipated seeing an increase in the use of those strategies from the pre-test to the post-test. We found that the only significant difference observed was in the use of mapping strategies. The other strategies (e.g., expanding structures, drawing non-bonding electrons) showed no significant change in use from the pre-test to the post-test (Fig. 9). The lack of strategy use could be related to students not believing the strategies are useful, necessary, thinking that the strategies would take too much time to implement, or not having had enough time to see how they can be useful. In both question types, more students attempted the questions in the post-test than in the pre-test.
image file: c9rp00274j-f9.tif
Fig. 9 Mapping instances found on the pre-test (orange, left) and post-test (blue, right) for each question. N = 103. Cohort II.

Mapping was the strategy that was demonstrated significantly more often in the post-test (48%) than in the pre-test (20%), χ2(1) = 23.361, p < 0.001, although still in relatively few questions overall (Fig. 9). The module dedicated a lot of time to this subject and contained many mapping questions for students to practice. Draw the products questions (Q5 to Q8) had the biggest increase in use of mapping (example in Fig. 10). Question 6 was done very well—average of 77% on the pre-test and 87% on the post-test—but very few students chose to map. We hypothesize that in those cases, participants chose not to map because they could visualize the answer or mapped in their heads. Question 8 had the largest increase in mapping (from 11 to 38 instances) as well as the largest learning gains (47% to 77%). Perhaps in this question, participants found value in mapping. A key decision point for students will be when to use such strategies, much like how one needs to decide when heuristics are appropriate and when deeper analytical thinking is needed (Talanquer, 2014, 2017).

image file: c9rp00274j-f10.tif
Fig. 10 Example of successful mapping used on the post-test, question 8.

Students who mapped in the pre-test (N = 18) had a higher mean score (70.1) than the participants who did not (N = 84, M = 57.0), t(25) = 2.383, p = 0.034. However, on the post-test both students who mapped (N = 49, M = 73.3) and did not map (N = 53, M = 71.5) obtained similar results, t(100) = 0.513, p = 0.609.

To analyze the effect of mapping on the students’ learning gains, the participants were separated into four groups based on whether or not they mapped (Fig. 11). Group 1 includes the sixteen participants who mapped on both the pre-test and the post-test. Group 2 includes the two participants who mapped on the pre-test and did not map on the post-test. Group 3 includes the 33 participants who did not map on the pre-test but mapped on the post-test. Group 4 includes the 51 participants who did not map on the pre-test or post-test.

image file: c9rp00274j-f11.tif
Fig. 11 Test scores, grouped according to whether mapping was observed in the pre-test and post-test. Cohort II.

The participants in Group 3, who used mapping on the post-test but not on the pre-test showed learning gains of 13%, which is not statistically different than the average of 11% for all students.

A one-way ANOVA was done between Groups 1, 3, and 4 to determine the effect of the groups on normalized learning gains. Group 2 was excluded due to the low number of participants. There was no significant effect for the three groups [F(2,97) = 0.941, p = 0.394]. Participants who changed from not-mapping (pre-test) to mapping (post-test) had similar learning gains as the other groups. This analysis accounts for the overall score of the worksheet and not the specific questions related to mapping. Another limitation of the analysis is that the worksheets were not coded on whether the participants mapped properly or not, but simply whether they mapped at all. The increase in the use of the mapping strategy is nevertheless promising.

Common errors (RQ3)

To analyze the errors that occurred in the questions, we coded answers from Cohort II. For Draw the arrows questions (LO1), there was a significant increase in the number of correct arrows between the pre-test and post-test, χ2(1) = 50.10, p < 0.001, consistent with the significant increase in learning gains (Fig. 12). Overall, there was a similar percentage and type of errors on both tests, i.e., no statistically significant difference (Fig. 12). Another major difference was that there were significantly more attempts on questions in the post-test than in the pre-test, χ2(1) = 19.05, p < 0.001.
image file: c9rp00274j-f12.tif
Fig. 12 For LO1, Draw the Arrows questions: overview of correct answers, errors in arrows, and attempts between the pre-test (left) and the post-test (right). Each data point represents a single arrow, N = 1522.

The most common type of error was drawing an arrow from an atom or charge (Fig. 13). These arrows demonstrate the correct direction of electron movement; however, the start of the arrow did not begin with electrons and so we did not assess it to be correct. The module emphasizes that bonds are created by the movement of electrons—not atoms or charges; therefore, curved (EPF) arrows should start from electrons (Fig. 14). In later years, we anticipate that students would adopt the common conventions used by organic chemists to draw EPF arrows from atoms or charges. Question 2 had the highest instance of this type of error, which had three of the six arrows start from electrons on a heteroatom, meaning that the students had to draw the electrons (i.e., an extra step). In contrast, drawing arrows from atoms or charges was rarely observed in Question 1, which involved a hydride transfer.

image file: c9rp00274j-f13.tif
Fig. 13 Example of errors seen in Draw the arrows questions (LO1), specifically question 2.

image file: c9rp00274j-f14.tif
Fig. 14 Common errors found in answers for questions associated with LO2: Draw the products. N = 103. Cohort II.

Questions associated with LO2 (Draw the products) had common errors consistent with previous work (Fig. 14) (Flynn and Featherstone, 2017). Missing or extra bonds were prevalent in Question 5 (Fig. 15). Questions tended to have specific types of errors; for example, answers to Question 7 were frequently missing a methyl group. Transplanting electrons was only observed in Question 8 (example in Appendix 2, Table 7).

image file: c9rp00274j-f15.tif
Fig. 15 Example of a missing atom error in a Draw the products question (LO2), specifically Question 5.

For all the errors in the Draw the products questions, the “long” arrow seemed to be most challenging to interpret, while shorter arrows were interpreted correctly more often. For example, in Question 5, only 39% of students on the pre-test successfully made the bond from the long arrow, as opposed to 65% for the bottom left arrow and 62% for the bottom right arrow. Comparing the arrow interpretations to each other showed significant difference between the long arrow and each of the shorter arrows, χ2 = 23.59, p < 0.001 and χ2 = 22.88, p < 0.001, respectively. There was no significant difference in interpretation between the two shorter arrows, χ2 = 0.16, p = 0.344. Similar results were found in the post-test, as well as Questions 7 and 8. These long arrows represent situations in which conformational changes are needed for the molecule to go from the conformation depicted in the starting material to the reactive conformation, which likely posed visualization and mental rotation challenges for students. Analogous difficulty was found between the same reactants shown in different conformations (Flynn and Featherstone, 2017). These results will need to be explored in more depth.


We found significant learning gains after students used the Organic Mechanisms: Mastering the Arrows module, an open education resource, in http://OrgChem101.com for one hour, focusing on the first two learning outcomes of the module: Draw the arrows, given the starting materials and products of a reaction step (LO1) and Draw the products, given the starting materials and electron-pushing arrows for that step (LO2), all for reactions the students had not previously seen. More students attempted the questions and used a mapping strategy in the post-test than in the pre-test. Students have reported a positive learning experience, belief that the module would help them in their organic chemistry courses, and greater ability to assess their skills related the learning outcomes described herein (i.e., an aspect of metacognition) after using the module (Visser and Flynn, 2018). Students in the control group did not experience these learning gains. These findings are consistent with a previous educational evaluation of the Nomenclature101 module in OrgChem101 (Flynn et al., 2014; Bodé et al., 2016). The strategies and errors observed in this study are also consistent with the types and relative prevalence found in exam analysis of electron-pushing formalism questions (Flynn and Featherstone, 2017).

Potential limitations

This study did not make comparisons with other instructional methods so we cannot make claims about the effectiveness of this method compared to others, only that significant and large learning gains were observed in this study's context in a short time period. This study demonstrated that this online tool is effective way for students to learn a key aspect of chemistry's language.

The learning gains of the cohort II study could be associated with time on task; analogous class time could have similar effects, as we found when evaluating the Nomenclature101 module in the same learning tool (Bodé et al., 2016). Students’ learning was measured over a short time period and so we do not make claims about the enduring nature of students’ abilities or students’ ability to transfer their skills to new situations. As with any learning, we expect that practice and use in context are essential for meaningful and enduring connections to made with other areas of chemistry. Students’ learning was only studied with respect to the first two learning outcomes of the module, although we hypothesize that the learning effects will be similar for intended learning outcomes three (draw the transition state of a reaction step, given the starting materials and product of that step) and four (draw the mechanism of the reversed reaction, given the forward mechanism. Because the pre-test and post-test were identical, students may have remembered their answers from the pre-test. We think this limitation is unlikely since the pre-test was collected immediately after it was completed, and the questions were of sufficient difficulty and present in sufficient quantity to mitigate a memorization effect. Moreover, even if the students remembered their answers, a repeated individual response would tend to produce the same answer (e.g., an incorrect answer would still be incorrect) if the intervention had no effect.

Another limitation of the study is that we did not collect data on which part of the module students finished or what they worked on during the workshop. The Classroom Observation Protocol for Undergraduate STEM courses (COPUS) was used to record what the students were generally doing during the workshop and showed that the students were working either by themselves or in group. However, we do not have data on what task exactly they were doing, and some students may have been working off-task or may not have completed the module.

Implications for teaching

This study demonstrated that the http://OrgChem101.com: Mastering the arrows module is a useful tool for students to learn and enhance their skills using the electron-pushing formalism. The mechanisms learning module could be used in a class setting, guiding students through its use and answering questions; alternatively, the expected learning outcomes in the course could be provided to students and they could work on the module independently.

With either type of module use, summative assessment questions (e.g., midterm and exam questions) should include aligned questions to demonstrate the value of the activities and module to the students as well as assess progress toward the intended learning outcomes.

The module's approach and question types address students’ skill in interpreting the electron-pushing formalism, part of organic chemistry's language, but does not directly teach or assess concepts or reasons for reactivity observed. In uOttawa's curriculum, those concepts and reasons are addressed in other areas and question types, building off students’ skills with the EPF (Flynn and Ogilvie, 2015; Raycroft and Flynn, 2017; Bodé et al., 2019). The module could also be used in a traditional curriculum to help students learn EPF skills and allow the students to practice on new questions focused solely on the EPF. The learning module also teaches strategies that are correlated with greater success for organic synthesis problems (Bodé and Flynn, 2016). The increase in the usage of mapping highlights the usefulness of the learning module in modelling these strategies for students.

Implications for research

The module is an ever-improving learning tool, therefore further research into its usefulness would help enhance the module. Further research into why students use strategies and interviews about the thought-process students engage in while solving these problems could help the development team enhance the module. The design and development of a module such as this is always iterative and the data obtained will guide the next iteration to improve the module further.

The effect of this module could also be studied for students currently in a traditional curriculum. Future research could investigate the effects of gaining fluency in organic chemistry's symbolism on students’ abilities to learn new concepts in organic chemistry that require using organic chemistry's symbolism. Future research could also investigate the existence and extent of benefits for a variety of learners (e.g., gender differences, effects of technological proficiency on learning).

Conflicts of interest

There are no conflicts to declare.

Appendix 1: cohort I study

Two pilot (cohort I) studies were conducted to ensure that the planned study's structure would unfold as expected and make adjustments as needed. The first is described in a previously published article (Visser and Flynn, 2018); the second is described below.

Participants and setting of the cohort I study

Organic chemistry I students enrolled in the 2018 winter term were invited to participate in a workshop held after their regularly scheduled classes. The researchers made an announcement during a class period and the professor teaching the course posted a recruitment text on the course online page. Pizza was provided as an incentive. Workshop attendees provided informed consent to participate in the study and could elect to participate in the workshop without having their data used for the study; nine attendees consented to have their data used in the study.

The cohort I study consisted of four parts (Fig. 3): (1) a pre-test, (2) time allotted for the students to use the learning module, (3) a post-test, and (4) a survey asking for their opinions about the module. The workshop focused on the first two learning outcomes (Draw arrows and Draw products).


The cohort I study showed learning gains between the pre-test (M = 61%) and the post-test (M = 70%), t(8) = −2.29, p = 0.051. These results were investigated in greater depth in the Cohort II study (vide infra).

Participants used very few strategies, such as mapping or expanding, although the strategies are taught in the module and have been correlated with successful problem solving in organic synthesis questions. Every participant answered Question 1 correctly on both the pre-test and the post-test. Because of this ceiling effect, the researchers replaced that question with a Draw the arrows question in the cohort II study.

All the participants completed a survey about their experiences. They replied positively when asked if the module's effect. For example, when answering the question: “Do you think your use of the organic reaction mechanisms module will have an impact on your success in your course? If so, how?”, Participant 1 wrote that “It is very good for learning the basics”. Participants 3, 5, 6 and 7 all mentioned that the practice questions and instant feedback were useful and the main reason they liked the module. Participant 9 stated that the module was good “to teach me everything I was behind on” and Participant 2 mentioned that the module was useful to “Make sure I understand the basics”. Table 1 shows the common themes recurring from the students’ answers in the survey (Table 4).

Table 4 Students’ comments regarding the features of the learning module
Best features Worst features
a The developers have resolved these issues.
Quick response to answers Occasional errors in the answersa
Module interface was easy Video question sometimes hid key information
Good explanations and descriptive videos Draw the arrow tool was hard to use
Self-paced Mapping questions were hard to seea

A few technical difficulties were reported, such as occasional errors in the feedback, problems logging in, and difficulty with the drawing ability of the learning module. These issues have been addressed by the developers and were not significant barriers to the students’ experiences or learning.

Appendix 2: coding scheme

For LO1: Draw the arrows questions, each arrow was awarded one point (Fig. 16). Arrows must start from an electron pair or bond and point to the correct atom or bond. Therefore, Question 1 was worth 4 points, question 2 was worth 6 points, question 3 was worth two points, and question 4 was worth 3 points.
image file: c9rp00274j-f16.tif
Fig. 16 Coding scheme for Draw the arrow questions. Each arrow was awarded one point.

For LO2: Draw the products questions, one point was awarded for interpreting each aspect of the electron pushing arrow: breaking and making a bond. For the example question 5, shown in Fig. 17, arrow A was worth two points: one point was awarded for breaking the π bond between carbons 1 and 6 and one point for making the bond between carbons 5 and 6. Each response was coded accordingly. All of the other Draw the arrow questions were coded in this way (Fig. 18–20).

image file: c9rp00274j-f17.tif
Fig. 17 Coding scheme for question 5: Draw the products.

image file: c9rp00274j-f18.tif
Fig. 18 Coding scheme for question 6: Draw the products.

image file: c9rp00274j-f19.tif
Fig. 19 Coding scheme for question 7: Draw the products.

image file: c9rp00274j-f20.tif
Fig. 20 Coding scheme for question 8: Draw the products.

Strategies used

Both LO1 and LO2 were coded with the same strategies. Table 5 shows the description associated with each strategy for both types of questions.
Table 5 Coding scheme for the strategies used
Code LO1: draw the arrows LO2: draw the products
Attempted Must have minimum 1 arrow Must have any structures
Drew non-bonding electrons Must have at least 1 non-bonding electron pair Must have at least 1 non-bonding electron pair on the product
Expanded the structure Explicitly drew proton(s) and/or wrote out a C for carbon Explicitly drew proton(s) and/or wrote out a C for carbon
Re-drew the structure Re-drew all or part of the structure Re-drew all or part of the structure
Mapping Any attempt to identify atoms in both the SM and product Any attempt to identify atoms in both the SM and product

For LO1, each arrow is assigned one of the codes outlined in Table 6. The codes were adapted from Flynn and Featherstone's work (2017).

Table 6 Coding scheme for errors in Draw the arrows questions; examples are from question 4
Code Definition Example
Correct arrow The arrow is correct image file: c9rp00274j-u1.tif
Missing arrow The arrow is missing image file: c9rp00274j-u2.tif
Extra arrow There is an extra arrow to show the step image file: c9rp00274j-u3.tif
Reversed arrow The arrow starts at the electron deficient site and points to the electron rich site image file: c9rp00274j-u4.tif
Arrow from the atom The arrow starts at an atom (the arrow would be “correct” if it had started from an electron pair) image file: c9rp00274j-u5.tif
Arrow from the charge The arrow starts at a charge (the arrow would be “correct” if it had started from an electron pair) image file: c9rp00274j-u6.tif
Wrong arrow The arrow starts and/or end at the wrong sites image file: c9rp00274j-u7.tif
Vague arrow The arrow is too vague to determine where it starts or points to image file: c9rp00274j-u8.tif
Did not attempt The question was left blank

For LO2: Draw the products several errors were coded for (Table 7). The errors were once again adapted from previous work (Flynn and Featherstone, 2017).

Table 7 Coding scheme for the errors of LO2 Draw the products; examples are from question 7
Error Definition Example
Formal charge (FC) error When the structure has either the wrong FC, a missing FC or an extra FC. The formal charge error must be in relation to the structure they drew. image file: c9rp00274j-u9.tif
Transplanting electrons Taking electrons and moving them without forming a bond image file: c9rp00274j-u10.tif
Missing/extra atom There is a missing or extra atom in the molecule drawn image file: c9rp00274j-u11.tif
Missing/extra bond There is a missing or extra bond in the molecule drawn image file: c9rp00274j-u12.tif
Placement error The atoms are not correctly connected (typically π is misplaced) image file: c9rp00274j-u13.tif
Did not attempt No answer was provided


We thank the Teaching and Learning Support Services at the University of Ottawa for their invaluable collaboration in creating and maintaining http://OrgChem101.com. We thank the University of Ottawa and the Social Sciences and Humanities Research Council of Canada for funding. Funders had no involvement in the research activities or findings. MC thanks SSHRC for support in the form of a CGS-D scholarship.


  1. Anzovino M. E. and Lowery Bretz S., (2015), Organic chemistry students’ ideas about nucleophiles and electrophiles: the role of charges and mechanisms, Chem. Educ. Res. Pract., 16(4), 797–810.
  2. Ausubel D. P., Novak J. D. and Hanesian H., (1978), in Rinehart and Winston (ed.), Educational psychology: a cognitive view, Holt.
  3. Becker N., Noyes K. and Cooper M. M., (2016), Characterizing Students’ Mechanistic Reasoning about London Dispersion Forces, J. Chem. Educ., 93(10), 1713–1724.
  4. Bhattacharyya G., (2013), From source to sink: mechanistic reasoning using the electron-pushing formalism, J. Chem. Educ., 90(10), 1282–1289.
  5. Bhattacharyya G., (2014), Trials and tribulations: student approaches and difficulties with proposing mechanisms using the electron-pushing formalism, Chem. Educ. Res. Pract., 15(4), 594–609.
  6. Bhattacharyya G. and Bodner G. M., (2005), “It Gets Me to the Product”: How Students Propose Organic Mechanisms, J. Chem. Educ., 82(9), 1402–1407.
  7. Bodé N. E. and Flynn A. B., (2016), Strategies of Successful Synthesis Solutions: Mapping, Mechanisms, and More, J. Chem. Educ., 93(4), 593–604.
  8. Bodé N. E., Caron J. and Flynn A. B., (2016), Evaluating students’ learning gains and experiences from using http://nomenclature101.com, Chem. Educ. Res. Pract., 17(4), 1156–1173.
  9. Bodé N. E., Deng J. M. and Flynn A. B., (2019), Getting Past The Rules and to the WHY: Causal Mechanistic Arguments When Judging the Plausibility of Organic Reaction Mechanism, J. Chem. Educ., 96(6), 1068–1082.
  10. Bretz S. L., (2001), Novak's Theory of Education: Human Constructivism and Meaningful Learning, J. Chem. Educ., 78(8), 1107.
  11. Brown P. C., Roediger H. L. and McDaniel M. A., (2014), Make it stick: the science of successful learning, Cambridge, Massachusetts: The Belknap Press of Harvard University Press.
  12. Caspari I., Weinrich M., Sevian H. and Graulich N., (2017), This mechanistic step is “productive”: organic chemistry students’ backward-oriented reasoning, Chem. Educ. Res. Pract., 19(1), 42–59.
  13. Cheng M. and Gilbert J. K., (2009), Towards a Better Utilization of Diagrams in Research into the Use of Representative Levels in Chemical Education, in Multiple Representations in Chemical Education, Springer, pp. 55–73.
  14. Cooper M. M., Kouyoumdjian H. and Underwood S. M., (2016), Investigating Students’ Reasoning about Acid–Base Reactions, J. Chem. Educ., 93(10), 1703–1712.
  15. Craig A. F., Koch D. L., Buffington A. and Grove N., (2012), Narrowing the Gap? Revisiting Publication Rates in Chemistry Education, J. Chem. Educ., 89(12), 1606–1608.
  16. Ferguson R. and Bodner G. M., (2008), Making sense of the arrow-pushing formalism among chemistry majors enrolled in organic chemistry, Chem. Educ. Res. Pract., 9(2), 102–113.
  17. Flynn A. B. and Featherstone R. B., (2017), Language of mechanisms: exam analysis reveals students’ strengths, strategies, and errors when using the electron-pushing formalism (curved arrows) in new reactions, Chem. Educ. Res. Pract., 18(1), 64–77.
  18. Flynn A. B. and Ogilvie W. W., (2015), Mechanisms before reactions: a mechanistic approach to the organic chemistry curriculum based on patterns of electron flow, J. Chem. Educ., 92(5), 803–810.
  19. Flynn A. B. and Visser R., (2018), Developing Open Educational Resources in French and English for Students of Organic Chemistry at the University of Ottawa, Canada. Contact North, Contact Nord.
  20. Flynn A. B., Caron J., Laroche J., Daviau-Duguay M., Marcoux C. and Richard G., (2014), http://Nomenclature101.com: a free, student-driven organic chemistry nomenclature learning tool, J. Chem. Educ., 91(11), 1855–1859.
  21. Flynn A. B., Caron J., Laroche J., Richard G., Bélanger M. and Featherstone R., (2016), http://Orgchem101.com: an organic chemistry and metacognitive skill and concept building tool.
  22. Galloway K. R. and Bretz S. L., (2015), Measuring Meaningful Learning in the Undergraduate General Chemistry and Organic Chemistry Laboratories: A Longitudinal Study, J. Chem. Educ., 92(12), 2019–2030.
  23. Galloway K. R., Stoyanovich C. and Flynn A. B., (2017), Students’ interpretations of mechanistic language in organic chemistry before learning reactions, Chem. Educ. Res. Pract., 18(2), 353–374.
  24. Gilbert J. K., (2008), in Gilbert J. K., Reiner M. and Nakleh M. (ed.), Visualization: An Emergent Field of Practice and Enquiry in Science Education.
  25. Gilbert J. K. and Treagust D. F., (2009), in Gilbert J. K. and Treagust D. F. (ed.), Multiple Representations in Chemical Education, Springer.
  26. Graulich N., (2015), The tip of the iceberg in organic chemistry classes: how do students deal with the invisible? Chem. Educ. Res. Pract., 16(1), 9–21.
  27. Grove N. P. and Lowery Bretz S., (2012), A continuum of learning: from rote memorization to meaningful learning in organic chemistry, Chem. Educ. Res. Pract., 13(3), 201–208.
  28. Grove N. P., Cooper M. M. and Rush K. M., (2012), Decorating with arrows: toward the development of representational competence in organic chemistry, J. Chem. Educ., 89(7), 844–849.
  29. Johnstone A. H., (1982), Macro- and micro-chemistry, Sch. Sci. Rev., 64, 377–379.
  30. Johnstone A. H., (2000), Teaching of Chemistry – Logical or Psychological? Chem. Educ. Res. Pract., 1(1), 9–15.
  31. Kalyuga S., Ayres P., Chandler P. and Sweller J., (2003), Cognitive Load Measurement as a Means to Advance Cognitive Load Theory, Educ. Psychol., 38, 23–31.
  32. Kozma R. B. and Russell J., (1997), Multimedia and understanding: expert and novice responses to different representations of chemical phenomena, J. Res. Sci. Teach., 34(9), 949–968.
  33. Krippendorff K., (2004), Reliability in Content Analysis, Hum. Commun. Res., 30(3), 411–433.
  34. Mayer R. E. and Moreno R., (2003), Cognitive Load Measurement as a Means to Advance Cognitive Load Theory, Educ. Psychol., 1520(38), 43–52.
  35. Moreira P., Marzabal A. and Talanquer V., (2019), Using a mechanistic framework to characterise chemistry students’ reasoning in written explanations, Chem. Educ. Res. Pract., 20, 120–131.
  36. National Research Council (NRC), (2000), How People Learn: Brain, Mind, Experience, and School: Expanded Edition, The National Academies Press.
  37. National Research Council (NRC), (2012a), A Framework for K-12 Science Education, ch. 4 and 7.
  38. National Research Council (NRC), (2012b), in Singer S. R., Nielsen N. R. and Schweingruber H. A. (ed.), Discipline-Based Education Research: Understanding and Improving Learning in Undergraduate Science and Engineering, The National Academies Press.
  39. Novak J. D., (1993), Human constructivism: a unification of psychological and epistemological phenomena in meaning making, Int. J. Pers. Constr. Psychol., 6(2), 167–193.
  40. Raycroft M. and Flynn A. B., (2017), Next Steps Toward Improving the Organic Chemistry Curriculum at the University of Ottawa, in Canadian Society for Chemistry: 100th Canadian Chemistry Conference and Exhibition.
  41. Russ R. S., Scherr R. E., Hammer D. and Mikeska J., (2008), Recognizing mechanistic reasoning in student scientific inquiry: a framework for discourse analysis developed from philosophy of science, Sci. Educ., 92(3), 499–525.
  42. Schunk D., (2016), Learning Theories: An Educational Perspective, 6th edn, Pearson.
  43. Stoyanovich C., Gandhi A. and Flynn A. B., (2015), Acid–Base Learning Outcomes for Students in an Introductory Organic Chemistry Course, J. Chem. Educ., 92(2), 220–229.
  44. Sweller J., (1999), Instructional design in technical areas, ACER Press.
  45. Taber K. S., (2009), Learning at the Symbolic Level, in Multiple Representations in Chemical Education, Springers Netherlands, pp. 75–105.
  46. Talanquer V., (2011), Macro, submicro, and symbolic: the many faces of the chemistry “triplet”, Int. J. Sci. Educ., 33(2), 179–195.
  47. Talanquer V., (2014), Chemistry Education: Ten Heuristics To Tame, J. Chem. Educ., 91(8), 1091–1097.
  48. Talanquer V., (2017), Concept Inventories: Predicting the Wrong Answer May Boost Performance, J. Chem. Educ., 94(12), 1805–1810.
  49. Taskin V. and Bernholt S., (2014), Students’ Understanding of Chemical Formulae: a review of empirical research, Int. J. Sci. Educ., 36(1), 157–185.
  50. Vasilyeva N., Blanchard T. and Lombrozo T., (2018), Stable Causal Relationships Are Better Causal Relationships, Cognit. Sci., 42(4), 1265–1296.
  51. Visser R. and Flynn A. B., (2018), What are students’ learning and experiences in an online learning tool designed for cognitive and metacognitive skill development? Collect. Essays Learn. Teach., 11, 129–140.
  52. Weinrich M. L. and Sevian H., (2017), Capturing students’ abstraction while solving organic reaction mechanism problems across a semester, Chem. Educ. Res. Pract., 18(1), 169–190.
  53. Weinrich M. and Talanquer V., (2015), Mapping students’ conceptual modes when thinking about chemical reactions used to make a desired product, Chem. Educ. Res. Pract., 16, 561–577.

This journal is © The Royal Society of Chemistry 2020