Kathleen S.
Lee
*a,
Brad
Rix
a and
Michael Z.
Spivey
b
aDepartment of Chemistry and Biochemistry, Abilene Christian University, USA. E-mail: Kathleen.Lee@acu.edu; Brad.Rix@acu.edu
bDepartment of Mathematics and Computer Science, University of Puget Sound, USA. E-mail: mspivey@pugetsound.edu
First published on 4th October 2022
Organic Chemistry I presents challenges to many students pursuing diverse fields of study, oftentimes curtailing further progress in those fields. The ability to identify students at risk of unsuccessful course outcomes may lead to improved success rates by offering tailored resources to those students. Previously identified predictors include college entrance exam scores, grade point averages (GPA), General Chemistry II course grades, first exam scores, and results from a logical thinking assessment. This work explores the use of the 20-item Math-Up Skills Test (MUST) in a first-semester organic chemistry course over two years at a small private university. Analysis of scores on the MUST, which is taken during the first week of the semester, indicates a statistically significant difference between successful and unsuccessful first-time students (n = 74 and 49, respectively) as the MUST has good internal consistency (Cronbach's alpha = 0.861) and a large effect size (Cohen's d = 1.29). Taken alone, the MUST predicts students at risk of not passing the course with 64% accuracy; addition of start-of-term science GPA data improves predictions to 82% accuracy. Predictions are further improved with incorporation of scores from the first exam of the semester. Observations to date indicate that the MUST is an easily administered assessment that can be utilized alone or as part of a trio of measures to predict success in first-semester Organic Chemistry. Implications of a mathematics assessment as a predictor for Organic Chemistry are addressed.
Efforts to compare levels of student success fall into two categories: those that examine common characteristics of successful students and those that employ assessments or measures to predict success from the outset of the course. Characteristics common amongst students who successfully complete organic chemistry include a positive perception of the field and attitude toward chemistry (Steiner and Sullivan, 1984; Turner and Lindsay, 2003), self-efficacy (Lynch and Trujillo, 2011; Villafañe et al., 2016), increasing levels of autonomous motivation (Black and Deci, 2000), more frequent studying early in the semester (Szu et al., 2011), and engaging in help-seeking behaviors (Horowitz et al., 2013). Grove and Bretz (2010) indicate that the study of organic chemistry requires learners to function at the multiplistic or relativistic levels of epistemological development. Furthermore, students classified as abstraction learners outperform exemplar learners in Organic Chemistry II (Frey et al., 2017).
While decisions relating to classroom practices and the development of resources may be informed by characteristics of successful students, these characteristics do little to identify which individuals will likely struggle through the class. Assessments and measures that can serve as an early warning system offer valuable information to instructors who can offer tailored resources to students at risk of not passing Organic Chemistry before any damage to grades occurs. Previously identified warning signals include several measures of general achievement, briefly described here in the order a student would encounter the assessment.
Standardized tests, taken prior to a student's entering college, show a moderate correlation with organic chemistry grades (Rixse and Pickering, 1985; Turner and Lindsay, 2003; Pursell, 2007; Hall et al., 2014). Rixse and Pickering found the mathematics and verbal sections of the SAT to correlate equally well with organic grades (n = 117, r2 = 0.16 and 0.14, respectively), while Turner and Lindsay found organic scores correlated slightly better with the ACT-mathematics section than with they did with the English, reading, or science reasoning sections of the ACT (n = 221). Other groups have reported conflicting results concerning standardized test scores and organic grades. The Cadet Entrance Evaluation Report, a score comprised of 48% SAT-mathematics, 22% SAT-verbal, and 30% high school rank, showed widely varying correlations (r2 = 0.001–0.59) with organic scores in four sections of the course (n = 18–20) at the United States Military Academy at West Point (Pursell, 2007). Steiner and Sullivan (1984) also reported better performance on the ACT-mathematics amongst the organic students earning a C or lower compared with those earning C+ or better (n = 64). Taken together, standardized test scores may not serve as the most reliable predictor of success.
Students typically enroll in organic chemistry during their second or third year of college, approximately two or three years after taking the standardized tests. Thus, students enrolling in organic courses have grade point averages (GPA) from coursework done during their first year or two of college. Students’ incoming GPA showed moderate to high correlations with organic grades (r2 = 0.18–0.67) (Pursell, 2007; Szu et al., 2011). In addition to indicating a level of general knowledge, part of this measure may include affective traits such as study skills and habits. Lopez and colleagues (2014) limited the incoming GPA to that of a prior science GPA, which included any science courses completed prior to organic, and found the refined measure accounted for 19% of the variability in students’ problem-solving performance.
Among the courses that would comprise a science GPA are those in the general chemistry sequence, which typically serves as a prerequisite for enrollment in first-semester organic. Indeed, general chemistry courses address several concepts (e.g., electronegativity, bonding, molecular geometry, hybridization, etc.) that are foundational to a solid understanding of the structure and reactivity of organic compounds, so it is unsurprising that general chemistry scores have been positively related to organic chemistry (Pursell, 2007; Austin et al., 2015). General Chemistry II courses, which tend toward quantitative analyses, show a range of correlations with Organic Chemistry I outcomes (r2 = 0.32–0.52) (Rixse and Pickering, 1985; Turner and Lindsay, 2003; Horowitz et al., 2013). A comparison of course grades revealed that more than 50% of students in organic chemistry earn within one-third of a grade step of their general chemistry grade (such as B to B–), with most changes in the downward direction (Rixse and Pickering, 1985).
Less is known of the relationship between General Chemistry I scores and Organic Chemistry I grades despite the fact that General Chemistry I houses the aforementioned relevant topics. Rixse and Pickering (1985) report sophomores’ fall organic chemistry grades more weakly correlate with fall freshman chemistry than with spring freshman chemistry (r2 = 0.30 and 0.38, respectively), but the reverse holds true for juniors (r2 = 0.50 and 0.37, respectively). Jasien (2003) compared students having either one or two semesters of general chemistry and found no statistically significant difference between their organic course outcomes, which might indicate that only General Chemistry I was important for success in organic courses. However, he also reported that the groups were statistically significantly different in age (with older students having two semesters of general chemistry) and in the amount of time since their most recent chemistry course. Both studies may indicate that organic course performance depends on the student's total amount of college experience (Robinson et al., 2007).
When multiple predictors were measured within a single study, general chemistry scores outperformed help-seeking behaviors (Horowitz et al., 2013), standardized test scores (Rixse and Pickering, 1985), and noncognitive variables (Turner and Lindsay, 2003) but fell behind incoming GPA (Pursell, 2007).
The first exam of the semester has been shown to indicate which students might not pass first semester organic at the University of North Georgia (Hollabaugh et al., 2019). Nearly two-thirds of students earned course averages within 10 points of their first exam score. As these first exam scores accounted for only 15% of the overall course grade, researchers discounted “arithmetic determinism” as the sole cause of the relationship between the two scores. Rather, they postulated that the first exam assesses students’ knowledge of review topics (e.g., Lewis structures and molecular geometries) and new concepts (e.g., structure representations, resonance, and acid-base chemistry) that are essential due to the cumulative nature of the course; inability to master these topics leads to shaky understanding of later concepts (Hollabaugh et al., 2019). Exam scores were also found to have a small but positive reciprocal effect with self-efficacy on performance (Villafañe et al., 2016).
Herron (1975), Goodstein and Howe (1978), and Bird (2010) indicate that the study of college chemistry requires students to be at the formal operational level but that many still are pre-formal thinkers. Indeed, formal thought assessment scores correlate with outcomes in general chemistry courses (Bender and Milakofsky, 1982; Bunce and Hutchinson, 1993; Lewis and Lewis, 2007; Bird, 2010). While similar comparisons between cognitive development and organic chemistry course outcomes do not seem to have been explored, Bunce and Hutchinson found a moderate correlation (r2 = 0.22) between the Group Assessment of Logical Thinking (Roadrangka et al., 1983) and outcomes in an organic and biochemistry course for nursing majors (1993). Certainly, a one-semester organic chemistry course for nonmajors would entail a different level of rigor than a two-semester sequence for science majors, though there would be noteworthy similarities in the course materials. While the developmental level required by organic chemistry students is nebulous, select factors may provide an indication of the students’ general level of reasoning ability.
Spatial ability, including mental visualization and rotation of objects, has long been recognized as having strong ties to chemistry (Harle and Towns, 2011) and has been shown to have a small correlation with organic chemistry course outcomes (Turner and Lindsay, 2003). The correlation most likely manifests in the ability's significant main effects on tasks involving: (1) conversion between names and structural formulas, (2) three-dimensional features of molecules, (3) completion of a reaction equation with either a missing reagent or product, (4) outline of a multi-step synthesis, (5) identification of a wrong or incomplete structure or formula, and (6) answering higher-order multiple choice questions (Pribyl and Bodner, 1987). Furthermore, spatial ability has been shown to have a significant relationship to mathematical ability (Guay and McDaniel, 1977; Battista et al., 1982; Jones et al., 2011; Cheng and Mix, 2014; Mix et al., 2016; Resnick et al., 2020). A meta-analysis examining the intricacies of the relationship revealed spatial ability is more strongly correlated with logical ability than with numerical or arithmetical ability (Xie et al., 2020).
Mathematical ability enjoys further ties to logical ability. The two require cognitively demanding mental operations such as the ability to process symbolic and abstract representations, to apply rules and draw conclusions, and to reason abstractly (Morsanyi and Szücs, 2014). In exploring the long-held idea that mathematics training develops general thinking ability, Inglis and Simpson (2009) determined that undergraduate students of mathematics outperform intelligence-matched non-mathematics students in conditional inference tasks. Further exploration of 16- to 18-year-olds revealed that students studying mathematics show greater improvement in their logical reasoning on the same conditional inference tasks than their non-mathematics peers (Attridge and Inglis, 2013). Use of a larger battery of rational and logical reasoning tasks provided similar results: more extensive mathematics training correlated with increased success on reasoning tasks amongst participants ranging from first-year undergraduates through research academic mathematicians (Cresswell and Speelman, 2020). The relationship between mathematic ability and logical reasoning may even be observed as early the age of six. Nunes et al. (2007) reported that logical competence at the start of primary school is causally related to mathematics achievement over the first 16 months and that training in logical competence has a large effect size on mathematics ability even after 13 months.
Tai et al. (2006) identified multiple significant mathematics-related predictor variables in a regression model for success in first-semester college chemistry; these included enrollment in calculus in high school, SAT-math score, last grade in a high school mathematics course, and experience with stoichiometry in high school chemistry. They propose that calculus experience denotes a level of mathematics fluency essential for comprehension of chemistry lectures and texts that assume an understanding of symbols and equations. Furthermore, Kennepohl et al. (2010) found scores on mathematics and critical thinking questions more closely correlated with chemistry course performance than scores on conceptual basics, problem solving, or previous schoolwork.
Recently, researchers from six Texas universities reported the use of a 20-item Math-Up Skills Test (MUST) and a demographics survey to predict success in General Chemistry I (Williamson et al., 2020) and in General Chemistry II (Powell et al., 2020). Advantages of the MUST include the ability to assess student strength from the first day of class, a minimal time commitment for the assessment, and the availability of results for every student. With the established relationships between organic chemistry performance and logical reasoning and between mathematics ability and logical reasoning, we were curious to see if the MUST could predict course outcomes for first-semester organic chemistry, the next course in the typical college chemistry sequence.
(2) Will the predictability of the MUST improve with incorporation of other academic measures, such as grade point average and first exam scores, which have been previously shown to correlate with OChem success?
Curriculum for OChem taken by the study population aligns with the American Chemical Society Examination Institute's Organic Chemistry—First-Term exam and includes a review of relevant concepts from general chemistry, structure and stereochemistry, nomenclature, reaction mechanisms, reactivity of alkenes, and nucleophilic substitution and elimination reactions. Students earn three credit hours for successful completion of the semester-long course and must earn a C or better to enroll in the next course in the sequence, Organic Chemistry II. The study's institution requires concurrent enrollment in the associated laboratory course for an additional one credit hour.
Two different versions of the MUST were distributed to the students so that no neighboring students would have the same version of the test. The two versions present items in the same order but employ different numbers within items. For example, item 1 asks for the product of 87 × 96 on one version and the product of 78 × 96 on the second version. Both versions of the MUST may be found in the ESI† of the General Chemistry I study (Williamson et al., 2020). Instructors hand-graded the assessments, marking each response as correct or incorrect for a possible range of scores from 0 to 20. Partially correct responses and incorrectly reported answers received no credit. All of the MUST questions were free response so students could not arrive at the correct answer by guessing or by working backwards.
The MUST was administered during the first week of the semester. For every student, instructors recorded a score for each item and the total MUST score. A two-tailed t-test of participant scores on the two versions of the MUST indicates no significant difference between the versions (p = 0.084). The MUST showed good internal consistency with Cronbach's alpha = 0.861 for the 150 participating students.
The data from the 2019 and 2020 study participants were used to develop two regression models in which the MUST score served as the numerical predictor variable. Overall semester grades were the numeric response in the linear regression, while the logistic regression utilized OChem success as a categorical, binary response. For the purpose of this study, success is defined as earning an A, B, or C. Grades of C or better (minimum of 69.5%) were chosen as the cutoff for successful completion as that is the minimum grade required for students to enroll in second-semester organic chemistry at the study institution. Students who earned a D or F or withdrew (W) before the end of the semester have been classified as unsuccessful in OChem for data processing. Success was assigned a value of 1 in the regression model while all other outcomes (D, F, or W) were assigned a value of 0.
Subsequent logistic regressions utilized categorical predictor variables to predict success in OChem. MUST scores were divided into three categories such that the middle category ranged from one-half standard deviation below to one-half standard deviation above the mean. Additional categorical predictor variables worked into the hierarchical logistic regressions included participants’ science GPA and score on the first exam of the semester. Coefficients for the resulting models are included in Appendix Table 5.
At the completion of the semester, MUST scores were compared to overall semester grades. Table 1 shows the average (SD) (SE) for various groups, based on whether students successfully or unsuccessfully completed the course. The average MUST score for successful students was approximately 5 points higher than that of their unsuccessful counterparts and is statistically significantly different when comparing all successful and unsuccessful participants (p = 2.1 × 10−10 and Cohen's d = 1.07) as well as the successful and unsuccessful participants who were enrolled in OChem for the first time (p = 1.8 × 10−10 and Cohen's d = 1.29). The group of students retaking the course after a previous withdrawal or unsuccessful attempt is much smaller (n = 27), but we see a similar pattern with the average MUST score 2 points higher for successful participants (p = 0.250). The bulk of the analysis for the remainder of the study focuses on the first-time enrollees to eliminate any bias of prior organic chemistry knowledge in those retaking the course.
Participant group | MUST (SD) (SE) | |
---|---|---|
a MUST scores for successful students statistically significantly higher at p < 0.05 level. | ||
All students | n = 150 (%) | |
Successful (A, B, or C) | 90 (60.0%) | 13.07 (4.74) (0.50)a |
Unsuccessful (D, F, or W) | 60 (40.0%) | 8.37 (3.78) (0.49) |
First-time enrollees | n = 123 (%) | |
Successful (A, B, or C) | 74 (60.2%) | 13.93 (4.21) (0.49)a |
Unsuccessful (D, F, or W) | 49 (39.8%) | 8.67 (3.85) (0.55) |
Repeating enrollees | n = 27 (%) | |
Successful (A, B, or C) | 16 (59.3%) | 9.06 (5.13) (1.28) |
Unsuccessful (D, F, or W) | 11 (40.7%) | 7.00 (3.23) (0.97) |
Fig. 1 Scatter plot of course grade versus MUST score for first-time enrollees in the study (n = 123). Course grades were taken at completion of semester or at time of withdrawal from the course. |
To achieve a uniform measure of grades, we calculated the average of the first three exam scores and plotted them against the MUST (Appendix Fig. 9). At the study institution, the third exam is taken during the ninth week of the semester, before students typically begin to withdraw. All but four of the 123 participants took the first three exams; these four students withdrew prior to the third exam. The r2 value for the linear regression increased slightly from 0.410 for semester grades vs MUST to 0.486 for three-exam averages vs MUST (Appendix Fig. 9), indicating a slightly stronger correlation between the measures. Both plots exhibit less variation in grades for the higher MUST scores and greater ranges in grades for the lower end.
The alluvial diagram in Fig. 2 compares MUST scores to course letter grades. The three MUST ranges illustrated were based on one standard deviation around the mean to establish “below average” as 0–8, “average” as 9–14, and “above average” as 15–20. The “above average” scores almost completely translated to a successful course outcome, and 72% of students in the “below average” category finished with an unsuccessful grade or withdrew. The “average” range is distributed throughout the three successful letter grades and represents the second largest source of DFW scores.
Fig. 3 illustrates the results of the MUST questions by success for all study participants, first-time enrollees, and repeaters. Each group follows the same general pattern although the percent answering correctly varies between the groups. For example, over 50% of test-takers in each category answered questions 1, 5, 10, 17, 19, and 20 correctly but questions 7 and 13 were answered incorrectly by at least 50% of the participants in each group. These question success patterns for our study cohort match those reported for a large population of students (n = 1073 and n = 1599) from different universities in the General Chemistry I and II studies (Williamson et al., 2020; Powell et al., 2020, respectively). Furthermore, the percentage correct on each question for successful groups is higher than for the corresponding unsuccessful groups with the exception of question 1, which tests 2-digit multiplication. The percentage correct for students repeating the class was not divided into “successful” and “unsuccessful” due to the small number represented therein (among repeaters: n = 16 successful and 11 unsuccessful). The results for all repeaters closely resemble those of the unsuccessful groups. Any repeater would have at one time been included in an “unsuccessful first-time” group, so similar scores are unsurprising.
With the MUST scores in hand, we sought to determine the ability of the MUST to predict organic chemistry course outcomes using a logistic regression. The resulting curve, which allows for the prediction of probability of success, may be seen in Fig. 4. Based on the curve, a student taking OChem for the first time has even odds of passing with a score of 9 on the MUST. This model predicts successful course outcomes with 80.3% accuracy and unsuccessful outcomes with 63.8% accuracy.
Fig. 4 Comparison of actual passing rates and logistic regression curve for the probability of success based on MUST scores for first-time enrollee participants. |
The receiver operating characteristic curve offers another indicator of the quality of a classifier such as a logistic regression by plotting sensitivity (or the true positive rates) versus 1 – specificity (or the false positive rates) at varying cutoffs for the classifier (Fawcett, 2006; García-Valcárcel and Tejedor, 2012; Han, 2022). The area under the curve (AUC) indicates the probability that a classifier will rank a randomly chosen positive instance higher than a randomly chosen negative instance; values range from 0 to 1, with a value of 0.5 corresponding to random guessing (Fawcett, 2006). The AUC for the logistic regression based on the MUST is 0.840. An AUC is considered acceptable between 0.7 and 0.8, excellent between 0.8 and 0.9, and outstanding between 0.9 and 1.0 (Han, 2022).
By using both the MUST and science GPA, the intent was to provide a more robust risk assessment for a student's performance in the course. For example, if a student had a high MUST score, but a lower science GPA, they may be perceived as being successful where they may have predicted to have a DFW result based on their science GPA alone. Likewise, students with a higher science GPA and low MUST score may be predicted to need some type of intervention at the outset of the course, based on the MUST score alone, when it may not be necessary. An initial cross tabulation examined the distribution of MUST scores within three ranges of science GPA (Fig. 5).
As expected, those participants in the lowest science GPA range had the highest percentage (>50%) of low MUST scores in the 0–8 range. Those in the uppermost GPA range had the highest percentage (>55%) of top MUST scores. In comparison, those participants who scored in the 9–14 range for the MUST were fairly evenly distributed throughout the science GPA ranges, with the highest percentage being found, interestingly, for those with the lowest GPA ranking. It was curious that for those participants in the midrange science GPA field, the MUST scores were equally distributed (Fig. 5).
Two new logistic regressions were created, using either the MUST ranges or science GPA ranges as the categorical input variable; coefficients for the predictor variables may be found in Appendix 5. Table 2 shows the overall success rates for the study participants, compared separately based on the range of MUST score and on the range of their science GPA, along with the probability of success predicted by the logistic regression.
Measure | Range | OChem success rate (%) | |
---|---|---|---|
Observed | Predicted | ||
Science GPA | 3.3–4.0 (n = 62) | 95.2 | 95.2 |
2.7–3.29 (n = 18) | 55.6 | 55.6 | |
<2.7 (n = 40) | 15.0 | 15.0 | |
MUST score | 15–20 (n = 44) | 90.9 | 99.1 |
9–14 (n = 43) | 58.1 | 57.8 | |
0–8 (n = 33) | 30.3 | 29.4 |
Both sets of data show that the highest performers in each category have a high chance of success, while those with the lowest science GPA or the lowest MUST score have a much lower likelihood of success. Participants were then cross tabulated by science GPA range and MUST range, and the course success rates were calculated for the groups. These results are shown in Fig. 6.
Fig. 6 Success rates for participants earning scores in specified MUST and science GPA ranges. All n values for each group are listed above bars in chart. |
Fig. 7 Observed success rates in OChem when first exam score is (a) passing or (b) not passing. The population values for n are shown above each bar. |
The study's participants in the highest science GPA range had the highest success rates irrespective of their MUST score. Even the participants in this range with the lowest MUST scores (n = 6) had a 100% success rate in this study. For the midrange science GPA, the success rates were similarly independent of the MUST scores. These participants were not as successful overall as those with a higher GPA, but they still maintained a 56% success rate. The effect of the MUST score was most evident for those participants that fell in the lowest science GPA range. These participants had a >65% chance of success if they scored high on the MUST, with this success rate dropping dramatically for lower MUST scores. Indeed, of the twenty-one participants who had a low science GPA and a low MUST score only one was able to achieve success in the course. It should be noted that the overall success rate for this group is 15%, due in part to the fact that there were very few participants (n = 3) who attained the high MUST score while having a low science GPA.
The next step was to incorporate both sets of ranges into the hierarchical logistic regression to see the impact of the MUST score and science GPA on the success rate. The predicted probabilities of passing were calculated using the regression's coefficients for all possible combinations and are shown in parentheses in Table 3 alongside the actual success rates. Any population calculated to have a passing rate of 50% or higher was predicted to be successful; they are represented by unshaded cells. Populations calculated to have success rates below 50% were predicted to be unsuccessful and are represented by the red shading. A classification table of predicted versus observed outcomes identified 8 students who were incorrectly predicted to succeed and 9 students incorrectly predicted to not succeed. As shown on the right side of Table 3, nearly half of the incorrectly predicted participants belong to the midrange science GPA, with slightly more incorrectly predicted to be successful (n = 5) than unsuccessful (n = 3). Overall, the model's prediction rate for successful students is 88.0%, while those who are unsuccessful may be predicted at an 82.2% accuracy rate.
The first exam in OChem at the study institution is approximately 10% multiple choice and 90% free response. It covers some review material from general chemistry that is adapted to OChem (acid/base theory, formal charges, valence electrons, and lone pairs) and an introduction to the fundamental concepts of OChem including resonance in organic compounds, organic line-bond structures, orbital hybridization, alkane naming, and constitutional isomers. The exam is the same for both sections of the course and is taken by the students during the third week of the semester. Each exam during the semester accounts for 15% of the overall semester grade, with the lowest exam score discounted to 10% of the semester grade. Only seven of the 123 first-time enrollee participants earned their lowest score on this first exam, thus the exam counted for 15% for the vast majority of the first-time enrollee participants. Homework (5%), participation (5%), and the ACS first-term standardized final (20%) comprise the remainder of the grade. A linear regression of course grade on the first exam score revealed a marked correlation between the two measures (r2 = 0.854, Appendix Fig. 12a). When semester grades were adjusted to exclude the first exam score, the r2 decreased only slightly to 0.804 (Appendix Fig. 12b).
Fig. 7 shows the actual passing rates for student groups based on the MUST, the science GPA, and the score of the first exam (grouped as D or F < 69.5% ≤ ABC). Compared to Fig. 6, there is a clearer picture of those participants who fall in the midrange science GPA as these students had a much higher success rate in the course if they attained a passing grade on the first exam and were uniformly unsuccessful if they scored a D or F on the first exam. Similarly, 50% of participants in the low science GPA range were successful in OChem if they passed the first exam, but only 6% of participants in the low GPA range passed the course with an unsuccessful first exam score. Among the participants with a low science GPA, higher MUST score ranges appeared to correspond with increased success rates regardless of the first exam outcome. In the study population, two groups contained zero participants: those with a high GPA, low MUST, and unsuccessful first exam and those with a low GPA, high MUST, and scored a D or F. It should be noted that the inclusion of a third variable reduced the population for some of these subgroups to low n values. Despite the low n values, the general trend holds that students who are in the middle or lower GPA range and do not make a C or higher on the first exam have a greatly reduced probability of passing the course.
The first exam was added to the hierarchical logistic regression as the third and final categorical input variable. Grades of 69.5 or better were assigned a 1 while scores below 69.5 were marked 0 for the analysis. Coefficients from the resulting model were used to calculate predicted probabilities of passing for the various combinations of science GPA, MUST range, and first exam outcome. Populations with >50% chance of passing were predicted to be successful (unshaded cells in Table 4), and all others were predicted to be unsuccessful (cells shaded in red in Table 4).
Analysis of the model's classification is shown on the right side of Table 4. As in the previous logistic regression, eight participants were incorrectly be predicted to be successful for an accuracy rate of 82.2%. Where half of these incorrectly predicted outcomes fell in the midrange science GPA from the previous model, the new model only incorrectly placed one participant in this range. Instead, half of the new model's incorrectly predicted students (n = 4) fell in the low science GPA and successful first exam category. Only two students, both in the low GPA and unsuccessful first exam category, were incorrectly predicted to be unsuccessful so that the prediction rate for successful students increased to 97.3%. The AUC for this final model is an outstanding 0.950.
The MUST is primarily a product-oriented test with free-response numerical answers and would therefore be expected to be more applicable to general chemistry since organic chemistry has been described as being more process-oriented than product-oriented (Graulich, 2015). However, the correlation between the MUST and success in OChem may indicate that there are elements in the MUST that also incorporate process-oriented reasoning. Donovan and Wheland (2009) suggest that the mathematics-chemistry connection is based on the development of higher-order cognitive skills required by both disciplines that takes place during the mastery of mathematics. Ralph and Lewis (2018) describe the ability of students at risk of not passing (those who scored in the bottom quartile of mathematics component scores on the SAT) to perform comparably to their peers in a first-semester general chemistry course when they achieved proficiency in mole concept and stoichiometry assessments. Both concepts rely heavily on proportional reasoning skills, one of the Piagetian tasks associated with formal thought.
Alternatively, the relationship between the MUST and OChem outcomes may depend on students’ cognitive load as the MUST is a calculator-free assessment. Students more familiar with the symbols and mathematical manipulations on the MUST and better able to retrieve math facts would experience a lighter load, allowing them to perform better than students with heavier cognitive loads (Royer et al., 1999). Furthermore, an analysis of problem types in organic chemistry revealed that types associated with higher cognitive load (i.e. those involving hybridization, resonance, Lewis structures (Tiettmeyer et al., 2017), isomerism, or multi-step, curved arrow formalisms) were more highly correlated with course success than those with lighter cognitive demands (Austin et al., 2015).
One factor to consider in the administration of the MUST is the students’ motivation to perform well on it. Prior to beginning the test, students are encouraged to perform well even though they are aware that the score they receive on the MUST does not impact their grade. Highly motivated students are probably more likely to strive to do their best, even if the assignment has no grade impact, and these students could reasonably be expected to maintain this motivation throughout the semester. Students that are not self-motivated to perform well on the MUST may or may not have an internal drive to do well in the course. As there is no true incentive to attain a high score on the MUST, the correlation between MUST scores and motivation may merit further investigation.
Both mathematics and organic chemistry have long been known to induce anxiety in students due to students’ anticipated ability (or inability) to succeed in the rigorous courses (Betz, 1978; Steiner and Sullivan, 1984). This anxiety inhibits student enjoyment of the subjects and impacts subsequent performance, negatively influencing motivation to persist in the course (Black and Deci, 2000). Thus, administrators of the test should consider the effects of stereotype threat. According to Spencer, Steele, and Quinn (1999), these effects can be minimized when the instructions for an assessment indicate that no effect has been observed for the assessment in the past. Thus, students taking the MUST should be instructed in such a way as to relieve stereotype threat. For example, the instructor may inform students that they are all equally qualified to take the test because of their experience with similar math problems in their general chemistry coursework. The potential for stereotype threat may be further reduced by separation of any demographics questionnaire from delivery of the MUST. Previously, Lynch and Trujillo (2011) questioned the extent to which factors underlying self-regulated learning, i.e. motivational, cognitive, metacognitive, behavioral, emotional factors (Panadero, 2017), extend across academic domains. Because OChem grades correlate with MUST scores despite a lack of mathematics calculations in OChem, the MUST may provide evidence for a lack of domain specificity in self-regulated learning.
Indeed, a closer examination of the success rates for the various combinations of science GPA and MUST score may provide further insight regarding abilities being measured by the MUST (Fig. 6). The data suggest that students entering OChem with a top-tier science GPA will find success even if their MUST score is low. Similarly, those with a midrange science GPA were more likely than not to find success, irrespective of MUST score, though this was the smallest cohort in the dataset. Certainly, many variables factor into academic performance leading to grade point averages. Students beginning OChem with mid- or high-range GPA may have the attitudes, study habits, and autonomous motivation for success to override differences in developmental level (Steiner and Sullivan, 1984; Black and Deci, 2000; Grove and Bretz, 2010; Szu et al., 2011). For students who fall in the bottom-tier GPA category, success rates are strongly differentiated by scores on the mathematics assessment. Only 5% of students with MUST scores <9 passed the course, while 19% of students with MUST scores of 9–14 obtained successful outcomes. For these students, the MUST implies some level of cognitive ability not reflected in their prior science coursework.
By collecting science GPA data and the MUST scores within the first week of the semester, it is feasible to provide suggestions to students with low or borderline scores and encourage them to seek out resources to ensure that they will improve their chances at finding success. Students with midrange science GPA and a midrange or low MUST score may benefit most from instructions on how to study most effectively. Students with both a low science GPA and a low MUST score may require resources and training to develop their cognitive ability. Early intervention could be critical in helping students at risk of not passing before the first semester exam is given or damage to grades occurs.
There are many types of interventions that can help organic chemistry students, with numerous examples in the literature that describe specific learning tools that can be employed, often in relation to a particular topic. However, in focusing on early intervention for students who are identified as less likely to pass the course at the onset of the semester, the focus should be on approaches that help them get up to speed quickly. Perhaps one of the most important tools is the encouragement of help-seeking behavior. As described in Horowitz et al. (2013), low-performing students often feel shame and are therefore reluctant to engage in help-seeking behaviors. Positive encouragement and assurance that tutoring and office hours would be beneficial may foster an environment that inspires students to take preemptive action to increase their chances for success. At the study institution, organic chemistry students have assigned online homework and access to optional practice problems within the textbook. Instructors also promote and encourage students to take advantage of additional online practice problem resources. The benefits of practice problems have also been described in Szu et al. (2011), where a student with a lower prior GPA outperformed another with higher prior GPA and reported a higher usage of practice problems in Month 1 (front-loading) of an organic chemistry course.
Promotion of peer-led team learning (PLTL) sessions is another method that could be employed to improve performance in organic chemistry. Implementation of PLTL has been shown to improve exam scores (Tien et al., 2002) and lead to higher success rates (Wamser, 2006) in organic chemistry. Online preparatory courses (Fischer et al., 2019; Pulukuri et al., 2021) delivered via a learning management system have also been found to result in higher performance and success rates in organic chemistry. There are also various examples of utilizing social-psychological interventions (SPIs) in order to increase performance and persistence in STEM courses. In a particular, utility value interventions (UVIs), which connect course concepts to the daily lives and/or personal values of the students through assigned research on a topic, showed improved exam scores in general chemistry (Wang et al., 2021). A similar strategy engaged organic chemistry students by assigning them the task of presenting on arrow-pushing, which led to improved perceptions (a goal of SPIs) and, consequently, higher exam scores (Green and Rollnick, 2006). Zavala et al. (2019) describes the use of student role-playing journal exercises to connect abstract course content to their career ambitions, which increased personal connections to organic chemistry and improved scores.
The additional data point of the first semester exam, which has been previously described as being a reliable predictive measure (Mills et al., 2009; Jensen and Barron, 2014; Hollabaugh et al., 2019), provides further strengthening of the predictive model. Participants who scored an A, B, or C on the first exam were overwhelmingly successful unless they were in the lowest-tier science GPA and had a low MUST score. Participants who had a D or F on the first exam were only successful if they had a high science GPA and a high MUST score. Even without considering any other factors, less than 10% of the students in this study who scored below 69.5 on this exam were able to obtain a passing grade in the course (n = 4). Though it is given later in the semester (during the third week for the course in this study), it is early enough to still suggest potentially critical intervention prior to the more rigorous portion of the course. By working these scores into our model, successful course outcomes were predicted with 97% accuracy and unsuccessful outcomes were predicted with 82% accuracy.
In the analysis of MUST scores subdivided by science GPA, various groupings had small population sizes. Additional data collection may further clarify the trends that are seen in the data. There was also an insufficient number of participants who were repeating the course to make an adequate evaluation of this subgroup. The limited data suggested that their MUST scores do not significantly increase when participants retake the course (Fig. 3). However, previous exposure to the course material and additional physical science coursework may offset this and aid in success prediction.
Due to the fact that the score on the MUST has no grade impact, students may not be putting in their best effort to maximize their score. This could potentially skew the results for some students, as the MUST may not be an accurate representation of their automaticity skills. Neither prior mathematics coursework, nor elapsed time since such coursework, was taken into consideration for this study.
The definition of science GPA is specific for the institution at which the study took place, and may not be applicable to other institutions. Nonetheless, adaptations to this definition could be tailored for other institutions or programs of study. Course exams (in using the first exam score) were developed in-house and are institution-specific, though previous work has indicated that the use of the first exam, regardless of institution, may be applicable for predicting success (Hollabaugh et al., 2019).
Consideration of the first OChem exam score further enhances the prediction when coupled with the MUST score and the science GPA. Of these three elements, the MUST is the most practical as it is simple to administer at the beginning of the semester. Incorporation of the science GPA is beneficial, though it requires additional effort by the instructor to obtain. While recommending interventional measures to identified students during the first week of the semester is preferable, instructors could better ascertain individuals’ chances of passing by evaluating the first exam scores. Though each of these methods has a varying level of predictive power, one or more was found to be a useful tool in determining which students were likely to pass or to not pass first-semester Organic Chemistry.
Fig. 8 Comparison of overall semester grades with the MUST score for students participating in the study; n = 120 for “Finishers” and n = 150 for “All Participants”. |
Fig. 9 Comparison of average of first three mid-semester exams with MUST scores for first-time enrollees (n = 119). |
Fig. 10a shows the relationship between attained grade in OChem and overall GPA; Fig. 10b shows attained grade versus the science GPA. Science GPA was determined to be a better predictor (r2 = 0.620) of overall performance compared to the overall GPA (r2 = 0.522) and was therefore used in the following analyses. Not surprisingly, there is also a wider distribution for the science GPA as the overall GPA is more heavily weighted toward the high end as first-year students normally enroll in many lower level core curriculum courses that tend to boost grade point averages. The data points that fall in the shaded area represent students that were not successful in OChem (score <69.5%).
For those participants (n = 120) who took the MUST and had previously completed science coursework at the study institution, this trend continues to hold. The correlation between OChem outcome and overall GPA was found to be r2 = 0.552, and that with science GPA was r2 = 0.676, as illustrated in Fig. 11. Thus, the participants of the current study were judged to be representative of the broader population of students.
Fig. 11 Scatter plot of OChem semester grade vs. science GPA for participants who completed the MUST (n = 120). |
Fig. 12 Scatter plots of (a) OChem semester grade vs. first exam score and of (b) adjusted OChem semester grade vs. first exam score for first time enrollees (n = 123). |
Variable | Model 1 | Model 2 | Model 3 | Model 4 | Model 5 | Model 6 |
---|---|---|---|---|---|---|
*p < 0.05. | ||||||
Intercept | 2.303* | 2.979* | 3.753* | 2.747* | 3.674* | 3.557* |
MUST range | ||||||
15–20 | Reference* | Reference | Reference | Reference | ||
9–14 | −1.989* | −1.219 | −0.828 | −0.272 | ||
0–8 | −3.178* | −1.652* | −1.371 | −0.299 | ||
Science GPA range | ||||||
3.3–4.0 | Reference* | Reference* | Reference* | Reference* | ||
2.7–3.29 | −2.756* | −2.532* | −1.983* | −2.026* | ||
0–2.69 | −4.714* | −4.201* | −3.191* | −3.260* | ||
Exam 1 score | ||||||
≥69.5 | Reference* | Reference* | Reference* | |||
<69.5 | −3.766* | −3.288* | −3.376* | |||
Pseudo-r2 | 0.331 | 0.641 | 0.667 | 0.669 | 0.778 | 0.777 |
χ 2 | 34.320 | 76.204 | 80.677 | 83.294 | 101.462 | 101.341 |
AUC | 0.781 | 0.903 | 0.926 | 0.882 | 0.950 | 0.948 |
Footnote |
† Electronic supplementary information (ESI) available. See DOI: https://doi.org/10.1039/d2rp00140c |
This journal is © The Royal Society of Chemistry 2023 |