Leveraging a hybrid learning environment and small group recitations to promote higher order learning: incorporating reform-minded instruction within a traditional general chemistry curriculum

Colomba Sanchez-Marsetti , Jiho Ahn , Eric Dao , Angie Lopez and Jack F. Eichler *
Department of Chemistry, University of California-Riverside, USA. E-mail: jack.eichler@ucr.edu

Received 27th August 2025 , Accepted 25th October 2025

First published on 25th October 2025


Abstract

There is an ongoing effort in the chemistry education community to promote instructional reform in undergraduate education that shifts the instructional emphasis from teaching disaggregated facts and skills to promoting more meaningful learning outcomes. In an effort to improve student performance as well as strengthen the student's conceptual understanding of core chemistry concepts, this study leveraged a hybrid (flipped) learning environment to integrate instructional activities inspired by the three-dimensional learning (3DL) framework into a college-level general chemistry course sequence. In phase one of the study, a hybrid general chemistry course that was partially structured around the 3DL framework was assessed in comparison to three other teaching-as-usual courses. The Higher Dimensional Lecture (HDL) course appeared to improve some higher order learning outcomes versus courses that were structured around a traditional curriculum, however no significant difference in 3DL assessment performance was observed relative to a traditional course that was observed to incorporate some elements of 3D learning. Phase two of the study was designed to address potential limitations in phase one by utilizing recitation sessions to increase the frequency at which students engaged with HDL practice activities. This appeared to improve higher order learning outcomes, as students who completed the HDL practice activities performed significantly better on summative 3DL assessments relative to students who completed more traditional problem-solving exercises. Students in this HDL practice activity treatment also performed better on one 3DL summative assessment versus students from a more traditional course. The results presented herein should provide a model for how instructors can use a hybrid course structure to promote reform-minded instruction within a traditional general chemistry curriculum.


Introduction

In higher education STEM courses (Science, Technology, Engineering, and Math) there is an ongoing need for reform in classroom instruction, particularly with respect to fostering student engagement, retention in STEM disciplines, and promoting meaningful learning (Laverty et al., 2016; Stains et al. 2018; Shortlidge et al., 2024). Despite a committed effort from the chemistry education community to address these broader problems, many college and university instructors in chemistry continue to resist the adoption of evidence-based instructional practices (EBIPs) and often rely on didactic instructional styles (Stains et al., 2018). Though it has generally been accepted that active learning approaches lead to improved student classroom performance (Freeman et al., 2014; Deslauriers et al., 2019; Theobald et al., 2020), it has been more recently proposed that active learning on its own does not necessarily lead to meaningful learning, and instructional reform should address both how we teach and what we teach (Cooper et al., 2024; Schwarz et al., 2024).

Hybrid learning (“flipped classroom”) structures have become quite prevalent in STEM higher education (Strelan et al., 2020; Bredow et al., 2021) and has been identified as an EBIP that generally leads to improved performance outcomes for students in post-secondary chemistry courses (Rahman and Lewis, 2020). Recent trends in optimizing the out-of-class and in-class learning environments in hybrid classrooms have been previously discussed in chemistry education literature (Eichler, 2022), and the general consensus is that hybrid classroom structures provide an opportunity for instructors to integrate more active learning during in-class instruction and the use of out-of-class class learning activities can help reduce cognitive load for students (Seery and Donnelly, 2012; Seery, 2015). However, as highlighted above, the use of hybrid instruction to integrate active learning does not guarantee meaningful learning outcomes have been achieved (Cooper et al., 2024; Schwarz et al., 2024). It has been argued by the corresponding author of this current work that hybrid learning classroom structures provide an opportunity for instructors to not only create active learning environments but also develop highly structured learning activities based on frameworks of learning that provide the foundation for conceptual understanding and higher order reasoning (Eichler, 2022).

In response to this call for instructional reform, the current study aims to build upon the previous success of implementing hybrid classroom structures that incorporate concept development (Wu et al., 2021) and higher order learning (Holloway et al., 2024). The goal was to explicitly design classroom activities around the three-dimensional learning (3DL) framework and assess how this instructional approach impacts higher order learning outcomes. Originally presented in the National Research Council's report, the 3DL Framework scaffolds K-12 curricula around three elements: Scientific and engineering practices, Crosscutting concepts, and Core ideas (National Research Council, 2012). This report has influenced science education reform and has been translated into concrete, assessable standards for K-12 by the Next Generation Science Standards (NGSS Lead States, 2013). Building on this foundation, Cooper and coworkers developed the Three-Dimensional Learning Assessment Protocol (3D-LAP) as a guide for STEM courses in postsecondary education to align curricula in congruence with the 3DL Framework. The push to integrate 3D learning into higher education stems from the observed gap between student's academic success and their ability to transfer knowledge to new contexts (Cooper et al., 2014). The three dimensions aim to create a more meaningful learning environment that promotes higher order thinking and a deeper conceptual understanding of the course content. More broadly, the need to enact curricular reform in higher education chemistry (Stains, 2018) and the goal to provide a meaningful learning experience that is relevant to students (Lindstrom and Middlecamp, 2017) has also motivated this work. It is hoped that the implementation and assessment of the 3DL-inspired classroom activities described herein will demonstrate hybrid classroom structures can be leveraged to promote meaningful learning within a traditional general chemistry curriculum.

Theoretical frameworks

The frameworks of learning guiding the current study can be viewed from the perspective of the classroom structure and learning environment, and what type of learning outcomes are emphasized within that broader structure. Sweller's cognitive load theory posits that there is a limit to a learner's capacity to process, handle, and store information which may reach its limit during traditional problem-solving activities (Sweller, 1998). Capping out the cognitive-processing capacity negatively impacts the learner's schema acquisition and can inhibit long-term retention by overloading their working memory (Sweller, 1998). By delivering course material under a different modality and prior to in-class sessions, the students are able to digest new material at a pace that best benefits their learning preference (Mayer, 2024). This material can be delivered in the form of required readings, or pre-recorded lectures that the students can navigate on a more flexible timeline. Shifting the environment and method by which the learner processes new information can help reduce the student's cognitive load during class time by better preparing them for lectures (Seery and Donnelly, 2012). This in turn alleviates the pressure of processing new information during class time and instead allows the learner to focus on actively participating in in-class activities and engage with the content in a higher-order thinking manner. While keeping cognitive load theory in mind, a hybrid/flipped classroom can also incorporate constructivism which centers class time around the students and provides them with the opportunity to actively engage with the materials and build their knowledge as they collaborate with peers and instructors (Seery, 2015). Piaget’s Constructivist Theory states that individuals actively construct knowledge by building an understanding from real-world experiences or interactions (Edwards, 2017). Within this context, meaningful learning is likely to occur when students move beyond passively receiving information and instead engage with in-class instructional activities that may provide simulations, visualizations, or guided questions to explore important concepts, manipulate systems to understand patterns, and construct an overall deeper understanding.

Though classroom structures are built on frameworks that are likely to improve how introductory content is taught and are now generally considered an evidence-based instructional practice (Rahman and Lewis, 2020), structuring the learning environment in this way does not guarantee that the emphasis of instruction is focused on meaningful learning that extends beyond rote skills. This current study aims to leverage the hybrid/flipped classroom design to promote three-dimensional learning (3DL). While both the 3DL framework and Ausubel's theory of meaningful learning share an emphasis on helping students strengthen their conceptual understanding, it is proposed here that Ausubel’s theory provides the foundation for how students achieve meaningful learning (Ausubel, 1962; Ivie, 1998), whereas the three-dimensional learning framework operates more as a guide for structuring curriculum and classroom practice. The components of the 3DL framework have been categorized as core ideas, scientific and engineering practices, and crosscutting concepts (Laverty et al., 2016). Building instruction around these three dimensions of the framework fosters an environment where students not only develop an understanding of the concepts but reach a state where their knowledge can be applied beyond the classroom. Previous research shows that STEM students in higher education have been observed to emerge from foundational chemistry courses with misunderstandings of important core ideas (Cooper et al., 2014). This disparity can be viewed as a flaw in traditional curricular emphases, which fall short in prioritizing meaningful learning and instead rely on students simply engaging in rote learning and factual recall. It is argued here that the 3DL framework can be viewed through the lens of Ausubel's Theory of Meaningful Learning, which stresses the importance of meaningful learning and the role it plays in strengthening retention for the learner. By using the core ideas as pillars for the foundation of existing knowledge, students can anchor new information into their existing knowledge scaffold (Ausubel, 1962). Rote learning overlooks the importance of building connections between prior knowledge and new knowledge, resulting in evanescent knowledge. Ivie, 1998 draws the connection between meaningful learning and higher order thinking by emphasizing how conceptual understanding encourages the utilization of complex thinking skills to relate different ideas to each other. This can be accomplished through the implementation of scientific and engineering practices, as defined by the 3DL framework, specifically encouraging students to use and apply their knowledge in scenarios where students are required to explain phenomena or engage in conversations that require them to communicate in the language of their discipline (Laverty et al., 2016).

In summary, the primary objective of this study is to demonstrate that hybrid/flipped classroom structures can be used to promote three-dimensional learning in introductory gateway STEM courses in higher education settings. However, because the two courses in this study that aimed to promote higher order learning did not incorporate a full 3DL curriculum they will be identified henceforth as Higher Dimensional Lecture (HDL) courses. Furthermore, the classroom implementation described herein might act as a model for how reform-oriented instruction can be achieved within a more traditional chemistry curriculum (i.e., in departments where whole-sale curricular reform is not supported). The following research questions were used to guide the study:

(1) What is the impact of an HDL course on student 3DL assessment performance relative to courses that emphasize more traditional skills and knowledge?

(2) What is the impact of HDL practice activities on student 3DL assessment performance relative to more traditional problem-based group learning activities?

Phase one of this study focused on addressing the first research question. This portion of the study was carried out during the winter of 2023 (W23) in an on-sequence second-quarter general chemistry (CHEM 001B) course. This was investigated using a quasi-experimental research design in which post-test performance was compared between an HDL course and two different versions of teaching-as-usual courses taught by three different instructors. The second research question was probed in phase two of the study, which was carried out in the winter of 2024 (W24) in an off-sequence first-quarter general chemistry (CHEM 001A) course. This was also investigated using a quasi-experimental research design; however, post-test performance was compared between a HDL practice exercise treatment and a traditional practice exercise treatment within the same HDL course, and these two conditions were both compared to a more traditional CHEM 001A course taught by a different instructor. In both phases of the study, the HDL course was implemented within a year-long course sequence in which all other instructors followed a traditional general chemistry curriculum and the majority of the instruction outside of the HDL course was didactic in nature (see SI Appendix A for a summary of the year-long general chemistry content).

Methods

Classroom setting and curriculum

In Phase One, four on-sequence CHEM 001B classes in Winter 2023 were evaluated: 001B Higher Dimensional Lecture (HDL), 001B Traditional Lecture 1, 001B Traditional Lecture 2, and 001B Traditional Lecture 3. Phase Two focused on CHEM 001A classes, including an on-sequence Fall 2023 course and an off-sequence Winter 2024 course, designated as the 001A Traditional Lecture and 001A Higher Dimensional Lecture (HDL), respectively. The 001B and 001A HDL Lecture courses employed a hybrid (“flipped”) classroom structure as previously described (Wu et al., 2021), whereas the Traditional Lecture Courses assigned post-lecture homework rather than pre-lecture learning activities. Summaries of the course structures are provided in Tables 1 and 2, and detailed descriptions of the classroom settings can be found in SI Appendix A.
Table 1 Classroom structure for Winter 2023 CHEM 001B study
Classa Course enrollment Class meetings (day and time) Exam/assessment structureb In-class activities Discussion group activities
a The 001B HDL Class taught by author JFE; 001B Traditional Lecture 1, 2, and 3 classes taught by three different instructors. b Traditional exam structure here is defined as multiple choice and free response questions which were not specifically designed to incorporate 3DL principles. c Activities done in class can be SI Appendix A.
001B HDL Course 252 M/W/F 10:00–10:50 am Traditional HDL activitiesc with collaborative group learning; frequent in-class poll questions with think-pair-share Traditional practice problems; collaborative groups
001B Traditional Lecture 1 253 Tues/Thurs 11:00 am–12:20 pm Traditional Periodic in class poll questions with think-pair-share; periodic unstructured problem solving Traditional practice problems; collaborative groups
001B Traditional Lecture 2 249 Tues/Thurs 3:30–4:50 pm Traditional Periodic in class poll questions with think-pair-share Traditional practice problems; collaborative groups
001B Traditional Lecture 3 193 M/W/F 11:00–11:50 am Traditional Periodic in class poll questions with think-pair-share Traditional practice problems; collaborative groups


Table 2 Classroom structure for Winter 2024 CHEM 001A study; 001A traditional lecture course taught in Fall 2023 term
Classa Discussion group designation Number of students Class meetings (day and time) Exam/assessment structure In-class activities Discussion group activities
a The class with the Higher Dimensional Discussion and Traditional Discussion Groups was taught by author JFE; The 001A Traditional Lecture class was taught by a different instructor (the same instructor who taught the 001B Traditional Lecture 1 course in the CHEM 001B implementation). b Traditional + 3DL is defined as an exam that included traditional multiple choice assessment items and some 3DL open-ended assessments; traditional is defined as multiple choice and free response questions which were not specifically designed to incorporate 3DL principles.
001A HDL Lecture HDL Discussion Group 31 Wednesday 10:00–10:50 am Traditional + 3DLb HDL activities in-class poll questions HDL practice activities
32 Wednesday 11:00–11:50 am
33 Wednesday 1:00–1:50 pm
33 Wednesday 3:00–3:50 pm
Traditional Discussion Group 32 Tuesday 2:00–2:50 pm Traditional + 3DLb 3DL activities in-class poll questions Traditional practice problems; collaborative groups
31 Tuesday 3:00–3:50 pm
32 Tuesday 5:00–5:50 pm
31 Thursday 9:00–9:50 am
001A Traditional Lecture n/a 245 M/W/F 2:00–2:50 pm Traditional In class poll everywhere questions Traditional practice problems; collaborative groups


Research study design and 3DL assessment

Phase one. The general structure of the study for phase one is summarized in Fig. 1. The CHEM 001B HDL course taught by author JFE was inspired by the 3DL framework. This included intentionally linking all course topics to the 3DL chemistry core ideas and included classroom activities throughout the term that incorporated learning objectives related to several of the 3DL scientific practices and crosscutting concepts (see Appendix A). The instructor attempted to embed elements of 3DL as much as possible and structured many of the in-class activity questions around an evidence/claim/explanation structure, but the course might not be considered a complete curricular overhaul. The other three CHEM 001B courses included in the study were taught by instructors who were not familiar with the 3DL framework and structured their courses and classroom activities independent of the 3DL approach. The three-dimensional learning observation protocol (3D-LOP) was used to evaluate all four classes included in this phase of the study and was focused on evaluating the class periods related to the course unit covering intermolecular forces and physical properties of liquids and solids (Bain et al., 2020). Class sessions taught by author JFE and one of the other instructors were directly observed (either in person or though classroom capture videos), whereas the classes for the other two instructors were evaluated for 3DL practices by careful review of the instructional materials (classroom capture videos were not available for these two instructors). The observational method implemented in this research is adapted from the 3D-LOP protocol available in the SI of Bain et al. (2020). Significant changes to the 3D-LOP made in the present study included identifying the specific Scientific Practices and Crosscutting Concepts observed in the class or instructional materials, and coding the classroom observations and instructional materials only for the course units related to topics included in the summative 3DL assessments (see Table 3 and SI Appendix C Fig. S1a). These changes allowed for the identification of specific Scientific Practices and Crosscutting Concepts that might be addressed more frequently by instructors, and to keep the observation and coding work to a manageable level.
image file: d5rp00328h-f1.tif
Fig. 1 Overview of phase one study design and instructional format.
Table 3 Summary of 3DL scientific practices and crosscutting concepts observed in CHEM 001B courses; the full 3DLOP coding summary can be found in SI Appendix C
Class % of Class time student-centered Type of student-centered work Scientific practices & crosscutting concepts observed in class
W23 CHEM 001B HDL Course ∼20–80 In class-assessments, clicker questions, group activities, peer to peer discussions, SP: 1, 2, 3, 4, 5, 6, 7
CC: 1, 2, 3, 4, 5, 6, 7
W23 CHEM 001B Traditional Lecture 1 ∼6–25 Clicker questions, peer to peer discussions SP: 2, 4, 5, 7
CC: 1, 2, 4, 5, 7
W23 CHEM 001B Traditional Lecture 2 ∼13 Clicker questions SP: 2, 4, 5, 7
CC: 1, 5, 7
W23 CHEM 001B Traditional Lecture 3 ∼10–23 Clicker questions SP: 5, 7
CC: 1, 5, 6, 8


Table 3 indicates the course taught by author JFE incorporated a larger number of 3DL Scientific Practices and Crosscutting Concepts and included the highest proportion of student-centered learning (this is designated the “001B HDL” course). It is noteworthy that even though the three Traditional Lecture courses were not designed around the 3DL framework and these instructors had no knowledge of 3DL course design, they were observed to incorporate some elements of 3DL.

The coding of the classroom observations and instructional materials was carried out by three researchers (authors JA, ED, and AL). The three-dimensional learning assessment protocol (3D-LAP) was first used to define the 3DL core ideas, scientific practices, and crosscutting concepts (Laverty et al., 2016), then the three researchers used the 3D-LOP as a guide for creating an observation coding key (Bain et al., 2020). This included breaking down the classroom into five-minute segments, documenting the teaching activities (instructor-centered discussion, group discussion, in-class poll questions, student-centered collaborative discussion, etc.), and documenting the presence/occurrence of the 3DL aspects associated with the instructional activities. For the two classes in which the instructional materials were evaluated, the presentation slide number was used to document the class material through time rather than time blocks (both instructors in these two courses used PowerPoint or PDF slide presentations to present the course material). Since identifying 3DL components can be quite subjective, each researcher coded the lectures individually, then they compared their coding to ensure that all the codes were consistent, and consensus was reached on the final 3DL coding for all four courses. Student-centered versus instructor-centered activities were coded on a timescale by direct observation of the classroom activities (001B HDL and 001B Traditional Lecture 1) or by using the number of in-class poll questions listed in the class instructional materials to estimate the amount of student-centered activities (001B Traditional Lectures 2 and 3). Conversely, it was difficult to identify a consistent way to code the 3DL components on a time scale. This was due to the variety of in-class activities (e.g., determining the time spent on 3DL was easier for student-centered activities compared to instructor-centered activities) and the different class materials that were available to observe (e.g., the actual class observation for the 001B HDL course and 001B Traditional Lecture 1 compared to the static instructional materials for the 001B Traditional Lectures 1 and 2). Therefore, the 3DL components present in the courses were coded based on the number of times the various 3DL components were observed rather than the length of time they were “covered” in class.

The impact of the use of 3D learning strategies in the CHEM 001B implementation was gauged by administered previously published 3DL assessment items (Laverty et al., 2016) in the CHEM 001B laboratory (001BL) final exam. The CHEM 001BL final exam was taken by students from all four class sections during a common exam period in which the 3DL items were given as extra credit. Because the four class sections did not cover these topics in the same fashion, it was decided that awarding extra credit for these assessment items would eliminate the possibility of adversely affecting the grade of students who may not have had exposure to this type of learning (e.g., in the 001B Traditional Lectures 1, 2, and 3). Any partial or missing responses to the 3DL items were assigned a score of zero. The final 3DL assessment consisted of both a multiple-choice item and a free response item (see SI Appendix D). The free response 3DL item was a five-part question that prompted the students to draw out the structure of different compounds that depicted certain intermolecular forces. Students were then asked to predict which of the compounds would have the highest boiling point. When grading this part of the assessment there was a pattern of students only answering two questions out of the five asked on the free response portion, and many students did not choose to answer any part of the assessment item. The two questions most often answered by students were in relation to what intermolecular forces would be present in trimethylamine (structure provided) and predicting which compound between trimethylamine and propylamine would have the higher boiling point with a provided explanation. The second 3DL assessment item was three-part multiple-choice questions where questions 2 and 3 were heavily dependent on the correctness of question 1. Due to the question being multiple choice, a larger number of students submitted answers for this assessment item compared to the free response item. Finally, it is noted that 001B Higher Dimensional Lecture course administered a midterm and final exam that followed a more traditional assessment format analogous to the 001B Traditional Lecture 1, 2, and 3 courses. This was done to not specifically advantage the 001B HDL class on the final 3DL assessments. The 3DL assessments were scored by three researchers who were responsible for a specific set of questions to ensure consistency across the grading rubric, and a subset of the assessment from each researcher were independently scored by a different researcher to ensure scoring reliability (>80% inter-rater reliability was observed). The identify of the class section/instructor was anonymized prior to scoring the assessments to eliminate any potential outcome bias. To determine if prior knowledge of general chemistry content might be a confounding variable parallel to the potential impact of the classroom interventions, the chemistry concept inventory was administered in the first week of the quarter in all the 001B courses. Previous research indicates that the JCE concept inventory produces valid and reliable results (Barbera, 2013). All data collection and analyses were carried out under the approved UCR IRB protocol #22127.

Phase two. The general structure of the study for phase two is summarized in Fig. 2. The CHEM 001A classes involved in phase two of this study include a 001A HDL course (taught by author JFE) and a course taught by the same instructor who taught the 001B Traditional Lecture 1 course from phase one of the study. Direct observation of the 001A HDL course (via classroom capture videos) and 001A Traditional Lecture course (via evaluation of the class lecture slides; class capture videos and in-person observation were not available for this class in the W24 term) were evaluated by three researchers (authors JA, ED, and AL) using the same 3D-LOP protocol employed in phase one of the study (see 3D-LOP summary in Table 4). Since the 3DL assessments involved in this phase focus on learning objectives related to periodic trends, and stoichiometry, only course materials relating to these topics were evaluated. Table 3 reports that multiple 3DL elements were embedded in the class activities taught by author JFE (labeled as “001A HDL” course; see SI Appendix C Fig. S1b). Analogous to phase one of the study, evaluation of the 001A Traditional Lecture instructor's learning materials for the Fall 2023 course using 3D-LOP found that aspects of three-dimensional learning were again present in the course (see SI Appendix C Fig. S1c for the 3D-LOP coding summary).
image file: d5rp00328h-f2.tif
Fig. 2 Overview of phase two study design and instructional format.
Table 4 Summary of 3DL observation protocol for CHEM 001A courses. The full 3D-LOP coding summary can be found in Appendix C
Class % of Class time student-centered Type of student-centered work Scientific practices & crosscutting concepts observed in periodic trends lessons Scientific practices & crosscutting concepts observed in stoichiometry lessons
W24 CHEM 001A HDL Course ∼56–64 In class-assessments, clicker questions, group activities, peer to peer discussions SP: 2, 4, 6, 7 SP: 2, 3, 4, 5, 6, 7
CC: 1, 2, 7 CC: 1, 2, 3, 4, 5, 6
F23 CHEM 001A Traditional Lecture ∼7–43 Clicker questions SP: 2, 4, 6, 7 SP: 2, 3, 4, 5, 6, 7
CC: 1, 5 CC:1, 2, 3, 4, 5


To measure the impact of the HDL practice activities may have on student 3DL assessment performance, a 3DL free response question was embedded into the midterm exam and two previously published 3DL assessment items were incorporated into the course final exam. These 3DL assessments evaluated the student's understanding of core chemistry concepts, such as periodic trends, energy, and stoichiometry. The 3DL midterm assessment reviewing periodic trends was created specifically for this research study (by author CSM) using the 3D-LAP as a guide. The assessment topics for the midterm were coordinated to the curricula of the lecture session and evaluated for content validity by the course instructor (author JFE). Both the HDL Discussion group and Traditional Discussion group were given the 3DL midterm exam question during their 50-minute discussion sections. These 3DL midterm assessments are available in SI Appendix D.

Analogous to phase one of the study, the final exam 3DL assessment items were obtained from previously published 3D-LAP (Laverty et al., 2016) and integrated into the course final exam (SI Appendix D). The selected questions measured the efficacy of 3DL by gauging the students’ performance on periodic trends assessment and a stoichiometry assessment. These assessment items were distributed to the class (including both the HDL Discussion sections and Traditional Discussion sections) as part of the exam during their scheduled final exam. To gain further insight into the impact that additional practice with 3DL exercises may have on a student's 3DL assessment performance, the 001A Traditional Lecture course was used as an additional active control course in which the 3DL final exam items were administered as part of the regular course final exam (prior arrangements had been made to have this course administer these two 3DL assessment items). The 001A Traditional Lecture course did not administer the midterm 3DL assessment items.

Chemistry concept inventory data collected at the start of the winter quarter was used to assess incoming students’ understanding of chemistry concepts for the HDL Discussion and Traditional Discussion groups (Barbera, 2013). The chemistry concept inventory could not be administered in the fall 2023 001A Traditional Lecture course, therefore institutional math placement data were used to assess the potential confounding variable of incoming academic knowledge. Analogous to phase one of the study, all data collection and analyses were carried out under the approved UCR IRB protocol #22127.

Statistical analyses

For phase one of the study, the Traditional Lecture 1 course had a larger total number of Scientific Practices and Crosscutting Concepts present compared to Traditional Lectures 2 and 3. Because of this observation, the Traditional Lecture 2 and 3 courses were aggregated into one study group for the subsequent statistical analyses. Tests for normality were used to determine whether the 3DL assessment scores could be evaluated using parametric statistical tests. Though the skewness and kurtosis statistics fall within generally acceptable ranges for all the 3DL assessment scores, visual inspection of the distribution histograms suggest these data might not be normally distributed. Consequently, a Shapiro-Wilk test was used to further assess the normality of the 3DL assessment scores, and it was found that the null hypothesis stating the data are normally distributed could be rejected (see SI Appendix C Fig. S2). Due to the non-normal distribution of data in both phases of the study, non-parametric Mann–Whitney U tests were used in assessing the results between the various quasi-experimental groups (001B Higher Dimensional Lecture vs. 001B Traditional Lecture 1 and 001B Higher Dimensional Lecture vs. 001B Traditional Lectures 2 and 3 in phase one; Higher Dimensional Discussion vs. Traditional Discussion and Higher Dimensional Discussion vs. 001A Traditional Lecture in phase two). The effect sizes (r) for the non-parametric tests were calculated using eqn (1) (Z = standardized test statistic from Mann–Whitney U test; na and nb = sample size of study groups).
 
image file: d5rp00328h-t1.tif(1)
The effect size is designated as small (r = 0.1), medium (r = 0.3), or large (r = 0.5) as large effect (Fritz et al., 2012). Since multiple statistical tests were carried out to compare the assessment scores, the threshold for statistical significance was adjusted using a Bonferroni correction (Dunn, 1961; Pratt et al., 2023). The probability that a null hypothesis was rejected in error was adjusted to reflect the two statistical tests that were conducted for each assessment item (p = 0.05/2 = 0.025). All statistical analyses were carried out using the Statistics Package for the Social Sciences (SPSS) version 28.0.0.0 or version 29.0.1.1.

Results

The 3D-LOP was used to identify if the Traditional Lecture courses incorporated any aspects of 3D learning. (see Tables 3, 4 and SI Appendix C). In both phase one and phase two, the HDL course was observed to incorporate more of 3DL Scientific Practices and Crosscutting Concepts, and a higher percentage of class time in the HDL course was dedicated to student-centered learning compared to the Traditional Lecture courses. The observational study of class materials and lectures detected that the control group courses integrated some 3DL elements in their materials even though those instructors did not have any prior knowledge of 3DL. The total number of 3DL elements was observed to be higher for the CHEM 001B HDL course relative to the CHEM 001B Traditional lectures (Table 3). However, the number of 3DL Scientific Practices was observed to be similar for the CHEM 001A HDL course and CHEM 001A Traditional Lecture and the 001A HDL course incorporated one additional Crosscutting Concept in each topic (Table 4). In comparing the 001A Higher Dimensional Lecture to the 001A Traditional Lecture, the number of 3DL elements present were again identified based on how often they appeared and not based on the length of time in which the 3DL occurred, therefore it is emphasized that the 3D-LOP coding does not necessarily indicate that one class was “more 3D” than another. On the other hand, the CHEM 001B and 001A HDL courses were taught by an instructor who was intentional in integrating 3DL principles into their teaching, and the HDL courses included in-class activities that were also structured to incorporate aspects of 3DL and included questions that emphasized an evidence-claim-explanation structure (see SI Appendix C for the full 3D-LOP coding summaries, and SI Appendices A and B for examples of HDL in-class activities and the coding of activity questions for the evidence-claim-explanation structure).

Descriptive statistics

For phase one of the study, the chemistry concept inventory was used to assess incoming knowledge (see Table 5) and an analysis of variance (ANOVA) indicated that there was not a statistically significant difference in incoming knowledge across the three study groups (see Appendix C Table 3A–C). In phase two, the chemistry concept inventory test assessed the students in both the Higher Dimensional Discussion and Traditional Discussion groups (see Table 6), and an independent samples t-test suggests students in these two groups have statistically equivalent incoming knowledge (see Appendix C Table S3L). The concept inventory could not be administered to the 001A Traditional Lecture course due to logistical reasons; therefore institutional math placement scores were used to compare the academic preparation of the 001A HDL course and Traditional Lecture course. Tables 5 and 6 provide a descriptive statistics summary of the performance on the chemistry concept inventory tests, and 3DL assessments evaluated in phase one and phase two, respectively.
Table 5 Descriptive statistics for the CHEM 001B phase one study (the concept inventory has a range of 0–22 points, the 3DL assessment item 1 has a range of 0–8 points, and the 3DL assessment Item 2 has a range of 0–3 points)
Class Concept inventory pre-test (mean ± sd) 3DL assessment item 1 (mean ± sd) 3DL assessment item 2 (mean ± sd)
001B HDL Course 11.40 ± 4.24 (n = 221) 3.22 ± 2.01 (n = 201) 2.12 ± 1.16 (n = 204)
001B Traditional Lecture 1 10.86 ± 4.46 (n = 233) 3.48 ± 2.14 (n = 193) 2.13 ± 1.15 (n = 213)
001B Traditional Lectures 2 and 3 10.99 ± 4.24 (n = 404) 2.74 ± 2.05 (n = 263) 2.05 ± 1.17 (n = 391)


Table 6 Descriptive statistics for the CHEM 001A phase two study. Due to only a subset of students from each class taking the math placement exams, the population of students for the math placement is less than the population of students who completed the assessment. (Note: the total possible points were 100 for math placement scores, 22 for concept inventory scores, 5 for the periodic trends midterm assessment, 4 for the stoichiometry final exam assessment, and 5 for the periodic trends final exam assessment)
Class Math placement scores (mean ± sd) Concept inventory pre-test (mean ± sd) Periodic trends 3DL midterm assessment (mean ± sd) Stoichiometry 3DL final exam assessment (mean ± sd) Periodic trends 3DL final exam assessment (mean ± sd)
001A HDL Course + HDD 59.63 ± 17.54 (n = 103) 9.32 ± 3.68 (n = 119) 2.42 ± 1.24 (n = 127) 1.52 ± 1.41 (n = 127) 3.85 ± 1.29 (n = 127)
001A HDL Course + Traditional Discussion 52.49 ± 18.45 (n = 104) 9.00 ± 4.21 (n = 116) 1.91 ± 1.16 (n = 121) 1.07 ± 1.42 (n = 123) 2.91 ± 1.23 (n = 123)
001A Traditional Lecture Course 71.27 ± 22.26 (n = 98) N/A N/A 1.53 ± 1.73 (n = 236) 3.22 ± 1.29 (n = 236)


3DL assessment

Phase 1. The final exam intermolecular forces assessments were analyzed using independent samples Mann–Whitney U tests to determine if the 3DL-inspired instructional methods improved three-dimensional learning outcomes relative to the 001B Traditional Lecture 1 and 001B Traditional Lectures 2 and 3 courses. Performance was compared between the 001B HDL course and 001B Traditional Lecture 1–3 courses for both the free response 3DL assessment item (see Table 7 and SI Appendix C Fig. S3A, B) and the multiple choice 3DL assessment item (see Table 8 and SI Appendix C Fig. S3C, D).
Table 7 Mann–Whitney U test results for the free response IMF data comparison of: (1) 001B higher dimensional lecture vs. 001B traditional lecture 1 scores; and (2) 001B higher dimensional lecture vs. 001B traditional lectures 2 and 3 scores
Comparison Group Mean rank U p Effect size (r)
Free Response IMF 001B HDL Course (n = 201) 190.43 20[thin space (1/6-em)]818.00 0.198 0.065
001B Traditional Lecture 1 (n = 193) 204.87
Free Response IMF 001B HDL Course (n = 201) 252.71 22[thin space (1/6-em)]370.00 0.004 0.134
001B Traditional Lectures 2 and 3 (n = 263) 217.06


Table 8 Mann–Whitney U test results for the MC IMF data comparison of: (1) 001B higher dimensional lecture vs. 001B traditional lecture 1 scores; and (2) 001B higher dimensional lecture vs. 001B traditional lectures 2 and 3 scores
Comparison Group Mean rank U p Effect size (r)
MC IMF 001B HDL course (n = 205) 208.75 21[thin space (1/6-em)]777.50 0.962 0.0024
001B Traditional Lecture 1 (n = 213) 209.24
MC IMF 001B HDL Course (n = 205) 304.07 38[thin space (1/6-em)]644.00 0.485 0.029
001B Traditional Lectures 2 and 3 (n = 246) 294.83


The 001B HDL course was observed to have a higher frequency of higher performing students compared to the 001B Traditional Lectures 2 and 3 for the free response assessment item, and this difference was found to be statistically significant with a small effect size (Table 7; p = 0.004, r = 0.1340; the threshold for significance was adjusted using a Bonferroni correction due the use of multiple tests; p = 0.050/2 = 0.025). However, statistically significant differences in the distribution of scores were not observed between the 001B HDL course and 001B Traditional Lectures 2 and 3 for the multiple choice assessment item (Table 8; p = 0.485), and there appeared to be no significant difference in performance between the 001B HDL course and 001B Traditional Lecture 1 course on either the free response or multiple choice 3DL assessment items (Tables 7 and 8; p = 0.198, p = 0.962).

Phase 2. Student scores on the midterm and final exam 3DL assessment items were used to assess the impact of the HDL practice activities implemented in the discussion group sessions. These assessment items and their corresponding topics were selected to directly reflect the learning outcomes of the course. Tables 9–11 summarize the results of the Mann–Whitney U test performed on the following items: periodic trends 3DL midterm assessment scores, stoichiometry 3DL final exam assessment scores, and periodic trends 3DL final exam assessment scores.
Table 9 Mann–Whitney U test results for the 3DL midterm periodic trend assessment scores of the higher dimensional discussion group vs. the traditional discussion group during Winter 2024
Comparison Group Mean rank U p Effect size (r)
3DL Midterm Assessment Scores for Periodic Trends Higher Dimensional Discussion W24 (n = 127) 138.45 5911.50 <0.001 0.206
Traditional Discussion W24 (n = 121) 109.86


Table 9 summarizes the Mann–Whitney U-test results which compare the performance of the Higher Dimensional Discussion group (n = 127) and the Traditional Discussion group (n = 121) on the 3DL midterm assessment evaluating periodic trends (0–5 points possible). The Higher Dimensional Discussion treatment appeared to have statistically higher mean ranks on the periodic trend midterm assessment relative to the Traditional Discussion group, with a small to moderate effect size (p < 0.001, r = 0.206; see Table 9).

The 3DL final exam assessments were evaluated using a Mann–Whitney U-test to compare the performance of the following three groups: Higher Dimensional Discussion (n = 127), Traditional Discussion (n = 123), and 001A Traditional Lecture (n = 236). On the stoichiometry assessment (0–4 points possible) there was not a significant difference to report when comparing the performance of the Higher Dimensional Discussion group and 001A Traditional Lecture group (p = 0.673, r = 0.022; see Table 10). However, in comparing the performance of the Higher Dimensional and Traditional Discussion groups on the same assessment, a statistically significant difference with a small effect size was observed (p = 0.008, r = 0.169; Table 10). The periodic trends 3DL final exam assessment scores (0–5 points possible) showed an overall significant improvement in the performance by the Higher Dimensional Discussion group relative to both Traditional Discussion and 001A Traditional Lecture control groups with moderate effect sizes (p < 0.001, r = 0.353, p < 0.001, r = 0.237; see Table 11).

Table 10 Mann–Whitney U test results for the 3DL final exam stoichiometry assessments scores of the 3DL discussion group vs. the traditional discussion group and 001A traditional lecture group
Comparison Group Mean Rank U p Effect size (r)
Final Exam Stoichiometry Scores Higher Dimensional Discussion W24 (n = 127) 136.93 6359.50 0.008 0.169
Traditional Discussion W24 (n = 123) 113.70
Final Exam Stoichiometry Scores Higher Dimensional Discussion W24 (n = 127) 185.00 14[thin space (1/6-em)]604.50 0.673 0.022
001A Traditional Lecture F23 (n = 236) 180.38


Table 11 Mann–Whitney U test results for the 3DL final exam periodic trend assessments scores of the higher dimensional discussion group vs. the traditional discussion group and 001A traditional lecture group
Comparison Group Mean rank U p Effect size (r)
Final Exam Periodic Trend Scores Higher Dimensional Discussion W24 (n = 127) 149.72 4734.00 <0.001 0.353
Traditional Discussion W24 (n = 123) 100.49
Final Exam Periodic Trend Scores Higher Dimensional Discussion W24 (n = 127) 214.44 10[thin space (1/6-em)]865.50 <0.001 0.237
001A Traditional Lecture F23 (n = 236) 164.54


Discussion

General research outcomes

Phase one of the study resulted in a partial null result. The 001B HDL treatment group scored higher on the open-ended IMF assessment compared to the 001B Traditional Lectures 2 and 3 control groups but there was no substantial difference between 001B HDL and 001B Traditional Lecture 1 (see Table 7), and no significant difference between any of the study groups was observed on the multiple-choice IMF assessment (see Table 8). We speculate this was caused by several possible factors: (a) the 001B Traditional Lectures 1, 2, and 3 incorporated some elements of 3D learning, which potentially closed the performance gap on the 3DL assessments; (b) since the 3DL assessments were embedded into the laboratory final and administered as extra credit there was likely low fidelity of student effort, which would then likely reduce the sensitivity of the 3DL assessments in detecting 3D learning gains; and (c) the 001B Higher Dimensional Lecture treatment may have had limited impact since students did not have frequent exposure to 3DL assessments during the course (DeGlopper et al., 2022).

Insights from phase one pointed to the need for adjustments; accordingly, phase two increased the frequency at which students engaged with practice activities reflecting 3DL principles. The mandatory discussion sessions for CHEM 001A were critical to phase two of the study. These co-curricular instructional recitations provided a structure for introducing routine HDL practice activities and had the advantage of offering a smaller classroom setting where collaboration with peers and more feedback from the instructor occurred. Additionally, the 3DL assessments were administered as for-credit questions on the final course exam to improve the fidelity of effort by the students.

While both the Higher Dimensional Discussion group and Traditional Discussion group were enrolled within the same flipped-hybrid course which incorporated routine 3DL elements, the 3DL midterm and final exam assessments reveal a significant difference between the Higher Dimensional Discussion group and Traditional Discussion group for the periodic trends content domain (see Tables 9 and 11). The Higher Dimensional Discussion treatment was observed to have significantly higher mean ranks on the midterm assessment with a small to moderate effect size, however the Higher Dimensional Discussion treatment appeared to have an even larger effect on the final exam periodic trends 3DL assessment (r = 0.353 on the final exam vs. r = 0.206 on the midterm; see Tables 9 and 11). This is notable because one might expect the performance gap to diminish over time due to the incorporation of 3DL components in the main lecture course. We speculate that the HDL practice activities administered in the Higher Dimensional Discussion treatment played a large part in the increased effect over time.

Because the stoichiometry unit was covered in the final week of the course, a comparison between the Higher Dimensional Discussion and Traditional Discussion groups could not be made at different time points, but the Higher Dimensional Discussion treatment still appeared to have a significant impact on the mean ranks of the stoichiometry 3DL assessment with a small to moderate effect (r = 0.169; see Table 10). The smaller effect on the stoichiometry assessment relative to the periodic trends assessment likely originates due to several confounding variables, but it is proposed here that one of those may be the nature of the 3DL assessments used on the final exam. In our judgement, the periodic trends assessment item is “more 3DL” in nature compared to the stoichiometry assessment item, in part because the periodic trends test item requires students to interpret data and make arguments from evidence. Previous researchers (Laverty et al., 2016) developed the 3DL stoichiometry assessment item, which was selected for inclusion in this study. However, the stoichiometry assessment item might be interpreted to follow a more traditional lower dimensional structure, as the problem focuses mainly on being able to determine the theoretical yield of a reaction (see SI Appendix D, Fig. S10). This may have reduced the sensitivity of this problem in detecting differences in 3D learning.

Upon comparing the Higher Dimensional Discussion treatment group to the 001A Traditional Lecture group, the Higher Dimensional Discussion group had a significantly higher mean rank on the periodic trend final exam assessment (Table 11) but there was not a significant difference found between the two groups on the stoichiometry assessment (Table 10). The Higher Dimensional Discussion treatment appeared to have a small to medium effect on the periodic trends final 3DL assessment (r = 0.237), however this is a notable outcome given the 001A Traditional Lecture course was populated with an on-sequence cohort of students who on average had better incoming academic preparation (the 001A Traditional Lecture cohort had an 11–12 point higher average score on the institutional math placement scores; see Table 6). The routine HDL practice activities administered in the Higher Dimensional Discussion treatment likely helped these students achieve higher 3DL gains, though it is acknowledged the larger and broader emphasis on higher dimensional learning in the 001A HDL course may have also contributed. As described above, there was no significant difference in the mean ranks on the stoichiometry final exam 3DL assessment between the Higher Dimensional Discussion treatment group and the 001A Traditional Lecture cohort (see Table 10). Like the outcome with the Higher Dimensional Discussion and Traditional Discussion study groups, it is proposed here the stoichiometry assessment may have been less sensitive in detecting differences in 3D learning compared to the periodic trends assessment. If this is indeed the case, one might expect the 001A Traditional Lecture instructional interventions to have similar impact on student performance for this particular stoichiometry 3DL assessment item.

Consistent with DeGlopper and coworker's previous work (DeGlopper et al., 2022), the findings reported herein appear to support the importance of exposing students more routinely to 3DL practice exercises and/or assessments if the broader aim is to improve students’ ability to engage in higher order thinking. Even though phase one of this study involved building a course around the 3DL framework, the lack of 3DL practice exercises may have lessened the impact of this type of course on higher order learning outcomes. Conversely, when students were given frequent opportunities to complete HDL practice activities in phase two of the study, performance on summative 3DL assessments appeared to improve. These results suggest that by implementing more HDL practice activities into a course that emphasizes higher order thinking, students can improve their performance on 3DL assessments and solidify a stronger understanding of core chemistry concepts relative to students who engage in more traditional problem-solving exercises.

A central feature of the hybrid/flipped format was that it provided dedicated in-class time to incorporate HDL activities and offered students opportunities to engage in tasks that strengthened higher order thinking. In phase one, the traditional lecture sections (1, 2, and 3) included some elements of 3DL, but these were embedded within conventional lectures and did not provide structured opportunities for HDL activities. In phase two, the CHEM 001A HDL lecture intentionally integrated HDL activities into the course structure, whereas the CHEM 001A traditional lecture section remained primarily lecture focused, containing only some embedded 3DL elements without formal HDL exercises. This distinction underscores that it was not merely the presence of 3DL principles, but the explicit incorporation of HDL activities through the hybrid/flipped design that shaped the observed differences in higher order learning outcomes.

Limitations

This study was more broadly limited due to the quasi-experimental nature of the research design and the naturalistic setting in which it was employed. Though groups of students in classes that had less emphasis on 3DL instruction could be used as active controls, students were not randomly assigned to the different classes and/or the different discussion group sessions, and variance in assessment performance that might arise due to different student characteristics could not be controlled (e.g., student experiences such as course schedule, co-curricular and extra-curricular activities, out-of-class work-life experiences, etc.). The study was also limited with respect to potential instructor bias on student performance, as different instructors taught the 001B HDL course and 001B Traditional Lecture courses; additionally, two different graduate student teaching assistants taught the Higher Dimensional Discussion and Traditional Discussion groups. Finally, an important limitation to acknowledge is the limited scope of the study with respect to the course content that was included, the breadth of that content that was assessed for 3DL outcomes, the longitudinal impact that 3DL interventions may have, and other types of student affective outcomes that may be impacted by this type of instructional approach. Future work in this area should aim to expand the breadth of content that is assessed for 3DL outcomes, probe the questions of if and how this type of 3DL course structure might impact higher order learning outcomes in future courses, and begin to look more deeply into the student experience in these types of learning environments. At our institution, most students enrolled in this type of large-enrollment gateway course are biological sciences and/or pre-health majors. Anecdotally, we have observed that many of these students tend to prefer the more straightforward path to success and may perceive higher-order learning as an unnecessary obstacle as it can be more challenging. This perspective is not unique to our institution and has been noted in similar educational contexts (Ramella et al., 2019). Researchers should consider investigating affective outcomes related to 3DL instructional approaches and how these may moderate 3DL cognitive outcomes.

Implications for instruction

In terms of the implications for current and future chemistry educators, the results presented here provide evidence that more routine and frequent HDL practice and formative assessment appear to be critical aspects for students to make progress toward improved 3DL outcomes. Chemistry instructors should consider utilizing discussion/recitation sessions (if available) and view these supplemental sessions as a valuable opportunity to engage students with the course objectives within a 3DL context and provide more frequent opportunities for 3DL assessment. A recent report similarly shows positive course outcomes by having TAs implement active learning in smaller classrooms within a large enrollment general chemistry course (Clark et al., 2024). This Distributed Active Learning (DAL) approach might be a potential instructional model for instructors who wish to incorporate more frequent HDL practice activities as described in the current study.

Of course, creating mandatory co-curricular discussion group sections and/or implementing a DAL structure might be too cost-intensive for many institutions, therefore instructors who teach large enrollment courses will continue to face the typical logistical challenges associated with adopting instructional strategies that promote more meaningful learning. The primary challenge likely confronting classroom practitioners is teaching in spaces with fixed seating/layouts that inhibit group work and make it difficult for the instructor to provide individual attention and feedback to students (Mayer et al., 2009; Clark et al., 2024). Additionally, instructors who teach in departments that continue to use more traditional undergraduate curricula that encompass a broad set of lower-level learning objectives may struggle to integrate non-traditional learning objectives such as those associated with 3DL. It is argued here that the results of the current study provide evidence that instructors can leverage the hybrid classroom structure to devote more in-class time to facilitating dialogic discourse and collaborative activities that promote higher thinking, and/or create out-of-class learning materials that focus on 3DL outcomes.

An important consideration for instructors regarding implementing a more 3DL-inspired curriculum is how to assess this type of learning, particularly in large enrollment STEM courses. Because 3D learning is largely based on students providing reasoning and making arguments from evidence, the logistics of evaluating this type of student work on a large scale is likely a challenge for many instructors. Additionally, there has been a growing trend in higher education chemistry to adopt non-traditional assessment structures such as specifications grading (Howitz et al., 2021) and mastery outcomes assessment structures (Hartman and Eichler, 2024). Because 3DL assessments often require open-ended responses in which students must articulate their thinking, using these types of assessment structures could be dauting for instructors due to the requirement of providing more frequent assessment and feedback for students. One approach to address this challenge might be to model an assessment and feedback strategy similar to that described here, in which students receive frequent informal, ungraded practice and feedback. This might reduce the time required by the instructor and/or TAs to deliver this type of assessment. However, if more formal graded assessments are required, digital grading tools such as Gradescope and/or artificial intelligence (AI) grading bots might become more widely available and help lower the logistical barrier for instructors (Kooli and Yusuf, 2025). Future research in the general area of implementing 3DL in large enrollment gateway courses ought to emphasize how assessments that gauge higher order learning outcomes can be carried out in alternative grading structures.

Conclusions

Overall, it was found that a hybrid classroom structure afforded an opportunity to incorporate 3DL-inspired instruction within a traditional general chemistry curriculum, however it appears that providing frequent opportunities for HDL practice activities is a significant contributor to helping students achieve 3D learning gains relative to more traditional learning environments. It is hoped the results from this study will encourage instructors who are seeking to re-structure their classes to a hybrid/flipped format to not only value the increased opportunities for active learning but also acknowledge that active learning should be built (at least in part) on the 3DL framework and promote higher order learning. Moving forward, there is a desire to develop more robust 3DL assessment structures that might fit into alternative grading strategies and obtain a more fine-grained picture about how students value this type of higher order learning. Providing students with routine feedback on 3DL thinking and ensuring they do not harbor resistance to less traditional learning outcomes will be important as the chemistry education community works to promote this type of more meaningful learning.

Author contributions

JFE conceived the project design and supervised all of the researchers in the data collection, data cleaning, and associated analyses. CSM worked with JFE in creating the CHEM 001A discussion group 3DL activities, implemented the activities, carried out all statistical analyses for the CHEM 001A and CHEM 001B 3DL assessments, and collaborated with JFE in drafting the manuscript. ED, JA, and AL did the 3DLOP observations and coding for the CHEM 001A and 001B courses and carried out the data cleaning for the 001B 3DL assessment.

Conflicts of interest

There are no conflicts of interest to declare.

Ethical statement

The human subjects research that was carried out in this study was done under an approved protocol that was reviewed by the University of California-Riverside Institutional Research Board (IRB protocol #22127), and informed consent was administered to all participants in the study as described in the IRB protocol.

Data availability

The datasets generated during and/or analysed during the current study are not publicly available due to restrictions in the approved IRB protocol, but are available from the authors on reasonable request.

Supplementary information (SI) is available. See DOI: https://doi.org/10.1039/d5rp00328h.

Acknowledgements

We would like to acknowledge the other instructors who cooperated with the researchers in administering the classroom assessments and providing access to their course’s materials for the 3DLOP coding. We would also like to gratefully acknowledge the UCR Chancellor’s Undergraduate Research Fellowship program for supporting AL’s work on this project.

References

  1. Ausubel D. P. (1962), A Subsumption Theory of Meaningful Verbal Learning and Retention, J. General Psychol., 66(2), 213–224 DOI:10.1080/00221309.1962.9711837.
  2. Bain K., Bender L., Bergeron P., Caballero M. D., Carmel J. H., Duffy E. M., Ebert-May D., Fata-Hartley C. L., Herrington D. G., Laverty J. T., Matz R. L., Nelson P. C., Posey L. A., Stoltzfus J. R., Stowe R. L., Sweeder R. D., Tessmer S. H., Underwood S. M., Urban-Lurain M. and Cooper M. M., (2020), Characterizing college science instruction: The Three-Dimensional Learning Observation Protocol, PLoS One, 15(6), e0234640 DOI:10.1371/journal.pone.0234640.
  3. Barbera J., (2013), A Psychometric Analysis of the Chemical Concepts Inventory. J. Chem. Educ., 90, 546–553.
  4. Bredow C. A., Roehling P. V., Knorp A. J. and Sweet A. M., (2021), To Flip or Not to Flip? A Meta-Analysis of the Efficacy of Flipped Learning in Higher Education, Rev. Educ. Res., 91(6), 878–918.
  5. Clark T. M., Ricciardo R. and Turner D. A., (2024), Distributed Active Learning (DAL): An Approach to Reduce Class Size in Large Enrollment Courses, J. Chem. Educ., 101, 1974–1982.
  6. Cooper M. M., Corley L. M. and Underwood S. M., (2014), An investigation of college chemistry students' understanding of structure–property relationships, J. Res. Sci. Teach., 50(6), 699–721.
  7. Cooper M. M., Caballero M. D., Carmel J. H., Duffy E. M., Ebert-May D., Fata-Hartley C. L., Herrington D. G., Laverty J. T., Nelson P. C., Posey L. A., Stoltzfus J. R., Stowe R. L., Sweeder R. D., Tessmer S. and Underwood S. M., (2024), Beyond active learning: using 3-Dimensional learning to create scientifically authentic, student-centered classrooms, PLoS ONE, 19(5), e0295887 DOI:10.1371/journal.Pone.0295887.
  8. DeGlopper K. S., Schwarz C. A., Ellias N. J. and Stowe R. L. (2022), Impact of Assessment Emphasis on Organic Chemistry Students' Explanations for an Alkene Addition Reaction, J. Chem. Educ., 99, 1368–1382.
  9. Deslauriers L., McCarty L. S., Miller K., Callaghan K. and Kestin G., (2019), Measuring actual learning versus feeling of learning in response to being actively engaged in the classroom, Proc. Natl. Acad. Sci. U. S. A., 1–7.
  10. Dunn O. J., (1961) Multiple Comparisons among Means, J. Am. Stat. Assoc., 56(293), 52–64 DOI:10.1080/01621459.1961.10482090.
  11. Edwards M. E., (2017), Creating and Using Symbolic Mental Structures Via Piaget's Constructivism and Popper's Three Worlds View with Falsifiability to Achieve Critical Thinking by Students in the Physical Sciences, Syst., Cyber. Inform., 15(6), 130–134.
  12. Eichler J. F., (2022), Future of the Flipped Classroom in Chemistry Education: Recognizing the Value of Independent Preclass Learning and Promoting Deeper Understanding of Chemical Ways of Thinking During In-Person Instruction, J. Chem. Educ., 99, 1503–1508.
  13. Freeman S., Eddy S. L., McDonough M., Smith M. K., Okoroafor N., Jordt H. and Wenderoth M. P., (2014), Active learning increases student performance in science, engineering, and mathematics, Proc. Natl. Acad. Sci. U. S. A., 111(23), 8410–8415.
  14. Fritz C. O., Morris P. E. and Richler J. J., (2012), J. Exp. Psychol. Gen., 141(1), 2.
  15. Hartman J. D. and Eichler J. F. (2024), Implementing Mastery Grading in Large Enrollment General Chemistry: Improving Outcomes and Reducing Equity Gaps, Educ. Sci., 14(11), 1224.
  16. Holloway L. R., Miller T. F., da Camara B., Bogie P. M., Hickey B. L., Lopez A. L., Ahn J., Dao E., Naibert N., Barbera J., Hooley R. J. and Eichler J. F., (2024), Using Flipped Classroom Modules to Facilitate Higher Order Learning in Undergraduate Organic Chemistry, J. Chem. Educ., 101(2), 490–500.
  17. Howitz W. J., McKnelly K. J. and Link R. D., (2021), Developing and Implementing a Specifications Grading System in an Organic Chemistry Laboratory Course, J. Chem. Educ., 98, 385–394.
  18. Ivie S. D., (1998), Ausubel's Learning Theory: An Approach to Teaching Higher Order Thinking Skills, The High School Journal, 82(1), 35–42.
  19. Kooli C. and Yusuf N., (2025), Transforming Educational Assessment: Insights Into the Use of ChatGPT and Large Language Models in Grading, Int. J. Human–Computer Interaction, 41(5), 3388–3399.
  20. Laverty J. T., Underwood S. M., Matz R. L., Posey L. A., Carmel J. H., Caballero M. D., Fata-Hartley C. L., Ebert-May D., Jardeleza S. E. and Cooper M. M. (2016), Characterizing College Science Assessments: The Three-Dimensional Learning Assessment Protocol. PLoS ONE, 11(9), e0162333 DOI:10.1371/journal.Pone.0162333.
  21. Lindstrom T. and Middlecamp C., (2017), Campus as a Living Laboratory for Sustainability: The Chemistry Connection, J. Chem. Educ., 94, 1036–1042.
  22. Mayer R. E., (2024), The Past, Present, and Future of the Cognitive Theory of Multimedia Learning, Educ. Psychol. Rev., 36, 8.
  23. Mayer R. E., Stull A., DeLeeuw K., Almeroth K., Bimber B., Chun D., Bulger M., Campbell J., Knight A. and Zhang H., (2009), Clickers in college classrooms: fostering learning with questioning methods in large lecture classes, Contem. Educ. Psychol., 34, 51–57.
  24. National Research Council, (2012), A Framework for K-12 science education: Practices, Crosscutting Concepts, and Core ideas, The National Academies Press.
  25. NGSS Lead States, (2013), Next Generation Science Standards: For States, By States.
  26. Pratt J. M., Stewart J. L., Reisner B. A., Bentley A. K., Lin S., Smith S. R. and Raker J. R., (2023) Measuring student motivation in foundation-level inorganic chemistry courses: a multi-institution study, Chem. Educ. Res. Pract., 24, 143.
  27. Rahman M. T. and Lewis S. E., (2020), Evaluating the evidence base for evidence-based instructional practices in chemistry through meta-analysis, J. Res. Sci. Teach., 57, 765–793.
  28. Ramella D., Brock B. E., Velopolcek M. K., Winters K. P., (2019), Using Flipped Classroom Settings to Shift the Focus of a General Chemistry Course from Topic Knowledge to Learning and Problem-Solving Skills: A Tale of Students Enjoying the Class They Were Expecting to Hate, ACS Symp. Ser., 1–20.
  29. Schwarz C. E., DeGlopper K. S., Esselman B. J. and Stowe R. L., (2024), Tweaking Instructional Practices Was Not the Answer: How Increasing the Interactivity of a Model-Centered Organic Chemistry Course Affected Student Outcomes, J. Chem. Educ., 101(6), 2215–2230.
  30. Seery M. K., (2015), Flipped learning in higher education chemistry: emerging trends and potential directions, Chem. Educ. Res. Pract., 16, 758–768.
  31. Seery M. K. and Donnelly R., (2012), The implementation of pre-lecture resources to reduce in-class cognitive load: a case study for higher education chemistry, BJET, 43(4), 667–677.
  32. Shortlidge E. E., Gray M. J., Estes S. and Goodwin E. C., (2024), The Value of Support: STEM Intervention Programs Impact Student Persistence and Belonging, CBE–Life Sci. Educ., 23(2), 1–16.
  33. Stains M., Harshman J., Barker M. K., Chasteen S. V., Cole R., DeChenne-Peters S. E., Eagan J. R. M. K., Esson J. M., Knight J. K., Laski F. A., Levis-Fitzgerald M., Lee C. J., Lo S. M., McDonnell L. M., McKay T. A., Michelotti N., Musgrove A., Palmer M. S., Plank K. M., Rodela T. M., Sanders E. R., Schimpf N. G., Schulte P. M., Smith M. K., Stetzer M., Van Valkenburgh B., Vinson E., Weir L. K., Wendel P. J., Wheeler L. B. and Young A. M., (2018), Anatomy of STEM teaching in North American universities, Science, 359, 270–272.
  34. Strelan P., Osborn A. and Palmer E., (2020), The flipped classroom: a meta-analysis of effects on student performance across disciplines and education levels, Educ. Res. Rev., 30, 100314.
  35. Sweller J., van Merrienboer J. J. G. and Paas F. G. W. C., (1998), Cognitive architecture and instructional design, Educ. Psychol. Rev., 10(3), 251–296.
  36. Theobald E. J., Hill M. J., Tran E., Agrawal S., Arroyo E. N. and Behling S., et al., (2020), Active learning narrows achievement gaps for underrepresented students in undergraduate science, technology, engineering, and math, Proc. Natl. Acad. Sci. U. S. A., 117(12), 6476–6483.
  37. Wu H.-T., Mortezaei K., Alvelais T., Henbest G., Murphy C., Yezierski E. J. and Eichler J. F., (2021), Incorporating Concept Development Activities into a Flipped Classroom Structure: Using PhET Simulations to Put a Twist on the Flip, Chem. Educ. Res. Pract., 22, 842–854.

Footnotes

The POGIL Project. From https://pogil.org/. Accessed April 10, 2025. Chemical Thinking. From https://sites.google.com/site/chemicalthinking/. Accessed April 10, 2025. CLUE: Chemistry, Life, the Universe, & Everything. From https://clue.chemistry.msu.edu/. Accessed April 10, 2025.
https://www.gradescope.com/. Accessed April 11, 2025.

This journal is © The Royal Society of Chemistry 2026
Click here to see how this site uses Cookies. View our privacy policy here.