Blending muddiest point activities with the common formative assessments bolsters the performance of marginalized student populations in general chemistry

Caroline Z. Muteti a, Tracy Kerr a, Mwarumba Mwavita b and Jacinta M. Mutambuki *a
aDepartment of Chemistry, Oklahoma State University, Stillwater, OK 74078, USA. E-mail: caroline.z.muteti@okstate.edu; tracy.kerr@okstate.edu; jacinta.mutambuki@okstate.edu
bDepartment of Research, Evaluation, Measurement, and Statistics, Oklahoma State University, Stillwater, OK 74078, USA. E-mail: mwavita@okstate.edu

Received 22nd November 2021 , Accepted 7th February 2022

First published on 7th February 2022


Abstract

Formative assessments (FAs), such as muddiest point activities can capture students’ areas of confusion or struggle on given content that may otherwise be overlooked with the common FAs, such as quizzes, homework, and clickers. The few reported studies indicate mixed results on the effects of implementing the muddiest point in undergraduate STEM or chemistry courses. Additionally, the commonly reported implementation model involves the short cycle—asking the areas of confusion at the end of the lesson on instructor pre-chosen topics. Using a quasi-experimental study design, we investigated the effects of blending the muddiest point activities at each Chapter (medium cycle) with the usual common FAs on performance in the General Chemistry 1 course. In one lecture section, students were exposed to the muddiest point and the common FAs, whereas in another lecture students were exposed to the common FAs alone. Results showed students in the treatment group performed significantly (p < 0.05) better on all three midterm exams than their counterparts who were exposed to the usual common FAs only. The mean difference in the final exam was not statistically significant (p > 0.05), even though the treatment group showed a slightly higher mean score than the comparison group. MANCOVA results showed a statistically significant main effect of the FA type and statistically significant interactions between the FA type and demographic variables, particularly gender and first-generation status, and race/ethnicity and first-generation status on performance, after controlling for ACT Math scores. Between-subject tests revealed significantly higher mean scores on some midterm exams for minority, minority-first-generation, and female-first-generation students in the treatment group compared to their peers in the comparison group. The findings imply that muddiest point activities can promote equitable access to learning for all students and bolster performance, particularly for marginalized students.


Introduction

Obtaining and providing timely feedback to assess student progress in learning can be a daunting task in large enrollment courses, especially when soliciting open-ended responses from the students. Often, instructors have the assumption that students will grasp the content and ask questions on the areas of struggle or confusion. While this might occasionally be true for some students, it is not an inclusive teaching approach, as introverted and marginalized students, such as first-generation college students and minority ethnic groups, might not be comfortable asking questions in classes or learning environments (Gardner, 2005; Rivera-Goba and Nieto, 2007). As such, these students are likely to be left out and lag on particular concepts if the gaps in learning are not identified and addressed in real-time. In turn, such a barrier can negatively impact their academic success.

Formative assessments (FAs) are pivotal for collecting real-time evidence of the learning (Wiliam, 2006; Harlen, 2013). The evidence is elicited, interpreted in terms of students’ learning needs, and utilized to inform instructional adaptions to meet the identified learning needs (Wiliam, 2006). The information is also beneficial in redirecting students to the areas they need to focus more on their studying. Wiliam proposed the topology of FA based on the duration of instruction and elicitation of evidence of learning. These include three cycles: (1) short-cycle—ranging between 5 seconds to 1 hour assessments with the focus on student learning within lessons, (2) medium cycle—assessments cover between 1 day to 2 weeks of instruction and focus on within and between lesson units, and (3) long cycle—covering between four weeks to one year or more, with a focus between instructional units (Wiliam, 2006). Both the students and the instructor benefit from the FAs by obtaining immediate feedback on the learning progress and the instruction, respectively. Specifically, students can review and reflect on what they got wrong and seek help from the instructor or consult resources on their own to address the gaps in learning. Feedback is therefore critical for improving performance (Yalaki, 2010; Yalaki and Bayram, 2015).

The implementation of FAs should be intentional to assess the intended learning outcomes. In turn, this provides the opportunities to identify and address the gaps in learning (Black and Wiliam, 1998, 2010), and modify the instruction to meet the students’ needs (Wiliam, 2006; Nilson, 2016). Common FAs, such as quizzes (Costa et al., 2010; Yalaki, 2010; Yalaki and Bayram, 2015), homework (Freasier et al., 2003; Botch et al., 2007; Leinhardt et al., 2007; Richards-Babb et al., 2011; Malik et al., 2014), and clicker questions have been shown to improve performance in STEM courses (Freeman et al., 2007). However, these FAs are often instructor-directed with pre-chosen topics and options for answers provided for multiple-choice formats, and therefore exclude students’ voices in monitoring their learning progress and receiving the necessary support on concepts that are otherwise not assessed.

Classroom assessment techniques (CATs), such as muddiest point (Angelo and Cross, 1993; Nilson, 2016) have the potential to enhance an inclusive-learning environment and cultivate successful learning experiences for all students. The “muddiest points” involve soliciting the most confusing concepts on the covered material from students and addressing the identified gaps in learning in real-time. The muddiest points are usually anonymously written on a notecard or a sheet of paper distributed and collected at the end of class (Mosteller, 1989) or using digital platforms such as clickers (King, 2011). For instance, King's study showed that the muddiest points on concepts covered in a given lesson can be identified using clickers (King, 2011). Two to four questions were incorporated at various times throughout each 50 min lecture. Students were asked to identify and select the most confusing concept from the muddiest point topics pre-chosen by the instructor using clickers. The information obtained from the polls led to a 5 min review at the start of the subsequent lesson. Specifically, the muddiest point prompt was displayed followed by a brief discussion about the frequently polled options or muddiest points. Additionally, students answered a follow-up question to uncover any remaining gaps in learning on the concept that had been reviewed. The study findings showed that clickers can be useful in identifying the muddiest points in large chemistry classes.

The muddiest point technique provides a platform for guiding students to become self-directed learners (Steadman, 1998) by monitoring their learning, and for the instructors to be more effective and reflective teachers (Angelo and Cross, 1993). Collecting anonymous responses on muddy points ensures honest feedback from the students and minimizes the feeling of intimidation, especially for shy and struggling students who may not be comfortable asking questions in the presence of the instructor or their peers.

The implementation of muddiest point activities in STEM courses is a work in progress. Past studies in material science and engineering courses have reported benefits of implementing the muddiest point CAT, such as bridging communication between students and instructors (Carberry et al., 2013), increased student motivation and self-efficacy (Carberry et al., 2013; Krause et al., 2014), increased student engagement (Carberry et al., 2013), development of metacognitive skills (Carberry et al., 2013; Krause et al., 2014), improved achievement scores (Krause et al., 2014), reduced anxiety in learning (Waters et al., 2016), and the ability of students to identify gaps in learning (Waters et al., 2016).

Additionally, a few studies on the muddiest point activities in postsecondary chemistry courses have reported mixed results on the use of muddiest point activities (Snead, 2017; Alanazi, 2021). Snead explored the relationship between the muddiest point technique and student performance for students who participated in the muddiest point clicker activities versus their counterparts in other lecture sections who did not participate in General Chemistry I and II courses (Snead, 2017). Results showed that there was no significant statistical difference between the experimental group and the comparison sections for all exams in both courses. Additionally, the findings showed a higher percentage of students who did not choose the muddiest point that was reviewed during the class often correctly answered the correlated exam question, indicating the little impact of the muddiest point clicker activities on most students (Snead, 2017).

Alanazi investigated, within the treatment group, whether exam performance was higher for students who provided the muddiest point feedback using clickers and those who did not participate at the end of each lesson. Their findings indicated that students who responded to the muddiest-point questions performed better on the reviewed concepts that were included in the exam questions compared to students who did not respond. Results further showed that performance was significantly better on the tested quantitative topics related to the muddiest point lists assessed in the lectures for the muddiest point clicker participants compared to students who did not respond to those same lists. However, the performance of conceptual exam questions related to the muddiest point topics was identical for the muddiest point participants and the non-participants (Alanazi, 2021). These mixed results indicate the need for more research to investigate different models of muddiest point interventions beyond the short cycle and assessment of their effects on learning outcomes in postsecondary chemistry education.

In this study, the researchers describe a model, whereby students were provided the list of the general topics covered from each chapter and asked to self-generate specific muddiest point concepts from the chapter topics rather than being given a list of the muddiest points pre-chosen by the instructor at the end of each lesson (King, 2011). This medium cycle FA allowed students to not only learn the chapter content but also to study and review the material prior to the reinforced reflections on the chapter concepts through the muddiest point activity. Moreover, we investigated the impact of the intervention on performance in the midterm exams and a final cumulative exam, as well as whether the muddiest point activities had any impact in narrowing the opportunity gap in performance between marginalized and non-marginalized groups in the general chemistry course.

The assessment of the effects of the muddiest point intervention was guided by the following questions: (1) Are there significant differences in performance between students exposed to TFAs coupled with the muddiest point CAT (treatment group—Trt group) and their counterparts exposed to the TFAs alone (comparison group—Ctl group) in a General Chemistry 1 course? (2) Controlling for ACT math scores, are there significant interactions between the type of FA (the muddiest point versus the usual common FAs) and participants’ demographics, such as gender, race, and first-generation status on performance as measured by the four exams, and does the intervention narrow the opportunity gap in performance between these demographic groups? The terminology “opportunity gap” was chosen over the “achievement gap” to emphasize that all students can achieve if provided with the relevant resources and opportunities. Proponents of equitable educational spaces have challenged the use of the achievement gap to explain disparities in performance as it tends to place the blame on historically marginalized students and assumes that these students do not meet the academic expectations (Johnson, 2002; Ladson-Billings, 2006; Boykin and Noguera, 2011). In our operationalization of the “opportunity gap”, we acknowledge that this gap is not due to inherent differences in students’ capabilities or identities, but unequal and inadequate educational opportunities that impede many marginalized students from succeeding. As one way of promoting equitable access to learning, the muddiest point activities intended to provide opportunities to all students to point out the chemistry concepts that they tussled with on each chapter unit.

This study tested the null hypotheses that (1) the mean scores between the treatment and the comparison groups on the midterms and the final exam will be identical, and (2) there will be no significant interactions between the FA type and demographic variables, such as gender, race, and first-generation status on performance in the course, after controlling for the ACT math scores. Additionally, the researchers the mean scores on the midterm exams and the final exam will be equal regardless of the participants’ social identity groups in both the treatment and the comparison groups. It is expected that the muddiest point activities will provide equitable access to learning by allowing all students to communicate their gaps in learning, which, in turn, will improve their performance. The implementation of the muddiest point activities followed the formative assessment cycle suggested by (Heritage, 2010).

The formative assessment cycle

Heritage described eight components of the formative assessment process, namely: (1) identifying the desired learning outcomes, (2) eliciting evidence of learning by collecting data, (3) analyzing and interpreting the evidence, (4) identifying the learning gaps, (5) providing feedback or addressing the learning gaps, (6) planning learning and instructional modifications, (7) scaffolding new learning, and (8) closing the gap (Heritage, 2010). These elements were condensed into seven steps or components shown in Fig. 1. We noted that these components were adopted in implementing the muddiest point CAT in the treatment group. During the first three stages, the instructor purposefully identifies the learning goals for a lesson (Develop Learning Objectives) and determines criteria for the achievement of these goals (Select the CAT and Develop Learning Activities), respectively. The instructor then shares the goals with students as success criteria to guide learning and engages students with learning tasks (perform instruction).
image file: d1rp00314c-f1.tif
Fig. 1 The formative assessment cycle and adoption in the General Chemistry I Course. Adapted from (Heritage, 2010).

In step 4, the instructor enacts suitable strategies during instruction to collect evidence of learning toward achieving the desired learning outcomes (Collect Evidence of Learning). These strategies can be planned or implemented spontaneously during the lesson. Examples include, but are not limited to questioning techniques, monitoring instructional tasks (e.g., explanations, problem-solving), mid-lesson checks (e.g., low-/high—tech response systems such as thumbs up/down, ABCD/flashcards, chorus answers, clicker questions, and writing the most confusing points (muddiest point) (Angelo and Cross, 1993; Nilson, 2016). In the current study, although questioning technique, clicker questions, and muddiest point CAT activities were employed to gauge the students’ learning progress, the latter were the primary focus. Top Hat technology platform provided the means for collecting the evidence of learning. The muddiest point CAT was implemented after the completion of each chapter, that is, before introducing a new chapter (medium cycle). The muddiest point enactment after each chapter was preferred over the implementation at end of each lesson (short cycle) to afford students additional moments of self-reflection on the content material and to identify and address any gaps in learning that were not addressed during the lessons. A summary of the muddiest point implementation timeline is presented in Fig. 2.


image file: d1rp00314c-f2.tif
Fig. 2 The muddiest point implementation timeline in the General Chemistry course.

Students were provided with the main concepts covered in each chapter to enhance reflection on the different concepts covered via the Top Hat teaching technology. Next, students were asked to identify specific concepts and anonymously write down the most confusing concepts that were likely to affect their performance if they were to complete an exam focused on the chapter at that moment. The polled anonymous muddiest point responses were projected on the screen during the class. Students were also asked to “like” their peers’ responses on reported muddy points that they felt needed to be addressed, in addition to the individual responses. An example of the muddiest point prompt and students’ responses captured using Top Hat are available in the ESI.

In step 5, the instructor analyzes the evidence of learning against the desired outcomes to determine progress in learning (Analyze Student Responses from the CAT and Identify Learning Gaps). The muddiest point responses were analyzed and categorized using frequency counts. A summary of the reported muddy points in each chapter is presented in Table 1 in the ESI. During the revelation of the muddy points in the classroom, three supplemental instruction (SI) leaders, undergraduate students who excelled in the previously offered course and offer outside class instructional sessions, also identified the frequently reported areas of struggle and noted them down for a follow-through during their SI sessions.

Frequently identified chapter muddy concepts, at least 10% of the total responses from each chapter, were considered for follow-up by the course instructor during the subsequent class meetings. At this stage, the instructor can diagnose what the students understand, any existing misconceptions not addressed, the knowledge they do or do not have, or the skills they have or have not acquired. With this information, students also understand their progress in learning. The teacher then identifies the gaps between the displayed learning status and the intended learning goals or objectives. Through reinforced or self-monitoring, students can also use the intended learning objectives to identify gaps in their learning.

In step 6, feedback is realized when teachers provide descriptive feedback to the students about the status of their learning concerning the success criteria and give cues to the students about what they can do to progress and close the gaps (Provide Feedback and Close the Gap). Students get feedback about their learning by self-monitoring and following through on the self-identified learning gaps. In the current study, students received, through the learning management system, recorded video clips and PowerPoint presentations of instructor-solved problem sets detailing the necessary problem-solving steps for additional tutoring. Additional ungraded homework problems were also assigned for practice.

In step 7, the instructor reflects, refines, and modifies subsequent instruction to meet students' learning needs (Reflect, Refine, and Adapt Instruction). The instructor also selects learning experiences that place an appropriate demand on students and lead to closing the gap between where students are and where they need to be. By self-monitoring, students also adjust their learning strategies and select appropriate strategies so that they can move forward (Muteti et al., 2021). Moreover, the popular muddy concepts were addressed before introducing a new concept or chapter and were also intentionally integrated into the subsequent chapters throughout the semester by the instructor. For instance, reflections on Chapter 1 revealed that “significant figures” were the most frequently reported muddy concept among students. Consequently, the instructor intentionally pointed out the relevant rules in writing significant figures during subsequent in-class quantitative problem-solving activities to reinforce the concepts. The learning assistants also worked closely with the instructor to develop practice problems addressing the reported muddiest points which, in turn, were implemented during their SI sessions.

Context of the study

The study was conducted at a public research University with participants enrolled in a General Chemistry I lecture course. The course was accompanied by a laboratory component. Class meetings met thrice a week. There were four lecture sections taught by different instructors during the semester of data collection. The current study involved participants from two of the four lecture sections. Undergraduate learning assistants, peer leaders in charge of supplemental instruction (SI), also attended the lectures and took notes to enhance content knowledge and meaningful discussions with students during the SI sessions. A total of 224 from the comparison group and 117 students from the treatment group consented to the study. In both lecture sections, participants were exposed to similar TFAs, such as homework, quizzes, and clicker questions interspersed during the lectures. However, the treatment group received additional formative assessment focused on the muddiest point CAT. At the beginning of the semester, the two instructors teaching the lecture sections met and identified the learning objectives and topics to be covered, and constantly discussed the depth of coverage on each identified topic to ensure consistency and uniformity on the content delivery.

To secure face validity (Nevo, 1985) and content validity of the test items (Yaghmaei, 2003), the two instructors of record, together, developed three non-cumulative midterm and cumulative final exam questions focused on the course learning objectives. The exam questions were similar to the standardized ACS General Chemistry exam items, but not identical. The assessment on the combined exams alone contributed 56.3% of the final course grade. The midterm exams mostly focused on 2–3 covered chapter units, whereas the cumulative final exam comprised questions from all the chapters. The final exam questions were weighted heavily on the chapters not previously tested on the three midterm exams, and about 3–4 questions from each of the chapters previously tested. The two lecture sections completed the same exam questions concurrently. The midterm exams lasted 1 hour and the final cumulative exam 1 h 50 min. The two instructors also developed answer keys for the multiple-choice questions and rubrics for the short-answer questions. The multiple-choice questions were graded using a scantron machine, whereas the short-answer questions were graded by graduate teaching assistants (GTAs) familiar with the course material. To establish consistency in grading, each GTA from both lecture sections was allowed to select a question for grading. Two GTAs graded the same question from both sections then sat together, discussed the scoring, and resolved conflicts on simple discrepancies in scoring. When the students’ reasoning processes slightly differed from the thought process steps presented in the grading key but seemed to lead to the correct solution or answer, the written solutions were discussed with the two-course instructors and a consensus was reached on the scoring.

Methods

Research design

This study followed the institutional review board (IRB) guidelines and the approved protocol at the authors’ institution. Specifically, a quasi-experimental design was employed to investigate the effect of the intervention, the muddiest point activities infused with the TFAs, on performance in the General Chemistry I course compared to the TFAs alone. The lecture sections were scheduled at different times and class meetings occurred thrice per week, with each lesson lasting 50 minutes. Participants were self-selected into the anonymous lecture sections—the names of the primary course instructors were not revealed till the beginning of the semester.

Participants

Participants were science majors enrolled in the General Chemistry I course. Out of 370 students enrolled in the course, 341 participants consented to participate in this study. Of these participants, 65% were female and 35% were male. The participants were in different levels of their study programs: 47% were first years (Trt = 74, Ctl = 86), 40% sophomores (Trt = 33, Ctl = 103), 10.5% juniors (Trt = 8, Ctl = 28), and 2.5% seniors (Trt = 2, Ctl = 7). Racial representations included White ∼66%, minority ethnic groups—(black/African American, Native American/Alaskan Native, Hispanics, Native Hawaiian/Pacific Islander, and Multiracial minority ethnic groups) ∼29%, Asian ∼0.9%. Additionally, 86% of the participants were non-first-generation college students and 14% were first-generation. Seven of the participants were international students. The average ACT Math score for both groups was identical (Trt = 24.83 ± 4.13, Ctl = 24.27 ± 4.34).

Data collection and analysis

Data included scores on three midterm exams and a final cumulative exam, incoming preparation scores-ACT Math scores, and demographics—gender, race/ethnicity, and first-generation status. The demographic data and incoming preparation scores were obtained from the University. The average exam scores were computed from the raw performance scores. Exams 1, 2, and 3 were each out of 100 points and automatically translated to 100%, whereas the final exam was set at 150 points and later converted to 100%. Prior to the analysis, the exam questions for each test were tested for internal consistency. The reliability for the multiple-choice scores and short-answer question scores were computed separately for each group. The generated Cronbach's coefficient for each exam was >0.7, indicating the reliability of the exam items in measuring student knowledge of the concepts. Alpha = 0.7 is considered acceptable and 0.8 good (Cortina, 1993). Additionally, the corrected item-total correlations were positive for all the test items in the midterm exams, suggesting the test items were acceptable for the study (DeVon et al., 2007). Although most of the items in the final exam showed corrected item-total correlation greater than 0.2, seven out of 30 items on the multiple-choice questions showed item-total correlations below 0.2, indicating poor discrimination of the test items in assessing knowledge (De Vaus, 2013). However, students had experienced similar questions in the homework problems and during the in-class activities which might suggest that the concepts were not well mastered as evidenced by the low item mean scores generated from the internal reliability test. A summary of the test item numbers and the corresponding concepts accessed in each exam is presented in Table 2 in the ESI, and summaries of the reliability test results including the Cronbach's coefficients, item-mean scores, and the item-total correlations are presented in Tables 3–6 in the ESI.

Prior to the analysis, data were checked for missingness and outliers. Specifically, Cook's distance was used to detect outliers. Cook's distance over 4/n (where n is the sample size) is considered to be an outlier (Pituch and Stevens, 2015). No outliers were identified. Additionally, internal reliability tests for each midterm exam and the cumulative final exam on the multiple-choice and short-answer questions were performed. Results indicated that both formats of the test questions produced Cronbach coefficients of ∼0.8, indicating good reliability (George and Mallery, 2003) of the test items in measuring student knowledge. Finally, the demographic data were dummy coded (1 or 0) for specific groups. The dummy codes were as follows: gender (0 = male, 1 = female), first-generation status (0 = non-first-generation, 1 = first-generation), and race/ethnicity (0 = white, 1 = minority). Independent t-tests were then performed at α = 0.05 to determine any statistically significant mean differences between the treatment and comparison groups on the midterm exams and the final cumulative exam. Effect sizes, the adjusted Cohen d for unequal sizes (Cohen, 2013), were also calculated for exams that showed statistically significant mean differences. Descriptive statistics were also conducted to evaluate the distribution of the exam mean scores and letter grades between the two groups by participants’ demographics.

Moreover, a 2 × 2 × 2 × 2 multivariate analysis of covariance (MANCOVA) was performed on four dependent variables (three midterm exams and final exam): Exam 1, Exam 2, Exam 3, and Final Exam, after controlling for ACT Math scores (covariate) to answer the second research question. Independent variables were demographics of gender, race/ethnicity, first-generation status, and the treatment condition. Each of the independent variables had two levels. This analysis was employed to determine if there were significant interactions between the type of FA and individual demographic variables, such as gender, race/ethnicity, first-generation status, on the dependent variables, the four exam scores. Descriptive statistics results generated from MANCOVA were also scrutinized to determine the performance mean score distribution by demographics in each exam, and the magnitude of the opportunity gap between the demographic groups.

Results

We present results addressing each research question separately in the subsequent sections.

RQ 1: Are there significant differences in performance in a general chemistry course between students exposed to the muddiest point activities coupled with the usual common FAs and their counterparts exposed to the common FAs only?

Finding. There were statistically significant differences in performance between the Trt group and the Clt group on the three midterm exams, but not the final cumulative exam. Results from independent t-test conducted on each individual exam showed statistically significant differences in performance between the two groups on Exam 1 (t(389) = 4.089, p < 0.001, d = 0.4), Exam 2 (t(272) = 2.194, p < 0.05, d = 0.2), and Exam 3 (t(248) = 2.412, p < 0.05, d = 0.3) as shown in Fig. 3.


image file: d1rp00314c-f3.tif
Fig. 3 Independent t-test mean scores between the treatment group (blue bars) and the comparison group (orange bars) on midterm exams and the cumulative final exam in the General Chemistry I course. The error bars denote the standard error mean values, and * indicates statistically significant mean differences in exam scores (p < 0.05) between the two groups.

However, the mean differences were associated with small effect sizes for Exam 2 and 3, but approaching moderate effect size for Exam 1, adjusted Cohen's d for unequal sizes—small = 0.2, moderate = 0.5, and large = 0.8 (Cohen, 2013; Pituch and Stevens, 2015). Overall, the average mean scores for the treatment group was higher than that of the comparison group on the three midterm exams: Exam 1 (Trt: Mean 78.4 ± 12.6, Ctl: Mean 71.9 ± 16.3), Exam 2 (Trt: Mean 62.8 ± 17.7, Ctl: Mean 58.0 ± 20.9), and Exam 3 (Trt: Mean 49.2 ± 19. 3, Ctl: Mean 43.7 ± 20.6). Additionally, the treatment group scored slightly higher than the comparison group on the final exam, but the mean difference was not statistically significant (Trt: Mean 50.0 ± 15.2, Ctl: Mean 48.4 ± 18.0). These results suggest that the TFAs coupled with the muddiest point CAT had positive effects on students’ performance over the TFAs alone on the tests focused on two chapters and not the cumulative exam.

Descriptive statistics performed on the midterm exams, where statistically significant mean differences were observed, revealed that 18%, 8%, and 4% more ABC grades in Exams 1, 2, and 3, respectively for the treatment group compared to the comparison group. Additionally, there were 17.3%, 8%, and about 5% lower DFs in the treatment group than the comparison group in Exam I, Exam 2, and Exam 3, respectively. Overall, these results suggest that TFAs coupled with the muddiest point activities bolstered performance in the three midterm exams compared to the TFAs alone.

RQ2: Controlling for ACT math scores, are there significant interactions between the type of FA (the muddiest point versus the usual common FAs) and participants’ demographics, such as gender, race, and first-generation status on performance as measured by the four exams, and does the intervention narrow the opportunity gap in performance between these demographic groups?

Finding 1. A significant multivariate effect was noted on the FA type on all combined performance, and significant interactions between the FA type and the demographic variables on certain midterm exams, after controlling for ACT Math scores. The N values considered in the MANCOVA test were as follows: FA type (Ctl = 173, Trt = 100), gender (male = 94, female = 179), race/ethnicity (majority = 194, minority = 79), and first-gen status (non-first-gen = 242, first-gen = 31). The generated Box's M of 89.96 indicated that the homogeneity of covariance matrices across groups was assumed (F(90, 5850.67) = 0.862, p = 0.820), linearity and multicollinearity were met. The 2 × 2 × 2 × 2 MANCOVA results indicated the four-way interaction of the FA type, gender, race, and first-generation status did not have a significant multivariate effect on the linear combination of the four exams (Wilk's Λ = 0.980, F(4, 253) = 1.293, p = 0.273). Similarly, the three-way interaction of the FA type, gender, and first-generation status had no significant multivariate effect on the linear combination of the four exams (Wilk's Λ = 0.975, F(4, 253) = 1.617, p = 0.170). Additionally, race, gender, and first-generation status did not show a significant interaction (Wilk's Λ = 0.993, F(4, 253) = 0.431, p = 0.786). However, we observed a multivariate effect of FA type, race/ethnicity, first-generation status on the linear combination of the four exam scores (Wilk's Λ = 0.960, F(4, 253) = 0.2.653 p < 0.05, ηp2 = 0.04). However, the results show rather small effect sizes as measured by partial-eta squared. A summary of the results of the multivariate effects is provided in Table 7 in the ESI.

None of the six two-way interactions had a significant multivariate effect on the linear combination of the four exams, although the interaction between the FA type and race/ethnicity was approaching the significance level, alpha of 0.05 (Wilk's Λ = 0.967, F(4, 253) = 2.170, p = 0.07, ηp2 = 0.03). Examining the main effects of the FA type, gender, race/ethnicity, and first-generation status revealed a significant multivariate effect of the FA type (Wilk's Λ = 0.991, F(4, 253) = 2.653, p < 0.05, ηp2 = 0.04). However, there was non-significant multivariate effect of gender (Wilk's Λ = 0.988, F(4, 253) = 0.759, p = 0.553), race (Wilk's Λ = 0.991, F(4, 253) = 0.572, p = 0.683), and first-generation status (Wilk's Λ = 0.991, F(4, 253) = 0.560, p = 0.692).

For all statistically significant multivariate results, we examined where those differences could be present. That is, are the statistically significant differences on Exam 1, Exam 2, Exam 3, the final exam, or a combination of the exams? The results from between-subjects test revealed statistically significant differences on Exam 1 based on the FA type, F(1, 256) = 6.32, p < 0.05, ηp2 = 0.02). In addition, a significant interaction effect was noted between the FA type and race/ethnicity on Exam 1, F(1, 256) = 4.13, p < 0.05, ηp2 = 0.02). Further, the three-way interaction of the FA type, race, and first-generation status yielded statistically significant results on Exam 1, F(1, 256) = 4.59, p < 0.05, ηp2 = 0.02). Similarly, the interaction of the FA type, gender, and first-generation status yielded significant results on Exam 3, F(1, 256) = 3.74, p < 0.10, ηp2 = 0.01). To examine where these differences occur and on what exams, based on the type of FA, we conducted pairwise comparisons, controlling for familywise error rates by performing Bonferroni correction to eliminate multiple spurious positives (Napierala, 2012). Results indicated that for Exam 1, the treatment group and the comparison group were significantly different, with the treatment group scoring higher than the comparison group (Ctl Mean = 69.13 ± 2.23 and Trt Mean = 76.54 ± 1.91). Overall, these results suggest that the muddiest point activities boosted the performance of students in Exam 1.

Finding 2. MANCOVA results revealed a narrowed opportunity gap in performance for marginalized student populations in the course. Results from the MANCOVA showed that minority students in both groups were significantly different on exam 1, after controlling for ACT Math scores. Specifically, minority students in the treatment group attained a higher mean score (M = 77.9 ± 3.57) on Exam I than their counterparts in the comparison group (M = 64.6 ± 3.29). This finding suggests that the muddiest point intervention significantly impacted the performance of minority students in Exam 1 and not other exams. Similarly, the significant interaction on the type of FA, gender, and first-generation status on Exam 3, after controlling for ACT Math scores, indicate that female students in the treatment group and who identified as first-generation college students performed better (M = 52.9 ± 7.13) than their peers in the comparison group (M = 39.7 ± 4.39). Together, these results suggest that the muddiest point activities bolstered the performance of minority and female students who identified as first-generation in the course.

Descriptive statistics results showed that the mean scores for the marginalized groups in STEM (minority, women, and first-generation) were slightly higher in the treatment group than the comparison group on the midterm exams, but about the same in the final exam (Fig. 4). These results indicate that the intervention had a positive impact on performance for all students on mini-lessons assessments especially first-generation students, minority ethnic groups, and women, but no impact on the cumulative assessment of the content.


image file: d1rp00314c-f4.tif
Fig. 4 Mean score distribution by demographic groups. The error bars represent the standard error of the mean. The majority denotes students who identify as White, whereas minority includes racial-ethnic groups such as Black/African American, Native American/Alaskan Native, Hispanics, Native Hawaiian/Pacific Islander, and multiracial students mostly identifying with minority ethnic groups.

Furthermore, the evaluation of the magnitude of the opportunity gap, the % difference in mean scores between demographic groups, in the treatment group and the comparison group indicated that the opportunity gap was small and about the same between male and female students in both groups (Table 8 in the ESI). However, decreased opportunity gaps were noted in the treatment group compared to the comparison group based on race/ethnicity in all the exams, and the first-generation status in the midterm exams only. The opportunity gap between first-generation and non-first-generation in the final exam was about the same in both groups. In particular, the opportunity gap between the majority and minority groups in the comparison group on Exam 1, Exam 2, Exam 3, and final exam were 6.9%, 7.6%, 5.8%, and 4.1% compared to 3.5%, 5.8%, 3.4%, and 2.4% in the treatment group, respectively. Although the majority group performed better than the minority group, the gap was narrowed in the treatment group.

The opportunity gap between the non-first-generation and first-generation students in the comparison group on Exam 1, Exam 2, Exam 3, and final exam were 12.6%, 22.09%, 12.34%, and 8.3% compared to 5.8%, 7.2%, 7.6%, and 8.6% in the treatment group, respectively. Similar patterns were noted in which non-first-generation students outperformed the first-generation students, but the gap was reduced in the treatment group. A summary of the results of the descriptive statistics between the treatment group and the comparison groups by demographics on the exams is available in Table 8 in the ESI. Overall, these results show that the opportunity gap between minority and majority, and between the first-generation and non-first-generation students was narrowed in the treatment group compared to the comparison group. However, performance by gender in both groups was identical.

Discussion and conclusions

The goal of this study was to investigate the effect of blending the muddiest point activities with the common FAs on performance in the General Chemistry I course. Results indicated statistically significant differences in performance scores between the treatment group and the comparison group, particularly on Exams 1, 2, and 3, but not the final exam. Results on the calculated effect sizes suggest that the average student in the treatment group scored 0.2 and 0.3 standard deviations above the average student in the comparison group on Exams 2 and 3; thus, exceeding the scores of 58% and 62% of the participants in the comparison group. Furthermore, the average student in the treatment group scored 0.4 standard deviations above the average student in the comparison group in Exam 1, exceeding the scores of 66% of participants in the comparison group.

In contrast to other studies that have reported no significant impact of the muddiest point on performance in the general chemistry courses (Snead, 2017), the current results indicate the significant metacognitive impact of the muddiest point activities, particularly on the midterm exams. One explanation for this difference might be attributed to the adequate time provided to the participants to interact with, reflect on and review the interleaved chapter concepts for a duration of 1 week to 1.5 weeks (medium cycle assessment) versus the short cycle reflection (immediately after the lesson). This elapsed time allowed participants the opportunity to encounter real barriers in their learning; hence, being able to accurately gauge the muddiest points and receive feedback. Learning is a process that requires time to make mistakes and learn from them, and time for the material to be retained. It is therefore imperative that students be afforded enough time to ‘digest’, process, and reflect on learned material to effectively identify the gaps in learning.

Additionally, descriptive statistics revealed the widest gap in terms of the percentage of ABCs (18% difference) and DFs (17% difference) between the two groups was associated with Exam 1 compared to other exams. One explanation for this observation is the closed gaps in learning that related to significant figures. Approximately 29% of the quantitative problems required students to report the answers to the correct significant digits. Significant figures were the second frequently reported muddiest points in Chapter 1 (37.4% of responses), particularly the application of significant figure rules in multistep calculations involving multiple mathematical operations, such as addition or subtraction followed by multiplication combined with division. The reported gaps were addressed by revisiting the completed homework problems in which the instructor had evidenced students’ struggles with significant figure calculations. The struggles were exacerbated on problem sets involving multiple mathematical operations. The instructor adapted the instruction in the subsequent lessons to intentionally point out significant digits. Additionally, the instructor required students to identify and shout out the significant figures during quantitative problem-solving in the subsequent chapters.

Independent t-test performed on the mean scores between the treatment group and the comparison group on significant figure questions in Exam I (two multiple-choice and 5 short-answer questions) indicated a statistically significant mean difference between the two groups (p < 0.001). In particular, the treatment group showed a higher mean score (M = 25.3, N = 116, SD = 4.3) than the comparison group (M = 22.4, N = 218, SD = 6.3, p < 0.05) based on the seven questions. Further scrutiny of the short-answer student-written solutions indicated that many students in both groups demonstrated the correct thought processes in solving the significant figure-related questions; however, the differences in the scores were mainly on the reported digits—often, the treatment group included the correct number of significant figures compared to the comparison group. This finding suggests that addressing the muddy points on significant figures was beneficial to students.

The statistically significant (p < 0.05) main effect of FA in Exam 1 and the noted significant interactions of FA type and race/ethnicity and FA type, race/ethnicity, and first-generation on Exam I, after controlling for ACT Math scores, suggest that the greatest benefits of the muddiest point activities were experienced in Exam I by all students; however, the minority first-generation students in the treatment group, on average, benefited the most compared to other demographic groups in Exam 1. Generally, compared to non-first generation and majority students, first-generation and minority students are less likely to have role models who can guide them on effective study strategies and resources they can adopt to excel in the courses, particularly during the first semester of transitioning into college. A study involving first-year students at a large research-based University found that first-generation students were less likely to ask questions in class or contribute to class discussion compared to non-first regeneration students (Soria and Stebleton, 2012). Similarly, Muteti et al. found that most students enrolled in the introductory general chemistry course mainly employed rote memorization, with reflective learning reported by a handful of the study participants (Muteti et al., 2021). Muddiest point activities enhance metacognitive regulation skills by encouraging students to reflect and monitor their learning progress (Mutambuki et al., 2020; Muteti et al., 2021). Therefore, it is likely that introducing the muddiest point activities early in the semester leveled the playing field for the vulnerable students, that is, minority and minority, first-generation students, to master the concepts and attain better scores. The narrowing of the opportunity gap by the comparison group in Exam 2 can be partly due to the acclimation of the learners to the learning environment and growth in cognitive maturity. As learners continue to adapt to the learning environment, they are likely to realize their potential to improve and make positive choices toward learning.

The significant interaction effect of the intervention (FA type), gender, and first-generation status on Exam 3, after controlling for ACT Math scores, indicates that the intervention favored female, first-generation college students over other demographic groups on this exam. This finding showed the potential of the muddiest point activities in bolstering the mastery of cognitively demanding chemistry concepts among this vulnerable student group. Exam 3 assessed students’ mastery of concepts related to three chapters that were more complex compared to those assessed in Exam I and Exam 2: thermochemistry—enthalpies of reactions, gases—gas mixture calculations/partial pressures, and solutions and aqueous reactions—redox reactions and assigning oxidation states (see Table 1 in the ESI). In many cases, the treatment group correctly answered more exam questions that correlated with the reviewed muddiest points from the three chapters than the comparison group.

Overall, female, minority, first-generation students, as we all male students in the treatment group scored more questions correctly than their counterparts in the comparison group. These results reveal that creating opportunities for students to identify gaps in learning on specific chapter units and closing them in real-time can improve students’ mastery of specific concepts and significantly improve the performance of students from marginalized groups. Although the significant p-values observed in measuring the interactions between the FA type and the demographic variables indicate that the muddiest point activities work, the effect sizes as measured by partial eta-squared are small. The small effects sizes suggest a small significant impact of the intervention on the opportunity gap. Currently, there are limited studies reporting effect sizes on muddiest point interventions for comparison. Replication of the current study at other institutions and STEM courses with larger populations of the demographics described herein should be considered in future research to produce more conclusive evidence of the effects of the muddiest point activities in narrowing the opportunity gap.

Results from MANCOVA indicated that the intervention did not have a significant effect on performance in the final cumulative exam including for any of the demographic groups. Similarly, descriptive statistics results showed identical performance in the final exam, by demographics. The authors attribute the identical performance between the two groups in the final exam to cognitive overload, in which the participants were tested on a total of 10 chapters rather than 2–3 chapters or mini-lessons. Item analysis on the final exam was performed to understand the reasons for identical performance in the final exam. Results revealed interesting patterns in performance, especially on the previously untested chapters. Although students in the treatment group performed better on some questions that correlated with the previously addressed muddy points, there were many instances in which the comparison group performed better on the previously addressed muddy concepts. For instance, many students in the treatment group correctly answered more questions from Chapters 8 and 9 compared to the comparison group, particularly on para-magnetism, identifying quantum numbers and orbitals, and filling orbitals with electrons: 33%, 40%, and 83% versus 15%, 34%, and 78%, respectively. The addressed muddy concepts related to the tested concepts were 11% (para-magnetism), 41% (identifying quantum numbers), and 22% (filling orbitals).

Low-performance scores were noted for the treatment group compared to the comparison group on the questions that related to the previously addressed muddy points, especially from the last chapter. These concepts included, the polarity of molecules (20% versus 25%), bond strength (33% versus 43%), ionization energies (34% versus 47%), and drawing Lewis dot structures and formal charges (16% versus 20%). With most concepts from the last chapter taught a week before the final exam, students might not have had enough time to master the reviewed muddy concepts as they were also reviewing concepts from the previously tested chapters. Another explanation for the drop in performance for the treatment group in the final exam can be explained by the breadth of the assessed learning objectives focused on multiple concepts, that is, concepts from 10 chapters rather than 2-3 chapters assessed in the midterm exams. A low inter-item correlation of 0.082 noted in the final exam also indicated that the tests items focused on broader concepts (Clark and Watson, 1995). The average inter-item correlations between 0.15 and 0.50 indicate the items are well correlated and measure the same concepts, whereas inter-item correlations above 0.50 indicate that the items are redundant in measuring a given construct (Clark and Watson, 1995).

Overall, results from the current study indicate that the muddiest points have the potential to level the playing field in general chemistry courses toward providing equitable participation and access to learning. Muddiest point activities enhance metacognitive regulation skills by encouraging students to reflect and monitor their learning progress (Mutambuki et al., 2020; Muteti et al., 2021). In the light of the current results, we contend that instructor pre-chosen assessment prompts, such as structured-clicker questions, quizzes, homework, among others, can overlook potential learning blind spots that students may be struggling with, yet not realized in real-time. Instructors must consider giving students a “voice” and autonomy in identifying and closing gaps in learning. The results revealed that the muddiest points blended with the common FAs have the potential to improve the mastery of concepts and performance scores than the usual common FAs alone.

Study limitations

One limitation of the study is the lack of a randomized study design; thus, the results presented herein should be interpreted with caution. Second data were obtained from one institution involving a specific student population and characteristics, which might differ across other institutions. Other researchers can replicate this study at other institutions and other introductory STEM courses. Third, the instructors had significant differences in prior teaching experiences in the General Chemistry I course. Specifically, the instructor in the comparison group had more years of teaching experience in the course compared to the instructor in the treatment group. The noted difference in instructor was an independent variable that unfortunately could not be controlled in the current study.

Finally, the results reported herein are focused on the implementation of middle-cycle muddiest point activities. In addition to replicating the current study, future studies should investigate the effects of implementing different models of muddiest points on student performance in chemistry and other STEM courses. For example, comparing the performance of students exposed to the muddiest point activities implemented following these three models: short-term cycle—at the end of the lesson, middle-cycle—at the end of a chapter, and long-term cycle—occurring after covering several chapters and before administering a high-stakes assessment (Wiliam, 2006), could reveal the most effective model for implementing muddiest point activities in future chemistry and STEM courses. The current study and the few others described herein provide a starting point for a long journey in reforming FAs in postsecondary chemistry education.

Implications for practice

Current results insights into future instruction in postsecondary chemistry courses. First, there is a need to re-think how students are assessed in the chemistry courses. The results indicate that chemistry instructors can solicit for, and address muddy points before administering high-stakes assessments. The proximity of content covered to testing plays a role in how students process and respond to questions. Each of the midterm exams was administered after the students had completed a chunk of content coupled with the learning strategies and had at least 5 days to review the material, including the muddy points. This enabled the students to understand the content better; thus, performing relatively higher than their counterparts in the comparison group. The students saw the connection between the content and the assessment of their understanding of the content. This was different in the final exam where a short time was afforded to students to review the content prior to the final exam. This finding implies that General Chemistry students are likely to face tough chemistry concepts; however, chunking of content coupled with solicitation of muddy points can improve performance in tests.

Second, the results imply that infusing muddiest point activities with common FAs has the potential to bolster the performance in general chemistry courses, particularly for first-generation and minority students. Ultimately, this has the potential to increase the progression of marginalized student populations into STEM programs and careers; thus, reducing the labor opportunity gap in STEM (NSF, 2019). Finally, confronting barriers to the opportunity to learn that many marginalized students experience is critical (Boykin and Noguera, 2011). Instructors should be intentional in designing their lessons to include opportunities for soliciting student-generated topics on areas of struggle, especially on the covered content. This will minimize the unidirectional approach to learning and provide a voice for students to communicate the areas of struggle on time. In turn, this can catalyze students’ co-construction of knowledge and narrow the opportunity gap in performance. Overall, this study reveals that soliciting students’ barriers to learning can reduce the opportunity gap in General Chemistry courses.

Conflicts of interest

There are no conflicts of interest to declare.

Acknowledgements

This work was in part supported through the Oklahoma State University Foundation, the Edward E. Bartlett Endowed Chair fund.

References

  1. Alanazi A. N., (2021), Quantitative Analysis of Technology Use and Muddiest Point Technique in Undergraduate Chemistry Courses, Drexel University.
  2. Angelo T. A. and Cross K. P., (1993), Minute paper, Classroom assessment techniques: A handbook for college teachers, 2nd edn, San Francisco, CA: Jossey-Bass Publishers, pp. 148–153.
  3. Black P. and Wiliam D., (1998), Assessment and classroom learning, Assess. Educ.: Princip., Pol. Pract., 5, 7–74.
  4. Black P. and Wiliam D., (2010), Inside the black box: Raising standards through classroom assessment, Phi delta kappan, 92(1), 81–90.
  5. Botch B., Day R., Vining W., Stewart B., Hart D., Rath K. and Peterfreund A., (2007), Effects on student achievement in general chemistry following participation in an online preparatory course. Chemprep, a voluntary, self-paced, online introduction to chemistry, J. Chem. Educ., 84, 547–553.
  6. Boykin A. W. and Noguera P., (2011), Creating the opportunity to learn: Moving from research to practice to close the achievement gap, Alexandria, VA: The Association for Supervision and Curriculum Development (ASCD).
  7. Carberry A., Krause S., Ankeny C. and Waters C., “Unmuddying” course content using muddiest point reflections, 2013.
  8. Clark L. A. and Watson D., (1995), Constructing validity: Basic issues in objective scale development. Psychol. Assess., 7, 309–319.
  9. Cohen J., (2013), Statistical power analysis for the behavioral sciences, Academic Press.
  10. Cortina J. M., (1993), What is coefficient alpha? An examination of theory and applications, J. Appl. Psychol., 78(1), 98–104.
  11. Costa D. S., Mullan B. A., Kothe E. J. and Butow P., (2010), A web-based formative assessment tool for Masters students: A pilot study, Comput. Educ., 54, 1248–1253.
  12. De Vaus D., (2013), Surveys in Social Research, 6th edn, London: Routledge.
  13. DeVon H. A., Block M. E., Moyle-Wright P., Ernst D. M., Hayden S. J., Lazzara D. J. and Kostas-Polston E., (2007), A psychometric toolbox for testing validity and reliability, J. Nurs. Scholar., 39(2), 155–164.
  14. Freasier B., Collins G. and Newitt P., (2003), A web-based interactive homework quiz and tutorial package to motivate undergraduate chemistry students and improve learning, J. Chem. Educ., 80, 1344–1347.
  15. Freeman S., O'Connor E., Parks J. W., Cunningham M., Hurley D., Haak D., Dirks C. and Wenderoth M. P., (2007), Prescribed active learning increases performance in introductory biology, CBE Life Sci. Educ., 6, 132–139.
  16. Gardner J., (2005), Barriers influencing the success of racial and ethnic minority students in nursing programs, J. Transcul. Nurs., 16, 155–162.
  17. George D. and Mallery P., (2003), IBM SPSS statistics 26 step by step: A simple guide and reference, Routledge.
  18. Harlen W., (2013), Assessment & Inquiry-Based Science Education, Triestly Italy: Global Network of Science Academies (IAP) Science Education Program (SEP).
  19. Heritage M., (2010), Formative assessment: Making it happen in the classroom, Corwin Press.
  20. Johnson R. S., (2002), Using data to close the achievement gap: How to measure equity in our schools, Thousand Oaks, CA: Corwin Press.
  21. King D. B., (2011), Using clickers to identify the muddiest points in large chemistry classes, J. Chem. Educ., 88, 1485–1488.
  22. Krause S. J., Baker D. R., Carberry A. R., Alford T. L., Ankeny C. J., Koretsky M., Brooks B. J., Waters C., Gibbons B. J. and Maass S., Characterizing and addressing student learning issues and misconceptions (slim) with muddiest point reflections and fast formative feedback, 2014.
  23. Ladson-Billings G., (2006), From the achievement gap to the education debt: Understanding achievement in U.S. schools, Educ. Res., 35(7), 3–12.
  24. Leinhardt G., Cuadros J. and Yaron D., (2007), “One firm spot”: The role of homework as lever in acquiring conceptual and performance competence in college chemistry, J. Chem. Educ., 84, 1047–1052.
  25. Malik K., Martinez N., Romero J., Schubel S. and Janowicz P. A., (2014), Mixed-methods study of online and written organic chemistry homework, J. Chem. Educ., 91, 1804–1809.
  26. Mosteller F., (1989), The ‘Muddiest Point in the Lecture’as a feedback device, J. Harv., 3, 10–21.
  27. Mutambuki J. M., Mwavita M., Muteti C. Z., Jacob B. I. and Mohanty S., (2020), Metacognition and active learning combination reveals better performance on cognitively demanding general chemistry concepts than active learning alone, J. Chem. Educ., 97(7), 1832–1840.
  28. Muteti C. Z., Zarraga C., Jacob B. I., Mwarumba T. M., Nkhata D. B., Mwavita M., Mohanty S. and Mutambuki J. M., (2021), I realized what I was doing was not working: the influence of explicit teaching of metacognition on students’ study strategies in a general chemistry I course, Chem. Educ. Res. Pract., 22, 122–135.
  29. Napierala, M. A. (2012), What is the Bonferroni correction? AAOS Now, 6(4), 40.
  30. National Science Foundation, National Center for Education Statistics: Women M. and Persons with Disabilities in Science and Engineering, (2019), National Center for Education Statistics: Women, Minorities, and Persons with Disabilities in Science and Engineering.
  31. Nevo B., (1985), Face validity revisited, J. Educ. Meas., 22(4), 287–293.
  32. Nilson L. B., (2016), Teaching at its best: A research-based resource for college instructors, John Wiley & Sons.
  33. Pituch K. A. and Stevens J. P., (2015), Applied multivariate statistics for the social sciences: Analyses with SAS and IBM's SPSS, Routledge.
  34. Richards-Babb M., Drelick J., Henry Z. and Robertson-Honecker J., (2011), Online homework, help or hindrance? What students think and how they perform, J. Coll. Sci. Teach., 40, 81–93.
  35. Rivera-Goba M. V. and Nieto S., (2007), Mentoring Latina nurses: A multicultural perspective, J. Latinos Educ., 6, 35–53.
  36. Snead L. P., (2017), The Effect of Using the Muddiest Point Technique in a Large General Chemistry Class, Drexel University.
  37. Soria K. M. and Stebleton M. J., (2012), First-generation students' academic engagement and retention, Teach. High. Educ., 17, 673–685.
  38. Steadman M., (1998), Using classroom assessment to change both teaching and learning, New Dir. Teach. Learn., 75, 23–35.
  39. Waters C., Krause S. J., Callahan J., Dupen B., Vollaro M. B. and Weeks P., (2016), Revealing student misconceptions and instructor blind spots with muddiest point formative feedback.
  40. Wiliam D., (2006), Formative assessment: Getting the focus right, Educ. Assess., 11, 283–289.
  41. Yaghmaei., F. (2003), Content validity and its estimation, J. Med. Educ., 3 (1), 25– 27.
  42. Yalaki Y., (2010), Simple formative assessment, high learning gains in college general chemistry, Eurasian J. Educ. Res., 40, 223–241.
  43. Yalaki Y. and Bayram Z., (2015), Effect of formative quizzes on teacher candidates’ learning in general chemistry, IJRES, 1, 151–156.

Footnote

Electronic supplementary information (ESI) available. See DOI: 10.1039/d1rp00314c

This journal is © The Royal Society of Chemistry 2022