Student perceptions of immediate feedback testing in student centered chemistry classes

Jamie L. Schneider*a, Suzanne M. Ruderb and Christopher F. Bauerc
aDepartment of Chemistry and Biotechnology, University of Wisconsin – River Falls, River Falls, WI 54022, USA. E-mail:
bDepartment of Chemistry, Virginia Commonwealth University, Richmond, VA 23284, USA
cDepartment of Chemistry, University of New Hampshire, Durham, NH 03824, USA

Received 22nd September 2017 , Accepted 8th January 2018

First published on 8th January 2018

Feedback is an important aspect of the learning process. The immediate feedback assessment technique (IF-AT®) form allows students to receive feedback on their answers during a testing event. Studies with introductory psychology students supported both perceived and real student learning gains when this form was used with testing. Knowing that negative student perceptions of innovative classroom techniques can create roadblocks, this research focused on gathering student responses to using IF-AT® forms for testing in general chemistry 1 and organic chemistry 2 classes at several institutions. Students’ perceptions on using the IF-AT® forms and how it influenced their thinking were gathered from a 16-item survey. The results of the student surveys are detailed and implementation strategies for using IF-AT® forms for chemistry testing are also outlined in this article.

Introduction and background

Testing is generally cited as a potential learning tool, however testing effects are weakened depending on students’ access and use of feedback from testing (Dunlosky et al., 2013). This is particularly problematic when students repeatedly focus on test scores rather than feedback once a testing event is over, potentially negating any positive testing feedback effects (Crooks, 1988). Feedback given during a testing event has the potential to provide positive effects and to bypass student avoidance of feedback after returning tests or posting of scores. Earlier studies in introductory psychology courses found both perceived and measureable positive effects with immediate, answer-until-correct (AUC) testing feedback (DiBattista et al., 2004; Brosvic and Epstein, 2007). However, we questioned how students would respond to this type of feedback in introductory chemistry courses, referred to as “gateway” or “roadblock” courses. Even if positive feedback effects could be measured, strong negative student perceptions or behaviors with getting immediate feedback during chemistry testing could ultimately affect implementation of AUC testing methods. In fact, some chemistry colleagues of the first author expressed hesitation in adopting immediate feedback testing strategies because of concerns over negatively affecting students’ morale when incorrect answers were obtained in these high stakes environments. Additionally, some expressed concern that students would just give up and guess when their answers were incorrect. These are not new concerns about assessment feedback. Higgins reported that students complained that getting feedback was “demoralizing” and they didn’t even know how to respond to it (Higgins, 2014). This paper will present the results of a student survey designed to capture student responses to getting immediate AUC feedback during chemistry testing. The results described in this paper are an important step in determining the feasibility of using immediate, answer-until-correct (AUC) feedback for testing in both general chemistry and organic chemistry courses.

Multiple choice testing and immediate feedback assessment technique (IF-AT®)

The multiple choice (MC) format is a common formal assessment testing method that is especially useful for large classes (Henriques et al., 2006). The MC format saves time in grading, is reliable and gives students practice preparing for standardized testing. Some disadvantages to using the MC format are that it is challenging to effectively test students’ critical thinking and problem solving skills, and many answers can be guessed correctly without a full understanding of the material. Although students generally receive information on which questions they got wrong, feedback is often limited to what the correct answer is and not on their understanding of the concepts. In the MC testing format, partial credit is typically not given for some understanding of the concepts. A strategy has been reported for giving partial credit on ACS exams (Grunert et al., 2013), but it is unclear how widespread partial credit schemes like this have been adopted in chemistry testing. AUC (Pressey, 1926) exam response formats give students the opportunity to continue to answer a MC question until they arrive at the correct answer, thus allowing partial credit to be awarded (Slepkov et al., 2016). Computer based testing supports AUC exam response formats, however, computer testing in class is often not feasible, especially in the large class setting.

One form of AUC format that can be used with traditional paper and pencil tests is the Immediate Feedback Assessment Technique (IF-AT®) (Epstein et al., 2001; Epstein et al., 2002). This manufactured multiple-choice answer sheet contains a row of selections where each answer choice is covered with a waxy coating like that of a lottery ticket. Students scratch off the coating of their answer choice: they obtain immediate feedback because the correct answer is marked with a star while incorrect answer spaces are blank. If the correct answer is not chosen on the first attempt, students may continue to scratch until the correct answer is obtained, thus allowing partial credit to be assigned based on the number of spaces that were scratched. An example of how a student might complete an IF-AT® form is shown in Fig. 1. In this example, full credit of 4 points was awarded when obtaining the answer after one scratch, half credit (2 points) when obtaining the correct answer after two scratches, and 1 point was awarded after obtaining the answer after three scratches. Instructors may choose how many points to distribute to incorrect answers. Slepkov et al. (2016) explored the validity of offering partial-credit when AUC formats where used with multiple-choice testing in general chemistry II. With a scoring scheme [1, 0.5, 0.1, 0.0] that awards the highest fractions of credit for the first or second scratches, they found test scores increased by about 6–7% compared to a dichotomously administered MC test. Additionally, they found that the partial credit was awarded in a discriminating way, such that student performance and correction of initial errors were highly correlated. They concluded that this data supports that students using AUC formats are less likely to be just guessing on subsequent answers; however, they did not implement any surveys for student feedback, or interview the students to verify these practices.

image file: c7rp00183e-f1.tif
Fig. 1 IF-AT® example.

In introductory psychology courses, DiBattista found that students had a very positive reaction to using IF-AT® forms for testing (DiBattista et al., 2004). In particular, students thought the forms were easy to use and expressed a strong desire to continue using IF-AT® forms for all multiple-choice tests. DiBattista describes the Immediate Feedback Assessment Technique as a learner-centered multiple-choice response form (DiBattista, 2005). A large majority of these psychology students noted that using the IF-AT® forms made taking the test seem something like a game, while still contributing to their learning. Additionally the level of students’ test anxiety in introductory psychology courses while using the IF-AT® forms was investigated (DiBattista and Gosse, 2006). They found that use of the IF-AT® form did not increase test anxiety in general. In fact, student test anxiety was reduced when they got an answer correct. Students who reported high levels of text anxiety continued to have similar test anxiety levels with IF-AT® testing as in regular testing situations.

Although our paper is focused on student response to IF-AT® forms for testing, other articles have raised instructor concerns of cost, test prep time, and test grading time (DiBattista et al., 2004; DiBattista, 2005; Slepkov et al., 2016). These are additional considerations when choosing this type of assessment response form over machine graded response forms for multiple-choice tests.

Formative assessment and IF-AT® forms

There are several reports on the use of IF-AT® forms with problem based learning in science courses (biochemistry and introductory biology) (White, 2005; Cotner et al., 2008; Carmichael, 2009). These articles focus on using IF-AT® forms for group quizzing during classroom activities. Their overall conclusion was that students responded positively to using IF-AT® forms for this type of formative assessment during a learning activity. In addition, Cotner et al. (2008) reported that students felt the IF-AT® forms helped them reveal misconceptions and perform better on exams. In an introductory accounting class students worked in groups to come to consensus answers using the IF-AT® forms (Mohrweis and Shinham, 2015). They found that students using the scratch off forms during group work had positive effects on their learning. Written comments from students were all positive regarding using the forms, saying that using the forms helped them learn. We were interested in student perceptions in courses that utilized IF-AT® forms for higher stakes testing environments in chemistry, in addition to lower stakes formative assessments with IF-AT® forms.

Theoretical framework

Corrective feedback, timing and engagement

Although it is generally agreed that feedback is important to the learning process (Bransford et al., 1999), many instructors are not able to spend significant time providing individual feedback to their students. Feedback may include providing the answers to the questions or more extensive feedback where the answers are coupled with an explanation. However, most instructors leave it up to students to make a choice about how and when to use this delayed feedback. Corrective testing feedback (feedback that provides the correct answers) has the potential to support learning, especially for courses with a hierarchical structure (Dempster, 1989; Schneider et al., 2014) like introductory and organic chemistry, where early course topics directly build into later ones. Corrective feedback is especially important when using multiple-choice testing because reading or endorsing the incorrect distractor options can result in acquisition of incorrect knowledge if not corrected (Roediger III and Marsh, 2005; Butler et al., 2006). Minimizing effects of choosing incorrect responses is particularly important for less prepared students who experience larger negative effects (Butler and Roediger III, 2008). Immediate answer-until-correct feedback formats were originally developed in 1926 (Pressey, 1926). A theory supporting the effectiveness of immediate feedback originates from behaviorists theories of reinforcement (Skinner, 1954). Offering immediate corrective feedback reduces interference from storing incorrect answers, an effect particularly important for students with learning challenges (Epstein et al., 2003). Corrective feedback may also limit repetition of retrieval failures (Roediger and Marsh, 2005).

Several classroom studies done using the IF-AT® forms for testing in introductory psychology courses showed improved student learning upon repeat testing (Epstein et al., 2002; Dihoff et al., 2003; Dihoff et al., 2004; Brosvic et al., 2005). In a more extensive study involving 611 introductory psychology students, students answered course test items with a bubble form or IF-AT® form (Brosvic and Epstein, 2007). The control group was given their scored answer sheets a day after the test. Performance on novel or repeated items on the final exam as well as performance at 3–12 months after the final exam was then studied. The immediate feedback using IF-AT® forms showed significant gains in repeat test items on the final exam compared to the control group. Closer examination of the response patterns showed that students were more likely to get answers correct on both exams and to change incorrect answers to correct answers upon repeat testing using the IF-AT® form. The patterns on the 3–12 month post-final exam testing revealed similar patterns albeit smaller gain due to attrition.

In addition to conceiving that feedback is useful for correction of errors, a more holistic viewpoint of feedback is summarized by Mulliner and Tucker (2017). They cite numerous research that links the importance of feedback that feeds forward, encourages further motivation for learning, and helps students identify gaps between desired and actual performance (Mulliner and Tucker). Effective feedback involves student engagement and action toward future work.

Focus of study

Herein we outline how IF-AT® forms were utilized for quizzing and testing in general and organic chemistry classrooms, and how students responded to their use. We chose to assess the use of these forms in interactive classrooms that utilized student centered teaching practices. These classrooms include frequent formative assessment techniques, so adding feedback to the testing events was a natural extension of these classroom practices. The teaching practices included lectures, interactive problem solving sessions often facilitated through classroom response systems, small group worksheets, think-pair-share sessions, and Process Oriented Guided Inquiry Learning (POGIL®) (Farrell et al., 1999) activities with more formal group learning. Pedagogy based on guided inquiry learning lends itself well to immediate feedback techniques. Guided inquiry learning is based on constructivist theory (Vygotsky, 1978), where students work on activities that guide them to construct new information in the classroom.

Results of student surveys from participants in the aforementioned courses at four institutions taught by five different faculty will be discussed. We were particularly interested in determining if students reported negative test taking behaviors like guessing or negative morale issues with obtaining immediate feedback during chemistry testing. We also share some insights gathered from the faculty as they wrote multiple-choice exams for use with IF-AT® forms for chemistry testing.



This study took place in general chemistry and organic chemistry courses at four different institutions with five different instructors, as shown in Table 1. Three of the institutions (Institutions A, B and C) were small to medium sized comprehensive and predominantly undergraduate institutions (PUI) with limited age and ethnic diversity, although there was some diversity in background and ability. All three institutions offered only bachelor's degrees in chemistry. The fourth institution (Institution D) was a large R1 school with BS and PhD programs in chemistry. The undergraduate population at Institution D was diverse in all aspects including age, race, ethnicity, background, and ability. The general and organic chemistry courses at all institutions under study were taught predominantly with interactive lecture styles as well as with POGIL® pedagogy.
Table 1 Study settings
  Institution A Institution B Institution C Institution D
Institution size/type Medium sized comprehensive Medium sized PUI Small sized comprehensive Large sized R1
Course General Chemistry Organic Chemistry General Chemistry Organic Chemistry Organic Chemistry
Class size 110, 110, 79 42 34 40 110
Instructor Green Purple Red Yellow Blue
Dominant pedagogy POGIL® POGIL® Interactive lecture Interactive lecture POGIL®
Testing type used with IF-AT® forms Tests, few group quizzes Tests Tests Tests, few group and individual quizzes Tests, regular group quiz
Participants 96, 98, 65 32 26 38 100

At the beginning of the term under study, all instructors assumed that students had no prior knowledge of the IF-AT® form so class time was used to describe the format in detail. In order to help ease student anxiety over a new testing format, the IF-AT® forms were first used for a group multiple-choice quiz in a peer-supported environment. These were counted for credit, most often for a combination of participation credit and credit for correct responses. Students were told to scratch their answers after answering each question and to rethink the question if they were wrong before selecting their next answer choice. This preparation was completed prior to administration of the first test using the IF-AT® forms.

Some instructors in this study used the IF-AT® forms only for testing after that initial group quiz introduction, while others used the forms for testing, individual quizzing and group quizzing. For each unit test, multiple-choice questions (answered using IF-AT® forms) made up about 20–60% of points with the remainder of the test consisting of short answer problems. Point distributions for correct answers varied between the institutions, but all instructors gave some partial credit for not obtaining the correct answer on the first try. A common point distribution was 4, 2, 1, or 0 points awarded for correct answers given on the first, second, third, or fourth attempt, respectively. In the organic chemistry course at Institution D, the same distribution was used on tests, but no credit was awarded for the third or fourth attempts on quizzes.

The frequency of quizzing (both individual and team-based) with IF-AT® forms varied between the different courses in this study. Most of the institutions irregularly gave quizzes using the IF-AT® often reporting less than five quizzing events. The organic chemistry course at Institution D had the most systematic approach to quizzing. Quizzes were given daily at the beginning of the class following completion of each class activity. Quizzes were either short clicker quizzes (4 points), individual paper quizzes (10 points) or group quizzes (10 points) using the IF-AT® forms. Half of the 10 point quizzes were group quizzes, given to groups or teams of students. In group quizzes, each group received one printout of the quiz and one IF-AT® answer sheet. As is typical in a POGIL® classroom, roles were assigned to all group members. During a group quiz, the student assigned the role of manager was given the quiz and was asked to read the quiz question to the other group members. The student assigned the role of presenter, made sure that each group member had a say in the discussion about the quiz, and finally the student assigned the role of recorder was responsible for recording (or scratching) the answer on the IF-AT® form. Using this technique, groups received full credit for getting the correct answer on the first attempt and half credit for getting the correct answer on the second attempt. The other courses used group quizzing in the beginning of the term to introduce the IF-AT® forms and occasionally during the term. The dominant use of the IF-AT® forms in all the courses in this study was for testing (higher stakes unit exams). All students in this study used the IF-AT® forms for testing, and some used for testing and quizzing. Surveys were based on overall student perceptions of the forms.

Data collection

At the end of the term in which the IF-AT® forms were used, volunteer students anonymously completed a 16-item survey considering their reaction to using these forms. At Institution D, 14 of the 16 survey questions were included as part of an online survey available to students during the last two weeks of class; items 3 and 16 from Table 2 were omitted. These questions were omitted in order to shorten the student survey, which included additional items after these survey questions. At Institutions A, B and C, classes surveyed utilized a paper/pencil version of the survey with all 16 items. The survey was developed through private conversations with the developer of the IF-AT® forms (Dr Michael Epstein), modifications of published IF-AT® surveys items (Dihoff et al., 2003; DiBattista et al., 2004; Dibattista and Gosse, 2006), and writing some original items. The survey questions focused on student feedback regarding some of the instructor concerns of negative student testing behaviors and student morale issues upon receiving feedback during testing. In addition, questions were included to elicit student perceptions of learning from immediate feedback.
Table 2 Survey questions used to assess student reactions to IF-AT® forms
# Question content
1 The IF-AT® form helped reinforce material that I knew.
2 The IF-AT® form helped me learn material I was unsure of.
3 With this type of exam I don’t have to study as much to do well.
4 The IF-AT® form had a positive effect on my morale because I gained confidence with each correct answer I scratched off.
5 The IF-AT® form had a positive effect on my morale because I knew I could get partial credit even if I didn’t get it right the first time.
6 Giving partial credit is unfair to students who really know the material.
7 The IF-AT® form helped me on the tests because knowing the right answer to some questions helped to guide me to the right answer in other questions.
8 The IF-AT® form had a negative effect on my morale because I got more and more anxious with each question, knowing that I’d already gotten some wrong.
9 When I got a question wrong on the first scratch, I often gave up and guessed at the correct answer.
10 The IF-AT® form had a negative effect on my morale because I lost confidence when I discovered I was wrong. I would rather not have known.
11 I would have rather used Scantron forms for the multiple choice portions of the exam.
12 I like the fact that I can score the multiple choice portion of the exam right away to see how I did.
13 I usually do better on the problem/fill in the blank portions of the exams.
14 Before I was willing to mark an answer, I found myself rechecking answers more often with IF-AT® forms compared to scantron forms.
15 Getting a question wrong challenged me to rethink about the question and try to logically choose a better answer.
16 The IF-AT® form helped me retain correct information.

All procedures in this study were reviewed and approved by Institutional Review Boards at all of the institutions. Statistical analyses were carried out using a current package of SPSS and Excel for Windows.

Results and discussion

Students’ experiences using IF-AT® forms

At the end of the semester in which the IF-AT® forms were used, students in the general and organic chemistry courses were surveyed on their experiences using IF-AT® forms for quizzes and exams. The 16-item anonymous paper/pencil survey was completed by 285 general chemistry students at Institutions A and C, and 70 organic chemistry students at Institutions B and C. A 14-item anonymous online survey (16-item survey minus items 3 and 16) was completed by 100 organic chemistry students at Institution D. All of the courses had a high responses rate with over 76% of the students completing the surveys. Table 2 lists the questions included on the survey. Student responses were collected using a five-point Likert scale, where 5 = strongly agree, 4 = agree, 3 = neutral, 2 = disagree and 1 = strongly disagree. Per the data management plan in the NSF grant (DUE 1140914), the descriptive statistics for the student responses are included in the Appendix (Tables 6–8) of this article. Fig. 2 illustrates the average responses for each survey item based on the course type (general chemistry 1, organic chemistry 2 (paper/pencil), and organic chemistry 2 (online)). In general, students agreed with items 1, 2, 4, 5, 7, 12, 14, 15, and 16. Students were neutral towards items 8 and 13, and they disagreed with items 3, 6, 9, 10, and 11.
image file: c7rp00183e-f2.tif
Fig. 2 Average survey item responses by course type.

Principal components analysis (PCA) with Varimax rotation was used to examine the structure of the 16-item survey data from the general chemistry students. The presence of three components with eigenvalues exceeding 1, explained 40.0%, 7.6%, and 6.9% of the variance respectively. Upon examining a 3-component solution, the rotated matrix had the loading patterns shown in Table 3. Items 3, 6, and 13 did not load well with any of these components. Upon closer examination of the items, item 11 loaded across components and had a unique topic compared with other items. Therefore, item 11 was not included. Item 15 was included in the third component because it had a stronger loading there than in the other components.

Table 3 Initial 3-factor solution loading patterns for the general chemistry 1 course data
Factor Items loading above 0.4 Other items with strong loading
1 Negative morale 8, 9, 10, 11 15 (negative below −0.4)
2 Positive consequences 4, 5, 7, 12 11 (negative below −0.4)
3 Thinking support 1, 2, 15, 16 14 (almost 0.4)

The final structure is shown in Table 4. Based on the semantic sense of the items in each component, names were generated to represent the sense of the items in each component. “Negative Morale” expresses a sense of negative responses to incorrect answers. “Positive Consequences” describe morale-boosting effects. The last component was called “Thinking Support” because the items suggest an effect on thinking or remembering.

Table 4 Final 3-factor solution with Cronbach alpha values for the general chemistry 1 course data
Factor Items loading above 0.4 Cronbach alpha value
1 Negative morale 8, 9, 10 0.762
2 Positive consequences 4, 5, 7, 12 0.802
3 Thinking support 1, 2, 15, 16 0.829

As shown in Table 4, each factor has good internal consistency, with Cronbach alpha coefficients greater than 0.76 reported for the general chemistry data. The organic chemistry survey data from Institution D showed similar Cronbach alpha values for these three components (note: thinking support factor was missing Q16). The organic chemistry paper/pencil survey from Institution B and C gave Cronbach alpha values that were slightly lower (0.58–0.75), which could be a result of smaller sample size. The factor structure was assumed to be the same based on reasonable Cronbach alpha values; however, exploratory factor analysis was not performed on the organic samples due to the smaller sample sizes. The component scores (averaging the responses for the items in those components) for each group of students are illustrated Fig. 3. In general, students mildly to moderately disagreed with negative morale effects. Students agreed with positive consequences and thinking support.

image file: c7rp00183e-f3.tif
Fig. 3 Average survey factor scores for each group of students.

A one-way between-groups multivariate analysis of variance (MANOVA) was performed to investigate differences between the three students groups (Gen Chem 1, Org Chem 2 (online), and Org Chem 2) across the three survey factors. There was a main effect of student group, F(3,440) = 6.83, p <0.001; Wilks’ Lambda = 0.91; partial eta squared = 0.44. Pairwise comparisons revealed that negative morale is significantly different among the three data sets (p < 0.05). The organic chemistry (paper/pencil PUI) group was significantly higher (p < 0.05) in positive consequences and thinking support when compared to the other two groups which were not statistically different.

We also investigated whether students who reported high negative morale effects (>3.5) would have lower positive consequences and thinking support as compared with students with lower negative morale scores (<2.5). We were particularly interested in investigating whether students who agreed that the IF-AT® forms contributed to negative morale effects would disagree with positive consequences and thinking supports from using IF-AT® forms. Stated differently, we hypothesized that students who reported negative effects from getting an answer incorrect would report less positive effects from correct answers and less positive test taking strategies. This hypothesis is supported by the work of several researchers who connect student failure to feelings of hopelessness that ultimately result in loss of student effort (Crooks, 1988). For this analysis, we pooled the general and organic chemistry students together, then separated the students into high and low negative morale scores. As shown Table 5, students with high negative morale effects had lower positive consequences and thinking support scores as compared with those who reported less negative morale effects. This was further supported with a MANOVA analysis where there was a main effect on the three factor scores for high and low negative morale scores, F(3,342) = 254.36, p < 0.001; Wilks’ Lambda = 0.309; partial eta squared = 0.691. Pairwise comparisons revealed significant differences in the positive consequences and thinking support scores between the high and low negative morale groups. However, the high negative morale group scores expressed neutral rather than negative average response regarding thinking support and perception of positive consequences, which contrasted with our original hypothesis.

Table 5 Average (standard deviation) of high and low negative morale student factor responses
Negative morale group Negative morale Positive consequences Thinking support
High (>3.5) 4.14 (0.45) 3.35 (0.83) 3.08 (0.83)
Low (<2.5) 1.88 (0.42) 4.38 (0.46) 4.17 (0.54)

Table 6 General chemistry 1 descriptive statistics for survey responses
  Institution A Institution C
N Average Std dev. Median N Average Std dev. Median
Q1 259 3.75 0.899 4.00 26 3.65 1.294 4.00
Q2 259 3.51 1.047 4.00 26 3.42 1.301 3.00
Q3 259 1.82 0.743 2.00 26 2.23 1.243 2.00
Q4 256 3.77 1.085 4.00 26 3.58 1.362 4.00
Q5 259 4.03 0.908 4.00 26 3.96 1.216 4.00
Q6 259 1.89 0.782 2.00 26 1.77 0.908 1.50
Q7 259 3.80 0.791 4.00 25 3.64 1.221 4.00
Q8 258 3.45 1.080 4.00 26 3.35 1.198 4.00
Q9 259 2.30 1.008 2.00 26 2.73 1.151 2.00
Q10 259 2.62 1.157 2.00 26 2.54 1.240 2.00
Q11 259 2.25 1.155 2.00 26 2.08 1.294 1.50
Q12 259 4.15 0.886 4.00 26 4.12 1.003 4.00
Q13 259 3.17 1.059 3.00 26 2.54 0.989 3.00
Q14 259 3.81 0.952 4.00 26 3.77 1.177 4.00
Q15 259 3.93 0.772 4.00 26 3.85 1.008 4.00
Q16 259 3.44 0.940 3.00 26 3.62 1.203 4.00
Neg. morale factor 258 2.79 0.886 2.67 26 2.87 1.046 2.67
Pos. conseq. factor 256 3.93 0.715 4.00 26 3.79 1.093 4.00
Thinking support factor 259 3.65 0.737 3.75 26 3.63 1.065 3.75

Table 7 Organic chemistry 2 (online) descriptive statistics survey responses
  Institution D
N Average Std dev. Median
Q1 100 3.47 1.414 4.00
Q2 100 3.45 1.192 4.00
Q4 100 3.70 1.176 4.00
Q5 99 3.97 0.974 4.00
Q6 100 1.90 1.078 2.00
Q7 100 3.80 0.841 4.00
Q8 98 3.45 1.245 4.00
Q9 99 2.69 1.192 2.00
Q10 98 2.90 1.272 3.00
Q11 98 2.34 1.192 2.00
Q12 99 3.95 0.941 4.00
Q13 98 3.17 1.219 3.00
Q14 99 3.86 0.904 4.00
Q15 99 3.93 0.824 4.00
Neg. morale factor 97 3.02 1.076 3.00
Pos. conseq. factor 98 3.87 0.775 4.00
Thinking support factor 99 3.61 0.896 3.67

Table 8 Organic chemistry 2 (paper/pencil) descriptive statistics survey responses
  Institution B Institution C
N Average Std dev. Median N Average Std dev. Median
Q1 32 4.22 0.832 4.00 38 4.32 0.620 4.00
Q2 32 4.19 0.859 4.00 38 3.97 0.636 4.00
Q3 32 1.94 0.914 2.00 38 1.89 0.764 2.00
Q4 32 4.28 0.772 4.00 38 4.42 0.722 5.00
Q5 32 4.41 0.798 5.00 38 4.47 0.830 5.00
Q6 32 1.50 0.718 1.00 38 1.71 0.694 2.00
Q7 32 4.41 0.756 5.00 38 4.16 0.679 4.00
Q8 32 2.84 1.298 3.00 38 2.61 1.079 2.50
Q9 32 2.31 1.120 2.00 38 1.92 0.784 2.00
Q10 32 1.88 0.942 2.00 38 1.74 0.950 1.50
Q11 32 1.59 0.837 1.00 38 1.58 0.889 1.00
Q12 32 4.28 0.792 5.00 38 4.39 0.790 5.00
Q13 31 2.71 1.006 3.00 38 2.79 0.741 3.00
Q14 32 4.00 1.191 4.00 38 3.71 0.984 4.00
Q15 32 4.44 0.716 5.00 38 4.18 0.801 4.00
Q16 32 3.97 0.933 4.00 38 4.11 0.649 4.00
Neg. morale factor 32 2.34 0.940 2.50 38 2.08 0.750 2.00
Pos. conseq. factor 32 4.37 0.462 4.50 38 4.36 0.559 4.50
Thinking support factor 32 4.20 0.597 4.25 38 4.14 0.417 4.25

Examples of IF-AT questions and response patterns

For both general chemistry and organic chemistry courses, patterns in student answer selections were examined as students received the immediate feedback. The items below illustrate student responses to different types of multiple-choice questions that were used by the instructors. Prior to using IF-AT® forms, the types of problems shown in Fig. 4–6 may be infrequently used in the multiple-choice format. More often these types of questions would have been included in open-response formats where partial credit could be assigned by the grader if a step in the problem was missed. In order to solve the problem shown in Fig. 4, students need to determine the proportional relationship between the two reactants as well as convert from quantities of one material to the other material. Out of 113 students in general chemistry, 45% of students answered the question correctly (choice a) after one attempt. Another 34% of students answered this question correctly after two attempts. The remaining 21% of students took three or four attempts to arrive at the correct answer to this question. Of those students who took two attempts, 59%, 16%, and 25% chose choices b, c, and d, respectively, for their first attempt. The most common incorrect answer (choice b) did not consider the 2[thin space (1/6-em)]:[thin space (1/6-em)]1 mole ratio between HCl and Ca(OH)2. A common missed approach was to use a dilution equation (M1V1 = M2V2) which assumes moles 1 equal moles 2. Another common approach was to just assume a 1[thin space (1/6-em)]:[thin space (1/6-em)]1 ratio without first writing a balanced equation. We can only speculate on these common missed approaches from years of teaching experience. Response process interviews would be necessary to determine the process by which students arrived at the first-choice incorrect answers. The students answering in two choices clearly had some proximate knowledge on their first attempt, and they would have possibly gained partial credit for this problem on an open response format exam.
image file: c7rp00183e-f4.tif
Fig. 4 Example of a multiple step stoichiometry question.

image file: c7rp00183e-f5.tif
Fig. 5 Example of a conceptual chemistry question.

image file: c7rp00183e-f6.tif
Fig. 6 Example of linked questions in organic chemistry.

Instructor Green found that the IF-AT® form was helpful for addressing and confronting conceptual chemical understanding. Fig. 5 shows an example of a conceptual question used on a general chemistry exam using IF-AT® forms. Out of 225 students, 70% answered the question correctly in one attempt, 24% used two attempts and 6% used three or more attempts. Of the students who took two attempts to answer correctly, 96% chose choice d, which could indicate a conceptual misunderstanding of cation/anion charge labeling, namely the positive and negative assignments were reversed. What is unclear at this time is whether this conflict/resolution created during the exam was enough to create long-term conceptual change for some students. This question will be grounds for future study.

Instructor Blue found the ability to link questions within the multiple-choice portion was particularly useful, especially when asking a student to explain why a previous answer was chosen or for multistep synthesis questions in organic chemistry. Similar types of linked questions in multiple-choice format have been reported: linked questions referred to as integrated testlets in physics (Slepkov, 2013) and two tiered questions where the first question refers to the content and the second question asks for a reason (Treagust, 1986; Treagust, 1988). An example pair of linked questions from organic chemistry testing is depicted in Fig. 6. Students are not penalized twice for getting the first part of the question wrong. Instead they know the answer to the first part of the question (whether they got full credit or not) thus helping them to solve the remaining questions and therefore learn from their initial error.

Although this data is from a different semester than survey data included herein, we did not ask questions on the survey about linked questions, but just provide this as an example used. Of the 240 students who answered the linked questions, 73% of the students got the answer for the first question correct on the first try, 13% got it right on the second try and 14% on the third or fourth attempt. For the second question, 86% of the students got the answer correct on the first try, while 9% got it in the second attempt and 5% in the third and fourth attempts. This seems to suggest that students who did not know the answer was the tertiary alkyl halide (a) in one attempt, were able to use the knowledge of the correct answer and successfully complete the second question, which asks why. Of the students who took more than one attempt to correctly answer the first question, 82% improved or took fewer attempts to answer the second question than the first question. Students appear at the very least to be paying attention to the previous question when answering the following linked question and are likely learning from their mistakes in order to improve their performance on the second question. Using IF-AT® forms allows more effective use of linked questions, as students can answer the second part after discovering the answer to the first part. More research is necessary before strong conclusions can be made about student learning from IF-AT® feedback on linked questions.

Limitations of study

This study included students from institutions with different profiles and from different course content areas. We cautiously suggest that our results may generalize to students in general and organic chemistry courses. We did not disaggregate by more specific student characteristics (sex, ethnicity, high school preparation). Since the survey was anonymous, only the general qualities of the students enrolled in the courses were described. In the future, it would be good to examine student perceptions as a function of test performance to see if poorly performing students respond more negatively compared to highly performing students. The item response patterns were included to provide some evidence of purposeful scratching on the IFAT forms. Collection of a larger number of items would be needed to fully support proposed benefits in that section of the paper. Student thinking behind their approach to scratching the forms could also be captured more effectively through student interviews.

Implications for teaching and future work

Results from this study indicate that organic and general chemistry students responded positively to using IF-AT® forms for the multiple-choice testing. Students also perceived increased learning as a result of the immediate feedback provided from the IF-AT® form. Reported positive morale effects of using the IF-AT® form outweighed the negative morale effects. Lastly, students reported using more metacognitive skills like reflection and self-evaluation with IF-AT® testing compared to Scantron testing.

From our studies, we would encourage others to consider using IF-AT® testing over traditional Scantron testing. The ability to provide immediate feedback during a testing event parallels the focus of most active learning environments. We also plan to further investigate whether student “perception” of learning is actually linked to enhancement of student performance on future exam questions.

Conflicts of interest

There are no conflicts of interest to declare.



We thank several people for insightful conversations on the survey development and analysis including Dr Michael Epstein and Dr Abdulaziz Elfessi (University of Wisconsin-LaCrosse (UW-L) Statistical Consulting Center). Materials support and undergraduate research stipend support for this work was provided by a UW-L Faculty Research Grant and a NSF TUES grant (DUE 1140914). Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation. Allison Fitzwater worked as an undergraduate student at UW-L organizing some of the data presented in this paper. Support for purchasing the IF-AT® forms was provided by the individual institutions.


  1. Bransford J. D., Brown A. L. and Cocking R. R. (ed.), (1999), How People Learn: Brain, Mind, Experience, and School, Washington, DC: National Academy Press.
  2. Brosvic G. M. and Epstein M. L., (2007), Enhancing learning in the introductory course, Psychol. Rec., 57(3), 391–408.
  3. Brosvic G. M., Epstein M. L., Cook M. J. and Dihoff R. E., (2005), Efficacy of error for the correction of initially incorrect assumptions and of feedback for the affirmation of correct responding: learning in the classroom, Psychol. Rec., 55(3), 401–418.
  4. Butler A. C. and Roediger III H. L., (2008), Feedback enhances the positive effects and reduces the negative effects of multiple-choice testing, Mem. Cognit., 36(3), 604–616.
  5. Butler A. C., Marsh E. J., Goode M. K. and Roediger III H. L., (2006), When additional multiple-choice lures aid versus hinder later memory, Appl. Cognit. Psychol., 20, 941–956.
  6. Carmichael J., (2009), Team-based learning enhances performance in introductory biology, J. Coll. Sci. Teach., 38(4), 54–61.
  7. Cotner S., Baepler P. and Kellerman A., (2008), Scratch this! The IF-AT as a technique for stimulating group discussion and exposing misconceptions, J. Coll. Sci. Teach., 37, 48–53.
  8. Crooks T. J., (1988), The impact of classroom evaluation practices on students, Rev. Educ. Res., 58, 438–481.
  9. Dempster F. N., (1989), Spacing effects and their implications for theory and practice, Educ. Psychol. Rev., 1(4), 309–330.
  10. DiBattista D., (2005), The immediate feedback assessment technique: a learner-centered multiple-choice response form, Can. J. High. Educ., 35(4), 111–113.
  11. DiBattista D. and Gosse L., (2006), Test anxiety and the immediate feedback assessment technique, J. Exp. Educ., 74(4), 311–328.
  12. DiBattista D., Mitterer J. O. and Gosse L., (2004), Acceptance by undergraduates of the immediate feedback assessment technique for multiple-choice testing, Teach. High. Educ., 9(1), 17–28.
  13. Dihoff R., Brosvic G. and Epstein M., (2003), The role of feedback during academic testing: the delay retention effect revisited, Psychol. Rec., 53(4), 533–548.
  14. Dihoff R. E., Brosvic G. M., Epstein M. L. and Cook M. J., (2004), Provision of feedback during preparation for academic testing: learning is enhanced by immediate but not delayed feedback, Psychol. Rec., 54(2), 207–231.
  15. Dunlosky J., Rawson K. A., Marsh E. J., Nathan M. J. and Willingham D. T., (2013), Improving students’ learning with effective learning techniques, Psychol. Sci. Public Interest, 14(1), 4–58.
  16. Epstein M. L., Epstein B. B. and Brosvic G. M., (2001), Immediate feedback during academic testing, Psychol. Rep., 88(3), 889–894.
  17. Epstein M. L., Lazarun A. D., Calvano T. B., Matthews K. A., Hendel R. A., Epstein B.B., Brosvic G. M., (2002), Immediate feedback assessment technique promotes learning and corrects inaccurate first responses, Psychol. Rec., 52(2), 187–201.
  18. Epstein M. L., Brosvic G. M., Costner K. L., Dihoff R. E. and Lazarus A. D., (2003), Effectiveness of feedback during the testing of preschool children, elementary school children, and adolescents with developmental delays, Pshychol. Rec., 53(2), 177–195.
  19. Farrell J. J., Moog R. S. and Spencer J. N., (1999), A guided-inquiry general chemistry course, J. Chem. Educ., 76(4), 570–574.
  20. Grunert M., Raker J., Murphy K. and Holme T., (2013), Polytomous versus dichotomous scoring on multiple-choice examinations: development of a rubric for rating partial credit, J. Chem. Educ., 90(10), 1310–1315.
  21. Henriques L., Colburn A. and Ritz W. C., (2006), Assessment in Science Practical Experiences and Education Research, Arlington, VA: NSTA Press.
  22. Higgins S.E., (2014), Formative assessment and feedback to learners, in Proven programs in education: classroom management and assessment, Thousand Oaks, California: Corwin Press, pp. 11–15.
  23. Mohrweis L. and Shinham K., (2015), Enhancing students' learning: instant feedback cards, Am. J. Bus. Educ., 8(1), 63–69.
  24. Mulliner E. and Tucker T., (2017), Feedback on feedback practice: perceptions of students and academics, Assess. Eval. High. Educ., 42(2), 266–288.
  25. Pressey S. L., (1926), A simple device for teaching, testing, and research in learning, Sch. Soc., 23, 373–376.
  26. Roediger III H. L. and Marsh E. J., (2005), The positive and negative consequences of multiple-choice testing, J. Exp. Psychol.: Learn. Mem. Cogn., 31(5), 1155–1159.
  27. Schneider J. L., Hein S. M. and Murphy K. L., (2014), Feedback in testing, the missing link, in Kendhammer L. K. and Murphy K. L. (ed.), Innovative Uses of Assessments for Teaching and Research, Washington, DC: American Chemical Society, pp. 93–112.
  28. Skinner B. F., (1954), Science of learning and the art of teaching, Harv. Educ. Rev., 24, 86–87.
  29. Slepkov A. D., (2013), Integrated testlets and the immediate feedback assessment technique, Am. J. Phys., 81, 782–791.
  30. Slepkov A. D., Vreugdenhil A. J. and Shiell R. C., (2016), Score increase and partial-credit validity when administering multiple-choice tests using an answer-until-correct format, J. Chem. Educ., 93(11), 1839–1846.
  31. Treagust D., (1986), Evaluating students’ misconceptions by means of diagnostic multiple choice items, Res. Sci. Educ., 16(1), 199–207.
  32. Treagust D. F., (1988), Development and use of diagnostic tests to evaluate students’ misconceptions in science, Int. J. Sci. Educ., 10(2), 159–169.
  33. Vygotsky L. S., (1978), Mind in society: the development of higher psychological processes, Cambridge, MA: Harvard University Press.
  34. White H., (2005), Generating discussion during examinations, Biochem. Mol. Biol. Educ., 33(5), 361–362.

This journal is © The Royal Society of Chemistry 2018