Daniel
Elford
*a,
Garth A.
Jones
b and
Simon J.
Lancaster
*c
aSchool of Chemistry, University of East Anglia, Norwich Research Park, NR4 7TJ, UK. E-mail: d.elford@uea.ac.uk
bSchool of Chemistry, University of East Anglia, Norwich Research Park, NR4 7TJ, UK. E-mail: garth.jones@uea.ac.uk
cSchool of Chemistry, University of East Anglia, Norwich Research Park, NR4 7TJ, UK. E-mail: s.lancaster@uea.ac.uk
First published on 19th April 2024
Peer Instruction (PI), a student-centred teaching method, engages students during class through structured, frequent questioning, facilitated by classroom response systems. The central feature of PI is the ConcepTest, a question designed to help resolve student misconceptions around the subject content. Within our coordination chemistry PI session, we provide students two opportunities to answer each question – once after a round of individual reflection, and then again after a round of augmented reality (AR)-supported peer discussion. The second round provides students with the opportunity to “switch” their original response to a different answer. The percentage of right answers typically increase after peer discussion: most students who answer incorrectly in the individual round switch to the correct answer after the peer discussion. For the six questions posed, we analysed students’ discussions, in addition to their interactions with our AR tool. Furthermore, we analyse students’ self-efficacy, and how this, in addition to factors such as ConcepTest difficulty influence response switching. For this study, we found that students are more likely to switch their responses for more difficult questions, as measured using the approach of Item Response Theory. Students with high pre-session self-efficacy switched from right-to-wrong (p < 0.05) and wrong-to-different wrong less often, and switched from wrong-to-right more often than students with low self-efficacy. Students with a low assessment of their problem solving and science communication abilities were significantly more likely to switch their responses from right to wrong than students with a high assessment of those abilities. Analysis of dialogues revealed evidence of the activation of knowledge elements and control structures.
Within a PI session, time is organised by a sequence of questioning, interactive discussion, and explanation (Schell and Mazur, 2015). The element of peer discussion is arguably the most recognizable feature of the PI model, and works to maximise both the amount of time that students think about key concepts, in addition to the time students spend engaging in self-monitoring of their understanding of the discipline. As students explain their understanding of a ConcepTest, often an epiphany occurs, which takes them further than their individual thinking processes. The body of research on PI, primarily from physics education researchers indicates that PI significantly improves student learning outcomes, such as conceptual understanding and problem-solving ability. As such, implementation of the process outlined in Fig. 1 has provided compelling evidence that PI is associated with substantial improvements in students’ ability to solve conceptual and quantitative problems (Mazur, 1997; Vickrey et al., 2015).
![]() | ||
Fig. 1 PI implementation procedure, adapted from Mazur (1997). |
Self-efficacy was first developed as an integral part of social cognitive theory (SCT), an agentic perspective to human development, adaptation, and change. As there are different social cognitive theoretical perspectives, the focus of this study is limited to the social cognitive theory proposed by Bandura (1986, 1997, 2001). SCT posits that learning occurs in a social context with a dynamic and reciprocal interaction of the person, environment, and behaviour (Bandura, 1986). Within this triadic reciprocality, each set of influences on human functioning affects the others, and is in turn affected by them. The pivotal feature of SCT is the importance of social influence, and its emphasis on external and internal social reinforcement.
The construct of self-efficacy within SCT refers to the level of a person's confidence in his or her ability to successfully perform an action. Thus, to support the establishment of perseverance and self-regulated learning within our PI environment, students were randomly organised into groups of 2–3 individuals. This allowed students to support one another, whilst making their thinking explicit through discussion. Social cognitive theorists emphasize that learning is most effective when peers learn from others, who are both similar to themselves, and display high levels of self-efficacy (Schunk, 2005). For example, students who feel competent about performing well in mathematics (high self-efficacy) are apt to engage in effective learning strategies that will benefit their learning (behavioural), as well as demonstrating greater persistence (Schunk and DiBenedetto, 2016; Schunk and Usher, 2019).
Meta-Analyses have been conducted on studies with diverse experimental and analytical methodologies applied across diverse spheres of functioning (Boyer et al., 2000; Moritz et al., 2000; Stajkovic et al., 2009). The accumulated evidence confirms that efficacy beliefs contribute significantly to the quality of human functioning. Cognitively, our intention was that the 3D perspective afforded by ChemFord would help manage working memory load, in addition to providing insight into the structure.
Smith et al. (2009) report that students improve the most when asked difficult questions during PI, a trend that was also found by Porter et al. (2011). In addition, lower learning gains have also been reported for instructors implementing easier ConcepTests (Rao and DiCarlo, 2000; Knight et al., 2013). Hence, empirical evidence suggests that the benefits of PI, especially the effectiveness of student discussions, is very likely influenced by the difficulty of the question posed. In their longitudinal analysis, Crouch and Mazur (2001) found that substantial learning gains following voting in round 2 (post-discussion) occurred when the voting in round 1 was correct for 35–70% of the student base. Below 35%, the concept may still be too alien, requiring the provision of further description (Simon et al., 2010).
As such, we developed six ConcepTests to probe students’ comprehension of organometallic chemistry concepts (see ESI,† for details of ConcepTests). Throughout the development process, internal validation with experts in the field of inorganic chemistry at UEA was carried out to ensure student attention was focused towards critical concepts key to addressing specific learning goals. To satisfy these requirements, we used the following six criteria for each ConcepTest (Newbury, 2013):
i. Clarity. Students should waste no cognitive resources understanding the requirements of the question.
ii. Context. The question should be appropriate for the learning material.
iii. Learning outcome. The question should allow students to demonstrate that they grasp the concept.
iv. Distractors. Distractors should be plausible solutions to the question.
v. Difficulty. The question should not be too easy or too hard.
vi. Stimulates thoughtful discussion. The question should engage students, and incentivise thoughtful discussion.
Regarding the implementation of AR technology into PI, a very limited number of previous works are reported (Ravna et al., 2022; Themelis, 2022). Although VR is commonly preferred for multiuser collaboration, the role of AR for collaboration is increasing. As such, throughout ConcepTest development, we focused on how the affordances of AR could be leveraged to promote important discussion points. In Fig. 2, we present our first ConcepTest. To answer the first ConcepTest correctly, there are three conceptual points which, fundamentally, students must understand:
i. Firstly, students must recognise how the axial and equatorial aqua ligands are situated around the chromium metal atom.
ii. Secondly, students must be able to comprehend the shapes and orientations of the five d-orbitals of the chromium metal atom.
iii. Lastly, students must be able to comprehend the consequence of ligand and chromium d-orbital interactions along the three Cartesian axis (x, y, and z).
ChemFord affords users the ability to instantiate interactable three-dimensional (3D) representations of the octahedral coordination sphere of the chromium complex, in addition to the 5d-orbitals of the chromium metal atom, to direct peer discussion towards these three conceptual points.
The structure of our PI session is outlined in Fig. 3. Student response (voting) data for our six ConcepTests were collected through TurningPoint (2022), an audience response system in which students submitted their responses using mobile phones. In parallel, all students’ PI discussions, alongside their interactions with ChemFord, were captured using audio- and screen-recording software installed on a suite of iPads distributed to student groups. This allowed the study of learning from two perspectives:
![]() | ||
Fig. 3 A timeline of our PI session. Numbers preceeding each action indicate the session time in minutes. |
i. Probing the conceptual understanding of students through the collection of voting data.
ii. Studying the process of conceptual development during AR-supported peer discussion, through recorded conversations.
The research questions investigated were as follows:
Ethical clearance was obtained under the regulations of UEA's School of Science Research Ethics Committee, a sub-committee of the UEA Research Ethics Committee. Participants were informed that their involvement within any aspect of this research was completely voluntary. In addition, Participants were made aware of their right to withdraw from the study, at any part of the research phase, without declaring a reason. Throughout the research period, participants were assured of data anonymity and confidentiality. Identifying information was irrevocably stripped from data documentation, and study codes utilized in their place. All information was stored securely and only accessible to the researcher.
Students are often unaware that they are engaging in a particular epistemic game. As such, the focus of this qualitative analysis is the interaction between students’ AR experiences and the activation of these control structures. We start by examining ConcepTests 2 and 3, as these both showed significant intragroup improvement and high PI efficiency. ConcepTest 2 relates to the identification of a linear complex's crystal field splitting diagram, whereas ConcepTest 3 concerns the geometric [Jahn–Teller] distortion of a non-linear molecular system. These ConcepTests are examples of a productive dialogue in which a change in student thinking, and voting response, are evident.
Throughout discussions relating to ConcepTests 2 and 3, evidence of the Pictorial Analysis epistemic game was apparent (Fig. 4). In the Pictorial Analysis Game, students generate an external spatial representation that specifies the relationship between influences in a problem statement (Tuminaro and Redish, 2007). The epistemic form being a representation that the student generates to guide their inquiry.
![]() | ||
Fig. 4 Schematic diagram of some moves in the epistemic game Pictorial Analysis. Adapted from Tuminaro and Redish (2007). |
The discussion presented (Table 1) was between a pair of students, of which one voted correctly on ConcepTest 2, and the other incorrectly. The second comment from group member (GM) 1 is the first activating statement in this dialogue. GM2 explains the interaction between the d-orbitals of the gold atom, and the two chlorido ligands. Amidst choosing a new representation on the virtual molecule, it is clear that there has been a change in thinking for GM1. This can be interpreted as an activating event, and evidence of the lowest level of resource activating (activation of a knowledge element). As the dialogue progresses, it is clear that GM1 has understood the concept, and is now able to use their knowledge to contribute to the discussion. Combining the video recording, representing the students’ AR experience, with the audio recording of the peer discussion, gave a clear indication of the positive impact that using AR had on supporting students’ thinking and knowledge construction.
As the session progressed to ConcepTest 3, employment of the Pictorial Analysis epistemic game was, again, evident from students’ discussions. The example outlined in Table 2 is a group of three students, in which a single member answered correctly during round 1, and the other two incorrectly. The interplay that is of particular interest is Section 2. GM2 is able to warrant proof of their claim through the use of ChemFord. The distortion of the represented octahedral complex is used as a means of activating the thinking of GM3. GM3 demonstrates activation of a knowledge structure, specifically, support knowledge building. The thinking of GM1 has not changed. Thus, GM2, building on their previous statement, thinks of a new way to persuade GM1 regarding the stabilisation of the z-components. Using ChemFord, GM2 is able to introduce the metal d-orbitals to support their conceptual story. Subsequently, GM1's thinking is activated, and repeats the statement that altered their perspective, reaching the correct conclusion. Both ConcepTests provide evidence of resource activation by means of successful AR-supported dialogue. All three students responded correctly on the second vote.
ConcepTest 1 is an interesting case. Although quantitative response data suggests that a majority of students answered correctly, qualitative data suggests that students may not have demonstrated a clear understanding at the start of the dialogue. As such, there are points of interest in terms of resource activation through utilisation of AR. Below, we present an example of a dialogue from a pair of students for ConcepTest 1. The AR representations employed are shown in Fig. 5. Both answered correctly before and after discussion:
[Student group 4] | |
GM1: | “I put D; I don’t know.” |
GM2: | “So, if we look at the dz2and the dx2−y2, when it splits there will be 2 orbitals at the top and 3 on the bottom.” |
GM1: | “Yeah, but the question is why those ones?” |
GM2: | “It's this [dz2] because the orbital is pointing towards the ligands. If you think of ligands as being point charges, the orbital overlaps with the ligands. That's higher energy. And this one also [dx2−y2]. The ones between the axis are in the t2g.” |
GM1: | “It makes sense that the top two orbitals are in line with the ligands, that these ones [dx2−y2] are pointing towards the ligands which is unfavourable, so it's going to be the highest energy. These ones [dxy] are between the axis and therefore lower in energy.” |
GM2: | “These aren’t pointing at the ligands so I think that these are energetically favourable.” |
![]() | ||
Fig. 5 AR representations employed during peer discussion of ConcepTest 1, with overlay of the dz2 orbital (left), and the overlay of the dxy orbital (right). |
Lastly, we provide an example from ConcepTest 5. Our data shows that this ConcepTest had the lowest correct response rate, as well as the lowest theoretical (and measured) PI efficiency. Furthermore, it was the only ConcepTest where the correct response rate of students was lower after discussion. Hence, it is important to understand the interactions present throughout discussions of ConcepTest 5, and how these differ from the successful dialogues presented in ConcepTests 1–3.
ConcepTest 5 asked students to use their understanding of pi backbonding to identify which carbonyl ligands (Fig. 6) are most susceptible to electrophilic attack. In all of the transcripts, a common theme was whether or not students could recognise that the two bridging carbonyl ligands are equivalent.
[Student group 19] | |
GM1: | “It's definitely not the top bridging CO.” |
GM3: | “The top one will be sterically hindered by the two other ligands.” |
GM2: | “Looking at the molecule you can see that both of the bridging COs are equivalent.” |
![]() | ||
Fig. 6 3D representation of cyclopentadienyliron dicarbonyl dimer with superimposed carbonyl π bonding molecular orbitals from ConcepTest 5. |
For ConcepTest 5, we also provided representations of the π and π* molecular orbitals of the ligands, in addition to the iron atom d-orbitals in the hope of initiating discussion of electron backdonation. This was noted in some dialogues, in which students responded correctly during round 2:
[Student group 2] | |
GM1: | “I’m just thinking about the antibonding orbitals on the carbonyls. The antibonding for these ones [carbonyls] would be here.” |
GM2: | “Okay.” |
GM1: | “And these ones are bound to two metals and these ones are only bound to one metal.” |
[Student group 5] | |
GM1: | “Yeah. Backbonding also provides electrons to the ligand.” |
GM2: | “Yeah.” |
GM1: | “So, those go into the pi* orbital of the CO.” |
Several dialogues for ConcepTest 5 provided examples of unproductive discussion, in which little conceptual chemistry was used. A reason for this may be that students were not able to retrieve the required knowledge elements to respond correctly, or that the AR experience did not manage to support resource activation. For group dialogues where the AR virtual objects were not referenced, or used as a driver for supporting the discussion, we found a greater number of incorrect responses after round 2. Evidence of the Recursive Plug-and-Chug epistemic game (Fig. 7) was also observed within ConcepTest 5 dialogues (not optimal). In the Recursive Plug-and-Chug epistemic game, students plug ideas into a problem situation and churn out answers without conceptually understanding the implications of their solution. Evidence of dialogue similar to that expected of Recursive Plug-and-Chug epistemic game was also observed in ConcepTests 4 and 6, but not in ConcepTests 1–3.
![]() | ||
Fig. 7 Schematic diagram of some moves in the epistemic game Recursive Plug-and-Chug. Adapted from Tuminaro and Redish (2007). |
ConcepTest No. | Difficulty | Discrimination |
---|---|---|
1 | −2.000 | 0.636 |
2 | −0.225 | 0.543 |
3 | 0.917 | 0.956 |
4 | 1.414 | 0.561 |
5 | 3.004 | 0.350 |
6 | 1.056 | 1.580 |
We employed PI efficiency (η) calculations, defined with the help of Hake's standardised gain (Hake, 1998), to examine the effectiveness of each ConcepTest. The proportion of correct answers before, and after, the discussion is denoted by Nb and Na respectively. While Hake's gain represents individual learning gain, PI efficiency is considered to reflect the ease of understanding gained through PI (Table 4). The collected response data from our ConcepTests was found to be normally distributed. Hence, we conducted paired-samples t-tests, alongside analysis of effect size, for intragroup comparisons. The theoretical value of Na is expressed as a function of Nb (Nitta, 2010), with the theoretical value of η = Nb. For this study, the average difference between the measured, and theoretical values of η = 0.061, similar to a value of 0.062 recorded by (Nitta et al., 2014) when measuring the effectiveness of PI using the Force Concept Inventory. The proportion of correct responses during independent voting in round 1 ranged from 0.290–0.897. The ideal range is reported to be from 0.35–0.70 (Crouch and Mazur, 2001). For ConcepTests 4–6, where correct independent response rates lie at the lower end of this range, students were likely to have had ineffective discussions during round 2. As such, the value of η observed is low.
ConcepTest | ||||||
---|---|---|---|---|---|---|
1 | 2 | 3 | 4 | 5 | 6 | |
No. of respondents before PI discussion | 29 | 31 | 29 | 33 | 31 | 27 |
No. of respondents after PI discussion | 32 | 31 | 29 | 25 | 30 | 25 |
Correct answers before discussion (Nb) | 0.897 | 0.581 | 0.379 | 0.333 | 0.290 | 0.296 |
Correct answers after discussion (Na) | 0.969 | 0.968 | 0.862 | 0.400 | 0.200 | 0.320 |
Paired Samples t-test | 0.161 | <0.01 | <0.01 | 0.161 | 0.375 | 1.000 |
Cohen's d* | 0.28 | 0.77 | 0.91 | 0.29 | 0.17 | 0.00 |
Theoretical value of Na | 0.989 | 0.824 | 0.614 | 0.555 | 0.496 | 0.504 |
PI efficiency (η) | 0.699 | 0.924 | 0.778 | 0.100 | −0.127 | 0.034 |
Theoretical value of PI efficiency (η) | 0.897 | 0.581 | 0.379 | 0.333 | 0.290 | 0.296 |
Difference between theoretical and measured values | 0.198 | −0.343 | −0.399 | 0.233 | 0.417 | 0.262 |
The normalised proportion of correct responses before, and after, the discussion phase of each ConcepTest is shown in Fig. 8. We observed statistically significant improvement for correct response rates between the first and second round of voting on ConcepTests 2 and 3. For ConcepTests 1 and 4, this improvement was approaching significance, with the difference between groups greater than 0.2 standard deviations.
![]() | ||
Fig. 8 Proportion of correct responses before, and after, discussion of each ConcepTest. The green line represents the theoretical curve for PI efficiency (Nitta, 2010). The purple line represents no change in correct pre-discussion and post-discussion response rate. Points above this line indicate improvements in accuracy, whereas points below the line represent decrements in accuracy. |
ConcepTest | Students who switched (%) | Direction | ||
---|---|---|---|---|
W–R | W–W | R–W | ||
1 | 29.41 | 70.0 | 10.0 | 20.0 |
2 | 48.48 | 81.3 | 12.5 | 6.3 |
3 | 65.63 | 71.4 | 23.8 | 4.8 |
4 | 63.64 | 28.6 | 38.1 | 33.3 |
5 | 65.63 | 19.0 | 47.6 | 33.3 |
6 | 54.84 | 23.5 | 52.9 | 23.5 |
When switching is measured, it is important to ensure that the data is not confounded with the frequency of correct (or incorrect) responses in round 1 (Miller et al., 2015). Normalising our response data with respect to students’ answers in round 1 provides an adjusted measure of switching, independent of how many times a student was correct, or incorrect, in round 1.
Coupling these normalised values with the output of our 2PL IRT model allows us to examine switching as a function of ConcepTest difficulty (Fig. 9). A Pearson's correlation showed a strong, positive correlation, r = 0.910, between response switching and ConcepTest difficulty which was statistically significant, p = 0.012. With increasing ConcepTest difficulty, students are more likely to switch their answers from right-to-wrong (r = 0.754, p = 0.084), and wrong-to-different wrong (r = 0.829, p = 0.042). In addition, students are less likely to switch their answers from wrong-to-right (r = −0.771, p = 0.072). A finding consistent with previous studies (Miller et al., 2015).
![]() | ||
Fig. 9 Student switching (%) in any direction for each ConcepTest as a function of difficulty. Each point represents a different ConcepTest. |
It is important for instructors to understand that they have some control over the measure of response switching that occurs throughout PI via the difficulty of the ConcepTests posed. Within our session, we attempted to scaffold this by posing easier ConcepTests first, subsequently building up to more difficult ConcepTests. Research has shown that prefacing more difficult problems with a sequence of related, but more basic conceptual questions, helps students answer harder problems (Ding et al., 2011). Cognitively, presenting easier questions prior to difficult questions may help students break down concepts into smaller, more manageable chunks when questioning the same concept. As ConcepTests often require students to apply conceptual understanding in new contexts, it is possible that scaffolding difficult ConcepTests may assist with positive switching transitions. A future study of ConcepTest response patterns to a series of scaffolded discussion points would prove interesting in providing further insight into the relationship between switching and ConcepTest difficulty.
![]() | ||
Fig. 10 Response switching patterns for students in the top and bottom 27% of reported self-efficacy measures. |
Students with higher pre-session self-efficacy negatively switched less often, and positively switched more often than students with low self-efficacy. As we did not administer a pre-test assessment, we are unable to control for covariates such as prior knowledge, but previous work has indicated that self-efficacy may be more predictive of switching than incoming knowledge (Zajacova et al., 2005). In addition, students with high self-efficacy positively switched their responses more often than students with low self-efficacy.
Students’ responses to two individual items on the PISE moderately correlated with not switching responses. These statements are: “I usually don’t worry about my ability to solve chemistry problems”, (p < 0.01); and “I know how to explain my answers to organometallic chemistry questions in a way that helps others understand my answer”, (p < 0.01). In contrast, these two same items strongly negatively correlated (p < 0.01) with negatively switching responses. For item 10, students who either disagreed or strongly disagreed negatively switched significantly more than students who agreed or strongly agreed (p < 0.001). This difference was also observed for item 16 (p = 0.01). Students with a low assessment of their problem solving and science communication abilities are significantly more likely to negatively switch their responses than students with a high assessment of those abilities.
Following our PI session, median Likert scores on the PISE instrument improved on the following items: “When I come across a tough chemistry problem, I work at it until I solve it” (neutral to agree); “I like hearing about questions that other students have about chemistry” (neutral to agree); “I can communicate science effectively” (neutral to agree, p = 0.04); and “I can communicate chemistry effectively” (neutral to agree, p = 0.025).
Moreover, we examined the relationship between response switching and reported self-efficacy. Students reporting higher measures of self-efficacy displayed lower levels of switching in a negative direction. Students with a low assessment of their problem solving and science communication abilities were significantly more likely to switch their responses from right to wrong than students with a high assessment of those abilities. Through qualitative analysis, we have provided evidence of Pictorial Analysis through AR-supported PI discussions. Where calculated peer efficiency values for ConcepTests were lower, this was less apparent, with Recursive Plug-and-Chug being the commonly observed control structure. It would be interesting to see what would happen if they were asked the same questions again at a later date – retention. Or a third round of the same question to see how many make 3 different choices.
Footnote |
† Electronic supplementary information (ESI) available. See DOI: https://doi.org/10.1039/d3rp00093a |
This journal is © The Royal Society of Chemistry 2024 |