Leonie Sabine
Lieber
a,
Krenare
Ibraj
a,
Ira
Caspari-Gnann
b and
Nicole
Graulich
*a
aJustus-Liebig University Giessen, Institute of Chemistry Education, Heinrich-Buff-Ring 17, 35392 Giessen, Germany. E-mail: Nicole.graulich@dc.jlug.de
bDepartment of Chemistry, Tufts University, 62 Talbot Ave, Medford, MA 02155, USA
First published on 31st May 2022
Building reasonable scientific arguments is a fundamental skill students need to participate in scientific discussions. In organic chemistry, students’ argumentation and reasoning skills on reaction mechanisms are described as indicators of success. However, students often experience challenges with how to structure their arguments, use scientific principles appropriately and engage in multivariate, instead of one-reason decision-making. Since every student experiences their individual challenges with a multitude of expectations, we hypothesise that students would benefit from scaffolding that is adapted to their needs. In the present study, we investigated how 64 chemistry students interacted with an adaptive scaffold that offered different ways of support based on students’ strengths and limitations with structural and conceptual aspects that are needed to build a scientific argument in organic chemistry. Based on the students’ performance in a diagnostic scaffold in which they were asked to judge the plausibility of alternative organic reaction pathways by building arguments, the students were assigned to one of four support groups that received a scaffold adapted to their respective needs. Comparing students’ performance in the diagnostic and adapted scaffolds allows us to determine quantitatively (1) to what extent the adaptive scaffold closes the gap in students’ performance and (2) whether an adaptive scaffold improves the students’ performance in their respective area of support (argumentation and/or concept knowledge). The results of this study indicate that the adaptive scaffold can adaptively advance organic chemistry students’ argumentation patterns.
Ongoing research on student argumentation clearly shows that the reported difficulties are either caused by (1) missing knowledge on how to structure an argument or missing activation of that knowledge for the problem at hand or (2) missing conceptual understanding or its application required for the argument or (3) both reasons. Given that prior knowledge has a major impact on students’ performance, students need to receive adapted support to build upon their strengths and limitations with argumentation, and the level of conceptual understanding they bring into the classroom (Chen, 2014). Scaffolds tailored to students’ needs may support them adaptively to solve a task on their own and by purposefully guiding and slowing down specific aspects of the argumentation process. It may, thus, support the students to direct their focus to the expected structure of an argument or to consider conceptual understanding that they may not have activated without a scaffold (Wood et al., 1976). Especially when building arguments for multivariate mechanistic tasks, slowing down the reasoning process with a scaffold has been shown to assist learners in first collecting multiple relevant chemical concepts and weighing them afterward before making a decision (Caspari and Graulich, 2019; Flynn, 2021; Watts et al., 2021). McNeill et al. (2006) emphasised that scaffolding as a flexible process should not be rigid. Instead, scaffolding should be adjusted to students’ needs.
Previous research on argumentation in chemistry demonstrated that (1) students experience challenges with building sound arguments, (2) students experience difficulties using appropriate scientific principles, and (3) scaffolds are powerful tools to address students’ needs. However, adaptive scaffolds designed to close this performance gap are still limited (Chen, 2014).
In this study, students received scaffolded training consisting of two consecutive parts (i.e., two different data points) which we refer to as an adaptive scaffold. In the first part of the adaptive scaffold, which we refer to as ‘diagnostic scaffold’ students received practice and support in building arguments and using concept knowledge. Students’ answers in this diagnostic scaffold served as a diagnosis for the second part of the adaptive scaffold in which students received one of four scaffolds. These adapted scaffolds addressed the previously mentioned difficulties, resulting in four adapted scaffolds for argumentation patterns, the use of concept knowledge, argumentation patterns and the use of concept knowledge, and students with no apparent difficulties.
Therefore, this quantitative study reports on the effectiveness of the adaptive scaffold (i.e., a combination of a diagnostic scaffold and four adapted scaffolds), designed as an online learning environment to adaptively scaffold students based on their performance of building arguments for alternative organic reaction pathways. In this manner, we investigated the extent to which the adaptive scaffold closes the gap in organic chemistry students’ performance and whether the adaptive scaffold improved students’ performance.
As scaffolding is a temporary process assisting students if they need support, it is also important to fade out the given support when it is no longer needed (Lajoie, 2005; McNeill et al., 2006). However, appropriate fading of support depends on the tasks’ complexity and students’ progress (Kang et al., 2014). Fading too early may have adverse effects as students might not have yet fully understood certain concepts or activities (Noroozi et al., 2017).
In addition to the benefits of using argument components, building strong arguments also involves the appropriate use of scientific concepts (Sandoval and Millwood, 2005; Choi et al., 2013; Lieber and Graulich, 2022). Therefore, CER scaffolding does not only provide support for the structure of arguments but can also be enhanced with the incorporation of concepts (McNeill et al., 2006; Songer and Gotwals, 2012). The interplay of concept knowledge and argumentation was demonstrated by Songer and Gotwals (2012) as students’ conceptual understanding increased by using CER scaffolding in a pre-post intervention. In addition to the main components of CER scaffolds such as argumentation and concept knowledge, the type of scaffold should also be considered.
Kang et al. (2014) suggested that four of the six types of instructional scaffolding (originally analysed for English language learners (Walqui, 2006)) fulfil different functions in the construction of evidence-based explanations in a scientific context. These different types of scaffolds can be combined to be beneficial for the students. The first type is instructional modelling, which gives students clear examples of what is expected from them. This is especially important when implementing a new principle or task. Accordingly, students need to see in advance what the finished product should look like (Walqui, 2006; Kang et al., 2014). In terms of argumentation and concept knowledge, this can be achieved by providing students with examples that illustrate the structure of an argument or presenting arguments that connect concept knowledge. The second type is bridging which functions as a link between existing and new knowledge (Walqui, 2006). This can be accomplished in scaffolds by asking students targeted questions that both activate their prior knowledge and link it directly to new content. The third type is contextualising, and this refers to using language in appropriate context as academic language is not only different from everyday language but often also intangible. For example, pictures or films can be used to support contextualisation (Walqui, 2006). When implementing this type of scaffold in argumentation, the visualisation of problem contexts or the illustration of the argument structure (in partial steps) is beneficial. The last type of scaffolding is developing metacognition, which fosters students’ ability to evaluate and reflect on their current state of knowledge (Walqui, 2006). This function of a scaffold can be realised by prompting students to assess the difficulty and confidence, but also evaluating tasks and whether the students need help in certain areas of the task.
(1) To what extent does the adaptive scaffold close the gap in students’ performance?
(a) To what extent do the group performances differ after the diagnostic scaffold based on scoring argumentation and concept knowledge?
(b) Do the group performances converge after the adapted scaffolds?
(2) Does the adaptive scaffold improve students’ performance in the respective area of support (argumentation and/or concept knowledge)?
To answer these two research questions, a quantitative analysis based on students’ answers of the diagnostic scaffold and adapted scaffolds was conducted.
As described shortly above, the four adapted scaffold groups differed with regard to the additional support provided to the respective group of students.
The task considered for the argumentation score included all of the three argumentation pattern tasks (see Appendix 2), in which sentence sequences of an argument had to be assigned to the three components, claim, evidence, and reasoning, and the building of autonomous arguments in relation to the alternative reaction products (see Appendix 4). Students could receive up to three points on each of the three argumentation pattern tasks. One point each was awarded when all sentence sequences representing either claim, evidence or reasoning were labelled correctly. In the second part, in which students had built arguments for the alternative reaction products, four points were awarded for each alternative reaction product, i.e., two points for the pieces of evidence and two points for the reasoning statements. There were no points awarded for the claim since the students had a given claim for which they only had to decide whether the alternative reaction product was plausible or implausible. For the evidence and reasoning statements, attention was paid to whether they met the following requirements. Chemical correctness was disregarded at this point. An evidence statement was considered correct when the statement relates to the claim. Thereby, the statement does not have to refer to an explicit characteristic of the reaction given but answers the question why the alternative reaction product is (im)plausible. In addition, the statement must be objective, based on data, and an explanation rather than a description. A student example which meets the criteria of a piece of evidence is “the ester has acidic alpha protons”. This built evidence supports and refers to the claim (“The reaction product is plausible”) and answers the why-question because it is an objective statement, which is based on data. In our context, data refers to structural characteristics of the molecule as well as implicit properties, e.g., acidity, as no experimental data are given. A student example which does not meet the criteria of a piece of evidence is “in the reaction above, what is shown is a SN2 reaction”. This self-declared piece of evidence does not support the claim in answering the why-question but this statement could serve as a claim itself. However, the claims (i.e., if plausible or implausible) are given, thus, they are not requested to build, but only to choose a claim by themselves. A reasoning statement was considered correct when a justification was provided as to why the evidence fits the claim. Reasoning must be objective, logical, and based on scientific principles. Thereby, it was not important whether the scientific principles were chemically correct but if they considered scientific principles in general. A student example which meets the criteria of a reasoning statement is “these protons are acidic because the enolate conjugate can be resonance stabilized”. The student applied scientific principles to justify as to why the evidence fits the claim and the statement is objective. A student example which does not meet the criteria of a reasoning statement is “based on the assumption of the previous argument, the deprotonation of OH would not occur”. In this statement, no scientific principles are applied. Moreover, this statement does not serve as a justification but seems more like a conclusion. All evidence and reasoning statements were evaluated according to the criteria mentioned above. Two points (and thus full points) were awarded when all statements were formed correctly. One point was awarded when at least 50% of the statements were correct and zero points when less than 50% of the statements were correct. In total, the argumentation score consisted of 25 points.
The concept knowledge score consists of both the answers to the questions on chemical concepts (Step B in Fig. 1) and the application of concept knowledge in building autonomous arguments (see Appendix 4). In the first part, students were given questions on different chemical concepts (see Appendix 3). While for most questions, students could receive a maximum of two points, they could receive a maximum of three points for the question on electronic effects as this question covered more than one concept a student could consider. In the second part, the use of chemical concepts in building arguments was assessed. Two points could be obtained for each alternative reaction product. The correct structural formation of evidence and reasoning was not scored. Zero points were given if no chemical concepts were used. One point was awarded if at least 50% of the concepts were incorrect and two points were awarded if more than 50% of concepts were used correctly. In total, the concept knowledge score consisted of 29 points.
First, a visual inspection of the normal distribution of students’ argumentation and concept knowledge scores was performed, which was supported by a Shapiro–Wilk test. This analysis revealed that the data were not normally distributed, which indicated the use of non-parametric tests. For all measurements, an α-level of 0.05 was used.
To determine to what extent the group performances differ after the diagnostic scaffold and whether the group performances converge after the adapted scaffolds, a Kruskal–Wallis test and subsequent post hoc comparisons with Wilcoxon rank-sum tests and Bonferroni-adjusted p-values were performed (Field et al., 2012). The Kruskal–Wallis test as the non-parametric counterpart of the one-way ANOVA was chosen to compare more than two independent samples of different sample sizes. As the Kruskal–Wallis test only indicates whether groups are significantly different, subsequent post hoc comparisons are necessary to identify these groups. In case of significant results in the post hoc comparisons, the correlation coefficient r as a measure of effect size was calculated from the conversion of the z-score (Rosenthal, 1991). The correlation coefficient r was defined as 0.10 ≤ r ≤ 0.30 as small effect, 0.30 ≤ r ≤ 0.50 as medium, and r ≥ 0.50 as large (Cohen, 1992).
To determine whether the adaptive scaffold improved students’ performance in the areas of support, a Wilcoxon signed-rank test with Bonferroni-adjusted p-values was performed. The Wilcoxon signed-rank test as the non-parametric counterpart of the dependent t-test was chosen since we wanted to compare two dependent samples, i.e., pre-post comparisons of the same group. In the case of significant results, the correlation coefficient r was reported as the effect size.
Comparisons | Diagnostic scaffold argumentation score | Diagnostic scaffold concept knowledge score | ||||||
---|---|---|---|---|---|---|---|---|
M
first![]() |
M
second![]() |
p | r |
M
first![]() |
M
second![]() |
p | r | |
ArgS vs. ConS | 10 | 13 | 0.001 | 0.58 | 21 | 16 | <0.001 | 0.79 |
ArgS vs. ArgConS | 10 | 8.5 | 0.056 | 21 | 13.5 | <0.001 | 0.79 | |
ArgS vs. ReaS | 10 | 14 | <0.001 | 0.68 | 21 | 21 | >0.999 | |
ConS vs. ArgConS | 13 | 8.5 | <0.001 | 0.78 | 16 | 13.5 | 0.003 | 0.51 |
ConS vs. ReaS | 13 | 14 | >0.999 | 16 | 21 | <0.001 | 0.78 | |
ArgConS vs. ReaS | 8.5 | 14 | <0.001 | 0.77 | 13.5 | 21 | <0.001 | 0.77 |
After the diagnostic scaffold (pre-measure), the groups differed significantly in terms of their argumentation score H(3) = 43.97, p = <0.001. Post hoc comparisons revealed significant differences with large effects in four of six group comparisons, indicated with black lines on the left side in Fig. 4. However, the ArgS group (yellow) and the ArgConS group (green) as well as the ConS group (blue) and the ReaS group (purple) did not vary significantly in the pre-measure. This sheds light on the appropriateness of the qualitative grouping since in the pre-measure, both groups (ArgS and ArgConS) with students’ argumentation scores below the threshold of 16, differed significantly with a large effect from the two groups (ConS and ReaS) in which students’ argumentation score was above the threshold and, thus, from those groups who will not receive additional argumentation support. Moreover, the pre-measure of the ArgS and ArgConS groups, who will receive an adapted scaffold for argumentation patterns, did not vary significantly from each other. The ConS and ReaS groups also did not differ significantly after the diagnostic scaffold and will consequently not receive an adapted scaffold for argumentation patterns. Fig. 4 illustrates the non-significant comparisons after the diagnostic scaffold (pre-measure) with dashed black lines. These results indicate that the qualitative grouping was successful for the argumentation score because the ArgS and ArgConS groups, who will receive an adapted scaffold for argumentation patterns, differed significantly from the two groups, ConS and ReaS groups, not receiving additional support for argumentation patterns.
After the diagnostic scaffold (pre-measure), the groups were found to differ significantly in terms of their concept knowledge score H(3) = 50.92, p = <0.001. Subsequent post hoc comparisons were performed and demonstrated that five of the six group comparisons differed significantly with large effects, which is illustrated with black lines on the right side of Fig. 4 and summarised in Table 1. Comparison of the concept knowledge score revealed that the two groups, ArgS and ReaS, who will subsequently not receive additional information on chemical concepts, did not differ significantly. The two groups (ConS and ArgConS), who will receive an adapted scaffold on concept knowledge, varied from each other in the pre-measure. Thereby, the ConS and ArgConS groups are significantly different with a large effect. The fact that both groups, who will receive an adapted scaffold on concept knowledge (ConS and ArgConS), differ from each other, but also that these groups differ from the two other groups (ArgS and ReaS), who will not receive an adapted scaffold on concept knowledge, revealed that the ConS and ArgConS groups received their adapted scaffold on a legitimate basis. This means that the qualitative grouping was also successful for the concept knowledge score. However, it is not surprising that the ArgConS group is significantly different from all of the other groups since the students received distinctly fewer points in the concept knowledge score. These differences in score can also be observed through qualitative observations of the students’ answers, as many questions (e.g., on nucleophilicity and electrophilicity, steric effects, or electronic effects) were either answered incorrectly or with the phrase “I don't know.”
After analysing the differences in students’ performance after the diagnostic scaffold (pre-measure), the results of the Kruskal–Wallis test and subsequent post hoc comparisons of the group performances after the adapted scaffolds (post-measure) were reported. In terms of the argumentation score, the groups did not differ significantly in the post-measure H(3) = 4.79, p = 0.188. This is illustrated with dashed grey lines in Fig. 4. Therefore, it can be assumed that after the adapted scaffolds, the groups have converged in terms of their argumentation score, which means that no significant differences were measurable. Regarding the concept knowledge score, the groups differed significantly from each other in the Kruskal–Wallis test H(3) = 10.46, p = 0.015, but in the subsequent post hoc tests with Bonferroni-adjusted p-values it became clear that only two groups differ significantly from each other after the adapted scaffolds (post-measure). The ArgConS and ReaS groups still differ from each other significantly with a medium effect after the adapted scaffolds in the concept knowledge score, which is illustrated in Fig. 4 with a grey line. All other group comparisons did not show a significant difference in post hoc comparisons, shown in Table 2. This reveals that the four groups also converged in terms of the use of concept knowledge and that the performance gap which was diagnosed beforehand is not significantly noticeable in the post-measure. This suggests overall that the adaptive scaffold supported the respective groups and closed the gap in their performance. The difference between the ArgConS and ReaS groups can be understood when considering them as two sides of a continuum. Students in the ArgConS group received the lowest scores in the concept knowledge score whereas the students of the ReaS group achieved the highest scores in this category. This is also noticeable in the pre-measure since the ConS and ArgConS groups differed significantly although both groups will receive an adapted scaffold for concept knowledge. However, these two extrema came a little closer to each other demonstrated by the comparison between the median values from the diagnostic scaffold (difference of median values for the ReaS group and ArgConS group = 7.5) and the adapted scaffolds (difference of median values for the ReaS group and ArgConS group = 7).
Comparisons | Adapted scaffolds concept knowledge score | |||
---|---|---|---|---|
M
first![]() |
M
second![]() |
p | r | |
ArgS vs. ConS | 20 | 19 | >0.999 | |
ArgS vs. ArgConS | 20 | 18 | >0.999 | |
ArgS vs. ReaS | 20 | 25 | 0.543 | |
ConS vs. ArgConS | 19 | 18 | >0.999 | |
ConS vs. ReaS | 19 | 25 | 0.188 | |
ArgConS vs. ReaS | 18 | 25 | 0.013 | 0.43 |
Overall, it became apparent that the groups who will receive additional support in the adapted scaffolds were significantly different to those who will not receive support, with a large effect in the pre-measure. Thus, this confirms that the grouping after the diagnostic scaffold was successful and revealed that a performance gap was present. After the adapted scaffolds (post-measure), the groups did not differ significantly in terms of argumentation score and concept knowledge score. The ArgConS group constitutes an exception as this group differed significantly from all the other groups regarding the concept knowledge score in the pre-measure and the ReaS group in terms of the concept knowledge score in the post-measure. This is because the ArgConS group was conceptually weaker after both the diagnostic scaffold and adapted scaffold compared to all three other groups.
The ArgS group increased significantly in the argumentation score from pre- (M = 10) to post- (M = 14.5) measure (V = 107.5, p = 0.007, r = 0.68) with a large effect. For the concept knowledge score, on the other hand, there was no significant change from pre (M = 21) to post (M = 20) (V = 47.5, p = 0.299) measurable. Therefore, the adapted scaffold on argumentation patterns supported the students significantly in building arguments but not in the use of concept knowledge. This result is encouraging as the students only received additional support on argumentation patterns. Louis, a participant in this study, serves as a student example of the ArgS group. He received 15/25 points in the argumentation score and 24/29 points in the concept knowledge score, which resulted in assigning him to the ArgS group. After receiving an adapted scaffold for building arguments, Louis received 16/16 points in the argumentation score and 27/29 in the concept knowledge score. In the diagnostic scaffold, he claimed that tetrahydrofuran (THF) is an implausible product of the reaction from 4-chlorobutanol and hydroxide. For that purpose, Louis built his argument by using the free-text boxes labeled as evidence and reasoning as follows:
Claim: The reaction product is implausible.
Evidence 1: While it is plausible that the OH−will initially deprotonate the alcohol, the molecule must contort significantly to put the alkoxide group next to the C–Cl bond.
Reasoning 1: The likelihood of getting the alkoxide group next to the C–Cl bond of the same molecule before, say, the alkoxide reacts with the C–Cl bond of a neighboring identical molecule is unlikely because the molecule would have to bend in on itself. In addition, entropics don’t favor the reaction because a ring has less entropy than a straight chain.
It is noticeable that Louis, referring to the argument structure, related the evidence to the claim and even explicitly mentioned what he claimed implausible regarding the formation of THF. However, after that, he did not clearly separate the argument components evidence and reasoning. In the first part of his reasoning statement, he described the evidence again in other words, but did not explain why “the molecule must contort significantly”. Moreover, in the second part of his reasoning statement, there is another piece of evidence and corresponding reasoning statement regarding entropy.
In the adapted scaffold, Louis then received additional argumentation support and built the argument shown below for the reaction of methyl acetate and diisopropylamide (LDA) to methyl acetoacetate via a Claisen condensation.
Claim: The reaction product is plausible.
Evidence 1: The amine is more likely to act as a base than nucleophile.
Reasoning 1: The amine is sterically bulky so it can’t approach an electrophile easily but can deprotonate a molecule.
Evidence 2: The amine is basic.
Reasoning 2: Since the amine has a negative charge and doesn’t have resonance structures it will be stabilized by receiving a proton and become neutral.
Evidence 3: The ester has acidic alpha protons.
Reasoning 3: These protons are acidic because the enolate conjugate can be resonance stabilized.
Evidence 4: The enolate can attack an ester because the carbanion is nucleophilic and the carbonyl is electrophilic.
Reasoning 4: The carbanion is nucleophilic because it is negatively charged and the carbonyl is electrophilic because the oxygen pulls electron density away.
At a first glance, it becomes clear that Louis has made a separation between evidence and reasoning. Each of his built pieces of evidence refers to the claim and answers the why-question. Furthermore, his reasoning statements consist of scientific principles that justify the evidence. By comparing these two formed arguments, Louis has improved significantly in building arguments by not only making a distinction between evidence and reasoning statements but also by building more pieces of evidence and reasoning, which gives depth to his argument. However, Louis can continue to improve in building arguments in the future. For example, he can concretize evidence 1 by already mentioning the structure of the amine. Furthermore, he can split evidence 4 and consider the nucleophilicity and electrophilicity separately from each other to address the electronic properties of the molecules more specifically in his reasoning statements.
When looking at the ConS group, the data analysis revealed a comparable trend, as in this group, there was no significant change for the argumentation score (pre-M = 13, post-M = 14), V = 30, p = 0.823. In comparison the concept knowledge score increased significantly from pre (M = 16) to post (M = 19), with a large effect (V = 82, p = 0.011, r = 0.66). This means that the adapted scaffold on the use for concept knowledge supported the students significantly for using concept knowledge but not in building argumentation patterns. This result is also encouraging as students in the ConS group only received additional support in the use of concept knowledge. Jessica, a participant in this study, serves as an example for the ConS group since she received 21/25 points in the argumentation score but 17/29 points in the concept knowledge score. After she worked with an adapted scaffold for the use of concept knowledge, she obtained 13/16 points in the argumentation score and 25/29 points in the concept knowledge score.
Jessica's example argument is also regarding the formation of THF as a reaction product from 4-chlorobutanol and hydroxide.
Claim: The reaction product is plausible.
Evidence 1: It forms stable products.
Reasoning 1: The Cl−is more stable than the OH−, H2O is more stable than OH−, and a membered ring with oxygen is relatively stable.
Evidence 2: There could be conditions where this is the most favorable reaction.
Reasoning 2: Though the diol attacking itself to form a ring is likely not the most kinetically favorable reaction, sometimes certain conditions promote reactions like that.
Jessica supported her claim with pieces of evidence by stating that the products are stable and that the reaction would be possible under certain conditions. Both pieces of evidence are rather vague and do not give a concrete indication for which conceptual reasons the formation of THF is plausible. While Jessica has not used any incorrect concepts per se, she did not elaborate on them. Instead, she justified stable products by comparing stability, but without elaborating on the reasons for stability. She also did not specify the reaction conditions. However, after getting additional information on chemical concepts in the adapted scaffold, such as the pKa values of the involved molecules or the electronegativity values, she built the following argument on the formation of methyl acetoacetate from methyl acetate and LDA.
Claim: The reaction product is plausible.
Evidence 1: The product is not entropically unfavorable.
Reasoning 1: 3 molecules become 3 molecules
Evidence 2: The negative charge on the product is stabilized by resonance.
Reasoning 2: It can put negative charge on two different oxygens (allylic to two CO bonds).
Evidence 3: The negative charge is stabilized by inductive effects.
Reasoning 3: It is allylic to two CO bonds, which are electron-withdrawing.
Evidence 4: The ester will be more likely to donate a proton.
Reasoning 4: It has a lower pKathan the amine.
Evidence 5: The final products are all stable.
Reasoning 5: Methanol and the amine are both stable as well as the ester and enolate compound.
Jessica, like Louis in the ArgS group, improved the quality of her argument with respect to the concept knowledge used. With the help of the additional concept information, which she still had to interpret herself, Jessica used concepts such as entropy, electronic effects, and acidity. In contrast to the diagnostic scaffold, she did not remain vague but used explicit scientific principles to support and justify her claim. Jessica tried to include a variety of scientific principles in building her arguments, since, for example, the consideration of entropy alone would not have been a sufficient justification. Furthermore, she did not build reasoning statements by repeating comparable statements she already used in her pieces of evidence, which is still apparent in reasoning 5. This is an evident improvement in her performance after the adapted scaffold. Thus, Jessica could continue to improve in building arguments in the future. In evidence 5 and reasoning 5, it becomes apparent that she should engage again with the concept of stability, as she is unable to provide a satisfactory justification for the stability of molecules in both the diagnostic scaffold and the adapted scaffold.
The ArgConS group, who received support in both, argumentation and concept knowledge, was able to achieve a significant increase with a large effect in both, the argumentation score (pre-M = 8.5, post-M = 13.5, V = 193.5, p = <0.001, r = 0.74) as well as in the concept knowledge score from pre- to post-testing (pre-M = 13.5, post-M = 18, V = 158.5, p = 0.011, r = 0.57). The student Mike is exemplary of the ArgConS group; he received 10/25 points in the argumentation score and 16/29 points in the concept knowledge score in the diagnostic scaffold. After working with the adaptive scaffold on building arguments and using concept knowledge, he obtained 16/16 points in the argumentation score and 23/29 points in the concept knowledge score. In Mike's example argument, he claimed that an alkoxide is a plausible product of the reaction of 4-chlorobutanol and hydroxide.
Claim: The reaction product is plausible.
Evidence 1: The oxygen is better stabilized.
Reasoning 1: On a larger molecule, the negative charge is better stabilized because it can be stabilized through resonance.
Evidence 2: The smaller molecule is more stable.
Reasoning 2: A water molecule is much more stable than a hydroxyl group.
Like most of his fellow students, Mike tried to build a reasoning statement for each piece of evidence. Evidence 1 meets all conditions of an evidence statement, such as the answer to the why question and that it is an explanation rather than a description. Reasoning 1 also formally meets the criteria, but technical deficiencies appear, for example, the alkoxide cannot be stabilized by resonance, which is a common misconception (Carle and Flynn, 2020). In the second part of the argument, Mike used the same argument as Jessica with respect to stability, as he tried to justify ‘the stability of molecules with their stability’ instead of justifying the stability with chemical concepts (see evidence 2 and reasoning 2). In reasoning statement 2, he did not provide any further information, so the justification of the evidence is not apparent. Furthermore, he remained vague in evidence 2 because without the reasoning statement it is not clear which molecule he referred to as “smaller molecule”. In the adapted scaffold, Mike worked on an adapted scaffold for building arguments and using concept knowledge, building the following argument for the formation of methyl acetoacetate from methyl acetate and LDA.
Claim: The reaction product is plausible.
Evidence 1: The reaction is entropically favored.
Reasoning 1: There are more products than reactants which leads to increasing disorder.
Evidence 2: The oxygen is stable with the negative charge.
Reasoning 2: Oxygen is electronegative and can stabilize the negative charge after the molecule is rearranged via resonance.
Evidence 3: The nitrogen is not very stable with the negative charge.
Reasoning 3: The nitrogen is not very electronegative and thus cannot stabilize the charge as well.
Evidence 4: The negatively charged product is larger.
Reasoning 4: The negative charge can be better stabilized.
Evidence 5: The ester is a better leaving group.
Reasoning 5: The negatively charged oxygen is protonated, which makes its formation more favorable.
From an argument structure point of view, Mike improved considerably. All of his built pieces of evidence supported the claim and all reasoning statements justified the evidence. He also used scientific principles as justification. From a conceptual point of view, Mike has improved, but this does not mean that all statements were technically correct. For example, he talked about the reaction being entropically favoured due to a higher number of products compared to reactants, which is incorrect because there are three molecules involved in the reaction on both the reactant and product side. Furthermore, in evidence 5, he referred to an ester as a leaving group. Here it is impossible to understand which molecule Mike was referring to as the leaving group since no ester is split off during the reaction. Nevertheless, Mike's improvement is evident across all arguments, both in terms of argument structure and the use of concept knowledge. In the future, Mike could further be supported in building arguments. Thereby, he can concretize his justification, for example by providing a counterpart in comparisons (e.g., X is a better leaving group than Y or molecule X is larger than molecule Y).
The scores in the ReaS group suggest that students in this group did not achieve a significant improvement in either the argumentation score from pre (M = 14) to post (M = 16), V = 34, p = 0.534, or the concept knowledge score from pre (M = 21) to post (M = 25), V = 51, p = 0.118, opposite to the other groups. This is not surprising because the students already received high scores in the diagnostic scaffold. Thus, the scores did not differ significantly which is a good result. Rachel is a student representative of the ReaS group. She received 23/25 points in the argumentation score and 26/29 points in the concept knowledge score in the diagnostic scaffold. For the formation of THF as a product of the reaction of 4-chlorobutanol and hydroxide, Rachel has built the following argument.
Claim: The reaction product is plausible.
Evidence 1: Intramolecular reactions are faster than intermolecular ones.
Reasoning 1: Rate is dependent on concentration of substrate. When the reactants are connected, there is essentially limitless substrate and thus this reaction can take place quite quickly.
Evidence 2: Hydroxide will deprotonate the hydroxyl.
Reasoning 2: The pKaof hydroxyl is similar to that of hydroxide and they will thus exist in a proton transfer equilibrium. When the alkoxide ion is formed it will react with the nearby electrophile.
Evidence 3: This reaction is enthalpically favorable.
Reasoning 3: Five membered rings are stable because they have optimal bond angles for sp3hybridization. This means that they are lower in energy/more stable and thus this reaction will be exothermic.
All arguments built by Rachel can be used as sample solutions for other students. She was able to separate her pieces of evidence from her reasoning statements as the pieces of evidence supported her claim and the reasoning statements justified the pieces of evidence. Moreover, Rachel used several chemical concepts such as enthalpy, kinetics, and basicity in her argumentation. After the adapted scaffold, Rachel obtained 16/16 points in the argumentation score and 28/29 points in the concept knowledge score. The students of the ReaS group were the only ones who were additionally prompted to build up to three reasoning statements for one piece of evidence. For the formation of an enolate as a product of the reaction of methyl acetate and LDA, Rachel built the following argument.
Claim: The reaction product is plausible.
Evidence 1: The amide anion is highly basic.
Reasoning 1.1: High electron density on nitrogen.
Reasoning 1.2: Nitrogen adjacent to electron donating groups.
Reasoning 1.3: Bulky groups mean it won’t act as a nucleophile.
Evidence 2: The alpha proton is slightly acidic.
Reasoning 2.1: The negative charge from removing a proton will be resonance stabilized.
Reasoning 2.2: The carbonyl contributes an inductive effect.
Evidence 3: The reaction is enthalpically favorable.
Reasoning 3.1: A weaker acid is formed than on the reactant side (protonated amide).
Reasoning 3.2: A weaker base is formed (enolate) than on the reactant side.
Reasoning 3.3: The enolate is resonance stabilized.
Rachel improved in both, her argumentation score and concept knowledge score, but already performed well in the diagnostic scaffold. It is noticeable that her arguments are at a high level. Her evidence and reasoning statements answered the why-questions, and her use of scientific principles is multivariate. By building up to three reasoning statements, Rachel became more detailed and justified her pieces of evidence from multiple perspectives. In her first reasoning statement, for example, she justified the basicity of LDA given the electronic effects of nitrogen, and the electronic and steric effects of the adjacent groups. In Rachel's case, she could be further supported to include more scientific principles in her arguments in the future, e.g., arguing with entropy or pKa values.
In summary, the adapted scaffolds improved students’ performance in the respective areas of support. The groups that received support in building arguments (ArgS and ArgConS) improved significantly with a large effect on the argumentation score and the groups. Those who received additional support in using concept knowledge (ConS and ArgConS) improved their performance significantly with a large effect on the concept knowledge score. Only the ReaS group showed no significant improvement, which is not surprising because the students of the ReaS group already had high scores in the diagnostic scaffold. Moreover, the fact that only the groups who received extra support in argumentation patterns and/or concept knowledge improved their performance in this area indicates that the improvement is not a simple training effect. Instead, the adaptive scaffold might be responsible for students’ improvement.
In this study, we investigated if adapted scaffolding that provides students with support in the area of argumentation and use of concept knowledge can make a significant difference in performance. Based on a diagnostic scaffold, which served as a pre-measure to analyse how students build arguments and how they used concept knowledge, students received an argumentation and concept knowledge score. They were then assigned to one of four adapted scaffolding groups: support for argumentation (ArgS), concept knowledge (ConS), argumentation and concept knowledge (ArgConS), and multivariate reasoning (ReaS). Consequently, each group received a different scaffold tailored to their needs in argumentation and/or use of concept knowledge. An argumentation score and concept knowledge score were given to each student based on their performance in the adapted scaffolds (post-measure).
The first research question in this study, examined to what extent the support groups differed from each other in the pre- and post-measure. When evaluating students’ answers, it became apparent that the groups differed significantly from each other in the pre-measure. In particular, students grouped into the ReaS group already built well-grounded arguments after the diagnostic scaffold (pre-measure) as multiple pieces of evidence and reasoning were used as support and justification of the claim. Both evidence and reasoning statements consisted to a large extent of scientific principles and answered the why-questions. The other three groups still showed some gaps in the pre-measure in terms of argumentation and/or the use of concept knowledge. A closer look at the analysis revealed that the ArgS and the ArgConS groups each differed significantly with high effects (r = 0.58 and r = 0.78) in terms of the argumentation score from the ConS and the ReaS groups after the diagnostic scaffold. However, the ArgS and ArgConS group, as well as the ConS and the ReaS groups, did not vary significantly from each other, which exemplifies that the argumentation score determined group performance differences in building arguments but not in terms of concept knowledge. This indicated that the grouping of the students in the ArgS and the ArgConS groups in the adapted scaffolds for argumentation patterns was successful. By comparing the groups in terms of concept knowledge, the ConS and the ArgConS groups differed significantly from the ArgS and the ReaS groups with high effects (between r = 0.77 to r = 0.79) in the pre-measure. Therefore, the grouping in the adapted scaffolds for the use of concept knowledge was also successful. An exception is evident when comparing the ConS and ArgConS groups. Both groups differed significantly from each other with a high effect (r = 0.51) after the diagnostic scaffold, although both groups will receive an adapted scaffold for concept knowledge. However, this can be explained as students in the ArgConS group were conceptually weaker in comparison to all other groups. Based on this first analysis, the initial grouping was successful and the groups received the support in the respective area needed (i.e., regarding argumentation and the use of concept knowledge). The second part of the first research question investigated whether the gap that occurred between the groups at the beginning could be closed using the adaptive scaffold. No significant differences between the groups were apparent in the argumentation skill after applying the adapted scaffold. In the area of concept knowledge, a significant difference with a medium effect (r = 0.43) only occurred between the ArgConS and ReaS groups. All other group comparisons showed no significant differences. It can therefore be assumed that the adaptive scaffolds closed the gap in students’ performance. More chemical concepts were used to build arguments. The link between concept knowledge and the argument components is a key aspect in building arguments as it is considered an important part of the quality of an argument (Sandoval and Millwood, 2005; Choi et al., 2013). In this context, one might assume that argumentation and concept knowledge can be fundamentally distinguished from each other. Songer and Gotwals (2012) reported a connection between concept knowledge and argumentation, which was not found in this study. However, one should not interpret this as evidence for the interdependence of argumentation and concept knowledge. The scoring process, separating argumentation and concept knowledge, did not explicitly acknowledge this linkage since strict attention was paid to the fact that both topics were considered separately from each other in the scoring process. Thus, in building arguments, technical correctness was not considered, and in the use of concept knowledge, no attention was given to whether the argument components (evidence and reasoning) were built correctly.
In the analysis of the second research question, the scoring results of the pre-post comparisons were compared to determine a possible improvement in each support group. Here, with respect to the argumentation score, the two groups that improved significantly with a high effect (r = 0.68 and r = 0.74) were those that received additional support for argumentation (ArgS and ArgConS group). Similarly, for the concept knowledge score, only the two groups that received support for the use of concept knowledge improved significantly with a high effect (r = 0.57 and r = 0.66) (ConS and ArgConS group). These results suggest that the adaptive scaffolds targeted areas where support was needed. Thus, this study demonstrates that an adaptive scaffold improved students’ performance in the respective area of support. In organic chemistry, scientific reasoning on reaction pathways and products requires considering multiple chemical concepts in the decision-making process so that alternative reaction products and by-products can also be considered (Popova and Bretz, 2018). The implementation of this adaptive scaffold is useful in supporting students in applying the content to context, such as in suggesting alternative reaction products (Chen, 2014; Graulich and Caspari, 2021).
![]() | ||
Fig. 6 Exemplary arguments for the four alternative reaction pathways in the diagnostic scaffold and the four alternative reaction pathways in the adapted scaffolds. |
![]() | ||
Fig. 8 An example of an alternative reaction pathway for the reaction of 4-chlorobutanol and hydroxide. |
Questions on chemical concepts | Student example |
---|---|
Decide whether steric aspects need to be considered in the reaction and explain why you think so. | “I do not think that steric aspects need to be considered in the reaction because the reaction is taking place on a primary alkyl halide. A primary alkyl halide only has one other non-hydrogen substituent so it is relatively unhindered. Therefore, the –OH can attack the carbon without experiencing significant hindrance from other substituents which would be steric considerations.” |
Approximate the pKa values of the involved molecules in this reaction, or, if you do not know, outline how you think the pKa values of the different molecules compare to each other (e.g., molecule x has the highest and molecule y the lowest pKa) | “I'd say the chlorobutanol is of a pKanear 15 because I think that's the pKaof water. The –OH itself probably has a pKaof water too because the –OH is the conjugate base of water.” |
Determine at which positions you think the involved molecules react as a nucleophile and at which positions they react as an electrophile. Explain your thinking. | “The O of the OH group on the alkane acts as a nucleophile along with the O on the hydroxide ion because they have extra electron density.” |
“I would expect the carbon bonded to the chlorine to be the most electrophilic site because chlorine is very electronegative and will pull electron density away from the carbon making it electrophilic. The carbon bonded to the hydroxyl group will be electrophilic for the same reason (oxygen is electronegative) but not as electronegative as the aforementioned carbon because oxygen is not as electronegative as chlorine.” | |
Determine at which positions you think the involved molecules react as an acid and at which positions they react as a base. Explain your thinking. | “The 4-chloro-1-butanol reagent will act as a weak acid due to its mildly acidic hydroxyl group. This molecule will only lose its hydroxyl proton to moderately or strongly basic species that react to form a conjugate acid with a pKahigher than that of 1-chloro-4-butanol (a higher pKacorresponds to a less acidic, and thereby lower-energy, product).” |
“On the hydroxide, oxygen molecule is primary source of basicity due to negative charge on it.” | |
Determine whether you think there are any effects that stabilise your product compared to the reactants. If so, explain how the effect/s stabilise the product. | “The product is stable because it is a five membered ring. This structure allows for optimal bond angles for sp 3 hybridization. The hydroxyl becomes protonated to form water, which is more stable since there are no formal charges in the molecule as oxygen forms two bonds.” |
Determine whether you think there are any entropic effects that influence the reaction process. If so, explain why you think so. | “I do not think there are entropic effects that influence the reaction. There are two starting molecules and two products.” |
Determine whether electronic effects (e.g., inductive effects, resonance, electronegativity,…) influence the reaction process and why you think so. | “I think that the only electronic effect here is electronegativity and induction, as the Cl–C bond is polarized so that the carbon is a slightly positive center; there are no double bonds to induce resonance. The Cl − is a better leaving group than OH partially because Cl is more electronegative than O, at least it is more electronegative towards other potential electrons than the effective electronegativity of an O bonded already to one H. The stronger inductive effects and electronegativity of Cl make it a better leaving group than the OH on the alcohol/chloride alkane.” |
Decide whether the reaction is reactant- or product-favoured from an energetic perspective (enthalpy). Explain your thinking. | “Product-favored. C–O bonds are stronger than C–Cl bonds due to the shorter length of C–O, so this substitution lowers the energy of the system.” |
Groups | M pre | M post | p | r |
---|---|---|---|---|
Argumentation score | ||||
ArgS | 10 | 14.5 | 0.007 | 0.68 |
ConS | 13 | 14 | 0.823 | |
ArgConS | 8.5 | 13.5 | <0.001 | 0.74 |
ReaS | 14 | 16 | 0.534 | |
Concept knowledge score | ||||
ArgS | 21 | 20 | 0.299 | |
ConS | 16 | 19 | 0.001 | 0.66 |
ArgConS | 13.5 | 18 | 0.011 | 0.57 |
ReaS | 21 | 25 | 0.118 |
This journal is © The Royal Society of Chemistry 2022 |