Andrew
Kreps
,
Kodi
Dailey
and
Renee
Cole
*
Department of Chemistry, University of Iowa, Iowa City, IA 52242, USA. E-mail: akreps@tacomacc.edu
First published on 14th July 2025
This study explores the impact of cognitive scaffolding on group learning to establish the effective use of Marzano's taxonomy as both a diagnostic and developmental tool. Using the cognitive levels of processing as defined by Marzano's taxonomy, activities in an introductory chemistry course were revised to embed support in the structure of questions. Data was collected from two cohorts: Cohort 1, which engaged with original activities, and Cohort 2, which used our cognitively scaffolded activities. The ICAP framework provided the theoretical framework for analyzing social processing, knowledge dynamics, and modes of reasoning within the groups. Findings suggest that cognitive scaffolding helped sustain higher cognitive engagement while promoting the use of reasoning during higher-order questions. While social dynamics did not show as pronounced changes, these findings demonstrate the potential of cognitive scaffolding to support student learning. The study additionally highlights the nuanced nature of collaborative learning, with factors such as group composition, course climate, and facilitation methods influencing engagement outcomes.
Studies suggest that group work is most effective for tasks that are cognitively challenging and allow for student autonomy; in contrast, retention-focused questions tend to favor individual work (Kirschner et al., 2009; Scager et al., 2016; Wilson et al., 2018). While challenging tasks can enhance learning, they may also lead to cognitive overload and uneven participation, particularly in newly formed groups (Janssen et al., 2010). As Reid et al. (2022a,b) and Nennig et al. (2023) observe, group work is often implemented without adequate structure or feedback, and students may lack awareness of effective collaborative strategies. They advocate for the inclusion of explicit prompts that scaffold students toward productive reflection. These findings underscore the critical role of activity design in shaping the quality of student engagement in active learning environments and form the foundation for the present work.
While the benefits of scaffolding, more specifically the structure of questions, are well established, there is limited research exploring how students in general chemistry respond to specific scaffolds, with even fewer studies focusing on the impact scaffolding has on group learning. Vo et al. (2025) discussed the prevalence of studies in organic chemistry courses, emphasizing the need for further investigations into scaffolding within general chemistry settings, particularly those that address the cognitive, behavioural, affective, and social dimensions (Fredricks et al., 2004; Vo et al., 2025). Although prior studies highlight the importance of balancing structure and autonomy to support reasoning, much of this work relies on lengthy, iterative revisions or interviews to refine scaffolded prompts, underscoring the time- and labour-intensive nature of developing effective materials (Noyes and Cooper, 2019; Noyes et al., 2022). While these approaches offer valuable models, they may be difficult for instructors to adopt without significant support.
In contrast, we propose the use of Marzano's taxonomy as a structured and more accessible framework for evaluating and developing scaffolded activities based on the levels of cognitive processing the taxonomy describes. Work by Toledo and Dubas (2016), along with others, demonstrated the successful application of Marzano's taxonomy to redesign instruction and assessment in a general chemistry course (Toledo and Dubas, 2016; James and LaDue, 2021). Our study builds on this by presenting a method for revising in-class group activities that embeds cognitive scaffolding.
This work addresses key gaps in the literature by first establishing the viability of using Marzano's Taxonomy to design cognitively scaffolded in-class activities, analyzing their impact on students' reasoning, knowledge dynamics, and social interactions, compared to groups working with non-cognitively scaffolded activities. We analyzed student discussions for evidence of knowledge construction, social processing, and reasoning strategies as they engaged with cognitively scaffolded in-class activities. The following sections introduce the coding scheme used in this study, provide relevant background, and explain how they relate to our central research question:
How does the cognitive scaffolding of in-class group activities developed with Marzano's taxonomy impact students’ cognitive and social engagement with chemical ideas?
Nokes-Malach et al. (2015) broadly define collaboration as “active engagement and interaction among group members to achieve a common goal,” emphasizing that collaborative learning is most effective when both cognitive and social processes are supported. Prior research has demonstrated that specific forms of discourse, such as complex reasoning, justification of ideas, and building upon peers’ knowledge, can improve learning outcomes and deepen conceptual understanding in chemistry (Towns, 1998; Michaels et al., 2008; Criswell, 2012; Moon et al., 2017b; Nennig et al., 2023). Therefore, to analyse the effects of cognitive scaffolding on group dynamics, we adopted a coding scheme that captures the social and epistemic dimensions of group interaction as well as the complexity of students’ reasoning.
Reasoning is integral to productive discourse as it promotes and supports critical thinking, developing deeper understanding, constructing arguments, and providing justification (Rojas-Drummond and Mercer, 2003; Mercer et al., 2004; Michaels et al., 2008). Yet, chemistry students tend to rely on a relational type of reasoning, where the connection between the features of a system and its behavior is made, leading to the construction of simple causal explanations and the reduction of variables considered (Sevian and Talanquer, 2014; Moreira et al., 2019; Deng and Flynn, 2021). Sevian and Talanquer (2014) defined four modes of reasoning that describe different levels of complexity: descriptive, relational, linear causal, and multicomponent. The descriptive and relational modes of reasoning represent surface-level reasoning, focusing on explicit features of systems and relying on relationships to drive reasoning. Linear causal and multicomponent are then where students construct a cause-and-effect type of argument; however, for the multicomponent modes, students consider and explore multiple variables, whereas, in linear causal, there may be a reference to various variables that are then reduced to a primary agent.
This work is therefore informed by the taxonomy through its application in restructuring in-class group activities to incorporate cognitive scaffolding. In cognitive literature, this is referred to as cognitive structuring, which Tharp and Gallimore (1988) define as “… the provision of a structure for thinking and acting.” Previous work regarding cognitive structuring demonstrates that by carefully considering the changes in the cognitive load placed on the learner, you can reduce the chance of cognitive overload (Sweller, 1994; van de Pol et al., 2015; Zulu et al., 2018). This approach offers benefits similar to established methods of scaffolding, but without requiring the labour-intensive iterative development often needed to refine prompts. Moreover, it is flexible enough to be applied across an entire course, from overall design to in-class activities.
As an example of how the taxonomy was used, Fig. 1 demonstrates sample questions from one of the activities used in this study before and after cognitive restructuring. The first three questions in both versions of the activity are identical and relatively straightforward, asking students to calculate the change in entropy and Gibbs free energy. According to Marzano's taxonomy, these questions reflect Retrieval-level processing because they require students to recall information from permanent memory or apply formulas already provided. The final questions in each version (Questions 11 and 9e) represent Analysis-level processing, where students must synthesize prior knowledge to generate new insights—for instance, by evaluating the effect temperature can have on the direction of a reaction. In the original activity, students are asked to move directly from a Retrieval-level task to an Analysis-level one. This abrupt jump in cognitive processing increases task complexity and may raise the risk of cognitive overload (Sweller, 2010; Chen et al., 2023).
To address this, the revised activity introduces an intermediate question (9d), which asks whether the reaction is favorable and prompts them to explain their reasoning. This question engages Comprehension-level processing by drawing attention to key features of the Gibbs free energy equation, such as the variables it contains. This question scaffolds their reasoning and supports a smoother cognitive progression by prompting students to reflect on these critical features before making interpretive judgments or providing reasoning. Highlighting critical features before asking students to generate new insights is one example of the cognitive scaffolding made possible through restructuring guided by Marzano's taxonomy. For further discussion about the taxonomy and its application in evaluating cognitive scaffolding, see our previous work (Kreps et al., 2024).
![]() | ||
Fig. 2 The layout of the room where all discussions took place, with a representative example of the placement of the video and audio recorder. |
The primary goal of the discussion was to provide an opportunity for students to work together on concepts they were learning, and it was expected that they would answer as many questions as possible in the allotted time. Nevertheless, answering every question in the given amount of time was not always possible, so grading was done by the GTA based only on the completion of questions answered. Each activity contained an overview of the learning objectives, applicable skills for the activity, questions, and a reflection. The reflections usually occurred in the middle of the activity, focusing on a specific process skill characteristic, and used portions of the Enhancing Learning by Improving Process Skills in STEM (ELIPSS) rubrics to aid students' reflection (Cole et al., 2019; Reynders et al., 2020; Czajka et al., 2021). The activities also included the roles of Manager, Recorder, Spokesperson, and Reflector as described by the POGIL (Process Oriented Guided Inquiry Learning) Project (Hoffman and Richardson, 2019). Students were meant to cycle through the roles each week to promote team learning, foster collaborative skills, and ensure no one student would always have the role of recorder, who was responsible for recording the group's answers and submitting the completed activity to the course learning management system.
Data collection occurred over two iterations of the second-semester introductory chemistry course, hereafter referred to as Cohort 1 and Cohort 2. Between these cohorts, a change was made to the facilitation methods and the activities the students worked on. For Cohort 1, each discussion began with a brief iClicker quiz before groups started the activity. For Cohort 2, the iClicker quiz was removed, allowing groups to begin the activity immediately.
In the first two weeks of each semester for both cohorts, researchers obtained consent from course instructors and GTAs. In week two, video recordings of the discussion sections were collected to help identify the groups to observe for the rest of the semester. Purposeful sampling, based on the days of the week and video recordings (Patton, 2014), ensured that selected groups worked collaboratively and did not complete activities prior to the discussion section. Two groups were selected from each semester: one meeting on Monday and the other on Thursday. This selection ensured that the groups sampled reflected any content knowledge gained during the lectures they attended throughout the week. In week three, researchers collected consent from the students in these groups, and from that point forward, the data collection process remained consistent across both cohorts. From weeks three to fifteen, each group was recorded using a video camera in the room, an audio recorder at the group's table, and an iPad with the week's activity uploaded and screen recording enabled. Students used the iPads to complete and submit their activities for GTA grading. Table 1 contains the reference names of each group along with the pseudonyms assigned to the students.
Monday | Cohort 1 | Cohort 2 |
---|---|---|
Gold group | Blue group | |
Hodgkin | Sheeana | |
Daly | Alice | |
Molina | Dorothy | |
Olga |
Thursday | Cohort 1 | Cohort 2 |
---|---|---|
Red group | Purple group | |
Ball | Miles | |
Arrhenius | Henry | |
Bohr | Helen | |
Bragg | Linus |
The audio–video data from each discussion was synced and uploaded to MAXQDA coding software (VERBI Software, 2021). Once uploaded, we used the software to “timestamp” sections of the video where groups worked on each question. These timestamps marked specific units of discourse to be analyzed during the coding process, starting from when students began a question and ending when they moved on to the next. This approach was intended to capture the evolving dynamics within each group as they progressed through the activities rather than coding individual tasks in isolation. Screen recordings from the iPads supplemented the audio/video files, providing additional context if group conversations were unclear or offering further insight into the group's thinking and work while answering questions.
In the original Cohort 1 activities, students were instructed to watch a linked video related to the activity's content before attending discussion, where they would then summarize what they learned as a group. Most of these videos were replaced with “Pre-class Questions” (PcQs), which students completed individually before briefly comparing answers or definitions with group members. For Activity 12, these PcQs asked students to: list two strong and two weak acids (PcQ 1), describe what a resonance structure is (PcQ 2), and identify the features they use to determine acid strength and draw the conjugate base (PcQ 3). These PcQs established a shared foundation of knowledge, helping to minimize cognitive overload during more complex questions (van Merrienboer et al., 2003). They also supported the development of a shared language, consistent with other studies on scaffolding, and helped reduce confusion during peer or GTA interactions (Parker Siburt et al., 2011; Vo et al., 2022).
For the remainder of the activity, two of the authors revised the question order, added new prompts, or rewrote existing ones based on cognitive processing levels. In the original version, Question 4a–d asked students to write the reactions of a strong and weak acid in water, determine which proceeds to a greater extent, and justify their reasoning. Based on Marzano's taxonomy, Questions 4a and 4b are Retrieval-level, 4c is a Comprehension-level task that emphasizes key conceptual features, and 4d is an Analysis-level question requiring students to draw a connection between concepts. While this sequence shows some cognitive structuring, our goal was to reduce low-level tasks, which typically do little to promote collaborative learning (Kirschner et al., 2009; van de Pol et al., 2015). Additionally, by analyzing the cognitive levels of the questions, we recognized a disconnect between our intended focus for the activity and the structure of the questions, particularly in Question 4d, which asked students to connect reaction extent with acid strength. While this question reflects higher-order thinking, it promotes conceptual integration that students may not yet be ready for, especially as they are still developing foundational strategies for evaluating acid–base strength.
In the revised Cohort 2 activity, we redirected higher-order thinking toward understanding acid strength by adding Questions 2–5. These questions adopted a problematizing approach, prompting students to reassess their understanding and identify knowledge gaps (Phillips et al., 2017). Questions 2 and 3a are Comprehension-level tasks where students write their own method for evaluating acid strength and draw the conjugate base of given molecules. These tasks emphasize structural and electronic features in Lewis structures, such as formal charge, electronegativity, and resonance, encouraging representational reasoning grounded in visual interpretation. This prepares students for the Analysis-level Question 3b, which asks them to identify the stronger acid and justify their answer. Question 4 presents a new set of molecules for ranking in terms of acidity, while Question 5 prompts students to re-evaluate their original method for determining acid strength. These revisions ensure that higher-order tasks are now more intentionally scaffolded, supporting deeper understanding and promoting the transfer of knowledge to long-term memory.
The final questions in the revised (Cohort 2) activity shift focus on ranking base strength. While no direct cognitive scaffolding is provided for these questions, this aligns with the literature on fading support, which suggests that scaffolding should gradually decrease to foster independent problem-solving (Belland, 2014). Although the prior questions focused on identifying and reasoning about acid strength, students must now apply their learning to bases without direct guidance. Whether this is an ideal example of fading remains an open question, but it highlights the taxonomy's broader utility, especially for addressing gaps in the scaffolding literature related to investigating the impact of fading scaffolding (Vo et al., 2025). Future research could build on this by designing activities with varying types of cognitive scaffolding to systematically explore the effects of fading (see the work of Kreps et al. (2024) for further discussion around different types of cognitive structuring).
Each activity also included a reflection section, completed collaboratively before proceeding to the next set of content questions. These reflections incorporated categories from the ELIPSS rubrics (Cole et al., 2017; Czajka et al., 2021), with each reflection focused on a specific process skill. To guide discussion, the reflections included observable indicators and suggestions for improvement. In Cohort 1, reflections were presented as pauses within the activity. For Cohort 2, transitions were restructured to encourage more intentional and content-connected reflection. This change was motivated by research on constructive alignment and self-regulated learning, which emphasizes that reflection promotes active engagement when closely tied to the task (Biggs, 1996; Zimmerman, 2002). In the revised activities (Cohort 2), reflections began with a brief description of a process skill the group had just practiced, followed by a prompt to consider its importance before completing the rubric.
There are four knowledge dynamic codes: not observable, knowledge sharing, knowledge application, and knowledge construction (Reid et al., 2022b). Knowledge sharing refers to groups exchanging information without it being added to or challenged by other members. A common example of this dynamic is the discourse surrounding Pre-class questions, where a student shares a definition of entropy that they have already written down, and the group merely agrees or says, “Sounds good to me.” Knowledge application refers to the use of information to solve a problem, with a clear understanding of its relevance. For instance, in a question asking which conjugate base is more stable, students first discuss stability before applying that knowledge to the specific molecules. The key distinction between knowledge sharing and knowledge application is that students actively channel the information to address the situation and arrive at an answer. Knowledge construction involves critiquing and building upon information. While knowledge application focuses on using information, knowledge construction emphasizes evaluating, critiquing, and testing the information. Not observable is just that; when no observable knowledge dynamic was seen, it would be coded as such.
The social processing codes are non-interactive, collaborative, tutoring, leader, domination, individualistic, confusion, multicomponent, and other (Reid et al., 2022b). Non-interactive was used when no social interactions were taking place. Collaborative represents groups that work together to find a solution to a question. Tutoring was used when a group's interaction primarily consisted of a student answering questions from other group members. Leader is representative of a single student constructing the response due to a lack of engagement from the other group members. Domination was used when a single student would construct a response while ignoring or rejecting other students’ input. Individualistic is representative of a group that does not interact with one another while constructing a response. Where individualistic differs from non-interactive is that individualistic includes an interaction where students may check-in on an answer that was developed independently. Confusion was used when the group spent the majority of time too confused to construct a response. Multicomponent was used for longer timestamps when the group displayed more than one dominant social interaction, and two other social processing codes would be selected as sub-codes for the interaction. The most common example of this was when students were confused about how to answer a question and spent an extended period trying to figure it out before asking for help from the GTA, after which they worked collaboratively on finishing the question and, therefore, would have been coded as multicomponent: confusion/collaborative. Other was used when the group's social interactions did not fall into another category and will be accompanied by a description when applicable. Inter-rater reliability (IRR) was determined using the Kappa statistical analysis for knowledge dynamic and social processing. Twenty percent of the interactions were randomly selected, and a final value of 0.80 and 0.83 was calculated for Cohorts 1 and 2, respectively, indicating strong agreement (McHugh, 2012).
The Modes of Reasoning codes are descriptive, relational, linear causal, and multi-component (Sevian and Talanquer, 2014). Descriptive reasoning refers to instances where a group's focus is on explicit features of a system or question without making connections to cause. For instance, a question that prompts students to rank acid strengths might elicit a response like, “This molecule has a lower pKa, so it must be stronger,” where the reasoning is based on a straightforward observation of a property. Relational was representative of when a group connected two features/properties of a system without explanation and contained an additional level: multi-relational, where multiple relations between features/properties were used (Weinrich and Talanquer, 2016). For example, students might expand on the previous reasoning with a statement like, “This molecule is more stable after losing hydrogen; therefore, it is stronger,” indicating a basic cause-and-effect relationship. Linear causal adds complexity by introducing a sequential chain of events, incorporating both cause and effect along with a justification or explanation. Here, students not only recognize relationships but also elaborate on them, explaining why increased stability would contribute to the strength of a molecule's acidity. Multi-component represents when a group identifies relevant interactions, and the effects of several features/properties are considered and weighted. Multi-component contains two subcodes, isolated where the effects of several variables are recognized and considered separately, and integrated where the effects of several variables are interconnected and systematically explained. For any case where no reasoning was present during discourse, an N/A code was applied. Two researchers independently coded the students' discourse for Cohort 1 and then discussed each code to come to a negotiated agreement. For Cohort 2, inter-rater reliability (IRR) was determined using the Kappa statistical analysis. Twenty percent of the interactions were randomly selected, and a final value of 0.73 was calculated, indicating substantial agreement (McHugh, 2012).
As shown in Fig. 4, the main result of the revisions was the redistribution of questions across different levels of processing. Although the number of Retrieval questions increased, most of these were Pre-class questions. The overall number of Comprehension questions decreased, while the Analysis level of processing saw the most pronounced increase. This reduction in Comprehension questions is important because, although useful for identifying critical features, these questions are less conducive to group learning compared to those requiring analysis (King, 2008). In the following sections of each analysis, Pre-class questions are excluded from the counts, as they were designed to be answered before the discussion and do not reflect the collaborative group learning environment. Additionally, only questions answered by both groups in each cohort are included in the analysis.
![]() | ||
Fig. 4 Counts for the number of questions at each cognitive level of processing across all fifteen activities. |
The results from revising each activity's reflection were less impactful, as observations from Cohort 1 revealed that groups rarely completed reflections collaboratively. Similarly, the recorder for groups in Cohort 2 often completed the reflection independently, either before the rest of the group arrived or while they were engaged in off-topic discussions. Although literature suggests that scaffolding can promote collaborative reflection, a critical component is providing proper instruction and demonstration on how to engage effectively in these practices (Desautel, 2009; Harvey et al., 2016; Türkmen, 2024).
Each activity prompted students to assign the roles of Manager, Recorder, Spokesperson, and Reflector, with the Recorder responsible for documenting the group's answers and submitting the completed activity. In Cohort 1, the red and gold groups consistently rotated roles each week, whereas in Cohort 2, the blue and purple groups did not, meaning that the same students served as Recorders every week. Notably, these Recorders were often the ones who constructed responses when the other students remained silent or answered questions from the other group members (leader and tutoring interactions). This finding underscores the importance of facilitation within a group learning environment and will be discussed further in a later section.
In addition, Cohort 2 demonstrated greater engagement in knowledge dynamics broadly for the Analysis questions compared to Cohort 1, where a knowledge dynamic was not observed in 6% of interactions. This finding is notable when considering that Cohort 2 had triple the number of discursive interactions between students while working on higher-order questions (Analysis) than Cohort 1. Although the overall trends in knowledge dynamics reinforce the connection to question level and did not differ drastically with the addition of cognitive scaffolding, our use of Marzano's cognitive levels of processing in its development led to the inclusion of more higher-order questions. As a result, students in Cohort 2 spent more time at a higher level of cognitive engagement. This suggests that our approach to developing cognitive scaffolding supported sustained engagement across a broader range of questions by reducing cognitive load.
Our future work will investigate this further, but as other studies have discussed, the reasoning used by students can vary depending on a myriad of factors. One factor we believe warrants closer examination is the impact of facilitator interactions on student reasoning. For example, in upcoming analyses, we will examine how teaching assistants interacted with groups in Cohort 2—particularly during a question where both cohorts expressed confusion. While this currently falls outside the scope of the present study, such analysis could help address a gap in the literature related to improving the facilitation of scaffolding through informed pedagogical training (Vo et al., 2025).
While completing Analysis questions, a greater proportion of Cohort 1 answered without providing any evidence of reasoning (N/A = 22%) compared to Cohort 2 (N/A = 12%), suggesting that Cohort 2 engaged in reasoning more consistently during these questions. The use of more complex reasoning, such as linear causal and multicomponent, remained consistent between the two cohorts. This result aligns with the findings on the group's knowledge dynamics, with the primary difference being an increased use of descriptive reasoning in Cohort 2. This suggests that our cognitive scaffolding may have both supported the sustained use of reasoning across more questions and encouraged greater use of low-level reasoning during higher-order questions, even if it did not necessarily lead to greater use of complex reasoning. This could indicate that to promote more complex reasoning, explicit instruction or guidance may be needed to deepen students' reasoning skills and fully engage them in more complex reasoning processes.
Literature on reasoning and argumentation highlights that scaffolding argument structures and prompting students to provide reasoning are effective strategies for fostering reasoning. However, without clear demonstrations or guidance on how to construct arguments or offer reasoning, students may view reasoning as just the answer to a question, rather than as an integral part of the science learning process (Ford, 2008; Russ et al., 2009; Leupen et al., 2020). Considering this, it seems that our cognitive scaffolding may have effectively encouraged greater student engagement through reasoning without necessarily increasing the complexity of that reasoning. To further iterate the point, the multicomponent reasoning in both cohorts was from the same groups (Cohort 1/gold group n = 1 and Cohort 2/purple group n = 3), implying that the group composition or dynamics may play a critical role in a group's use of more complex reasoning (Kumpulainen and Kaartinen, 2003; Jensen and Lawson, 2011; Nennig et al., 2023).
The primary difference between the activities is that Question 2d from Cohort 1 was removed, and Question 1c from Cohort 2 was added. Question 2d (Cohort 1) asked the students which hydrogen on acetic acid would be associated with a higher pKa value, while Question 1c (Cohort 2) asked the students to explain why electronegativity or resonance increased the molecule's stability. The reasoning behind the change was to reduce the change in concepts as defined by ACCM and more clearly promote identifying the critical features that can help determine the strength of an acid. Reducing the change in concepts helps to minimize cognitive load, helping students to integrate new information with their existing knowledge more effectively (Sweller, 2010, 2011). In this way, the change in the question represents context-specific scaffolding by adjusting the task to better align with the student's current level of understanding and focusing their attention on the most relevant concepts (McNeill et al., 2006; Ford, 2012; Leupen et al., 2020). By shifting the focus to the stability factors of the molecule, the revised question provides a more targeted opportunity for students to engage with the critical elements of acid strength, fostering deeper comprehension.
For the blue questions (2c and 1d), the students were asked to circle the most acidic proton from the molecule provided; the reason for the difference in the cognitive level of processing stems from the addition of asking students to justify their choice based on their answer to the previous question (1c). Thus, the students needed to connect the concepts of electronegativity, resonance, and stability to an acid's strength and were asked to engage in an Analysis level of processing. This shifted the knowledge dynamics for this question from knowledge sharing in Cohort 1 to knowledge application and construction in Cohort 2.
Table 3 contains the social processing codes applied to each group in Cohorts 1 and 2. While some studies indicate that scaffolding can positively influence groups' social dynamics, these studies have focused on online project-based learning and teacher-led scaffolding (Rojas-Drummond and Mercer, 2003; Kraatz et al., 2020; Cortázar et al., 2022). In contrast, our work focuses on the impact of using Marzano's taxonomy to develop activities that can be used as hard scaffolds, which is support that is static and pre-planned. Our results indicate that cognitive scaffolding did not have a large impact on social dynamics within the groups. For Cohort 1, both groups were collaborative, which, according to the ICAP framework, is a behavior that indicates higher engagement. However, a closer examination of the group's conversations reveals that, while they collaborated to answer questions, they may not have been as collaborative in explaining or reasoning through their answers. For example, while the gold group was working on Question 2e, Hodgkin was not sure how they could provide an explanation for the entropic argument “Hodgkin: I have no idea how to go about explaining the other side,” and Daly responded by explaining “Daly: The other side would be entropically…because you lose that proton and have that negative ion.” While it initially appears that the two students are working collaboratively to build their understanding, confusion about which atoms on the molecule they are referring to sets in, ultimately leading them to give up on constructing the explanation: “Daly: I feel like I can kind of explain it both ways but… I don’t really know what it is [exactly], I would still [say] more energetically affected.”
For the blue group, who mainly worked collaboratively, the students explicitly mentioned early on that “[they are] trying to get this done so [they] can go watch the [eclipse].” The desire to get the activity done quickly led them to start answering questions before Alice arrived. As the group starts working on Question 1d, Alice arrives, and the rest of the group puts no effort into getting them caught up, eventually leading Alice to try and contribute without being fully aware of what the group is working on.
Alice: | [Reading off their personal iPad] As electronegativity goes up, size goes down. Is that what you were asking? |
Sheeana: | No, we were talking about stability. |
Alice: | Oh, sorry |
Sheeana: | No, you are good. |
Leupen et al. (2020) states that “…teams should always be working on the same problem; otherwise, teams will be less motivated to be interested when others report,” and the above dialogue is a good example of this, especially when we consider that Alice does not participate for the remaining questions. In addition, both the red and purple groups would begin reading the next question while the recorder was still writing out the answer to their current question. Given that these more complex social interactions are present across both cohorts, this finding suggests that cognitive scaffolding alone may not be sufficient to positively impact a group's social dynamics. The social processing codes for the purple group illustrate how social dynamics can shift quickly, as one student consistently took on the role of recorder for almost every week and was the source of the domination during their discussion. Literature identifies factors such as this among relevant prior knowledge/experiences, holding students accountable for their answers, and specifically, facilitation methods as crucial for promoting high-quality social dynamics (Dallimore et al., 2006; Knight et al., 2016; Pecore et al., 2017; Leupen et al., 2020; Newton et al., 2020).
Table 4 contains the progression in each group's reasoning while working through the questions. For this section, we will focus on Questions 2d and 1c for Cohorts 1 and 2, respectively. For the gold group, there was initial confusion about what the pKa means “Hodgkin: Well a lower pKameans more stable, right? Or is it the opposite, that it is a strong acid?” before they were ultimately able to land on a relational type of reasoning “Daly: Just because comparatively it is not as electronegative, therefore [carbon] can’t hold that negative charge as well as an oxygen would.” The red group relied more on their general understanding of pKa to dictate their reasoning, leading them to engage in descriptive reasoning: “Bohr: Since it is weaker, it would have a higher pKa…and if it was stronger, it would have a lower pKa.”
For Cohort 2, the question was modified to better align with the concept of scaffolding and cognitive structuring, which aims to guide students through complex topics by prompting them to consider the ‘why’ behind a concept before applying it to higher-order questions. By adjusting the question to better correspond with the concepts explored in other questions and emphasizing the reasoning behind electronegativity and stability, we expected to promote interactive or constructive engagement and more complex reasoning in their work (Chi and Wylie, 2014). The blue group begins this question with confusion before they settle on descriptive reasoning for their answer.
Dorothy: | I don’t know why, all I know is that it is that way. |
Sheeana: | So, if it is more electronegative, it holds the electrons tighter. |
Dorothy: | Okay, that makes sense. |
Olga: | It makes them less available to form bonds. |
This type of reasoning is similar to the red group from Cohort 1, which focuses on restating the salient entities without any justification or explanation (Weinrich and Talanquer, 2016). According to the ICAP framework, reasoning that merely reasserts explicit properties (descriptive reasoning) and applies shared knowledge (knowledge application) reflects active engagement, which does not align with the expected interactive or constructive engagement for this type of question. However, it is important to consider that the group had previously expressed a desire to finish the activity quickly to observe the eclipse. By communicating this intention, the group had already committed to limiting their engagement in favor of completing the task rapidly (Draskovic et al., 2004). In contrast, the purple group showed a markedly different outcome. For instance, during this question, we observe a key aspect of engagement and reasoning: critique (Michaels et al., 2008; Osborne, 2013). Below is a section of the transcribed group discussion during Question 1c.
Helen: | Why do we want it to have a higher electronegativity? What is the actual reason? Because I know it has something to do with the negative charge but don’t we… [Linus starts talking] |
Linus: | I think it has to do with size and how easy it gives up its proton. Or… |
Helen: | That's size. |
Linus: | Oh yeah, that is size and not electronegativity. |
Miles: | They are more stable when they have a higher electronegativity because you don’t want it to change, you don’t want it to just be giving them up, or accepting them. Right? That is what makes it more stable |
Helen: | Hmm… My thing is why. Because more electronegative means that they take more electrons, wouldn’t that concentrate it more? I want to know the reason |
Miles: | Oh yeah, that's a good question. |
In this discussion, both Linus and Miles proposed potential explanations for the effect of electronegativity on molecular stability, while Helen challenged their ideas by critiquing the reasoning. Although the question was coded as relational reasoning, this classification is primarily limited by the student's inability to continue the dialogue due to their confusion, which eventually led them to seek guidance from the GTA. Through the lens of the ICAP framework, this scenario exemplifies interactive or constructive engagement: the students not only articulate their reasoning but also critically engage with and evaluate each other's ideas. This collaborative process results in a co-construction of knowledge.
Determining the effectiveness of scaffolding is no easy task, as it requires careful consideration of a myriad of factors (Palincsar, 1998; van de Pol et al., 2010). To this end, we employed coding schemes that aided in our analysis of the group's social and knowledge dynamics and reasoning. While the cognitive scaffolding led to increased cognitive engagement and more frequent use of lower-level reasoning, it did not appear to support the development of more complex reasoning or consistently benefit the social dynamics within groups. Moreover, while the scaffolding encouraged basic reasoning processes, the limited presence of complex reasoning may reflect broader curricular constraints—suggesting that instructional approaches must go beyond activity design to more explicitly and consistently promote advanced forms of reasoning throughout the course. These findings indicate that cognitive scaffolding with Marzano's taxonomy alone may not be sufficient to support effective collaboration or deeper reasoning. Our future work will explore how facilitation strategies influence group interactions and whether informed pedagogical approaches can enhance the impact of scaffolding during group work.
Although the design and impact of scaffolding remains an active area of research, we believe Marzano's taxonomy provides a much-needed foundation for a shared language and approach to future studies. For instance, while our work did not address metacognitive scaffolding directly, metacognition is explicitly included in the taxonomy. This presents a clear opportunity for future research to explore how metacognitive supports might be incorporated into cognitive scaffolding without requiring an entirely separate framework. Additionally, there has been a call for further investigation into the fading of scaffolds, and our previous work provides examples of what that may look like using Marzano's taxonomy and how it can aid in further research on the topic (Kreps et al., 2024).
While the impact of our cognitive scaffolding on social interactions was less evident, these outcomes were likely shaped by other variables, such as group dynamics, instructional facilitation, and the overall classroom environment. Future research should examine how these contextual factors influence the effectiveness of scaffolding, particularly in collaborative learning settings. Doing so would not only address gaps in the literature but also contribute to the development of more robust facilitation practices and pedagogical training. By adopting a shared, cognitively grounded framework, facilitators across disciplines can more easily communicate instructional strategies and goals, ultimately making pedagogical practices more transferable and comprehensible. This benefit can also extend to students. When instructors are able to clearly communicate the purpose and benefits of cognitive scaffolding, it fosters greater transparency and can help build student trust in the learning process (Vo et al., 2022).
Although more research is needed to refine and expand this approach to cognitive scaffolding, our findings already point to several practical applications for instructors and facilitators. While we focused on revising in-class discussion activities, the taxonomy can also guide the design of higher-order thinking questions and help identify points where students may need additional support. For example, recognizing that a task requires Analysis-level processing might prompt the instructor to include an intermediate Comprehension-level prompt, such as a guided line of inquiry, to scaffold student reasoning. In this way, Marzano's taxonomy serves as both a developmental and diagnostic tool, supporting more intentional and effective instructional design.
Level of cognitive processing | Definition |
---|---|
Level 1 – Retrieval | Recalling information from a provided source or permanent memory. |
Level 2 – Comprehension | Identification of critical features needed to transfer knowledge from working memory to permanent. |
Level 3 – Analysis | Creating new insights from previous knowledge or using previously obtained knowledge in novel situations (focus is on the concept/skill). |
Level 4 – Knowledge utilization | Using knowledge for novel and specific situations (focus is on the situation). |
Category | Definition |
---|---|
Knowledge sharing | The focus of the group interactions is based on sharing information to answer the question without questioning of the utterances presented |
Knowledge application | The focus of the group interactions is based on applying a formula/method/concept and relating that to a clear understanding of how it relates to the explanation or process of solving the problem |
Knowledge construction | The focus of group interactions is based on sharing information and building upon the ideas of others by questioning or critiquing the ideas presented |
Not observable | No knowledge dynamic is seen due to a lack of student interaction |
Category | Definition |
---|---|
Non-interactive | Students are not having a conversation, and there is no proof of individualistic work |
Collaborative | Students are co-constructing ideas and generating products together |
Tutoring | One or more students ask questions that another student responds to by guiding the students through the problem asking for the tutees ideas or just by explaining their reasoning without asking input from the student who asked the question |
Leader | One student primarily constructs the response due to a lack of contribution from others |
Domination | One student constructs the response for the group while not considering, ignoring, or rejecting input given by others |
Individualistic | Students are working independently and are not having conversations about the question products |
Confusion | Students are too confused to really generate the expected product or make confident progress for a question |
Multicomponent | Students engage in more than one social processing |
Other | Describes a way that students are interacting with one another that is not described by another code |
Category | Definition |
---|---|
Descriptive | Salient entities in a system are identified or recognized. Explicit properties are described, verbalized. Functions or properties of entities are seen as sufficient explanation for their behavior. A phenomenon is seen as an instantiation of reality; it may be re-described by merely asserting that things are as they are without referring to causes. Reasoning mostly based on experiences and knowledge from daily life. Strong influence of surface similarity and recognition on judgment and decision making. |
Relational | Salient entities in a system are identified or recognized. Explicit and implicit differentiating properties are highlighted. Spatial or temporal relations between entities are noticed. Correlations between properties and behaviors are established but not explained or justified. A phenomenon is seen as an effect of a single entity, the natural outcome of a single property or the linear combination of several properties; no mechanisms are proposed. Reduction of variables and overgeneralization constrain reasoning. |
Multi-relational | Explanation based on multiple relations |
Linear causal | Salient entities in a system are identified or recognized. Explicit and implicit differentiating properties are highlighted. Spatial or temporal organization of and connections between entities and noticed. Relevant direct interactions between entities invoked. Although the influence of many factors may be recognized, phenomena tend to be seen as (reduced to) the result of the actions of a single agent on other entities; proposed mechanisms involve linear cause-effect relations and sequential chains of events. Reduction of variables and overgeneralization frequently constrain reasoning. |
Multicomponent | Salient entities in a system are identified or recognized. Explicit and implicit differentiating properties are highlighted. Spatial or temporal organization of and connections between entities are noticed. Relevant interactions between entities are invoked. Effects of several variables are considered and weighed. |
Isolated | Complex phenomena are seen as the result of the static and dynamic interplay of more than one factor and the direct interactions of several components. Effects of several variables are considered and weighed separately. |
Integrated | Complex phenomena are seen as the result of the dynamic interplay of more than one factor and the direct and indirect interactions of several components. Explanations as interconnected stories of how variables affect the entities involved. The effects of different variables are more thoroughly and systematically explained for multiple entities involved. |
NA | No mode of reasoning is present |
![]() | ||
Fig. 8 Pie charts representing the percentage of each knowledge dynamic code for Cohort 1 (Gold and Red Group) and Cohort 2 (Blue and Purple Group). |
![]() | ||
Fig. 9 Pie charts representing the percentage of each social processing code for Cohort 1 (Gold and Red Group) and Cohort 2 (Blue and Purple Group). |
![]() | ||
Fig. 10 Pie charts representing the percentage of each mode of reasoning for Cohort 1 (Gold and Red Group) and Cohort 2 (Blue and Purple Group). |
Footnote |
† Electronic supplementary information (ESI) available. See DOI: https://doi.org/10.1039/d5rp00225g |
This journal is © The Royal Society of Chemistry 2025 |