Preservice teachers’ enactment of formative assessment using rubrics in the inquiry-based chemistry laboratory

Yoram Zemel , Gabby Shwartz and Shirly Avargil *
Faculty of Education in Science and Technology, Technion, Israel Institute of Technology, Haifa 3200003, Israel. E-mail: savargil@technion.ac.il; Tel: +972-4-8293132

Received 1st January 2021 , Accepted 8th August 2021

First published on 9th August 2021


Abstract

In recent years teachers’ education programs encourage preservice teachers to practice a variety of assessment methods to prepare them to be highly qualified practitioners who are capable in enhancing students’ scientific understanding. Formative assessment (FA) – also known as assessment for learning – involves the process of seeking and interpreting evidence about students’ ideas and actions to enhance and guide the learning process. An inquiry-based chemistry laboratory was chosen as the context of this research in which 13 preservice teachers studied the practice and application of FA. The preservice teachers evaluated students’ lab reports using two components of assessment – rubric-based scoring and providing students with feedback comments. Our goal was to understand whether guidance provided through the teacher education program affected preservice teachers’ FA enactment, which was reflected in their score variation and the quality of the written feedback comments provided to students. The study findings show that the total lab report score variation decreased in the 2nd assessment due to the explicit guidance. That is, the guidance provided the preservice teachers the opportunity to examine, discuss, and improve their own assessment knowledge and scoring process. However, the rubric dimensions that were perceived by preservice teachers as more open for discussion and interpretation – such as evidence-generating analysis and formulating conclusions – were challenging to assess and the explicit guidance created different thinking directions that led to increased variation scores. In these dimensions the guidance exposed the preservice teachers to the complexity of a rubric-based scoring in a FA manner. We recommend that the guidance preservice teachers receive regarding FA of inquiry-based lab reports, should include aspects of how to notice and interpret students’ ideas and only then respond with a formative feedback. The results of our study expand the theoretical knowledge regarding FA and have important implications for the preparation of future chemistry teachers and for the professional development of those already teaching chemistry in a classroom environment.


Introduction

In chemistry education, a major teaching and learning goal is the development of inquiry-based laboratory practices (Mamlok-Naaman and Barnea, 2012; van Brederode et al., 2020). The development of these practices greatly depends on teachers’ approach to assessment, which influences and supports students’ scientific conceptual understanding (Ruiz-Primo and Furtak, 2007; Talanquer et al., 2013). Moreover, students’ written laboratory reports are a prominent method to assess inquiry-based laboratory practices (Walker and Sampson, 2013; Avargil et al., 2019). Some factors inhibit learning in the inquiry-based laboratory environment, for example, the process of assessing students’ planning of experiments, analysis of results, and drawing of conclusions after completing hands-on tasks (Nadji and Lach, 2003; Hofstein et al., 2004; Taylor, 2007; Carmel et al., 2019).

Thus, to promote students’ learning, teachers’ enactment of formative assessment is essential to inform students of their advancement in acquiring these skills (Tomanek et al., 2008; Abell and Siegel, 2011; Correia and Harrison, 2020).

Formative assessment (FA) – also known as assessment for learning – involves the process of seeking and interpreting evidence about students’ ideas and actions to enhance and guide the learning process (Black and Wiliam, 1998; Talanquer et al., 2015; Clinchot et al., 2017).

In the construct of teachers’ professional knowledge and practice, FA is crucial (Avargil et al., 2012; Dori et al., 2014; Herppich and Wittwer, 2018); however, preservice teachers’ perception is usually limited to what assessment means and what must be assessed (Buck et al., 2010; Abell and Siegel, 2011). Moreover, their professional identity, beliefs and perceptions regarding assessment and what it encompasses, is in its early stages of formation (Tsybulsky and Muchnik-Rozanov, 2019; Shwartz and Dori, 2020). As a result, preservice teachers’ assessment greatly emphasizes grading and attends little to assessment as a learning tool (Sabel et al., 2015; Herppich and Wittwer, 2018). Being a unique group of participants (compared to in-service teachers or teaching assistants), teacher education programs are required to expose and allow preservice teachers to practice various assessment methods, changing their focus from utilizing assessment of learning to utilizing it for learning (Shepard, 2000; Sabel et al., 2015). They can provide preservice teachers with opportunities to apply, via meaningful experiences and social interaction, practices such as assessment and enable them to shape and nurture their perceptions of assessment, and to support meaningful implementation in the classroom (Sutherland et al., 2010).

Our study stems from the belief that as teachers educators, we need to provide preservice chemistry teachers the opportunity to experience FA in an authentic context of chemistry education while reflecting, examining and learning about their own practice (Talanquer et al., 2013; Sabel et al., 2015).

The current study describes how 13 preservice chemistry teachers, in a teacher education program, studied and applied FA in assessing inquiry-based laboratory reports (details of teachers’ education program at our institution – the Technion – are shown in Appendix A). In the study, preservice teachers evaluated high school students’ laboratory reports using two assessment components: (1) official rubric-based scoring provided by the Ministry of Education and (2) written feedback on students’ laboratory reports. We focus on laboratories where a chemical phenomenon is presented to students and they develop a procedure for investigating a specific aspect they are interested in (Fay et al., 2007).

During an instructional unit on assessing inquiry-based laboratory reports in the course “Teaching inquiry-based laboratories in chemistry education”, (see Appendix A), we characterized preservice chemistry teachers’ FA process in terms of score variation and formative feedback they provided.

Thus, we asked the following research question:

What characterize preservice chemistry teachers’ laboratory score variation and written feedback while assessing high school chemistry students’ inquiry-based laboratory reports?

We delved into the nature of preservice teachers’ feedback comments, attempting to characterize their interpretation of students’ data, while navigating their score variations using the assessment rubric. To the best of our knowledge, this study is unique because it addresses preservice chemistry teachers’ FA, using a rubric and written feedbacks, of inquiry-based laboratory reports.

Overall, this study yields implications about FA for teacher education programs as well as professional development for teachers.

Theoretical framework

Formative assessment is the theory that acts as the lens though which we analyze, interpret, and explain our findings. The Formative Assessment in the Science Classroom theory of Buck and colleagues (2010) informed the context of our study, while Torrance and Pryor's (2001) Convergent and Divergent Formative Assessment theory was utilized for our data analysis. The guiding principle of Buck et al. (2010) is that students need feedback that will provide information for them to progress in their learning while teachers need to see themselves as facilitators. Brookhart (1994) claims that students who receive feedback, that can help them improve next time, feel empowered as opposed to feedback that do not entails information on how to improve and convey a feeling of judgment. Thus, teachers educators need to prepare preservice teachers for creating an atmosphere of formative assessment in science classroom (Brookhart, 1997).

Torrance and Pryor conceptualized the construct of divergent assessment and convergent assessment (2001). Divergent assessment is conceptualized as creating opportunities to explore students’ thinking and understanding, and therefore considered as creating an environment more conducive to enhance learning. Convergent assessment is conceptualized as creating opportunities to address current students’ knowledge.

Within this theoretical framework, we conceptualize the use of rubrics for formative assessment and deepening our understanding furthermore regarding the perceptions, implementation, and challenges of this form of assessment among preservice teachers. Although the scholars we mentioned above (Torrance and Pryor, 2001; Buck et al., 2010) grounded their theory within a primary school context, we believe this setting is transferable, enabling us to examine and implement these theories within the context of preservice teachers’ education program.

Formative assessment in the context of the inquiry-based laboratory

Chemistry is an experimental science and the fundamental goal is that students develop the ability to actually do science (Cacciatore and Sevian, 2009; Wheeler et al., 2017). The role of inquiry-based chemistry laboratories and the method of their assessment takes a central place in chemistry curricula (Pullen et al., 2018). The current common assessment of laboratory learning inhibits the promotion and development of conceptual understanding and scientific practices, because it focuses on content knowledge rather than on understanding the purpose of laboratory investigations (Hofstein and Lunetta, 2004; National Research Council, 2012; Carmel et al., 2019). However, student assessment in an inquiry-based context requires attention to various elements including students’ scientific ideas during inquiry as well as scientific inquiry practices, explanations, and arguments (Talanquer et al., 2013).

Shifting away from grading towards using classroom evidence in informing the learning process, FA (i.e. assessment for learning), has been recognized as a responsive approach that enables students’ development of scientific concepts, and supports the development of students’ higher order thinking skills (Clinchot et al., 2017; Murray et al., 2020). As opposed to a summative assessment which evaluates learning outcomes at the end of the learning process, FA aims to improve teaching and learning as it occurs during instruction (Dolin et al., 2018). Collecting evidence through summative assessment, for example via tests, usually will not include going back to the specific unit after interpreting students’ difficulties. It reports on students’ level of learning rather than on students’ understanding and teaching effectiveness (Bennett, 2011).

Formative assessment can be formal – deliberately planned, or informal – spontaneous and not scored (Talanquer et al., 2015). Either way it is a dynamic process in which the teacher receives and analyzes data about students’ learning and actions and provides feedback (Herman et al., 2015; Furtak et al., 2016). The evidence gathered is judged in terms of what it indicates about existing ideas and competencies required to meet the lesson goals (Dolin et al., 2018).

Formal FA, which is the focus of our study, includes planned activities designed to elicit students’ ideas and thinking process. When using formal FA, for example while examining students’ inquiry-based laboratory reports, a teacher evaluates students’ understanding at key points during the inquiry process, and then gives feedback to students to enhance their learning, i.e. development of scientific thinking, during the learning (Tomanek et al., 2008; Buck et al., 2010; Murray et al., 2020). The effectiveness of FA depends on teachers’ ability to focus on students’ ideas, to generate an interpretation about their conceptual understanding and misconceptions, and provide feedback (Bennett, 2011; Sevian and Dini, 2019).

One of the main principles of formative assessment is the nature and quality of the feedback that is given to the students (Coffey et al., 2011). Feedback refers to giving responses to a product, process, or event to improve students’ performance.

Feedback from teachers to students should give students information about how they can improve their work or take their learning forward. Just giving marks or grades that only indicate how well the work is judgmental and is not consistent with the aim of using assessment to help learning (Brookhart, 1994, 1997). Experiencing and learning in the laboratory requires continuous feedback and an open line of communication between the teacher and the student; it serves as a source of information for teachers about current student understanding so that teachers can adjust instruction to maximize student learning (Furtak et al., 2016). Studies suggest that students appreciate receiving formative feedback and recognize its benefits. However, the same studies have also identified substantial variability in how feedback is perceived by students (Zumbrunn et al., 2016; Van der Kleij et al., 2017). While teachers believed they provided students with helpful feedback, many students did not perceive this feedback as helpful, and have not used it (van der Kleij, 2019). In a study conducted by Harks and colleagues (2014), students perceived process focused feedback as more useful than grade-oriented feedback.

Enacting formative assessment using rubrics

Formative assessment can be carried out in different ways such as: (a) observing students as they work while asking questions to probe their understanding; (b) listening to their explanations and engaging in dialogues; (c) designing tasks that require particular ideas and competencies; and (d) giving feedback to an outcome of their work, such as drawings, video or a laboratory report (Herman et al., 2015; Clinchot et al., 2017). In this paper, we focus on enacting formative assessment through the usage of rubrics as they constitute one of the main assessment tools in the inquiry-based laboratory.

Rubrics include a set of evaluative criteria, qualitative definitions of those criteria, and a scoring strategy with the aim of evaluating students’ work (Menéndez-Varela and Gregori-Giralt, 2018). They serve as an appropriate assessment tool in the inquiry-based laboratory since they enable to identify the construct under evaluation and define criteria that must be present in a specific task (Panadero and Jonsson, 2013). Additionally, using rubrics enables to elicit information that demonstrates understanding of a particular construct such as drawing a conclusion and critical thinking capability (Siegel et al., 2006).

The use of rubrics as a FA instruction practice, although being controversial, have increased in recent years in higher education and is widely used at the school level (Panadero and Jonsson, 2013). Researchers who support the use of rubrics for FA purposes, suggest two approaches of implementation: (1) student-centered way where the rubric could be shared with the students to support their learning. Letting the students know what is expected from them, and increasing transparency help in communicating expectations, and lowering students’ anxiety regarding assignments (Jonsson and Svingby, 2007); and (2) teacher-centered way, where the teachers make assessment criteria explicit and provide the students with feedback comments involving a two-way communication system that promotes a form of student–teacher dialog (Buck et al., 2010; Panadero and Jonsson, 2013). In our study, we describe a combination of both approaches where students are given the rubric and receive formative feedback from their teacher after submission. Schamber and Mahoney (2006) reported that a majority of faculty members emphasized that the rubrics helped them to clarify learning goals, give feedback, and help students build understanding through the usage of rubrics. However, using a rubric can be challenging due to the following reasons: (a) consistency among graders which is essential to maintain accuracy in providing students with improvement direction (Avargil et al., 2019), and (b) the notion of feedback comments which is crucial to support the FA process and student's learning (Panadero and Jonsson, 2013; Murray et al., 2020). Examining the benefits of FA from students’ perspective, Dresel and Haugwitz (2008) found a positive effect in supporting students’ self-regulated learning skill, which lead to improvement in students’ achievements. Using rubrics in a FA manner have also helped in facilitating both planning and self-assessment (Panadero, 2011). For instance, in relation to planning, the students in the study done by Andrade and Du (2005) reported that using the rubric assisted in planning their assignment approach.

Formative assessment in preservice teacher education – perceptions and challenges

Preservice teachers often struggle with FA as they consider students’ knowledge in terms of correct or incorrect which has consequences for how they consider responding to their students’ thinking (Tomanek et al., 2008; Harshman and Yezierski, 2015; Sabel et al., 2015; Kim et al., 2020). When they evaluate students’ ideas, they tend to describe what students said in their learning process rather than identifying misconceptions and focusing on the extent to which students’ have made sense of the topic (Talanquer et al., 2015). Past work has shown that preservice teachers need opportunities to engage and reflect on teaching and assessment strategies such as FA during their education program (Kohler et al., 2008; Sabel et al., 2015). They should be directed, through their education program, not to perceive assessment as correct or incorrect conceptions regarding students’ knowledge (Buck et al., 2010). Supporting preservice teachers to learn how to employ FA requires a focus within teacher education programs on the way and the context in which students learn science (Levin et al., 2009). Research suggests that specific types of experiences during the education program can support preservice teachers to expand their knowledge of classroom assessment; including engaging in assessment tasks that simulate authentic science learning environments, such as the laboratory, and participating in professional dialogues with colleagues (Buck et al., 2010; Talanquer et al., 2013; Sabel et al., 2015). This engagement in FA includes opportunities to elicit and identify students’ ideas, and considering the type of feedback that support students’ knowledge building (Clinchot et al., 2017; Murray et al., 2020).

Assessment of students’ laboratory work and skills has varied in the research literature, focusing mainly on the effective aspect of inquiry such as students’ self-report of their inquiry experience, and assessing the extent to which laboratory activities provide opportunities for students to engage in scientific practices (Carmel et al., 2019). We witnessed a shortage in literature concerned with preservice chemistry teachers assessing chemistry laboratory reports using rubrics in a FA manner. Some studies that have been conducted at the undergraduate level, which documented Teaching Assistants (TAs) assessing students laboratory work, concluded that most TAs had difficulty in providing concise and accurate written feedback (Kurdziel et al., 2003; Avargil et al., 2019). For high school chemistry students to develop understanding as well as scientific skills through inquiry-based chemistry laboratories, their teacher's laboratory-report assessment and feedback is crucial (Talanquer et al., 2015). Thus, preservice teachers’ reflection on assessing inquiry-based chemistry laboratory reports is important. It can provide them the opportunity to: (a) better understand how and why rubrics are constructed, and (b) encourage teachers’ FA practice of the inquiry process while providing meaningful, valuable, and consistent feedback that enhances students’ learning.

Research setting

The inquiry-based chemistry laboratory in Israel

In 2010, the chemical education committee, set up by the Israeli Ministry of Education, recommended that the high school curriculum of chemistry education will include a learning unit of inquiry-based laboratory. This inquiry-based laboratory unit is mandatory, for students who study chemistry in 10th–12th grades and consists of 90 lessons, 45 minutes each. Therefore, it became an important component of the high-school chemistry curriculum and the matriculation exam in Israel (Barnea et al., 2010; Mamlok-Naaman and Barnea, 2012; Hofstein et al., 2019).

The chemistry inquiry-based laboratory unit follows the approach presented in Abd-El-Khalick et al. (2004):

“Inquiry as ends (or inquiry about science) refers to inquiry as an instructional outcome: Students learn to do inquiry in the context of science content and develop epistemological understandings about nature of science and the development of scientific knowledge, as well as relevant inquiry skills.” (pp. 398)

The following are some of the unit's main objectives, as defined by the Ministry of Education:

• Recognition of chemistry principles in practice

• Demonstration and application of theoretical content knowledge

• Development of interest and curiosity while making chemistry more relevant

• Development of inquiry skills and higher order thinking skills, and independent work

• Development of critical thinking

The laboratory unit includes both guided and open-ended experiments as described in the rubric of Fay and colleagues (2007) and is classified into three levels:

• Level 1 laboratories: the problem and procedure are provided to students and they interpret the data in the microscopic level while connecting their observations to the chemical phenomena occurred. The skills required for this level of laboratory are mainly following instructions, using instruments, collecting, analyzing, and interpreting data.

• Level 2 laboratories: the problem is provided to students and they develop a procedure for investigating a specific aspect they are interested in. The skills required for this level of laboratory are posing research questions, raising scientific hypotheses, planning the work, examining student's assumptions, collecting and analyzing data, comparing graphs, and writing conclusions.

• Level 3 laboratories: a ‘raw’ phenomenon is provided to students. The students choose the problem to explore, develop a procedure for investigating the problem, decide what data to gather, and interpret the data to propose viable solutions.

The preservice teachers participating in this research learned and experienced, both as students and as teachers, all three levels of laboratories. In this article, we describe the results of a level 2 laboratory the preservice teachers practiced.

To support the process of implementing the learning unit, an assessment tool was developed by the Ministry of Education in our country (Hofstein et al., 2006). The assessment tool is a rubric which contains different sections (see further details on the rubric in the research tools section), for assessing the different phases of an inquiry-type experiment as well as a section for the teacher's observations of the students’ group work in the laboratory (Hofstein and Lunetta, 2004). The rubric is periodically updated based on teachers’ feedback and the one was used in this research was last updated in August of 2018.

During 11th and 12th grades, every chemistry-major student must conduct 8 different inquiry-based experiments and generate a report for each one, based on the rubric provided by the Chemical Division of the Ministry of Education. When writing the report, the students must structure the report based on the rubric. The report is supposed to be assessed by the teacher in a formative style, then corrected by the student and resubmitted for an additional assessment. This way, a student–teacher dialog is created, enabling the students to know what is expected from them and how they can learn and further develop their learning. Finally, all the student's reports are organized in a portfolio, and an oral exam is administered, by an external chemistry teacher assessor, at the end of 12th grade (Mamlok-Naaman and Barnea, 2012).

Research process

The current research focused on assessing two different high-school students’ inquiry-based laboratory reports, that were written for the same investigated phenomenon – the reaction between Magnesium metal and aqueous hydrochloric acid (see experiment protocol in Appendix B). Meaning, both groups of students conducted the same experiment, but their research question was different. For preservice teachers to be able to give an in-depth feedback that includes guidance, directions for improvement, and provide a basis for a discussion, the course lecturer chose, from her high school chemistry class, two average laboratory reports that were neither excellent nor poor.

During the course, the course lecturer was not responsible for collecting and analyzing the data which was done by the first and third authors. The course lecturer was not aware who signed the consent form, and assessing high school students’ authentic laboratory reports was part of the course routine (Lawrie et al., 2021).

Fig. 1 depicts how the research process was conducted. The first phase was dedicated towards understating the structure, content and use of the assessment rubric. During two class lectures the course lecturer focused on providing instructions on how to use the rubric, what each criterion means, and what a proper response to a criterion is. The course lecturer covered each section and dimension and provided specific detailed expectations for each criterion.


image file: d1rp00001b-f1.tif
Fig. 1 Research process.

In the next step, the preservice teachers were provided with an authentic 11th grade anonymous student's laboratory report that was written for an inquiry-based experiment procedure (see Appendix B). Each preservice teacher received the same student's laboratory report and had to independently assess it – score it based on the rubric and provide a written feedback in the body of the report. The preservice teachers had to submit their first assignment shortly before the third class was conducted.

In the third class, the course lecturer delivered a complete analysis of the preservice teachers’ first laboratory report assessment. In her analysis, the course lecturer primarily focused on the rubric score variation, and attempted to highlight the causes for the variation as they were reflected in the assessment. As part of this process, she covered the different criteria in the rubric and explained in more details along with extra examples how different levels of student's answers should be scored. Finally, the course lecturer covered the written feedback assessment; she provided examples of different feedback comments from the preservice teacher assessments and used it as a method to deliver clear guideline of what a constructive and formative written feedback is.

Table 1 lists a few examples of feedback comments covered in the class discussion and the learning value they provided.

Table 1 Class discussion on written feedback held by the course lecturer
Preservice feedback examples given in the report Class discussion
Feedback characteristics as was raised in the class discussion How this feedback enhances students’ learning?
Please provide more details. To which substance the clear solution refers to? What is its concentration and volume? The feedback is relatively general and provides a direction to the students The feedback provides a direction for further thinking and learning by listing a few examples
Description is missing for the Mg particles at the beginning of the experiment.
At the end of the experiment, it is recommended to state the perimeter of the glove was 21 cm
The titles of the experiment observation table should be: before the experiment; during the experiment; after the experiment. This feedback is less general and more directive, still it does not provide exact details to what to write in the observation section The feedback provides a direction for what is missing in the report and does not provide exact details which enables further thinking
Observations should list a description of each substance separately and should include the changes during the experiment.
In the following feedback, the preservice teacher corrected the student's writing in the report (Word document) and provided a few comments: This feedback is very specific, the student in this kind of feedback just need to ‘Accept All Changes’ in the Word document, and need to follow the teacher's guidelines. The preservice teacher did all the corrections This feedback does not provide a general direction but rather specific corrections that were made by the preservice teacher and thus does not provide an opportunity for further students’ thinking and learning
You should state that this observation is related to HCl and provide more details and include concentration, volume, state of matter, and color.
Observations are missing for Mg.
Pay attention to the use of scientific language


In addition, a discussion regarding the different feedback styles was held aiming at understating: (a) what is the meaning and nature of formative assessment, and (b) the potential contribution of each feedback style to students’ learning process and to the development of their higher order thinking skills. This class served as a reflection process for the preservice teachers with the intention of highlighting the variation in score and written feedback so that the preservice teachers learn from it, improve, and therefore, potentially, score variation would be lower, and feedback would be more consistent for the second anonymous laboratory report assessment.

In the next step, the preservice teachers were provided with a second authentic 11th grade anonymous high-school student's laboratory report, dealing with the same experiment. They had to assess the second laboratory report and submit it to the course lecturer, along with a response for reflective questions about the entire laboratory report assessment process.

Methodology

Considering the lack of theory regarding using rubrics in a FA manner in the context of inquiry-based laboratory reports in chemistry, and the need to address this phenomenon from teachers’ perceptive while providing a rich descriptive picture, we employed a qualitative methodology (Erickson, 2012). An exploratory case study approach (Yin, 2009) guided this study. This approach provides the researcher the opportunity to an in-depth exploration of a phenomenon in a situated context. In this study the situated context phenomenon is the enactment of formative assessment during learning on assessing inquiry-based laboratory reports in preservice teachers’ education program. To answer the research question, the qualitative data was obtained from various research tools (Merriam, 1998; Flick, 2013). Using a qualitative case study approach helped us develop an understanding of what preservice teachers assess, how they do so, and why (Yin, 2017). Exploratory case study approach builds on a theory (i.e. formative assessment) from which additional avenues can be explored (Reiter, 2017).

Research participants

Thirteen preservice chemistry teachers, who enrolled to the chemistry education track in our institution, participated in the research. The participants took the course “Teaching inquiry-based laboratories in chemistry education” and were engaged in conducting inquiry-based chemistry experiments, delivering laboratory reports based on a rubric provided, and developing higher order thinking skills related to inquiry-based chemistry laboratories. Additionally, the preservice teachers were required to assess two different 11th grade anonymous students’ laboratory reports, that were provided by the course lecturer. The assessment of the two different reports was comprised of two components – scoring based on the rubric and providing feedback comments in the body of the report. This research was approved by the institutional ethics committee, approval #2020-107.

Research tools

We combined both open-ended and closed-ended data: the analysis process of the scoring involved descriptive analysis by calculating score variation, while the written data analysis enabled the characterization of the nature of preservice teachers’ formative feedback comments before and after the course lecturer's guidance.

The three research tools that were used in this study were:

(1) The scored rubrics provided by preservice teachers on the two different high-school students’ laboratory reports, in the 1st assessment and 2nd assessment. The rubric is highly comprehensive and is broken up into 3 sections. There is a total of 10 dimensions listed under those sections, and anywhere between 1 to 6 different criteria under each dimension which make a total of 28 different criteria for assessing a laboratory report. A laboratory report must be assessed based on the rubric and each criterion must be scored by assigning it a whole number between 0 to 5.

To help the preservice teachers get a sense of the quality of work that deserves a particular score, we articulated what each score meant. For example ‘0’ means the students did not answer the question at all, while ‘2’ means that the answer is incorrect, and ‘4’ means his answer is partial or not accurate. Then, we discussed the reasons to score a given criterion. This serves the goal of helping the preservice teachers to score the rubric and the report more consistently.

Exemplary dimensions and criteria for assessment are presented in Table 2 (See full assessment rubric in Appendix C).

Table 2 The rubric sections and examples of dimensions and criteria under each section
Section Exemplary dimensions Exemplary criteria for assessment
(1) Getting familiar with a phenomenon Conducting pre-experiment Making detailed observations
Describing observation without interpretation
(2) Experiment planning Formulating a research question Formulating a research question showing the relationship between 2 variables
Formulating a hypothesis Basing the hypothesis on scientific and relevant information (including microscopic and chemical equations)
(3) Carrying out the experiment, analyzing, and drawing conclusions Results displaying and analyzing Processing and displaying results using a graph (Excel)
Explaining the results based on relevant and scientific information (including microscopic and chemical equations)
Summarized discussion Critically analyzing the results (refer to results accuracy and experiment limitation)
Critically considering the validity of the conclusions


High-school students who had to write a laboratory report are also given the rubric and they must structure the report based on the rubric and follow its guidelines.

(2) The written feedbacks preservice teachers gave for the two high-school students’ laboratory reports in the 1st assessment and 2nd assessment. The laboratory report was provided in a WORD document and feedback was inserted in the document itself in a form of comments and/or corrections of the actual text. As part of the FA process, comments in the body of the laboratory report, must be inserted by the chemistry teacher to better explain what was missing or incomplete, as well as, complimenting the student on a good work that s/he has done. Those comments should not simply state the correct answer but rather provide with a constructive and formative feedback, such as asking leading questions and providing directions on how to improve and promote further learning.

(3) Reflections written by preservice teachers. Preservice teachers had to submit a response for reflective questions about the entire laboratory report assessment process.

The reflective questions were related to assessing the laboratory reports through the rubric: (a) What dimensions and criteria in the rubric were more challenging for you and what dimensions and criteria were less challenging for you to score, during the assessment process? explain (b) What were your main uncertainties or doubts during assessing the laboratory reports using the rubric, and (c) Was there a difference between the 1st assessment of the 1st laboratory report to the second one? Please explain what the difference was and how it manifested in your assessment.

Other reflection questions were related to the written feedback preservice teachers provided for the two different high-school students’ laboratory report: (a) How would you characterize the written feedback given to the high-school students? (b) What were your challenges in writing your feedback comments? Please explain, and (c) Has your written feedback changed between the 1st assessment of the 1st laboratory report to the second one? In what way? Please explain

Data analysis

The rubric contained three different sections, multiple dimensions under each section, and multiple criteria under each dimension. Each criterion had to be scored on a scale between 0 and 5 (integers only), where 0 is the lowest and 5 is the best. The data from the scored rubrics of 1st and 2nd assessments was arranged and tabulated, charted, and then analyzed to reveal different patterns and variation by section, dimension, and criterion, as well as across the two assessments. Descriptive statistics was used to quantify the phenomenon. Thus, we calculated average, standard deviation, highest and lowest score, as well as range. These will be presented in the results section and provide a description of the patterns or irregularities in the data for descriptive purposes only.

The study is a qualitative study with a small number of participants, we emphasize here that we are not aiming for statistical generalization, however the descriptive statistics together with the qualitative results are important to describe the phenomenon under investigation (Sandelowski, 2000; Baškarada, 2014). We believe that the detailed description we provided regarding research context and the case we studied, will enable the reader to assess the degree of similarity between cases investigated and those to which the findings are to be applied and transferred (Bretz, 2008).

The qualitative content analysis method was used to analyze the written feedback given to students by the preservice teachers on two laboratory reports, and to analyze the reflection written by preservice teachers at the end of the process (Hsieh and Shannon, 2005). Therefore, the coding technique is qualitative in nature and then we used a method of quantifying the qualitative codes to compare their frequencies (Chi, 1997).

The analysis of preservice teachers’ feedback comments was done simultaneously by two researchers uncovering common themes. This phase revealed common viewpoints in preservice teachers’ feedback comments. The second phase of this analysis began by creating, and later refining, a coding scheme addressing the various levels of feedback types (Maguire and Delahunt, 2017). In this phase, we created a scale along a continuum between convergent and divergent feedback, based on Torrance and Pryor (2001), presented in our theoretical framework section. We applied this framework to characterize preservice teachers’ level of feedback using the range of convergent, intermediate, and divergent formative feedback, each has different characteristics that are described in Table 3.

Table 3 Coding scheme of formative feedback by level and domain
Level of formative feedback Description Category Example from the report of feedback comments
Convergent Convergent feedback – feedback that demand specific corrections • Correcting the text Your determination of state of matter are incorrect
• Judging as correct/incorrect by giving precise correction instructions You were required to raise 5 questions only.
Intermediate Feedback that includes guidance and specific actions for improvement • Providing specific direction for correction Reconsider the chemical language you are using to describe this process
• Asking specific, close-ended, leading questions The conclusion should be more focused on your variables
Divergent Divergent feedback – feedback that responds to students’ ideas with the aim of advancing their learning • Prompting for elaboration or justification How can you measure the rate of a reaction?
• Encouraging and giving opportunity to think Elaborate why your conclusions are valid for this specific experiment only
• Identifying inconsistencies while reasoning


In the next stage, percent change of the written feedback level was calculated across the two assessments by taking the difference in the number of feedback comments between the 1st and the 2nd assessment, divided by the total number of feedback comments of the 1st assessment. For example, if the number of feedback comments in divergent level of the 1st assessment and the 2nd assessment was 41 and 57, respectively, then the % change is image file: d1rp00001b-t1.tif.

Trustworthiness

In this study the inclusion of both qualitative content analysis and descriptive statistics of score variation enabled the triangulation of the data. Combining data from a preservice teacher's rubric scoring, feedback comments, and reflections enabled the triangulation of the findings, which, in turn, increased the validity of the findings, reduced the subjectivity of our interpretations, and enhanced the trustworthiness of the conclusions (Jonsen and Jehn, 2009).

The constant comparative method, of reviewing and comparing the data, addressed the concern of credibility. In addition, active corroboration on the interpretation of data helped in controlling validity and reliability. Trustworthiness in the content analysis process was addressed by each researcher conducting this phase individually and then discussing together until consensus was reached through negotiated agreement process (Watts and Finkenstaedt-Quinn, 2021).

Findings

Preservice teachers first assessed the laboratory reports based on the rubric, then we conducted an analysis which included the following four phases: in phase 1, we examined the overall scoring of the laboratory report, in phase 2 we examined the score variations by the three different sections, in phase 3 by the different dimensions, and in phase 4 by the different criteria. Fig. 2, presents the aspects teachers had to assess (the full rubric is presented in Appendix C).
image file: d1rp00001b-f2.tif
Fig. 2 Overview of data analysis.

Assertion 1: inconsistent overall scoring among preservice teachers

The total score of the 1st laboratory report assessment ranged between 40 to 90 points. For the 2nd laboratory report assessment, the range was between 51 to 88 points (from available 100 points). This large spread of the total score for the two assessed laboratory reports was also expressed by the extent of the standard deviation values – 13 for the 1st laboratory report assessment and 11 for the 2nd laboratory report assessment. The score range and standard deviation were slightly lower for the 2nd assessment, which is modestly encouraging, but remained high.

Fig. 3 presents the total score for the laboratory reports in both 1st and 2nd rubric-based assessments. In both assessments the score range was wide, although the range decreased from 50 points in the 1st assessment to 37 points in the 2nd assessment.


image file: d1rp00001b-f3.tif
Fig. 3 Rubric-based assessments’ total scores in the 1st and the 2nd laboratory-reports.

In both assessments, the score of 10 out of 13 pre-service teachers ranged between 70–90, while the rest scored below 70 (potential outliers). If we remove the outliers then overall score standard deviation values for both assessments would be cut by about 55%, and the standard deviation of the 2nd assessment would still be slightly better compared to the 1st one – 4.8 vs. 5.7 respectively.

Assertion 2: challenges preservice teachers experienced while assessing the laboratory report

Table 4 presents the descriptive statistics by rubric section – the average, standard deviation, highest and lowest score, and range. We have highlighted the important range and standard deviation values of Sections 2 & 3 which describe the change in score variance between the 1st and the 2nd assessment. These values will be discussed in detail below.
Table 4 Rubric-based assessment average scores by sections and descriptive statistics of the 1st and 2nd laboratory reports
Section Available points Measure 1st assessment 2nd assessment
1. Getting familiar with a phenomenon 10 Average 7.1 8.1
Std dev 1.0 1.1
Highest 8.7 10.0
Lowest 5.3 6.7
Range 3.3 3.3
2. Experiment planning 40 Average 32.8 31.7
Std dev 5.9 3.2
Highest 40.0 37.0
Lowest 15.0 24.5
Range 25.0 12.5
3. Carrying out the experiment, analyzing, and drawing conclusions 50 Average 36.1 34.3
Std dev 6.9 8.4
Highest 43.6 45.0
Lowest 20.0 20.0
Range 23.6 25.0


In Section 1, the standard deviation and range were consistent across the two assessments. Thus, we can assume that Section 1 was not very challenging for pre-service teachers to score. In Section 2, the range of the 2nd assessment was cut in half from 25 to 12.5 points compared to the 1st one, and the standard deviation decreased by 47%, from 5.9 to 3.2. We assume that Section 2 was challenging at first but after classroom discussions this became less challenging for pre-services teachers to score. In Section 3, the 2nd assessment variation slightly increased compared to the 1st one – the range increased slightly from 23.6 to 25, and the standard deviation increased from 6.9 to 8.4. We therefore assume that Section 3 was the most challenging section to score.

Delving into the analysis of preservice teachers’ reflections we revealed that 11 out of the 13 teachers felt that the first and second section were less challenging to assess compared to the 3rd section. One teacher wrote: “Assessing the first section was technical for me – I only had to check whether the student mentioned all substances in the observation stage and did not give them any interpretation…. this was not complicated”.

Another teacher added: “In the first section of the rubric there are no grey areas, you simply need to decide whether the students did what is expected of them based on the rubric”.

Addressing the 3rd section, one teacher reflected: “The most challenging part was the third section because students are required to understand, analyze, and explain the results in relation to the hypothesis. The students often write partial answers and I had to ask myself if the student really understood the experiment correctly or are there missing or unclear parts”.

With regards to the guidance received between the 1st and the 2nd assessments, preservice teachers reported ambiguity regarding the rubric-based assessment; specifically, they had difficulty in deciding whether to score a certain answer as ‘2’ or ‘3’, and ‘1’ or ‘2’. Nevertheless, the preservice teachers reported that talking explicitly about the meaning of each score, and reflecting on the scoring process along with providing examples for each possible answer, and its corresponding appropriate score, helped them in scoring the 2nd laboratory report.

The change in standard deviation in the 3rd section highlights the complexity the preservice teachers experienced while assessing Section 3, which deals with explaining the results, drawing conclusions, and requires the application of critical thinking. These elements were the most difficult to assess according to the preservice teachers’ reflections. We assume that the explicit guidance helped the teachers in assessing students’ experiment planning (2nd section) which includes more focused, close-ended and specific aspects of the laboratory report, such as formulating research questions and defining experiment variables, however assessing Section 3, which is more complex to assess, remained challenging for them.

Assertion 3: scoring open and non-structured dimensions vs. focused dimensions

In Section 1 the standard deviation and range were consistent across the two assessments. Thus, we further delved only into dimensions in Sections 2 and 3. The descriptive statistics for the dimensions under Section 2 are presented in Table 5.
Table 5 Rubric-based assessment average scores of dimensions in Section 2 (Experiment planning) and descriptive statistics of the 1st and 2nd laboratory reports
Dimension Available points Measure 1st assessment 2nd assessment
2.1 Asking research questions 5 Average 3.7 3.7
Std dev 1.4 0.9
Highest 5.0 5.0
Lowest 0.0 2.0
Range 5.0 3.0
2.2 Formulating research question 10 Average 8.4 8.2
Std dev 1.5 1.0
Highest 10.0 10.0
Lowest 5.0 7.0
Range 5.0 3.0
2.3 Formulating a hypothesis 10 Average 8.3 7.0
Std dev 1.9 2.2
Highest 10.0 10.0
Lowest 3.0 2.0
Range 7.0 8.0
2.4 Experiment planning 15 Average 12.6 12.9
Std dev 2.6 1.4
Highest 15.0 14.0
Lowest 5.0 9.5
Range 10.0 4.5


In dimensions 2.1, 2.2, and 2.4 that deal with setting up research questions and plan an experiment we observed a reduction in both range and standard deviation values from the 1st assessment to the 2nd assessment. However, in dimension 2.3 – formulating a hypothesis – the range and standard deviation increased from 1.9 in the 1st assessment to 2.2 in the 2nd assessment. This dimension includes providing a scientific explanation for students’ hypothesis. In their reflections, 8 out of the 13 preservice teachers emphasized that they struggled with scoring this dimension especially because they did not know what should be included in a scientific explanation of the hypothesis, and what the depth of the explanation should be.

For example, one teacher wrote: “I did not know what is expected from a student's answer… my main dilemma was to decide the extent to which students’ answer should be elaborated and therefore I had a difficulty in deciding what is the appropriate score. I had to remind myself that it is the student's report, rather than mine, and therefore it should reflect their perspective”.

For all other dimensions under this section standard deviation had decreased from the 1st assessment to the 2nd assessment. Specifically, in the dimension Experiment planning the standard deviation was cut by almost 50% from 2.6 to 1.4. This dimension included criteria such as: stating the constant variables in the experiment and describing detailed and logical outline of the experiment process steps.

As one teacher described: “When students described their planned experiment, I simply had to assess whether the stages are logical and that there is not any stage missing…this was easy for me to score”.

The descriptive statistics for the dimensions under Section 3 are presented in Table 6.

Table 6 Rubric-based scores of dimensions in Section 3 (carrying out the experiment, analyzing, and drawing conclusions) and their descriptive statistics for the 1st and 2nd laboratory reports
Dimension Available points Measure 1st assessment 2nd assessment
3.2 Displaying and analyzing results 15 Average 10.6 9.8
Std dev 2.1 2.6
Highest 14.3 12.8
Lowest 6.0 3.0
Range 8.3 9.8
3.3 Drawing conclusions 10 Average 7.1 6.1
Std dev 2.4 3.0
Highest 10.0 10.0
Lowest 3.0 0.0
Range 7.0 10.0
3.4 Summarized discussion 10 Average 5.8 5.0
Std dev 1.8 2.5
Highest 9.3 8.0
Lowest 2.0 0.7
Range 7.3 7.3
3.5 The laboratory report's language, organization, and aesthetics 10 Average 7.3 8.5
Std dev 1.8 1.3
Highest 10.0 10.0
Lowest 4.0 5.3
Range 6.0 4.7


Dimension 3.1 – experiment handling – was excluded from the analysis since it can be assessed only by observing students performing the experiment. As shown in Table 6, for dimensions 3.2, 3.3, and 3.4, which deal with results analysis, conclusions, and discussion, the ranges and standard deviation values increased from the 1st assessment to the 2nd assessment.

It is especially noted that for dimension 3.3 – drawing conclusions – the range for the 2nd assessment spans over the entire range of available points (i.e. 10 points).

As noted, Section 3 was challenging to score for the preservice teachers. In their reflections, they addressed these dimensions and emphasized: “The discussion section has a high degree of freedom and I felt that assessing this dimension depends a lot on teachers’ perspective and requirements. For example, when thinking about possible mistakes in the experiment and research limitations, I do not believe that it is enough to address only measurement errors rather it is essential to address other possible limitations also… but maybe for other teachers it is sufficient.” This might reflect that when preservice teachers had to assess focused dimensions (such as dimension 2.4 – Experiment planning), the guidance helped the scoring process and resulted in more consistent grades. Whereas, when they had to assess more open and non-structured dimensions, that require students to exhibit high-order thinking skills, they experienced difficulties in scoring and the standard deviation increased across the two assessments.

Assertion 4: the role of the guidance provided to preservice teachers

Given the results in the rubric-based assessment scoring in the different sections and dimensions, we focus only on the criteria under the dimensions of Section 3, that presented high inconsistency in scoring among the preservice teachers. Table 7 presents these criteria's standard deviations of the 1st and 2nd assessments.
Table 7 Criteria's standard deviations of the 1st and 2nd rubric-based assessments scores in Section 3
Dimension Criterion Standard deviation
1st assessment 2nd assessment
3.2 Displaying and analyzing results 3.2.1 Organizing and displaying results in a table format 0.9 1.2
3.2.2 Processing and displaying results using a graph (Excel) 1.3 0.7
3.2.3 Describing the trends revealed 0.5 1.5
3.2.4 Explaining the results based on relevant and scientific information (including the microscopic chemistry level and chemical equations) 1.5 0.9
3.3 Drawing conclusions 3.3.1 Drawing conclusions that fit the experiment results 1.6 1.9
3.3.2 Explaining whether conclusions support the hypothesis 1.3 1.7
3.4 Summarized discussion 3.4.1 Critically analyzing the results (referring to results accuracy and experiment limitations) 1.5 1.5
3.4.2 Critically considering the validity of the conclusions 1.2 1.8
3.4.3 Based on the experiment results, formulating 3 new research questions 1.1 0.7
3.5 The laboratory report's language, organization, and aesthetics 3.5.1 Using concise and scientific language 1.3 0.9
3.5.2 Writing clearly in standard language 0.9 0.5
3.5.3 Submitting complete, readable, organized, and aesthetic report 0.9 1.0


Deepening the analysis for the criteria in Section 3 revealed that standard deviation values had decreased from the 1st assessment to the 2nd assessment in criteria 3.2.2 – processing and displaying results using a graph, 3.2.4 – explaining the results based on relevant and scientific information, and 3.4.3 – based on the experiment results, formulating 3 new research questions, which can be explained as an effect of the guidance. Preservice teachers’ reflections also supported and emphasized that the shared discussion in the classroom among colleagues and the course lecturer, made it clearer for them how the experiment results should be displayed, and what is a proper research question. The most significant reduction in standard deviation was observed in criteria 3.2.4 because in the 1st assessment the preservice teachers struggled to understand the extent of students’ scientific explanations.

However, despite that guidance, the standard deviation increased from the 1st assessment to the 2nd assessment mostly in the criteria dealing with writing conclusions and thinking critically about the results of the experiment (criteria 3.3.1, 3.3.2, 3.4.1, and 3.4.2). The guidance to the preservice teachers addressed how to assess and score students’ writing related to experiment's conclusions and critical thinking by providing a variety of students’ possible responses. However, since the related criteria score standard deviation had increased from the 1st assessment to the 2nd one, we assume that this form of guidance probably created more confusion.

The nature of feedback provided by preservice teachers

In addition to scoring based on the rubric, the 13 preservice teachers were asked to provide a written feedback in the body of the two high-school students’ laboratory reports. These feedback comments were aimed at helping and guiding students on how to improve the content of the report. To characterize the feedback quality, we evaluated each feedback comment based on the criteria shown previously and assessed it on a formative feedback scale of convergent, intermediate, and divergent.

Table 8 shows that the number of feedback comments for the convergent level decreased by 16%, for the intermediate level decreased by 5%, while for the divergent level increased by 39%. That means that the type of written formative assessment feedbacks given by preservice teachers changed from their 1st assessment to the 2nd assessment and became more divergent in nature. The divergent approach advances students’ thinking and creates an opportunity to discuss and elaborate. We can also notice that the main changes occurred in the convergent and the divergent levels, whereas there is only a slight change in the intermediate level, meaning preservice teachers’ assessment still included leading closed-ended questions and specific directions for improvement.

Table 8 Category short name and category full description
Formative feedback level Category short name Feedback category full description Change in number of feedbacks (%)
Convergent Correcting Correcting the text −58
Judging Judging as correct/incorrect by giving precise correction instructions +26
Convergent – average change −16
Intermediate Asking Asking specific, close-ended, leading questions +6
Providing Providing specific direction for correction −7
Intermediate – average change −5
Divergent Encouraging Encouraging and giving opportunity to think −13
Identifying Identifying inconsistencies while reasoning −25
Prompting Prompting for elaboration or justification +150
Divergent – average change +39


Each level (convergent, intermediate, and divergent) was characterized by several categories and the percent change in number of feedback comments by category across the two assessments is also shown in Table 8. We see that the number of feedback comments in the category correcting the text decreased the most across the two assessments while the number of feedback comments in the category prompting for elaboration or justification increased the most.

In their 1st laboratory-report assessment, preservice teachers felt that generating a well written report that students can include in their portfolio is superior to guiding them through the learning process and creating a two-way dialog that encourages them to think and develop higher order thinking skills. We can see reinforcement of this result in their reflections where one teacher wrote: “In the first report I corrected the whole report and the student had to simply accept my comments and submit a prepared report without their personal contribution. In the second report, I tried to provide the student with directions for further thinking. Yet, I chose to correct only things that are incorrect or missing and not comment on things I would have done differently but they are still correct and acceptable”.

Another teacher reflected: “looking back at my initial feedback I understand that I wrote the laboratory report instead of the student, investing a lot of time and energy. Now, I realize that it is important that the report reflects the student's work and not strive to change everything. By doing so, I allow the students to maintain their uniqueness and I demonstrate respect towards their work”.

We also infer from the data that some preservice teachers still felt the need to judge students’ answers and provide them with the ‘correct’ answer as well as providing them with specific directions for correction. During our analysis, we still witnessed some of ‘correct/incorrect’ feedback such as:

“You do not say gas is formed” [In their observations, students are required to describe what they see in the macroscopic chemistry level, without giving any interpretation]

“This is a conclusion and not an observation, there is no need to write it here”

“Do not connect the dots in your graphical representation”. This example of feedback does not enable or advances students’ understanding because an explanation is missing for: (a) why connecting the dots is not appropriate in this case, (b) when connecting the dots is the right thing to do and when it doesn’t, and (c) what is the significance of the connection of the dots.

“It is not correct to write ‘the quantity of the acid increased’. You should address the volume or the concentration of the acid” [When planning the experiment, students are required to specify the exact amount of acid that serves as they independent variable].

Although 12 out of 13 teachers wrote in their reflections that they perceived their 2nd assessment feedback to be more guiding and formative, some still tended to judge students’ report in a convergent way.

Additionally, the number of written feedbacks which were classified as intermediate level remained almost the same across both assessments. Teachers felt that this type of feedback is necessary because it: (a) provided them with further details related to the inquiry process by asking close-ended questions such as “Did you mean the concertation of HCl?” or “After how long you measured the pH of the solution”? and (b) not all comments should be elaborated or open for a discussion such as “Correct the chemical equation to be written in a proper chemical language”. Meaning, certain errors require a feedback that provides a specific direction for improvement to advance students towards the right direction.

The main aspect that changed in preservice teachers’ feedback is their ability to prompt students for further elaboration and justification. The following examples show that the preservice teachers encouraged the students to continue with their line of investigation, to elaborate regarding a variety of aspects, and to further justify their results and conclusions:

“The results obtained should be further explained and strengthen by scientific evidence” [Students are required to provide a scientific explanation of their results, by connecting the macroscopic chemistry phenomena to the microscopic chemistry level, along with relevant chemical equations and address possible errors that might have occurred during the inquiry]

“Explain what the difference between the experiments systems is, and what will be measured in each system” [Planning the experiment, students are required to change the independent variable along four systems and describe how they will measure the dependent variable in each system]

“Reasons should be given as to why the results did not confirm the hypothesis”

They also included questions that generate a discussion, and probe student's thinking such as “If the acid is partially neutralized, why is there no change in acidity?’; “How does the pH remain acidic even in large Magnesium masses?”

One teacher emphasized that: “The discussion in the course helped me to understand that I was missing certain topics in my written feedback and how I can give the students further directions to think by asking certain questions or requesting for more in-depth explanations. The goal is that the students will look for the appropriate answer and discuss how to proceed”.

Addressing the category of identifying inconsistencies while reasoning, we assumed that the decrease across the two assessments is due to the nature of the report itself. That is, the first report assessed by the preservice teachers contained a lot of inconsistencies, leading to a significant amount of feedback of this sort compared to the second report. We believe that this category is highly report dependent.

Discussion

Careful attention for the use of formative assessment is a critical component of meaningful learning in science in general and chemistry classroom in particular (Tomanek et al., 2008; Furtak et al., 2016; Clinchot et al., 2017). This study examined whether guidance provided through a teacher education program affected preservice teachers’ formative use of scoring rubrics in the context of an inquiry-based laboratory. The formative use of the rubric incorporated the value of feedback to students in addition to the rubric-based scoring. The study findings show that the total laboratory report score variation decreased in the 2nd assessment probably due to the explicit guidance the preservice teachers received. The teachers valued the explicit guidance and emphasized it provided them with the opportunity to examine, discuss, and learn about their own assessment knowledge and scoring process. The reflection enabled them to actively utilize FA while developing an in-depth understanding of its potential implications for teaching and learning (Buck et al., 2010; Sabel et al., 2015). However, despite the guidance, variation had increased, mainly in Section 3 that deals with results analysis, conclusions, and discussion dimensions.

From the scores range and standard deviations, and from preservice teachers’ reflections, we can infer that when the preservice teachers scored dimensions that are less open for interpretation, the guidance led them to a more in-depth articulation of what is expected from the students, and their scores tended to be more unified. Whereas, in the dimensions that are more open for discussion and interpretation, which included a variety of aspects to account for – related to evidence-generating analysis, interpretation, conclusions – the explicit guidance may have created different thinking directions that led to the increase in score variation. In these dimensions the guidance exposed the preservice teachers to the complexity of a rubric-based scoring in a FA manner, making it challenging for them to score these criteria consistently.

The Results display, Analyzing, and Drawing conclusions dimensions under Section 3 require students to think critically about their inquiry-based experiment and concisely express their chemistry understanding inferred from an experiment, while connecting the macroscopic to the microscopic chemistry levels (Bevins and Price, 2016). These components of the inquiry process were documented as complicated learning skills for teachers to assess because they require teacher's deep conceptual understanding regarding the meaning of these aspects (McNeill and Krajcik, 2008; Berland and Reiser, 2009; Mardapi, 2020). That is, different assessors’ cognitive understanding of how and what an analysis of results should comprise, might lead to different scoring and a large variation in scores (Bernard and Dudek-różycki, 2009). This finding highlights that it can be challenging to achieve consistency among graders when scoring dimensions that deal with data analysis, interpretation, and discussion (Allen and Tanner, 2006; Avargil et al., 2015; Grob et al., 2017). Yet, given the central role that data analysis plays in thinking critically about the experiment, we believe that providing an in-depth guidance, targeted specifically towards these aspects, can create a coherent understanding of how to assess and score these complex components, and as a result narrow the variability of scoring (Buck et al., 2010; Panadero and Jonsson, 2013). The guidance should also include a conceptualization of the different phases during the inquiry process, discussing their purpose, meaning, and representation, presenting different levels of implementation, and at last assessing a variety of students' examples (Sabel et al., 2015; Grob et al., 2017).

Regarding characterizing preservice teachers’ formative feedback, we found that the written feedbacks changed across the two assessments to be more divergent and less convergent in their nature. The feedbacks, in the 2nd laboratory report assessment, included fewer instances of correcting the text, and more formative feedbacks that encourage students to think and be prompted for elaboration and justification.

Still, the results show that in the 2nd assessment there were both divergent and convergent feedbacks which might have given us some indication on teachers’ assessment reasoning and perceptions regarding FA.

It is important to note that in the context of assessing a laboratory report, both types of feedbacks are vital due to two reasons: (a) regarding convergent feedback, there are some instances in which correcting the text and giving a precise direction for corrections is needed, such as writing a wrong formula or a chemical equation, and (b) regarding divergent feedback, the ability to provide students with this kind of feedback depends on the capacity of the preservice teachers to clearly understand and implement a divergent feedback first before providing students this kind of feedback (Torrance and Pryor, 2001).

Encouraging preservice teachers to enact both types of feedbacks, we should strive to enable them, within time and a constant reflection, to minimize the evaluative orientation of their feedback, and move towards the interpretive one, while making sense of why students struggle with the targeted concepts and ideas of the inquiry process (Talanquer et al., 2015).

The two types of feedbacks reveal the importance of attending to how teachers’ pay attention to students’ ideas while they are engaged in the practice of formative assessment, and how guidance can contribute to their professional development in the aspect of formative assessment (Talanquer et al., 2013, 2015; Barnhart and van Es, 2015; Van der Kleij et al., 2017). In our study we show this importance of the context of formatively assessing students’ laboratory reports in chemistry.

While analyzing the feedback comments of the 2nd assessment, it was evident that the preservice teachers were able to provide students with a substantive feedback that can engage students in active participation in revising their own inquiry-based laboratory report. In the 2nd assessment, preservice teachers were more successful in giving a formative feedback which can lead to eliciting students’ knowledge and facilitating students’ conceptual understanding. From this perspective, our work supports and contributes to recent calls that encourage the use of rubrics in a FA manner and helps facilitating the scoring process in a formative feedback style (Allen and Tanner, 2006; Panadero and Jonsson, 2013; Menéndez-Varela and Gregori-Giralt, 2018).

Although the explicit guidance conveyed the importance of leading high-school chemistry students through the inquiry-based process by providing them feedback, when the preservice teachers confronted with students’ lack of understanding in the report, they were still unable, in some cases, to enhance or revise their feedback judged students’ answers as correct or incorrect, while providing precise instructions for correction.

Several of the categories that we identified in this research, that characterized the written feedback, such as correcting the text and providing a precise direction for correction, have been also identified by researchers that are interested in FA (Buck et al., 2010; Harshman and Yezierski, 2015; Usher and Barak, 2018; Kim et al., 2020; Murray et al., 2020). In some of these studies preservice teachers struggled to identify students’ misconceptions and focused on the extent to which students have made sense of the topic. Teachers’ reaction in these cases was to provide the students with help that sequentially narrowed students’ thinking. Another documented challenge was how to select what to comment on, and how to avoid giving away part of the answer in the feedback (Grob et al., 2017). This challenge was evident in our research when preservice teachers sometimes struggled to use the feedback as a communication avenue, and instead provided students with the correct answer.

In an effort to characterize preservice teachers’ feedback in the 1st assessment vs. their 2nd assessment, we turned to Murray and colleagues (2020) that identified FA personalities of chemistry teachers when engaged in providing feedback to students’ written work. For the 1st assessment we can describe preservice teachers’ feedback as evaluative and directive. Meaning, teachers focused on the extent to which students’ answers were correct or not and their proposed actions focused on providing stepwise guidance in the correct direction. Whereas in the 2nd assessment their interpretation still was evaluative in some cases, but their proposed actions were more dialogic, using students’ ideas to make suggestions on how to move students’ conceptual understanding forward. This characterization of teachers’ FA personality highlights the importance of paying attention to the extent to which teachers’ approach to assessment of students’ laboratory reports is characterized as descriptive versus inferential (Barnhart and van Es, 2015; Wheeler et al., 2015). Presenting and guiding teachers through these approaches while they build recognition regarding their FA practice, might lead them to explore and implement other types of feedbacks that promote students’ learning (Murray, 2020).

We therefore recommend that the guidance preservice teachers receive, regarding formative assessment of inquiry-based laboratory reports, should include aspects of how to notice and interpret students’ ideas and only then respond to students with a formative feedback (Talanquer et al., 2015).

Implications for preservice teacher education

Our study is rooted in the belief that if we want teachers to become better assessors of student understanding, then teacher educators should identify and understand characteristics of teachers’ assessment reasoning that can enhance students’ learning process, and lead to successful instruction in the classroom. Therefore, the results of our study have important implications for the preparation of future chemistry teachers and for the professional development of those already teaching chemistry. Fig. 4 presents the main components and a sequence of stages to enact FA in the context of assessing inquiry-based laboratory reports.
image file: d1rp00001b-f4.tif
Fig. 4 A model of effective guidance – enactment of inquiry-based laboratory reports’ FA.

Focusing on the assessment of students’ laboratory reports in an authentic context of the inquiry-based laboratory enables prospective teachers to develop practical procedures and a sense of realism with regards to planning and teaching (Loucks-Horsley et al., 2009; Buck et al., 2010; Wheeler et al., 2015). By allowing the preservice teachers to experience assessment of students’ laboratory reports in their education program, preservice teachers can explore the complexity of formative assessment in a controlled and supporting environment. Prospective teachers in their early stages of development do not yet have well developed abilities to enact FA, therefore we suggest that the guidance will be scaffolded to include the following aspects as shown in Fig. 4:

(1) Expanding knowledge and understanding of inquiry-based laboratory practices – teachers need to have a clear understanding regarding each inquiry-based laboratory practice, especially practices that require data analysis and reasoning. This aspect includes discussing what each practice means and what is important to address when implementing and assessing. This aspect may lead to a more unified scoring (Sabel et al., 2015; Bevins and Price, 2016; Correia and Harrison, 2020).

(2) Discussing FA in theory (2.1) and practice (2.2) – preservice teachers emphasized that understanding the purpose of FA and realizing its potential in promoting students’ understanding and active participation in the learning process motivated them to provide the students a formative feedback. Therefore, delving into the essence of FA and presenting the notion of noticing and interpreting students’ ideas, will set the ground for teachers to implement this approach during their instruction in general, and in assessing laboratory reports in particular (Talanquer et al., 2015; Kim et al., 2020; Murray et al., 2020).

(3) Reflecting on the process – allows for purposeful, systemic investigation into one's own or others’ teaching and assessment approaches, in order to develop new understandings and improve assessment practices (Abell et al., 1998; Karlström and Hamza, 2019). The reflection process during the course allowed the preservice teachers to examine their assessment knowledge and to articulate their learning and challenges regarding FA of inquiry-based laboratory reports. The combination of authentic experiences, substantive reflection, followed by a discussion with colleagues, has the potential for a meaningful impact on prospective teachers’ practice and beliefs about science teaching and assessment (Buck et al., 2010).

(4) Providing a field-based teaching experience like the one presented in this paper.

Limitation, further research, and contribution

The rubric used in this research was a pre-prepared rubric designed by the Chemistry Division in the Ministry of Education, with no possibility of changing or modifying it for the course purposes. This rubric is used by teachers routinely in the laboratory learning unit which is mandatory in chemistry studies in high-school. Getting familiar with it is vital for preservice teachers’ future practice as high-school chemistry teachers in Israel. We could have gained an additional insight into preservice teachers’ perspective regarding rubric-based scoring in a FA manner, by conducting interviews. Yet, the reflections of the preservice teachers provided a qualitative description of their experience, emphasizing challenges and perceptions regarding this process.

Our analysis method and discussion are based on a small group of participants that are not representative of all preservice chemistry teachers. Therefore, we cannot attribute the changes occurred from 1st to 2nd assessment only to the guidance provided to the preservice teachers due to other unknown variables and experiences that may have played at this time. We acknowledge that the transferability of our results is limited but it may serve as the starting point for other researchers in this field (Bretz, 2008).

Our work extends the small but growing body of research on the use of scoring rubrics for formative assessment purposes (Panadero and Jonsson, 2013; Menéndez-Varela and Gregori-Giralt, 2018). This study advances the idea that the use of rubrics should be considered as a potentially valuable assessment tool and feedback method. It sets out to examine the implementation and use of rubrics in the field of inquiry-based laboratories in chemistry education, as few papers examined this aspect in this discipline and in the context of high-school chemistry teachers’ assessment knowledge. Practically, we offer guidelines for teachers’ education and professional development programs on how to guide prospective and experienced chemistry teachers in assessing students’ inquiry-based laboratory report for FA.

We raise a concern regarding consistency among graders when using the rubric, especially in the data analysis section, and therefore this is an important area to investigate. In addition, more research is needed to better understand how teachers’ assessment reasoning evolves from convergent to divergent levels of feedback. It would be critical to identify and characterize stepping-stones that allow teachers to transition to higher levels of formative feedback in addition to the different criteria identified in our study.

Conflicts of interest

The authors have no conflict of interest to declare.

References

  1. Abd-El-Khalick F., Boujaoude S., Duschl R., Lederman N. G., Mamlok-Naaman R., Hofstein A., et al., (2004), Inquiry in science education: International perspectives. Sci. Educ., 88(3), 397–419.
  2. Abell S. K. and Siegel M. A., (2011), Assessment literacy: What science teachers need to know and be able to do, in The Professional Knowledge Base of Science Teaching, Springer, Netherlands, pp. 205–221.
  3. Abell S. K., Bryan L. A. and Anderson M. A., (1998), Investigating preservice elementary science teacher reflective thinking using integrated media case-based instruction in elementary science teacher preparation. Sci. Educ., 82(4), 491–509.
  4. Allen D. and Tanner K., (2006), Rubrics: Tools for making learning goals and evaluation criteria explicit for both teachers and learners. CBE—Life Sci. Educ., 5(3), 197–203.
  5. Andrade H. and Du Y., (2005), Student perspectives on rubric-referenced assessment. Pract. Assessment, Res. Eval., 10, 3.
  6. Avargil S., Herscovitz O. and Dori Y. J., (2012), Teaching Thinking Skills in Context-Based Learning: Teachers’ Challenges and Assessment Knowledge. J. Sci. Educ. Technol., 21(2), 207–225.
  7. Avargil S., Bruce M. R. M., Amar F. G. and Bruce A. E., (2015), Students’ understanding of analogy after a CORE (Chemical Observations, Representations, Experimentation) learning cycle, general chemistry experiment. J. Chem. Educ., 92(10), 1626–1638.
  8. Avargil S., Bruce M. R. M., Klemmer S. A. and Bruce A. E., (2019), A professional development activity to help teaching assistants work as a team to assess lab reports in a general chemistry course. Isr. J. Chem., 59(6), 536–545.
  9. Barnea N., Doria Y. J. and Hofsteind A., (2010), Development and implementation of inquiry-based and computerized-based laboratories: Reforming high school chemistry in Israel. Chem. Educ. Res. Pract., 11(3), 218–228.
  10. Barnhart T. and van Es E., (2015), Studying teacher noticing: EXAMINING the relationship among pre-service science teachers’ ability to attend, analyze and respond to student thinking. Teach. Teach. Educ., 45, 83–93.
  11. Baškarada S., (2014), Qualitative Case Study Guidelines. Qual. Rep., 19, 1–18.
  12. Bennett R. E., (2011), Formative assessment: A critical review. Assess. Educ. Princ. Policy Pract., 18(1), 5–25.
  13. Berland L. K. and Reiser B. J., (2009), Making sense of argumentation and explanation. Sci. Educ., 93(1), 26–55.
  14. Bernard P. and Dudek-różycki K., (2009), Integration of inquiry-based instruction with formative assessment: The case of experienced chemistry teachers. J. Balt. Sci. Educ., 18(2), 184–196.
  15. Bevins S. and Price G., (2016), Reconceptualising inquiry in science education. Int. J. Sci. Educ., 38(1), 17–29.
  16. Black P. and Wiliam D., (1998), Assessment and classroom learning. Int. J. Phytoremediation, 21(1), 7–74.
  17. Bretz S. L., (2008), Qualitative research designs in chemistry education research, in Bunce D. M. and Cole R. S. (ed.), Nuts and Bolts of Chemical Education Research, ACS Division of Chemical Education, Inc., pp. 79–99.
  18. Brookhart S. M., (1994), Teachers’ Grading: Practice and Theory. Appl. Meas. Educ., 7(4), 279–301.
  19. Brookhart S. M., (1997), A Theoretical Framework for the Role of Classroom Assessment in Motivating Student Effort and Achievement. Appl. Meas. Educ., 10(2), 161–180.
  20. Buck G. A., Trauth-Nare A. and Kaftan J., (2010), Making formative assessment discernable to pre-service teachers of science. J. Res. Sci. Teach., 47(4), 402–421.
  21. Cacciatore K. L. and Sevian H., (2009), Incrementally approaching an inquiry lab curriculum: Can changing a single laboratory experiment improve student performance in general chemistry? J. Chem. Educ., 86(4), 498–505.
  22. Carmel J. H., Herrington D. G., Posey L. A., Ward J. S., Pollock A. M. and Cooper M. M., (2019), Helping students to “do Science”: Characterizing scientific practices in general chemistry laboratory curricula. J. Chem. Educ., 96(3), 423–434.
  23. Chi M. T. H., (1997), Quantifying qualitative analyses of verbal data: A practical guide. J. Learn. Sci., 6, 271–315.
  24. Clinchot M., Lambertz J., Huie R., Banks G., Lewis R., Ngai C., et al., (2017), Better formative assessment. Sci. Teach., 084(03), 69.
  25. Coffey J. E., Hammer D., Levin D. M. and Grant T., (2011), The missing disciplinary substance of formative assessment. J. Res. Sci. Teach., 48(10), 1109–1136.
  26. Correia C. F. and Harrison C., (2020), Teachers’ beliefs about inquiry-based learning and its impact on formative assessment practice. Res. Sci. Technol. Educ., 38(3), 355–376.
  27. Dolin J., Black P., Harlen W. and Tiberghien A., (2018), Exploring Relations Between Formative and Summative Assessment, Springer, pp. 53–80.
  28. Dori Y. J., Dangur V., Avargil S. and Peskin U., (2014), Assessing advanced high school and undergraduate students’ thinking skills: The chemistry-from the nanoscale to microelectronics module. J. Chem. Educ., 91(9), 1306–1317.
  29. Dresel M. and Haugwitz M., (2008), A Computer-Based Approach to Fostering Motivation and Self-Regulated Learning. Artic. J. Exp. Educ., 77(1), 3–20.
  30. Erickson F., (2012), Qualitative research methods for science education, in Fraser B. J., McRobbie C. J. and Tobin K. (ed.), Second International Handbook of Science Education, Springer, pp. 1451–1469.
  31. Fay M. E., Grove N. P., Towns M. H. and Bretz S. L., (2007), A rubric to characterize inquiry in the undergraduate chemistry laboratory. Chem. Educ. Res. Pract., 8(2), 212–219.
  32. Flick U., (2013), The SAGE Handbook of Qualitative Data Analysis, Sage, London.
  33. Furtak E. M., Kiemer K., Circi R. K., Swanson R., de León V., Morrison D. and Heredia S. C., (2016), Teachers’ formative assessment abilities and their relationship to student learning: findings from a four-year intervention study. Instr. Sci., 44(3), 267–291.
  34. Grob R., Holmeier M. and Labudde P., (2017), Formative assessment to support students’ competences in inquiry-based science education. Interdiscip. J. Probl. Learn., 11(2), 11.
  35. Harks B., Rakoczy K., Hattie J., Besser M. and Klieme E., (2014), The effects of feedback on achievement, interest and self-evaluation: The role of feedback's perceived usefulness. Educ. Psychol., 34(3), 269–290.
  36. Harshman J. and Yezierski E., (2015), Guiding teaching with assessments: High school chemistry teachers’ use of data-driven inquiry, Chem. Educ. Res. Pract, 16(1), 93–103.
  37. Herman J., Osmundson E., Dai Y., Ringstaff C. and Timms M., (2015), Investigating the dynamics of formative assessment: relationships between teacher knowledge, assessment practice and learning. Assess. Educ. Princ. Policy Pract., 22(3), 344–367.
  38. Herppich S. and Wittwer J., (2018), Preservice teachers’ beliefs about students’ mathematical knowledge structure as a foundation for formative assessments. Teach. Teach. Educ., 76, 242–254.
  39. Hofstein A. and Lunetta V. N., (2004), The laboratory in science education: Foundations for the twenty-first century. Sci. Educ., 88(1), 28–54.
  40. Hofstein A., Shore R. and Kipnis M., (2004), Providing high school chemistry students with opportunities to develop learning skills in an inquiry-type laboratory: A case study. Int. J. Sci. Educ., 26(1), 47–62.
  41. Hofstein A., Mamlok R. and Rosenberg O., (2006), Varying instructional methods and assessment of students in high school, in McMahon M., Simmons P., Sommers R., DeBaets D. and Crawley F. (ed.), Assessment in Science: Practical Experiences and Education Research, NSTA, pp. 139–148.
  42. Hofstein A., Dkeidek I., Katchevitch D., Nahum T. L., Kipnis M., Navon O., et al., (2019), Research on and development of inquiry-type chemistry laboratories in Israel. Isr. J. Chem., 59(6), 514–523.
  43. Hsieh H. F. and Shannon S. E., (2005), Three approaches to qualitative content analysis. Qual. Health Res., 15(9), 1277–1288.
  44. Jonsen K. and Jehn K. A., (2009), Using triangulation to validate themes in qualitative studies. Qual. Res. Organ. Manag. An Int. J., 4(2), 123–150.
  45. Jonsson A. and Svingby G., (2007), The use of scoring rubrics: Reliability, validity and educational consequences. Educ. Res. Rev., 2(2), 130–144.
  46. Karlström M. and Hamza K., (2019), Preservice science teachers’ opportunities for learning through reflection when planning a microteaching unit. J. Sci. Teacher Educ., 30(1), 44–62.
  47. Kim Y. A., Monroe E., Nielsen H., Cox J., Southard K. M., Elfring L., et al., (2020), Exploring undergraduate students’ abilities to collect and interpret formative assessment data. J. Chem. Educ., 97(12), 4245–4254.
  48. Kohler F., Henning J. E. and Usma-Wilches J., (2008), Preparing preservice teachers to make instructional decisions: An examination of data from the teacher work sample. Teach. Teach. Educ., 24(8), 2108–2117.
  49. Kurdziel J. P., Turner J. A., Luft J. A. and Roehrig G. H., (2003), Graduate teaching assistants and inquiry-based instruction: Implications for graduate teaching assistant training. J. Chem. Educ., 80(10), 1206.
  50. Lawrie G. A., Graulich N., Kahveci A. and Lewis S. E., (2021), Ethical statements: a refresher of the minimum requirements for publication of chemistry education research and practice articles. Chem. Educ. Res. Pract., 22, 234.
  51. Levin D. M., Hammer D. and Coffey J. E., (2009), Novice teachers’ attention to student thinking. J. Teach. Educ., 60(2), 142–154.
  52. Loucks-Horsley S., Stiles K. E., Mundry S., Love N. and Hewson P. W., (2009), Designing professional development for teachers of science and mathematics.
  53. Maguire M. and Delahunt B., (2017), Doing a thematic analysis: A practical, step-by-step guide for learning and teaching scholars. AISHE-J. All Irel. J. Teach. Learn. High. Educ., 9(3), 3351.
  54. Mamlok-Naaman R. and Barnea N., (2012), Laboratory activities in Israel. Eurasia J. Math. Sci. Technol. Educ., 8(1), 49–57.
  55. Mardapi D., (2020), Assessing students’ higher order thinking skills using multidimensional item response theory. Probl. Educ. 21st Century, 78(2), 196–214.
  56. McNeill K. L. and Krajcik J., (2008), Inquiry and scientific explanations: Helping students use evidence and reasoning. Sci. Inq. Second. Setting, 121–134.
  57. Menéndez-Varela J. L. and Gregori-Giralt E., (2018), The reliability and sources of error of using rubrics-based assessment for student projects. Assess. Eval. High. Educ., 43(3), 488–499.
  58. Merriam S. B., (1998), Qualitative research and case study applications in education, San Francisco: Jossey-Bass.
  59. Murray S. A., Huie R., Lewis R., Balicki S., Clinchot M., Banks G., et al., (2020), Teachers’ noticing, interpreting, and acting on students’ chemical ideas in written work. J. Chem. Educ., 97(10), 3478–3489.
  60. Nadji T. and Lach M., (2003), Assessment Strategies for Laboratory Reports. Phys. Teach., 41(1), 56–57.
  61. National Research Council, (2012), A framework for K-12 science education: Practices, crosscutting concepts, and core ideas.
  62. Panadero E., (2011), nstructional help for self-assessment and self-regulation: Evaluation of the efficacy of self-assessment scripts vs. rubrics, Dr Diss. Univ. Autónoma Madrid, Madrid, Spain.
  63. Panadero E. and Jonsson A., (2013), The use of scoring rubrics for formative assessment purposes revisited: A review. Educ. Res. Rev., 9, 129–144.
  64. Pullen R., Thickett S. C. and Bissember A. C., (2018), Investigating the viability of a competency-based, qualitative laboratory assessment model in first-year undergraduate chemistry. Chem. Educ. Res. Pract., 19(2), 629–637.
  65. Reiter C., (2017), Theory and methodology of exploratory social science research.
  66. Ruiz-Primo M. A. and Furtak E. M., (2007), Exploring Teachers’ Informal Formative Assessment Practices and Students’ Understanding in the Context of Scientific Inquiry. J. Res. Sci. Teach., 44(1), 57–84.
  67. Sabel J. L., Forbes C. T. and Zangori L., (2015), Promoting prospective elementary teachers’ learning to use formative assessment for life science Instruction. J. Sci. Teacher Educ., 26(4), 419–445.
  68. Sandelowski M., (2000), Focus on research methods: Whatever happened to qualitative description? Res. Nurs. Heal., 23(4), 334–340.
  69. Schamber J. F. and Mahoney S. L., (2006), Assessing and improving the quality of group critical thinking exhibited in the final projects of collaborative learning groups. J. Gen. Educ., 55(2), 103–137.
  70. Sevian H. and Dini V., (2019), A design-based process in characterizing experienced teachers’ formative assessment enactment in science classrooms, in McLoughlin E., Finlayson O. E., Erduran S. and Childs P. E. (ed.), Bridging Research and Practice in Science Education, Springer, pp. 325–337.
  71. Shepard L. A., (2000), The role of assessment in a learning culture. Educ. Res., 29(7), 4–14.
  72. Shwartz G. and Dori Y. J., (2020), Transition into Teaching: Second Career Teachers’ Professional Identity. Eurasia J. Math. Sci. Technol. Educ., 16(11), 1–19.
  73. Siegel M. A., Hynds P., Siciliano M. and Nagle B., (2006), Using rubrics to foster meaningful learning, in McMahon M., Simmons P., Sommers R., DeBaets D. and Crawley F. (ed.), Assessment in Science: Practical Experiences and Education Research, NATA, pp. 89–106.
  74. Sutherland L., Howard S. and Markauskaite L., (2010), Professional identity creation: Examining the development of beginning preservice teachers’ understanding of their work as teachers. Teach. Teach. Educ., 26(3), 455–465.
  75. Talanquer V., Tomanek D. and Novodvorsky I., (2013), Assessing students’ understanding of inquiry: What do prospective science teachers notice? J. Res. Sci. Teach., 50(2), 189–208.
  76. Talanquer V., Bolger M. and Tomanek D., (2015), Exploring prospective teachers’ assessment practices: Noticing and interpreting student understanding in the assessment of written work. J. Res. Sci. Teach., 52(5), 585–609.
  77. Taylor S. S., (2007), Comments on Lab Reports by Mechanical Engineering Teaching Assistants Typical Practices and Effects of Using a Grading Rubric. J. Bus. Tech. Commun., 21(4), 402–424.
  78. Tomanek D., Talanquer V. and Novodvorsky I., (2008), What do science teachers consider when selecting formative assessment tasks? J. Res. Sci. Teach., 45(10), 1113–1130.
  79. Torrance H. and Pryor J., (2001), Developing formative assessment in the classroom: using action research to explore and modify theory. Br. Educ. Res. J., 27(5), 615–631.
  80. Tsybulsky D. and Muchnik-Rozanov Y., (2019), The development of student-teachers’ professional identity while team-teaching science classes using a project-based learning approach: A multi-level analysis. Teach. Teach. Educ., 79, 48–59.
  81. Usher M. and Barak M., (2018), Peer assessment in a project-based engineering course: comparing between on-campus and online learning environments, Assess. Eval. High. Educ., 43(5), 745–759.
  82. Van Brederode M. E., Zoon S. A. and Meeter M., (2020), Examining the effect of lab instructions on students’ critical thinking during a chemical inquiry practical. Chem. Educ. Res. Pract., 21(4), 1173–1182.
  83. van der Kleij F. M., (2019), Comparison of teacher and student perceptions of formative assessment feedback practices and association with individual student characteristics. Teach. Teach. Educ., 85, 175–189.
  84. Van der Kleij F. M., Cumming J. J. and Looney A., (2017), Policy expectations and support for teacher formative assessment in Australian education reform. Assess. Educ. Princ. Policy Pract., 1–18.
  85. Walker J. P. and Sampson V., (2013), Learning to argue and arguing to learn: Argument-driven inquiry as a way to help undergraduate chemistry students learn how to construct arguments and engage in argumentation during a laboratory course. J. Res. Sci. Teach., 50(5), 561–596.
  86. Watts F. M. and Finkenstaedt-Quinn S. A., (2021), The current state of methods for establishing reliability in qualitative chemistry education research articles. Chem. Educ. Res. Pract, 22(3), 565–578.
  87. Wheeler L. B., Maeng J. L. and Whitworth B. A., (2015), Teaching assistants’ perceptions of a training to support an inquiry-based general chemistry laboratory course. Chem. Educ. Res. Pract., 16(4), 824–842.
  88. Wheeler L. B., Clark C. P. and Grisham C. M., (2017), Transforming a Traditional Laboratory to an Inquiry-Based Course: Importance of Training TAs when Redesigning a Curriculum. J. Chem. Educ., 94(8), 1019–1026.
  89. Yin R. K., (2009), Doing case study research: Design and methods, Thousand Oaks, CA: Sage.
  90. Yin R. K., (2017), Case study research and applications: Design and methods, SAGE Publications.
  91. Zumbrunn S., Marrs S. and Mewborn C., (2016), Toward a better understanding of student perceptions of writing feedback: a mixed methods study, Springer, vol. 29(2), pp. 349–370.

Footnote

Electronic supplementary information (ESI) available. See DOI: 10.1039/d1rp00001b

This journal is © The Royal Society of Chemistry 2021