Poh Nguk
Lau
School of Applied Science, Temasek Polytechnic, Singapore. E-mail: pohnguk@tp.edu.sg
First published on 14th August 2019
The rhetorical argument that laboratory courses are crucial for training skilled STEM practitioners is ill-evidenced in teaching practice. The arduous task of implementing instructor-led skill assessment in large-cohort courses and persistent student disengagement from its educative goals are some obstacles. This study emphasized the need to equip learners to self-assess technical skills, supported by explicit performance standards and objective evidence. It trials two interventions, a self-assessment (SA) checklist and a learner-recorded video, to examine how the combination impacts the appraisal ability and attitudes towards SA. The participants were from a first year chemistry course in a biotechnology and chemical engineering course. All the participants self-assessed titration competencies against a checklist, with about half assisted with a video replay. A video critique task showed a significant main effect by intervention. SA-with-video participants scored higher than SA-only participants and the control group. The additional video intervention did not produce any significant gains above SA alone. Qualitative analysis revealed that SA-with-video participants were more targeted in their critique responses. Video differences in attitudinal responses towards SA were not prominent. Selected SA items related to perceptions of the value of SA in skill improvement, and, as a future study strategy, goals and commitment of using SA for skill improvement, were associated with video exposure in the biotechnology course, or with the course in the video group. Improvements for future work are discussed.
Given the decrease in government funding and high investment costs in offering laboratory courses, there has been a constant re-evaluation of whether laboratory instruction has indeed fulfilled its intrinsic value in science and engineering curricula (Gibbins and Perkins, 2013). Past studies have repeatedly shown that students’ perceptions on the ground have diverged from our visionary and lofty goals of laboratory instruction, highlighting a mismatch between faculty and student goals (Russell and Weaver, 2008; Parry et al., 2012; DeKorver and Towns, 2015; Galloway and Bretz, 2016). Students often downplay their engagement and ownership of the learning process. For example, some possess the “escapist mindset” by hastily finishing up the work as fast as possible without much thinking, while others view laboratory work as a means to obtain good grades instead of skill (Parry et al., 2012; DeKorver and Towns, 2015; Galloway and Bretz, 2016). These are pressing issues to address in order to maximize the value of laboratory curriculum, and for laboratory instructors to re-focus on the assessment of laboratory outcomes and to monitor the process of skill mastery. A widely implemented assessment practice in the chemistry laboratory is to evaluate the extent to which students “act with criteria” (Prades and Espinar, 2010, p. 453) by integrating cognitive skills (decision making, relating theory to experiment) and hands-on skills. Another common feature of laboratory assessment is that the process is not once-off, but a continuous interaction between instructors and students in-class to evaluate competency levels (Prades and Espinar, 2010).
In the institute where the author teaches, a freshman course in inorganic and physical chemistry has a total enrolment size of about 500 students. Thus, instructor-led, one-on-one skill assessment on-site is often time-consuming and practically hard to implement and sustain for such enrolment size. The question then is whether one could turn the situation around by equipping learners with skills to discern their own competency levels objectively. For this learning experience to be meaningful, learners need to know where they are now in their competencies, what to aim for and how to get to the desired standards (Andrade and Heritage, 2017). As the good old adage goes, “practice makes perfect”, but we need to define what makes perfection in technical skills, and allow multiple opportunities for learners to improve and move towards perfection. Along these lines, two well-established cornerstones of teaching practice, formative assessment and self-assessment (SA), are particularly relevant for laboratory instruction. Andrade and Heritage (2017) defined formative learning as a process where learners reflect on their competencies against performance norms, and iteratively make refinements to close their learning gaps. If provided with objective evidence, specific learning goals of the domain content, or feedback from peers, students could engage in self-examination, enabling them to self-judge their own work quality (Boud, 2003). Boud argued that SA is a valuable life-long skill to develop in graduates from higher education, moulding several soft skills such as a mindset of continuous learning, positive self-concepts and increased autonomy in the ownership of learning, to name a few. In the context of laboratory assessment, the performance criteria should thus be clearly communicated to students and instructors to calibrate standards (Prades and Espinar, 2010).
A scan of the existing literature showed that SA in a laboratory course is supported by an open-ended reflection guide (Parry et al., 2012) or a more structured skill inventory (Zhang et al., unpublished work, 2013). Zhang et al. examined the effects of using SA in food science and biology laboratory classes. Skill checklists were provided to students as exemplars of performance standards in a cell culture and food microbiology laboratory. They then engaged in self-monitoring and reflection, and also provided peer feedback to each other. A post-intervention feedback survey showed that students viewed SA positively, with over 90% of the students surveyed agreed that the checklist increased confidence levels in skill performance, assisted in the identification of strengths and weaknesses and enabled them to make improvements to their skills. Parry et al. (2012) implemented the Critical Incident Report (CIR) in a biochemistry and molecular biology laboratory course. Students were asked to list what they thought were the critical tasks necessary to complete the task objectives and to reflect on how well they performed these tasks. They were also asked to describe the learning that they took away from these incidents and how it would affect their future performance. A pre- and post-intervention questionnaire was used to compare perception levels in the value of reflection. The results showed that the CIR enhanced participants’ awareness of the importance of tutor feedback and adopting alternative study strategies to improve their knowledge on the subject matter.
The problem with checklists or reflection guides is the lack of objective evidence. Post-task reflection could be hindered by an inability to recall what was actually performed during the task (Dawes, 1999). Therefore, recent work in chemical education research has begun to explore the use of video as a means of assessment to evaluate skill attainment (Towns et al., 2015; Hensiek et al., 2016; Hennah and Seery, 2017). Termed the “digital badge” approach, learners prepare a self-demonstration video on laboratory skills with a set of given instructions. The skills ranged from pipetting to titration. These videos were reviewed by peers, instructors or both. If the desired competency level is attained, the student would earn a badge as a visible recognition for skill mastery, much like how experts are recognized in professional fields. The authors reported improved student outcomes in terms of perceived confidence in technical skills and also in written tests (Towns et al., 2015; Hensiek et al., 2016; Hennah and Seery, 2017).
Veal et al. (2009) combined a self-reflection intervention with a self-demonstration video to measure the extent of students’ awareness of practical skills in a general chemistry course. The experimental group was video-recorded by instructors in class in which instructor feedback was provided after a series of student recordings. The participants were immediately prompted to think of how they went about completing the task, and the areas in which they did and did not do well. Subsequently, the participants completed a survey to elicit the utility of the videotape feedback and were interviewed. The results showed that, firstly, the experimental group fared better in the laboratory and theory examinations. Secondly, the video reflection drew participants’ attention to the quality of their laboratory skills more readily, explicitly and critically. The participants unanimously supported self-critique using a video.
The inorganic and physical chemistry course spanned a duration of 17 weeks from April to August 2017, with three weeks of mid-term recess. The laboratory classes were timed on alternate odd weeks of the semester. The emphasis of the tasks was on the use of a pipette and the titration technique. Fig. 1 summarizes the lesson plan in the whole project cycle. In total, the participants completed the SA form over three laboratory sessions, and self-recordings were collected over four sessions. Re-rating the SA checklist while reviewing the video took place in Labs 3 and 4.
The first lesson began in Week 3 (second week of May 2017). In the first session, the participants began a simple pipette task and the author demonstrated the correct techniques. The participants took turns to practice the pipetting technique, with their partner recording the process using a mobile phone. In the first session, the self-assessment checklist was not distributed as the intent was to acquaint the participants with the technique. The skill checklist was customized specifically for the chemistry tasks of titration and pipetting. It was developed by the author and colleagues in the teaching team, and also from the author's prior experience in assessing practical titration skills in a school-based assessment setting. The author also took the opportunity to advise participants on the focal points required in the video. These focal points were the critical actions corresponding to the skill checklist. The participants uploaded their video files into a Google drive folder encrypted with a password provided by the author.
In Week 5, the laboratory task involved acid–base titration to determine the unknown concentration of a basic solution. After a teacher demonstration, the first self-assessment was implemented in Week 5, together with the video recording. The participants completed the skill inventory (A1 of the Appendix) on-site after completing the hands-on work. Laboratory lessons resumed in Week 13 (mid Jul 2017) after the term recess. Since a long break had occurred, the author conducted a review on good titration techniques with a briefing and a short quiz. The laboratory task involved titration, and before the video group began any hands-on work, they were requested to review the Week 5 video and to re-assess their skills on the SA checklist. The same lesson plan took place in the Week 15 class.
The participants continued on with the titration work in the last practical class in Week 17, but self-video and SA were stopped. A video critique task was administered. It showed a student performing a titration experiment similar to the task performed in class. The participants were instructed to provide written comments individually, to describe the areas in which the protagonist did well and to suggest areas for improvement. The other instrument, the SA perception survey, was also distributed for completion in this session (A2 of the Appendix)
Two other tutorial classes taught by the author served as control. The participants in the control classes were taught by other laboratory instructors, and were not exposed to the video and SA. As part of a revision class, they were tasked to also critique the same titration video during a tutorial session. Table 1 summarizes the deployment of the intervention amongst the classes. Missing data in the survey and video critique were taken into account in the sample sizes.
Experimental | Control | |||
---|---|---|---|---|
ChE | BIO | ChE 1 | ChE 2 | |
Self-video | 10 | 13 | — | — |
SA checklist | 26 | 27 | — | — |
Survey | 25 | 27 | — | |
Video critique | 22 | 23 | 23 | 25 |
The second data source was scores on a video critique task, administered in the last laboratory session of the course. The same video task was implemented in the two control ChE classes in the same week. The video clip was selected from the pool of participant recordings, as unscripted and imperfect videos taken by novices provide rich learning experiences (Blazeck and Zewe, 2013). Thus, it was inevitable that some critical scenes could not be captured. Participants’ written responses to the video critique exercise were graded on a 9-point rubric (see A3 in the Appendix), which captured all strengths and areas for improvements directly visible or which could be partially inferred from the video clip (Fig. 2).
For actions or mistakes clearly visible from the video, one point was allocated for correct identification. Owing to the lack of visual cues, some actions in the clip could be construed as both strengths and weaknesses. Some examples were whether or not the student actor had read the liquid meniscus at eye level, had clamped the burette vertically, or had tapped the pipette tip when dispensing the solution. In such cases, benefit-of-doubt was given as partial credit (half point) if participants classified these skills either way. In some instances, participants also mentioned other non-SA checklist skills which were credible, such as using a white surface for color contrast. These responses were rewarded with a partial credit. If participants did not explicitly classify the actions as “actions done well” or “room for improvement”, semantical cues were used to imply responses such as “did not”, “should have”, “instead of” and “should not”. No marks were given for overtly wrong classification, meaning when participants classified a positive action as a deficit or vice versa. No deduction was made for such instances. Other attributes not captured in the rubric were coded on an a posteriori basis during the grading process to generate coding themes. These responses were coded using NVivo version 12.
The chi-square statistic was used to find out if there were any survey items which produced a significant association with the video intervention or course. If the assumption on the expected cell count was violated, likelihood ratio statistics were reported (Field, 2013, p. 724). All the chi-square values and the respective p-values reported are the unadjusted values, compared to a Bonferroni-adjusted critical p-value. Decisions on how and when to apply the Bonferroni correction may be quite subjective (Cabin and Mitchell, 2000). This study uses an approach similar to that of Boucek et al. (2009) to correct for multiple testing. The Bonferroni adjustment was applied after the first-level chi-square tests, set at 0.05 significance. For items which passed the first-cut critical level, gamma (γ) coefficients are checked for effect sizes against a Bonferroni-corrected p-value. This adjusted alpha level is 0.05/6 (=0.0083), because there are 6 possible pairwise comparisons between the video and course that are relevant to the research aims. They are: video–no video; ChE–BIO; ChE video–ChE no video; BIO video–BIO no video; ChE video–BIO video; ChE no video–BIO no video. Gamma (γ) coefficients are used because the data are ordinal–nominal in nature. Items which cleared both criteria are presented in this work. Using an item-level correction seeks to achieve a compromise to balance the Type 1 error rate and the power of the statistical analysis.
% SA | % A | % D | % SD | % NA | % SA + A | |
---|---|---|---|---|---|---|
Item 5 is a free response question.a Significant association with the course. See Table 3.b Significant association with the video groups in BIO only. See Table 4.c Significant association with the course in the video group only. See Table 5. | ||||||
(1) Performing the self-assessment exercises has increased my confidence in the subject | 26 (6) | 70 (16) | 4 (1) | — | — | 96 |
28 (8) | 72 (21) | — | — | — | 100 | |
(2) I had no difficulty identifying action steps to improve work | 30 (7) | 48 (11) | 22 (5) | — | 78 | |
17 (5) | 69 (20) | 10 (3) | 3 (1) | 86 | ||
(3) I worked on action steps in the next session to improve worka | 44 (10) | 56 (13) | — | — | — | 100 |
31 (9) | 69 (20) | — | — | — | 100 | |
(4) I found the SA checklist useful | 33 (7) | 52 (11) | 5 (1) | — | 10 (2) | 85 |
32 (9) | 68 (19) | — | — | — | 100 | |
(6) Doing the SA enables me to judge performance better | 35 (8) | 65 (15) | — | — | — | 100 |
31 (9) | 66 (19) | 3 (1) | — | — | 97 | |
(7) I am able to compare the quality of my work against standards/criteria | 39 (9) | 48 (11) | 13 (3) | — | — | 87 |
24 (7) | 69 (20) | 7 (3) | — | — | 93 | |
(8) SA enables me to improve on my learning in areas I am not so good at | 39 (9) | 61 (14) | — | — | — | 100 |
38 (11) | 59 (17) | 3 (1) | — | — | 97 | |
(9) I become better aware about my learning through doing the SA | 39 (9) | 57 (13) | 4 (1) | — | — | 96 |
31 (9) | 66 (19) | 3 (1) | — | — | 97 | |
(10) The SA helps me assess my strengths and weaknesses accurately | 35 (8) | 56 (13) | 9 (2) | — | — | 91 |
28 (8) | 65 (19) | 7 (2) | — | — | 93 | |
(11) Doing the SA is a waste of time | 9 (2) | 14 (3) | 32 (7) | 36 (8) | 9 (2) | 23 |
4 (1) | 24 (7) | 48 (14) | 14 (4) | 10 (3) | 28 | |
(12) I do the SA with the intention of improving my workc | 30 (7) | 57 (13) | 13 (3) | — | — | 87 |
21 (6) | 79 (23) | — | — | — | 100 | |
(13) The school should continue implementing SA with subjects for meb | 35 (8) | 52 (12) | 13 (3) | — | — | 87 |
14 (4) | 76 (22) | 10 (3) | — | — | 90 | |
(14) I will continue using the SA in my learning | 26 (6) | 65 (15) | 9 (2) | — | — | 91 |
14 (4) | 82 (23) | 4 (1) | — | — | 96 | |
(15) SA will better prepare me for the world of work | 30 (7) | 61(14) | 4 (1) | 4 (1) | — | 91 |
24 (7) | 66 (19) | 10 (3) | — | — | 90 | |
(16) SA has improved my lab skillsb,c | 52 (12) | 44 (10) | 4 (1) | — | — | 96 |
35 (10) | 65 (19) | — | — | — | 100 |
Item 3 (I worked on action steps in the next session to improve my work) produced a strong significant association with the course, χ2(1) = 8.76, p = 0.003 (Table 3). All the participants regardless of video conditions agreed that they took action steps to close skill deficit in the next class. The gamma-coefficient (γ) of −0.74 was significant when compared to the Bonferroni-corrected critical value of 0.0083 (unadjusted p = 0.001). The large and negative effect size implied that BIO participants were more inclined to commit to improving their skills. This is seen in the proportion of strongly agree and agree responses, which were more evenly distributed in the BIO class (56% and 44%). The ChE course produced a more skewed profile (16% strongly agree, 84% agree).
% SA | % A | % D | % SD | % NA | % SA + A | |
---|---|---|---|---|---|---|
χ 2(1) = 8.76, unadjusted p = 0.003; γ = −0.74, unadjusted p = 0.001 (<0.0083 adjusted critical level). | ||||||
(3) I worked on action steps in the next session to improve work | 16 (4) | 84 (21) | — | — | — | 100 |
56 (15) | 44 (12) | — | — | — | 100 |
In the ChE class, all items had no significant association with the video intervention. Table 4 shows the response distribution for two items that differed significantly in the BIO class between the video groups. They were items 13 (“the school should continue to implement SA”, χ2(2) 7.59, p = 0.02, unadjusted) and 16 (“SA has improved my lab skills”, χ2(1) = 6.31, p = 0.01, unadjusted). For item 13, the γ coefficient was 0.86 (p = 0.002). This implied that BIO video participants strongly supported the future use of SA. The profile showed that the proportion of video participants who strongly agreed to continue SA implementation was almost seven times higher (SA + video = 46%, SA-only = 7%).
% SA | % A | % D | % SA + A | |
---|---|---|---|---|
Item 13: likelihood ratio χ2(2) = 7.59, unadjusted p = 0.02; γ = 0.86, unadjusted p = 0.002 (<0.0083 adjusted critical level). Item 16: χ2(1) = 6.31, unadjusted p = 0.01; γ = 0.79, unadjusted p = 0.004 (<0.0083 adjusted critical level). | ||||
(13) The school should continue implementing SA with subjects for me | 46 (6) | 54 (7) | — | 100 |
7 (1) | 79 (11) | 14 (2) | 86 | |
(16) SA has improved my lab skills | 77 (10) | 23 (3) | — | 100 |
29 (4) | 71 (10) | — | 100 |
For item 16, a large and significant effect size was seen (γ = 0.79, p = 0.004). The response distribution showed that the percentage of video participants who thought that the SA had improved their hands-on skills was about three times higher (SA-and-video = 77%, SA-only = 29%). Similar to item 13, this observation is consistent with the magnitude and direction of the effect size.
Amongst the video participants, item 12 (“I do the SA with the intention of improving my work”, χ2(2) 7.81, unadjusted p = 0.02) and item 16 (χ2(2) = 8.46, unadjusted p = 0.015) were significantly associated with the course. Refer to Table 5. Item 12 produced a large γ coefficient of −0.83 (p = 0.002). The γ coefficient of item 16 was −0.87 (p = 0.001). These results suggested that BIO video participants had more favourable perceptions of SA on the items compared to their video peers in the ChE course. In both items, none of the BIO video participants responded negatively. No items were associated with the course background in the non-video group.
% SA | % A | % D | % SA + A | |
---|---|---|---|---|
Item 12: likelihood ratio χ2(2) = 7.81, unadjusted p = 0.02; γ = −0.83, unadjusted p = 0.002 (<0.0083 adjusted critical level). Item 16: likelihood ratio χ2(2) = 8.46, unadjusted p = 0.015; γ = −0.87, unadjusted p = 0.001 (<0.0083 adjusted critical level). | ||||
(12) I do the SA with the intention of improving my work | 10 (1) | 60 (6) | 30 (3) | 70 |
46 (6) | 54 (7) | — | 100 | |
(16) SA has improved my lab skills | 20 (2) | 70 (7) | 10 (1) | 90 |
77 (10) | 23 (3) | — | 100 |
SA + video | SA only | Control | |
---|---|---|---|
Md | 2.50 | 2.00 | 1.00 |
Mean rank | 67 | 53.6 | 35.5 |
M | 2.82 | 2.10 | 1.20 |
SD | 1.43 | 1.38 | 0.82 |
n | 19 | 26 | 48 |
A significant difference in critique marks was obtained across the three groups (χ2(2, 93) = 21.14, p = 0.00). The SA + video participants had the highest median score (Md = 2.50), followed by the SA-only group (Md = 2.00) and the control group (Md = 1.00). Pairwise comparison revealed a strong, significant difference between the SA + video group and the control group (p = 0.00). The difference between the SA-only and control groups was marginal after accounting for pairwise comparisons (p = 0.016). The difference between the SA-only and SA + video groups was not significant (p > 0.05).
The profile of rubric-graded attributes is shown in Table 7, arranged in descending order of frequency. The incidence of participants who identified a positive and negative attribute was computed against the sample size per group. The results showed that the top (correctly) identified weakness was poor dropwise control, followed by swirling of contents and the removal of the funnel, both positive demonstration. These three skills are core skills emphasized in the skill inventory. About 70% of the SA + video participants correctly identified poor dropwise control as a weakness. For “swirl flask contents” and “removed the funnel”, close to half of the SA & video participants made the correct judgment. About half of the SA-only participants cited poor dropwise control as a skill gap to close. In the control group, the attribute that attracted the highest incidence was “swirl flask”.
Total n | % SA & video (n = 19) | % SA only (n = 26) | % Control (n = 48) | |
---|---|---|---|---|
(−) Poor drop-by-drop control* | 33 | 68 | 46 | 17 |
(+) Swirl flask contents* | 32 | 42 | 27 | 35 |
(+) Removed the funnel* | 25 | 47 | 35 | 15 |
(+) Use a white tile | 22 | 21 | 15 | 29 |
(+) Read at eye-level* | 19 | 16 | 4 | 2 |
(+) Wash the burette tip* | 16 | 21 | 39 | 4 |
(+) Use a funnel to add titrant | 12 | 21 | 12 | 10 |
(−) Air bubbles in the burette* | 12 | 11 | 15 | 13 |
(−) Failed to transfer pipette waste to a beaker | 8 | 42 | 0 | 0 |
(+) Tap the pipette against a conical flask* | 5 | 16 | 4 | 2 |
(−) Should lift the pipette to adjust the meniscus* | 5 | 26 | 0 | 0 |
(−) Should read at eye-level* | 4 | 0 | 8 | 4 |
(−) Didn’t tap the pipette against the conical flask* | 2 | 5 | 0 | 2 |
(+) Clamp the burette vertically* | 1 | 0 | 0 | 2 |
A total of 118 non-rubric responses were coded. Table 8 presents the data for these attributes, arranged in descending order of incidence. The general and wrong classification categories were further broken down into sub-attributes to provide details of participants’ responses. The general category captured loose and ambiguous comments, such as “performed titration well” or “good pipetting skills”. The wrong classification category included responses that incorrectly identified a positive (negative) skill in the video, but responded to as a deficit (strength).
Total (n = 118) | % SA & video (n = 19) | % SA only (n = 26) | % Control (n = 48) | |
---|---|---|---|---|
Hand control | 32 | 16 | 12 | 54 |
Wrong classification | 22 | |||
• Added dropwise | 16 | 21 | 12 | 19 |
• Did not wash the burette tip | 3 | 5 | 4 | 2 |
• No air bubbles in the burette tip | 3 | 16 | 0 | 0 |
Bubbles in the pipette | 18 | 21 | 46 | 4 |
General | 17 | |||
• Followed procedures correctly | 2 | 0 | 4 | 2 |
• Gentle with rinsing | 1 | 0 | 0 | 2 |
• Good pipetting skills | 6 | 0 | 4 | 10 |
• Good titration skills | 4 | 0 | 8 | 4 |
• Incorrect pipette method | 1 | 0 | 4 | 0 |
• Precise and accurate | 2 | 0 | 0 | 4 |
• Remove the remaining solution in the pipette | 1 | 0 | 4 | 0 |
Wrong color change | 8 | 0 | 19 | 6 |
Did not wash glassware | 7 | 5 | 19 | 2 |
Speed (too fast or too slow) | 5 | 0 | 0 | 10 |
Should not add deionized water | 4 | 0 | 0 | 8 |
Wear personal protective equipment | 3 | 0 | 8 | 2 |
Should prepare crude reading first | 2 | 0 | 8 | 0 |
The results showed that a large proportion of control group participants considered hand maneuver as a skill deficit. Examples of such comments were “keep the left hand on the stopcock to have more control” or “should use the non-dominant hand as well”. None of the SA + video participants gave general comments, such as “good titration skills” or “incorrect pipette technique”, responses which did not identify any specific deficiencies. However, about 16 to 20% of these participants incorrectly judge that the burette tip had no air bubbles and that drop-wise control was performed well. Categorically, they were skills to be addressed. A small 5% of them also failed to notice the positive demonstration of rinsing the burette tip with deionized water during the titration. The “bubbles in the pipette” was a non-issue as the bubbles disappeared in the process of adjusting the meniscus. Approximately half of the SA-only participants (46%) and 21% of the SA + video participants thought this was an issue. The “bubbles in the burette tip” and “bubbles in the pipette” were not picked up by the SA-only and control group, as seen from the low percentages (0% and 5% respectively).
The performance of the video critique task supported the hypothesis that when SA was implemented alongside the self-video review, it elevates learners’ ability to identify pitfalls and strengths in others as compared to a control group who did not experience both interventions. This finding is consistent with that of Veal et al. (2009), who found that using video feedback to facilitate self-reflection of skills led to an improvement of student outcomes. The SA-only group also fared better when compared to the control, although marginally.
The nature of responses in the critique task showed some interesting trends. Control group participants tended to focus on surface attributes that were more readily observed from the video. These attributes were mainly the hand actions of the student actor, such as flask swirling, use of a white surface or trivial aspects such as speed of titration. The participants did not attend to detailed and minute deficits such as bubbles in the burette and poor drop-wise control in titrant dispensing, which were the core performance standards. On the other end, SA + video participants were more skilful in judging the micro aspects of the demonstration, and did not give unqualified statements on skill quality.
The second research question was to explore if participants who did both SA and self-video would perform differently if not better on the critique task. The data suggested that the self-video review did not produce any significant gains over and above what SA alone could achieve. Although the SA + video group fared the strongest, the difference with the SA-only group was no better than a chance occurrence. As long as skill acquisition is scaffolded by explicit performance standards, a video review appears to be a “good to have”. These results and the qualitative analysis on task responses lend credence to the merits of formative assessment and SA (Boud, 2003; Andrade and Heritage, 2017). On the other hand, the absence of the facilitative effects of the video should not be seen as a contradiction to the digital badge pedagogy reported in various studies (Towns et al., 2015; Hensiek et al., 2016; Hennah and Seery, 2017). This is because in the digital badge literature, the video is the dominant learning outcome and product. The goal was to work towards a perfect video for grading purposes, after iterative corrections. The current study uses video as part of the process in the learning.
The third aim was to investigate how differently the video and non-video groups would perceive the value of SA. 80% or more of the responses indicated favourable perceptions of SA, but none of the items differed by video intervention. These support levels were fairly consistent with those reported in previous works (Parry et al., 2012; Zhang et al., unpublished work, 2013). A possible reason why the self-video resource failed to manifest as a differentiating factor in the survey and the critique task could be that the participants perceive the SA check-list as separate from self-monitoring with the video. Thus, they might have failed to integrate the two resources effectively, negating any positive outcomes on attitudes and critique marks that might otherwise be seen. Another possible reason could be the modality of a critique task. The critique task assessed participants’ ability to identify strengths and weaknesses in others (on a video), as opposed to performing the physical titration task by oneself. The latter emphasized psychomotor performance, while the video critique task nuanced cognition and embodiment. In addition, regardless of whether there was a video review or not, all the participants already made conscious efforts to improve future skills.
Another outcome that pointed to the SA–video disconnect was seen in the low critique scores obtained by the SA + video group. On average, this group scored about 31%, indicating that core skills were still omitted. This, however, might also be due to the inability to recall the skill-set during the task or the fact that the particular skill was not readily observed from the video. Several options could be considered to improve the integration of the video and the SA checklist. For one, the video review should be undertaken immediately after practical work, instead of waiting until the next class. The participants should also be made to explicitly pen down action steps to address skill gaps and check their progress against these plans. The mechanics of the recording should also be given due focus to give participants more explicit initial guidance in capturing good videos meaningful for learning.
Two items showed a polarization between the BIO video and non-video participants in favor of the former. The first item was whether SA should continue to be implemented in their course (item 13). About 46% of the video participants strongly supported this statement, while only 7% of the non-video participants did. In fact, the video group viewed future SA implementation very positively. None of them refuted this item, while about 14% of the non-video group did not support the continuation of SA. The second item which resulted in significant differences between the BIO video and non-video participants was whether SA has improved laboratory skills (item 16). Regardless of whether a video was used or not, all the participants agreed that SA had improved their laboratory skills. However, the proportion of video participants who strongly agreed with this item was almost three times higher than that of the non-video group (77% and 29% respectively). The strong positive correlations suggested that the BIO students who used both SA and video appeared to be more supportive of self-monitoring. BIO video students were also more likely than their ChE video peers to agree that they used SA to improve their work (item 12). The BIO video participants also strongly supported that SA has helped them improve laboratory skills (item 16), the common item that differed not just from their non-video classmates but also from the corresponding ChE video participants. On another note, item 3 which elicited participants’ commitment to work on action steps for skill improvement was unexpectedly associated with the course, attracting higher levels of positive perceptions from the BIO participants. All in all, the results seemed to point to some course-based differences in how participants evaluate the merits of SA with or without a video review. It is unclear why such differences should arise. One possible reason could be student differences, such as interest and motivational levels. While it is sufficient to “persuade students of the interest and value of the exercise” (Boud, 2003, p. 183) with the BIO participants, a different incentive approach might be more suitable for the ChE course. This angle warrants deeper research.
In addition to improving the SA and video integration, this study has three notable statistical limitations. Firstly, the sample size is small; therefore, the results should be treated with caution. Secondly, at the point of study and reporting of the results, the SA survey has not been validated. Thus, the validity of the instrument remains unchecked. Increasing the sample size would not only improve the robustness of the results, but also allow for more in-depth quantitative techniques such as factor analysis to study the underlying structure of the SA survey. It is hoped that this study could be replicated to attract more research attention to using SA and formative assessment pedagogy in STEM laboratory courses.
Thirdly, as mentioned in the beginning, decisions on application of the Bonferroni correction may be quite tricky (Cabin and Mitchell, 2000). It is arguable whether the corrected p-value should be applied at the scale-level (15 items) or at the item-level (second-level correction) for significant items only as in the current analysis. It is acknowledged that results would differ, depending on the criteria used. Assuming if the alpha level was 0.05/15 (=0.003), the chi-square test results would show a marginal but significant relationship on one item only (again, item 3, worked on action steps to improve learning). The significance of the effect size of item 3 would still pass the 0.003 critical value. However, perception differences in video-versus-no video within the BIO class would go undetected, even though the significance of the effect size of some items would clear the α = 0.003 critical level. With a strict Bonferroni correction comes a possible loss of information on how learner characteristics might influence SA perceptions. This dilemma is not an easy issue to address and further research to replicate the current study is needed.
The results appear to suggest that the critique task performance and perceptions towards SA did not differ significantly for participants who carried out SA only, and those with the additional video review. However, the SA + video participants did perform better in critiquing the features of a laboratory skill video demonstration when compared to the control group. They were also more critical judges of performance standards, particularly in noticing finer details and providing more targeted critiques.
In conclusion, the results obtained from the current study provided some preliminary learning points in using a video review to scaffold SA. Together with a skill inventory, the video tool also served as an explicit input for formative learning, although both resources should be more tightly intertwined in the lesson design. This would enable a more visible scaffold to guide learners to next-step improvement, and extract the value of the video resource tool in the entire learning experience.
Yes/No |
Use of a pipette |
1. Lift the pipette from the liquid surface to adjust the meniscus to the graduation mark |
2. Gently tap or rotate the pipette tip against the base of glassware |
Burette set-up |
3. Clamp the burette vertically on the retort stand |
4. Remove air bubbles from the tip of the burette |
5. Remove the filter funnel from the burette |
6. Read the meniscus at eye level |
Endpoint observation |
7. Continuously swirl flask contents |
8. Towards the end-point, add titrant drop-wise |
9. Towards the end point, completely transfer the titrant by rinsing the walls of the conical flask and burette tip |
10. Obtain correct color change at the end point |
Data-recording |
11. Record burette readings to correct precision level (2 decimal places) |
A2. SA survey items (video version), rated on a 5-point scale, Strongly Agree, Agree, Disagree, Strongly Disagree, Not Applicable. Items 5, 17, 18, 19 and 21 are free-response items. The non-video version excludes items 20 and 21.
1. Performing the Self-Assessment exercises has increased my confidence in the subject. |
2. I had NO difficulty in identifying the action steps to improve my work. |
3. I worked on the action steps in the next session to improve my work. |
4. I found the Self-Assessment checklist useful. |
5. Please explain your response in Q4 in terms of how the checklist was helpful or not helpful. |
6. Doing the Self-Assessment enables me to judge my performance better. |
7. I am able compare the quality of my work against the standards or assessment criteria. |
8. The Self-Assessment enables me to improve on my learning in areas where I am not so good at. |
9. I become better aware about my learning through doing the Self-Assessment. |
10. The Self-Assessment helps me assess my strengths and weaknesses accurately. |
11. Doing the Self-Assessment is a waste of time. |
12. I do the Self-Assessment with the intention of improving my work. |
13. The school should continue implementing Self-Assessment with subjects for me. |
14. I will continue using Self-Assessment in my learning. |
15. Self-Assessment will better prepare me for the world of work. |
16. Self-Assessment has improved my lab skills. |
17. With reference to your response to Q16, name these lab skills. |
18. In your opinion, what are the steps involved in Self-Assessment of your work? |
19. Share with us your overall experience in self-assessing your work for improvement. |
20. The videos I took were helpful in improving my lab skills. |
21. With reference to Q20, in what ways were the videos useful? |
A3. Marking a rubric for a video critique task. Total maximum possible = 9 marks
Skills demonstrated, each response = 1 mark | Skills not demonstrated or to be improved, each response = 1 mark |
• Burette clamped vertically | • Fail to clear air bubbles in the burette tip at the start of the titration |
• Removed the funnel | • Poor drop-by-drop control |
• Swirl flask contents continuously | |
• Wash the burette tip with a wash bottle | |
The following responses were awarded partial credit: | The following responses were awarded partial credit: |
– Use the funnel for introducing titrant | – Should lift the pipette to adjust the liquid level |
– Use of a white surface | – Should read the meniscus at eye level |
– Tap the pipette against the conical flask wall | – Failed to transfer pipette waste into a waste beaker |
This journal is © The Royal Society of Chemistry 2020 |