Keith S.
Taber
Faculty of Education, University of Cambridge, UK. E-mail: kst24@cam.ac.uk
That of course makes an assumption about how the word ‘and’ should be read in ‘…Research and Practice’. CERP is not a journal which (a) reports research, and which also (b) reports practice. Rather it is a research journal. This is why the criteria that reviewers are asked to consider (Taber, 2012a) relate to what makes an article worth including in a research journal. Those criteria do have to be interpreted in relation to different manuscripts, and indeed some of those criteria are not relevant to all submissions, as the journal publishes different types of articles. However, the journal does not classify articles as ‘research’ or ‘practice’ – but rather as ‘papers’, ‘reviews’ and ‘perspectives’. These classifications reflect different kinds of research. Papers are empirical studies reporting research involving the analysis of data collected from some chemistry teaching and learning context(s). Analysis of textbooks, or curriculum, or assessment materials, would be considered here – even though the data is in the form of documents themselves rather than concerning their application in specific teaching and learning episodes and contexts.
A review, by comparison, explores a specified corpus of existing literature in, or relating to, the field of chemistry education. The ‘data’ is the existing research literature. A perspective does not usually have data to analyse in the same sense, but is still scholarly – being research that is more in the tradition of philosophy than empirical science. Substituting alternative, if now anachronistic, terms that highlight the historical link between these activities, a perspective presents research more in the tradition of metaphysics rather than natural philosophy.
I take colleagues’ references to ‘practice papers’ as meaning articles that report on aspects of the authors’ own practices as chemistry educators. Such articles are certainly welcome under the ‘paper’ category – as long as they meet the journal's criteria for what counts as research – so they may be ‘practice papers’, but as with all papers in CERP, they must also fit in a research journal. I am aware that could be misconstrued to suggest the journal claims to be open to practice papers, but really only wants to publish research papers. However that is making the research-practice distinction that I rejected earlier. Rather, if there is an opposite to ‘Practice’ papers in CERP, it is not ‘Research’ papers but something else. In this editorial, I explore what makes a ‘practice’ paper potentially publishable, in part by discussing some hypothetical examples of practice papers that CERP would not publish and which I would hope readers of this editorial would agree do not deserve to be published in a research journal.
I welcome these kind of developments, and they deserve to be evaluated and communicated to other practitioners who can make use of the activity. However, reports of such developments are not research unless they both contextualise the innovations in the CER literature to show they respond to a genuine educational need or issue, and they have been systematically evaluated in practice. Simply sharing teaching ideas and activities, important as it is, does not comprise research. Furthermore, the peer review system in chemistry education is based on experts critiquing scholarly accounts of their work. The kinds of practical actives submitted for publication would need to be evaluated in the laboratory by reviewers (an expensive and possibly time consuming activity) and subject to health and safety reviews. CERP asks reviewers to critique arguments and analysis, but not to go and set up laboratory activities and evaluate them.
However, to make a case for the effectiveness of any innovation, there has to be systematic evaluation. That means data collection has to be planned carefully, using appropriate (valid, reliable) instrumentation, with careful (and explicitly explained) analysis, that allows robust conclusions to be drawn. This almost certainly means thinking about the design of the study before introducing the innovation (or at least before employing it with the cohort or group who are participating in the evaluation). Comments that Faculty and students seem to like it, or seem to think it was a good idea, offers weak evidence for something that will be published in an international research journal.
Even evaluations that use student feedback of the kind often collected for internal course evaluation may not be convincing to referees, or to readers considering whether to modify their own practice on the basis of a report of someone else's practice. We all know that simple course evaluations (Likert-type items and invitations for open-ended comments) have severe limitations. They need to be interpreted in relation to the responses from previous cohorts and an insider's knowledge of the context, and used to complement the practitioner's own direct experience of being there. With all that context, knowing that 83% said some activity was ‘good’ or ‘very good’ and 69% agreed that they would like more courses to use similar activities, can be interpreted meaningfully. However from the outsider's perspective, lacking that context, such descriptive statistics have limited force.
Action research is indeed an alternative, but perfectly respectable, form of research, however that in itself belies two issues. The first is that just because action research is practitioner research, this does not mean that all practitioner research automatically counts as action research. It is a necessary condition for action research that it is responding to a perceived issue or problem in the researcher's own practice (Whitehead, 1989) – but that is not a sufficient condition. As one example, action research is iterative, yet many practitioner research studies claiming to use this approach have not taken the work through the cycles of activity and evaluation that is an essential feature of this type of research.
The second issue is that of what action research is meant to be an alternative to: I would suggest action research is an alternative to academic research (Taber, 2013a). Academic research is primarily theory-directed research, whereas action research is context-directed research. To say that academic research is theory-directed is not to suggest it will not be relevant to practice. Rather this means that it seeks to develop generalisable knowledge that is abstracted from, and can be considered independent of, any specific context, and so may be applicable across contexts. This is a large part of what makes something publishable research – that it is of relevance beyond the specific context where it was undertaken.
Education is rather different to chemistry here. If copper carbonate thermally decomposes in your lab under certain conditions, then I can expect it to do that in mine as well if I use the details in your research report to replicate the conditions. However, a lesson plan that is effective with your class of 14 year olds, a laboratory course that is effective for your undergraduates, an approach to doctoral supervision that works with your research students… may not be assumed to work with all classes of 14 year olds, all groups of undergraduates, all research students.
In part this is the problem of replication. Your 14 year-olds are taught in a class of 25 with 70 minute lessons in a well equipped laboratory, and a teacher working with a class of 40 having 45 minute lessons in a makeshift teaching space is not in a position to change the conditions they teach under and replicate your work. However, even if we could specify and control such matters as might have an effect (a morning lesson in a room with large windows facing south on a sunny day following a win the previous evening for the national football team…) this still would not do. My copper carbonate (assuming we source high quality materials) is much like yours. This class of 14 years olds is not so like that one, and this research student is quite different to yours.
However, despite this, academic research in chemistry education, as in chemistry, is seeking knowledge which potentially rises above the specific and makes general claims. CER does not aspire to the knowledge that this class learnt about the nature of chemistry as a science through a particular type of enquiry activity, but rather that this type of activity can support learning (more generally) of the nature of chemistry as a science. There is clearly a major challenge here: if every class is different, if every learner is different, if classroom contexts matter, if institutional norms matter, if cultural and language contexts make a difference, then how can we possibly develop generalisable knowledge? That challenge does not however undermine the aspiration to produce knowledge that transcends the specific context in which it was developed.
Yet action research (when it is really undertaken as action research) is context-directed. It is about solving this problem we have here, making this situation better, improving professional practice in the here and now. It is more about finding solutions than understanding and theorising them. It is not atheoretical, but it is not about developing generalisable theory. Genuine action research is an iterative process where the researcher does not wait to collect a full data set before acting on the existing information collected – it is a moving target that is hard to document, and where documentation is secondary to the action itself (Tripp, 2005). Evaluation is critical to the process, but it is always responding to the balance of evidence available now, to move things on, rather than attempting to build up a convincing case suitable for a research report. Action research provides new knowledge, but often this is largely personal knowledge, context-bound knowledge, much of which may remain tacit rather than being suitable for public representation in formal terms. Of course such personal knowledge can be highly fallible, but that does not negate the value of action research because it is cyclic and iterative and further action can always offer new evidence that can lead to further changes in practice. That is a great strength of action research: it supports an evidence-based, ongoing personal research programme of professional development. These characteristics, that make action research especially suitable for context-directed practitioner research, also make it less suited as a strategy for academic research.
It is certainly possible to square this circle, to undertake action research in a way that (also) produces what can be judged as more generalisable, publishable knowledge (Philip and Taber, 2015). However, the inherent limitations of action research are perhaps part of the reason for the increasing popularity of research based on lesson study (Allen et al., 2004) and design research (Ruthven et al., 2009) which adopt a similar iterative process, but to build knowledge intended to be more generalisable. When practitioner research follows action research principles it could be characterised as often resembling a series of small but incomplete experiments, each of which is (but with good reason) abandoned part way through to move on the next.
Of course it is seldom possible to study education under laboratory conditions, both for practical reasons, and because the phenomena of interest – at core, processes of classroom teaching and learning – become too decontextualised if excised from the natural setting. Such studies certainly do occur in relation to learning, but are usually seen as part of psychology, rather than educational research – and reflect more basic research that can inform, but does not replace, applied studies in education.
Experimental work in education usually relies on being able to recruit large samples representing the diversity of populations of interest (across a range of schools for example) and randomly assigning groups to conditions. Even then, such work is subject to problems such as expectancy effects, novelty effects, and the way teaching/effectiveness normally improves after several cycles of working with a new curriculum/teaching approach/resource etc. (making evaluations of novel innovations that teachers have just met and been trained up in suspect). Large, diverse samples offer some protection against specific interactions between particular treatments and particular teacher skills or styles, or even the interaction between particular teachers and classes, confusing the comparison being sought.
Sometimes when an intervention is tested in this way, the outcome is that the spreads of results within the ‘experimental’ and ‘control’ conditions are far greater than any difference between the mean outcome in the two conditions (Taber et al., 2016) – reminding us how educational outcomes depend so much on the particulars of groups of students, teachers, school context, etc., and how these interact with particular ‘treatments’ and each other.
Experimental work carried out by practitioners with their own classes is therefore especially problematic as it usually is limited to a very small number of classes – sometimes simply comparing two parallel classes in a school or university department. Usually it is not possible to assign individual students to conditions and intact groups have to be used. The ethical considerations become more severe when breaking up and reorganising classes purely for research convenience – which in any case potentially adds an additional intervention compared with the naturalistic state of settled classes.
A common way to test for the comparability of classes that are to experience different treatments is to check for differences in what are considered the most relevant pre-tests. Where tests of significance are used to suggest that no statistically significant difference is found before the experiment, this may be taken as a sufficient test of initial equivalence. However not finding that a difference is very unlikely to occur by chance (p < 0.05 being the usual measure) – that is, finding that the difference would be likely to occur by chance more often than once in twenty occasions – is far from saying there are no differences that might be significant for the experiment. Lack of a statistically unlikely difference is an insufficient criterion for assuming any difference that does exist is irrelevant to the learning that will follow.
Even when comparability of the groups to be treated differently prior to the treatment is considered sound, any difference after treatment is a difference due to the interaction treatment × teacher × students (and, possibly, × schedule/accommodation…, as well), and only directly tells us about how well one teacher taught one class using a particular approach or resource, compared with how a different teacher taught a different class using a different approach or resource. Having the same teacher teach in both conditions may seem to remove one variable but actually raises issues about the teacher’s relative commitment to, relative skills in, and expectations about, the effectiveness of, the two different approaches. More sophisticated designs that switch treatments between groups at some intermediate measurement stage, and even seek to balance ordering effects due to the sequence in which groups meet treatments, tend to be more difficult to organise in practice. Given all of these complications, experimental work carried out by practitioners within a single educational context may not seem to offer outcomes of wide relevance elsewhere.
The literature suggests that pupils at secondary schools often fail to think enough about the ideas that practical work is meant to illustrate whilst they are doing the practicals (Abrahams, 2011). Yet we know that what is true in one context may not be true elsewhere. So we teachers at the Theresa May New Grammar School in Poortown have undertaken some research, and we have shown it is true here with our students doing their chemistry practical work as well. This is useful, because now we know that problem also exists here, we can look to address it.
Flipped learning, where pre-lecture activities free up time for more interactive engagement in problem-solving in class (Seery, 2015), has worked well in a number of universities, but had not previously been attempted in Chemistry 101 at West of the Rockies University. We have now implemented this approach and found that students here seem to benefit from this innovation.
If this was a sufficient basis for publishable research, then there are potentially thousands of yet-to-be-written papers from the many schools, colleges and universities around the world telling us that something also works (or does not work), here, there and everywhere. It seems likely that most reviewers would consider that papers which simply tell us ‘it also works here’ fail on one of the generally agreed criteria for high quality research – originality. Such research is certainly valuable for informing practice in the contexts where undertaken (i.e. as context-directed research), but thousands of papers simply repeating the same work in different institutions is of limited value to the international community (i.e. it is not sufficiently theory-directed). So that's another kind of practice paper, that certainly reports research, but which is unlikely to get published.
As so often happens in educational research, there is a need to go beyond the labels given to studies to see what particular authors actually mean by them. For example, ‘mixed methods’ has become a very popular referent for educational research – but one that means very different things to different people (Taber, 2013b). The problem with the term ‘case study’ is that a study that discusses a case is not necessarily (in a meaningful sense) a case study.
A case is one instance among many. The case could be one school, one education district, one lesson and so on. One example of a published study case concerns one group of students (among several groups in a class) undertaking a discussion activity (that is one part of the lesson sequence) taught to one class as part of an innovative new teaching module on complex systems (Duit et al., 1998). Another example is an exploration of the developing conceptualisation of models of atomic structure of one individual learner (Petri and Niedderer, 1998). Case studies may be especially valuable in exploring changes over time where ‘signal’ would likely get lost in ‘noise’ if looking at averages across cases: as for example when exploring how conceptual understanding changes (Taber, 2003). These studies were considered publishable, but clearly the discussion of one group of students or the thinking of a single student, are of limited generalisability. Indeed the logic of ‘idiographic’ research is that some phenomena of interest can only be understood in any depth by looking at the particular cases in great detail. Yet, it would not be viable for researchers to study, or journals to publish, how every group of students discussed every science topic, or how every individual student understood every science concept, or indeed how every attempt to implement a pedagogic innovation in some classroom, lecture hall, laboratory, or study group, worked out.
So to be publishable, a case study has to be motivated by the current state of knowledge as reported in the literature (and not just in the sense that we do not know if it works here in particular, or that there is no published study of whether this specific class will benefit from an interactive text, or about what this particular student understands about reaction rate). Yet as suggested above, given that every student, class, school etc. is somewhat unique, there is always the question of to what extent research reported in the literature is generalisable to other contexts. Not knowing if it works here is certainly a strong motivation for context-directed research, practitioner research aimed at informing those of us working here in this context, but as suggested above, not all research that is valuable in the local context meets the criteria for publication.
Rather motivation for academic research (theory-directed research) needs to be more principled – more theoretical – than simply that the literature does not tell us about this specific instance or context. So if it has been found that flipped learning works well in a range of Universities in the United States, but – let us say – has not been reported in some East Asian context, that may be a gap in the literature: but only if we have good grounds for considering generalisation to that context to be problematic. This is where the ‘theoretical’ comes in: if it is known there are difference in (say) standard teaching practices and typical student study habits between these contexts which could feasibly influence the level of effectiveness of flipped learning, then there is a good justification for exploring (what is nominally) the ‘same’ innovation in the distinct context.
This is not replication, as we know the conditions are not the same, but is research to explore the range of application of our educational findings – and is theoretically motivated because we see the potential relevance of differences in the context. So ‘range of application’ is not to be understood in simple numerical terms (we can tick off another student, teacher, class, school, college, or university, where this applies) but in theoretical terms where we identify different types of learner, teacher, class, institutional context etc. on the basis of differences that theory suggests may be relevant.
There is much potential for this kind of research in chemistry education and indeed more widely in science education (Taber, 2012b), but where the studies are principled because they explore practice in relation to identifiable variations (i.e. abstracted from particular contexts, so theoretical) rather than just because no one has tried it just here before. A new research site needs to represent a theoretically different teaching and learning context, not just a different particular context. That in itself justifies new (potentially publishable) studies in new contexts, but they need not be case studies. Indeed often it would be better to sample a range of cases that collectively might better represent the theoretical type to be explored, rather than just one particular example.
Idiographic studies, case studies, come into their own when it is suspected that what is being explored is responsive to a whole range of conditions that can not be easily separated. This is often the case in teaching and learning contexts. What happens in a particular school class reflects that particular group of students, and how they interact, and their particular teacher and how she responds to them and the class to her. The physical teaching environment may also have an influence, and the ethos and norms of the school almost certainly will. The cultural resources of the students and teacher will influence teaching and learning and the language of instruction provides different constraints and affordances than other national languages. Let us imagine this is a chemistry class in a country where context-based teaching has not been investigated. If we simply want to know whether context-based learning will be effective in that national context then we should best do a survey across a range of classes attempting context-based learning in different schools in that country.
Yet if we are interested in understanding the extent to which the new approach can be implemented, and why students respond in the ways they do, we may need to do something much more detailed where we can explore the nested nature (the learner working in a group in a class in a certain teacher's classroom, in a particular school,…) of the processes we are interested in. This is where case study is used: where we are exploring something complex and where there is no sharp distinction between the case and its wider context and where we cannot decouple them anyway (Taber, 2013b). We could take the teacher and class out of school to a University model classroom – but then we may not find out about what would happen in the authentic teaching and learning context.
So a case study is not just a study that reports on one case, but a study which offers sufficient ‘thick description’ (Geertz, 1973) to allow readers to appreciate the complexity of the system the case is a part of. We might see classroom teaching and learning as an emergent phenomena that arises from a complex system (Wilensky and Resnick, 1999). In effect case study is motivated by seeing the case as one example of a particular class of instances (a class in a country that has not tried this teaching method; a teacher who has not been exposed to this particular form of professional development; a university with very different resourcing and student profile to where this innovation has been tried before) but with the acknowledgement that this is only one part of a complex identity that makes each case within this theoretical class unique in terms of other considerations.
All empirical studies are carried out in some context or contexts. Sometimes the researchers are external to the context which may help avoid the potential for some kinds of bias. Sometimes the researchers are practitioners in the study context which can offer particular privileged insider understanding. Researcher practitioners may also sometimes find it easier to negotiate access for research – although this can also sometimes present challenges in putting in place ethical precautions. If researching one's own students it is normally necessary to set up alternative gatekeepers (Taber, 2013b) and this may be a particular challenge if the researcher has a senior or influential role in the institution (as the gatekeeper has to be able to step in and prevent or stop the research if there are concerns, and participants have to feel confident that the gatekeeper is willing and able to do this). Undertaking in-depth case studies can be resource-heavy, as the approach tends to require extensive engagement in the research context (Tobin et al., 1990), which may require external researchers to commit considerable time to being on site. Case studies carried out by practitioners can amplify the potential tensions between ethical and methodological concerns due to the dual role of practitioner and researcher (Taber, 2002), but allow the researcher to engage in the research context over extended periods quite naturally, and use their extensive insight into the case to report it with thick description.
In summary, CERP welcomes papers reporting high quality chemistry education research relevant to practitioners, and this certainly includes those where manuscripts report studies carried out into the researchers’ own practice. There's a lot wrong with some of the submissions of practitioner research received by the journal – just as there is with some of the papers submitted for consideration by researchers who are not based in the research contexts they are reporting. However, CERP also receives and publishes strong examples of research carried out both by external researchers and also by practitioner researchers. As long as the work meets the journal's usual criteria for publication, in principle, there is nothing wrong with practice papers.
This journal is © The Royal Society of Chemistry 2016 |