Gwendolyn
Lawrie
School of Chemistry & Molecular Biosciences, The University of Queensland, St. Lucia, Qld 4072, Australia. E-mail: g.lawrie@uq.edu.au
A question which often arises for chemistry education researchers, and is also frequently raised by reviewers of Chemistry Education Research and Practice (CERP) articles, is whether a research data sample size (N) is big enough? However, the answer to this question is more complicated than a simple ‘yes’ or ‘no’! In fact, there is substantial discussion of this issue within research literature which can make it even harder for a researcher to decide.
The typical process of research data collection involves ‘sampling’ which is the selection of a number of participants, or artefacts, to be measured, and can be considered to represent a larger population or collection in the context of the study. There are no universal criteria available to guide researchers in terms of what quantity of each unit of analysis or unit of observation is ‘enough’. Ideal sample size is dependent on the research context and study aims. In this editorial several considerations are presented, to assist readers and future authors, informed by published articles related to the field of CER along with highly regarded education research work perspectives drawn from beyond our field. The purpose is to encourage authors towards clearly communicating any processes which they have applied to sample participants (or artefacts of teaching and learning) in their studies while also acknowledging potential limitations or biases that arise. This article is by no means a comprehensive overview of the topic and readers are encouraged to go to cited works to gain a deeper perspective.
Fig. 1 Overview of different sampling strategies applying probability and non-probability sampling approaches (boxes and oval frames indicate the recruitment of study participants). |
Qualitative, quantitative and mixed method research paradigms each attract separate considerations with regard to sampling – the sample size and composition becomes very important in terms of the perceived quality of the analysis subsequently applied. Quantitative studies are usually attributed as requiring a minimum sample size to ensure that statistical analysis techniques provide valid results. The sample size can be calculated using power analysis based on the target population size and probability of finding a statistically significant result. These calculations estimate the minimum units of analysis that are required according to the choice of statistical method. However, researchers aiming to complete a confirmatory factor analysis followed by structural equation modelling are advised to apply careful thought to sample size, missing data, biases and latent variables since they are specific to the model under consideration (Wolf et al., 2013). An inherent challenge in recruiting participants to complete survey-based instruments is the potential introduction of non-response bias based on low response rates (the number of usable data sets in proportion to the number of participants approached). We recommend that researchers report response rates and comment on sample quality in their articles.
Qualitative research methods are highly contextualised, hence it is not feasible to ‘calculate’ an ideal minimum sample size. Rather the theoretical basis of the research paradigm becomes important in guiding data collection, for example, Guba (1981) explains that trustworthiness is increased in rationalistic treatments of data when probability sampling approaches are adopted, however in naturalistic treatments, purposive sampling is more appropriate. Herrington and Daubenmire (2014) have provided chemistry education researchers with examples of sampling approaches for different types of qualitative research studies in our field.
Onwuegbuzie and Collins (2007) provide a useful synthesis of literature to recommend ranges of sample sizes which may apply in different research methodology approaches. They also remind readers that the false dichotomy of qualitative and quantitative approaches does not delineate sampling approaches (Onwuegbuzie and Leech, 2007). In mixed methods research, it is rare to find a combination of qualitative and quantitative data that both involve random sampling, and much more common to find a combination of non-probability sampled quantitative and qualitative data. There are many examples where participants are sampled by applying one strategy for a quantitative phase of a study followed by a different strategy for a qualitative phase. A minimum ideal number of interviews that can be calculated as being required to be representative of the population that completed a quantitative instrument does not exist – this sample number is dependent on the nature of the research question or study aim.
The discussion of a sample size is often linked to the quality of a study and potential generalizability of findings, often referred to as external validity. There are three models of generalizability: statistical generalization, analytical generalization and case to case transfer (transferability) (Polit and Beck, 2010). In quantitative research, external validity is considered to be important where the findings of a study are regarded as statistically generalizable, and the sampled data are representative of the whole population. However, this aim can be considered an ideal situation and is not necessarily what is achievable in practice when the entire population is difficult to define or access. In qualitative research, the ideal analytic generalization model aims to support theories or conceptualisations through rigorous inductive analysis approaches and confirmatory strategies (Polit and Beck, 2010). Replication of findings can contribute to supporting generalizability, however for brevity an in-depth discussion of replication is beyond the scope of this editorial.
A variety of sampling approaches are evident in chemistry education research and evaluative studies published in our journal (Table 1 provides several recent example articles). As mentioned earlier, in our research field, the units of analysis or observation in terms of evidence of learning and teaching may involve single or multiple combinations of individual or groups of students, teachers and institutions.
Sampling approach | Context | Unit of analysis | Target population size (sample size) | Study |
---|---|---|---|---|
Random | Semester 2 general chemistry courses | Tertiary students | 1584 (1086) | Farheen and Lewis (2021) |
Stratified random | Tertiary institutions | In-service teachers | 6388 (829) | Raker et al. (2021) |
Quota | Multiple-level tertiary chemistry courses | Tertiary students | 23 (9) | Hosbein and Barbera (2020) |
Convenience | High school classrooms | Secondary students | 78 (78) | Kadioglu-Akbulut and Uzuntiriyaki-Kondakci (2021) |
Purposeful and snowball | Chemistry outreach activities | Graduate students | Case 1:5 | Santos-Díaz and Towns (2021) |
Case 2:4 | ||||
Voluntary | Professional development | Doctoral students | Incoming enrolments (4) | Busby and Harshman (2021) |
It is important to acknowledge that quantitative experimental research studies which aim to evaluate the effectiveness of a teaching or learning intervention in classroom settings face multiple sampling challenges. There are often insufficient participants to enable randomised sampling approaches or statistical analysis to compare treatment and control groups. Taber (2019) provides a detailed overview and advice for sampling, generalizability and replication of findings for these types of studies.
CERP readers will often seek out research articles which provide insight into sampling methods and outcomes from studies, regardless of sample size, so that they can inform their own research methods and context. Indeed, data which is curated across multiple cases or contexts, sourced from separate published studies, can be compared and synthesised as part of a meta-analysis to build a consensus picture that may become generalizable through the weight of combined evidence. In summary, our advice to authors is to invest in a detailed description of their data collection processes for their own context, including the sampling procedures used and sample composition. They should also reflect on their sample size and study context further, to acknowledge the limitations or sample biases in their study, before making recommendations for future research or implementation in teaching practice. We suggest that conservative claims, informed by the study's findings, will be more highly regarded by readers than unevidenced claims and generalizations which are not supported by the data that was collected.
This journal is © The Royal Society of Chemistry 2021 |