Regis
Komperda
a,
Kathryn N.
Hosbein
b,
Michael M.
Phillips
c and
Jack
Barbera
*b
aDepartment of Chemistry & Biochemistry, Center for Research in Mathematics and Science Education, San Diego State University, USA
bDepartment of Chemistry, Portland State University, USA. E-mail: jbarbera@pdx.edu
cSchool of Psychological Sciences, University of Northern Colorado, USA
First published on 8th April 2020
The Science Motivation Questionnaire II (SMQ II) was developed to measure aspects of student motivation in college-level science courses. Items on the SMQ II are structured such that the word ‘science’ can be replaced with any discipline title (e.g., chemistry) to produce a discipline-specific measure of student motivation. Since its original development as the Science Motivation Questionnaire and subsequent refinement, the SMQ II and its discipline-specific variants have been used in a number of science education studies. However, many studies have failed to produce acceptable validity evidence for their data based on the proposed internal structure of the instrument. This study investigated if modifications could be made to the SMQ II such that it produces consistent structural evidence across its use in various forms. A modified SMQ II (mSMQ II) was tested with wording variants (‘science’ and ‘biology’ or ‘chemistry’) in general biology and in preparatory and general chemistry courses at several institutions. Exploratory and confirmatory factor analysis were used to cull problematic items and evaluate the structure of the data based on the relations posited by the SMQ II developers. While extensive revisions resulted in acceptable data model fit for the five-factor structural models in most course and wording conditions, significant issues arose for the single-factor scales. Therefore, potential users are cautioned about the utility of the SMQ II or its variants to support the evaluation of classroom practices. A reflective review of the theoretical underpinnings of the SMQ II scales call into question the original framing of the scales and suggests potential alternatives for consideration.
When developing instruments to measure unobservable (i.e., latent) traits such as motivation, it is necessary to align the items on the instrument with a theoretical framework for the latent variable (American Educational Research Association et al., 2014). In the case of motivation, the literature contains multiple theoretical frameworks including social-cognitive theory (Bandura, 1993), self-determination theory (Ryan and Deci, 2000), and expectancy-value theory (Wigfield and Eccles, 2000), among others. One instrument combining multiple motivation theories is the Science Motivation Questionnaire (SMQ; Glynn and Koballa, 2006; Glynn et al., 2009), which was later revised by the developers into the Science Motivation Questionnaire II (SMQ II; Glynn et al., 2011).
Within the SMQ II (Glynn et al., 2011), the only items that specifically address aspects of self-regulation are on the self-determination scale (factor 3; p. 1167), as these items focus on study preparation and effort exertion for studying science (e.g., “I put enough effort into learning science” or “I prepare well for science tests and labs”). Additionally, the scale itself might not align with a framework of self-determination, particularly if, as described by the authors (p. 1161) this definition arises from the self-determination theory (SDT; Deci and Ryan, 2000), which centers more on the three psychological needs of autonomy, competence, and relatedness. When these three needs are met (to varying degrees), an individual's actions are more self-determined, which can influence regulatory styles (as described in SDT) across the extrinsic-intrinsic continuum. Self-determined actions are growth-oriented and are not overly impacted by external influences, which is how self-determined actions are related to the distinction between intrinsic and extrinsic motivation. This continuum contrasts with the dichotomy implied by the separation of the intrinsic motivation scale from the extrinsic scale in the SMQ and later the extrinsic focused scales of grade and career motivation in the SMQ II. An additional concern regarding the theoretical framework of the SMQ II scales is the inclusion of self-determination as a distinct construct. Though Ryan and Deci describe their theory of motivation as self-determination theory (SDT), self-determination tends to describe motivated action when one's psychological needs are being met (Ryan and Deci, 2000) and can range on a continuum from extrinsic to intrinsic motivation rather than being a distinct construct.
The primary support for Glynn and colleagues’ (2011) proposed theoretical framework for the SMQ II comes from analyses of the internal structure of the instrument using both exploratory and confirmatory factor analysis. Factor analysis statistical techniques allow researchers to determine if instrument data are aligned with a hypothesized internal structure, a form of model testing critical to the practice of science (Grosslight et al., 1991). In the case of the SMQ II, this takes the form of a model containing five distinct yet related aspects of motivation in a five-factor model (intrinsic, extrinsic [grade and career], self-determination, self-efficacy). Results from exploratory factor analysis in prior work (Glynn et al., 2009) showed that extrinsic motivation consisted of two separate but related components: grade and career motivation. When extrinsic motivation was split into these two components, the five-factor motivation model was shown to provide adequate fit to samples of students within major and non-major biology courses (Glynn et al., 2011). While these two types of extrinsic motivation were supported through factor analysis, these results alone do not provide strong theoretical support for the new constructs. If subsequent data are found to poorly fit this model, or if the aspects of motivation measured by the SMQ II are not found to be distinct factors within additional samples, this provides an opportunity to examine potential issues with the underlying model of motivation or the items developed to measure it and further refine the items and/or model. This model testing should occur at each use of the instrument to support the validity of the data collected and ensure the results can be interpreted in a meaningful way (American Educational Research Association et al., 2014).
As assessment instruments are commonly used within the chemistry education community to provide insight to the impacts of classroom practice, it is imperative that the data produced by an assessment instrument shows evidence of validity and reliability (Arjoon et al., 2013). Furthermore, if assessment items used to measure a relevant trait, such as self-efficacy, are not shown to align with a theory of self-efficacy, the interpretation of the results may not be reflective of a learning environment's support of, or impact on, the trait. Therefore, interpreting data from assessment instruments that do not show evidence of validity and reliability can lead to misinformed judgements about classroom practice. As the SMQ II is purported as an assessment tool that can be used across a range of courses and disciplines (Glynn et al., 2011), data from its different wording variants and applications must be equally supported by evidence.
As the SMQ II developers intentionally designed the instrument such that the word ‘science’ could be replaced with any other specific discipline (Glynn et al., 2011) many versions of the SMQ II can be found in the literature using wording such as biology, chemistry, organic chemistry, histology, math, nanotechnology, pharmacy, physics, and technology (Tosun, 2013; Campos-Sánchez et al., 2014; Riccitelli, 2015; Salta and Koulougliotis, 2015; Srisawasdi, 2015; Hibbard et al., 2016; Kassaee, 2016; Kwon, 2016; Mahrou et al., 2016; Olimpo et al., 2016; Cleveland et al., 2017; Reece and Butler, 2017; Yamamura and Takehira, 2017; Ardura and Pérez-Bitrián, 2018; Austin et al., 2018; Cagande and Jugar, 2018; Komperda et al., 2018a; Young et al., 2018). The popularity of the SMQ II also extends to discipline-based education researchers who have translated it from English into at least seven other languages (Tosun, 2013; Campos-Sánchez et al., 2014; Salta and Koulougliotis, 2015; Srisawasdi, 2015; Schumm and Bogner, 2016; Shin et al., 2017; Yamamura and Takehira, 2017; Ardura and Pérez-Bitrián, 2018; Vasques et al., 2018).
Investigation of the internal structure of the SMQ II has utilized analysis techniques both with and without a priori models of how the items should be related. Analyses without an a priori model generally fall under the classification of exploratory factor analysis (EFA), although of the most commonly used techniques with the SMQ II, principal components analysis and principal axis factoring, the former is frequently described as a data reduction technique, not a factoring approach (Henson and Roberts, 2006). As described earlier, the theoretical framework of the SMQ II describes motivation as a “multicomponent construct” composed of “types and attributes of motivation” (Glynn et al., 2011, 1161) including intrinsic motivation, self-determination, self-efficacy, grade motivation, and career motivation. Of the seven studies using EFA techniques, four identified five factors aligned with the proposed theoretical framework (Glynn et al., 2011; Kwon, 2016; Schmid and Bogner, 2017; Ardura and Pérez-Bitrián, 2018). After initially failing to find a five-factor solution, Austin et al., (2018) removed a majority of the intrinsic items resulting in a combined intrinsic/career factor described as ‘relevance.’ Yamamura and Takehira (2017) also obtained a four-factor solution after removing 12 items due to low association with a factor, including all the self-efficacy items. The last study only utilized three scales from the SMQ II (self-efficacy, self-determination, and career) which resulted in a three-factor solution (Schumm and Bogner, 2016). These studies (Table S2, ESI†) provide some support that the items are aligned with their intended factors but moving to a confirmatory framework provides the ability to test data against a previously specified model and restricts items to only associating with a single factor.
The SMQ II developers specified a correlated five-factor model with five items belonging to each factor (Glynn et al., 2011). Therefore, data collected from administration of the SMQ II can be tested against this a priori model and evaluated with typical data-model fit criteria (Hu and Bentler, 1999). These data-model fit criteria take the form of examining the value of various fit indices and comparing them to suggested cutoff values, generally a CFI and/or TLI at or above 0.95, RMESA at or below 0.06, and SRMR at or below 0.08. Direct comparison of data-model fit across studies with the SMQ II is difficult due to variations in the wording of the items as either science or discipline-specific, and editing or removal of the items themselves (see Table S3, ESI† for a list of studies, variations, and fit values). However, in general the Hu and Bentler cutoff criteria were not met by most studies (Glynn et al., 2011; Salta and Koulougliotis, 2015; Kwon, 2016; Ardura and Pérez-Bitrián, 2018; Komperda, et al., 2018a; Vasques et al., 2018) unless the instrument was modified by removing items or entire scales (Tosun, 2013; Yamamura and Takehira, 2017).
Additional limitations for the direct comparison of CFA results across studies are due to the low frequency with which information is reported about the estimator chosen for the factor analysis (Table S3, ESI†) and justification that the properties of the data supported the use of the chosen estimator. For example, when descriptive statistics are reported for SMQ II items or scales it is frequently found that responses to the grade motivation items are much higher (more positive) than the other scales (Glynn et al., 2011; Salta and Koulougliotis, 2015; Hibbard et al., 2016; Ardura and Pérez-Bitrián, 2018; Austin et al., 2018; Komperda et al., 2018a). This could indicate potential issues with nonnormality of data or collapsing of the five-point response scale such that it essentially functions only as a two- or three-point scale for some items. In any of these cases it would be recommended to move from the typical maximum likelihood (ML) estimator to a robust estimator (MLR) that provides a correction for non-normality or a mean and variance adjusted weighted least squares (WLSMV) estimator for categorical data (Finney and DiStefano, 2013). Studies employing the WLSMV estimator with the full 25-item SMQ II have found slightly better data-model fit than those employing the ML estimator (Komperda et al., 2018a). The inconsistency among the CFA results suggests the need to examine causes for this variation, particularly as they related to alignment between the theoretical framework of the SMQ II, the individual items, and student responses. Providing evidence of these alignments is paramount to ensuring that the instrument has solid theoretical support and can further be used to inform instructional practices.
1. What are potential reasons for the inconsistent validity evidence based on the internal structure of the SMQ II as proposed by the SMQ II developers?
2. If these issues are addressed, will a modified SMQ II that is aligned with the theoretical framework proposed by the SMQ II developers have acceptable internal structure across different wordings and course contexts?
These research questions were addressed in two phases with independent samples. The goal of phase one was to identify modifications that could be made to the SMQ II to improve the functioning of both the science and discipline-specific wordings when administered to undergraduate students. The result of phase one was the development of a modified Science Motivation Questionnaire II (mSMQ II). The goal of phase two was to assess the functioning of the mSMQ II in a new sample of undergraduate students enrolled in science courses in order to evaluate whether the modifications resulted in improved data-model fit relative to the SMQ II. A factor analysis framework was chosen for this research to align with previous work done by the instrument developers (Glynn et al., 2011) and to provide a point of comparison to the previously discussed SMQ II studies. The methods and results from each phase are reported sequentially. Within each phase, human subjects IRB approval was obtained from Portland State University and appropriate participant consent was gathered from the study populations. Any incentives, if provided, are noted within each population description.
During the interview, students were provided a paper copy of the SMQ II and randomly presented with either the science or chemistry wording of the items. Students read all of the items silently and circled their responses on the original frequency-based scale. Students were asked to explain their responses to a subset of items, which were identified by the research team as having potentially good or poor fits to the response scale and/or a hypothesized item category based on a prior study (Komperda et al., 2018a). Next, students were asked to go over the entire survey again and explain if any item responses would change if the other wording (science or chemistry) were substituted. Specific demographics were not collected for the interview participants and each was provided a $10 gift card for their time.
Though students were not asked to fully describe their response process for all SMQ II items during the interviews, the results from the 12 items students responded to show similarities between language used by students and the preferred response scale identified by the experts. The three intrinsic labeled items explored in the interviews (I2, I3, and I5; wording given in Table 1) were coded as having frequency-based responses in less than 40% of instances (32%, 25%, and 38%, respectively), which aligned with expert preferences not to use a frequency-based response scale for these items. Similar responses were seen for the self-efficacy item SE2 in which frequency-based codes were used with 32% of responses and for the three career items explored in the interviews (C1, C2, and C3) with less than a quarter of students using frequency-based language (15%, 22%, and 15%, respectively). Only two items explored in the interviews, a self-efficacy item (SE1) and a grade item (G4) showed a majority use of frequency-based language (65% and 57% of codes, respectively), which is also consistent with the experts’ evaluation of being more aligned with a frequency-based response scale than a Likert-type scale.
Scale | Item | SMQ II wording | Itema | mSMQ II wording |
---|---|---|---|---|
a Items removed after exploratory factor analysis are indicated with asterisks. | ||||
Intrinsic | I1 | The science I learn is relevant to my life | I1* | The [] I learn is relevant to my life |
I1a* | The [] I learn is relevant to the world around me | |||
I2 | Learning science is interesting | I2 | Learning [] is interesting | |
I3 | Learning science makes my life more meaningful | I3a | Learning [] helps me understand the world around me | |
I3b | Learning [] increases my appreciation of the world around me | |||
I4 | I am curious about discoveries in science | I4 | I am curious about discoveries in [] | |
I5 | I enjoy learning science | I5 | I enjoy learning [] | |
Self-determination | SD1 | I put enough effort into learning science | SD1a | I put effort into learning [] well |
SD2 | I use strategies to learn science well | SD2* | I use strategies to learn [] well | |
SD3 | I spend a lot of time learning science | SD3 | I spend a lot of time learning [] | |
SD4 | I prepare well for science tests and labs | SD4a* | I prepare well for [] tests | |
SD5 | I study hard to learn science | SD5 | I study hard to learn [] | |
SD5a* | I use a lot of mental energy learning [] | |||
Self-efficacy | SE1 | I am confident I will do well on science tests | SE1 | I am confident I will do well on [] tests |
SE2 | I am confident I will do well on science labs and projects | SE2a | I am confident I will do well on [] assignments | |
SE3 | I believe I can master science knowledge and skills | SE3a* | I believe I can master [] knowledge | |
SE4 | I believe I can earn a grade of “A” in science | SE4a | I believe I can earn the grade I want in [] | |
SE5 | I am sure I can understand science | SE5* | I am sure I can understand [] | |
Grade | G1 | I like to do better than the other students on science tests | G1a* | I like to do better than the other students in [] |
G2 | Getting a good science grade is important to me | G2 | Getting a good [] grade is important to me | |
G3 | It is important that get an “A” in science | G3a | It is important that I earn the grade I want in [] | |
G4 | I think about the grade I will get in science | G4 | I think about the grade I will get in [] | |
G4a* | I worry about my [] grade | |||
G5 | Scoring high on science tests and labs matters to me | G5a | Scoring high on [] tests matters to me | |
Career | C1 | Learning science will help me get a good job | C1 | Learning [] will help me get a good job |
C2 | Knowing science will give me a career advantage | C2 | Knowing [] will give me a career advantage | |
C3 | Understanding science will benefit me in my career | C3 | Understanding [] will benefit me in my career | |
C4 | My career will involve science | C4 | My career will involve [] | |
C5 | I will use science problem-solving skills in my career | C5* | I will use [] problem-solving skills in my career |
When asking students to explain their responses to some of the SMQ II items, recurrent issues with students' interpretation arose. For example, when explaining their responses to the intrinsic item (I1) “The science I learn is relevant to my life”, 17% of students referenced their career as the reason that science (or chemistry) was relevant to their lives. This was unexpected since the SMQ II contains a separate set of items intended to address students' career motivation. Similar overlap with thinking about future careers was seen in responses to the grade item (G4) “I think about the grade I will get in science” students cited pressure from graduate or professional school for the reason they think about their grade.
When responding to the intrinsic item (I3) “Learning science makes my life more meaningful,” students were unsure of what “meaningful” meant in that context. The most frequent way students described “meaningful” was that learning science (or chemistry) helped them to better understand the world around them. Students also expressed confusion about other vague phrases such as “relevant” (I1) and “think about” (G4) suggesting that the wording of these items could be made clearer to improve response process validity.
Another commonly observed response was for students to ignore a portion of an item when formulating their response. An example of this is with the self-efficacy item (SE2) “I am confident I will do well on science labs and projects.” Some students explicitly mentioned only focusing on the lab portion of the question because their course did not involve projects while other students responding to this question made a comparison between labs and tests. In these instances, students may be interpreting projects to mean tests or simply ignoring the project portion of the question. In either case, this suggests problems with the wording of the item due to the presence of multiple topics within a single item (i.e., the item is double-barreled) or a topic in the item not being applicable to the typical experience of a student in a general chemistry course (e.g., having projects). Full student quotes are provided in Table S6 (ESI†).
During the student interviews, two intrinsic items (I1 and I3) were identified as potentially causing unintended student responses. In the first item, I1, students were considering their career as a reason that science or chemistry was relevant to their lives, which could cause this item to be more associated with responses to items on the career scale rather than the intrinsic motivation scale. For that reason, an additional version of the item “The science I learn is relevant to the world around me” was added to the survey in order to test whether more general wording could avoid having the item align with the career items. The second item, I3, was revised into two different wordings to address different reasons for why science or a specific discipline may be “meaningful” to students, which is by increasing their understanding or appreciation. The phrase “world around me” was used in these revisions to align with the revision to item I1. Additionally, students describing their response to item G4 emphasized thinking about their grade in terms of worrying about requirements for graduate or professional school so an additional item was added to explore the “worry” aspect of thinking about grades.
Other modifications were made to items to remove or separate double-barreled items. As in Salta and Koulougliotis (2015), the phrase “labs” was removed from items SD4, SE2, and G5 rather than removing the items entirely (Austin et al., 2018) since most of the classes had separate lab and lecture components. Additionally, the phrase “projects” was removed from SE2 and replaced with the more general “assignments” since not all classes have projects. Similarly, item SE3 had “skills” removed since that appeared more aligned with a laboratory context.
The final set of modifications was made to deal with phrases that were either overly vague, or overly focused on a specific aspect of course performance. One self-determination item, SD1, “I put enough effort into learning science” was modified into the more concrete “I put effort into learning science well”, which better aligned it with item SD2 about using “strategies to learn science well.” Similarly, a second version of item SD5 “I study hard to learn science” was written more concretely as “I use a lot of mental energy learning science.” In a grade item, G1, specifically focused on students doing better than others on tests, the test-specific language was dropped to account for other aspects of course performance. Similarly, two items, SE4 and G3, focused students on earning an “A” in science, in recognition that this is not necessarily the goal for all students, this was reworded as earning “the grade I want in science.” The complete set of mSMQ II items is provided in Table 1. Items that stayed the same from the SMQ II to the mSMQ II retain the same numbering. Items that have been revised have a letter after their designation (e.g., I1a) to indicate that they are a revision of a pre-existing SMQ II item, I1 in this case. In total, 29 items appeared on the mSMQ II.
A link to the online mSMQ II, created in Qualtrics, was provided to each course instructor. The instructor was asked to provide this link to students through their course management website and also to play a brief video in class in which a research team member described the purpose of the study and the consent process to students. No identifying student information was collected on the survey itself. Most of the course instructors offered extra credit for student participation in the survey. If extra credit was offered, students were taken to a separate survey where they entered their name and university ID for identification purposes for extra credit only. All surveys were open for a non-exam week selected by the instructor between the end of October and the end of November 2017. When taking the survey, students were randomly presented with either the science or discipline-specific wording (biology or chemistry) for all mSMQ II items. The items were presented in a randomized order followed by demographic questions about gender, race/ethnicity, and declared major.
The self-determination and self-efficacy scales of the mSMQ II only consisted of three items each after removing poorly functioning items. With only three items and no restrictions on the strength of associations between an item and a factor, known as loadings, a single-factor model has zero degrees of freedom and data-model fit cannot be tested. Constraining loadings on a factor to be equal (i.e., a tau-equivalent model) restores degrees of freedom to the model and data-model fit can be tested (Komperda et al., 2018b). While tau-equivalent models are more restrictive and therefore less likely to achieve model-fit than unconstrained (i.e., congeneric) models, it is necessary to use them when a factor has less than four items for the aforementioned reasons. Therefore, tau-equivalent single-factor models were tested for the self-determination and self-efficacy scales, while congeneric single-factor models were tested for the intrinsic, grade, and career scales.
In recognition of the ordinal and highly-skewed properties of the mSMQ II data, the robust diagonally weighted least squares (WLSMV) estimator was used (Finney and DiStefano, 2013). As the WLSMV estimator was expected to show better data-model fit than robust maximum likelihood (MLR) due to the properties of the data; fit indices from both estimators are provided for comparison purposes. Data-model fit was evaluated using a set of indices appropriate for the estimator used (Hu and Bentler, 1999; Yu, 2002; Beauducel and Herzberg, 2006; Xia and Yang, 2018). For the WLSMV estimator, values of CFI and TLI ≥ 0.95 and RMSEA ≤ 0.05 were used to indicate acceptable data-model fit. Since previous studies demonstrated that the SRMR does not function well with the WLSMV estimator with small number of response categories, the SRMR was not used to make data-model fit assessments for this estimator. For the MLR estimator, values of CFI and TLI ≥ 0.95, RMSEA ≤ 0.06, and SRMR ≤ 0.08 were used to determine acceptable data-model fit. For both estimators, a model was deemed to have acceptable data-model fit when all fit indices were acceptable. All CFA models were analyzed using the lavaan package in R (version 0.6–1; Rosseel, 2012).
Course | Wording | Responses | Female (%) | White (%) | Top major (%) |
---|---|---|---|---|---|
Preparatory chemistry | Science | 139 | 61 | 75 | Biology pre-health (40) |
Chemistry | 137 | 76 | 74 | Biology pre-health (35) | |
General chemistry | Science | 835 | 55 | 63 | Engineering (30) |
Chemistry | 855 | 51 | 60 | Engineering (31) | |
General biology | Science | 258 | 77 | 61 | Biology pre-health (29) |
Biology | 263 | 76 | 66 | Biology pre-health (28) |
Estimator | Course | Wording | χ 2 | CFIa | TLIa | RMSEAa | [90% CI] | SRMRa |
---|---|---|---|---|---|---|---|---|
a Acceptable data-model fit values differ by estimator: for WLSMV cut-off values are CFI and TLI ≥ 0.95 and RMSEA ≤ 0.05; for MLR cut-off values are CFI and TLI ≥ 0.95, RMSEA ≤ 0.06, and SRMR ≤ 0.08. | ||||||||
WLSMV | Preparatory chemistry | Science (n = 139) | 189 | 0.99 | 0.99 | 0.05 | [0.03, 0.07] | — |
Chemistry (n = 137) | 194 | 0.99 | 0.98 | 0.05 | [0.03, 0.07] | — | ||
General chemistry | Science (n = 417) | 251 | 0.99 | 0.98 | 0.04 | [0.03, 0.05] | — | |
Chemistry (n = 426) | 334 | 0.99 | 0.98 | 0.06 | [0.05, 0.06] | — | ||
General biology | Science (n = 128) | 179 | 0.99 | 0.98 | 0.05 | [0.02, 0.06] | — | |
Biology (n = 130) | 191 | 0.99 | 0.99 | 0.05 | [0.03, 0.07] | — | ||
MLR | Preparatory chemistry | Science (n = 139) | 251 | 0.90 | 0.88 | 0.07 | [0.06, 0.09] | 0.06 |
Chemistry (n = 137) | 235 | 0.93 | 0.92 | 0.07 | [0.05, 0.08] | 0.06 | ||
General chemistry | Science (n = 417) | 227 | 0.97 | 0.96 | 0.04 | [0.03, 0.05] | 0.04 | |
Chemistry (n = 426) | 284 | 0.96 | 0.95 | 0.05 | [0.04, 0.06] | 0.04 | ||
General biology | Science (n = 128) | 255 | 0.89 | 0.87 | 0.08 | [0.06, 0.09] | 0.07 | |
Biology (n = 130) | 210 | 0.96 | 0.95 | 0.06 | [0.04, 0.08] | 0.05 |
The acceptable fit index values for some wording and course combinations with the MLR estimator align with those seen in other studies involving extensive modifications by removing items (Tosun, 2013; Yamamura and Takehira, 2017). Unfortunately, it is unclear if those studies with acceptable fit indices used the same estimator, though it is a reasonable assumption since the default estimator in many CFA programs is maximum likelihood. The MLR fit indices that did not meet acceptable values are more similar to those from studies by the original developers (CFI = 0.91; RMSEA = 0.07; SRMR = 0.04; Glynn et al., 2011) or those using the 25 SMQ II items with only modifications to the language (e.g., Greek) or the target (e.g., chemistry) of the instrument (Salta and Koulougliotis, 2015; Kwon, 2016; Ardura and Pérez-Bitrián, 2018; Vasques et al., 2018).
The results of testing the five-factor model with two different estimators suggests two possibilities. First, that the poor data-model fit seen in prior work with the unmodified SMQ II was primarily a result of using an inappropriate estimator for the characteristics of the data. However, this conclusion is suspect because previous work using the original wording and the appropriate WLSMV estimator did not show consistently acceptable data-model fit (Komperda et al., 2018a). Second, that the need for extensive revisions or removal of items in order to fit a five-factor model, as in this study and others, indicates a larger problem with the underlying theoretical framework of the instrument. This second possibility was investigated by looking at how the individual scales, which represent the individual aspects of motivation, function as independent factors. If the scales show good data-model fit as single-factor scales this indicates that they are appropriate measurements of a construct but that they do not necessarily relate to each other in the ways hypothesized by the SMQ II developers. If the individual scales do not show good data-model fit as single-factor models this indicates issues with what the scales themselves are measuring and whether it is well aligned with existing theories of motivation.
Model | Estimator | Course | Wording | χ 2 | CFIa | TLIa | RMSEAa | [90% CI] | SRMRa |
---|---|---|---|---|---|---|---|---|---|
a Acceptable data-model fit values differ by estimator: for WLSMV cut-off values are CFI and TLI ≥ 0.95 and RMSEA ≤ 0.05; for MLR cut-off values are CFI and TLI ≥ 0.95, RMSEA ≤ 0.06, and SRMR ≤ 0.08. | |||||||||
Intrinsic items (df = 5) | WLSMV | Preparatory chemistry | Science (n = 139) | 18 | 0.99 | 0.98 | 0.14 | [0.08, 0.21] | — |
Chemistry (n = 137) | 42 | 0.98 | 0.96 | 0.23 | [0.17, 0.30] | — | |||
General chemistry | Science (n = 835) | 69 | 0.99 | 0.98 | 0.12 | [0.10, 0.15] | — | ||
Chemistry (n = 855) | 166 | 0.98 | 0.96 | 0.19 | [0.17, 0.22] | — | |||
General biology | Science (n = 258) | 37 | 0.98 | 0.97 | 0.16 | [0.11, 0.21] | — | ||
Biology (n = 263) | 48 | 0.99 | 0.97 | 0.18 | [0.14, 0.23] | — | |||
MLR | Preparatory chemistry | Science (n = 139) | 18 | 0.93 | 0.86 | 0.14 | [0.09, 0.20] | 0.04 | |
Chemistry (n = 137) | 34 | 0.91 | 0.81 | 0.21 | [0.15, 0.27] | 0.04 | |||
General chemistry | Science (n = 835) | 31 | 0.97 | 0.95 | 0.08 | [0.06, 0.10] | 0.03 | ||
Chemistry (n = 855) | 107 | 0.94 | 0.87 | 0.15 | [0.13, 0.18] | 0.04 | |||
General biology | Science (n = 258) | 44 | 0.90 | 0.81 | 0.17 | [0.14, 0.22] | 0.05 | ||
Biology (n = 263) | 50 | 0.92 | 0.83 | 0.18 | [0.15, 0.23] | 0.04 | |||
Self-determination items (equal loadings; df = 2) | WLSMV | Preparatory chemistry | Science (n = 139) | 0 | 1.00 | 1.01 | 0.00 | [0.00, 0.06] | — |
Chemistry (n = 137) | 7 | 0.98 | 0.97 | 0.14 | [0.04, 0.25] | — | |||
General chemistry | Science (n = 835) | 26 | 0.99 | 0.98 | 0.12 | [0.08, 0.16] | — | ||
Chemistry (n = 855) | 26 | 0.99 | 0.98 | 0.12 | [0.08, 0.16] | — | |||
General biology | Science (n = 258) | 2 | 1.00 | 1.00 | 0.02 | [0.00, 0.13] | — | ||
Biology (n = 263) | 11 | 0.99 | 0.99 | 0.13 | [0.06, 0.21] | — | |||
MLR | Preparatory chemistry | Science (n = 139) | 1 | 1.00 | 1.02 | 0.00 | [0.00, 0.12] | 0.09 | |
Chemistry (n = 137) | 10 | 0.89 | 0.84 | 0.17 | [0.08, 0.28] | 0.15 | |||
General chemistry | Science (n = 835) | 13 | 0.97 | 0.95 | 0.08 | [0.05, 0.12] | 0.12 | ||
Chemistry (n = 855) | 12 | 0.98 | 0.97 | 0.08 | [0.05, 0.11] | 0.09 | |||
General biology | Science (n = 258) | 0 | 1.00 | 1.02 | 0.00 | [0.00, 0.04] | 0.02 | ||
Biology (n = 263) | 0 | 1.00 | 1.02 | 0.00 | [0.00, 0.04] | 0.02 | |||
Self-efficacy items (equal loadings; df = 2) | WLSMV | Preparatory chemistry | Science (n = 139) | 2 | 1.00 | 1.00 | 0.00 | [0.00, 0.16] | — |
Chemistry (n = 137) | 5 | 1.00 | 1.00 | 0.10 | [0.00, 0.22] | — | |||
General chemistry | Science (n = 835) | 18 | 1.00 | 0.99 | 0.10 | [0.06, 0.14] | — | ||
Chemistry (n = 855) | 10 | 1.00 | 1.00 | 0.07 | [0.03, 0.11] | — | |||
General biology | Science (n = 258) | 13 | 0.99 | 0.99 | 0.14 | [0.07, 0.22] | — | ||
Biology (n = 263) | 10 | 0.99 | 0.99 | 0.13 | [0.06, 0.21] | — | |||
MLR | Preparatory chemistry | Science (n = 139) | 3 | 0.99 | 0.98 | 0.06 | [0.00, 0.16] | 0.10 | |
Chemistry (n = 137) | 6 | 0.97 | 0.96 | 0.12 | [0.00, 0.24] | 0.09 | |||
General chemistry | Science (n = 835) | 24 | 0.97 | 0.95 | 0.12 | [0.08, 0.15] | 0.09 | ||
Chemistry (n = 855) | 9 | 0.99 | 0.98 | 0.06 | [0.03, 0.11] | 0.05 | |||
General biology | Science (n = 258) | 10 | 0.96 | 0.94 | 0.13 | [0.07, 0.19] | 0.13 | ||
Biology (n = 263) | 7 | 0.97 | 0.96 | 0.10 | [0.03, 0.19] | 0.08 | |||
Grade items (df = 2) | WLSMV | Preparatory chemistry | Science (n = 139) | 7 | 0.99 | 0.97 | 0.14 | [0.04, 0.25] | — |
Chemistry (n = 137) | 3 | 1.00 | 0.99 | 0.06 | [0.00, 0.19] | — | |||
General chemistry | Science (n = 835) | 4 | 1.00 | 1.00 | 0.04 | [0.00, 0.09] | — | ||
Chemistry (n = 855) | 7 | 1.00 | 1.00 | 0.06 | [0.02, 0.10] | — | |||
General biology | Science (n = 258) | 2 | 1.00 | 1.00 | 0.00 | [0.00, 0.12] | — | ||
Biology (n = 263) | 2 | 1.00 | 1.00 | 0.01 | [0.00, 0.12] | — | |||
MLR | Preparatory chemistry | Science (n = 139) | 10 | 0.84 | 0.52 | 0.17 | [0.09, 0.26] | 0.04 | |
Chemistry (n = 137) | 1 | 1.00 | 1.04 | 0.00 | [0.00, 0.09] | 0.01 | |||
General chemistry | Science (n = 835) | 5 | 0.99 | 0.98 | 0.04 | [0.00, 0.07] | 0.01 | ||
Chemistry (n = 855) | 3 | 1.00 | 0.99 | 0.03 | [0.00, 0.06] | 0.01 | |||
General biology | Science (n = 258) | 2 | 1.00 | 1.02 | 0.00 | [0.00, 0.07] | 0.02 | ||
Biology (n = 263) | 3 | 0.99 | 0.97 | 0.05 | [0.00, 0.10] | 0.02 | |||
Career items (df = 2) | WLSMV | Preparatory chemistry | Science (n = 139) | 0 | 1.00 | 1.00 | 0.00 | [0.00, 0.09] | — |
Chemistry (n = 137) | 3 | 1.00 | 1.00 | 0.07 | [0.00, 0.20] | — | |||
General chemistry | Science (n = 835) | 22 | 1.00 | 0.99 | 0.11 | [0.07, 0.15] | — | ||
Chemistry (n = 855) | 22 | 1.00 | 1.00 | 0.11 | [0.07, 0.15] | — | |||
General biology | Science (n = 258) | 5 | 1.00 | 0.99 | 0.08 | [0.07, 0.17] | — | ||
Biology (n = 263) | 5 | 1.00 | 1.00 | 0.07 | [0.00, 0.16] | — | |||
MLR | Preparatory chemistry | Science (n = 139) | 1 | 1.00 | 1.02 | 0.00 | [0.00, 0.07] | 0.01 | |
Chemistry (n = 137) | 2 | 1.00 | 1.00 | 0.02 | [0.00, 0.15] | 0.01 | |||
General chemistry | Science (n = 835) | 15 | 0.97 | 0.92 | 0.09 | [0.06, 0.12] | 0.02 | ||
Chemistry (n = 855) | 10 | 0.99 | 0.98 | 0.07 | [0.04, 0.11] | 0.01 | |||
General biology | Science (n = 258) | 11 | 0.95 | 0.85 | 0.14 | [0.08, 0.20] | 0.03 | ||
Biology (n = 263) | 4 | 1.00 | 0.99 | 0.06 | [0.00, 0.13] | 0.01 |
Examination of patterns of acceptable and unacceptable data-model fit for the single factor models provides evidence that the model of the remaining intrinsic items (I2, I3a, I3b, I4, and I5) showed consistently poor data-model fit across course and wording conditions, particularly in the high RMSEA values relative to other scales. In contrast, the grade motivation items had a majority of course and wording combinations with acceptable data-model fit, more than any other scale. The patterns are more difficult to interpret for the self-determination and self-efficacy items given their inconsistency in fit across course and wording combinations as well as the additional constraints placed on those scales in order to obtain data-model fit information. Similarly, the results for the career items were somewhat mixed particularly when comparing across estimators.
Overall, the single-factor model results provide some evidence that the underlying issue of inconsistent fit of the SMQ II and mSMQ II data may be due to problems with the individual aspects of motivation hypothesized to underpin the instrument. Single-factor models of the original SMQ II items showed similar patterns with the WLSMV estimator in that the intrinsic items had less support for their structure across course and wording combinations while the grade motivation items had more (Komperda et al., 2018a). Yet, grade motivation is not a known construct within the motivation literature while intrinsic motivation is. These results further support the interpretation that even when a more appropriate estimator is used there are underlying issues in the structure and framework of the SMQ II and mSMQ II responsible for the poor data-model fit.
Course | Preparatory chemistry | General chemistry | General biology | |||
---|---|---|---|---|---|---|
Wording | Science | Chemistry | Science | Chemistry | Science | Biology |
Intrinsic | — | — | — | — | — | — |
Self-determination | 0.81 | — | — | — | 0.82 | 0.86 |
Self-efficacy | 0.89 | — | — | 0.86 | — | — |
Grade | — | 0.90 | 0.90 | 0.88 | 0.89 | 0.90 |
Career | 0.92 | 0.92 | — | — | — | 0.95 |
No omega value is reported in Table 5 for scales that did not meet the previously determined data-model fit criteria for each estimator (CFI and TLI ≥ 0.95 and RMSEA ≤ 0.05 for WLSMV; CFI and TLI ≥ 0.95, RMSEA ≤ 0.06, and SRMR ≤ 0.08 for MLR). Though the omegas reported in Table 5 for the modified scales are generally higher than previously reported reliability values for the scales (Glynn et al., 2011; Salta and Koulougliotis, 2015; Schumm and Bogner, 2016; Schmid and Bogner, 2017; Ardura and Pérez-Bitrián, 2018; Komperda et al., 2018a), they should not be interpreted as providing any indication of scale quality on their own, as there is no set threshold for acceptable reliability (Arjoon et al., 2013; Taber, 2018). Rather, the reliability values are one of many pieces of evidence that should be evaluated to provide overall evidence for the quality of data obtained from an instrument.
Though the grade motivation scale had relatively better fit than other scales, this is not an aspect of motivation with a strong theoretical foundation in the self-regulatory literature. The grade scale was created by the developers during revision from the original SMQ to the SMQ II based on EFA results and interviews with students (Glynn et al., 2011; Table S1, ESI†). The original grade (and career) items were initially intended to belong to a single factor representing extrinsic motivation, a more theoretically-based aspect of motivation (Ryan and Deci, 2000) but evidence from this study does not suggest the grade and career items belong to a single factor. In addition to the theoretical concerns for the grade scale, there are practical concerns in how useful this scale is to instructors and researchers since students were likely to always select responses on the far end of the response scale resulting in a ceiling effect.
In contrast, the self-efficacy scale has more theoretical justification and when tested with constraints, had acceptable data-model fit in some course and wording conditions and more students making use of the entire response scale. The original set of items on the self-efficacy scale were also found to function well when translated to Spanish and used with the physics wording. For this reason, the self-efficacy scale may be useful to some practitioners, though it would be beneficial for future research to explore adding additional items to the scale.
It is likely that students value the science content they are learning for multiple reasons and these could potentially have some overlap that needs to be addressed. This would explain the incongruence of the intrinsic and career items overlapping so strongly when intrinsic and extrinsic motivation are on opposite ends of a motivational continuum as described by self-determination theory (Ryan and Deci, 2000). In recent years, Brophy (2008) noted that students’ motivation to learn has been examined in three categories: (1) the influence of the classroom milieu, (2) students’ expectancies or self-beliefs, and (3) their perceived value of the task. The SMQ II tends to focus more on the latter two and thus the expectancy-value model (Wigfield and Eccles, 2000) might be more harmonious with the intended purpose of the measure. For example, the items for the intrinsic, career, and grade motivation scales on the SMQ II are more aligned with different task value perspectives than the polar ends of the extrinsic-intrinsic continuum. In the expectancy-value model, Eccles and Wigfield (2002) articulate different reasons for why an individual might value a task. For the SMQ II, it tends to be that the two most related might be utility (e.g., the usefulness of a chemistry course for reaching one's career goals) and intrinsic value (because the content is inherently interesting to learn about). When breaking down the intrinsic, career, and grade motivation scales, we could potentially see that a student inherently finds chemistry to be interesting to learn about and thus wants to pursue a career in chemistry. Thus, the student is interested in attaining good grades in her chemistry course based on the relevance to her future career pursuits. For this student, her motivation would be somewhere between ‘identified’ or ‘integrated’ regulation on Ryan and Deci's (2000) extrinsic-intrinsic continuum.
With specific regard to the SMQ II and its discipline-specific variants (i.e., BMQ II, CMQ II, PMQ II and others), researchers deciding to use this assessment instrument are encouraged to carefully consider the outcomes of this study, our prior work (Komperda et al., 2018a), and the number of SMQ II studies compiled within this manuscript when analyzing their data. Given the repeated lack of substantiation for the structure of the SMQ II, researchers choosing to use the instrument are urged to continue conducting single and multi-factor CFAs to evaluate the structure of their data.
While psychometric studies can provide insights regarding if an assessment instrument, administered in a specific context, produces valid and reliable data, they are not necessarily designed to explain why validity or reliability might not be supported. Therefore, when structural validity issues arise, such as those that have been observed with the SMQ II, qualitative studies designed specifically to explore the underlying theoretical framework might be warranted. Consequently, it is recommended that future work on the SMQ II include cognitive interviews using open-ended probes designed to elicit the nature of an underlying construct (e.g., intrinsic motivation) within a specific context (e.g., chemistry and biology courses). These studies would provide an understanding beyond the item-level cognitive interviews conducted to support response process validity and might provide insight to the salient features of a construct and the way in which cuing of subjects affects student ideas and therefore, why the same structural model might not be equally supported.
Footnote |
† Electronic supplementary information (ESI) available: see DOI: 10.1039/d0rp00029a |
This journal is © The Royal Society of Chemistry 2020 |