Alexandra R. Brandrieta,
Rose Marie Wardb and
Stacey Lowery Bretz*a
aMiami University, Department of Chemistry and Biochemistry, Oxford, OH 45056, USA. E-mail: bretzsl@miamioh.edu
bMiami University, Department of Kinesiology and Health, Oxford, OH 45056, USA
First published on 9th May 2013
Ausubel and Novak's construct of meaningful learning stipulates that substantive connections between new knowledge and what is already known requires the integration of thinking, feeling, and performance (Novak J. D., (2010), Learning, creating, and using knowledge: concept maps as facilitative tools in schools and corporations, New York, NY: Routledge Taylor & Francis Group.). This study explores the integration of these three domains using a structural equation modeling (SEM) framework. A tripartite model was developed to examine meaningful learning through the correlational relationships among thinking, feeling, and performance using student responses regarding Intellectual Accessibility and Emotional Satisfaction on the Attitudes toward the Subject of Chemistry Inventory version 2 (ASCI V2) and performance on the American Chemical Society exam. We compared the primary model to seven alternatives in which correlations were systematically removed in order to represent a lack of interconnectedness among the three domains. The tripartite model had the strongest statistical fit, thereby providing statistical evidence for the construct of meaningful learning. Methodological issues associated with SEM techniques, including problems related to non-normal multivariate distributions (an assumption of traditional SEM techniques), and causal relationships are considered. Additional findings include evidence for weak configural invariance in the pre/post implementation of the ASCI(V2), mainly due to the poor structure of the pretest data. The results are discussed in terms of their implications for teaching and learning.
In the classroom, meaningful learning engages students in three domains of experience: thinking, feeling, and performance (Novak, 2002). Not only must all three domains be part of the learning experience, but there must also be an active integration among students' thinking, feeling, and doing:
“…successful education must focus upon more than the learner's thinking. Feelings and actions are also important. We must deal with all three forms of learning. These are acquisition of knowledge (cognitive learning), change in emotions or feelings (affective learning) and gain in physical or motor actions or performance (psychomotor learning) that enhance a person's capacity to make sense out of their experiences.” (Novak, 2010, p. 13, italics original)
A key point in Novak's theory is that the interaction of thinking and performance is necessary, but not sufficient, for meaningful learning to occur. Consider a student in the chemistry laboratory who may be able to execute a titration with extreme precision and accuracy. This does not mean that the student has meaningfully learned about the particulate interactions of the acid–base chemistry that cause the indicator to change color or even what information can be learned from a titration. If a student performs the task perfunctorily with no appreciation for when and why chemists perform titrations, then meaningful learning will not occur. Previous studies have typically investigated learning in chemistry with regard to either thinking or feeling, but have not examined the integration of thinking and feeling and performance.
For example, a large body of literature exists regarding student misconceptions that form when new information interacts with prior knowledge structures (Banerjee, 1991; Garnett and Treagust, 1992; Nakhleh, 1992; Taber, 2002; Stefani and Tsaparlis, 2009; Davidowitz et al., 2010; Bretz and Linenberger, 2012; McClary and Bretz, 2012; Naah and Sanger, 2012). In addition to knowledge acquisition, metacognition is an intellectual skill in which students are required to think about their own thinking and is an important construct for understanding how students learn chemistry (Cooper et al., 2008; Cooper and Sandi-Urena, 2009).
Students' feelings with regard to learning chemistry (e.g., students' attitudes, self-concept, and motivation) have been investigated largely through the use of self-report surveys (Pintrich and Johnson, 1990; Bennett et al., 2001; Bauer, 2005, 2008; Barbera et al., 2008). Xu and Lewis (2011) created a shortened version of the Attitude towards the Subject of Chemistry Inventory (ASCI) originally proposed by Bauer (2008). This instrument, referred to as the ASCI(V2), stems from social psychology theories in which attitudes are grounded in a tripartite model of responses: cognitive, affective, and behavioral (Rosenberg and Hovland, 1960). ASCI(V2) measures students' thinking about the Intellectual Accessibility of chemistry and their feelings about the Emotional Satisfaction of learning chemistry. Instruments of this nature can be used as diagnostic measures in the chemistry classroom and laboratory (Brandriet et al., 2011) to explore the interaction between thinking and feeling that impacts students' willingness to choose to learn the content in a meaningful manner. The National Research Council (2012) report on Discipline-Based Education Research (DBER) has recommended that more research is needed to examine the affective domain and its relationships to other outcomes such as knowledge (thinking) and skills (performance).
While performance in learning chemistry typically conjures up images of students in the laboratory (e.g., Mattox et al., 2006; Poock et al., 2007; Lawrie et al., 2009; Sandi-Urena et al., 2011), this third element of meaningful learning has also been investigated in classroom settings by examining constructs such as problem solving, chemistry expectations, and scientific reasoning (Tobin and Capie, 1981; Grove and Bretz, 2007; Holme et al., 2010). Numerous studies have explored factors that influence performance using American Chemical Society examinations that are nationally normed in the United States (Lewis and Lewis, 2005, 2007; Bunce et al., 2006; Cracolice et al., 2008).
Students' abilities to perform algorithmic problems without demonstrating a strong conceptual understanding have been investigated (Nakhleh and Mitchell, 1993; Zoller et al., 1995) and are typical of research that examines dyads, i.e., relationships among two variables such as thinking and performing. For example, Lewis et al. (2009) and Xu and Lewis (2011) used student performance on an ACS exam to characterize the thinking-performance dyad and the feeling-performance dyad. Very recently, a study by Xu et al. (2013) used structural equation modeling (SEM) to identify pre-instruction predictors (prior knowledge, math ability, and attitude as measured by the ASCIV2) for student performance on an ACS exam. However, this study did not simultaneously examine thinking, feeling, and performance in chemistry as the essential elements of Novak's construct of meaningful learning.
Fig. 1 Independent-dependent relationships in regression analysis. Independent variable (IV) and dependent variable (DV). |
Extensive literature has identified the relationships between student attitude and science achievement (Steinkamp and Maehr, 1983; Turner and Lindsay, 2003; Kan and Akbas, 2006; Nieswandt, 2007). However, discrepancies exist regarding the exact nature of the attitude–achievement relationship. Researchers often use statistical techniques that assume directional relationships between variables (e.g., attitude is an independent variable that influences the dependent variable of chemistry achievement). Whereas this can be a legitimate assumption when temporal relationships exist between variables (e.g., attitude is established first and then subsequently influences achievement), this argument is not always viable for cross-sectional studies (Bullock et al., 1994). Hence, the question of whether achievement influences attitude must also be considered. For example, students who do well may improve in attitude, while the attitudes of students who underperform may decline.
In a study recently published by Xu et al. (2013) students' pre-instruction “thinking” and “feeling” were used as predictors of student performance. While Xu et al. (2013) contribution to the body of literature is profound, our work adds to this previous literature by being strongly grounded not only in previous literature but also analyzed through the use of a theoretical lens. While situating a research study within the context of previous literature is important, the confirmatory approach of SEM requires a priori stipulation of a theoretical model (Kline, 2011). In the absence of theory, SEM may identify multiple patterns that can exist in the data and can be confirmed by a variety of models that fit the data well. Using a theoretical lens to formulate testable hypotheses reduces the occurrence of type 1 or 2 error (Kline, 2011). Novak himself cautions against method driven, rather than theory driven, research (Novak, 2010, p. 20). The research described below emerged from Novak's construct of meaningful learning, tested a simple model that was thoughtfully built to reflect this theoretical framework, and contributes to the body previously published research by simultaneously testing all three dyads within Novak's domains of meaningful learning. Furthermore, the design of this study contrasted the theorized model with multiple, other plausible models in order to provide evidence that any good statistical fit stemmed from confirmation of the theory, rather than patterns in the data.
(1) What evidence exists for a mutually dependent relationship across thinking, feeling, and performance for meaningful learning?
(2) Is the ASCI(V2) structurally invariant across pretest and posttest implementations?
The response options for the ASCI(V2) are on a semantic differential scale of polar adjectives with the stem “Chemistry is…”. The items are measured on a 7-point scale where a positive response (e.g., chemistry is easy) is scored as a 7 while the polar adjective (e.g., chemistry is hard) is scored as 1. Four items on the ASCI(V2) require reverse coding. ASCI(V2) has strong internal consistency and greater construct validity than the original ASCI (Brandriet et al., 2011; Xu and Lewis, 2011).
Fig. 2 Directional models tested through structural regression modeling. Standardized parameter estimates for the theorized and comparison models (N = 89): meaningful learning (a), attitude influence success (b), and success influences attitude (c). Theta (θε) represents error associated with an observed variable and zeta (ζ) represents error associated with a latent variable. All paths leaving error are set to 1. IA, ES, and ACS are Intellectual Accessibility, Emotional Satisfaction, and ACS exam score, respectively. *p ≤ 0.05, **p ≤ 0.01, ***p ≤ 0.001. |
To provide evidence that the elements of thinking, feeling, and performance must not only exist in the model but also be interdependent, the model of meaningful learning (Fig. 2, model a) was compared to seven alternative models in which individual bidirectional paths were successively removed from between the thinking, feeling, and performance variables (Fig. 3, models 1–7). Furthermore, configural invariance of the ASCI(V2) was assessed to test for equivalence of its two-factor structure across both pretest and posttest implementations of the instrument. This was done to identify whether the ASCI(V2) factor structure is equivalent across the pretest and posttest implementations for the at-risk students in this study, and thus, provide some insight into the construct validity of the data produced by the instrument within this study. If the factor structure is not similar across the pretest and posttest implementation of the instrument, caution is advised when comparing the results of the two implementations.
Fig. 3 Theorized and alternative models of meaningful learning standardized parameter estimates for the theorized (a) and alternate models (1–7) (N = 89). Theta (θ) represents error associated with an observed variable and zeta (ζ) represents error associated with a latent variable. All paths leaving errors are set to 1. *p ≤ 0.05, **p ≤ 0.01, ***p ≤ 0.001. |
Based on recommendations of Hu and Bentler (1999), χ2 and the associated p-value, Comparative Fit Index (CFI), and Standardized Root Mean Square Residual (SRMR) were used to evaluate model fit. The χ2 goodness-of-fit test compares the proposed covariance structure of the theorized model with that of the sample data. In order to retain the null hypothesis, i.e., the theorized and sample covariance matrices are equal, a p-value would need to be greater than 0.05. Another indication that the proposed model fits the data well is when the χ2 per degrees of freedom (χ2/df) is lower than 2.00; however, this value should be interpreted with caution since it is considered a “rough rule of thumb” (Tabachnick and Fidell, 2007). Due to small sample considerations, the model fit was based on analysis of the CFI (>0.95) and SRMR (<0.06) statistics as proposed by Hu and Bentler (1999). The CFI is an incremental fit index that measures the relative improvement of the fit in the model in comparison to a model that assumes that none of the variables covary; this is known as the independence or null model. Large CFI values indicate a substantial improvement of the proposed model relative to the independence model. The SRMR is a standardized index based on the difference in covariance of the residuals between the predicted and the observed models. Small values indicate that the residuals of the theorized model reflect those in the sample data (Kline, 2011). The Root Mean Square Error Approximation (RMSEA) and the Tucker–Lewis Fit Index (TLI) are also provided as output by Mplus, but these indices tend to overreject true-population models when the sample sizes are small (Hu and Bentler, 1999). Therefore, the RMSEA and TLI indices were not used in this analysis as indicators of model fit.
Item | Item #a | Pre mean (SD) (N = 123) | Post mean (SD) (N = 89) | ||
---|---|---|---|---|---|
a Items that required reverse coding are labeled ‘r’. | |||||
Intellectual accessibility (IA) | — | — | — | ||
Easy | Hard | 1r | 2.92 (1.28) | 2.78 (1.43) | |
Complicated | Simple | 2 | 2.64 (1.29) | 3.06 (1.61) | |
Confusing | Clear | 3 | 3.36 (1.37) | 3.53 (1.55) | |
Challenging | Unchallenging | 6 | 2.20 (1.06) | 2.43 (1.36) | |
Emotional satisfaction (ES) | — | — | — | ||
Comfortable | Uncomfortable | 4r | 3.61 (1.37) | 3.74 (1.56) | |
Satisfying | Frustrating | 5r | 3.72 (1.56) | 3.37 (1.77) | |
Pleasant | Unpleasant | 7r | 3.90 (1.24) | 3.64 (1.37) | |
Chaotic | Organized | 8 | 4.40 (1.40) | 4.34 (1.54) |
Multivariate normality was assessed using the Mardia's normalized estimate of multivariate kurtosis using AMOS 20 (Mardia, 1970). Values greater 5.00 indicate that the data may lack a multivariate normal distribution (Byrne, 2010). The Mardia normalized coefficient was assessed prior to running analyses on each model. All values were greater than 7.00; therefore, the models were outside the bounds of multivariate normality. Another consideration of the authors was the small sample size; therefore, a post-hoc power analysis was conducted based on the RMSEA of close fit (MacCallum et al., 1996). Power was established to be greater than 0.80, and thus, sufficient to detect statistical differences.
Fig. 4 Measurement model for ASCI(V2). Theta (θε) represents measurement error in model. Paths leaving error are set to 1 but removed for simplicity of image. IA and ES are Intellectual Accessibility and Emotional Satisfaction, respectively. |
The fit indices for the pretest measurement model were χ2(19, N = 123) = 77.177, p ≤ 0.001, CFI = 0.832, and SRMR = 0.078, while the fit indices for the posttest measurement model were χ2(19, N = 89) = 27.657, p = 0.0903, CFI = 0.958, and SRMR = 0.051. Based on the Hu and Bentler (1999) criteria, the posttest measurement model was determined to have a strong fit, while the pretest model was poor. In order to assess whether the two-factor structure of the ASCI(V2) was stable across pre and post chemistry instruction, configural invariance testing was performed on the data. The structure was found to vary across implementations with fit indices of χ2(50) = 121.482, p ≤ 0.001, CFI = 0.868, and SRMR = 0.079. Results show that the pretest data contributed to the overall χ2 value more than the posttest (χ2pretest = 76.964, χ2posttest = 44.518). Since a large χ2 is associated with a smaller p-value, this result is indicative of the poor fit in the pretest model. This was likely a consequence of the diversity of students' prior experiences with chemistry before college instruction (e.g., secondary school chemistry). Students with varying prior experiences may respond differently, and thus, result in inconsistent responses and poor model fit. However, the two-factor structure did fit the posttest data well. This result may be founded in the students now having a shared experience, and therefore, shared understanding of chemistry.
Because the strong fit in the ASCI(V2) measurement model provides evidence of high construct validity (based on the fit indices), the posttest data was used for the remaining analyses. All pretest and posttest items had large and significant loading on their respective factors, as indicated in Table 2. The correlation between the IA and ES factors was determined to be large and significant (rpretest = 0.759, p ≤ 0.001; rposttest = 0.828, p ≤ 0.001).
ASCI(V2) items | Factor loadings | Error variance | |||
---|---|---|---|---|---|
Pretest | Posttest | Pretest | Posttest | ||
a p ≤ 0.01.b p ≤ 0.001. | |||||
Intellectual accessibility | |||||
1r | 0.700b | 0.736b | 0.511b | 0.458b | |
2 | 0.755b | 0.822b | 0.429b | 0.324a | |
3 | 0.857b | 0.840b | 0.265b | 0.294b | |
6 | 0.577b | 0.655b | 0.667b | 0.571b | |
Emotional satisfaction | |||||
4r | 0.787b | 0.824b | 0.381b | 0.320b | |
5r | 0.808b | 0.783b | 0.348b | 0.387b | |
7r | 0.787b | 0.630b | 0.381b | 0.603b | |
8 | 0.418b | 0.552b | 0.826b | 0.696b |
Both the theorized and the alternate models had identical and good fit: χ2(25, N = 89) = 35.985, p = 0.0718, χ2/df = 1.44, CFI = 0.955, and SRMR = 0.052. The Akaike Information Criteria (AIC) takes into account model fit and parsimony and is generally used to compare non-nested models (one model is nested in another when the only difference between models is that individual paths in one model are removed) (Kline, 2011; Byrne, 2012). However, since models a–c had the same level of parsimony (df = 25) and similar fit, the AIC was equivalent across the three models (AIC = 2962.265). Based purely on statistical fit, we could not rule out any of the proposed non-nested models. All significant parameter estimates for models a–c were large (see Fig. 2).
Given that all the models had identical and good statistical fit, the best fitting model could not be determined based on empirical evidence alone. If we had tested only one of these models without comparing it to the fit of the other theoretically viable models, an incomplete account of the relationships could have been proposed. Once again, SEM models should always be based on theory and compared to other theoretically viable models. When model specification is based only on empirical evidence and not upon theoretical consideration, the likelihood of a significant or non-significant path due to chance (type I or type II error) is of great concern (Kline, 2011). For this reason, the model of best-fit was chosen upon the basis of the theoretical framework of meaningful learning (Fig. 2, model a). Feedback on assessment of student performance may also influence students' thinking and feeling, much like students' thinking and feeling affects their performance. If one were to accept one of the other models (Fig. 2, model b or c) based only on empirical evidence, they would fail to maximize the quality of student learning by not taking into account the tripartite equilibrium that exists between thinking, feeling, and learning. Remaining analyses were based upon this model due to theoretical implications of meaningful learning and the temporality of variables consideration.
As shown in Table 3, the theorized model of meaningful learning had a much stronger fit than the seven alternate models. The chi-square fit statistic was influenced by the number of degrees of freedom in the model, so that more parsimonious models (more degrees of freedom) were less likely to emulate the covariance matrix and were easier to reject than less parsimonious models (fewer degrees of freedom) (Raykov and Marcoulides, 2006). Since the alternate models were more parsimonious than the meaningful learning model, a chi-square difference test was used to determine if constraining correlational paths to zero (i.e., removing a path and increasing the degrees of freedom) in the alternate models did, in fact, reduce the fit of the model.
Nested models | χ2a | ΔTb | Δdf | χ2/df | CFI | SRMR |
---|---|---|---|---|---|---|
a Satorra–Bentler chi-square corrected for non-normal data.b Satorra–Bentler chi-square difference value corrected by scaling factor.c p < 0.01.d p < 0.001. | ||||||
Meaningful learning | 35.985 | — | — | 1.44 | 0.955 | 0.052 |
Alternate 1 | 47.143c | 14.83d | 1 | 1.69 | 0.913 | 0.117 |
Alternate 2 | 53.529c | 17.54d | 1 | 2.06 | 0.887 | 0.133 |
Alternate 3 | 78.864d | 37.95d | 1 | 3.03 | 0.783 | 0.255 |
Alternate 4 | 54.608c | 21.34d | 2 | 2.02 | 0.877 | 0.141 |
Alternate 5 | 80.239d | 47.28d | 2 | 2.97 | 0.782 | 0.260 |
Alternate 6 | 85.528d | 44.72d | 2 | 3.17 | 0.760 | 0.265 |
Alternate 7 | 96.722d | 63.02d | 3 | 3.45 | 0.718 | 0.278 |
When using the MLR method for parameter estimation, the chi-square values were corrected for non-normality in the data through the use of a scaling correction factor (Muthén and Muthén, 2010). Due to this correction, the chi-square value no longer followed a normal chi-square distribution so the difference between values alone had very little meaning. The Satorra–Bentler scaled chi-square difference test (ΔT) was used to take into account the scaling factor of the chi-square value (Mplus, 2012; Satorra and Bentler, 2001).
Table 3 provides the ΔT and associated p-values between the meaningful learning model and each of the seven alternate models. The theorized meaningful learning model showed a significantly better fit than any of the alternate models, providing evidence that a tripartite relationship exists between the thinking, feeling, and performance elements of meaningful learning. In alternate models 2 and 3 (Fig. 3) it is noted that the relationships between IA and ACS exam score is non-significant but in the meaningful learning model the equivalent relationship is strong and significant. A similar result occurred in the alternate model in Fig. 2, model b. From these results, it is possible to conclude that the relationship between thinking and performance was the weakest relationship in the models. A slight positive skew (students responded that chemistry was hard, complicated, challenging, and confusing) in the IA items likely reduced the associated variance, and since the SEM paths are measured simultaneously, the influence of the ES-ACS exam path may be so great that it suppressed the variance associated with the IA-ACS exam path in Fig. 2, model b and Fig. 3, model 2 and 3 (Velicer, 1978; Maassen and Bakker, 2001). However, since the fit of the theorized model was greater than either model 2 or 3 in Fig. 3, the conclusion can be drawn that a relationship does exist between thinking and performance. Based on both theory and the results of this analysis, evidence has been generated that not only do thinking, feeling, and performance each need to be present, but that each domain also needs to be interconnected or integrated into an educational experience as originally theorized.
Instructors face the challenge of maintaining an equilibrium across thinking, feeling, and performance so that all domains co-exist and are integrated into a meaningful experience. In a chemical equilibrium, increasing the concentration of the reactants will shift the equilibrium to the right so that the concentration of the products also increases. LeChatelier's principle does not apply, however, to the model of meaningful learning. That is to say, emphasis of thinking and performance in the chemistry classroom is necessary, but not sufficient, to improve feeling. In fact, in this study the weakest association was between thinking and performance in the meaningful learning model, further emphasizing that feeling is vital for students to have meaningful experiences. Teachers must help students grow in the affective domain. Chemistry students' affective ideas are built through teacher and student interactions with thinking and performance (Novak, 2010). By sharing and discussing their thinking with the instructor or through inquiry-based teaching strategies, students can build their own ideas about how chemical principles relate to them and their understanding of the world. By providing contextual examples of chemical concepts, students may be able to assimilate news ideas with their prior understanding. We suggest that future research replicate this study with different samples in order to further generalize Novak's model to a larger population, as well as test different types of measures of thinking, feeling and performance under Novak's model.
In this study, the model of meaningful learning had strong statistical fit for the posttest data, but the fit for the pretest data was poor. One explanation for this result is that students at the beginning of the semester have not yet had a meaningful experience to integrate their thinking, feeling, and performance, and therefore, meaningful learning of chemistry could not have possibly occurred yet. However, after a semester of chemistry lecture, laboratory, and POGIL-based recitation sections, the thinking, feeling, and performance of these students are more meaningfully integrated. Our data suggest that meaningful learning does not exist at the beginning of general chemistry for these students. More research is needed to explore how thinking, feeling, and performance change at multiple points in time across a semester in order to understand the implications of the lack of invariance across the pretest and posttest measurement model. However, based on the results of this study, educators and researchers should proceed cautiously when designing studies that compare the results of self-report data over time. The pretest results in this study are likely a consequence of the diversity of (if not lack thereof) student prior experiences with chemistry before university instruction. The thinking, feeling, and performance of students with varying prior experiences may well differ, especially when measured at the beginning of a semester before the students have a chance to share a common meaningful experience.
This journal is © The Royal Society of Chemistry 2013 |