Jacob D.
McAlpin
,
Ushiri
Kulatunga
and
Jennifer E.
Lewis
*
Department of Chemistry, University of South Florida, 4202 E. Fowler Avenue, Tampa, Florida 33620, USA. E-mail: jennifer@usf.edu
First published on 16th May 2023
Motivation helps drive students to success in general chemistry, and active learning environments with social interactions has consistently shown to improve motivation. However, analyzing student outcomes in an interactive environment is best done by considering students not as isolated units but as working together and influencing each other. Therefore, we used social network analysis with self-determination theory as a framework for understanding motivation and social comparison theory as a framework for understanding how students influence each other. When analyzing an undergraduate general chemistry course that has incorporated peer-led team learning using data from the Learning Climate Questionnaire and Intrinsic Motivation Inventory, a series of progressively sophisticated statistical models with data gathered from 270 students shows that perceived competence and relatedness predict student interest in the activities with their peer-led sessions. However, we also found evidence that students tend to become polarized in their interest toward peer-led team learning activities, which is one possible outcome of social comparisons with their peers. In addition to these findings, this project demonstrates how social network analysis can expand how chemistry education researchers consider relational data and the effects of non-independent data on statistical analysis.
Considering the potential impacts of motivation on student success, chemistry education researchers have investigated how pedagogy affects motivation. One of the findings from this work is the recognition of a positive relationship on motivation gained from using active learning pedagogies such as the use of Study Periods and Discussion Groups (Cicuto and Torres, 2016), hands-on science teaching (Juriševič et al., 2012), and Process Oriented Guided Inquiry Learning (POGIL; Southam and Lewis, 2013). One particular active learning technique with promise to support student motivation in chemistry is Peer-Led Team Learning (PLTL; Gosser and Roth, 1998; Liu et al., 2018). Under this pedagogical method, students in large lecture courses are placed into groups under the supervision of a peer leader who is an undergraduate student who has already successfully passed the course. As part of the PLTL environment, students have prolonged engagement with peers in their class over the length of the course. While this prolonged engagement has the potential to support student motivation and persistence, it also has implications for studying PLTL. As students have the ability to influence each other, researching the potential impacts of PLTL should acknowledge that students are not isolated units but are incorporated into groups (Stevens, 2007; Theobald, 2018).
One method to analyze a population while considering how individuals might affect each other is through use of social influence models which have a few ways to model relational effects (Lane et al., 2019; Leenders, 2002). This technique falls under the broader category of Social Network Analysis (SNA) which is a set of processes and tools largely developed within sociology and mathematics for considering relational data (Wasserman and Faust, 1994) and has been promoted as a valuable tool in discipline-based education research (Grunspan et al., 2014). Social networks are maps of individuals (nodes) and the connections between them (edges/ties). The connections in a social network can either be directed, where a connection is not necessarily reciprocal (student A loans book to student B), or undirected, where the connection is reciprocal (student A and B study together). The methods of SNA were developed from graph theory and can allow education researchers to consider behaviors and social dynamics in the classroom.
The applications of SNA in education research are quite varied so we will just present a sampling here. First, Brewe et al. (2012) used SNA to investigate student communities in a physics learning center. They found that a student's centrality (term describing how well an individual is connected within a network) in the student community was predicted by days per week spent in physics learning center and whether or not the student was a physics major. However, a student's centrality in the network was not significantly predicted by gender and ethnicity variables suggesting an equitable learning environment. At the same location, Dou et al. (2016) further found that they could identify a correlation between a student's self-efficacy and the student's network centrality. Another option within SNA is to look more directly at how individual students affect each other. This was demonstrated by Vitale et al. (2016) when they reported for a group of graduate students in an Italian university. They found that while formal groups of students formed temporarily did not relate to graduate school performance, informal groups based on mutual interest and goals were predictive of performance. Within chemistry education research specifically, a similar set of tools to SNA has been used to analyze discussion in a POGIL style physical chemistry course (Liyanage et al., 2021). Within that course, three distinct patterns of student engagement were observed that seemed related to instructor facilitation strategies. Additionally, SNA was used to map the interactions of undergraduates, faculty, and graduate students at a virtual undergraduate poster session to show the different kinds of interactions between those groups (Bongers, 2022).
Another way that students come into PLTL groups with different experiences can be attributed to the various demographic categories to which they belong. Particularly when considering features such as motivation, there have been reports of significant differences among groups. For example, it has been reported that female students typically have lower motivation than their peers in general chemistry (Liu et al., 2017). Additionally, there are reports showing how motivation is correlated to early college academic success among Hispanic students (Kaufman et al., 2008) while other reports demonstrate Hispanic students pass general chemistry at a rate lower than their peers (Mason and Mittag, 2001). When considering how students interact with each other in a PLTL setting, it is valuable to know how these differences are either increased, maintained, or decreased as a result of social influence in order to better plan activities to promote student learning and motivation.
We had an interest in understanding how PLTL at a particular university supports developing students’ motivation in chemistry. Further, we wanted to understand more about how PLTL was serving the various demographic subgroups of the course to ensure that no group was being left behind. However, we were concerned about how the relational and non-independent nature of students in this setting might impact the validity of any claims made via traditional statistical methods due to the increased chance of false positive results (Scariano and Davenport, 1987). We also wanted a plan of analysis that would let us understand more about how students are affecting each other as a result of their interaction. Therefore, in order to investigate motivation in a particular PLTL setting, we chose to use social influence models, a technique from SNA similar to multiple regression (Leenders, 2002). By using these social influence models, we seek to better understand student motivation in this context along with nuances of how students can affect each other. Additionally, we wish to use this project to demonstrate to the chemistry education research community a way of considering relational data that can help elucidate features of social learning environments that are not obtainable through traditional statistical methods.
![]() | ||
Fig. 1 Illustration of the three basic needs which need to be satisfied to promote motivation according to basic needs theory (BNT). |
Within chemistry education research, SDT has been used as a framework for understanding motivation in a variety of settings. This research includes finding a positive relationship between student motivation and visuospatial skills in the context of learning group theory in a spectroscopy course taught by process oriented guided inquiry learning in Australia (Southam and Lewis 2013). Another setting where SDT was utilized was in Juriševič et al. (2012) when they found that students with higher motivation outperformed their peers in activities related to visible spectrometry.
SDT has also been used as a foundation when designing activities to support student learning and motivation (Ferreira et al., 2022; Wellhöfer and Lühken, 2022; Williams and Dries, 2022). Ferreira et al. (2022) designed inquiry-based laboratory activities for Brazilian high secondary students. The activities promoted autonomy (students approached activities as they saw fit), competence (students developed skills to propose procedures for experiments), and relatedness (these activities required teamwork). Through questionnaires, student interviews, and teacher's observations in a logbook, Ferreira et al. found evidence to support that the activities promoted intrinsic motivation. Wellhöfer and Lühken (2022) reported on a laboratory course designed around problem based learning. From student interviews, they found giving students the ability to autonomously propose experimental procedures was helpful in supporting student motivation. Finally, Williams and Dries (2022) reported on a guided-inquiry laboratory course for intermediate level chemistry majors (bioanalytical). From survey data, Williams and Dries found that many students attributed the ability to approach experiments autonomously or social factors as features that were helpful to their learning.
In addition to the previous examples of activities designed explicitly around SDT, the design of PLTL has many elements that would support student motivation according to the principle of fulfilling the basic needs laid out by SDT (Liu et al., 2018). For autonomy, students are not given a strict way to approach problems and peer leaders are instructed to support students in approaching the activities as the students see fit. For competence, the opportunity to work on the activities with the PLTL session is intended to enable students to gain more confidence in their ability to perform other chemistry problems. For relatedness, students' interactions with their peers provide opportunities for forming lasting relationships with peers that can extend even beyond general chemistry.
One way that this social comparison can manifest is when students could become more similar to each other through ‘normative processes’ (Dijkstra et al., 2008). In this case, students’ opinions or abilities will become more similar to their peers as a result of interaction (Felson and Reed, 1986). These processes would result in students who are above average on some outcome going down toward the average, and students below average approaching the average. An additional type of comparison is some students might see a high performance among their peers as a source for a negative self-evaluation as they do not meet the same standard as their peers. Simultaneously, the students with high performance will have that high performance as a standard to support that they are in fact high performing. These two behaviors create a feedback loop or ‘contrast effect’ which causes students to diverge in their outcome measures such that the range of the outcome measure spreads as a result of the social influence. Generally, many findings in education research looking at social influence of self-concept show a contrast effect (Dijkstra et al., 2008).
Within chemistry education research, social comparison theory was directly used as a framework to analyze how general chemistry students engaged in a simulated peer review activity (Berg and Moon, 2022). In this study, Berg and Moon found that students were motivated by a desire for self-improvement where the students would take what was good in the reviewed responses to improve their own. This is in contrast to a self-enhancement motivation where students gain confidence in their existing response based on a downward comparison to someone with perceived lower ability.
1. To what degree do three student groups of interest (transfer, female, and Hispanic) differ from their peers within our sampled population on measures of motivation, autonomy, competence, and relatedness?
2. To what degree does interest in PLTL activities follow the pattern described by Basic Needs Theory (BNT) and illustrated in Fig. 1 for general chemistry students within our sampled population?
3. What patterns do we observe in the social influence of motivation toward the end of the semester, and what does social comparison theory suggest about these patterns?
4. To what degree does the amount of social interaction over a semester relate to the intensity of social influence?
The particular semester for this study, Fall 2021, was unique in terms of how it was affected by COVID-19. At the beginning of the semester, all class activities were planned to be in-person. However, whenever a student tested positive for COVID-19, the section of the course and associated PLTL sections were moved online for a 2 week period. Shortly before the first exam, the decision was made that due to the need for direct interaction among students in PLTL, the Friday sessions would be online for the remainder of the semester while the lecture would remain in-person when possible. Online PLTL sessions were conducted in Microsoft Teams both before and after the decision to conduct all sessions online. For online sessions, students met in a Teams room assigned to their particular PLTL section, were split into breakout rooms for group work, and came together as a class to discuss answers after working in their particular group.
For the other three components related to BNT (motivation, competence, and relatedness), we chose to use measures from the Intrinsic Motivation Inventory (IMI; McAuley et al., 1989). The IMI was developed to include a number of potential scales that researchers can mix and match to their particular research questions to approach many aspects of SDT. For our research, we chose to use scales for interest, perceived competence, and relatedness to characterize motivation, competence, and relatedness respectively. These three scales of the IMI have been used in previous reports using SDT to analyze motivation in chemistry classrooms (Southam and Lewis 2013; Liu, 2017; Ferreira et al., 2022). The scale of interest is considered to be the self-report measure on intrinsic motivation (McAuley et al., 1989) and allows us to efficiently gain an understanding on motivation while simultaneously exploring other constructs. For competence, we used perceived competence as the self-report impression that students are able to successfully answer the prompts in the activities within their PLTL session. For relatedness, the relatedness scale of the IMI gives us a measure about each student's desire to form and maintain interactions with the other members of their PLTL group.
A few adjustments were made to adapt the instruments to our setting. First, the items of the LCQ were presented with the original text of “my instructor” being replaced with “my peer leader”. This change in wording was made to focus the items on the peer leaders in terms of how they are supporting student autonomy. After completing the LCQ, students were presented items from the IMI. While setting up the instruments, we randomized the order of items in the IMI such that items from all scales were shuffled among each other, but each student would see the same items in the same order. For the IMI, the phrasing of “this activity” was replaced with “Peer Leading Activities” along with minor adjustments to the subject-verb agreement and “this person” was replaced with “my peer leading group” to match the context. These choices in wording were decided based on discussion with the peer leading coordinator to best match how the activities within the PLTL sessions are usually described to students by their peer leaders. The items for these instruments are presented in Appendix A.
Data preparation began by removing all responses from students who did not consent to having their responses analyzed for research purposes. Responses were then checked against course enrollment to match responses to other student data. Each case where a student gave an identical response to every item on either the LCQ or IMI was removed from the analysis in order to improve data quality. This choice was made because, as the instrument included a mixture of positively and negatively phrased items, an identical response to every item suggests the respondent was not thoughtfully reading and responding to the particular items. This straight lining behavior has also been associated with participants speeding through responses in order to finish the instrument without thoughtfully engaging with the items as they go (Zhang and Conrad, 2014). The last data preparation step was to reverse code all items which were negatively phrased to aid in interpreting results.
There were a total of 1988 students enrolled in the course. After data processing and cleaning, we had a total of 1179 (59% of total students) responses from the first administration and 1244 from the second administration (63%). Analyzing response rates by demographics (Appendix B, Table 8) did not reveal any concerning trends in response rate across demographic groups, which might indicate evidence of a response bias and therefore merit a change in the analysis plan.
As part of running a social influence model, we needed to define the social network through which students were connected. As part of their responsibilities, peer leaders recorded which students attended each session as well which students worked together during the sessions. From this attendance data, we constructed social networks based on the group composition for each particular session by considering members within a particular PLTL group as having undirected ties among each other. These particular social networks were then further refined based on the particular research question being addressed.
Measurement invariance testing (Appendix E) was then conducted in order to check if we had evidence that the internal structure was consistent between the two administrations and across some key demographic groups. This step is important because if measurement invariance does not hold, then the constructs cannot be meaningfully compared as they potentially have different internal structures across time or demographics (Chen, 2007; Rocabado et al., 2019; Rocabado et al., 2020). When comparing the first and second administration, we found evidence of strict invariance supporting that we can directly compare scores produced by the two administrations without need for further adjustments. Additionally, we observed strict invariance between female and male, white and Hispanic, and non-transfer and transfer students suggesting that we can use the factor scores from the instrument to directly compare students in these pairs of groups.
Another form of social influence, network disturbance, is described by the pair of equations y = Xβ + ε and ε = ρWε + υ. What this mathematical model suggests is that the residual terms of the regression are influenced by the network. For the classroom setting, this kind of finding would imply that students who interact rise or fall together controlling for other factors in the regression. The first equation, y = Xβ + ε, is identical to multiple regression and to how the terms were defined in network effects. The additional contribution for network disturbances will be that the residual term is going to be influenced by the social network. So for ρWε + υ, ρ is the weighting constant for the network disturbance, W represents the social network as it did in network effects, ε represents the residual terms from the regression, and υ is the remaining residual after accounting for the amount explained by the network influence.
Model 1 | Model 2 | Model 3 | |
---|---|---|---|
Factor score computation method | Coarse | Refined | Refined |
Social influence terms | N/A | N/A | Network effects (ρ1) and network disturbance (ρ2) |
Outcome measure | Interest (time 2) | ||
Predictors | Interest (time 1) | ||
Autonomy support (time 2) | |||
Perceived competence (time 2) | |||
Relatedness (time 2) |
Before we ran our multiple regression and social influence models, we needed to determine which students would be included in the analysis and how we would establish which students were connected. At first, we used the group composition from the session before the second data collection was opened as our social network. As peer leaders were instructed to keep students together to the best degree possible and this particular session occurred late in the semester, we believed this particular session would be representative of students who had significant interactions with each other and had the most opportunities for social influence. Therefore, this was the network we used when running the Model 3 that is directly compared to Model 1 and Model 2.
However, as we also wanted to explore how social influence varied by the amount of interaction among students (research question 4), we needed to establish a different set of networks to address this question. For this task, we started by creating a weighted network object where the weight of a tie between two individuals was the count of the number of times the peer leader recorded the pair being in the same PLTL group. So even if groups were fluid based on the attendance for a particular session, this weighted network would be able to identify pairs of students who worked together consistently even if other students in the group varied week to week. From this weighted network object, we created a series of unweighted networks by restricting the network to only include ties that reached a certain tie weight threshold. As an example, the students in the network with the weight threshold of 3 include pairs of students who interacted with each other at least three times though certain pairs could have interacted more than three times. So low weight thresholds result in networks predominantly composed of interactions that were not maintained throughout the semester while high weight thresholds result in networks predominantly composed of interactions that persisted throughout the semester.
From these networks, we needed to further restrict the particular students who would be included in our statistical models. For starters, as we wanted to use data from both timepoints in the analysis, we eliminated students who did not take and consent to both administrations of our instrument. Additionally, as we wanted to consider the impact of social influence on the students, we did not want to include students for whom we were lacking data from their group members as this lack of data could complicate how we interpret social influence. To address this concern, we removed students for whom we did not have complete data from at least 2 group members.
We started with 1179 students who completed the initial instrument and 1244 who completed the final instrument. These numbers take into account students completing the instrument, consenting to data analysis, and not straight lining their responses. In total, we had a total of 870 students who completed both the first and second data collections. Then, after filtering students for whom we did not have complete data from at least 2 peers, we were left with 270 students to analyze. Tables comparing the 270 included students to the rest of the sample are presented in Appendix F (Tables 20–22). For the separate set of network objects defined by the number of interactions among students, we started with 735 students who completed both data collections and interacted at least 1 time with at least 2 others who also completed both data collections. As the threshold increases for interaction increases, the number of students included decreases. For example, from the data we can model, 370 students interacted with peers at least 3 times, 105 interacted at least 7 times, and 24 interacted 11 times.
Students | Administration | n | Mean | sd | Cohen's d |
---|---|---|---|---|---|
All students | Time 1 | 1179 | 4.18 | 1.40 | |
All students | Time 2 | 1244 | 3.74 | 1.52 | 0.30 |
Non-transfer | Time 1 | 1037 | 4.14 | 1.38 | |
Transfer | Time 1 | 107 | 4.36 | 1.52 | 0.16 |
Non-transfer | Time 2 | 1096 | 3.72 | 1.51 | |
Transfer | Time 2 | 113 | 3.66 | 1.58 | −0.04 |
Female | Time 1 | 759 | 4.17 | 1.40 | |
Male | Time 1 | 420 | 4.19 | 1.41 | 0.02 |
Female | Time 2 | 762 | 3.66 | 1.53 | |
Male | Time 2 | 482 | 3.85 | 1.50 | 0.12 |
White | Time 1 | 475 | 3.99 | 1.39 | |
Hispanic | Time 1 | 261 | 4.22 | 1.36 | 0.17 |
White | Time 2 | 507 | 3.61 | 1.50 | |
Hispanic | Time 2 | 270 | 3.77 | 1.44 | 0.11 |
Students | Administration | n | Mean | sd | Cohen's d |
---|---|---|---|---|---|
All students | Time 1 | 1179 | 5.56 | 0.93 | |
All students | Time 2 | 1244 | 5.64 | 0.99 | −0.08 |
Non-transfer | Time 1 | 1037 | 5.56 | 0.92 | |
Transfer | Time 1 | 107 | 5.57 | 1.05 | 0.01 |
Non-transfer | Time 2 | 1096 | 5.63 | 0.99 | |
Transfer | Time 2 | 113 | 5.72 | 0.99 | 0.09 |
Female | Time 1 | 759 | 5.56 | 0.96 | |
Male | Time 1 | 420 | 5.56 | 0.87 | 0.00 |
Female | Time 2 | 762 | 5.66 | 1.01 | |
Male | Time 2 | 482 | 5.61 | 0.96 | −0.05 |
White | Time 1 | 475 | 5.56 | 0.94 | |
Hispanic | Time 1 | 261 | 5.53 | 0.92 | −0.03 |
White | Time 2 | 507 | 5.65 | 1.01 | |
Hispanic | Time 2 | 270 | 5.59 | 0.97 | −0.03 |
Students | Administration | n | Mean | sd | Cohen's d |
---|---|---|---|---|---|
All students | Time 1 | 1179 | 4.80 | 1.30 | |
All students | Time 2 | 1244 | 4.74 | 1.36 | 0.05 |
Non-transfer | Time 1 | 1037 | 4.83 | 1.29 | |
Transfer | Time 1 | 107 | 4.58 | 1.37 | −0.20 |
Non-transfer | Time 2 | 1096 | 4.78 | 1.35 | |
Transfer | Time 2 | 113 | 4.37 | 1.32 | −0.31 |
Female | Time 1 | 759 | 4.67 | 1.35 | |
Male | Time 1 | 420 | 5.04 | 1.18 | 0.28 |
Female | Time 2 | 762 | 4.62 | 1.42 | |
Male | Time 2 | 482 | 4.93 | 1.24 | 0.23 |
White | Time 1 | 475 | 4.73 | 1.28 | |
Hispanic | Time 1 | 261 | 4.77 | 1.31 | 0.03 |
White | Time 2 | 507 | 4.77 | 1.38 | |
Hispanic | Time 2 | 270 | 4.61 | 1.25 | −0.12 |
Students | Administration | n | Mean | sd | Cohen's d |
---|---|---|---|---|---|
All students | Time 1 | 1179 | 5.44 | 1.06 | |
All students | Time 2 | 1244 | 3.48 | 1.38 | 1.59 |
Non-transfer | Time 1 | 1037 | 5.44 | 1.04 | |
Transfer | Time 1 | 107 | 5.44 | 1.18 | 0.00 |
Non-transfer | Time 2 | 1096 | 3.46 | 1.37 | |
Transfer | Time 2 | 113 | 3.56 | 1.46 | 0.08 |
Female | Time 1 | 759 | 5.43 | 1.09 | |
Male | Time 1 | 420 | 5.46 | 0.99 | 0.02 |
Female | Time 2 | 762 | 3.37 | 1.41 | |
Male | Time 2 | 482 | 3.64 | 1.33 | 0.19 |
White | Time 1 | 475 | 5.42 | 1.07 | |
Hispanic | Time 1 | 261 | 5.40 | 1.04 | −0.02 |
White | Time 2 | 507 | 3.38 | 1.32 | |
Hispanic | Time 2 | 270 | 3.40 | 1.33 | 0.01 |
For all students, we see there is a drop in interest (Table 2), perceived competence (Table 4), and relatedness (Table 5) as the semester progresses. By Cohen's d, the drop in interest is small, the drop in perceived competence is trivial (<0.2), and the drop in relatedness is large. Of note is that while the drop in interest is small, it does cross the threshold of going from a positive score (4.18) to a negative score (3.74) considering the neutral score of 4. This suggests that while students started with a generally positive interest in activities within their PLTL sessions, this was not maintained throughout the semester. Additionally, the relatedness score experienced a similar but much larger drop across the two administrations from 5.44 to 3.48. This finding suggests that across the two time points, students went from a generally positive desire to form and maintain relationships with their peers to a generally negative desire. In contrast, while perceived competence dropped trivially between the administrations, it maintained a consistently positive value (4.80 to 4.74) suggesting students consistently felt positive about their ability to succeed in doing the activities within their PLTL sessions. While many of the factor scores dropped as the semester progressed, we observed a slight rise in autonomy support but with a trivial effect size (Table 3). Both coarse averages of autonomy support at the two time points (5.56 and 5.64) were positive relative to the neutral value of 4 suggesting students consistently felt their peer leaders supported them in approaching the activities within their PLTL sessions as the students desired.
When comparing various demographic categories, we see that the differences often have quite small effect sizes. In fact, no comparison for interest, autonomy support, and relatedness for either administration reaches the traditional threshold for a small effect size of 0.2 when looking at Cohen's d. In contrast, we see some small effect size differences when comparing the perceived competence of non-transfer/transfer students and with female/male students. For the comparison between non-transfer and transfer students, transfer students start with a lower perceived competence and this gap increases as the semester progresses. Then for females and males, males start with a higher perceived competence relative to their peers though this gap does shrink slightly between the two time points. In the cases where there is a small effect size difference in perceived competence, the difference does not represent a change from positive to negative perceived competence.
When analyzing student interest using multiple regression (Model 1 and Model 2) and a social influence model (Model 3), we first wanted to see evidence that BNT is a predictive framework followed by seeing how each model added to our understanding of the data. The summary of the results from all of these models is shown in Table 6. From all three models, we see predictive power in all/most of our measured factors in predicting student interest. Specifically, initial interest, perceived competence, and relatedness are significant and positive predictors of interest. One distinction between Model 1 and the others is that autonomy support falls below the conventional levels of statistical significance when using refined factor scores from the measurement model instead of the coarse score used in Model 1.
Model | B | Std error | Z value | Sig | Fit | |
---|---|---|---|---|---|---|
a p < 0.05. | ||||||
1 | (Intercept) | −1.41 | 0.32 | −4.425 | <0.01a | R 2 = 0.6807 |
Interest (t1) | 0.26 | 0.04 | 6.114 | <0.01a | ||
Autonomy support (t2) | 0.20 | 0.07 | 3.071 | <0.01a | ||
Perceived competence (t2) | 0.25 | 0.05 | 5.486 | <0.01a | ||
Relatedness (t2) | 0.50 | 0.05 | 10.486 | <0.01a | ||
2 | (Intercept) | −0.10 | 0.05 | −2.35 | 0.02a | R 2 = 0.8017 |
Interest (t1) | 0.16 | 0.04 | 4.45 | <0.01a | ||
Autonomy support (t2) | 0.07 | 0.06 | 1.22 | 0.22 | ||
Perceived competence (t2) | 0.27 | 0.04 | 6.52 | <0.01a | ||
Relatedness (t2) | 0.80 | 0.05 | 16.13 | <0.01a | ||
3 | (Intercept) | 0.03 | 0.04 | 0.768 | 0.44 | R 2 = 0.8130 |
Interest (t1) | 0.15 | 0.03 | 4.430 | <0.01a | ||
Autonomy support (t2) | 0.07 | 0.05 | 1.371 | 0.17 | ||
Perceived competence (t2) | 0.26 | 0.04 | 6.534 | <0.01a | ||
Relatedness (t2) | 0.82 | 0.05 | 16.834 | <0.01a | ||
Network effects (ρ1) | −0.05 | 0.02 | −2.163 | 0.03a | ||
Network disturbances (ρ2) | 0.02 | 0.03 | 0.451 | 0.65 |
When looking at Model 3, the social influence model, we see that network effects is a significant negative predictor while network disturbances is not significant. The fact that network effects is significant and negative gives support that students are having an effect on each other, and this effect generally results in the range of student interest within a group expanding as a result of interaction.
Finally, we wanted to understand how the social influence was affected by the amount of interaction between students. To do this, we ran all the terms of Model 3 on the set of network objects we produced by varying the threshold of the number of interactions required to be included in the analysis. Fig. 3 represents how the network effect term varied based on the number of interactions between students. From this data, we see that for low threshold values, we do not see a significant network effect term in the model (illustrated in the figure by the 95% confidence interval overlapping with zero). However, this behavior changes after the threshold of 8 interactions has been met. From there, we see the significant and negative network effect term like what was observed in the previous analysis that focused on the network toward the end of the semester. While the 95% confidence interval expands, partially as a result of fewer students being included in the analysis, the magnitude of the effect is large enough to be found significant. This finding suggests that while social influence is occurring between students, there is some threshold of interaction that needs to be reached before this influence is statistically observable.
![]() | ||
Fig. 3 Graph of how the network effect term from Model 3 can vary based on the amount of interaction between students. Bars from the observed value represent 95% confidence intervals. |
Compared to the drop in interest, the drop in relatedness was quite stark with a large effect size. When analyzing this result, it is important to remember the context in which these data were collected. Specifically, this was a semester of PLTL that was intended to be in person that transitioned into an entirely virtual experience. Many CER studies reported findings such as the transition to online learning resulting in the loss of peer communication networks (Jeffery and Bauer, 2020). Another reported finding is the observation that the online learning environment generally weakened student engagement due in part to working in settings that were not conducive to learning (Wu and Teets, 2021). Based on these previous findings, we believe the nature of the online interaction during this semester could play a strong role in the large drop of students’ feeling of building and maintaining relationships with their peers in their PLTL groups. Additionally, the online delivery also limits students' ability to interact directly with their peer leader. As part of the design of PLTL is using peers due to their ability to more directly relate with their students than professors would be able to do (Gosser and Roth, 1998), this lack of interaction could have also led to some overall drop in students’ feelings of relatedness.
After understanding the baseline for all students, we then looked at our groups of interest. There were three comparisons that we chose to explore based on our concerns with potential inequities and the availability of enough data to make a meaningful comparison: (1) transfer students (Wesemann, 2005; Stitzel and Raje, 2021), (2) female students (Liu et al., 2017), and (3) Hispanic students (Mason and Mittag, 2001). As for the comparisons between time points, we focused the comparisons for these groups based on effect size instead of simple significance testing as simply commenting on significance can either lead us to focus on trivial differences that simply meet a threshold of statistical significance or ignore large differences that simply did not meet a threshold of statistical significance. We saw no evidence of difference among the demographic comparisons at either time point for interest, autonomy support, or relatedness that met the traditional threshold of a small effect size. However, we did see a small effect size difference within perceived competence at both timepoints for transfer and non-transfer students as well as female and male students. In particular, the gap for perceived competence between transfer students and non-transfer students expanded during the semester though remained a small effect size. Transfer students were generally isolated in their peer-leading groups so social comparisons would be more likely to occur through comparisons to non-transfer student peers. In contrast, the gap between female and male students shrank while also staying in the small effect size category. Female and male students were typically combined in most PLTL groups, as the peer leaders were explicitly instructed to avoid homogenous groups on this demographic which gave more opportunities throughout the semester for these groups to engage in social comparisons with each other.
However, the evidence in our models for autonomy support predicting interest is mixed. This observation comes from the fact that while autonomy support is a significant positive predictor for interest in Model 1, it is not significant in Model 2 and Model 3. The difference between Model 1 and the others is the method used in calculating factor scores. For Model 1, factor scores were ‘coarse’ meaning they were computed using the unweighted average of each item within the factor. This computational method is simple and is usable even when there are not enough students to run a reliable factor model. In contrast, Model 2 and Model 3 were ‘refined’ factor scores computed from the measurement model which can consider features such as items contributing unequally to the factor and measurement error. Therefore, one possible explanation of why this difference in significance is observed can be that the significance of autonomy support in Model 1 is an artifact of the coarse factor score treating each item in the scale as an equal contributor to the factor score. Regardless of the variation in significance for autonomy support between models, the way in which this course is structured relative to others might also affect the particular value of autonomy in this setting. A previous study has directly shown a relationship between autonomy support and success in organic chemistry (Black and Deci, 2000) and some contexts such as laboratories designed around problem based learning have suggested that giving students many options in how to approach a task is helpful to motivation (Wellhöfer and Lühken, 2022). In contrast, it might be the case in this particular setting that while students may strongly believe that their peer leaders will let them take any path they want to find an answer, they may still believe there is a ‘correct’ way to answer a particular question and that method should be determined. So while autonomy might be supported in this setting, students perceive the goal as finding that ‘correct’ method and do not necessarily see themselves as acting autonomously.
Specifically, Model 3 gave support that our data exhibits a significant and negative network effect process. As the network effect term generally describes a process where an individual will be influenced by the group average (Leenders, 2002), a negative value here generally suggests a movement away from the average. While a full explanation of the model is more nuanced in how that happens due to the autoregressive nature of the model, the general observation that individuals become more polarized under a negative network effect holds. Based on social comparison theory, individuals look to others to better understand themselves (Festinger, 1954). In the case of some affective outcomes in educational settings, it is common to see that students are influenced by a contrast effect (Felson and Reed, 1986; Dijkstra et al., 2008). Based on the observed negative network effect, our data seems consistent with an observation of a contrast effect which is what we believe is likely in play here.
In order to better illustrate the concept of network effects within our data, we found a group in the data to serve as an exemplar. The students in this group are assigned pseudonyms of Alice, Beth, and Charlie, and their interest throughout the semester is summarized in Table 7. Within this table, the refined interest scores are calculated in such a way that 0 represents the average interest for all students, negative values represent below average interest, and positive values represent above average interest. The Xβ and ρWy + Xβ columns are predicting the refined interest score as was done in the modeling steps. At the beginning of the semester, before there was an opportunity for much influence to occur, Alice and Beth both reported high levels of interest while Charlie reported a low level of interest. At the end of the semester, we see that Alice and Beth largely maintained their interest while Charlie grew in his disinterest, expanding the range of interest for this group. When looking at the other terms in the multiple regression model, we see that this drop in Charlie's interest goes beyond what would be expected based on the other variables. For someone with Charlie's initial interest, perception of autonomy support, perceived competence, and feelings of relatedness, we would expect a final interest of around −0.440 in isolation instead of the observed −1.256. The negative values suggests that Charlie has a below average interest with the observed value being even farther below average than the modeled value. Adding the network effect term into the model brings Charlie's expected interest to −0.526 which is still above the observed value but helps reduce the residuals in the model and improves the fit.
Student | Coarse interest (t1) | Coarse interest (t2) | Refined interest (t1) | Refined interest (t2) | Model 3 predicted interest (without social influence) | Model 3 predicted interest (with network effects) |
---|---|---|---|---|---|---|
Alice | 6.2 | 6.4 | 2.080 | 2.067 | 1.588 | 1.576 |
Beth | 6.2 | 6.0 | 1.965 | 1.815 | 1.674 | 1.656 |
Charlie | 3.2 | 2.6 | −0.613 | −1.256 | −0.440 | −0.526 |
Range | 3.0 | 3.8 | 2.693 | 3.323 | 2.114 | 2.183 |
In interpreting our findings of social influence, we also considered that this finding is observed in a largely online environment. While we lack comparable data from previous semesters that would allow us to directly address the potential effects of COVID-19 or online learning, we can make some reasonable connections to other reported findings. For example, Jeffery and Bauer (2020) found that while lectures may not have changed much for students, the ability to focus on the online content relative to in-person was reduced, which could be a factor in play for the students we have analyzed. Additionally, they also reported that the modal value of exchanges with a student's peer network was reduced by around 90% during this time. Considering our finding that the observed social influence related to the amount of interaction we could quantify from attendance data, a significant change in how much students interact would be expected to affect how much they influence each other. Furthermore, an analysis of collaborative learning in a chemistry setting (Gemmel et al., 2020) showed that, compared to in-person learning, students in the online setting would often take longer to solve problems due to challenges in communication. Additionally, groups that included students with cameras off tended to work less collaboratively. This lack of collaboration among the group members could have had an effect on the type and amount of social influence observed.
Another caution in interpreting these results is that we do not have a causal research design. Any findings should be considered correlative, and any implications of a causal direction are derived from theory and not directly from the data.
Additionally, while our online instruments allowed us to efficiently collect data from many students, we lack some of the nuance available from qualitative data collected by more time intensive means such as interviews with students. Another consideration is that our particular measure for motivation lacks some nuance other researchers have used that considers different types of motivation (e.g. intrinsic and extrinsic) and different elements within each of those types of motivation. We chose our particular measure as it best fit our research questions and setting, but that decision does limit our ability to speak to how any specific type of motivation is being socially influenced so our claims about motivation are currently quite broad. Another consideration in interpreting the scores from our instrument is that the fit statistics of our CFA did not quite reach the conventionally accepted guidelines. While we feel the level of fit we observed is adequate for the claims we have made here, future research using this instrument should consider that continued development work is probably necessary.
Another limitation in interpreting our result comes from the challenges intrinsic to any form of SNA analysis. As SNA requires a high response rate for reliable analysis (Grosser and Borgatti, 2013), we are limited in our ability to speak to the social influence in any group where we feel we are lacking data, which is part of why our models are run on a relatively small portion of the total student body. While we presented evidence there was not a meaningful difference between the students we analyzed and those we did not (Appendix F), we cannot rule out the possibility that our results only apply to the particular students included in the model and not to the students who were excluded.
Another potential implication can be in the particular applications of a concept an instructor chooses to highlight in class. If all examples used in instruction come from a particular topic (e.g. pharmaceuticals), then students interested in the particular topic will consistently grow in interest and this growth may have a continued negative impact on their peers due to the social comparisons. Therefore, we encourage instructors to consider using a variety of applications to appeal to as many students as possible. For a couple examples, many concepts in general chemistry have been connected to activities such as cooking (Miles and Bachman, 2009; Howell et al. 2021) or analyzing the pigments used in paintings (Nivens et al., 2010; Vyhnal et al., 2020). By using a variety of examples, instructors limit the possibility that students will contrast themselves to their peers in such a way that consistently reduces their interest in chemistry.
LCQ items:
1. I feel that my peer leader provides me choices and options.
2. I feel understood by my peer leader.
3. I am able to be open with my peer leader during class.
4. My peer leader conveyed confidence in my ability to do well in this course.
5. I feel that my peer leader accepts me.
6. My peer leader made sure I really understood the goals of the course and what I need to do.
7. My peer leader encouraged me to ask questions.
8. I feel a lot of trust in my peer leader.
9. My peer leader answers my questions fully and carefully.
10. My peer leader listens to how I would like to do things.
11. My peer leader handles people's emotions very well.
12. I feel that my peer leader cares about me as a person.
13. I don’t feel very good about the way my peer leader talks to me.
14. My peer leader tries to understand how I see things before suggesting a new way to do things.
15. I feel able to share my feelings with my peer leader.
Students were then presented with items adapted from the Intrinsic Motivation Inventory (IMI) to characterize interest, perceived competence, and relatedness (McAuley et al., 1989). Items were answered on a 7-point scale ranging from “Not true at all” to “Very true” In adapting items from the original source, “this activity” was replaced with “Peer Leading Activities” and subject/verb agreement was adjusted as necessary. For items characterizing relatedness, “this person” became “my peer leading group” to match the context. Items 2, 3, 6, 11, 18, 20, and 21 are reverse phrased and were reverse coded for ease of interpretation.
The items break down by intended factor in the following way:
• Interest: 1, 4, 9, 11, 13, 17, 20
• Perceived competence: 3, 5, 7, 10, 15, 19
• Relatedness: 2, 6, 8, 12, 14, 16, 18, 21
IMI items:
1. I would describe Peer Leading Activities as very interesting.
2. I don't feel like I could really trust my peer leading group.
3. The Peer Leading Activities were ones that I couldn't do very well.
4. I thought Peer Leading Activities were quite enjoyable.
5. After working at Peer Leading Activities for awhile, I felt pretty competent.
6. I felt really distant to my peer leading group.
7. I think I am pretty good at Peer Leading Activities.
8. It is likely that my peer leading group and I could become friends if we interacted a lot.
9. Peer Leading Activities were fun to do.
10. I am satisfied with my performance at Peer Leading Activities.
11. Peer Leading Activities did not hold my attention at all.
12. I feel close to my peer leading group.
13. While I was doing Peer Leading Activities, I was thinking about how much I enjoyed them.
14. I'd like a chance to interact with my peer leading group more often.
15. I think I did pretty well at Peer Leading Activities, compared to other students.
16. I felt like I could really trust my peer leading group.
17. I enjoyed doing Peer Leading Activities very much.
18. I really doubt that my peer leading group and I would ever be friends.
19. I was pretty skilled at Peer Leading Activities.
20. I thought Peer Leading Activities were boring.
21. I'd really prefer not to interact with my peer leading group in the future.
n | t1 | t2 | t1 and t2 | |
---|---|---|---|---|
All students | 1988 | 1179 (59%) | 1244 (63%) | 870 (44%) |
Female | 1167 | 759 (65%) | 762 (65%) | 569 (49%) |
Male | 821 | 420 (51%) | 482 (59%) | 301 (37%) |
White | 784 | 475 (61%) | 507 (65%) | 353 (45%) |
Hispanic | 428 | 261 (61%) | 270 (63%) | 201 (47%) |
First time in college | 1711 | 1037 (61%) | 1096 (64%) | 772 (45%) |
Transfer students | 220 | 107 (49%) | 113 (51%) | 75 (34%) |
Item | Mean | SD | Skewness | Kurtosis |
---|---|---|---|---|
lcq.01 | 5.60 | 1.30 | −1.22 | 4.48 |
lcq.02 | 5.67 | 1.23 | −1.11 | 4.26 |
lcq.03 | 5.60 | 1.30 | −1.08 | 4.00 |
lcq.04 | 5.71 | 1.24 | −1.15 | 4.36 |
lcq.05 | 5.96 | 1.08 | −1.24 | 4.82 |
lcq.06 | 5.80 | 1.22 | −1.29 | 4.76 |
lcq.07 | 6.02 | 1.12 | −1.46 | 5.46 |
lcq.08 | 5.52 | 1.31 | −0.88 | 3.57 |
lcq.09 | 5.99 | 1.09 | −1.43 | 5.66 |
lcq.10 | 5.41 | 1.32 | −0.64 | 2.87 |
lcq.11 | 5.47 | 1.22 | −0.47 | 2.61 |
lcq.12 | 5.35 | 1.27 | −0.46 | 2.75 |
lcq.13.r | 5.93 | 1.50 | −1.75 | 5.36 |
lcq.14 | 5.39 | 1.23 | −0.50 | 2.82 |
lcq.15 | 4.94 | 1.43 | −0.37 | 2.70 |
imi.01.int | 4.44 | 1.69 | −0.26 | 2.50 |
imi.02.rel.r | 5.53 | 1.61 | −1.07 | 3.46 |
imi.03.pc.r | 5.35 | 1.63 | −0.83 | 2.94 |
imi.04.int | 4.31 | 1.67 | −0.15 | 2.51 |
imi.05.pc | 5.04 | 1.54 | −0.48 | 2.75 |
imi.06.rel.r | 4.68 | 1.88 | −0.42 | 2.17 |
imi.07.pc | 4.88 | 1.56 | −0.32 | 2.60 |
imi.08.rel | 4.11 | 1.72 | −0.01 | 2.37 |
imi.09.int | 4.10 | 1.68 | 0.03 | 2.44 |
imi.10.pc | 5.05 | 1.61 | −0.56 | 2.74 |
imi.11.int.r | 5.04 | 1.72 | −0.69 | 2.68 |
imi.12.rel | 2.91 | 1.66 | 0.64 | 2.76 |
imi.13.int | 3.00 | 1.73 | 0.62 | 2.67 |
imi.14.rel | 3.72 | 1.76 | 0.19 | 2.34 |
imi.15.pc | 4.35 | 1.60 | −0.11 | 2.61 |
imi.16.rel | 4.03 | 1.65 | 0.03 | 2.50 |
imi.17.int | 3.90 | 1.72 | 0.11 | 2.38 |
imi.18.rel.r | 4.68 | 1.76 | −0.44 | 2.39 |
imi.19.pc | 4.54 | 1.56 | −0.14 | 2.63 |
imi.20.int.r | 4.62 | 1.77 | −0.42 | 2.40 |
imi.21.rel.r | 5.34 | 1.71 | −0.98 | 3.22 |
Additionally, we checked the full inter-item correlation table (630 values), and nothing from that was flagged as problematic. We also checked the frequency that each item was left blank by a student and found that no item had more than 0.5% missing data.
Results from running a 4-factor EFA on all items are presented in Table 10. In this analysis, we observed that reverse phrased items seemed to load onto a single factor together regardless of the intended factor for each item and this behavior was also observed in a 3-factor solution and at both administrations. For ease of interpretation for the instrument scores, we eliminated the reverse phrased items from our analysis.
Autonomy support | Combined interest and relatedness | Perceived competence | Reverse phrased items | |
---|---|---|---|---|
lcq.01 | 0.69 | |||
lcq.02 | 0.79 | |||
lcq.03 | 0.77 | |||
lcq.04 | 0.79 | |||
lcq.05 | 0.81 | |||
lcq.06 | 0.80 | |||
lcq.07 | 0.76 | |||
lcq.08 | 0.83 | |||
lcq.09 | 0.77 | |||
lcq.10 | 0.79 | |||
lcq.11 | 0.75 | |||
lcq.12 | 0.77 | |||
lcq.14 | 0.71 | |||
lcq.15 | 0.68 | 0.34 | ||
imi.01.int | 0.38 | 0.56 | ||
imi.04.int | 0.60 | 0.41 | ||
imi.09.int | 0.70 | 0.40 | ||
imi.13.int | 0.75 | 0.27 | ||
imi.17.int | 0.74 | 0.34 | ||
imi.08.rel | 0.60 | |||
imi.12.rel | 0.74 | |||
imi.14.rel | 0.66 | |||
imi.16.rel | 0.34 | 0.60 | ||
imi.05.pc | 0.35 | 0.31 | 0.55 | |
imi.07.pc | 0.83 | |||
imi.10.pc | 0.73 | |||
imi.15.pc | 0.71 | |||
imi.19.pc | 0.79 | |||
lcq.13.r | 0.37 | |||
imi.11.int.r | 0.67 | |||
imi.20.int.r | 0.32 | 0.62 | ||
imi.02.rel.r | 0.67 | |||
imi.06.rel.r | 0.62 | |||
imi.18.rel.r | 0.65 | |||
imi.21.rel.r | 0.69 | |||
imi.03.pc.r | 0.43 | 0.41 |
Factor structures were again calculated based on both potential number of factors. The summary of the item loadings can be found in Table 11 for a 3-factor solution and Table 12 for a 4-factor solution. From these results as well as the consideration in determining number of potential factors, we could reasonably propose two different factor structures. One was a 4-factor solution which directly matches the intended design of the instrument (autonomy support, interest, perceived competence, and relatedness). However, we could not rule out a separate 3-factor structure which divided the items into factors of autonomy support, perceived competence, and a factor that combined items from interest and relatedness together. Therefore, we used CFA to compare these models to find the one that was a better approximation for our data.
Autonomy support | Perceived competence | Combined relatedness and interest | |
---|---|---|---|
lcq.01 | 0.64 | ||
lcq.02 | 0.79 | ||
lcq.03 | 0.76 | ||
lcq.04 | 0.74 | ||
lcq.05 | 0.74 | ||
lcq.06 | 0.78 | ||
lcq.07 | 0.71 | ||
lcq.08 | 0.82 | ||
lcq.09 | 0.74 | ||
lcq.10 | 0.75 | ||
lcq.11 | 0.74 | ||
lcq.12 | 0.71 | ||
lcq.14 | 0.67 | ||
lcq.15 | 0.70 | ||
imi.05.pc | 0.34 | 0.59 | 0.32 |
imi.07.pc | 0.81 | ||
imi.10.pc | 0.74 | ||
imi.15.pc | 0.75 | ||
imi.19.pc | 0.81 | ||
imi.08.rel | 0.58 | ||
imi.12.rel | 0.70 | ||
imi.14.rel | 0.61 | ||
imi.16.rel | 0.35 | 0.60 | |
imi.01.int | 0.37 | 0.57 | |
imi.04.int | 0.43 | 0.65 | |
imi.09.int | 0.38 | 0.73 | |
imi.13.int | 0.73 | ||
imi.17.int | 0.36 | 0.75 |
Autonomy support | Perceived competence | Relatedness | Interest | |
---|---|---|---|---|
lcq.01 | 0.64 | |||
lcq.02 | 0.79 | |||
lcq.03 | 0.76 | |||
lcq.04 | 0.74 | |||
lcq.05 | 0.74 | |||
lcq.06 | 0.77 | |||
lcq.07 | 0.70 | |||
lcq.08 | 0.81 | |||
lcq.09 | 0.74 | |||
lcq.10 | 0.75 | |||
lcq.11 | 0.74 | |||
lcq.12 | 0.72 | |||
lcq.14 | 0.67 | |||
lcq.15 | 0.71 | 0.33 | ||
imi.05.pc | 0.34 | 0.56 | 0.33 | |
imi.07.pc | 0.81 | |||
imi.10.pc | 0.73 | |||
imi.15.pc | 0.80 | |||
imi.19.pc | 0.84 | |||
imi.08.rel | 0.61 | |||
imi.12.rel | 0.73 | |||
imi.14.rel | 0.63 | |||
imi.16.rel | 0.35 | 0.62 | ||
imi.01.int | 0.33 | 0.53 | 0.60 | |
imi.04.int | 0.36 | 0.36 | 0.59 | |
imi.09.int | 0.33 | 0.57 | 0.53 | |
imi.13.int | 0.37 | 0.41 | 0.67 | |
imi.17.int | 0.62 | 0.38 |
Number factors | χ 2 | df | p-Value | SRMR | RMSEA | CFI | Δχ2 | Δdf | p-Value | ΔSRMR | ΔRMSEA | ΔCFI |
---|---|---|---|---|---|---|---|---|---|---|---|---|
4 | 1699 | 344 | <0.001 | 0.058 | 0.082 | 0.888 | — | — | — | — | — | — |
3 | 1986 | 347 | <0.001 | 0.064 | 0.089 | 0.865 | 289 | 3 | <0.001 | 0.006 | 0.007 | −0.023 |
Number factors | χ 2 | df | p-Value | SRMR | RMSEA | CFI | Δχ2 | Δdf | p-Value | ΔSRMR | ΔRMSEA | ΔCFI |
---|---|---|---|---|---|---|---|---|---|---|---|---|
4 | 1692 | 344 | <0.001 | 0.064 | 0.079 | 0.896 | — | — | — | — | — | — |
3 | 1919 | 347 | <0.001 | 0.067 | 0.085 | 0.878 | 226 | 3 | <0.001 | 0.003 | 0.006 | −0.018 |
At both time points, we see evidence that there is a substantial improvement of fit by incorporating a fourth factor into the model. While some level of fit improvement is to be expected by adding any additional parameters, it is noteworthy that the improvement to χ2 is statistically significant. Additionally, a parsimony adjusted fit index (RMSEA), sees improvement even with a less parsimonious model. Therefore, when considering these data and the theoretical design of the instrument, we believe the 4-factor solution provides a better approximation of our data.
The 4-factor solution calculated at the second administration is illustrated in Fig. 4. When looking at the model fit indices for the 4-factor solution, we see that χ2 indicates significant misfit. For the other fit indices, we see that SRMR is within the generally accepted guidelines (Hu and Bentler, 1999) for good levels of fit (<0.08); however, we find that RMSEA and CFI fall outside the traditional guidelines (RMSEA < 0.06; CFI > 0.95). There are some sources that suggests RMSEA < 0.08 (Bandalos and Finney, 2018) and CFI > 0.9 (McDonald and Ho, 2002) can be considered acceptable levels of fit. For interpretability, we elected to retain the 4-factor model.
![]() | ||
Fig. 4 Standardized CFA showing the results of the 4-factor solution calculated from the second administration of our instrument. |
At this point in the CFA, we chose to explore alternate measurement models that utilized more of our collected data,. For this task, we first ran a CFA that included all items based on their intended theoretical scales. The next CFA we conducted incorporated a negative ‘methods’ factor which is a technique that has been shown to improve model fit (Naibert and Barbera, 2022) in cases similar to ours. Finally, we ran a bifactor model as that might also have provided a way to better understand our data. The fit statistics for these three models as well as the fit statistics for the model shown in Fig. 4 are shown in Table 15. Since the fit of none of these models represented a meaningful improvement over the model in Fig. 4, we elected to retain the four-factor model without reverse phrased items as this model is the most straightforward to interpret.
χ 2 | df | SRMR | RMSEA | CFI | |
---|---|---|---|---|---|
Model used (reverse items dropped) | 1692 | 344 | 0.064 | 0.079 | 0.896 |
All items | 3251 | 588 | 0.075 | 0.085 | 0.830 |
With negative ‘method’ factor | 2392 | 577 | 0.065 | 0.071 | 0.884 |
Bifactor | 2581 | 558 | 0.098 | 0.076 | 0.871 |
R code to calculate measurement invariance was adapted from Rocabado et al. (2020) and used the lavaan library (Rosseel, 2012). According to guidelines for invariance laid out by Chen (2007), it is acceptable to move up the steps of measurement invariance testing if the fit indices do not become worse by certain thresholds. Generally, we are looking for ΔSRMR > 0.03, ΔRMSEA > 0.015, and ΔCFI < −0.01 as establishing a lack of invariance. For establishing scalar or strict invariance, the threshold for ΔSRMR is brought down to 0.01. Additionally, Chen laid out stricter guidelines (ΔSRMR > 0.025, ΔRMSEA > 0.01, and ΔCFI < −0.005 for metric invariance with ΔSRMR > 0.005 for scalar and strict invariance) when the size of groups is small or unevenly distributed. As our sample had a limited number of transfer and Hispanic students (n < 300), we aimed for the stricter guidelines on these models.
We ran 4 sets of measurement invariance tests to determine if we could fairly compare our data. These comparisons considered whether measurement invariance occurred between the two separate administrations (Table 16), non-transfer and transfer students (Table 17), female and male students (Table 18), and White and Hispanic students (Table 19). In all four cases, we accepted that we met standards for strict invariance allowing us to use factor scores to directly compare the investigated groups.
Step | Testing level | χ 2 | df | p-Value | SRMR | RMSEA | CFI | Δχ2 | Δdf | p-Value | ΔSRMR | ΔRMSEA | ΔCFI |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | Baseline (time 1) | 2761 | 344 | <0.001 | 0.056 | 0.077 | 0.899 | — | — | — | — | — | — |
0 | Baseline (time 2) | 2705 | 344 | <0.001 | 0.060 | 0.074 | 0.908 | — | — | — | — | — | — |
1 | Configural | 5466 | 688 | <0.001 | 0.058 | 0.076 | 0.904 | — | — | — | — | — | — |
2 | Metric | 5535 | 712 | <0.001 | 0.061 | 0.075 | 0.903 | 69 | 24 | <0.001 | 0.003 | −0.001 | −0.001 |
3 | Scalar | 5643 | 736 | <0.001 | 0.062 | 0.074 | 0.901 | 107 | 24 | <0.001 | 0.001 | −0.001 | −0.002 |
4 | Strict | 5762 | 764 | <0.001 | 0.062 | 0.073 | 0.899 | 119 | 28 | <0.001 | <0.001 | −0.001 | −0.002 |
Step | Testing level | χ 2 | df | p-Value | SRMR | RMSEA | CFI | Δχ2 | Δdf | p-Value | ΔSRMR | ΔRMSEA | ΔCFI |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | Baseline (transfer) | 703 | 344 | <0.001 | 0.082 | 0.096 | 0.858 | — | — | — | — | — | — |
0 | Baseline (non-transfer) | 2498 | 344 | <0.001 | 0.061 | 0.074 | 0.908 | — | — | — | — | — | — |
1 | Configural | 3202 | 688 | <0.001 | 0.063 | 0.077 | 0.903 | — | — | — | — | — | — |
2 | Metric | 3234 | 712 | <0.001 | 0.064 | 0.075 | 0.903 | 32 | 24 | <0.001 | 0.001 | −0.002 | <0.001 |
3 | Scalar | 3275 | 736 | <0.001 | 0.064 | 0.074 | 0.902 | 40 | 24 | <0.001 | <0.001 | −0.001 | −0.001 |
4 | Strict | 3323 | 764 | <0.001 | 0.065 | 0.073 | 0.901 | 48 | 28 | <0.001 | 0.001 | −0.001 | −0.001 |
Step | Testing level | χ 2 | df | p-Value | SRMR | RMSEA | CFI | Δχ2 | Δdf | p-Value | ΔSRMR | ΔRMSEA | ΔCFI |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | Baseline (Female) | 1795 | 344 | <0.001 | 0.058 | 0.074 | 0.913 | — | — | — | — | — | — |
0 | Baseline (Male) | 1374 | 344 | <0.001 | 0.068 | 0.079 | 0.888 | — | — | — | — | — | — |
1 | Configural | 3170 | 688 | <0.001 | 0.062 | 0.076 | 0.904 | — | — | — | — | — | — |
2 | Metric | 3198 | 712 | <0.001 | 0.063 | 0.075 | 0.904 | 28 | 24 | 0.240 | 0.001 | −0.001 | <0.001 |
3 | Scalar | 3264 | 736 | <0.001 | 0.065 | 0.074 | 0.902 | 65 | 24 | <0.001 | 0.002 | −0.001 | −0.002 |
4 | Strict | 3342 | 764 | <0.001 | 0.065 | 0.074 | 0.900 | 78 | 28 | <0.001 | <0.001 | <0.001 | −0.002 |
Step | Testing level | χ 2 | df | p-Value | SRMR | RMSEA | CFI | Δχ2 | Δdf | p-Value | ΔSRMR | ΔRMSEA | ΔCFI |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | Baseline (White) | 1500 | 344 | <0.001 | 0.069 | 0.081 | 0.897 | — | — | — | — | — | — |
0 | Baseline (Hispanic) | 967 | 344 | <0.001 | 0.071 | 0.082 | 0.878 | — | — | — | — | — | — |
1 | Configural | 2467 | 688 | <0.001 | 0.070 | 0.082 | 0.891 | — | — | — | — | — | — |
2 | Metric | 2497 | 712 | <0.001 | 0.072 | 0.080 | 0.891 | 29 | 24 | 0.209 | 0.002 | −0.002 | <0.001 |
3 | Scalar | 2514 | 736 | <0.001 | 0.072 | 0.079 | 0.892 | 17 | 24 | 0.825 | <0.001 | −0.001 | 0.001 |
4 | Strict | 2624 | 764 | <0.001 | 0.073 | 0.079 | 0.887 | 109 | 28 | <0.001 | 0.001 | <0.001 | −0.005 |
Modeled | Unmodeled | difference | |
---|---|---|---|
Interest (t1) | 4.04 | 4.21 | −4% |
Interest (t2) | 3.78 | 3.72 | 2% |
Autonomy support (t1) | 5.44 | 5.60 | −3% |
Autonomy support (t2) | 5.60 | 5.65 | −1% |
Perceived competence (t1) | 4.72 | 4.83 | −2% |
Perceived competence (t2) | 4.76 | 4.73 | 1% |
Relatedness (t1) | 5.30 | 5.49 | −3% |
Relatedness (t2) | 3.54 | 3.46 | 2% |
Modeled | Unmodeled | Difference | |
---|---|---|---|
Interest (t1) | −0.14 | 0.04 | −0.18 |
Interest (t2) | −0.43 | −0.50 | 0.06 |
Autonomy support (t1) | −0.11 | 0.03 | −0.15 |
Autonomy support (t2) | 0.04 | 0.08 | −0.04 |
Perceived competence (t1) | −0.08 | 0.02 | −0.11 |
Perceived competence (t2) | −0.04 | −0.06 | 0.02 |
Relatedness (t1) | −0.09 | 0.03 | −0.11 |
Relatedness (t2) | −0.37 | −0.43 | 0.06 |
Modeled | Unmodeled | Difference | |
---|---|---|---|
Female | 67% | 60% | 7% |
Male | 33% | 40% | −7% |
White | 41% | 40% | 1% |
Hispanic | 23% | 21% | 2% |
Non-transfer | 91% | 87% | 4% |
Transfer | 7% | 10% | −3% |
This journal is © The Royal Society of Chemistry 2023 |