Learning beyond the classroom: using text messages to measure general chemistry students' study habits

Li Ye , Razanne Oueini , Austin P. Dickerson and Scott E. Lewis *
University of South Florida – Chemistry, Center for the Improvement of Teaching & Research in Undergraduate STEM Education, 4202 E. Fowler Avenue CHE205, Tampa, Florida 33620, USA. E-mail: slewis@usf.edu; Fax: +1-813-974-3203; Tel: +1-813-974-3099

Received 28th May 2015 , Accepted 27th July 2015

First published on 27th July 2015


Abstract

This study used a series of text message inquiries sent to General Chemistry students asking: “Have you studied for General Chemistry I in the past 48 hours? If so, how did you study?” This method for collecting data is novel to chemistry education research so the first research goals were to investigate the feasibility of the technique and the evidence for validity of the data collected. The results showed that text messages provide ample data on students' study habits though initial participant recruitment may pose a challenge. This study also explored evidence for validity and found that the percent of students reporting studying peaked with each exam date matching the expected trend (content validity) and participants in the study had only small departures from the population of students at the setting (generalizable validity). Second, students' study habits were characterized using cluster analysis finding three clusters: students that knowingly do not study, students who describe mandatory course components as studying and students who study in addition to the mandatory course components. These student groups were compared on a common exam in the course with the last group out-performing those who knowingly do not study. Finally, student study habits were charted across the semester and show signs of adapting, possibly as a result of course expectations or course content.


Introduction

Understanding factors related to student learning in General Chemistry is necessary to design and evaluate implementations to improve academic performance. Considerable effort has been made toward this end through the use of reformed pedagogical techniques. These techniques target in-class activities and have shown a notable impact on metrics for academic performance (Freeman et al., 2014). In post-secondary education however, students spend only three to five hours per week in class with the opportunity to spend considerably more time outside of class studying the course materials. This leads to two overlapping possibilities regarding a causal explanation for the effectiveness of in-class pedagogical reform: (i) students' experiences in class cause learning gains or (ii) the reform modifies students' activities outside of class that cause learning gains. Currently little information is known regarding post-secondary chemistry students' studying of course material outside of class, herein referred to as study habits. This study seeks to examine a novel method for measuring students' study habits and explore the role of study habits in academic performance.

Background

Past work on study habits

Considerable past work on college students' study habits has been carried out in the fields of education and psychology. In a recent meta-analysis of the work on study habits, Crede and Kuncel (2008) described the empirical and theoretical literature on studying behaviors as fragmented. They organized studying behaviors based on the constructs: study skills, knowing how to study, study habits, the frequency and type of actions taken toward studying and study attitudes, the motivation toward studying. These constructs are differentiated from study processes that describe the depth of processing on a continuum from deep (an effort to relate new material to previously learned contexts) to surface (characterized as memorization without seeking context). In the meta-analysis, the researchers identified 40 studies relating study habits to college GPA and found correlations that average approximately 0.33 with a 90% interval between 0.09 and 0.51. Relationships between study habits and individual course performance was lower, averaging 0.26, which the authors attribute to not being able to correct for reliability in individual course grades. Study habits also featured a weak relationship with established measures of general cognitive ability such as high school GPA or college admissions tests. This suggests that the relationship between study habits and academic college performance is unique from the well-established relationships between measures of cognitive ability and student performance. Further, it helps to rule out the explanation that stronger students exhibit better study habits and that this is responsible for the observed correlation; instead it suggests that students can benefit from effective study habits regardless of incoming ability.

There has been little research attention toward measuring study habits in the context of post-secondary chemistry. Richards-Babb and Jackson (2011) investigated gender differences in study habits through a survey given at the end of General Chemistry and reported that male students were more likely to procrastinate. Also related, Li et al. (2013) investigated post-secondary chemistry students' conceptions of learning chemistry and approaches to learning chemistry. Conceptions of learning chemistry were measured by a survey developed based on earlier, more general research interviewing students about learning experiences. This work identified memorizing, testing, calculating/practicing and higher order thinking labeled as transforming as the relevant themes in students' conceptions of learning. Approaches to learning were measured based on the previously described study processes and the continuum from surface to deep learning. The study found that students who were characterized as deep learners conceived of learning in multiple ways including transforming, memorizing and testing, while learners that used surface strategies employ memorizing and testing.

Most studies that investigated students' study habits used a single-admission survey that may be problematic for two reasons. First, as a single measure, it presumes that students' study habits are constant, whereas it is possible that students' study habits adapt to the nature of the content and with familiarity toward assessment expectations. Second, as an in-class survey, it relies upon retrospection on behalf of the student, particularly when it is given at the end of the semester. Past research has called into question the accuracy of retrospective accounts, particularly at lengthier time intervals (Bernard et al., 1984). By exploring study habits at multiple time points, both problems may be minimized as changes over time can be documented and participants would not be asked to reflect upon several months of study habits.

There have also been efforts made to improve students' study habits. Cook et al. (2013) implemented a one-day lecture for General Chemistry students that presented differences in expectations between post-secondary and secondary education as described by Bloom's taxonomy. They also presented metacognitive learning tools including a study cycle. At the conclusion, students made a brief written statement committing to use some of the tools presented. Student attendance to the lecture was voluntary though students received a bonus equivalent to 0.5% of their grade for attending. Attending students were compared to non-attendees on the metric of points earned in the course post-implementation (transformed to follow a normal distribution). The statistical comparison used students' first exam score as a covariate as it preceded the implementation. The results of an ANCOVA showed that students who attended the treatment performed better on the outcome measure than those who did not. These results indicate the potential importance of student study habits to student learning, but the results could also be indicative of a confounding variable such as student motivation. For example, the authors did note that the control group missed more exams than the treatment group and that could be an indication that the groups differed in their motivation to succeed in the class. Incorporating a measure of study habits before and after the intervention would further elucidate the impact of the intervention and help better establish a causal connection such as the intervention impacted student study habits which led to greater student learning.

In summary, considerable research has shown a relationship between study habits and academic performance in post-secondary education but not in the sciences. The following study seeks to address this gap in the research literature by investigating the frequency and types of study habits in General Chemistry. The creation of a detailed measure of student study habits as described below can open two potentially fruitful areas of study. First, measuring study habits can inform efforts to better understand the factors related to academic success (study habits as an independent variable). Second, measuring study habits can aid explorations of instructional efforts to improve students' study habits (study habits as a dependent variable). The following study takes an exploratory approach to examine study habits as an independent variable.

Experience sampling method

The methodology used in this study is Experience Sampling Method (ESM) which uses technology to measure participants' self-report of their actions or psychological state while the participant is in their natural environment. What follows is a brief summary of the methodology as it applies to the current study. For a complete introduction to Experience Sampling Method (ESM) including methodological stance, research antecedents and examples, readers are advised to please see Hektner et al. (2007). By measuring in the participants' natural environment, researchers can learn about participants' actions outside of a particular research setting (e.g. the classroom) while relying on a much more proximal retrospective account than traditionally done. ESM has been described as systemic phenomenography in that the information collected relies on self-report and remains restricted to describing the participants' perspective on the area of focus. It is considered systemic in that ESM uses technology to facilitate multiple measures of a construct from each participant to establish reliability and investigate patterns within a participant.

ESM has been used in a variety of contexts, particularly in the field of psychology where it has been used to explore constructs as diverse as morality, mental illness and substance abuse. (Smyth and Stone, 2003; Hofmann et al., 2014) It has also been used with medical applications to investigate disorders, drug abuse and treatment effectiveness. (Hektner et al., 2007) In education it has been used most often at the secondary level to investigate student motivation, satisfaction with the educational environment or the nature of the environment. (Csikszentmihalyi, 2014) To date, we could not locate a study that has used ESM to explore a post-secondary chemistry setting or post-secondary student study habits.

Research questions

As ESM has not been previously used to explore post-secondary study habits, the first research goal was to establish the utility of this method to measure post-secondary students' study habits. Additional research goals include relating study habits to academic performance, which speaks to the relative importance of study habits, and investigating the extent study habits change. If study habits are found to change over the semester, this would suggest a fruitful line of research to investigate instructional actions to direct student study habits toward effective practices. Specifically, this study was guided by the following research questions:

(1a) To what extent is it feasible to measure student study habits using ESM?

(1b) To what extent is there evidence for the validity of the data collected on student study habits?

(2) Which study habits were related to academic performance in the course as measured by a cumulative final exam?

(3) How did student study habits change over the course of the semester?

Research setting

This study was conducted over one semester at a large research-intensive university in the southeast United States. At the setting four classes of General Chemistry were offered with class sizes between 200 and 225 students. The classes are coordinated where the instructors agree to a common syllabus, textbook, grading scheme, content sequence and pace. The classes also employed common exams where students across all classes take the same exam at the same time. The exams were constructed by contributions from each of the four instructors and used multiple-choice questions and a measure of linked concepts (Ye and Lewis, 2014). The measure of linked concepts provides a brief description of a chemical situation and has students evaluate the legitimacy of a series of statements as true or false. The series of statements span the content throughout the course and are meant to have students consider how concepts throughout the course are linked.

To aid student studying, past exams were posted approximately two weeks before the actual exam and are referred to as practice tests. The textbook used was Tro's Chemistry: A Molecular Approach (2014) and the content sequence was: quantum numbers, periodic trends, Lewis structures, shapes and polarity, gas laws, thermodynamics, intermolecular forces and properties of solutions. Grades were determined largely by performance on three in-class exams (15% each) and the cumulative final exam (25%). A smaller portion of the grade was attributed to three different effort-based measures at 10% each. First, the class used weekly peer-led problem-solving sessions where students worked in groups on problems designed by the instructors with the aid of peer-leaders (Lewis, 2011). Attendance and participation in these sessions was worth 10% of their grades. Second, students were graded on their performance on eight online homework assignments using Sapling Learning. Third, instructors used clickers to facilitate in-class questions in the large lecture-hall setting.

Methods

Students were recruited for this study from three of the four General Chemistry classes at the research setting. One class was omitted from this study as the instructor for the class was a member of the research team, and there was concern that recruitment might appear coercive to students. Among the three classes that were recruited, 670 students were enrolled (out of 889 students among the four classes). Recruitment occurred on the first day of class by describing the nature of participation in this study. Participants would be asked for their cell phone numbers and would periodically receive a text message that inquired “Have you studied for General Chemistry I in the past 48 hours? If so, how did you study?” The text messages would be sent approximately twice a week at random times between 9 AM and 9 PM. Participants would be asked to respond to the message within 12 hours of receipt if possible and were given an instruction sheet that included example responses. To encourage participation, students who responded to 80% or more of the text messages would be entered into a raffle for a $25 gift card at the end of the semester. The university's Institutional Review Board approved these procedures. The recruitment effort led to 301 participants consenting of the 670 students (44.9%). The text message inquiry was sent out as described above 28 times over the course of the semester.

Student responses to the text messages were combined with data collected in the normal educational setting from either university records or in-class records. This data includes student responses to the revised two-factor Study Process Questionnaire (rSPQ) administered on the first day of class (Biggs, 2001). The rSPQ is a 20 item Likert-scale instrument meant to measure students' study processes. The instrument was revised by the original instrument's author and measures respondents on two sub-scales: deep approach and surface approach. The deep approach can be characterized by intrinsic interest or a motivation to understand. Example items from the rSPQ that measure the deep approach are “I come to most classes with questions in mind that I want answering” and “I find that at time studying gives me a feeling of deep personal satisfaction.” The surface approach can be characterized by a narrow focus on content and memorization with example statements “My aim is to pass the course while doing as little work as possible” and “I find the best way to pass examinations is to try to remember answers to likely questions.” In this study, students were asked to consider their study habits in general, but if they need to consider a subject, consider how they would study for chemistry or a science course. The Likert-scale was a five-point range from “this item is never or only rarely true of me” to “this item is always or almost always true of me,” and each factor score represents the sum of ten associated items.

In this study the rSPQ is thought to measure the quality of studying where the deep approach describes the desirable educational process (Biggs 2001). This is differentiated from study habits, which describe the type and frequency of studying. There are expected relationships between the constructs, for example students who employ a deep approach are expected to study more frequently. Student scores on the rSPQ are considered in contrast to their cohort as recommended by Biggs (2001). Additional measures include student demographics and SAT scores (a measure of incoming college preparation) from university records, and student performance on exams from in-class records. Descriptive statistics on each of these measures for the population of 899 students are presented in Table 1.

Table 1 Descriptive statistics of measures
Variable Average St. Dev. N Theoretical range Cronbach's α
SAT math 552 68 695 [200, 800]
SAT verbal 548 73 695 [200, 800]
Deep approach 32.1 6.7 797 [10, 50] 0.826
Surface approach 23.8 6.2 797 [10, 50] 0.776
Final exam 48.5 15.7 754 [0, 100] 0.816


Results and discussion

Feasibility of ESM for measuring study habits

Over the course of the semester 4775 responses were collected in response to the 28 inquiries. This represents an average of approximately 16 responses per participant. A histogram of the frequency of student responses is presented in Fig. 1. From the histogram, there were 34 students (11.3% of participants) who never responded to the text message inquiries. There were also a sizable number of students who regularly responded, as 188 students (62.5%) responded to at least half of the messages and 137 students (45.5%) responded to at least three-quarters of the messages.
image file: c5rp00100e-f1.tif
Fig. 1 Histogram of text message responses.

In terms of feasibility, it is plausible to have a substantial portion of the recruited population respond to this data collection technique. It is worth noting that the raffle incentive required responses to 80% of the inquiries (23 or more inquiries), which may partially explain the rise in number of participants who responded to 23 or more inquiries. Instead, the largest source of data attrition in the study was during the initial recruitment, where out of an initial population of 670 students, 301 students agreed to participate (44.9%). This suggests that future research studies that intend to rely on a large number of responses would benefit by planning for a substantially larger recruitment pool. The current data indicates that roughly one-quarter of the initially recruited population provided responses to at least half of the inquiries.

At the close of the semester, an additional text message inquiry was sent asking participants if they would participate in a similar study using text messages in the future. Of the 94 respondents, 78% responded positively compared to 18% negative (with the remainder unsure). The most common negative comment (5 responses) was that students found the messages annoying. However, the most common comment (52 responses) described the convenience in participating with some indicating it was less of a time commitment compared to traditional studies.

Evidence for validity

First, to determine how generalizable the sample is, participants were compared to the non-participants on each variable describing an in-coming characteristic: SAT sub-scores and the Surface Approach and Deep Approach score from the rSPQ. Scores on each metric are compared in Table 2. Using the two one-sided t-tests method (Lewis and Lewis, 2005) for establishing equivalence, with an equivalence interval equal to the small effect size (Cohen's d = 0.2), the two groups were equivalent on Math SAT and the Surface Approach. (Cohen, 1988) The departures from equivalence were minor and when the interval was expanded to d = 0.25 the two groups were equivalent on all metrics. Participants were also compared to non-participants based on demographic characteristics of gender and minority status. For this comparison the chi-square test was used with the effect size estimated using Cohen's w. The comparison found that both differences were less than a small effect, which Cohen operationalized as w = 0.10. For gender χ2 = 3.27, w = 0.06, and for minority status χ2 = 2.39, w = 0.05. The above comparisons serve to investigate self-selection bias in this study and found only small departures from the participants and the non-participants on the measures considered. These measures only serve as an indirect measure of self-selection bias as it is still possible that the study habits of the participants and non-participants differ and study habits of non-participants could not be investigated with the data collected. It is therefore proposed that no evidence was found to believe that the sample is biased by self-selection and the sample may be generalizable to the population of General Chemistry students at the setting.
Table 2 Generalizable validity: comparing participants to non-participants
  Participants average (St. Dev.) N Non-participants average (St. Dev.) N
a URM = under-represented minority (as defined by the National Science Foundation).
SAT Math 554 (65) 237 550 (69) 458
SAT Verbal 552 (70) 237 545 (74) 458
Deep Approach 32.7 (6.6) 288 31.8 (6.7) 509
Surface Approach 23.6 (6.3) 288 23.9 (6.2) 509
Participant demographics (N) Non-participant demographics (N)
Gender 65% Female (300) 59% Female (594)
Minority 45% URMa (277) 40% URM* (559)


To explore the content of the responses, the text messages were coded using an open-coding scheme. The coding process resulted in 16 codes as shown in Table 3. Each response was coded and responses could be coded with multiple codes. For example, “Yes the back of book problems, reading the chapter, and doing the online homework assignment” was coded for Textbook, Practice Problems and Homework. To check the inter-rater reliability of the coding scheme, 10% of the text messages were randomly selected and coded by a researcher who was independent of the first coding pass. The resulting codes agreed with the original code for 94% of the responses. Table 3 also presents the relative frequency of the codes as the percent of responses that used a particular code.

Table 3 Types of study habits and frequency
Code Percent (%)
Did not study 42.2
Reviewed notes or PowerPoint 18.8
Reviewed the textbook 16.4
Online homework 14.2
Practiced problems 6.8
Previous exams or study guides 5.7
Unspecified yes 4.5
Used online materials 2.6
Worked with friends or in a group 2.4
Attended peer leading or reviewed peer leading assignment 2.1
Worked with a tutor 1.9
Attended class 1.1
Made flashcards 0.9
Visited professor 0.3
Attended lab 0.2
Reviewed tables, models or charts 0.2


Text messages were then also coded dichotomously as either a study habit was used or the participant did not study. The codes unspecified yes, attended class, attended peer leading or attended lab were coded as missing in this categorization as it was not clear whether these participants had employed a study habit. With the new dichotomous codes, the percent of participants employing a study approach was determined for each text message inquiry. The percent of participants using a study habit is plotted by date in Fig. 2, using only those participants who responded to at least half of the 28 inquiries (N = 188). The vertical lines in Fig. 2 correspond to the exam dates in the class. From Fig. 2, the percent of students who report studying increases leading up to each exam, peaks at the exam date and subsequently drops-off. This matches the expected pattern of instructional experience where student inquiries tend to ramp up leading up to the date of an exam, lending content validity to the responses received.


image file: c5rp00100e-f2.tif
Fig. 2 Percent of responses describing a study habit.

Ultimately, the measure of students' study habits proposed is still reliant on self-report. Self-reported data may be influenced by factors that cannot be ruled out such as participants' belief in a socially desirable response pattern or errors in participants' efforts to recollect. Such factors would impact the accuracy of the responses as a measure of actual student actions. As study habits by definition occur outside of a controlled research setting, attempts to triangulate the measure without relying on self-report would require extensive observations that would impose on participants' privacy. This serves as an unavoidable limitation of this study though it is proposed that participants' self-report of study habits do offer value in understanding the factors needed for successful academic performance.

Relationship of study habits to academic performance

Identifying successful study habits can guide efforts to improve study habits through student advising. The knowledge of productive study habits can also inform evaluations of reform pedagogies allowing an exploration of the extent reform pedagogies promote effective study habits. To investigate the relationship of study habits to academic performance, each participant was characterized by the percent of the participant's responses that indicated each of the study habits shown in Table 3. Second, only participants who responded to half of the text message inquiries were considered to promote stability in the percentage. That is, a participant who indicated reviewing notes in 14 out of 21 responses indicates a more stable pattern than another participant who indicated so in 2 out of 3 responses. Finally, academic performance was operationalized by performance on the cumulative final exam discussed earlier. This measure was chosen as the clusters represent study habits across the semester and the final exam was the only measure to occur at the end of the semester.

Initially, correlations between each study habit and the final exam were conducted. Each correlation indicated a weak relationship with the strongest relationship of 0.14 between percent of responses using the textbook and final exam score. Since correlations only indicate the strength of a linear relationship, the data was further explored for the possibility of relationships that do not follow a linear pattern. Owing to the substantial number of study habits present, the decision was made to conduct a cluster analysis to look for patterns among the multiple study habits. Cluster analysis is an algorithm that measures the distance between each case (student) on the variables (frequency of study habits) and combines pairs of students who feature the smallest distance into a cluster. The algorithm continues to combine students and clusters of students until it reaches a user-specified number of clusters. In this way, cluster analysis can be used to find groups of students who have similar profiles across multiple variables (Everitt et al., 2011). Cluster analysis can be used to describe the data in terms of number of students per group and the average study habits within each group. These groupings can then facilitate investigating relationships among other measures.

For the cluster analysis, only the six most prevalent study habits in Table 3 were used, as these were each represented by at least 5% and were also readily interpretable (the next most prevalent code would be the unspecified yes). A hierarchical cluster analysis using Ward's method and squared Euclidean distance was employed to create clusters that were distinct from each other (Aldenderfer and Blashfield, 1984). To determine the number of clusters, the cluster analysis began with six clusters that were evaluated based on sample size in each cluster and the average percent for each study habit. Then an analysis to create five clusters was conducted to determine which two clusters were combined; these clusters were evaluated based on qualitative similarity on study habit percentages and the relative sample size of each cluster. The analysis was continued until reducing the number of clusters meant losing a cluster that was substantially distinct. The intent was to determine the number of clusters that led to a reasonable representation in the sample for each cluster and where each cluster was distinct. This resulted in three clusters that are characterized in Table 4.

Table 4 Cluster analysis results – study habits
  Cluster 1 Cluster 2 Cluster 3
Bold indicates study habit has more than +0.5 standard deviation different than the overall average; italic text indicates study habit is less than −0.5 standard deviations different than the overall average.
Sample size 64 62 62

Study habits Average (St. Dev.)
Did not study 67% (13%) 26% (14%) 33% (15%)
Reviewed notes or PowerPoint 8% (9%) 22% (22%) 26% (16%)
Reviewed the textbook 8% (11%) 35% (16%) 8% (8%)
Online homework 5% (7%) 11% (9%) 25% (16%)
Practiced problems 4% (6%) 10% (12%) 6% (9%)
Practice tests or study guides 3% (4%) 6% (7%) 9% (8%)


Table 4 describes three distinct clusters that arose from the study habits in the sample. To place the values in context, 14% would indicate that they used the study habit at least twice and at most four times over the course of the semester. The sample distribution among the three clusters is relatively even which suggests that each cluster has prevalence among the sample. Participants in Cluster 1 indicated not studying far more often than the rest of the sub-sample (67% versus 42% for the sub-sample) and subsequently indicated reviewing notes and the online homework less often than the sub-sample. Cluster 2 was more than one standard deviation greater than sub-sample on use of the textbook (35% versus 16%). Cluster 2 was also higher on practicing problems and lower on the percentage of not studying. Cluster 3 was noteworthy for describing the online homework as their study habit, but was also higher on reviewing notes and the practice tests.

The three clusters were compared on the five other measures with data presented in Table 5. To compare the clusters an Analysis of Variance (ANOVA) was performed with α = 0.05 which provides a group-wise error rate of 0.23 across the five tests. The effect size was also characterized by Cohen's f, where 0.10 is a small effect and 0.25 is a medium effect (Cohen, 1988). Interestingly, neither SAT sub-score was found to be statistically significant with negligible effects for Verbal SAT (F = 0.233; p = 0.792; f = 0.06) and Math SAT (F = 0.135; p = 0.874; f = 0.04). For the study approaches, the clusters differed with medium effects on both the deep approach (F = 4.190; p = 0.017; f = 0.22) and the surface approach (F = 7.315; p < 0.001; f = 0.27). Post hoc comparisons using the Tukey test indicate that the significant difference is Cluster 1 is higher on the surface approach than the other two clusters and Cluster 2 is higher on the deep approach than Cluster 1. On the final exam metric, the clusters were also different with a near medium effect (F = 3.663; p = 0.028; f = 0.21). Post hoc analysis describes the significant difference as Cluster 2 higher than Cluster 1. An ANCOVA analysis controlling for SAT sub-scores on the final exam measure indicated similar results (F = 3.881; f = 0.24) as the original ANOVA analysis.

Table 5 Study habit clusters compared
Variables Cluster 1 Cluster 2 Cluster 3
Average (St. Dev.)
SAT math 566 (62) 561 (71) 560 (57)
SAT verbal 557 (70) 566 (81) 558 (71)
Deep approach 30.4 (6.5) 33.7 (6.4) 31.8 (5.8)
Surface approach 25.7 (5.9) 22.1 (6.1) 22.4 (5.6)
Final exam 43.8 (13.3) 51.1 (15.2) 46.8 (14.0)


Thus, it appears that Cluster 2, which comprises roughly one-third of the sample, had higher scores on average on the final exam measure. This suggests the study approaches described as reviewing the textbook and practicing problems leads to increased academic performance in the course. Not surprisingly, Cluster 1, which indicated predominately not studying, performed worse. That Cluster 2 scored higher on the deep approach and Cluster 1 on the surface approach lends external validity to the qualitative difference between these two groups. Cluster 3's performance on the final exam is interesting as it was comparable to Cluster 1. The study efforts of Cluster 3 are more concentrated on the online homework. It is hypothesized that these students perceived the completion of the required online homework as suitable preparation for the exams. The central feature of the hypothesis is the emphasis on perception. Since the online homework was a mandatory part of the class it is likely that the strong majority of students completed it, however the students in Cluster 3 may have perceived it as satisfactory preparation whereas students in Cluster 2 believed that additional preparation was necessary. Thus, Cluster 1 may be described as knowingly not studying, Cluster 3 as believing the required course components constitute satisfactory studying and Cluster 2 studied in addition to the required course components by relying on the textbook and practicing problems.

The relationship of the study habits measured by ESM to a measure of academic performance serves as support for external validity of the data collected. The finding that students who study more regularly perform better on the cumulative final exam may not be surprising. However, the findings that approximately one-third of the sample study regularly, which matches the baseline observed in Fig. 2, is of importance as it suggests that there is ample ground for promoting effective study habits. That the students who study regularly are also not distinguishable from the other groups based on SAT scores also partially rules out the competing explanation that these students were more academically prepared prior to the semester. Another possible explanation for differences in academic performance may include differences between clusters in student motivation to succeed in the course; in particular, it is plausible that differences in motivation may manifest themselves in more frequent studying.

Study habits change over the semester

To investigate changes in study habits over the course of the semester, the analysis focused on the text message inquiries that were sent out immediately preceding each exam. The decision to focus on these four text message inquiries was based on the increase observed in describable study habits that coincided with the exams as shown in Fig. 2. It will also lend the most insight into students' exam preparation strategies. As a measure of change in study habits over the semester, the analysis was conducted on only the 113 participants who responded to each of the four messages in question; otherwise, observed changes could result from trends in missing data. A separate lexical analysis was conducted on the responses from each of the four text message inquiries. Lexical analysis is an algorithm designed to automatically categorize written responses. The lexical analyses were conducted using SPSS Text Analytics (IBM, 2011). This program used linguistics-based text analysis, which combines phrases into a common category if the differences between phrases are the use of synonyms (e.g. practicing problems and doing problems). Some of the resulting categories were then manually combined such as practice tests and old tests. Lexical analysis also facilitates an investigation into patterns of overlap among categories that provides insight into the extent that study habits are diversified at each time point.

The end result created 18 categories from the responses across the set of four inquiries. Note these categories were created independently of the codes described in Table 3. A sizable advantage of the lexical analysis technique is the ability to demonstrate the categories and interrelations between each category in a web diagram. Web diagrams were created for each exam (Fig. 3 through 6) focusing only on categories with at least five responses. The web diagram represents each category with a node, and the size of the node is proportional to the frequency of the category. The frequency of each category is indicated in parenthesis inside each node. Nodes are connected with lines that indicate the extent the connected categories were mentioned together in a response. The type of line indicates the extent the categories are shared as a proportion of the smaller node. A solid line indicates that 60% or more of the responses that were categorized by the smaller node were also present in the category in the larger node. A long dash line indicates 40% to 59% agreement, a square dotted line indicates 20% to 39% agreement and no line indicates below 20% agreement. Reviewing the web diagrams can provide insight into changes in study patterns that occurred throughout the term. For context in interpreting trends in the web diagrams, the relevant topics from each exam are shown in Table 6.


image file: c5rp00100e-f3.tif
Fig. 3 Exam 1 web diagram.
Table 6 Content on exams
Exam Content
Exam 1 Properties of light, electron configurations, periodic trends
Exam 2 Lewis structures, molecular shapes, polarity
Exam 3 Gas laws, thermodynamics
Final exam Intermolecular forces, colligative properties, cumulative exam included prior content


In the Exam 1 (Fig. 3) diagram, notes, previous tests, textbook and homework are the most prominent, with PowerPoint (PPT) slides also mentioned. The links show moderate overlap among these five categories, though notably no significant overlap was found between previous tests and PowerPoint or previous tests and textbook. In the Exam 2 (Fig. 4) diagram the study pattern is more concentrated on notes, previous tests and textbook with moderate overlap among almost all of the categories. In Exam 3 (Fig. 5) the textbook is reduced in prominence and the studying was more focused on previous tests; also the relations among nodes are generally weaker than in Exam 2 indicating less reliance on multiple study approaches. In preparing for the cumulative Final Exam (Fig. 6) the use of the textbook has returned to prominence along with notes and previous tests, similar to Exam 2. This diagram is also the most interconnected web suggesting a stronger reliance on multiple studying techniques, possibly owing to the cumulative nature of this exam.


image file: c5rp00100e-f4.tif
Fig. 4 Exam 2 web diagram.

image file: c5rp00100e-f5.tif
Fig. 5 Exam 3 web diagram.

image file: c5rp00100e-f6.tif
Fig. 6 Final exam web diagram.

Looking for changes across study patterns, one clear trend is the diminished role of studying homework in preparation for the exams. In Exam 1, homework was among the most prominent nodes, whereas in each subsequent exam it is a minor node. This may describe students' perceiving a lack of relevance of the homework assignments in exam preparation after the first exam. For context, the online homework was always due one to two days before each of these four text message inquiries so that it was likely students were working on the assignments in the time frame indicated. Students could also review the homework assignments after the due date. Incidentally, after the semester had completed, the instructors at the setting discussed deliberately including one or two questions modified from the homework assignment in each of the exams to emphasize to students the importance of understanding the process of problem solving in the homework over simply arriving at the correct answer. By making this change it is possible that students may benefit more from engaging in the homework which would be reflected in their study habits and related to their academic performance.

Another trend among the web diagrams is the diminished role of the textbook and notes in the Exam 3 diagram. Exam 3 strongly relied on math content (see Table 6) differing from the preceding exams. Students may respond to this by studying the textbook and notes less and focusing more on the instructor provided materials in the PowerPoint slides and previous tests. Other explanations are also possible such as time constraints related to other courses giving exams, the perceived challenge of the practice tests that were posted taking up more student time or students finding the textbook less helpful in this content.

Returning to the research question, there appears to be considerable evidence of changing study habits among a common group of students over the course of the semester. The changing role of homework, textbook, notes and the use of multiple study techniques suggest that student study habits differ across the exams. The changes may be for many reasons including students responding to the perceived effectiveness of study techniques for each exam, the perceived nature of the content on each exam or the quality of study materials available, or competing interests for students' time. The changes in the nature of links also indicates that the variety of techniques used by students changes over the course of the semester and are amplified when taking a final exam, possibly as a result of the cumulative nature of the exam.

Conclusions and future work

This study has shown the feasibility of using text messages to provide considerable data on students' self-reported study habits in General Chemistry. Among the principal limiting factors is recruiting students to participate, which may become an issue depending on the intended use. Future work may benefit by modifying the incentive structure for recruiting students. Second, there is evidence for validity of the text messages in that the response pattern matches the expected trend relative to the exam dates in the setting. Additionally, the recruited sample featured minor departures from the overall population on incoming metrics, including a measure of study approaches, lending support to the consideration that the results are generalizable to the population of General Chemistry students at the setting.

Next, the study provided evidence that study habits are related to academic performance in the course, notably by students using study habits that are in addition to the mandated course requirements. In this study, use of the textbook was most prominent as the additional study habit. A direct instructional implication that results from this study is the potential benefit of discussing with students the need to study beyond mandatory course components. In the current study one-third of the students described mandatory course components as their principal means for studying and these students performed comparably to students who did not report studying. Future research could have instructors discuss with students the results shown here to students and measure the impact on student study habits or academic performance.

One of the most interesting areas of future work may be an investigation on the impact of instructional techniques to impact study habits. Indeed one of the more surprising outcomes of this study was the infrequent mention of group work or studying with friends (Table 3) as it seemed possible that the weekly group work during the peer-led meetings would facilitate greater use of study groups outside of class. To investigate this area further, the impact of incorporating reform pedagogy or training sessions on how to study can be investigated in either a repeated measures or quasi-experimental design using text messages to measure students' study habits. Such a study may inform causal mechanisms behind evidence-based instructional practices. For example proponents of cooperative learning have indicated social constructivism as a potential explanation for improved learning outcomes that have been observed (Muthyala and Wei, 2013). The causal mechanism for learning would be that students' social processes within group-work have facilitated their conceptual understanding. An alternative causal mechanism however is that students engaged in cooperative learning may become dissatisfied with their own progress in comparison to their peers and as a result study more. The plausibility of the alternative explanation is supported by the time available outside of class relative to in class and the observed academic benefits of study habits herein. An investigation into the impact of evidence-based instructional practices on study habits can then support or dissuade the alternative explanation proposed.

This study has also shown that students' study habits can change over the course of the semester. This finding has relevance for work that relies on a single measure of particular study habits (as opposed to more general study approaches) extrapolated to describe students' habits throughout the term. It also provides some support to the expectation that instructor actions can influence student study habits. Relating changes in study habits to measures of reflective action (e.g. metacognition) or interviewing students regarding their study actions prior to each test may offer additional support for this contention. Additionally future research that investigates how changes in study habits relate to academic performance is warranted, as whether consistent or adaptable study habits are more beneficial remains an open question.

Finally, ESM has the flexibility to potentially support a diverse range of instructional strategies. For example, the action of messaging students outside of class can, by itself, serve as an instructional intervention. Instructors can use text messages to direct students toward online resources, set-up peer study groups or remind students of deadlines in a timely fashion. Additionally, the messages can be tailored for individual students or small groups; for example messages can notify a group of students who haven't completed an online homework assignment of an upcoming due date or inform a student who has struggled that the student's recent test score shows an improvement over past performance. Early research in a wide range of educational settings has shown that such tailored messages have a strong potential for producing positive gains (Dynarski, 2015). This approach may offer a non-intrusive way to show faculty concern for student performance which, when missing, has been cited as a factor in student attrition from the STEM disciplines, particularly among minority students (Tsui, 2007; Museus et al., 2011).

References

  1. Aldenderfer M. S. and Blashfield R. K., (1984), Cluster Analysis, Beverly Hills: Sage Publications.
  2. Bernard H. R., Killworth P., Kronenfeld D. and Sailer L., (1984), The Problem of Informant Accuracy: The Validity of Retrospective Data, Ann. Rev. Anthro., 13, 495–517.
  3. Biggs J. B. (2001), The revised two-factor Study Process Questionnaire: R-SPQ-2F, Brit. J. Educ. Psychol., 71, 133–149.
  4. Cohen J., (1988), Statistical Power Analysis for the Behavioral Sciences, Hillsdale: Lawrence Erlbaum Associates.
  5. Cook E., Kennedy E. and McGuire S. Y., (2013), Effect of Teaching Metacognitive Learning Strategies on Performance in General Chemistry Courses, J. Chem. Educ., 90, 961–967.
  6. Crede M. and Kuncel N. R., (2008), Study Habits, Skills, and Attitudes; The Third Pillar Supporting Collegiate Academic Performance, Perspect. Psychol. Sci., 3, 425–453.
  7. Csikszentmihalyi M., (2014) Applications of flow in human development and education, Dordrecht: Springer.
  8. Dynarski S. (2015), Helping the Poor in Education: The Power of a Simple Nudge, The New York Times, The Upshot, retrieved July 28, 2015, from: http://www.nytimes.com/2015/01/18/upshot/helping-the-poor-in-higher-education-the-power-of-a-simple-nudge.html.
  9. Everitt B. S., Landau S., Leese M. and Stah D. (2011) Cluster Analysism, 5th edn, West Sussex, UK: John Wiley & Sons.
  10. Freeman S., Eddy S. L., McDonough M., Smith M. K., Okoroafor N., Jordt H. and Wenderoth M. P., (2014), Active learning increases student performance in science, engineering and mathematics, Proc. Natl. Acad. Sci. U. S. A., 111, 8410–8415.
  11. Hektner J. M., Schmidt J. A. and Csikszentmihalyi M., (2007), Experience Sampling Method; Measuring the Quality of Everyday Life, Thousand Oaks, California: Sage Publications, Inc.
  12. Hofmann W., Wisneski D. C., Brandt M. J., and Skitka L. J., (2014). Morality in everyday life, Science, 345, 1340–1343.
  13. IBM (2011), IBM SPSS Text Analytics for Surveys 4.0.1 User's Guide, retrieved May 21, 2015, from: http://ftp://public.dhe.ibm.com/software/analytics/spss/documentation/tafs/4.0.1/en/Users_Guide.pdf.
  14. Lewis S. E. (2011), Retention and Reform: An Evaluation of Peer-Led Team Learning, J. Chem. Educ., 88, 703–707.
  15. Lewis S. E. and Lewis J. E., (2005), The same or not the same: Equivalence as an issue in educational research, J. Chem. Educ., 82, 1408–1412.
  16. Li W.-T., Liang J.-C. and Tsai C.-C., (2013), Relational analysis of college chemistry-major students' conceptions of and approaches to learning chemistry, Chem. Educ. Res. Pract., 14, 555–565.
  17. Museus S. D., Palmer R. T., Davis R. J. and Maramba D. C., (2011), Factors That Influence Success Among Racial and Ethnic Minority College Students in the STEM Circuit, San Francisco: Jossey-Bass.
  18. Muthyala R. and Wei W., (2013), Does Space Matter? Impact of Classroom Space on Student Learning in an Organic-First Curriculum, J. Chem. Educ., 90, 45–50.
  19. Richards-Babb M. and Jackson J. K., (2011), Gendered responses to online homework use in general chemistry, Chem. Educ. Res. Pract., 12, 409–419.
  20. Smyth J. M. and Stone A. A. (2003), Ecological momentary assessment research in behavioral medicine, J. Happiness Stud., 4, 35–52.
  21. Tro N. J., (2014), Chemistry: A Molecular Approach, Upper Saddle River, NJ: Pearson Education, Inc.
  22. Tsui L., (2007), Effective Strategies to Increase Diversity in STEM Fields: A Review of the Research Literature, J. Negro. Educ., 76, 555–581.
  23. Ye L. and Lewis S. E., (2014). Looking for Links: Examining Student Responses in Creative Exercises for Evidence of Linking Chemistry Concepts, Chem. Educ. Res. Pract., 15, 576–586.

Footnote

Analyses presented later in this manuscript will rely on a subset of the sample based on frequencies of responses to text messages. The correlation between frequency of responses and each of the measures in Table 2 were found to be weak, with r = 0.16 for response rate to Math SAT or Verbal SAT and |r| < 0.14 for the other measures, indicating subsets generally continue with minor departures from the population.

This journal is © The Royal Society of Chemistry 2015