Development of The Grit Scale For Children and Adults and Its Relation To Student Efficacy, Test Anxiety, and Academic Performance
Development of The Grit Scale For Children and Adults and Its Relation To Student Efficacy, Test Anxiety, and Academic Performance
Development of the grit scale for children and adults and its relation to MARK
student efficacy, test anxiety, and academic performance
Edward D. Sturmana,⁎, Kerri Zappala-Piemmeb
a
SUNY Plattsburgh at Queensbury, Department of Psychology, Queensbury, NY, USA
b
SUNY Plattsburgh at Queensbury, Educational Leadership, Queensbury, NY, USA
A R T I C L E I N F O A B S T R A C T
Keywords: We sought to develop a new measure of grit, which would be suitable (i.e. readable) for both schoolchildren and
Grit adults. An initial pool of 14 items was administered to a student/community sample in Study 1 and 12 items
Anxiety were selected for the Grit Scale for Children and Adults (GSCA) based on factor loadings. In Study 2, the GSCA
Self-efficacy was administered to 249 students in grades 3–12. Participants also completed measures of self-efficacy, test
Standardized test
anxiety, the Grit-O scale, and standardized tests in ELA, Math and Science. The GSCA demonstrated high internal
Achievement
consistency and test-retest reliability. Construct validity was supported by significant correlations with efficacy,
anxiety, and other measures of grit. Scores on the GSCA predicted achievement on the ELA and Science stan-
dardized tests, over and above an existing grit scale (the Grit-O). The study found initial evidence for the psy-
chometric properties of the GSCA and its use in school-children.
⁎
Corresponding author at: Psychology, SUNY Plattsburgh at Queensbury, 640 Bay Rd., Queensbury, NY, USA.
E-mail addresses: estur001@[Link] (E.D. Sturman), kzapp002@[Link] (K. Zappala-Piemme).
[Link]
Received 11 November 2016; Received in revised form 31 July 2017; Accepted 19 August 2017
1041-6080/ © 2017 Elsevier Inc. All rights reserved.
E.D. Sturman, K. Zappala-Piemme Learning and Individual Differences 59 (2017) 1–10
Abuhassàn and Bates (2015) showed that consistency of interests include efficacy as it related to self-regulatory learning (incorporating
was just a facet of conscientiousness, with the authors going on to motivation, planning, and enlisting resources to learn) and meeting
say “what is unique to Grit, then, is a single (rather than bi-factor) others' expectations (ability to live up to expectations of parents, tea-
construct tapping effortful persistence or ‘elbow grease.’” Moreover, chers, and peers), among other domains. Subsequent research demon-
perseverance, but not consistency of interest, was significantly re- strated that efficacy in self-regulated learning directly predicted aca-
lated to achievement. demic achievement and was indirectly related to performance by
promoting a prosocial orientation and decreased despondency
Therefore, we defined grit as follows: To sustain a focused effort to (Bandura et al., 1996). Diseth (2011) showed that previous academic
achieve success in a task, regardless of the challenges that present them- success, as measured by high-school GPA, predicted higher efficacy,
selves, and the ability to overcome setbacks. The goal of the present study which in turn led to future academic achievement. Other research has
was to develop a measure that would adequately capture this definition demonstrated that self-efficacy predicts GPA partly through persistence
and would be readable by young adults and children alike. of effort and overcoming of difficulties (Komarraju & Nadler, 2013).
The impact of self-efficacy on academic performance and persistence is
1.1. Grit and academic performance now well-established (see Multon, Brown, & Lent, 1991 and Richardson,
Abraham, & Bond, 2012 for meta-analyses).
Grit has been shown to predict grade point average (GPA) in Ivy
League and state college students (Duckworth et al., 2007); adolescents 1.3. Test anxiety and academic performance
at a public school (Duckworth & Quinn, 2009); Black male college
students at a predominantly White college (Strayhorn, 2014); and The relationship between anxiety and academic performance dates
doctoral students (Cross, 2014). Most of the associations that have been back many years with the pioneering work of Mandler and Sarason
found between grit and GPA have been in the modest to moderate (1952) and Sarason's (1960) attempts to define test anxiety and ex-
range. It is notable that grit predicted performance over and above amine its role in academic performance. While it is acknowledged that
traditional predictors such as SAT scores (Duckworth et al., 2007) or some anxiety is adaptive in preparing students for upcoming chal-
ACT scores and high school GPA (Strayhorn, 2014). Grit has been lenges, such as exams or assignments, too much anxiety is clearly det-
shown not only to predict GPA but also retention in a variety of roles as rimental to performance. Seipp (1991) identified 126 studies that tested
would be expected if existing grit scales are really measuring perse- this association and her meta-analyses showed a clear, albeit modest,
verance and a “stick-to-it-ness.” For instance, Robertson-Kraft and relationship across the studies. Of particular note was that specific
Duckworth (2014) found that grit was able to predict both teacher ef- forms of anxiety, namely test anxiety, showed stronger correlations
fectiveness and retention. Eskreis-Winkler, Shulman, Beal, and with performance compared to generalized anxiety.
Duckworth (2014) found grit predicted high school graduation rates, Liebert and Morris (1967) differentiated test anxiety into an emo-
completion of an Army Special Forces course, retention in salespeople, tional and cognitive component (labelled worry) and found initial
and even the propensity of married males to stay in their marriage. evidence that the cognitive aspect may be more involved with perfor-
Most of the effect sizes were relatively modest, yet grit accounted for mance. Indeed, Hembree (1988) conducted an early meta-analysis and
unique variance in the outcomes, even after controlling for other known showed that the worry component seemed to interfere with perfor-
predictors. mance to a greater extent than emotionality. Further, he found that test
We should note that more recent studies have somewhat tempered anxiety was linked to personality variables such as low self-esteem
the expectations for grit as a key predictor of academic performance. (inversely) and an external locus of control. Cassady and Johnson
Notably, a meta-analysis by Credé, Tynan, and Harms (2016), showed (2002) developed a specific measure for cognitive test anxiety and
that grit was correlated at about 0.16 and 0.17 with GPA at the high- found it to be related to academic performance on self-reported SAT
school and college levels respectively. The perseverance factor was scores and examination grades. A high degree of stability has been
related to academic performance at a higher level (0.26) than the found for both the cognitive and bodily system components of test
consistency factor (0.10). Overall grit scores did not show incremental anxiety, which would indicate that the construct is more a trait than a
validity over conscientiousness (which was found to be highly corre- state (Cassady, 2001).
lated with grit; also see Rimfeld, Kovas, Dale, & Plomin, 2016 for si- Studies showing the effects of test anxiety on high stakes tests are
milar conclusions). Yet, the perseverance factor yielded substantial in- less common. Hancock (2001) found that individuals with high test
cremental validity, over and above conscientiousness. We see this as anxiety are more likely to show performance deficits in highly eva-
further support for omitting the consistency facet from our definition of luative contexts. Segool, Carlson, Goforth, Von Der Embse, and
grit, and instead focusing on perseverance. Barterian (2013) showed that high stakes tests were related to sig-
Of course, there are other personality and mood constructs which nificantly higher levels of test anxiety compared to classroom tests. To
are known to predict academic performance, and which are theoreti- our knowledge, the current study represents a unique opportunity to
cally associated with grit. We included two of these constructs, efficacy examine cognitive test anxiety in relation to high stakes testing of the
and anxiety, in the present study. Their relationship to academic per- Common Core Learning Standards as well as grit. Insofar as grit re-
formance is briefly reviewed below. presents confidence in the ability to overcome challenges (such as
tests), we would expect it to be inversely related to test anxiety, which
1.2. Self-efficacy and academic performance is associated with lower self-confidence and an external locus of control
(Hembree, 1988). Studies on this association are rare but Sheridan,
Self-efficacy has been defined by Bandura (1977) as the belief that Boman, Mergler, and Furlong (2015) found a moderate negative cor-
individuals will be able to produce desired outcomes and has been relation between grit and general anxiety, while Celik and Sarıçam
equated with mastery or competence. Self-efficacy is considered a (2016) specifically found higher grit to be related to lower levels of test
personality variable that is relatively stable and has important im- anxiety in a sample of Turkish school-children and adolescents.
plications for coping and persistence in challenging situations
(Bandura, 1977; Blatt, D'Afflitti, & Quinlan, 1976). 1.4. Study overview
Following Bandura's landmark paper, many researchers sought to
examine self-efficacy as it related to various domains within the aca- The goal of the present study was to validate a new grit scale that
demic sphere. For example, Bandura (1990); Bandura, Barbaranelli, would be comprehensible by young children and adults alike. Items for
Caprara, and Pastorelli (1996) himself differentiated the construct to the new scale were created with this goal in mind and the resulting item
2
E.D. Sturman, K. Zappala-Piemme Learning and Individual Differences 59 (2017) 1–10
pool was tested in a college/community sample (Study 1). After elim- setbacks. We sought to develop the items using language that would be
inating items that did not load on a homogenous dimension, the Grit understandable by younger children (i.e. grades 3 and up). The initial
Scale for Children and Adults (GSCA) was validated in a sample of pool of items was presented to three experts in the field of education to
students from grades 3–12 (Study 2). Parents of students provided their elicit feedback. All of the experts were education professors possessing
own ratings of grit and we also administered measures of efficacy to- a doctorate with over two decades of practical teaching experience.
wards self-regulated learning and meeting others' expectations. Test They were asked to rate the items as either “essential,” “useful but not
anxiety was assessed along with a measure of anxiety specifically essential,” or “not necessary.” Most of the items were deemed essential
geared towards the high stakes standardized tests. We were able to by all three raters and there was low agreement on the exceptions.
obtain scores on the high stakes tests in English Language Arts (ELA), Therefore, all 14 items were included in the initial assessment and it
Math, and Science in order to evaluate the predictive validity of the was decided that factor analysis would be used as the basis to retain or
new scale. eliminate items. Any items with a factor loading below 0.40 were
The specific hypotheses for Study 2 were as follows: eliminated.
1. Scores on the GSCA would be significantly correlated with parental [Link]. Grit-O scale (Duckworth et al., 2007). The 12-item Grit-O scale
ratings of grit and scores on the Grit-O scale (Duckworth et al., assesses the ability of individuals to sustain effort and maintain
2007); consistency of interests. Respondents are asked the degree to which
2. Scores on the GSCA would be significantly correlated with efficacy they agree with various statements on a 5-point scale ranging from
in self-regulated learning and meeting others' expectations; 1 = not at all like me to 5 = very much like me. Duckworth et al.
3. Scores on the GSCA would be correlated with test anxiety generally (2007) obtained a two-factor solution of the scale corresponding to
and test anxiety that was specifically related to standardized tests; perseverance of effort and consistency of interest. An example
and perseverance item is “I have overcome setbacks to conquer an
4. The GSCA would postdict achievement levels on the ELA standar- important challenge.” An example consistency item is “I become
dized test and prospectively predict achievement levels on stan- interested in new pursuits every few months.” Both subscales
dardized tests for Math and Science. demonstrated adequate internal consistency. However, the authors
have also pointed to the value of using total scores in predicting
2. Study 1 important outcomes. The Grit-O Scale was able to significantly predict
adult educational achievement, GPA, retention in West Point cadets,
In the first study we sought to develop a measure of grit based on and was related to conscientiousness, supporting its construct validity
the definition put forward in the general introduction. A pool of items (Duckworth et al., 2007).
was created based on theoretical considerations and we solicited the
advice of experts in the field as to the suitability of items. We evaluated 2.2. Results
the homogeneity of items using factor analysis and repeated the pro-
cedure for the Grit-O scale (Duckworth et al., 2007). A mixed student/ A factor analysis was conducted on the initial 14 items and returned
community sample was used to increase the generalizability of our four factors with eigenvalues above 1. The first factor had an eigenvalue
findings. We should note that the measure was to be used in school- of 4.71 and accounted for 33.66% of the variance. The other factors
children in study 2 but the factor analysis utilized only responses from accounted for 11.16%, 8.08%, and 7.74% of the variance respectively.
adults. We thought this was acceptable for the following reasons: 1) the A scree plot demonstrated that the first factor was predominant and it is
GSCA would ultimately be used in adults as well as children; and 2) the notable that none of the items loaded above 0.40 on any of the other
purpose of the factor analysis was to simply weed out items at the test factors with the exception of item 3, which loaded at − 0.54 on the
development stage that were not correlating with other items. In order second factor and item 9, which loaded at −0.52 on the fourth factor.
to ensure similar factor structure, we conducted another analysis with Item 10 loaded 0.398 on the third factor. In all cases, those few items
the school-children (see Study 2). with factor loadings above 0.4 on the other factors, nevertheless had
higher loadings on the first factor. Therefore, we used factor loadings
2.1. Materials and methods on the first factor as the basis for retaining or eliminating items.
Table 1 shows the factor loadings of all 14 items. Item 5 (I am able
2.1.1. Participants and procedure to bounce back when things don't go my way) and item 6 (I don't always
The study received approval from the IRB at the institution. care if I do my best work) both had factor loadings below 0.40. The
Undergraduate students from a state college in upstate New York were factor loadings for the remaining items were relatively robust ranging
asked to complete an online survey posted on Survey Monkey and re- from 0.42 to 0.72. Based on our criteria we eliminated items 5 and 6,
ceived course credit for their participation (see Measures). Each student resulting in the 12-item Grit Scale for Children and Adults.
participant was also asked to forward the survey to three family As this was a preliminary scale development paper, it would not be
members and/or friends in order to obtain a more representative appropriate to examine the 12 items that remained from the initial pool
sample. All participants provided informed consent online before as a stand-alone scale. That is, the psychometric properties of the GSCA
completing the online measures. A total of 109 respondents (36 male, would need to be established via a proper validation study, which was
72 female, 1 unknown) completed the survey. The average age was the goal of Study 2. However, it is noteworthy that the 12 items that
38 years (SD = 15.50). The highest education level achieved was as were retained demonstrated relatively high internal consistency
follows: high school or less (n = 25); Associate's Degree (n = 36); Ba- (Cronbach's alpha = 0.84). When scores on all 12 items were summed,
chelor's Degree (n = 21); Master's Degree (n = 23); and Doctorate the resulting total score was significantly correlated with the Grit-O
(n = 3). The sample was predominantly Caucasian (93.8%), followed scale, r (108) = 0.68, p = 0.001. This provided some evidence for the
by African American (1.8%), Hispanic (0.9%), and other (0.9%). The convergent validity of the new scale but, as mentioned, we recognized
ethnic breakdown was representative of the larger community. that the scale would need to be evaluated as a stand-alone measure that
did not include the eliminated items.
2.1.2. Measures A factor analysis of the Grit-O scale revealed two factor with ei-
In line with our definition of grit, we drafted 14 items corresponding genvalues above one, with the first factor accounting for 44.22% of the
to sustaining a focused effort to achieve success in a task, regardless of variance and the second factor accounting for only 12.04% of the
the challenges that present themselves, and the ability to overcome variance. No items loaded higher on the second factor than the first
3
E.D. Sturman, K. Zappala-Piemme Learning and Individual Differences 59 (2017) 1–10
Table 1 total score that was significantly related to that of the Grit-O scale. In
Loadings for the initial pool of grit items. Study 2, we would extend these findings and gather evidence for the
reliability and validity of the finalized GSCA.
Items Loadings
1 2 3 4 3. Study 2
I don't always work as hard as I can. 0.65 −0.31 − 0.12 − 0.04
I always finish what I start. 0.72 0.25 − 0.33 0.19
After identifying 12 items that seemed to be tapping a homogenous
I am not always motivated to do my best. 0.62 −0.54 − 0.05 0.10 construct, we sought to explore the reliability and validity of the newly
I always stick to the task I am working on until 0.65 0.31 − 0.46 − 0.01 developed scale (see Appendix A). Of particular interest was whether
it is complete. grit could predict academic performance, as measured by achievement
I am able to bounce back when things don't go 0.30 0.30 0.11 − 0.10
on age-appropriate standardized tests across many grade levels. We
my way.
I don't always care if I do my best work. 0.37 −0.19 0.05 0.12 should note that the scale was administered shortly after the English
I always keep working for what I want even 0.42 0.09 0.31 0.47 Language Arts (ELA) test and several weeks in advance of the Math and
when I don't do as well as I would like to. Science assessments. The reliability of the measure was established by
Sometimes I am not as focused on my work as I 0.51 −0.15 − 0.01 − 0.13 administering it at two time points separated by approximately two
would like to be.
Challenges in my life sometimes make me want 0.57 −0.10 0.06 − 0.52
weeks. We also sought to test the construct validity of the measure by
to stop trying. relating it to parental ratings of grit and associated constructs such as
No matter what happens to me I will be okay. 0.44 0.19 0.40 − 0.11 self-efficacy.
I always pay attention to what I am working on 0.57 0.29 0.02 0.08
to make sure I do it well.
Sometimes I don't care about my work as much 0.65 −0.36 0.11 0.09 3.1. Methods
as I should.
I never give up even when things get tough. 0.51 0.21 0.19 0.015 3.1.1. Participants and setting
I am able to get through tough times without 0.55 0.21 0.15 − 0.15 A total of 249 schoolchildren (54.8% female and 45.2% male) in
any difficulty.
grades 3–12 participated in the study. We should note that a large
proportion of students in the district participated in the study, with the
factor and the factors did not seem to correspond to those obtained by exception of students in grades 6 and 7, who did not complete the tests
Duckworth et al. (2007), owing to the fact that the second factor was due to error (specifically, the teachers did not administer the measures).
uninterpretable. Therefore, there seems to be one factor underlying the Table 2 shows the number of students from each grade that participated
items of the Grit-O scale, at least in our sample. In support of the in the study broken down by gender. Seven hundred and fifty-five
homogeneity of the measure, we obtained a Cronbach's alpha of 0.88 students in grades K-12 are enrolled in the small rural public school
across all items, indicating a high level of internal consistency. district. The school district reported that 97% of their students are
The Flesch-Kincaid Grade Level and Flesch Reading Ease were used Caucasian, 1% are African American and 1% are Hispanic. All students
to evaluate the readability of the 12 items selected for the GSCA and who had voluntarily agreed to participate in the study and who had
Grit-O scale. The Flesch-Kincaid formula was developed by Kincaid, returned parental or guardian permission slips, completed the surveys
Fishburne, Rogers, and Chissom (1975) and was initially used by the during the school day. Teachers administered the surveys to their stu-
military to assess the reading level of technical manuals. Since that time dents and returned them to the authors.
it has become a widely used measure of reading level in other realms.
Both the Flesch-Kincaid Grade Level and Flesch Reading Ease in- 3.1.2. Measures
corporates the number of syllables and sentence length in their calcu- [Link]. The Grit Scale for Children and Adults (GSCA). The
lation. For Flesch Reading Ease, scores range from 0 (very hard) to 100 development of the 12-item self-report scale was described in Study
(very easy). The Flesch-Kincaid simply states the grade level for which a 1. The items assess sustained and focused effort to achieve success in a
passage would be appropriate. task, regardless of the challenges that present themselves, and the
The GSCA received a score of 81 on Flesch Reading Ease (easy to ability to overcome setbacks. Responses are made on a 5-point Likert
read) and 5.0 on the Flesch-Kincaid Grade Level. The Grit-O scale re- scale ranging from 1 = Strongly Disagree to 5 = Strongly Agree. The
ceived a score of 64.7 on Flesch Reading Ease (plain English, suited to Results section contains detailed information on the reliability and
13–15 year olds) and 6.7 on the Flesch-Kincaid Grade Level. We also construct validity of the scale.
employed the New Dale-Chall Readability Formula (Chall & Dale,
1995), which is based on the length of sentences and the number of [Link]. Grit-O scale (Duckworth et al., 2007). The Grit-O Scale was
hard words. The GSCA received a final score of 4.8, which corresponded included in Study 2 to explore the convergent validity of the GSCA. See
to a grade 4–5 reading level. Using the same index, the Grit-O scale Study 1 for a description of the scale. In the present study we also
received a final score of 6.7 and a grade 7–8 reading level. separated out the consistency of interest and perseverance of effort
factors to assess whether these facets would have different predictive
validities than the overall score.
2.3. Discussion
Table 2
The first study was limited in scope, intended only to create the Participants by grade level.
initial pool of items related to our definition of grit, and to eliminate
items based on the results of a factor analysis. The sample could largely Grade N Female Male
be considered a community sample given the average age (38 years)
3 24 16 8
and varied educational background. Twelve items were retained which 4 29 16 13
would form the GSCA. Employing two widely used readability indices, 5 25 12 13
we found that the 12 items that formed the GSCA ranged from a 4th to 8 27 13 14
5th grade reading level whereas the Grit-O scale was generally at the 9 39 22 17
10 30 18 12
7th to 8th grade reading level.
11 41 21 20
The 12 items retained for the GSCA displayed an adequate level of 12 24 13 11
internal consistency and summing scores across the items produced a
4
E.D. Sturman, K. Zappala-Piemme Learning and Individual Differences 59 (2017) 1–10
[Link]. Parental grit rating. Parents were given the following item at assessments was measured by the following 3-item self-report scale:
the time they provided consent: Grit has been defined by researchers as
possessing high motivation, care, and focus in relation to one's work 1. Are you worried about the upcoming New York State Tests?
and the ability to persevere and bounce back from setbacks. Using the 2. Are you worried about the results of the York State Tests?
above definition please rate your child's ‘grit’ on a scale from 0 to 100. 3. Are you worried about performance on the York State Tests?
A number of responses were erroneous (i.e. they did not conform to the
scale of 0–100) or were missing. In total we were able to use 135 The scale contained five response options for each item ranging
responses to the item. The mean score was relatively high at 80.28 from 1 = not at all worried to 5 = extremely worried. High internal
(SD = 20.30), possibly revealing a halo effect. consistency was obtained for the items (0.94), which were summed to
form a total anxiety score.
[Link]. Multidimensional Scales of Perceived Self-Efficacy (MSPSE;
Bandura, 1990). The MSPSE is a 57-item self-report measure of self-
efficacy in schoolchildren. It consists of nine subscales, measuring [Link]. Standardized tests for ELA, math, and science. The New York
various components of efficacy. Items are rated on a 7-point Likert-type State tests in ELA, Math, and Science were developed by Pearson.
scale with 1 = not well at all, 3 = not too well, 5 = pretty well, and Starting in 2013, the assessments were updated to be in accord with the
7 = very well. In our study we focused on two subscales: Self-Efficacy Common Core Learning Standards (CCLS) and are administered every
for Self-Regulated Learning (11 items) and Self-Efficacy to Meet Others' year to children in grades 3–8. The Science test is administered to
Expectations (4 items). An example item from the former is “How well children in grade 4 and again in grade 8. Students are assigned to 4
can you study when there are other interesting things to do?” An different levels based on their proficiency of the standards, as reflected
example item from the latter is “How well can you live up to what your in the following rankings: 1 = well below proficiency; 2 = partially
teachers expect of you?” These subscales were thought to be most proficient; 3 = proficient; 4 = excels in standards. These performance
relevant to the construct of grit and other practical considerations (e.g. levels were used as the outcome in our analyses (see ANOVAs in
time and the age of the participants) caused us to select the most age- Results) but we also were able to obtain scale scores which were then
appropriate subscales, rather than administer the entire scale. Both converted into z-scores. Standardized tests come in different forms;
Miller, Coombs, and Fuqua (1999) and Choi, Fuqua, and Griffin (2001) scale scores are based on raw scores but have been transformed so that
have found the scales to have relatively high internal consistency. We the different forms of the test are on a consistent scale. This allows for
were not interested in the Academic Achievement subscale because meaningful comparisons. Nevertheless, we would not be able to
these questions asked about how well students could learn foreign compare scaled scores from one grade level to another for each of the
language, algebra, biology, etc. and we had access to the participant's tests. For this reason we converted the scaled scores to z-scores by
actual performance. The items were also not appropriate for the 3rd - subtracting the mean from the scores and dividing by the standard
8th grade curriculum (students in grades 3–8 do not typically take all of deviation. This procedure transforms each scaled score into a rank; the
these courses). z-score tells us how many standard deviations away from the mean each
student scored on the various tests.
[Link]. Children's Test Anxiety Scale (CTAS; Wren & Benson, 2004). The The ELA test consists of multiple-choice, short-response, and ex-
CTAS consists of 30 items relating to test anxiety, which respondents tended-response questions to assess CCLS in relation to reading,
rate on a four-point scale (1 = almost never, 2 = some of the time, writing, and language. The Mathematics test used the same types of
3 = most of the time, 4 = almost always). The items reflect the three questions to assess students' ability to understand Common Core con-
dimensions of children's test anxiety: Thoughts = 13 items; automatic tent conceptually, work with mathematical facts, know which formulas
reactions = 9 items; and off-task behavior = 8 items (Wren & Benson, to employ, and solve real-world problems. The Science tests, of course,
2004). The internal consistency for the 30-item CTAS was 0.92 and 0.89 varied in content depending on grade levels (i.e. grade 4 vs. 8) but also
for the Thoughts subscale. In our study we focused on the 13-item aligned with CCLS and consisted of multiple-choice and open-ended
Thoughts subscale as it specifically focusses on thoughts that are questions. A number of technical reports, issued over several years,
evident at the time of testing, namely self-critical thoughts, test- support the reliability and validity of all of the standardized tests (see
related concerns, and test-irrelevant thoughts (Wren & Benson, 2004). [Link] The school district
An example item from the scale is “I worry about how hard the test is.” used in the study tended to underperform in the standardized tests in
relation to other districts.
[Link]. Anxiety of standardized tests. Test anxiety for the standardized
10-15 days
Phase 2
5
E.D. Sturman, K. Zappala-Piemme Learning and Individual Differences 59 (2017) 1–10
3.1.3. Procedures = −0.050, p = 0.96. Table 4 shows the correlations between scores on
[Link]. Phase 1. In the first phase of the study parents participating in the GSCA (at baseline and Time 2) and key variables. The scale dis-
the study received a two-page questionnaire that obtained consent, played a moderate correlation with the Grit-O scale and parental grit
demographic information, and assessed grit (see Fig. 1 for a chart of the rating. We should note that the Grit-O scale was not significantly re-
Procedure). Parents completed the two-page survey 10–21 days before lated to the parental grit rating, r(112) = 0.13, p = 0.156. The GSCA
the ELA test was administered for grades 3–8. The following week, was also significantly and strongly associated with efficacy for self-
students in grades 3–12 received a one-page questionnaire that assessed regulated learning and efficacy meeting others' expectations. It dis-
grit using the GSCA. played significant negative correlations with test anxiety and anxiety
related specifically to standardized assessments, such that higher grit
scores were associated with lower test anxiety. The single-item parental
[Link]. Phase 2. The second phase of the study began 10–15 days after
grit rating demonstrated relatively high correlations with the standar-
the completion of Phase 1. Students in grades 3–12 received a six-page
dized tests in ELA, Math, and Science. We should also note that the
questionnaire that assessed grit (using the GSCA and Grit-O scale),
separate perseverance and consistency of interest subscales, derived
efficacy, and anxiety (test anxiety and anxiety for standardized tests).
from the Grit-O, generally showed lower correlations with other key
The following week, students in grades 3–8 completed the Mathematics
predictors compared to the overall Grit-O score.
standardized test and three weeks later students in grades 4 and 8
completed the Science standardized test. All the survey and assessment
3.2.1. Predicting academic performance – levels achieved on standardized
data in phase one and phase two were administered and collected
tests
during the 2013–2014 school year.
A series of one-way ANOVAs were conducted to determine whether
baseline scores on the GSCA were related to performance levels on
3.2. Results standardized tests for ELA, Math, and Science. The size of the samples
for each of the analyses (and those that follow for the GSCA and stan-
The internal consistency of the GSCA was adequate at both time dardized tests) were as follows: ELA (n = 88); Math (n = 82); and
points, with Cronbach's alpha of 0.84 at baseline and 0.86 at Time 2. Science (n = 51). The sample sizes were smaller than that of the overall
The test-retest reliability was 0.78. A principal components analysis sample as only students in grades 3–8 took the tests, and, in the case of
revealed three components with eigenvalues over 1 (4.48, 1.23, and the Science test, only students in grades 4 and 8 took the test (as di-
1.13), although the first component was clearly predominant ac- rected by the state of New York).
counting for 37.34% of the variance compared to 10.25%, and 9.39% Performance levels on the ELA test were not related to scores on the
for the second and third factors. With the exceptions of item 2 (0.39) GSCA at baseline, F(3, 84) = 1.52, p = 0.216, η2 = 0.05. Math levels
and item 5 (0.39), the items of the GSCA had strong loadings on the first were significantly related to scores on the GSCA at baseline, F(3, 78)
component (in the 0.5 to 0.7 range; see Table 3 for factor loadings and = 3.11, p = 0.031, η2 = 10.68 (see Fig. 2). LSD Post-hoc analyses
the finalized scale). Items 2 and 12 had relatively strong loadings on the showed that those scoring a 3 on the Math test had significantly higher
second factor (0.46 and − 0.59 respectively) and item 2 also had re- baseline scores on the GSCA compared to those scoring 1 (p = 0.007) or
latively strong loadings on the third factor (0.51). Item 6 had a rela- 2 (p = 0.040).
tively strong loading (− 0.47) on the third factor but not as strong as on A second ANOVA showed that there were significant differences
the first factor. Given the clear dominance of the first factor and the between the levels of the Science test on GSCA scores at baseline, F (3,
lack of interpretability of any other factor, a one factor solution seemed 47) = 4.38, p = 0.008; η2 = 21.86 (see Fig. 3). LSD post-hoc analyses
to best describe the associations between items. demonstrated that those scoring a 4 on the Science test had significantly
No significant gender differences on the GSCA were found at either higher scores on the GSCA compared to those scoring a 3 (p = 0.011) or
baseline or Time 2. At baseline, mean scores for males were 39.84 2 (p = 0.001). A series of ANOVAs revealed that the levels on the ELA,
(SD = 8.76) compared to 41.89 (SD = 8.59) for females, t(209) Math, and Science tests were not significantly related to scores on the
= − 1.078, p = 0.089. At Time 2, mean scores for males were 41.52 Grit-O scale.
(SD = 7.67) compared to 41.58 (SD = 9.43) for females, t(188)
3.2.2. Predicting academic performance – Z-scores on standardized tests
Table 3 We were provided with scale scores for each participant on the
Loadings on the GSCA. standardized tests for Math, ELA, and Science. The scale scores in
Items Loadings
themselves would not provide a basis for comparison but we were able
to convert the scale scores into z-scores and conduct three hierarchical
1. 2. 3. regression analyses in order to determine whether the GSCA predicted
scores on Math, ELA, and Science tests over and above the Grit-O scale.
1. I
don't always work as hard as I can. 0.61 0.33 − 0.34
2. always finish what I start.
I 0.39 0.46 0.51
In all of the regression analyses, the Grit-O scale was entered as a
3. I
am not always motivated to do my best. 0.70 0.06 − 0.27 predictor in the first step and the GSCA was entered in the second step.
4. I
always stick to the task I am working on until it is 0.64 0.12 0.18 We should note that the sample size for participants who had completed
complete. the Grit-O scale, GSCA and standardized tests was fairly low (n = 66
5. I always keep working for what I want even when I 0.39 0.33 0.40
with Math as the DV; n = 72 with ELA as the DV; and n = 46 with
don't do as well as I would like to.
6. Sometimes I am not as focused on my work as I would 0.62 − 0.05 − 0.47 Science as the DV). Although larger sample sizes are generally pre-
like to be. ferred, the sample sizes in the present analyses are more or less in
7. Challenges in my life sometimes make me want to 0.60 − 0.37 0.10 keeping with the recommendations of Harris (1985), who re-
stop trying. commended that n = 50 + the number of predictors, and Schmidt
8. No matter what happens to me I will be okay. 0.56 − 0.42 0.30
9. I always pay attention to what I am working on to 0.70 0.30 0.03
(1971) who found that predictive power was not lost so long as the
make sure I do it well. sample size was above 40.
10. Sometimes I don't care about my work as much as I 0.74 0.11 − 0.33 In the first hierarchical regression analysis, standardized scores for
should. Math were entered as the dependent variable. Neither the Grit-O scale
11. I never give up even when things get tough. 0.72 − 0.14 0.22
(p = 0.316) nor the GSCA (p = 0.091) emerged as significant pre-
12. I am able to get through tough times without any 0.56 − 0.59 0.10
difficulty. dictors in the second step. In the second regression analysis ELA scores
were entered as the dependent variable. Table 5 shows the standardized
6
E.D. Sturman, K. Zappala-Piemme Learning and Individual Differences 59 (2017) 1–10
Table 4
Correlations between grit and study variables.
⁎
Correlation significant at p < 0.05.
⁎⁎
Correlation significant at p = 0.001.
beta and p-values for each predictor, along with the change in R2 and F- relatively strong associations with the standardized tests scores.
ratio. The predictors together accounted for 12.9% of the variance in However, it did not relate to test anxiety or efficacy meeting other's
ELA scores. Of the predictors only GSCA scores (at Time 1) significantly expectations. The parental grit rating was significantly correlated with
predicted performance. Note that as GSCA scores (and scores on the GSCA scores but not scores on the Grit-O. It could be that parents were
other variables) were obtained after the ELA tests it would be more largely basing their evaluations of their child's grit on their academic
accurate to say that scores postdicted performance and we only use performance, although the study was not equipped to test this hy-
prediction in the statistical sense. pothesis. While the measure was useful in the present study, we would
We also conducted a regression analysis to determine whether the caution against using it on its own as a measure of grit, for the reasons
grit scales significantly predicted performance on Science standardized outlined above, and the lack of reliability inherent in single-item
scores. In this case, prediction is the most appropriate term as the measures.
Science test was administered several weeks after completing the other Another unexpected finding was that the perseverance facet showed
measures (see Methods). Table 6 contains the results of the analysis. lower correlations with our other key variables compared to the Grit-O
The Grit-O scale and GSCA together accounted for 16.9% of the var- overall score. Further, it was not significantly correlated with any of the
iance in Science scores. Of the predictors, only GSCA scores (at base- standardized test scores. This was to be expected for consistency of
line) significantly predicted performance. interest, based on past research, but perseverance generally shows
significant correlations with performance (see Credé et al., 2016). This
finding would seem to support the recommendation of Duckworth et al.
3.3. Discussion
(2007) to use the total score, at least for our sample of school-children.
As expected, scores on the GSCA were related to higher efficacy for
The second study was primarily aimed at testing the psychometric
self-regulated learning and meeting others' expectations. Further, GSCA
properties of the GSCA. The internal consistency of the scale was ade-
scores were associated with test anxiety generally (as measured by the
quate and the test-retest reliability coefficient indicated that the GSCA
CTAS) and, specifically, anxiety of standardized tests. Higher levels of
was tapping a relatively stable trait. Convergent validity was demon-
grit predicted lower anxiety. Incremental validity was obtained for the
strated by moderately high correlations with the Grit-O scale. The GSCA
GSCA in postdicting ELA scores and predicting Science scores. These
was able to predict performance on the Math and Science standardized
associations, and their implications, are explored further in the General
tests, while the Grit-O scale did not show significant relationships to
Discussion.
either outcome.
Somewhat surprisingly, the single-item parental grit rating showed
45
42.61
40.68
40
Scores on GSCA
35
30
25
20
Level 1 Level 2 Level 3 Level 4
7
E.D. Sturman, K. Zappala-Piemme Learning and Individual Differences 59 (2017) 1–10
41.13
40
34.33
35
30
25
20
Level 2 Level 3 Level 4
Table 5 rural, etc.). The internal consistency was relatively high, suggesting that
Postdicting performance on the ELA standardized test. we were tapping a homogenous trait, and the test-retest data suggested
that the trait was relatively stable, although a longer time interval be-
Predictors B β P ΔR2 ΔF
tween testing sessions would enable a more confident statement about
Step 1 0.034 2.482 its stability. Evidence for its convergent validity came from relatively
Grit-O scale − 0.029 0.184 0.120 high correlations with the Grit-O scale, yet not so high to suggest that
Step 2 0.095 7.635⁎
both scales were measuring the same thing. Criterion validity was
Grit-O scale 0.006 0.041 0.742
GSCA at baseline 0.040 0.340 0.007
supported by the significant correlations with anxiety and self-efficacy,
measured concurrently, and its ability to predict future performance on
Note. the Math and Science tests. Finally, incremental validity was obtained
⁎
p < 0.05. for the GSCA as it was related to performance on the ELA and Science
standardized tests over and above the Grit-O scale.
Table 6
Predicting performance on the Science standardized test.
4.1. Limitations and future directions
Predictors B β P ΔR2 ΔF
8
E.D. Sturman, K. Zappala-Piemme Learning and Individual Differences 59 (2017) 1–10
Heritability estimates are based on the proportion of phenotypic var- development of novel measures are necessary for the aforementioned
iation in a trait attributable to genetic factors versus environmental purposes. Nevertheless, we believe quick self-report measures could be
factors (which also includes non-shared and shared environments) in a quite useful for educators, provided they are not used in high stakes
particular population at a given time. Twin studies have shown that grit contexts. For example, teachers and administrators may benefit from
has a substantial genetic influence (see Rimfeld et al., 2016; Tucker- identifying the levels of grit in their classrooms, along with associated
Drob, Briley, Engelhardt, Mann, & Harden, 2016). Rimfeld et al. (2016) variables such as self-control and efficacy, so that they may intervene to
found genetic factors to account for roughly a third of the variance in bolster these “non-cognitive” variables and improve both short-term
grit, and notably, shared environmental factors (e.g. the same school and long-term performance.
environment) did not account for any variance to a significant extent.
The present study was not equipped to examine the heritability of grit, 5. Conclusions
as measured by the GSCA, but this is an important research question
that deserves further investigation. In the present study we demonstrated that a new tool to assess grit
can be used in schoolchildren to predict performance. The scale was
4.2. Implications for educational practices found to be reliable and showed incremental validity over and above an
existing grit measure and other key predictors of performance. We see
The educational reform agenda calls for greater accountably and the primary advantage of the GSCA as being its relative simplicity and
looks to Common Core Learning Standards (CCLS), Science Technology readability by younger schoolchildren.
Engineering Math (STEM), Data Driven Instruction, and Teacher and Schools are utilizing many resources to increase academic perfor-
Leader effectiveness to increase student performance. The focus is on mance and ensure that students are working to their full potential.
student achievement and shifting educational practices to prepare Research by Dweck and Leggett (1988) points to the importance of a
learners for graduation, college and career readiness. In order to “growth mindset,” which has been embraced by educators; the belief
achieve this, the field of education is making critical shifts in curri- that ability is malleable and that learning and effort both play a large
culum, instruction and assessment. For example, the New York State role in its expression. Studies have also shown that these mindsets
Assessments in Mathematics (which served as one outcome in the themselves can be malleable and that, when students are taught to have
present study) are aligned with the CCLS and require students to per- a growth mindset, they are more successful academically (Blackwell,
severe to solve problems. Trzesniewski, & Dweck, 2007).
We agree with Duckworth and Yeager (2015) that the assessment of It should be noted that there is indirect evidence that educational
grit by self-report methods should not be used for the purpose of pro- interventions may have some effect on increasing grit. Even brief
gram evaluation, teacher accountability, diagnosis, or practice im- growth-mindset interventions, which emphasize resilience to setbacks
provements. The temptation to do so is clear given the research behind (a component of grit) have been shown to improve student performance
grit but, as these researchers point out, self-report methods lend (Paunesku et al., 2015). Of course, further studies will be needed to
themselves to various errors, biases, social desirability, and error. We determine whether promoting a growth mindset actually increases grit.
would expect the likelihood of bias, social desirability, and faking In the meantime, as we allocate time and money to increase academic
would increase if high stakes decisions are to be made on the basis of performance, it would appear to be worthwhile to teach students to
self-reported grit. Duckworth and Yeager (2015) suggest that a multi- work hard, stick to a task, to persevere when they face challenges and to
method approach (including performance based measures) and the get back up and continue (try-try-again) when they face setbacks.
9
E.D. Sturman, K. Zappala-Piemme Learning and Individual Differences 59 (2017) 1–10
Blatt, S. J., D'Afflitti, J., & Quinlan, D. (1976). Experiences of depression in normal young downside-grit/.
adults. J. Abnorm. Psychol. 85, 383–389. Komarraju, M., & Nadler, D. (2013). Self-efficacy and academic achievement: Why do
Cassady, J. C. (2001). The stability of undergraduate students' cognitive test anxiety le- implicit beliefs, goals, and effort regulation matter? Learn. Individ. Differ. 25, 67–72.
vels. Practical Assessment, Research & Evaluation, 7, 1–8. Liebert, R. M., & Morris, L. W. (1967). Cognitive and emotional components of test an-
Cassady, J. C., & Johnson, R. E. (2002). Cognitive test anxiety and academic performance. xiety: A distinction and some initial data. Psychol. Rep. 20, 975–978.
Contemp. Educ. Psychol. 27, 270–295. Mandler, G., & Sarason, S. B. (1952). A study of anxiety and learning. J. Abnorm. Soc.
Celik, C., & Sarıçam, H. (2016). Grit and test anxiety in Turkish children and adolescents. Psychol. 47, 166–173.
Presented at ERPA international congresses on education, June, Sarajevo, Bosnia Miller, J. W., Coombs, W. T., & Fuqua, D. R. (1999). An examination of psychometric
Herzogivina. properties of Bandura's multidimensional scales of perceived self-efficacy. Meas. Eval.
Chall, J. S., & Dale, E. (1995). Readability revisited: The new Dale-Chall readability formula. Couns. Dev. 31, 186–196.
Cambridge, MA: Brookline Books. Multon, K. D., Brown, S. D., & Lent, R. W. (1991). Relation of self-efficacy beliefs to
Choi, N., Fuqua, D. R., & Griffin, B. W. (2001). Exploratory analysis of the structure of academic outcomes: A meta-analytic investigation. J. Couns. Psychol. 38, 30–38.
scores from the multidimensional scales of perceived self-efficacy. Educ. Psychol. Paunesku, D., Walton, G. M., Romero, C., Smith, E. N., Yeager, D. S., & Dweck, C. S.
Meas. 61, 475–489. (2015). Mind-set interventions are a scalable treatment for academic under-
Cohen, R. M. (2015). Teaching character: Grit, privilege, and American education's obsession achievement. Psychol. Sci. 0956797615571017.
with novelty. American Prospect Longform. (April 10, 2015. Retrieved from) http:// Richardson, M., Abraham, C., & Bond, R. (2012). Psychological correlates of university
[Link]/article/can-grit-save-american-education. students' academic performance: A systematic review and meta-analysis. Psychol.
Credé, M., Tynan, M. C., & Harms, P. D. (2016). Much ado about Grit: A meta-analytic Bull. 138, 353.
synthesis of the grit literature. J. Pers. Soc. Psychol.. (Advance online publication) Rimfeld, K., Kovas, Y., Dale, P. S., & Plomin, R. (2016). True grit and genetics: Predicting
[Link] academic achievement from personality. J. Pers. Soc. Psychol. 111(5), 780.
Cross, T. M. (2014). The gritty: Grit and non-traditional doctoral student success. Journal Robertson-Kraft, C., & Duckworth, A. L. (2014). True grit: Trait-level perseverance and
of Educators Online, 11, 1–30. passion for long-term goals predicts effectiveness and retention among novice tea-
Diseth, A. (2011). Self-efficacy, goal orientations and learning strategies as mediators chers. Teachers college record. 1970. Teachers college record (pp. 116–). . [Link]
between preceding and subsequent academic achievement. Learn. Individ. Differ. 21, [Link]/[Link]?ContentId=17352.
191–195. Sarason, I. G. (1960). Empirical findings and theoretical problems in the use of anxiety
Duckworth, A. L., Peterson, C., Matthews, M. D., & Kelly, D. R. (2007). Grit: Perseverance scales. Psychol. Bull. 57, 403–415.
and passion for long-term goals. J. Pers. Soc. Psychol. 92, 1087–1101. Schmidt, F. L. (1971). The relative efficiency of regression and simple unit predictor
Duckworth, A. L., & Quinn, P. D. (2009). Development and validation of the short grit weights in applied differential psychology. Educ. Psychol. Meas. 31, 699–714.
scale (grit-S). J. Pers. Assess. 91, 166–174. Segool, N. K., Carlson, J. S., Goforth, A. N., Von Der Embse, N., & Barterian, J. A. (2013).
Duckworth, A. L., & Yeager, D. S. (2015). Measurement matters: Assessing personal Heightened test anxiety among young children: Elementary school students' anxious
qualities other than cognitive ability for educational purposes. Educ. Res. 44, responses to high-stakes testing. Psychol. Sch. 50, 489–499.
237–251. Seipp, B. (1991). Anxiety and academic performance: A meta-analysis of findings. Anxiety
Dweck, C. S., & Leggett, E. L. (1988). A social-cognitive approach to motivation and Research, 4, 27–41.
personality. Psychol. Rev. 95, 256–273. Sheridan, Z., Boman, P., Mergler, A., & Furlong, M. J. (2015). Examining well-being,
Eskreis-Winkler, L., Shulman, E. P., Beal, S. A., & Duckworth, A. L. (2014). The grit effect: anxiety, and self-deception in university students. Cogent Psychology, 2 (Article
Predicting retention in the military, the workplace, school, and marriage. Front. 993850).
Psychol. 5, 1–12. Stokas, A. G. (2015). A genealogy of grit: Education in the new gilded age. Educational
Hancock, D. R. (2001). Effects of test anxiety and evaluative threat on students' Theory, 65, 513–528.
achievement and motivation. J. Educ. Res. 94, 284–290. Strayhorn, T. L. (2014). What role does grit play in academic success of black male col-
Harris, R. J. (1985). A primer of multivariate statistics (2nd ed). New York: Academic Press. legians at predominately white institutions? J. Afr. Am. Stud. 18, 1–10.
Hembree, R. (1988). Correlates, causes, effects, and treatment of test anxiety. Rev. Educ. Thomas, P. L. (2014). The “grit” narrative, “grit” research and codes that bind. The be-
Res. 58, 47–77. coming radical. (Retrieved from) [Link]
James, W. (1907, March 1). The energies of men. Science, 25, 321–332. 30/the-grit-narrative-grit-research-and-codes-that-blind/.
Kincaid, J. P., Fishburne, R. P., Jr., Rogers, R. L., & Chissom, B. S. (1975). Derivation of Tucker-Drob, E. M., Briley, D. A., Engelhardt, L. E., Mann, F. D., & Harden, K. P. (2016).
new readability formulas (automated readability index, fog count and flesch reading ease Genetically-mediated associations between measures of childhood character and
formula) for navy enlisted personnel (No. RBR-8-75). Naval Technical Training academic achievement. J. Pers. Soc. Psychol. 111(5), 790–815.
Command Millington TN Research Branch. Wren, D. G., & Benson, J. (2004). Measuring test anxiety in children: Scale development
Kohn, A. (2014). The downside of “grit.” What really happens when kids are pushed to be and internal construct validation. Anxiety Stress Coping, 17, 227–240.
more persistent. Washington post. (April 6, 2014) [Link]
10