ACCESS Technical Report 2013-2014
ACCESS Technical Report 2013-2014
Prepared by:
This report has been reviewed by the WIDA ACCESS for ELLs Technical Advisory Committee
(TAC), which is comprised of the following members:
• Jamal Abedi, Ph.D., Professor at the Graduate School of Education at the University of
California, Davis and a research partner at the National Center for Research on
Evaluation, Standards, and Student Testing (CRESST)
• Lyle Bachman, Ph.D., Professor Emeritus, Applied Linguistics, University of California,
Los Angeles
• Akihito Kamata, Ph.D., Professor, Department of Education Policy and Leadership,
Department of Psychology, Southern Methodist University.
• Timothy Kurtz, Hanover High School, Hanover, New Hampshire
• Carol Myford, Ph.D., Associate Professor, Educational Psychology at the University of
Illinois at Chicago.
• Elizabeth Peña, Ph.D., Professor, Department of Communication Sciences and Disorders,
University of Texas at Austin.
More information on the TAC members can be found at the WIDA website
(www.wida.us/assessment/access/TAC/index.aspx).
As in the previous annual technical reports, this report provides background to the test
(Chapter 1). The current report has been modified for Series 302 to introduce an argument-based
validation framework to support the use of ACCESS for ELLs and to contextualize the data so
that its interpretation and use are more transparent to stakeholders (Chapter 2). The rest of the
report consists of paired chapters. The first chapter within each pair contains text that explains
the data tables that follow in the second chapter. Information on the students who participated in
the operational administration is presented (Chapters 3 and 4), followed by an explanation of the
technical analyses conducted on each of the 44 test forms that constitute ACCESS for ELLs
(Chapter 5) and the tables and figures of results (Chapter 6). The final chapters explain (Chapter
7) and present (Chapter 8) technical analyses based on the domain scores and composite scores
by grade-level cluster. Note that Chapters 1–4 are in Volume 1, Chapters 5–6 are in Volume 2,
and Chapters 7–8 are in Volume 3.
Summary Highlights
This report presents a wealth of data documenting the technical properties of the 44 test forms of
ACCESS for ELLs Series 302, which is impossible to summarize here. In addition to
information on validity, the report presents information on reliability of test scores and the
accuracy and consistency of proficiency level classifications, including information on
Demographic data
The Series 302 data set for analyses included the results of 1,372,806 students. The largest grade
was Kindergarten with 204,828 students, while the smallest was Grade 12 with 31,299 students.
Of the participating WIDA states, the largest was Illinois with 176,389 students, while the
smallest was Vermont with 1,533 students. Technical analyses in this report are based on the
performance of all students who were administered Series 302 of ACCESS for ELLs.
Center for Applied Linguistics (forthcoming). ACCESS for ELLs Series 302 Media-Based
Listening Field Test Technical Brief. (WIDA Consortium).
This report (forthcoming) provides detailed information on the conceptualization,
development, and field testing of ACCESS for ELLs Media-Based Listening Test.
Gottlieb, M., & Boals, T. (2005). Considerations in Reconfiguring Cohorts and Resetting Annual
Measurable Achievement Objectives (AMAOs) based on ACCESS for ELLs Data
(WIDA Consortium Technical Report No. 3).
This report is intended to assist states with the transition to a standards-based test and
determining their AMAOs using ACCESS for ELLs.
Gottlieb, M. & Kenyon, D.M. (2006). The Bridge Study between Tests of English Language
Proficiency and ACCESS for ELLs (WIDA Consortium Technical Report No. 2).
This report provides the background, procedures, and results of a study intended to
establish estimates of comparability between ACCESS for ELLs and four other
English language tests used by Consortium member states. Students in Illinois and
Rhode Island were administered ACCESS for ELLs along with one of the other four
tests, and results on the four tests were compared with results on ACCESS for ELLs.
Results allow states, districts, and schools to understand and report ACCESS for ELLs
scores and to establish continuity between previous tests and ACCESS for ELLs.
Kenyon, D. M. (2006). Development and Field Test of ACCESS for ELLs (WIDA Consortium
Technical Report No. 1).
This report provides detailed information on the conceptualization, development, and
field testing of ACCESS for ELLs. It also provides technical data on equating and
scaling procedures, standard setting and operational score reporting, analyses of
reliability and errors of measurement, and two initial validity studies.
Kenyon, D. M., Ryu, J.R. (Willow), & MacGregor, D. (2013). Setting Grade Level Cut Scores
for ACCESS for ELLs. (WIDA Consortium Technical Report No. 4).
This report describes the technical procedures and outcomes of the process to move
from grade-level-cluster cut scores to grade-level cut scores. Proposed cut scores were
determined mathematically and then reviewed and revised in a standard setting process
involving 75 teachers from 14 WIDA Consortium states.
MacGregor, D., Kenyon, D. M., Gibson, S., & Evans, E. (2009). Development and Field Test of
Kindergarten ACCESS for ELLs. (WIDA Consortium).
This report provides detailed information on the conceptualization, development, and
field testing of Kindergarten ACCESS for ELLs. It also provides technical data on
equating and scaling procedures, standard setting and operational score reporting, and
analyses of reliability and errors of measurement.
Bachman, L. F. (2005). Building and supporting a case for test use. Language Assessment
Quarterly, 2(1), 1–34.
This article describes how an argument for test use might be structured so as to provide
a clear linkage from test performance to interpretations and from interpretations to
uses.
Bachman, L. F., & Palmer, A. S. (2010). Language assessment in practice. Oxford: Oxford
University Press.
This book presents the Assessment Use Argument, which provides a framework for
justifying the intended uses of an assessment, as well as a guide for the design and
development of the assessment itself.
Bauman, J., Boals, T., Cranley, E., Gottlieb, M., and Kenyon, D.M. (2007). The Newly
Developed English Language Tests (World-Class Instructional Design and Assessment
– WIDA). In Abedi, Jamal (Ed.), English Language Proficiency Assessment in the
Nation: Current Status and Future Practice. Davis: University of California.
In this book chapter, the authors describe the test development process, from the
development of standards through the development of items, field testing, and
operationalization. They also report on validation of the test, accommodations, the test
administration and technical manuals, and score reporting.
Chapelle, C. A., Enright, M. E., & Jamieson, J. (2010). Does an argument-based approach to
validity make a difference? Educational Measurement: Issues and Practice, 29(1), 3–
13.
Drawing on experience between 2000 and 2007 in developing a validity argument for
the high-stakes Test of English as a Foreign Language™, this paper evaluates the
differences between the argument-based approach to validity as presented by Kane
(2006) and that described in the 1999 AERA/APA/NCME Standards for Educational
and Psychological Testing.
Chapelle, C. A., Enright, M. & Jamieson, J. (Eds.) (2008). Building a validity argument for the
Test of English as a Foreign Language. London: Routledge.
This book uses the Test of English as a Foreign Language™ as a case study for
validating test design. It attempts to meet the standards of educational measurement
while also drawing on theory related to English language proficiency.
Cook, H. G. (2007). Alignment Study Report: The WIDA Consortium’s English Language
Proficiency Standards for English Language Learners in Kindergarten through Grade
12 to ACCESS for ELLs® Assessment. Madison, WI: WIDA Consortium.
In this report, the author describes a study to align the WIDA Standards to the
ACCESS for ELLs test. The study was designed to address two questions: how well
the test measures the proficiency levels described in the Standards, and how well the
different domains of each standard are addressed by the domains of the test. The author
concludes that overall ACCESS for ELLs is adequately aligned to the Standards.
Zieky, M. (1993). Practical questions in the use of DIF statistics in test development. In
P.Holland & H. Wainer (EMS.), Differential item functioning. Hillsdale, NJ: Lawrence
Erlbaum Associates.
These language proficiency levels are thoroughly embedded in the WIDA ELD Standards in a
two-pronged fashion.
First, they appear in the performance definitions. According to the WIDA ELD Standards, the
performance definitions provide a global overview of the stages of the language acquisition
process. As such, they complement the model performance indicators (PIs, see below) for each
language proficiency level. Being general definitions applicable across the PIs, the performance
definitions are not explicitly replicated within the PIs. The performance definitions are based on
three criteria. The first is students’ increasing comprehension and production of the technical
language required for success in the academic content areas. The second criterion is students’
demonstration of oral interaction or writing of increasing linguistic complexity. The final
criterion is the increasing development of phonological, syntactic, and semantic understanding in
receptive skills or control in usage in productive language skills.
Second, the language proficiency levels of the WIDA ELD Standards are fully embedded in the
accompanying PIs, which exemplify the Standards. The PIs describe the expectations for ELL
students for each of the five Standards, at five different grade-level clusters, across four
language domains, and at the five language proficiency levels. That is, within each
combination of standard, grade-level cluster, and language domain is a PI at each of the five
language proficiency levels. Proficiency Level 6, Reaching, represents the end of the continuum
rather than another level of language proficiency. The sequence of these five PIs together
describes a logical progression and accumulation of skills on the path from the lowest level of
English language proficiency to full English language proficiency for academic success. These
groupings of five PIs in logical progression are called a “strand.”
ACCESS for ELLs is based on the 80 strands, containing 400 individual PIs, within the WIDA
ELD Standards. (The Standards and the accompanying model PIs are available at the WIDA web
site, (www.wida.us.) Each selected-response item or performance-based task on ACCESS for
ELLs is carefully developed, reviewed, piloted, and field tested to ensure that it allows students
WIDA ACCESS Annual Tech Rpt 10 3 Series 302 (2013-2014)
Return to Summary of Assessment Records Claims, Actions, and Evidence
to demonstrate accomplishment of the targeted PI. (See the sample items at the WIDA web site
for examples.)
1.2.5 Tiers
Obviously, test items and tasks suitable for allowing Entering (Level 1) or Emerging (Level 2)
students to demonstrate accomplishment of the PIs at their level of language proficiency (i.e.,
that allow them to demonstrate what they can do) will not allow Expanding (Level 4) or Bridging
(Level 5) students to demonstrate the full extent of their language proficiency. Likewise, items
and tasks developed to allow Expanding (Level 4) and Bridging (Level 5) students to
demonstrate accomplishment of the PIs at their level would be far too challenging for Entering
(Level 1) or Emerging (Level 2) students. Items that are far too easy for test takers may be
boring and lead to inattentiveness on the part of students. Likewise, items that are far too difficult
for test takers may be frustrating, discouraging them from giving their best performance. But
more importantly, a test is a measure, and items that are too easy or too hard for a student add
very little to the accuracy or quality of the measurement of that student’s proficiency. Tests need
to be at the right difficulty level for individual test takers.
The solution to making ACCESS for ELLs appropriate to the proficiency level of individual
students across the wide range of proficiencies described in the WIDA ELD Standards is to
present the test items in three overlapping tiers for each grade-level cluster: A, B, and C. Figure
1.2.5A shows how the different tiers map to the language proficiency levels.
Thus, Tier A has items and tasks designed to allow students at the lowest language proficiency
levels (Levels 1 and 2) to demonstrate meeting the WIDA ELD Standards at their language
proficiency levels, and it includes some items targeted to Language Proficiency Level 3.
Table 1.2.5A
Unique Components in ACCESS for ELLs
List/Read/Write Speak
Grade-Level Tier A Tier B Tier C (adaptive)
Clusters
9-12 x x x x
6-8 x x x x
3-5 x x x x
1-2 x x x x
K x (adaptive)
Table 1.3.1A
Field Test for Listening, Reading and Writing: Students per Grade-level Cluster
Grade-Level Cluster Students
1-2 1,647
3-5 1,850
6-8 1,449
9-12 1,716
Total 6,662
A separate, individually administered field test was conducted for Speaking. One form was
developed for each grade-level cluster, using the adaptive design described in 1.2.5, for a total of
52 tasks. Field testing for Speaking was conducted in Wisconsin and the District of Columbia.
Table 1.3.1.B shows the number of students who participated in the Speaking field test by grade-
level cluster.
Table 1.3.1B
Speaking Field Test: Students per Grade-level Cluster
Grade-Level Cluster Students
1-2 103
3-5 159
6-8 136
9-12 125
Total 523
In addition, a separate field test was conducted in DC for the Kindergarten test. The final version
of the adaptive Kindergarten assessment was produced by first choosing the Listening and
Reading folders (i.e., sets of thematically related items) that contained items that were
empirically the easiest for first graders based on the data collected from the field test. These
folders were placed in the Kindergarten assessment in order from easiest to hardest. The
Speaking portion of the Kindergarten assessment was the same as that for the 1–2 grade-level
cluster, except it included only the SI and LA/SS folders, in order to reduce testing time. Special,
very simple writing tasks were adapted from the 1–2 grade-level cluster Tier A SI writing folder.
The adaptive administration of the Kindergarten assessment is similar to that of the Speaking
test. Thus, in any domain, if a student does not get at least two items in any folder (part) correct,
the administrator stops testing in that domain and moves on to the next domain. (The exception
is Speaking, which operates exactly as the standard ACCESS Speaking assessment.)
WIDA ACCESS Annual Tech Rpt 10 6 Series 302 (2013-2014)
Return to Summary of Assessment Records Claims, Actions, and Evidence
A total of 154 students participated in the Kindergarten field test. Of those, 55% were boys (84
students) and 45% were girls (70 students). 90.2% (139) of the students were Spanish speakers;
the only other language with more than one student was Vietnamese (3).
The Comprehension Composite score (based on performances in Listening and Reading) and the
Overall Composite score (based on performances in all four domains) were created with Series
100. Beginning with Series 101, the Oral Language Composite score (based on performances in
Listening and Speaking) and Literacy Composite score (based on performances in Reading and
Writing) were added.
1
Note: The 2005 ACCESS for ELLs field test and standard setting were based on the 2004 WIDA ELP standards.
The WIDA English Language Proficiency (ELP) Standards (2004, 2007) were amplified in 2012 to become English
Language Development (ELD) Standards (WIDA, 2012). In this section, the standards are referred to as ELD
standards for consistency.
WIDA ACCESS Annual Tech Rpt 10 9 Series 302 (2013-2014)
Return to Summary of Assessment Records Claims, Actions, and Evidence
proficiency levels and a range in 10-point increments (i.e., 50/50, 60/40, 70/30, 80/20, or 90/10),
or to indicate 100 under one language proficiency level. The results were compiled and discussed
with the panelists as a group. The panelists then were given the opportunity to reconsider and
adjust their bookmarking, if they so chose. The final results were analyzed by CAL using a
logistic regression procedure to determine the points along the underlying writing proficiency
continuum at which at least 50% of the panelists would be expected to agree that the writing
represents the work of the next higher proficiency level than the current proficiency level. The
results from this analysis were used to set the cut scores for the language proficiency levels.
The procedure for Speaking was similar, with the panelists listening to portfolios and recording
their judgments.
Beginning with Series 302, the Listening test transitioned from a traditional test administrator-
read script to a media-delivered format, played either from CD or from streaming audio available
online, for all grade level clusters except for Kindergarten. For more information, please see the
ACCESS for ELLs Series 302 Media-Based Listening Field Test Technical Brief (Center for
Applied Linguistics, forthcoming).
For Speaking, the SI task is replaced yearly, while the MS and LS tasks are replaced in
alternating years. New items are field tested on separate forms during the operational
administration of ACCESS for ELLs.
Table 1.4A also reflects a change in the Writing test that took effect starting with Series 201. In
that series, the separate Math and Science folders were replaced with a combined Language of
Math/Language of Science folder. Starting with that series, while the IT task will continue to be
replaced yearly, the MS and LA tasks will be replaced in a two-year cycle for Tier A, and the
MS and SI tasks will be replaced in a two-year cycle for Tiers B and C.
From Table 1.4A, we see that between Series 101 and Series 100, the IT Writing task and the
MA/SC Speaking task were replaced. In the Listening and Reading portion of the test, various
item folders were replaced following analysis of the field test and operational Series 100.
Because ACCESS for ELLs was so new, it was decided that it was most important to be able to
improve and/or replace weaker items across all five Standards than to choose only two Standards
to be replaced.
102 06-07 MA SI LA SC - MA IT SI LS
103 07-08 LA SC MA SS - SC IT SI MS
203 11-12 MA SC LA SS LA SI IT SI MS
301 12-13 LA SS MA SI MS MS IT SI LS
302 13-14 MA SI - - - - IT SI MS
Social and Instructional language (SI); Language of English Language Arts (LA); Language of Math (MA); Language of Science
(SC); Language of Social Studies (SS); Integrated Language of Science, Language of Language Arts, and Language of Social
Studies (IT); Language of Math and Language of Science (MS); Language of English Language Arts and Language of Social
Studies (LS)
*Reading not refreshed for 302 because of full refreshment of 302 Listening. New specs for 303 are LA & SC.
The following paragraphs describe annual procedures currently in place that influence the
development of future items.
Phase I is conducted at the same time as equating (see Section 1.3.2) using two sources of data:
one, all student data available a week before the equating sample is pulled, called Early Return;
WIDA ACCESS Annual Tech Rpt 10 12 Series 302 (2013-2014)
Return to Summary of Assessment Records Claims, Actions, and Evidence
two, the equating sample, called Equating Sample. During Phase I analysis, only ethnicity DIF
(Hispanic vs. Non-Hispanic) is investigated. In this phase, items that show high levels of DIF in
both data sets are investigated by a team of content experts to determine if any construct-
irrelevant factors can be identified that may contribute to DIF. Items which are identified as
having construct-irrelevant sources of DIF will not be scored operationally. Two items were
identified as having a C-level ethnicity DIF favoring Hispanics in the Early Return data but a A-
level DIF favoring Hispanics for the Equating Sample; therefore, no further action was required.
For Series 302, no items were unscored because of DIF in Phase I.
Phase II is conducted using all student data available in early May. During Phase II analysis,
ethnicity and gender DIF were investigated. As with Phase I, items that show high levels of DIF
are investigated by a team of content experts to determine if any construct-irrelevant factors can
be identified that may contribute to DIF. Items which are identified as having construct-
irrelevant sources of DIF will be removed from the test in the next operational year. For Series
302, one listening item was identified as having C-level ethnicity based DIF, favoring Hispanics;
one reading item was identified as having C-level ethnicity based DIF, favoring Non-Hispanics.
For the Annual Technical Report, an ethnicity and gender DIF analysis is conducted using all
student data. For Series 302, five items showed DIF. Out of 270 Listening items, two (0.7%)
showed C-level DIF based on ethnicity, favoring Hispanics. Out of 342 Reading items, one
(0.3%) showed C-level DIF based on ethnicity, favoring Non-Hispanics. Out of 43 Writing tasks,
one (2.3%) showed C-level DIF based on ethnicity, favoring Non-Hispanics. Out of 62 Speaking
items, one (1.6%) showed C-level DIF based on ethnicity, favoring Hispanics. These items are
thoroughly analyzed by the Psychometrics/Research team at CAL to determine the potential
sources of DIF. In terms of DIF by ethnicity (Hispanics versus Non-Hispanics), special attention
is paid to the presence of Spanish-English cognates or false cognates that may affect student
performance. That information is provided to the test development team, which makes necessary
revisions to continuing items and keeps a record of such cognates for future reference. The test
development team uses this information to guide the item development and review process for
future items.
For information on the procedures used to calculate DIF, see Section 5.1.4.
Figure 1.5.1A depicts the weighting for each of the composite scores. As shown, the Overall
Composite is computed using scores from all four domains. Each of the other three composites is
shown with the weighting of domains, in terms of the weighting used for the Overall Composite.
As the diagram shows, more weighting is given to the literacy skills than to the oral skills for the
Overall Composite
Table 1.5.2B
Cut Scores (Reading)
Grades Domain Cut
1/2 2/3 3/4 4/5 5/6
K (Instructional) Read 121 159 204 228 255
K (Accountability) Read 238 251 261 274 295
1 Read 253 269 283 294 314
2 Read 267 286 303 312 331
3 Read 279 302 320 328 347
4 Read 291 316 336 343 360
5 Read 302 328 350 355 372
6 Read 312 340 360 366 382
7 Read 321 349 369 375 391
8 Read 329 358 376 382 398
9 Read 336 364 381 387 402
10 Read 341 370 383 390 406
11 Read 346 374 384 392 407
12 Read 350 376 385 393 408
Table 1.5.2D
Cut Scores (Speaking)
Grades Domain Cut
1/2 2/3 3/4 4/5 5/6
K (Instructional) Spek 256 285 308 342 365
K (Accountability) Spek 269 314 343 366 383
1 Spek 278 318 344 367 385
2 Spek 286 322 345 368 386
3 Spek 293 326 346 369 389
4 Spek 299 329 348 371 391
5 Spek 305 333 350 374 394
6 Spek 310 337 353 377 397
7 Spek 314 340 358 380 400
8 Spek 317 344 361 384 404
9 Spek 319 347 366 388 407
10 Spek 321 351 371 393 412
11 Spek 322 354 377 399 416
12 Spek 3232 357 384 405 421
Table 1.5.2F
Cut Scores (Literacy Composite)
Grades Domain Cut
1/2 2/3 3/4 4/5 5/6
K (Instructional) Litr 133 189 224 249 291
K (Accountability) Litr 232 255 278 299 323
1 Litr 246 271 296 315 338
2 Litr 259 286 312 330 352
3 Litr 272 300 325 344 366
4 Litr 283 312 338 357 377
5 Litr 295 324 350 368 388
6 Litr 305 335 361 379 397
7 Litr 315 344 370 387 406
8 Litr 324 353 379 395 413
9 Litr 332 360 385 401 419
10 Litr 339 367 390 406 424
11 Litr 345 372 394 410 427
12 Litr 351 377 398 414 430
Table 1.5.2H
Cut Scores (Overall Composite)
Grades Domain Cut
1/2 2/3 3/4 4/5 5/6
K (Instructional) Over 158 206 239 268 307
K (Accountability) Over 237 263 288 307 329
1 Over 249 277 303 321 344
2 Over 261 290 316 335 357
3 Over 272 303 328 347 369
4 Over 283 314 340 359 380
5 Over 293 324 350 369 390
6 Over 302 334 359 379 399
7 Over 311 342 368 386 407
8 Over 319 350 375 394 414
9 Over 327 357 382 400 419
10 Over 333 363 387 405 424
11 Over 340 368 391 409 427
12 Over 3465 372 395 413 430
A proficiency level score consists of a two-digit decimal number (e.g., 4.5). The first digit
represents the student’s overall language proficiency level range based on the student’s scale
score. A score of 4.5 indicates that the student is in language proficiency Level 4. The number to
the right of the decimal is an indication of the proportion of the range between cut scores that the
student’s scale score represents. A score of 4.5 tells us that the student’s scale score is halfway
between the cut scores for Levels 4 and 5.
1.7 Scoring
Test booklets are returned to MetriTech, where they are electronically scanned in preparation for
scoring. Listening, Reading, and Writing are scored by Metritech. Speaking is locally scored by
the test administrator. Details of the scoring methods are described below.
1.7.2 Writing
Students’ responses to the Writing tasks are centrally scored at MetriTech by raters who are
trained to follow the WIDA Consortium’s Writing Rubric (see 1.7.2.1). The rubric reflects the
Performance Level Descriptions of the WIDA ELD Standards and is presented in Table 1.7.2A.
In addition to training in the generic rubric, training is provided to scorers as to expectations for
each grade level and for each Writing task. For example, exceptional vocabulary usage in the 1–
2 grade-level cluster would not be so exceptional at the 9–12 grade-level cluster. The amount of
writing and sophistication of thought at each performance level generally increases with moving
up the grade-level clusters. Thus, a single generic rubric rooted in the WIDA ELD Standards lies
at the core of the scoring of Writing, but developmental differences between grade-level clusters
are part of the additional training that each rater receives.
For example, for all grades on Tier B and C tests the three tasks are given weights of 1, 2, and 3.
Thus, a student who receives scores of 6, 5, and 4 on the three Writing tasks for that test would
have an overall writing raw score of 28 ((6*1) +(5*2) + (4*3)).
1.7.3 Speaking
The Speaking test is administered individually to each test taker. Each task is immediately scored
by the administrator while the test is being given. The administration and scoring procedure were
designed together to be quite simple to implement. As described previously, the Speaking tasks
are designed around the PIs to allow students to demonstrate mastery of the performance level
for which the task is designed. After administering each task and listening to the student’s
responses, the administrator decides whether the student’s performance exceeds, meets, or
approaches task-level expectations. Specifically, the possible ratings are defined as follows:
Exceeds: The student’s performance exceeds task-level expectations in quantity and/or quality.
Meets: The student’s performance meets task-level expectations in quantity and quality.
WIDA ACCESS Annual Tech Rpt 10 25 Series 302 (2013-2014)
Return to Summary of Assessment Records Claims, Actions, and Evidence
Approaches: The student’s performance approaches task-level expectations, but falls short in
quantity and/or quality.
No Response: The student’s performance is quite inadequate: there is no response, the response
is incomprehensible or in a language other than English, or the student is unable to understand
the task directions.
Operationally, a score of 1 is given for every task that either meets or exceeds expectations, and
a 0 is given for any task that is rated as approaches or no response. The sum of those scores is the
total Speaking raw score for that student.
Table 1.7.3A presents the WIDA Consortium’s Speaking Rubric, which summarizes the
expectations for each task level on the Speaking assessment. These expectations are drawn from
the performance level descriptions of the WIDA ELD Standards and are divided into three
components (Linguistic Complexity, Vocabulary Usage, and Language Control). The training for
test administrators consists of familiarizing them with the tasks at each level and listening to
responses to those tasks, determining whether they meet the task-level expectations or not.
In CAL’s Validation Framework, the Plan step involves an examination of possible decisions
states might make and consequences that might result from the assessment. This leads to the
consideration of several models during the Design step, where specifications that answer such
critical questions as “What are we measuring?” and “How do we measure it?” are developed
(Mislevy, Almond, & Lukas, 2004). The subsequent steps of the validation framework highlight
the trialing, implementation, and use of the assessment results, beginning with test takers’
performance on the assessment (Assessment Performance) and continuing through the collection
of test scores (Assessment Records), interpretations of those test scores (Interpretations),
decisions made based on the test scores (Decisions), and the consequences of test use
(Consequences).
The WIDA Consortium is using CAL’s Validation Framework to present a complete validity
argument, which will be updated as needed, for ACCESS for ELLs. To date, information related
to Step 4, Assessment Records, has been explored and is found in this chapter.
Figure 2.2A: Structure of the Argument-Based Approach Supporting Step 4 Contained in this Chapter
C4.6. All test takers are provided comparable opportunities to demonstrate their English
Language Proficiency.
C4.5. All tasks and items are scored consistently for all test takers.
WIDA ACCESS Annual Tech Rpt 10 31 Series 302 (2013-2014)
Return to Summary of Assessment Records Claims, Actions, and Evidence
C4.4. Test items/tasks work appropriately together to measure each test taker’s English
Language Proficiency.
C4.3. The same scale scores obtained by test takers in different years retain the same meaning.
C4.2. ACCESS for ELLs measures English Language Proficiency for all test takers in a fair
and unbiased manner.
C4.1. Test takers are classified appropriately according to the proficiency levels defined in the
WIDA English Language Development Standards.
As shown in Figure 2.2.1A, these claims depend upon each other, again moving from (C4.6) up
to (C4.1). Within this organizational structure, each successive claim builds upon the previous
one(s) (e.g., ratings are only useful to test developers and stakeholders if all test takers are
provided comparable opportunities to demonstrate their proficiency). In the next section, these
claims are broken down even further into actions that are taken to ensure the consistency and
reliability of the assessment records.
Claim 4.6 - All test takers are provided comparable opportunities to demonstrate
their English Language Proficiency.
Action 4.6a: Well-specified procedures were developed for test administrators so that they are
able to administer the test consistently.
Evidence: Procedures for administering the test and producing reported scores are documented in
the ACCESS for ELLs Test Administration Manual (WIDA, 2012a).
Action 4.6b: Test administrators document and report any irregularities that may occur so that
appropriate action may be taken.
Evidence: Test administration procedures are documented in the ACCESS for ELLs Test
Administration Manual (WIDA, 2012a).
Claim 4.5 – All items and tasks are scored consistently for all test takers.
Action 4.5a: Raters of performance-based tasks undergo thorough training so that they know how
to score appropriately.
Evidence: Section 1.7 of this report specifies the scoring procedure for ACCESS for ELLs, with
Section 1.7.2 providing information on the Writing domain and Section 1.7.3 explicating the
procedure for Speaking tasks. Raters of Writing tasks are trained by MetriTech to follow the
Writing rubric (see Table 1.7.2B). Since Speaking tasks are scored locally, raters are trained
through an online program on the WIDA website to follow the Speaking rubric (see Table
1.7.3A).
Action 4.5b: Listening and Reading items are scored electronically using a carefully checked
key.
Action 4.5c: Raters of performance-based tasks are certified, demonstrating that they can score
appropriately.
Evidence: Section 1.7 of this report specifies the scoring procedure for ACCESS for ELLs.
Writing tasks are centrally scored at MetriTech, and all raters are pre-screened and subsequently
trained (see Section 1.7.2). Speaking is scored by the test administrator after the completion of
training on test administration and on the Speaking rubric (see Section 1.7.3).
Action 4.5d: Raters of Writing tasks are monitored daily to ensure that they are scoring
appropriately.
Evidence: MetriTech provides Raters of Writing tasks with specially prepared calibration sets
each day to monitor that the scoring rubric is being applied consistently across scoring sessions
(see Section 1.7.2.1).
Action 4.5e: Scoring data for Writing tasks are analyzed for rater agreement to understand how
closely raters agree.
Evidence: Interrater reliability is calculated for each of the three or four Writing tasks. The
percentage of agreement between two raters is calculated in terms of three features (i.e.,
Linguistic Complexity, Vocabulary Usage, and Language Control). When the two raters agree on
a score, this is counted as exact agreement. If the two raters provide feature scores that differ by
one point, this is counted as adjacent agreement (see Table 6F for percentages of exact and
adjacent agreement).
Claim 4.4 - Test items/tasks work appropriately together to measure each test
taker’s English Language Proficiency.
Action 4.4a: For each test form (e.g., Reading 6–8B), item and task analyses are performed and
psychometric properties of the items and tasks are evaluated to confirm that scores are internally
consistent.
Evidence: Reliability and accuracy information based on Classical Test Theory is calculated for
each test form (i.e., for each tier within each grade-level cluster). This information includes
Cronbach’s alpha, which is a measure of internal consistency. Cronbach’s coefficient alpha is
widely used as an estimate of reliability and expresses how well the items on a test appear to
work together to measure the same construct (see Table 6F).
Action 4.4b: For each domain and composite score across tiers, item and task analyses are
performed and psychometric properties of the items and tasks are evaluated to confirm that
scores are internally consistent.
Action 4.4c: Analyses of Rasch model fit statistics are conducted to show that individual tasks
perform appropriately.
Evidence: The Complete Items Properties table includes information on the Rasch fit statistics
for each test item (see Table 6H). These statistics, called outfit mean square and infit mean
square statistics, are calculated by comparing the observed empirical data with the values that the
Rasch model expects test takers to produce. Infit and outfit statistics indicate any consistently
unusual performance in relation to the item’s difficulty measure by measuring the degree to
which examinees’ responses to items deviate from expected responses. Both statistics have an
expected value of 1.0. Items with infit and outfit mean square statistics between 0.5 and 1.5 are
considered “productive for measurement” (Linacre, 2002). Values between 1.5 and 2.0 are
“unproductive for construction of measurement, but not degrading.” Values greater than 2.0
might “distort or degrade the measurement system.” Values below 0.5 are “less productive for
measurement, but not degrading.” Infit helps ensure that test takers within range of the targeted
proficiency level perform as expected. It is not as sensitive to outliers as Outfit. Outfit can be
skewed if test takers with extreme (i.e., high-level or low-level) proficiency do not perform as
expected. High infit is a bigger threat to validity, but is more difficult to explain than high outfit
(Linacre, 2002). The infit and outfit mean square statistics are part of the evaluation criteria used
to select the items and tasks that appear on the final operational forms.
Claim 4.3 - The same scale scores obtained by test takers in different years retain
the same meaning.
Action 4.3a: A sufficient number of items and tasks are used as anchor items across adjacent
years to maintain a consistent scale from year to year.
Evidence: Each year, while a certain percentage of items on each ACCESS for ELLs test form is
refreshed, a number of items and tasks are retained from the previous year’s assessment. These
retained “anchor items” ensure that performances on the newer form may be interpreted in the
same frame of reference as the previous year. For Listening and Reading, a majority of test items
are anchor items, while one of three Writing tasks and one of three Speaking folders are retained
annually as anchor tasks. Table 6E displays information on the anchor items for each test form.
Action 4.3b: New items and tasks are calibrated with anchor items to ensure that their difficulty
measures are on the same consistent scale that is used from year to year.
Action 4.3c: The same scaling equation is applied from year to year to ensure that scale scores
are obtained consistently over time.
Evidence: The scaling equation table is used to convert a test taker’s ability measure, which is
calculated based on test performance using Rasch modeling, into an ACCESS for ELLs scale
score (see Table 6D). The same equation is used across all tiers and grade-level clusters within
each domain.
Claim 4.2 - ACCESS for ELLs measures English Language Proficiency for all test
takers in a fair and unbiased manner.
Action 4.2a: Differential item functioning (DIF) analyses are conducted to determine whether
any items or tasks may be biased against certain subgroups.
Evidence: The Item/Task Analysis Summary provides a summary of the findings of the DIF
analyses, which look for measurement bias in test items (see Table 6G). Analyses search for bias
in contrasting groups based on gender (male versus female) and ethnicity (Hispanic versus non-
Hispanic). This table shows the number of items that favored one group or the other at all levels
of DIF.
The Complete Items Properties table includes more detailed information on the DIF analyses,
showing the degree of measurement bias for each item and which group is favored (ATR Table
6H). Each item is categorized into three levels of DIF: A, B, or C (Zieky, 1993). An item
exhibiting A level DIF shows little or no evidence of bias toward a particular group, an item
exhibiting B level DIF is displays a moderate amount of bias, and an item exhibiting C level DIF
is considered to display considerable evidence for potential bias and should be closely examined
by test developers to identify any construct irrelevant factors that may contribute to DIF.
Action 4.2b: Items that show evidence of DIF are carefully reviewed so that any that indicate
bias are not used for scoring and are removed from future test forms.
Evidence: As described in Chapter 1.4.5 (DIF Items), ethnicity and gender DIF analyses are
conducted using all test taker data. Information on DIF is gathered at different points in the
testing cycle and is provided to the test development team. The test development team uses this
information to guide the item development and review process for future items.
Claim 4.1 - Test takers are classified appropriately according to the proficiency
levels defined in the WIDA English Language Development Standards.
Action 4.1a: Distributions of scale scores and proficiency levels for each domain are analyzed to
confirm that ACCESS for ELLs effectively measures the performance of test takers across the
range of English Language Proficiency levels as defined by the WIDA English Language
Development (ELD) Standards.
Action 4.1b: Distributions of scale scores and proficiency levels, organized by grade-level
cluster, are analyzed to confirm that ACCESS for ELLs effectively measures the performance of
test takers across the range of English Language Proficiency levels as defined by the WIDA
English Language Development (ELD) Standards.
Evidence: The distribution of test takers’ scale scores on ACCESS for ELLs, organized by grade-
level cluster, shows that ACCESS for ELLs effectively measures the performance of test takers
across the range of ELD abilities as described by the WIDA ELD Standards (see Table 8A; see
Figure 8A).
The proficiency level distribution of test takers’ scores on ACCESS for ELLs, organized by
grade-level cluster, shows that ACCESS for ELLs effectively measures the performance of test
takers across the range of proficiency levels as defined by the WIDA ELD Standards (see Table
8B; see Figure 8B).
The Test Characteristic Curve reflects test takers’ mean raw scores by domain on ACCESS for
ELLs across the entire test for Kindergarten and across the three tiers for the other grade-level
clusters (see Figure 8C). It also graphically illustrates how the tiers differ in difficulty, showing
that ACCESS for ELLs effectively captures a range of ELD ability levels. Tier A is represented
by a dotted curve, Tier B by a light solid curve, and Tier C by a dark solid curve. As shown, Tier
B is more difficult than Tier A, and Tier C is more difficult than Tier B.
Action 4.1d: Across domains, analyses are run to confirm that English Language Proficiency is
measured with high precision at the cut points pertinent to each tier.
Evidence: The conditional standard error of measurement provides information on how precisely
test takers’ performances on ACCESS for ELLs are measured at the cut points between language
proficiency levels. These cut points are critical because they are the points at which decisions are
made about test taker placements. Because the cut points depend on the grade level, information
for each domain is provided for each grade level within the cluster. From Table 8C, it is possible
to examine how well the different tiers measure the English Language Proficiency of test takers
at the appropriate proficiency level cut scores (i.e., PL1 through PL3 for Tier A, PL2 through
PL4 for Tier B, and PL3 and up for Tier C).
The Test Information Function reflects the precision of measurement by graphically presenting
the standard error of measurement across tiers for grade-level clusters (see Figure 8D). Tier A is
represented by a dotted curve, Tier B by a light solid curve, and Tier C by a dark solid curve. As
shown, Tier B is more difficult than Tier A, and Tier C is more difficult than Tier B. As in
Figure C, the cut scores at the highest grade in each cluster are indicated by vertical lines. These
lines make it easy to see that the test forms for different tiers measure most accurately at the
proficiency levels they are meant to capture.
Action 4.1e: Classification and accuracy analyses are conducted by grade level to confirm that
proficiency level classifications are reliable for all domain and composite scores.
Evidence: Information related to the accuracy of test takers’ proficiency-level classifications is
presented in multiple ways (see Table 8E). A separate table is provided for each grade level in a
cluster. The table provides overall indices related to the accuracy and consistency of
WIDA ACCESS Annual Tech Rpt 10 38 Series 302 (2013-2014)
Return to Summary of Assessment Records Claims, Actions, and Evidence
classification. These indices indicate the percent of all test takers who would be classified into
the same language proficiency level by both the administered test and either the true score
distribution (accuracy) or a parallel test (consistency). Cohen’s kappa, which is a statistical
measure of interrater agreement between two raters that takes chance agreement between raters
into account, is also presented. A kappa value of 1 indicates complete agreement between the
two raters, while a kappa value of 0 indicates no agreement other than what would be expected
by chance. Table 8E also shows accuracy and consistency information conditional on level and
provides indices of classification accuracy and consistency at the cut points.
b. Listening and Reading items are scored b. Chapter 1.7.1 (Scoring - Listening
electronically onsite at MetriTech. and Reading)
d. Raters of Writing tasks are monitored daily to d. Chapter 1.7.2.1 (Scoring Procedures
ensure that they are scoring appropriately. for Writing)
e. Scoring data for Writing tasks are analyzed for e. Table 6F (Reliability)
rater agreement to understand how closely
raters agree.
4. Test items/tasks a. For each test form (e.g., Reading 6-8B), item a. Table 6F (Reliability)
work appropriately and task analyses are performed and
together to measure psychometric properties of the items and tasks
each test taker’s are evaluated to confirm that scores are
English Language internally consistent.
Proficiency.
b. For each domain and composite score across b. Table 8D (Reliability)
tiers, item and task analyses are performed and
psychometric properties of the items and tasks
are evaluated to confirm that scores are
internally consistent.
c. Analyses of Rasch model fit statistics are c. Table 6H (Complete Item Analysis)
conducted to show that individual tasks
perform appropriately
c. The same scaling equation is applied from c. Table 6H (Complete Item Analysis)
year to year to ensure that scale scores are
obtained consistently over time.
2. ACCESS for ELLs a. Differential item functioning (DIF) analyses a. Table 6H (Complete Item Analysis);
measures English are conducted to determine whether any items Table 6G (Item/Task Analysis
Language Proficiency or tasks are biased against certain subgroups. Summary)
for all test takers in a
fair and unbiased b. Items that show evidence of DIF are carefully b. Chapter 1.4.5 (DIF Items)
manner. reviewed so that any that indicate bias are not
used for scoring and are removed from future
test forms.
1. Test takers are a. Distributions of scale scores and proficiency a. Figure 6A (Raw Scores) & Table 6A
classified levels for each domain are analyzed to confirm (Raw Score Descriptive Statistics);
appropriately that ACCESS for ELLs effectively measures Figure 6B (Scale Scores) & Table 6B
according to the the performance of test takers across the range (Scale Score Descriptive Statistics);
proficiency levels of English Language Proficiency levels as Figure 6C (Proficiency Level) &
defined in the WIDA defined by the WIDA English Language Table 6C (Proficiency Level
English Language Development Standards. Distribution);
Development Table 6J (Raw Score to Proficiency
Standards. Level Score Conversion Chart);
Figure 6D (Test Characteristic
Curve)
b. Distributions of scale scores and proficiency b. Figure 8A (Scale Scores) & Table 8A
levels, organized by grade-level cluster, are (Scale Score Descriptive Statistics);
analyzed to confirm that ACCESS for ELLs Figure 8B (Proficiency Level) &
effectively measures the performance of test Table 8B (Proficiency Level
takers across the range of English Language Distribution);
Proficiency levels as defined by the WIDA Figure 8C (Test Characteristic
English Language Development Standards. Curve)
c. For each test form, analyses are run to confirm c. Figure 6E (Test Information
that English Language Proficiency is Function);
measured with high precision at the cut points Table 6I (Raw Score to Scale Score
pertinent to each tier. Conversion Chart)
To use the Visual Guide to Tables and Figures as a navigational tool, click on the links in
Figures 2.5.1 through 2.5.3 to navigate to the selected tables and figures in the Annual Technical
Report. A link is provided at the end of each section in Chapters 4, 6, and 8. Detailed
descriptions of the information in each of the tables and figures is included in the preceding
chapters (i.e., Chapter 5 contains information on tables and figures in Chapter 6, and Chapter 7
contains information on tables and figures in Chapter 8). These descriptions may be accessed
through links in Table 2.4A Summary of Assessment Records Claims, Actions, and Evidence.
Figure 2.5.1 displays the tables in Chapter 4 that provide information on participation, scale
score, and proficiency level results, as well as results by standard. The key in the upper left
corner of the figure describes the tables contained in each section of the chapter. For example,
tables in Section 4.1 contain information about participation. To find specific information in
Chapter 4, select the Grade or Grade Cluster tab, Domain or Tier tab, and then choose from three
categories: Demographic Characteristics, Domain Composites, or Domains. Within each of these
categories, several additional options organize information so that individual tables can be
accessed. For example, to find a table that displays information on the number of female Grade 2
students who completed the Speaking section, refer to Figure 2.5.1 and complete the following
steps: one, select Grade; two, select Domains; three, select Demographic Characteristics; four,
select Gender. The information is found in Table 4.2.2.2. Click on 4.2.2.2 to go to the
appropriate table in Chapter 4.
Figure 2.5.2 displays the sections in Chapter 6 that contains analyses for each ACCESS for
ELLs test form by grade-level cluster, tier, and domain. The key above the figure describes
specific information in each table and figure. For example, to find the Reliability table for Grade-
level Cluster 9–12C in the Reading domain, refer to Figure 2.5.2 and complete the following
steps: one, select Grade Cluster 9–12; two, select Tier C; three, select Reading under Domains.
Information for 9–12C Reading is shown in section 6.5.2.3. Finally, look at the key that explains
that reliability information is located in table F. The result is Table 6.5.2.3F. Click on 6.5.2.3 to
go to the appropriate section, and then locate Table F.
Figure 2.5.3 displays the sections in Chapter 8 that contain analyses across tiers, organized by
grade-level cluster, domain composites, and domains. The key above the figure describes the
specific information in each table and figure. For example, to find the Conditional Standard Error
of Measurement table for Grade-level Cluster 6–8 in the Writing domain, refer to Figure 2.5.3
and complete the following steps: one, select Grade Cluster 6–8; two, select Domain; three,
select Writing. Information for 6–8 Writing is shown in section 8.5.3. Finally, look at the key
and find Conditional Standard of Error Measurement table. The result is 8.5.3C. Click on 8.5.3
to go to the appropriate section, and then locate Table C.
4.1.2.1 4.1.1.1
4.2.1.1
4.1.3.2 4.2.2.1 4.1.3.1
4.2.3
4.3.3.3
4.3.3.2 4.3.3.1 4.4.2.1
4.4.2.2
4.3.4.3
4.3.4.2 4.3.4.1 4.4.3.1
4.4.3.2
3.1 Participation
Participation in ACCESS for ELLs is shown in three ways: grade-level cluster; grade, and tier.
3.1.2 Grade
Section 4.1.2 gives similar data as in the previous section, but broken out by grade rather than by
grade-level cluster.
3.1.3 Tier
Finally, Section 4.1.3 gives participation by tier.
Table 4.1.3.1 shows this information by cluster, tier, and domain. Because, for example,
Listening in the 1–2 grade-level cluster for Tier A represents a specific test form, this table
indicates how many students took each test form. Note that because Speaking is not administered
by tiers, the total number shows how many took that cluster’s Speaking test.
Table 4.1.3.2 shows the same information, but by grade rather than by grade-level cluster.
Table 4.1.3.3 shows the breakdown by grade-level cluster and tier for gender. When reviewing
data on DIF in Chapter 6, it may be useful to refer to these tables to understand the size of the
comparison groups on each form.
Table 4.1.3.4 shows the same information for ethnicity (Hispanic vs. Non-Hispanic). Consortium
member states use the Census Bureau categories for student ethnicity. Again, this data may be
useful when reviewing analyses of DIF in tables G and H in Chapter 6.
Note that in some circumstances there was a mismatch between a student’s reported grade and
the reported cluster of the test the student took (for example, a student who was reported to be in
Kindergarten but who was administered a test in the 1–2 grade-level cluster). In all, 334 students
were administered a test form from a cluster other than the grade in which they were reported to
be. Table 3.1 below shows the number of students in each grade who were administered out-of-
grade-level tests, and the test form that they were administered. The data for these students was
eliminated from all analyses in this report.
3.2.2 Correlations
Tables 4.2.3A through 4.2.3E show correlations among the four domain scale scores by grade-
level clusters across all tiers, as well as the number of students included in each correlation.
Table 4.2.3A shows the results for Kindergarten, Table 4.2.3B shows the results for the 1–2
grade-level cluster, Table 4.2.3C shows the results for the 3–5 grade-level cluster, Table 4.2.3D
shows the results for the 6–8 grade-level cluster, and Table 4.2.3E shows the results for the 9–12
grade-level cluster. Beginning with Series 101, caps were placed on students taking Tier A and
Tier B test forms in Listening and Reading. This capping of scores may raise the correlation
between those two scores, while decreasing the correlation of those two scores with Speaking
WIDA ACCESS Annual Tech Rpt 10 48 Series 302 (2013-2014)
Return to Summary of Assessment Records Claims, Actions, and Evidence
and Writing. Note, all correlations in Tables 4.2.3A through 4.2.3E are significant at the 0.01
level (2-tailed).
3.4.2 Writing
Section 4.4.2 shows the results for Writing. Again, the first section (4.4.2.1) shows results by
grade-level cluster, while the second section (4.4.2.2) shows the results by grade. Within each
table, the third column shows the Standard (Social and Instructional Language, Language of
Language Arts/Social Studies, and Language of Mathematics/Science). The next three columns
show the mean raw scores (out of a maximum of 6) of the three sub scores for the Writing test:
Linguistic Complexity, Vocabulary Usage, and Language Control. The seventh column shows
the total mean raw score for each Standard (out of a maximum of 18). The final column shows
the mean raw score as a percentage of the maximum possible score.
3.4.3 Speaking
Finally, Section 4.4.3 presents the results for Speaking. As in the previous sections, the first
section (4.4.3.1) shows results by grade-level cluster, while the second section (4.4.3.2) shows
the results by grade. Note that the Speaking assessment itself is adaptive but not tiered. Student
results are categorized here by tier according to the tier of the group-administered assessment
that they took. Within each table, the third column shows the Standard (Social and Instructional
Language, Language of Language Arts/Social Studies, and Language of Mathematics/Science).
The fourth column shows the maximum possible score, the fifth columns shows the mean raw
score, and the sixth column shows the mean raw score as a percentage of the maximum possible
score.
4.1.1.1 By State
Table 4.1.1.1
Participation by Cluster by State S302
Cluster
S tate K 1-2 3-5 6-8 9-12 Total
AK 1,569 3,560 4,063 2,924 2,884 15,000
AL 3,762 6,298 3,651 2,105 1,972 17,788
CO 12,525 25,972 30,932 21,189 16,576 107,194
DC 1,149 1,793 1,194 852 1,024 6,012
DE 1,792 2,915 1,854 888 923 8,372
GA 17,917 33,137 23,045 10,822 8,223 93,144
HI 2,456 3,947 3,777 2,985 3,477 16,642
IL 30,799 60,513 45,318 22,284 17,475 176,389
KY 3,280 6,244 5,038 2,898 2,712 20,172
MA 10,003 18,876 19,031 12,608 13,678 74,196
MD 10,110 17,921 13,922 7,573 7,919 57,445
ME 522 1,021 1,273 1,161 1,280 5,257
MI 10,503 19,024 21,483 14,936 15,522 81,468
MN 8,608 16,237 18,901 13,223 11,123 68,092
MO 4,618 7,788 7,423 4,399 3,556 27,784
MP 63 255 543 534 222 1,617
MS 1,312 2,391 2,376 1,357 1,034 8,470
MT 321 790 1,135 697 478 3,421
NC 14,123 27,576 25,140 15,990 13,632 96,461
ND 418 718 830 684 764 3,414
NH 407 1,085 1,092 715 897 4,196
NJ 11,768 19,239 12,843 7,940 10,816 62,606
NM 6,433 13,595 15,266 11,187 8,919 55,400
NV 9,663 20,134 20,690 12,804 7,866 71,157
OK 7,291 13,057 11,260 6,797 5,275 43,680
PA 4,754 10,926 12,025 9,989 11,995 49,689
RI 1,228 2,464 2,589 1,646 1,869 9,796
SD 662 1,107 1,157 826 853 4,605
UT 5,559 10,916 8,207 5,718 5,526 35,926
VA 14,803 27,739 23,153 13,731 15,813 95,239
VT 196 366 392 227 352 1,533
WI 5,792 11,932 13,497 9,005 7,616 47,842
WY 422 726 770 484 397 2,799
Total 204,828 390,262 353,870 221,178 202,668 1,372,806
4.1.1.3 By Ethnicity
Table 4.1.1.3
Participation by Cluster by Ethnicity S302
Hispanic/Non-Hispanic
Cluster Hispanic Other Missing Total
Count 137,915 63,871 3,042 204,828
K
% within Cluster 67.3% 31.2% 1.5% 100.0%
Count 269,735 115,964 4,563 390,262
1-2
% within Cluster 69.1% 29.7% 1.2% 100.0%
Count 242,064 107,037 4,769 353,870
3-5
% within Cluster 68.4% 30.2% 1.3% 100.0%
Count 146,739 70,573 3,866 221,178
6-8
% within Cluster 66.3% 31.9% 1.7% 100.0%
Count 121,693 76,619 4,356 202,668
9-12
% within Cluster 60.0% 37.8% 2.1% 100.0%
Count 918,146 434,064 20,596 1,372,806
Total
% within Cluster 66.9% 31.6% 1.5% 100.0%
4.1.2.1 By State
Table 4.1.2.1
Participation by Grade by State S302
Grade
State K 1 2 3 4 5 6 7 8 9 10 11 12 Total
AK 1,569 1,797 1,763 1,614 1,374 1,075 1,005 1,032 887 1,012 720 573 579 15,000
AL 3,762 3,434 2,864 2,075 924 652 614 706 785 944 467 352 209 17,788
CO 12,525 12,914 13,058 12,212 9,801 8,919 7,658 7,246 6,285 5,529 4,152 3,391 3,504 107,194
DC 1,149 994 799 559 356 279 292 253 307 560 197 123 144 6,012
DE 1,792 1,624 1,291 1,018 484 352 299 286 303 465 225 135 98 8,372
GA 17,917 17,589 15,548 12,871 5,947 4,227 3,540 3,781 3,501 4,418 1,841 1,120 844 93,144
HI 2,456 2,074 1,873 1,895 997 885 819 1,026 1,140 1,602 799 560 516 16,642
IL 30,799 31,002 29,511 25,508 11,678 8,132 6,917 7,782 7,585 8,053 4,111 3,116 2,195 176,389
KY 3,280 3,298 2,946 2,475 1,479 1,084 936 1,033 929 1,085 738 504 385 20,172
MA 10,003 9,742 9,134 7,895 5,946 5,190 4,379 4,316 3,913 5,018 3,435 3,005 2,220 74,196
MD 10,110 9,430 8,491 7,266 3,763 2,893 2,554 2,682 2,337 3,991 2,022 999 907 57,445
ME 522 544 477 505 423 345 347 414 400 394 331 288 267 5,257
MI 10,503 9,732 9,292 8,191 7,150 6,142 5,577 4,941 4,418 4,646 4,480 3,325 3,071 81,468
MN 8,608 8,265 7,972 7,470 6,259 5,172 4,660 4,530 4,033 3,969 2,676 2,464 2,014 68,092
MO 4,618 3,992 3,796 3,315 2,276 1,832 1,528 1,481 1,390 1,336 972 705 543 27,784
MP 63 115 140 276 134 133 206 165 163 151 33 18 20 1,617
MS 1,312 1,282 1,109 1,040 766 570 516 457 384 462 300 168 104 8,470
MT 321 362 428 407 410 318 262 229 206 187 130 94 67 3,421
NC 14,123 13,968 13,608 12,951 6,782 5,407 5,222 5,383 5,385 6,728 3,284 2,071 1,549 96,461
ND 418 389 329 373 246 211 246 236 202 263 189 175 137 3,414
NH 407 511 574 593 276 223 228 234 253 349 227 179 142 4,196
NJ 11,768 10,475 8,764 6,459 3,679 2,705 2,513 2,650 2,777 3,615 3,120 2,421 1,660 62,606
NM 6,433 6,967 6,628 6,253 5,139 3,874 3,681 3,769 3,737 3,885 2,245 1,573 1,216 55,400
NV 9,663 10,261 9,873 9,172 6,414 5,104 4,676 4,516 3,612 2,824 1,876 1,648 1,518 71,157
OK 7,291 6,997 6,060 5,355 3,327 2,578 2,353 2,229 2,215 2,409 1,261 917 688 43,680
PA 4,754 5,396 5,530 4,834 3,821 3,370 3,310 3,404 3,275 3,907 3,246 2,561 2,281 49,689
RI 1,228 1,338 1,126 1,098 839 652 555 565 526 637 534 391 307 9,796
SD 662 580 527 539 329 289 276 268 282 334 233 163 123 4,605
UT 5,559 5,918 4,998 2,869 2,817 2,521 2,100 1,834 1,784 1,543 1,615 1,363 1,005 35,926
VA 14,803 14,438 13,301 11,730 6,747 4,676 4,220 4,596 4,915 7,228 3,836 3,142 1,607 95,239
VT 196 193 173 202 100 90 70 77 80 108 89 92 63 1,533
WI 5,792 6,027 5,905 5,416 4,750 3,331 2,750 3,175 3,080 3,184 1,706 1,476 1,250 47,842
WY 422 368 358 402 221 147 158 168 158 148 115 68 66 2,799
Total 204,828 202,016 188,246 164,838 105,654 83,378 74,467 75,464 71,247 80,984 51,205 39,180 31,299 1,372,806
4.2.1.1 By Cluster
Table 4.2.1.1
Mean Scale Scores by Cluster S302
Cluster List Read Writ S pek Oral Litr Cphn Over
M ean 269.99 192.18 210.63 302.51 286.48 201.65 215.52 226.90
K
N 203,841 203,853 203,840 203,827 203,823 203,837 203,837 203,809
M ean 311.79 296.51 276.02 346.38 329.35 286.56 301.17 299.16
1-2
N 388,944 388,669 388,786 388,802 388,584 388,428 388,557 388,056
M ean 357.86 338.60 345.45 357.86 358.12 342.28 344.52 346.83
3-5
N 352,289 351,942 352,025 352,105 351,940 351,617 351,832 351,277
M ean 384.46 358.42 354.70 372.24 378.64 356.82 366.31 363.16
6-8
N 219,478 219,320 219,327 219,219 218,979 219,028 219,163 218,525
M ean 385.46 375.73 397.54 380.05 383.05 386.92 378.73 385.57
9-12
N 197,907 198,037 197,712 197,615 196,703 197,316 197,563 196,059
Table 4.2.1.3
Mean Scale Scores by Cluster by Ethnicity S302
Cluster Ethnicity List Read Writ S pek Oral Litr Cphn Over
Non-Hispanic M ean 280.02 219.99 233.95 309.54 295.01 227.24 237.98 247.36
Asian N 27,047 27,046 27,044 27,047 27,046 27,042 27,045 27,042
Non-Hispanic M ean 260.69 182.03 202.85 303.97 282.56 192.68 205.66 219.45
Pacific Islander N 1,891 1,892 1,892 1,891 1,891 1,892 1,891 1,891
Non-Hispanic M ean 277.75 205.70 223.69 317.81 298.02 214.95 227.31 239.67
Black N 8,944 8,943 8,943 8,942 8,942 8,943 8,943 8,941
Hispanic (Of M ean 265.40 184.23 203.60 298.12 281.99 194.16 208.57 220.31
Any Race) N 137,415 137,423 137,413 137,404 137,402 137,413 137,414 137,395
K
Non-Hispanic M ean 269.89 181.82 192.49 300.37 285.35 187.39 208.23 216.58
American Indian N 3,539 3,540 3,540 3,539 3,538 3,540 3,539 3,538
Non-Hispanic M ean 304.73 223.73 238.55 336.70 320.95 231.40 248.11 258.14
M ulti-racial N 967 969 969 966 966 969 967 966
Non-Hispanic M ean 284.48 204.71 225.05 315.90 300.41 215.14 228.63 240.51
White N 20,220 20,220 20,219 20,220 20,220 20,218 20,220 20,218
M ean 264.83 190.13 204.94 296.52 280.88 197.77 212.55 222.53
M issing
N 3,818 3,820 3,820 3,818 3,818 3,820 3,818 3,818
Non-Hispanic M ean 316.21 304.57 284.22 349.09 332.91 294.69 308.13 305.91
Asian N 48,036 48,015 48,009 48,000 47,982 47,984 48,004 47,940
Non-Hispanic M ean 305.25 292.39 277.01 341.87 323.86 284.97 296.36 296.42
Pacific Islander N 3,467 3,460 3,464 3,459 3,458 3,457 3,458 3,449
Non-Hispanic M ean 312.16 297.97 277.24 351.51 332.10 287.90 302.27 300.92
Black N 17,362 17,349 17,354 17,358 17,347 17,336 17,343 17,321
Hispanic (Of M ean 310.61 294.56 274.06 344.70 327.92 284.60 299.45 297.36
Any Race) N 269,122 268,915 269,035 269,048 268,896 268,763 268,840 268,519
1-2
Non-Hispanic M ean 308.16 292.64 271.65 340.18 324.46 282.45 297.36 294.84
American Indian N 7,091 7,085 7,073 7,089 7,079 7,066 7,079 7,050
Non-Hispanic M ean 322.42 305.90 282.07 363.99 343.49 294.27 310.90 308.83
M ulti-racial N 1,675 1,676 1,676 1,674 1,672 1,676 1,674 1,671
Non-Hispanic M ean 316.37 301.00 279.99 354.45 335.66 290.79 305.67 304.02
White N 36,073 36,055 36,065 36,060 36,047 36,041 36,046 36,013
M ean 306.01 292.61 273.84 341.87 324.22 283.53 296.68 295.50
M issing
N 6,118 6,114 6,110 6,114 6,103 6,105 6,113 6,093
4.2.2.1 By Grade
Table 4.2.2.1
M ean Scale Scores by Grade S302
Grade List Read Writ S pek Oral Litr Cphn Over
M ean 269.99 192.18 210.63 302.51 286.48 201.65 215.52 226.90
K
N 203,841 203,853 203,840 203,827 203,823 203,837 203,837 203,809
M ean 299.11 283.48 266.82 337.37 318.47 275.44 288.17 288.11
1
N 201,336 201,177 201,249 201,244 201,132 201,059 201,119 200,853
M ean 325.40 310.49 285.90 356.05 341.02 298.49 315.11 311.00
2
N 187,608 187,492 187,537 187,558 187,452 187,369 187,438 187,203
M ean 349.43 331.68 340.82 355.70 352.84 336.47 337.18 341.21
3
N 164,211 164,072 164,071 164,141 164,066 163,916 164,030 163,789
M ean 360.74 340.84 346.85 358.06 359.63 344.14 346.95 348.57
4
N 105,153 105,032 105,084 105,094 105,043 104,940 104,998 104,822
M ean 370.90 349.48 352.81 361.87 366.65 351.43 355.99 355.77
5
N 82,925 82,838 82,870 82,870 82,831 82,761 82,804 82,666
M ean 377.12 351.16 349.34 369.61 373.63 350.51 359.00 357.25
6
N 73,891 73,817 73,833 73,805 73,741 73,722 73,777 73,582
M ean 385.15 358.84 355.17 372.59 379.15 357.27 366.83 363.63
7
N 74,881 74,844 74,840 74,819 74,722 74,744 74,773 74,562
M ean 391.41 365.56 359.80 374.63 383.33 362.95 373.39 368.85
8
N 70,706 70,659 70,654 70,595 70,516 70,562 70,613 70,381
M ean 381.43 372.59 394.61 376.09 379.05 383.90 375.34 382.26
9
N 79,568 79,602 79,499 79,452 79,128 79,349 79,449 78,900
M ean 385.08 374.91 396.88 379.28 382.45 386.19 378.04 384.87
10
N 50,140 50,148 50,096 50,051 49,875 49,998 50,057 49,719
M ean 389.92 379.46 400.95 383.57 387.04 390.50 382.67 389.27
11
N 38,200 38,245 38,170 38,164 37,978 38,092 38,136 37,857
M ean 391.12 380.63 402.05 387.37 389.60 391.62 383.88 390.85
12
N 29,999 30,042 29,947 29,948 29,722 29,877 29,921 29,583
Table 4.2.3A
Correlations Among Scale Scores: K S302
Listening Reading Writing Speaking
Pearson Correlation 1 .537 .555 .783
Listening
N 203,841 203,837 203,824 203,823
Pearson Correlation 1 .720 .496
Reading
N 203,853 203,837 203,823
Pearson Correlation 1 .553
Writing
N 203,840 203,813
Pearson Correlation 1
Speaking
N 203,827
Table 4.2.3B
Correlations Among Scale Scores: 1-2 S302
Listening Reading Writing Speaking
Pearson Correlation 1 .688 .569 .497
Listening
N 388,944 388,557 388,626 388,584
Pearson Correlation 1 .671 .447
Reading
N 388,669 388,428 388,302
Pearson Correlation 1 .467
Writing
N 388,786 388,461
Pearson Correlation 1
Speaking
N 388,802
Table 4.2.3C
Correlations Among Scale Scores: 3-5 S302
Listening Reading Writing Speaking
Pearson Correlation 1 .727 .611 .484
Listening
N 352,289 351,832 351,878 351,940
Pearson Correlation 1 .676 .472
Reading
N 351,942 351,617 351,579
Pearson Correlation 1 .509
Writing
N 352,025 351,709
Pearson Correlation 1
Speaking
N 352,105
Table 4.2.3E
Correlations Among Scale Scores: 9-12 S302
Listening Reading Writing Speaking
Pearson Correlation 1 .700 .648 .600
Listening
N 197,907 197,563 197,133 196,703
Pearson Correlation 1 .694 .530
Reading
N 198,037 197,316 196,711
Pearson Correlation 1 .597
Writing
N 197,712 196,576
Pearson Correlation 1
Speaking
N 197,615
Table 4.3.1.1B
Proficiency Level by Cluster by Tier (Percent): Listening S302
Listening Proficiency Range
Cluster Tier 1 2 3 4 5 6 Total
K (instructional) - 12.1% 5.8% 10.2% 16.0% 30.8% 25.1% 100.0%
K (accountability) - 25.1% 10.2% 8.8% 5.8% 15.8% 34.3% 100.0%
A 7.4% 13.3% 25.0% 54.2% n/a n/a 100.0%
1-2 B 0.5% 1.4% 4.5% 4.3% 89.2% n/a 100.0%
C 0.4% 2.7% 15.6% 9.5% 27.0% 44.7% 100.0%
A 7.8% 29.3% 27.6% 35.4% n/a n/a 100.0%
3-5 B 0.5% 4.1% 14.5% 14.7% 66.2% n/a 100.0%
C 0.1% 0.9% 7.5% 9.5% 26.9% 55.0% 100.0%
A 23.9% 39.9% 19.7% 16.6% n/a n/a 100.0%
6-8 B 1.2% 12.9% 23.0% 26.4% 36.4% n/a 100.0%
C 0.2% 0.6% 7.1% 16.2% 35.7% 40.2% 100.0%
A 48.1% 35.8% 8.8% 7.2% n/a n/a 100.0%
9-12 B 3.8% 12.4% 26.5% 25.5% 31.9% n/a 100.0%
C 1.4% 4.3% 15.5% 31.3% 26.4% 21.1% 100.0%
Table 4.3.1.3B
Proficiency Level by Grade (Percent): Listening S302
Listening Proficiency Range
1 2 3 4 5 6 Total
K (instructional) 12.1% 5.8% 10.2% 16.0% 30.8% 25.1% 100.0%
K (accountability) 25.1% 10.2% 8.8% 5.8% 15.8% 34.3% 100.0%
1 2.4% 5.2% 15.3% 23.3% 45.7% 8.0% 100.0%
2 1.5% 3.5% 8.7% 9.7% 59.0% 17.5% 100.0%
3 0.6% 4.1% 11.8% 9.3% 44.6% 29.6% 100.0%
4 1.0% 4.8% 11.4% 16.9% 41.0% 24.9% 100.0%
5 1.6% 5.5% 13.7% 18.9% 35.2% 25.2% 100.0%
6 2.0% 8.4% 16.0% 19.4% 34.9% 19.3% 100.0%
7 3.1% 9.4% 16.5% 19.2% 33.2% 18.6% 100.0%
8 4.4% 10.8% 10.7% 21.7% 28.0% 24.5% 100.0%
9 9.8% 13.1% 14.9% 22.1% 30.0% 10.0% 100.0%
10 8.6% 11.7% 21.0% 25.9% 23.8% 9.0% 100.0%
11 8.4% 11.2% 18.0% 30.4% 21.4% 10.5% 100.0%
12 9.4% 10.6% 26.2% 27.8% 15.6% 10.3% 100.0%
Table 4.3.2.1B
Proficiency Level by Cluster by Tier (Percent): Reading S302
Reading Proficiency Range
Cluster Tier 1 2 3 4 5 6 Total
K (instructional) - 22.0% 13.8% 19.6% 8.4% 10.1% 26.0% 100.0%
K (accountability) - 67.0% 7.0% 4.6% 5.3% 16.1% 0.0% 100.0%
A 24.2% 24.3% 19.9% 31.6% n/a n/a 100.0%
1-2 B 1.4% 4.1% 17.3% 14.9% 62.3% n/a 100.0%
C 1.3% 4.1% 15.5% 12.9% 27.2% 39.0% 100.0%
A 31.8% 32.7% 14.3% 21.3% n/a n/a 100.0%
3-5 B 1.9% 10.4% 24.5% 9.1% 54.0% n/a 100.0%
C 0.4% 2.4% 13.0% 9.9% 32.4% 41.9% 100.0%
A 39.4% 39.7% 12.2% 8.7% n/a n/a 100.0%
6-8 B 4.6% 25.7% 35.4% 6.8% 27.5% n/a 100.0%
C 1.3% 17.9% 32.5% 13.5% 18.6% 16.1% 100.0%
A 38.6% 37.4% 11.9% 12.2% n/a n/a 100.0%
9-12 B 12.4% 38.8% 18.1% 8.2% 22.4% n/a 100.0%
C 1.7% 14.0% 15.2% 13.6% 22.1% 33.4% 100.0%
Table 4.3.2.3B
Proficiency Level by Grade (Percent): Reading S302
Reading Proficiency Range
1 2 3 4 5 6 Total
K (instructional) 22.0% 13.8% 19.6% 8.4% 10.1% 26.0% 100.0%
K (accountability) 67.0% 7.0% 4.6% 5.3% 16.1% 0.0% 100.0%
1 8.5% 10.2% 18.6% 23.1% 32.4% 7.2% 100.0%
2 4.1% 6.8% 16.1% 12.5% 45.5% 15.1% 100.0%
3 2.2% 6.2% 14.8% 8.3% 46.6% 21.9% 100.0%
4 4.3% 9.3% 17.6% 15.7% 32.3% 20.9% 100.0%
5 5.6% 11.3% 24.7% 8.2% 32.0% 18.2% 100.0%
6 4.7% 20.4% 35.8% 12.4% 19.7% 6.9% 100.0%
7 6.6% 23.4% 31.8% 11.4% 18.9% 7.8% 100.0%
8 8.8% 25.9% 26.3% 7.5% 21.2% 10.3% 100.0%
9 11.2% 24.4% 19.2% 10.1% 18.2% 16.9% 100.0%
10 10.5% 31.0% 15.4% 12.7% 16.8% 13.6% 100.0%
11 11.2% 27.4% 12.2% 12.8% 19.7% 16.6% 100.0%
12 12.6% 26.6% 12.3% 10.3% 23.5% 14.7% 100.0%
Table 4.3.3.1B
Proficiency Level by Cluster by Tier (Percent): Writing S302
Writing Proficiency Range
Cluster Tier 1 2 3 4 5 6 Total
K (instructional) - 17.7% 29.8% 16.8% 14.1% 18.7% 2.8% 100.0%
K (accountability) - 58.5% 20.1% 12.4% 6.3% 2.8% 0.0% 100.0%
A 13.9% 64.4% 21.7% 0.0% 0.0% 0.0% 100.0%
1-2 B 7.1% 44.0% 48.3% 0.6% 0.0% 0.0% 100.0%
C 2.4% 20.0% 67.3% 10.2% 0.0% 0.0% 100.0%
A 12.8% 25.8% 38.4% 22.3% 0.7% 0.0% 100.0%
3-5 B 1.2% 7.1% 30.0% 57.2% 4.5% 0.0% 100.0%
C 0.2% 1.2% 7.6% 63.5% 26.6% 0.9% 100.0%
A 18.9% 33.8% 39.8% 7.3% 0.1% 0.0% 100.0%
6-8 B 4.9% 15.3% 55.3% 24.0% 0.5% 0.0% 100.0%
C 1.7% 8.0% 60.6% 29.2% 0.4% 0.0% 100.0%
A 13.7% 29.8% 49.8% 6.5% 0.2% 0.0% 100.0%
9-12 B 5.1% 6.6% 33.0% 45.6% 9.2% 0.5% 100.0%
C 2.0% 2.0% 14.1% 45.1% 33.3% 3.6% 100.0%
Table 4.3.3.3B
Proficiency Level by Grade (Percent): Writing S302
Writing Proficiency Range
1 2 3 4 5 6 Total
K (instructional) 17.7% 29.8% 16.8% 14.1% 18.7% 2.8% 100.0%
K (accountability) 58.5% 20.1% 12.4% 6.3% 2.8% 0.0% 100.0%
1 8.9% 48.4% 40.3% 2.4% 0.0% 0.0% 100.0%
2 5.5% 34.5% 55.9% 4.1% 0.0% 0.0% 100.0%
3 1.2% 5.1% 16.1% 57.4% 19.5% 0.7% 100.0%
4 1.8% 6.3% 19.8% 58.0% 13.8% 0.4% 100.0%
5 2.5% 6.2% 26.5% 56.7% 8.0% 0.2% 100.0%
6 3.4% 12.5% 47.9% 35.4% 0.8% 0.0% 100.0%
7 4.9% 12.4% 57.8% 24.7% 0.3% 0.0% 100.0%
8 6.1% 15.9% 63.8% 14.1% 0.1% 0.0% 100.0%
9 4.5% 8.7% 20.3% 32.3% 30.7% 3.4% 100.0%
10 4.9% 7.7% 26.6% 43.1% 16.5% 1.2% 100.0%
11 4.8% 6.3% 31.2% 47.1% 10.0% 0.7% 100.0%
12 6.1% 7.8% 37.4% 43.5% 4.8% 0.3% 100.0%
Table 4.3.4.1B
Proficiency Level by Cluster by Tier (Percent): Speaking S302
Speaking Proficiency Range
Cluster Tier 1 2 3 4 5 6 Total
K (instructional) - 23.1% 7.9% 15.2% 16.7% 11.1% 26.0% 100.0%
K (accountability) - 23.1% 23.1% 16.7% 11.1% 26.0% 0.0% 100.0%
A 22.2% 35.9% 16.2% 6.0% 4.3% 15.5% 100.0%
1-2 B 4.1% 20.1% 20.2% 10.2% 9.1% 36.3% 100.0%
C 1.4% 8.1% 12.5% 8.8% 9.7% 59.5% 100.0%
A 51.1% 24.5% 9.8% 3.8% 2.8% 8.0% 100.0%
3-5 B 6.1% 20.9% 22.3% 11.8% 10.5% 28.5% 100.0%
C 2.1% 10.0% 16.1% 12.6% 14.2% 45.1% 100.0%
A 53.3% 20.7% 11.7% 6.5% 2.2% 5.6% 100.0%
6-8 B 4.6% 11.2% 20.5% 22.2% 11.1% 30.5% 100.0%
C 1.3% 4.0% 12.0% 21.1% 14.3% 47.3% 100.0%
A 63.5% 16.4% 8.6% 4.4% 1.8% 5.2% 100.0%
9-12 B 10.1% 14.2% 17.3% 14.5% 9.6% 34.3% 100.0%
C 2.4% 4.8% 10.9% 14.1% 12.6% 55.2% 100.0%
Table 4.3.4.3B
Proficiency Level by Grade (Percent): Speaking S302
Speaking Proficiency Range
1 2 3 4 5 6 Total
K (instructional) 23.1% 7.9% 15.2% 16.7% 11.1% 26.0% 100.0%
K (accountability) 23.1% 23.1% 16.7% 11.1% 26.0% 0.0% 100.0%
1 8.3% 27.1% 18.5% 8.7% 7.4% 30.1% 100.0%
2 6.1% 12.8% 15.7% 9.1% 9.1% 47.0% 100.0%
3 6.9% 18.7% 19.4% 11.4% 11.3% 32.4% 100.0%
4 8.8% 14.3% 17.9% 12.0% 12.0% 34.9% 100.0%
5 8.5% 11.8% 16.1% 11.2% 12.1% 40.2% 100.0%
6 7.1% 8.9% 15.5% 24.7% 11.5% 32.4% 100.0%
7 8.6% 6.2% 13.7% 23.0% 11.9% 36.6% 100.0%
8 8.9% 10.5% 16.3% 11.8% 12.0% 40.5% 100.0%
9 17.9% 6.3% 10.0% 16.6% 9.8% 39.3% 100.0%
10 14.5% 12.7% 14.7% 9.9% 9.9% 38.3% 100.0%
11 11.0% 13.0% 15.4% 10.4% 9.8% 40.3% 100.0%
12 8.8% 12.2% 15.1% 10.6% 10.2% 43.1% 100.0%
Table 4.3.5.1B
Proficiency Level by Cluster by Tier (Percent): Oral S302
Oral Language Proficiency Range
Cluster Tier 1 2 3 4 5 6 Total
K (instructional) - 14.6% 9.1% 12.0% 20.5% 22.6% 21.2% 100.0%
K (accountability) - 25.3% 14.4% 16.5% 8.8% 13.8% 21.2% 100.0%
A 13.3% 25.0% 37.2% 10.0% 14.5% 0.0% 100.0%
1-2 B 0.9% 4.6% 30.7% 19.2% 44.6% 0.0% 100.0%
C 0.5% 3.1% 12.0% 14.0% 32.8% 37.6% 100.0%
A 28.4% 33.5% 24.1% 6.5% 7.5% 0.0% 100.0%
3-5 B 1.2% 7.3% 25.3% 29.5% 36.7% 0.0% 100.0%
C 0.3% 1.7% 8.4% 16.6% 32.9% 40.1% 100.0%
A 43.8% 28.5% 16.3% 7.0% 4.3% 0.0% 100.0%
6-8 B 2.0% 10.1% 20.8% 34.2% 32.8% 0.0% 100.0%
C 0.4% 0.9% 5.6% 17.8% 34.0% 41.2% 100.0%
A 56.5% 27.1% 9.3% 4.6% 2.5% 0.0% 100.0%
9-12 B 4.2% 14.6% 21.0% 26.1% 34.1% 0.0% 100.0%
C 1.6% 3.0% 10.6% 24.5% 34.1% 26.2% 100.0%
Table 4.3.5.3B
Proficiency Level by Grade (Percent): Oral S302
Oral Proficiency Range
1 2 3 4 5 6 Total
K (instructional) 14.6% 9.1% 12.0% 20.5% 22.6% 21.2% 100.0%
K (accountability) 25.3% 14.4% 16.5% 8.8% 13.8% 21.2% 100.0%
1 4.3% 11.9% 32.6% 15.2% 28.6% 7.5% 100.0%
2 2.8% 5.2% 20.7% 16.3% 41.2% 13.9% 100.0%
3 2.3% 6.8% 18.3% 21.7% 29.9% 21.0% 100.0%
4 3.1% 6.5% 16.9% 21.2% 32.4% 19.9% 100.0%
5 4.2% 7.0% 14.1% 20.3% 37.2% 17.3% 100.0%
6 4.7% 7.1% 12.3% 24.9% 31.6% 19.4% 100.0%
7 5.8% 7.1% 13.0% 22.0% 29.9% 22.2% 100.0%
8 6.8% 7.8% 12.2% 21.5% 29.6% 22.3% 100.0%
9 13.2% 10.1% 10.2% 19.4% 31.5% 15.7% 100.0%
10 10.1% 12.2% 15.2% 22.0% 30.4% 10.1% 100.0%
11 8.4% 11.3% 17.6% 24.6% 26.6% 11.5% 100.0%
12 7.8% 11.0% 20.2% 26.9% 26.2% 7.8% 100.0%
Table 4.3.6.1B
Proficiency Level by Cluster by Tier (Percent): Literacy S302
Literacy Proficiency Range
Cluster Tier 1 2 3 4 5 6 Total
K (instructional) - 17.5% 24.6% 18.4% 12.5% 20.3% 6.6% 100.0%
K (accountability) - 64.1% 12.0% 12.1% 8.2% 3.6% 0.0% 100.0%
A 16.9% 47.3% 35.8% 0.0% 0.0% 0.0% 100.0%
1-2 B 1.9% 21.6% 73.0% 3.4% 0.0% 0.0% 100.0%
C 1.0% 11.0% 37.8% 30.9% 17.1% 2.3% 100.0%
A 17.4% 34.6% 31.6% 16.2% 0.2% 0.0% 100.0%
3-5 B 1.0% 7.4% 31.2% 55.8% 4.6% 0.0% 100.0%
C 0.2% 0.5% 9.3% 33.1% 42.9% 14.0% 100.0%
A 26.4% 40.7% 28.7% 4.1% 0.0% 0.0% 100.0%
6-8 B 3.4% 21.2% 51.4% 23.6% 0.4% 0.0% 100.0%
C 0.8% 9.7% 52.4% 28.2% 7.5% 1.4% 100.0%
A 21.3% 43.1% 29.7% 5.8% 0.1% 0.0% 100.0%
9-12 B 5.3% 18.4% 36.7% 31.4% 8.2% 0.0% 100.0%
C 1.3% 3.3% 16.9% 30.7% 33.0% 14.8% 100.0%
Table 4.3.6.3B
Proficiency Level by Grade (Percent): Literacy S302
Literacy Proficiency Range
1 2 3 4 5 6 Total
K (instructional) 17.5% 24.6% 18.4% 12.5% 20.3% 6.6% 100.0%
K (accountability) 64.1% 12.0% 12.1% 8.2% 3.6% 0.0% 100.0%
1 5.9% 29.5% 53.6% 7.6% 2.9% 0.5% 100.0%
2 3.9% 18.6% 56.4% 13.3% 6.9% 0.8% 100.0%
3 1.2% 5.1% 16.5% 40.9% 28.1% 8.3% 100.0%
4 2.0% 6.7% 20.6% 44.1% 20.4% 6.2% 100.0%
5 3.3% 7.9% 28.0% 38.6% 17.1% 5.0% 100.0%
6 2.8% 15.4% 49.1% 27.9% 4.1% 0.8% 100.0%
7 4.6% 16.8% 49.1% 24.6% 4.1% 0.8% 100.0%
8 6.3% 19.9% 50.2% 18.9% 4.0% 0.7% 100.0%
9 6.0% 13.7% 23.1% 25.1% 23.0% 9.2% 100.0%
10 5.4% 15.7% 27.7% 27.9% 17.7% 5.6% 100.0%
11 5.2% 15.3% 28.2% 29.8% 15.5% 6.0% 100.0%
12 6.5% 16.3% 30.7% 29.0% 12.7% 4.8% 100.0%
Table 4.3.7.1B
Proficiency Level by Cluster by Tier (Percent): Comprehension S302
Comprehension Proficiency Range
Cluster Tier 1 2 3 4 5 6 Total
K (instructional) - 12.3% 15.1% 21.5% 14.3% 14.9% 22.0% 100.0%
K (accountability) - 59.0% 6.9% 7.1% 6.6% 12.4% 8.0% 100.0%
A 12.2% 27.3% 35.5% 25.0% n/a n/a 100.0%
1-2 B 0.7% 1.6% 16.7% 22.2% 58.8% n/a 100.0%
C 0.5% 2.3% 13.1% 14.0% 32.3% 37.7% 100.0%
A 16.3% 40.2% 25.7% 17.9% n/a n/a 100.0%
3-5 B 0.6% 6.1% 25.0% 22.4% 45.8% n/a 100.0%
C 0.1% 0.6% 8.8% 10.5% 35.7% 44.3% 100.0%
A 32.2% 43.2% 18.7% 5.9% n/a n/a 100.0%
6-8 B 1.5% 19.5% 38.8% 22.3% 17.9% n/a 100.0%
C 0.3% 3.8% 24.6% 18.4% 32.1% 20.9% 100.0%
A 41.6% 40.4% 14.2% 3.9% n/a n/a 100.0%
9-12 B 5.9% 28.8% 29.5% 20.1% 15.7% n/a 100.0%
C 1.2% 6.9% 18.3% 21.2% 27.3% 25.1% 100.0%
Table 4.3.7.3B
Proficiency Level by Grade (Percent): Comprehension S302
Comprehension Proficiency Range
1 2 3 4 5 6 Total
K (instructional) 12.3% 15.1% 21.5% 14.3% 14.9% 22.0% 100.0%
K (accountability) 59.0% 6.9% 7.1% 6.6% 12.4% 8.0% 100.0%
1 4.0% 10.1% 22.5% 24.4% 32.3% 6.7% 100.0%
2 2.2% 4.5% 16.9% 16.4% 45.1% 14.8% 100.0%
3 0.9% 4.9% 13.0% 15.1% 41.5% 24.6% 100.0%
4 1.8% 6.7% 19.4% 17.8% 34.0% 20.4% 100.0%
5 3.0% 8.2% 22.1% 16.2% 32.0% 18.4% 100.0%
6 2.7% 11.9% 31.7% 20.0% 24.4% 9.4% 100.0%
7 4.3% 13.9% 30.4% 18.3% 22.0% 11.0% 100.0%
8 5.7% 16.1% 25.5% 17.1% 23.5% 12.1% 100.0%
9 9.9% 17.5% 23.2% 16.4% 20.2% 12.8% 100.0%
10 7.7% 22.1% 24.2% 17.9% 18.2% 10.0% 100.0%
11 8.0% 23.0% 18.2% 21.3% 17.4% 12.1% 100.0%
12 9.5% 20.8% 19.8% 20.0% 18.1% 11.8% 100.0%
Table 4.3.8.1B
Proficiency Level by Cluster by Tier (Percent): Overall S302
Overall Proficiency Range
Cluster Tier 1 2 3 4 5 6 Total
K (instructional) - 14.1% 20.3% 19.9% 17.7% 21.3% 6.7% 100.0%
K (accountability) - 53.0% 15.9% 14.6% 9.7% 5.8% 0.9% 100.0%
A 11.9% 42.2% 45.7% 0.1% 0.0% 0.0% 100.0%
1-2 B 0.8% 10.5% 63.2% 25.4% 0.0% 0.0% 100.0%
C 0.4% 4.5% 28.0% 36.1% 26.9% 4.2% 100.0%
A 21.0% 34.7% 30.7% 13.2% 0.5% 0.0% 100.0%
3-5 B 0.6% 5.9% 29.8% 53.4% 10.2% 0.0% 100.0%
C 0.1% 0.4% 6.8% 28.0% 45.7% 19.0% 100.0%
A 33.7% 37.2% 23.7% 5.4% 0.0% 0.0% 100.0%
6-8 B 1.6% 14.9% 43.2% 37.6% 2.6% 0.0% 100.0%
C 0.3% 2.6% 27.4% 45.0% 21.8% 3.1% 100.0%
A 33.4% 42.2% 20.0% 4.2% 0.1% 0.0% 100.0%
9-12 B 3.5% 16.2% 33.7% 33.2% 13.4% 0.0% 100.0%
C 0.9% 2.3% 13.7% 31.8% 35.7% 15.5% 100.0%
Table 4.3.8.3B
Proficiency Level by Grade (Percent): Overall S302
Overall Proficiency Range
1 2 3 4 5 6 Total
K (instructional) 14.1% 20.3% 19.9% 17.7% 21.3% 6.7% 100.0%
K (accountability) 53.0% 15.9% 14.6% 9.7% 5.8% 0.9% 100.0%
1 3.7% 20.9% 54.4% 15.6% 4.6% 0.9% 100.0%
2 2.6% 10.2% 44.1% 30.7% 10.8% 1.5% 100.0%
3 1.3% 4.8% 16.5% 35.8% 30.2% 11.5% 100.0%
4 2.2% 5.9% 19.3% 39.2% 25.2% 8.2% 100.0%
5 3.2% 6.7% 21.5% 38.8% 22.9% 6.9% 100.0%
6 3.1% 9.9% 32.0% 41.2% 12.1% 1.7% 100.0%
7 4.4% 10.7% 33.9% 36.3% 13.2% 1.6% 100.0%
8 5.7% 12.4% 32.8% 36.3% 11.4% 1.5% 100.0%
9 8.2% 12.5% 18.1% 24.8% 26.5% 9.9% 100.0%
10 5.9% 14.7% 23.3% 28.9% 21.4% 5.8% 100.0%
11 5.4% 14.1% 24.8% 31.1% 18.8% 5.9% 100.0%
12 5.8% 13.6% 28.8% 32.8% 14.5% 4.4% 100.0%
4.4.1.1 By Cluster
Table 4.4.1.1
Mean Raw Score by Cluster by Tier by Standard: Comprehension S302
Maximum Percent of
Cluster Tier Standard Score Mean Score Maximum
Social Instructional Language 12 7.77 64.74%
Language of Language Arts 9 5.97 66.34%
A Language of Math 9 5.27 58.54%
Language of Science 6 4.16 69.33%
Language of Social Studies 6 3.66 60.97%
Social Instructional Language 6 4.51 75.15%
Language of Language Arts 12 7.36 61.33%
1-2 B Language of Math 12 7.49 62.43%
Language of Science 9 6.65 73.89%
Language of Social Studies 9 6.25 69.47%
Social Instructional Language 6 4.89 81.53%
Language of Language Arts 12 8.86 73.83%
C Language of Math 12 6.95 57.94%
Language of Science 9 5.71 63.46%
Language of Social Studies 9 6.24 69.35%
Social Instructional Language 12 7.10 59.15%
Language of Language Arts 9 4.03 44.76%
A Language of Math 9 4.79 53.18%
Language of Science 6 3.22 53.60%
Language of Social Studies 6 3.26 54.33%
Social Instructional Language 6 4.81 80.09%
Language of Language Arts 12 7.73 64.38%
3-5 B Language of Math 12 6.89 57.42%
Language of Science 9 5.25 58.32%
Language of Social Studies 9 5.42 60.22%
Social Instructional Language 6 3.73 62.17%
Language of Language Arts 12 7.71 64.27%
C Language of Math 12 5.10 42.49%
Language of Science 9 5.05 56.06%
Language of Social Studies 9 4.33 48.12%
4.4.2.1 By Cluster
Table 4.4.2.1
Mean Raw Score by Cluster by Tier by Standard: Writing S302
Mean Raw Score
Linguistic Vocabulary Language Percent of
Cluster Tier Standard Complexity Usage Control Total Maximum
A Social Instructional Language 5.18 4.81 4.24 14.23 19.76%
Social Instructional Language 1.18 1.92 1.20 4.30 23.88%
B Language of Math / Science 2.37 2.23 1.92 6.52 36.24%
1-2 Language of Language Arts / Social Studies 2.48 2.31 1.74 6.54 36.32%
Social Instructional Language 2.72 2.45 2.08 7.25 40.26%
C Language of Math / Science 2.85 2.73 2.19 7.76 43.13%
Language of Language Arts / Social Studies 2.94 2.63 2.19 7.75 43.07%
Social Instructional Language 2.05 1.88 1.54 5.47 30.38%
A Language of Math / Science 2.12 2.22 1.73 6.06 33.69%
Language of Language Arts 2.14 1.91 1.64 5.70 31.68%
Social Instructional Language 2.96 3.12 2.46 8.53 47.41%
3-5 B Language of Math / Science 2.94 3.06 2.56 8.55 47.50%
Language of Language Arts / Social Studies 2.84 2.30 2.29 7.43 41.25%
Social Instructional Language 3.23 3.45 2.78 9.46 52.58%
C Language of Math / Science 3.12 2.97 2.67 8.76 48.68%
Language of Language Arts / Social Studies 3.13 2.58 2.61 8.32 46.22%
Social Instructional Language 2.23 1.94 1.81 5.98 33.21%
A Language of Math / Science 2.12 1.67 1.74 5.53 30.70%
Language of Language Arts 2.29 2.07 1.73 6.10 33.88%
Social Instructional Language 3.30 2.92 2.68 8.90 49.43%
6-8 B Language of Math / Science 3.18 3.32 2.65 9.15 50.85%
Language of Language Arts / Social Studies 3.18 2.65 2.56 8.38 46.58%
Social Instructional Language 3.63 3.12 3.01 9.76 54.25%
C Language of Math / Science 3.66 3.73 3.06 10.45 58.05%
Language of Language Arts / Social Studies 3.55 2.98 2.91 9.45 52.48%
Social Instructional Language 2.17 2.03 1.85 6.05 33.61%
A Language of Math / Science 2.23 2.05 1.69 5.98 33.20%
Language of Language Arts 2.42 2.16 1.76 6.34 35.22%
Social Instructional Language 3.52 2.93 3.01 9.45 52.50%
9-12 B Language of Math / Science 3.35 2.96 2.84 9.15 50.86%
Language of Language Arts / Social Studies 3.29 3.17 2.72 9.18 50.99%
Social Instructional Language 3.82 3.24 3.39 10.45 58.07%
C Language of Math / Science 3.36 3.64 3.04 10.04 55.78%
Language of Language Arts / Social Studies 3.73 3.61 3.17 10.52 58.45%
4.4.3.1 By Cluster
Table 4.4.3.2
Mean Raw Score by Cluster by Tier by Standard: Speaking S302
Maximum Mean Raw Percentage of
Cluster Tier Standard Score Score Maximum
Social and Instructional Language 3 2.35 78.46%
A Language of Language Arts/Social Studies 5 2.87 57.47%
Language of Mathematics/Science 5 2.41 48.27%
Social and Instructional Language 3 2.84 94.72%
1-2 B Language of Language Arts/Social Studies 5 3.98 79.53%
Language of Mathematics/Science 5 3.57 71.44%
Social and Instructional Language 3 2.93 97.68%
C Language of Language Arts/Social Studies 5 4.50 90.01%
Language of Mathematics/Science 5 4.24 84.82%
Social and Instructional Language 3 1.72 57.27%
A Language of Language Arts/Social Studies 5 2.01 40.21%
Language of Mathematics/Science 5 1.62 32.39%
Social and Instructional Language 3 2.82 93.96%
3-5 B Language of Language Arts/Social Studies 5 3.73 74.52%
Language of Mathematics/Science 5 3.49 69.81%
Social and Instructional Language 3 2.92 97.36%
C Language of Language Arts/Social Studies 5 4.24 84.82%
Language of Mathematics/Science 5 4.10 81.95%
Social and Instructional Language 3 1.69 56.24%
A Language of Language Arts/Social Studies 5 1.80 35.93%
Language of Mathematics/Science 5 1.46 29.30%
Social and Instructional Language 3 2.83 94.24%
6-8 B Language of Language Arts/Social Studies 5 3.97 79.48%
Language of Mathematics/Science 5 3.55 70.91%
Social and Instructional Language 3 2.93 97.55%
C Language of Language Arts/Social Studies 5 4.44 88.88%
Language of Mathematics/Science 5 4.11 82.21%
Social and Instructional Language 3 1.65 55.00%
A Language of Language Arts/Social Studies 5 1.63 32.61%
Language of Mathematics/Science 5 1.47 29.33%
Social and Instructional Language 3 2.77 92.17%
9-12 B Language of Language Arts/Social Studies 5 3.83 76.56%
Language of Mathematics/Science 5 3.44 68.77%
Social and Instructional Language 3 2.90 96.73%
C Language of Language Arts/Social Studies 5 4.50 89.99%
Language of Mathematics/Science 5 4.16 83.23%
Prepared by:
Volume 2
5. Analyses of Test Forms: Overview ........................................................................................ 123
5.1 Background ................................................................................................................ 123
5.1.1 Measurement Models Used..................................................................................... 123
5.1.2 Sampling ................................................................................................................. 125
5.1.3 Equating and Scaling .............................................................................................. 125
5.1.4 DIF Analyses .......................................................................................................... 125
5.1.4.1 Dichotomous Items ............................................................................................. 126
5.1.4.2 Polytomous Items................................................................................................ 126
5.2 Descriptions ................................................................................................................ 128
5.2.1 Raw Score Information (Figure A and Table A) .................................................... 128
5.2.2 Scale Score Information (Figure B and Table B) ................................................... 128
5.2.3 Proficiency Level Information (Figure C and Table C).......................................... 129
5.2.4 Scaling Equation Table (Table D) .......................................................................... 130
5.2.5 Equating Summary (Table E) ................................................................................. 130
5.2.6 Test Characteristic Curve (Figure D) ...................................................................... 131
5.2.7 Test Information Function (Figure E) ..................................................................... 131
5.2.8 Reliability (Table F) ................................................................................................ 132
5.2.9 Item/Task Analysis Summary (Table G) ................................................................ 133
5.2.10 Complete Item Analysis Table (Table H) ............................................................... 134
5.2.11 Complete Raw Score to Scale Score Conversion Chart (Table I) .......................... 135
5.2.12 Raw Score to Proficiency Level Score Conversion Table (Table J) ...................... 136
6. Analyses of Test Forms: Results ........................................................................................... 138
6.1 Grade: K ..................................................................................................................... 138
6.1.1 Listening K.............................................................................................................. 138
6.1.2 Reading K ............................................................................................................... 144
6.1.3 Writing K ................................................................................................................ 151
6.1.4 Speaking K .............................................................................................................. 157
6.2 Grades: 1–2 ................................................................................................................ 163
6.2.1 Listening 1-2 ........................................................................................................... 163
6.2.1.1 Listening 1-2 A ............................................................................................... 163
6.2.1.2 Listening 1-2 B ............................................................................................... 170
6.2.1.3 Listening 1-2 C ............................................................................................... 177
6.2.2 Reading 1-2 ............................................................................................................. 184
6.2.2.1 Reading 1-2 A ................................................................................................. 184
6.2.2.2 Reading 1-2 B ................................................................................................. 191
6.2.2.3 Reading 1-2 C ................................................................................................. 198
6.2.3 Writing 1-2 .............................................................................................................. 205
6.2.3.1 Writing 1-2 A .................................................................................................. 205
6.2.3.2 Writing 1-2 B .................................................................................................. 213
6.2.3.3 Writing 1-2 C .................................................................................................. 222
WIDA ACCESS Annual Tech Rpt 10 i Series 302 (2013-2014)
Return to Summary of Assessment Records Claims, Actions, and Evidence
6.2.4 Speaking 1-2 ........................................................................................................... 231
6.3 Grades: 3–5 ................................................................................................................ 237
6.3.1 Listening 3-5 ........................................................................................................... 237
6.3.1.1 Listening 3-5 A ............................................................................................... 237
6.3.1.2 Listening 3-5 B ............................................................................................... 244
6.3.1.3 Listening 3-5 C ............................................................................................... 251
6.3.2 Reading 3-5 ............................................................................................................. 258
6.3.2.1 Reading 3-5 A ................................................................................................. 258
6.3.2.2 Reading 3-5 B ................................................................................................. 265
6.3.2.3 Reading 3-5 C ................................................................................................. 272
6.3.3 Writing 3-5 .............................................................................................................. 279
6.3.3.1 Writing 3-5 A .................................................................................................. 279
6.3.3.2 Writing 3-5 B .................................................................................................. 286
6.3.3.3 Writing 3-5 C .................................................................................................. 295
6.3.4 Speaking 3-5 ........................................................................................................... 304
6.4 Grades: 6–8 ................................................................................................................ 310
6.4.1 Listening 6-8 ........................................................................................................... 310
6.4.1.1 Listening 6-8 A ............................................................................................... 310
6.4.1.2 Listening 6-8 B ............................................................................................... 317
6.4.1.3 Listening 6-8 C ............................................................................................... 324
6.4.2 Reading 6-8 ............................................................................................................. 331
6.4.2.1 Reading 6-8 A ................................................................................................. 331
6.4.2.2 Reading 6-8 B ................................................................................................. 338
6.4.2.3 Reading 6-8 C ................................................................................................. 345
6.4.3 Writing 6-8 .............................................................................................................. 352
6.4.3.1 Writing 6-8 A .................................................................................................. 352
6.4.3.2 Writing 6-8 B .................................................................................................. 359
6.4.3.3 Writing 6-8 C .................................................................................................. 368
6.4.4 Speaking 6-8 ........................................................................................................... 377
6.5 Grades: 9–12 .............................................................................................................. 383
6.5.1 Listening 9-12 ......................................................................................................... 383
6.5.1.1 Listening 9-12 A ............................................................................................. 383
6.5.1.2 Listening 9-12 B ............................................................................................. 391
6.5.1.3 Listening 9-12 C ............................................................................................. 398
6.5.2 Reading 9-12 ........................................................................................................... 405
6.5.2.1 Reading 9-12 A ............................................................................................... 405
6.5.2.2 Reading 9-12 B ............................................................................................... 412
6.5.2.3 Reading 9-12 C ............................................................................................... 419
6.5.3 Writing 9-12 ............................................................................................................ 426
6.5.3.1 Writing 9-12 A ................................................................................................ 426
6.5.3.2 Writing 9-12 B ................................................................................................ 433
6.5.3.3 Writing 9-12 C ................................................................................................ 442
6.5.4 Speaking 9-12 ......................................................................................................... 451
5.1 Background
5.1.1 Measurement Models Used
The measurement model that forms the basis of the analysis for the development of ACCESS for
ELLs is the Rasch measurement model (Wright & Stone, 1979). Additional information on its
use in the development of the test is available in WIDA Technical Report 1, Development and
Field Test of ACCESS for ELLs (Kenyon, 2006). The test was developed using Rasch
measurement principles, and in that sense the Rasch model guided all decisions throughout the
development of the assessment and was not just a tool for the statistical analysis of the data.
Thus, for example, data based on Rasch fit statistics guided the inclusion, revision, or deletion of
items during the development and field testing of the test forms, and will continue to guide the
refinement and further development of the test.
For Listening, Reading, and Speaking, the dichotomous Rasch model was used as the
measurement model. Mathematically, the measurement model may be presented as
log( P ni1 ) = B n - D i
P ni 0
where
Pni1 = probability of a correct response “1” by person “n” on item “i”
Pni0 = probability of an incorrect response “0” by person “n” on item “i”
Bn = ability of person “n”
Di = difficulty of item “i”
When the probability of a person getting a correct answer equals the probability of a person
getting an incorrect answer (i.e., 50% probability of getting it right and 50% probability of
getting it wrong), Pni1/Pni0 is equal to 1. The log of 1 is 0. This is the point at which a person’s
ability equals the difficulty of an item. For example, a person whose ability is 1.56 on the Rasch
logit scale encountering an item whose difficulty is 1.56 on the Rasch logit scale would have a
50% probability of answering that question correctly.
For the Writing tasks, a Rasch Rating Scale model was used. Mathematically, this can be
represented as
log( P nik ) = B n - D i - F k
P nik -1
where
5.1.2 Sampling
The results presented in most of the tables in Chapter 6 are based on the full data set of all
students who were administered operational Series 302 of ACCESS for ELLs in the academic
year 2013–2014. Exceptions are Tables E, G, H, and I. The equating summary tables (Table E)
use data from a sample of about 1,000 students rather than the entire population of students,
because the equating was done in the midst of the operational scoring. The item or task analysis
summary tables (Table G), the complete item analysis tables (Table H), and the raw score to
scale score conversion tables (Table I) use item and task difficulties from this equating.
(For information on procedures for dealing with items with C-level DIF, see Section 1.4.5.)
For each test form, Figure A shows the distribution of the raw scores. The horizontal axis shows
the raw scores. The vertical axis shows the number of students (count). Each bar shows how
many students were awarded each raw score.
Table A shows, by each grade in the cluster and by total for the cluster:
• The number of students in the analyses (the number of students who were not absent,
invalid, refused, exempt, or in the wrong cluster)
• The minimum observed raw score
• The maximum observed raw score
• The mean (average) raw score
• The standard deviation (std. dev.) of the raw scores
Cronbach’s alpha is also affected by the distribution of ability within the group of students
tested. All things being equal, the greater the heterogeneity of abilities within the group of
students tested (i.e., the more widely the scores are distributed), the higher the reliability. In this
sense, Cronbach’s alpha is sample dependent. It is widely recognized that reliability can be as
much a function of the test as of the sample of students tested. That is, the exact same test can
produce widely disparate reliability indices based on ability distribution of the group of students
tested. Because ACCESS for ELLs is a tiered test (that is, because each form in Tier A, B, or C
targets only a certain range of the entire ability distribution), results for reliability on any one
form, particularly for the shorter Listening test, may at times be lower than typically expected.
where
n = number of items i
σi2 = variance of score on item i
σt2 = variance of total score
Table F also presents the standard error of measurement (SEM) based on classical test theory.
Unlike IRT, in this approach, SEM is seen as a constant across the spread of test scores (ability
continuum). Thus, it is not conditional on ability being measured. It is, however, a function of
two statistics: the reliability of the test and the (observed) standard deviation of the test scores. It
is calculated as
SEM = SD 1 − reliability
Traditionally, SEM has been used to create a band around an examinee’s observed score, with
the assertion in the view of classical test theory, that the examinee’s true score (i.e., what the
examinee’s score would be if it could be measured without error) would lie with a certain degree
of probability within this band. Statistically speaking, then, there is an expectation that an
examinee’s true score has a 68% probability of lying within the band extending from the
observed score minus 1 SEM to the observed score plus 1 SEM.
For the Writing tests (except Kindergarten, which is scored by the test administrator),
information on inter-rater reliability is also provided in Table F. This portion of the table shows,
for each of the three or four Writing tasks, the percent of agreement between two raters in terms
of the three features being rated: Linguistic Complexity (LX), Vocabulary Usage (VU), and
Language Control (LC). In this part of the table, the first column shows the Writing task (i.e., the
first, second, third, or fourth, if applicable). The second column shows the number of Writing
papers that were double scored. This number is generally 25% of all papers scored, chosen at
random during the operational scoring process. The next column shows the feature, while the
following columns show the rates of agreement: exact, adj (adjacent), and total sum of exact and
adjacent. When the two raters agreed on the score, an exact agreement was counted. If the two
raters were different in that feature by one point, an adjacent agreement was counted.
There are two things to note about this table. First, unlike scale scores, which are determined
psychometrically and have a one-to-one correspondence to raw scores regardless of the grade
level of the student, Proficiency Level scores are interpretations of the scale score. In Series 100
and 101, cut scores between proficiency levels were determined at the grade-level cluster level;
thus, for example, in the 3–5 grade-level cluster, a given scale score was associated with the
same Proficiency Level score for students in Grades 3, 4, and 5. Such a system, however, fails to
take into account that older children can be expected to perform better on the test due to general
cognitive growth over and above growth in English language proficiency. This effect can clearly
be seen in Tables A and B, where average scores on any test form tend to rise, albeit slightly, by
grade level. In order words, we would expect a fifth grader to perform better on the 3–5 grade-
level cluster test form than a third grader at the same underlying level of English proficiency. To
account for this effect, the WIDA Consortium adopted grade-level cut scores beginning with
Series 102 so that, for any given raw score/scale score, the Proficiency Level score now
associated with it differs according to the grade level of the student. (For details on how grade-
level cut scores were determined, see Kenyon et al., 2013.) The effect of this for Table J is to
require a separate column for each grade.
Second, because scale scores are capped on Listening and Reading for Tiers A and B at the scale
score corresponding to the proficiency level score of 4.0 (for Tier A) and 5.0 (for Tier B),
beginning with Series 102, this capped score is now dependent on the grade level (rather than
Chapter 6 contains proprietary test information and is not publicly available. State educational agencies (SEAs) may
request this information; please contact us at [email protected].
Prepared by:
Volume 3
where
k = number of components j
wj = weight of component j
σj2 = variance of component j
σc2 = variance of composite
ρj = reliability coefficient of component j.
The data to compute the stratified Cronbach’s alpha is provided in the appropriate tables in
Chapter 8.
7.2 Descriptions
7.2.1 Scale Score Information (Figure A and Table A)
Figure A and Table A relate to the ACCESS for ELLs scale scores that were achieved by
students in the grade-level cluster. Figure A shows the distribution of the scale scores. The
horizontal axis shows the full range of all scale scores observed for the grade-level cluster. To
provide a full perspective, it extends somewhat below and above the range of observed scale
scores. The vertical axis shows the number of students (count). Each bar shows how many
WIDA ACCESS Annual Tech Rpt 10 458 Series 302 (2013-2014)
Return to Summary of Assessment Records Claims, Actions, and Evidence
students were awarded each scale score. Note that for Listening and Reading, the effects of
capping the scores for Tier A and Tier B can often be clearly detected in this figure.
Table A shows, by each grade in the cluster and by total for the cluster:
• Number of students in the analyses (the number students who were not absent, invalid,
refused, exempt, or in the wrong cluster)
• Minimum observed scale score
• Maximum observed scale score
• The mean (average) scale score
• The standard deviation (std. dev.) of the scale scores
Figure 8.1.1A
Scale Scores: List K S302
15,000
Count
10,000
Table 8.1.1Bi
Proficiency Level Distribution: List K S302
Figure 8.1.1Bi (Accountability)
Proficiency Level: List K S302 Level Count Percent
(Accountability) 1 51,112 25.1%
40.0%
2 20,746 10.2%
30.0%
3 17,932 8.8%
Percent
20.0%
4 11,923 5.8%
10.0%
5 32,179 15.8%
0.0%
1 2 3 4 5 6 6 69,949 34.3%
Proficiency Level
Total 203,841 100.0%
Table 8.1.1Bii
Proficiency Level Distribution: List K S302
Figure 8.1.1Bii (Instructional)
Proficiency Level: List K S302 Level Count Percent
(Instructional) K1 24,626 12.1%
40.0%
K2 11,794 5.8%
30.0%
K3 20,693 10.2%
Percent
20.0%
K4 32,677 16.0%
10.0%
K5 62,805 30.8%
0.0%
1 2 3 4 5 6 K6 51,246 25.1%
Proficiency Level
Total 203,841 100.0%
Proficiency
Level Cut Score SEM
Table 8.1.1Cii
Conditional Standard Error of
Measurement at Cut Scores: List K
S302 (Instructional)
Proficiency
Level Cut Score SEM
30
27
Expected Raw Score
24
21
18
15
12
0
-11 -10 -9 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4
Ability Measure
Figure 8.1.1D
Test Information Function: List K S302
5
4
Information
0
-11 -10 -9 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4
Ability Measure
Table 8.1.1E
Accuracy and Consistency of Classification Indices: List (Grade K) S302
(Accountability)
Overall Accuracy Consistency Kappa (k)
Indices 0.679 0.614 0.495
Conditional Level Accuracy Consistency
on Level 1 0.872 0.817
2 0.469 0.351
3 0.327 0.247
4 0.211 0.155
5 0.474 0.363
6 0.824 0.770
Indices at Accuracy
Cut Points False False
Cut Point Accuracy Positives Negatives Consistency
1/2 0.941 0.033 0.026 0.918
2/3 0.930 0.027 0.043 0.902
3/4 0.916 0.049 0.035 0.883
4/5 0.907 0.045 0.048 0.874
5/6 0.899 0.035 0.066 0.860
Table 8.1.1E
Accuracy and Consistency of Classification Indices: List (Grade K) S302
(Instructional)
Overall Accuracy Consistency Kappa (k)
Indices 0.677 0.574 0.462
Conditional Level Accuracy Consistency
on Level 1 0.886 0.811
2 0.442 0.325
3 0.526 0.401
4 0.566 0.446
5 0.702 0.580
6 0.734 0.656
Indices at Accuracy
Cut Points False False
Cut Point Accuracy Positives Negatives Consistency
1/2 0.969 0.013 0.018 0.955
2/3 0.959 0.020 0.021 0.940
3/4 0.940 0.030 0.030 0.914
4/5 0.916 0.042 0.042 0.882
5/6 0.884 0.040 0.077 0.838
Figure 8.1.2A
Scale Scores: Read K S302
30,000
25,000
20,000
Count
15,000
10,000
5,000 Table 8.1.2A
0 Scale Score Descriptive Statistics: Read K S302
95 125 155 185 215 245 275 No. of Students Min. Max. Mean Std. Dev.
Scale Score
203,853 100 290 192.18 66.09
Table 8.1.2Bi
Proficiency Level Distribution: Read K S302
Figure 8.1.2Bi (Accountability)
Proficiency Level: Read K S302 Level Count Percent
(Accountability) 1 136,612 67.0%
80.0%
2 14,186 7.0%
60.0%
3 9,412 4.6%
Percent
40.0%
4 10,849 5.3%
20.0%
5 32,794 16.1%
0.0%
1 2 3 4 5 6 6 0 0.0%
Proficiency Level
Total 203,853 100.0%
Table 8.1.2Bii
Proficiency Level Distribution: Read K S302
Figure 8.1.2Bii (Instructional)
Proficiency Level: Read K S302 Level Count Percent
(Instructional) K1 44,927 22.0%
30.0%
25.0%
K2 28,198 13.8%
20.0% K3 39,920 19.6%
Percent
15.0%
10.0% K4 17,097 8.4%
5.0% K5 20,656 10.1%
0.0%
1 2 3 4 5 6 K6 53,055 26.0%
Proficiency Level
Total 203,853 100.0%
Table 8.1.2Cii
Conditional Standard Error of Measurement
at Cut Scores: Read K S302 (Instructional)
30
27
Expected Raw Score
24
21
18
15
12
9
6
3
0
-14 -13 -12 -11 -10 -9 -8 -7 -6 -5 -4 -3 -2 -1 0 1
Ability Measure
Figure 8.1.2D
Test Information Function: Read K S302
5
4.5
4
3.5
Information
3
2.5
2
1.5
1
0.5
0
-14 -13 -12 -11 -10 -9 -8 -7 -6 -5 -4 -3 -2 -1 0 1
Ability Measure
Table 8.1.2E
Accuracy and Consistency of Classification Indices: Read (Grade K) S302
(Accountability)
Overall Accuracy Consistency Kappa (k)
Indices 0.821 0.787 0.583
Conditional Level Accuracy Consistency
on Level 1 0.943 0.929
2 0.337 0.252
3 0.240 0.176
4 0.293 0.213
5 0.872 0.772
Indices at Accuracy
Cut Points False False
Cut Point Accuracy Positives Negatives Consistency
1/2 0.934 0.039 0.027 0.910
2/3 0.937 0.032 0.031 0.913
3/4 0.943 0.029 0.028 0.919
4/5 0.948 0.033 0.019 0.926
Table 8.1.2E
Accuracy and Consistency of Classification Indices: Read (Grade K) S302
(Instructional)
Overall Accuracy Consistency Kappa (k)
Indices 0.771 0.699 0.603
Conditional Level Accuracy Consistency
on Level 1 0.902 0.835
2 0.574 0.459
3 0.700 0.589
4 0.388 0.289
5 0.922 0.881
Indices at Accuracy
Cut Points False False
Cut Point Accuracy Positives Negatives Consistency
1/2 0.944 0.020 0.036 0.922
2/3 0.940 0.033 0.026 0.915
3/4 0.936 0.029 0.034 0.910
4/5 0.940 0.032 0.028 0.914
Figure 8.1.3A
Scale Scores: Writ K S302
35,000
30,000
25,000
20,000
Count
15,000
10,000
Table 8.1.3A
5,000
0
Scale Score Descriptive Statistics: Writ K S302
95 125 155 185 215 245 275 305 335 No. of Students Min. Max. Mean Std. Dev.
Scale Score
203,840 100 339 210.63 65.58
Table 8.1.3Bi
Proficiency Level Distribution: Writ K S302
Figure 8.1.3Bi (Accountability)
Proficiency Level: Writ K S302 Level Count Percent
(Accountability) 1 119,172 58.5%
80.0%
2 40,889 20.1%
60.0%
3 25,262 12.4%
Percent
40.0%
4 12,868 6.3%
20.0%
5 5,649 2.8%
0.0%
1 2 3 4 5 6 6 0 0.0%
Proficiency Level
Total 203,840 100.0%
Table 8.1.3Bii
Proficiency Level Distribution: Writ K S302
Figure 8.1.3Bii (Instructional)
Proficiency Level: Writ K S302 Level Count Percent
(Instructional) K1 36,149 17.7%
40.0%
K2 60,841 29.8%
30.0%
K3 34,266 16.8%
Percent
20.0%
K4 28,805 14.1%
10.0%
K5 38,130 18.7%
0.0%
1 2 3 4 5 6 K6 5,649 2.8%
Proficiency Level
Total 203,840 100.0%
Proficiency
Level Cut Score SEM
Table 8.1.3Cii
Conditional Standard Error of
Measurement at Cut Scores: Writ K
S302 (Instructional)
Proficiency
Level Cut Score SEM
1/2 145 31.10
17
16
15
Expected Raw Score
14
13
12
11
10
9
8
7
6
5
4
3
2
1
0
-12 -11 -10 -9 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4
Ability Measure
Figure 8.1.3D
Test Information Function: Writ K S302
4
3.5
3
Information
2.5
2
1.5
1
0.5
0
-12 -11 -10 -9 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4
Ability Measure
Table 8.1.3E
Accuracy and Consistency of Classification Indices: Writ (Grade K) S302
(Accountability)
Overall Accuracy Consistency Kappa (k)
Indices 0.740 0.689 0.482
Conditional Level Accuracy Consistency
on Level 1 0.941 0.914
2 0.610 0.465
3 0.386 0.344
4 - 0.259
5 - 0.139
Indices at Accuracy
Cut Points False False
Cut Point Accuracy Positives Negatives Consistency
1/2 0.922 0.034 0.044 0.893
2/3 0.902 0.021 0.077 0.863
3/4 0.909 0.091 0.000 0.898
4/5 0.972 0.028 0.000 0.971
Table 8.1.3E
Accuracy and Consistency of Classification Indices: Writ (Grade K) S302
(Instructional)
Overall Accuracy Consistency Kappa (k)
Indices 0.680 0.588 0.476
Conditional Level Accuracy Consistency
on Level 1 0.865 0.792
2 0.791 0.707
3 0.515 0.388
4 0.365 0.281
5 0.245 0.603
Indices at Accuracy
Cut Points False False
Cut Point Accuracy Positives Negatives Consistency
1/2 0.955 0.025 0.020 0.935
2/3 0.916 0.041 0.043 0.885
3/4 0.898 0.031 0.071 0.860
4/5 0.879 0.045 0.077 0.832
Figure 8.1.4A
Scale Scores: Spek K S302
50,000
40,000
30,000
Count
20,000
Table 8.1.4Bi
Proficiency Level Distribution: Spek K S302
Figure 8.1.4Bi (Accountability)
Proficiency Level: Spek K S302 Level Count Percent
(Accountability) 1 47,008 23.1%
30.0%
25.0%
2 47,168 23.1%
20.0% 3 34,064 16.7%
Percent
15.0%
10.0% 4 22,635 11.1%
5.0% 5 52,952 26.0%
0.0%
1 2 3 4 5 6 6 0 0.0%
Proficiency Level
Total 203,827 100.0%
Table 8.1.4Bii
Proficiency Level Distribution: Spek K S302
Figure 8.1.4Bii (Instructional)
Proficiency Level: Spek K S302 Level Count Percent
(Instructional) K1 47,008 23.1%
30.0%
25.0%
K2 16,092 7.9%
20.0% K3 31,076 15.2%
Percent
15.0%
10.0% K4 34,064 16.7%
5.0% K5 22,635 11.1%
0.0%
1 2 3 4 5 6 K6 52,952 26.0%
Proficiency Level
Total 203,827 100.0%
Proficiency
Level Cut Score SEM
1/2 269 18.68
Table 8.1.4Cii
Conditional Standard Error of
Measurement at Cut Scores: Spek K
S302 (Instructional)
Proficiency
Level Cut Score SEM
1/2 256 20.89
10
Expected Raw Score
0
-10 -9 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4
Ability Measure
Figure 8.1.4D
Test Information Function: Spek K S302
2
1.5
Information
0.5
0
-10 -9 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4
Ability Measure
Table 8.1.4E-1
Accuracy and Consistency of Classification Indices: Spek (Grade K) S302
(Accountability)
Overall Accuracy Consistency Kappa (k)
Indices 0.468 0.451 0.321
Conditional Level Accuracy Consistency
on Level 1 0.830 0.760
2 0.662 0.533
3 0.377 0.260
4 0.212 0.194
5 - 0.563
Indices at Accuracy
Cut Points False False
Cut Point Accuracy Positives Negatives Consistency
1/2 0.928 0.041 0.031 0.900
2/3 0.891 0.032 0.077 0.850
3/4 0.865 0.046 0.089 0.786
4/5 0.740 0.260 0.000 0.755
Table 8.1.4E-1
Accuracy and Consistency of Classification Indices: Spek (Grade K) S302
(Instructional)
Overall Accuracy Consistency Kappa (k)
Indices 0.652 0.563 0.419
Conditional Level Accuracy Consistency
on Level 1 0.871 0.797
2 0.312 0.234
3 0.474 0.357
4 0.360 0.264
5 0.794 0.721
Indices at Accuracy
Cut Points False False
Cut Point Accuracy Positives Negatives Consistency
1/2 0.938 0.029 0.033 0.910
2/3 0.914 0.047 0.039 0.883
3/4 0.888 0.031 0.081 0.850
4/5 0.859 0.061 0.080 0.790
Figure 8.1.5A
Scale Scores: Oral K S302
10,000
8,000
Count
6,000
4,000
Table 8.1.5A
2,000
Scale Score Descriptive Statistics: Oral K S302
0
95 125 155 185 215 245 275 305 335 365 No. of Students Min. Max. Mean Std. Dev.
Scale Score
203,823 100 369 286.48 66.28
Table 8.1.5Bi
Proficiency Level Distribution: Oral K S302
Figure 8.1.5Bi (Accountability)
Proficiency Level: Oral K S302 Level Count Percent
(Accountability) 1 51,598 25.3%
30.0%
25.0%
2 29,375 14.4%
20.0% 3 33,562 16.5%
Percent
15.0%
10.0% 4 17,908 8.8%
5.0% 5 28,069 13.8%
0.0%
1 2 3 4 5 6 6 43,311 21.2%
Proficiency Level
Total 203,823 100.0%
Table 8.1.5Bii
Proficiency Level Distribution: Oral K S302
Figure 8.1.5Bii (Instructional)
Proficiency Level: Oral K S302 Level Count Percent
(Instructional) K1 29,747 14.6%
25.0%
K2 18,588 9.1%
20.0%
K3 24,444 12.0%
Percent
15.0%
10.0% K4 41,756 20.5%
5.0%
K5 45,977 22.6%
0.0%
1 2 3 4 5 6 K6 43,311 21.2%
Proficiency Level
Total 203,823 100.0%
Figure 8.1.5C
n/a
Figure 8.1.5D
n/a
Table 8.1.5D
Oral Composite Reliability: Oral K S302
Component Weight Variance Reliability
Listening 0.50 4997.123 0.934
Speaking 0.50 4852.854 0.894
Oral 4393.165 0.952
Table 8.1.5E
Accuracy and Consistency of Classification Indices: Oral (Grade K) S302
(Accountability)
Overall Accuracy Consistency Kappa (k)
Indices 0.628 0.546 0.447
Conditional Level Accuracy Consistency
on Level 1 0.908 0.862
2 0.633 0.514
3 0.609 0.488
4 0.338 0.226
5 0.359 0.297
6 0.728 0.625
Indices at
Cut Points Accuracy
False False
Cut Point Accuracy Positives Negatives Consistency
1/2 0.955 0.023 0.021 0.936
2/3 0.935 0.030 0.035 0.910
3/4 0.927 0.025 0.047 0.899
4/5 0.924 0.027 0.049 0.881
5/6 0.857 0.101 0.042 0.827
Figure 8.1.6A
Scale Scores: Litr K S302
15,000
Count
10,000
5,000
Table 8.1.6A
0 Scale Score Descriptive Statistics: Litr K S302
95 125 155 185 215 245 275 305 No. of Students Min. Max. Mean Std. Dev.
Scale Score
203,837 100 315 201.65 61.11
Table 8.1.6Bi
Proficiency Level Distribution: Litr K S302
Figure 8.1.6Bi (Accountability)
Proficiency Level: Litr K S302 Level Count Percent
(Accountability) 1 130,672 64.1%
80.0%
2 24,411 12.0%
60.0%
3 24,653 12.1%
Percent
40.0%
4 16,739 8.2%
20.0%
5 7,362 3.6%
0.0%
1 2 3 4 5 6 6 0 0.0%
Proficiency Level
Total 203,837 100.0%
Table 8.1.6Bii
Proficiency Level Distribution: Litr K S302
Figure 8.1.6Bii (Instructional)
Proficiency Level: Litr K S302 Level Count Percent
(Instructional) K1 35,756 17.5%
30.0%
25.0%
K2 50,231 24.6%
20.0% K3 37,594 18.4%
Percent
15.0%
10.0% K4 25,398 12.5%
5.0% K5 41,480 20.3%
0.0%
1 2 3 4 5 6 K6 13,378 6.6%
Proficiency Level
Total 203,837 100.0%
Figure 8.1.6C
n/a
Figure 8.1.6D
n/a
Table 8.1.6D
Literacy Composite Reliability: Litr K S302
Component Weight Variance Reliability
Reading 0.50 4368.024 0.947
Writing 0.50 4299.672 0.922
Literacy 3733.607 0.962
Table 8.1.6E
Accuracy and Consistency of Classification Indices: Litr (Grade K) S302
(Accountability)
Overall Accuracy Consistency Kappa (k)
Indices 0.797 0.747 0.542
Conditional Level Accuracy Consistency
on Level 1 0.961 0.943
2 0.562 0.434
3 0.513 0.399
4 0.452 0.390
5 - 0.264
6 - 0.000
Indices at Accuracy
Cut Points False False
Cut Point Accuracy Positives Negatives Consistency
1/2 0.947 0.025 0.028 0.925
2/3 0.942 0.024 0.034 0.918
3/4 0.933 0.034 0.033 0.906
4/5 0.964 0.036 0.000 0.955
5/6 1.000 0.000 0.000 1.000
Figure 8.1.7A
Scale Scores: Cphn K S302
6,000
5,000
4,000
Count
3,000
2,000
1,000 Table 8.1.7A
0
Scale Score Descriptive Statistics: Cphn K S302
95 125 155 185 215 245 275 305 No. of Students Min. Max. Mean Std. Dev.
Scale Score
203,837 100 312 215.52 60.34
Table 8.1.7Bi
Proficiency Level Distribution: Cphn K S302
Figure 8.1.7Bi (Accountability)
Proficiency Level: Cphn K S302 Level Count Percent
(Accountability) 1 120,340 59.0%
80.0%
2 14,043 6.9%
60.0%
3 14,430 7.1%
Percent
40.0%
4 13,518 6.6%
20.0%
5 25,262 12.4%
0.0%
1 2 3 4 5 6 6 16,244 8.0%
Proficiency Level
Total 203,837 100.0%
Table 8.1.7Bii
Proficiency Level Distribution: Cphn K S302
Figure 8.1.7Bii (Instructional)
Proficiency Level: Cphn K S302 Level Count Percent
(Instructional) K1 25,081 12.3%
25.0%
K2 30,698 15.1%
20.0%
K3 43,899 21.5%
Percent
15.0%
10.0% K4 29,078 14.3%
5.0%
K5 30,330 14.9%
0.0%
1 2 3 4 5 6 K6 44,751 22.0%
Proficiency Level
Total 203,837 100.0%
Figure 8.1.7C
n/a
Figure 8.1.7D
n/a
Table 8.1.7D
Comprehension Composite Reliability: Cphn K S302
Component Weight Variance Reliability
Listening 0.30 4997.123 0.934
Reading 0.70 4368.024 0.947
Comprehension 3641.051 0.961
Table 8.1.7E-1
Accuracy and Consistency of Classification Indices: Cphn (Grade K) S302
(Accountability)
Overall Accuracy Consistency Kappa (k)
Indices 0.763 0.708 0.529
Conditional Level Accuracy Consistency
on Level 1 0.961 0.943
2 0.391 0.285
3 0.392 0.285
4 0.350 0.257
5 0.560 0.462
6 0.671 0.531
Indices at Accuracy
Cut Points False False
Cut Point Accuracy Positives Negatives Consistency
1/2 0.949 0.023 0.028 0.928
2/3 0.950 0.025 0.024 0.928
3/4 0.946 0.029 0.025 0.923
4/5 0.941 0.031 0.028 0.918
5/6 0.945 0.032 0.023 0.925
Figure 8.1.8A
Scale Scores: Over K S302
4,000
3,000
Count
2,000
Table 8.1.8Bi
Proficiency Level Distribution: Over K S302
Figure 8.1.8Bi (Accountability)
Proficiency Level: Over K S302 Level Count Percent
(Accountability) 1 108,070 53.0%
60.0%
50.0%
2 32,479 15.9%
40.0% 3 29,796 14.6%
Percent
30.0%
20.0% 4 19,836 9.7%
10.0% 5 11,866 5.8%
0.0%
1 2 3 4 5 6 6 1,762 0.9%
Proficiency Level
Total 203,809 100.0%
Table 8.1.8Bii
Proficiency Level Distribution: Over K S302
Figure 8.1.8Bii (Instructional)
Proficiency Level: Over K S302 Level Count Percent
(Instructional) K1 28,669 14.1%
25.0%
K2 41,422 20.3%
20.0%
K3 40,481 19.9%
Percent
15.0%
10.0% K4 36,103 17.7%
5.0%
K5 43,506 21.3%
0.0%
1 2 3 4 5 6 K6 13,628 6.7%
Proficiency Level
Total 203,809 100.0%
Figure 8.1.8C
n/a
Figure 8.1.8D
n/a
Table 8.1.8D
Overall Composite Reliability: Over K S302
Component Weight Variance Reliability
Listening 0.15 4997.123 0.934
Reading 0.35 4368.024 0.947
Speaking 0.15 4852.854 0.894
Writing 0.35 4299.672 0.922
Overall Composite 3260.417 0.973
Table 8.1.8E
Accuracy and Consistency of Classification Indices: Over (Grade K) S302
(Accountability)
Overall Accuracy Consistency Kappa (k)
Indices 0.807 0.747 0.616
Conditional Level Accuracy Consistency
on Level 1 0.956 0.936
2 0.704 0.593
3 0.680 0.561
4 0.533 0.444
5 0.253 0.477
6 - 0.149
Indices at Accuracy
Cut Points False False
Cut Point Accuracy Positives Negatives Consistency
1/2 0.953 0.023 0.024 0.933
2/3 0.951 0.023 0.026 0.931
3/4 0.955 0.021 0.024 0.935
4/5 0.953 0.035 0.011 0.941
5/6 0.991 0.009 0.000 0.991
Percent
Count
30.0%
40,000
20.0%
20,000
10.0%
0 0.0%
99 129 159 189 219 249 279 309 339 369 399 1 2 3 4 5 6
Scale Score Proficiency Level
Table 8.2.1A
Scale Score Descriptive Statistics: List 1-2 S302
Grade No. of Students Min. Max. Mean Std. Dev.
1 201,336 104 397 299.11 27.35
2 187,608 108 397 325.40 28.51
Total 388,944 104 397 311.79 30.85
Table 8.2.1B
Proficiency Level Distribution: List 1-2 S302
Grade 1 Grade 2 Total
Level Count Percent Count Percent Count Percent
1 4,892 2.4% 2,876 1.5% 7,768 2.0%
2 10,535 5.2% 6,561 3.5% 17,096 4.4%
3 30,868 15.3% 16,335 8.7% 47,203 12.1%
4 46,918 23.3% 18,243 9.7% 65,161 16.8%
5 91,944 45.7% 110,748 59.0% 202,692 52.1%
6 16,179 8.0% 32,845 17.5% 49,024 12.6%
Total 201,336 100.0% 187,608 100.0% 388,944 100.0%
Proficiency SEM
Level Grade Cut Score Tier A Tier B Tier C
21
18
Expected Raw Score
15
12 A
B
9 C
0
-8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5
Ability Measure
Figure 8.2.1D
Test Information Function: List 1-2ABC S302
4
Information
3 A
B
2 C
0
-8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5
Ability Measure
Table 8.2.1E-1
Accuracy and Consistency of Classification Indices: List (Grade 1) S302
Overall Accuracy Consistency Kappa (k)
Indices 0.534 0.395 0.160
Conditional Level Accuracy Consistency
on Level 1 0.808 0.615
2 0.534 0.342
3 0.429 0.274
4 0.354 0.271
5 0.604 0.561
6 - 0.141
Indices at Accuracy
Cut Points False False
Cut Point Accuracy Positives Negatives Consistency
1/2 0.987 0.003 0.010 0.981
2/3 0.962 0.013 0.025 0.939
3/4 0.853 0.077 0.070 0.770
4/5 0.740 0.097 0.163 0.658
5/6 0.920 0.080 0.000 0.847
Table 8.2.1E-2
Accuracy and Consistency of Classification Indices: List (Grade 2) S302
Overall Accuracy Consistency Kappa (k)
Indices 0.598 0.439 0.149
Conditional Level Accuracy Consistency
on Level 1 0.816 0.618
2 0.548 0.371
3 0.444 0.257
4 0.217 0.135
5 0.675 0.659
6 - 0.265
Indices at Accuracy
Cut Points False False
Cut Point Accuracy Positives Negatives Consistency
1/2 0.991 0.002 0.007 0.988
2/3 0.975 0.007 0.017 0.962
3/4 0.923 0.036 0.042 0.866
4/5 0.840 0.086 0.074 0.748
5/6 0.825 0.175 0.000 0.731
40,000 30.0%
Percent
Count
30,000
20.0%
20,000
10.0%
10,000
0 0.0%
136 166 196 226 256 286 316 346 376 1 2 3 4 5 6
Scale Score Proficiency Level
Table 8.2.2A
Scale Score Descriptive Statistics: Read 1-2 S302
Grade No. of Students Min. Max. Mean Std. Dev.
1 201,177 141 395 283.48 24.82
2 187,492 150 395 310.49 25.84
Total 388,669 141 395 296.51 28.69
Table 8.2.2B
Proficiency Level Distribution: Read 1-2 S302
Grade 1 Grade 2 Total
Level Count Percent Count Percent Count Percent
1 17,131 8.5% 7,724 4.1% 24,855 6.4%
2 20,510 10.2% 12,664 6.8% 33,174 8.5%
3 37,350 18.6% 30,187 16.1% 67,537 17.4%
4 46,516 23.1% 23,355 12.5% 69,871 18.0%
5 65,236 32.4% 85,287 45.5% 150,523 38.7%
6 14,434 7.2% 28,275 15.1% 42,709 11.0%
Total 201,177 100.0% 187,492 100.0% 388,669 100.0%
27
24
Expected Raw Score
21
18
A
15
B
12
C
9
6
3
0
-8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6
Ability Measure
Figure 8.2.2D
Test Information Function: Read 1-2ABC S302
5
Information
4
A
3 B
2 C
0
-8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6
Ability Measure
Table 8.2.2E-1
Accuracy and Consistency of Classification Indices: Read (Grade 1) S302
Overall Accuracy Consistency Kappa (k)
Indices 0.544 0.428 0.272
Conditional Level Accuracy Consistency
on Level 1 0.814 0.671
2 0.482 0.340
3 0.473 0.354
4 0.433 0.339
5 0.596 0.527
6 - 0.239
Indices at Accuracy
Cut Points False False
Cut Point Accuracy Positives Negatives Consistency
1/2 0.964 0.015 0.021 0.946
2/3 0.923 0.036 0.041 0.886
3/4 0.860 0.064 0.076 0.806
4/5 0.825 0.073 0.102 0.761
5/6 0.928 0.072 0.000 0.889
Table 8.2.2E-2
Accuracy and Consistency of Classification Indices: Read (Grade 2) S302
Overall Accuracy Consistency Kappa (k)
Indices 0.576 0.459 0.291
Conditional Level Accuracy Consistency
on Level 1 0.802 0.592
2 0.453 0.307
3 0.484 0.349
4 0.259 0.199
5 0.737 0.651
6 0.624 0.474
Indices at Accuracy
Cut Points False False
Cut Point Accuracy Positives Negatives Consistency
1/2 0.976 0.006 0.018 0.966
2/3 0.944 0.026 0.030 0.913
3/4 0.878 0.067 0.055 0.822
4/5 0.828 0.110 0.061 0.771
5/6 0.892 0.043 0.065 0.838
15,000 50.0%
40.0%
Percent
Count
10,000 30.0%
20.0%
5,000
10.0%
0 0.0%
198 228 258 288 318 348 1 2 3 4 5 6
Scale Score Proficiency Level
Table 8.2.3A
Scale Score Descriptive Statistics: Writ 1-2 S302
Grade No. of Students Min. Max. Mean Std. Dev.
1 201,249 203 365 266.82 20.66
2 187,537 209 363 285.90 21.44
Total 388,786 203 365 276.02 23.10
Table 8.2.3B
Proficiency Level Distribution: Writ 1-2 S302
Grade 1 Grade 2 Total
Level Count Percent Count Percent Count Percent
1 17,984 8.9% 10,252 5.5% 28,236 7.3%
2 97,405 48.4% 64,792 34.5% 162,197 41.7%
3 81,036 40.3% 104,861 55.9% 185,897 47.8%
4 4,808 2.4% 7,602 4.1% 12,410 3.2%
5 15 0.0% 30 0.0% 45 0.0%
6 1 0.0% 0 0.0% 1 0.0%
Total 201,249 100.0% 187,537 100.0% 388,786 100.0%
Proficiency SEM
Level Grade Cut Score Tier A Tier B Tier C
110
100
Expected Raw Score
90
80
70
60 A
50
B
40
C
30
20
10
0
-6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7
Ability Measure
Figure 8.2.3D
Test Information Function: Writ 1-2ABC S302
25
20
Information
15
A
10 B
C
5
0
-6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7
Ability Measure
Table 8.2.3E-1
Accuracy and Consistency of Classification Indices: Writ (Grade 1) S302
Overall Accuracy Consistency Kappa (k)
Indices 0.863 0.815 0.686
Conditional Level Accuracy Consistency
on Level 1 0.851 0.750
2 0.881 0.838
3 0.846 0.812
4 0.791 0.458
5 - 1.000
Indices at Accuracy
Cut Points False False
Cut Point Accuracy Positives Negatives Consistency
1/2 0.970 0.013 0.018 0.956
2/3 0.917 0.040 0.043 0.883
3/4 0.977 0.023 0.000 0.975
4/5 1.000 0.000 0.000 1.000
Table 8.2.3E-2
Accuracy and Consistency of Classification Indices: Writ (Grade 2) S302
Overall Accuracy Consistency Kappa (k)
Indices 0.864 0.816 0.668
Conditional Level Accuracy Consistency
on Level 1 0.831 0.727
2 0.865 0.805
3 0.866 0.847
4 - 0.300
5 - 1.000
Indices at Accuracy
Cut Points False False
Cut Point Accuracy Positives Negatives Consistency
1/2 0.981 0.009 0.010 0.972
2/3 0.923 0.036 0.041 0.892
3/4 0.959 0.041 0.000 0.953
4/5 1.000 0.000 0.000 1.000
Percent
Count
80,000
60,000 20.0%
40,000
10.0%
20,000
0 0.0%
168 198 228 258 288 318 348 378 1 2 3 4 5 6
Scale Score Proficiency Level
Table 8.2.4A
Scale Score Descriptive Statistics: Spek 1-2 S302
Grade No. of Students Min. Max. Mean Std. Dev.
1 201,244 173 391 337.37 47.84
2 187,558 174 391 356.05 42.90
Total 388,802 173 391 346.38 46.47
Table 8.2.4B
Proficiency Level Distribution: Spek 1-2 S302
Grade 1 Grade 2 Total
Level Count Percent Count Percent Count Percent
1 16,794 8.3% 11,526 6.1% 28,320 7.3%
2 54,483 27.1% 24,077 12.8% 78,560 20.2%
3 37,167 18.5% 29,518 15.7% 66,685 17.2%
4 17,430 8.7% 17,124 9.1% 34,554 8.9%
5 14,834 7.4% 17,116 9.1% 31,950 8.2%
6 60,536 30.1% 88,197 47.0% 148,733 38.3%
Total 201,244 100.0% 187,558 100.0% 388,802 100.0%
Proficiency
Level Grade Cut Score SEM
1 278 20.89
1/2
2 286 19.88
1 318 18.28
2/3
2 322 18.28
1 344 19.08
3/4
2 345 19.08
1 367 20.08
4/5
2 368 20.08
1 385 20.69
5/6
2 386 20.69
*No equating was performed for S302
13
12
11
Expected Raw Score
10
9
8
7
6
5
4
3
2
1
0
-11 -10 -9 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10 11
Ability Measure
Figure 8.2.4D
Test Information Function: Spek 1-2 S302
1.5
Information
0.5
0
-11 -10 -9 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10 11
Ability Measure
Table 8.2.4E-1
Accuracy and Consistency of Classification Indices: Spek (Grade 1) S302
Overall Accuracy Consistency Kappa (k)
Indices 0.610 0.519 0.409
Conditional Level Accuracy Consistency
on Level 1 0.599 0.467
2 0.717 0.623
3 0.530 0.427
4 0.366 0.249
5 0.277 0.207
6 0.947 0.883
Indices at Accuracy
Cut Points False False
Cut Point Accuracy Positives Negatives Consistency
1/2 0.938 0.043 0.020 0.908
2/3 0.888 0.047 0.064 0.853
3/4 0.921 0.024 0.055 0.884
4/5 0.949 0.028 0.023 0.916
5/6 0.890 0.098 0.011 0.873
Table 8.2.4E-2
Accuracy and Consistency of Classification Indices: Spek (Grade 2) S302
Overall Accuracy Consistency Kappa (k)
Indices 0.653 0.582 0.441
Conditional Level Accuracy Consistency
on Level 1 0.687 0.542
2 0.576 0.468
3 0.561 0.465
4 0.370 0.269
5 0.305 0.223
6 0.955 0.906
Indices at Accuracy
Cut Points False False
Cut Point Accuracy Positives Negatives Consistency
1/2 0.964 0.021 0.015 0.945
2/3 0.922 0.039 0.039 0.897
3/4 0.918 0.023 0.060 0.888
4/5 0.945 0.026 0.029 0.910
5/6 0.880 0.103 0.017 0.856
Percent
Count
20.0%
20,000 15.0%
10,000 10.0%
5.0%
0 0.0%
134 164 194 224 254 284 314 344 374 1 2 3 4 5 6
Scale Score Proficiency Level
Table 8.2.5A
Scale Score Descriptive Statistics: Oral 1-2 S302
Grade No. of Students Min. Max. Mean Std. Dev.
1 201,132 139 394 318.47 32.54
2 187,452 141 394 341.02 30.79
Total 388,584 139 394 329.35 33.65
Table 8.2.5B
Proficiency Level Distribution: Oral 1-2 S302
Grade 1 Grade 2 Total
Level Count Percent Count Percent Count Percent
1 8,563 4.3% 5,161 2.8% 13,724 3.5%
2 23,951 11.9% 9,752 5.2% 33,703 8.7%
3 65,602 32.6% 38,724 20.7% 104,326 26.8%
4 30,505 15.2% 30,532 16.3% 61,037 15.7%
5 57,456 28.6% 77,171 41.2% 134,627 34.6%
6 15,055 7.5% 26,112 13.9% 41,167 10.6%
Total 201,132 100.0% 187,452 100.0% 388,584 100.0%
Figure 8.2.5C
n/a
Figure 8.2.5D
n/a
Table 8.2.5D
Oral Composite Reliability: Oral 1-2 S302
Component Weight Variance Reliability
Listening 0.50 950.266 0.688
Speaking 0.50 2155.715 0.891
Oral 1131.001 0.882
*Variances from students who had results in all four domains
Table 8.2.5E-1
Accuracy and Consistency of Classification Indices: Oral (Grade 1) S302
Overall Accuracy Consistency Kappa (k)
Indices 0.630 0.516 0.379
Conditional Level Accuracy Consistency
on Level 1 0.838 0.717
2 0.665 0.528
3 0.773 0.665
4 0.366 0.272
5 0.637 0.574
6 - 0.300
Indices at Accuracy
Cut Points False False
Cut Point Accuracy Positives Negatives Consistency
1/2 0.984 0.006 0.010 0.976
2/3 0.939 0.033 0.029 0.911
3/4 0.885 0.035 0.080 0.841
4/5 0.882 0.063 0.054 0.828
5/6 0.925 0.075 0.000 0.898
8,000 50.0%
40.0%
6,000
Percent
Count
30.0%
4,000
20.0%
2,000 10.0%
0 0.0%
167 197 227 257 287 317 347 377 1 2 3 4 5 6
Scale Score Proficiency Level
Table 8.2.6A
Scale Score Descriptive Statistics: Litr 1-2 S302
Grade No. of Students Min. Max. Mean Std. Dev.
1 201,059 172 369 275.44 19.98
2 187,369 180 376 298.49 21.47
Total 388,428 172 376 286.56 23.70
Table 8.2.6B
Proficiency Level Distribution: Litr 1-2 S302
Grade 1 Grade 2 Total
Level Count Percent Count Percent Count Percent
1 11,892 5.9% 7,380 3.9% 19,272 5.0%
2 59,372 29.5% 34,935 18.6% 94,307 24.3%
3 107,734 53.6% 105,639 56.4% 213,373 54.9%
4 15,285 7.6% 24,999 13.3% 40,284 10.4%
5 5,846 2.9% 12,853 6.9% 18,699 4.8%
6 930 0.5% 1,563 0.8% 2,493 0.6%
Total 201,059 100.0% 187,369 100.0% 388,428 100.0%
Figure 8.2.6C
n/a
Figure 8.2.6D
n/a
Table 8.2.6D
Literacy Composite Reliability: Litr 1-2 S302
Component Weight Variance Reliability
Reading 0.50 822.466 0.828
Writing 0.50 532.856 0.925
Literacy 561.317 0.919
Table 8.2.6E-1
Accuracy and Consistency of Classification Indices: Litr (Grade 1) S302
Overall Accuracy Consistency Kappa (k)
Indices 0.813 0.740 0.590
Conditional Level Accuracy Consistency
on Level 1 0.795 0.699
2 0.802 0.721
3 0.880 0.840
4 0.527 0.407
5 0.763 0.673
6 - 0.997
Indices at Accuracy
Cut Points False False
Cut Point Accuracy Positives Negatives Consistency
1/2 0.975 0.012 0.013 0.966
2/3 0.910 0.049 0.042 0.875
3/4 0.945 0.023 0.032 0.922
4/5 0.978 0.021 0.001 0.978
5/6 0.995 0.005 0.000 0.999
Percent
Count
30,000
20.0%
20,000
10,000 10.0%
0 0.0%
125 155 185 215 245 275 305 335 365 395 1 2 3 4 5 6
Scale Score Proficiency Level
Table 8.2.7A
Scale Score Descriptive Statistics: Cphn 1-2 S302
Grade No. of Students Min. Max. Mean Std. Dev.
1 201,119 130 396 288.17 23.06
2 187,438 137 396 315.11 24.50
Total 388,557 130 396 301.17 27.31
Table 8.2.7B
Proficiency Level Distribution: Cphn 1-2 S302
Grade 1 Grade 2 Total
Level Count Percent Count Percent Count Percent
1 8,077 4.0% 4,206 2.2% 12,283 3.2%
2 20,404 10.1% 8,528 4.5% 28,932 7.4%
3 45,271 22.5% 31,646 16.9% 76,917 19.8%
4 49,007 24.4% 30,713 16.4% 79,720 20.5%
5 64,865 32.3% 84,545 45.1% 149,410 38.5%
6 13,495 6.7% 27,800 14.8% 41,295 10.6%
Total 201,119 100.0% 187,438 100.0% 388,557 100.0%
Figure 8.2.7C
n/a
Figure 8.2.7D
n/a
Table 8.2.7D
Comprehension Composite Reliability: Cphn 1-2 S302
Component Weight Variance Reliability
Listening 0.30 950.266 0.688
Reading 0.70 822.466 0.828
Comprehension 745.518 0.871
*Variances from students who had results in all four domains
Table 8.2.7E-1
Accuracy and Consistency of Classification Indices: Cphn (Grade 1) S302
Overall Accuracy Consistency Kappa (k)
Indices 0.623 0.510 0.363
Conditional Level Accuracy Consistency
on Level 1 0.798 0.651
2 0.655 0.505
3 0.605 0.486
4 0.526 0.420
5 0.662 0.584
6 0.632 0.388
Indices at Accuracy
Cut Points False False
Cut Point Accuracy Positives Negatives Consistency
1/2 0.982 0.007 0.011 0.973
2/3 0.946 0.023 0.031 0.922
3/4 0.879 0.066 0.055 0.831
4/5 0.858 0.059 0.083 0.807
5/6 0.942 0.045 0.013 0.918
Percent
Count
4,000 30.0%
20.0%
2,000
10.0%
0 0.0%
157 187 217 247 277 307 337 367 1 2 3 4 5 6
Scale Score Proficiency Level
Table 8.2.8A
Scale Score Descriptive Statistics: Over 1-2 S302
Grade No. of Students Min. Max. Mean Std. Dev.
1 200,853 162 376 288.11 21.19
2 187,203 168 380 311.00 22.06
Total 388,056 162 380 299.16 24.46
Table 8.2.8B
Proficiency Level Distribution: Over 1-2 S302
Grade 1 Grade 2 Total
Level Count Percent Count Percent Count Percent
1 7,368 3.7% 4,795 2.6% 12,163 3.1%
2 42,025 20.9% 19,187 10.2% 61,212 15.8%
3 109,264 54.4% 82,581 44.1% 191,845 49.4%
4 31,256 15.6% 57,530 30.7% 88,786 22.9%
5 9,231 4.6% 20,218 10.8% 29,449 7.6%
6 1,709 0.9% 2,892 1.5% 4,601 1.2%
Total 200,853 100.0% 187,203 100.0% 388,056 100.0%
Figure 8.2.8C
n/a
Figure 8.2.8D
n/a
Table 8.2.8D
Overall Composite Reliability: Over 1-2 S302
Table 8.2.8E-1
Accuracy and Consistency of Classification Indices: Over (Grade 1) S302
Overall Accuracy Consistency Kappa (k)
Indices 0.829 0.765 0.636
Conditional Level Accuracy Consistency
on Level 1 0.787 0.763
2 0.820 0.741
3 0.896 0.860
4 0.674 0.573
5 0.723 0.591
6 - 0.995
Indices at Accuracy
Cut Points False False
Cut Point Accuracy Positives Negatives Consistency
1/2 0.985 0.008 0.007 0.983
2/3 0.938 0.035 0.028 0.915
3/4 0.930 0.031 0.040 0.905
4/5 0.970 0.024 0.006 0.966
5/6 0.991 0.009 0.000 0.995
50,000 40.0%
40,000 30.0%
Percent
Count
30,000
20.0%
20,000
10,000 10.0%
0 0.0%
107 137 167 197 227 257 287 317 347 377 407 437 467 1 2 3 4 5 6
Scale Score Proficiency Level
Table 8.3.1A
Scale Score Descriptive Statistics: List 3-5 S302
Grade No. of Students Min. Max. Mean Std. Dev.
3 164,211 112 469 349.43 34.80
4 105,153 116 469 360.74 36.54
5 82,925 120 469 370.90 38.07
Total 352,289 112 469 357.86 37.15
Table 8.3.1B
Proficiency Level Distribution: List 3-5 S302
Grade 3 Grade 4 Grade 5 Total
Level Count Percent Count Percent Count Percent Count Percent
1 939 0.6% 1,042 1.0% 1,298 1.6% 3,279 0.9%
2 6,685 4.1% 5,076 4.8% 4,538 5.5% 16,299 4.6%
3 19,397 11.8% 12,013 11.4% 11,347 13.7% 42,757 12.1%
4 15,298 9.3% 17,721 16.9% 15,638 18.9% 48,657 13.8%
5 73,211 44.6% 43,069 41.0% 29,186 35.2% 145,466 41.3%
6 48,681 29.6% 26,232 24.9% 20,918 25.2% 95,831 27.2%
Total 164,211 100.0% 105,153 100.0% 82,925 100.0% 352,289 100.0%
Proficiency SEM
Level Grade Cut Score Tier A Tier B Tier C
21
18
Expected Raw Score
15
A
12 B
9 C
0
-7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7
Ability Measure
Figure 8.3.1D
Test Information Function: List 3-5ABC S302
4.5
4
3.5
A
Information
3
B
2.5
C
2
1.5
1
0.5
0
-7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7
Ability Measure
Table 8.3.1E-1
Accuracy and Consistency of Classification Indices: List (Grade 3) S302
Overall Accuracy Consistency Kappa (k)
Indices 0.537 0.428 0.213
Conditional Level Accuracy Consistency
on Level 1 0.711 0.258
2 0.509 0.270
3 0.414 0.267
4 0.183 0.137
5 0.616 0.540
6 0.677 0.544
Indices at Accuracy
Cut Points False False
Cut Point Accuracy Positives Negatives Consistency
1/2 0.994 0.000 0.005 0.993
2/3 0.962 0.007 0.031 0.939
3/4 0.883 0.051 0.066 0.818
4/5 0.811 0.120 0.069 0.739
5/6 0.806 0.101 0.093 0.734
Table 8.3.1E-2
Accuracy and Consistency of Classification Indices: List (Grade 4) S302
Overall Accuracy Consistency Kappa (k)
Indices 0.505 0.393 0.186
Conditional Level Accuracy Consistency
on Level 1 0.744 0.396
2 0.524 0.303
3 0.381 0.249
4 0.313 0.237
5 0.555 0.485
6 0.607 0.464
Indices at Accuracy
Cut Points False False
Cut Point Accuracy Positives Negatives Consistency
1/2 0.992 0.001 0.007 0.988
2/3 0.959 0.011 0.029 0.933
3/4 0.880 0.058 0.062 0.817
4/5 0.795 0.100 0.105 0.726
5/6 0.804 0.099 0.097 0.729
WIDA ACCESS Annual Tech Rpt 10 522 Series 302 (2013-2014)
Return to Visual Table of Tables and Figures
Table 8.3.1E-3
Accuracy and Consistency of Classification Indices: List (Grade 5) S302
Overall Accuracy Consistency Kappa (k)
Indices 0.483 0.378 0.182
Conditional Level Accuracy Consistency
on Level 1 0.764 0.447
2 0.486 0.295
3 0.415 0.284
4 0.341 0.260
5 0.486 0.421
6 0.625 0.468
Indices at Accuracy
Cut Points False False
Cut Point Accuracy Positives Negatives Consistency
1/2 0.988 0.002 0.010 0.983
2/3 0.953 0.015 0.032 0.924
3/4 0.868 0.059 0.073 0.808
4/5 0.794 0.088 0.119 0.723
5/6 0.802 0.117 0.081 0.730
40,000 40.0%
30,000 30.0%
Percent
Count
20,000 20.0%
10,000 10.0%
0 0.0%
153 183 213 243 273 303 333 363 393 423 453 1 2 3 4 5 6
Scale Score Proficiency Level
Table 8.3.2A
Scale Score Descriptive Statistics: Read 3-5 S302
Grade No. of Students Min. Max. Mean Std. Dev.
3 164,072 158 448 331.68 25.63
4 105,032 166 448 340.84 27.35
5 82,838 175 448 349.48 29.27
Total 351,942 158 448 338.60 27.98
Table 8.3.2B
Proficiency Level Distribution: Read 3-5 S302
Grade 3 Grade 4 Grade 5 Total
Level Count Percent Count Percent Count Percent Count Percent
1 3,660 2.2% 4,475 4.3% 4,649 5.6% 12,784 3.6%
2 10,157 6.2% 9,743 9.3% 9,333 11.3% 29,233 8.3%
3 24,294 14.8% 18,439 17.6% 20,440 24.7% 63,173 17.9%
4 13,671 8.3% 16,467 15.7% 6,826 8.2% 36,964 10.5%
5 76,417 46.6% 33,950 32.3% 26,484 32.0% 136,851 38.9%
6 35,873 21.9% 21,958 20.9% 15,106 18.2% 72,937 20.7%
Total 164,072 100.0% 105,032 100.0% 82,838 100.0% 351,942 100.0%
25
Expected Raw Score
20
A
15 B
C
10
0
-7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8
Ability Measure
Figure 8.3.2D
Test Information Function: Read 3-5ABC S302
5
Information
4
A
3
B
2 C
0
-7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8
Ability Measure
Weighted
Tiers No. of Students Reliability Reliability
A 29,169 0.838
B 148,739 0.805 0.779
C 174,034 0.748
Table 8.3.2E-1
Accuracy and Consistency of Classification Indices: Read (Grade 3) S302
Overall Accuracy Consistency Kappa (k)
Indices 0.567 0.450 0.260
Conditional Level Accuracy Consistency
on Level 1 0.770 0.540
2 0.526 0.348
3 0.490 0.341
4 0.182 0.134
5 0.691 0.605
6 0.597 0.478
Indices at Accuracy
Cut Points False False
Cut Point Accuracy Positives Negatives Consistency
1/2 0.986 0.004 0.010 0.980
2/3 0.953 0.020 0.028 0.926
3/4 0.885 0.053 0.062 0.829
4/5 0.840 0.093 0.067 0.777
5/6 0.832 0.063 0.105 0.768
Percent
40.0%
Count
8,000
30.0%
6,000
4,000 20.0%
2,000 10.0%
0 0.0%
210 240 270 300 330 360 390 420 450 1 2 3 4 5 6
Scale Score Proficiency Level
Table 8.3.3A
Scale Score Descriptive Statistics: Writ 3-5 S302
Grade No. of Students Min. Max. Mean Std. Dev.
3 164,071 215 409 340.82 25.58
4 105,084 221 427 346.85 25.47
5 82,870 227 456 352.81 25.34
Total 352,025 215 456 345.45 25.95
Table 8.3.3B
Proficiency Level Distribution: Writ 3-5 S302
Grade 3 Grade 4 Grade 5 Total
Level Count Percent Count Percent Count Percent Count Percent
1 1,982 1.2% 1,857 1.8% 2,031 2.5% 5,870 1.7%
2 8,428 5.1% 6,593 6.3% 5,126 6.2% 20,147 5.7%
3 26,453 16.1% 20,757 19.8% 21,928 26.5% 69,138 19.6%
4 94,135 57.4% 60,918 58.0% 46,976 56.7% 202,029 57.4%
5 31,981 19.5% 14,553 13.8% 6,646 8.0% 53,180 15.1%
6 1,092 0.7% 406 0.4% 163 0.2% 1,661 0.5%
Total 164,071 100.0% 105,084 100.0% 82,870 100.0% 352,025 100.0%
Proficiency SEM
Level Grade Cut Score Tier A Tier B Tier C
100
Expected Raw Score
90
80
70
60
A
50
B
40
C
30
20
10
0
-5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8
Ability Measure
Figure 8.3.3D
Test Information Function: Writ 3-5ABC S302
25
20
Information
15
A
10 B
C
5
0
-5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8
Ability Measure
Table 8.3.3E-3
Accuracy and Consistency of Classification Indices: Writ (Grade 3) S302
Overall Accuracy Consistency Kappa (k)
Indices 0.742 0.670 0.461
Conditional Level Accuracy Consistency
on Level 1 0.785 0.756
2 0.811 0.716
3 0.799 0.702
4 0.826 0.750
5 0.481 0.426
6 - 0.973
Indices at Accuracy
False False
Cut Point Accuracy Positives Negatives Consistency
1/2 0.995 0.003 0.002 0.994
2/3 0.983 0.008 0.009 0.977
3/4 0.949 0.024 0.027 0.928
4/5 0.834 0.070 0.097 0.774
5/6 0.993 0.007 0.000 0.994
Table 8.3.3E-2
Accuracy and Consistency of Classification Indices: Writ (Grade 4) S302
Overall Accuracy Consistency Kappa (k)
Indices 0.779 0.717 0.507
Conditional Level Accuracy Consistency
on Level 1 0.747 0.784
2 0.812 0.715
3 0.839 0.749
4 0.762 0.758
5 - 0.304
6 - 1.000
Indices at Accuracy
Cut Points False False
Cut Point Accuracy Positives Negatives Consistency
1/2 0.992 0.005 0.003 0.993
2/3 0.978 0.012 0.010 0.973
3/4 0.944 0.022 0.034 0.924
4/5 0.858 0.142 0.000 0.828
5/6 0.996 0.004 0.000 0.999
Percent
Count
60,000 20.0%
40,000 15.0%
10.0%
20,000 5.0%
0 0.0%
170 200 230 260 290 320 350 380 1 2 3 4 5 6
Scale Score Proficiency Level
Table 8.3.4A
Scale Score Descriptive Statistics: Spek 3-5 S302
Grade No. of Students Min. Max. Mean Std. Dev.
3 164,141 175 403 355.70 44.80
4 105,094 176 403 358.06 45.60
5 82,870 177 403 361.87 46.17
Total 352,105 175 403 357.86 45.43
Table 8.3.4B
Proficiency Level Distribution: Spek 3-5 S302
Grade 3 Grade 4 Grade 5 Total
Level Count Percent Count Percent Count Percent Count Percent
1 11,295 6.9% 9,217 8.8% 7,066 8.5% 27,578 7.8%
2 30,686 18.7% 15,070 14.3% 9,789 11.8% 55,545 15.8%
3 31,833 19.4% 18,828 17.9% 13,372 16.1% 64,033 18.2%
4 18,702 11.4% 12,598 12.0% 9,301 11.2% 40,601 11.5%
5 18,485 11.3% 12,654 12.0% 9,990 12.1% 41,129 11.7%
6 53,140 32.4% 36,727 34.9% 33,352 40.2% 123,219 35.0%
Total 164,141 100.0% 105,094 100.0% 82,870 100.0% 352,105 100.0%
Proficiency
Level Grade Cut Score SEM
3 293 19.08
1/2
4 299 19.48
5 305 19.68
3 326 20.89
2/3
4 329 21.09
5 333 21.49
3 346 22.29
3/4
4 348 22.49
5 350 22.69
3 369 24.90
4/5
4 371 25.31
5 374 25.71
3 389 27.31
5/6
4 391 27.52
5 394 27.52
13
12
11
Expected Raw Score
10
9
8
7
6
5
4
3
2
1
0
-11 -10 -9 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9
Ability Measure
Figure 8.3.4D
Test Information Function: Spek 3-5 S302
1.5
Information
0.5
0
-11 -10 -9 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9
Ability Measure
Table 8.3.4E-1
Accuracy and Consistency of Classification Indices: Spek (Grade 3) S302
Overall Accuracy Consistency Kappa (k)
Indices 0.603 0.512 0.397
Conditional Level Accuracy Consistency
on Level 1 0.650 0.493
2 0.614 0.506
3 0.519 0.431
4 0.353 0.261
5 0.379 0.274
6 0.915 0.844
Indices at Accuracy
Cut Points False False
Cut Point Accuracy Positives Negatives Consistency
1/2 0.955 0.028 0.017 0.930
2/3 0.890 0.050 0.060 0.857
3/4 0.890 0.029 0.082 0.851
4/5 0.930 0.033 0.037 0.887
5/6 0.903 0.073 0.023 0.872
Table 8.3.4E-2
Accuracy and Consistency of Classification Indices: Spek (Grade 4) S302
Overall Accuracy Consistency Kappa (k)
Indices 0.591 0.501 0.380
Conditional Level Accuracy Consistency
on Level 1 0.718 0.576
2 0.524 0.420
3 0.513 0.426
4 0.360 0.269
5 0.350 0.247
6 0.901 0.825
Indices at Accuracy
Cut Points False False
Cut Point Accuracy Positives Negatives Consistency
1/2 0.954 0.027 0.019 0.929
2/3 0.898 0.050 0.052 0.868
3/4 0.890 0.029 0.082 0.854
4/5 0.923 0.033 0.044 0.877
5/6 0.884 0.087 0.029 0.844
Percent
20.0%
Count
15,000
15.0%
10,000
10.0%
5,000 5.0%
0 0.0%
139 169 199 229 259 289 319 349 379 409 439 1 2 3 4 5 6
Scale Score Proficiency Level
Table 8.3.5A
Scale Score Descriptive Statistics: Oral 3-5 S302
Grade No. of Students Min. Max. Mean Std. Dev.
3 164,066 144 436 352.84 34.11
4 105,043 146 436 359.63 35.50
5 82,831 149 436 366.65 36.84
Total 351,940 144 436 358.12 35.62
Table 8.3.5B
Proficiency Level Distribution: Oral 3-5 S302
Grade 3 Grade 4 Grade 5 Total
Level Count Percent Count Percent Count Percent Count Percent
1 3,809 2.3% 3,282 3.1% 3,454 4.2% 10,545 3.0%
2 11,130 6.8% 6,802 6.5% 5,787 7.0% 23,719 6.7%
3 29,982 18.3% 17,709 16.9% 11,681 14.1% 59,372 16.9%
4 35,592 21.7% 22,292 21.2% 16,789 20.3% 74,673 21.2%
5 49,079 29.9% 34,075 32.4% 30,801 37.2% 113,955 32.4%
6 34,474 21.0% 20,883 19.9% 14,319 17.3% 69,676 19.8%
Total 164,066 100.0% 105,043 100.0% 82,831 100.0% 351,940 100.0%
Figure 8.3.5C
n/a
Figure 8.3.5D
n/a
Table 8.3.5D
Oral Composite Reliability: Oral 3-5 S302
Component Weight Variance Reliability
Listening 0.50 1377.820 0.657
Speaking 0.50 2058.602 0.891
Oral 1266.567 0.863
Table 8.3.5E-1
Accuracy and Consistency of Classification Indices: Oral (Grade 3) S302
Overall Accuracy Consistency Kappa (k)
Indices 0.612 0.497 0.358
Conditional Level Accuracy Consistency
on Level 1 0.824 0.657
2 0.586 0.429
3 0.619 0.487
4 0.506 0.402
5 0.596 0.492
6 0.727 0.604
Indices at Accuracy
Cut Points False False
Cut Point Accuracy Positives Negatives Consistency
1/2 0.989 0.003 0.008 0.984
2/3 0.956 0.023 0.022 0.933
3/4 0.896 0.044 0.060 0.857
4/5 0.870 0.056 0.074 0.819
5/6 0.885 0.057 0.058 0.838
Table 8.3.5E-3
Accuracy and Consistency of Classification Indices: Oral (Grade 5) S302
Overall Accuracy Consistency Kappa (k)
Indices 0.581 0.468 0.308
Conditional Level Accuracy Consistency
on Level 1 0.844 0.714
2 0.561 0.415
3 0.558 0.425
4 0.500 0.384
5 0.616 0.538
6 0.560 0.424
Indices at Accuracy
Cut Points False False
Cut Point Accuracy Positives Negatives Consistency
1/2 0.983 0.006 0.011 0.976
2/3 0.955 0.022 0.022 0.934
3/4 0.910 0.037 0.052 0.876
4/5 0.869 0.054 0.077 0.816
5/6 0.847 0.082 0.071 0.797
40.0%
6,000
30.0%
Percent
Count
4,000
20.0%
2,000
10.0%
0 0.0%
182 212 242 272 302 332 362 392 422 1 2 3 4 5 6
Scale Score Proficiency Level
Table 8.3.6A
Scale Score Descriptive Statistics: Litr 3-5 S302
Grade No. of Students Min. Max. Mean Std. Dev.
3 163,916 187 422 336.47 23.14
4 104,940 194 433 344.14 24.14
5 82,761 201 436 351.43 25.19
Total 351,617 187 436 342.28 24.68
Table 8.3.6B
Proficiency Level Distribution: Litr 3-5 S302
Grade 3 Grade 4 Grade 5 Total
Level Count Percent Count Percent Count Percent Count Percent
1 1,908 1.2% 2,137 2.0% 2,772 3.3% 6,817 1.9%
2 8,438 5.1% 6,992 6.7% 6,543 7.9% 21,973 6.2%
3 26,975 16.5% 21,617 20.6% 23,138 28.0% 71,730 20.4%
4 66,967 40.9% 46,272 44.1% 31,950 38.6% 145,189 41.3%
5 46,008 28.1% 21,384 20.4% 14,182 17.1% 81,574 23.2%
6 13,620 8.3% 6,538 6.2% 4,176 5.0% 24,334 6.9%
Total 163,916 100.0% 104,940 100.0% 82,761 100.0% 351,617 100.0%
Figure 8.3.6C
n/a
Figure 8.3.6D
n/a
Table 8.3.6D
Literacy Composite Reliability: Litr 3-5 S302
Component Weight Variance Reliability
Reading 0.50 781.583 0.779
Writing 0.50 672.252 0.924
Literacy 608.817 0.908
*Variances from students who had results in all four domains
Table 8.3.6E-1
Accuracy and Consistency of Classification Indices: Litr (Grade 3) S302
Overall Accuracy Consistency Kappa (k)
Indices 0.705 0.617 0.471
Conditional Level Accuracy Consistency
on Level 1 0.846 0.725
2 0.759 0.636
3 0.698 0.575
4 0.808 0.723
5 0.606 0.556
6 0.714 0.472
Indices at Accuracy
Cut Points False False
Cut Point Accuracy Positives Negatives Consistency
1/2 0.995 0.002 0.003 0.993
2/3 0.979 0.009 0.012 0.970
3/4 0.926 0.041 0.033 0.893
4/5 0.885 0.036 0.079 0.842
5/6 0.919 0.079 0.002 0.914
Table 8.3.6E-3
Accuracy and Consistency of Classification Indices: Litr (Grade 5) S302
Overall Accuracy Consistency Kappa (k)
Indices 0.688 0.588 0.445
Conditional Level Accuracy Consistency
on Level 1 0.881 0.794
2 0.695 0.566
3 0.763 0.659
4 0.743 0.641
5 0.513 0.439
6 - 0.334
Indices at Accuracy
Cut Points False False
Cut Point Accuracy Positives Negatives Consistency
1/2 0.989 0.004 0.007 0.985
2/3 0.964 0.018 0.017 0.948
3/4 0.899 0.047 0.054 0.858
4/5 0.885 0.037 0.078 0.837
5/6 0.950 0.050 0.000 0.946
Percent
Count
20,000 20.0%
15.0%
10,000 10.0%
5.0%
0 0.0%
139 169 199 229 259 289 319 349 379 409 439 1 2 3 4 5 6
Scale Score Proficiency Level
Table 8.3.7A
Scale Score Descriptive Statistics: Cphn 3-5 S302
Grade No. of Students Min. Max. Mean Std. Dev.
3 164,030 144 454 337.18 26.39
4 104,998 151 454 346.95 27.94
5 82,804 159 454 355.99 29.73
Total 351,832 144 454 344.52 28.70
Table 8.3.7B
Proficiency Level Distribution: Cphn 3-5 S302
Grade 3 Grade 4 Grade 5 Total
Level Count Percent Count Percent Count Percent Count Percent
1 1,507 0.9% 1,868 1.8% 2,516 3.0% 5,891 1.7%
2 8,114 4.9% 7,035 6.7% 6,794 8.2% 21,943 6.2%
3 21,263 13.0% 20,392 19.4% 18,307 22.1% 59,962 17.0%
4 24,724 15.1% 18,647 17.8% 13,425 16.2% 56,796 16.1%
5 67,993 41.5% 35,685 34.0% 26,499 32.0% 130,177 37.0%
6 40,429 24.6% 21,371 20.4% 15,263 18.4% 77,063 21.9%
Total 164,030 100.0% 104,998 100.0% 82,804 100.0% 351,832 100.0%
Figure 8.3.7C
n/a
Figure 8.3.7D
n/a
Table 8.3.7D
Comprehension Composite Reliability: Cphn 3-5 S302
Component Weight Variance Reliability
Listening 0.30 1377.820 0.657
Reading 0.70 781.583 0.779
Comprehension 822.711 0.845
*Variances from students who had results in all four domains
Table 8.3.7E-1
Accuracy and Consistency of Classification Indices: Cphn (Grade 3) S302
Overall Accuracy Consistency Kappa (k)
Indices 0.634 0.524 0.356
Conditional Level Accuracy Consistency
on Level 1 0.781 0.548
2 0.653 0.471
3 0.533 0.388
4 0.379 0.286
5 0.691 0.606
6 0.739 0.624
Indices at Accuracy
Cut Points False False
Cut Point Accuracy Positives Negatives Consistency
1/2 0.994 0.001 0.005 0.992
2/3 0.970 0.011 0.019 0.954
3/4 0.908 0.051 0.041 0.863
4/5 0.856 0.074 0.070 0.805
5/6 0.877 0.056 0.067 0.824
Table 8.3.7E-3
Accuracy and Consistency of Classification Indices: Cphn (Grade 5) S302
Overall Accuracy Consistency Kappa (k)
Indices 0.587 0.477 0.332
Conditional Level Accuracy Consistency
on Level 1 0.826 0.656
2 0.592 0.435
3 0.593 0.471
4 0.356 0.276
5 0.595 0.502
6 0.708 0.564
Indices at Accuracy
Cut Points False False
Cut Point Accuracy Positives Negatives Consistency
1/2 0.985 0.004 0.012 0.978
2/3 0.949 0.024 0.027 0.922
3/4 0.870 0.068 0.062 0.820
4/5 0.846 0.068 0.086 0.793
5/6 0.891 0.056 0.053 0.842
Percent
4,000
Count
20.0%
3,000
15.0%
2,000 10.0%
1,000 5.0%
0 0.0%
169 199 229 259 289 319 349 379 409 439 1 2 3 4 5 6
Scale Score Proficiency Level
Table 8.3.8A
Scale Score Descriptive Statistics: Over 3-5 S302
Grade No. of Students Min. Max. Mean Std. Dev.
3 163,789 174 426 341.21 24.49
4 104,822 179 433 348.57 25.59
5 82,666 185 436 355.77 26.81
Total 351,277 174 436 346.83 26.05
Table 8.3.8B
Proficiency Level Distribution: Over 3-5 S302
Grade 3 Grade 4 Grade 5 Total
Level Count Percent Count Percent Count Percent Count Percent
1 2,152 1.3% 2,358 2.2% 2,643 3.2% 7,153 2.0%
2 7,783 4.8% 6,201 5.9% 5,566 6.7% 19,550 5.6%
3 26,956 16.5% 20,187 19.3% 17,803 21.5% 64,946 18.5%
4 58,649 35.8% 41,058 39.2% 32,075 38.8% 131,782 37.5%
5 49,480 30.2% 26,434 25.2% 18,894 22.9% 94,808 27.0%
6 18,769 11.5% 8,584 8.2% 5,685 6.9% 33,038 9.4%
Total 163,789 100.0% 104,822 100.0% 82,666 100.0% 351,277 100.0%
Figure 8.3.8C
n/a
Figure 8.3.8D
n/a
Table 8.3.8D
Overall Composite Reliability: Over 3-5 S302
Component Weight Variance Reliability
Listening 0.15 1377.820 0.657
Reading 0.35 781.583 0.779
Speaking 0.15 2058.602 0.891
Writing 0.35 672.252 0.924
Overall Composite 678.626 0.937
Table 8.3.8E-1
Accuracy and Consistency of Classification Indices: Over (Grade 3) S302
Overall Accuracy Consistency Kappa (k)
Indices 0.756 0.665 0.548
Conditional Level Accuracy Consistency
on Level 1 0.897 0.816
2 0.764 0.655
3 0.751 0.643
4 0.818 0.739
5 0.695 0.621
6 0.774 0.626
Indices at Accuracy
Cut Points False False
Cut Point Accuracy Positives Negatives Consistency
1/2 0.996 0.001 0.003 0.994
2/3 0.982 0.009 0.009 0.974
3/4 0.938 0.033 0.028 0.912
4/5 0.910 0.031 0.059 0.874
5/6 0.930 0.052 0.018 0.909
Table 8.3.8E-3
Accuracy and Consistency of Classification Indices: Over (Grade 5) S302
Overall Accuracy Consistency Kappa (k)
Indices 0.720 0.635 0.509
Conditional Level Accuracy Consistency
on Level 1 0.904 0.841
2 0.735 0.623
3 0.766 0.662
4 0.808 0.720
5 0.584 0.529
6 - 0.409
Indices at Accuracy
Cut Points False False
Cut Point Accuracy Positives Negatives Consistency
1/2 0.992 0.003 0.006 0.989
2/3 0.974 0.014 0.012 0.963
3/4 0.925 0.038 0.037 0.893
4/5 0.897 0.029 0.074 0.858
5/6 0.931 0.069 0.000 0.928
Percent
20.0%
Count
10,000
15.0%
5,000 10.0%
5.0%
0 0.0%
119 149 179 209 239 269 299 329 359 389 419 449 1 2 3 4 5 6
Scale Score Proficiency Level
Table 8.4.1A
Scale Score Descriptive Statistics: List 6-8 S302
Grade No. of Students Min. Max. Mean Std. Dev.
6 73,891 124 473 377.12 41.48
7 74,881 128 473 385.15 44.36
8 70,706 132 473 391.41 46.32
Total 219,478 124 473 384.46 44.45
Table 8.4.1B
Proficiency Level Distribution: List 6-8 S302
Grade 6 Grade 7 Grade 8 Total
Level Count Percent Count Percent Count Percent Count Percent
1 1,475 2.0% 2,328 3.1% 3,100 4.4% 6,903 3.1%
2 6,193 8.4% 7,003 9.4% 7,637 10.8% 20,833 9.5%
3 11,788 16.0% 12,372 16.5% 7,539 10.7% 31,699 14.4%
4 14,320 19.4% 14,390 19.2% 15,333 21.7% 44,043 20.1%
5 25,821 34.9% 24,829 33.2% 19,773 28.0% 70,423 32.1%
6 14,294 19.3% 13,959 18.6% 17,324 24.5% 45,577 20.8%
Total 73,891 100.0% 74,881 100.0% 70,706 100.0% 219,478 100.0%
Proficiency SEM
Level Grade Cut Score Tier A Tier B Tier C
22
20
Expected Raw Score
18
16
14 A
12 B
10 C
8
6
4
2
0
-6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7
Ability Measure
Figure 8.4.1D
Test Information Function: List 6-8ABC S302
4
Information
3 A
B
2 C
0
-6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7
Ability Measure
Table 8.4.1E-1
Accuracy and Consistency of Classification Indices: List (Grade 6) S302
Overall Accuracy Consistency Kappa (k)
Indices 0.446 0.343 0.155
Conditional Level Accuracy Consistency
on Level 1 0.671 0.376
2 0.519 0.331
3 0.406 0.286
4 0.314 0.242
5 0.480 0.425
6 0.512 0.353
Indices at Accuracy
Cut Points False False
Cut Point Accuracy Positives Negatives Consistency
1/2 0.984 0.004 0.012 0.974
2/3 0.930 0.023 0.047 0.895
3/4 0.848 0.058 0.093 0.784
4/5 0.777 0.096 0.128 0.698
5/6 0.810 0.124 0.067 0.739
Table 8.4.1E-3
Accuracy and Consistency of Classification Indices: List (Grade 8) S302
Overall Accuracy Consistency Kappa (k)
Indices 0.400 0.326 0.150
Conditional Level Accuracy Consistency
on Level 1 0.699 0.456
2 0.508 0.344
3 0.268 0.184
4 0.353 0.265
5 0.378 0.341
6 0.556 0.406
Indices at Accuracy
Cut Points False False
Cut Point Accuracy Positives Negatives Consistency
1/2 0.968 0.009 0.023 0.952
2/3 0.912 0.023 0.064 0.875
3/4 0.861 0.060 0.078 0.793
4/5 0.772 0.094 0.133 0.690
5/6 0.768 0.180 0.052 0.709
Percent
20.0%
Count
4,000 15.0%
10.0%
2,000
5.0%
0 0.0%
178 208 238 268 298 328 358 388 418 448 1 2 3 4 5 6
Scale Score Proficiency Level
Table 8.4.2A
Scale Score Descriptive Statistics: Read 6-8 S302
Grade No. of Students Min. Max. Mean Std. Dev.
6 73,817 183 458 351.16 24.27
7 74,844 191 458 358.84 26.35
8 70,659 200 458 365.56 28.63
Total 219,320 183 458 358.42 27.09
Table 8.4.2B
Proficiency Level Distribution: Read 6-8 S302
Grade 6 Grade 7 Grade 8 Total
Level Count Percent Count Percent Count Percent Count Percent
1 3,493 4.7% 4,931 6.6% 6,217 8.8% 14,641 6.7%
2 15,059 20.4% 17,500 23.4% 18,273 25.9% 50,832 23.2%
3 26,439 35.8% 23,810 31.8% 18,598 26.3% 68,847 31.4%
4 9,163 12.4% 8,568 11.4% 5,271 7.5% 23,002 10.5%
5 14,568 19.7% 14,182 18.9% 15,002 21.2% 43,752 19.9%
6 5,095 6.9% 5,853 7.8% 7,298 10.3% 18,246 8.3%
Total 73,817 100.0% 74,844 100.0% 70,659 100.0% 219,320 100.0%
27
24
Expected Raw Score
21
18
15
A
12 B
9 C
6
3
0
-6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8
Ability Measure
Figure 8.4.2D
Test Information Function: Read 6-8ABC S302
5
A
Information
4 B
3 C
0
-6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8
Ability Measure
Table 8.4.2E-1
Accuracy and Consistency of Classification Indices: Read (Grade 6) S302
Overall Accuracy Consistency Kappa (k)
Indices 0.497 0.400 0.223
Conditional Level Accuracy Consistency
on Level 1 0.730 0.549
2 0.686 0.527
3 0.578 0.465
4 0.205 0.170
5 0.412 0.338
6 - 0.177
Indices at Accuracy
Cut Points False False
Cut Point Accuracy Positives Negatives Consistency
1/2 0.974 0.012 0.014 0.958
2/3 0.887 0.042 0.071 0.840
3/4 0.790 0.077 0.133 0.717
4/5 0.794 0.092 0.114 0.727
5/6 0.931 0.069 0.000 0.897
Table 8.4.2E-3
Accuracy and Consistency of Classification Indices: Read (Grade 8) S302
Overall Accuracy Consistency Kappa (k)
Indices 0.490 0.392 0.241
Conditional Level Accuracy Consistency
on Level 1 0.736 0.559
2 0.648 0.514
3 0.467 0.369
4 0.140 0.108
5 0.443 0.368
6 0.536 0.310
Indices at Accuracy
Cut Points False False
Cut Point Accuracy Positives Negatives Consistency
1/2 0.950 0.021 0.029 0.924
2/3 0.857 0.056 0.087 0.803
3/4 0.825 0.076 0.099 0.757
4/5 0.827 0.092 0.081 0.760
5/6 0.898 0.095 0.007 0.858
6,000 40.0%
Percent
Count
30.0%
4,000
20.0%
2,000
10.0%
0 0.0%
228 258 288 318 348 378 408 438 1 2 3 4 5 6
Scale Score Proficiency Level
Table 8.4.3A
Scale Score Descriptive Statistics: Writ 6-8 S302
Grade No. of Students Min. Max. Mean Std. Dev.
6 73,833 233 416 349.34 23.45
7 74,840 239 440 355.17 23.13
8 70,654 245 438 359.80 23.10
Total 219,327 233 440 354.70 23.62
Table 8.4.3B
Proficiency Level Distribution: Writ 6-8 S302
Grade 6 Grade 7 Grade 8 Total
Level Count Percent Count Percent Count Percent Count Percent
1 2,526 3.4% 3,660 4.9% 4,299 6.1% 10,485 4.8%
2 9,221 12.5% 9,253 12.4% 11,238 15.9% 29,712 13.5%
3 35,381 47.9% 43,231 57.8% 45,067 63.8% 123,679 56.4%
4 26,136 35.4% 18,455 24.7% 9,962 14.1% 54,553 24.9%
5 566 0.8% 237 0.3% 85 0.1% 888 0.4%
6 3 0.0% 4 0.0% 3 0.0% 10 0.0%
Total 73,833 100.0% 74,840 100.0% 70,654 100.0% 219,327 100.0%
Proficiency SEM
Level Grade Cut Score Tier A Tier B Tier C
100
Expected Raw Score
90
80 A
70
B
60
50 C
40
30
20
10
0
-5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8
Ability Measure
Figure 8.4.3D
Test Information Function: Writ 6-8ABC S302
25
20
Information
15
A
10 B
C
5
0
-5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8
Ability Measure
Table 8.4.3E-1
Accuracy and Consistency of Classification Indices: Writ (Grade 6) S302
Overall Accuracy Consistency Kappa (k)
Indices 0.812 0.739 0.585
Conditional Level Accuracy Consistency
on Level 1 0.859 0.767
2 0.796 0.698
3 0.848 0.764
4 0.774 0.719
5 - 0.000
6 - 1.000
Indices at Accuracy
Cut Points False False
Cut Point Accuracy Positives Negatives Consistency
1/2 0.989 0.005 0.006 0.985
2/3 0.959 0.019 0.021 0.942
3/4 0.871 0.047 0.082 0.819
4/5 0.992 0.008 0.000 0.992
5/6 1.000 0.000 0.000 1.000
Table 8.3.3E-3
Accuracy and Consistency of Classification Indices: Writ (Grade 8) S302
Overall Accuracy Consistency Kappa (k)
Indices 0.798 0.732 0.491
Conditional Level Accuracy Consistency
on Level 1 0.885 0.815
2 0.826 0.738
3 0.786 0.791
4 - 0.311
5 - -
6 - 1.000
Indices at Accuracy
Cut Points False False
Cut Point Accuracy Positives Negatives Consistency
1/2 0.985 0.007 0.008 0.979
2/3 0.955 0.019 0.026 0.936
3/4 0.858 0.142 0.000 0.817
4/5 0.999 0.001 0.000 0.999
5/6 1.000 0.000 0.000 1.000
Percent
Count
40,000 20.0%
30,000 15.0%
20,000 10.0%
10,000 5.0%
0 0.0%
173 203 233 263 293 323 353 383 413 1 2 3 4 5 6
Scale Score Proficiency Level
Table 8.4.4A
Scale Score Descriptive Statistics: Spek 6-8 S302
Grade No. of Students Min. Max. Mean Std. Dev.
6 73,805 178 416 369.61 47.71
7 74,819 179 416 372.59 48.65
8 70,595 180 416 374.63 49.97
Total 219,219 178 416 372.24 48.81
Table 8.4.4B
Proficiency Level Distribution: Spek 6-8 S302
Grade 6 Grade 7 Grade 8 Total
Level Count Percent Count Percent Count Percent Count Percent
1 5,206 7.1% 6,406 8.6% 6,266 8.9% 17,878 8.2%
2 6,570 8.9% 4,615 6.2% 7,442 10.5% 18,627 8.5%
3 11,431 15.5% 10,240 13.7% 11,505 16.3% 33,176 15.1%
4 18,194 24.7% 17,218 23.0% 8,300 11.8% 43,712 19.9%
5 8,505 11.5% 8,930 11.9% 8,457 12.0% 25,892 11.8%
6 23,899 32.4% 27,410 36.6% 28,625 40.5% 79,934 36.5%
Total 73,805 100.0% 74,819 100.0% 70,595 100.0% 219,219 100.0%
Proficiency
Level Grade Cut Score SEM
6 310 22.09
1/2
7 314 22.29
8 317 22.69
6 337 23.50
2/3
7 340 23.50
8 344 23.70
6 353 23.50
3/4
7 358 23.30
8 361 23.30
6 377 22.69
4/5
7 380 22.29
8 384 22.09
6 397 21.49
5/6
7 400 21.49
8 404 21.49
13
12
11
Expected Raw Score
10
9
8
7
6
5
4
3
2
1
0
-13-12-11-10 -9 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10
Ability Measure
Figure 8.4.4D
Test Information Function: Spek 6-8 S302
2
1.5
Information
0.5
0
-13 -12 -11 -10 -9 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10
Ability Measure
Table 8.4.4E-1
Accuracy and Consistency of Classification Indices: Spek (Grade 6) S302
Overall Accuracy Consistency Kappa (k)
Indices 0.563 0.456 0.322
Conditional Level Accuracy Consistency
on Level 1 0.725 0.576
2 0.405 0.299
3 0.453 0.370
4 0.552 0.455
5 0.250 0.179
6 0.804 0.710
Indices at Accuracy
Cut Points False False
Cut Point Accuracy Positives Negatives Consistency
1/2 0.966 0.022 0.012 0.944
2/3 0.913 0.054 0.034 0.882
3/4 0.874 0.047 0.079 0.842
4/5 0.876 0.022 0.102 0.827
5/6 0.872 0.064 0.063 0.802
Table 8.4.4E-3
Accuracy and Consistency of Classification Indices: Spek (Grade 8) S302
Overall Accuracy Consistency Kappa (k)
Indices 0.533 0.434 0.291
Conditional Level Accuracy Consistency
on Level 1 0.718 0.595
2 0.445 0.356
3 0.526 0.443
4 0.320 0.228
5 0.225 0.175
6 0.805 0.732
Indices at Accuracy
Cut Points False False
Cut Point Accuracy Positives Negatives Consistency
1/2 0.957 0.029 0.014 0.934
2/3 0.908 0.050 0.042 0.884
3/4 0.891 0.022 0.087 0.864
4/5 0.910 0.021 0.069 0.852
5/6 0.800 0.135 0.066 0.731
Percent
20.0%
Count
4,000
15.0%
10.0%
2,000
5.0%
0 0.0%
146 176 206 236 266 296 326 356 386 416 446 1 2 3 4 5 6
Scale Score Proficiency Level
Table 8.4.5A
Scale Score Descriptive Statistics: Oral 6-8 S302
Grade No. of Students Min. Max. Mean Std. Dev.
6 73,741 151 445 373.63 39.11
7 74,722 154 445 379.15 41.12
8 70,516 156 445 383.33 42.98
Total 218,979 151 445 378.64 41.26
Table 8.4.5B
Proficiency Level Distribution: Oral 6-8 S302
Grade 6 Grade 7 Grade 8 Total
Level Count Percent Count Percent Count Percent Count Percent
1 3,450 4.7% 4,331 5.8% 4,761 6.8% 12,542 5.7%
2 5,243 7.1% 5,312 7.1% 5,500 7.8% 16,055 7.3%
3 9,040 12.3% 9,747 13.0% 8,580 12.2% 27,367 12.5%
4 18,364 24.9% 16,435 22.0% 15,133 21.5% 49,932 22.8%
5 23,316 31.6% 22,330 29.9% 20,850 29.6% 66,496 30.4%
6 14,328 19.4% 16,567 22.2% 15,692 22.3% 46,587 21.3%
Total 73,741 100.0% 74,722 100.0% 70,516 100.0% 218,979 100.0%
Figure 8.4.5C
n/a
Figure 8.4.5D
n/a
Table 8.4.5D
Oral Composite Reliability: Oral 6-8 S302
Component Weight Variance Reliability
Listening 0.50 1972.790 0.645
Speaking 0.50 2377.192 0.904
Oral 1700.473 0.863
*Variances from students who had results in all four domains
Table 8.4.5E-1
Accuracy and Consistency of Classification Indices: Oral (Grade 6) S302
Overall Accuracy Consistency Kappa (k)
Indices 0.554 0.447 0.290
Conditional Level Accuracy Consistency
on Level 1 0.840 0.713
2 0.555 0.405
3 0.489 0.364
4 0.571 0.447
5 0.524 0.453
6 0.571 0.441
Indices at Accuracy
Cut Points False False
Cut Point Accuracy Positives Negatives Consistency
1/2 0.982 0.007 0.012 0.973
2/3 0.954 0.023 0.023 0.932
3/4 0.909 0.042 0.050 0.874
4/5 0.856 0.047 0.097 0.803
5/6 0.831 0.093 0.076 0.781
Table 8.4.5E-3
Accuracy and Consistency of Classification Indices: Oral (Grade 8) S302
Overall Accuracy Consistency Kappa (k)
Indices 0.525 0.429 0.277
Conditional Level Accuracy Consistency
on Level 1 0.838 0.716
2 0.507 0.372
3 0.468 0.349
4 0.506 0.378
5 0.479 0.420
6 0.581 0.458
Indices at Accuracy
Cut Points False False
Cut Point Accuracy Positives Negatives Consistency
1/2 0.974 0.010 0.016 0.962
2/3 0.945 0.027 0.028 0.921
3/4 0.908 0.038 0.054 0.873
4/5 0.859 0.050 0.092 0.800
5/6 0.806 0.121 0.073 0.759
Percent
Count
30.0%
2,000
20.0%
1,000 10.0%
0 0.0%
203 233 263 293 323 353 383 413 443 1 2 3 4 5 6
Scale Score Proficiency Level
Table 8.4.6A
Scale Score Descriptive Statistics: Litr 6-8 S302
Grade No. of Students Min. Max. Mean Std. Dev.
6 73,722 208 436 350.51 20.98
7 74,744 215 435 357.27 21.99
8 70,562 223 440 362.95 23.21
Total 219,028 208 440 356.82 22.63
Table 8.4.6B
Proficiency Level Distribution: Litr 6-8 S302
Grade 6 Grade 7 Grade 8 Total
Level Count Percent Count Percent Count Percent Count Percent
1 2,050 2.8% 3,432 4.6% 4,463 6.3% 9,945 4.5%
2 11,345 15.4% 12,576 16.8% 14,069 19.9% 37,990 17.3%
3 36,171 49.1% 36,732 49.1% 35,389 50.2% 108,292 49.4%
4 20,596 27.9% 18,362 24.6% 13,324 18.9% 52,282 23.9%
5 2,994 4.1% 3,077 4.1% 2,807 4.0% 8,878 4.1%
6 566 0.8% 565 0.8% 510 0.7% 1,641 0.7%
Total 73,722 100.0% 74,744 100.0% 70,562 100.0% 219,028 100.0%
Figure 8.4.6C
n/a
Figure 8.4.6D
n/a
Table 8.4.6D
Literacy Composite Reliability: Litr 6-8 S302
Component Weight Variance Reliability
Reading 0.50 733.056 0.770
Writing 0.50 556.239 0.920
Literacy 511.916 0.896
*Variances from students who had results in all four domains
Table 8.4.6E-1
Accuracy and Consistency of Classification Indices: Litr (Grade 6) S302
Overall Accuracy Consistency Kappa (k)
Indices 0.754 0.673 0.504
Conditional Level Accuracy Consistency
on Level 1 0.660 0.698
2 0.787 0.679
3 0.831 0.752
4 0.650 0.583
5 - 0.201
6 - 0.999
Indices at Accuracy
Cut Points False False
Cut Point Accuracy Positives Negatives Consistency
1/2 0.983 0.011 0.006 0.984
2/3 0.937 0.033 0.030 0.919
3/4 0.863 0.054 0.083 0.818
4/5 0.952 0.048 0.000 0.946
5/6 0.992 0.008 0.000 0.999
Table 8.4.6E-3
Accuracy and Consistency of Classification Indices: Litr (Grade 8) S302
Overall Accuracy Consistency Kappa (k)
Indices 0.728 0.643 0.472
Conditional Level Accuracy Consistency
on Level 1 0.770 0.745
2 0.768 0.660
3 0.811 0.731
4 0.536 0.456
5 - 0.196
6 - 1.000
Indices at Accuracy
Cut Points False False
Cut Point Accuracy Positives Negatives Consistency
1/2 0.972 0.015 0.012 0.969
2/3 0.921 0.039 0.040 0.896
3/4 0.862 0.057 0.081 0.815
4/5 0.953 0.047 0.000 0.949
5/6 0.993 0.007 0.000 0.999
Percent
20.0%
Count
4,000 15.0%
10.0%
2,000
5.0%
0 0.0%
160 190 220 250 280 310 340 370 400 430 460 1 2 3 4 5 6
Scale Score Proficiency Level
Table 8.4.7A
Scale Score Descriptive Statistics: Cphn 6-8 S302
Grade No. of Students Min. Max. Mean Std. Dev.
6 73,777 165 463 359.00 26.64
7 74,773 172 463 366.83 29.04
8 70,613 180 463 373.39 31.31
Total 219,163 165 463 366.31 29.61
Table 8.4.7B
Proficiency Level Distribution: Cphn 6-8 S302
Grade 6 Grade 7 Grade 8 Total
Level Count Percent Count Percent Count Percent Count Percent
1 1,957 2.7% 3,244 4.3% 4,022 5.7% 9,223 4.2%
2 8,754 11.9% 10,408 13.9% 11,361 16.1% 30,523 13.9%
3 23,371 31.7% 22,765 30.4% 18,021 25.5% 64,157 29.3%
4 14,763 20.0% 13,690 18.3% 12,073 17.1% 40,526 18.5%
5 17,997 24.4% 16,436 22.0% 16,623 23.5% 51,056 23.3%
6 6,935 9.4% 8,230 11.0% 8,513 12.1% 23,678 10.8%
Total 73,777 100.0% 74,773 100.0% 70,613 100.0% 219,163 100.0%
Figure 8.4.7C
n/a
Figure 8.4.7D
n/a
Table 8.4.7D
Comprehension Composite Reliability: Cphn 6-8 S302
Component Weight Variance Reliability
Listening 0.30 1972.790 0.645
Reading 0.70 733.056 0.770
Comprehension 876.066 0.834
*Variances from students who had results in all four domains
Table 8.4.7E-1
Accuracy and Consistency of Classification Indices: Cphn (Grade 6) S302
Overall Accuracy Consistency Kappa (k)
Indices 0.557 0.456 0.300
Conditional Level Accuracy Consistency
on Level 1 0.781 0.611
2 0.668 0.513
3 0.673 0.558
4 0.393 0.305
5 0.498 0.431
6 0.597 0.347
Indices at Accuracy
Cut Points False False
Cut Point Accuracy Positives Negatives Consistency
1/2 0.987 0.005 0.008 0.980
2/3 0.934 0.031 0.034 0.903
3/4 0.847 0.063 0.090 0.794
4/5 0.843 0.057 0.101 0.782
5/6 0.907 0.091 0.002 0.882
Table 8.4.7E-3
Accuracy and Consistency of Classification Indices: Cphn (Grade 8) S302
Overall Accuracy Consistency Kappa (k)
Indices 0.532 0.426 0.288
Conditional Level Accuracy Consistency
on Level 1 0.789 0.630
2 0.646 0.505
3 0.566 0.452
4 0.350 0.269
5 0.487 0.407
6 0.608 0.402
Indices at Accuracy
Cut Points False False
Cut Point Accuracy Positives Negatives Consistency
1/2 0.972 0.011 0.017 0.959
2/3 0.912 0.039 0.049 0.874
3/4 0.854 0.064 0.083 0.800
4/5 0.847 0.065 0.088 0.788
5/6 0.894 0.079 0.027 0.856
Percent
Count
20.0%
2,000
15.0%
1,000 10.0%
5.0%
0 0.0%
186 216 246 276 306 336 366 396 426 1 2 3 4 5 6
Scale Score Proficiency Level
Table 8.4.8A
Scale Score Descriptive Statistics: Over 6-8 S302
Grade No. of Students Min. Max. Mean Std. Dev.
6 73,582 191 439 357.25 23.91
7 74,562 197 438 363.63 25.38
8 70,381 203 439 368.85 26.97
Total 218,525 191 439 363.16 25.86
Table 8.4.8B
Proficiency Level Distribution: Over 6-8 S302
Grade 6 Grade 7 Grade 8 Total
Level Count Percent Count Percent Count Percent Count Percent
1 2,314 3.1% 3,266 4.4% 4,007 5.7% 9,587 4.4%
2 7,253 9.9% 7,968 10.7% 8,707 12.4% 23,928 10.9%
3 23,553 32.0% 25,285 33.9% 23,057 32.8% 71,895 32.9%
4 30,308 41.2% 27,047 36.3% 25,561 36.3% 82,916 37.9%
5 8,916 12.1% 9,828 13.2% 8,009 11.4% 26,753 12.2%
6 1,238 1.7% 1,168 1.6% 1,040 1.5% 3,446 1.6%
Total 73,582 100.0% 74,562 100.0% 70,381 100.0% 218,525 100.0%
Figure 8.4.8C
n/a
Figure 8.4.8D
n/a
Table 8.4.8D
Overall Composite Reliability: Over 6-8 S302
Component Weight Variance Reliability
Listening 0.15 1972.790 0.645
Reading 0.35 733.056 0.770
Speaking 0.15 2377.192 0.904
Writing 0.35 556.239 0.920
Overall Composite 668.610 0.930
*Variances from students who had results in all four domains
Table 8.4.8E-1
Accuracy and Consistency of Classification Indices: Over (Grade 6) S302
Overall Accuracy Consistency Kappa (k)
Indices 0.745 0.662 0.519
Conditional Level Accuracy Consistency
on Level 1 0.788 0.814
2 0.771 0.664
3 0.832 0.748
4 0.739 0.672
5 0.512 0.402
6 - 0.982
Indices at Accuracy
Cut Points False False
Cut Point Accuracy Positives Negatives Consistency
1/2 0.988 0.007 0.005 0.988
2/3 0.962 0.022 0.016 0.949
3/4 0.908 0.037 0.055 0.876
4/5 0.889 0.066 0.045 0.854
5/6 0.983 0.017 0.000 0.987
Table 8.4.8E-3
Accuracy and Consistency of Classification Indices: Over (Grade 8) S302
Overall Accuracy Consistency Kappa (k)
Indices 0.719 0.633 0.495
Conditional Level Accuracy Consistency
on Level 1 0.832 0.815
2 0.749 0.640
3 0.810 0.716
4 0.708 0.617
5 0.335 0.369
6 - 0.987
Indices at Accuracy
Cut Points False False
Cut Point Accuracy Positives Negatives Consistency
1/2 0.981 0.010 0.009 0.979
2/3 0.951 0.027 0.022 0.935
3/4 0.900 0.039 0.061 0.865
4/5 0.908 0.067 0.024 0.855
5/6 0.985 0.015 0.000 0.989
Percent
Count
6,000 15.0%
4,000 10.0%
2,000 5.0%
0 0.0%
131 161 191 221 251 281 311 341 371 401 431 461 491 1 2 3 4 5 6
Scale Score Proficiency Level
Table 8.5.1A
Scale Score Descriptive Statistics: List 9-12 S302
Grade No. of Students Min. Max. Mean Std. Dev.
9 79,568 136 499 381.43 50.87
10 50,140 140 499 385.08 46.50
11 38,200 144 499 389.92 45.23
12 29,999 148 499 391.12 45.82
Total 197,907 136 499 385.46 48.13
Table 8.5.1B
Proficiency Level Distribution: List 9-12 S302
Grade 9 Grade 10 Grade 11 Grade 12 Total
Level Count Percent Count Percent Count Percent Count Percent Count Percent
1 7,822 9.8% 4,301 8.6% 3,222 8.4% 2,831 9.4% 18,176 9.2%
2 10,443 13.1% 5,883 11.7% 4,285 11.2% 3,190 10.6% 23,801 12.0%
3 11,839 14.9% 10,538 21.0% 6,883 18.0% 7,848 26.2% 37,108 18.8%
4 17,621 22.1% 12,970 25.9% 11,612 30.4% 8,335 27.8% 50,538 25.5%
5 23,901 30.0% 11,949 23.8% 8,171 21.4% 4,694 15.6% 48,715 24.6%
6 7,942 10.0% 4,499 9.0% 4,027 10.5% 3,101 10.3% 19,569 9.9%
Total 79,568 100.0% 50,140 100.0% 38,200 100.0% 29,999 100.0% 197,907 100.0%
Proficiency SEM
Level Grade Cut Score Tier A Tier B Tier C
20
Expected Raw Score
15
A
10 B
C
0
-7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8
Ability Measure
Figure 8.5.1D
Test Information Function: List 9-12ABC S302
4
Information
3 A
B
2 C
0
-7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8
Ability Measure
Table 8.5.1E-1
Accuracy and Consistency of Classification Indices: List (Grade 9) S302
Overall Accuracy Consistency Kappa (k)
Indices 0.410 0.316 0.155
Conditional Level Accuracy Consistency
on Level 1 0.819 0.684
2 0.511 0.327
3 0.289 0.186
4 0.287 0.261
5 0.437 0.396
6 - 0.149
Indices at Accuracy
Cut Points False False
Cut Point Accuracy Positives Negatives Consistency
1/2 0.957 0.016 0.028 0.938
2/3 0.899 0.023 0.078 0.853
3/4 0.817 0.048 0.134 0.716
4/5 0.681 0.169 0.150 0.630
5/6 0.900 0.100 0.000 0.822
Table 8.5.1E-2
Accuracy and Consistency of Classification Indices: List (Grade 10) S302
Overall Accuracy Consistency Kappa (k)
Indices 0.370 0.309 0.134
Conditional Level Accuracy Consistency
on Level 1 0.825 0.687
2 0.484 0.281
3 0.366 0.244
4 0.312 0.301
5 - 0.305
6 - 0.124
Indices at
Cut Points Accuracy
False False
Cut Point Accuracy Positives Negatives Consistency
1/2 0.964 0.013 0.022 0.947
2/3 0.902 0.022 0.076 0.853
3/4 0.768 0.041 0.192 0.667
4/5 0.672 0.328 0.000 0.631
5/6 0.910 0.090 0.000 0.855
Table 8.5.1E-4
Accuracy and Consistency of Classification Indices: List (Grade 12) S302
Overall Accuracy Consistency Kappa (k)
Indices 0.368 0.295 0.109
Conditional Level Accuracy Consistency
on Level 1 0.860 0.694
2 0.367 0.165
3 0.377 0.284
4 0.319 0.311
5 - 0.183
6 - 0.125
Indices at Accuracy
Cut Points False False
Cut Point Accuracy Positives Negatives Consistency
1/2 0.964 0.011 0.025 0.943
2/3 0.884 0.014 0.102 0.826
3/4 0.673 0.021 0.306 0.599
4/5 0.740 0.260 0.000 0.643
5/6 0.897 0.103 0.000 0.848
Percent
Count
15.0%
4,000
10.0%
2,000
5.0%
0 0.0%
203 233 263 293 323 353 383 413 443 473 1 2 3 4 5 6
Scale Score Proficiency Level
Table 8.5.2A
Scale Score Descriptive Statistics: Read 9-12 S302
Grade No. of Students Min. Max. Mean Std. Dev.
9 79,602 208 468 372.59 32.55
10 50,148 216 468 374.91 30.92
11 38,245 224 468 379.46 30.85
12 30,042 233 468 380.63 31.45
Total 198,037 208 468 375.73 31.82
Table 8.5.2B
Proficiency Level Distribution: Read 9-12 S302
Grade 9 Grade 10 Grade 11 Grade 12 Total
Level Count Percent Count Percent Count Percent Count Percent Count Percent
1 8,899 11.2% 5,279 10.5% 4,293 11.2% 3,796 12.6% 22,267 11.2%
2 19,440 24.4% 15,531 31.0% 10,490 27.4% 7,980 26.6% 53,441 27.0%
3 15,281 19.2% 7,698 15.4% 4,671 12.2% 3,705 12.3% 31,355 15.8%
4 8,059 10.1% 6,381 12.7% 4,901 12.8% 3,080 10.3% 22,421 11.3%
5 14,506 18.2% 8,445 16.8% 7,535 19.7% 7,070 23.5% 37,556 19.0%
6 13,417 16.9% 6,814 13.6% 6,355 16.6% 4,411 14.7% 30,997 15.7%
Total 79,602 100.0% 50,148 100.0% 38,245 100.0% 30,042 100.0% 198,037 100.0%
Proficiency SEM
Level Grade Cut Score Tier A Tier B Tier C
27
24
Expected Raw Score
21
18
15 A
12 B
9 C
6
3
0
-6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9
Ability Measure
Figure 8.5.2D
Test Information Function: Read 9-12ABC S302
5
Information
4
A
3 B
2 C
0
-6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9
Ability Measure
Table 8.5.2E-1
Accuracy and Consistency of Classification Indices: Read (Grade 9) S302
Overall Accuracy Consistency Kappa (k)
Indices 0.491 0.397 0.265
Conditional Level Accuracy Consistency
on Level 1 0.752 0.597
2 0.630 0.499
3 0.379 0.294
4 0.203 0.154
5 0.369 0.298
6 0.635 0.463
Indices at Accuracy
Cut Points False False
Cut Point Accuracy Positives Negatives Consistency
1/2 0.942 0.026 0.032 0.913
2/3 0.866 0.052 0.083 0.814
3/4 0.839 0.078 0.083 0.777
4/5 0.834 0.074 0.092 0.773
5/6 0.868 0.082 0.050 0.818
Table 8.5.2E-2
Accuracy and Consistency of Classification Indices: Read (Grade 10) 302
Overall Accuracy Consistency Kappa (k)
Indices 0.488 0.396 0.258
Conditional Level Accuracy Consistency
on Level 1 0.715 0.554
2 0.691 0.568
3 0.299 0.231
4 0.256 0.193
5 0.365 0.294
6 0.627 0.422
Indices at Accuracy
Cut Points False False
Cut Point Accuracy Positives Negatives Consistency
1/2 0.940 0.030 0.030 0.909
2/3 0.850 0.055 0.096 0.795
3/4 0.838 0.076 0.086 0.776
4/5 0.841 0.066 0.094 0.779
5/6 0.886 0.083 0.032 0.843
Table 8.5.2E-4
Accuracy and Consistency of Classification Indices: Read (Grade 12) S302
Overall Accuracy Consistency Kappa (k)
Indices 0.475 0.384 0.247
Conditional Level Accuracy Consistency
on Level 1 0.751 0.605
2 0.633 0.501
3 0.245 0.184
4 0.189 0.144
5 0.434 0.365
6 0.510 0.358
Indices at Accuracy
Cut Points False False
Cut Point Accuracy Positives Negatives Consistency
1/2 0.937 0.031 0.032 0.905
2/3 0.851 0.054 0.095 0.798
3/4 0.834 0.061 0.106 0.770
4/5 0.823 0.076 0.101 0.755
5/6 0.856 0.087 0.058 0.803
Percent
4,000
Count
3,000 20.0%
2,000
10.0%
1,000
0 0.0%
246 276 306 336 366 396 426 456 486 516 1 2 3 4 5 6
Scale Score Proficiency Level
Table 8.5.3A
Scale Score Descriptive Statistics: Writ 9-12 S302
Grade No. of Students Min. Max. Mean Std. Dev.
9 79,499 251 473 394.61 33.76
10 50,096 257 494 396.88 30.82
11 38,170 263 500 400.95 29.00
12 29,947 269 519 402.05 29.45
Total 197,712 251 519 397.54 31.64
Table 8.5.3B
Proficiency Level Distribution: Writ 9-12 S302
Grade 9 Grade 10 Grade 11 Grade 12 Total
Level Count Percent Count Percent Count Percent Count Percent Count Percent
1 3,581 4.5% 2,438 4.9% 1,818 4.8% 1,830 6.1% 9,667 4.9%
2 6,914 8.7% 3,862 7.7% 2,393 6.3% 2,334 7.8% 15,503 7.8%
3 16,134 20.3% 13,349 26.6% 11,897 31.2% 11,213 37.4% 52,593 26.6%
4 25,717 32.3% 21,588 43.1% 17,976 47.1% 13,024 43.5% 78,305 39.6%
5 24,413 30.7% 8,261 16.5% 3,823 10.0% 1,443 4.8% 37,940 19.2%
6 2,740 3.4% 598 1.2% 263 0.7% 103 0.3% 3,704 1.9%
Total 79,499 100.0% 50,096 100.0% 38,170 100.0% 29,947 100.0% 197,712 100.0%
Proficiency SEM
Level Grade Cut Score Tier A Tier B Tier C
100
90
Expected Raw Score
80
70
60
A
50
40 B
30 C
20
10
0
-3 -2 -1 0 1 2 3 4 5 6 7 8 9
Ability Measure
Figure 8.5.3D
Test Information Function: Writ 9-12ABC S302
25
20
Information
15
A
10 B
C
5
0
-3 -2 -1 0 1 2 3 4 5 6 7 8 9
Ability Measure
Table 8.5.3E-1
Accuracy and Consistency of Classification Indices: Writ (Grade 9) S302
Overall Accuracy Consistency Kappa (k)
Indices 0.676 0.567 0.417
Conditional Level Accuracy Consistency
on Level 1 0.840 0.746
2 0.735 0.614
3 0.774 0.668
4 0.632 0.494
5 0.631 0.576
6 - 0.089
Indices at Accuracy
Cut Points False False
Cut Point Accuracy Positives Negatives Consistency
1/2 0.986 0.007 0.007 0.979
2/3 0.966 0.015 0.019 0.951
3/4 0.934 0.023 0.043 0.907
4/5 0.822 0.071 0.107 0.755
5/6 0.966 0.034 0.000 0.952
Table 8.5.3E-2
Accuracy and Consistency of Classification Indices: Writ (Grade 10) S302
Overall Accuracy Consistency Kappa (k)
Indices 0.695 0.607 0.437
Conditional Level Accuracy Consistency
on Level 1 0.875 0.797
2 0.717 0.591
3 0.819 0.713
4 0.633 0.618
5 - 0.354
6 - 0.038
Indices at
Cut Points Accuracy
False False
Cut Point Accuracy Positives Negatives Consistency
1/2 0.987 0.006 0.007 0.982
2/3 0.969 0.015 0.016 0.955
3/4 0.916 0.026 0.058 0.882
4/5 0.823 0.177 0.000 0.787
5/6 0.988 0.012 0.000 0.986
Table 8.5.3E-4
Accuracy and Consistency of Classification Indices: Writ (Grade 12) S302
Overall Accuracy Consistency Kappa (k)
Indices 0.723 0.633 0.430
Conditional Level Accuracy Consistency
on Level 1 0.892 0.830
2 0.742 0.619
3 0.819 0.621
4 0.664 0.638
5 - 0.091
6 - 0.000
Indices at Accuracy
Cut Points False False
Cut Point Accuracy Positives Negatives Consistency
1/2 0.987 0.007 0.006 0.982
2/3 0.970 0.013 0.016 0.958
3/4 0.816 0.030 0.154 0.747
4/5 0.948 0.052 0.000 0.933
5/6 0.997 0.003 0.000 0.997
Percent
Count
40,000
30,000 20.0%
20,000
10.0%
10,000
0 0.0%
176 206 236 266 296 326 356 386 416 1 2 3 4 5 6
Scale Score Proficiency Level
Table 8.5.4A
Scale Score Descriptive Statistics: Spek 9-12 S302
Grade No. of Students Min. Max. Mean Std. Dev.
9 79,452 181 428 376.09 60.72
10 50,051 182 428 379.28 53.97
11 38,164 183 428 383.57 50.35
12 29,948 184 428 387.37 48.45
Total 197,615 181 428 380.05 55.52
Table 8.5.4B
Proficiency Level Distribution: Spek 9-12 S302
Grade 9 Grade 10 Grade 11 Grade 12 Total
Level Count Percent Count Percent Count Percent Count Percent Count Percent
1 14,234 17.9% 7,263 14.5% 4,211 11.0% 2,650 8.8% 28,358 14.4%
2 5,023 6.3% 6,367 12.7% 4,974 13.0% 3,643 12.2% 20,007 10.1%
3 7,983 10.0% 7,351 14.7% 5,875 15.4% 4,536 15.1% 25,745 13.0%
4 13,195 16.6% 4,964 9.9% 3,980 10.4% 3,163 10.6% 25,302 12.8%
5 7,799 9.8% 4,958 9.9% 3,730 9.8% 3,058 10.2% 19,545 9.9%
6 31,218 39.3% 19,148 38.3% 15,394 40.3% 12,898 43.1% 78,658 39.8%
Total 79,452 100.0% 50,051 100.0% 38,164 100.0% 29,948 100.0% 197,615 100.0%
Proficiency
Level Grade Cut Score SEM
9 319 20.49
11 322 20.69
12 323 20.89
9 347 22.49
11 354 22.90
12 357 23.10
9 366 23.90
11 377 24.90
12 384 25.91
9 388 26.51
11 399 29.32
12 405 30.53
9 407 30.93
11 416 32.94
12 421 33.54
12
Expected Raw Score
10
0
-12 -11 -10 -9 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9
Ability Measure
Figure 8.5.4D
Test Information Function: Spek 9-12 S302
1.5
Information
0.5
0
-12 -11 -10 -9 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9
Ability Measure
Table 8.5.4E-1
Accuracy and Consistency of Classification Indices: Spek (Grade 9) S302
Overall Accuracy Consistency Kappa (k)
Indices 0.656 0.574 0.444
Conditional Level Accuracy Consistency
on Level 1 0.885 0.809
2 0.319 0.231
3 0.377 0.293
4 0.505 0.402
5 0.259 0.179
6 0.858 0.790
Indices at Accuracy
Cut Points False False
Cut Point Accuracy Positives Negatives Consistency
1/2 0.956 0.020 0.024 0.934
2/3 0.932 0.041 0.027 0.904
3/4 0.906 0.045 0.049 0.877
4/5 0.900 0.027 0.073 0.862
5/6 0.896 0.047 0.057 0.836
Table 8.5.4E-2
Accuracy and Consistency of Classification Indices: Spek (Grade 10) S302
Overall Accuracy Consistency Kappa (k)
Indices 0.659 0.578 0.463
Conditional Level Accuracy Consistency
on Level 1 0.817 0.722
2 0.508 0.405
3 0.497 0.403
4 0.361 0.268
5 0.359 0.248
6 0.914 0.859
Indices at Accuracy
Cut Points False False
Cut Point Accuracy Positives Negatives Consistency
1/2 0.949 0.027 0.023 0.926
2/3 0.914 0.043 0.044 0.886
3/4 0.910 0.027 0.063 0.879
4/5 0.934 0.028 0.037 0.898
5/6 0.912 0.057 0.031 0.874
Table 8.5.4E-4
Accuracy and Consistency of Classification Indices: Spek (Grade 12) S302
Overall Accuracy Consistency Kappa (k)
Indices 0.640 0.535 0.407
Conditional Level Accuracy Consistency
on Level 1 0.789 0.683
2 0.607 0.490
3 0.579 0.475
4 0.432 0.307
5 0.272 0.186
6 0.905 0.847
Indices at Accuracy
Cut Points False False
Cut Point Accuracy Positives Negatives Consistency
1/2 0.965 0.020 0.015 0.949
2/3 0.933 0.030 0.037 0.910
3/4 0.928 0.024 0.048 0.901
4/5 0.938 0.027 0.035 0.901
5/6 0.853 0.113 0.033 0.796
Percent
20.0%
Count
4,000 15.0%
10.0%
2,000
5.0%
0 0.0%
154 184 214 244 274 304 334 364 394 424 454 1 2 3 4 5 6
Scale Score Proficiency Level
Table 8.5.5A
Scale Score Descriptive Statistics: Oral 9-12 S302
Grade No. of Students Min. Max. Mean Std. Dev.
9 79,128 159 464 379.05 50.90
10 49,875 161 464 382.45 44.66
11 37,978 164 464 387.04 41.83
12 29,722 166 464 389.60 40.57
Total 196,703 159 464 383.05 46.37
Table 8.5.5B
Proficiency Level Distribution: Oral 9-12 S302
Grade 9 Grade 10 Grade 11 Grade 12 Total
Level Count Percent Count Percent Count Percent Count Percent Count Percent
1 10,466 13.2% 5,041 10.1% 3,198 8.4% 2,322 7.8% 21,027 10.7%
2 7,972 10.1% 6,061 12.2% 4,301 11.3% 3,268 11.0% 21,602 11.0%
3 8,051 10.2% 7,573 15.2% 6,684 17.6% 5,998 20.2% 28,306 14.4%
4 15,346 19.4% 10,984 22.0% 9,328 24.6% 8,010 26.9% 43,668 22.2%
5 24,890 31.5% 15,172 30.4% 10,090 26.6% 7,795 26.2% 57,947 29.5%
6 12,403 15.7% 5,044 10.1% 4,377 11.5% 2,329 7.8% 24,153 12.3%
Total 79,128 100.0% 49,875 100.0% 37,978 100.0% 29,722 100.0% 196,703 100.0%
Figure 8.5.5C
n/a
Figure 8.5.5D
n/a
Table 8.5.5D
Oral Composite Reliability: Oral 9-12 S302
Component Weight Variance Reliability
Listening 0.50 2311.716 0.688
Speaking 0.50 3077.424 0.924
Oral 2149.264 0.889
*Variances from students who had results in all four domains
Table 8.5.5E-1
Accuracy and Consistency of Classification Indices: Oral (Grade 9) S302
Overall Accuracy Consistency Kappa (k)
Indices 0.550 0.449 0.318
Conditional Level Accuracy Consistency
on Level 1 0.879 0.801
2 0.548 0.416
3 0.403 0.288
4 0.478 0.350
5 0.547 0.489
6 0.474 0.359
Indices at Accuracy
Cut Points False False
Cut Point Accuracy Positives Negatives Consistency
1/2 0.962 0.015 0.023 0.946
2/3 0.943 0.024 0.034 0.917
3/4 0.916 0.040 0.043 0.879
4/5 0.859 0.058 0.083 0.803
5/6 0.836 0.090 0.074 0.789
Table 8.5.5E-3
Accuracy and Consistency of Classification Indices: Oral (Grade 11) S302
Overall Accuracy Consistency Kappa (k)
Indices 0.558 0.448 0.310
Conditional Level Accuracy Consistency
on Level 1 0.834 0.730
2 0.603 0.471
3 0.584 0.457
4 0.511 0.382
5 0.510 0.463
6 - 0.282
Indices at Accuracy
Cut Points False False
Cut Point Accuracy Positives Negatives Consistency
1/2 0.971 0.014 0.015 0.957
2/3 0.936 0.029 0.035 0.911
3/4 0.904 0.032 0.064 0.867
4/5 0.845 0.062 0.093 0.775
5/6 0.885 0.115 0.000 0.844
Percent
Count
2,000 15.0%
10.0%
1,000
5.0%
0 0.0%
225 255 285 315 345 375 405 435 465 1 2 3 4 5 6
Scale Score Proficiency Level
Table 8.5.6A
Scale Score Descriptive Statistics: Litr 9-12 S302
Grade No. of Students Min. Max. Mean Std. Dev.
9 79,349 230 471 383.90 30.76
10 49,998 237 479 386.19 28.25
11 38,092 244 484 390.50 27.26
12 29,877 251 486 391.62 27.62
Total 197,316 230 486 386.92 29.18
Table 8.5.6B
Proficiency Level Distribution: Litr 9-12 S302
Grade 9 Grade 10 Grade 11 Grade 12 Total
Level Count Percent Count Percent Count Percent Count Percent Count Percent
1 4,722 6.0% 2,682 5.4% 1,994 5.2% 1,947 6.5% 11,345 5.7%
2 10,892 13.7% 7,864 15.7% 5,839 15.3% 4,862 16.3% 29,457 14.9%
3 18,306 23.1% 13,851 27.7% 10,751 28.2% 9,186 30.7% 52,094 26.4%
4 19,918 25.1% 13,962 27.9% 11,336 29.8% 8,670 29.0% 53,886 27.3%
5 18,247 23.0% 8,840 17.7% 5,897 15.5% 3,791 12.7% 36,775 18.6%
6 7,264 9.2% 2,799 5.6% 2,275 6.0% 1,421 4.8% 13,759 7.0%
Total 79,349 100.0% 49,998 100.0% 38,092 100.0% 29,877 100.0% 197,316 100.0%
Figure 8.5.6C
n/a
Figure 8.5.6D
n/a
Table 8.5.6D
Literacy Composite Reliability: Litr 9-12 S302
Component Weight Variance Reliability
Reading 0.50 1010.989 0.800
Writing 0.50 996.222 0.916
Literacy 850.309 0.916
*Variances from students who had results in all four domains
Table 8.5.6E-1
Accuracy and Consistency of Classification Indices: Litr (Grade 9) S302
Overall Accuracy Consistency Kappa (k)
Indices 0.636 0.535 0.416
Conditional Level Accuracy Consistency
on Level 1 0.834 0.733
2 0.751 0.636
3 0.691 0.577
4 0.613 0.487
5 0.537 0.484
6 - 0.366
Indices at Accuracy
Cut Points False False
Cut Point Accuracy Positives Negatives Consistency
1/2 0.980 0.010 0.010 0.970
2/3 0.948 0.022 0.030 0.927
3/4 0.909 0.041 0.050 0.873
4/5 0.885 0.043 0.072 0.837
5/6 0.908 0.092 0.000 0.892
Table 8.5.6E-3
Accuracy and Consistency of Classification Indices: Litr (Grade 11) S302
Overall Accuracy Consistency Kappa (k)
Indices 0.650 0.548 0.420
Conditional Level Accuracy Consistency
on Level 1 0.822 0.717
2 0.766 0.653
3 0.731 0.622
4 0.643 0.523
5 0.465 0.400
6 - 0.289
Indices at Accuracy
Cut Points False False
Cut Point Accuracy Positives Negatives Consistency
1/2 0.982 0.010 0.009 0.973
2/3 0.943 0.026 0.031 0.919
3/4 0.897 0.041 0.062 0.858
4/5 0.882 0.042 0.076 0.832
5/6 0.940 0.060 0.000 0.933
Percent
Count
4,000
3,000 10.0%
2,000
5.0%
1,000
0 0.0%
181 211 241 271 301 331 361 391 421 451 481 1 2 3 4 5 6
Scale Score Proficiency Level
Table 8.5.7A
Scale Score Descriptive Statistics: Cphn 9-12 S302
Grade No. of Students Min. Max. Mean Std. Dev.
9 79,449 186 477 375.34 35.56
10 50,057 193 477 378.04 32.81
11 38,136 200 477 382.67 32.26
12 29,921 208 477 383.88 32.75
Total 197,563 186 477 378.73 34.01
Table 8.5.7B
Proficiency Level Distribution: Cphn 9-12 S302
Grade 9 Grade 10 Grade 11 Grade 12 Total
Level Count Percent Count Percent Count Percent Count Percent Count Percent
1 7,850 9.9% 3,870 7.7% 3,059 8.0% 2,849 9.5% 17,628 8.9%
2 13,916 17.5% 11,042 22.1% 8,763 23.0% 6,231 20.8% 39,952 20.2%
3 18,466 23.2% 12,115 24.2% 6,947 18.2% 5,916 19.8% 43,444 22.0%
4 13,021 16.4% 8,939 17.9% 8,124 21.3% 5,991 20.0% 36,075 18.3%
5 16,059 20.2% 9,092 18.2% 6,641 17.4% 5,410 18.1% 37,202 18.8%
6 10,137 12.8% 4,999 10.0% 4,602 12.1% 3,524 11.8% 23,262 11.8%
Total 79,449 100.0% 50,057 100.0% 38,136 100.0% 29,921 100.0% 197,563 100.0%
Figure 8.5.7C
n/a
Figure 8.5.7D
n/a
Table 8.5.7D
Comprehension Composite Reliability: Cphn 9-12 S302
Component Weight Variance Reliability
Listening 0.30 2311.716 0.688
Reading 0.70 1010.989 0.800
Comprehension 1155.625 0.858
*Variances from students who had results in all four domains
Table 8.5.7E-1
Accuracy and Consistency of Classification Indices: Cphn (Grade 9) S302
Overall Accuracy Consistency Kappa (k)
Indices 0.525 0.426 0.302
Conditional Level Accuracy Consistency
on Level 1 0.821 0.699
2 0.631 0.498
3 0.541 0.425
4 0.351 0.266
5 0.437 0.367
6 0.591 0.403
Indices at Accuracy
Cut Points False False
Cut Point Accuracy Positives Negatives Consistency
1/2 0.961 0.017 0.023 0.943
2/3 0.907 0.041 0.051 0.869
3/4 0.862 0.056 0.081 0.811
4/5 0.854 0.059 0.087 0.796
5/6 0.885 0.087 0.028 0.848
Table 8.5.7E-3
Accuracy and Consistency of Classification Indices: Cphn (Grade 11) S302
Overall Accuracy Consistency Kappa (k)
Indices 0.497 0.414 0.286
Conditional Level Accuracy Consistency
on Level 1 0.749 0.609
2 0.710 0.581
3 0.425 0.328
4 0.439 0.336
5 0.380 0.327
6 3.214 0.388
Indices at Accuracy
Cut Points False False
Cut Point Accuracy Positives Negatives Consistency
1/2 0.962 0.021 0.017 0.941
2/3 0.892 0.042 0.066 0.852
3/4 0.863 0.055 0.082 0.811
4/5 0.852 0.050 0.098 0.792
5/6 0.879 0.121 0.000 0.857
Percent
Count
2,000 15.0%
1,500
10.0%
1,000
500 5.0%
0 0.0%
203 233 263 293 323 353 383 413 443 473 1 2 3 4 5 6
Scale Score Proficiency Level
Table 8.5.8A
Scale Score Descriptive Statistics: Over 9-12 S302
Grade No. of Students Min. Max. Mean Std. Dev.
9 78,900 208 465 382.26 34.84
10 49,719 214 469 384.87 30.90
11 37,857 220 476 389.27 29.24
12 29,583 226 477 390.85 28.98
Total 196,059 208 477 385.57 32.15
Table 8.5.8B
Proficiency Level Distribution: Over 9-12 S302
Grade 9 Grade 10 Grade 11 Grade 12 Total
Level Count Percent Count Percent Count Percent Count Percent Count Percent
1 6,432 8.2% 2,920 5.9% 2,026 5.4% 1,728 5.8% 13,106 6.7%
2 9,873 12.5% 7,296 14.7% 5,329 14.1% 4,030 13.6% 26,528 13.5%
3 14,283 18.1% 11,586 23.3% 9,388 24.8% 8,509 28.8% 43,766 22.3%
4 19,591 24.8% 14,391 28.9% 11,763 31.1% 9,706 32.8% 55,451 28.3%
5 20,901 26.5% 10,637 21.4% 7,125 18.8% 4,297 14.5% 42,960 21.9%
6 7,820 9.9% 2,889 5.8% 2,226 5.9% 1,313 4.4% 14,248 7.3%
Total 78,900 100.0% 49,719 100.0% 37,857 100.0% 29,583 100.0% 196,059 100.0%
Figure 8.5.8C
n/a
Figure 8.5.8D
n/a
Table 8.5.8D
Overall Composite Reliability: Over 9-12 S302
Component Weight Variance Reliability
Listening 0.15 2311.716 0.688
Reading 0.35 1010.989 0.800
Speaking 0.15 3077.424 0.924
Writing 0.35 996.222 0.916
Overall Composite 1033.737 0.945
Table 8.5.8E-1
Accuracy and Consistency of Classification Indices: Over (Grade 9) S302
Overall Accuracy Consistency Kappa (k)
Indices 0.679 0.588 0.485
Conditional Level Accuracy Consistency
on Level 1 0.889 0.822
2 0.767 0.665
3 0.693 0.579
4 0.693 0.572
5 0.594 0.556
6 - 0.411
Indices at Accuracy
Cut Points False False
Cut Point Accuracy Positives Negatives Consistency
1/2 0.980 0.009 0.011 0.972
2/3 0.960 0.017 0.023 0.943
3/4 0.931 0.035 0.034 0.903
4/5 0.906 0.036 0.058 0.868
5/6 0.901 0.099 0.000 0.887
Table 8.5.8E-3
Accuracy and Consistency of Classification Indices: Over (Grade 11) S302
Overall Accuracy Consistency Kappa (k)
Indices 0.703 0.611 0.500
Conditional Level Accuracy Consistency
on Level 1 0.849 0.768
2 0.803 0.708
3 0.768 0.669
4 0.725 0.603
5 0.548 0.492
6 - 0.307
Indices at Accuracy
Cut Points False False
Cut Point Accuracy Positives Negatives Consistency
1/2 0.985 0.008 0.007 0.978
2/3 0.956 0.020 0.024 0.938
3/4 0.923 0.032 0.045 0.893
4/5 0.896 0.035 0.069 0.851
5/6 0.941 0.059 0.000 0.938