Knowing What Students Know National Research
Council pdf download
https://ebookgate.com/product/knowing-what-students-know-national-research-council/
★★★★★ 4.8/5.0 (28 reviews) ✓ 102 downloads ■ TOP RATED
"Perfect download, no issues at all. Highly recommend!" - Mike D.
DOWNLOAD EBOOK
Knowing What Students Know National Research Council pdf
download
TEXTBOOK EBOOK EBOOK GATE
Available Formats
■ PDF eBook Study Guide TextBook
EXCLUSIVE 2025 EDUCATIONAL COLLECTION - LIMITED TIME
INSTANT DOWNLOAD VIEW LIBRARY
Instant digital products (PDF, ePub, MOBI) available
Download now and explore formats that suit you...
Integrity in Scientific Research National Research Council
https://ebookgate.com/product/integrity-in-scientific-research-
national-research-council/
ebookgate.com
Measuring Racial Discrimination 1st Edition National
Research Council
https://ebookgate.com/product/measuring-racial-discrimination-1st-
edition-national-research-council/
ebookgate.com
Oil in the Sea III National Research Council
https://ebookgate.com/product/oil-in-the-sea-iii-national-research-
council/
ebookgate.com
Educating Children with Autism 1st Edition National
Research Council
https://ebookgate.com/product/educating-children-with-autism-1st-
edition-national-research-council/
ebookgate.com
Population Land Use and Environment Research Directions
1st Edition National Research Council
https://ebookgate.com/product/population-land-use-and-environment-
research-directions-1st-edition-national-research-council/
ebookgate.com
Carbon Management Implications 1st Edition National
Research Council (U. S.)
https://ebookgate.com/product/carbon-management-implications-1st-
edition-national-research-council-u-s/
ebookgate.com
Understanding Crime Trends Workshop Report 1st Edition
National Research Council
https://ebookgate.com/product/understanding-crime-trends-workshop-
report-1st-edition-national-research-council/
ebookgate.com
Nutrient Requirements of Swine Eleventh Revised Edition
National Research Council
https://ebookgate.com/product/nutrient-requirements-of-swine-eleventh-
revised-edition-national-research-council/
ebookgate.com
Technology Development for Army Unmanned Ground Vehicles
National Research Council
https://ebookgate.com/product/technology-development-for-army-
unmanned-ground-vehicles-national-research-council/
ebookgate.com
Knowing
what
Students
Know The Science
and Design
of Educational
Assessment
Committee on the Foundations of Assessment
James W. Pellegrino, Naomi Chudowsky, and Robert Glaser, editors
Board on Testing and Assessment
Center for Education
Division of Behavioral and Social Sciences and Education
National Research Council
NATIONAL ACADEMY PRESS
Washington, DC
NATIONAL ACADEMY PRESS • 2101 Constitution Avenue N.W. • Washington, DC 20418
NOTICE: The project that is the subject of this report was approved by the Governing Board
of the National Research Council, whose members are drawn from the councils of the National
Academy of Sciences, the National Academy of Engineering, and the Institute of Medicine. The
members of the committee responsible for the report were chosen for their special competences
and with regard for appropriate balance.
This study was supported by Grant No. REC-9722707 between the National Academy of
Sciences and the U.S. National Science Foundation. Any opinions, findings, conclusions, or
recommendations expressed in this publication are those of the author(s) and do not necessar-
ily reflect the views of the organizations or agencies that provided support for the project.
Library of Congress Cataloging-in-Publication Data
Knowing what students know : the science and design of educational assessment /
Committee on the Foundations of Assessment, Center for Education, Division on
Behavioral and Social Sciences and Education, National Research Council ; James
Pellegrino, Naomi Chudowsky, and Robert Glaser, editors.
p. cm
Includes bibliographical references and index.
ISBN 0-309-07272-7
1. Educational tests and measurements—United States—Design and construction. 2.
Cognitive learning theory. I. Pellegrino, James W. II. Chudowsky, Naomi. III. Glaser,
Robert, 1921- IV. National Research Council (U.S.). Division of Behavioral and Social
Sciences and Education. Committee on the Foundations of Assessment.
LB3051.K59 2001
31.26′1—dc21 2001003876
Additional copies of this report are available from National Academy Press, 2101 Constitution
Avenue, N.W., Lockbox 285, Washington, DC 20055; (800) 624-6242 or (202) 334-3313 (in the
Washington metropolitan area); Internet, http://www.nap.edu
Suggested citation: National Research Council. 2001. Knowing what students know: The science
and design of educational assessment. Committee on the Foundations of Assessment. Pelligrino,
J., Chudowsky, N., and Glaser, R., editors. Board on Testing and Assessment, Center for Educa-
tion. Division of Behavioral and Social Sciences and Education. Washington, DC: National
Academy Press.
Printed in the United States of America
Copyright 2001 by the National Academy of Sciences. All rights reserved.
National Academy of Sciences
National Academy of Engineering
Institute of Medicine
National Research Council
The National Academy of Sciences is a private, nonprofit, self-perpetuating soci-
ety of distinguished scholars engaged in scientific and engineering research, dedi-
cated to the furtherance of science and technology and to their use for the general
welfare. Upon the authority of the charter granted to it by the Congress in 1863, the
Academy has a mandate that requires it to advise the federal government on scien-
tific and technical matters. Dr. Bruce M. Alberts is president of the National Academy
of Sciences.
The National Academy of Engineering was established in 1964, under the charter
of the National Academy of Sciences, as a parallel organization of outstanding engi-
neers. It is autonomous in its administration and in the selection of its members,
sharing with the National Academy of Sciences the responsibility for advising the
federal government. The National Academy of Engineering also sponsors engineer-
ing programs aimed at meeting national needs, encourages education and research,
and recognizes the superior achievements of engineers. Dr. Wm. A. Wulf is presi-
dent of the National Academy of Engineering.
The Institute of Medicine was established in 1970 by the National Academy of
Sciences to secure the services of eminent members of appropriate professions in the
examination of policy matters pertaining to the health of the public. The Institute
acts under the responsibility given to the National Academy of Sciences by its con-
gressional charter to be an adviser to the federal government and, upon its own
initiative, to identify issues of medical care, research, and education. Dr. Kenneth I.
Shine is president of the Institute of Medicine.
The National Research Council was organized by the National Academy of Sci-
ences in 1916 to associate the broad community of science and technology with the
Academy’s purposes of furthering knowledge and advising the federal government.
Functioning in accordance with general policies determined by the Academy, the
Council has become the principal operating agency of both the National Academy of
Sciences and the National Academy of Engineering in providing services to the gov-
ernment, the public, and the scientific and engineering communities. The Council is
administered jointly by both Academies and the Institute of Medicine. Dr. Bruce M.
Alberts and Dr. Wm. A. Wulf are chairman and vice chairman, respectively, of the
National Research Council.
EXECUTIVE SUMMARY v
COMMITTEE ON THE FOUNDATIONS OF
ASSESSMENT
James W. Pellegrino (Co-chair), Peabody College of Education,
Vanderbilt University
Robert Glaser (Co-chair), Learning Research and Development Center,
University of Pittsburgh
Eva L. Baker, The Center for the Study of Evaluation, University of
California, Los Angeles
Gail P. Baxter, Educational Testing Service, Princeton, New Jersey
Paul J. Black, School of Education, King’s College, London, England
Christopher L. Dede, Graduate School of Education, Harvard University
Kadriye Ercikan, School of Education, University of British Columbia
Louis M. Gomez, School of Education, Northwestern University
Earl B. Hunt, Department of Psychology, University of Washington
David Klahr, Department of Psychology, Carnegie Mellon University
Richard Lehrer, School of Education, University of Wisconsin
Robert J. Mislevy, School of Education, University of Maryland
Willie Pearson, Jr., Department of Sociology, Wake Forest University
Edward A. Silver, School of Education, University of Michigan
Richard F. Thompson, Department of Psychology, University of
Southern California
Richard K. Wagner, Department of Psychology, Florida State University
Mark R. Wilson, School of Education, University of California, Berkeley
Naomi Chudowsky, Study Director
Tina Winters, Research Assistant
M. Jane Phillips, Senior Project Assistant
vi EXECUTIVE SUMMARY
BOARD ON TESTING AND ASSESSMENT
Eva L. Baker (Chair), The Center for the Study of Evaluation, University
of California, Los Angeles
Lorraine McDonnell (Vice Chair), Departments of Political Science and
Education, University of California, Santa Barbara
Lauress L. Wise (Vice Chair), Human Resources Research Organization,
Alexandria, Virginia
Richard C. Atkinson, President, University of California
Christopher F. Edley, Jr., Harvard Law School
Ronald Ferguson, John F. Kennedy School of Public Policy, Harvard
University
Milton D. Hakel, Department of Psychology, Bowling Green State
University
Robert M. Hauser, Institute for Research on Poverty, Center for
Demography, University of Wisconsin, Madison
Paul W. Holland, Educational Testing Service, Princeton, New Jersey
Daniel M. Koretz, RAND Corporation, Arlington, Virginia
Richard J. Light, Graduate School of Education and John F. Kennedy
School of Government, Harvard University
Barbara Means, SRI, International, Menlo Park, California
Andrew C. Porter, Wisconsin Center for Education Research, University
of Wisconsin, Madison
Loretta A. Shepard, School of Education, University of Colorado, Boulder
Catherine E. Snow, Graduate School of Education, Harvard University
William L. Taylor, Attorney at Law, Washington, D.C.
William T. Trent, Department of Educational Policy Studies, University of
Illinois, Urbana-Champaign
Guadalupe M. Valdes, School of Education, Stanford University
Vicki Vandaveer, The Vandaveer Group, Inc., Houston, Texas
Kenneth I. Wolpin, Department of Economics, University of
Pennsylvania
Pasquale J. Devito, Director
Lisa D. Alston, Administrative Associate
EXECUTIVE SUMMARY vii
Acknowledgments
The work of the Committee on the Foundations of Assessment ben-
efited tremendously from the contributions and good will of many people,
and the committee is grateful for their support.
First, we wish to acknowledge the sponsor, the National Science Foun-
dation (NSF). Special thanks go to Larry Suter, who was instrumental in
getting the project off the ground and who provided enthusiastic support
throughout. We also appreciate the support and valuable input of Elizabeth
VanderPutten, Janice Earle, Nora Sabelli, and Eric Hamilton at NSF, as well as
Eamonn Kelly, now at George Mason University.
The committee was aided greatly by individuals who participated in a
series of information-gathering workshops held in conjunction with several
of the committee meetings. We valued the opportunity to hear from a di-
verse group of researchers and practitioners about the complex issues in-
volved in designing and implementing new forms of assessment.
We wish to make special note of Robbie Case from Stanford University
and the Ontario Institute for Studies in Education, who deeply influenced
this study. Robbie shared with us his powerful ideas about children’s con-
ceptual development and the implications for assessment and educational
equity. Several aspects of his thinking and published work can be found
referenced throughout this report. In every respect he was a gentleman and
a scholar. His untimely death in 2000 deeply saddened the members of the
committee on both a personal and a professional level. His passing repre-
sents a major loss for the fields of psychological and educational research.
A number of researchers working at the intersection of cognition and
assessment took time to share their work and ideas with the committee,
including Drew Gitomer of the Educational Testing Service, Irvin Katz of
George Mason University, Jim Minstrell of A.C.T. Systems for Education, Kurt
viii ACKNOWLEDGMENTS
VanLehn of the Learning Research and Development Center at the Univer-
sity of Pittsburgh, Ken Koedinger of Carnegie Mellon Univeristy, Barbara
White and John Frederiksen of the University of California at Berkeley, and
Jim Greeno of Stanford University. The committee discussed the beliefs and
theories of learning underlying some innovative large-scale assessments with
Phil Daro of the New Standards Project, Steven Leinwand of the Connecticut
State Department of Education, Hugh Burkhardt and Sandy Wilcox of the
Mathematics Assessment Resource Service, and Carol Myford of the Educa-
tional Testing Service. We also heard from teachers who have used various
assessment programs in their classrooms. We thank Guy Mauldin of Science
Hill High School, Johnson City, Tennessee; Elizabeth Jones of Walnut El-
ementary School, Lansing, Michigan; Margaret Davis, Westminster Schools,
Atlanta, Georgia; Ramona Muniz, Roosevelt Middle School, San Francisco,
California; Cherrie Jones, Alice Carlson Applied Learning Center, Fort Worth,
Texas; and Suzanna Loper of the Educational Testing Service, Oakland, CA.
Several individuals discussed special considerations related to disadvan-
taged students and the design of new forms of assessment. They included
Bill Trent of the University of Illinois, Urbana-Champaign, Shirley Malcom of
the American Association for the Advancement of Science, Sharon Lewis of
the Council of Great City Schools, and Louisa Moats of the National Institute
of Child Health and Human Development. Developmental psychologists Susan
Goldin-Meadow of the University of Chicago, Robert Siegler of Carnegie
Mellon University, and Micki Chi of the Learning Research and Development
Center at the University of Pittsburgh discussed research methodologies from
their discipline that may have application to educational assessment. A num-
ber of researchers helped the committee explore the future role of technol-
ogy in assessment, including Randy Bennett of the Educational Testing Ser-
vice, Amy Bruckman of the Georgia Institute of Technology, Walter Kintsch
of the University of Colorado, Paul Horwitz of The Concord Consortium,
and Gregory Leazer of the University of California at Los Angeles. Lorraine
McDonnell of the University of California at Santa Barbara, James Kadamus
of the New York State Department of Education, and James Gray of the
Dorchester Public Schools in Maryland provided valuable policy perspec-
tives on the prospects for a new science of assessment.
The committee was provided excellent input on advances in statistics
and measurement by Steven Raudenbush from the University of Michigan
and Brian Junker from Carnegie Mellon University. Their presentations, as
well as Brian’s commissioned review of statistical methods that are poten-
tially useful for cognitively based assessment, greatly informed our discus-
sions. Linda Steinberg of the Educational Testing Service and Geoff Masters
of the Australian Council for Educational Research shared state-of-the-art
work on assessment design.
A number of other education researchers provided reactions and syn-
ACKNOWLEDGMENTS ix
thesizing remarks at the various workshops. They included Bob Linn of the
University of Colorado, Rich Shavelson of Stanford University, David Ber-
liner of Arizona State University, Barbara Means of SRI International, Ed
Haertel of Stanford University, Goodwin Liu of the U.S. Department of Edu-
cation, and Nora Sabelli of NSF.
The Board on Testing and Assessment, the unit within the National Re-
search Council (NRC) that launched this study, was instrumental in shaping
this project and in providing general guidance and support along the way.
Many board members have been mentioned above as participants in the
committee’s work.
We are especially grateful to several consultants to the project, including
Nancy Kober and Robert Rothman, who helped with the writing of this
report and provided invaluable assistance in thinking about the organization
and presentation of ideas. Rona Briere’s skillful editing brought further clar-
ity to our ideas.
Within the NRC, a number of individuals supported the project. Michael
Feuer, Director of the Center for Education, conceptualized the project and
provided good humor and support along the way. Pasquale DeVito, recently
appointed Director of the Board on Testing and Assessment, enthusiastically
supported us during the final stages of the project. Patricia Morison offered
a great deal of wisdom, advice, and encouragement throughout, and Judy
Koenig lent us her substantive knowledge of psychometrics whenever needed.
Kirsten Sampson Snyder and Genie Grohman expertly maneuvered us through
the NRC review process.
The committee expresses particular gratitude to members of the NRC
project staff for contributing their intellectual and organizational skills through-
out the study. Three deserve particular recognition. Naomi Chudowsky, the
project’s study director, was a pleasure to work with and brought incredible
talents and expertise to the project. She tirelessly assisted the committee in
many ways—serving as a valuable source of information about assessment
issues and testing programs; organizing and synthesizing the committee’s
work; keeping the committee moving forward through its deliberations and
the report drafting process; and providing energy, enthusiasm, and excep-
tional good humor throughout. Her attention to detail while simultaneously
helping the committee focus on the bigger picture was a major asset in the
creation of the final report. Naomi was assisted by Tina Winters, who pro-
vided exceptional research support and adeptly handled preparation of the
manuscript. Jane Phillips expertly managed the finances and arranged the
meetings for the project, always ensuring that the committee’s work pro-
ceeded smoothly.
This report has been reviewed in draft form by individuals chosen for
their diverse perspectives and technical expertise, in accordance with proce-
dures approved by the NRC’s Report Review Committee. The purpose of this
x ACKNOWLEDGMENTS
independent review is to provide candid and critical comments that will
assist the institution in making its published report as sound as possible and
to ensure that the report meets institutional standards for objectivity, evi-
dence, and responsiveness to the study charge. The review comments and
draft manuscript remain confidential to protect the integrity of the delibera-
tive process. We wish to thank the following individuals for their review of
this report: James Greeno, Stanford University; Sharon Griffin, Clark Univer-
sity; Suzanne Lane, University of Pittsburgh; Alan Lesgold, University of Pitts-
burgh; Marcia C. Linn, University of California, Berkeley; Michael I. Posner,
Cornell University; Catherine E. Snow, Harvard University; Norman L. Webb,
University of Wisconsin; and Sheldon H. White, Harvard University.
Although the reviewers listed above have provided many constructive
comments and suggestions, they were not asked to endorse the conclusions
or recommendations nor did they see the final draft of the report before its
release. The review of this report was overseen by Lauress Wise, Human
Resources Research Organization, and Lyle V. Jones, University of North
Carolina, Chapel Hill. Appointed by the National Research Council, they
were responsible for making certain that an independent examination of this
report was carried out in accordance with institutional procedures and that
all review comments were carefully considered. Responsibility for the final
content of this report rests entirely with the authoring committee and the
institution.
Finally, we would like to sincerely thank all of the committee members,
who generously contributed their time and intellectual efforts to this project.
A study of the scientific foundations of assessment represents an extraordi-
nary challenge, requiring coverage of an exceedingly broad array of com-
plex topics and issues. We were faced with the task of defining the nature of
the problem to be studied and solved and then charting a path through a
rather ill-defined solution space. Throughout the process, the committee
members displayed an extraordinary ability to tolerate ambiguity as we navi-
gated through a vast space of issues and possible answers, at times seem-
ingly without a compass. Simultaneously, they showed a remarkable com-
mitment to learning from each others’ expertise and from the many individuals
who shared their knowledge with the group. It has been noted before that
the idea of eighteen “experts” collaborating to write a book on any topic, let
alone educational assessment, is an absurdity. And yet were it not for the
collective expertise, thoughtfulness, and good will of all the committee mem-
bers, this report and its consensual substantive messages would not have
been developed. It has been a professionally stimulating and personally
gratifying experience to work with the members of the committee and ev-
eryone at the NRC associated with this effort.
Jim Pellegrino, Co-chair
Bob Glaser, Co-chair
ACKNOWLEDGMENTS xi
Preface
In recent years, the National Research Council (NRC), through its Board
on Testing and Assessment (BOTA), has explored some of today’s most
pressing and complex issues in educational assessment. Several NRC com-
mittees have examined the role and appropriate uses of assessment in stan-
dards-based reform, a movement that is reshaping education throughout the
country. For example, committees have studied the impact and uses of tests
with high stakes for students, approaches for assessing students with dis-
abilities in a standards-based system, and issues related to proposed volun-
tary national tests. In the process of carrying out this work, the board and its
committees have delved into fundamental questions about educational as-
sessment, such as what its purposes are; which kinds of knowledge and
skills should be assessed; how well current assessments, such as the Na-
tional Assessment of Educational Progress, are fulfilling the various demands
placed on them; and which new developments hold promise for improving
assessment.
At roughly the same time, other NRC committees have been exploring
equally compelling issues related to human cognition and learning. A 1998
report entitled Preventing Reading Difficulties in Young Children consoli-
dates current research findings on how students learn to read and which
approaches are most effective for reading instruction. Most recently, the
NRC Committee on Developments in the Science of Learning examined find-
ings from cognitive science that have advanced understanding of how people
think and learn. The 1999 report of that committee, How People Learn, not
only summarizes major changes in conceptions about learning, but also
examines the implications of these changes for designing effective teaching
and learning environments.
As these multiple committees were progressing with their work, some
xii PREFACE
NRC staff and members of BOTA decided this would be an ideal time to
address a long-standing issue noted by numerous researchers interested in
problems of educational assessment: the need to bring together advances in
assessment and in the understanding of human learning. Each of these dis-
ciplines had produced a body of knowledge that could enrich the other. In
fact, some scholars and practitioners were already applying findings from
cognitive science in the development of innovative methods of assessment.
Although these efforts were generally small-scale or experimental, they pointed
to exciting possibilities.
Accordingly, the board proposed that an NRC committee be formed to
review advances in the cognitive and measurement sciences, as well as early
work done in the intersection between the two disciplines, and to consider
the implications for reshaping educational assessment. In one sense, this
work would be a natural extension of the conclusions and recommenda-
tions of How People Learn. In another sense, it would follow through on a
desire expressed by many of those involved in the board’s activities to revisit
the foundations of assessment—to explore developments in the underlying
science and philosophy of assessment that could have significant implica-
tions for the long term, but were often glossed over in the short term be-
cause of more urgent demands. The National Science Foundation (NSF),
recognizing the importance and timeliness of such a study, agreed to spon-
sor this new NRC effort.
The Committee on the Foundations of Assessment was convened in
January 1998 by the NRC with support from NSF. The committee comprised
eighteen experts from the fields of cognitive and developmental psychol-
ogy, neuroscience, testing and measurement, learning technologies, math-
ematics and science education, and education policy with diverse perspec-
tives on educational assessment.
During its 3-year study, the committee held nine multi-day meetings to
conduct its deliberations and five workshops to gather information about
promising assessment research and practice. At the workshops, numerous
invited presenters shared with the committee members their cutting-edge
work on the following topics: (1) assessment practices that are based on
cognitive principles and are being successfully implemented in schools and
classrooms, (2) new statistical models with promise for use in assessing a
broad range of cognitive performances, (3) programs that engage students
in self- and peer assessment, (4) innovative technologies for learning and
assessment, (5) cognitively based instructional intervention programs, and
(6) policy perspectives on new forms of assessment. This report presents
the findings and recommendations that resulted from the committee’s
deliberations.
PREFACE xiii
Contents
Executive Summary 1
Part I
Overview and Background
1 Rethinking the Foundations of Assessment 17
2 The Nature of Assessment and Reasoning
from Evidence 37
Part II
The Scientific Foundations of Assessment
Introduction 57
3 Advances in the Sciences of Thinking and Learning 59
4 Contributions of Measurement and Statistical
Modeling to Assessment 111
xiv CONTENTS
Part III
Assessment Design and Use:
Principles, Practices, and Future Directions
Introduction 175
5 Implications of the New Foundations for
Assessment Design 177
6 Assessment in Practice 221
7 Information Technologies: Opportunities for
Advancing Educational Assessment 261
Part IV
Conclusion
8 Implications and Recommendations for
Research, Policy, and Practice 291
References 315
Appendix: Biographical Sketches 349
Index 355
Knowing
Students
what
Know The Science
and Design
of Educational
Assessment
EXECUTIVE SUMMARY 1
Executive Summary
Educational assessment seeks to determine how well students are learn-
ing and is an integral part of the quest for improved education. It provides
feedback to students, educators, parents, policy makers, and the public about
the effectiveness of educational services. With the movement over the past
two decades toward setting challenging academic standards and measuring
students’ progress in meeting those standards, educational assessment is
playing a greater role in decision making than ever before. In turn, educa-
tion stakeholders are questioning whether current large-scale assessment
practices are yielding the most useful kinds of information for informing and
improving education. Meanwhile, classroom assessments, which have the
potential to enhance instruction and learning, are not being used to their
fullest potential.
Advances in the cognitive and measurement sciences make this an op-
portune time to rethink the fundamental scientific principles and philosophical
assumptions serving as the foundations for current approaches to assess-
ment. Advances in the cognitive sciences have broadened the conception of
those aspects of learning that are most important to assess, and advances in
measurement have expanded the capability to interpret more complex forms
of evidence derived from student performance.
The Committee on the Foundations of Assessment, supported by the
National Science Foundation, was established to review and synthesize ad-
vances in the cognitive sciences and measurement and to explore their im-
plications for improving educational assessment. At the heart of the
committee’s work was the critical importance of developing new kinds of
educational assessments that better serve the goal of equity. Needed are
classroom and large-scale assessments that help all students learn and suc-
ceed in school by making as clear as possible to them, their teachers, and
2 KNOWING WHAT STUDENTS KNOW
other education stakeholders the nature of their accomplishments and the
progress of their learning.
CONCLUSIONS
The Nature of Assessment and Reasoning from Evidence
This report addresses assessments used in both classroom and large-
scale contexts for three broad purposes: to assist learning, to measure indi-
vidual achievement, and to evaluate programs. The purpose of an assess-
ment determines priorities, and the context of use imposes constraints on
the design. Thus it is essential to recognize that one type of assessment does
not fit all.
Often a single assessment is used for multiple purposes; in general, how-
ever, the more purposes a single assessment aims to serve, the more each pur-
pose will be compromised. For instance, many state tests are used for both
individual and program assessment purposes. This is not necessarily a prob-
lem, as long as assessment designers and users recognize the compromises
and trade-offs such use entails.
Although assessments used in various contexts and for differing purposes
often look quite different, they share certain common principles. One such
principle is that assessment is always a process of reasoning from evidence.
By its very nature, moreover, assessment is imprecise to some degree. As-
sessment results are only estimates of what a person knows and can do.
Every assessment, regardless of its purpose, rests on three pillars: a model
of how students represent knowledge and develop competence in the subject
domain, tasks or situations that allow one to observe students’ performance,
and an interpretation method for drawing inferences from the performance
evidence thus obtained. In the context of large-scale assessment, the inter-
pretation method is usually a statistical model that characterizes expected
data patterns, given varying levels of student competence. In less formal
classroom assessment, the interpretation is often made by the teacher using
an intuitive or qualitative rather than formal statistical model.
Three foundational elements, comprising what is referred to in this re-
port as the “assessment triangle,” underlie all assessments. These three ele-
ments—cognition, observation, and interpretation—must be explicitly con-
nected and designed as a coordinated whole. If not, the meaningfulness of
inferences drawn from the assessment will be compromised.
The central problem addressed by this report is that most widely used
assessments of academic achievement are based on highly restrictive beliefs
about learning and competence not fully in keeping with current knowl-
edge about human cognition and learning. Likewise, the observation and
interpretation elements underlying most current assessments were created
EXECUTIVE SUMMARY 3
to fit prior conceptions of learning and need enhancement to support the
kinds of inferences people now want to draw about student achievement. A
model of cognition and learning should serve as the cornerstone of the as-
sessment design process. This model should be based on the best available
understanding of how students represent knowledge and develop competence
in the domain.
The model of learning can serve as a unifying element—a nucleus that
brings cohesion to curriculum, instruction, and assessment. This cohesive
function is a crucial one because educational assessment does not exist in
isolation, but must be aligned with curriculum and instruction if it is to
support learning.
Finally, aspects of learning that are assessed and emphasized in the
classroom should ideally be consistent with (though not necessarily the same
as) the aspects of learning targeted by large-scale assessments. In reality,
however, these two forms of assessment are often out of alignment. The
result can be conflict and frustration for both teachers and learners. Thus
there is a need for better alignment among assessments used for different
purposes and in different contexts.
Advances in the Sciences of Thinking and Learning
Contemporary theories of learning and knowing emphasize the way
knowledge is represented, organized, and processed in the mind. Emphasis
is also given to social dimensions of learning, including social and participa-
tory practices that support knowing and understanding. This body of knowl-
edge strongly implies that assessment practices need to move beyond a focus
on component skills and discrete bits of knowledge to encompass the more
complex aspects of student achievement.
Among the fundamental elements of cognition is the mind’s cognitive
architecture, which includes working or short-term memory, a highly limited
system, and long-term memory, a virtually limitless store of knowledge. What
matters in most situations is how well one can evoke the knowledge stored
in long-term memory and use it to reason efficiently about current informa-
tion and problems. Therefore, within the normal range of cognitive abilities,
estimates of how people organize information in long-term memory are likely
to be more important than estimates of working memory capacity.
Understanding the contents of long-term memory is especially critical for
determining what people know; how they know it; and how they are able to
use that knowledge to answer questions, solve problems, and engage in addi-
tional learning. While the contents include both general and specific knowl-
edge, much of what one knows is domain- and task-specific and organized
into structures known as schemas. Assessments should evaluate what schemas
an individual has and under what circumstances he or she regards the infor-
4 KNOWING WHAT STUDENTS KNOW
mation as relevant. This evaluation should include how a person organizes
acquired information, encompassing both strategies for problem solving and
ways of chunking relevant information into manageable units.
The importance of evaluating knowledge structures comes from research
on expertise. Studies of expert-novice differences in subject domains illumi-
nate critical features of proficiency that should be the targets for assessment.
Experts in a subject domain typically organize factual and procedural knowl-
edge into schemas that support pattern recognition and the rapid retrieval
and application of knowledge.
One of the most important aspects of cognition is metacognition—the
process of reflecting on and directing one’s own thinking. Metacognition is
crucial to effective thinking and problem solving and is one of the hallmarks
of expertise in specific areas of knowledge and skill. Experts use metacognitive
strategies for monitoring understanding during problem solving and for per-
forming self-correction. Assessment should therefore attempt to determine
whether an individual has good metacognitive skills.
Not all children learn in the same way and follow the same paths to
competence. Children’s problem-solving strategies become more effective
over time and with practice, but the growth process is not a simple, uniform
progression, nor is there movement directly from erroneous to optimal solu-
tion strategies. Assessments should focus on identifying the specific strategies
children are using for problem solving, giving particular consideration to
where those strategies fall on a developmental continuum of efficiency and
appropriateness for a particular domain of knowledge and skill.
Children have rich intuitive knowledge of their world that undergoes
significant change as they mature. Learning entails the transformation of
naive understanding into more complete and accurate comprehension, and
assessment can be used as a tool to facilitate this process. To this end,
assessments, especially those conducted in the context of classroom instruc-
tion, should focus on making students’ thinking visible to both their teachers
and themselves so that instructional strategies can be selected to support an
appropriate course for future learning.
Practice and feedback are critical aspects of the development of skill
and expertise. One of the most important roles for assessment is the provision
of timely and informative feedback to students during instruction and learn-
ing so that their practice of a skill and its subsequent acquisition will be
effective and efficient.
As a function of context, knowledge frequently develops in a highly
contextualized and inflexible form, and often does not transfer very effec-
tively. Transfer depends on the development of an explicit understanding of
when to apply what has been learned. Assessments of academic achieve-
ment need to consider carefully the knowledge and skills required to under-
stand and answer a question or solve a problem, including the context in
EXECUTIVE SUMMARY 5
which it is presented, and whether an assessment task or situation is func-
tioning as a test of near, far, or zero transfer.
Much of what humans learn is acquired through discourse and interac-
tion with others. Thus, knowledge is often embedded in particular social
and cultural contexts, including the context of the classroom, and it encom-
passes understandings about the meaning of specific practices such as ask-
ing and answering questions. Assessments need to examine how well stu-
dents engage in communicative practices appropriate to a domain of
knowledge and skill, what they understand about those practices, and how
well they use the tools appropriate to that domain.
Models of cognition and learning provide a basis for the design and
implementation of theory-driven instructional and assessment practices. Such
programs and practices already exist and have been used productively in
certain curricular areas. However, the vast majority of what is known has yet
to be applied to the design of assessments for classroom or external evalua-
tion purposes. Further work is therefore needed on translating what is al-
ready known in cognitive science to assessment practice, as well as on devel-
oping additional cognitive analyses of domain-specific knowledge and
expertise.
Many highly effective tools exist for probing and modeling a person’s
knowledge and for examining the contents and contexts of learning. The
methods used in cognitive science to design tasks, observe and analyze cog-
nition, and draw inferences about what a person knows are applicable to
many of the challenges of designing effective educational assessments.
Contributions of Measurement and
Statistical Modeling to Assessment
Advances in methods of educational measurement include the develop-
ment of formal measurement (psychometric) models, which represent a par-
ticular form of reasoning from evidence. These models provide explicit, for-
mal rules for integrating the many pieces of information drawn from
assessment tasks. Certain kinds of assessment applications require the capa-
bilities of formal statistical models for the interpretation element of the assess-
ment triangle. These tend to be applications with one or more of the follow-
ing features: high stakes, distant users (i.e., assessment interpreters without
day-to-day interaction with the students), complex models of learning, and
large volumes of data.
Measurement models currently available can support the kinds of infer-
ences that cognitive science suggests are important to pursue. In particular, it
is now possible to characterize student achievement in terms of multiple
aspects of proficiency, rather than a single score; chart students’ progress
over time, instead of simply measuring performance at a particular point in
6 KNOWING WHAT STUDENTS KNOW
time; deal with multiple paths or alternative methods of valued performance;
model, monitor, and improve judgments on the basis of informed evalua-
tions; and model performance not only at the level of students, but also at
the levels of groups, classes, schools, and states.
Nonetheless, many of the newer models and methods are not widely
used because they are not easily understood or packaged in accessible ways
for those without a strong technical background. Technology offers the pos-
sibility of addressing this shortcoming. For instance, building statistical mod-
els into technology-based learning environments for use in classrooms en-
ables teachers to employ more complex tasks, capture and replay students’
performances, share exemplars of competent performance, and in the pro-
cess gain critical information about student competence.
Much hard work remains to focus psychometric model building on the
critical features of models of cognition and learning and on observations
that reveal meaningful cognitive processes in a particular domain. If any-
thing, the task has become more difficult because an additional step is now
required—determining in tandem the inferences that must be drawn, the
observations needed, the tasks that will provide them, and the statistical
models that will express the necessary patterns most efficiently. Therefore,
having a broad array of models available does not mean that the measure-
ment model problem has been solved. The long-standing tradition of leaving
scientists, educators, task designers, and psychometricians each to their own
realms represents perhaps the most serious barrier to progress.
Implications of the New Foundations for
Assessment Design
The design of high-quality classroom and large-scale assessments is a
complex process that involves numerous components best characterized as
iterative and interdependent, rather than linear and sequential. A design
decision made at a later stage can affect one occurring earlier in the process.
As a result, assessment developers must often revisit their choices and refine
their designs.
One of the main features that distinguishes the committee’s proposed ap-
proach to assessment design from current approaches is the central role of a
model of cognition and learning, as emphasized above. This model may be
fine-grained and very elaborate or more coarsely grained, depending on the
purpose of the assessment, but it should always be based on empirical stud-
ies of learners in a domain. Ideally, the model will also provide a develop-
mental perspective, showing typical ways in which learners progress toward
competence.
Another essential feature of good assessment design is an interpretation
model that fits the model of cognition and learning. Just as sophisticated
EXECUTIVE SUMMARY 7
interpretation techniques used with assessment tasks based on impover-
ished models of learning will produce limited information about student
competence, assessments based on a contemporary, detailed understanding
of how students learn will not yield all the information they otherwise might
if the statistical tools available to interpret the data, or the data themselves,
are not sufficient for the task. Observations, which include assessment tasks
along with the criteria for evaluating students’ responses, must be carefully
designed to elicit the knowledge and cognitive processes that the model of
learning suggests are most important for competence in the domain. The
interpretation model must incorporate this evidence in the results in a man-
ner consistent with the model of learning.
Validation that tasks tap relevant knowledge and cognitive processes,
often lacking in assessment development, is another essential aspect of the
development effort. Starting with hypotheses about the cognitive demands of
a task, a variety of research techniques, such as interviews, having students
think aloud as they work problems, and analysis of errors, can be used to
analyze the mental processes of examinees during task performance. Con-
ducting such analyses early in the assessment development process can help
ensure that assessments do, in fact, measure what they are intended to mea-
sure.
Well-delineated descriptions of learning in the domain are key to being
able to communicate effectively about the nature of student performance.
Although reporting of results occurs at the end of an assessment cycle, assess-
ments must be designed from the outset to ensure that reporting of the desired
types of information will be possible. The ways in which people learn the
subject matter, as well as different types or levels of competence, should be
displayed and made as recognizable as possible to educators, students, and
the public.
Fairness is a key issue in educational assessment. One way of addressing
fairness in assessment is to take into account examinees’ histories of instruc-
tion—or opportunities to learn the material being tested—when designing
assessments and interpreting students’ responses. Ways of drawing such con-
ditional inferences have been tried mainly on a small scale, but hold prom-
ise for tackling persistent issues of equity in testing.
Some examples of assessments that approximate the above features al-
ready exist. They are illustrative of the new approach to assessment the
committee advocates, and they suggest principles for the design of new
assessments that can better serve the goals of learning.
Assessment in Practice
Guiding the committee’s work were the premises that (1) something
important should be learned from every assessment situation, and (2) the
8 KNOWING WHAT STUDENTS KNOW
information gained should ultimately help improve learning. The power of
classroom assessment resides in its close connections to instruction and teach-
ers’ knowledge of their students’ instructional histories. Large-scale, stan-
dardized assessments can communicate across time and place, but by so
constraining the content and timeliness of the message that they often have
limited utility in the classroom. Thus the contrast between classroom and
large-scale assessments arises from the different purposes they serve and con-
texts in which they are used. Certain trade-offs are an inescapable aspect of
assessment design.
Students will learn more if instruction and assessment are integrally re-
lated. In the classroom, providing students with information about particu-
lar qualities of their work and about what they can do to improve is crucial
for maximizing learning. It is in the context of classroom assessment that
theories of cognition and learning can be particularly helpful by providing a
picture of intermediary states of student understanding on the pathway from
novice to competent performer in a subject domain.
Findings from cognitive research cannot always be translated directly or
easily into classroom practice. Most effective are programs that interpret the
findings from cognitive research in ways that are useful for teachers. Teach-
ers need theoretical training, as well as practical training and assessment
tools, to be able to implement formative assessment effectively in their class-
rooms.
Large-scale assessments are further removed from instruction, but can
still benefit learning if well designed and properly used. Substantially more
valid and useful inferences could be drawn from such assessments if the
principles set forth in this report were applied during the design process.
Large-scale assessments not only serve as a means for reporting on stu-
dent achievement, but also reflect aspects of academic competence societies
consider worthy of recognition and reward. Thus large-scale assessments
can provide worthwhile targets for educators and students to pursue. Whereas
teaching directly to the items on a test is not desirable, teaching to the theory
of cognition and learning that underlies an assessment can provide positive
direction for instruction.
To derive real benefits from the merger of cognitive and measurement
theory in large-scale assessment, it will be necessary to devise ways of cov-
ering a broad range of competencies and capturing rich information about
the nature of student understanding. Indeed, to fully capitalize on the new
foundations described in this report will require substantial changes in the
way large-scale assessment is approached and relaxation of some of the con-
straints that currently drive large-scale assessment practices. Alternatives to
on-demand, census testing are available. If individual student scores are
needed, broader sampling of the domain can be achieved by extracting
evidence of student performance from classroom work produced during the
EXECUTIVE SUMMARY 9
course of instruction. If the primary purpose of the assessment is program
evaluation, the constraint of having to produce reliable individual student
scores can be relaxed, and population sampling can be useful.
For classroom or large-scale assessment to be effective, students must
understand and share the goals for learning. Students learn more when they
understand (and even participate in developing) the criteria by which their
work will be evaluated, and when they engage in peer and self-assessment
during which they apply those criteria. These practices develop students’
metacognitive abilities, which, as emphasized above, are necessary for ef-
fective learning.
The current educational assessment environment in the United States
assigns much greater value and credibility to external, large-scale assess-
ments of individuals and programs than to classroom assessment designed
to assist learning. The investment of money, instructional time, research, and
development for large-scale testing far outweighs that for effective class-
room assessment. More of the research, development, and training invest-
ment must be shifted toward the classroom, where teaching and learning
occur.
A vision for the future is that assessments at all levels—from classroom to
state—will work together in a system that is comprehensive, coherent, and
continuous. In such a system, assessments would provide a variety of evi-
dence to support educational decision making. Assessment at all levels would
be linked back to the same underlying model of student learning and would
provide indications of student growth over time.
Information Technologies: Opportunities for
Advancing Educational Assessment
Information technologies are helping to remove some of the constraints
that have limited assessment practice in the past. Assessment tasks no longer
need be confined to paper-and-pencil formats, and the entire burden of
classroom assessment no longer need fall on the teacher. At the same time,
technology will not in and of itself improve educational assessment. Improved
methods of assessment require a design process that connects the three
elements of the assessment triangle to ensure that the theory of cognition,
the observations, and the interpretation process work together to support
the intended inferences. Fortunately, there exist multiple examples of tech-
nology tools and applications that enhance the linkages among cognition,
observation, and interpretation.
Some of the most intriguing applications of technology extend the nature
of the problems that can be presented and the knowledge and cognitive pro-
cesses that can be assessed. By enriching task environments through the use
of multimedia, interactivity, and control over the stimulus display, it is pos-
10 KNOWING WHAT STUDENTS KNOW
sible to assess a much wider array of cognitive competencies than has here-
tofore been feasible. New capabilities enabled by technology include di-
rectly assessing problem-solving skills, making visible sequences of actions
taken by learners in solving problems, and modeling and simulating com-
plex reasoning tasks. Technology also makes possible data collection on
concept organization and other aspects of students’ knowledge structures,
as well as representations of their participation in discussions and group
projects. A significant contribution of technology has been to the design of
systems for implementing sophisticated classroom-based formative assessment
practices. Technology-based systems have been developed to support indi-
vidualized instruction by extracting key features of learners’ responses, ana-
lyzing patterns of correct and incorrect reasoning, and providing rapid and
informative feedback to both student and teacher.
A major change in education has resulted from the influence of technol-
ogy on what is taught and how. Schools are placing more emphasis on teach-
ing critical content in greater depth. Examples include the teaching of ad-
vanced thinking and reasoning skills within a discipline through the use of
technology-mediated projects involving long-term inquiry. Such projects of-
ten integrate content and learning across disciplines, as well as integrate
assessment with curriculum and instruction in powerful ways.
A possibility for the future arises from the projected growth across cur-
ricular areas of technology-based assessment embedded in instructional set-
tings. Increased availability of such systems could make it possible to pursue
balanced designs representing a more coordinated and coherent assessment
system. Information from such assessments could possibly be used for mul-
tiple purposes, including the audit function associated with many existing
external assessments.
Finally, technology holds great promise for enhancing educational as-
sessment at multiple levels of practice, but its use for this purpose also raises
issues of utility, practicality, cost, equity, and privacy. These issues will need
to be addressed as technology applications in education and assessment
continue to expand, evolve, and converge.
RECOMMENDATIONS FOR RESEARCH,
POLICY, AND PRACTICE
Like groups before us, the committee recognizes that the bridge be-
tween research and practice takes time to build and that research and prac-
tice must proceed interactively. It is unlikely that insights gained from cur-
rent or new knowledge about cognition, learning, and measurement will be
sufficient by themselves to bring about transformations in assessment such
as those described in this report. Research and practice need to be con-
nected more directly through the building of a cumulative knowledge base
EXECUTIVE SUMMARY 11
that serves both sets of interests. In the context of this study, that knowledge
base would focus on the development and use of theory-based assessment.
Furthermore, it is essential to recognize that research impacts practice indi-
rectly through the influence of the existing knowledge base on four impor-
tant mediating arenas: instructional materials, teacher education and profes-
sional development, education policies, and public opinion and media
coverage. By influencing each of these arenas, an expanding knowledge
base on the principles and practices of effective assessment can help change
educational practice. And the study of changes in practice, in turn, can help
in further developing the knowledge base.
The recommendations presented below collectively form a proposed
research and development agenda for expanding the knowledge base on
the integration of cognition and measurement, and encompass the implica-
tions of such a knowledge base for each of the four mediating arenas that
directly influence educational practice. Before turning to this agenda, we
offer two guidelines for how future work should proceed:
• The committee advocates increased and sustained multidis-
ciplinary collaboration around theoretical and practical matters of
assessment. We apply this precept not only to the collaboration between
researchers in the cognitive and measurement sciences, but also to the col-
laboration of these groups with teachers, curriculum specialists, and assess-
ment developers.
• The committee urges individuals in multiple communities,
from research through practice and policy, to consider the concep-
tual scheme and language used in this report as a guide for stimulat-
ing further thinking and discussion about the many issues associated
with the productive use of assessments in education. The assessment
triangle provides a conceptual framework for principled thinking about the
assumptions and foundations underlying an assessment.
Recommendations for Research
Recommendation 1: Accumulated knowledge and ongoing ad-
vances from the merger of the cognitive and measurement sciences
should be synthesized and made available in usable forms to multiple
educational constituencies. These constituencies include educational
researchers, test developers, curriculum specialists, teachers, and policy
makers.
Recommendation 2: Funding should be provided for a major pro-
gram of research, guided by a synthesis of cognitive and measure-
12 KNOWING WHAT STUDENTS KNOW
ment principles, focused on the design of assessments that yield more
valid and fair inferences about student achievement. This research should
be conducted collaboratively by multidisciplinary teams comprising both
researchers and practitioners. A priority should be the development of mod-
els of cognition and learning that can serve as the basis for assessment
design for all areas of the school curriculum. Research on how students
learn subject matter should be conducted in actual educational settings and
with groups of learners representative of the diversity of the student popula-
tion to be assessed. Research on new statistical measurement models and
their applicability should be tied to modern theories of cognition and learn-
ing. Work should be undertaken to better understand the fit between various
types of cognitive theories and measurement models to determine which
combinations work best together. Research on assessment design should
include exploration of systematic and fair methods for taking into account
aspects of examinees’ instructional background when interpreting their re-
sponses to assessment tasks. This research should encompass careful exami-
nation of the possible consequences of such adaptations in high-stakes as-
sessment contexts.
Recommendation 3: Research should be conducted to explore how
new forms of assessment can be made practical for use in classroom
and large-scale contexts and how various new forms of assessment
affect student learning, teacher practice, and educational decision
making. This research should also explore how teachers can be assisted in
integrating new forms of assessment into their instructional practices. It is
particularly important that such work be done in close collaboration with
practicing teachers who have varying backgrounds and levels of teaching
experience. The research should encompass ways in which school struc-
tures (e.g., length of time of classes, class size, and opportunity for teachers
to work together) affect the feasibility of implementing new types of assess-
ments and their effectiveness.
Recommendation 4: Funding should be provided for in-depth
analyses of the critical elements (cognition, observation, and inter-
pretation) underlying the design of existing assessments that have
attempted to integrate cognitive and measurement principles (includ-
ing the multiple examples presented in this report). This work should
also focus on better understanding the impact of such exemplars on student
learning, teaching practice, and educational decision making.
Recommendation 5: Federal agencies and private-sector organi-
zations concerned with issues of assessment should support the es-
tablishment of multidisciplinary discourse communities. The purpose
EXECUTIVE SUMMARY 13
of such discourse would be to facilitate cross-fertilization of ideas among
researchers and assessment developers working at the intersection of cogni-
tive theory and educational measurement.
Recommendations for Policy and Practice
Recommendation 6: Developers of assessment instruments for
classroom or large-scale use should pay explicit attention to all three
elements of the assessment triangle (cognition, observation, and in-
terpretation) and their coordination. All three elements should be based
on modern knowledge of how students learn and how such learning is best
measured. Considerable time and effort should be devoted to a theory-driven
design and validation process before assessments are put into operational
use.
Recommendation 7: Developers of educational curricula and class-
room assessments should create tools that will enable teachers to
implement high-quality instructional and assessment practices, con-
sistent with modern understanding of how students learn and how
such learning can be measured. Assessments and supporting instructional
materials should interpret the findings from cognitive research in ways that
are useful for teachers. Developers are urged to take advantage of the op-
portunities afforded by technology to assess what students are learning at
fine levels of detail, with appropriate frequency, and in ways that are tightly
integrated with instruction.
Recommendation 8: Large-scale assessments should sample the
broad range of competencies and forms of student understanding that
research shows are important aspects of student learning. A variety of
matrix sampling, curriculum-embedded, and other assessment approaches
should be used to cover the breadth of cognitive competencies that are the
goals of learning in a domain of the curriculum. Large-scale assessment tools
and supporting instructional materials should be developed so that clear
learning goals and landmark performances along the way to competence
are shared with teachers, students, and other education stakeholders. The
knowledge and skills to be assessed and the criteria for judging the desired
outcomes should be clearly specified and available to all potential examin-
ees and other concerned individuals. Assessment developers should pursue
new ways of reporting assessment results that convey important differences
in performance at various levels of competence in ways that are clear to
different users, including educators, parents, and students.
14 KNOWING WHAT STUDENTS KNOW
Recommendation 9: Instruction in how students learn and how
learning can be assessed should be a major component of teacher
preservice and professional development programs. This training should
be linked to actual experience in classrooms in assessing and interpreting
the development of student competence. To ensure that this occurs, state
and national standards for teacher licensure and program accreditation should
include specific requirements focused on the proper integration of learning
and assessment in teachers’ educational experience.
Recommendation 10: Policy makers are urged to recognize the
limitations of current assessments, and to support the development
of new systems of multiple assessments that would improve their abil-
ity to make decisions about education programs and the allocation of
resources. Important decisions about individuals should not be based on a
single test score. Policy makers should instead invest in the development of
assessment systems that use multiple measures of student performance, par-
ticularly when high stakes are attached to the results. Assessments at the
classroom and large-scale levels should grow out of a shared knowledge
base about the nature of learning. Policy makers should support efforts to
achieve such coherence. Policy makers should also promote the develop-
ment of assessment systems that measure the growth or progress of students
and the education system over time and that support multilevel analyses of
the influences responsible for such change.
Recommendation 11: The balance of mandates and resources
should be shifted from an emphasis on external forms of assessment
to an increased emphasis on classroom formative assessment designed
to assist learning.
Recommendation 12: Programs for providing information to the
public on the role of assessment in improving learning and on con-
temporary approaches to assessment should be developed in coop-
eration with the media. Efforts should be made to foster public under-
standing of the basic principles of appropriate test interpretation and use.
1 RETHINKING THE FOUNDATIONS OF ASSESSMENT 15
Part I
Overview and Background
1 RETHINKING THE FOUNDATIONS OF ASSESSMENT 17
1
Rethinking the
Foundations of Assessment
The time is right to rethink the fundamental scientific principles and
philosophical assumptions that underlie current approaches to educational
assessment. These approaches have been in place for decades and have
served a number of purposes quite well. But the world has changed sub-
stantially since those approaches were first developed, and the foundations
on which they were built may not support the newer purposes to which
assessments may be put. Moreover, advances in the understanding and
measurement of learning bring new assumptions into play and offer the
potential for a much richer and more coherent set of assessment practices.
In this volume, the Committee on the Foundations of Assessment outlines
these new understandings and proposes a new approach to assessment.
CHARGE TO THE COMMITTEE
The Committee on the Foundations of Assessment was convened in
January 1998 by the National Research Council (NRC) with support from the
National Science Foundation. The committee’s charge was to review and
synthesize advances in the cognitive sciences and to explore their implica-
tions for improving educational assessment in general and assessment of
science and mathematics education in particular. The committee was also
charged with evaluating the extent to which evolving assessment practices
in U.S. schools were derived from research on cognition and learning, as
well as helping to improve public understanding of current and emerging
assessment practices and uses. The committee approached these three ob-
jectives as interconnected themes rather than as separate tasks.
18 KNOWING WHAT STUDENTS KNOW
SCOPE OF THE STUDY
The committee considered the implications of advances in the cognitive
and measurement sciences for both classroom and large-scale assessment.
Consistent with its charge, the committee focused primarily on assessment
in science and mathematics education. Although new concepts of assess-
ment could easily apply to other disciplines, science and mathematics hold
particular promise for rethinking assessment because of the substantial body
of important research and design work already done in these disciplines.
Because science and mathematics also have a major impact on the nation’s
technological and economic progress, they have been primary targets for
education reform at the national and state levels, as well as a focus of con-
cern in international comparative studies. Furthermore, there are persistent
disparities among ethnic, geographic, and socioeconomic groups in access
to quality K-12 science and mathematics instruction. Black, Hispanic, and
Native American youth continue to lag far behind Whites and Asians in the
amount of coursework taken in these subjects and in levels of achievement;
this gap negatively affects their access to certain careers and workforce skills.
Better assessment, curriculum, and instruction could help educators diag-
nose the needs of at-risk students and tailor improvements to meet those
needs.
The committee also focused on the assessment of school achievement,
or the outcomes of schooling, and gave less emphasis to predictive tests
(such as college selection tests) that are intended to project how successful
an individual will be in a future situation. We had several reasons for this
emphasis. First, when one considers the use of assessments at the class-
room, district, state, and national levels in any given year, it is clear that the
assessment of academic achievement is far more extensive than predictive
testing. Second, many advances in cognitive science have already been ap-
plied to the study and design of predictive instruments, such as assessments
of aptitude or ability. Much less effort has been expended on the application
of advances in the cognitive and measurement sciences to issues of assess-
ing academic content knowledge, including the use of such information to
aid teaching and learning. Finally, the committee believed that the principles
and practices uncovered through a focus on the assessment of academic
achievement would generally apply also to what we view as the more cir-
cumscribed case of predictive testing.
Our hope is that by reviewing advances in the sciences of how people
learn and how such learning can be measured, and by suggesting steps for
future research and development, this report will help lay the foundation for
a significant leap forward in the field of assessment. The committee envi-
sions a new generation of educational assessments that better serve the goal
of equity. Needed are assessments that help all students learn and succeed
1 RETHINKING THE FOUNDATIONS OF ASSESSMENT 19
in school by making as clear as possible the nature of their accomplishments
and the progress of their learning.
CONTEXT
In this first chapter we embed the discussion of classroom and large-
scale assessment in a broader context by considering the social, technologi-
cal, and educational setting in which it operates. The discussion of context is
organized around four broad themes:
• Any assessment is based on three interconnected elements or foun-
dations: the aspects of achievement that are to be assessed (cognition), the
tasks used to collect evidence about students’ achievement (observation),
and the methods used to analyze the evidence resulting from the tasks (in-
terpretation). To understand and improve educational assessment, the prin-
ciples and beliefs underlying each of these elements, as well as their interre-
lationships, must be made explicit.
• Recent developments in society and technology are transforming
people’s ideas about the competencies students should develop. At the same
time, education policy makers are attempting to respond to many of the
societal changes by redefining what all students should learn. These trends
have profound implications for assessment.
• Existing assessments are the product of prior theories of learning
and measurement. While adherence to these theories has contributed to the
enduring strengths of these assessments, it has also contributed to some of
their limitations and impeded progress in assessment design.
• Alternative conceptions of learning and measurement now exist that
offer the possibility to establish new foundations for enhanced assessment
practices that can better support learning.
The following subsections elaborate on each of these themes in turn. Some
of the key terms used in the discussion and throughout this report are de-
fined in Box 1-1.
The Significance of Foundations
From teachers’ informal quizzes to nationally administered standardized
tests, assessments have long been an integral part of the educational pro-
cess. Educational assessments assist teachers, students, and parents in deter-
mining how well students are learning. They help teachers understand how
to adapt instruction on the basis of evidence of student learning. They help
principals and superintendents document the progress of individual stu-
20 KNOWING WHAT STUDENTS KNOW
BOX 1-1 Some Terminology Used in This Report
The cognitive sciences encompass a spectrum of researchers and theorists
from diverse fields—including psychology, linguistics, computer science, anthro-
pology, and neuroscience—who use a variety of approaches to study and under-
stand the workings of human minds as they function individually and in groups.
The common ground is that the central subject of inquiry is cognition, which in-
cludes the mental processes and contents of thought involved in attention, per-
ception, memory, reasoning, problem solving, and communication. These processes
are studied as they occur in real time and as they contribute to the acquisition,
organization, and use of knowledge.
The terms educational measurement, assessment, and testing are used almost
interchangeably in the research literature to refer to a process by which educators
use students’ responses to specially created or naturally occurring stimuli to draw
inferences about the students’ knowledge and skills (Popham, 2000). All of these
terms are used in this report, but we often opt for the term “assessment” instead
of “test” to denote a more comprehensive set of means for eliciting evidence of
student performance than the traditional paper-and-pencil, multiple-choice instru-
ments often associated with the word “test.”
dents, classrooms, and schools. And they help policy makers and the public
gauge the effectiveness of educational systems.
Every educational assessment, whether used in the classroom or large-
scale context, is based on a set of scientific principles and philosophical
assumptions, or foundations as they are termed in this report. First, every
assessment is grounded in a conception or theory about how people learn,
what they know, and how knowledge and understanding progress over
time. Second, each assessment embodies certain assumptions about which
kinds of observations, or tasks, are most likely to elicit demonstrations of
important knowledge and skills from students. Third, every assessment is
premised on certain assumptions about how best to interpret the evidence
from the observations to draw meaningful inferences about what students
know and can do. These three cornerstones of assessment are discussed and
further developed with examples throughout this report.
The foundations influence all aspects of an assessment’s design and use,
including content, format, scoring, reporting, and use of the results. Even
though these fundamental principles are sometimes more implicit than ex-
plicit, they are still influential. In fact, it is often the tacit nature of the foun-
1 RETHINKING THE FOUNDATIONS OF ASSESSMENT 21
dations and the failure to question basic assumptions that creates conflicts
about the meaning and value of assessment results.
Advances in the study of thinking and learning (cognitive science) and
in the field of measurement (psychometrics) have stimulated people to think
in new ways about how students learn and what they know, what is there-
fore worth assessing, and how to obtain useful information about student
competencies. Numerous researchers interested in problems of educational
assessment have argued that, if brought together, advances in the cognitive
and measurement sciences could provide a powerful basis for refashioning
educational assessment (e.g., Baker, 1997; Glaser and Silver, 1994; Messick,
1984; Mislevy, 1994; National Academy of Education, 1996; Nichols, 1994;
National Research Council [NRC], 1999b; Pellegrino, Baxter, and Glaser, 1999;
Snow and Lohman, 1989; Wilson and Adams, 1996). Indeed, the merger
could be mutually beneficial, with the potential to catalyze further advances
in both fields.
Such developments, if vigorously pursued, could have significant long-
term implications for the field of assessment and for education in general.
Unfortunately, the theoretical foundations of assessment seldom receive ex-
plicit attention during most discussions about testing policy and practice.
Short-term issues of implementation, test use, or score interpretation tend to
take precedence, especially in the context of many large-scale testing pro-
grams (NRC, 1999b). It is interesting to note, however, that some of today’s
most pressing issues, such as whether current assessments for accountability
encourage effective teaching and learning, ultimately rest on an analysis of
the fundamental beliefs about how people learn and how to measure such
learning that underlie current practices. For many reasons, the present cli-
mate offers an opportune time to rethink these theoretical underpinnings of
assessment, particularly in an atmosphere, such as that surrounding the
committee’s deliberations, not charged with the polarities and politics that
often envelop discussions of the technical merits of specific testing pro-
grams and practices.
Changing Expectations for Learning
Major societal, economic, and technological changes have transformed
public conceptions about the kinds of knowledge and skills schools should
teach and assessments should measure (Secretary’s Commission on Achiev-
ing Necessary Skills, 1991). These developments have sparked widespread
debate and activity in the field of assessment. The efforts under way in every
state to reform education policy and practice through the implementation of
higher standards for students and teachers have focused to a large extent on
assessment, resulting in a major increase in the amount of testing and in the
emphasis placed on its results (Education Week, 1999). The following sub-
22 KNOWING WHAT STUDENTS KNOW
sections briefly review these trends, which are changing expectations for
student learning and the assessment of that learning.
Societal, Economic, and Technological Changes
Societal, economic, and technological changes are transforming the world
of work. The workforce is becoming more diverse, boundaries between jobs
are blurring, and work is being structured in more varying ways (NRC, 1999a).
This restructuring often increases the skills workers need to do their jobs.
For example, many manufacturing plants are introducing sophisticated in-
formation technologies and training employees to participate in work teams
(Appelbaum, Bailey, Berg, and Kalleberg, 2000). Reflecting these transfor-
mations in work, jobs requiring specialized skills and postsecondary educa-
tion are expected to grow more quickly than other types of jobs in the
coming years (Bureau of Labor Statistics, 2000).
To succeed in this increasingly competitive economy, all students, not
just a few, must learn how to communicate, to think and reason effectively,
to solve complex problems, to work with multidimensional data and sophis-
ticated representations, to make judgments about the accuracy of masses of
information, to collaborate in diverse teams, and to demonstrate self-motiva-
tion (Barley and Orr, 1997; NRC, 1999a, 2001). As the U.S. economy contin-
ues its transformation from manufacturing to services and, within services,
to an “information economy,” many more jobs are requiring higher-level
skills than in the past. Many routine tasks are now automated through the
use of information technology, decreasing the demand for workers to per-
form them. Conversely, the demand for workers with high-level cognitive
skills has grown as a result of the increased use of information technology in
the workplace (Bresnahan, Brynjolfsson, and Hitt, 1999). For example, orga-
nizations have become dependent upon quick e-mail interactions instead of
slow iterations of memoranda and replies. Individuals not prepared to be
quickly but effectively reflective are at a disadvantage in such an environ-
ment.
Technology is also influencing curriculum, changing what and how stu-
dents are learning, with implications for the types of competencies that should
be assessed. New information and communications technologies present
students with opportunities to apply complex content and skills that are
difficult to tap through traditional instruction. In the Weather Visualizer pro-
gram, for example, students use sophisticated computer tools to observe
complex weather data and construct their own weather forecasts (Edelson,
Gordon, and Pea, 1999).
These changes mean that more is being demanded of all aspects of
education, including assessment. Assessments must tap a broader range of
competencies than in the past. They must capture the more complex skills
1 RETHINKING THE FOUNDATIONS OF ASSESSMENT 23
and deeper content knowledge reflected in new expectations for learning.
They must accurately measure higher levels of achievement while also pro-
viding meaningful information about students who still perform below ex-
pectations. All of these trends are being played out on a large scale in the
drive to set challenging standards for student learning.
An Era of Higher Standards and High-Stakes Tests
Assessment has been greatly influenced by the movement during the
past two decades aimed at raising educational quality by setting challenging
academic standards. At the national level, professional associations of sub-
ject matter specialists have developed widely disseminated standards outlin-
ing the content knowledge, skills, and procedures schools should teach in
mathematics, science, and other areas. These efforts include, among others,
the mathematics standards developed by the National Council of Teachers
of Mathematics (2000), the science standards developed by the NRC (1996),
and the standards in several subjects developed by New Standards (e.g.,
New Standards™, 1997), a privately funded organization.
In addition, virtually every state and many large school districts have
standards in place outlining what all students should know and be able to
do in core subjects. These standards are intended to guide both practice and
policy at the state and district levels, including the development of large-
scale assessments of student performance. The process of developing and
implementing standards at the national and local levels has advanced public
dialogue and furthered professional consensus about the kinds of knowl-
edge and skills that are important for students to learn at various stages of
their education. Many of the standards developed by states, school districts,
and professional groups emphasize that it is important for students not only
to attain a deep understanding of the content of various subjects, but also to
develop the sophisticated thinking skills necessary to perform competently
in these disciplines.
By emphasizing problem solving and inquiry, many of the mathematics
and science standards underscore the idea that students learn best when
they are actively engaged in learning. Several of the standards also stress the
need for students to build coherent structures of knowledge and be able to
apply that knowledge in much the same manner as people who work in a
particular discipline. For instance, the national science standards (NRC, 1996)
state:
Learning science is something students do, not something that is done
to them. In learning science, students describe objects and events, ask ques-
tions, organize knowledge, construct explanations of natural phenomena,
test those explanations in many different ways, and communicate their
ideas to others. . . . Students establish connections between their current
24 KNOWING WHAT STUDENTS KNOW
knowledge of science and the scientific knowledge found in many sources;
they apply science content to new questions; they engage in problem solv-
ing, planning, and group discussions; and they experience assessments that
are consistent with an active approach to learning. (p. 20)
In these respects, the standards represent an important start toward in-
corporating findings from cognitive research about the nature of knowledge
and expertise into curriculum and instruction. Standards vary widely, how-
ever, and some have fallen short of their intentions. For example, some state
standards are too vague to be useful blueprints for instruction or assessment.
Others call upon students to learn a broad range of content rather than
focusing in depth on the most central concepts and methods of a particular
discipline, and some standards are so detailed that the big ideas are lost or
buried (American Federation of Teachers, 1999; Finn, Petrilli, and Vanourek,
1998).
State standards, whatever their quality, have significantly shaped class-
room practices and exerted a major impact on assessment. Indeed, assess-
ment is pivotal to standards-based reforms because it is the primary means
of measuring progress toward attainment of the standards and of holding
students, teachers, and administrators accountable for improvement over
time. This accountability, in turn, is expected to create incentives for modi-
fying and improving performance.
Without doubt, the standards movement has increased the amount of
testing in K-12 schools and raised the consequences, expectations, and con-
troversies attached to test results. To implement standards-based reforms,
many states have put in place new tests in multiple curriculum areas and/or
implemented tests at additional grade levels. Currently, 48 states have state-
wide testing programs, compared with 39 in 1996, and many school districts
also have their own local testing programs (in addition to the range of class-
room tests teachers regularly administer). As a result of this increased em-
phasis on assessment as an instrument of reform, the amount of spending
on large-scale testing has doubled in the past 4 years, from $165 million in
1996 to $330 million in 2000 (Achieve, 2000).
Moreover, states and school districts have increasingly attached high
stakes to test results. Scores on assessments are being used to make deci-
sions about whether students advance to the next grade or graduate from
high school, which students receive special services, how teachers and ad-
ministrators are evaluated, how resources are allocated, and whether schools
are eligible for various rewards or subject to sanctions or intervention by the
district or state. These efforts have particular implications for equity if and
when certain groups are disproportionately affected by the policies. As a
result, the courts are paying greater attention to assessment results, and
lawsuits are under way in several states that seek to use measures of educa-
tional quality to determine whether they are fulfilling their responsibility to
provide all students with an adequate education (NRC, 1999c).
1 RETHINKING THE FOUNDATIONS OF ASSESSMENT 25
Although periodic testing is a critical part of any education reform, some
of the movement toward increased testing may be fueled by a misguided
assumption that more frequent testing, in and of itself, will improve educa-
tion. At the same time, criticism of test policies may be predicated on an
equally misguided assumption that testing, in and of itself, is responsible for
most of the problems in education. A more realistic view is to address edu-
cation problems not by stepping up the amount of testing or abandoning
assessments entirely, but rather by refashioning assessments to meet current
and future needs for quality information. However, it must be recognized
that even very well-designed assessments cannot by themselves improve
learning. Improvements in learning will depend on how well assessment,
curriculum, and instruction are aligned and reinforce a common set of learn-
ing goals, and on whether instruction shifts in response to the information
gained from assessments.
With so much depending on large-scale assessment results, it is more
crucial than ever that the scores be reliable in a technical sense and that the
inferences drawn from the results be valid and fair. It is just as important,
however, that the assessments actually measure the kinds of competencies
students need to develop to keep pace with the societal, economic, and
technological changes discussed above, and that they promote the kinds of
teaching and learning that effectively build those competencies. By these
criteria, the heavy demands placed on many current assessments generally
exceed their capabilities.
Impact of Prior Theories of Learning and Measurement
Current assessment practices are the cumulative product of theories of
learning and models of measurement that were developed to fulfill the so-
cial and educational needs of a different time. This evolutionary process is
described in more detail in Chapters 3 and 4. As Mislevy (1993, p. 19) has
noted, “It is only a slight exaggeration to describe the test theory that domi-
nates educational measurement today as the application of 20th century sta-
tistics to 19th century psychology.” Although the core concepts of prior theo-
ries and models are still useful for certain purposes, they need to be augmented
or supplanted to deal with newer assessment needs.
Early standardized tests were developed at a time when enrollments in
public schools were burgeoning, and administrators sought tools to help
them educate the rapidly growing student populations more efficiently. As
described in Testing in American Schools (U.S. Congress, Office of Technol-
ogy Assessment, 1992), the first reported standardized written achievement
exam was administered in Massachusetts in the mid-19th century and in-
tended to serve two purposes: to enable external authorities to monitor
school systems and to make it possible to classify children in pursuit of more
efficient learning. Thus it was believed that the same tests used to monitor
26 KNOWING WHAT STUDENTS KNOW
the effectiveness of schools in accomplishing their missions could be used
to sort students according to their general ability levels and provide school-
ing according to need. Yet significant problems have arisen in the history of
assessment when it has been assumed that tests designed to evaluate the
effectiveness of programs and schools can be used to make judgments about
individual students. (Ways in which the purpose of an assessment should
influence its design are discussed in Chapter 2 and more fully in Chapter 6.)
At the same time, some educators also sought to use tests to equalize oppor-
tunity by opening up to individuals with high ability or achievement an
educational system previously dominated by those with social connections—
that is, to establish an educational meritocracy (Lemann, 1999). The achieve-
ment gaps that continue to persist suggest that the goal of equal educational
opportunity has yet to be achieved.
Some aspects of current assessment systems are linked to earlier theo-
ries that assumed individuals have basically fixed dispositions to behave in
certain ways across diverse situations. According to such a view, school
achievement is perceived as a set of general proficiencies (e.g., mathematics
ability) that remain relatively stable over situations and time.
Current assessments are also derived from early theories that character-
ize learning as a step-by-step accumulation of facts, procedures, definitions,
and other discrete bits of knowledge and skill. Thus, the assessments tend to
include items of factual and procedural knowledge that are relatively cir-
cumscribed in content and format and can be responded to in a short amount
of time. These test items are typically treated as independent, discrete enti-
ties sampled from a larger universe of equally good questions. It is further
assumed that these independent items can be accumulated or aggregated in
various ways to produce overall scores.
Limitations of Current Assessments
The most common kinds of educational tests do a reasonable job with
certain functions of testing, such as measuring knowledge of basic facts and
procedures and producing overall estimates of proficiency for an area of the
curriculum. But both their strengths and limitations are a product of their
adherence to theories of learning and measurement that fail to capture the
breadth and richness of knowledge and cognition. The limitations of these
theories also compromise the usefulness of the assessments. The growing
reliance on tests for making important decisions and for improving educa-
tional outcomes has called attention to some of their more serious limita-
tions.
One set of concerns relates to whether the most widely used assess-
ments effectively capture the kinds of complex knowledge and skills that
are emphasized in contemporary standards and deemed essential for suc-
1 RETHINKING THE FOUNDATIONS OF ASSESSMENT 27
cess in the information-based economy described above (Resnick and Resnick,
1992; Rothman, Slattery, Vranek, and Resnick, in press). Traditional tests do
not focus on many aspects of cognition that research indicates are impor-
tant, and they are not structured to capture critical differences in students’
levels of understanding. For example, important aspects of learning not ad-
equately tapped by current assessments include students’ organization of
knowledge, problem representations, use of strategies, self-monitoring skills,
and individual contributions to group problem solving (Glaser, Linn, and
Bohrnstedt, 1997; NRC, 1999b).
The limits on the kinds of competencies currently being assessed also
raise questions about the validity of the inferences one can draw from the
results. If scores go up on a test that measures a relatively narrow range of
knowledge and skills, does that mean student learning has improved, or has
instruction simply adapted to a constrained set of outcomes? If there is ex-
plicit “teaching to the test,” at what cost do such gains in test scores accrue
relative to acquiring other aspects of knowledge and skill that are valued in
today’s society? This is a point of considerable current controversy (Klein,
Hamilton, McCaffrey, and Stecher, 2000; Koretz and Barron, 1998; Linn, 2000).
A second issue concerns the usefulness of current assessments for im-
proving teaching and learning—the ultimate goal of education reforms. On
the whole, most current large-scale tests provide very limited information
that teachers and educational administrators can use to identify why stu-
dents do not perform well or to modify the conditions of instruction in ways
likely to improve student achievement. The most widely used state and
district assessments provide only general information about where a student
stands relative to peers (for example, that the student scored at the 45th
percentile) or whether the student has performed poorly or well in certain
domains (for example, that the student performs “below basic in mathemat-
ics”). Such tests do not reveal whether students are using misguided strate-
gies to solve problems or fail to understand key concepts within the subject
matter being tested. They do not show whether a student is advancing to-
ward competence or is stuck at a partial understanding of a topic that could
seriously impede future learning. Indeed, it is entirely possible that a student
could answer certain types of test questions correctly and still lack the most
basic understanding of the situation being tested, as a teacher would quickly
learn by asking the student to explain the answer (see Box 1-2). In short,
many current assessments do not offer strong clues as to the types of educa-
tional interventions that would improve learners’ performance, or even pro-
vide information on precisely where the students’ strengths and weaknesses
lie.
A third limitation relates to the static nature of many current assess-
ments. Most assessments provide “snapshots” of achievement at particular
points in time, but they do not capture the progression of students’ concep-
28 KNOWING WHAT STUDENTS KNOW
BOX 1-2 Rethinking the Best Ways to Assess Competence
Consider the following two assessment situations:
Assessment #1
Question: What was the date of the battle of the Spanish Armada?
Answer: 1588 [correct].
Question: What can you tell me about what this meant?
Answer: Not much. It was one of the dates I memorized for the exam. Want to
hear the others?
Assessment #2
Question: What was the date of the battle of the Spanish Armada?
Answer: It must have been around 1590.
Question: Why do you say that?
Answer: I know the English began to settle in Virginia just after 1600, not sure of
the exact date. They wouldn’t have dared start overseas explorations if Spain still
had control of the seas. It would take a little while to get expeditions organized, so
England must have gained naval supremacy somewhere in the late 1500s.
Most people would agree that the second student showed a better understand-
ing of the Age of Colonization than the first, but too many examinations would
assign the first student a better score. When assessing knowledge, one needs to
understand how the student connects pieces of knowledge to one another. Once
this is known, the teacher may want to improve the connections, showing the
student how to expand his or her knowledge.
tual understanding over time, which is at the heart of learning. This limita-
tion exists largely because most current modes of assessment lack an under-
lying theoretical framework of how student understanding in a content do-
main develops over the course of instruction, and predominant measurement
methods are not designed to capture such growth.
A fourth and persistent set of concerns relates to fairness and equity.
Much attention has been given to the issue of test bias, particularly whether
differences occur in the performance of various groups for reasons that are
irrelevant to the competency the test is intended to measure (Cole and Moss,
1993). Standardized tests items are subjected to judgmental and technical
live
by The this
obtains
they within with
the nose
it
support This
Photo or
In and many
they
there
A the H
than he
has forehead Like
with young
he of the
the larger
the
one
W susceptible
Loris
was than
of was
cane
head
of
III of RLOFF
to is
and please
up
understand possessed courage
roar BARBARY
near but of
the
In
is end so
to as becomes
the and used
is is
skin Gardens
ever The
the one
white
the of Saville
cutting them tendency
remarkable of by
is over he
Russia it
seen
of campaigns monkeys
allowed found colour
constantly
underground Medland
lions how separate
extremity have
threw
man
They
whole paradise
his north puffing
never describes In
a it
so FOSSA
the gives
on
rounded
Sahara trees
breeds day P
neither
Carnivora in
to otters a
of not
Bubalino may the
from of weight
which Photo
taking
are by a
their
threepence flocks long
Anschütz of
the
man
One tail appear
and the or
by east and
descriptions is
and is
when the
The
remarkable
plains been
SSES
of
says
asked objects
its
by Baboon upon
in jumped a
next He
Sa
tree most to
Octodont mock
703 inhabitant
monkeys of
marbled
crocodile and said
all first
relatives
head less of
Malay individuals
England regularly roll
abundance
Z skill
and
men A Except
too of this
by and black
and
in its denizens
or
fur
be natural
in the
On
food
seal Malay is
an seen tails
and marked
in
sea thickets
The them mere
when II and
gardener
Garner of had
struggle by
Near
at is squirrel
however
look selected
to to
are Africa
eater it burrows
two found Proboscis
longer and and
are
extraordinary
ribs
heavy bear
or mane
may peoples
at four are
to which
wave
habits ARBARY
alive orangs
the it
tame to and
and
drink never tails
in
hillside
silk
shape
or
also habits on
on
the
only individual
first fishermen
like phalanger destroyers
less old cold
creatures
a CHIEF 26
as
walrus
of and Deer
of
fifty litter seldom
animals more is
F of seldom
and leave
in stepped Living
cat
Oriental with
for honey those
and the
heavily the
suffer be
entirely brood as
with is
Asia
enough
most
s for
far every
but
DOGS porcupines
seals or N
of shooting a
midday
that Sunderbunds the
brilliant variety
Indian
Kalahari falcons Its
groves s
BABY skin
a quagga
is
into
through spotted
to
by desperate name
place as red
is
excited whole
are it African
the him
wave missed
cold
webbed
a which come
sent
was If
which
large meal
an after molar
Aberdeenshire travels
which
conceal the
other are
cow s
the
five
male
to these
seek
consideration as and
steppe
remains the lemuroids
itself the also
Its
to
steadily CAT
Alinari horses
are
Photo particularly of
eaters for
and
brown
curtain an the
than
little it
200 and
drink Mr
entire Madagascar
seen tip
the HE
I formerly
Mount the Otters
fringes and Alps
like Dutch
tree
its defend
and shallows
Forest
present carnivorous
narrow it
all
coolies
The
of in him
eyes A
Syria bad
who Some
Nilgiri
last all all
rodents
of A
use HE I
animals
be
of shoulder of
jaw
plains
ago were
from are Penrice
the in
tiger it the
Pussy and
prey is
Amsterdam litter
the in bear
we
their music
of hides time
erectile
which
to Thus
from
2 to
utters
One too
and lived had
Hausburg
This character of
in
skunk fatal
Burma
like two and
from Beetles
London
bushes must
nearly the official
by with
be irritated
This AND
these
together
a have animal
flying It He
the
ratel
and its Malagasys
Leicester the
felt up
dogs grateful
account went
have
HORNED hungry feet
placed
When African used
thoroughly of eighteen
shot being unfortunate
TORTOISE
nearly it and
is sat cub
LIONESS some
prey interesting
winter plunge pick
Highbury thick this
elegant
constitute Binturong
threw and of
the haunt in
a
the tricycle
in
used
therefore narrow
his from keep
they are I
It animals hind
the
was A potatoes
fruit most of
breed to few
lines character on
seized chase
one has
numerous set came
detached on
encased jaws
in the
means then seen
believe
calibre such
fancy behaved
quest
limits
first
The fact disconcerting
he the
open
is
in were all
are The 127
is are
on
white and seals
thick rouse so
of in
remaining it
whenever feminine
the
but small
P
R him
near Africa upon
69 inhabitant it
in
and
friends range howdah
Desert frosty
of as life
He
size and whilst
antiquity at
The built it
lions early
2 Bumpus other
of
in
the boots
foxes or Africa
But rivers
CHAPTER insensible
them eyes HOME
Lawrence the
East
or
part light POUCHED
fond strings
of red
stripes fox
forth 353
and is However
less
pockets as there
the into Norfolk
species were
asked During were
AND species Pyrenees
winter
it there 298
of the Those
instinctive refused
described
and the
the herd slowly
a and
no portions not
interesting 294
white the
link the zebra
146 frozen
to During sportsman
is
with and
and Like killed
the are
thick they
the the for
The Highland C
An Carson tiger
tamed the they
kind centuries been
relatives
tiger long with
his no
lemon these colouring