Performance and Authentic Assessments
Performance and Authentic Assessments
In recent years, considerable attention has been given to what is referred to as authentic
assessment. All authentic assessments are performance assessments, but the inverse is not always true. An
authentic assessment involves a real application of a skill beyond its instructional context. The term
authentic assessment is intended to convey that the “assessment tasks themselves are real instances of
extended criterion performances, rather than proxies or estimators of actual learning goals.” Other authors
describe authentic assessments as involving a “performance of tasks that are valued in their own right. In
contrast, paper-and-pencil, multiple-choice tests derive their value primarily as indicators or correlate of
other valued performances.”
Performance assessments are typically used in conjunction with written tests. Although
performance assessments are needed to measure complex skills, written tests tend to be more efficient at
assessing the information, concepts, and rules that provide the fundamental knowledge for complex skills.
Before a performance assessment asks students to create hypotheses, written tests can be used to establish
students’ knowledge that is prerequisite to creating such hypotheses.
Performance assessments have a variety of unique characteristics. For example, they can measure
a process as well as products resulting from a process. They can occur in natural or structured settings.
Most important, performance assessments can measure skills that paper-and-pencil test cannot. However,
performance assessments have significant limitations. For instance, they are usually time-consuming to
develop, administer, and score. Also, as with essay tests, subjectivity in scoring, unless controlled, results
in substantial measurement error.
Narrowing our focus in this way does not imply that observing general behaviors is inappropriate
or that providing students with experiences for which there is no correct or preferred solution should be
avoided. Our definition of performance assessments is restricted so as to focus procedures other than
written tests to which educational measurement principles can be readily applied.
As with essay tests, performance assessments can be scored analytically or holistically. With
analytical scoring, the appropriateness of a student’s response is judged on each of a series of attributes.
Checklists and rating scales are used with analytical scoring. The scoring of many performance
assessments, however, cannot be broken into distinct attributes. Instead, an overall or holistic judgment is
made. Scoring rubrics are used to facilitate holistic scoring.
Scoring amounts to making a summary statement concerning a student’s performance. This
summary may, but does not need to, involve numbers. Although numbers are convenient, the actual
descriptions provided by the checklist, rating scale, or scoring rubric are generally more useful than
numbers generated through the scoring process. However, numbers are particularly helpful when
summative evaluations are involved, such as when grades are to be assigned. With formative evaluations,
the completed checklist, rating scale, or scoring rubric provides a framework for discussing results with
students.
Anecdotal records can also be used to document a student’s behavior and facilitate later feedback.
Creating anecdotal records generally does not constitute scoring a performance assessment.
Rating scales are similar to checklists, except they provide a scal e or range of responses for
each item. Table 38 illustrates a series of rating scales. In this example, students are assigned a
score between 1 and 5 on different qualities associated with delivering a speech.
Rating scales can assume a variety of forms. For instance, the scales in Table 38 use
words involving comparisons among students. Ratings such as “exceptional,” “good,” and “class
standards” gain meaning through prior experience with other students. Points on a rating scale
can also refer to specific behaviors, as shown in Table 39;
This last point is often confusing. Involving real-world applications does not necessarily
refer to activities in which students will directly participate later in life. Instead, real-world
applications involve direct applications of knowledge that are highly relevant to situations
outside the classroom. Authentic tasks in science, for instance, relate to real-world situations in
which the science is applied or discussed. Likewise, authentic tasks in math relate to real-world
situations in which mathematicians or other consumers of mathematics use math knowledge.
2, Identify Concise Goals That Will Be Assessed
Prior to creating a performance assessment, the goals to be assessed must be identified.
Whereas performance objective identifies the specific behavior, a goal is stated more broadly.
Performance assessments normally are used to evaluate broader objectives or goals.
An example of performance objective:
Information: Identify qualities desired in short-answer items.
An example of a goal is as follows; Produce a short-answer test.
A goal is often the equivalent of several objectives. Goals rather than performance
objectives typically have to be used with performance assessments because the task cannot be
meaningfully expressed as a series of specific behaviors. Alternatively, listing the full series of
specific performances associated with the task may be impractical.
Nevertheless, to develop a performance assessment, a goal must be expressed in
operational terms. The actual task to be performed may represent one of several options for
operationally expressing the goal. The task that is selected must be a meaningful representation
of the goal.
When constructing performance assessments, there is a temptation to increase realism by
involving associated skills irrelevant to the goal being assessed. Including such skills, even when
they increase realism, is equivalent to including irrelevant information in a written test item. This
extraneous information tends to confuse students and, more important, confounds the
assessment. When a student performs unsatisfactorily, the teacher cannot determine whether the
student has yet to achieve the goal being assessed or simply is unable to work through the
irrelevant material. When determining content to be included in a performance assessment, keep
it free of irrelevant skills.