0% found this document useful (0 votes)
985 views

Performance and Authentic Assessments

The document discusses performance assessments and authentic assessments. It notes that all authentic assessments are performance assessments, but not all performance assessments are authentic. Performance assessments can measure complex skills that written tests cannot, but they are more time-consuming to develop, administer, and score compared to written tests. The document also categorizes performance assessments as measuring process versus product, using simulated versus real settings, and using natural versus structured stimuli. It describes some advantages of performance assessments, such as allowing evaluation of complex skills, but also notes limitations like the significant time required and potential for rater bias.

Uploaded by

Jamillah Ar Ga
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
985 views

Performance and Authentic Assessments

The document discusses performance assessments and authentic assessments. It notes that all authentic assessments are performance assessments, but not all performance assessments are authentic. Performance assessments can measure complex skills that written tests cannot, but they are more time-consuming to develop, administer, and score compared to written tests. The document also categorizes performance assessments as measuring process versus product, using simulated versus real settings, and using natural versus structured stimuli. It describes some advantages of performance assessments, such as allowing evaluation of complex skills, but also notes limitations like the significant time required and potential for rater bias.

Uploaded by

Jamillah Ar Ga
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 8

PERFORMANCE AND AUTHENTIC ASSESSMENTS

In recent years, considerable attention has been given to what is referred to as authentic
assessment. All authentic assessments are performance assessments, but the inverse is not always true. An
authentic assessment involves a real application of a skill beyond its instructional context. The term
authentic assessment is intended to convey that the “assessment tasks themselves are real instances of
extended criterion performances, rather than proxies or estimators of actual learning goals.” Other authors
describe authentic assessments as involving a “performance of tasks that are valued in their own right. In
contrast, paper-and-pencil, multiple-choice tests derive their value primarily as indicators or correlate of
other valued performances.”
Performance assessments are typically used in conjunction with written tests. Although
performance assessments are needed to measure complex skills, written tests tend to be more efficient at
assessing the information, concepts, and rules that provide the fundamental knowledge for complex skills.
Before a performance assessment asks students to create hypotheses, written tests can be used to establish
students’ knowledge that is prerequisite to creating such hypotheses.
Performance assessments have a variety of unique characteristics. For example, they can measure
a process as well as products resulting from a process. They can occur in natural or structured settings.
Most important, performance assessments can measure skills that paper-and-pencil test cannot. However,
performance assessments have significant limitations. For instance, they are usually time-consuming to
develop, administer, and score. Also, as with essay tests, subjectivity in scoring, unless controlled, results
in substantial measurement error.

Characteristics of Performance Assessments


Specifying the characteristics of performance assessments depends in part on deciding exactly
what a performance assessment is. The qualities associated with written formats such as multiple-choice
and essays are more established. The nature of performance assessments is highly diverse and may
include observation and evaluation of laboratory experiments, artwork, speaking, work habits, social
interactions, and feelings about issues. For our purposes, a performance assessment will be limited to
those situations that meet the following criteria:
1. Specific behavior or outcomes of behaviors are to be observed.
2. It is possible to judge the appropriateness of students’ actions or at least to identify whether one
possible response is more appropriate than some alternative response.
3. The process or outcome cannot be directly measured using a paper-and-pencil test, such as a test
involving a multiple-choice, essay, or another written format.

Narrowing our focus in this way does not imply that observing general behaviors is inappropriate
or that providing students with experiences for which there is no correct or preferred solution should be
avoided. Our definition of performance assessments is restricted so as to focus procedures other than
written tests to which educational measurement principles can be readily applied.

Categories of Performance Assessments


Performance assessments can be categorized in several ways. This chapter contrasts performance
assessments that measure a process versus a product, use simulated versus real settings, and depend on
natural versus structured stimuli.
1. Process versus Product Measures. A process is the procedure that a student uses to complete a task. A
product is a tangible outcome that may be the result of completing a process. For example, the way in
which an individual uses woodworking tools to build a piece of furniture would be a process; the piece of
furniture resulting from working with the tools would be the product. Generally, a performance
assessment is concerned with only the process or only the product, or at least emphasizes one over the
other. For instance, the completed watercolor might be assessed, but not the student’s technique in
producing the painting.
2. Simulated versus Real Settings. Many performance assessments represent simulations because the real
situations are too expensive or dangerous to use, unavailable or impractical for other reasons. Some
simulations are so realistic that they unquestionably represent adequate substitutes for the real thing.
Sophisticated flight simulators train and examine new pilots so thoroughly that the student pilots can
safely pilot basic aircraft solo without in-flight experience. Authentic assessments place considerable
emphasis on realism, simulated or otherwise. As with any performance assessment, however, realism
must go beyond appearances. Having an assessment look real is not sufficient to make an assessment
authentic. The appearance of realism may not even be necessary. Instead, an authentic assessment must
incorporate all conditions relevant to the realistic use of the assessed knowledge.
3. Natural versus Structured Stimuli. A stimulus is natural when it occurs without the intervention of the
observer. For instance, a student’s social skills are typically evaluated without prompts. In some
situations, however, a stimulus may be structured to ensure that the behavior being evaluated occurs or
occurs in a particular setting. Examples of structured stimuli include asking a student to prepare and
deliver a speech, perform a lab experiment, or read aloud.
Natural stimuli facilitate observation of typical performance, whereas structured stimuli tend to
elicit maximum performance. Therefore, natural stimuli are preferred for assessing personality traits,
work habits, and willingness to follow prescribed procedures such as safety rules. Structured stimuli are
needed to determine how well a student can explain a concept orally, write a paper, play a musical
instrument, or perform other tasks.
Structuring the stimulus also ensures that the performance to be observed will occur. Observation
time can be reduced by asking a student to do something rather than waiting for it to happen naturally.
Structuring also helps determine whether the lack of a particular performance results because the student
is avoiding a behavior in which he or she is not proficient or because the appropriate condition for
eliciting that behavior has yet to occur.

Advantages of Performance Assessments


The most significant advantage of performance assessments is that they allow evaluation of
complex skills which are difficult to assess by written tests. A complex skill draws on previously learned
information, concepts, and rules, and involves problems that can be solved in a variety of ways. A written
test can measure students’ understanding of grammar rules; however, a performance assessment must be
used to determine whether students can apply these rules when writing. Parallel examples exist in every
academic area. The learning of complex skills is often the ultimate justification of many subjects taught in
school. Without performance assessments, proficiency with these skills generally cannot be evaluated.
A second basic advantage of performance assessments is their effect on instruction and learning.
For example, students learn to write in the active voice if writing performance is also going to be
assessed, not only learning the difference between the active voice and passive. Similarly, the teacher is
more likely to teach how to write in the active voice if the performance assessment is part of the lesson
plan.
A third advantage is that performance assessments can be used to evaluate the process as well as
the product. Whereas written tests (and portfolio assessments) focus on the product that result from
performing the task, the focus of a performance assessment is often on the process a student uses to get to
that product. Examples include observing how students formulate a hypothesis or techniques they use in a
science lab. Observing the process is particularly important in diagnostic evaluations. If a student is
having difficulty solving a math problem, an effective way is for the teacher to watch the student work
through the problem. This activity is a performance assessment.

Limitations of Performance Assessments


One major limitation of performance assessments is the considerable amount of time they require
to administer. Although written tests are usually administered simultaneously to an entire class,
performance assessments often must be administered to one student or to a small group of students. This
limitation makes it difficult to use performance assessments for measuring a substantial number of skills.
Because administering performance assessments is time intensive, more efficient techniques, such as
written tests, should be used when possible.
A second limitation of performance assessments is that the student responses often cannot be
scored later. In particular, when a process rather than a product is being assessed, the observer has to
score or record events as they happen. If a pertinent behavior goes unobserved, it goes unmeasured.
A third limitation pertains to scoring. Like that of essay tests, the scoring of performance
assessments is susceptible to rater error. Bias, expectations and inconsistent standards can easily cause
teachers to interpret the same observation differently. As with essay tests, this problem can be controlled
by developing a careful scoring plan.
A fourth limitation involves the inconsistencies in performance on alternative skills within the
same domain. Will a student who can deliver a persuasive speech also be able to deliver an entertaining
speech? Will an education student who can create a good short-answer test also be able to create a good
essay test or a good performance assessment? Performance often does not generalize across alternative
skills of a domain. The way to resolve this problem is to observe the student performing each task.
However, doing so is often not possible because performance assessments are time-consuming to
administer.

Scoring Performance Assessments


Another major component of a performance assessment specification is establishing the scoring
plan. This may result in a numerical score, but often, particularly in classroom assessments, it produces a
qualitative description of a student’s performance. The scoring plan establishes what the teacher will
observe within each student’s performance.
The content of a scoring plan is heavily dependent of whether the process or product of a
student’s response is to be scored. Some authors propose that the process should be score if generally
accepted procedures for completing the task have been taught and a student’s departure from these
procedures can be detected. However, the product should be scored if a variety of procedures are
appropriate, particular procedures have not been explicitly taught, or the use of appropriate procedures
cannot be ascertained by watching the student’s performance. Considering the example in Table 41, the
student’s actual construction of the short-answer test will not be observed; on the other hand,
characteristics that should be present in a copy-ready short-answer test have been established. The
scoring, therefore, focuses on the product rather than on the process, while Table 42 shows the scoring
rubric focusing on the process.
Three issues should be considered when establishing a scoring plan.
1. Is each quality to be measured directly observable?
2. Does the scoring plan delineate essential qualities of a satisfactory performance?
3. Will the scoring plan result in different observers’ assigning similar scores to a student’s
performance?

As with essay tests, performance assessments can be scored analytically or holistically. With
analytical scoring, the appropriateness of a student’s response is judged on each of a series of attributes.
Checklists and rating scales are used with analytical scoring. The scoring of many performance
assessments, however, cannot be broken into distinct attributes. Instead, an overall or holistic judgment is
made. Scoring rubrics are used to facilitate holistic scoring.
Scoring amounts to making a summary statement concerning a student’s performance. This
summary may, but does not need to, involve numbers. Although numbers are convenient, the actual
descriptions provided by the checklist, rating scale, or scoring rubric are generally more useful than
numbers generated through the scoring process. However, numbers are particularly helpful when
summative evaluations are involved, such as when grades are to be assigned. With formative evaluations,
the completed checklist, rating scale, or scoring rubric provides a framework for discussing results with
students.
Anecdotal records can also be used to document a student’s behavior and facilitate later feedback.
Creating anecdotal records generally does not constitute scoring a performance assessment.

Checklists and Rating Scale


A checklist is a list of actions or descriptions; the teacher or rater checks off items as the given
behavior or outcome is observed. Checklists can be used in a variety of settings to establish the presence
or absence of a series of conditions. They also help structure complex observations. A checklist can
structure observations of a student within a performance assessment.
The checklist in Table 37 is from part of an assessment of students’ learning to use a word
processor. Students were provided both a diskette and printed copy of a short paper. Changes that the
students were to make were handwritten on the paper copy. The checklist was used to observe each
student’s use of a word processor to make the required changes. Notice that an options is provided for
recording a “did not observe”.

Rating scales are similar to checklists, except they provide a scal e or range of responses for
each item. Table 38 illustrates a series of rating scales. In this example, students are assigned a
score between 1 and 5 on different qualities associated with delivering a speech.
Rating scales can assume a variety of forms. For instance, the scales in Table 38 use
words involving comparisons among students. Ratings such as “exceptional,” “good,” and “class
standards” gain meaning through prior experience with other students. Points on a rating scale
can also refer to specific behaviors, as shown in Table 39;

Rubrics (also refer to Chapter 15)


Performance assessments sometimes must be scored holistically because it is not always
possible to analyze performance into a series of separate attributes. Sometimes holistic scoring is
used simply because it is faster than analytical scoring; it is quicker to obtain an overall
impression than to make a series of judgments. Scoring rubrics are often used when performance
is being holistically scored. Rubric is like a rating scale. It consists of a scale with descriptions of
performance that range from higher to lower. Unlike a rating scale, a scoring rubric addresses
several qualities simultaneously within the same scale. Table 40 illustrates a rubric for scoring
writing samples. Take note that the same set of variables is used across the range of the
continuum.
When developing a scoring rubric, it is important to list the variables that are to be judged
and then to establish specific descriptions of these characteristics for each point along the
continuum. These characteristics should fit together at each point. It is not sufficient for each
variable to change from lower to higher across the continuum. Instead, the description of all
variables should match what is typically seen in students’ performance at a particular level.

Requirements Prior to Creating a Performance Assessment


It will be useful to identify other tasks that should be completed before the performance
assessment is produced. Performance assessments are a very valuable tool. They can assess skills
that other techniques cannot assess, and these skills tend to represent the ultimate goals of
instruction. Here are some strategies than can be useful;

1. Identify Authentic Tasks


Performance assessments should focus on authentic tasks. Here are qualities that make a
task authentic:
 The task is an actual performance outcome that students are to achieve through your
class. The task is not a convenient substitute of that performance or an indirect indication
that students have achieved that performance.
 The task requires students to draw on previously learned knowledge.
 The task clearly involves important themes and ideas associated with the content being
taught.
 The task involves a real-world application of the content being taught.

This last point is often confusing. Involving real-world applications does not necessarily
refer to activities in which students will directly participate later in life. Instead, real-world
applications involve direct applications of knowledge that are highly relevant to situations
outside the classroom. Authentic tasks in science, for instance, relate to real-world situations in
which the science is applied or discussed. Likewise, authentic tasks in math relate to real-world
situations in which mathematicians or other consumers of mathematics use math knowledge.
2, Identify Concise Goals That Will Be Assessed
Prior to creating a performance assessment, the goals to be assessed must be identified.
Whereas performance objective identifies the specific behavior, a goal is stated more broadly.
Performance assessments normally are used to evaluate broader objectives or goals.
An example of performance objective:
Information: Identify qualities desired in short-answer items.
An example of a goal is as follows; Produce a short-answer test.
A goal is often the equivalent of several objectives. Goals rather than performance
objectives typically have to be used with performance assessments because the task cannot be
meaningfully expressed as a series of specific behaviors. Alternatively, listing the full series of
specific performances associated with the task may be impractical.
Nevertheless, to develop a performance assessment, a goal must be expressed in
operational terms. The actual task to be performed may represent one of several options for
operationally expressing the goal. The task that is selected must be a meaningful representation
of the goal.
When constructing performance assessments, there is a temptation to increase realism by
involving associated skills irrelevant to the goal being assessed. Including such skills, even when
they increase realism, is equivalent to including irrelevant information in a written test item. This
extraneous information tends to confuse students and, more important, confounds the
assessment. When a student performs unsatisfactorily, the teacher cannot determine whether the
student has yet to achieve the goal being assessed or simply is unable to work through the
irrelevant material. When determining content to be included in a performance assessment, keep
it free of irrelevant skills.

3. Determine Whether the Assessment Will Focus on Process or Product


As mentioned earlier, a performance assessment can uniquely evaluate the process that a
student uses as well as the product that results from completing the task. The process is
particularly important for diagnostic evaluation when a teacher tries to find out why a student
encounters difficulty. For some tasks, the way in which a student tackles the problem is more
relevant to the instructional goal than the specific solution the student derives. Generally, the
performance assessment focuses on either process or product, not both. The particular focus
affects the direction given to the student and the plan the teacher must devise to score students’
responses.

4. Tell Student What Will Be Expected of Them


One advantage of performance assessments is that teachers usually can tell students
exactly what is on the test. This cannot be done with a written test. With performance
assessments, telling students what they will be asked to do, right instructional goals. (This also is
an effective way to help assure that instruction is consistent with the instructional goals!) Telling
students what they are to do increase the likelihood that they will successfully perform the task.
Given that performance assessments are time-consuming to administer, teachers and students can
save considerable time by explaining what are expected to accomplish.

5. Determine That Students Have Mastered Prerequisite Skills


When used appropriately, performance assessments require students to integrate
knowledge. Obviously, without knowledge of prerequisite information, concepts, and rules,
students will not succeed with the performance assessment. Considerable time will be saved if
prerequisite skills are identified and students’ achievement of these skills is evaluate before the
performance assessment is administered. Again, performance assessments on which students
perform well are much quickly scored.
There are three basic steps to produce a performance assessment; establishing the
capability to be assessed, the performance to be observed and the plan for scoring student’s
performance. These will be discussed in the sections that follow.

Establishing the Capability to be Assessed


When creating a performance assessment, it is useful to structure the process using
specification, such as the one illustrated in Table 41. The specification breaks the construction of
a performance assessment into three major components, the first involving specification of the
capability to be assessed.
Specifying the capability to be assessed is the most fundamental step in creating a
performance assessment. throughout the discussion in the different chapters, a very deliberate
distinction has been made between the student’s capability and the student’s performance. The
capability is the student’s knowledge, which we would like to assess but cannot assess directly,
because we cannot see what another person knows or is thinking. Instead, we must identify a
performance that provides an indication of the student’s knowledge.
A performance assessment is used to observe this performance systematically. The
selection of the performance to be observed must be based on a clear awareness of the capability
being assessed. Identifying that capability helps us determine the type of behavior we should
observe and how broadly our observation must generalize. For instance, if we are trying to learn
whether a student can find the meaning of an unknown word without the help of others, we
might ask the student to look up a few words in a dictionary. However, we may want our
observations to generalize to both paper and electronic dictionaries and possibly to other
resources. We must clearly establish the capabilities we are trying to assess and use this
framework to control the content of our performance assessments.
With performance assessments, the capability if often expressed as a goal. Goals involve
statements such as these:
Using commercial references, find the meaning of unknown words.
Produce a short-answer test.
Summarize the plot of a short story.
Calculate the volume of an irregularly shaped object.
Use rations of area to solve common problems.
When creating a performance assessment, it may be necessary to elaborate on the goal in
order to define the capability adequately. In Table 41, for instance, simply stating that the student
will “produce a short-answer test” does not adequately establish a target, therefore additional
detail is given parenthetically.
When establishing the capability to be assessed, it is important to identify the type of
knowledge involved (declarative and procedural). In this book, the term information is used to
refer to declarative knowledge and terms concept, rule, and complex skill are used as the names
for the subtypes of procedural knowledge.
Identifying the type of knowledge involved is also necessary, because different types of
student performance are used to indicate whether a particular type of capability has been learned.
Table 43 involves demonstrating how to use the phone to call the police in an emergency. Unlike
the previous example, this skill involves declarative knowledge, or what is called information.
This particular skill requires the student to restate what he or she can do. No application of a
concept or rule to a new situation is involved. Instead, the student recalls what is to be done.
With this skill, the important performance is the immediate dialing of the police emergency
number. The specification in Table 42 indicates that the phone used is to be similar to what the
student would use at home.

Establishing the Performance to be Observed


As illustrated in Table 41, the second major component of a performance assessment
specification establishes the performance to be observed. This is divided into five parts:
Description of Performance is a brief narrative description of what the teacher will see the
student do. In some cases, particularly when the goal to be assessed is clearly stated, this
description of performance similar or even identical to the goal. With many goals, however, any
of a number of performances could be used in the performance assessment. for instance, either
an electronic or book version of a dictionary might be used to see if students can find the
meaning of unknown words. Or any of a number of tasks could be selected to see whether
students can use ratios of area to solve common problems. This part of our specification
establishes the specific behavior to be observed in the performance assessment.
Required materials identify physical items that will be required in the performance
assessment.
Guidelines for administration establish what the teacher must do in order to administer
the performance assessment. this particular part of the specification also identifies prompts that
can be provided to students during the performance assessment.
Instructions to students state either verbatim or in outline form the directions that will be
provided to students at the beginning of the performance assessment.
Which will be scored, process or product? Here, establish whether the performance
assessment will focus on a process or product. This decision usually affects the performance to
be observed, and certainly influences the scoring plan.

When establishing the performance to be observed, four issues should be considered;


1. Does this performance assessment present a task relevant to the instructional goal?
2. Are the number and nature of qualities to be observed at one time sufficiently limited to
allow accurate assessment?
3. Are the conditions under which the performance assessment will occur clearly
established?
With performance assessments, one can either wait for a student’s behavior to occur
naturally, without prompting, or one can structure the stimulus by telling the student what
to do. When the stimulus is structured, the instructions to the student play the major role
in specifying conditions. If a naturally occurring stimulus is to be used, the instructions to
the observer must address the following issues;
 The conditions that must exist for the observation to begin or what conditions will
cause the observation to begin.
 How long the observation is to be or what condition will terminate the
observation
 To complete the assessment, how many times a student must observed
 During the observation, what actions by others and what circumstances nullify the
assessment
4. If the stimulus is structured, are instructions to the student concise and complete?

You might also like