0% found this document useful (0 votes)
590 views411 pages

Faa Ada280477 Dot Faa Rd-93 5

Avionics

Uploaded by

anantia
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
590 views411 pages

Faa Ada280477 Dot Faa Rd-93 5

Avionics

Uploaded by

anantia
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 411

REPORT DOCUMENTATION PAGE

OMB No.0

704

18 8

Public reporting burden for this coLLection o information Is estimated to average 1 hour per response, including the
time.for reviewing instructions, searching ex tng data sources, gathering andmaintaining the dAta needed, an
coqv)leting andreviewing the co~tectionot nformatIon. Send co'ivnents regardig thIsI ren estTMat or any other
L vr
qspct
of Directorate
ts correction
ft/ntorat ion,
Including
for Je0ferson
r
ng DavI burden, to wGash ingt ,-Hearter
for ?nformation
Operations
a suggestions
Report a. 121e
-iiit 6v' uTe
1
Ant ngton VA
L"Sw(-u and tn the offica af Mnnao.en andS,
Sevc
l
'UTo I-1-tPewrkRdcinPr~~"O
fi~way.
iiUn
In~ 2jO
1. AGENCY USE ONLY (Leave blank)

I 3.

2. REPORT DATE
July 1993j

REPORT TYPE AND DATES COVERED


Final Report - June 1989-Sept 1992

4. TITLE AND SUBTITLE

Human Factors for Flight Deck Certification Personnel

5.

FUNDING NUMBERS

FA3E2/A3093
6. AUTHOR(S)
Kim Cardosi and M. Stephen Huntley
7. PERFORMING ORGANIZATION NAME(S) AND ADDRESS(ES)
U.S. Department of TransportationResearch and Special Programs Administration

8. PERFORMING ORGANIZATION
REPORT NUMBER
DOT-VNTSC-FAA-93-4

John A. Volpe National Transportation Systems Center


Cambridge,

MA 02142-1093

9. SPONSORING/MONITORING AGENCY NAME(S) AND ADDRESS(ES)

10. SPONSORING/MONITORING

U.S. Department of Transportation


Federal Aviation Administration

AGENCY REPORT NUMBER


DOT/FAA/RD-93/5

Research and Development Service


800 Independence Ave., S.W.
Washington,

DC

20591

11. SUPPLEMENTARY NOTES


12a. DISTRIBUTION/AVAILABILITY STATEMENT

This document is

available to the public through the National

Technical Information Service,


13. ABSTRACT (Maximur

12b. DISTRIBUTION CODE

Springfield, VA

22161

200 words)

This document is a compilation of proceedings and lecture material on human


performance capabilities that was presented to FAA flight deck certification
personnel. A five-day series of lectures was developed to provide certification
specialists with information on fundamental characteristics of the human operator that
are relevant to flight deck operations.
The series was designed to proceed from the
presentation of basic information on human sensory capabilities, through human
cognition, to the application of this knowledge to the design of controls and displays
in the automated cockpit.
The initial lectures were prepared and presented by
published academic researchers.
The later ones were presented by senior human factors
practitioners employed by major American airframe manufacturers.

14. SUBJECT TERMS

15. NUMBER OF PAGES

416
Human Factors, Cockpit, Automation, Display Design, Human
Performance, Human Engineering, Perception, Sensation,
Attention, Workload, Evaluation
17. SECURITY CLASSIFICATION
OF REPORT

Unclassified

18. SECURITY CLASSIFICATION


OF THIS PAGE

19. SECURITY CLASSIFICATION


OF ABSTRACT

Unclassified

Unclassified

16. PRICE CODE


20. LIMITATION OF ABSTRACT

Best
Avai~lable
Copy

NOTICE
Th is docu ment is dissem inated u nder the sponsorship
of the Department of Transportation in the interest
of information exchange. The United States Government
assumes no liability for its contents or use thereof.

NOTICE
The United States Government does not endorse
products or manufacturers. Trade or manufacturers'
names appear herein solely because they are considered
essential to the object of this report.

US. Department
of Transportation

John A. Volpe
National Transportation
Systems Center

Kendall Square
Cambridge Massachusetts 02142

Research and

Special Programs
Administration

May 1994

Human Factors for Flight Deck Certification Personnel


Final Report - July 1993

ERRATA

Due to an oversight on the part of the printer, certain last-minute changes to this Final Report
were not added. As a result, it is necessary for us to enclose a revised copy of page 44. Please
accept our apologies. In the future, we intend tt print a new edition of Human Factorsfor
Flight Deck Certification Personnel that contains these additional changes.

Human Factors for Flight Deck Certification Personnel

that color discriminations that depend on S cones will be impaired if the image
is sufficiently small to fall only on the center of the fovea. This is illustrated by
Figure 3.3. When viewed dose, so that the visual angle of each circle subtends
several degrees, it is easy for an individual with normal color vision to
discriminate the yellow vs. white and red vs. green. Viewed from a distance of
several feet, however, the yellow and white will be indiscriminable. This is
called smail-fleld tritanopia, because tritanopes are individuals who completely
lack S cones. A tritanope would not be able to discriminate the yellow from the
white in Figure 3.3 regardless of their sizes. With certain small fields, even
normal individuals behave like tritanopes. Notice that even from a distance, the
red-green pair is still discriminable because S cones are not necessary for this
discrimination. Thus, the small-field effect is limited to discriminations that
depend on S cone5. (Note: Due tc teclincal difficulties in reproducing colors,
individuals with normal color vision may still be able to discriminate the yellow
and white semicircles at a distance.)

Figure 3.3.

Colors (yellow and white) not discriminable at a distance due to small field
trftanopia

94-i89937
44

94 6 20

004

METRIC/ENGLISH CONVERSION FACTORS


ENGLISH TO METRIC

METRIC TO ENGLISH

LENGTH (APPROXIMATE)

LENGTH (APPROXIMATE)

1 inch

(in.) = 2.5 centimeters (cm)


1 foot (ft) = 30 centimeters (cm)
I yard (yd) = 09 meter (m)
1 mile (mi) = 1.6 kilometers (km)

AREA

1 millimeter (mm)
1 centimeter (cm)
1 meter (m)
1 meter (m)
1 kilometer(km)

AREA

(APPROXIMATE)

1 square inch (sq in, in2 ) =


2
1 square foot (sq ft, ft ) =
1 square yard (sq yd, yd 2 % =
1 square mile (sq mi, mi 2) a
1 acre = 0 .4 hectares (he) =

6.5 square centimeters (cm 2)


2
0.09 square meter (m )
2
0.8 square meter (M )
2.6 square kilometers (km 2 )
2
4,000 square meters (m )

=
=
=

=
=

= 0.03 fluid ounce (fI oz)

2.1 pints (pt)


1.06 quarts (qt)
= 0.26 gallon (gal)
3
= 36 cubic feet (cu ft, ft )
3
= 1.3cubicyards(cuyd, yd )
=
=

TEMPERATURE (EXACT)

yC

(APPROXIMATE)

1 milliliter (ml)
1 liter(1)
1 liter (I)
1 liter(1)
1 cubic meter (m3 )
3
1 cubic meter (m )

TEMPERATURE (EXACT)
[(x-32)(5/9)]*F

(APPROXIMATE)

VOLUME

5 milliliters (ml)
15 milliliters (ml)
30 milliliters (ml)
0.24 liter (I)
0.47 liter (I)
0.96liter(1)
3.8 liters (I)
0.03 cubic meter (m3 )
0.76 cubic meter (m3 )

3.3 feet (ft)

= 1.1 yards(yd)
a 0.6 mile (mi)

1 gram (gr) - 0.036 ounce (oz)


1 kilogram (kg) - 2.2 pounds (Ib)
1 tonne (t) = 1,000 kilograms (kg) = 1.1 short tons

(APPROXIMATE)
-

MASS - WE IG HT (APPROXIMATE)

1 ounce (oz) = 28 grams (gr)


1 pound (Ib) = .45 kilogram (kg)
1 short ton - 2,000 pounds (lb) - 0.9 tonne (t)

VOLUME

= 0.4 inch (in)

1 square centimeter (cm 2) a 0.16 square inch (sq in, in 2)


2
2
1 square meter (m ) = 1.2 square yards (sq yd, yd )
2
2
1 square kilometer (kn ) = 0.4 square mile (sq mi, mi )
1 hectare (he) = 10,000 square meters (m 2) a 2.5 acres

MASS - WEIGHT (APPROXIMATE)

1 teaspoon (tsp)
1 tablespoon (tbsp)
1 fluid ounce (fl oz)
1 cup (c)
1 pint (pt)
1 quart(qt)
1 gallon (gal)
1 cubic foot (cu ft W3)
3
1 cubic yard (cu yd, yd )

= 0.04 inch (in)

[(9/5)y+32]JC

xF

QUICK INCH-CENTIMETER LENGTH CONVERSION


1

CENTIMETERS

10

11

12

13

14

15

16

17

18

19

20

10

21

22

23

24

25

25.40

QUICK FAHRENHEIT-CELSIUS TEMPERATURE CONVERSION


-40

.22

O Ile
.400

.30*

40

14

.20

.10

32

1 I
0

50s

680

86

200

30*

10

104

122*

40

50

140

I
600

1s8

I
70

176

I
80s

For more exact and or other conversion factors, see NBS Miscellaneous Publication 286, Units of Weights and
Measures. Price S2.50. SD Catalog No. C1 3 10286.

ii

194

212

I1
90g

1000

Preface
Flight test pilots who perform aircraft certification and evaluation
functions for the FAA are frequently required to make important
decisions regarding the human factors aspects of cockpit design. Such
decisions require a thorough understanding of the operational conditions
under which cockpit systems are used as well as the performance limits
and capabilities of the flight crews who will use these systems. In the
past, the limits of control and display technology and the test pilot
familiarity with the knobs and dials of traditional aircraft have provided
useful references from which to judge the safety and utility of cockpit
displays and controls. Today, however, with the advent of the
automated cockpit, and the almost limitless information configurations
possible with CRT and LCD displays, evaluators are being asked to go far
beyond their personal experience to make certification judgments.
A survey of human factors handbooks, advisory circulars and even formal
human factors courses revealed little material on human performance
that was formatted in a fashion that would provide useful guidelines to
certification personnel for human factors evaluations in the cockpit.
Most sources of human factors information are of limited use in
evaluating advanced technology cockpits because they are out of date
and do not consider the operational and cockpit context within which the
newly designed controls and displays are to be used.
It will be some time before the human factors issues concerning
interacting with electronic cockpits are well defined and there is
sufficient information and understanding available to support the
development of useful handbooks. In lieu of such guidance, a series of
one-week seminars on human factors issues relevant to cockpit display
design was conducted for approximately 120 FAA certification personnel.
The lectures were given by researchers and practitioners working in the
field. The lectures included material on the special abilities and
limitations of the human perceptual and cognitive system, concepts in
display design, testing and evaluation, and lessons learned from the
designers of advanced cockpit display systems. The contents of this
document were developed from the proceedings of the seminars.
'n&I
I wish to thank a number of my friends and colleagues for their
B
important contribution to this document. I am deeply indebted to Dr.
aced
impotantcontibuton
t
Kim Cardosi, Dr. Peter D. Eimas, Mr. Delmar M. Fadden, Dr. Richard F. cat.
Gabriel, Dr. Christopher D. Wickens, and Dr. John Wemer, the principal

Diatglbltll

Aw11rbab11

.,

authors of the material in this book. Each of them is a respected and


highly productive professionai in his or her own field and has
contributed rare and valuable time to this activity. Clearly there would
be no document without their contributions.
I am particularly grateful to Dr. Kim Cardosi, the project manager, for
her editing and able administration of much of the work that culminated
in this report. This work included the management of the four seminars
as well as the organization of the resulting proceedings into the textbook
format of the current doci'ment.
Special thanks are due to Mr. Paul McNeil, Mr. Arthur H. Rubin, and Mr.
Jim Green of EG&G Dynatrend Corp. for their many hours and tireless
efforts in assembling and publishing the manuscripts included herein, and
to Ms. Rowena Morrison of Battelle for her insightful and thoughtful
support in editing particularly troublesome sections of this work.
The four seminars and the publishing of the resulting report were
generously funded through the Federal Aviation Administration's Flight
Deck Human Factors Research Program managed by Mr. William F.
White.
M.S. Huntley, Ph.D.
Manager, Cockpit Human Factors Program

iv

Contents

CONTENTS
Chapter 1

Auditory Perception ..........................

Physical Properties of Sound .........................

Frequency and Intensity Relations to Perception ..........

The Effects of Aging ..

............................

Effects of Exposure ..

.............................

Sound Localization ..

.............................

Habituation and Adaptation ........................

Ambient Noise (Masking) ...........................

Chapter 2

Basic Visual Processes ........................

11

Physical Properties of Light .........................

11

The EZ
y.

14

......................................

Accommodation ............................
Aging and Presbyopia ........................
Ocular Media Transmission and Aging ............

16
17
17

Rods and Cones ................................


Variation with Retinal Eccentricity ..............
Spectral Sensitivity ..........................
Luminance . ..............................
Dark Adaptation . ..........................
Sensitivity/Resolution Trade-Off .................
Damage Thresholds . ........................

20
20
22
23
24
25
26

Eye Movements ................................

27

Temporal Vision .................................

32

Human Factors for Fhiht Deck Certification Peonnel

Chapter 3

Flicker .................................

32

Motion ...................................

35

Color Vision ..............................

41

Color Mixture ...................................

41

Variation in Cone Types with Retinal Eccentricity .........

43

Color Vision Deficiencies . .........................


Congenital Deficiencies .......................
Acquired Deficiencies . .......................
Variation with Age ..........................
Testing ..................................
Note of Caution: . ..........................

45
45
46
47
48
51

Color Appearance . ..............................


Chromatic and Achromatic Colors ...............
Variations with Intensity ......................
Variations with Retinal Eccentricity ..............

51
53
55
56

Wavelength Discrimination and Identification ............


Range of Discrimination .....................
Range of Identification .......................
Implications for Color Displays ..................

57
58
59
59

Contrast Effects ................................


Successive Contrast . ........................
Simultaneous Contrast . ......................
Assimilation . .............................

60
60
60
62

Adaptation .....................................
Chromatic Adaptation ........................
Variation Under Normal Conditions ..............

63
63
64

Color Specification . .............................


CIE System ................................
Munsell System ............................

65
65
68

vi

Contents

implications for Displays


Chapter 4

.........................

Form and Depth ..........................

69

71

Edges and Borders ...............................

72

Contrast Sensitivity ...............................


Variation with Luminance .....................

75
77

Variation with Retinal Eccentricity ..............


Variation with Age ..........................
Implications for Displays ......................

78
78
79

Form-Color Interactions ...........................

83

Depth Perception ...............................


Monocular Depth Cues ......................
Ocular Convergence .........................
Stereopsis .................................
Binocular Rivalry ...........................

83
83
89
89
91

Color Stereopsis ............................


Implications for Displays ......................

91
92

Information Processing .......................

93

Chapter 5

What Is the Mind? ...

...........................

93

The Brain as an Information Processor .................

94

Attention ......................................

97

Selective Attention . .............................

98

The Cost of Multiple Tasks .........................

99

Automatic and Controlled Processing ................

100

Expectation ..................................

101

vii

Human Factors for Flight Deck Certification Personnel

Pattern Recognition ............................

..

102

Speech Perception ...............................

105

Memory ......................................
The Sensory Store . ........................
Short-Term Memory ........................
Long-Term Memory ........................

107
108
110
111

Chapter 6

Display Compatibility and Attention ............

115

Display Compatibility ............................

115

Attention .....................................
Focused Attention . ........................
Divided Attention ..........................
Selective Attention .........................
Head-Up Displays ..........................
HUD Optics . ............................
Physical Characterstics ......................
Symbology . .............................
Attention Issues ...........................

118
119
120
121
122
125
128
129
130

Chapter 7

Decision Making ..........................

133

The Decision-Making Process . .....................

133

Pilot Judgment .................................

135

Biases in Situation Assessment . ....................


Salience Bias ..............................
Confirmation Bias . ........................
Anchoring Heuristic . .......................
Base Rate of Probability .....................
Availability Heuristic . ......................
Representativeness Heuristic . .................
Overconfidence Bias ........................
Risk Assessment ...........................

137
137
137
138
139
140
141
141
142

vili

Contents

Stress and Decision Making ..................

144

Lessening Bias in Decision Making ...................

145

High Speed Decision Making: The Choice of Action ......


Decision Complexity ........................
Expectancy ...............................
Context . ...............................
The Speed-Accuracy Trade-off .................
Signal and Response Discriminability ............
Practice . ...............................
The Decision Complexity Advantage ............
Following Checklist Procedures .................
Respo-,se Feedback .........................
Display-Control (Stimulus-Response) Compatibility
Stress and Action Selection ...................

146
146
147
147
147
148
149
149
149
151
152
160

Negative Transfer ...............................

Chapter 8

Timesharing, Workload, and Human Error ........

. .

161

165

Divided Attention and Timesharing ..................


Sampling and Scheduling ....................
Confusion ................................
Resources ................................
Workload ................................
Workload Prediction ........................
Multiple Resources .........................
Workload Assessment .......................
Primary Task Performance Measures .............
Secondary Task Performance ..................
Subjective Measures of Workload ................
Physiological Measures of Workload ............
A Closed-Loop Model of Workload .............
Underload ................................

165
166
166
167
169
171
177
179
180
182
183
186
188
190

Sleep Disruption ................................


Characteristics of Sleep ......................

190
190

ix

Human Factors for Flight Deck Certification Personnel

Sleep Disruption in Pilots ...................


Recommendations . ........................

193
198

Human Error ..................................


Categories of Human Error ..................
Error Remediation and Safeguards .............
Error in a Systems Context ..................

200
200
204
206

Chapter 9

209

Cockpit Automation ........................

Introduction ...................................

209

Definition . ..................................

210

Summary of Aviation Automation Concerns ............

210

Experience with Automation in Nonaviation Systems ......

213

Nuclear Power Studies ...........................

213

Office Automation ..............................

214

Experience with Automation in Aviation ...............

215

Accident Data ..................................

215

Incident Data ..................................

217

Pilot Opinion ..................................

219

Reasons Cited for Automating Systems ................

220

Some Automation Concerns .......................


Loss of Situation Awareness ...................
Loss of Proficiency .........................
Reduced Job Satisfaction .....................
Overconfidence in the Aircraft ................
Intimidation by Automation and/or Complacency
Increased Training Requirements ..............

221
223
223
223
224
224
224

...

Contents

Inability of the Crew to Exercise Authority ........


Design-Induced Error .......................

224
224

Design Practices . ..............................


Traditional Design Approach .................
Automation Philosophy ......................
The Influence of Crew Role on Design ...........

225
226
227
229

Human Factors .................................


How Human Factors Relate to Automation Design...
"Soft" Sciences and the Need for Testing ..........
The Problem of Criteria . ....................

232
235
237
239

Conclusions ..................................

239

Recommendations ...............................

240

Chapter 10

Display Design ............................

243

Display Development Process ......................

244

Requirements ................................

244

Design ......................................

248

Evaluation ..................................

250

Operation ....................................

251

General Design Issues .............................


Opportunities for Standardization ...............

251
251

Use of Color ...................................

256

Eye Fatigue ..................................

258

Time Shared Information ..........................

260

xi

Human Factors for Flight Deck Certification Personnel

Command vs. Situation-Prediction Displays ............

264

Future Display Issues ............................

267

chapter 11

Workload Assessment .......................

Workload Methodology ..........................

Commercial Aircraft Workload ......................

269
271

273

Workload Assessment Scheduling ...............


Workload Assessment Criteria ................
Timeline Analysis ..........................
Task-Time Probability .......................
Pilot Subjective Evaluation ..................

275
277
282
289
292

Certification Considerations ........................


Early Requirements Determination .............
Mandatory Indicators and Displays in Integrated
Cockpits . ..........................
Airline Differences . ........................

301
301

Coping with Pilot Error . .........................


Error Types ..............................
Error Tolerant Design .......................

303
303
303

Future Workload Issues . .........................

305

Chapter 12

Human Factors Testing and Evaluation ..........

301
302

307

Introduction ...................................

307

When Is a Human Factors Test Warranted? ............

308

How is Human Performance Measured ................


Components of Response Time .................
Factors Affecting Human Performance ...........
"Real World" Data on Pilot Response Time ........

310
310
311
315

xii

Contents

What Method of Testing Should Be Used .............


Field Observations . ........................
Questionnaires ............................
Rating Scales . ...........................
Laboratory Experiments ......................

316
316
318
319
320

Experimental Validity and Reliability ................


Operationally Defined Variables ...............
Representative Subject Pool ...................
Controji ig Subject Bias . ....................
Representative Test Conditions .................
Counter-balancing . ........................

321
322
322
323
325
325

How Should Test Results Be Analyzed ...............


Descriptive Statistics ........................
Inferential Statistics . .......................
Statistical Significance . .....................
Analysis of Variance ........................
An Example . ............................
Regression Analysis . .......................
Statistical vs. Operational Significance ...........

327
327
330
331
332
332
335
336

REFERENCES ......................................
INDEX

......................................

R-1
index-1

xo~oe

Human Factors for Fliht Deck Certification Penonnel

LIST OF FIGURES
Figure 1.1

Figure 1.2
Figure 1.3
Figure 1.4
Figure 2.1
Figure 2.2

Page

Changes in air pressure shown for two sound waves


differing in frequency and amplitude (top). When
added together (bottom), the two pure tones form
a complex sound...............................

Amplitude spectra of a C note played on three


different instruments ...........................

Variation of absolute threshold with sound frequency


for a young adult .. ...........................

High and low frequency sound waves emanating


from a source to the right of a person's head ...........

Regions of the electromagnetic spectrum and their


corresponding wavelengths ......................

12

Relative energy of fluorescent lamps plotted as a


function of wavelength: 1 = standard warm white,
2 = white, 3 = standard cool white, 4 = daylight. .....

13

Sunlight energy plotted as a function of wavelength


for a surface facing away from (300 solar altitude)
or toward the sun (80 solar altitude) ................

14

Figure 2.4

Cross section of the human eye; visual angle .........

15

Figure 2.5

Image formation in emmetropic (normal),


hypermetropic, and myopic eyes ...................

16

Figure 2.6

Near point plotted as a function of age..............

17

Figure 2.7

Optical density of the human lens plotted as a


function of wavelength.. .......................

18

Optical density of human ocular media at 400 um


plotted as a function of age.....................

18

Various cell types in the primate retina ..............

19

Figure 2.3

Figure 2.8
Figure 2.9

xiv

Contents

Figure 2.10

The number of rods and cones plotted as a function


of retinal eccentricity ..........................

20

Recommended placement of visual alert and other


high priority signals relative to the line of sight. .......

21

Log relative sensitivity plotted separateiy for rods


and cones as a function of wavelength.. ............

22

Threshold decrease during adaptation to darkness


showing that cones (top branch) and rods
(bottom branch) adapt at different rates .............

24

Figure 2.14

Visual acuity plotted as a function of log luminance .....

26

Figure 2.15

Relative sensitivity to retinal damage plotted as a


function of wavelength.. .......................

27

Figure 2.16

Visual field for humans: about 1800 ................

28

Figure 2.17

Eye movements while viewing pictures; small dots


are fixations ................................

29

Eye movement records illustrating physiological


nystagmus .. ...............................

31

The dark lines show retinal blood vessels. The


central circle, relatively free of blood vessels,
represents the fovea ...........................

32

Critical flicker fusion for a centrally viewed stimulus


plotted as a function of log luminance. Different curves
show different stimulus sizes .....................

33

Critical flicker fusion for a 20 stimulus plotted


as a function of log luminance. Different curves show
CFF for different retinal loci .....................

34

Critical flicker fusion plotted as a function of age


for six different studies which used different stimulus
conditions ..................................

35

Figure 2.11
Figure 2.12
Figure 2.13

Figure 2.18
Figure 2.19

Figure 2.20

Figure 2.21

Figure 2.22

xv

Human Factors for Flight Deck Certification Personnel

Figure 2.23

Figure 2.24

Figure 2.25

Modulation amplitude (r%/o) of the fundamental


component contained in the waves plotted as a
function of flicker frequency ....................

36

If the two sets of dots are superimposed, no pattern


can be detected. However, if one set of dots moves
relative to the other, the word "motion" will be
clearly visible ..............................

37

Illustration of experiment by Brown (1931). The left


circle must move faster than the one on the right for
the two to be perceived as moving at the same speed ...

38

Absorption of the cone and rod photopigments plotted


as a function of wavelength. The curves have been
normalized to the same heights ..................

42

The number of short-, middle-, and long-wave


sensitive cones per square mm in a baboon retina
as a function of retinal eccentricity ................

43

Colors (yellow and white) not discriminable at a


distance due to small field tritanopia ...............

44

Log sensitivity of short-, middle-, and long-wave cones,


measured psychophysically, plotted as a function of
observer age .................................

48

A schematic of the split field produced by an


anomaloscope ...............................

49

A pseudoisochromatic plate from the Dvorine Plate


Test for color vision deficiencies ...................

50

Farnsworth Dichotomous Test of Color Blindness,


Panel D-15 ..................................

50

Figure 3.8

An illustration of simultaneous brightness contrast ......

52

Figure 3.9

Average color-naming data obtained for three normal


trichromats plotted for wavelengths presented at equal
luminance ..................................

53

Illustration of relations between hue and saturation

55

Figure 3.1

Figure 3.2

Figure 3.3
Figure 3.4

Figure 3.5
Figure 3.6
Figure 3.7

Figure 3.10

xvi

....

Contents

Figure 3.11

Color-naming results plotted as a function of stimulus


intensity for four observers .....................

56

Zones in the visual field of the right eye in which


various colors can be seen .....................

57

Wavelength difference required for discrimination


independent of intensity plotted as a function of
wavelength .................................

58

Figure 3.14

A demonstration of successive color contrast .........

61

Figure 3.15

A demonstration of simultaneous color contrast .......

61

Figure 3.16

A demonstration of assimilation, the Bezold


spreading effect ..............................

62

Chromaticity diagram showing stimuli that appear


white under dark-adapted condition (central x) and
following adaptation to chromatic backgrounds
(filled circles) ..............................

63

Chromaticity diagram showing how the color gamut


of a display decreases with increasing sunlight ........

64

CIE tristimulus values for a 2o standard observer


plotted as a function of wavelength ...............

66

Figure 3.20

CIE color diagram ............................

67

Figure 3.21

CIE, u', v' chromaticity diagram based on a


20 standard observer ..........................

68

Figure 3.22

Schematic of the Munseln color solid ...............

69

Figure 4.1

Demonstration of the Craik-Cornsweet-O'Brien


illusion ....................................

73

Figure 4.2

Illustration of Mach bands

74

Figure 4.3

Vertical sine-wave gratings and their luminance


distributions .................................

Figure 3.12
Figure 3.13

Figure 3.17

Figure 3.18
Figure 3.19

xvii

.....................

75

Human Factors for Flight Deck Certification Personnel


Figure 4.4

Contrast sensitivity as a function of spatial frequency

Figure 4.5

Demonstration of size-selective adaptation ...........

77

Figure 4.6

Contrast sensitivity is plotted as a function of


spatial frequency for young, adult observers ..........

78

Contrast sensitivity measured at different retinal


eccentricities is plotted in the graph on the left
as a function of spatial frequency. The graph on the
right shows contrast sensitivity obtained at the same
retinal eccentricites but with a stimulus that was
scaled according to "neural" coordinates ............

79

Contrast sensitivity as a function of spatial frequency


for different age groups .......................

80

Sine-wave (left) and square-wave (right) gratings of


the same spatial frequency .....................

81

Figure 4.7

Figure 4.8
Figure 4.9
Figure 4.10

...

Illustration of Fourier synthesis of a square-wave


(top left) and waveform changes as various sinusoidal

components are added ........................


Figure 4.11

76

81

Illustration of Fourier synthesis of a complex image by

the successive addition of sinusoidal components in


two dimensions ..............................

82

Illustration showing how the size of an obiect influences


the perception of distance ......................

84

Illustration of interposition as a monocular cue for


distance ...................................

85

Illustration of how linear perspective makes the same


size objects appear to be different sizes .............

86

Figure 4.15

Illustration of texture gradients as a cue to distance ....

87

Figure 4.16

Illustration of motion parallax ...................

88

Figure 4.17

Illustration of motion perspective for a person who


is moving and fixating straight ahead ..............

88

Figure 4.12
Figure 4.13
Figure 4.14

xviUi

Contents

Figure 4.18

Schematic illustration of binocular disparity ..........

89

Figure 4.19

A random-dot stereogram ......................

90

Figure 4.20

An illustration of color stereopsis .................

92

Figure 5.1

Boxology diagram of mental processing .............

94

Figure 5.2

Example of use of contextual cues to identify an


ambiguous signal ............................

103

Example of a four-by-three matrix of letters and


numbers shown to subjecLs to illustrate sensory
store capacity of short-term memory ..............

110

Different altimeter displays illustrating the principles


of pictorial realism and of the moving part .........

117

Figure 5.3

Figure 6.1
Figure 6.2

Vertical column engine indicators for a four-engine


aircraft ..................................

121

(a) Example of good display organization.


(b) Example of poor display organization ..........

122

Figure 6.4

Sample of head-up display (HUD)

123

Figure 6.5

Touchdown dispersions with and without HUD for


nonprecision approaches .......................

125

Example of HUD stimuli used in experiment, and


graph showing results of tests of pilot's ability to
switch from near to far information ..............

127

Figure 7.1

A model of information processing ...............

134

Figure 7.2

A model of human decision making ..............

136

Figure 7.3

Example of how an illustration can be used to avoid


technical jargon and improve comprehension ........

151

Different possible orthogonal display-control


configurations ..............................

154

Figure 6.3

Figure 6.6

Figure 7.4

xix

...............

Human Factors for Flight Deck Certification Personnel

Figure 7.5
Figure 7.6

Figure 7.7
Figure 7.8
Figure 8.1

Examples of population stereotypes in control relations

155

Illustration of how a "cant," i.e., angling controls

to be partially parallel to displays will reduce


compatibility ambiguity .......................

157

Example of display-to-control compatibility on a


vertical speed window .......................

158

The 'sweep-on' switch position concept which is


slowly replacing the earlier 'forward-on' arrangement

. .

159

Graph of how performance is a function of the


difficulty of primary and secondary tasks ...........

168

Figure 8.2

Model of workload

169

Figure 8.3

Example of workload time history profile as


produced by Timeline Analysis Program ............

Figure 8.4

.........................

172

Pilot internal vision tasking in advanced flight deck


for weather avoidance

173

.......................
.............

179

Figure 8.5

Graph showing workload assessment

Figure 8.6

Relationship between gain and error ..............

181

Figure 8.7a

The Bedford pilot workload rating scale

184

Figure 8.7b

The Cooper-Harper pilot workload rating scale .......

184

Figure 8.8

Graph plotting inter-beat time intervals for heartbeat


over a two-r,- nute period. A birdstrike appears at
approximately 35 seconds ......................

187

Graph plotting inter-beat time intervals for heartbeat


over a two-minute period. Note the reduction in
variability at t=40, with no corresponding change in
mean heart rate .............................

188

Figure 8.10

(a) Static and (b) dynamic concept of workload ......

189

Figure 8.11

Graph of sleep duration and its relationship to


circadian rhythm ............................

191

Figure 8.9

xx

...........

Contents

Figure 8.12

Mean sleep latencies for 21-year-olds and


70-year-olds ..............................

192

Graphs showing how human performance varies


during the day with a rhythm corresponding to
body temperature ...........................

194

Graphs showing desynchronization on east- and


westbound flights across time zones ..............

196

Average resynchronization of variables for eight


post-flight days ..............................

198

A Tmneline of the Development of Aircraft


Automation ................................

211

Boeing statistical summary of primary cause factors


for accidents ...............................

216

Figure 9.3

Human Error in ASRS Reports ..................

218

Figure 9.4

Boeing guidelines for crew function assignment ......

230

Figure 9.5

Hypothetical relationship between workload


and performance ............................

236

Beneficial adjustment of flight crew workload by


phase of flight .............................

237

Figure 10.1

Display Development Flowchart .................

245

Figure 10.2

Initial 747-400 PFD and ND Heading and


Track Symbology.............................

253

Figure 10.3

Navigation Display Heading Pointer Shape Change ....

254

Figure 10.4

Consistent Shapes for Heading and Track Pointers ....

255

Figure 10.5

Consistent 747-400 PFD and ND Heading and


Track Symbology ............................

256

Variable radius circular arc symbol whose radius


varies with the current turn rate .................

266

Figure 8.13

Figure 8.14
Figure 8.15
Figure 9.1
Figure 9.2

Figure 9.6

Figure 10.6

'Ii

Human Factors for Flight Deck Certification Personnel

Figure 11.1

A Typical Five-Year Workload Assessment Program ....

275

Figure 11.2

Systems Normal Procedures Workload Results for


Various Airplanes ...........................

280

Figure 11.3

Systems Nonnormal Procedures Workload Results for


Various Airplanes ...........................
Mission Profile Visual Activity Time Demand,

281

747-400 and 737-200 ........................

288

Figure 11.5

Mission Activity Channel Time Demand Summary .....

289

Figure 11.6

Description of Workload Evaluation Function and


Factor Combinations ........................

293

Figure 11.4

Figure 11.7

Evaluator Background Data Sheet, Pilot Subjective

Evaluation Questionnaire
Figure 11.8
Figure 11.9
Figure 11.10
Figure 11.11

.....................

294

Departure Information Data Sheet, Pilot Subjective


Evaluation Questionnaire .....................

295

Departure Workload Rating Sheet, Pilot Subjective


Evaluation Questionnaire .....................

296

Nonnormal Operations Workload Rating Sheet, Pilot


Subjective Evaluation Questionnaire ..............

297

Rating Boxes Used in the Boeing Pilot Subjective


Evaluation Questionnaire .....................

298

xxii

Contents

List of Tables
Table 1.1

The Decibel Scale

Table 3.1

Congenital Color Vision Deficiencies ...............

Table 5.1

Memory Structures ..........................

108

Table 7.1

Matrix Showing Error Probability Due to Transfer .....

162

Table 8.1

Workload Component Scales for the UH-60A


Mission/Task/Workload Analysis ................

176

Shift Rates after Transmeridian Flights for Some


Biological and Performance Functions .............

197

Table 9.1

The Spectrum of Automation in Decision Making .....

212

Table 9.2

Conclusions Based on Research in Nonaviation


Automation ...............................

214

Probability of an Individual Being Killed on a


Non-Stop U.S. Domestic Tnmkline Flight ...........

217

Table 8.2

Table 9.3
Table 9.4

............................

46

ASRS Flight Management System (FMS)/Control


Display Unit (CDU) Analysis ...................

218

Table 9.5

Analysis of DC-9/MD-80 Service Difficulty Reports ....

219

Table 9.6

Reasons Cited for Automating Systems ............

221

Table 9.7

Ironies of Automation ........................

222

Table 9.8

Cognitive Factors Influencing Design Elements .......

227

Table 9.9

Boeing's Automation Philosophy .................

228

Table 9.10

Design Philosophies .........................

231

Table 9.11

Functions of the Human Operator ...............

232

Table 9.12

Crew Role ...............................

233

xxmie

Human Factors for Hight Deck CEification Personmnel


Table 9.13

Some Psychological Topics Relevant to Automation ....

238

Table 9.14

Role of FAA Cockpit Certification Specialist(s) Human Factors ............................

241

Subsystems Workload Data Summary - Normal


Inflight Procedures ..........................

278

Table 11.1
Table 11.2
Table 11.3
Table 11.4

Subsystems Workload Data Summary - Non-Normal


Inflight Procedures ..........................

279

Flight Procedure Workload Data Summary Chicago to St. Louis Flight Totals ................

284

Flight Instrument Visual Scan - Dwell Time and


Transition Time and Transition Probability Summary ...

285

Line Operation Visual Activity Time Demand Average Percent of Time Available Devoted to
Visual Tasks ...............................

286

Table 11.6

Average Time Devoted to Motor Tasks ............

287

Table 11.7

Probability of Being Busy with a Visual Task ........

290

Table 11.8

Probability of Being Busy with a Motor Task ........

291

Table 11.5

xxiv

ACKNOWLEDGEMENT OF PERMISSION TO REPRINT

Figure 1.2

From Fletcher, IL Spwa* and Heabig in


a . (2nd
edit.) (Copyright e 1953) New York: Van Nostrand Co.
Reproduced by permission of the publisher.

Figure 1.3

From Fletcher, H. & Munson, W.A. Loudness, its definition,


measurement and calculation. (Copyright e 1933) Jouna of the
Acouta Socdety of Amerka, 5, 82-108. Reproduced by
permission of the publisher.

Figure 1.4

From Werner, J.S. & Schlesinger, K Prchoo.I SC e of M94


Br and Behaor. (Copyright o 1991) New York: McGraw-Hill.
Reproduced by permission of McGraw-HilL

Figure 2.1

From figures in Sanaaom and Pffcqdm; Second Edition by


Stanley Coren, Clare Porac, and Lawrence M. Ward, copyright .
1984 by Harcourt Brace & Company. Reproduced by permission
of the publisher.

Figure 2.2

From Wyszecki, G. & Stiles, W.S. Color Science.- Cobcqpb and


Metdlv4 Quanatimve Data and Fomnnuae, (2nd ed.) (Copyright .
1982) New York: John Wiley & Sons, Inc. Reproduced by
permission of John Wiley & Sons, Inc.

Figure 2.3

From Walraven, J. The colours are not on the display: a survey


of non-veridical perceptions that may turn up on a colour
display. This figure and article were first published in Diipay
Vol 6, January 1985, pp 35-42 and the figure is reproduced here
with the permission of Butterworth-Heinemann, Oxford, UK.

Figure 2.4

From figures from V /ual


Arnqwiou by Tom N. Cornsweet,
copyright e 1970 by Harcourt Brace & Company. Reproduced by
permission of the publisher.

Figure 2.5

From figures in Sesadon and PffcqXlon Second Edition, by


Stanley Coren, Clare Porac, and Lawrence M. Ward, copyright.
1984 by Harcourt Brace & Company. Reproduced by permission
of the publisher.

Figure 2.6

From Helps, E.P.W. Physiological effects of aging (copyright .


1973). hmcwdnv of the RoW Society of Mdice, 66, 815-818.
Reproduced by permission of the Journal of the Royal Society of
Medicine.
xxv

Figure 2.8

From Werner, J.S., Peterzell, D. & Scheetz, A.J. Light, vision, and
ion
aging - A brief review. (Copyright * 1990) Optomeby and
Sdeuc, 67 214-229. Reproduced by permission of William &
Wilkins, American Academy of Optometry.

Figure 2.9

Cocp and
mence:From Wyszecki, G. & Stiles, W.S. Color S
Metsod; QuanNtt
Data and Fomulae. (2nd ed.) (Copyright.
1982) New York: John Wiley & Sons, Inc. Reproduced by
permission of John Wiley & Sons, Inc.

Figure 2.10

From Osterberg, G. Topography of the layer of rods and cones in


, Supplement, (copyright
the human retina. Acta Opluthalmoo
e 1935). Reproduced by permission of Scriptor Publisher ApS,
Copenhagen, Denmark.

Figure 2.12

From Judd, D.B. Basic correlates of the visual stimulus. In S.S.


Stevens (Ed.) Handbook of EVxpeie Poasolog, (copyright .
1951) New York: John Wiley & Sons, Inc., 811-867. Reproduced
by permission of John Wiley & Sons, Inc.

Figure 2.13

From Graham, C.H. Some fundamental data. In C.H. Graham


(Ed.),
ion and V'ual Paeqwon. (Copyright * 1965a) New
York: John Wiley & Sons, Inc., 68-80. Reproduced by permission
of John Wiley & Sons, Inc.

Figure 2.14

From Hecht, S. Vision: II. The nature of the photoreceptor


process. In C. Murchison (Ed.), A Handbook of Genemal
Epeienta P*ycholo.
(Copyright o 1934) Worcester, MA:
Clark University Press, 704-828. Reproduced by permission of the
publisher.

Figure 2.16

From Sekuler, R. & Blake, R. Pafepwkm. (Copyright o 1985) New


York: McGraw-Hill, Inc. Reproduced by permission of McGrawHill
From Yarbus, A.L. Eye Movemet and VHision. (Copyright o 1967)
New York: Plenum Publishing Corp. Reproduced by permission of
the publisher.

Figure 2.17

Figure 2.18

From Ditchburn, ILW. Eye movements in relation to retinal


action. (Copyright o 1955) Opica Ada, 1, 171-176. Reproduced
by permission of Taylor & Francis Ltd., London, U.K.

Figure 2.19

From Polyak, S. The Reina. (Copyright o 1941) Chicago:


University of Chicago Press. Reproduced by permission of the
publisher. All rights reserved.

xxvi

Figure 2.20

From Hecht, S. & Smith, E.L. Intermittent stimulation by light: VI.


Area and the relation between critical frequency and intensity.
Reproduction from the arnal of Gemvl PhIygioo, 1936, 19,
979-989. Reproduced by copyright permission of the Rockefeller
University Press.

Figure 2.21

From Hecht, S. & Verrijp, C.D. Intermittent stimulation by light. III.


The relation between intensity and critical fusion frequency for
different retinal locations. Reproduction from the ounal of
Generalffd Ph
, 1933, 17, 251-265. Reproduced by copyright
permission of the Rockefeller University Press.

Figure 2.22

From Weale, RA. A Biqgraph of the Ejv. (Copyright e 1982)


London: H.K. Lewis & Co. Reproduced by permission of Chapman
& Hall, London, U.K.

Figure 2.23

From deLange, H. Research into the nature of the human


fovea-cortex systems with intermittent and modulated light. I.
Attenuation characteristics with white and colored light. (Copyright
o 1958) Washington, D.C.: Joral of the Optical Society of
Amera, 48, 777-784. Reproduced by permission of the publisher.

Figure 2.25

From Goldstein, E.B. Sena


and Perei 2/E, (copyright .
1984) by Wadsworth, Inc. Reproduced by permission of the
publisher.

Figure 3.1

From Bowmaker, J.K., Dartnall, H.J.A & Mollon, J.D.


Microspectrophotometric demonstration of four classes of
photoreceptors in an old world primate Macaca fascicularis.
(Copyright o 1980) Jornal of Physilg, 298, 131-143.
Reproduced by permission of the Physiological Society, Oxford,
U.K.

Figure 3.2

From Marc, R. E. & Sperling, H. G. Chromatic organization of


primate cones. (Copyright o 1977 by AAAS) Scknce, 196, 454-456.
Reproduced by permission of the AAAS, Washington, D.C.

Figure 3.4

From Werner, J.S. & Steele, V.G. Sensitivity of human foveal color
mechanisms throughout the life span. (Copyright o 1988)
Washington, D.C.: Jounal of th Optical Society of Ammica, A, 5,
2122-2130. Reproduced by permission of the publisher.

Figure 3.6

From the Farnsworth Dichotomous Test of Color Blindness, Panel


D-15. (Copyright o 1947) The Psychological Corporation.
Reproduced by permission of The Psychological Corporation. All
rights reserved.

xxvii

Figure 3.9

From Wemer, J.S. & Wooten, B.R. Opponent-chromatic


mechanisms: Relation to photopigments and hue naming.
(Copyright e 1979) Washington, D.C.: owmal of dte Opdc Soiety
ofAmwica, 69, 422-434. Reproduced by permission of the
publisher.

Figure 3.10

From Hurvich, L.M. Color V'uion. (Copyright e 1981) Sunderland,


MA: Sinauer Associates. Reproduced by permission of Leo Hurvich.

Figure 3.11

From Volbrecht, V.J., Aposhyan, H.M. & Werner, J.S. Perception of


electronic display colours as a function of retinal illuminance.
(Copyright e 1988) Dirfmap,
9, 56-64. Reproduced by permission
of Butterworth-Heinemann Ltd., London, U.K.

Figure 3.12

From Hurvich, L.M. Color Fivion. (Copyright * 1981) Sunderland,


MA: Sinauer Associates. Reproduced by permission of Leo Hurvich.

Figure 3.13

From Wright, W.D. & Pitt, F.H.G. Hue discrimination in normal


colour-vision. (Copyright * 1934) Procealkp of de Phyical
Society, 46, 459-473. Reproduced by permission of the lOP, Bristol,
U.K.

Figure 3.14

From Hurvich, L.M. Color Viuion. (Copyright e 1981) Sunderland,


MA: Sinauer Associates. Reproduced by permission of Leo Hurvich.

Figure 3.15

From Albers, J. Interaction of Color. (Copyright * 1975) New


Haven, CT: Yale University Press. Reproduced by permission of the
publisher.

Figure 3.16

From Evans, R.M. An Intnduion to Colr. (Copyright o 1948)


New York: John Wiley & Sons, Inc. Reproduced by courtesy of
Pauline Evans.

Figure 3.17

From Werner, J.S. & Walraven, J. Effect of chromatic adaptation


on the achromatic locus: The role of contrast, luminance and
background color. (Copyright * 1982)
ion Reseach, 22,
929-943. Reproduced by permission of Pergamon Press, plc.,
Elmsford, NY.

Figure 3.18

From Viveash, J.P. & Laycock, J. Computation of the resultant


chromaticity coordinates and luminance of combined and filtered
sources in display design. This figure and article were first
published in Disa^ Vol 4, January 1983, pp 17-23. The figure
was reproduced here with the permission of ButterworthHeinemann, Oxford, UK.

Xu'lu

Figure 3.19

From Wyszecki, G. & Stiles, W.S. (2nd ed.) Color ScienceConept and MediA Quantati Data wd Formula.
(Copyright e 1982) New York: John Wiley & Sons, Inc.
Reproduced by permission of John Wiley & Sons, Inc.

Figure 3.20

From Wyszecki, G. & Stiles, W.S. Color Scien:- Cancp and


Methods Quuitalive Data and Fwrmu/ae. (2nd ed.) (Copyright.
1982) New York: John Wiley & Sons, Inc. Reproduced by
permission of John Wiley & Sons, Inc.

Figure 3.21

From Volbrecht, VJ., Aposhyan, H.M. & Werner, J.S. Perception


of electronic display colours as a function of retinal illuminance.
(Copyright o 1988) Dispaw, 9, 56-64. Reproduced by permission
of Butterworth-Heinemann Ltd., London, U.K.

Figure 3.22

From Wyszecki, G. & Stiles, W.S. ColorScien.- Caotu and


Methods, Quanitatw Data and Fownule. (2nd ed.) (Copyright .
1982) New York: John Wiley & Sons, Inc. Reproduced by
permission of John Wiley & Sons, Inc.

Figure 4.1

From figures in V'wal Paixqpton by Tom N. Cornsweet, copyright


o 1970 by Harcourt Brace & Company. Reproduced by permission
of the publisher.

Figure 4.2

From figures in Viual Perception by Tom N. Cornsweet, copyright


o 1970 by Harcourt Brace & Company. Reproduced by permission
of the publisher.

Figure 4.3

From figures in Visual Prcepdtn by Tom N. Cornsweet, copyright


o 1970 by Harcourt Brace & Company. Reproduced by permission
of the publisher.

Figure 4.4

From Campbell, F.W. & Robson, J.G. Application of Fourier


analysis to the visibility of gratings. (Copyright * 1968) Journal
of Phywsio, 197, 551-566. Reproduced by permission of the
Physiological Society, Oxford, U.K.

Figure 4.5

From Blakemore, C.B. & Sutton, P. Size adaptation: A new


aftereffect. (Copyright o 1969 by AAAS) Scince, 166, 245-247.
Reproduced by permission of the AAAS, Washington, D.C.

Figure 4.6

From DeValois, R.L., Morgan, H. & Snodderly, D.M.


Psychophysical studies of monkey vision--Ill. Spatial luminance
contrast sensitivity tests of macaque and human cbservers.
(Copyright o 1974) VI'ion Ruearch, 14, 75-81. Reproduced by
permission of Pergamon Press, plc., Elmsford, NY.

xxiX

Figure 4.7

From Rovamo, J., Virsu, V. & Nasanen, R. Cortical magnification


factor predicts the photopic contrast sensitivity of peripheral
vision. (Copyright o 1978) Nature, 271, 54-56. Reproduced by
permission of Macmillan Magazines, London, U.K.

Figure 4.8

From Owsley, C., Sekuler, R. & Siemsen, D. Contrast sensitivity


throughout adulthood. (Copyright o 1983) V'sion Research, 23,
689-699. Reproduced by permission of Pergamon Press, plc.,
Elmsford, NY.

Figure 4.9

From DeValois, R.L. & DeValois, ICK. Spatial V"sion. (Copyright o


1988) New York: Oxford University Press. Reproduced by
permission of the publisher.

Figure 4.10

From DeValois, R.L. & DeValois, K.K. Spatial Vision. (Copyright 0


1988) New York: Oxford University Press. Reproduced by
permission of the publisher.

Figure 4.11

From DeValois, R.L. & DeValois, K.K. Spatial Vision. (Copyright o


1988) New York: Oxford University Press. Reproduced by
permission of the publisher.

Figure 4.14

From Sekuler, R. & Blake, R. Perception. (Copyright o 1985) New


York: McGraw-Hill, Inc. Reproduced by permission of McGrawHill.

Figure 4.15

From Gibson, J.J. The Senses Considered as PerceptualSystems.


Copyright o 1966. Reproduced by Houghton Mifflin Company.
Used with permission.

Figure 4.16

From figures from Sensation and Percepton, Second Edition by


Stanley Coren, Clare Porac, and Lawrence M. Ward, copyright o
1984 by Harcourt Brace & Company. Reproduced by permission
of the publisher.

Figure 4.17

From Matlin, M.W. Sensation and Perception. (2nd edit.)


(copyright o 1983) Boston: Allyn and Bacon. Reproduced by
permission of the publisher.

Figure 4.18

From Werner, J.S. & Schlesinger, K. Psycholqo: Science of Mid4


Brain and Behavior. (Copyright o 1991) New York: McGraw-Hill.
Reproduced by permission of McGraw-Hill.

Figure 4.19

From Julesz, B. Foundationsof Cyclopean Perception. (Copyright a


1971) Chicago: University of Chicago Press. Reproduced by
permission of the publisher. All rights reserved.

xxx

Figure 6.1

From Wickens, C.D. Engineering Psychoog and Human


Performance (2nd edit.). (Copyright o 1992) New York:
HarperCollins. Reproduced by permission of HarperCollins,
College Publishers.

Figure 6.2

From Wickens, C.D. EngineeringPsychology and Human


Performance (2nd edit.). (Copyright o 1992) New York:
HarperCollins. Reproduced by permission of HarperCollins,
College Publishers.

Figure 6.3

From Wickens, C.D. EngineeringPsychoog and Human


Performance (2nd edit.). (Copyright o 1992) New York:
HarperCollins. Reproduced by permission of HarperCollins,
College Publishers.

Figure 6.4

From Desmond, J. Improvements in aircraftsafety and operational

dendabiltyfrom a prjd

fUh path guidance diipy,

reproduced with permission from SAE Paper No. 861732,


copyright o 1986, Society of Automotive Engineers, Inc.
Figure 6.5

From Desmond, J. Improvements in aircraftsafety and operational


dependabilityfrom a projectedflight path guidance display,
reproduced with permission from SAE Paper No. 861732,
copyright o 1986, Society of Automotive Engineers, Inc.

Figure 7.1

From Wickens, C.D. EngineeringPsyhology and Human


Performance (2nd edit.). (Copyright o 1992) New York:
HarperCollins. Reproduced by permission of HarperCollins,
College Publishers.

Figure 7.2

From Wickens, C.D. & Flach, J. Human Information Processing, in


E. Weiner & D. Nagel (Eds.), Human Factors in Aviation, 111155, (copyright o 1988) Orlando, FL: Reproduced by permission
of Academic Press.

Figure 7.3

From Wright, P. Presenting technical information: a survey of


research findings. (Copyright o 1977) InstructionalScience, 6, 93134. Reproduced by permission of Kluwer Academic Publishers,
Dordrecht, Netherlands.

Figure 7.7

From Wickens, C.D. EngineeringPsychology and Human


Performance (2nd edit.). (Copyright o 1992) New York:
HarperCollins. Reproduced by permission of HarperCollins,
College Publishers.

xxxi

Figure 7.8

From Hawkins, F.H. Human Factm in F7%Wt, (copyright e 1987)


Brookfield, VT: Gower Technical Press. Reproduced by permission
of Ashgate Publishing Co., Brookfield, VT

Figure 7.9

From Braune, 11J. The commou/same type raniW human factors and
adwn ibues Reproduced with permission from SAE Paper No.
892229, copyright e 1989, Society of Automotive Engineers, Inc.

Figure 8.1

From Wickens, C.D. Eninai Pholog7 and Human Pawowmance


(2nd edit.). (Copyright o 1992) New York: HarperCollins.
Reproduced by permission of HarperCollins, College Publishers.

Figure 8.3

From Parks, D.L & Boucek, G.P., Jr. Workload prediction,


diagnosis and continuing challenges. In G.R. McMillan et al.
(Eds.), Appficala of Human Perifmance Mdek to Stmn Dean,
(copyright o 1989) New York: Plenum Publishing Corp. 47-63.
Reproduced by permission of the publisher.

Figure 8.4

From Groce, J.L & Boucek, G.P., Jr. Air banipmt cew tasking in an
ATC data fink awinmenL Reproduced with permission from SAE
Paper No. 871764, copyright o 1987, Society of Automotive
Engineers, Inc.

Figure 8.5

From Wickens, C.D. EAnueai Pyrcholog and Human Performance


(2nd edit.). (Copyright o 1992) New York: HarperCollins.
Reproduced by permission of HarperCollins, College Publishers.

Figure 8.7a

From Roscoe, A.H. The original version of this material was first
published by the Advisory Group for Aerospace Research and
Development, North Atlantic Treaty Organisation (AGARD/NATO)
in AGARDograph AG-282 "The practical assessment of pilot
workload" in 1987.

Figure 8.7b

From Cooper, G.E. & Harper, R.P., Jr. The original version of this
material was first published by the Advisory Group for Aerospace
Research and Development, North Atlantic Treaty Organisation
(AGARD/NATO) in AGARDograph AG-567 "The use of pilot rating
in the evaluation of aircraft handling qualities" in 1969.

Figure 8.8

From Wilson, G.F., Skelly, J., & Purvis, G. The original version of
this material was first published by the Advisory Group for
Aerospace Research and Development, North Atlantic Treaty
Organisation (AGARD/NATO) in AGARDograph CP-458 "Reactions
to emergency situations in actual and simulated flight. In Human
Behavior in High Stress Situations in Aerospace Operations" in
1989.
xxxii

Figure 8.9

From Wilson, G.F., Skelly, J., & Purvis, G. The original version of
this material was first published by the Advisory Group for
Aerospace Research and Development, North Atlantic Treaty
Organisation (AGARD/NATO) in AGARDograph CP-458
"Reactions to emergency situations in actual and simulated flight.
In Human Behavior in High Stress Situations in Aerospace
Operations" in 1989.

Figure 8.11

From Czeisler, C.A., Weitzman, E.D., Moore-Ede, M.C.,


Zimmerman, J.C., & Knauer, R.S. Human sleep: its duration and
organization depend on its circadian phase. (Copyright o 1980)
Science, 210, 1264-1267. Reproduced by permission of AAAS.

Figure 8.12

From Richardson et al. Circadian Variation in Sleep Tendency in


Elderly and Young Subjects. (Copyright o 1982) Sleep, S (suppl.
2), A.P.S.S., 82. Reproduced by permission of the publisher.

Figire 8.13

From Klein, K.G., Wegmann, H.M. & Hunt, B.I. Desynchronization


of body temperature and performance circadian rhythms as a
result of outgoing and homegoing transmeridian flights.
(Copyright a 1972) Aerospace Medicine, 43, 119-132. Reproduced
by permission of Aerospace Medical Association, Alexandria, VA.

Figure 8.15

From Wegman, H.M., Gundel, A., Naumann, M., Samel, A.,


Schwartz, E. & Vejvoda, M. Sleep, sleepiness, and circadian
rhythmicity in aircrews operating on transatlantic routes.
(Copyright o 1986) Aviation, Space, and Environmental Medicine,
57, (12, suppl.), B53-B64. Reproduced by permission of
Aerospace Medical Association, Alexandria, VA.

Figure 9.2

From Nagel, D.C. Aviation safety: needs for human factors


research. (Copyright o 1987) Presentation to Air Transport
Association (ATA) Airlines Operations Forum, 19-21 October,
Annapolis, Maryland. Reproduced by permission of the ATA.

Figure 9.4

From a document of the Boeing Commercial Airplane Group,


Seattle, Washington, Reproduced by permission of Boeing
Company.

Figure 10.1

From Society of Automotive Engineers, Human interface design


methodology for integrated display symbology. (Copyright o
1990) from SAE document ARP 4155. Reproduced by permission
of the Society of Automotive Engineers, Inc.

Figures
11.1 - 11.11

From a document of the Boeing Commercial Airplane


Group, Seattle, Washington. Reproduced by permission of Boeing
Company.
xxxllm

TABLES
Table 8.2

From Klein, K.G., Wegmann, H.M. & Hunt, B.I. Desynchronization


of body temperature and performance circadian rhythms as a
result of outgoing and homegoing transmeridian flights.
(Copyright e 1972) Affopace Medlce, 43, 119-132. Reprinted
by permission of Aerospace Medical Association, Alexandria, VA.

Table 9.3

From Barnett, A. & Higgins. M.X Airline safety: the last decade.
Managmun Scince; 35, January 1989, 1-21, (copyright * 1989).
Reprinted by permission of The Institute of Management
Sciences, 290 Westminster St., Providence, Rhode Island 02903,
USA.

Table 9.7

From Bainbridge, L Ironies of automation. In New Tedimoiogy


and Human Emr. Ed. M.J. Rasmussen, K. Duncan, and J. Leplat.
(Copyright e 1987) West Sussex, U.K.: John Wiley & Sons Ltd.
Reprinted by permission of John Wiley & Sons Ltd.

Table 9.8

From Meister, D. A cognitive theory of design and requirements


for a behavioral design aid in system design. In Behviwa

huyecives for Df4ws, Tools and Oqpniza.

Ed. W.B. Rouse

and K.R. Boff. (Copyright o 1987) New York: North Holland


Publications. Reprinted by permission of W.B. Rouse.
Table 9.9

From a document of the Boeing Commercial Airplane Group,


Seattle, Washington. Reprinted by permission of Boeing
Company.

xxxiv

Executive Summary
A series of one-week seminars was developed to provide FAA certification

specialists with information on fundamental characteristics of the human


operator relevant to cockpit operations with examples of applications of
this information to aviation problems. The series was designed to
proceed from the development of basic information on human sensory
capabilities, through human cognition, to the application of this

knowledge to the design of controls and displays in the automated


cockpit.
The earlier lectures were prepared and presented by published academic
researchers, the later ones by human factors practitioners employed by
the major airframe manufacturers in the United States.
The lecture series was presented on four separate occasions and was
attended by approximately 120 FAA ffight test and evaluation group
pilots and engineers. This text is a compilation of the lecture material
presented to these professionals during the four occasions.

xxxv/xxxvi

Auditory Perception

Chapter 1
Auditory Perception
by John S. Werner, Ph.D., University of Colorado at Boulder
Hearing, like vision, provides information about objects and events at a
distance. There are some important practical differences between hearing and
vision. For example, the stimulus for vision, light, cannot travel through solid
objects, but many sounds can. Unlike vision, hearing is not entirely dependent
on the direction of the head. This makes auditory information particularly
useful as a warning system. A pilot can process an auditory warning regardless
of the direction of gaze, and while processing other critical information through
the visual channel. Auditory information is also less degraded than visual
signals by turbulence during flight, making auditory warnings an appropriate
replacement for some visual display warnings (Stokes & Wickens, 1988). No
doubt these considerations formed the basis of FAA voluntary guidelines on the
use of aural signals as part of aircraft alerting systems (RD-81/38,II, page 89).

Human Factors for Flight Deck Certification Personnel

Physical Propertiesof Sound


Let us start by exploring what happens in the physical world to generate sound.
When you pluck the string of a guitar, it vibrates back and forth compressing a
small surrounding region of air. When the vibrating string moves away, it
pushes air in the opposite direction, creating a region of decompression. As the
string vibrates back and forth, it creates momentary increases and decreases in
air pressure, or sound waves. These alternating increases and decreases travel
through the air at a speed of approximately 740 miles per hour (Mach I, the
speed of sound). Eventually they arrive at our ear, where the tympanic
membrane, our eardrum, vibrates in synchrony with the pulsations of air
pressure.
The simplest pattern of such pressure pulsations is generated for a "pure" tone,
or sine wave. One important characteristic of the sine wave is its frequency.
Frequency is the number of high to low variations in pressure, called cycles,
that occur within a unit amount of time. The units we use to describe sound
frequency are cycles per second, or Hertz (Hz). Waveforms of low and high
frequency tones are illustrated in Figure 1.1.
Low Frequency

~Amplitude

""
0

High Frequency
C)

4)

Low + High Frequency

Time (sec)
Figure 1.1.

Changes in air pressure shown for two sound waves differing in frequency
and amplitude (top). When added together (bottom), the two pure tones form
a complex sound. (original figure)

Another important characteristic of pure tones is the degree of change from


maximum to minimum pressure, which we call the amplitude or intensity, also
illustrated in Figure 1.1. Sound amplitude is usually measured in dynes per
square centimeter, which is a measure of force per unit area. The human
2

Auditory Perception

auditory system is sensitive to an enormous range of variations in amplitude of


a sound wave -- from about 1 to 10 billion. Thus, intensity is more conveniently
specified by a logarithmic scale using units called decibels (dB). One dB = 20
log (p,/p 0) where p, refers to the sound under consideration and po is a
standard reference (0.002 dynes per square centimeter). Table 1.1 shows some
representative sounds on the dB scale.
Table 1.1
The Decibel Scale
dB
0

Example
Threshold of Hearing

10

Normal Breathing

20

Leaves Rustling

30

Empty Office

40

Residential Neighborhood at Night

50
60
70
80

Quiet Restaurant
Two-Person Conversation
Busy Traffic
Noisy Auto

90

City Bus

100

Subway Train

120

Propeller Plane at Takeoff

130
140

Machine-Gun Fire, Clore Range


Jet ct Takeoff

160

Wind Tunnel

Comment

Prolonged Exposure Can Impair


Hearing

Threshold of Pain

(Adapted from Sekuler & Blake, 1985)


Sine-wave tones are considered pure because we can describe any waveform as
a combination of a set of sine waves each of which has a specific frequency
and amplitude. This fact was initially demonstrated by Fourier. A sound
comprised of more than a single sine wave is termed a complex sound. Most of
the sounds we hear are complex sounds. The bottom panel of Figure 1.1 shows
how two sine waves of different frequencies can be combined to form a
complex sound. Some typical complex sounds are shown in Figure 1.2. Here, we
3

Human Factors for Flight Deck Certification Personnel

see the same note from the musical scale played by three different musical
instruments. Below each waveform is shown the amplitude of each frequency in
the sound. That is, each complex sound was broken down into a set of sine
waves of different frequencies using a method called Fourier analysis.
The three instruments sound different because they contain different amplitude
spectra (amplitudes as a function of frequency).

ClOOOng Pipe C

0." fl
o

II

1
1000

P-nn C

2M0

3M0

Frequency

Figure 1.2.

Tro-bone Orgen Ppe C

0
godl
0

1000

2M0

3=0

Frequency

eI

1000

2M0

3000

Fequency

Amplitude spectra of a C note played on three different instruments. (from


Fletcher, 1929)

Typically, the pitch we hear in a complex sound corresponds to the pitch of the
lowest frequency component of that sound. This component is called the
fwzdamental frequency. Frequency components higher than the fundamental are
called hanmonics, and these harmonics affect the quality or the timbre of the
sound. Two musical instruments, say a trumpet and piano, playing the same
note will generate the same fundamental. However, their higher frequency
components, or harmonics will differ, as illustrated in Figure 1.2. These
harmonics produce the characteristic differences in quality between different
instruments. If we were to remove the harmonics, leaving only the fundamental,
a trumpet and a piano playing the same note would sound identical.

Frequency and Intensity Relations to Perception


How do the physical properties of sound relate to our perceptions? First,
consider the range of frequencies over which we are sensitive. The lower curve
in Figure 1.3 shows how absolute threshold varies with sound frequency for a
young adult.
The range over which sounds can be detected is from about 20 to 20,000 Hz.
As you can see, we are most sensitive to sounds between 500 and 5,000 Hz.
These are also the frequencies of human speech. The frequency range
4

'2

Threshold
of feeling

120
100

Equal
loudness
curves

80
3

so -

Conversational

Audibility

20 -

curve

0 r

20

II

100

500 1000

(threshold
of hearing)

5000 10,000

Frequency (Hz)
Rgum 1.3.

Vadelon dr abslum Uwhald wth sound *rquuncy for a young aduL (Prnm
FRichw & MunWon, 193)

recommended for aircraft warning signals is 250 to 4,000 Hz (FAA volintary


guidelines based on FAA RD-81/38,II, page 91). Our sensitivity declines sharply,
i.e., the threshold increases, for higher and lower frequencies. What this means
is that sounds of different frequencies require different amounts of energy to be
loud enough to be heard.
it is interesting to consider our absolute sensitivity under optimal conditions. At
about 2,500 Hz, we are so sensitive that we can detect a sound that moves the
eardrum less than the diameter of a hydrogen molecule (B6k6sy & Rosenblith,
1951). In fact, if we were any more sensitive, we would hear air molecules
hitting our eardrums and blood moving through our head.
To make the frequency scale a little more intuitive, consider that the range on a
piano is from about 27.5 Hz to about 4,186 Hz. Middle C is 262 Hz. As sound
frequency is increased from 20 to 20,000 Hz, we perceive an increase in pitch.
It is important to note though that our perception of pitch does not increase in
exact correspondence to increases in frequency.
As we increase the amplitude, or physical intensity, of a particular frequency, its
loudness increases. Loudness is a perceptual attrbute referring to our subjective
experience of the intensity of a sound, however, not a physical property of the
sound. To measure the relative loudness of a sound, researchers typically
present a tone of a particular frequency at a fixed intensity and then ask
subjects to increase or decrease the intensity of another tone until it matches
the loudness of the standard. This is repeated for many different frequencies to
yield an erpffoudwm catour. Figure 1.3 shows equiloudness contours for
5

Human Factors for Flight Deck Certification Personnel

standards of 40 and 80 dB above threshold. Note that the shape of the contour
changes with increasing intensity. That is, the increase in the loudness of a
sound with increasing intensity occurs at different rates for different
frequencies. Thus, we are much more sensitive to intermediate frequencies of
sound than to extremes in frequency. However, with loud sounds, indicated by
higher intensity standards in Figure 1.3, this difference in our sensitivity to
various frequencies decreases.
Sensitivity to loudness depends on the sound frequency in a way that changes
with the level of sound intensity. You have probably experienced this
phenomenon when listening to music. Listen to the same piece of music at high
and low volumes. Attend to how the bass and treble become much more
noticeable at the higher volume. Some high-fidelity systems compensate for this
change by providing a loudness control that can boost the bass and treble at
low volume. The fact that the loudness of a tone depends not only on its
intensity but also on its frequency is a further illustration that physical and
perceptual descriptions are not identical.
While pitch depends on frequency, as mentioned, it also depends on intensity.
When we increase the intensity of a low frequency sound, its pitch decreases.
When we increase the intensity of a high frequency sound, its pitch increases.
The Effects of Aging
The frequency range for an individual observer is commonly measured by
audiologists and is known as an audiogram. Figure 1.3 showed that the
frequency sensitivity of a young adult ranged from about 20 to 20,000 Hz. This
range diminishes with increasing age, however, so that few people over age 30
can hear above approximately 15,000 Hz. By age 50 the high frequency limit is
about 12,000 Hz and by age 70 it is about 6,000 Hz (Davis & Silverman,
1960). This loss with increasing age is known as presbycusis, and is usually
greater in men than in women.
The cause of presbycusis is not known. As with all phenomena of aging, there
are large individual differences in the magnitude of high frequency hearing loss.
One possibility is that changes in vasculature with increasing age limit the
blood supply to sensitive neural processes in the ear. Another possibility is that
there is some cumulative pathology that occurs with age. For example, cigarette
smokers have a greater age-related loss in sensitivity than nonsmokers (Zelman,
1973) and this may be due to the interfering effects of nicotine on blood
circulation. There are other possibilities, but perhaps the most important to
consider is the cumulative effect of sound exposure.

Auditory Perception

Effects of Exposure
Sudden loud noises have been known to cause hearing losses. This is a common
problem for military personnel exposed to gun shots. Even a small firecracker
can cause a permanent loss in hearing under some conditions (Ward &
Glorig,1961).
Exposure to continuous sound is common in modem industrial societies. Even
when the sounds are not sufficiently intense to cause immediate damage,
continuous exposure may produce loss of hearing, especially for high
frequencies. Unprotected workers on assembly lines or airports have hearing
losses that are correlated with the amount of time on the job (Taylor, 1965).
Similar studies have shown deleterious effects of attending loud rock concerts.
The potentially damaging effects of sound exposure on hearing depend on both
the intensity and duration of the sounds. Thus, cumulative exposure to sound
over the life span might be related to presbycusis.

Sound Localization
The separated locations of our ears allows us to judge the source of a sound.
We use incoming sound from a single source to localize sounds in space in two
different ways. To begin with, suppose a tone above 1,200 Hz is sounded
directly to your right, as illustrated in Figure 1.4.
The intensity of high frequency sounds will be less in the left ear than the right
because your head blocks the sounds before they reach your left ear. This
intensity difference only exists for sounds above 1,200 Hz, however. At lower
frequencies, sound can travel around your head without any significant
reduction in intensity.
Whenever a sound travels farther to reach one ear or the other, a lime difference
exists between the arrival of the sound at each ear. Thus, if the sound source is
closer to one ear, the pulsations in air pressure will hit that ear first and the
other a bit later. We can use a time difference as small as 10 microseconds
between our two ears to localize a sound source (Durlach & Colbum, 1978),
but this information is only useful for low frequency sounds. Thus, localization
of high frequency sounds depends primarily on interaural intensity differences,
but low frequency sounds are localized by interaural time differences.

Human Factoum for Fiaht Deck Certfici

Pernmon
/

Habit

Sound source

atn n Adaptaio

Our ability to detect sounds is not


static, but rather changes as a sound
is
presented.
This can be
duerepeatedly
to adapta&m,
a physiulogical
change in sensitivity of the auditory

High frequency

system following exposure to sounds.


However, changes in the ability to
detect sounds need not occur for us
to "tune out" sounds around us.
When a stimulus is repeatedly
presented, there is a tendency to
decrease responsiveness over time.
For example, when sitting in a room
we may notice a fan when it is first
turned on, but over time the noise of
the fan is not noticeable at all. This

/ /tSound

source

High pressure
Low frequency
Low pressure

1.4. wayis
wagve mnd
*h=aasowa
emunihifiquml
from
saue to
tso uw of a pwaon's hema. #mn
Winnr &8ctimingw, 1981)

is called habikaaikm, a decrease in


response or of noticing the sound
that cannot be attributed to fatigue
or adaptation. To distinguish between adaptation and habituation, the same
tone might suddenly be reduced in intensity. If the response is due to
habituation, there may be a recovery of response even though the stimulus is
weaker.

The importance of habituation is dear when an individual must engage in a


task that involves attending to repetitive stimuli. There is a natural tendency to
tune out what is repeated and renew attention to what is novel. Tuning out
what is repeated, and presumably irrelevant, keeps the sensory channels open to
process new information. Habituation and adaptation phenomena are not
limited to detecting auditory stimuli, but they can be demonstrated for any of
the senses.

Ambet Nois (Msking)


Detection of pure tones is affected by background noise. We require more
intense tones for detection in the presence of background noise, and the shape
of the frequency sensitivity curve changes with the characteristics of the
ambient noise. The experience of detecting sounds in the presence of
background noise is a familiar one. In the laboratory we call the sound that an
individual is trying to detect the target, and the sound that is interfering with
detection the maUing stimulus. Not surprisingly, the effectiveness of a masking
8

Auditory Perception

stimulus increases with its intensity. This corresponds to our experience in


which we must speak more loudly to be heard as the sounds around us increase
in loudness. Perhaps not so intuitive, however, are the results of masking
studies which show that masking sounds do not affect all tones equally, but
rather act selectively to reduce sensitivity for tones of the same and somewhat
higher frequenies than the mask (Zwicker, 1958).
There are some conditions in which having two ears makes it possible to reduce
the effects of masking stimuli. To demonstrate this effect, sounds are played
separately to the two ears by use of headphones. Suppose that a tone is
delivered to the right ear and it becomes inaudible when masking noise is
delivered to that same ear. Now, if the same noise stimulus (without the tone)
is played to the other ear, the tone will become audible again. It is as though
the stimuli to both ears can be separated from the target that is presented to
only one ear. This is known as binauralunmasking.
Binaural unmasking is probably one factor that helps an individual to focus on
one set of sounds in the presence of others. This is a familiar experience at
parties, in which you can listen to one conversation while tuning out
conversations in the background. If your name happens to be mentioned in
another conversation, however, you may find yourself unable to resist switching
the conversation to which you are listening. This is known as the cockcaiparty
pheanenon and it underscores our ability to monitor incoming information
that we are not actively processing.

9/10

Basic Visual Processes

Chapter 2
Basic Visual Processes
by John S. Werner, Ph.D., University of Colorado at Boulder
Vision is our dominant sensory channel, not only in guiding aircraft, but also in
most tasks of everyday life. For example, we can recognize people in several
ways -- by their appearance, their voice, or perhaps even their odor. When we
rearrange stimuli in the laboratory so that what one hears or feels conflicts with
what one sees, subjects consistently choose responses based on what they saw
rather than on what their other senses told them (Welch and Warren, 1980).
Most of us apparently accept the idea that "seeing is believing."
Physical Propertiesof Light
Light is a form of electromagnetic energy that is emitted from a source in small,
indivisible packets called quanta (or photons). A quantum is the smallest unit of
light. As with sound energy, the movement of light energy through space is in a
11

Human Factors for Plight Deck Cerification Personnel

sinusoidal pattern. Sound waves were described in terms of their frequency, but
light waves are more commonly described in terms of the length of the waves
(i.e., the distance between two successive peaks). This description is equivalent
to one based on frequency because wavelength and frequency are inversely
related. Figure 2.1 illustrates two waves differing in their length. As can be seen
Short wavelength

Ultraviolet

Gmma

Shortwave

asInfrared

X rays

rays

10-14

Long wavelength

10-12

Volet

400O

Figure 2.1

10-10

rays

1-8

Blue Green

10-b

Yellow

Radar

10 4

10

-"

FM

TV

10"

ac ct
AM

104

Electricty

10h

1013

Red

500
600
700
Wavelength in nanometers (nm)

Regions

of

the

electromagnetic

spectrum

and

their corresponding

wavelengths. (from Coren, Porec & Ward, 1984)

in the figure, the electromagnetic spectrum encompasses a wide range, but our
eyes are sensitive only to a small band of radiation which we perceive as light.
Normally, we can see quanta with wavelengths between about 400 and 700
nanometeW
(nm; 1 nm is one billionth of a meter). Thus, the two major
physical variables for discussing light are quanta and wavelength. The number
of quanta falling on an object describes the light intensity, whereas the
wavelength tells us where the quanta lie in the spectrum. Most naturally
occurring light sources emit quanta of many wavelengths (or a broadband of
the spectrum), but in a laboratory, we use specialized instruments that emit
only a narrow band of the spectrum called monochromatic lights. If a person
with normal color vision were to view monochromatic lights in a dark room,
the appearance would be violet at 400 nm, blue at 470 nm, green at 550 rnm,
yellow at 570 nm, and red at about 680 rim. Note that this description is for
one set of conditions; later we will illustrate how the appearance can change
for the same monochromatic lights when viewed under other conditions.

12

Basic VsuWal Processe

Figure 2.2 shows the distribution of energy for some familiar light sources,
fluorescent lamps. The four different curves show four different types of lamp.

"200

C.
I1.

,..

,I_

S100

-44
A
....

..
,/

...
...-

400
Figure 22-

.........

...........
, ---..
.

i,"

.......
...

/#

I--------------

600
500
WAVELENGTH X(nm)

700

Relatlve energy of fluorescert lamps polaed as a functIon o Wavelength: I =


standard warm white, 2 = whate, 3 = standard cool wMie, 4 = dayWgt (from
Wyedd & Sties, 1982)

While they all may be called "white," they differ in their relative distribution c.
energy. They also appear different in their color although this is not always
noticed unless they are placed side-by-side. Variations in the intensity and
spectral distribution of energy can sometimes be quite large without affecting
our color perception. Indeed, Figure 2.3 shows the energy of sunlight plotted as
a function of wavelength for a surface facing away from the sun or toward the
sun. If these two light distributions were placed side-by-side you would say that
one is bluish and the other yellowish, but if either one was used to illuminate a
whole scene by itself, you would most likely call this illuminant white and
objects would appear to have their usual color. Objects usually do not change
their color with these changes in the source of illumination. This perceptual
phenomenon is called color constancy.
When light travels from one medium to another, several things can happen.
First, some or all of the quanta can be lost by absogdon and the energy in the
absorbed quanta is converted into heat or chemical energy. Second, when
striking another medium some or all of the quanta can bounce back into the
initial medium, a familiar phenomenon known as Itjlecdon. Third, the light can
13

Human Factors for Flisht Deck Certification Personnel


150
W

1
away
facing
from sun

,
towards
facing
sun

-100
0

100az
U

S5001-

300
Figue .3

400

500
600
wavetength (nm)

700

S~unItgt energy plaited asa function of wavelength for a surface facing away
from W3solar alitiude) or toward fthsun (80 solar altitude). (fromn Wairaven at
aL. 1990

be transmitted, or move forward, from one medium to another, but in doing so


the path may change somewhat; that is, the rays of light will be bent by
refraction. The extent to which each of these phenomena will occur depends
upon the medium that the light is striking, and the angle of incidence between
the light rays and the medium.
Absorption, reflection, and refraction all occur at the various structures in the
eye. It is, therefore, important to consider these phenomena in attempting to
understand the formation of optical images in the eye.

The Eye
Figure 2.4 is a diagram of the human eye. The eyeball is surrounded by a
tough, white tissue called the sclera, which becomes the clear cornea at the
front. Light that passes through the cornea continues on through the pupil, a
hole formed by a ring of muscles called the iris. it is the outer, pigmented layer
of the iris that gives our eyes their color.
Contraction and expansion of the iris opens or closes the pupil to adjust the
amount of light entering the eye. Light then passes through the ens and strikes
the rwna, several layers of cells at the back of the eye. The retina includes
receptors that convert energy in absorbed quanta into neural signals. One part of
the retina, called the fovea, contains the highest number of receptors per unit
area. When we want to look at an object, or fixate it, we move our head and

14

Basic Visual Processes

eyes so that the light will travel along the visual axis and the image of the
object will fall on the fovea.
The sizes of visual stimuli are
often specified in terms of the
region of the retina that they
subtend (cover). This concept is
illustrated in panel (b) of Figure
2.4. Consider what happens when

or...

ten,

we look at an object, say a tree

(Figure 2.4). Imagine the tree as


many points of light, and we are
looking at the light coming from
the top of the tree. When we
focus on the tree, our cornea and
lens bend the light so that an
image of the tree is formed at the
back of the eye, much as an
image is made on photographic

."

Ro,

3m

film by a camera. Note that the

__

,7,n
,
(b,
optics of the eye bend the light
so that the image of the tree on
Figure 2.4.
Cross section of the human eye;
the retina is upside down and
visual angle. Mrom Comsweet 19M)
reversed left to right. The area of
the retina covered by the image is called the visual angle, which is measured in
degrees. The angle depends on the object's size and distance from us. In Figure
2.4, we can deduce that smaller and smaller trees at closer and closer distances
could all subtend the same visual angle. The same principles hold for two
equally sized objects at differing distances; they will produce different visual
angles and appear as different sizes. This relation is such that as the distance of
an object from the eye doubles, the size of the image produced by the object is
halved. Artists use this information to create an illusion of three-dimensional
space on a flat surface by making background figures smaller than foreground
figures.
The visual angle x is calculated by: arctan (size/distance), and is specified in
degrees. (Note that the distance is between the object and the cornea, plus the
distance between the cornea and point 'p' in Figure 2.4. The latter value is
seven mm.) By definition, on, !egree equals 60 minutes of arc, and one
minute of arc equals 60 seconc of arc. A rough rule of thumb (no pun
intended) is that the visual angle 'x' of your thumb nail at arms length is about
2o.

15

Human Factors for Flight Deck Certification Personnel

An eye that properly focuses distant objects on the retina is said to be


emmetropic. Howeve. as Figure 2.5 illustrates, some individuals have an eyeball
that is abnormally short or the optics of their eye dr" rot ;ufficiently refract the
incoming light with the result that the light is focused behind the retina. This
condition is called hypermetropia or farsightedness. Other individuals may have
an eye that is too long or optics that refract the light from distant objects too
much with the result that the object is imaged in front of the retina. This
condition is known as myopia or nearsightedness. In both hypermetropia and
myopia, the image falling on the retina is not properly focused and vision may
be blurred. Fortunately, this problem can be corrected by prescribing spectacles
or contact lenses that cause distant images to be focused on the retina.
Accommodation

At any one time, the eye can focus

onRays
ecs

objects clearly only if those objects fall


within a limited range of distance. To

look at close objects we require more


bending of the light to properly focus

Emmetropic eye (normal)


Actual eye shape

the image on the retina. In humans, this

(too

is accomplished by a somewhat flexible


lens in the eye. The lens is attached to
muscles that can be contracted or
relaxed to change the lens curvature.
When the shape is changed, the light

short)

Raysfocus

behind eye

Normal eye shape

will be refracted or bent differently, a

Hypermetropic eye (farsighted)

process known as accommodation.

SActual eye

shape

(too tong)

It is not clear what triggers the eye to

change its state of accommodation, but


e asocited
ithfront
w~u image.
a defocused
one likely source isobjets
Since shifts in fixation from far to near
objects will be associated with some
image blur, accommodation will occur.
The reaction time for accommodation is
about 360 milliseconds (Campbell &

Westheimer, 1960). Although this is a


short reaction time, it is nevertheless

Rays tocusin

ofretina
Normal eye shape

Figure 2.5.

in
formation
Image
(normal),
emmetropic
and
hypermetropic,

myopic

eyes.

(from

Coren, Porac, & Ward,


1984)

long enough to produce noticeable blur


when shifting focus from a display panel or head-up display (HUD) to a distant
object, or vice versa. It may be noted that the need to accommodate to HUD
symbology is theoretically unexpected because it is produced by optically

16

Basic Visual Processes

collimated virtual images; however, collimated images do not necessarily lead to

focus at optical infinity (Iavecchia, lavecchia & Roscoe, 1988).


Aging and Presbyia
The flexibility of the eye lens decreases with age and thereby limits the ability
to accommodate, both in terms of the amount of change in the lens and the
time required to respond to changes that occur when shifting fixation from far
to near objects (Weale, 1982). The loss in accommodative ability, known as
presbyopia, is often quantified in terms of the near point, or the closest distance
at which an object can be seen without blur. As illustrated by Figure 2.6, the
near point increases with advancing age. By about age 40, the near point is
such that reading can only be accomplished when the print is held at some
distance or if reading glasses are used.
'I

Some individuals require one lens


correction for their distance vision
and a different correction for
their presbyopia. This can be
accomplished by bifocal lenses -lenses which require the
individual to look through
different parts in order to
properly focus near and far
objects.

8o
'E
0
20
I
40

20

to

Age in Years

Ocular Media Transmission and


Aging

Figure 2.6.

Near point plotted as a function of


age. (from Helps, 1973)

The various optical components


of the eye -- the ocular media -- shown in Figures 2.4 and 2.5 are not
completely transparent. The lens of the eye, in particular, has a yellowish color.
It absorbs quite strongly at the short wavelengths of the visible spectrum
(around 400 to 450 nm) and even more strongly in the ultraviolet portion of
the spectrum from 300 to 400 nm. This is illustrated by Figure 2.7 which shows
optical density plotted as a function of wavelength. Optical density refers to the
log of the reciprocal of transmission and can be thought of as the log of the
absorption. Thus, optical density 2.0 refers to ten times greater absorption than
optical density 1.0.
Figure 2.8 shows the variation in ocular media density (at 400 nm) as a
function of advancing age. One can spe that at each age there is a great deal of
individual variation, about 1 log unit or a factor of 10-to-i. In addition, the

17

.-.....

Human Factors for Flight Deck Certification Personnel

3.0

S2.0

8
1.0

200

300

400

500

600

700

800

Wavelength, nm

Figure 2.7.

Optical density of the human lens plotted as a function of wavelength. (data


from Boettner, 1967, original figure)

E 3.5I
o 3.0
0
0

-'2.5
.2.0

O-

0 t-

1.5
1.0

C
0
Figure 2.8.

" 0.5 -A
0

20

40
60
AGE (years)

80

100

Optical density of human ocular media at 400 nm plotted as a function of age.


(from Werner, Peterzell & Scheetz, 1990)

optical density of the lens increases markedly with advancing age. It can be
deduced from the solid line fit to the data that the average 70-year-old eye
transmits about 22 times less light at 400 nm (1.34 optical density difference)
than does the eye of the average 1-month-old infant This difference between
young and old diminishes with increasing wavelength.
Because the lens increases its absorption with age, the visual stimulus arriving
at the receptors will be less intense with age. In addition, for stimuli with a
broad spectrum of wavelengths, there will be a change in the relative
distribution of light energy because the short wavelengths will be attenuated
18

Basic Visual Processes

more than middle or long wavelengths. Since the stimulus at the retina is
changing with age, there will be age-related decreases in the ability to detect
short wavelengths of W The amount of light absorbed by the lens will also
directly influence our ability to discriminate short wavelengths (blue hues).
Thus, the large range of individual variation in the lens, leads to large
individual differences in discrimination of blue hues and in how a specific blue
light will appear to different observers.
While an increase in the absorption of light with advancing age is considered
normal, some individuals experience an excessive change which leads to a lens
opacity known as a catarac. A cataractous lens severely impairs vision and is
typically treated by surgical removal and implantation of a plastic, artificial lens.
These artificial lenses eliminate the ability to accommodate, but in most cases of
cataract the individual is above about 55 or 60 years of age and has lost this
ability anyway.

ai

z3

'

,o,,,.oi
ColC

ollc

gonglion

Figure 2.9.

Various cal types In the primate retina. (original figure from Dowling &
Boycod, 1966;

odifed by Wyszecki & Stiles, 1982)

19

Human Factors for Flight Deck Certification Personnel

Rods and Cones


Figure 2.9 shows the various cell layers of the retina. Quanta falling on the
retina are absorbed by
photopigments contained in the
visual receptors. Energy contained
in an absorbed quantum changes
i
the structure of the photopigment
which causes the receptor to
respond. These responses are
passed along to other cells in the
retina. In this diagram, light
enters from the bottom of the
picture and before it reaches the

,4.,.40, -

receptors it must travel through


the different cell layers. This does
not affect the image, however,
since these other cells are

,,__

essentially transparent. The

BI~nd%W

160.ooom

human retina contains two types

140.000

120.000oo

of visual receptors, the rods and

!,oo

100,0M)

cones, so named because of their

different shapes.

Rd

L.W

60.000
40,000

Variation with Retinal Eccentricity


There are approxim ately 6 m illion
cones and about 125 million rods
in the human retina. These

MOOD
0

0
70' 60' 50' 40" 30' 20" I0
P
e. o 0

(Y
0I

,,on ,elin,4 ,,,,r,

Figure 2.10.

0 20' M0'4O' 50' 60' 70A80


Nasl
angle (ft)

The number of rods and cones


plotted as a function of retinal

receptor cells are not evenly.

from Osterberg,
(data Comsweet,
eccentricity.
distributed across the retina, as
1935; figure from
1970)
cones
The
shown in Figure 2.10.
are most densely packed in the
fovea. To look at something directly or fixate on it, we turn our eyes so that
the object's image falls directly on the fovea. This is advantageous because the
fovea contains the greatest number of cones, providing us with our best visual
acuity, or ability to see fine details. Outside the fovea, where the density of the
cones decreases, there is a corresponding decrease in visual acuity. The density
of rods is greatest about 200 from the fovea and decreases toward the periphery.
The periphery has many more rods than cones, but a careful reading of the
figure shows that there are as many as 7,500 cones per square num even in the
peripheral retina.

20

Basic Visual Processes

When light falls on the rods and cones, they send signals to other retinal cells,
the horizontal, bipolar, and amacrine cells located in different retinal layers
(Figure 2.9). These cells organize incoming information from receptors in
complex ways. For example, one of these cells can receive information from
many receptors as well as from other retinal cells. Then these cells send their
information on to ganglion cells, which can further modify and reorganize the
neural information. The activity of these ganglion cells is sent to the brain
along neural fibers called axons. Thus, the only information that our brain can
process must be coded in the signals from the ganglion cells. The interactions
among the different retinal cell types provide the physiological basis for many
important perceptual phenomena.
The axons of ganglion cells form a bundle of approximately one million fibers

called the optic nerve. These fibers leave the eyeball in the region termed the
optic disc. Because this area is devoid of receptors, it is called the blind spot. As
can be seen in Figure 2.10, the blind spot is located at about 150 on the side of
the nose (or nasal retina) from the fovea.
As a practical matter, one can now see why FAA guidelines (see RD-81/38,11,
page 40) stress the importance of placing master visual alerts within 15o of each

pilot's normal line of sight as illustrated by Figure 2.11. This is the area of the
visual field with best visual acuity and typically the center of attention. By

FOR

SAREA
3D
SHIGH

PRIORITY /

AREA FOR
SECONDARY

SIGNALS

Figur 2.11.

Reomemende pWemt of visual WWd and other high pdofty signals


(from DOT/FAA/RD-81/38,lt

rmeive to the kim of Wg

21

Human Factors for Flight Deck Certification Personnel

placing high priority signals in this area, they will be detected more quickly
than if they are placed more peripherally.

Spetral Sniiiy1
The functional difference between rods
and cones was discovered in 1825, when
the Czech medical doctor Purkinje
realized that he was most sensitive to a
part of the spectrum in complete
darkness that was different from the part
he was most sensitive to in daylight.
From this, he hypothesized the existence
of different receptors for day (photopic)
vision and night (scotopic) vision. Shortly

Fva oe

o -

cones/8 above fovea

-1/,-3_-

s
-5
700
600
500
400
Wavelength inmillimicrons
Log relative sensitivity
Figure 2.12.
plotted separately for rods

after this, a German biologist named


Schultze described two types of receptors
and cones as a function
in the retina which he named rods and
of wavelength. (Data from
Wald, 1945; figure from
cones based on their shapes. He noted
Judd, 1951)
that rods were the main type of receptor
in animals active at night and cones
predominated in animals active during the day. From this, he concluded that
rods are the receptors of scotopic or "night" vision and cones the receptors for
photopic or "daylight" vision.
Rods and cones differ in their sensitivity to different wavelengths of light, or
their spectral sensitivity. If we measure spectral sensitivity with light focused on
the retina where the rods are most numerous, the maximal sensitivity will be at
about 505 nm. The top curve in Figure 2.12 shows scotopic spectralsensitivity, or
the sensitivity of rods to different wavelengths. The shape of this curve is due
to the fact that the photopigment contained in rods absorbs some wavelengths
better than others.
Under scotopic conditions, we do not perceive hue -- the chromatic quality in
colors that we identify with names such as red, green, blue, and yellow. If we
observed lights of different wavelengths emitting the same numbers of quanta,
light at 510 nm would appear brighter to us than other wavelengths because of
our greater sensitivity to it, but all the wavelengths would appear to have the
same color under scotopic or dark-adapted conditions.
If we measure spectral sensitivity for cones by focusing light directly onto the
fovea where there are virtually no rods, we see that cone sensitivity is
dramatically lower than for the rods -- as much as a thousand times lower at
22

Basic Vbsual Processes

some wavelengths. The wavelength of maximal sensitivity for the cones (555
nm) is different than for the rods. This phaotopic spewlvd wmsi
is shown in
Figure 2.12. Not only do the cones differ from the rods in their spectral
sensitivity, but they produce different perceptual experiences. Under photopic or
daylight conditions, we can see different hues as wavelength varies. Thus,
perception of hue is dependent on cone receptors.

Let us briefly digress to consider a practical implication of the spectral


sensitivity functions. We have seen that lights can be specified in terms of the
number of quanta emitted at various wavelengths of the visible spectrum.
However, because the eye is not equally sensitive to all wavelengths, the
specification of the intensity of a light in terms of a purely physical metric does
very little to describe its effectiveness as a stimulus for vision. For this reason,
the International Commission on Illumination (Commission Internationale de

reclairage, CIE) has developed a system of specifying the intensity of the


stimulus weighted according to the spectral sensitivity of the human observer.
The spectral sensitivity function used by the CIE is called the standard
observer's visibility function or V, when specifying lights under photopic
conditions and V', when specifying lights viewed under scotopic conditions.
Louminanc, the intensity of light per unit area reflected from a surface toward
the eye, is thus defined as:
K'ExVXd,
where E, is the radiant energy contained in wavelength interval dX and V, is
the relative photopic spectral sensitivity function for the standard observer. For
scotopic conditions, the same formula applies except that V', is used instead of
V'. The K is related to the units in which luminance is specified, the most
common in current usage being the candela per square meter (cd/m 2 ). In the
literature, one may find luminance specified in different units by different
investigators. Conversion factors needed to compare the various studies are
tabled by Wyszecki and Stiles (1982).
There are a few points to note about luminance specifications. First, there is no
subjectivity inherent in the measurement of luminance. One simply measures the
energy at each wavelength and multiplies this value by the relative sensitivity of
the standard observer at that wavelength. Alternatively, one may directly
measure luminance with a meter -- a meter that has been calibrated to have the
sensitivity of the CIE standard observer. Second, while there is no subjectivity in
the measurement of luminance, it was the original intent of the CIE to develop
a metric that would be closely related to the brightness or subjective intensity
23

Human Factu for ftft Deck Cwtification Pemone

of a visual stimulus. As we shall see, the brightness of a stimulus depends on


many variables such as the preceding or surrounding illumination. These
variables are not taken into account in specifying luminance. Thus, the
luminance of a stimulus is often of no value in specifying brightness (Kinney,
1983). The term luminance should be reserved for the specification of light
intensity, and the term brightness should be reserved for a description of the
appearance of a stimulus.
Dark Adeptafion

77
groped
Most of us have
around in a dark movie__6_-_
c6
theater until our eyes adjusted
to the dim level of
.
5
illumination. This process is
called dark adaptation, and it
"
_4
occurs, in part, because our

31
receptors need time to achieve
35
30
25
20
15
10
5
0
their maximum sensitivity, or
(min)
dark
in
Time
we
If
threshold.
minimum
were to measure the minimum
amount of light required to
Figure 2.13. Tlrel decrease, duLh adaptdon to
dftln showing to cones (iop branch)
see at various times, i.e., our
branch) a1 a dlrm
M
an r
a
entered
we
after
threshold
1=4
Gahan%
o
r
plot
could
we
darkened room,
a dark adaptation curve such
as that shown in Figure 2.13 (reprinted by permission from C.H. Graham, Ed.
Viaion and Viuual Parepton,e John Wiley & Sons, Inc. New York, NY, 1965, p.
75). This curve indicates that the eye becomes progressively more sensitive in
the dark, but notice that the curve has two distinct phases. The first phase,
which lasts about seven minutes, is attributed to the cone system, and the
second phase, to the rod system. When we first enter the dark our cones are
more sensitive than the rods, but after about seven minutes, the rods become
more sensitive.

What explains the greater sensitivity of rods over cones in a dark theater? Part
of the answer is related to the fact that there are many more rods than cones.
Second, because the rods contain more photopigment than cones, they absorb
more quanta. To consider the third explanation for the difference in scotopic
and photopic sensitivity, we must look at the connections of rods and cones to
other neural elements in the retina. Several cones are often connected to a
single bipolar cell. This is termed convergence because the signals from several

24

Basic Visual Processes

cones come together at one cell. The more receptors converging on a single cell,
the greater chances are of activating that cell.
Sen

yRes oluflon Trade-Off

A dim light that produces a weak signal in many rods has a greater chance of
being detected because many rods summate their signals on another cell. Their
combined effects can produce a signal strong enough for visual detection.
Detection of a weak signal by cones is less likely because their spatial
summation of signals occurs over much smaller regions than rods. Convergence,
a structural property of many neural-sensory systems, thus enhances sensitivity.
(Signals in receptors can also be added together over time, a process known as
temporal summation, and this occurs over longer durations for rods than for
cones.)
While it may seem advantageous to summate visual signals over a wide region
of the retina to enhance sensitivity, it should be noted that this is associated
with a loss of resolution or acuity. That is, whenever signals are combined,
information about which receptors generated the signals is lost. Conversely, if
information from receptors is separated, there is a greater possibility of
localizing which receptors are activated and thereby resolving the locus of
stimulation. Thus, there is a trade-off between sensitivity and resolution.
Because rods pool their signals over larger retinal regions than cones, they
enhance sensitivity at the cost of spatial resolution. Cones, on the other hand,
summate information over small regions of retina and favor high resolution at
the expense of sensitivity.
Visual acuity, or resolution, is often defined in terms of the smallest detail that
an observer can see. This is measured by the familiar eye chart with varying
letter sizes viewed at a fixed distance. Visual acuity tested with such a chart is
defined by the smallest letter that can be read. When an individual has, for
example, an acuity of 20/40 or 0.5 it means that at a distance of 20 feet, the
individual just resolves a gap in a letter that would subtend I minute of arc at
a distance of 40 feet (see Riggs, 1965 for other details). In many states, a
person is legally blind if visual acuity is 20/400 or worse.
Visual acuity varies with luminance, as shown in Figure 2.14. In the scotopic
range, visual acuity is dependent on rods and is very poor. As light intensity
increases into the photopic range, visual acuity is more dependent on cones and
dramatically improves. Note, however, that even after cones "take over," visual
acuity continues to vary with light intensity. The data in Figure 2.14 represent
more or less ideal conditions. When a stimulus is moving or the display is
vibrating (as in turbulence), visual acuity may be considerably reduced.
25

Humn Factors for Rlifht Deck Cerdfication Pemonnel


00
1.6 .

--

1.4

00

0-0

00

1.0

.100
00

0.4

00

-3

-2

-1

0-2
log L (mL)

Rgur 2-14.

Visu acuiy potaeds a functon of log hmnance. (from Hecht, 1934)

Damage Thresholds
The human visual system is extremely sensitive to light -- so sensitive that when

light is very intense, the receptors of the retina can be permanently damaged. A
common example of this is the blindness that occurs subsequent to viewing the
solar eclipse.
Ham and colleagues (1982) conducted experiments with rhesus monkeys (who
had their lenses surgically removed) to determine which wavelengths of light
are most damaging to the retina. Because rhesus monkeys have a retina that is
nearly identical to that of humans, the results of these experiments can be
generalized to humans. The results are presented in Figure 2.15 in terms of
relative sensitivity to damage as a function of wavelength.
Damage observed by Ham et al. occurred to the receptors as well as to the cells
behind the receptors that are necessary for receptor function, cells in the layer
known as the retinal pigment epithelium. The data points in Figure 2.15
indicate that any wavelength of light, in sufficient intensity, can be damaging to
the retina. Note, however, that the short wavelengths in the visible spectrum
(ca. 450 run) and the ultraviolet wavelengths (300 to 400 ran) are most
26

Basic Vmu

Processes

(D I000

_j

C
0
_-

100 1-U)

I-

S10
U)

t-

Cr

300

400

500

WAVELENGTH
Figuo 2.1&5.

600

(nm)

RhlWlve mnhvy
rilna
nw danhppoudm a funIon 1 wavdouglh. (dam
*am Ham a a , 19O, ad"
Iuw)

effective in producing damage.


The absorption of light at different wavelengths by the human lens and the
macular pigment, a yellow pigment concentrated around the fovea, is indicated
in Figure 2.1S by the hatched and screened areas, respectively. Since these
pigments absorb the light indicated by the areas shown, they substantially
reduce the intensity of the most hazardous wavelengths before that light
reaches the retinal receptors. Thus, our lens and macular pigment provide a
natural source of protection from light damage. Unfortunately, these natural
filters do not always provide sufficient protection against the hazardous effects
of ultraviolet radiation and many researchers advise additional protection
against the long-term effects of radiation which may accumulate over the life
span and contribute to aging of the retina (Werner, Peterzell & Scheetz, 1990)
and possibly certain diseases of the retina such as age-related macular
degeneration (Young, 1988). Because the intensity of ultraviolet radiation
increases with increasing altitude, these concerns may be especially important to
airline pilots.
Eye Mlomnts
The field of view for humans is about 180o for the two eyes combined, as
shown in Figure 2.16, but as we have seen in Figure 2.17, the receptor mosaic
27

Human Factors for FliRht Deck Certification Personnel

varies with retinal eccentricity and so we rely on a smaller portion of the retina
for processing detailed information. In particular, the act of fixation involves
head and eye movements that position the image of objects of interest onto the
fovea. This can be seen by recording the eye movements of someone who is
viewing a picture. Figure 2.17 shows some recordings made by Yarbus (1967)
of eye movements while viewing pictures. Notice that the eye moves to a point,
fixates momentarily (producing a small dot on the record), and then jumps to

by
both
eyes

Figure 2.16.

Seen

Se

left

right

Visual field for humans: about 180(.

(from Sekuler & Blake, 1985)

another point of interest. Notice also that much of the fixation occurs to
features or in areas of light-dark change. Homogeneous areas normally do not
evoke prolonged inspection. For information to be recognized or identified
quickly and accurately, movements of the eye must be quick and accurate. This
is accomplished by six muscles that are attached to the outside of each eye.
These muscles are among the fastest in the human body.
There are two general classes of eye movements: vergence and conjunctive.
Movements of the two eyes in different directions -- for example, when both
eyes turn inward toward the nose -- are called vergence movements. These
movements are essential for fixating objects that are close. The only way both
eyes can have a near object focused on both foveas is by moving them inward.
Eye movements that displace the two eyes together relative to the line of sight
are known as conjunctive eye movements. There are three types of conjunctive
eye movements: saccadic, pursuit, and vestibular. Saccadic eye movements are
easily observed when asking a person to change fixation from one point in
space to another. A fast ballistic movement is engaged to move the eye from
28

Basic Visual Processes

one point to the next. Careful measurements show that the delay between
presentation of a peripheral stimulus and a saccade to that stimulus is on the
order of 180 to 250 msec. The movement itself only requires about 100 msec
for the eyes to travel a distance of 400 (Alpern, 1971). Saccadic movements of
the eyes are necessary to extract information from our environment. For

Figure 2.17.

Eye movements while viewing pictures; small dots are fixations. (from Yarbus,
196I

29

Human Factors for Flight Deck Certification Personnel

example, during reading we may make as many as four small saccades across a
line of type and one large saccade to return the eye to the beginning of the
next line. We engage in many thousands of saccadic eye movements each day.
One of the great mysteries in eye movement research has to do with why we
don't notice our eye movements. If the visual image in a motion picture were
moved around the way the eyes move the visual image, it would be very
disconcerting. The same motion of the image due to movement of the eye
results in the appearance of a stable world. Part of the reason that saccadic eye
movements are not disruptive has to do with an active suppression of visual
sensitivity for about 50 msec before and after a saccadic eye movement
(Volkmann, 1962). A similar reduction in visual sensitivity also occurs during
blinks (Riggs, Volkmann & Moore, 1981), which is probably why we do not
notice "the lights dimming" for one-third second, every four seconds, which is
about the duration and frequency of eye blinking. The light is reduced by about
99% during a blink, but this change is seldom noticed.
Still another reason we may fail to notice blurring during a saccadic eye
movement is due to visual masking. When two stimuli are presented in quick
succession, one stimulus may interfere with seeing the other. For example,
threshold for detecting a weak visual stimulus will increase if a more intense
stimulus is presented just before or just after the weak stimulus is presented.
Similarly, the sharp images seen just before and after an eye movement may
mask the blurred stimulus created during the saccade (Campbell & Wurtz,
1978).
While saccadic eye movements allow the eye to "jump" from one point to
another, pursuit eye movements allow the eye to move slowly and steadily to
fixate a moving object. These movements are very different from saccades and
are controlled by different mechanisms in the brain. Saccadic eye movements
are programmed to move the eye between two points with no changes in the
direction of movement once the saccade has begun. Pursuit movements require
brain mechanisms to determine the direction and velocity of a moving object for
accurate tracking. Indeed, accurate tracking for slow moving objects is possible,
but the accuracy decreases with increasing target speed.
Vestibular movements of the eye are responsible for maintaining fixation when
the head or body moves. To maintain fixation during head movement, there
must be compensatory changes in the eyes. The movement of the head is
detected by a specialized sensory system called the vestibular system, and
head-position information is relayed from the vestibular system to the brainstem
areas controlling eye movements. Although we are seldom aware of vestibular
eye movements, they are essential for normal visual perception. Some antibiotics
30

Basic Visual Processes

have been known to


temporarily impair function of
the vestibular system and

10
AN,

--).7

eliminate vestibular eye


movements. Under these

Even when we are intently

9JA,

(/i

conditions, it is virtually
impossible to read signs or

recognize objects because the


lack of eye movements to
compensate for head
movements makes the world
appear to jump about.

"

\11

892
2

1'

,4
Typical Intercon'e
Size

Figure 2.18. Eye


p

"
3

movement
n

records illustrating
(for Ditchburn,

fixating an object, small


19
random contractions of the
eye muscles keep the eyes moving to some extent. These tiny eye movements
are known as physiological nystagmus and include tiny drifting eye movements
and microsaccades (Ditchburn, 1955). Physiological nystagmus is illustrated by
eye movement recordings shown in Figure 2.18. The numbered dots represent
successive time intervals of 200 msec. The large circle encompasses only 5
minutes of arc, so the movements are quite small, some on the order of the
diameter of two photoreceptors.
One might wonder what would happen if the eyes did not move at all. To
answer this question, Riggs et al. (1953) designed a clever apparatus in which
the observer wore a contact lens with a mirror attached to it. Light from a
projector that bounced off the mirror was projected onto the wall. Thus, when
the eye moves, the mirror moves and, of course, so does the visual stimulus. As
a consequence, the projected image always falls on the same part of the retina,
and is called a stabilized retinal image. The visual experience with a stabilized
image is startling. Borders fade away and eventually the entire visual image
disappears. In other words, when the retina is uniformly stimulated, the eye
becomes temporarily blind to the image. Small movements of the eye destabilize
the retinal image and make vision possible.
The fact that stabilized images disappear explains why we don't see the blood
vessels in our own eyes. Figure 2.19, for example, shows the blood vessels in
the eye which lie n-ifront of the receptors. This means that when light passes
into the eye and strikes the vessels, a shadow is cast on the retina. Because this
shadow moves wherever the eye moves, it is stabilized and we don't see it. You
can actually see the blood vessels in your eye by doing the following. Take a
small flashlight and position it close to the outside corner of your eye. Look
31

Human Factou for Flight Deck Certification Perounel

straight ahead in a dark room and


shine the light directly into the eye
while moving it quickly back and
forth. The moving light causes the
shadows to move and hence the
images are no longer stabilized and
become visible.
Temporal Vision
Many visual stimuli change over
time, and the change itself can
provide compelling information about
the stimulus. Indeed, sometimes it is
only the temporal variation of a
stimulus that allows it to be detected,

discriminated, or recognized.
Rkkrr

_ z

'

&'

Fgum
r219. The drk
sow red-n bkwd
ah.
vewWe The cu" drc,
dmisr ine oo reod vgnz
mpmo the fove ("om P*
1941)

If a light is turned on and off in


rapid succession, we will experience a sensation that we call flicker. If the
frequency of oscillations, measured in cycles per second (cps or Hz), is high
enough, the flicker will no longer be perceptible. This is known as the aiical
Jlickerjfion (CFF) frequency. At high light levels, CFF may occur at
frequencies as high as 60 Hz. The fact that flicker fuses at high frequencies
explains why fluorescent lamps appear to be steady even though they are going
on and off at 120 cps. Here, we will discuss some of the main parameters that
determine CFF; see Brown (1965) for a complete review of the literature.
Our ability to detect flicker depends on the light level; as luminance increases,
flicker is easier to detect. Figure 2.20 shows CFF as a function of luminance
(Hecht & Smith, 1936). It is clear that CFF depends on both the light level and
the stimulus area. When the area is large enough to stimulate both rods and
cones, the curve has two branches. When cones dominate sensitivity, CFF
increases linearly with light level over a wide range before reaching an
asymptote. The lower branch of each two-part curve is mediated by rods which
have relatively low sensitivity to flicker.
The data shown in Figure 2.20 were obtained by having the subjects view the
center of the stimulus. The effect of increasing area was, therefore, partially
confounded with retinal location. You may have noticed this yourself under
nonlaboratory conditions; when looking directly at a large object like a
32

Basic Visual Processes

60
0

CL

40
40

S30

__-

.C20

Me___

10
0

-3

--2

-1

Log retinal illuminance (trolands)

Figure 2.20.

Critical flicker fusion for a centrally viewed stimulus plotted as a function of log
luminance. Different curves show different stimulus sizes. (from Hecht & Smith,
1936)

computer screen it may appear steady with direct viewing, but not in your
periphery. A flickering stimulus (e.g., part of a display) in the periphery can be
very distracting as it efficiently attracts attention.
In Figure 2.21, data are presented showing how sensitivity to flicker varies with
the intensity of the stimulus and retinal location (Hecht & Verrijp, 1933). Data
were obtained with a 2 stimulus that was viewed foveally (00 in the figure)
and at 5o and 150 eccentric to the fovea. It appears that a single curve can
account for the changes in CFF with intensity in the fovea, but to describe CFF
at more peripheral locations requires curves with two branches. From the data
in the figure, one can see that the relationship between CFF and retinal
location is complex. At high light levels there is a decrease in CFF from the
fovea to the periphery, whereas the reverse is true at low light levels. Flicker
sensitivity declines rather markedly as a function of increasing observer age, as
shown in Figure 2.22. This is to be expected at least in part because the light
transmitted by the lens decreases with age, and flicker sensitivity is dependent
on light level. It is still not entirely clear whether there are additional neural
changes associated with age-related changes in CFF or whether these changes in
flicker sensitivity are secondary to changes in light level alone (Weale, 1982). In
any case, this means that a display may appear to be flickering for one observer
while an older observer would see the display as steady, i.e., no flicker.
33

Human Factors for Flight Deck Certification Personnel


50
00

50

S30
0*

150

S20

.'"

10
0

-4

-3

-2

-1

0o

Log retinal illuminance (trolands)

Figure 2.21.

Critical flicker fusion for a 20 stimulus plotted as a function of log luminance.


Different curves show CFF for different retinal loci. (from Hecht & Verjp, 1933)

The CFF measurements discussed so far were obtained with a stimulus that was
either completely on or completely off. If we were to draw a graph of the
intensity over time it would look like shape 1 illustrated in Figure 2.23, and is
known as a square wave. With specialized equipment, deLange (1958) also
measured flicker sensitivity using other waveforms (changes in light intensity
over time) that are shown by the inset in Figure 2.23, and at three different
light levels. In each case, the stimulus was repeatedly made brighter and
dimmer at the frequency specified on the horizontal axis. The vertical axis plots
the "ripple ratio," or amplitude of modulation, which refers to the amount that
the light must be increased and decreased relative to the average light level to
just detect flicker. Figure 2.23 thus illustrates our sensitivity to flicker at all
different frequencies. It can be seen that we are most sensitive to flicker at
about 10 Hz. At higher frequencies, the amplitude of modulation must be
increased in order for flicker to be detected.
We noted in our discussion of hearing that the response of the human auditory
system to complex sounds can be predicted by decomposing the complex tones
into a set of pure tones, or sinusoidal waveforms. deLange (1958) applied this
approach to the different waveforms of his flickering stimuli by mathematically
analyzing them in terms of a set of sine-wave components (using Fourier
analysis). Figure 2.23 shows a plot of the amplitude of modulation for the
34

Basic Visual Processes


oI

T
Ref I
20
40

60

S5,0

30

Figure 2.22.

10

20

30

40

50

60

70

80

9O

Critical flicker fusion plotted as a function of age for six different studies which
used different stimulus conditions. (from Weale, 1982)

fundamental component of the different waveforms, i.e., the amplitude of the


lowest frequency contained in the complex wave. When analyzed in this way, it
appears as though sensitivity to flicker for complex waves can be predicted by
the response of the eye to the various sinusoidal components into which the
complex wave can be decomposed.
Moton
If you doubt that motion is a fundamental perceptual quality, try to imagine
what life would be like without the ability to experience it. A rare case of
damage to a part of the brain (called area MT) that appears to be specialized
for analysis of motion occurred in a woman in Munich. The scientists who
studied this woman noted what it was like:
She had difficulty, for example, in pouring tea or coffee into a
cup because the fluid appeared to be frozen, like a glacier. In
addition, she could not stop pouring at the right time since she
was unable to perceive the movement in the cup (or a pot) when

35

Human Factors for Flifht Deck Certification Personnel

WI~7..:

bserver L
195

-4-'

-'

2 -

i -.i

43tr.oIands43
-----

10
...-

10

20*

4.3 trolands I

.-

I--H-

.--- --

1-,,

-"

-,

Shape

-,-

-"

" --

- +'\ '

-3i

-.

-i

-:

- -

30 -}I~1.
40

200

t I I U \t-;
I

10

100

A i- ;

I I

20

30 40 50

100

CFF cps -

Figure 2.23.

Modulation amplitude (r%) of the fundamental component contained in the waves


plotted as a function of flicker frequency. (from deLange, 1958)

the fluid rose.... In a room where more than two other people
were walking she felt very insecure and unwell, and usually left
the room immediately, because 'people were suddenly here or
there but I have not seen them moving.'... She could not cross
the street because of her inability to judge the speed of a car,
but she could identify the car itself without difficulty. 'When I'm
looking at the car first, it seems far away. But then, when I want
to cross the road, suddenly the car is very near.'
(From Zihl, von Cramon & Mai, 1983, p. 315).
Figure 2.24 shows a square comprised of dots that are arranged in a random
order, and a set of dots arranged so that they spell a word. If the two sets of
dots are printed on a transparent sheet and superimposed, no word can be read
and one observes only a set of dots. However, if one sheet moves relative to the
other, the dots that move together form a clearly legible word. Structure

36

Basic Visual Processes

S-

I.

Figure 2-24.

1"

Ifthe two sets of dots are superimposed, no pattern can be detected. However,
if one set ot dots moves relative to the other, the word "motion" will be clearly
visbe.

emerges from the motion information. This illustrates one of the many functions
of motion -- to separate figure and ground. When an object moves relative to a
background, the visual system separates the scene into figure and ground.
Our perception of motion is influenced by many factors. Our perception of
motion speed is affected by the sizes of moving objects and background.
Measures of motion thresholds indicate that we can detect changes of an object
on a stationary background on the order of 1 to 2 minutes of arc per second.
However, when the background cues are removed, motion thresholds increase
by about a factor of ten (see Graham, 1965b). These thresholds also depend on
the size of the moving object and background. For example, Brown (1931)
compared movement of circles inside rectangles of different size, as illustrated
by Figure 2.25. Observers were asked to adjust the speed of one of the dots to
match the experimenter-controlled speed of the other. He found that in the
large rectangle, the spot had to move much faster than in the small rectangle to
be perceived as moving at the same speed. As a general rule, when different
size objects are moving at the same speed, the larger one will appear to be
moving more slowly than the small one. Leibowitz (1983) believes that this is
the reason for the large number of fatalities at railroad crossings. Large
locomotives are easily seen from the road, but they are perceived to be moving
more slowly than they really are. As a consequence, motorists misjudge the
amount of time they have to cross the tracks.
Most of the motion that we observe involves actual displacement of objects over
time, but this is not a necessary condition for the experience of motion. For
example, a compelling sense of motion occurs if we view two lights, separated
in space, that alternately flash on and off with a brief time interval between the
flashes (about 60 msec). This movement is called stroboscopic motion, and it is

37

Human Factos for Flight Deck Certification Personnel

Figure 2.25.

Illustration of experiment by Brown (1931). The left circle must move faster than
the one on the right for the two to be perceived as moving at the same speed.

(from Goldstein, 1984)

very important for motion pictures because films are merely a set of still
pictures flashed in quick succession. Stroboscopic movement is also important
for understanding how motion is actually perceived because it demonstrates that
it is.a perceptual quality of its own, rather than a derivative of our sense of
time and space.
In early studies of stroboscopic movement, Wertheimer (1912) discovered that
the apparent movement of two spots of light in the above demonstration goes
through several different stages depending on the time interval between the
flashes. If the interval was less than 30 msec, no movement was detected.
Between about 30 and 60 msec there was partial or jerky movement, while at
about 60 msec intervals the movement appeared smooth and continuous.
Between about 60 and 200 msec, movement could be perceived, but the form of
the object could not (objectless movement). Above about 200 msec, no
movement was detected. Of course, these values depend on the distance
between the two stimuli, but at all distances the different stages could be
identified.
Still another type of movement perception occurs without actual movement of
the object. For example, induced movement occurs when a background moves in
the presence of a stationary object, but it is the object not the background that
is seen as moving. You may have had this experience looking at the moon
when clouds were moving quickly in the wind; it is not unusual to have the
experience of the moon moving across the sky.

38

Basic V'msal Processes

On a clear and quiet night looking at a star against a dark sky you may also
have experienced illusory movement of the star. The effect is easily
demonstrated by looking at a small light on a dark background. It may start to
move, even though it is rigidly fixed in place. This illusory movement is known
as the auto
effect. It is not well understood, but some researchers believe
it may be due to drifting movements of the eyes (Matin & MacKinnon, 1964).
Whatever the cause, one can imagine practical situations in which the
autokinetic effect has the potential to cause errors in judgment.

39/40

Color Vision

Chapter 3
Color Vision
by John S. Wemer, Ph.D., University of Colorado at Boulder
Color Mixture
From the scotopic spectral sensitivity curve (Figure 2.12, p. 22) it is clear that
rods are not equally sensitive to all wavelengths. Why, then, do all wavelengths
look the same to us when they stimulate only the rods? The answer is that a
rod can only produce one type of signal regardless of the wavelength that
stimulates it. That is, all absorbed quanta have the same effect on a single
receptor, and, therefore, it can only pass on one type of signal to the brain.
Thus, even though some wavelengths are more easily absorbed than others,
once absorbed they all have the same effect.

41

h-man Factoi for Fbifht Deck Certification Personnel

If each receptor cell has only one type of response, what explains how we use
our cones to see color? The answer is that we have three different types of
cones. They differ because each type contains a different photopigment.
Figure 3.1 shows the absorption spectra -- plots of relative absorption as a
function of wavelength -- for the three types of photopigment contained in
human cones. Note that each type is capable of absorbing over a broad
wavelength range. One type maximally absorbs quanta at about 440 rin,
another at about 530 nm, and the third type at about 560 nm. We call these
three types of receptors short-, middle-, and long-wave cones, based on their
wavelength of maximal sensitivity.
9
to

Now suppose we look at two

'S00d1

monochromatic lights presented


side-by-side. If the wavelengths are
450 and 605 rim respectively, we
5
would probably describe the lights
as reddish blue and yellowish red.
Note that these two wavelengths
are equally absorbed by the
z
middle-wave cones. The same
500
400
quantal absorption for two lights
means that a single receptor must
PMuet o the cone and rod
A
3.1.
produce the same signal for the
The cuasa
huncton ofer
two lights. The 450 rim light will,
hve boen nomazed to the same
however, elicit a much stronger
heot (aOw Bowmaker at aL,
signal in the short-wave cones than
19eo)
in the long-wave cones, and the
opposite will occur for the 605 rm
light. Thus, both monochromatic lights will produce signals in all three cone
types, but the pattern of activity will differ among them. This pattern of
receptor activity is transmitted to the brain and allows us to discriminate a
difference in the two wavelengths.
Because our cone system produces these different patterns of response to
different wavelengths, it can distinguish changes in intensity and wavelength.
But not all differences can be discriminated. In the mid-1800s, Helmholtz and
Maxwell performed experiments by having a subject match a light composed of
three different wavelengths with a light containing only one wavelength. They
discovered that any single wavelength of light can be perfectly matched by a
mixture of three other wavelengths. This match is possible because the three
combined wavelengths produce the same pattern of activity in the different cone
types that is produced by the one wavelength alone. Thus, an observer perceives
the two physically different patches of light as identical.
42

Color Vision

Our three types of cone receptors allow us to discriminate different


wavelengths, but the example above showed how this system can be fooled.
Actually, it is this very limitation that allows us to have electronic color
displays. The image on the display consists of many small spots of light, or
pixels. Three contiguous pixels, containing different phosphors, may produce
either red, green, or blue. These three phosphors are so small and close
together that the light produced by them is blended in the retinal image. The
colors on the display are created by electrically exciting these phosphors to
produce the amounts of the three lights that produce the color we see.
Variation in Cone Types with Retinal Eccentricity
Figure 2.10 (Chapter 2, p. 20) showed the distribution of rods and cones with
varying eccentricity. A careful examination of the distribution of cones would
show that there are asymmetries in the distribution of cones. At any given
eccentricity, the nasal retina has a higher
70
density of cones than the temporal retina.
6.0
There appear to be no asymmetries along
60
the superior to inferior meridian. The
4.5
50
practical consequences of the retinal
asymmetry in cone distribution are not clear, 1
3.0
although it has been shown that color vision
1.5
is, in some sense, better in the nasal
compared to the temporal retina (Uchikawa,

Kaiser & Uchikawa, 1982).

4) 20

MT

8"20

40

The distribution of the three cone types also


10 S
varies with retinal eccentricity, as shown by
Figure 3.2. The data presented in this figure
06
1'0 io io 4o
are actually from a baboon retina and are
Eccentricity (degrees)
believed to be similar to the human cone
Figure 3.2. The number of short-,
distribution with an important exception.
middle-, and long-wave
sensitive cones per
Whereas the baboon has more M than L
square mm in a baboon
cones, humans have more L than M cones,
retina as a function of
In fact, for the central 20 of retina, the ratio
retinal eccentricity. (after
of L:M:S cones is about 32:16:1 (Vos &
Marc & Spering, 1977)
Walraven, 1971). The relative scarcity of S
cones has important implications for visual perception. Partly because of their
numbers and partly because of their neural connections, the S cones make a
negligible contribution to high spatial acuity and high temporal sensitivity
(Kelly, 1974). The S cones are important for color discriminations.
The inset of Figure 3.2 shows a magnified scale of the retinal distribution of S
cones. There are virtually no S cones in the center of the fovea. This means
43

Human Factors for Flight Deck Certification Personnel

that color discriminations that depend on S cones will be impaired if the image
is sufficiently small to fall only on the center of the fovea. This is illustrated by
Figure 3.3. When viewed close, so that the visual angle of each circle subtends
several degrees, it is easy for an individual with normal color vision to
discriminate the various pairs; yellow vs. white, blue vs. green, and red vs.
green. Viewed from a distance of several feet, however, the yellow and white,
as well as the blue and green, pairs will be indiscriminable. This is called
wma/fl-Je/d bitanopia, because tritanopes are individuals who completely lack S
cones. A tritanope would not be able to discriminate the yellow from the white
in Figure 3.3 regardless of their sizes. With certain small fields, even normal
individuals behave like tritanopes. Notice that with the small field condition, the
red-green pair is still discriminable because S cones are not necessary for this
discrimination. Thus, the small-field effect is limited to discriminations that
depend on S cones.

Figure 3.3.

Colors (yellow and white) not discriminable at a distance due to small field

No blue and green pairs are shown


There is no
on this page.
reference to blue and green pairs
Refer to the
in the new text.
enclosed errata sheet for the
correct text.

44

Color Vision

Color Vision Deficiencies


We take the colorfulness of our world so much for granted that it is hard to
imagine a form of color vision different from our own. Normal color vision is
based on three types of cone receptors, and such individuals are known as
Uichromats. An individual can be classified as trichromatic if he or she requires
a mixture of three lights (known as primaries) to match all wavelengths of the
spectrum.
CongenitalDeficiencies
Many individuals require three primaries to match all wavelengths of the
spectrum, but the intensity ratio of the three lights needed is not normal. Such
individuals are called anomalous trichromats. The reason for anomalous
trichromacy is that one or more of the cone receptor classes contains a
photopigment that is shifted along the wavelength scale relative to normals.
Since there are three types of cones, there can be at least three types of
anomalous trichromacy, depending on which type of photopigment is shifted.
Anomalous trichromats can be classified as tritanomalous (shifted pigment in the
short-wave cones), deuteranomalous (shifted pigment in the middle-wave cones)
or protanomalous (shifted pigment in the long-wave cones). Tritanomalous color
vision is extremely rare -- so rare that some authorities doubt its existence.
Deuteranomaly and protanofhaly are not rare, as can be seen in Table 3.1. In
both of these forms of anomalous vision, the middle- and long-wave pigments
overlap in their sensitivity by a greater degree than normal. This affects not
only their color matching, but also the ability of anomalous trichromats to
discriminate certain wavelengths of light. A more severe form of color deficiency
exists when an individual is completely missing one type of photopigment in the
cones. It should be mentioned that the normal number. of cones is present in
such individuals, but the cones are segregated into two classes rather than
three. These individuals are called dichromats because they require only two
primaries to match all wavelengths of the spectrum. There are three types of
dichromat. A person who is missing the short-wave cone photopigment is called
a ritanope, and would have difficulty discriminating white from yellow, for
example. Persons missing the normal middle-wave cone photopigments are
known as deuleranopes and would not be able to discriminate red from green
based on wavelength alone (see Figure 3.3). Red-green discriminations are also
impaired in protanopes, individuals missing the normal long-wave cone
photopigment. Finally, there are some individuals, known as monochromats,
who require only one wavelength of light to match all others of the spectrum.
This implies that the individual is using only one type of receptor in color
matching. Such a person could be a monochromat due to having only one type
45

Human Factors for Flight Deck Certification Personnel

Table 3.1
Congenital Color Vision Deficiencies

Type
Tritanomaly

Abnormality
Shifted S-Cone Pigment

Deuteranomaly

Shifted M-Cone Pigment

Protanomaly

Shifted L-Cone Pigment

Tritanopia

Missing S-Cone Pigment

Deuteranopia

Missing M-Cone Pigment

Protanopia

Missing L-Cone Pigment

S-Cone: short-wave cone

Prevalence
Males: ?
Females: ?
Males: 5.1%
Females: 0.5%
Males: 1.0%
Females: 0.02%
Males: 0.0007%
Females: 0.0007%
Males: 1.1%
Females: 0.01%
Males: 1.0%
Females: 0.02%

M-Cone: middle-wave cone L-Cone: long-wave cone

of cone (a cone monochromat) or because the individual has no cones (a rod


monochromat). The cone monochromat has one type of cone for photopic
vision and rods for scotopic vision. The rod monochromat has no cones so is
severely impaired in functioning under photopic (day vision) conditions.
When anomalous trichromacy or dichromacy is present from birth, the
deficiency is called congenital. The incidence of all forms of color vision
deficiency combined varies across populations; about 8% in Caucasian males,
5% among Asian males, and only 3% in Black and Native American males.
Table 3.1 summarizes the incidence of congenital color vision deficiency in
North America ane Western Europe. It is clear that these forms are inherited
with the most common forms carried by the sex chromosomes. This is why the
incidence of middle- and long-wave cone deficiencies is about ten times more
prevalent in males than in females.
Acquired Deficiencies

Not all deficiencies of color vision are congenital, some are acquired in later
life. Unlike congenital deficiencies which are due to abnormalities at the level
of the photopigments, acquired deficiencies can be due to disruption of
processing at any level of the visual system. For example, on rare occasions
following a stroke, an individual may experience damage to a particular region
of the brain involved in color processing that will render him or her
permanently color blind. Such a case was reported for a customs official who
46

Color Vision

had passed color vision screening tests as a condition of employment, but he


could not do so after his stroke (Peariman, Birch & Meadows, 1971). The man
had good memory for colors, but when given crayons to color a picture, he
appeared completely confused in his selections. Fortunately, such cases of
cortical color blindness are extremely rare.
Other acquired deficiencies of color vision are not rare. Glaucoma and diabetes,
for example, often impair functioning of S cones (Adams et al., 1987). In some
cases, these changes in color vision occur before there are any physical changes
that can be detected by standard clinical testing and before there are changes in
visual acuity. Many acquired defects of color vision do not fit neatly into the
categories of color deficiency that are used to classify congenital losses
(Verriest, 1963). Early in the disease a loss of yellow-blue discrimination is
typically noticed, but this may be followed by impairment of red-green
discriminations. The incidence of acquired defects of color vision in the
population has been estimated at about 5%, but these figures are not
unequivocal.
Some drugs (both recreational and prescription) can cause defects of color
vision. For example, blue-yellow color defects have been associated with certain
medications used in the treatment of psychiatric disorders (e.g., phenothiazine
[Thorazine] and thioridazine hydrochloride [Mellaril]). These effects can persist
even after the medication is withdrawn. A more commonly used drug,
chloroquine, prescribed as an antimalarial drug, has also been associated with
blue-yellow defects. Red-green defects have also been reported as a side effect
of certain medications. Among the drugs involved are certain antibiotics such as
streptomycin and cardiovascular drugs such as Digoxin. The list of drugs that
may impair color vision is actually quite large (see Pokorny et al., 1979), but
the patient is seldom made aware of this possible side effect.
Variation with Age
We have seen that as we get older, the lens of the eye becomes less efficient at
transmitting light, particularly light at short wavelengths. Since color
discrimination is impaired by a reduction in light intensity, it is perhaps not
surprising that performance on color vision tests can change with age. Verriest
(1963) has shown age-related losses in performance on the Farnsworth-Munsell
100-Hue test of color discrimination, and these changes can be mimicked with
young observers who are tested with short-wave absorbing filters placed in
front of their eyes. In general, these losses in discrimination of the elderly are
similar to deficits associated with congenital deficiencies of short-wave cones.

47

Human Factors for Flight Deck Certification Personnel

-05 -sws

While it might be thought that changes in


color vision with advancing age are only
secondary to reductions in light transmission

by the lens, this is not the case.

-2
0
-2 5

..
2.t.

Werner and Steele (1988) measured the

2230'

a*

and 84 years. Figure 3.4 presents a summary


of their results. Each symbol represents a

0"

different observer's cone sensitivity, and each


panel represents one of the three types of'

large individual differences at each age, but

there is also a significant reduction in


sensitivity throughout life. Converting from

-30 00

9
9

_ _-_
..
2

20

40

80..

00

so

100

-:--

---

60

o---- 7---.
-05

LWS

the logarithmic scale used to plot the data,

-,o-

cone sensitivity for each decade of life. This

-15"

there is a reduction of approximately 25% in


means that our ability to distinguish
between
with age. subtly different colors deteriorates

to0

0wS

>

cone receptors. You can see that there are

So

cones using subjects between the ages of 10

40

0o ,--

sensitivity of each of the three classes of

**

-25
-30
0

'

60

80

,-20

40

00
100

AGE (yeafs)

Testing

Figure 3.4.

Individuals with abnormal color vision are


often unaware that their color vision differs
from normals. Even dichromats can often
name colors quite well in their natural
environment because reds and greens, or
blues and yellows, for example, may differ in
their brightness. Thus, to properly test for
color vision deficiencies, special tests are required.

Log sensitivity of short-,


middle-, and long-wave

cones, measured
psychophysically, plotted
as a function of observer
age. (data from Werner &
Steele, 1988, figure from
Werner et al., 1990)

The most definitive way to measure color vision is through color matching. A
yellow light (590 nm) can be matched with a mixture of a yellowish red (670
nm) and a yellowish green (545 nm). The stimulus used for such a test is
illustrated by Figure 3.5 and is produced by an instrument called an
anomaloscope. Deuteranomalous and protanomalous individuals will differ from
normal in the ratio of the two light intensities in the mixture that is required to
match the yellow. Deuteranopes and protanopes can match the yellow using
only one of the two lights simply by adjusting the intensity. Other wavelength
mixtures can be used to diagnose deficiencies of the short-wave cones.
48

. ..

.. ..

'

' ,
-J

r.

Color Vision

670 nm

Figure 3.5.

A schematic of the split field produced by an anomaloscope

Unfortunately, anomaloscopes are expensive and not readily available to most


clinicians.
Perhaps the most familiar test for assessing color vision deficiency involves a
series of plates composed of dots of one color which form a number or simple
geometric form such as a circle or square. Surrounding these dots are others of
a different color. The dots are carefully chosen so that, when illuminated with
the proper lamp, normal individuals will be able to see the number or form but
individuals with color vision deficiencies will not. Various color combinations
are provided by different plates in order to detect different forms of deficiency.
Figure 3.6 shows one of these pseudoisochromaicplates used for testing color
vision. Normal trichromats see a number 46 in this plate, but monochromats,
certain dichromats and anomalous trichromats will not. -This test provides an
assessment of deficiencies involving middle- and long-wave cones, but most of
the plate tests are not useful for detecting deficiencies of short-wave cones. This
means that individuals who confuse certain reds and greens are more likely to
be identified than individuals who confuse yellows and blues (or yellows and
whites).
To detect abnormalities of any of the three cone types, a clinician could use the
Panel D-15 test shown in Figure 3.7. This test consists u,' a number of caps of
different colors. The object of the test is to arrange the caps in a logical color
sequence. One of the caps is fixed in the tray and the subject is asked to place
the one that is most similar next to it in the tray, and then to place the next
most similar near the second cap and so forth. Each of the different types of
color deficient observers will choose a different arrangement of the caps, which
49

,-

'..

. .

Human Factors for Flixht Deck Certification Personnel

07

0.,
*~ ~

.g@
,0

ih

4~

**.kw

*~

~0.

.v
*

L~

@*4

* ,"0 V Iih!i

Figure 3.6:

A pseudoisochromatic plate from the Dvorine Plate Test for color


vision deficiencies.

Figure 3.7.

Farnsworth Dichotomous Test of Color Blindness, Panel D-15.


Copyright a 1947 by The Psychological Corporation. Reproduced by
permission. All rights reserved.
50

.-

Color Vision

can be scored by reference to numbers on the bottom of the caps.


Noa &tCoai:w
According to the Society of Automotive Engineers, ARP4032 (1988):
"Approximately 3% of private pilots, 2% of commercial pilots, and 1% of airline
transport pilots are known to have some form of color vision deficiency"
(page 12). As already mentioned, individuals with abnormal color vision are
often good at naming colors. People with such deficiencies learn to use other
cues to discriminate colors; they learn, for example, that on a stop light, red is
on top. Many color deficient observers could name the colors in most aircraft
cockpits without having learned position cues. This does not, however, imply
that they can process the colors normally. Discriminating between the colors
may not be normal, especially under conditions in which the colors are
desaturated ("washed oute). Search and reaction times are also impaired in color
deficient observers. Cole and Macdonald (1988) demonstrated this using
cockpit displays with redundant color coding (the meaning of the display
symbols are coded by color and another cue such as shape).
Finally, we have already noted that screening for color vision deficiency requires
certain tests, but it should be emphasized that these tests are only valid when
administered under the proper conditions. The proper illumination of the tests
can be obtained with specialized lamps, but because of their expense they are
not always used. Failure to use the proper illuminant may result in misdiagnosis
or failure to detect a color deficiency. Many of these testing considerations are
summarized in a review by the Vision Committee of the National Research
Council (1981).

CoW Appeanwce
Color is defined by three properties) brightness, hue, and saturation. It would
be convenient for engineers if these three psychological properties were related
in one-to-one correspu-dence to physical properties of light, but they are not.
Imagine that you are sitting in a dark room viewing a moderately bright
monochromatic light of 550 nm. A normal trichromat would say it is yellowish
green. If we increased the number of quanta the light emits, you would say that
the light is now brighter. What you experience as lhown increases with the
light intensity, but before you conclude that brightness depends only on light
utaneo
hne coM .
intensity, look at Figure 3.8 which demonstrates
The two central patches are identical, but their brightness is influenced by the
surroundings. All things being equal, brightness increases with intensity, but it
is also affected by other factors.
51

Human Factors for Flijzht Deck Certification Personnel

Figure 3.8.

An illustration of simultaneous brightness contrast.

As we increase the intensity of our 550 run light, you will probably notice that
what appeared as green with just a tinge of yellow now has a much more vivid
yellow component. You might say that the color has changed, but this change
in appearance is described more precisely as a change in hue. Hue refers to our
chromatic experience with light, such as redness and greenness. Many people
think that particular wavelengths produce definite hues, but this is not entirely
correct. Wavelength is related to hue, but one must consider other variables as
well, such as intensity. In our example, a single wavelength produced somewhat
different hues at different intensities.
A third change in the appearance of our 550 nm light as we increase the
intensity is that the tinge of whiteness that was detectable at low intensities has
now become clearer. The whiteness or blackness component is another
dimension of our color experience known as saluration.A light with little white
is said to be highly saturated and appears vivid; a light with more whiteness is
less saturated and appears more "washed out."
Thus, there are three dimensions
saturation. These dimensions are
wavelength. As we increased the
constant, we saw a clear charge
saturation.

of color experience: brightness, hue, and


not uniquely related to quanta and
number of quanta yet kept the wavelength
in brightness, but also a change in hue and

52

Color V'ison

Chromaic and Achromatic Colors


Suppose we look at two physically different spots of light that perfectly match,
and that we call orange. The existence of three types of cone receptors in the
color-normal person explains why the two colors cannot be discriminated, but it
does not explain why we see the particular hue as orange. Hering (1920)
proposed a theory to explain the appearance of hues. He proposed that all our
experiences of hue can be reduced to four fundamental sensations: red, green,
yellow, and blue. Thus orange is nothing more than a yellow-red. Consistent
with this observation, modem experimental evidence has shown that the four
basic terms are both necessary and sufficient to describe all hues. Figure 3.9
shows how these hue names are used to describe monochromatic lights from
400 to 700 nm. Notice that the percentage of red cr green is plotted from 0 to
G
100

GROUP
z

"W 75 -

25

o
M

,.

0
.J
9

-50
5

50
0

z
W
U
cr 25
0.

Figure 3.9.

400

450

Iz
W
75 0

500
550
600
B
Y
WAVELENGTH (nm)

650

700

1100

Average color-naming data obtalned for three normal trlchromats

plotted for wavelengths presented at equal luminance. (after Werner &


Wooten, 1.79)

100 on the left and the percentage blue or yellow is plotted on the right from

100 to 0. The data could be plotted in this way because, when describing a
uniform patch of the visual field, observers do not use the terms red and green
simultaneously, that is, they do not call it "reddish green," nor do they use the

53

Human Factors for Flight Deck Certification Personnel

terms blue and yellow simultaneously ("bluish yellow"). The arrows in the
graphs indicate the wavelengths perceived to be uniquely blue, green, or yellow.

Hering further argued that by studying our color experiences carefully, we could
discover other properties of how the brain codes for hue. For example, while
we can experience red in combination with either yellow (to produce orange)

or blue (to produce violet), we cannot experience red and green at the same
time and place. When red and green lights are combined, they cancel each
other. The same is true of blue and yellow lights. Hering proposed that this
happens because red and green (as blue and yellow) are coded by a single

process wi:h two opposing modes of response, excitation and inhibition. A


red-green channel can be activated in one direction to signal redness or in the

opposite direction to signal greenness, but it cannot simultaneously signal both


red and green -- the neural excitation in one cancels the inhibition from the
other. Like a seesaw, when one is up the other is down, so red and green
cancel each other out. The same holds true for yellow and blue. For this reason,
Hering termed red and green, and yellow and blue, opponent colors. Subsequent
research on how the brain codes color strongly supports Hering's opponentcolors theory (Zrenner et al, 1990).
As we shall see, the fact that there are a limited number of fundamental hues
and that certain color pairs are mutually exclusive can have important practical
implications for the appearance of colors in displays. One example can be on a
course selector in which the manual radio function is displayed in green and
the planned course selector is displayed in magenta. When these two are
superimposed, they look white, but the white is coded to mean proposed course
modification. In this case, color cancellation on the display may produce
confusion.
While the four basic hue terms are sufficient to describe all hues, an account of
color appearance must also take into account the achromatic aspects coded by
an opponent process that signals black and white. This achromatic channel
provides the physiological process for the perception of light and dark colors
such as pinks and browns. For example, pink is a bluish red with a substantial
white component, and brown is a yellow or yellow red with a substantial black
component.
A representation of perceptual color space is shown in Figure 3.10. From our
previous discussion, it is apparent that such a representation requires two
chromatic dimensions in which red and green are mutually exclusive and yellow
and blue are mutually exclusive. In addition, achromatic dimensions must be
represented orthogonally to the chromatic dimensions to show the varying
degrees of blackness or whiteness in colors.
54

Color Vision
W

Bk
Figure 3.10.

Illustratlon of relations between hue and satration. (fror Hurvlch, 1961)

Variaions wiff Intensity


Since the mid-1800s it has been known that as the intensity of a light of fixed
spectral composition increases, the hue will change. Specifically, the blue or
yellow hue component increases relative to the red or green component. So, for
example, as the intensity of a violet light is increased, it will appear more blue
than red. This is known as the Bezold-Brtcke hue shift. Purdy (1931) quantified

this effect and, in addition, reported that three wavelengths, corresponding to


the loci of unique hues, were invariant with changes in intensity. There are
individual differences in the wavelength of the unique hues.
Figure 3.11 presents data obtained from four observers who were asked to
describe the color of a monochromatic light when it was presented at different
intensities. The wavelength of the light was 609 nrm, which is equivalent to a
commonly used red on cathode-ray tube (CRT) displays. Notice that at low light
levels, redness is a minor component relative to black and white, but redness
and yellowness increase with increasing intensity. Similar results, consistent
with a Bezold-Bricke hue shift were obtained for other CRT display colors
(Volbrecht et al., 1988).
The data in Figure 3.11 represent a 10 stimulus viewed by the fovea for 1
second. In addition to the loss of hue at low luminances, perception of hue can
be further degraded if the stimulus is made smaller and the viewing time is
shorter (Kaiser, 1968). When stimuli were presented in a color-naming
experiment using small field sizes (less than 15 minutes of arc) and short
presentations (50-200 msec), monochromatic or "colored" stimuli were called
white 50% of the time (Bouman & Walraven, 1957; Walraven, 1971).

55

Human Factors for Flight Deck Certification Personnel

100

1
0

75-

-j

75-

La.

25-

a:-

0..25

.000

SD

50A

50-

HA

25

a-

a
obeves (7ro
0

-O

1.0

2.0

VoVeh eta.S98
01
3.0

4.0

-1.0

LOG TROLANDS

Figure 3.11.

(c'0.46,

1.0

2.0

3.0

4.0

0.53)

Color-naming results plotted as a function of stimulus intensity for four


observers. (from Volbrecht et al., 1988)

Variations with Retinal Eccentricity


We have already looked at how the different cone types vary in their
distribution with retinal eccentricity. These receptors, of course, provide the
input to the neural processes that code the fundamental colors. Thus, it follows
that there ought to be some variation in color perception with retinal
eccentricity, or with location in the visual field. Sensitivity to color is greatest in
the fovea and decreases toward the periphery.
Visual field measurements using stimuli of different color are shown in Figure
3.12. These results are from the right eye of a normal trichromat. The center of
the diagram corresponds to the point in the visual field that falls on the fovea
and the concentric circles represent positions that move away from the center of
the visual field in steps of 100. The outer, irregularly shaped contour shows the
limit of the visual field. Nothing outside this area can be seen with a stationary,

56

Color Vision

right eye. Inside the visual


field are other irregularly
shaped contours that define
regions in which particular
hues can be experienced.

Within the central

100

i"

the

Ylo

observer is responsive to all


27W
the basic colors: red, green,
yellow, blue, black, and white.
As we move out from the
center, sensitivity to red and
green diminishes. Objects that
were previously described as
2
reddish yellow and bluish
green are now simply seen as
yellow or blue. With further
eccentricity, the yellow and
blue zones diminish and color
responses are limited to black Figure 3.12.
and white. Thus, the accuracy
with which we can identify
colors in a display depends on
whether we are looking at them directly or

/
s

60

1,1"

M
Zones in the visuai field of the right eye in
col can be s1o91r)
whimch,
Hurvich, 1981)

viewing them peripherally.

There are three points that should be noted about these color zones in the
visual field. First, it is evident that the same visual stimulus can be perceived
differently depending on the area of visual field that is stimulated. For example,
at the fovea, a stimulus might appear orange or reddish yellow, at about 400
away from the fovea it might be yellow, and at 70o it may appear gray. Second,
the figure again illustrates that red and green are linked, as are yellow and
blue. The linkage is through an opponent code as discussed earlier. Third, these
zones were measured under one condition and with other conditions such as
larger fields they will change somewhat.
Wavelength Discriminationand Identification
Discriminating color requires an observer to compare two lights and to decide
whether they are the same or different. Identification involves an absolute
judgment about a color name or category that must be made regardless of
whether other colors are present.

57

Human Factors for Flight Deck Certification Personnel

ROWg of Difaiminafion
To measure wavelength discrimination, the experimenter typically uses a split
field such as that shown by the inset of Figure 3.13. One half-field is
illuminated by a standard wavelength and the other half-field by a variable
wavelength. If the two half-fields are seen as different, the experimenter
increases or decreases the intensity of the variable wavelength to determine
whether it is discriminable at all intensities. If there is any intensity at which
the fields are indiscriminable, it is said that the observer does not discriminate
the wavelength pairs. Thus, when we say that two wavelengths can be
discriminated, it is implied that this discrimination is made independent of
intensity. The object of such an experiment is to find the minimum wavelength
difference, or AX, that can be discriminated.
.0
Photometric Field
2 degrees
Approx 70 trolonds

(nm)

400

500

600
WAVELENGTH, X

igueM
3.13.

WavengM

dif

nce required for dctkm

w a funcion of wavelengh

700

inm)

IndependeW

of iensy ploted

(ater Wrigt & PKit 1934)

An average wavelength discrimination function is shown in Figure 3.13. It is

plotted as a function of wavelength. There are two minima in the function. At


about 500 nm and at about 590 um some observers can discriminate a
wavelength difference of only about I nm, regardless of the intensities of the
wavelengths. Wavelength discrimination, as with other aspects of color vision,
depends on field size. Smaller field sizes are associated with poorer wavelength
discrimination (Bedford & Wyszecki, 1958). This means that, all other things
being equal, it will be easier to notice a color difference between two relatively
large display symbols than two smaller symbols.
The data in Figure 3.13 pertain only to the discriminability of monochromatic
lights. To determine the number of discriminable colors requires some account

58

Color Vision

of nonspectral lights. Based on the number of discriminable hues, number of


discriminable steps along the achromatic continuum, and the number of
discriminable saturation steps, there are an estimated 7,295,000 color
combinations that can be discriminated by the normal human eye (Nickerson &
Newhall, 1943).
Range of kienfficaton
According to Chapanis (1965), a set of colors that must be identified on an
absolute basis must fulfill several criteria. First, every member of the set must
seldom be confused with any other member. Second, every color in the set must
be associated with a common color name. Third, use of the color codes should
not require specialized training, but should be naturally understood by
individuals with normal color vision. To this end, Chapanis asked 40 observers
to name 1,359 different color samples (from the Munsell system described on
page 69). He then analyzed the data to determine which colors names were
used most :znsistently across observers. Chapanis found that in addition to the
achromatic colors (black, white, and gray) which were applied consistently,
subjects were most consistent in their use of the terms red, green, yellow, blue,
and orange.
Recommendations about the optimum number of colors that ought to be
available for visual displays range from about three or four (Murch & Huber,
1982) to ten (Teichner, 1979), the number that can be absolutely identified
without extensive training (Ericsson & Faivre, 1988). Use of more than about
six or seven colors will lead to errors in identifiation.
Inriiafionsibr Color Dipays
One often hears of displays that are capable of presenting a large number of
colors. In some applications, such as map displays, it may be useful to access a
large color palette. However, if colors must be identified, not just discriminated,
a large color palette may be of little value. For colors to be identified reliably,
they must be distinct under a wide range of viewing conditions. The maximum
number that fulfills this requirement is probably not greater than six. Of course,
in applications that do not require absolute identification (e.g., cartography),
the number of discriminable colors that can be used will increase. The number
of colors might also be increased when they are only used to reduce clutter and
need not be specifically identified.
In addition to all these considerations, one should heed the conventions for
various color choices. For this reason, FAA guidelines (RD-81/38,1I, page 50)
stress that red should be used for warning indicators and amber for caution
59

Human Factors for Flight Deck Certification Personnel

signals. A third color, of unspecified hue, is recommended to indicate advisory


level alerts (RD-91/38,II, page 60).
Contrast Effects
The appearance of a color can be altered by another color next to it or another
color seen just before or after it. As we scan a scene, we view colors with an
eye that has been tuned from moment-to-moment through exposure to
preceding and surrounding colors. These contrast effects are dependent on the
intensity, duration, and sizes of the stimuli. Here we will illustrate and describe
contrast effects, but for detailed summaries of the literature see Graham and
Brown (1965) or Jameson and Hurvich (1972).
Successive Conrast
Figure 3.14 illustrates a temporal color-contrast effect. Fixate on one of the dots
on the right for a while and then shift your gaze to one of the dots on the
white surface to the left. You will see an afterimage of colors complementary to,
that is, opposite, those in the picture. This contrast effect produced over time
makes sense if we assume that an opponent-color channel is first driven in one
direction by color stimulation and then experiences a rebound effect (of neural
activity) in the opposite direction when the stimulus is removed. Thus, we see
the opposing color though no external stimulus exists. Wooten (1984) has
provided a detailed description of changes in color appearance resulting from
successive color contrast.

Simultaneous Contrast
Figure 3.15 illustrates a spatial, color-contrast effect. The thin bars in the two
patterns are identical, but they look different when surrounded by different
colors. This is called simultaneous color contrast because it occurs
instantaneously. The color induced into the focal area is opposite to that of the
surround. This is attributable to opponent processes that operate over space; the
neural activity in one region of the retina produces the opponent response in
adjacent regions. While the effect noticed here is primarily from the surround
altering the appearance of the bars, the opposite also occurs.
Through simultaneous contrast we can experience many colors that are not seen
when viewing spectral lights. For example, the color brown is experienced only
under conditions of color contrast. If a yellow spot of light is surrounded by a
dim white ring of light it will look yellow. As the luminance of the surround is
increased (without changing the luminance of the center), there will be

60

Color Vision

Figure 3.14.

A demonstration of successive color contrast (from Hurvich, 1981)

Figure 3.15.

A demonstration of simultaneous color contrast (from Albers, 1975)

61

.Hmnan Factor for Flmht Deck Certification Personnel

corresponding changes in the central color. First it will look beige or tan, then
light brown, followed by dark brown (Fuld et at., 1983). If the ring is still
further increased in luminance, the central spot will look black. The color black
is different from the other fundamental colors because it arises only from the
indirect influence of light. That is, like brown, the color black is a contrast
color and is only perceived under conditions of contrast. Any wavelength can be
used in the center or surround and if the luminance ratio is sufficiently high,
the center will appear black (Werner et al., 1984).
AssiwTadoa
Sometimes a pattern and background of different colors will not oppose each
other as in simultaneous contrast, but will seem to blend together. This is
known as assimirlion or the Bezold Spreading Effect and is illustrated by Figure
3.16, (reprinted from Evans, R.M. An Intodudkm to Color. Plate XI, p. 192 *
John Wiley & Sons, Inc., New York, NY). Here we see that the saturation of the

Figure 3.16

Ademonrlon
motm
cd
1948

lleon, th Bezold sp edinoI

(Irom Evan%

red background of the top left and center looks different depending on whether
it is interlaced with white or black patterns, even though the background is
physically the same in the two sections. The lower illustration shows the effect

62

Color Vision

of assimilation with a blue background. Assimilation is not well understood, but

it is known that it cannot be explained by light scatter from one region of the
image to another. The phenomenon arises from the way in which colors are
processed by the brain.
Adaptation
We have already seen from the dark adaptation curve that the visual system
changes its sensitivity according to the surrounding level of illumination. We
have also seen that visual acuity increases with increased light level. Here we
shall briefly discuss some of the changes in color perception that occur with
changes in ambient light.
Chromatic Adaptation

The appearance of a color can be altered by preceding or surrounding colors


that are only momentarily in the field of view. Even larger effects can be
observed when an individual is fully adapted to a chromatic background. This is
demonstrated by an experiment of Werner and Walraven (1982) in which the
subject was instructed to adjust the ratio of two lights so that the mixture
would appear pure white. The subject then viewed an 80 chromatic adapting
background for seven minutes and again adjusted the ratio of the two lights so
that it looked white. The results are shown in Figure 3.17 using the CIE color
diagram that will be explained below. For now, consider that the color diagram
represents all mixtures of colors. The central x designates the mixture that
appeared white in the neutral state (dark background) and the lines radiating
outward connect the neutral white point with the chromaticity of the adapting
background (on the perimeter of the diagram). The individual data points show
the light mixture that appeared white after chromatic adaptation. You can see
that the light mixture that appears white is dramatically altered by chromatic
adaptation.
In part of the experiment, the intensity of the chromatic background was kept
constant, but the intensity of the test spot was varied. Contrast refers to the
ratio of the increment to the background. The results show that lower contrasts
are associated with larger shifts in the white point. Indeed, nearly any light
mixture can appear white under the appropriate conditions of adaptation and
contrast.
In natural settings one does not ordinarily adjust the chromaticity of a stimulus
to maintain a constant color, although devices to implement such a scheme on
aircraft displays have been proposed (Kuo & Kalmanash, 1984). What ordinarily
happens is that adaptation alters the color of a stimulus in a direction opposite
63

Human Factors for Flight Deck Certification Personnel

520
0.8 r530

450

510

Contrast

540

0 0.5
o 1.0
5.o

550

- / ,'

"-A

0.6

0.

'~560

"

570

580

0.4

'-..~,sy~n590

0.4
490

0.2 -

0
0

63

o ,

",= ,.

450
0.2

0.4

0.6

0.8

Figure 3.17.

Chromaticity diagram showing stimuli that appear white under dark-adapted


condition (central x) and following adaptatio to chromatic backgrounds (filed
circles). (after Werner & Walraven, 1982)

to that of the adapting field color. For example, white letters may be tinged
with yellow when viewed on a blue background or tinged with green when the
observer has adapted to a red background. These effects of chromatic
adaptation can be altered to work in favor of color identification or detection.
For example, detection of a yellow stimulus may be enhanced by presenting it
on a blue background.
Variaton Under Normal Conditions
The effects of ambient light in altering the state of adaptation are not
fundamentally different from those already shown in Figure 3.17. However,
since most ambient lights contain a broad distribution of wavelengths, the
receptors are not adapted as selectively as in laboratory experiments.
One important consideration in evaluating changes in ambient illumination
under natural conditions is that in addition to altering the perceptual state of
an observer, there often can be substantial changes in the display itself. CRT
screens typically reflect a high percentage of incident light. The light emitted
64

Color Vision

from the display is therefore seen against this background of ambient light.
Figure 3.18 shows how sunlight alters the spectral composition of the colors
available on a display. As sunlight is added to the display, the gamut of
chromaticities shrinks, as illustrated by the progressively smaller triangles
(Viveash & Laycock, 1983). To an observer this would be experieiiced as a
desaturation or "wash out" of the display colors as well as a shift in hue that
accompanies changes in saturation, called the Abney effect (see Kurtenbach,
Stemheim & Spillmann, 1984). Some colors that were previously discriminable
may no longer be so. Finally, not illustrated by the figure is the substantial
reduction in luminance contrast
09

with increasing ambient


illumination. Some visual displays

Do

on aircraft are automatically


adjusted in their luminance by
sensors that respond to the
ambient illumination (e.g., all

193,

CE

06

o5

CRTs on Boeing 757 and 767).

0-

This is an important innovation,


033'
0.2

and indeed consistent with FAA


recommendations (RD-81/38,II,

page 47) that alerting signals be


automatically adjusted according
to the ambient illumination level.
However, manual override control
is also recommended
(RD-81/38,II, page 73) to

0,

00
Figure 3.18.

compensate for individual


differences in sensitivity,
adaptation, and other factors such
as use of sunglasses.

O01

02

03

04

05

06

07

06

Chromaticity diagram showing how

the color gamut of a display


decreases with increasing sunlight
(after Viveash & Laycock, 1983)

Color Specification
There are many situations in which it is useful to have an objective method for
specifying color. Since color perception of a fixed spectral distribution depends
upon many conditions, a system of color specification could be based on
appearance or on some physical or psychophysical description of the stimulus.
Each system of color specification has advantages and disadvantages.
CIE System
We have seen that a normal trichromat can match any wavelength (or any
mixture of wavelengths) by some combination of three other wavelengths or
65

Human Factors for Flight Deck Certification Personnel

primaries. The choice of wavelengths for the primaries is somewhat arbitrary,


but different sets of primaries will necessarily involve different intensity ratios.
Since any color can be matched by some mixture of three primaries, any color
can be represented in terms of the proportional contribution of each primary to
the mixture. For example, a light that is matched with 10 units of wavelength
450 run, 5 units of 550 rin, and 20 units of 670 nm has a ratio of the three
primaries of 2:1:4. While our ratio of primaries would provide an exact match
to the light of interest, other primaries could also be used to provide an exact
match. To be useful in a wide variety of applications, it would be helpful if
specifications of a color could all be made in terms of the same set of
primaries. Thus, in 1931 the CIE
developed a set of imaginary
(XY,Z ) - System
primaries to represent the
I

color-matching functions for a


standard observer. Since these

Color- matChing
functions

primaries are not real, they are given


the arbitrary labels X, Y, and Z.
Figure 3.19 shows the relative

ic A

amount of these theoretical primaries


needed to match any wavelength of
unit energy. The values plotted here

OM
G
W
0 4
are designated 3,, Y, and Z, and are
tristimulus
spectral
the
as
known
values. Among the nuances of this
FN 3.19. CIE Ubdnwks valus for a 20
samdad obsw prAted w a
system, the Y tristimulus value is
fincaoi of wavdengf (from
identical to the V, function (photopic
Wimadd & SM, 19)
sensitivity of the standard observer).
Thus, when the V tristimulus value is
integrated with the energy distribution (by multiplying the energy by Y at each
wavelength and summing), we have the total value of the Y primary which is
equal to the luminance. It should also be mentioned that the CIE actually
developed two sets of tristimulus values, one for 20 stimuli and one for 100
stimuli.
To specify the chromaticity of a particular color in the CIE system, the energy at
each wavelength is multiplied by the X tristimulus value at each wavelength and
the products are summed across wavelengths to yield the tristimulus value (not
to be confused with the spectral tristimulus values) designated as X. Similarly,
the energy across wavelengths is convolved with the V and I tristimulus values
to yield Y and Z. The X, Y, Z values can be quite useful in specifying a color.
For example, given the values for a color of interest, we can be certain that it

66

Color Vision

can be matched with respect to the standard observer by an individual who


creates these same X, Y, Z values using any other wavelength combination.
We now have the ingredients for representing a color in question in an x,y
chromaticity diagram that represents all conceivable colors. The chromaticity
coordinates are defined as: x =X/(X+Y+ Z); y=Y/(X+Y+ Z); z=
Z/(X + Y + Z). Notice that x, y, and z are proportions that sum to 1.0. Thus, it
is only necessary to plot x and y since z = 1 - (x + y). The resulting
chromaticity diagram is shown in Figure 3.20. Notice that monochromatic lights

550

x, y )-chromaticity

diagram
00.6

o(E)

47 0s

3o

I0

(Z)

0.2

0.4

0.6

Figtu 3.20.

CIE color diagrm. (frm Wyozecd & SIe, 192)

67

0.8

1.-0

Human Factors for Flifht Deck Certification Personnel

all plot around the perimeter of the diagram, a region known as the spectrum
locus. The area inside the diagram represents all physically realizable mixtures
of color. Given the chromaticity coordinates of a color, a perfect match can be
made by various mixtures determined using the chromaticity diagram. If we also
wanted the match to include information about luminance, we would have to
specify Y as well as the x,y coordinates.
A useful property of the CIE chromaticity diagram stems from the fact that a
mixture of two lights always plots on a straight line that connects the points
representing the lights within the diagram. The position along the line that
represents the mixture depends on the energy ratio of the two lights. Thus, if
we plot the points representing the chromaticity coordinates of three phosphors
on a color display, we can connect the points to create a triangle representing
the color gamut of the display. This triangle would represent all chromaticities
that can be generated by the display.
The CIE chromaticity diagram is useful for specifying color in many
applications, but it does have some drawbacks. Perhaps the most important
problem is that equal distances between sets of points in the diagram are not
necessarily equal distances in perceptual space. To rectify this problem the CIE
developed a new chromaticity diagram, shown in Figure 3.21, in an attempt to
provide more uniform color spacing. The coordinates of this diagram are called
u',v' and can be obtained by a simple transformation from the x,y coordinates
(for further details see Wyszecki & Stiles, 1982). The smaller triangle in Figure
3.21 shows the gamut of many typically used displays while the larger triangle
shows the maximum envelope of
currently used displays.
0 6
Munsel SYsM4

5900

610 620 640 7

05

The CIE system is useful for


specifying the chromaticity of a
visual stimulus, but no

490
03-

information about color

480

appearance is preserved. The

02

appearance of lights of a fixed


chromaticity will depend on many

01-

variables, as was illustrated in


Figure 3.17. Several systems for
specifying
colorclosely
that are
easierto to
use
and more
related

470

46-1

Figure 3.21.

perception than the CIE system


are available, perhaps the best

0o

400
0 3 II04

0 5

06

07

ClE, u', v' chromailcaty diagram


based on a 20 stardard obseqyer.

(fro

68

02

Vobret a al., 1968)

Color Vision

known system being the one developed by Munsell in 1905. In its current form,
the Munsell System consists of a series of colored paint chips arranged in an
orderly array, as illustrated by Figure 3.22. Each entry is characterized by three
numbers that specify hue, blackness and whiteness from 0 to 10 (called
lightness), and ratio of chromatic and achromatic content (called chroma). Hue
is represented by a chrcular arrangement in 40 steps that are intended to be
equal in perceptual space. Lightness varies from bottom to top in nine equally
spaced steps from black to white. Chroma, or saturation, represents the hue and
lightness ratios in 16 steps that vary from the center outward. To use this
system, one merely finds the chip that most closely matches an item of interest.
Each chip is specified by three parameters: hue, lightness, and chroma. Since
the steps between chips are nearly equal, the Munsell system can be useful in
the selection of colors that are equal distances in perceptual space.
While the Munsell system is easy to use and the arrangement corresponds more
closely to color appearance than the
CIE system, it still has many
limitations. The influences of
surrounding colors and state of
adaptation which are important for
color appearance are not taken into
account by the Munsell designations.
Thus, the appearance and
Pigme,
discriminability of colors expected
from a Munsell designation may not
be obtained when the conditions of
viewing are altered.
P,

Implications for Displays

of

constant hue

5R

Color can significantly enhance


search and identification of
information on visual displays. It is
more effective than shape or size in
helping to locate information quickly
(Christ, 1975). The attention-getting
nature of color facilitates search

1,
12

/10
chr...

Figure 3.22. Schematic of the Munselt color


solid. (from Wyszecki & Stiles,

19U)
while at the same time providing a
good basis for grouping or
organizing information on a display
which may help display operators segregate multiple types of information and
reduce clutter. For example, an experiment by Carter (1979) showed that when
the number of display items was increased from 30 to 60, search time increased

69

Human Factors for Flight Deck Certification Personnel

by 108% when only one color was used, but increased by only 17% for
redundant color-coded displays.
There are severe constraints on the effective usage of color information (see
also Walraven, 1985). The attention-getting value of a color is dependent on its
being used sparingly. Only a limited number of colors should be used in order
to avoid overtaxing the ability of an observer to classify colors. If each color is
to have meaning, only about six or seven can be utilized effectively.
In addition, we have seen that perception of a fixed stimulus will be changed as
a function of many variables including the intensity, surrounding conditions,
temporal parameters, and state of adaptation of an observer. If color is a
redundant code, these problems, as well as loss of color due to aging of the
display, will have substantially less impact on operator performance.
The choice of colors can be facilitated by considering the physiological
principles by which hues are coded -- red opposes green and blue opposes
yellow. These colors are also separated well in CIE chromaticity diagrams.
Colors that are barely discriminable at low ambient conditions may not be at all
discriminable at high ambient conditions because of a physical change in the
color gamut.
The use of blue stimuli can be problematic for displaying characters requiring
good resolution. The blue phosphors on many displays only produce relatively
low luminances, but the main difficulty is a physiological problem in processing
short wavelengths. One problem already mentioned that might result from using
small blue stimuli is related to small-field tritanopia. Because the short-wave
cones are distributed more sparsely across the retina, they contribute very little
to detail vision. Short-wave cone signals are not used in defining borders oi
contours (Boynton, 1978). In addition, focusing of short-wavelength stimuli is
not as easily achieved as for middle- and long-wave stimuli, making blue a
color to avoid in displaying thin lines and small symbols. A major advantage of
blue and yellow is that our sensitivity to these colors extends further out in the
visual field than our sensitivity to red and green. Blue hues also provide good
contrast with yellow. Thus, while blue may be a good colo- to avoid when
legibility is a consideration, it may be a good color to use for certain
backgrounds on displays.

70

Form and Depth

Chapter 4
Form and Depth
by John S. Werner, Ph.D., University of Colorado at Boulder
We have already discussed how two objects of different sizes, placed at different
distances from us can cast images of idi.ntical size and shape on our retina.
Despite this, we can still tell that one is small and close and the other is large
and far away. How do we do thL. Either we have additional information
about physical distance or we know something about the physical size.
We encounter another aspect of the same perceptual problem when we consider
the fact that as an object changes position with respect to us, because either it
is moving or we are moving, the retinal image formed by the object
continuously changes shape and size. These changes depend on both the
object's distance and ,ar angle of view. For example, an object moving away
"grows" smaller. Or the image of a square on our retina may become in turn a
rectangle or a trapezoid depending on our angle of view. The amazing fact in
the face of such retinal contortions is that our perceptions of the object's shape
and size remain relatively constant; we still see a square. These perceptual
71

Human Factors for Fliiht Deck Certification Personnel

constancies, termed shape and se conztancy, require information atzLut distance,


not only the distance of objects in relation to each other, but also the distances
between points on the same object, say the comers of a square. Somehow we
process the information we take in through our retinas at higher levels in the
nervous system in terms of information we hold about size, shape, distance -- in

other words, our concepts about the physical world. It is important to realize
that we are usually unaware of this process when perceiving size and distance;
we do it automatically. In this section we will discuss some of the ways in
which form and depth information are processed.

Edges and Borders


The eye movement rec-,rds shown before in Figure 2.17 suggested that borders
and edges of a stimulus were often the target of visual fixation. Our ability to

separate figure and ground in a complex scene requires differences in light


level it is not the overall light level that typically defines an object's edge or
border, it is a difference in light levels -- the contrast. There are several ways to
define contrast, but in this section we will define it as:

(i.. - L.) / (LI. + .


where I is the maximum luminance in the pattern and L, is the minimum
luminance in the pattern. With this definition, contrast can vary between 0 and
1.0.
The importance of contrast in defining the brightness or lightness of an object
or area was already illustrated by simultaneous brightness contrast. The
brightness of a point of light within a pattern is partly determined by its own
characteristics but also by the brightness of points surrounding it. Many of the
processes responsible for simultaneous contrast originate within the retina. The
information that retinal cells send to the brain has little to do with overall light
level, rather they are coding small differences in light level from one region to
the next.
A striking consequence of the way in which the visual system extracts
brightness information is illustrated by Figure 4.1. The top panel represents a
black-and-white disk that can be mounted to a motor and spun rapidly. The
black region reflects about 5% of the light falling on it and the white region
reflects about 85%. Now imagine that the disk is spun rapidly so that you
cannot discern the separate black and white regions. This is shown by the
middle panel. If we measure the light reflected from the disk by passing a small
probe from left to right, the intensity of light would vary with the ratio of
black-to-white areas. The bottom panel shows a luminance profile of this
72

Form and Depth

S-4"-

Figure 4.1.

Demonstration of the Craik-Corsweet-O'Bri

illusion. (from Cornsweet, 1970)

stimulus; that is, a graph of the light intensity or luminance plotted as a


function of spatial position. Notice that the inside and outside of the pattern
are separated by a change in light level, or border, but beyond this change they
have the same black-to-white ratio and hence the same luminance is reflected to
the eye. If brightness depended on light intensity alone, these two regions
should be perceived as identical. This, however, is not what we perceive; the
inside region is perceived as darker than the outside region. This effect is
known as the Craik-Cornsweet-O'Brien illusion. It shows that the brightness of a
region of light is dependent on the contrast at the border.

73

Human Factors for Flight Deck Certification Personnel

I)/
D

Position

Figure 4.2.

Illustration of Mach bands. (from Comnsweet, 1970)

The bottom panel of Figure 4.2 shows a luminance profile in which there is an
increase in intensity from left to right. The photograph above shows a stimulus
that changes according to this luminance distribution, but notice that our
perception does not follow it exactly. Rather, at the border one perceives small
bands of exaggerated darkness and brightness, labelled D and B in the
photograph. These are called Mach bands in honor of Ernst Mach (1865) who
first described them. The pattern we perceive exaggerates the abrupt light-dark
transitions.
There are many other phenomena in which the brightness or darkness of a
region depends on border contrast or on changes in contrast over time (see
Fiorentini et al., 1990). These phenomena reveal the visual system's attempt to
extract information at the borders because borders and edges define objects or

parts of objects.

74

Form and Depth

Contrast Sensitivity
The forms of objects are defined by contrast. It is, therefore, important to
characterize the sensitivity of the visual system to contrast. One approach to this
problem is to measure contrast sensitivity using grating stimuli in which the
luminance is varied sinusoidally as illustrated by Figure 4.3. If one were to

Pualnon

Pol)lllo

I kiPf O-kftf\J
Figure 4.3.

Vertical sine-wave gratings and their luminance distributions. (from Comsweet,

1970)
measure the intensity of the stimuli on the left, by passing a light meter across
it, the sinusoidal luminance profile on the right would be found. The profile of
the stimuli could be characterized by the contrast, which was defined above by
the difference between the luminance maximum and minimum, divided by the
average luminance. The frequency of oscillation of the sine wave is defined in
terms of the number of cycles per degree of visual angle (cpd). For example,
the stimulus on the top of Figure 4.4 has a lower spatial frequency than the
one on the bottom.
Contrast threshold is measured by determining the minimum contrast required
for detection of a grating having a particular spatial frequency (usually
generated on a CRT display). Contrast sensitivity is the reciprocal of contrast
threshold. Thus, the contrast sensitivity function represents the sensitivity of an

75

Human Factors for Flight Deck Certification Personnel


1000

individual to sine-wave
gratings plotted as a function
of their spatial frequency.
Figure 4.4 shows a typical
contrast sensitivity function

10
to

(Campbell & Robson, 1968).

These data were obtained


with a set of static sine-wave
gratings (like those in Figure
4.3), but contrast sensiti'.
functions vary as a function
of luminance, temporal
characteristics of the grating
stimuli (e.g., flickering vs.
steady), and stimulus motion
characteristics (e.g., drifting

C
L) 10

vs. stationary gratings). The

____,__,________,__,._,____

0.1

10
Spatial frequency (cideg)

shape of the contrast

sensitivity function also varies


with the individual observer
and the orientation of the

Figure

4.4.

Contrast sensitivity as a function of spatial


frequency. (from Campbell & Robson,
1968)

grating. For example, many

individuals are more sensitive


to vertical and horizontal gratings of high spatial frequency than to oblique
(450 or 1350 from horizontal) gratings (Appelle, 1972).

It can be deduced from the contrast sensitivity function that we are not equally
sensitive to the contrast of objects of all sizes. High spatial frequency sensitivity
is related to visual acuity; both are a measure of resolution, or the finest detail
that can be seen. When spatial vision is measured by an optometrist or
ophthalmologist, only visual acuity is typically measured. While a more
complete evaluation of spatial vision would include contrast sensitivity
measurements over a range of spatial frequencies, it is the high frequency
sensitivity that is most impaired by optical blur (Westheimer, 1964). Thus, high
frequency sensitivity is what is improved by spectacle corrections.
One explanation for our contrast sensitivity is that cells in the visual cortex
respond selectively to a small band of spatial frequencies. The contrast
sensitivity function may thus represent the envelope of sensitivity of these cells.
This is analogous to the photopic spectral sensitivity function representing the
relative activity of three classes of cones. In the case of contrast sensitivity, the
model implies that different cells respond selectively to stimuli of different sizes.
A demonstration consistent with this idea is presented in Figure 4.5. Notice that
76

Form and Depth

Figure 4.5.

Demonstration of size-selective adaptation. (from Blakemore & Sutton, 1969)

the two patterns on the right are of identical spatial frequency. Now, stare at
the bar between the two gratings on the left, allowing your eyes to move back

and forth along the bar. This scanning prevents the buildup of a traditional
afterimage. It is intended to fatigue cells responsive to gratings of a particular
size. After about 45 seconds of fixating along the bar on the left, shift your
gaze to the small bar on the right. The two patterns on the right will now
appear to have different spatial frequencies. According to theory (Blakemore &
Sutton, 1969), size-selective cells responsive to gratings on the left were
fatigued during fixation. This shifted the balance of activity when looking at the
patterns on the right compared to the activity produced by the gratings prior to
adaptation.
Variation with Luminance
The effects on contrast sensitivity of changing the space average luminance of
the stimulus were systematically investigated by DeValois, Morgan and
Snodderly (1979). Their data are shown in Figure 4.6. Contrast sensitivity is
plotted as a function of spatial frequency. Different symbols and curves from
top to bottom correspond to luminance decreases in steps of 1.0 log unit. This
figure shows that overall contrast sensitivity is reduced as luminance decreases,
but the reduction in sensitivity is much greater for high than low spatial
frequencies. This shifts the peak of the function to lower frequencies with
reduced luminance. In general, high spatial frequency sensitivity decreases as a
function of the square root of the luminance (Kelly, 1972).
77

Human Factorn for Flight Deck Certification Personnel


"000

Variaton with Retinal Eccentricity

As we have seen, the peak of the spatial


contrast sensitivity function occurs at about
3-5 cpd at moderately high luminances and
declines at lower and higher frequencies
around the peak. When the same function is z
measured at different retinal eccentricities
using a display of fixed size, the results
depend strongly on distance from the fovea,

,
,
A
0
\

\
o6

as shown by the panel on the left of Figure

4.7 (Rovamo, Virsu & Nistinen, 1978).

Spatiof

.
2

frequency,

40

'10

c/deg

Measurements were obtained with a 10 x 20

vertical grating. Different curves refer to


different retinal eccentricities; from the

Figure 4.6.

Contrast sensitivity is
plotted as a function of

highest curve down these were 0%, 1.5o, 4.00,


7.50, 140, and 300 from the fovea. There are a

spatial frequency for


young, adult observers.
(from DeValois, Morgan &

number of reasons for this dependence on

Snodderly, 1974)

eccentricity, including the variation in receptor


distribution with eccentricity and the way in which receptor signals are pooled

in the retina and at higher levels in the brain (see Wilson et al., 1990). When
larger stimuli were used to compensate for these factors, Rovamo et al. obtained
the results shown in the panel on the right side of Figure 4.7. These results are
important because they show how stimuli can be scaled in size to be equally
visible at all eccentricities.

Varialion with Age


Average contrast sensitivity for various spatial frequencies are shown in Figure
4.8 plotted as a function of age. These data represent averages from 91
clinically normal, refracted observers tested by Owsley, Sekuler and Siemsen
(1983). Age-related declines in contrast sensitivity, like declines related to
decreased retinal illuminance, are most pronounced at high spatial frequencies
(see also Higgins, Jaffe, Caruso & deMonasterio, 1988). These findings are
consistent with studies which examined the relation between age and static
visual acuity (Pitts, 1982), a measure that is primarily dependent on the
transmission of high spatial frequencies, and known to decline with advancing
age. Because the lens transmits less light and the pupil is smaller in elderly
observers, the change in contrast sensitivity may be partly a luminance effect,
although changes in neural structures also play a role.

78

Form and Depth

10-a

oO
01

S"c

.1-010

~0

O\
0,o.

7.

tI

c0

16

32

Spatial frequency (cycles deg-

Figure 4.7.

64
')

0.125 0-25 0.5

Spatial frequency (cycles rmm- 1)

Contrast sensitivity measured at different retinal eccentricities is


plotted in the graph on the left as a function of spatial frequency. The
graph on the right shows contrast sensitivity obtained at the same
retinal eccentricites but with a stimulus that was scaled according to
neural coordinates. (from Rovamo, Virsu, & Nassnen, 1978)

Implications for Displays


The contrast sensitivity function has several areas of application. First, as a
predictor of visual performance, the contrast sensitivity function may be more
useful than traditional measures of visual acuity. The visual acuity chart varies
only the size of the stimuli to evaluate spatial vision while contrast sensitivity
testing requires variation in both size and contrast. The importance of this
additional information about contrast was demonstrated by Ginsburg et al.
(1982). They conducted an experiment with experienced pilots and an aircraft
simulator. The simulated visibility was poor and half of the simulated landings
had to be aborted due to an obstacle placed on the runway. Performance was
measured by how close the pilots flew to the obstacle before aborting the
landing. Pilot responses on this task (times required to abort the landing) varied
considerably. Individual differences in performance were not well correlated
with visual acuity, but were well predicted by individual variation in contrast
sensitivity. Thus, contrast sensitivity testing may be more useful than traditional
measures of visual performance for predicting responses in complex settings.
79

Human Factors for Flight Deck Certification Personnel

A second application of the

5,,

contrast sensitivity function is


for predicting the visibility of
complex patterns presented on
displays. It may not be
feasible to test every unit of
symbology directly, but
knowing the contrast
sensitivity function, it may be

200

L
-

IOo

50
2 o0's -S
20

3
40's
~40'

c_

possible to make some


predictions using a spatial

\0 50's
o-

frequency analysis of the


stimulus. This approach is
based on Fourier's theorem,
according to which any
complex waveform can be

described in terms of a set of

6o's
7

.70's

-8os
N 91

--

sinusoidal waves of known

0 5

Spatial frequency,

16

16

c/deg

frequency, amplitude and

phase (the alignment of the

Figure 4.8 Contrast sensitivity as a function of spatial


waves when added together).
frequency for dierent age groups. (from
This approach was briefly
Owsley, Sekuler & Siemsen, 1963)
introduced (page 2) when
discussing the processing of complex tones by decomposing them into individual
pure tones, and it was used to predict temporal sensitivity to complex
waveforms on the basis of sensitivity to sinusoidal waves.
To illustrate how a complex pattern can be described in terms of a set of sine
waves, consider the difference between a sine-wave grating and a square-wave
grating. Figure 4.9 shows these two types of grating at the same spatial
frequency. The square wave is so named because it has sharp, or "square,"
edges. If the luminance of the square wave was plotted as a function of spatial
position, it would look like the function shown at the top of Figure 4.10.
According to Fourier's theorem, the square wave is composed of a sine wave of
the same spatial frequency, called the fundamental frequency, plus a set of sine
waves that form a series that are odd multiples of the frequency and amplitude
of the fundamental. The latter waves are called harmonics. In the case of the
square wave, these harmonics include sine waves that are three times the
fundamental frequency and one-third the amplitude, five times the fundamental
frequency and one-fifth the amplitude, seven times the fundamental frequency
and one-seventh the amplitude and so on to infinity. Figure 4.10 shows how
the addition of each successive harmonic component makes the combined sine

80

Form and Depth

Figure 4.9.

Sine-wave (left) and squarewave (right) gratings of the same spatial frequency.
(from De Valois & De Valois, 1988)

waves look more and more like a square wave. While a mathematically perfect
square wave requires an infinite number of sine waves, only the frequencies to
which we are sensitive (as defined by the contrast sensitivity function) need be
used. This can be demonstrated by producing a set of sine waves and adding
various components until the complex wave becomes indiscriminable from a
true square wave.

SSquare

31

Figure 4.10.

Wa.e

, *3r. Sf
t.3,5f*7

Illustration of Fourier synthesis of a square-wave (top left) and waveform changes


as various sinusoidal components are added. (from DeValois & DeValois, 1988)

81

Human Factors for Flight Deck Certification Personnel

Our demonstration of Fourier synthesis of a square wave involved energy


variations only along one dimension, i.e., a vertical grating only changes from
left to right. To synthesize natural images from a set of sine waves, one must
add the sinusoidal energy variations in two dimensions. Figure 4.11 shows how
a set of sine waves can be progressively summed in two dimensions to produce
a complex pattern. The top left shows the fundamental frequency and to the
immediate right is the power spectrum -- a two-dimensional graph of the
frequency, amplitude, and orientation of the sine wave components. Each
successive frame shows the number of spatial frequency components in the
picture. Although the computer screen generated the image using about 65,000
points, the picture is recognizable with only about 164 spatial frequencies.
Fourier analysis has been used in psychophysical experiments to successfully
predict performance on visual detection, discrimination, and recognition tasks
with complex stimuli (for reviews see Sekuler, 1974; DeValois & DeValois,
1988). This approach involves a number of assumptions that are true only
under a restricted set of conditions. The advantages of this approach, however,
should be obvious for evaluating displays. Under some conditions, the contrast
sensitivity function might be used as a filter through which the visibility of

Figure 4.11.

Illustration of Fourier synthesis of a compls Image by the successive addition


of sinusoidal components in two dimensions. (frm DeValois & DeValois, 1988)

82

Form and Depth

various components of a pattern or the whole pattern can be predicted.


Studies conducted at the Boeing company have used the contrast sensitivity
function to predict image quality. Since the contrast sensitivity function defines
the energy required for a threshold response, the energy above threshold should
contribute to pattern visibility. When this suprathreshold energy is summed
across spatial frequencies, it correlates highly with subjective measures of image
quality (Klingberg et al., 1970). The contrast sensitivity function has also been
used with success to predict image quality of other display parameters such as
target size, display background, and clutter (Snyder, 1988).
Form-ColorInteractions
If the eye is alternately exposed to a red vertical grating and a green horizontal
grating for about 5-10 minutes while the observer freely scans, there will be a
powerful aftereffect. If a black-and-white grating is used as a test stimulus,
following adaptation the observer will see the white region as green when the
stripes are vertical and as red when they are horizontal. These aftereffects are
in the opposite direction to the adapting condition and are contingent on the
orientation of the test pattern. This is known as the McCollough (1965) effect.
Color-contingent aftereffects under these conditions are quite long-lasting -- up
to months in some cases -- and cannot be attributed to traditional after-images.
Effects of this sort are not uncommon for individuals who work on video
display units. Exposure to red or green symbology on a display with a dark
background would later be expected to cause white letters to appear green or
red, respectively.
Depth Perception
Information about size, color, contrast, and motion are not all that we need to
understand our visual environment. We also need to perceive the positions of
objects in space, an ability called depth perception. There are two major classes
of cues that we use to perceive depth. Monocular depth cues provide information
about depth that can be extracted using only one eye. Binocular depth cues rely
on an analysis of slightly different information available from each of the two
eyes.
Monocular Depth Cues
If you close one eye and look around, you will probably not be confused about
the relative distances of most objects. Your perception of distance in this case is
based on monocular cues which are even more powerful than some of the
binocular cues to depth (Kaufman, 1974).

83

Human Factors for Flight Deck Certification Personnel

The size of objects can sometimes indicate their relative depth. If several similar
items are presented together, the larger items will be judged as closer. For
example, the series of circles in Figure 4.12 appears to be receding into the
distance. This makes sense because, in fact, the size of an object's image on the
retina becomes progressively smaller as it moves away.

000
Figure 4.12.

Iltustration showing how the size of an object influences the perception of


distance.

The ability to infer distance from image size often depends on familiarity with
the true size of the objects. At great distances, such as looking down from an
airplane, we perceive objects to be smaller than when they are near. in this
situation, our familiarity with objects and their constancy of size serve as a
source of information about distance. Although from the air a house ieems like
a toy, our knowledge about the actual size of houses informs us that the house
is only farther away, not smaller.
The relation between size and distance can lead not only to faulty inferences
about distance, as illustrated by Figure 4.12, but assumptions about distance
can also lead to faulty inferences about size. When we are misinformed about
distance, our pLrceptions of size and shape will be affected. You have probably
noticed, for example, how much larger the moon appears when it is low on the
horizon than high in the evening sky. This is called the moon illusion. The
change in the moon's appearance is only slightly affected by atmospheric
phenomena; by far the greatest effect is perceptual. Our retinal image of the
moon is the same size in both positions. You can prove this by holding at amns
length a piece of cardboard just large enough to block the moon from view.
The same piece of cardboard blocks the moon at the horizon and at its zenith
equally. Though they look different, they measure the same. The moon illusion
seems to be caused by inaccurate distance information about very far objects
(Kaufman & Rock, 1962). Because we see intervening objects on the earth's
surface when we look at the moon near the horizon, our internal distance
analyzers apparently cue us that the moon is farther away than when it is at its
zenith. An object analyzed as more distant has to be larger to produce an image

84

Form and Depth

of the same size. Thus we perceive the moon as larger on the horizon than
when it is at its zenith.
The relationship between size and distance is important to understanding not
only harmless illusions such as the size of the moon, but also in situations of
more significance. As mentioned above, changing fixation from a head-up
display to distant objects often requires a change in the state of
accommodation. Change in the focus of the eye is accompanied by a change in
the apparent visual angle of distant objects. Thus, when a pilot shifts fixation
from a HUD to a distant surface in the outside world, the objects in the
distance may appear smaller and more distant than they really are (Iavecchia,
lavecchia & Roscoe, 1988). While the resultant spatial errors in perception are
temporary, Iavecchia et al. believe it could introduce a significant safety hazard
under some conditions.
Any ambiguity about relative distance in relation to size can be rectified when
one object partially occludes another, as shown in Figut,. 4.13. We perceive the
partially occluded object as being more distant. This cue to depth is called
inerposilion.

Figure 4.13

Illustration of interposition as a monocular cue for distance.

If a distant object is not partially occluded, we may still be able to judge its
distance using linearperspeciive- When you look at a set of parallt. lines, such
as railroad tracks going off into the distance, the retinal images of these lines
converge because the visual angle formed by two points parallel to another
decreases as the points are farther away. This cue to depth is s., powerful that
it may cause objects of the same size to be perceived as different, as in Figure
4.14.
85

Human Factors for Right Deck Certification Personnel

Figure 4.14

Illustration of how linear perspective makes the same size objects appear to
be different sizes. (from Sekuler & Blake, 1985)

If you look at a textured surface such as a lawn, two blades of grass the same
distance apart would be separated by a smaller distance in the retinal image the
f-rther away they are because they cover a smaller visual angle. Most surfaces
have a certain pattern, grain, or texture such as pebbles on the beach or the
grain of a wood floor. Whatever the texture, it becomes denser with distance.
This information can provide clear indications of distances (Newman, Whinham
& MacRae, 1973). Figure 4.15 shows how discontinuities in the texture also
indicate a change such as an edge or corner.
Of special relevance to aircraft pilots is the depth cue known as aerial
pernpetive. As light travels through the atmosphere, it is scattered by molecules
in the air such as dust and water. The images of more distant object-, are thus
less clear. Under different atmospheric conditions, the perceived distance of an
object of fixed size may vary. For example, an airport will appear farther away
on a hazy day than on a clear day.
Some monocular cues to depth are not static, but are dependent on relative
movement. When we are moving, objects appear to move relative to the point
86

Form and Deph

~/'

\X

\//

/;

11 12k

Foum 4.15

IMusrlon o

ItmI
ugWadlefts

A'A

a cue to diNce. (frm Glbon, 1966)

of fixation. The direction and speed of movement is related to their relative


distances. This is illustrated by Figure 4.16. Objects that are more distant than
the point of fixation move in the same direction as the observer. Objects in
front of the point of fixation move opposite to the direction of the observer.
You can demonstrate this by holding two fingers in front of you at different
distances and then observing their relative displacement as you move your head
back and forth. The difference in how near and far objects move, called motion
parallax, is probably our most important monocular source of information about
distance. Motion parallax occurs from any relative motion -- moving the whole
body, the head, or the eyes.
Modon poepecdie is a phenomenon related to motion parallax. It refers to the
fact that as we move straight ahead, the images of objects surrounding the
point of fixation tend to flow away from that point. Figure 4.17 illustrates
motion perspective for an individual walking through the stacks of books in a
library. If the observer were to back up, the flow pattern would contract rather
than expand. These optic flow patterns carry information about direction,
distance and speed, and are believed to be an important depth cue used by
pilots to land planes (Regan, Beverly & Cynader, 1979).

87

Human Factors for Flistht Deck Certification Personnel

Figure 4.16.

Illustration of motion parallax. (from Coren, Porac, & Ward, 1984)

Fiur 4.17.

Illustration of motion perspective for a person who is moving and fixating


straight ahead. (from Matlin, 1983)
88

Form and Depth

Ocular Convrgence
When fixating a distant object, the image of the object will fall on the fovea of
each eye. As the object is brought nearer, maintenance of fixation will require
that the two eyes move inward or converge. This information about
convergence can be used to gauge the absolute distance of objects, provided the
objects are not more than about 10 feet away. Beyond this distance, the
convergence angle of the two eyes approaches zero.

Because the two eyes are


separated by about 3 inches, the
visual fields are slightly different
for the two eyes (refer back to
Figure 2.16 in chapter 2). In the
region where the two eyes have
overlapping visual fields, they will
receive slightly different images of
objects. This is easily verified by
alternately fixating an object a
few feet away with one eye and
then the other. With the left eye
you will see more of the left side
of the object, and with the right
eye you will see more of the right
side of the object. This difference
between the images in the two
eyes is referred to as retinal
disparity or binoculardisparity.

F
/"

Figure 4.18 shows how binocular


disparity arises. When we fixate
on point F, both eyes are
Figure 4.18.
Schematic illustration of binocular
disparity. (from Werner &
oriented so that the image falls
1991)
Schlesinger,
in
fovea
the
of
on the center
each eye. Images from objects at
other distances from our eyes -- for example, the tree in Figure 4.18 -- will fall
onto different locations in relationship to the foveas. This happens because the
two eyes have different angles of view. Images of objects that are either inside
or outside the half-circle in Figure 4.18 will strike the two retinas differently.
Thus, disparate signals from each eye will be sent to the brain where
comparisons are made by specialized cells; different cells are tuned to respond
89

Human Facton for FliPht Deck Certification Penonnel

according to the amount of disparity (Pettigrew, 1972). The amount of


binocular disparity (specified in arc units) provides us with information about
how far in front or behind our fixation (F) point an object is. The ability to
judge depth using retinal disparity is known as steimopi.
There are several ways to demonstrate stereoscopic depth from two-dimensional
images. Wheatstone (1838) showed that if one image is presented to one eye
and another image to the other eye through a stereoscope, the images could be
fbised and a three-dimensional image could be seen. Today, 3-D movies are
created by projecting two (disparate) images on a screen. Separation of the
images is made possible by projecting them with polarized light of orthogonal
orientations. If the viewer has polarizing glasses, the two images will be
separately projected to each retina, fused, and perceived as three dimensionaL
The remarkable ability of the brain to extract information about depth was
demonstrated by Julesz (1971) through patterns called random-dot stereograms.
A random-dot stereogram is shown in Figure 4.19. The two squares consist of
dots placed randomly within the frame. However, in one frame the dots from a
small square region were displaced (moved) slightly. When these two images
are presented separately to each eye the displaced dots will produce retinal

Fgute 4.19. A random-do uterogram (from Julosz, 1971)

disparity. Thus, we will perceive the subset of dots within the square as lying in
front or behind the other dots. The ability of the visual system to correlate all
of these random dots shows that retinal disparity does not require a comparison
of specific forms or features of objects. One possible basis for extracting the
information in the two eyes quickly might be for the visual system to process
the spatial frequency content in the two images (Frisby & Mayhew, 1976).

90

Form and Depth

Random-dot stereograms demonstrate the keen sensitivity of the human visual


system to binocular disparity. There are many practical implications of this
ability. For example, if a counterfeit dollar bill is placed on one side of a
stereoscopic viewer and a genuine dollar bill on the other, the two can be
compared and differences of 0.005 mm can be detected because they will stand
out in depth. Other virtues of stereovision are well known to aerial surveyors
and experts in aerial surveillance. Under optimal conditions, stereoscopic depth
can be used to resolve displacement in depth of about 2 sec of arc. This
corresponds to a difference that is smaller than the diameter of a single cone
receptor.
Stereoacuity varies with the distance of the object. Beyond about 100 feet,
retinal disparity diminishes so greatly that this cue to depth is not useful. Thus,
it is sometimes noted that routine aspects of flying an airplane do not require
stereopsis, but it is helpful when moving the plane into the hangar (DeHaan,
1982).
Binocular Rivalry
If the scenes presented to each eye are very different, such as when the images
of objects are too binocularly disparate, the visual system does not fuse the
images. Rather, views of the two scenes may alternate from one eye to the
other or a mosaic that combines portions of the two images may alternate. This
is known as binocularrivaby and can occur whenever the images presented to
each eye are too different to be combined. Apparently the visual system
attempts to match the images from the two eyes and when this cannot be done,
one of the images or at least portions of one image are suppressed.
During early life, the images to the two eyes may be chronically discordant due
to the two eyes being improperly aligned, a condition known as sauabimus. If
this condition is not corrected in early childhood, the input from one of the
eyes may become permanently suppressed and the individual wil be steaeobind,
that is, incapable of using stereoscopic cues to depth. Whether due to
strabismus or other causes, about 5-10% of the population is stereoblind
(Richards, 1970).
Colr Steros
When deeply saturated colors are viewed on a display, it sometimes appears
that the different colors lie at different depths. This phenomenon, known as
color stereopsis or chromostereopsis, is illustrated in Figure 4.20. The effect is
most clearly seen with colors that are maximally separated in the spectrum. On
displays, red may appear to be nearer than blue.

91

Human Factors for Flight Deck Certification Personnel

Figure 4.20. An illustration of color sereopsis.

Color stereopsis is due to retinal disparity arising from chromatic dispersion by


the optics of the eye. Short wavelengths are imaged more nasally than long
wavelengths and the resultant retinal disparity leads to the perception that the
different colors are at different depth planes. As pointed out by Walraven
(1985), display operators can minimize this effect by using less saturated colors
or brighter backgrounds.
Implications for Displays

Stereopsis provides a little used channel for presenting information on visual


displays. By using retinally disparate images, it is possible to create more
realistic portrayals of the external environment than would be possible on
displays carrying only monocular information. Whether stereo imagery on
displays would improve performance in the cockpit should be further studied.
There is some evidence that it can decrease response time, increase recognition,
and reduce workload (Tolin, 1987).
Perhaps the most interesting applications of stereo displays are not in the
cockpit, but in the control tower (Williams & Garcia, 1989). The workload of
traffic controllers could conceivably be reduced if aircraft could be seen in
three-dimensional rather than two-dimensional space. Methods for generating
such "volumetric" displays and evaluation of human performance with these
displays provide an interesting challenge for the future.

92

Information Processinm

Cht........er 5
Information Processing
by Kim M. Cardosi, Ph.D., Volpe Center
based on material presented by Peter D. Eimas, Ph.D., Brown University
What Is the Mind?
An important belief shared by cognitive psychologists is that the mind has many
components that perform different functions. We can measure the time it takes
for the different parts of the mind to do their jobs, even though our experience
of information processing or of any cognitive function is that it happens
instantaneously. In laboratory research, psychologists can parcel our mental
processes into component parts and measure the time it takes for each
component task to be accomplished.

93

Human Factors for Flight Deck Certification Personnel

The Brain as an Information Processor


Figure 5.1 shows one representation of the mind as an information processing
system. This system is a product of the brain. There are at least 10 billion,
probably 100 billion cells called neurons in the brain. Each neuron has between
a hundred and 10,000 connections. It is a very large system and its size which
permits us to perform the many mental tasks that we do so well, for example,
communicate by means of language, solve problems, and monitor complex
physical systems that inform us about events in the environment.
ENVIRONMENT
INTENTIONS
t

1.

SENSORY

LONG-TERM MEMORY

SYSTEMS

&

SENSORY BUFFERS
T

4-

KNOWLEDGE

ATTENTION

AUTOBIOGRAPHICAL

ME14OR IES
RULES

PERCEPTION
-

PATTERN
RECOGNITION

EXPECTATIONS
GOALS

DESIRES

It

WORKING (SHORT-TERM) MEMORY

"-

RESPONSE SYSTEM:
DECISIONS,
RESPONSE EXECUTION

CONSCIOUSNESS (?)
L

Figure 5.1

Boxology diagram of mental processing. (original figure)

94

RESPONSES

Information Processinf

Information from the environment comes in through sensory systems: both


internal and external. An example of an internal sense is hunger pangs. External
senses are sight, hearing, smell, taste, and touch. Both internal and external
senses provide our minds with information that flows through the system, and
results in a response. Our responses or behavior can change the environment
and create a new situation. Then we may respond in a different way to the new
situation we have helped to create. That's why the diagram shows the arrows
going from response back to the environment.
The first box in Figure 5.1 represents the senses, which actually detect the
things and events in the environment, and the sensory register. In the sensory
register, information is held for a very brief period of time (less than one
second) while it is selected or filtered and ultimately processed so as to provide
us with the percepts that we experience. Information is processed by means of
what has been called a patternrecognition system. We have patterns (such as
your name, a familiar voice, an aircraft call sign) stored in long-tenn memory
that help us with the recognition process. Long-term memory is the memory you
have for your entire life. This includes all of the knowledge you have, what
you've learned in school through all the years, the expertise you've gained in
your work, etc. It also has your autobiographical memories, what you did when,
with whom. To recognize a pattern, to know what something is and its
significance, means you have matched it to something you already know.
When we recognize selected information, we hold it in short-term memoty, also
called wkodng memoty. Short-term memory is like the central processing unit of
a computer. It's where we do our work, where we solve our problems -- at least
partially -- where we bring information together from short- and long-term
memory that begins to answer the questions that are posed to us by our
environment. Short-term memory has a limited capacity. In it, we can store
approximately five to nine items (e.g., letters) or chunks of information (e.g.,
words) for up to one minute. Information that can be retrieved after one
minute is said to have been transferred to long-term memory. Long-term
memory has unlimited capacity, but retrieval can be a problem. That is, the
information is known to be in long-term memory but it, at least temporarily,
cannot be transferred to short-term memory for use. Memory will be discussed
in more detail later in this chapter.
In summary, information can be viewed as constantly moving back and forth
between the outer world and the mind through our internal and external
senses. The information is filtered, processed by pattern recognition systems and
stored briefly in short-term memory, which may also be the site of
consciousness, and can under the right circumstances, be stored in long-term
memory indefinitely. This information in working memory can also be used to
95

Human Pactors for Fright Deck Certification Personnel

make a decision and initiate a response. Decision making and response selection
will be covered in Chapter 7.
We can classify our stored knowledge as explicit or implicit. Frplicit knowledge
is knowledge that you have direct and immediate access to. This includes your
name, your phone number, what you do for a living, who your spouse is, what
your children's names are, all the knowledge you have about your expertise and
your profession, etc. All of these are explicit forms of knowledge that you can
describe in detail as well as use for many tasks of a cognitive nature.
Implitm
knowledge is knowledge that you have, but you are not able to describe;
that is to say, you do not have direct access to this knowledge. Good examples
of implicit knowledge are things like riding a bicycle, playing tennis, catching a
baseball, the syntactic rules of your language, etc. Most likely, unless you're a
physicist, you have no explicit knowledge of the laws of physics that you use
when doing such things as riding a bicycle. Nevertheless, you can do them
properly. Your implicit knowledge enables you to do so -- it is available for
certain tasks, but it is not available to consciousness.
Figure 5.1 breaks things up rather neatly, as if these processes occur separately,
taking a lot of time. However, information processing, perception, speaking, and
listening go on very, very quickly. The diagram shows mental activity occurring
in accord with a serial processing system; that is, we do one thing at a time.
However, there is the belief that paralel processing (doing more than one thing
at a time) also occurs. It is difficult to substantiate that parallel processing goes
on in the mind because the measuring instruments are limited. We can measure
the electrical activity of someone's brain and say that the brain is working
because we see the blips on the electroencephalogram. We can be much more
precise and say that certain areas are working. What appears to be true is that
some of those areas are working in parallel. Indeed, if we think of all the
events that must occur during perception of visual scenes or spoken language,
parallel processing would seem to be absolutely necessary if we are to explain
how these processes, these mental activities, could occur so quickly.
Another box in Figure 5.1 is attention. Attention is simply the part of the mental
system that directs us to one sort of information rather than another. We are
able to attend to a particular stimulus even in the presence of an enormous
amount of other stimulation. This ability to selectively attend to specific
information will be discussed in detail in Chapter 8.
Information processing takes time, as noted above. The time required to process
information depends upon many factors. In most cases, information will be
96

Information Processin

processed only to the extent that is required by the task. The more complex the
task, the more time will be required. For example, in any array of colored
numbers, more time will be needed to count the blue numbers than to decide
whether or not blue numbers are present in the display. Still more time will be
needed to add the blue numbers. This type of difference in the level of
processing is referred to as "depth" of processing. The more, or "deeper," the
information is processed, the easier it will be to remember (Craik and Lockhart,
1972). For example, a controller is more likely to remember "seeing" an aircraft
that he or she has communicated with several times than one with which no
communication was required. In our previous example of an array of colored
numbers, the person who added the blue numbers would have more success in
recalling them than the person who counted the same numbers. Information
that is not specifically attended to is not likely to be remembered. The more
attentional resources spent on processing the information, the more accurately
the information will be remembered. This has implications for complex tasks in
which it is important to remember certain pieces of information. We can
maximize the chances of being able to remember information by requiring that
the information be used or processed in some way. Information that is not
actively attended to will not be easily recalled from memory when needed.
Attention
In Principlesof Psychology (1890), William James defined auention as "the mind
taking possession, in clear and vivid form, of one of what seems several
simultaneous possible objects or trains of thought. It implies withdrawal from
some things to deal effectively with others." In processing information we can
focus on specific information at the expense of other information, and we can
shift our attention from one thing to another. What are the costs of focused
attention? How do you move attention around?
Attention directs us to something particular. Some researchers consider human
mental processing to be, for the most part, a serial processing system like the
central processing system of most computers. Computers do one thing at a time,
but they do them very, very quickly; performing millions of operations per
second. Our neurons are not as fast. In fact, they're incredibly slow. So what we
probably do is group great masses of them together to do things and use
parallel processing. One mass of neurons in one section of the brain does one
thing, while another mass in another section does another thing.
The attention mechanism that directs our processing energy works both within
a sensory modality, (i.e., within vision or within audition) and across
modalities. There may be two types of attention: one that directs you to a
modality, and one that works within a modality. Alternatively, there may be a
97

tHman Factors for Flight Deck Cerdfication Peoend

single central processor in the mind that is responsible for prioritizing incoming
information.

Se

Attention

Some of the early scientific work on attention began around 1950 in a group
led by Donald Broadbent in England. It took its impetus from a phenomenon
that came to be called "the cocktail party effect." If you go to a cocktail party
where there are only a couple people and the noise level is not bad, it's easy to
understand the person you're t,1king to. After 150 people have arrived, the
noise level is overpowering; if you recorded it, it would sound like gibberish. It
would be very difficult to pick out one conversation on the tape and pay
attention to it. However, an individual at the cocktail party can begin to and
continue to attend to a speaker and understand what that speaker is saying
despite distracting noise.
One factor that makes this possible is the distinctiveness of the voice of the
person who is speaking to you; it is easier to attend to an individual if the
voice is distinctive in some way. For example, it would be easier to attend to a
woman's voice when the distracting voices were men's voices, because of the
pitch of a woman's voice tends to be very different from a man's. Another factor
is the direction of the voice. You can focus on a voice by virtue of the direction
it comes from: a voice coming from a certain direction hits one ear earlier than

the other by a very precise amouiat of time. Other factors that allow you to
attend to a particular voice at a noisy cocktail party are the coherence or
meaning of the speech, the nature of the voice, and the emphasis given to the
words. These kinds of simple matters were related by Broadbent to what was
called "picking up a channel of information and staying attached to it."
Neisser and Becklen (1975) performed interesting experiments that show the
power of selective attention. They showed videotapes of games to subjects and
had them perform simple tasks. In one tape, three men bounced a basketball
back and forth to each other. The subjects' task was to count the bounces.
Then, Neisser showed a tape of two people playing a handslapping game. The
subjects' task here was to count the number of hits. If either task was
performed alone, counting accura-y was near perfect. When the two tapes were
superimposed, it was still quite easy to count either the number of ball bounces
or the number of hand slaps. Trying to count both at the same time, however,
was quite difficult. It was so difficult that the subjects failed to notice the "odd"
events of the ball disappearing or the men being replaced by women. This is
one example of the filtering of information. We can attend to and process

98

Information Processing

complex information quite efficiently. However, if the task is attentionally


taxing, we may not process all of the information available to us.
Filtering can occur at both high and low levels. In low-level filtering, also called
early selection, the person can respond to stimuli more quickly because simpler
processing (e.g., male vs. female voice) allows him or her to decide which
information is pertinent. High-level filtering, also called late selection, demands
more effort because you have to process the meaning of something, not just the
simple, physiLai characteristics of it. In this case, it is more difficult for the
person to filter out the unimportant information and decide what is pertinent.
Whether your selection of pertinent information occurs early or late depends
upcn the task.

The Cost of Multiple Tasks


Johnston and Heinz (1978) sat people in front of a display box and instructed
them to press a button whenever a light came on. The light came on at random
intervals. Subjects simultaneously listened to a tape of excerpts from Reader's
Pi" articles. Their task was to listen to the tape and press the button when
the light came on. The participants also had to answer simple true/false
questions about the passage at the end of each trial. These questions were
asked to ensure that the subjects attended to the tape and didn't neglect the
button-pressing task. Adding the task of listening to a message raised the time
required to respond to the light from 320 msec to 355 msec. Thus, there was a
small, but statistically significant, rise in response time for a very simple task
(i.e., a button press) when another simple and unrelated task (i.e., listening)
was added to it. As the experimenters made the listeninr r':sk more difficult
(e.g., attend to one of two stories), response time rose wich the difficulty of the
task. For example, it took an average of 387 msec to press the button in
response to the light as subjects tried to pay attention to one of two very
different messages (i.e., on different topics with one spoken by a man and one
spoken by a woman), and an average of 429 msec to respond to the light as
subjects tried to attend to one of two very similar messages (i.e., with same sex
speakers and similar content).
These experiments demonstrate three things. First, the time required to conduct
even the simplest task will increase as other, even simple and unrelated tasks,
are added to it. Second, the more difficult the added task is, the higher the
attentional cost due to the additional burden on the attentional mechanism.
Third, this attentional cost can be measured in the laboratory. On the average
for these subjects, it took 320 milliseconds to simply press the button when the
light came on without any information being broadcast to the ears. If a stimulus
(e.g., a warning light or text message) appears directly in front of a person,
99

Human Factors for Plight Deck Certification Personnel

response time to it will be faster than if eye movements are required to fixate,
or focus on, the information. Similarly, if the stimulus appears within the
person's visual field, but in the periphery rather than at the fixation point,
response time will be lower than when an eye movement is required, but higher
than when the target appears at the fixation point. While we usually move our
eyes when we shift attention, this is not always necessary. We can shift our
mental focus, or intemal attention. Even when shifting internal attention does
not involve eye movements, it does take time. The time required to shift
internal attention increases with the distance from the fixation point and travels
at a velocity of about 1 per 8 msec. (Tsal, 1983). Furthermore, some
information that is presented during the time that it takes to make this shift
may not be processed (Reeves and Sperling, 1986).
Automatic and Conrolled Processing
Automatic processing occurs in highly practiced activities like driving a car,
riding a bike, etc. You do it without necessarily being aware of what you're
doing. It just happens. Automatic processing is fast. It appears to be parallel,
that is, you can do more than one thing at a time, and ites fairly effortless.
Controlled processing means voluntary, one-step-at-a-time processing. It is a
rather slow process. It requires focussing attention to specific parts of complex
tasks. Acquiring controlled processing can be done simply by saying "Pay
attention to this." Acquiring automatic processing, on the other hand, may be
very slow or fast, depending on the task. At very low levels where the
distinctions are being made by simple kinds of physical stimuli, e.g., search for
a red object among varying colored objects, search for a curved line among all
straight lines, automatic processing can be achieved quickly. Things tend to
jump out. It's called the popout effect. If you ask subjects, "How did you find the
red square?" they say, "Well, it was kind of there. It popped out at me."
Whether there was one choice, two choices, or four choices, they were trying to
search for, it really didn't make any difference. It just seemed to show up to
them. They had to do less processing. It popped out. Something was
"automatically" happening to them. If you have to do high-level processing, such
as searching for particular letters in a field of other letters over and over again,
achieving automatic processing is much more difficult and takes much more
time. Automatic processing allows for development of fast, highly skilled
behaviors without eating up attentional resources.
Many things we do acquire a quality of automaticity, which is to say we do
these things automatically, without thinking much about them. For example,
learning to drive a car is a complex, difficult task. It is attentionally taxing and
even simple conversation is very distracting. An experienced driver, however,
100

Information Processini

can drive and carry on a conversation with ease. This is what we mean by
automatic processing. You can perform your primary task (e.g., driving) and
simultaneously perform another task (e.g., conversing) and do each as well as if
you were doing it alone. And, you are doing both without a great deal of stress
and effort because one of these tasks is being done automatically.
There are many examples of complex, difficult tasks becoming easier and less
taxing with practice. Any difficult task is, at first, attentionally all consuming;
extraneous or unexpected information is not likely to be processed. Sufficient
practice, however, can make even the most difficult tasks sufficiently easy to
deal with other incoming information. This is the advantage of automaticity.
When tasks or parts of tasks (subtasks), such as flying straight and level, are
performed automatically, resources are available to perform other tasks
simultaneously. While it is easy to see the benefits of automaticity, it is
important to be aware of the hidden costs. One of these costs is commonly
referred to as complacency. Since we devote less attention to tasks we can
perform automatically, it is easy to miss some incoming information - even
when this information is important (such as a subtle course deviation or a new
stop sign on a road traveled daily). We are most likely to miss or misinterpret
information when what we expect to see or hear differs only slightly from what
is actually there.

Epecaton
Expectations are powerful shapers or perception. We are susceptible particularly under high workload - to seeing what we expect to see and hearing
what we expect to hear. Even when we do notice the difference between the
expected and the actual message, there is a price to pay; it takes much longer
to process the correct message when another one is expected than when the
correct one is expected or when there are no expectations.
Scharf, Quigley, Aoki, Peachey, and Neeves (1987) demonstrated that evwn the
simplest of information processing shows a detrimental effect of a discrepancy
between the expected and actual information. They played a pure tone between
600 and 1500 Hz that was just barely audible and told subjects that this tone
would be played again during one of two time intervals. No tone was played in
the other interval. The subjects' task was to decide in which interval the tone
was played. When the tone that the subjects had to listen for (the target) was
the same frequency as the one they had heard first (the prime), subjects were
90% correct in identifying the interval that contained the tone. When the
frequency of the target was changed, performance suffered. For example, when
a 600 Hz tone was expected and a 600 Hz tone was the target, performance
was near perfect with 90% accuracy. When a 1000 Hz tone was expected and a
101

Human Factors for Flight Deck Certifiction Personnel

600 Hz tone was the target, performance was near chance with subjects
guessing which interval contained the tone with only 55% accuracy. The same
was true when the target tone was 1500 Hz and the prime was 1000 Hz. Even
a difference of only 75 Hz (with targets of 925 and 1075 Hz) resulted in a
drop in accuracy frcm 90% to 64%. This supports transfer ot training principles
which will be discussed in detail in Chapter 7. The closer auditory warnings are
to what is expected (e.g., from simulator training or experience in other
aircraft), the easier it will be to "hear," all other things being equal.
The powers of expectancy are even more obvious in higher level processing,
such as speech perception. If you quickly read aloud, "the man went to a
restaurant for dinner and ordered state and potatoes," chances are any listeners
would hear "the man went to a restaurant for dinner and ordered steak and
potatoes." it is not surprising that there have been many ASRS reports of pilots
accepting clearances not intended for them after requesting higher or lower
altitudes. Again, we are most likely to make such mistakes when what we
expect to hear is only slightly different from what should be heard (as with
similar call signs).

Pattern Recognidon
Pattern recognition is one of the components of our model of information
processing (Figure 5.1). The word "pattern" refers to anything we see or hear or
really sense by any means. Our ability to perceive and identify patterns whether words or objects - depends heavily on our ability to match the pattern
that we see or hear with the representations of patterns that are stored in
memory. We refer to this matching as pattern recognition.
There have been many theories of pattern recognition. The template theory states
that there are entire patterns stored in our brains as whole patterns. When we
see or hear something, we match this to one of the stored patterns to identify
it. The problem with this theory is that we would need an infinite number of
templates to match the innumerable ways in which an object may be presented
to us -- one stored pattern for each different pattern in a different size and
orientation. For example, consider an individual letter "Z." This letter may be
presented to us in print (in either upper or lower case) or handwritten by many
different writers. While no one template would fit all of these Z's, we usually
have no trouble recognizing Z's as such.
A similar theory, the feature theory states that incoming information is broken
down into its component physical characteristics or features and their relations.
A "Z" for example, can be broken down into two horizontal parallel lines, an
102

Information Promessmn

oblique line, and two acute angles. There is some physiological evidence to
suggest that our brains do process some information in this way. There are
brain cells that respond only to horizontal lines, others that respond only to
vertical lines, etc. But that does not mean that we process all information in
this way. In fact, it would be difficult to explain the identification of most real
world objects in this way. For example, by what features do we recognize a
dog? There are barkless dogs, tailless dogs, dogs with three legs, hairless dogs,
etc. Whatever feature we might consider using to define "dog," we are sure to
think of an exception.
The template and feature theories both assume a "bottom-up" mode of
information processing. That is, they assert that we process information by
beginning with the physical aspects of the stimulus and working up to its
meaning. In a "top-down" approach, the meaning is accessed first or at least in
parallel with other information (usually with the aid of contextual cues), and
then that information helps us process the physical features. For example, none
of the characters in Figure 5.2(a) appear at all ambiguous; there is a dearly
definable "A", "B", "C", and "D". However, in a different context. the "B"

(a) F1

13 C

S12

B 14151

FIgu. 5.2

Eum"S o ue d =orMM

(or*

cuin to Iftft an arm

s"gn

figure)

now appears to be a 13. If you only saw Figure 5.2(b), you wouldn't think that
any of those numbers were ambiguous, yet the "13" and the "B"are exactly the
same. This is an example of the use of contextual cues to identify an ambiguous
signaL When surrounded by the letters "A", "C", and "D", we see a "B"; when
103

Human Factos for Flight Deck Certification Personnel

surrounded by the numbers "12", "14", and "15", we see a "13". There are many
studies that show that an appropriate context aids our ability to identify visual
stimu]L For example, lines are easier to identify when they are presented in the

context of an object, such as a box, than when they are presented alone (e.g.,
Weisstein and Harris, 1974). Letters are easier to identify when they are
presented in a word than when they are presented alone (Reicher, 1969).
Palmer (1975) showed subjects pictures of a loaf of bread, a mailbox, and a
drum. The bread and the mailbox were physically very similar. The subject's
task was to decide which of the three pictures they saw. The subjects saw the
pictures for such a short period of time that they could not be sure of which
picture they saw. Sometimes, before seeing one of these pictures, subjects were
presented with a scene such as a kitchen scene (i.e., a picture of a kitchen
counter with utensils, food, etc.). When subjects saw a scene that was
appropriate for the target picture (such as seeing the kitchen scene before
seeing the loaf of bread), accuracy was significantly better than where they saw
nothing before seeing the target. Performance suffered when subjects were "led
down the garden path" with an inappropriate context and a target object that
was physically similar to an appropriate object. For example, after seeing the
kitchen scene, many subjects were sure they had seen the loaf of bread even if,
in fact, they had been shown the mailbox.
In most cases, context helps or hurts us by setting the stage for expectations.
When what we see or hear is compatible with what we expect, we process the
information quickly and accurately. When it is incompatible, performance
suffers. Examples of this can be found in videotapes of simulation studies where
pilots say what they are thinking throughout the session. In an early TCAS
simulation study, one pilot saw the traffic display and was so convinced that a
"climb" advisory would follow that he never heard the many repetitions of the
"descend" command (See pp. 313-314 for a detailed discussion.)
Our pattern recognition system is set into motion every time our senses perceive
something. It is the first step toward processing complex information and
problem solving. It is important to understand that pattern recognition cannot
be considered in isolation. When we want to know how easy it will be to see
or hear a particular stimulus (whether a simple line or tone or a complex
message), we must consider the physical attributes of the stimulus, the context
in which it will be presented, and the knowledge or expectancies of the
perceiver.

104

Information Processing

Speech Perception
One example of complex pattern recognition is the comprehension of speech.
Speech perception is a very interesting problem. Almost any small computer is
capable of producing intelligible speech with the appropriate software and
hardware. Nevertheless, it is incredibly difficult to get even the most
sophisticated super computer to understand what the small stupid one said.
These computers fail almost completely when they listen to a variety of human
speakers say a variety of different things.
The French equivalent of Bell Laboratories has developed an automatic
telephone where the caller speaks the number into the phone rather than
dialing it. It works remarkably well with one notable exception. The phone does
not usually work for Americans or other non-native French speakers, even
though they may speak French very well. It appears totally unable to process
the call. Why can't this computer recognize American French as well as French
people can? The speech recognition systems that work best are "trained" to
individual speakers who use a limited vocabulary. The speaker says the words
to be used into the computer several times. The computer system then "learns"
to recognize this limited set of words under ideal conditions. One necessary
condition is a quiet environment since the computer can't differentiate between
speech sounds and similar noises. Once the speech recognition system is trained
to a speaker, it cannot tolerate much change in the speaker's voice, such as the
rise in pitch that is often induced by stress.
To understand why speech recognition is so difficult, we must first examine the
complexities of the speech signal. A spectrogram is a physical representation of
the speech signal. It plots the frequencies (in Hz) of the speech sounds as a
function of time. An examination of a spectrogram of normal speech reveals
that it is impossible to say where syllables begin and end; words can only be
differentiated when they are separated by silent pauses and these pauses do not
always exist in natural speech which is quite rapid. This presents a problem for
computers, since they are limited to the physical information in processing
speech. We, on the other hand, use our knowledge of language to help parse
the acoustic signal into comprehendible units such as words.
Another problem for speech recognition systems is the tremendous amount of
variability in the speech signal. Ask one person to say "ba" five times. And these
five simple sounds will all be slightly different (e.g., in terms of how long
before the vocal folds vibrate after the initial release of the sound -- the initial
opening of the vocal tract at the region of the lips. When these sounds are
produced in context, they are even more variable. The "ba" in "back," for
example, is slightly different than the "ba" in "bag."
105

Human Factois for Flight Deck Certification Personnel

There is even more variability from speaker to speaker. An examination of a


physical representation of different English vowel sounds spoken by several
native English speakers reveals a tremendous amount of overlap (Peterson and
Barney, 1952). In many cases, it is only context that allows us to differentiate
one from the other. This type of variability increases further if we include nonnative English speakers. Being a non-native speaker affects not only how we
produce speech sounds but also how we hear them. Unless we are exposed to
the subtleties of the speech sounds as youngsters, we do not develop the
capability to use the cues to the differences between these sounds in a speech
context. The most famous example of this is the ra/la distinction. This
distinction is used in German and English, for example, but not in many Eastern
languages including Japanese. To native Japanese speakers, who learned English
from other native Japanese speakers, "ra" is the same as "la" and "la" is "ra."
They cannot distinguish one from the other even though they can distinguish
the acoustic cues that differentiate these sounds for native English speakers
when they are presented outside of a speech context (Miyawaki et al, 1975).
There are several other factors that influence our reception of speech sounds.
One obvious one is the signal-to-noise ratio. In a noisy environment, some of
the critical speech information can be masked. Generally, as the noise level
increases, intelligibility decreases markedly. Specifically, the sounds that will be
masked are the sounds of the same or nearby frequencies that exist in the
ambient noise. Two other factors that have an additive effect on the effect of
noise are the rate of speech and the age of the listener. When a person speaks
quickly in a noisy environment, much more information is lost than when a
person speaks quickly in a quiet environment or speaks slowly in a noisy
environment. The effects of age on speech perception are two-fold. First, there
is a loss of sensitivity, particularly to higher frequencies, that makes it more
difficult to hear certain speech sounds. There is also a more subtle and intricate
loss in sensitivity. After about age 50 we see a spreading in the widths of
critical bands. This further compromises our ability to differentiate the speech
signal from ambient noise. One result is that it is difficult to hear casual
conversation at a noisy gathering. What do we do when we miss a word or part
of a word? Based on context and our knowledge of language, we fill in the
blanks - and we do so with utmost confidence. Studies have shown that if part
of a word in a sentence is replaced with a noise, such as a cough or tone, the
listeners fill in the missing syllables when they are asked to repeat what they
heard. They are not able to locate the noise in time, even though they expect
the noise somewhere in the sentence (Warren, 1970, Warren and Obusek,
1971).

106

Information Processmf

To add to the problems of a 55-year-old (e.g., pilot) in a noisy environment


(e.g., cockpit) trying to attend to a fast-talking speaker (e.g., controller), devices
that transmit speech sounds, such as telephones, radios and headphones,
selectively attenuate certain frequencies. The best earphones transmit everything
from 25 to 15,000 Hz. These earphones wouldn't be very useful in the cockpit
because the radios don't come close to this level of fidelity. Some of the
frequencies that are lost (usually above 3000 Hz) are likely to contain some
speech information since these frequencies are within the speech domain.
While many factors (e.g., age, noise, and transmitting devices) can degrade our
ability to understand speech, there are very few factors that can destroy it. One
thing that can, however, is delayed auditory feedback more commonly referred
to as an echo. It is very disruptive for a speaker to have to listen to his or her
own speech slightly delayed. Similarly, if we present speech in one ear and the
same speech slightly delayed (beyond 30-40 msec) in the other, it makes the
listener distressed and unable to understand the message. Delays below 30
msec. aren't as disruptive to comprehension, but are annoying and distracting.
A study conducted with air traffic controllers showed that even a 5 msec. delay
can be annoying (Nadler, Mengert, Sussman, Grossberg, Salomon, and Walker,
unpublished manuscript). Fortunately, this is an artificial situation (i.e., induced
by equipment) that can usually be avoided.
It is almost amazing that we understand speech as well as we do. The speech
signal is incredibly complex and often embedded in noise. Yet, under most
circumstances, the system works very well and failures to comprehend spoken
messages are the exception rather than the rule. Unless the workload and stress
levels are terribly high and/or the environment is excessively noisy, we usually
do OK. Armed with our knowledge of language and aided by context, we are
able to decipher the signal and understand the message. And then, sometimes,
we just fill in the blanks.

Memoly
Memory is a key component in our information processing system. Simple
recognition requires that the pattern in front of us match a pattern in memory
and most complex problem solving requires applying information stored in
memory to the task at hand.

107

Human Factors for Flifht Deck Certification Personnel

The

Stm

Scientists usually think of memory as three different memory structures:


sensory, short-term (also called working), and long-term. Table 5.1 summarizes
the key characteristics of these three structures. A sensory memory structure
probably exists fnr each of the five senses. These five sensory modalities take in
information automatically; there is no way to avoid it. If you open your eyes,
information comes in. Unless you plug your ears, auditory information enters.
This information that enters sensory memory automatically cannot be
maintained intentionally. You can only look again or listen again to the same
message. Otherwise, the scene or message is gone in a short time from fivetenths of a second to two seconds. For some auditory information, sensory
memory has been demonstrated to be about a quarter of a second which is the
length of most syllables. Our capacity for sensory storage is very large. The
information is held in an unprocessed mode. The meaning of a word, for
example, is not yet accessed. The information must proceed to short-term
memory with the aid of pattern recognition procedures for further processing.
The sensory store takes in a lot of information but holds it for such a short
time that only a small portion of this information can be recognized and
transferred to short-term memory, and thus, available for further conscious
processing. The rest of the information is lost and this loss usually goes
Table 5.1
Memory Structures
FEATURES

SENSORY

(WORKING) SHORT-TERM

LONG-TERM

information
input

Automatic

Requires Attention

Reheasal; Higher order


processing

0.5 to 2 sec

20-30 sec

Decades

Large

7+ or -2 items

No known
limit

Information
duration

Information
capacity

108

Information Processn

unnoticed. Sperling (1960) conducted a series of experiments that


demonstrates the capacity of this sensory store. In one experiment, he
showed a card with a four-by-three matrix of letters and numbers (Figure
5.3) to subjects for 50 msec. When subjects were asked to recall all the
letters and numbers they saw, they remembered seeing twelve but could
only name three or four. Were more items available in sensory store but lost
before they could be reported? To investigate this, Sternberg showed the
same type of matrix for the same amount of time, but this time, he also
played a high-, medium-, or low-pitched tone. If the tone was high-pitched,
then the subject was to report the top row. If the tone was mediumpitched, then the subject was to report the middle row. If the tone was lowpitched, then the subject was to report the bottom row. The tone was
played immediately after the display disappeared. The subjects were asked to
report only the letters and numbers that had appeared in that row. In order
for subjects to report all four items in the row correctly, the full array of
twelve items would have to be available in sensory memory when the tone
sounded. This was, in fact, the case. The subjects were able to recall all four
letters and numbers, no matter which row was cued. Without the cue,
however, most of the items were "loste before they could be reported.
Sensory memory has also been demonstrated in the auditory domain. With
the use of earphones, we can present letters or digits that appear to come
from three different places. For example, in the right ear, we present "1, 2,
3" and simultaneously present, i.e., superimpose "4, 5, 6". In the left ear, we
present "7, 8, 9" with the same "4, 5, 6" superimposed. What the subject
"hears" is "1, 2, 3" in the right ear, "7, 8, 9" in the left ear, and "4, 5, 6" in
the center of the head. If one of these locations is randomly cued after
presentation, recall for the numbers presented there is nearly perfect.
Without the cue, only three or four of the numbers can be recalled. In
sensory memory, much visual and auditory information is stored but lost
quickly. A small proportion of the stored information is transferred to shortterm memory for further processing.

109

Human Factrs for Flight Deck Certification Personnel

THE STIMULUS CARD

Fgure 5.3.

Example of a four-by4hree matrx of letters and numbers shown to

sublects to Mustrate sensory store capacity ot shot-term memory. (orginal

Short-Term Memory

Our second memory structure is working or short-term memory (STM). We


can think of the information stored in short-term memory as what is
immediately available in consciousness. It is what we are thinking about at
the time. Maintenance of information in short-term memory requires
attention. That is, if you want to keep information available here you must
focus on it or use it in some way. How many times has someone introduced
you to someone and one minute later you can't recall the name? You heard
the name dearly, but you didn't perform any cognitive effort to process the
information. Unlike the information in your sensory store, you can keep the
information in short-term store by rehearsing it, that is, repeating it.
Without rehearsal, the information will be available for only 20 to 30
seconds. Even with rehearsal, the information in short-term memory is
fragile. If someone tells you a phone number, repeating it will keep it
available on your way to the phone. If someone approaches you while
you're rehearsing and asks you the time, your response of 3:45 could
displace the phone number out of STM.
The information in STM is very susceptible to interference. The more similar
the interfering information is to the information in STM, the stronger the
interference will be. For example, numbers can displace other numbers more
easily than names can displace numbers.

110

Information Processig

The capacity for storage in short-term memory is relatively small: seven


items, plus or minus two (five to nine) items. The "items" can be digits in a

phone number, for example, or they can be packages or chunks of


information. If someone read a string of letters such as "F, C, J, M, U, B, I,
F, T, H, F, V, K, A, I",then asked you to recall them, you would probably

be able to recall between seven and ten of them. If the same letters were
read in logical groupings, such as "FBI, CIA, JFK, MTV, UHF", you would

probably be able to recall all of them. Fifteen items have been "chunked" or
grouped into a meaningful set of five. Similarly, "2, 0, 6, 3, 8, 4, 5, 7, 9, 1"
will be easier to recall as "206, 384, 5791", particularly if it is a familiar
phone number. These two examples illustrate two points. First, the capacity

of short-term memory is increased when the information is organized.


Second, if the information to be stored in STM is familiar, that is, already

exists in long-term memory, then it will be easier to maintain in short-term


memory. It should be noted, however, that the definition of a "chunk" of

information can be arbitrary. For example, whether a radio frequency can


be considered as one chunk of information or as four separate pieces of
information is debatable (and should probably depend on whether or not

the frequency is a familiar one).


Long-Term Memory

Long-term memory is the "warehouse" of information stored up over a


lifetime. There is no known limit to the amount of information we are able
to store in long-term memory or to the length of time we are able to store
this information. Information is rarely lost from memory, but it is frequently
more difficult to retrieve than we would like. Often, we know we are very
close before we successfully access the information. For example, in trying
to recall a name, we may be able to recall what letter it begins with, the
number of syllables, or what the name "sounds like" but the actual name
escapes us. This is called the "tip-of-the-tongue" phenomenon. The
information is in long-term memory but we can't recall it into short-term
memory at that moment. Eventually, we are usually able to reconstruct the
name from the descriptive information that we can retrieve.
Much of memory is reconsmcve. Information may be available even though
it isn't encoded in the same form as the information for which you're
searching; it may have to be derived. For example, the number of rooms in
the house that you lived in when you were five years old is not something
you consciously stored. It is also not something that you can recall quickly.
However, you probably are able to recall an image of the house and "walk
through" and count the rooms.
111

Human Factors for Flight Deck Cirtification Personnel

Memory is also constructive in the sense that we not only store information
that is given directly to us, but we also store whatever that can be derived
from that information. Bransford, Barclay, and Franks (1972) read many
sentences to subjects in their experiments and later asked them if the test
sentences were ones they had heard before. They found that subjects could
not distinguish between the sentences they heard and ones that could be
logically inferred from the ones they heard. It was the processed meaning of
the sentences, not the specific words, that was stored in long-term memory.
There is some physiological evidence for the existence of short- and longterm memories as separate distinct structures in the brain. The following
case illustrates this point. H.M. incurred brain damage as the result of an
accident. Because of this damage to the temporal lobes, H.M. was unable to
transfer information from short-term into long-term memory. The
information stored in long-term memory before the accident remained intact
and could easily be recalled. This, along with his functioning short-term
memory enabled H.M. to carry on normal conversations with his doctor and
others. Without the ability to transfer the information to long-term memory,
however, the conversations were forgotten. If the doctor left the room and
returned minutes later, there was no evidence that H.M. had any memory of
the conversation that took place just minutes before.
With disease such as Alzheimer's, there is also -vidence of a separation of
short- and long-term memory. In the beginning stage of the disease,
transferring information from short-term memory into long-term memory is
problematic. Later, long-term memory degenerates and eventually the disease
invades so deep into the memory system that even language can be
forgotten.
In the absence of brain disease or damage, there are things we can do to
help store information effectively in long-term memory. If the material to be
learned can be organized around existing knowledge structures, (i.e.,
information already known), then it will be more efficiently stored and,
thus, easier to recall. it is easier to learn more about something you already
know than to learn the same amount of material about something totally
foreign or to learn it as isolated facts. Cognitive effort can also help to store
information in long-term memory. This effort can be intentional or
incidental. We can study to memorize facts (intentional) or we can use
a phone number, that we learn it whether or not
information so often, e,
we intend to do so. On ie other hand, informatiov that we would like to
keep easily accessible 'such as memory items on a checklist) may not be
readily available without regular review. Our raicmorv for important,
112

Information Processinf

complex information that is not used regularly but does need to be quickly
accessed, requires periodic maintenance - particularly if this informadon is
expected to be recalled in stressful situations.

113/114

Ch:':a:p'ter 6..
Display Compatibility and
Attention
by Christopher D. Wickens, Ph.D., University of Illinois

Displa Compaiibilily
As we follow the sequence through which information is processed by the pilot,
the first critical stage is that of penqdon, that is, interpreting or understanding
displayed information. However, there are features in display design that can
allow this interpretation to proceed automatically and correctly or, alternatively,
to require more effort with the possibility of error. This is the issue of the
compatibility between displayed information (stimulus) and its cognitive
interpretation. Based on that understanding, a response is triggered.
Compatibility generally refers to the relationship between a display's
representation, the way in which the display's meaning is interpreted, and the
115

Hum=n Factor for Flifht DWek Calification Pezuonne

way in which the response is carried out. S-C couuadbty refers to the
relationship between how a stimulus changes on a display, and how it is to be
cognitively interpreted. S-R couaubilty refers to the relation between displayed
stimulus change and the appropriate response. The important design issue in SC compatibility, which we shall now consider, is whether the change in a
display state naturally fosters the correct cognitive interpretation. We provide
several examples below.
Color is one important component of display compatibility. When a display
changes color, does the color on that display immediately give the correct
interpretation to the pilot of what that color is supposed to mean? The meaning
of certain colors is related to pcpdatkm Maewomj
which must be kept in mind
by designers. A designer might think "I have a meaning I want to convey, what
color should I use to convey that meaning?" This is really working backwards
because it does not address other population stereotypes a color might have.
What the designer really wants to do is say: "When a color appears on a
display, what will the pilot automatically interpret it to mean?" The problem
occurs when colors have multiple stereotypes, and so the pilot may instinctively
interpret one that is different from what the designer intended. Red has a
stereotype of both "danger" and "stop" or "retard speed." Now a pilot sees red,
in the context of airspeed control. Does it mean "slow down" or does it mean
that "airspeed is already too slow and there is danger of a stall?" Possible
conflicts of color stereotypes must be carefully thought through by the designer,
to make sure that a given color has an association that can't possibly confuse or
be confused and trigger the incorrect interpretation.
The second component of display compatibility is the spatial interpretation of
display orientation and movement. This relates to the movement of a display
and how a pilot interprets what that movement signals. Roscoe (1968) cited
two principles that define display compatibility. The first is the principe of
pi
iwdalim. The spatial layout of a display, that is, the picture of a display,
should be an analogical representation of the information it is supposed to
represent. The second principle that helps define display compatibility is the
prbacp o th moving pail. The moving element of a display should move in
the same orientation and direction as the pilot's mental model of systems
moving in the real world.
A good way to illustrate these two principles is with examples of hypothetical
airspeed indicator designs as shown in Figure 6.1. These are not necessarily the
ideal ways of designing an airspeed indicator, but they either confirm or violate
the principle of pictorial realism or the principle of the moving part. A pilot's
mental model of airspeed is something with a "high" and "low" value.
Therefore, according to the principle of pictorial realism, a vertical
116

DWWla C~mnaib~ftk and AffMdWa

representation is more compatible than the circular one as shown in design (d).

(c)

(a)

(b)

300-

500

200-

400

100

300

400

0-

200

500

200

Moving scale

Moving pointer
Id)

Flgm 6l.

Dfbui

Vrnw-

dhqr
w

1- -1g fto pdnc-Ims o pI 11,

pwL Uumn Wdickm

mb

md of

1994

Also, our mental model has high airspeed at the top and low airspeed at the
bottom. So a fixed scale moving pointer indicator with the high airspeed
represented at the top, as shown in design (a), is compatible with the principle

117

Human FACINO fi Flfrht DeCk CMeRMiiAton PeMAune

of pictorial realism, whereas a moving scale indicator with the high airspeed
represented at the bottom violates that principle (d).
Consider the display of altitude as another example. There has been a good
deal of research suggesting that pilots think of the aircraft as the moving
element through the stable airspace, not as the stable element in a moving
airspace (Johnson and Roscoe, 1969). So when an aircraft gains altitude, it is
compatible with both the principle of the moving part and the principle of
pictorial realism for the m~oving part of the display to move upwards, and when
the plane descends, for the moving part to move downwards. This is exactly
what we get with a fixed scale moving pointer display (a). You have high
altitudes at the top, low altitudes at the bottom, and your moving pointer is in
a direction of motion that is compatible with the pilot's mental representation
of what is happening in the environment. That is, it conforms to the principle
of the moving part. With a fixed pointer moving tape display, there are two
possible design orientations. The situation in design (b) has the high altitude at
the top of the tape and the low altitude at the bottom; again, conforming to
the principle of pictorial realism. But an increase in altitude is signaled by a
downward movement on the display--a violation of the principle of the moving
part. The alternative is to present the low altitude at the bottom of the tape
and the high altitude at the top. In that case, when the plane climbs, the tape
moves upwards, and you've satisfied the principle of the moving part but
violated the principle of pictorial realism. This is one of those cases of
competing principles.
While it would seem therefore that the fixed scale display is ideal because it
conforms to both Roscoe's principles, it turns out that even this is not
necessarily the ideal solution because of a problem with scale resolution. For
variables like altitude, you can't print the whole scale unless it is printed so
small it is nearly impossible to read. That is the nice thing about moving scale
displays. They can accommodate a much longer scale because they are not
constrained by space. A compromise solution which could be adopted here is
called "frequency separation," in which the pointer moves rapidly across a fixed,
partially exposed scale to reflect high frequency changes. But lower frequency,
longer duration changes that will require exposing a different scale range are
accomplished by moving the scale.

Affeimion
Attention may be characterized as a limited capacity available to process a lot
of information. Our discussion of attention here will lead in two directions:
discussing the principles of multi-element display design, and the use of
head-up displays. Then in Chapter 11, we shall discuss the issue of dividing
118

Display CouuM

ftbik
and Atumnton

attention when trying to perform several tasks at once, and measuring the
attention demands of tasks: the issue of pilot workload. The issue of attention
can really be divided into three different aspects of human abilities. One aspect
is focued auion--how easily we can focus on one source of information and
ignore the distraction of other information. Successful focus is the opposite of
distraction. Another aspect of attention is diided atteton--how easily we can
divide attention between two activities and do two things at once, or process
two display channels at once. These activities could involve the pilot flying at
the same time he or she is communicating, or perceiving vertical velocity at the
same time that heading is perceived. Finally, we have the aspect called sekei'
attenion, and this describes how easily and how carefully the pilot selects
particular channels of information to be processed at the right time (e.g., is the
pilot sampling an instrument when he should be looking outside, or attending
to data entry on the FMC when he should be attending to airspeed control).
Fobeumd Affanw
A discussion of focused attention and distraction leads to consideration of the
electronic display issue. One of the things that we know from basic psychology
is that all information that falls in about one degree of visual angle is going to
get processed whether you want it processed or not. We know in aviation
displays that clutter is going to be an inevitable consequence of putting more
information in a smaller and smaller space. This will be important in the
discussion of head-up displays to follow. The issue now is to minimize the
confusion caused by clutter, and images that are too close together in the visual
field. How can we increase the pilot's ability to focus attention on one
displayed item and ignore other things that may not be relevant? We are
finding in research that color is an extremely useful technique for segregating
different sources of information. Coloring all of one type of information in one
color and different information in a different color can allow us to focus in on,
say, all of the information that is of one t ', and ignore the information that
is of the other, even if they are in the sanm spatial location.
With auditory messages too, the issue of confusion and distraction is relevant.
How do we allow the pilot to focus attention on one auditory channel of
information (say a synthesized voice message from the cockpit), while filtering
out conversation from the copilot or controller, so that the latter will not get
confused with the cockpit alert? The answer here is again in terms of physical
differences, in this case making messages sound as different from each other as
possile -- perhaps by purposefully making computer-driven messages sound
artificial

119

Huma Famc

for

uthr Deck Cardficatim Paeonnel

DAddld AMd~n
When we consider divided attention, particularly attention divided between
different aspects of a display, designers are interested in creating for the pilot a
sense that two (or more) parts of a display that are to be related can be
perceived at the same time. This objective can sometimes be achieved by
bringing them close together in space. This, of course, is the principle
underlying the development of the head-up display. Also, any sort of static
display ought to have the labels of an indicator very dose to the indicator's
actual moving part. In fact, the analysis of the USS Vincennes incident when
the Navy ship shot down the Iranian airliner, revealed that the label on the
Navy's radar system that indicated whether the altitude was increasing or
decreasing was considerably separated from the actual indicator of XY position
itself. So the separation of these two pieces of information may, in part, have
caused the controllers on the radar display to misinterpret what that altitude
trend information was showing, assuming that it represented a descending
attacking fighter, rather than a climbing neutral airliner.
Of course, spatial closeness can be overdone. As we noted above, too much
closeness can create display clutter and thereby be counterproductive. Thus,
relative closeness between related display channels is probably more important
than absolute closeness.
In addition to spatial closeness, it is also possible to use a common color to
bring together in the mind two things that may be spatially separatd, and
make it easier to divide attention between them. As we note in the next
chapter, for example, it may be useful to use a common color to show the
relationship between a display and its associated control, when these are not
colocated; or, in an air traffic status display, to code all aircraft with similar
characteristics (e.g., common altitude) with the same color. Because color can
be processed in parallel with other features of a display, it is often useful to use
the color coding of an object to facilitate divided attention.
A third display feature that can improve the ability to divide attention between
two indicators is to present them as two dimensions of a single object. Perhaps
the best example of this is the attitude display indicator (ADI) that represents
two independent dimensions of flight control It represents both pitch and roll
as the vertical location and the angle of the horizon. That design feature greatly
improves the ability to divide attention between those two critical pieces of
flight information for integrated lateral and vertical flight control
Another important way of designing displays to facilitate parallel processing is
through the creation of wwmq tfeammm. These are perceptual characteristics of
120

Dislar Compatibility and A19mg

a set of displays that are not the property of any single display. A good
example of an emergent feature is the imagined horizontal line, that connects
the tops of four vertical column engine indicators on a four-engine aircraft,
when all engines are running at the same level as shown in Figure 6.2.

Engine power

Engine number

FRgm 6.2

Veiuci cobmn emni

kMcalorn for a four-ngho

WdveM ISM

Two other characteristics that will improve the ability to process information in
parallel will be discussed in more detail in our later section on workload. These
are the automakty with which information is perceived (the more
automatically we process one symbol or piece of information, the better we can
do so in parallel with other display processing), and the use of sqwrate
modalide of information display (i.e., auditory and visual channels).

The piloes ability to select information that is needed on the display at the
appropriate time can be improved by three factors. First, and most obviously,
&uinn can improve selective attention. There is reasonably good evidence that
pilot's mean pauenmu (good indices of what is being attended when), change as a
function of their skill level, indicating an evolution of selective attention ability.
Second, diqlay orpnkadkm provides a good way of enabling the pilot to find
(look at) the information needed at the right time. One can contrast the more
organized display in Figure 6.3a, with the less organized one in Figure 6.3b, to
see the difference. However, it is important that the physical organization of the
display be compatible with the mental organization that defines the plotes

121

HM

FaUND fOr Phlft Deck Ctfitw Pensounel

inormation needs. That is, displays that are clustered or grouped together
would be those that are also used together.

S~

(a)

FlWs &3

(b)

(a) Exns d good deftk oagoi On

-rgwteftL-WkwK1S"

(b) EMs d poor cmpi,

Diyiq conAueny is a third variable that effects the pilot's ability to selectively
attend to the right sources of information at the right time. Where possible,
similar types of displays should be located at similar places, across different
viewing opportunities. This applies both for display locations across different
cockpits, and for multifunction displays across different pages that may contain
similar material. Finally, as we described above, dilay cuer will be a
hindrance to effective selective attention. It is difficult to visually find what you
want on a cluttered display.

The design and use issues of the head-up display highlight many of the issues
of attention discussed in the previous pages. Figure 6.4 shows a sample of a
head-up display (HUD) developed by Flight Dynamics, Incorporated. It is flown
in Alaskan Airlines planes. The HUD was designed primarily to bring visual
channels closer together in space so as to improve the ability to divide attention
between them. Instead of having critical flight instrumentation physically
separate from the outside world, the HUD overlays certain aspects of this
information on the view of the outside world. The goal of the head-up display
122

Display Compatibili"; and Attention

FDI HGS SYMBOLOGY


TOGA MODE WITH WINDSHEAR

ALERTING AND GUIDANCE

WIND SHEAR
WARNING

10
ARTIFICIAL
HORIZON

10-WIND

6.8

lII -..

WINDSHEAR

'

15
-=W

,,

12
I

..
.
.
SPEED ERROR TAPE .
F LIG H T
HFLGTPT
.. . . .. . . . . . . . .A
............
..--..... ............
.)
ACCELERATION

13
I

AOA LIMIT

14
I

<-).j
.......... ................. ...... .-

GWIND SHEAR

GUIDANCE
----FLI HTCOMMAND
AT
---- FLIGHT PATH

BARD ALTITUDE
480 B<------------------------

GS

.148 -

.......

.-

..-136
...
AIRSEED

AIRSPEE

j~

VECTOR
AND MAGNITUDE

1'

100 -650

-10

""

-10

-VERTICAL SPEED

RADIO ALTITUDE

GROUNDSPEED

Figure 6.4

Sample of head-up display (HUD). (Desmond, 1986)

is twofold. One, as noted above, is to reduce the need for visual scanning
between instruments and the outside world. The second goal is to portray
certain critical pieces of information that conform with the enviroranent so they
can be directly superimposed on that environment. These would include,
certainly, the runway symbol, the horizon line, a flight path representation, and
a symbol of the aircraft's current and predicted position. This conformal
ombolog, then can be interpreted by the pilot as belonging at locations along
his or her line of sight beyond the HUD.
HUD display development and research has a very long history in the military.
There are a number of issues in the military, like flying inverted and getting out
of high-G combat situations, that are less relevant for the design of civil
aircraft. On the other hand, it has been recently introduced and successfully
123

Human Facmtu for Flfiht Deck Cerificain Peronnel

flown in Alaskan Airlines planes and has had very good reception (Steenblik,
1990). The first category three landing was at Seattle-Tacoma Airport in late
1989. The pilots who have flown with it generally have liked it and have found
that it does a good job of allowing maneuvering and landing in very low

visibility conditions. At the same time, it keeps them actively involved in the
control loop rather than turning over control to automatic landing systems,
thereby maintaining a level of involvement which pilots generally value. Flight
tests with the HUD have been quite successful. Figure 6.5 shows an example of
the "footprints" of landing touchdowns made on a series of category one and
category two landings done in simulations with and without a HUD. It shows
greater touchdown dispersion without the HUD than with it. It also tells us that
there were six go-arounds in the approach without the HUD and no go-arounds
with the HUD. Desmond (1986) reviewed the development of the HUD and its
implementation in the aircraft.
The critical issues in HUD design relate not so much to whether they are a
good thing or bad thing, although some researchers have phrased it that way,
but rather to the appropriate design guidelines to follow, how HUDs can be
improved, and to identification of the potential pitfalls in HUD use (Weintraub

& Ensi,

1992).

In the analysis of HUDs, there are three conceptually different domains. One
domain has to do with the optics of the HUD, that is, how they are coiRl,
how the lenses are configured, and where they are located (the visual angle
between the HUD instrumentation and the line of sight out the cockpit toward
the runway during approach). A second is the Oabi
of the HUD. What
exactly should be placed on the HUD, and in what format? How much of this
should be nonconformal symbology? The third domain concerns the whole issue
of pilot atten in the HUD. How does human attention switch back and forth
between the HUD instrumentation and distant objects in the far environment?
How well can human attention be divided between instrumentation and things
in the far domain? What are the consequences of focusing attention on the near
HUD and ignoring information that is out there in the environment?
In addition to these three issues of HUD research, there are four important
categories of differences between typical HUDs and conventional flight
instruments. First, HUDs are, of course, displaced upwards to overlap the visual
scene. Second, conventional displays are presented at a short optical distance.
HUDs are typically collimated out to near optical infinity. Third, there are
significant differences in the symbology between conventional instruments,
which often, although not necessarily, have an older round dial symbology, and
HUD instrumentations which typically have a much more novel symbology.
Fourth, the different symbologies represent the movement of the airplane
differently. Most conventional instrumentation for presenting guidance
124

DhaW1

Camnaibdiky and A~thmdm

TOUCHDOWN COMPARISON
CAT I, CAT II AND NON-PRECISION APPROACHES
Touchdown Dispersions Without HGS (46 Flights, 6 Go-Arounds)
THRESHOLD

1050 FEET

3000 FEET

5000

Touchdown Dispersions With HGS (51 Flights, No Go-Arounds)


THRESHOLD

1050 FEET

3000 FEET

Tests carried out in


Boeing 727 simulator

RUre 6.5

5000

Flights flown by air


carrier line pilots

Toudulown cleupuionu wlh md wihomt HUD for in ip ecihin


; ,"I-Prom Dinmon4 10S)

information is based on the relationship of the airplane to the air mass. Some

HUD symbology (e.g., that used by Flight Dynamics), in contrast, may be based
on the inertial guidance of the plane and therefore provides information with
respect to the ground surface. Differences in flight test performance between
HUD and conventional instrumentation could result from any or all of these
differences in design features.

When we view objects up dose, the light rays from the object hit the eyeball in
a converging orientation. They are not parallel The muscles surrounding the
lens must activate or "refract&to bring that image into focus. For objects more
than five or six meters away, the light rays travel in a roughly parallel
orientation. The lens relaxes its shape and the more distant object is brought
125

Human Factors for Flight Deck Certification Personnel

into focus. This change in lens shape we call accommodation. Accommodation is


not instantaneous, which is why we have a difficult time going from viewing
something far away to suddenly reading something up close. This problem with
accommodation increases with age. The goal in the design of the HUD is to

present the information, which is superimposed on the windscreen, so the light


rays travel in parallel to the eyeballs, and so the information is essentially
perceived as being out at a great distance (i.e., at optical infinity). This is all
done by a series of collimated lenses down at the bottom of the HUD that take
the image generated on a CRT and transform it into parallel rays. Hence, the
rays from the far domain of the runway or distant aircraft and the rays from
the near domain from the instrumentation are all displayed in parallel.
Information from both domains therefore requires very little accommodation at
all.
In making any sort of comparison between the HUD and conventional
instrumentation, one of the issues is the fact that conventional instrumentation
is usually presented at close range while head-up display information is
presented out at optical infinity. There has been some dispute in the human
factor literature regarding whether or not it is appropriate to collimate the HUD
instrumentation out towards optical infinity. The issue is conceptually simple. At
different times, the pilot has to have the eyeball accommodated to two different
distances. On the one hand, he has to look out of the cockpit and focus on the
things that require far accommodation like the runway, distant aircraft, targets
in space, etc. On the other hand, the pilot has to spend time looking at close
things, particularly airport approach plates and maps in the cockpit. So, a
decision must be made about where to put other aspects of the critical flight
information. Should it be projected in close, where processing is more
compatible with the maps, or projected "out there," where processing is more
compatible with the distant world? The general guideline followed by HUD
designers seems to be that it is more important for the pilot's eye to be well
accommodated to the distant features. Therefore imagery is either collimated
out to optical infinity or a little less than that, but still fairly far out, which
keeps the light rays almost parallel.
Despite the decision which has been made for pilots to view HUD
instrumentation at optical infinity, there isn't a lot of data to suggest how pilots
really do accommodate back and forth between the "near" and "far" domains.
One of the few studies in this area to date has been done by Weintraub,
Haines, and Randall (1985). They used a static test in which they examined the
pilot's ability to switch between near information and far information. The near
information was a digital altitude and air speed display on a HUD. The far
information was the presence of an X at the end of a runway, which would
signal that the runway was closed (Figure 6.6). The experiment would present
126

Disnla Compatdbityl and Atbmd

the HUD information, then would suddenly present the runway information,
and determine how long it took the pilot to confirm appropriate altitude and
airspeed, and then make the decision about whether the runway was open or
dosed. Essentially they were asking the pilot to switch attention from the near
domain, (the air speed and altitude), to the far domain, and then make a
response of whether there was an X present or not. In one condition of their
experiment, the instrumentation was presented head down, and optically close.
Therefore the pilots not only had to switch attention from the near to the far,
but they had to accommodate from the HUD to the distant runway.
Figure 6.6 shows the results from this condition. The solid line represents the
state of accommodation, changing from the near to the far symbology. This is

130 JE

STATIC
HUD

220

SWITCH
NEAR TO FAR

STATIC
RUNWAY
RT

NEAR

ACCOMODATnVE
RESPONSE

FAR

Fig.re
K

&

Fmnpis of HUD uoil

RUNWAY

RESPONSE

STIMULUS

TIME

uind ineipehmi mid graph showing renss of

Mt of pW'. ay to iwch from naw to far hiomon. (adaptn from


Wgkrinn, Hims &Ravid, 1984)

called the ac o
wrapouwe. The important point to note in this figure is
that the time to make this decision is influenced partially by how far they have
to accommodate, but also they can make the response well before they have
completely reaccommodated to the greater distance. This finding suggests that
you don't need to have perfect visual information in the far domain before you
are able to process it and use it. Nevertheless, this was the first experiment that
127

Human Facto, for Fluebt Deck Certificon PersoMel

really documented a major cost to reaccommodate, and that cost showed up in


performance. It is an experiment that strongly suggests the importance of

keeping that imagery close to optical infinity rather than close in.
Weintraub, Haines and Randall also varied the visual angle between the HUD
and the runway information. They compared two conditions. In both conditions,
the HUD imagery was collimated to optical infinity. In one condition, the HUD
imagery was overlapping the runway and "head-up." In the other condition, it
was not overlapping the runway and "head-down." In the head-down condition,
the imagery was still optically far, but was no longer superimposed on the
runway. Instead, it was positioned at the same location as the true conventional
instrumentation. So to get information from the runway and HUD in the headdown condition, the pilot still had to visually scan up and down, but didn't
have to reaccommodate. The investigators found almost no difference in
performance between the head-up and head-down conditions in terms of the
ability with which judgments could be made. These results suggest that the
advantages in the head-up display may be more in the symbology on the one
hand, and in lessening the need to reaccommodate, than in the fact that there
is overlapping imagery.

In addition to the physical and optical placement issues, there are a set of other
physical characteristics of the HUD that are worth noting. Many of these are
taken from a series of guidelines presented by Richard Newman, who did a
fairly extensive review for the Air Force, and whose findings are applicable to
civil aviation as well (Newman, 1985). One of the guidelines concerns the qe
refiemcepoint. It turns out that in viewing a HUD, the imagery changes and the
ability to interpret it changes a little bit, depending on where the eye is
positioned relative to the HUD. Newman argues very strongly that the HUD
positioning should be adjustable to allow different seating postures, so it could
be moved when the pilot is scrunched forward or sitting back. A second issue
concerns the fied of view. That is, how much of the outside world should the
HUD incorporate? A lot of technological effort has been put into designing
HUDs that can present a wide field of view. One of the guidelines is that the
field of view should be at least wide enough so that when you are landing into
a crosswind with a very substantial crab angle, the runway is still visible on the
HUD, even as the aircraft is crabbed maximally into the wind. This difference
between aircraft heading and velocity vector indicates how wide the field of
view should be on the HUD.
Another issue that isn't well-resolved concerns what happens when conformal
symbology on the HUD moves out of the field of view. Suppose a pilut is flying

directly towards the runway, and then changes course so that now the runway
128

Dbvaga Comjatiflffyt and Attention

symbol slides off to the side of the HUD. Should it disappear or just freeze on
the side so the pilot still dearly perceives that it is off to the left or the right,
but now perceives an underestimation of the magnitude of the deviation.
Another physical characteristic concerns the frequency with which the HUD is
updated. For analog information on the HUD, a guideline is that the variables
should be refreshed at around 10 to 12 hertz, sufficient to give good
performance. For digital information on the other hand, you certainly don't
want that fast updating, because digits tend to be unreadable. Therefore,
something like 3 to 4 hertz is probably appropriate.

The symbology issue can be broken into two major domains. The first relates to
some of the sensory factors that relate to issues in visual and auditory
perception. For example, what should be the intensity of the HUD imagery?
How bright should it be? What is the necessary intensity to perceive across the
conditions ranging from night viewing, in which you can get by with fairly low
intensity, to incredibly bright snow cover? Is a single fixed intensity adequate,
or should there be automatic or manual intensity control? A related issue
concerns the &wmiuance.Newman has recommended that no less than 70
percept of the outside world light should be transmitted through the HUD.
Weintraub argues instead that it should really be more like 90 percent
(Weintraub & Ensing, 1992). In fact, the Flight Dynamic HUD used by Alaskan
Airlines has about 90 percent transmittance.
Color is another issue in HUD design. The current HUD designs tend to be
monochrome (green). One of the reasons is that the monochrome display
transmits a lot more light than a color HUD. Color of course has benefits, but
color, as viewed on the HUD, may have some real problems in terms of
interpretation, particularly when several colors are to be used. Under the varied
conditions of illumination in which a HUD may be used, any more than four or
five colors will create a real risk of confusion.
Cognitive issues in the design of HUD symbology are also relevant. The Air
Force has done some good research in terms of the nature of the HUD
symbology and how that can be best interpreted (Weinstein, 1990). The nature
of the pitch ladder is one example. How do you make the pitch ladder as
unambiguous as possible in depicting whether the aircraft is nose-up or nosedown? Here is where color comes in. One of the problems with the HUD is that
its graphic representation of what is up and down is not as good as the colored
representation on the typical Attitude Display Indicator using blue and brown.

129

Human Factors for Flidgt Deck Certification Personel

There may be a role for color in HUDs to help make the simple discrimination
of what is above and what is below the horizon.
The use of the inertia guidance system is an important cognitive issue. Its
importance is suggested by the fact that the evaluation of the HUD flown in
Alaskan Airlines revealed that the characteristic that pilots seemed to like most
is the fact the guidance given by the HUD is based on inertial guidance rather
than air mass guidance. In other words, the pilot actually gets a representation
on the HUD instrumentation of where the plane is heading relative to the
ground, rather than relative to the air mass through which it is flying. So this
indicates that possibly the major benefits may be in what the HUD presents
rather than where it is physically presented.
Some issues have to do with the development of flight director displays on
HUDs. These correlate very closely with the same issues of the flight director
for presenting head-down information. What is the appropriate tuning? What
are the appropriate rules to guide the flight director?
One major symbology issue concerns how much information should be on the
HUD. Should a HUD only present the necessary conformal flight information,
the things that are necessary for actual flight path guidance, and, therefore,
conform to (and can be superimposed on) the world outside? Should the HUD
also present different kinds of flight parameter and alerting information, and if
so, how much? As we see below, this impacts the issue of display clutter.
Finally, there is the issue of multimode operations. Some HUD designs present a
lot of information in a relatively small space. If this is viewed as a problem,
then designers often recommend that the pilots be given the option of calling
up alternative forms of information. However, any time the designer creates
multimode situations, you start dealing with problems of menu selection, forcing
the pilot into computer keyboard operation. Such operations have a number of
potential dangers at critical high workload times during the flight, when the
HUDs are likely to be in use.
,Msxon houas
The initial goal of the HUD was to resolve the problems of divided attention by
superimposing the two images. Once that decision was made, then there
followed the issue of how to improve the symbology, and the decision to
collimate the images at optical infinity. The real question is whether or not
simply superimposing images of nonconformal symbology does address the
problems of divided attention, or whether it creates the potential for other
problems.
130

Disvlav Compatflbiity and t

There are three possible attention problems that are created by superimposing
visual images. One is whether or not the resulting chter disrupts the ability to
focus attention. Are there problems trying to focus attention on the far world,
(the runway out there) when there is a large amount of symbology in the near
domain that may be partially obscuring it? Can these problems be addressed by
reducing HUD intensity? The second problem, a related one, is related to
divided ateion and confusion. If a pilot is actually trying to process the farworld information and the near-world symbology simultaneously, is there a
possibility of confusion? For example, when the aircraft moves and the farworld runway then moves relative to the HUD, could the motion of the runway
be misinterpreted as being part of the movement of analog symbology on the
HUD? The third problem, related to attentonal twmnneing orjiraion we now
discuss in some detail, in the context of research at NASA Ames.
One of the few studies that has been conducted with a dynamic head-up display
to examine attentional issues has received a fair amount of publicity, although
it has some methodological problems. It is a study done by Fischer, Haines, and
Price (1980). Ten pilots flew a simulated instrument landing approach. The
HUD was compared with conventional head-down instrumentation (not
collimated). Although most of the landings were normal, on the very last trial,
there was a runway incursion. As the pilot was approaching the simulated
runway, another aircraft pulled onto the runway. The investigators found that,
although the HUD gave better performance under normal landing conditions, a
significant number of pilots failed to notice the plane coming onto the runway
when flying with the HUD. Furthermore, those that did notice the runway
incursion took longer tn notice it when they were flying with the HUD than
when they were flying with conventional head-down instrumentation. However,
this finding w-as not replicated in a more carefully controlled study by Wickens,
Martin-Emerson, and Larish (1993).
The way the NASA investigators interpreted the fixation data was to state that
in flying with conventional instrumentation, there is a very regular scan pattern
required to check the clearance of the runway; but with the HUD, the imagery
may obscure the distant runway, and the scan pattern is disrupted in a way that
doesn't allow the routine and automatic examination of the imagery out in the
far domain. In the evaluation by Steenblik (1989) of the operational use of the
HUD in Alaskan Airlines, some pilots report that in the last few seconds of the
approach, coming into and through the flare, they find the imagery on the HUD
distracting. They have a tendency to tunnel attention exclusively on that
imagery and, therefore, they prefer to turn off the HUD to avoid this tunneling.
Also, earlier evaluations done by NASA indicate a substantial problem with
tunneling in on the HUD instrumentation and potentially ignoring the outside
world. Finally, some research on military applications of the HUD done by
131

HmnFacms for Flight De&k Certificatwio Peuuowne

Opatek indicated a lot of problems, at least with early HUD designs, that arose
from them being too cluttered, so that pilots had a tendency to turn them off.
A summary of the attentional issues highlights the following points. First, the
distinction between conformal and nonconformal symbology is critical.
Conformal symbology will not create clutter and clearly is desirable to be
presented head-up, particularly when driven by inertial guidance information.
Nonconformal symbology, whether digital or analog, may lead to clutter and
confusion, and its addition to a HUD, while reducing scanning, should be
considered only with caution. Secondly, attentional tunneling on either
conformal or nonconformal symbology, to the exclusion of attention to the far
domain, is a potentially real problem. Consideration should be given as to how
to "break through" the tunnel (e.g., by turning off the HUD or reducing its
intensity). Third, there is some suggestion that the tunneling problem may be
exacerbated in head-up rather than head-down presentations.
In conclusion, there has been some debate in the aviation psychology literature
regarding whether the HUD is an advancement or a detriment to aviation
safety. One way of addressing this debate is to point to the strong endorsements
provided by pilots who have flown with the current versions. A second way is
to consider what HUD has done. It has pushed the performance envelope of
aircraft into a whole new domain, and dearly in that new domain there are
going to be more chances for risk and accidents, for example, flying lower to
the ground in low to zero visibility. In this sense, it is analogous to headlights
which, by allowing night driving, have placed the driver in a consistently more
dangerous environment (Weintraub & Ensing, 1992).

132

DKduhn Makh

Chapter 7

Decision Making
by Christopher D. Wickens, Ph.D., University of Illinois
The Decision-Making Process
Figure 7.1 shows a model of information processing. This is similar to the
model presented "i Chapter 5, Figure 5.1. In the preceding chapters, the
discussion focused on basic characteristics of the senses, how the eyes and ears
perceive stimuli, and how information from the world around us is perceived or
understood. This chapter deals with the decision-making process that takes
place after the sensory information is perceived.
Figure 7.1 provides a framework for discussing the decision-making process. A
pilot senses a stimulus, for example, the VASI on a runway. That information
becomes an understood piece of knowledge when the pilot recognizes the visual
133

H-uma Facts for Flfrt Deck Cfdfican Pmannd


Attention
Resources
Senses
Eyes

Stmul-..

Perception

--

Decision and

Response

Response

Execution

Responses

Ears

SWorking
S~Memory

L Memor
- .....
long-ftrm

o...

Feedback

FRUm 7.1.

A modin
I" dVornin

proFmh..k

(from Wickam, 19

information based upon past experience which is stored as long-term memory.


Once perception is complete, then the pilot has a mental representation of the
state of things-a situational awareness. He or she is now able to engage in
decision and response selection. First, a decision about what to do is made. The
decision may be to defer action and hold the information in working memory,
or the decision may be how to carry out the response: vocally, manually, by
foot movement, etc. After a particular response is selected, the pilot executes
the response, that is, carries out some action by coordinated muscular action.
The response execution, of course, changes the environment. The new
environment provides feedback and new stimuli for the senses, and processing
returns to the beginning of the loop shown in Figure 7.1.
Our attentional resources, pictured as a reservoir of limited capacity in Figure
7.1, are critically involved in the decision-making process. Our attention
resources are directly applied to perception, working memory, decision/response
selection, and response execution. Attention has a limited capacity. It allows us
to perceive only so much information at one time, store so much in working
memory at once, make only one decision at a time about which responses to
execute, and execute so many responses at once. Working memory is
particularly subject to the limits of attention resources. Working memory is the
very limited capacity buffer where we store temporary information like
waypoints, radio frequencies, etc., that we have just received and will

134

Decision Making

immediately forget if we stop rehearsing. Very often, working memory guides


our decisions and responses.
Our discussion will focus on the process of decision making at two levels. First,
we consider pilot judgment--the decisions under uncertainty that pilots carry
out, generally with considerable thought and effort. Then we consider the rapid
and relatively automatic decisions that involve direct selection of an action. This

second class has direct relevance to cockpit design issues, and this will lead us
to a discussion of the transfer between different designs on different aircraft.
PilotJudgment
When we talk about decision making, we begin with the concept of uncertainty.
Decisions can be made with certainty or with uncertainty. A pilot's decision to
lower the landing gear, for example, is made with certainty. The pilot knows he
or she must lower the landing gear to touch down on the runway and the
consequences of the decision are well known in advance. On the other hand, a
decision to proceed with a flight in bad weather or to carry on with a landing
where the runway is not visible is a decision with uncertainty, because of the
uncertain consequences of the actions. What will happen if the pilot continues
with the flight in bad weather can't be predicted.
A lot of the conclusions in decision making that will be discussed here come
directly from studies and experiments that have not been related to pilot
judgment. There are, of course, databases about aviation accidents and incidents
that attribute a large percentage of these to poor pilot judgment and faulty
decisions (Jensen, 1977; Nagel, 1988). The problem, of course, is going back
after the fact of an accident or incident. It is easy to attribute a particular
disaster to poor judgment when, in fact, there may be, and usually are, a lot of
other causes. Poor judgment may have been only one of a large number of
contributing causes all of which cannot be identified. For this reason, it is
helpful to study judgment and decision making in other fields besides aviation,
like the nuclear power industry, or to draw inferences from some experimental
laboratory research. Much of the information in this section is based upon
conclusions from these other nonaviation areas.
Figure 7.2 (Wickens and Flach, 1988) shows L general model of human
decision making that highlights the information. processing components which
are relevant to the decision process. To the left of the figure, we represent the
pilot sampling, processing and integrating a number of cues or sources of
information. If it is a judgment about flying into instrument meteorological
conditions, these cues may be weather reports, direct observation of the
weather, anecdotal reports from other pilots in the air, etc. All of the cues help
135

Human Factors for Flight Deck Certification Persormel

perception M,
attention 4"
working

critgrion

[r(diagnosis)
hypothesis
generation

action
riskessemmen
generation

long term memory

[J

salience bias
representativeness heuristic

F!]

As es-if heuristic

Figure 7.2.

Av ava.iability heuristic
confirmation bias

[ac

A model cf human deiso

framing bias

makig. (rorn Wickmns & Flinch, 1988)

--what we might call a diagn(, of what


the pilot form a sitation assessmet
is going on. In making situational assessments, we are often dependent upon

on: hypotheses about


our ability to generate hypotheses about what is going
icing conditions, or severe turbulence for example. Or in the case of diagnosing
engine failures, hypotheses about possible failure states of the aircraft. This
diagnosis, in turn, depends on the information available from long-term
memory, the stored results of training we have had in the past about the things
that possibly could go wrong. Having made a temporaay situation assessment,
we often follow this up by perceiving and attending to further information. In
other words, we seek out more cues to either support or refute our hypothesis.
So this is very much of a cosed-loop process. You form a tentative hypothesis.
You go out and get more information, perhaps call for updated weather
information, or do more observation to try to confirm the hypothesis.
Eventually, you reach a point where a choice is required. The choice is between
actions also learned and thereby stored in long-term memory. Do you go
through with the flight? Do you return to an airport? Do you request an
of an action is sometimes based upon a
alternate flight path? mTh oe i
criterion setting, that is, how much information is needed before you carry
136

Decision

through with a given action or decision. In aviation, the criterion setting is very
often based upon risk a.esment.What is the risk of continuing in bad
weather? What is guiding our choice? What are the consequences of failure?
And then in the final box in the model in Figure 7.2 we perform an action, and
observe its consequences which themselves generate more cues.
Biases in Situation Assessment
The model in Figure 7.2 includes codes (S, R, As, etc.) in small boxes which
represent biases that can cause errors in human decision making. Some of these
biases are also called hewuics, shortcuts or mental "rules of thumb" that
people use to approximate the correct way of making a decision because it
takes less mental effort (Kahneman, Slovic, & Tverksy, 1982).
Sa/ence Bias
The first of these biases is called a salience bias (S). The salience bias means
that when someone is forming a hypothesis based on a lot of different cues of
perceptual information, he or she tends to pay more attention to the most
salient cue. For example, a pilot may be processing various sources of auditory
information including weather reports, reports from air traffic control and other
pilots, conversation from the first officer, etc., to form a hypothesis. The
salience bias is reflected in the fact that it is often the loudest sound or loudest
voice that has the most influence. Another example of the salience bias occurs
in dealing with a multi-element display. We tend to pay most attention to
information displayed at the center of the display rather than the information at
the bottom. These are physical characteristics of a display that aren't necessarily
related to how important that information is. The brightness of lights creates a
bias: the brighter the light, the more we tend to pay attention to it in making
our situation assessment
Con

atinon Bias

Early in the decision-making process, we form a tentative hypothesis about our


situation and we go back to the environment for more cues. At the tentative
hypothesis stage, we may experience a second form of bias called the
confimation bias (C). The confirmation bias states that once a tentative
hypothesis is chosen, we tend to seek and find information to confirm that
hypothesis, but we also tend to ignore information that disputes the hypothesis,
information that tells us we are wrong. An example of the confirmation bias at
work is airport misidentification. It seldom happens in commercial aviation, but
rather frequently in private aviation. The pilot simply approaches or lands at the
wrong airport. There is a tendency when the pilot is lost and disoriented to try
to interpret the ground information as consistent with the airport that he is
137

Human Factors for Fli~ft Deck Certfication Personnel

expecting to approach rather than the airport that he is actually approaching,


particularly at night. The visual world (i.e., the pattern of runway lights or
surrounding features) is distorted in a way that confirms the pilot's
expectations.
The confirmation bias is supported by ctpectyan. What you expect to see helps
you confirm what you believe your state is. A major concern in private pilot
aviation is continued flight into deteriorating weather. This has been
documented by some research at Ohio State University (Griffin & Rockwell,
1989). Pilots continue to pay attention to information saying the weather is
good if they have initially filed their flight plan under the assumption of good
weather. They ignore the contrary evidence that the weather is deteriorating.
Misdiagnosis of failure is another area where expectancy reinforces the
confirmation bias. We don't have documented aviation examples of this, but in
the nuclear industry there are some very definite situations where operators
have formed a hypothesis about a failure state in the plant, and then have
sought information to confirm that hypothesis and ignored information that says
otherwise. The Three Mile Island disaster can be directly attributed to the effect
of the confirmation bias. The operators had a hypothesis that the water level in
the plant was too high. They continued to process information that confirmed
that, and they ignored much of the other information that indicated, in fact,
that the pressure was dropping, and the radioactive core was about to be
exposed.

Anc

ng Heulsfdc

A heuristic closely related to the confirmation bias is called anchoring.The


anchoring heuristic states that if there are a couple of hypotheses you might
have, you tend to anchor your beliefs to one and ignore information supporting
the other. As new information comes in that supports the other hypothesis, the
one you have not anchored to, you don't give it much credibility. So your
degree of belief in one versus the other hypothesis doesn't change very much.
You are open primarily to the information that confirms the hypothesis to
which you are anchored. Then if you get one piece of information that supports
what you already believe, (you are already anchored to), you give that
information a lot more weight. Again, we can use the example of continued
flight into bad weather. If you initially believe the weather is good and that is
your hypothesis, you are more likely to process new information that says that
the weather is indeed good, and ignore information that says it is poor. One
might imagine that different pilots have different beliefs about whether a
particular aircraft is a good aircraft or a bad aircraft, or an aircraft system has
faults or works well. Biased with these beliefs, the pilot is likely to pay a lot of
138

7i
Decision Making

attention to information that confirms those hypotheses, and ignore information


that doesn't. So if you believe a system works well, you are likely to pay less
attention to incidents where the system fails. You are also less likely to notice
when the system does fail. If you believe the system is faulty, you are going to
be very sensitive to instances in which the system does indeed fail. You may
also assume the system has failed when it is, in fact, operating correctly.
Base Rafte of PRobabily
One of the fundamental theories of situation assessment is known as Bayes
theorem. Expressed intuitively, Bayes theorem says that whenever you are trying
to evaluate or form a hypothesis about what is going on, your belief in the
most likely state of the world should be based upon an equal consideration of
two things. One is the probability of each state of the world. We call that the
base Mate. Independent of what you see, how likely is it that the weather is
going to be bad versus good? Independent of what you see, how likely is it that
your hydraulic system will fail rather than some other failure. In addition to the
base rate, the second thing is the similarity of the actual data (the available
visual or auditory information) to the mental representation of the pattern of
symptoms caused by that particular failure. Do the symptoms you see match the
pattern of symptoms expected for a given failure?
Bayes theorem can be summarized by the following equation:

Belief = (B x Base Rate) + (S x Similarity)


Here's an example. You're viewing a particular state of meteorological
information. You are trying to form one of two hypotheses: the weather is
going to be bad on the route which you are flying or the weather is going to
be good. The hypothesis formation should be based upon the similarity between
the actual weather that you are viewing and the weather conditions when it is
good or bad, and upon the base rate: the probability that the weather indeed
will be bad versus the probability that the weather will indeed be good along
your route. For example, the base rate probability may be the overall actuarial
data that says that at a given location the weather is going to be clear 90
percent of the time, on a given day of the year.
The two elements in Bayes theorem, base rate and similarity, should compensate
for each other. So if you don't have much data on which to base similarity,
(you haven't got a good weather report and maybe you don't have very good
observation of the weather), you should pay most attention to the base rate in
making your forecast. That is, what the overall probability is that there will be
good or bad weather along your route. On the other hand, if you don't have
139

Human Factors for Flight Deck Certification Personnel

base rate data, (if you don't know what those overall probabilities are), and
you have a lot of weather forecasts and a lot of good observations, you should
pay more attention to the degree of similarity between the hypothesis and the

existing conditions.
Aveilbity Hew

There are two very important heuristics that we use to approximate the base
rate and the similarity of the data to the hypothesis. These are vailabilty and

reprntatrvmees iVsctive1y. We approximate the base rate, how frequent or


how probable a certain condition is, by the availabiiy hewisic. The availability
heuristic leads us to consider a hypothesis most hlkely if it is most available in
memory. Your estimated base rate of a hypothesis or of a particular risk is
based on how easily you can recall that hypothesis from memory. For example,
suppose you are trying to diagnose a particular failure state in an aircraft. How
probable is it that the failure state exists? There is probably data somewhere
about the likelihood that a given system will faiL There is certainly data in the
nuclear industry about the probability that certain systems will fail and that
data is what you really ought to go on. However, the availability in your
memory is governed heavily by rwenq, by how fresh the infornation is in your
mind. So, according to the availability heuristic, if you recently experienced a
certain kind of failure or perhaps you read about it (in an FAA, company, or
other aviation publication) that makes it very available in your memory and,
therefore, that failure will seem highly probable.
In many domains, availability is very much based on pub/icity. For the flying
public, there is a greatly elevated fear or estimation of the probability of a fatal
aircraft crash simply because of the high publicity given to aircraft accidents.
Because they are very highly publicized, the public generally has available this
idea that aircraft accidents are fairly frequent and, therefore, overestimates how
likely they are to occur.
Availability is also often governed by simp/-lity. It is easier to remember or to
think about simple situations than complex situations. And this is very much
true in trying to diagnose a failure. Multiple failures are fairly complex;
therefore, it is hard for people in doing failure diagnosis to think that those
multiple failures could happen, because they are simply not easy to recall from
memory. It is much easier to think about simple failures; a single element
failure rather than a complex failure.

140

Decision

Mkinf

ftwesa~mimHowisfic
We have said that people should rest their belief in part on the base rate
probability. We have also said that the way people actually use base rate
probability is not by the true probability, but by how easily they can recall
instances of an event. However, it seems that people frequently do not use
probability at all in making diagnoses. Instead, they attend only to the similarity
or representativeness of the current evidence or data to one hypothesis or
another. The representativeness heuristic further states that the only time we
use base rate is when there isn't much data to go on. For example, a pilot may
be flying in a particular area, and it is highly probable that the local weather
conditions may be severe, based upon past data. If the present weather actually
looks clear outside the cockpit, the pilot would tend to ignore the base rate
nformation which might state that in this particular region, at this particular
time of year, the weather is likely to degrade. The representativeness heuristic

also makes us tend to ignore differences in the probability of different failure


states if a set of symptoms that you observe looks like the prototypical case of a
particular failure you have well represented in memory. In the case of landing
at the wrong airport, the representativeness heuristic would make you ignore
the fact that this is really not a likely place for your target airport to be,
because the airport runway and the pattern of lights look like the airport you
think you should be approaching. The wrong runway is representative of an
image you have in memory of the correct runway.
ov.coni

nw' Bias

In understanding where you are, what your situation is, and what you should
do next, the overconfidence bias can be at work. This seems to be a fairly
pervasive bias that underlies performance of both novices and experts in a lot

of different domains. We tend to be overconfident in our own judgments based


on our own memory and our own cognitive ability. In other words, when we
have solved a problem, we are more confident than we should be that the
problem is solved correctly. One important example of overconfidence occurs in
eyewitness testimony. A lot of data coming from research on judicial procedures
indicate that eyewitnesses to a crime, or to a significant event such as an
aircraft accident, tend to be far more overconfident about what they saw than
the accuracy of their own testimony will reflect. An eyewitness, for example,
might state with high confidence that a plane was on fire before it crashed,
when, in fact, it was not. The point is that you can't give much credibility to
the eyewitnesses' asserted confidence of what they saw or heard, and instead
you must down-weight that confidence appropriately. There has also been some
laboratory work done with pilots' decision making at the University of Illinois

141

Human Factors for Flight Deck Certification Personnel

where it was found that pilots are more confident that their judgments are
correct than they really have a right to be (Wickens et al, 1988).
For the pilot, the consequence of overconfidence in the correctness of a decision
that he or she has just made, is that the next course of action will be taken
without adequately considering the alternative actions, should the decision in
fact be the wrong one, and will be taken without adequately monitoring the
evolving consequences of the decision just made.
Risk Assessment
A characteristic of many judgments both on the ground and in the air is the
need to choose between a Ar*k, option and a sure thing option. A risky option
has two possible outcomes, neither one of them assured. A sure thing option
har only one, certain, outcome. It is almost guaranteed. The classic example of
choosing between a risky option and a sure thing option is delaying takeoff on
a flight. The sure thing option is that you are going to sit on the ground for a
long period of time and nothing is going to happen except a certain delayed
flight. The risky option involves going ahead with the takeoff into potentially
uncertain weather, a decision with two possible outcomes: an accident or
incident due to severe weather, or a safe trip. With the sure thing option,
staying on the ground, it is highly probable that everything will be fine, and
the consequences of the decision will be generally good (safe, but with a
delay). The risky option really has a relatively high probability that things will
go very well (a safe flight but no delay), but a very severe negative
consequence if the bad weather leads to disaster.
How do people make these choices? Do they tend to go for the sure thing or

the risky option? These sorts of decision problems can be expressed intuitively
in terms of gambling choices. Here's the choice: you can receive five dollars
guaranteed, or you can flip a coin and either win #---i dollars or nothing at all.
This is really a choice between two positive outcomes with the same expected
value in the long run. One is keeping the five dollars, a sure thing. The other is
that you have a 50/50 chance of getting something good, ten dollars, or
nothing at all. With either option, you have everything to gain and nothing to
lose. In contrast, we can also represent these two decision choices in terms of
negative outcomes. So I can say, I will take five dollars from you, or you can
flip a coin and have a 50/50 chance of either losing ten dollars or nothing at
all.
The research in psychology has studied people confronted with these gambling
choices (including also trained business people making inv,-rnents). The results
reveal that whether people choose the risky option or th2 sure thing option,
142

Decision Making

depends upon whether the choice is between two positive outcomes as in the
first example, or two negative outcomes as in the second example. Given a
choice between two positive outcomes, people tend to take the sure thing. They
tend to be aversive to risk, and the expression goes, they "take the money and
run." So more people would be likely to take the five dollars than to take the
bet of getting more or nothing at all. But given the choice between two
negative outcomes, people tend to be risk-seeking. The expression for them is,
they "throw good money after bad." They are more likely to take the gamble
and hope they come out with no loss rather than accepting a guaranteed loss.
This difference in choice preference is called framing of decisions, because the
way in which a decision is made depends on how it is framed: Whether it is a
choice between positives or a choice between negatives (Kahneman, Slovic, &
Tversky, 1982).
Consider, for example, a physician making choices between a sure thing medical
treatment and risky treatment. Investigations have found that the physician
recommendations are very much influenced by whether words are phrased in
terms of saving the patient, or the probability that the patient will die. Saving
the patient is the positive outcome; the probability of death is the negative
outcome.
How do we translate framing into an aviation-relevant example? Again, let's
consider a decision between, say, canceling or delaying a flight and taking off
into bad weather. We can talk about the sure thing characteristics of delaying
or canceling the flight. There is a certain good characteristic to delay or
cancellation, and that is you are guaranteeing safety. A certain bad
characteristic is that you are guaranteeing a lot of irate passengers, a disrupted
crew schedule, etc. The risky option of flying into bad weather has a good (but
uncertain) outcome: it is likely that you are going to proceed in a more L,ely
fashion. It also has a potentially bad characteristic: with a low probability, it
could happen that there is going to be severe delay and possibly disaster. The
issue here is that the bias towards one choice or the other can be based on the
way in which the positive outcomes are framed or emphasized. Say the decision
is between guaranteeing a safe flight or a high probability of getting a timely
flight to the destination. That is a decision framed in terms of a positive sure
thing and a positive risk. The framing bias suggests that under these
circumstances, the bias would be towards delaying the flight and just staying on
the ground. Whereas, if the decision were framed in terms of negatives, a sure
thing of delay with irate passengers or a relatively small possibility of a crisis
because of being in the air in bad weather, there would be a greater bias
towards choosing the risky option.

143

Human Factors for Flight Deck Certification Personnel

Stress and Decision Making

It is imDortant to consider some of the ways in which stress amplifies the


various biases, or otherwise affects decision making. These conclusions are
based on both accident and incident reports as well as some experimental data.
By stress, we mean the perception of being in a highly dangerous environment,
and refer to the kind of experiences that result when alarms start to go off and
the cockpit systems start to fail or when the aircraft encounters very serious
meteorological conditions. The effects of stress seem to enhance the
confirmation bias, also called cognitive tunneling. This occurs when you
continue to believe in the hypothesis that you initially formulated, regardless of
what the new data say. The analysis of the Three Mile Island nuclear incident
shows very graphically how the operators, under the stress of a crisis situation
after the initial alarms sounded, and knowing they had a critical situation,
continued to tunnel in and focus on that one belief that the water level was too
high, not too low. In cognitive tunneling, one not only tunnels to a particular
hypothesis, but also tends to focus onto particular elements of a display under
high levels of stress and, therefore, process less information. It is as if the
searchlight of attention narrows down onto certain critical cues; you pay most
attention to those you believe are most important and you tend to ignore other
information. Cognitive tunneling and display tunneling work very much hand in
hand, in the sense that the higher the stress the more you pay attention to the
information that confirms the hypothesis you believe to be the case. A recent
study of errors made by RAF pilots indicated that cognitive tunneling of
displays under stress was a significant cause of the accidents they examined.
Approximately 16 or 17 percent of the accidents were related to this (Allnut,
1987).
Stress contributes to a loss in working memory: the ability to rehearse digits,
(navigational waypoints, radio frequencies), and the ability also to form a
mental model of the visual airspace. Research has been done at Illinois that
indicates that these imaging capabilities seem to go down under high levels of
stress as well (Wickens et al, 1988). Clearly, the more we are stressed, the less
we use working memory, and the more we try to use very simple heuristics,
simple mental rules of thumb. Under stress, the heuristics or biases tend to
dominate our decision-making process.
It is also important to point out that at least some data indicate that there are
processes that are .ftss-resisanf.In particular, a lot of times decisions can be
made, not by going through this process of weighing all of the information and
integrating it with mental calculations, but rather by direct long-term memory
retrieval. Decision making by expert pilots in familiar situations is usually

automatic and almost unconscious. The pilot sees a situation, it matches


144

Decision

something he or she has experienced before, and that's the diagnosis. The pilot
has carried out an action that worked before under those same conditions, so
the pilot carries it out again and doesn't go through a time-consuming process
of risk evaluation and calculated action choice.
Both the research at Illinois (Wickens et al, 1988) and a lot of the research that

Klein (1989) has done with tank ciew commanders and with fire fighters
indicate that this type of decision making seems to be much more resistant to
stress. Finally, it has been found that people's ability to evaluate the risk of
different options, again, does not appear to be degraded by stress. There isn't a
tendency to be more risky or less risky under stress.
Lessening Bias in Decision Making
So where does all this lead to? What steps can be taken to address bias
problems in decision making? Clearly, training and developing expertise is one
step. Experts tend to use decision strategies that are based more on directly and
rapidly retrieving the right action or diagnosis from long-term memory, on the
basis of similarity with past experience, rather than using working memory to
generate or ponder the alternatives in an effortful manner (Klein, 1989).
Another step that can be taken is de-biasing. There has been some successful
research in de-biasing, that is, making pilots or decision makers aware of the
kind of biases already mentioned in this chapter. Weather forecasters, for
example, if given explicit training about the tendency to be overconfident in
their forecasts, can learn to calibrate those forecasts quite accurately. Planning,
that is, rehearsing alternatives in advance of a crisis situation, is another step in
addressing the bias problem. Effective pilot training naturally strives to get the
student to plan for alternative courses of action, and their consequences in
different and possible circumstances. Finally one of the more controversial
means used to deal with bias, one that is emerging in the commercial flight
deck and is already used in the military flight deck, are expert systems. Expert
systems can, at least according to some scientists, replace some of the pilot
decision making necessary in the cockpit, or at least can recommend judgments

to the pilot in the cockpit.


The mention of expert systems leads us directly to consider the advantages and
limitations of automation, an issue covered more thoroughly in Chapter 9. By
and large, automated systems are far more helpful at this stage if they can
provide sufficient ways of integrating and presenting information rather than
actually replacing judgment and decision making. There is too little known
about the way in which pilots make decisions to trust all of those decision
recommendations to the expert systems, but there is much to be gained from
using automation to integrate and present information.
145

Human Factors for Flifht Deck Certification Personnel

High Speed Decision Making: The Choice of Action


A decision to take off rather than abort is one that is made 99 times out of
100, or maybe more frequently than that, without a lot of conscious thought;
there is very little choice or uncertainty about the consequence of doing one
versus the other. The decision of what key to press on a control display unit is
also one you don't really have to think about. You know the consequences of
hitting the right keys are good and the consequences of hitting the wrong keys
are bad. The decision to respond to a TCAS advisory to engage in a particular
flight maneuver is also made without a lot of thought. One doesn't weigh the
consequences, (expected cost and benefits), of doing one thing versus another.
These are all examples of decision making under certainty. There are many
factors that affect how quickly aviators respond to TCAS commands, etc. An
important thing to keep in mind as we discuss these factors is that almost
anything that makes a decision take longer will also be more likely to make
that decision incorrect. The things that prolong response time are also the same
things that will lead to an increased likelihood of error. (The one exception is
the person's choice to proceed more cautiously. The longer decision will
probably be more accurate.)
Decision Complexity

The first factor that affects response selection speed is the decision compleriy.
The complexity of a decision is literally the number of possible alternatives.
Think of a two-choice decision. You are accelerating for takeoff. Do you rotate
or abort the landing? There are two possible choices available. A more complex
example is a choice between four possible alternatives. A TCAS warning might
tell you to turn right or left, or to climb or descend. It might even present more
detailed choices: right and descend, left and descend, etc. The response time
increases with the number of possible response alternatives. In fact, we have a
nice equation that can be used to express how long the response time will be
as a function of the number of possible alternatives that are available.

RT = a + b log2N
You can plot this function to show that each time we double those alternatives,
we get a constant increase in response time (and an increase in the probability
of making a mistake). Again, simple choices are easier and made more rapidly
than complex ones.

146

A second important factor in response time is pubabiOy, or expectancy. We

tend to perceive and respond very fast to things we expect, take a long time to
respond to (or not perceive at al) things that we do not expect. For example,
in accelerating for takeoff, the pilot very much expects the conditions to be
favorable to rotating and going through with the takeoff. He does not expect
conditions that will force an abandonment of takeoff procedures. Coming in for
a landing, the pilot expects a clear and open runway, and does not expect an
obstacle to appear on the runway. We have a formula for the effect of
expectancy or probability on reaction time (Hyman, 1953).

RT = a+blog2[1/p(a)]
The lower the probability of the event (a), the less frequent it is, and the
longer is the reaction time. These equations provide some evidence, which
psychologists are always seeking, for fairly well-defined mathematical laws of
human performance. To some extent and in some circumstances, these laws can
be balanced against the very strong mathematical laws of engineering
performance.

A third factor influencing response selection speed is the canted in which an


event occurs. We respond very rapidly if the context makes the event likely. We

respond more slowly if the context makes the event unlikely than if the context
makes the event a probable one. So a crew will respond to a windshear alert

quite rapidly if it is in the context of flight into very turbulent thunderstorm


conditions near the ground. The crew will respond to the wind shear alert
much more slowly if it is in the context of a clear air approach where the
weather is good and there is no prior evidence that it is a likely condition to
occur. Similarly, the response to a collision or potential collision will be
relatively fast in a very dense airspace and will be relatively slow if there is
minimal traffic because the latter context is not one that suggests you're likely
to encounter traffic.
The SpeedAcc-acy Trade-off
Response selection speed is also affected by other factors. The first is a very
intuitive one, speed stress. The more we are stressed to go fast, the faster we
go, but the more likely we are to make errors. This is called the peed -acc cy
bud-off. On the other hand, the more we try to be accurate in our responses,
the slower we are going to be. You can sum this up by saying the higher the
accuracy, the higher the response time. A pilot, rushing through a checklist to
147

Human Factors for Flight Deck Certification Personnel

reach the next phase of flight as soon as possible, will be more likely to make
an error. There is an interesting application of the speed-accuracy trade-off in
nuclear power plant design. Designers have found that operators, under
emergency conditions, tend to put a self-imposed time stress on themselves.
When warning signals start to go off, they want to respond very rapidly. The
consequences have been, in a couple of accidents, that people respond fast and

make mistakes. Of course, mistakes in dealing with a crisis are the last things
you want to have happen. There have been some implicit recommendations in
this country, and explicit recommendations in Germany in the nuclear industry,
to tell operators that when something starts to go wrong, not to respond
immediately. Germany has actually given them a time in which they cannot do
anything until they form an understanding of exactly what is happening. In
other words, control room operators have been given instructions that combat
the tendency to respond fast and make more errors in times of crisis.
Signal and Response Discniminabilily

Low discriminability between signals is another factor that slows response


selection speed. For example, when a pilot is responding to signals rapidly, the
likelihood of confusion is greater if the signals are similar to each other.
Consider the air traffic controller, for example, who must respond to one of two
aircraft that have similar designations, for example, B4723 and B4724. The only
difference is the single digit at the end -- 3 and 4, and the controller will take a
relatively longer time to respond in this case. On the other hand, if we take
away all of those common features and leave only the different features, 3 and
4, the response time can be relatively rapid. Another example of potential
discriminability problems might be digital information on head-up displays. If
very similar information like air speed and altitude is displayed digitally in a
common format, then the high degree of similarity between the representation
of each of these may seriously impede a pilot's ability to respond rapidly to a
change in one or the other. Auditory alerting tones are also another major
culprit for similarity induced slowing or confusion, if there are several different
tones, each with different meanings.
Just as two highly similar signals can be confusing and slow down response
time to one or the other, so also highly similar switches that have to be used in
similar fashion can delay response. If there are two switches that function
exactly the same way for different purposes, it will take a pilot longer to pick
the right one in an emergency. There is a book called The Psychology of
Everyday Things (Norman, 1988). It is very readable, nontechnical, and it
demonstrates how the selection of action is influenced by the design of
everyday things like automobile dashboards, VCR controls, light switch controls,
etc. For example, it discusses the problems associated with clock-radios having
148

Deidsiona
what the manufacturer calls "human-engineered" direct input pushbutton
control, in which all of the controls look identical. This is exactly the opposite
of good human engineering principles where you would want to have a high
degree of discriminability between one control and another.
Pracdco
Practice is still another influence on response time. The more practiced we are

at responding in certain ways under specific conditions, the more rapidly those
responses will be.

The Deck~ian Conk* Advantage


We have already seen that complex choices take longer than simple choices. A
four-choice reaction takes longer than a two-choice reaction. However, there are
situations in which it is better to have a smaller number of complex choices
than a large number of simple choices. And that we call the decision comlexity
advantage. A good example of this is going through a menu on a flight
management computer. What does a pilot need to do if he goes through a
menu? There may be a total of 16 options, one of which has to be selected.
How do you get to those 16 options to choose the one you want? One
possibility is to put all 16 options on a single menu page and have the pilot
choose from the 16 items. This is called a "broad/shallow" menu and involves
one complex decision. Another option might be to break them into four groups
of four options, and have the pilot first choose which group of four he wants to
use. Then once he gets the group of four options, he makes another choice
within the remaining four. This is called a "narrow deep" menu and involves
two simpler decisions. The suggestion is that there are a lot of different ways of
getting down from the beginning of a menu to the final option you want.
So which is better: broad shallow menus, with lots of options/menu, or narrow
deep menus? It is generally better to make a smaller number of more complex

choices (broad/shallow) than a larger number of slightly simpler choices


(narrow/deep). That is a fairly well-established principle in human factor design
(Wickens, 1992). This is some of the kind of guidance that human factors is
able to offer for that issue.
Fo/owing Checldist Procedures
Menu choice is one case where operators have to execute a number of
responses in sequence. Another area in which multiple responses are relevant is
in following checklist procedures; a topic that is well discussed from a human
factors viewpoint by Degani & Wiener (1990). One of the greatest potential
causes of human error is in following a checklist. Here again, there are some

149

Human Factors for Flifght Deck Certification Personnel

human factor guidelines that are relevant. A major point in checklist design is

to avoid negatives. Negatives in any sort of checklist or procedural instruction


do two things. First of all, they provide an added cognitive load. Any time you
hear or read "do not do something" you have to mentally represent what it is
that you do and then mentally reverse that representation. Psychologists have
shown that this added cognitive transformation takes longer, and it also
increases mental workload. A second danger of negatives in a checklist is that
there is always the possibility of missing the negative and assuming that it is
the positive. Negatives are particularly dangerous in command information.
When someone needs to know what should be done, the instruction should be
a positive one. If a pilot should ascend, it is confusing to command "don't
descend," but saying "ascend" or "climb" is clear. Negatives should also be
avoided in communicating status information. To be told that the status of
something is "not" this or that places that extra burden on the mind to translate
the negative information to a positive. Also, saying what something is "not" is
ambiguous, because there are often several things that it could be instead.
Another important issue related to checklists is the idea of congmence. Anytime
there is a checklist or verbal narrative of things to be done that will be played
out, in some sequence, over time, it is important to make sure that the ordering
of words over time is congruent with the ordering of time. If you are reading a
checklist that says "do X then do Y," you encounter the letter X before Y, and
that is the correct, congruent order. If you have a checklist that says "before
you do Y, make sure X is done," then you have an ordering of the words in the
checklist that is opposite or incongruent from the ordering of actions that are to
be accomplished. So somebody who quickly glances at the sequence sees Y then
X, and perhaps does Y first, which is reversed from the intended order.
There is a lot to be gained by the use of pictures in checklists and procedure
following. Here we are getting more into the maintenance guidelines rather
than the flight deck guidelines, but it is still relevant. Figure 7.3 includes an
instruction written in text and the same instruction illustrated with a picture
and text. The text-only verbal instructions are "See that the sliding cog
associated with the reverse drive bevel is rotating freely before tightening the
long differential casing." A better presentation is the drawing combined with
brief instructions, numbered (1) and (2). This is a clear case if where a picture
speaks far more clearly t.an words. Another characteristic of this picture that
might be considered in terms of the logical way of processing information is
that we typically read from left to right. Therefore, following the principle of
congruence, it would be better to have instruction (1) to the left of (2), so you
encounter instruction (1) first as you scan from left to right.

150

Decision Mkn

'See that the sliding cog


associated with the reverse
drive bevel Is rotating
freely before tightening
the long differential
casing'

(1) check that this turns freely

(2) tighten this screw

Figure 7.3.

Example of how an Elustraton can be used to avoid technical Jargon and


Improve conprehenslor (from Wd^g I197)

Response Feedback
Another issue in response selection, particularly relevant to making several
responses in a row, is the issue of feedback from the responses. There are two
different classes of feedback. Eniwsic feedback is separate from the act of
making the response itself. Extrinsic feedback is often visual. For example, when
you press a key on a CDU (control-display unit), you see a visual indicator on
the display corresponding with the key that was pressed. Intrinsic feedback, on
the other hand, is directly tied to the act itself. It may be tactile feedback where
you press a button and feel the click as it makes contact, and perhaps you hear
a click. Intrinsic feedback is very useful if it is immediate; that is, if it occurs
immediately after the action. For example, pushbutton phones that give you a
tone each time you press a button provide better intrinsic feedback than those
that don't. There is a great advantage to making sure any keyboard design
includes this intrinsic, more immediate feedback.

151

Human Factors for Flikht Deck Certification Personnel

On the other hand, it is clear that delayed feedback is harmful, particularly for
novices. It disrupts the ability to make sequential responses, particularly when
that feedback is attended to and is necessary, or particularly if it is intrinsic.
One of the things we have known for a long time is that delayed auditory
feedback has a tremendously disruptive effect. If you are hearing your own
voice and it is delayed by as little as a quarter second, the voice transmission is
very profoundly degraded. Looking toward the future, design considerations for
the data-link system between pilots and area traffic controllers will need to be
concerned with feedback issues, as the pilot communicates through the
computer interface with the ground using various forms of non-natural displays
and non-natural controls, (i.e., keyboard controls, computer-based voice
recognition, and voice synthesis).
Display-Control (Stimulus-Response) Compatibility
The compatibility between a display and its associated control has two
components. One relates to the relative location of the control and display; the
second to how the display reflects (or commands) control movement.
In its most general form, the principle of location compatibility says that the
location of a control should correspond to the location of a display. But there
are several ways of describing this correspondence. Most directly this is satisfied
by the principle of colocadon, which says that each display should be located
adjacent to its appropriate control. But this is not always possible in cockpit
design when the displays themselves may be grouped together. Then the
compatibility principle of congruence takes over, which states that the spatial
arrangement of a set of two or more displays should be congruent with the
arrangement of their controls. Unfortunately, some aviation systems violate the
congruence principle (Hartzell et aL, 1980). In the traditional helicopter, for
example, the collective, controlled with the left hand, controls altitude which is
displayed to the right; whereas the cyclic, controlled by the right hand, affects
airspeed which is displayed to the left.
The distinction between "left" and "right" in designing for compatibility can be
expressed either in relative terms (the airspeed indicator is to the left of the
altitude indicator), or in absolute terms, relative to some prominent axis. This
axis may be the body midline (i.e., left hand, right hand), or it may be a
prominent axis of symmetry in the aircraft, like that bisecting the ADI on an
instrument panel, or that bisecting the cockpit on a twin seat design. Care
should be taken that compatibility mappings are violated in neither relative nor
absolute terms. For example, in the Kegworth crash in the United Kingdom in
1989, in which pilots shut down the remaining, working (right) engine on a
Boeing 737, there is some suggestion that they did so because the diagnostic

152

Decision Makin

indicator (engine vibration) of the malfunctioning (left) engine was positioned


to the right of the cockpit centerline (Flight International, 1990).
Sometimes an array of controls (e.g., four throttles) are to be associated with
an array of displays (e.g., four engine indicators). Here, congruence can be
maintained (or violated) in several ways. Compatibility will be best maintained
if the control and display arrays are paralleL It will be reduced if they are
orthogonal (Figure 7.4, i.e., a vertical display array with a horizontal left-right
or fore-aft control array). But even where there is orthogonality, compatibility
can be improved by adhering to two guidelines: (1) the right of a horizontal
array should map to the front of a fore-aft array; (2) the display (control) at
the end of one array should map to the control (display) at the end of the
other array to which it is closest (see Figure 7.4). It should be noted in closing,
however, that the association of the top (or bottom) of a vertical array with the
right (or high) level of a horizontal array is not strong. Therefore, ordered
compatibility effects with orthogonal arrays will not be strong if one of them is
vertical. Some other augmenting cue should be used to make sure that the
association of each end of the array is clear (e.g., a common color code on
both, or a painted line between them).
The movement aspect of SR compatibility is called cognitive-reponse-stiMulus
comatibliy or CRS-compadibiy. This means that the pilot has a cognitive
intention to do something: increase, activate, set an air speed, turn something
on, adjust a command altitude, etc. Given that intention, the pilot makes a
response, an adjustment. Given that response, some stimulus is displayed as
feedback from what has been done. There is a set of rules for this kind of
mapping between an intention to respond, a response, and the display stimulus.
The rules are based on the idea that, first of all, people generally have a
conception of how a quantity is ordered in space. As we noted in the previous
chapter, when we think about something increasing, we think about a
movement of a display that is either upwards, to the right, forward, or
clockwise. Secondly, there is a set of guidelines having to do with the
relationship between control and display movement that is most compatible, or
that is most natural. These guidelines are shown in Figure 7.5. Whenever one is
dealing, for example, with a rotary control, there are certain expectations we
have about how the movement of that control will be associated with the
corresponding movement of a display. We think of these as stereotypes, and
there are three important stereotypes.
The first is the clockwise increase stmreotpe, meaning anytime we grab a rotary
control, if we want to increase the quantity, we automatically think we have to
rotate the rotary control in a clockwise direction (c and d). The second
stereotype is what is called the proximity of movement steeotype. It says that
153

Human Factors for Flight Deck Certification Personnel

(b)

(a)

X-Y

x-z

(c)

(d)

00
Y-Z

Figur 7.4.

y-x

Dlrerrt possbl odogonal dslafy-conro configuains (from Andre &

cw is 1990
154

Decision Mkn

(a)

(b)

(e)

(g)

Figir 7.5.

(c)

(a)

(h)

Examples of populamo steeotypes in conr~ol reatons. (rmWlckgns 1968)


155

Human Factors for Flight Deck Certification Personnel

with any rotary control, the arc of the rotating element that is closest to the
moving display is assumed to move in the same direction as that display.
Looking at (c) in Figure 7.5, we see that rotating the control clockwise is
assumed to move the needle to the right, while rotating it counterclockwise is
assumed to move the needle to the left. It is as if the human's "mental model" is
that there is a mechanical linkage between the rotating object and the moving
element, even though that mechanical linkage may not really be there.
The important point is that it is very easy to come up with designs of control
display relations that conform to one principle and violate another. A good
example is (e). It shows a moving vertical scale display with a rotating
indicator. If the operator wants to increase the quantity, he or she grabs the
dial and rotates it clockwise. That will move the needle on the vertical scale up,
thus violating proximity of movement stereotype. You can almost hear the
grinding of teeth as one part moves down while the adjacent part moves up.
How do we solve the confusion? Simply by putting the rotary control on the
right side rather than the left side of a display. We have now created a display
control relationship that conforms to both the proximity of movement stereotype
as well is the clockwise to increase stereotype. Simply by improving the controlto-display relationship, designers can reduce the sorts of blunder errors that
may occur when an operator inadvertently sets out to, say, increase an air speed
bug by doing what seems to be compatible, and instead it moves it in the
opposite direction.
The third component of movement compatibility relates to congruence. Just as
we saw with location compatibility, so movement compatibility is also preserved
when controls and displays move in a congruent fashion: linear controls parallel
to linear displays [(f), but not (g)], and rotary controls congruent with rotary
displays [(b) and (h). Note, however, that (h) violates proximity of movement].
When displays and controls move in orthogonal directions, as in (g), the
movement relation between them is ambiguous. Such ambiguity, however, can
often be reduced by placing a modest "cant" on either the control or display
surface, so that some component of the movement axes are parallel, as shown
in Figure 7.6.
As we have seen with the proximity of movement principle, movement
compatibility is often tied to a pilot's "mental model" of the quantity being
controlled and displayed. Figure 7.7 shows one particular example of display-tocontrol compatibility that indicates how consideration of the mental model can
increase the complexity of compatibility relations. This example is taken from
an aircraft manual on a vertical speed window. It is a thumbwheel control
mounted in the panel, and to adjust the speed down, you rotate the wheel
upward. The label next to the thumbwhee! shows an arrow pointing up to
156

Decision

.4

0.%

8=600

0dO68=45*

(a)

0
70

Figure 7.6

Il~usratlon of how a Ocant,8 Le., angfln cortrols to be parudfy perakil to


displays vill reduce comnpabbil* amnbigul. (from Andre & Wsckens, 1960

bring down (DN) vertical speed and an arrow pointing down to bring vertical
speed up (UP). From the human factors point of view, this is an incompatible
relationship between control and display. If you want to go down, you should
push something down, not up. If you want to go up, you should push
something up. However, consideration of the mental model makes the relation
more compatible than it first appears. If you think about this as a vertical
wheel, mounted into the cockpit along the longitudinal axis, you are basically
157

Human Factor for FlPrh Deck Crificatmion Personnel

Vertical speed window

sown

~Up
Glareshield(b

(b)
Vertical speed selector
UP/DN - sets vertical speed in
vertical speed window
(a)

giuM 7.7.

EBMTIs Of iplay&*ft

WUMM 194

co"

csnprQMby on a vwlfial

sc

wlndw. (torM

rotating the nose of the aircraft down or up. So moving it up rotates the nose
of the aircraft down, thereby creating a descent. How pilots think of this is not
altogether dear, but it illustrates an important principle that a pilot's mental
model of what a control is doing has tremendous implications for whether that
control will be activated in the correct or incorrect direction.
Compatibility concerns also address the issue of how a toggle switch should
move to activate or provide power to a system. To configure a control mounted
on a front panel in a way that its movement will increase the quantity of
something or activate it, we might well have it move to the right or upward. If
it is mounted along a side panel, we might want to move forward to increase
(on) and backward to decrease (off). What happens when we have it mounted
on a panel which is at an angle between the right side and the front? We now
have a competition between whether this panel is being viewed as closer to the
forward position, in which case an increase should be to the right, or closer to
the sideward position, in which case an increase should be forward--but in the
opposite direction. Which way should this control go to increase? An answer is:
Why fight the stereotypes? Why not instead go with the one direction that is
unambiguous. That is, make sure upwards increases? If there is a zone of
ambiguity, where you have one stereotype fighting against the other stereotype,

158

Decision Makinf

good human factors should consider that battle and take advantage of designs

that make sure that neither stereotype is violated.


The idea that "on" is indicated by up, right and forward moving switches is
contradicted by at least one design philosophy. Figure 7.8 shows the "sweep-on"
switch position concept illustrated for a pilot in a cockpit. The sweep-on
concept says that to turn switches on, a pilot can do so with a single
continuous sweep of the hands. So the direction for on is forward at the
bottom, but is backwards up at the top of the cockpit control panel. While
there is a certain amount of logic behind this design, given the simplicity of
movement, it also presents a
concern if a pilot must suddenly
focus on a switch overhead and
on
makes a rapid decision whether it
on
is on or off. Does the fact that it
is thrown in a backward position
counteract the stereotype that
means that forward means on?
on
Again, it is not an issue that is
easily settled. It is the kind of
issue for which a lot more data
should be collected to find out
how these different stereotypes
on
can come into conflict with each
other, and when they do, which
one "wins."
SR-compatibility is also related to
modaiy, both voice versus visual
display, as well as voice versus
manual control. Not a lot of work
has been done in this area. We
are going to see, and already are

on
Fur 7.8

The sweep-on' switch position

slowt repka*n
w
c
seeing in the military, more and
the saier lorward-on'
more voice-activated controls
rom
" Hawn, 1967
arrangere
replacing manual controls.
Certain guidelines seem to exist
that suggest that voice control is well-suited (compatible) for certain kinds of
cognitive tasks, but poorly suited (incompatible) for other kinds of tasks. The
voice is very good for making categorical output, describing a state. On the
other hand, using the voice for any sort of tracking task, describing the location
of things, or movement of things in space, is relatively poor. One reason for this
is that our understanding of space is directly connected with our manipulation
159

Human Factors for Flight Deck Certification Personnel

of the hands. Therefore, the hands, whether using a key or joystick are much
more appropriate for continuous analog control when responding to continuous
analog displays. The one possible benefit for voice control of continuous
variables would occur if the hands were already heavily involved with other
manual control activities. (See Chapter 8.)
Stress and Action Selection
As we have mentioned before, high stress tends to shift one towards fast but
inaccurate performance. People tend to react rapidly, but they tend to make
more mistakes. It is also clear that under stress, people shift to the most
compatible habits and actions. This is probably the strongest reason for keeping
stimulus-response compatibi-Kit.y gh. Under low stress, people can be effective
using an incompatible design like an overhead switch that goes back to turn
something on. However, the data suggests that under high levels of stress, the
incompatible design is likely to cause an accident, even for the skilled pilot.
Somebody wants to turn it off, so by habit they move it backward (which is
really on). So compatibility is most beneficial under stress, and, of course, the 1
percent of the time when stress is high is when we are most concerned about
good cockpit design, because this is the period in which the environment may
be least forgiving of human error.
Stress also has other effects on action selection. It biases operators to perform
the best learned habits, in place of more recently learned habits. Stress leads to
a sort of "action tunneling," which is analogous to the cognitive tunneling we
discussed above. In action tunneling, the pilot may repeat the same
(unsuccessful) action over and over. Because stress reduces the capacity of
working memory, it may have a particularly degrading effect on mu/dmode
systems--like a multimode autopilot--in which the pilot must remember what
mode of operation a system is in, in order to select an appropriate action. (We
discuss these systems again under the topic of human error in the next
chapter.) If the memory fails (because of stress), the multimode system
becomes particularly vulnerable to an inappropriate action.
Finally, stress has implications for voice control, where either a pilot or air
traffic controller is talking to voice recognition systems. Major concern in the
research on voice control is the extent to which high levels of stress distort the
voice quality and, therefore, distort the computer's ability to recognize and
categorize the voice message. This has been one of the biggest bottlenecks to
the use of voice control in military systems. What happens when a pilot comes
under stress when talking to the aircraft, and the aircraft does not recognize his
voice commands?

160

Decision Makint

Negative Transfer
The topic of stress and action selection are closely related to the issue of
negative transfer. Negative transfer is the bringing of habits used in one

environment into another environment where those transported habits now


conflict with the actions that are called for. There are problems of negative
transfer when a pilot transfers from one aircraft to another, when a pilot deals
with, say, a modification in his or her customary aircraft, or even when a pilot
deals with two different systems within the same aircraft like two different
keyboards. Wiener (1988), for example, has called attention to the negative
transfer between the ACARS and FMC keyboards in many modem commercial
aircraft. The negative transfer issue is directly relevant to the whole issue of
pilot certification and common type rating. At what point should two aircraft
have different type ratings that require major differences in training?
An example of an accident that wa- Eirectly related to negative transfer
occurred on the DC-9 that crashed on an ILS approach. The new, modified
DC-9 involved replacement of the flight director system. In the old system, a full
clockwise rotation of the mode selector switch engaged an approach mode. In
the new system, the same clockwise rotation of the mode selector engaged a
go-around mode. So the same action produced two very different results in the
old and new systems. In the analysis of the accident, Rolf Braune reconstructed
a sequence in which the crew presumably intended to do an approach, and
inadvertently selected a go-around mode by turning the mode selector
clockwise. That caused the confusion that led to the accident.
Given potentially catastrophic confusions such as that described above, designers
need to be concerned with the causes of a negative transfer, as well as positive
transfer in which experience with the previous system helps performance with
the new system. The most general principle of negative transfer is that unless
two designs are identical in both appearance and procedure, the following
design changes will increase the potential for crew error.
o

The appearance of the new design is the same or similar to the old.

The procedure is similar, but not exactly the same.

Table 7.1 is a matrix showing error probability due to transfer of previous

learning and experience. Almost any task that a pilot must perform can be
characterized by some perceived information read from a display and a required
action. This matrix portrays whether the perceived information and the required
action is the same between the old and the new systems.

161

Human Factors for Flight Deck Certification Personnel

Table 7.1.
Matrix Showing Error Probability Due to Transfer. (from Braune, 1989)

Case 1

Perceived
Information

Required
Action

Transfer of
Previous
Learning and
Experience

Error
Probability Due
to
Transfer

Same

Same

Maximum

None

Positive

Case 2

Different

Same

Positive

Immediate

Case 3

Different

Different

Little or
None

Low

Case 4

Same

Different

Negative

High

In Case 1 in Table 7.1, the perceived information is the same and the required
action is the same. With two identical systems, therefore, everything that was
learned in the old system is going to transfer to performance in the new system.
There is going to be a maximum positive transfer of previous learning and
experience from the old system to the new. There is really no possibility for
errors in the transfer.
Case 2 is where there is a different representation of the perceived information,
but the same required action. For example, the old system might have an
analog display and the new system has a digital CRT display. The information is
perceived differently because it is presented in two different formats but the
required action is the same. The transfer of previous learning and experience
will be positive. Error probability is intermediate, so that some errors will occur
but not a great many.
162

Decision Malng

In the Case 3 example, both the displays and the controls are different.
Therefore, there is little or no transfer of previous learning and experience. The
probability of error due to transfer in Case 3 is low. In Case 4, the perceived
information is the same, but there is a different required action. This was the
situation in the DC-9 crash. The same mode switch in two cockpits performed
different actions. The mode switch had to be set differently in the old system
than in the new system, and here is where the transfers of previous learning
and experience are highly negative. These are the "red flags" for potential error
in transferring from one design to the other.
It is important to note that the potential for negative transfer is greatest when
the required action is actually similar, but incompatible with the old action. In
the DC-9 crash described above for example, the identically appearing rotary
switch was turned in both cases; only the turn was to a different position in
the old and new (two incompatible responses). The nature of the transfer
relationship shown in the matrix is such that negative transfer may sometimes
be avoided by making the appearance of the new response device substantially
different from the old (e.g., a pushbutton select, rather than a rotary control, in
the above case). One of the greatest problems with the different aircraft
manufacturers doing their own thing is the extent to which there is a lack of
standardization of those kinds of display-action relations across aircraft. In
particular, there is a lack of consistency in the relationship between computer
systems and control that leads operators to make errors when transferring from
one to the other.

163/164

Timesharing, Woridoad. and Human Error

Chapter 8
Timesharing, Workload, and
Human Error
by Christopher D. Wickens, Ph.D., University of Illinois
Diwded Attention and Timesharing
In Chapter 6, we talked about attention in terms of ability to divide attention
between two different sources of displayed information. We talk now of
attention in the broader sense of being able to divide attention between a large
number of different tasks such as between flying and communicating, between
navigating and talking, or between understanding the airspace and diagnosing

the failure. Discussion of attention in these terms describes issues of


timesharing. Each of these shall now be described in turn, before addressing
the broader issues of workload and human error.

165

Human Factors for Flight Deck Certification Personnel

Sampling and Scheduling


The first mechanism relates to task sampling and scheduling; that is, how well
does an individual know what perceptual channel or task to attend to at what

time. Effective timesharing is being able to attend to the right thing at the right
time. Much of your ability to take notes at a lecture is based on your ability to
write when the speaker is not saying anything important, then switch your
attention to listening when the speaker is saying something important. A lot of
research on selective attention, on being able to attend to the right place at the
right time, particularly in aviation, has focused on the visual world and pilots'
successful ability to look at the right instrument at the right time. The general
conclusion of research at NASA Langley is that pilots are fairly good at
attending to the right place at the right time.
On the other hand, there is also some good evidence that task scheduling and
information sampling is not always optimal. Accident reports may be cited in
which pilots have clearly "tunneled" their attention onto tasks of lower priority,
while neglecting those of higher priority (e.g., maintaining stability and safe
altitude). The Eastern Airlines crash into the Florida Everglades in 1972 is
perhaps the most prominent example. Furthermore, experiments done at Illinois
find that student pilots do not adequately postpone lower priority tasks when
workload becomes high.
There is some interesting research that Gopher (1991) has done with the Israel
Air Force which looks at ways to train pilots to better allocate their attention
flexibly between tasks. This training device was found to be fairly effective in
qualif"in 6 pilots for fightcr aircraft duty.
Confion
A second cause of poorly divided attention in doing two things at the same
time relates to confusion, a topic discussed in our section on HUDs. You can
think of two channels of information, and two responses, but the responses that
should have been made for B show up in A, and the responses that should have
been made for A show up in B. Recall our discussion of a pilot flying an HUD.
There is a motion in the outside runway because the plane changes attitude,
and the pilot interprets that motion as being motion on the HUD. This is an
example of confusion. One possible way of avoiding confusion between HUD
imagery and the far domain is by the use of color. Certainly confusion often
occurs in verbally dependent environments where there are two verbal messages
arriving at once; for example, a pilot listening to a copilot and simultaneously
listening to an air traffic controller. There is confusion when a message coming
from one person gets attributed to the other person, or when the digits or the
166

Timesharing, Workload, and Human Error

words in the two messages get confused. The main guideline to avoid confusion
is to maximize the differences between the voices. You are less likely to confuse
the voice of the copilot with the voice of the controller if one is male and the
other is female than if both are male or both are female. The same thing could
probably be said regarding digital voice messages. Make sure the voice quality
of the digital message is very distinctive and very clear, perhaps by making it
sound mechanical, which differs markedly from the voices typically heard on the
flight deck. Differences that help us to distinguish between voices include
location (or source) and pitch.

Resources
The third mechanism that is involved in timesharing and attention when doing
several things at a time is the concept of esources. We have limited capacity,
resources, or a supply of "mental effort" that is available for different tasks.
Because this limitation exists, the concept of processing resources is important
to the issue of pilot workload prediction and assessment, a topic to be discussed
later in the chapter. We allocate our limited attentional resources to tasks; as
we try to do two tasks at once, for example, fly and communicate, one task gets
a certain amount of resources and another task receives the remainder. Our
ability to do the two activities at once depends upon the demand of the task for
resources and the available supply. In discussing task demand and supply of
resources, psychologists describe a function that relates the level of performance
on a given task to the amount of resources that are invested in that task. This
function is known as the performance resourcefimaion. If you take a very
difficult task, for example, flying through heavy turbulence and landing under
low visibility conditions, it requires a full investment of all of one's resources.
One hundred percent of the resources are required to obtain a given level of
performance, and that level of performance isn't very good. However, if you
consider an easy task, like cruising through clear weather, one can obtain very
good performance by only investing half of the attentional resources; and trying
harder (investing more resources) can't improve performance any further. You
can get maximum performance by giving only a small amount of your resources.
Figure 8.1 presents the performance-resource functions for an easy task (top), a
difficult task (bottom), and one of intermediate difficulty. The difference

between the bottom and top curve is important not only in the level of
performance that is attainable, but also in the amount of "residual resources"
that are available to devote to a second (concurrent) task. For the difficult task,
as for the intermediate one, any diversion of resources to a secondary task will
sacrifice its performance. But for the easy task, a good portion of resources can
be diverted with no loss in performance.

167

Human v-?-,oi for Flight Deck Certification Personnel

The curves in Figure 8.1 are also related to training. Extensive practice on any
given task will shift the performance resource function from the bottom, to the
middle, to the top curve. As the task can be performed with fewer resources, we
say that its performance has become automadized. Compare the middle and top

Hg

Resources
Allocated to
Primary Task
Fo

Resources
Allocated to
Secondary Task

&Iur6.. Grqii of hw pslfomulis a tww ion of the cMliill of prinwy and aecondiny
Usk& (If= Wickau 1992)

curves. Note that there are no differences in maximum levels of performance


between the intermediate and high skill level. But those with high skill will be
able to perform more automatically, and will allow successful performance of
concurrent tasks with the "residual resources." One important characteristic of
human resources is that they exist in more than one variety. The specific nature
of these "multiple resources" will be discussed in the following section on
workload prediction.

168

Timcsharmi. Wo'kload, and Human Error

Our discussion of attention and timesharing in the previous section has set the
stage for the treatment of workload here. Figure 8.2 is one representation of

HUMAN OPERATOR

WORKLOAD

STRATEGY
BEHAVIOR

EFFORT

CONTROL

ENVIRONMENT

SSYSTEM

Figum &

0 PERFORMANCE

Mode of woddoad.

workload. Loosely speaking, we can think of workload as the relationship

between the capacity of a human operator and the demands of a system. That
human operator interacts with the system in two ways. First, he or she is
involved with control--doing things to it and watching what happens. Second,
he or she is also involved with putting effort into this performance, and the
system itself drams effort from the operator. The human and the system
together work under the influence of an environment. The hum-n outputs
behavior. The system outputs performance. For example, in an aircraft, the
human is doing things to the control yoke, and the aircraft is performing (i.e.,
following some flight profile). The human also outputs workload which is the
experience of the effort involved in controlling or monitoring the system. This is
what we measure when we measure workload, and these are the factors that
basically drive workload.
There are a number of important case studies in which pilot workload has
played a major role. Right now a major issue in the Army is whether one or
two pilots should fly the LHX Light Attack Helicopter. That is very much of a
169

Human Factors for Mifht Deck Certification Personnel

workload issue. Can one crew member manage the task load requirement with
sufficiently low workload to make it fly satisfactorily with sufficient residual
resources to handle the unexpected? An analogous choice was posed around
1980 regarding two- versus three-person flight crews on the generation of more
automated commercial aircraft (e.g., the Boeing 757). The President established
a workload task force to look at the issue of whether the flight engineer was
necessary. The decision came down to allow two-crew operations, in part,
because the mental workload was deemed to be allowable with this
complement. FAR 25.23, Appendix D, talks about certifying aircraft for their
workload. In such certification, workload estimations are used to compare
systems. Does the old system impose less workload or more workload than the
new system? Workload is also relevant in examining the impact of data-link
based automation versus traditional communications with the air traffic control.
Finally, there is the issue of using workload measures to examine the level of
training of a pilot. As we saw in the previous section, although two pilots may
fly the mission at the same level, if one flies with a lot less workload than the
other, does that make a difference in predicting how the pilots will do later on
or how well the pilots may transition from simulator training to the air?
What exactly is workload? How does workload relate to performance? How a
plane performs in terms of its landing oi deviation from the flight path tells you
a good deal, but doesn't tell you all there is to know about the cost imposed on
pilot workload by flying the aircraft. A good metaphor for workload is of a
"dipstick to the brain." If workload depends upon this reservoir of resources we
have, as shown in Figure 7.1, we would like to be able to push a little dipstick
into the brain, find out how much workload there is, then just pull it out like
we measure the amount of oil in a car. We'd like to be able to say the
workload of this task is a 0.8 relative to some absolute capacity. This measure
of absolute workload is a goal we are a long way from achieving. We will
probably never be able to achieve it with a high degree of accuracy. Far more
realistic is being able to make judgments of relative workload; for example that
the workload of the new system is less than or greater than the workload of
the old system. This is different than saying the workload is excessive or not
excessive.
In addition to the distinction between absolute and relative workload measures,
a second distinction is between workload prediction and workload assessment. A
major objective of design is to be able to predict workload of an aircraft before
flying a mission, as opposed to assessing the workload of the pilot actually
flying. In this chapter we shall first contrast these two approaches: prediction
and assessment. While our discussion in these sections will focus on conditions
of overload (is workload excessive?), we will then turn to the other extreme of
work underload, and the closely allied issue of sleep disruption. Finally the
170

Timesharint. Workload, and Human Error

chapter concludes with a discussion of human error, a topic closely related to


both underload and overload.
Woddoad P~dicmion

ilmelne Analysis
The simplest model or technique for predicting workload is the timeline model.
The timeline model is based on the assumption that during any flight task, the
pilot, over time, performs a number of different tasks, and each task has some
particular time duration. Therefore, we can estimate the workload on the pilot
as being the proportion of total time that he or she has been occupied doing
something. When applying this method, it doesn't matter what the difficulty of
that task is. The only thing that matters is how long it takes to carry out the
task. It doesn't make much of a difference whether two tasks are done at the
same time or done at different periods of time. Timeline analysis has been
developed extensively in the work that Parks and Boucek (1989) have done at
Boeing, where they have developed specialized software for doing such analysis.
As shown in Figure 8.3, the TIme/ine Analysis Program (TLAP) simply codes a
time record by lines, whose vertical position indicates the type of task, and
whose length indicates the duration of time each task segment is performed.
The time line is divided up into lengths of equal duration. Then the program
sums within each unit of time the total amount of time the tasks are being
done and the total time available. It computes the fraction of the time required
to do each task and divides that by the time available within the interval. From
that, the software comes up with a workload score for each interval.
The program can generate a chart for a particular activity that shows peaks and
valleys. Figure 8.3 shows an example of a workload time history profile. Using
such a technique, it is possible to establish a "red line" of absolute workload
level, a workload you would say is "excessive." Then you can determine where
design problems are in the epochs when the task demands exceed the red line.
As one example, Parks and Boucek (1989) carried out an analysis of their view
of the implication of the data-link system cai fight crew workload. The scenario
they fabricated was one with a weather deviation, an approach to landing, some
major weather, a wind shear warning, missed approach, and a number of other
events. They first traced out the pattern of activities carried out by the pilotflying and the pilot-not-flying, under the conventional instrumentation and the
conventional interaction with controllers. The task analysis was then repeated
assuming their conception of the data-link system, which posited a data-link
display on which, at the bottom of the CDU there was a message board that
presented the necessary information from the data-link, (the automated
171

Human Factors for Flight Deck Certification Personnel

Workload Prediction
Timeline Analysis
Workload Histogram
Crewmember - Captain

Unshifted
June 1, 1987

Flight Phase
Eng Start and Taxi

Channel - Right Hand


Configuration - Config. A

4)
140

120

100

0
0

Peak Workload Limit


~60

40

320
M0
0

12

18

24

30

36

42

48

54

Time (Seconds)
WL
Figure3.

Time Required
Time Available

Example of woudoad tm hltsxy prolle as produced by Tmeline Analysi


Program

rm Pafs & BouceK 1969)

information given from the controllers).


The particular conclusions that they drew from this analysis are less important
than the simple illustration of the technique. The way in which they applied it
was one of looking at the change in workload for the copilot and for the pilot,
from the conventional system to the data-link system. Using a more detailed
analysis, they also broke down the tasks in terms of different channels of
human resources that were loaded: internal vision, (vision that was headdown), external vision, the left-hand, the right-hand, cognitive activity, and
"auditive activity" (listening and speaking). They found that with the data-link
system for the copilot, there was a very substantial increase in internal vision;
the eyes were much less frequently out the window and far more focused on
head-down operations, because of the necessity of monitoring the CDU. Also,
there was much more left-handed activity. There was also less auditory activity
for the copilot, a reduction related of course to the decrease in voice
172

=ro
Timesharinf. Workload. and Human

communications with ATC. A timeline for one of those particular channels,


internal vision, is shown in Figure 8.4 for the advanced flight deck with a
weather avoidance segment. Workload is plotted as a function of time in
seconds. The heavy black line indicates an increase from the data-link system
over the conventional system. The investigators found that at particular
locations in time, something about the mission drove internal vision above the
red overload line, where there is 100 percent workload (time occupancy). These
events had to do with monitoring data-link for heading and altitude,

concurrently with an instrument scan.

140

120

02 1000

80

Increase Due to Data

Link

Overload

"The"Red Line"

- - -- -- - -- -- - -- - -- -- - -- -- - --

60
40

20
0

109

206

309

412

515

618

721

824

927

1080

Time (Seconds)

Figur 8.4.

P kternu vkion tak ing advicedl Not d*ck for wehir waldimm
Groe & BouceK 1M7)

P=m

There are some other examples of timeline analysis. For example, McDonnell
Douglas has a slightly different version of a timeline program. Either version
provides a good way of auditing what the tasks are and where the potential
periods of peak overload may be. The technique has certain limitations however
because it assumes that the workload of a task is only defined by how long it
takes and not how intensive or demanding it is. We all know intuitively that
there is a difference between how long something takes and how much demand
it imposes on our mental process. For example, the pilot may have to retain
three digits of information from ATC in short-term memory for five seconds, or
173

Human Factors for Flight Deck Certification Personnel

seven digits of information in short-term memory for five seconds. Either way,
that task takes five seconds, but certainly keeping seven digits in mind is more
demanding on our mental resources than keeping three digits in mind.
Similarly, flight control with an easily controlled system may involve just as
much stick activity but a lot less cognitive demand than flight control with a
system that has long lags and is very difficult to predict. Timeline analysis
doesn't really take into account the demand of the tasks.
A second problem is that the way timeline analysis is derived, the definition of
a task is usually something you can see the operator doing, and it doesn't
handle very well the sort of cognitive thinking activities that pilots go through
(planning, problem solving), although timeline analysis is beginning to address
A third problem is that timeline analysis doesn't account for the fact that certain
tasks can be timeshared more easily than others. Pilots can do a fairly good job
of controlling the stick at the same time they a:e listening. Visual and vocal
activity can be timeshared very easily. Visual and manual activities can be less
easily shared. In other words, scanning the environment at the same time as
entering information into a keyboard is much more difficult than speaking to a
controller while looking outside the cockpit. Rehearsing digits is also quite
difficult while talking or listening. Timeline analysis does not account for the
fact that certain tasks are easy to timeshare and others are hard. These
differences in timesharing will be elaborated below when we discuss multiple
resources.
Finally, a fourth problem is that timeline analysis is fairly rigid. It sets up a
timeline in advance and sees where different tasks will be performed, but in
reality, pilots do a fairly good job of scheduling and moving tasks around. So if
two tasks overlap in time according to the timeline set up by the analyst, pilots
may simply postpone one in a way that avoids overlap.
Elaborations of Thneline Analysis
There are a number of more sophisticated workload prediction techniques that
address some of these limitations of timeline analysis. Table 8.1 shows workload
component scales for the UH-60A mission/task/workload analysis. It is an
attempt by Aldrich, Szabo, & Bierbaum (1989), who have been working with
the Army on the helicopter design to code the tasks in terms of how demanding
or how difficult they are. The left column has a number for the difficulty scale
of the task. A higher number means the task is more difficult. The first task on
the list is "Visually Register/Detect (Detect Occurrence of Image)." It has a

174

Thmesharimf Wor-load. and Hu

difficulty value of 1. The authors have also dzfined six channels of task
demand, analogous in some respects to the different channels used by Boeing.
Another way of accounting for the demands of a task is through a demand
checklist. That is, if you do an analysis of the task that a pilot has to do, there
are certain characteristics of any given task that influence whether it is difficult
or easy, independent of how long it takes. Consider, for example, the signal-tonoise ratio. It obviously is a lot easier to search for a runway if it is clearly
defined than if it is partially masked by poor visibility. Other characteristics that
influence display processing demand are the disci
l
between different
display symbols, the clt on a display, the conmatbilty between a display and
its meaning, as discussed in the earlier chapter, and the consistenc of
symbology across displays. Variables that influence the demand for central
processing resources are the number of modes in which a system may operate,
the requirements for predicdon, the need for mental/riain
(as a pilot must
often do when using an approach plate to plan a south-flying approach), the
amount of worldng memoy demands (time and number of chunks), and the
need to follow utwpompdprocedures. Demands on response processes are
imposed by low S-R compadibiit, the absence of feedback from action, and the
need for precision of action.

175

.HumanFactors for Flizht Deck Certification Personnel

Table 8.1
Workload Component Scales for the U1H-60A Mission/Task/Workload Analysis
Scale
Value

Descriptors

1.0
3.7
4.0
5.0
5.4
5.9
7.0

vsal Rgse/Detect (Detect Occurrence of Image)


ViulyDiscnmmnate (Detect Visual Differences)
Visual Inspect/Check (Discrete Inspection/Static Condition)
n (Selective Orientation)
visualy L te/
Orientation)
ViulyRead (ybl
(Continuous/Serial Inspection,
ViualySa/eac/otr

Visual-Aided. Q~t Vision Goggles [NVGJ)


Visuauyl
gister/Detect (Detect Occurrence oflIage) With NVG
Visually Inspe ek (screte Inspection/Static Condition (With
NVG)y nwtCekX
e tect Visual Differences) With NVG
5.0
VisualyxDiciia
Orientation) With NVG
/Aou
/Upto
n
(Selective
5.6
VOisly
tain Orientation With NVG
Trac/ow (
6.Vsuy
M.ual Scani/Search/Monitor (Continuous/Serial Inspection, Multiple
7.0
Condit~ions (With NVG)
1.0
4.8

1.0
2.P0
4.2
43
.9
6.6
7.0

tectocSwi ece oa Sound)


Detect/R eiscrete Sdus
on/Attention)
d
Orient to
eOrien'at.ion/Attention)
Orient to So(e
Occurrence of Anticipated Sound)
VDeri Auditory ec
whh
eIntercret Smnticu o ntent (S
Discrimiinate Snd Chrceitcs (ectAuditory Differences)
Rates, Etc.
Interpret Sound Patterns (uen

7.0

Kinesthetic
Detect Discrete Activation of Switch (Toggle, Trigger, Button)
Detect Preset Position or Status of Object
Detect Discrete Adjustment of Switch (Discrete Rotary or Discrete Lever
etect SeriaI Movements (Keyboard Entries)
Detect Kinesthetic Cues Conflicting with Vnisal Cues
Detect Continuous Adjustment of Switches (Rotary Rheostat,
Thumb.heel)
Detect Continuous Adjustment of Controls

1.0
1.2
3.7
4.6
5.3
6.8
7.0

cognitive
Automatic (Simiple Association)
Alternative Selecton
Sig/ignal Recognton
Evaluation/Judgent (Consider Single Aspect)
Encoding&/Decoding, Recall
Evaluation/Judgent (Consider Several Aspects)
Estimation, Calculation, Conversion

1.0
2.2
2.6

Psychomotor
Sec
DiceeActuation (Butto To e, Trigr
ntrolSebsr Control)
Continuous Adjustive (Ph-t

1.0
4.0
4.8
5.5
6.1
6.7

176

Timesharinm,

Workload, and Human Error

Table &8 (contd)


Workload Component Scales for the UH60A Mission/rask/Workload Analysis
4.6

5.8
6.5
7.0

Manipulative

Discrete Adjustive (Rotary, Verticzl' Thumbwheel, Lever Position)


Symbolic Production (Wntmig_)
Serial Discrete Manipulation (Keyboard Entries)
(from Aldrich, Szabo, & Bierbaum 1989)

These are a series of guidelines that can be used to predict the amount of load
on a task. There are other approaches to predicting task demand as well. Parks
and Boucek have used an information complexity measure for computing task
demands. However, what has been discussed up to now has still been a view of
attention that really assumes that there is one pool of resources that are used
for all tasks, or a series of separate and completely independent channels. That
assumption of how the attentional system works is not in line with the fact that
not all of the interference between tasks can be accounted for by difficulty. For
example, entering data into a keyboard interferes a lot more with flying
performance when it is done manually than when it is done by voice. When we
change the structure of the task like this we can sometimes find a large
difference in the amount of interference with flying. We also find another
characteristic of dual task performance which indicates that not all tasks
compete for the same resources, and this is called djffrulty hinenid'iy. This is a
situation when increasing the difficulty does not increase the interference with
another task. Given the assumption that there is one pool of resources, then if
we make one task more difficult, we pull resources away from the other task,
and the performance of the other task ought to decline. But there are situations
when this doesn't happen. For example, we can increase the difficulty of flying
and a pilot's ability to communicate will not change much unless the flying
becomes very, very difficult.
MLutpfe Resoucs
The above findings and others suggest that there is not a single pool of
resources, but rather that there are multiple resources. So to the extent that two
tasks share many common characteristics, and therefore common resources, the
amount of interference between them will increase. For example, if we have
two tasks that both demand the same resource, like controlling aircraft stability
while adjusting a navigational instrument, there will be a trade-off in
performance between them. However, if we have one task that demands
resource A, and a second task that demands resource B, like listening, while
flying a coordinated turn, there will be little or no mutual interference. As an
analogy, if you have one home that relies on gas, and another home that relies
on oil, there is not going to be any competition for heating resources between
these homes if, say, the demand for gas suddenly increases.
177

Human Factors for Flight Deck Certification Personnel

A second characteristic of multiple resources is that we can talk about


increasing the workload of a task, in terms of increasing the demands on a

specific type of resource. If this resource is also shared with concurrent tasks,
the difficulty increase will be more likely to lead to a loss of performance. In
other words if two tasks demand the same resources, there will be a trade-off
between the difficulty of one and performance of the other. If they use different
resources, we can change the demand of one and not affect the performance of

the other.

We have argued elsewhere that there are three distinctions that define
resources. First, auditory resources are different from visual resources.
Therefore, it is easier to divide attention between the eye and ear than between
messages from two visual sources or two auditory sources. Second, the
resources that are used in perceptual and cognitive processes in seeing, hearing,
and understanding the world are different from the resources that are involved
in responding, whether with the voice or with the hands. Third, we have
contrasted spatial and verbal resources.
As we are perceiving words on a printed page or spoken words, we are using
verbal resources. When employed in central processing, we use verbal resources
for logical problem solving, rehearsal of digits or words, and mental arithmetic.
For a pilot this could involve rehearsing navigational frequencies given by ATC
or computing fuel problems. Anything that has to do with the voice uses verbal
response resources.
In perceiving spatial information, we do a variety of things. We do visual
search; we process analog quantities like moving tapes or moving meter
displays. We also process flow fields, that is, estimate the velocity over the
ground, from the flow of texture past the aircraft. We recognize spatial patterns
on maps, to help form a guidance of where to fly. Spatial central processing
involves imagining the airspace, or mentally rotating maps from say a north-up
to a heading-up orientation. Spatial responses are anything that involves
manually guiding the hands, fingers, feet or eyes through space: using the
control yoke, the rudder pedals, and the keyboards or engaging in visual search.
Thus the idea behind multiple resources models is that you can predict how
tasks will interfere with each other or how much workload will be experienced
not only by how long those tasks take to perform and by how demanding those
tasks are, but also by the extent to which two tasks demand common resources.
There are now a number of different efforts in the research design community,
more directly focused on military systems, that have elaborated upon versions of
multiple resources theories to come up with computation models that will take
a timeline and a task demand coding, and make predictions of the workload on
178

Timesharitm Wokldoad. and Human Error

the pilot. Both Honeywell and the Boeing people have been involved in
developing a model of this sort (North & Riley, 1989).

WododAwaoet
A framework for understanding workload assessment is presented in Figure 8.5
which shows a graph that presents across the bottom line the resources
demanded by a task or set of tasks. The farther to the right on this axis, the
more the pilot is having to do more tasks or perform tasks that are more
difficult. The pilot has available multiple resources that can be given to those
4-

Underload"

Overload

Primary task performance

Maximum

MResources

-supplied
Reserv e

Resource
supply

cpi
V

Resources demanded
More Tasks
More Difficult Tasks

Fo

8.5.

Gmph shoin woddoaW nemimut qrom Wckem, 1982)

tasks. These can be supplied up to a maximum, shown as the horizontal dashed


line. As the graph moves from doing nothing at all (on the left end) to doing
something that is moderately difficult at the middle of the graph, more
resources are demanded but the pilot can adequately supply those resources, so
there is a nice linear supply-demand curve. As long as this linear function
remains, resource supply keeps up with demand, and the piioes performance is
going to be perfect. This region where supply satisfies demand is called the
"underload" region. By underload, we don't really mean the region of boredom
where the pilot is doing nothing at all, but rather the region where he is not
asked to do more than can possibly be done.

179

Human Factors for Flight Deck Certification Personnel

If you look at all how well the pilot is performing the task at hand when
demands are in the left side of the graph (e.g., maintaining the flight path),
what you will see is less and less reserve resources available to do other things.
As we push the demand beyond the maximum supply at the middle of the
figure, the pilot is getting into the "overload" region. There is an excess of
demands and the pilot needs more than he can give. As a result, performance of
the task of interest is going to begin to deteriorate. The measurement of
workload requires looking across this whole range of task demands, from
underload to overload. This suggests that how we measure workload may vary
depending on where the pilot falls in the underload and overload regions. At
the left, we must measure raudual raeouwrs. At the right, we may measure
performance directly. Four major techniques of measuring workload are
generally proposed: measuring the primary task itself, meas,.ring performance
on a secondary task, taking subjective measurements, and recording
physiological measurements.
Pdmay Task Pwftmn

ce MeAwwsures

In aviation, the critical primary task is flight performance. How well is a pilot
actually doing keeping the plane in the air along a predefined flight path
trajectory? The direct measure of primary task performance might be some
measure of error or deviations off of that trajectory. However, it is also
important to measure not only performance, but some index of control activity;
that is, how much effort the pilot is putting into keeping the plane on the
trajectory. We need to measure control activity because we can get two aircraft
that fly the same profile with the same error, but one requires a lot of control
activity and one needs very little control activity. It turns out that one good
measure of control activity is the open loop gain, which is the ratio of the pilot's
control output (yoke displacement) to a given flight path deviation.
Figure 8.6 shows the relationship between gain (effort) and error. The upper
left box represents a timeline of a pilot flying a particular profile under low
workload because there is little error and little control effort being made. This
is an unambiguous measure of low workload; performance (flight path error) is
good and effort is low. In the upper right box, we have a situation where the
error is low but the pilot is putting in a lot of control activity to maintain that
low error. We would see there is a high gain or high effort invested in the
flight performance. This is probably a high workload situation and suggests that
there is some sort of control problem. That is, some sort of problem in the way
the information is represented or the handling of the aircraft, so it is taking a
lot of effort to keep the plane flying steadily. This situation may also reflect
flying in high turbulence.

180

Thnesharih.

Workloa& and Human Error

Error

Control-.--

--

Inputs
GAIN (Effort)
Low

High

Low WL

High WL
Control Problems

Low

ERROR
.High WL
Neglect

High WL

High------------1

FR9M 86

RddkIn.Id bgWeen gain and errM. (010kt figur)

In the lower left box is represented the opposite situation in which there is not
much control activity going on, but there is a fairly high amount of error. It is
almost as if the plane is flying through turbulence and the pilot is not doing
anything with the stick. This pattern may very well signal ne-lect where the
pilot is neglecting the flight control and allocating resources to something else-system problems or problems with other aspects of the aircraft. It is also an
indicator that there is high workload, but the high workload is not associated
with the flight control itself, but with some aspect of the aircraft environment.
Finally, the lower right box shows the worst situation, in which the pilot is
producing a lot of control activity and is still generating a lot of error for
whatever reason. Thus there is very high workload in this situation.
The important point illustrated in this figure is that looking at performance of
the primary task itself as an indicator of workload is not sufficient. You have to
look jointly at performance of the system and at the behavior of the pilot

181

Human Factors for Fliugit Deck Certification Personnel

Seconday Task Pawtmance


A second approach to workload measurement is the secondary task. This

technique assesses the extent to which the pilot has enough residual resources
to perform another, secondary task at the same time as a primary task without

letting performance in the primary task drop. When doing a difficult primary
task, if we give the pilot a secondary task, he is going to either have no
resources for that secondary task, or, if resources are diverted, the primary task
is going to drop (Wickens, 1991).
One example of a secondary task is time estimation. Suppose the pilot is flying
along and is asked to give a voice report every time he thinks 10 seconds has
passed. Time estimation generally becomes more variable and the intervals
longer as the workload increases. Another secondary task that has received a
fair amount of interest is the task of a memory comparison. While flying along,
the pilot hears a series of probe signals. Maybe they represent call signs. Every
time he hears the call sign of his own aircraft, he presses a button. Every time
he hears the call sign of another aircraft, he does nothing. So he compares each
call sign to his memory. If it matches he responds. This task is sometimes called
the Stmnber Task. The response time to acknowledge call signs is longer with
higher levels of workloads. Random number genem ion is another possible
secondary task. The pilot is asked to generate a series of random numbers and
the more difficult the primary task, the less random the numbers become.
Another secondary task is the citical intability racking task, in which a second
tracking task is built into the pilot's primary flight control loop. Error on this
task directly reflects the difficulty of the flight dynamics of the primary task.
All of these types of secondary tasks have various problems. One problem they
have in common is that they are all sensitive to multiple resources. If you have
a secondary task that demands resources that are different from the primary
task, you are going to underestimate workload. If you have a primary task that
is heavy, in terms of perceptual-cognitive load--rehearsing digits would be a
good example--and you have a secondary task that is heavily motor, like
performing a critical tracking task, it is like you are looking in one comer of a
room for something that exists in a different part of the room. So you need to
have your secondary tasks demand the same resources as the primary task.
Perhaps even more critical, at least for in-flight secondary task measures of
mental workload, is this problem of inisivenew. We can all imagine the
resistance that a pilot would give if he were trying to fly the aircraft through

high workload conditions, and at the same time 1had to generate a continuous
stream of random numbers, or had to continuously control a side-tracking task.
He simply wouldn't want to do it. This is the biggest bottleneck towards the

182

Timesharing- Workload, and Human Error

introduction and the use of secondary tasks--they tend to be intrusive into the
primary task and disrupt the primary task; and this is a major problem when
the primary task is one involving a high-risk environment (i.e., in flight
recording, rather than simulation).
A solution to the problem of intrusiveness is a technique called the embedded
secondary task; that is, use of a secondary task which is an officially designated
part of the pilot's primary responsibilities, but is fairly low in the hierarchy of
importance for the pilot. In flying, there is a certain intrinsic task priority
hierarchy. For example, there is the standard command hierarchy to aviate,
navigate, and communicate in that order of priority. With more precision we
can further rank order tasks in terms of those that have very high priority, say
maintaining stability of the aircraft, those of extremely low priority, like
answering service calls from the back of the aircraft, and those things in
between. The idea behind this prioritization scheme is that as the workload
increases from low to high, the lowest priority tasks are going to drop out, so
when the workload is very, very high, the only thing that will be left to do is
the highest priority task. Thus good embedded measures of secondary tasks are
those tasks that are naturally done but are lower down in the priority hierarchy.
An example might be acknowledging call signs. To the extent that this is a
legitimate part of the communication channel, one can measure how long it
takes the pilot to acknowledge the call sign as an embedded secondary task.
Our research has indicated that airspeed control is a good embedded secondary
task. The control of airspeed around some target is of lower priority, or at least
seems to be reduced in its accuracy more, when the demands for the control of
the innerloop flight path error, (heading and altitude error), become excessively
difficult. So as the demand goes up, the airspeed errors seem to increase, more
so than do the other types of errors.
Subjective Measures of Worldoad

The third category of workload measures, which is often the most satisfactory to
the pilot, is the subjective measure. There are a number of different techniques
of subjective workload measurement. One is a unidimensional scale. An example
of this is the Bedford Scale shown in Figure 8.7a, and involves a decision tree
logic. There are a series of questions: Was workload satisfactory without
reduction? Was workload tolerable for the task? Was it possible to complete the
task? If the answer is yes or no, then you go on up to some higher levels that
eventually allow you to categorize the workload of a task on a 10-point scale.
Similar to the Bedford Scale is the modified Cooper-HarperScale (Figure 8.7b),
which is taken more directly from the Cooper-Harper scale of flight handling
quality, but now has questions phrased in terms of workload. The important
point is that you can get a single number, and that number is guided by a
183

Hmm ftctm f" Milk De& Caraftimum Pawmd


WORKLOAD DESCRIPTION

RAT114G

Workload Insignificant

WL I

Workload low

WL 2

Enough Spare capacity tor all desirable additional tasks

WL 3

Insufficient spare capacity for easy attention


to additional tasks
Reduced spare capacity: Additional tasks cannot
be given the desired amount of attention
Me spare capacity: Level of effort allows IM19
attention to additional tasks

WL4

Very little spare capacity, but maintenance of effort


In the primary tasks not In question
Very high workload with aknost no spare capacity,
dinevulty in maintaining level of effort
Extremely high workload. No spare capacity. Serious
doubts as to ability to maintain level of effort

WL 7

DECISION TREE

------

Yes
-----o.
r

No

rldoa I satisfectory

Was

ithout rl duction?
------

WIL 6
WL 6

Yes

Joa, I:tolerable for

Was

No

Isk?

Worph

WL 9

Yes
FWasit

No

b 9 to complete

ip Task abandoned. Pilot unable to apply sufficient effort

Fig= &7& The

BKfold

pft

raft BCW& (ftm Rnar n a

AdOWMfOrSelocled

Dorm,

Task or Requirml Operation

a Pilot inSewcwd
Ta* w 14"*W
QW&Gon

FIZ.Ilca

Clow Negfillible
8
F&W- Sorre
Inklyur"asent

Yes

factor

POWconVersallon not a factor


lor desired Porto

MWMWPIMCM"Mwn
r@WW*dfor dnlmd

:r=

="".

Minor
fast annaft

Is it
satisfactory

No

without
ment

oaft"clas
Warrant
ipow mars

(11111clonal"

Lfodwm*
01310abrablo
doficland"

pwb-wm
remilres
9 Pam corripeossibn
"A.

paft.

Very ------------ Adequate padommm MqLdm


tout "rable
derclencies
"Ono**
Plot

19ffn

Pat
I
RoWv I

Yes
Ad*Wft peftemancs not
Is adequate

Malor &ficiencitts
Deftlenclas

performance
attainable with a

Mq**

No

MOW
tolerable pilot

def-st-Ws

workload'.)

Major denciencles

affafnabla arth nmorkitum


lobroble plot

ControftbAlty not in MoMn.


Cormilderablopft-11-M
is MqAed for cor"
wasr"

Pki convenswon is

rewww to relsh

am"

Yes
Isit
controllable?

No

If
Malordeliclercifte

ortion

170

PIM decisions

F19M &7h

The CbGW44MM POX ",n IfIne

184

nMN ecdL MM

& HNW, 1964

Timesharing. Worldoad. and Human E~1m"

series of verbal decision rules about how it is that you ought to interact with
the task. Both of these unidimensional scales, the Bedford and the modified
Cooper-Harper, are simple. Because they are simple, they have a certain amount
of ambiguity. It is not always clear why a task is rated difficult, because the
scale won't tell you if it is difficult, for example, because it had difficult
response characteristics, or because the displays were hard to interpret, or there
was heavy time pressure or heavy cognitive demands, etc.
Multidimensional scales, in contrast, assume that there are several dimensions
underlying subjective workload, and reveal what these dimensions are. The two
major candidates for multidimensional scales are the Subjective Workload
Assesment Technique (SWA.T) and the NASA T7X Sca/e. The SWAT, which was
developed for the Air Force at Wright Patterson AFB, assumes that we
experience workload in terms of three dimensions: the time demands of the
task, the effort of the task, and the stress the task imposes on us. It asks the
pilot to indicate for each of these scales, on a three-point rating, whether the
time, effort, and stress levels are low, medium, or high. By a fairly elaborate
procedure which uses all 27 possible workload ratings derived from low,
medium, and high combinations for each of these three scales, it is possible to
determine which scale is more important for a particular pilot. This procedure is
used as a way of coming up with a single measure of workload from these
three ratings on each of the different scales.
Two major problems have been found with the SWAT technique. The sorting
procedure it uses, which seems to be a mandatory part of SWAT, is timeconsuming. The other problem has to do with the scale resolution; that is,
SWAT only allows you to say that workload is low, medium, or high on each
scale. If you consider your own flight experience, you are able to give a lot
more precision to workload than three levels. You have more power of
discrimination between the resource demands of the task than simply low,
medium, and high. What happens when only three rating levels are available is
that people tend to choose the middle level, and pretty soon you don't get
much resolution at all.
A different techique, as an alternative to the SWAT is the NASA Task Load
Index, or TLX scale. This was developed by Sandra Hart at NASA and assumes
that there really are six dimensions of subjective workload: mental demand,
physical demand, temporal (time) demand, the level of performance the pilot
thinks he or she has achieved, amount of effort, and frustration level with the
task. For each of these, there is a verbal description of what it means, and,
furthermore each of these different demand levels can be rated on a 13-point
scale. You do it by putting a mark on a piece of paper somewhere along the
13-point scale. The scale gives the pilots more freedom and flexibility to rate on
185

Human Factors for Flight Deck Certification Personnel

different dimensions without a lot of extra effort, and probably provides more
information. In fact, some comparisons of how well the two different scales
have differentiated loads indicates that the TLX scale does a better job than the
SWAT. TLX also has a procedure that allows the six dimensions to be combined
into a single workload rating. For many purposes, the single-dimensional rating
scales are probably adequate for picking up most of what there is in workload.
There are really three problems with subjective workload measures. One of
them is reqspne bias. If you are simply asking for a rating of workload, we all
know there are individual differences among pilots. One may not ever admit
that the workload is greater than three, no matter how difficult things are.
Another may be very quick to admit to high levels of workload whether they
exist or not. A second problem with subjective workload measures is related to
memory. An example would be if we were evaluating two tasks, flown on two
different systems, and the pilot is asked to compare their workload. Since the
pilot's memory for the first one may have degraded, he may not be able to
make an accurate judgment based on memory. The third problem with
subjective workload measures is that they do not always agree with
performance. It sometimes happens that when two systems are compared, one
gives better performance than the other. However, the one that gives better
performance is, in fact, shown to have higher measures of subjective workload.
Which measure should then be trusted by the designer?
PhysiologicalMeasures of Woddoad
The fourth category of workload measures are physiological measures. Several
of these have been proposed: heart rate (both mean rate and variability), visual
scanning, blinking and various measures of electroencephalogram (EEG) that
can measure fatigue and, finally, the evoked potential, the momentary changes
in the EEG that are caused by a discrete event, like the sudden onset of a light
or a tone. The prevailing view is that most of these techniques have some uses,
but as far as being reliable measures of pilot workload, particularly in civil
aviation, there are more problems than there are benefits. The most successful
measures appear to be those that relate to heart rate. Here, there are two
specific measures. There is the mean heart rate. That is, the number of beats
per minute. The faster the heart beat, presumably the higher the level of mental
workload. That does hold true more or less, but there are other factors,
unrelated to mental workload that cause the heart to beat fast. Certainly two of
these are arousal and stress. Another one is simply physical load. So in a
physically taxing environment, even though the mental workload may be low,
the heart rate may still be very rapid. Thus the mean heart rate is not a terribly
good indicator of mental workload by itself.

186

Trmesharin. Wordoad, and Human Error

A better measure of cognitive load is the variability of the heart beat interval
(Vicente, et al. 1987). It has been found that as the workload gets higher, the
variable of the heart gets lower. Figure 8.8 shows some data taken at Wright
Patterson (Wilson, et al. 1988). It is a timeline of two minutes which plots at
the bottom, the interval between each heartbeat. The fact that the curve
oscillates suggests that the heartbeat interval is itself variable. Some periods

wo-

36

36

4.

4' 1

i.

T1m (,..)

Figure &&

for headthe over a too-nimle period A


inr-bed OrmvNo
Graph p koang
r WoNW, gd a 19
35 secondL (orn
bikdako appeam a apprcmdm

the beats are close together, then they get slower, then they get faster, then
they get slower. So this oscillation represents variability in the inter-beat
interval. The overall level represents the overall inter-beat interval or the mean
heart rate, plotted at the top. When the level is low, that means the heart is
beating very fast. In the figure, note that at 35 seconds into the flight test, a
bird struck the windshield. This was a fairly traumatic event, and you can see
very dramatically an increase in heart rate (decrease in the inter-beat interval)
and a reduction in the variability. So both emotional stress and the cognitive
load of dealing with this unexpected event made the heartbeat faster and
caused much less variation. Figure 8.9 (top) shows another case of relatively
low variability in heartbeat, indicating high workload. Figure 8.9 (bottom)
shows the change from high to low variability (low to high workload) with
little corresponding change in emotional load.
Collectively, it is hard to say which technique of workload measure is best. In
civil aviation tests by both AirBus Industries and Douglas there has been some
success with the physiological measures. The best approach is probably one that
187

Human Factors for ,Fifht Deck Certification Personnel


150140-

S130-

IL 120-

S110-

1009070
601
0

10

20

30

40

50

60

70

80

90

100

110

120

10

20

30

40

50

60

70

80

90

100

1;0

120

-a1200 g.1000.

40
C

FRgure 8.9.

Graph plotting inter-bea time Inervals for huuribeg over a two-minute period.
Note the reduction InvatabUity at t=40. with no corresponding change in
moe

headrat0e. (1mm Wion, at a. 198.

involves comparisons across primary task performance measures, and embedded


secondary tasks, augmented by subjective and possibly physiological measures,
with an emphasis on the heart rate measures.
A Cloed-Loop Model of Woddood
The traditional view of workload has involved a fairly static concept expressed
in Figure 8.10a, which proposes that there are a certain number of things that
we could call drivers of workload. These are things that vary in a task or
environment to increase the workload. Drivers of workload are task
requirements, available resources, time available, and operator experiences.
Drivers imposed on a task change the physical and mental actions required for
the task and produce workload and performance as a result. This is an openloop appxroach to workload. Simply stated, something is done to the operator,
and it produces workload.
More recently, a dynamic closd-loop concept of workload has been proposed
(Hart, 1989). This is illustrated in Figure 8.10b. The FAA, NASA, and the Air

Force have cooperatively sponsored a program to look at workload as a more


dynamic and adaptive phenomenon. As in the static concept, all of the drivers

of workload are again represented. But there are also a set of fairly
sophisticated cognitive activities, assumed to be carried out by the pilot. These
include planning, setting priorities, establishing a schedule, allocating effort,
focusing attention on certain tasks, ignoring others, etc. As a result of this
adjustment, the pilot experiences some mental and physical demands, which we
call workload, but the workload experienced at one moment in time is used to
continuously adjust performance, establish priorities, and change task
188

Timesharing Wbrilmd. and Human


Traditional Concept of Workload

Ouw..Remadm

Task Reqrements

AV0
AtctReso
TimeAvailbl

ACfOM

OUtCOMe.

wcsrj~naJJcqsod
W

Pt ys

Dynamic Concept of Workload


Cite,

Plenndng
AcHOvft

RasuM"g

Outcomes

ACVo"s

InrtOs
Condibons
Task
Reqirments

(b)

shShde

then express it. nstead, if they experence workload, and the workload is too

high, they drop tasks. If the workload is too low, they assume tasks.
Unfortntely, we really do not have a very strong database on how well
people conform to this model. For example, there aren't good data regarding
how good a job people do at shedding tasks appropriately and knowing

whether optimal task shedding is done weil under normal conditions, or done
poorly under stress. A program of research at NASA and the Air Force is
beginning to examine this issue, and there is a slmilar research program at

Ilinois to investigate task shedding.


One important implication of the closed-loop model, which we have not yet

addressed, is that as people become underloaded they will tend to assume "pick
up' tasks. The goal of a pilot is not to minimize workload, but rather to keep
workload at some moderate, stable, intermediate level This obviously has longterm implicaton
for the system designer who is considering the appropriate
level of automation. The goal of automation should not be to eliminate the

pilot and reduce the pilots workload to zero, but rather to simply address the

overload conditiomn, and coider problems of the underload condition


as well.
There has been a slight disconnect between the approach that more automation
is invariably better, and the approach that automation ought to be designed to
keep workload at an intermediate level rather than to eliminate all tasks from

189

Human Factors for Flight Deck Certification Personnel

the pilot's repertoire. The problems of excessively low workload, and their close
relation to issues of sleep' disruption will now be addressed.
Urndedad
The flip side of high workload is underload. As we discuss underload in this
section, it refers to situations of long periods of relative inactivity. Transoceanic
flights or long cross-continental flights are examples of underload, where very
little is actually happening. it is not surprising that very long periods of low
workload really are not optimal. The pilot will try to create some level of
workload, whether it is flight-related or not, in order to avoid sleeping. Some
interesting studies of air traffic controllers by Paul Stager in Canada found that
a predominance of ATC error seems to occur at relatively low workloads rather
than periods of high overload.
One of the things we know about low workload periods is that these interact
negatively with sleep loss. Pilots under sleep loss conditions are much more
likely to perform poorly under low workload periods than pilots who are well
rested, and so we now turn to a discussion of this important topic.
Sleep Disnrption
There have not been many systematic studies of the effects of sleep deprivation
on pilots' performance. Perhaps the best of these was a study carried out by
Farmer and Green (1985) in the UK, in which they worked with 16 pilots. The
pilots were deprived of one night's sleep, by being kept awake for 24 straight
hours. Then they did a series of in-flight maneuvers, with a wide-awake check
pilot to make sure that nothing disastrous happened. Farmer and Green looked
at the kind of errors that were made, and found that the errors occurred mostly
during the low activity portion of the flight, at the times when not much was
going on, except for an occasional need to respond to, for example,
unpredictable and infrequent warning signals. These are what psychologists call

the "vigilance tasks."

Because we know that sleep loss has consequences that are harmful in low
workload environments, it is important to understand some of the characteristics
of sleep. We have two different forms of sleep. One is rapid qve movement
(REM) sleep in which the eyes are twitching, there is a lot of dreaming, and
there is actually a fairly high level of brain activity. The other is slow wave
zkep, so named because the EEG is very slowly changing during this type of
sleep. The brain is very quiescent during slow wave sleep. There is not much
190

Tmesharlinn. Workload. and Human r

dreaming activity going on. REM sleep takes place later in the night. Slow wave
sleep takes place predominately during the first part of the night. There is good
evidence that both kinds of sleep are important for the overall health of the
individual
The whole sleep wake cycle is defined not only in terms of staying awake and
being asleep, but also by a set of body rhythms, called vcadian dijyhmn that
reflect different characteristics of the efficiency of performance. These circadian
rhythms run on a 24-hour cycle and can be defined by body temperature, the
depth of sleep, sleep latency, and performance. Figure 8.11 shows the average
duration of sleep episodes and the body temperature of a person during a 48hour time period.
What the function shows is that temperature is lowest in the night and the very
early morning period. It begins to climb during the day, reaches its peak in the
late afternoon and evening, then declines at night. The graph of temperature
coincides with the bar graph that plots the duration of sleep. This graph shows
-20
Mean
Sleep

-15

10 Latency
0(min)

18a)

o
- 14---4
Lw

-37.0

a-

a)
0.TC

"0

a)

E
a)

ID

a)

-36.5
a1)

12

18

24

12

18

24

12

Circadian Time (hrs)


FRge 6.11.

Grqzh of sba dadon wod f

ga., irno9
191

oun nonut
to rcndcuilt

Pdlyt
om CznMsl

>

Human Factors for Flight Deck Certification Personnel

that if you go to sleep sometime in the early morning hours, your sleep
duration will be relatively short. If you go to sleep during the evening, your
duration of sleep will be longer.
A third characteristic of the circadian rhythms has to do with seep atency.
Figure 8.12 shows a graph of the mean sleep latency of subjects who received
the Sleep Latency Test. Sleep latency is how long it takes you to fall asleep. If
there is a long latency, it means you are wide awake, and so you are not about
to nod off to sleep. If there is a relatively short latency, it means you are very
prone to fall into a deep sleep. Figure 8.12 covers results of a 24-hour period
from 9:30 am to 9:30 am. Eight 21-year-old subjects and eight 70-year-old

Young Men
Old Men

Q0
*

25

e-

20

20

0)
CC

0)

O/
C

10

as
5

0930

1330

1730

2130

0130

0530

0930

Time of Day
Figure 8.12.

Mean sinep hindcu for 21.yew-olds mnd 70-yer-olds. Prom Richwrdson d al,

192

Timeshadinr. Workload. and Human Error

subjects received the Mean Sleep Latency Test (MSLT), while awake, during the
day, followed by four brief awakenings at 2-hour intervals during the night
(shaded). In the afternoon, there is a "post-lunch dip" which indicates that in
the afternoon we tend to fall asleep and drop off rapidly. Sleep latency gets
longer in the evening time (it takes longer to fall asleep), but then again
becomes very short in the morning, and rises again during the daytime. The
measures of temperature and sleep duration show only one cycle during the
day, while sleep latency has the same general cycle but with this little extra dip
in it in the afternoon.
Performance is the all-important measure related to sleep deprivation. Figure
8.13 shows how human performance on various tasks changes during the day.
The performance tends to correspond with body temperature, but also shows
hints of the "post lunch dip" characteristic of sleep latency. One graph shows
psychomotor performance, like a tracking task. You do progressively better
during the day, best in the early afternoon, and do relatively poorly at night
and in the early morning hours. The other graphs show the measurement of
reaction time, and of ability to do symbol cancellation and digit summation.
The collective implications revealed by all of these effects is that we have a
regularly trained rhythm that describes how fast we go to sleep, how long we
sleep, our body temperature, and the level of performance, all of which show a
very pronounced dip in the time from midnight until about six in the morning.
The data strongly suggest that when possible, flight schedules ought to be
arranged to take advantage of the capacity for sleep. Flight schedules that allow
pilots to sleep at times when they go to sleep fastest and sleep for the longest
are better than those that give pilots the opportunity to sleep at times when
they have a hard time sleeping because their sleep latency is long.
Sleep Disitipon in Pilots

A lot of the research on sleep disruption has either been based upon subjects
that were not pilots, or were military pilots, so there are not a lot of data that
generalize directly to civil aviation. There are two important studies that were
carried out at NASA that do have a direct bearing on the civilian piloting
community (Graeber 1988). One of these is a short-haul study in which a large
number of pilots were evaluated during a series of domestic short hauls. They
flew for three or four days before returning to the home base. Out of that study
came the first systematic conclusions of the effects of sleep cycle on the short
haul. First, the pilots began the trip with a sleep loss, because they were apt to
sleep less than the normal amount the night before they took off for the first
leg. Thus they started out behind the eight ball. This is interesting, because it is
precisely the opposite of a concept that has proved to be an effective antidote
against sleep loss, the concept of prophylacdic seep. This is defined as getting
193

Human Factors for Flight Deck Certification Personnel

210

Psychomotor performance

66

220
230

64
E 62

240 -91215182124436

__

96058912151821243 6 9E250
hours

Symbol cancellation

4.0

4.8 Digit summation


5.0"

4.2

5.2-

9121518212436
4.6
hours
4.

." 5.45.6 9121518212


hou
5.8

4.8
5.0

'

3 69

6.0-

5.2

6.2
6.4FguM 813.

260270
280-

5856
3.8

Reaction time

Graphs showng how human purfornwn e vados during the day wut a rhthm
"npodA g to body
I
wamprum. (hrom Klein aL., 1972)

extra sleep in advance of a period of time when you are going to miss a lot of
sleep. It can do a very good job of compensating for the later loss of sleep.
A second finding from the short-haul study was that sleep loss each night is
greater on layovers than at home. Generally, the pilots were sleeping less per
night on the layovers. The sleep was also more fragmented during the layovers.
Graeber also examined the buildup of fatigue across the four days of flying, and
found that this buildup (measured by the pilots' subjective rating of how tired
they were), was really greatest after the first day of the trip, with a more
modest increase in fatigue after the third and fourth days.

194

Thnesharfin

Workload, and Human Ero

Now consider what each day of the trip is like. Some days are very fragmented
and consist of three or four different legs on different aircraft -- up to seven or
eight takeoffs and landings at different airports. Other days may involve only
one flight with a fairly long layover. Thus we can distinguish between busy
days and relatively nonbusy days in terms of takeoff and landings. Graeber's
third conclusion was that sleep was better following a busy day than following
a relatively light day. That is not altogether surprising. The busier the day, the
more takeoffs and landings, the more fatigue within a day, and, therefore, the
better the sleep will be after that day is over. A fourth conclusion from
Graeber's study is that down-line changes of schedules are bad for sleep planning.
If, after the second or third day into the short haul, the pilot was informed of a
sudden change in the flight schedule, this change seriously disrupted the pilot's

sleep schedules. It was almost as if the crews could preprogram themselves for
how much sleep they were going to need each night into the short haul.
However, if that schedule was suddenly disrupted by a change, that change

disrupted the preprogramming. For pilots who have done operational flying for
commercial airlines, most of these conclusions are probably not surprising. The
important point is, for the first time, they are firmly documented in an objective
study with data.

The second major component of Graeber's work was a study of long-haul


flights. These are transoceanic flights that typically involve time-zone changes of
six or more hours. To understand the effects of those long-haul flights, we need
to consider a little bit more about this natural circadian rhythm. It turns out
that the period of the natural rhythm is not exactly 24 hours, but it is actually

about 25 hours. Studies of people who have gone into caves where they have
no sense of waking in the natural day/night cycle reveal that these subjects
tend to adopt a 25-hour schedule rather than a 24-hour schedule. There are
interesting reasons why this is the case, but it is very dear that our natural
schedules tend to be longer than the daylight forc-s iis into- W!e'-. !eft to our
own devices during the week, we tend to stay up later and later each night,
and we tend to be late stayers more than early risers. What happens,
nevertheless, when we go into a long-haul flight is that we have suddenly
moved to a situation where the day/night cycle in the environment where we
land, is different from the day/night cycle that our brain has adapted to when
we took off. This phenomenon is called desynchrrmiado.
Desynchronization is represented by Figure 8.14. The upper graph represents
the westbound flight and the lower graph represents the eastbound flight. The
dotted line is the natural circadian rhythm that was formed when we left our
home base. So it is the same no matter whether we are flying west or east. The
solid line for the west- and eastbound flights is the circadian rhythm at the
destination. As we fly west, we are flying with the sun, and initially undergo a
195

Human Factors for Flhiht Deck Certification Personnel

4# %

%/

Natural Rhythm

Eastbound Flight

-*

Fg"

L14.

4.

%%%

Grap~hs shwig

zone (ad".g
. n

an east- and westboud ff4hl. across b"

r)

very long day. As we reach the new destination, now we have a day/night

cycle, but it is shifted ahead of what our natural cycle is. So when our brain
thinks it's night, it is still afternoon. When we are flying east, on the other
hand, we have a very fast day initially. When we reach our destination, again

there is desynchronization. Now when our brain thinks it is night, it is morning.

The data in either case obviously suggest that there is a mismatch between our
circadian rhythms and the post flight day/night cycle.
The data also suggests that it is considerably easier to adapt to westbound
flights than eastbound flights. When flying west, the natural rhythms have an
easier time lengthening themselves to get in synchrony with the local day/night
cycle. On the other hand, when flying east, it is as if the rhythms don't know
whether to contract and make a very short day, or expand to make a doubly
long data ahso sges that s coniderdata indicating that the eastbound
flights, which condense the day, are worse than the westbound flights which
stretch the day. These data come, in part, from examining the way in which
different characteristics of the physiological systems adapt to the new rhythms.
In other words, you have got a natural rhythm which was in existence when
196

"lime.harit W'l__oad. and Human Fi-e,

you left, and you acquire a new rhythm which you should take on when you
reach your destination. The longer you stay at your destination, the more the
old rhythm is going to shift into phase with the new rhythm. We can then plot
how rapidly that shift takes place.
Table 8.2 shows the shift rates for different variables after transmeridian flights,
either westbound or eastbound.
Table 8.2
Shift Rates after Trmnsmeidian Flights for Some Biological and Performance
Functions

Westbound
Adrenaline
Nradiuaaline

Eastbomund

90
160

60
120

Psychmotor per nance


Raction t150

52

38
74

Hlemit rate
Body temperature

90
60

60
39

17-01H(

47

32

(fnxh Klein et aL, 1972)


The numbers in the table are expre'osd in terms of the amount of shift in
minutes per day, so that a higher number indicates a more rapid shift. What
you see is that generally the numbers for the westbound flights are higher than
the numbers for the eastbound flights. In fact, sometimes the westbound shifts
are as much as two times faster than the eastbound. The table shows the rate
of uptake of adrenaline and noradrenaline, psychomotor performance and
reaction time, heart rate, body temperature, and a body chemistry measure (17OHCS). Each of these different rhythms seem to shift at a slightly different rate.
Therefore, in transcontinental or transoceanic flight not only is your rhythm out
of synchrony with the rhythm of day and night at your new destination, but all
of your different rhythms are out of synchrony with each other because of the
different shift rates. Thus there is kind of a "double whammy" to readaptation.
Different things are lost at different times, and different things are regained at
different times.
The last conclusion of the long-haul flights study is that the return to normal is
a relatively gradual one that on the average takes about four to five days before
"a new rhythms regain synchrony with the local environment. This figure ;
197

Human Factors for Might Deck Catificatim Persound

probably more like five to six days after an eastbound flight, and perhaps three
to four days after a westbound flight. Figu- 8.15 shows some more data

100 -6
means of 8 variables

.-.

50-

25 -- 1.5
12.5
0.75
6.25-- 0.375
0
no"r 615&

Avmerage

8 days

mu~chonhrabn d vuldili for eW po@141W day& Prom

wfegnwi a a., 186)

representing this shift. It shows how much resynchronization took place for
different variables (body temperature, performance, etc.) after the first through
the eighth day. Notice that even after eight days, subjects still haven't
completely resynchronized with the new rhythms, although most of the
resynchronization took place after the second and third day. The bottom line
question of course is whether this desynchronization leads to a higher number
of pilot-induced accidents or poorer pilot performance. At this point, there isn't
a good database to suggest that is the case. In other words, there aren't
accidents that have been directly attributed to the resynchronization problem,
but there are certainly suggestions that it may have been a contributing cause
in some instances.
Reconmed
There are a number of recommendations that have come out of the research on
sleep resynchronization, and these are, again, taken from Graeber's work
(Graeber, 1988, 1989). His chapter recommends that pilots should sleep when it
is most effective and do so within the natural cycle. Where possible, sleep ought

198

"Timesharn, Woddoad and Human Lnw

to be scheduled at late night, early morning hours in the phase with the
rhythms to which the body is accustomed. Extra sleep, rather than deprivation
prior to a short haul, is advised. Following a long transoceanic flight, Graeber
argues it is better not to sleep immediately after one's arrival, but simply try to

stay awake until the local bedtime, particularly if one is going to be adjusting
for some time to new rhythms. During any 24-hour period, sleep is relatively

more effective before takeoff than after landing during a layover. So following a
landing, sleep during this period is going to be better just prior to the
subsequent takeoff. This is consistent with the idea of prophylactic sleep,
sleeping in advance of a period where one knows sleep deprivation is likely to
occur. Prophylactic sleep is helpful and much more restorative than sleeping just
after a period of time without sleep.
A somewhat more controversial issue, but one that is certainly receiving some
research interest, concerns conrolled napping. How effective is controlled
napping in flight, assuming, obviously, that somebody else is awake at the
controls. The studies that have been done of napping indicate there are really
two sorts of napping. First, there is micro-seep, where one may doze off for a
couple of seconds or a very short period of time. There is very little evidence
that micro-sleep, in itself, is effective in restoring sleep loss. Then there is a
bonafide nap. There is a minimum amount of time, about 10 minutes, before a
nap can be effective in terms of restoring some sort of sleep loss.
Another phenomenon that relates to naps is the concept of sleep inertia. Its
something that is intuitively familiar to all of us. Sleep inertia describes the
cognitive inertia we experience immediately after waking up. In fact, for 10
minutes or so after one wakes up, there is an inertia that inhibits our ability to
respond quickly, think fast, and so forth. This is well-documented in the
research of Chuck Czeisler at Harvard, which suggests that any program of
controlled napping has got to be one in which the wake-up time is well in
advance of the time one may have to carry out some sort of high-level cognitive
activity or rapid action. If this is applied to a pilot flying transoceanic, you don't
want to wake up just before you start making the important decisions required
on the approach, but rather with sufficient time to dissipate that sleep inertia
before such decisions are required.
In conclusion, it should be noted that the findings and recommendations
reported here result from pooling information from a lot of data sources, many
of them not taken from aviation. Furthermore, the causal links between the
different forms of sleep disruption and pilot error have not always been
conclusively established. Nevertheless, it is prudent to believe that there are

some direct implications to aviation performance.

199

Human Factors for Flht Deck Cerfication Personnel

Human Error
Anytime one talks about human error, there is a tendency to do a lot of fingerpointing. Pilot error comes up with a red flag as being a frequent cause of a
disaster or accident. Training in engineering psychology, however, leads one to
conclude that when errors do occur, they rarely occur as a result of a mistake
made exclusively by the pilot. Typically errors are caused by some traininginduced, schedule-induced, or design-induced factor that made that error almost
an inevitable consequence--something that was bound to happen sooner or
later. This is actually a positive philosophy, for it suggests that there are usually
steps that can be taken to reduce the likelihood of error.
A number of studies that have looked at pilot errors have tried to categorize the
nature of the vrious errors in terms of where they occurred, how they
occurred, and what they were the result of. The approach to pilot error
classification that is consistent with the information processing model presented
in Chapter 7 is one that identifies four major kinds of errors. In this model of
information processing, there are the stages of perception and understanding
the situation (situation awareness or diagnosis), formulating some intention for
action, (deciding what to do about it and making a choice), and finally
executing the action. When taking an action, we often rely upon our memory,
both short- and long-term, to help us recall the rules of what it is we are
supposed to do. Within this context, two researchers, Norman (1988) of the
U.S., and Reason (1990) of the U.K. have come up with similar ways of
classifying errors. Classification is important because the different kinds of
errors seem to have different remediations, or different fixes. This classification
is nicely applied to aviation in Nagers chapter in Wiener and Nagers book on
Human Factors in Aviation (Academic Press, 1988).

Categone of Human Enor


In Reason and Norman's Classification scheme, there are, first of all, what are
called mistakes, a misunderstanding of the situation. Knowledge-bawl miwakes
occur when you don't have the knowledge to understand what is going on.
Rule-based mbtakes occur when you select the wrong rule to make a decision.
Forgetting is another type of error. You forget what is going on, what mode
you are in, and you make a mistake. You have lapse, where you simply forget
what you are doing and therefore do the wrong thing. Finally, you have errors
of the execution of action, which we call sips. A slip occurs when you know
what to do, but you slip and do the wrong thing. You hit the wrong button on
the control display unit, for example.

200

=ro
Timesharinf Workload. and Human

We can represent these different types of errors in terms of different


characteristics of a pilot's behavior. A knowledge-based mistake might be a
misdiagnosis when the pilot doesn't understand what is wrong with an engine.
A rule-based mistake would characterize the situation when the pilot knows
what is wrong, but chooses the wrong action. The pilot realizes that an engine
is malfunctioning, but intentionally reduces power to the engine, rather than
shutting it down completely. With a slip, the pilot intends to perform the
correct action, but simply executes it incorrectly. For example, the right engine
is known to be failing, and the pilot intends to shut it off but shuts off the left
one instead.
Considering these error types in more detail, knowledge-based mistakes typically
result from inadequate knowledge, usually a consequence of insufficient training
or the inadequate or confusing display of information. A good example of a
knowledge-based mistake would be misinterpreting flight path information and
ground-based features, and landing at the wrong airport. Somehow your
knowledge and interpretation of the available information is simply wrong, and
you have made a mistake about where you are. Knowledge-based mistakes often
occur when attention is directly focused on the task in which the error is made.
The pilot who lands at the wrong airport typically doesn't do so because of
failure to pay attention to where he or she was going. In fact, the pilot is
usually paying fairly careful attention to the aircraft's course at the time, but is
simply confused. Knowledge-based mistakes often occur at times of very high
working memory load. The operator is usually in a state of uncertainty and
hesitancy. Finally, the detection of knowledge-based mistakes is often very slow.
As a consequence, you often don't realize the mistake was made until it is too
late. These are often characteristics of diagnosing system failures. The pilot is
focusing a lot of attention on the demanding diagnostic task. Some human error
analyses have been carried out in the domain of nuclear process control One
study looked at 80 process control errors committed in actual plant operations
and found that out of the 80 errors, half of them were knowledge-based
mistakes. The operators were never aware that they made any of the mistakes.
They always thought they made the right decision until the consequences were
felt later on. The main remediations for knowledge-based mistakes are (1)
training, thereby giving people better knowledge, and (2) displays that provide
operators with better, more integrated information.
Rule-based mistakes also result from inadequate knowledge. The diagnosis is
correct, one may know the correct status of the world, but one's decision of
what to do about it is wrong. It is as if the pilot has a rule of thumb of what
to do in case of failure X. Failure X is correctly diagnosed, but the rule is
wrong, and therefore the wrong corrective action is carried out. Rule-based
mistakes occur when attention is highly focused on the task. Once you diagnose
201

Human Factors for Flight Deck Certification Personnel

the situation, you act with a high degree of certainty, even though you are
acting incorrectly. As Reason says, one's actions are "strong but wrong."

Training is one antidote. Automation assistance is also a possible aid in


lessening the likelihood of rule-based errors. It can provide some guidance,
given a certain kind of diagnostic condition, of what the appropriate rule to be
followed should be.
Two kinds of memory errors have been referred to. One of these, more common
in computerized systems, is called a mode enw. A mode error occurs when the
operator forgets the currently active mode of operation. The simplest example is
the typewriter or computer keyboard. Suppose you are typing along and you
press the CAPS LOCK key, that makes everything you type in capital letters.
Then you forget what mode you're in, and start typing digits. On the
conventional typewriter keyboard instead you will get: $&&@#(&I This is a
mode error. Mode errors are likely to occur in any multimodal system in which
the same response can generate various results, depending on the mode setting.
Mode errors are not likely to occur if the operator is new at the system, and is
concentrating very intensely on remembering what mode the system is in. The
more familiar you get with the system, the more you stop paying attention to
what mode you are in, and the more likely you are to make a mode error.
As we deal with automation devices that are increasingly based upon different
modes of operations, like multimode autopilots, mode errors are likely to occur
with increasing frequency. The remediation for mode errors is to provide very
strong reminders of what mode of operations one is operating in. Consider, for
example, multiple modes of autopilot control where the level of guidance is
controlled by a wings leveler or heading control. There should be something
highly visible and continually available to remind the pilot what mode the
system is operating in. Another remediation for some mode errors in computer
operations is simply to use dedicated keys or one-to-one mappings between key
and function. This means you press a key and it always does only one thing.
This feature avoids a design where a given key can activate very different
functions depending on the mode setting of some other key. However, it is
often a more economical design to have multimode keys rather than one-to-one
mapping as far as space is concerned.
A second form of memory errors are the occurrence of lapse. Lapses result
whenever a procedure is forgotten. One simply forgets to do something in a
series of steps. Lapses often occur when a long series of actions are required to
reach the goal. This is obviously the case in many checklist operations, like pretakeoff, pre-landing, etc. Lapses are more likely to occur when a procedure
sequence is interrupted, then later resumed. Perhaps in following a set sequence

202

Timesbarii.

Wofload. and Human Error

of A, B, C, there is some interruption; later on the operator jumps back in and


forgets that step D was not yet performed and goes right on to E, F, and G.
The National Transportation Safety Board (NTSB) report of the Northwest
Airlines crash in Detroit, in which the flaps were not deployed, indicates that a
lapse was a very likely cause. The pilot was going through the taxi checklist on
the runway. Then there was a series of disruptions by air traffic control
requesting a change in the runway. Investigators inferred that somehow that
checklist was resumed, but that one critical step of deploying the flaps had been
left out. Other contributing causes to the disaster were, of course, also
identified. There were a number of fail-safe operations that did not work and
thereby allowed the error to occur. Many of these fail-safes were also related to
automation, but it is clear that the checklist procedure contributed a major
potential source of error. A remediation of this kind of situation would be a
checklist design which avoided forcing pilots to go through multistep sequences
that do not have a clear prompt that guides them through the checklist, saying
"do this, do this, do that, do the other, check this, check that." Even with such
external prompts, there is no guarantee that the steps will all be done, but it
certainly is an important safeguard. Degani & Wiener (1990) have written a
nice summary of the human factors of pilot checklists.
A sip is an error which occurs when you have diagnosed a situation correctly,
you have formulated the correct intention, your rules of what to do are correct,
but somehow there is an incorrect execution. The error category of slips
sometimes includes mode errors and lapses. You either left out a step or did an
extra step. One example of a slip is hitting the wrong key on a keyboard.
Another example is grabbing the orange juice instead of the syrup, and pouring
it on your pancakes. Certainly in aviation, there are lots of situations where the
wrong control has been activated. The pilot may activate the flaps rather than
the landing gear, when the pilot surely knows the landing gear and not the
flaps is what should be activated. What are the conditions that cause slips in
the first place? There are really three triggering conditions. First of all, a slight
deviation from the most expected or frequent behavior sequence is intended.
There is a familiar pattern of activity you carry out most of the time, and the
needed pattern is similar, but slightly different. Second, the conditions or
location and the feel of the intended action is similar to the conditions of the
less frequent action. So most of the time you are doing A, B, and C. Under
these circumstances, you plan to do A, B, and C, which is slightly different than
C. It may be a slightly different control, a control located close by, but in a
slightly different location to the normal control C, or a control pulled upward
(C), instead of downward (C). A third triggering condition for slips is that
performance in carrying out the sequence of actions is fairly automated, so
attention is usually directed elsewhere.

203

Human Factors for FHlft Deck Certification Personnel

A general characteristic of slips is that they are "strong but wrong." An operator
commits to the action, and usually does it with the same degree of certainty as
the correct action. Fortunately, we are usually fairly good at detecting our own
slips just as they are made. As we type or enter data into a CDU, it is very
obvious when we make a slip, as if the finger knows before the brain knows
that it has gone to the wrong place or setting. With a particular switch in an
aircraft, you may know immediately that you made the wrong choice.
The fact that we are good at catching ourselves making slips has some
important implications for how we remediate them. Remediation of slips is a
major issue in system design. Since slips usually occur when attention is
directed elsewhere, that means slips usually occur on sequences of behavior that
are fairly well learned for operators that are highly trained. So remediation is
really not so much in training as it is in system design--remediation includes
such things as avoiding the design of similar controls with similar physical
actions which must be used in similar conditions. Good design avoids
circumstances where you have two similar switches that are flipped in similar
conditions but for different purposes. Always try to adhere to SR compatibility.
One of the major culprits causing slips is the incompatible response mapping,
discussed in Chapter 7. Here without paying attention, the pilot may have a
tendency to move something in the wrong direction because the right direction
was an incompatible response.
Ewr Roriedhiaion and Safuards
In this section we review and present a series of recommendations that
psychologists have proposed to remediate human error -- eliminate it, or reduce
the likelihood of its unpleasant consequences. First, there is the issue of
allowing for mvenifbily of actions. Such an allowance creates what we call a
fagiin system. As we noted, operators are usually pretty good at monitoring
their own performance and detecting their own errors if there are slips. Once
you've made an error, it is nice to have a chance to correct it. Some systems
have an "error capture" mechanism, which captures and delays the response a
little bit before its consequences can effect the system. That's not always a
feasible design option, but there are situations in which it can be made feasible.
There are computer systems that, whenever you press a button that involves
deleting a major file, will come back with a message that says, "Are you sure
you want to delete this?" That is like capturing your behavior before it gets
passed on to the system. Slips often involve throwing things away. Don
Norman, the author of The Psycholog of Everyday Thing, stores all of the trash
baskets in his office for 24 hours in a separate room before they are emptied. If
someone in the office realizes the next day that they inadvertently threw out
something important, they can go into the room and pull out the information.
204

TimeharinL Workload, and Human E= o

This is a forgiving system. On the other hand, if, on an airplane, you slip some
paperwork into the seatback pocket, then forget it when you exit from the
plane, your chances of getting it back are slim. As soon as the plane is empty,
the maintenance crew will have almost immediately cleaned out the seatbacks
and destroyed it. That is not a forgiving system that acknowledges the fact that
people do have lapses of this sort.
The idea of reversible actions, or forgiving systems, where a slip can be reversed
and undone before it is passed on to the system has led to a philosophy of
human error that is somewhat of a marked departure from an earlier
philosophy. That earlier philosophy was that human errors are bad, and
whenever they occur, we ought to try to remediate them. Therefore, we ought
to try to redesign the system to make sure an error doesn't occur in the first
place. This philosophy has led to two approaches. One is called "bandaids." In
the bandaid approach, the system gets more and more complex, because every
human error is a cause for another design feature (i.e. a bandaid) that tries to
eliminate the human error. This correction, by making the system more
complex, very often creates conditions conducive for another error (mistakes
become more likely with more complex systems) and doesn't acknowledge the
fact that errors are probably always going to happen to some extent in any
case; any fix for one sort of error may be likely to produce another error. The
second approach characterizing the old philosophy that all human errors are
bad is one which pushes automation as an ideal because of the belief that a
computer can perform better than a human if there is a mistake. The problem
with automation is that the designer is usually transferring the responsibility for
human error to someone else. For example, this responsibility may be
transferred from the pilot to the computer programmer who is just as likely to
make the errors as the pilot.
In contrast to the earlier philosophy, the proponents of forgiving systems make
two assertions about errors. They say that an error is, first of all, unpredictable
and inevitable. No matter how we design the system, and patch it with
bandaids, errors are always going to occur to some extent. Furthermore, they
say that error is sometimes a necessary consequence of the fact that the human
is a flexible performer. It is that very flexibility that makes us want to keep
humans involved in the first place. Pilots have flexible problem-solving skills,
and that's good. There is an inevitable cost to that flexibility, and that
sometimes is going to lead to the wrong action in inappropriate circumstances,
but we still want to maintain that flexibility because of its positive qualities. We
have to accept the consequences, which are the occasional errors; therefore, our
philosophy of redesigning the system should be one that says errors are going
to occur but let's design the system in a way in which they can be tolerated.
This is the philosophy for envr toeinwt system.
205

Human Factors for Flight Deck Certification personnel

In this vein, Earl Weiner has discussed the concept of the eledmc cocoon. The

idea here is that a pilot ought to be free to make a lot of different responses,
some of which may be incorrect. The appropriate role of automation would be
to simply monitor the performance envelope of the aircraft, and only intervene
if the errors are serious enough to bring about a serious consequence. The idea
is to have some master computer monitoring the pilot, but allow the pilot a lot
of opportunities to make errors and to correct them before things get bad. Bill
Rouse and his associates have done a lot of work on this concept for the Air
Force, as part of the Pilot's Associate program, designing electronic copilots that
can monitor the pilot's performance and act as a cooperative crew member.
Their concept is that of an intelligent system which can monitor human
performance and infer the intentions of the human control actions. You have a
pilot interacting with a task under intelligent monitoring. The pilot's behavior is
providing information to the monitor. The monitor, in turn, can take a series of
actions in the face of the pilot's behavior, if the monitor detects that the pilot
might be making mistakes. Rather than just simply taking over for the pilot,
Rouse and his colleagues suggest that this intelligent monitor might go through
a hierarchy of guidance. At the very first level, if the intelligent monitor infers
that the pilot is doing something that is amiss, it might do nothing more than
increase vigilance. If there is continued evidence that the pilot's behavior is
inappropriate, the intelligent monitoring system might say some things to the
pilot, like "Are you sure you want to do this? Are you watching your airspeed?"
If the error worsens, the monitoring system might prompt the operator with
some advice like lowering or increasing airspeed, etc. Only under the most
serious error circumstances will the intelligent monitor assume command
automatically and correct the error.
Emro in a Systems Context

In conclusion, it is important to consider the concept of human error in a much


larger domain of overall system integration. Jim Reason has done so by
introducing the concept of error as a "resident pathogen." Reason speaks of a
"latent error" or resident pathogen as a virus that sits in the system not causing
any particular abnormality, but waiting for some conditions to trigger it. Reason
examined a lot of different case studies of major disasters such as Chemobyl,
Three Mile Island, the Bhopal incident at the Union Carbide plant in India, and
the sinking of the ferry boat "Herald of Free Enterprise." This was the ferry boat
that sank crossing the English Channel after the captain left the loading door
open in heavy seas. The boat filled up with water, sank, and scores of lives
were lost. All of these were disastrous events that were directly attributable to
operator error at some final point in the chain of events. However, Reason
concludes that, in fact, the operating conditions in these complex systems were
conditions that were poorly designed with a potential error lurking there
206

Timesharfz Waorload. and Hkma

somewhere (like the pathogen). All that was needed was for one operator to
"trigger the system, and make these inevitable errors occur. Furthermore, he
argues that there are a large number of potential causes of these catastrophes
within complex systems. Rather than pointing a finger of blame at a particular
operator who commits the final triggering error, Reason argues that the real
remediation should be accomplished by considering a number of uediin
factors that made the disaster a nearly inevitable consequence of a triggering
human error.
One of these factors is the collection of hardware defects related to poor human
factors concerns of design, construction, and location. System goals that are
incompatible with safety also contribute to errors. Very often in industry, system
goals are designed towards production rather than safety. These two goals are
not always totally compatible. Poor operating conditions have a tremendous
impact on the extent to which the goals are or are not compatible with safety.
Inadequate training is another factor. Just checking off a box and saying
somebody has been through the simulator is inadequate. Poor maintenance
procedures is an additional factor that creates conditions for error. The Three
Mile Island disaster was a case where maintenance procedures were sloppily
carried out, and it wasn't clear to the control room personnel on duty what
systems were and were not in operational status. Finally, management attitudes
(or lack of guidance) can lead to violations by operators that will help
propagate unsafe acts. The operators at Chernobyl provided a nice example of
where the people at the plant simply do things that they knew weren't
supposed to do, because the guidelines had said it was all right to do so. We
are all making violations every time we surpass the speed limit. We know we
are going over the speed limit by a few MPH, because we don't have incentive
not to do so.
Reason's final point is that sometimes even though a system is very well
designed from a human factors point of view, following the sort of prescriptions
we have discussed here, there will still be human errors because of the failures
at all of these other levels. This is a systemwide approach to human error.

207/208

CockpitAtmation

Cockpit Automation
by Richard F. Gabriel, McDonnell-Douglas, retired

/loduc~on
The Federal Aviation Administration (FAA) has a direct and pervasive influence
on aircraft design through its certification process, and on operations through
its design and operation of the Air Traffic Control System (ATC). In spite of the
FAA's broad regulatory administrative role, it is difficult for rules and
regulations to keep pace with rapid technological advances in aircraft design
and operation. It is therefore important that FAA personnel have an
understanding of the impact that advanced technology (automation) may have
on those who operate these systems, so that the benefits of automation can be
realized without unacceptable side effects.

209

Hiuman Factmr for Fliut Deck Cortification Personel

in recent years, increasing levels of automation have shaped and changed the
aviation industry. These effects include:
"*Economic impacts - growth in passenger demand, increase in fuel prices
and other operating costs, increased competition among airlines;
"*Changes in airspace and airiort configuration - capacity limitations, huband-spoke concepts, air traffic control requirements;
Effects on eouivment - increased equipment reliability, increase in aircraft
longevity, aircraft design and performance improvements, increased
automation of flight decks;
"* Effects on operators - reductions in crew size, reduced emphasis by airlines
on training, changes in crew qualifications and availability.
This review will consider the human factors issues of automation from the
operator's standpoint. Although the discussion is relevant to ATC as well as
flight crews, emphasis will be on cockpit applications.

Definition
Automation has been defined as the incorporation or use of a system in which
many or all of the processes...are automatically performed by self operating
machinery [and] electronic devices. (Webster's New World Dictionary, 1970).
Figure 9.1 depicts the progress of automation in aircraft and indicates
automation has been increasing since the origin of heavier-than-air flight.
Automation is not an all-or-nothing proposition. Sheridan (1980) has identified
ten levels of automation, from totally manual (100 percent human controlled),
to systems in which a computer makes and implements a decision if it feels it
should and the human may not even be informed (100 percent computer
controlled). Current systems generally fall between these extremes, but the trend
is to reduce the role of the human and move away from human control even in
decisionmaking. Self-correcting systems are becoming commonplace in newer
aircraft. Table 9.1 presents Sheridan's levels of automation.

Summary of Aviation Automaon Concerns


Some human factors specialists have expressed concern about designers'
overreliance on automation to perform flight functions. Recent developments-particularly the availability of small, powerful digital computers--have led to
systems designs that not only control the aircraft for much of its flight, but may
even replace crew decision functions in the hope of reducing human error. An
210

Cockpit Automation

eProtection

AlSystems
saFlight

Performance Mgt.

(A-32ie

(Mf-80)
Mgt. Systems

S~(MD-80,

B-767)

"p

Active Controls, Advanced


Autopilot (L i011
-00)l
Triplex Autopilot
wiSoAutoland (Trident)
Full Capability Flight
Directors (B-707, DC-8)

r"
ry"Zoro .Reade
Director Device
Electronic Autopilots with

Coupled Naviganon (DC-6)

"Sperry
Autopelot in
Lockheed Electra World
(Howard
F~~~~~gmia
~~ FlightDeorni
~ Hughes)
~ f ~c~tAhmain
~
~
wt E..nvelopei
Sperry Automatic Pilot

in Winnie Mae Solo


World Flight (Post)

Patunt and Flight Test of


Two-Axis Non-Gyroscopic Stability

Patent for
Gyroscopic Stabilizer
(Sir Hiram Maxim)

Augmentation (Taplin)
Rlight Derno of 2-Axis
Coupled Gyroscopic
Stabilizer (Sperry)
M&D Leading to Patent for
Stability Augmentation
System (Wright)

FW9~ 9.1. A k

dngdthe Dedom

orAcrf

ukn

example is envelope protection, in which certain maneuvers such as a stall


cannot be induced by the crew either intentionally or unintentionally.
Another concern is the possibility that even redundant systems may fail. In these
situations, flight crews may experience difficulty in diagnosing problems and
performing corrective actions if they have been lulled into overconfidence by
highly automated flight systems, or have lost the fine edge of their skills as a

result of disuse.
Some of these automation concerns are illustrated by the following scenario:
A pilot of average skill is captain of an advanced, highly automated

aircraft incorporating features such as relaxed static stability, funl-time


augmentation, a sophisticated flight guidance and control system, and
"e-,velope protection" with most failures detected and corrected. The

ca ain has flown in this type of aircraft for some years and has recently

upgraded to his position. The crew flies in the automatic modes most of

211

Human Facton for Ftibht Deck Cutificabon Peronnel

the time. They are making an automated approach and landing, when, at
the middle marker, a major electrical failure causes the aircraft to revert
back to its basic characteristics. The crew lhs to take over control of the
aircraft, make the correct decisions, and take appropriate actions.
Additional factors may complicate their decisionmaking: night, bad
weather, the start of a bid cycle, fatigue and other plausible and realistic

influences.
The ultimate question for designers, manufacturers, operators, and certifiers is
whether safety will be enhanced by incorporating a specific automated
capability. The answer lies in the crew's ability to interact with the automated
system effectively and to take over in the event of a failure or a situation not
foreseen by the designers.
Table 9.1
The Spectrumn of Automation in Decision Making (Sheridan, 1980)
100% HUMAN
CONTROL

100% COMPUTER
CONTROL

1.

Human considers alternatives, makes and implements


the decision.

2.

Computer offers a set of alternatives which human


may ignore in making decision.

3.

Computer offers a restricted set of alternatives, and


human decides to implement.

4.

Computer offers a restricted set of alternatives and


suggests one, but human still makes and implements
the decision.

5.

Computer offers a restricted set of alternatives and


suggests one, which it will implement if human
approves.

6.

Computer makes decision, but gives human option to


veto before implementation.

7.

Computer makes and implements decision, but must


inform human after the fact.

8.

Computer makes and implements decision, and


informs human only if asked to.

9.

Computer makes and implements decision, and


informs human only if it feels this is warranted.

10.

Computer makes and implements decision if it feels


it should, and informs human only if it feels this is
warranted.

212

Codwit Atamnation

E e

ce with Autoaon in Noa iaon System

Experience gained in the design and operation of automated systems in nonaviation environments is often relevant for aircraft systems. Process plants such
ab power plants, oil refineries, factories, and offices have adopted various levels
of automation.
Nuclear Power Studies
Designers of nuclear power plants have incorporated many automated features
in their control systems to avoid catastrophic human error. This is because they
fear that the human operator may not be able to respond to system emergencies
that occur with manual systems. They believe that automation, because of the
complexity of nuclear power plant design and processes, can solve this problem.
Yet, it has been found that automated safety systems aren't necessarily the
answer. An evaluation of 30,000 nuclear plant incidents revealed that 50
percent occurred through unique combinations of machine and human error
(Woods, 1987).
The Three Mile Island accident is a case in point. The initial blame for this
incident was assigned to humans. Investigators found, however, that the design
of the human interface was greatly deficient. Designers had not considered the
human functions systematically. They had paid little attention to display/control
design or work station layout. The control room was filled with banks of almost
identical controls and displays that made it difficult to identify the appropriate
information source or required response to system problems. For some
functions, the operator could not see the display and the corresponding controls
simultaneously. To compound these problems, training of control room
operators had been inadequate.
After the Three Mile Island accident, the response of managers and designers
was to further divorce the human operator from system control through even
more automation. Extensive programs for redesigning the displays and controls
were initiated. One involved changing the warning system from a tile (e.g.,
legend light) system to a computer-based system. The purpose was to automate
the alarm system and reduce display clutter. The result was disappointing. The
computer-based system wasn't programmed to anticipate all the possible
combinations of events that could occur; the operators lost the ability to
integrate display information by recognizing patterns of lights and thus gain
insight into the fundamental problem.

213

Human Factors for Flight Deck Certfication Personnel

Additional multimillion dollar studies were initiated in several countries to


develop a computer-based fault diagnosis system. The goal was to reduce the
operator's role in fault diagnosis. It was found that fault diagnosis could not be
totally automated. The computer solved the easy problems, but the tough ones
were left for the operator. The operator tended to be overloaded with data and
also tended to be deskilled; that is, he or she lost the value of practicing on the
easy problems. Ultimately, the effort to automate fault diagnosis was
abandoned.

Office Automaton
Research in office automation has shown that no system, even a very simple
one, is ever completely defined by designers. One reason is that the system is
not always used for the purpose initially intended. A screwdriver offers a simple
example. It was designed to drive or loosen screws. But it is also used to open
lids of cans, scrape surfaces, clean fingernails, and even as a weapon. Similarly,
a wire coat hanger may be used to help open a locked car.
The same variability in application is found with automated systems. Inventory
systems may be used differently as business grows, shrinks, and/or conditions
change. Accounting systems may have to be altered as tax laws change. Even
office electronic mail systems may be used variably as security needs or capacity
requirements change (Card, 1987).
According to articles in the public press, many of the increases in productivity
anticipated from office automation have not been realized. Moreover, the costs
in personal satisfaction and well-being have been high. Worker motivation has
suffered as jobs have been changed and depersonalized.
Table 9.2 offers some conclusions various authorities have reached after
studying automation in arenas other than aviation.

Cousonimi

Table 9.2
Based on Research in Nonaviation Automation

" Humans tend to be less catastrophically affected than computers when


subjected to severe overload (Silaiko, 1972).

"* Human
is degraded when automated systems perform very well
(Rouse, performance
1977).
"* In situations that require strict vigilance, information sampling and transfer is
done better by humans than by automated systems (Crossman, Cooke, and
Beishon, 1974).

214

Codit Automation

Table 9.2 (Cont'd)


Condusions Based on Reaearch m Nonaviation Automation

"* About 20% of human input errors go undetected (DoD, 1985).


"* Automation can lead to sloppiness (Card, 1987).
"* The nuclear power industry's evaluation of 30,000 incidents revealed that

about one-half occurred tKrough unique combinations of machine and human


errors. Trying to change the automated system sometimes created new
difficulties because the interaction between humans and machines was changed
in unforeseen ways (Woods, 1987).

"* Automated systems usually solve simple problems but fall down in more
complex cases (Roth, Bennett, and Woods, 1987).

"* We need to complement the design for prevention of trouble with the design
for management of trouble (Roth, Bennett, and Woods, 1987).

"* Computer systems should be designed as a tool, not as a replacement for the
human (Roth, Bennett, and Woods, 1987).

Experience with Automation in Aviation


The effects of automation on human performance are difficult to assess. Many
of these, especially boredom and loss of skill, occur fully only after extended
periods of time. To quantify the effects of automation on human performance
would require time-consuming longitudinal studies under controlled conditions.
There are sources of data, however, such as opinion surveys and accidentincident data that can help provide insight. Some of these data will be
summarized and discussed in this section.

Accident Data
Errors on the part of the flight crew have historically been cited as a primary
cause in most accidents. Figure 9.2 presents data tabulated by Boeing
Commercial Aircraft Company and cited by Nagel (Nagel, 1989). It shows that
flight crews have been identified as a primary cause for accidents about 65
percent of the time. The next largest primary cause-airframe, power plant, or
aircraft system failure--accounts for less than 20 percent of accidents.
As shown in Figure 9.2, the flight crew has remained a primary cause of
accidents at about the same frequency over the years since 1957. The reason for
the overall improvement in system safety is probably not a result of any single
factor. Reliability of equipment, better knowledge of weather, and almost
universal availability of instrument landing systems have undoubtedly
contributed. The largest gain in safety of air travel was made during the 1977 1981 period. (This was before the introduction of third generation jets that
dramatically increased automation in the cockpit.) The following period
21S

Humall Faction for Phight Deck Cerdficatid

eucn

Worldwide Commercial Jet Fleet


Number of
Lost

Factor
Primary
PrimarTotato

Percent of Total Accidents


with Known Causes

Right Crew

441

Airplane"

119

11

Weather

32

12

Maintenance

19

Misc. (Other)

30

10

674

231

Total with

Unknown orUnknowor

Fiou

9.2.

40

50

60

70

42*

33

Awaiting Reports
Total

30

150

Airport/ATC

Known Causes

20

10

Total_10 yrs.

_J
"ECUCI
i m
MOMty Adt~n
Turbulence Injury
Injury
Evacuation

31
262

731
747
Li

"vcuaes
Airframe
Araft Sysems
DnPoweot

1959 - 1986

Last 10 Years
(1977 - 1986)
Source: Statistical Summary - Boeing
from Nagel. 1987

Boeing sldsIcal summay of primay cause factom for accidents. (Nagui,

I9sla).

(during which the MD-80, 757, and 767 were introduced) suggests a slight
reduction in safety, but this change may not be statistically significant. Even
though flight crew error rate as a cause of accidents has remained constant,
flight crew performance probably has improved through better training (use of
simulators, for example), better human factors engineering, and other
performance enhancements.
The trend in commercial aviation has been toward dramatic improvements in
safety. Table 9.3 shows accident trends in terms of the probability that an
individual will be killed due to an accident on any nonstop flight in the United
States in 5-year increments since 1957, the date jet service was initiated. The
data indicate that a traveller is approximately 10 times safer now than in the
1950s.

216

Codak h

Table 9.3
b
of an Individual Being Mlled
an a Noh-Stop U.S. Domestic Tnmkilne Fligh
.PERIOD

RISK LEVEL

1957-61

1 in 1.0milion

1962
1967

66
71

1 in 1.1 million
1 in 2.1 million

1972-76

1 in 2.6 mllion

1977-81

1 in 11.0 milon

1982-86

1 in 10.2 mllion

Incidet DAM
An incident has been described as an accident that didn't happen--an event that
could have resulted w. an accident but did not because the crew recovered
(avoidance maneuver) or other factors intervened. Since incidents occur more
frequently than accidents, they provide sufficient data to identify trends that
may allow detection of unsafe conditions and allow corrective measures to be
initiated before accidents occur.
The Aviation Safety Reporting System (ASRS) was established by NASA to
provide an incident database. The ASRS database includes data from all
segments of aviation, including commercial aviation, general aviation, and air
traffic control It is interesting that ASRS incident data presented in Figure 9.3
mirror almost exactly the proportion of human error depicted in Figure 9.2.
A NASA study on classification and reduction of pilot error used the ASRS
database to identify problems associated with Control-Display Units (CDU) in
cockpits (Rogers, Locan, and Boley, 1989). The CDU is a common feature of
automated systems ar s a common source of crew errors. It allows the
operator to program .,d observe the state of automated equipment. In modem
cockpits, it generally consists of a cathode ray tube (CRT) and a related
keyboard.
Of the approximately 29,000 reports in the ASRS database at the time of the
NASA study, 309 involved CDUs. Table 9.4 provides some specific problems
found with CDUs. This analysis of CDUs shows that both human and machine
error occurred, with human error predominant. Clearly, humans make errors
even in automated systems.
217

thr Midi&Deck Catificatlan bamane

hUmiiAnPac3

70%

.t
.
.1
.ogme
.of
...

wniEfa i

GOA^.3
....
Ui
(FA.t.... ....

MR~a~

bl..9..

ax)Aa~

Roea

m,& oe,18

8f
Percntag
ReotdInidnstatdueoCD
.7 insufficientNtimebto program

*~~~~~~~al 69.4ins~*7
eneed43
descetud
-44 restictions

hl n odn
teimetionpoga
Inuficentl

incorrectlysconnrunwaynchange

one potential weakness of the ASRS is that reports are voluntary. Not everyone
experiencing an unsafe condition reports it. Aircraft equipment malfunctions
218

Cockdit Automation

probably occur more frequently than are reported to the ASRS, particularly
when they do not lead to incidents or near accidents.
However, the FAA requires significant equipment problems to be reported as
Service Difficulty Reports (SDRs). An analysis of SDRs for DC-9/MD-80 aircraft
during one time period is provided in Table 9.5. This table reveals that of the
445 events included, 201 required crew intervention. Of these, 160 required an
unscheduled landing or aborted takeoff. Only four SDRs involved cockpit crew
error. In focusing on accidents, it is easy to forget just how significant the
crew's role is in averting accidents caused by equipment malfunction.
Data from the Douglas Aircraft Accident/Incident Database supports this
conclusion. Table 9.5 shows that of the 736 reports in the McDonnell-Douglas
database, 65 percent are related to equipment malfunction. Only 12 percent are
related to crew error.

Pilot Opinion
The opinion of the flight crews operating the aircraft provides an important
source of information on cockpit design. Although crew opinion is subject to
many sources of bias and is not by itself adequate for design decisions, it is a
rich source for hypotheses about design advantages, disadvantages, and areas
needing intensive study.
Table 9.5
AnayWs of DC-9/MD-80 Service Difficulty Reports

TOTAL EVENTS FOR TIME FRAME


NUMBERS OF CREW INTERVENTIONS
ABORT TAKEOFF
UNSCHEDULED LANDING
EMERGENCY DESCENT
FUEL DUMPING
DEACTIVATE SYSTEM
ENGINE SHUTDOWN
OTHER

445
201
29
131
11
3
13
12
29

AUTOMATIC SYSTEM FAILURES

45

OTHER EQUIPMENT FAILURES

230

CREW ERROR

12"

"Eight of these were related to flight attendants (galley problems, etc.)

219

Human Factors for Flight Deck Certification Peromnnd

NASA performed several field studies of crew acceptance after the


introduction of the MD-80 and B-757/67 aircraft. Data sources included
direct observation of crew performance on the flight deck during normal
revenue service, interviews, and questionnaires. The results reported were
generally as follows (Curry, 1985; Wiener, 1985):
*

Crews liked automated aircraft.

There was a slight trend toward reduced workload.

Late changes by ATC created problems in reprogramming.

There was a slight trend toward fewer errors with automation.

As crew experience increased, there was a tendency to turn off


the automation (Flight Management Systems) during busy times.
Several factors were cited to explain this: mismatch of automated
system capability with ATC instructions; slow response of
autopilots; problems in crew interfaces; and training
inadequacies.

This brief review suggests that many of the same difficulties encountered in
non-aviation environments are experienced in automated cockpits.

Reasons Cited for Automating Systems


Operators and manufacturers want to maximize the return on their
investment. The decision to invest the huge sums required to develop and
certify new systems is not undertaken lightly. For operators and
manufacturers to seriously consider automating systems, there must be
strong potential for a dividend. For the manufacturer the dividend is
increased sales and safety. Table 9.6 lists benefits commonly cited to justify
automation.
Most of these reasons emphasize the need for increased efficiency in
operation and use of airspace and airports, and improved operations in
varying environments. To meet these requirements, designers seek ways of
providing lower fuel, maintenance, and crew costs while improving
efficiency through higher reliability, greater payloads, and more precise
flight path control Trends in aircraft flight deck design indicate that airline
and manufacturer decisionmakers believe that automation will help meet
their goals.

220

Cockvit Amaau

Table 9.6

Reasons Cited for Automating Systems


* Enhanced safety
* Reduced human error
* Improved human performance

Reduced approach noise


e Reduced weight
e Increased capacity

9 Reduced crew workload and


fatigue
* Reduced crew training

Improved passenger comfort and


ride quality
o Reduction of boring, tedious,
and/or unpleasant tasks

requirements

o Improved reliability and schedule

e Reduced crew size


e Improved effciencyeformance

Improved management control

e Reduced costs

0 Impr9ved speed and quality of


learning
e Competitive posture

* Increased precision, accuracy,


stability
e Performance of functions
beyond human capability

e Reduced task difficulty, more


convenience and ease of use

e Increased operational capability

Some Aufaran on Conem8


Designing a new aircraft system as complex and sophisticated as a modern
airliner is a formidable challenge, particularly with typical time and budget
constraints. Meeting this challenge requires a design team to focus intensely
on their objective.
Historically, designers have emphasized hardware development. They have
relied on the crews to adapt to their flight deck designs rather than
designing cockpits to accommodate the performance characteristics of the
crews. Allocating budgets for human factor considerations has been a hard
sell for many reasons: lack of understanding, uncertain or unspecified
payoffs, undefined criteria, threats to established budgets and schedules, lack
of a recognized and/or easily accessed human factors database, and mistrust
of human factors practitioners.
It is perhaps ironic that automation--intended to reduce the reliance on
humans-may require greater attention devoted to human factors. A British
author who has worked extensively in studying automation in process plants
221

H==a Factor for Flieht Deck Certification Personnel

and offices has identified a number of ironies associated with automation

(Bainbridge, 1987). Table 9.7 lists some of the author's observations.


Table 9.7
Ironies of Automation (Bainbridge, 1987)

" By taking away the eas.parts of the task automation can make the
operator's task more difficult.

"* The classic aim of automation is to replace human manual control, planning
and problem solving by automated devices. But even highly automated
systems need humans for supervision, adjustment, maintenance, expansion,
improvement, etc.

"* The more advanced a system is, the more crucial may be the contribution of
the human operator.
"* Designers may view the human operator as unreliable and inefficient, to be

eliminated if possible. There are two (2) ironies in this: design error can be
a major source of operating problems; and designers seeking to eliminate the
human operator still leave7hinVher to do the tasks which the designers can't
automate.

"* Efficient retrieval of knowledge from long-term memory depends on

frequency of use. (Consider any course which you passed and haven't
to copeYet
with
about
since.)onlyKnowledge
thou
lt about
theabnormal
operator is
feedback.
andhow
through use
develops
cond~ion
expected to cope with such situations when the reliability of the automated
system is te justification for acquisition.

"* Current automated systems work because they are being monitored and

aided by formerly manual workers. Later generations or operators may not


have the requisite skill and knowledge to make the automated system work.

"* A paradox is that with some automated systems the human operator is given
a task which is only possible for someone who has on-line control.

"* Catastrophic breaks or failures are relatively easy to identify. Automated

control can, however, camouflage a system failure by controlling against the


variable that is changing, so trends do not become apparent until tey are
beyond control.

"* If a human is not involved in on-line control, he does not have detailed

knowledge of current system state. The straightforward solution in the event


of a detected failure is to shut down. Problems arise when, because of some
factor, the process must be stabilized rather than shut down.

"* It is not adequate to expect an operator to react to unfamiliar events solely


by consulting the operating procedures. These cannot cover all of the
possibilities, so the operator is expected to fill the gaps.
" It is ironic that the most successful automated systems, with rare need for
normal intervention, may need the greatest investment in operator training.

222

Cockpit Automation

Aviation includes a number of features that make inappropriate or failed


automation more critical than most other applications. These include:
"*The need for rapid action if a failure occurs near the ground.
"*The inability to just shut down the system to troubleshoot and fix the
problem.
"*The potential for large numbers of deaths and/or injuries if an accident
occurs.
The Society of Automotive Engineers (SAE) committee on Behavioral
Technology has identified a number of specific concerns related to cockpit
automation. The following discussion elaborates on these concerns.
Lowe of Madan Awsnm
Humans focus attention on the tasks they perform. They obtain information
related to the task, make decisions, and take actions as a matter of course.
The task of monitoring an automated system tends to be boring. If the
system is reliable, only rarely is there a need for the operator to intervene
and exercie his/her ability. Consequently, the operator becomes easily
distracted and may tend to allocate attention to other interests. As a result,
the operator may lose a sense of what is happening that is relevant to the
operation for periods of time during the activity. As an operator spends
months and years performing the same monitoring functions, boredom and
distraction may increase, exacerbating loss of situation awareness.
Low of P'fckency
A high degree of competence in any skill requires practice. The keen edge of
finely honed skills may be rapidly lost if not used. A safe pilot needs a
high degree of proficiency in psychomotor, cognitive, and communication
skills. Automated systems tend to eliminate opportunities for operators to
practice their skills. There is concern about how these skills will be retained
in an automated system.
R@d~v

Job SeibcdUn

A worker doesn't have to be entirely content to perform well, but


satisfaction with a job is important to long-term performance and employee
retention. Several factors lead to job satisfaction. These include the feeling
that the job is important and that there is an opportunity of using one's
223

Fag

1w Flioh Deck Cuificatw Petsomu

abilities to meet a challenge. Automation may have an adverse effect on job


satisfaction if it reduces the opportunity to experience these feelings.

The negative potential of a highly reliable system is crew overconfidence in


the system. The system always works, but ff it doesn't, the crew can be
surprised and unprepared to compensate.
kmWh, Vd~wby AmuIan Wb/arsdb Canpiacwic
If one is below average in operating ability, deskilled through disuse,
inexperienced, or overconfident in a system, he or she may be reluctant to
take over when the automated system doesn't perform. If the system has
been designed so that it is difficult for a flight crew to know what the
automated system is doing or why it is operating in a certain way, this may
add to their uncertainty. This reluctance and uncertainty will reduce the
ability of the crew to fulfill its responsibility for taking over when it is
appropriate.
icreud Trmt R
Automated systems may be complex. For example, aircraft flight guidance
and control systems have many modes; failures in these systems may require
reprogramming or assumption of manual control Even reprogramming to
accommodate a change in the flight plan may require many actions. Thus,
operators may require training in order to maintain proficiency in both the
automatic and manual modes, modes which require different skills. As
"system complexity increases, there may be a corresponding increase in
training requirements.
km&

of he Crew t Excune FAhad

This concern is related to others: the potential of automation to intimidate


operators, the design of the crew interface, and fundamental design features
such as envelope protection. As shown in Table 9.1, the further the crew is
removed from decisionmaking by higher and higher levels of automation,
the greater the danger of reducing the czew's ability to intervene or exercise
authority.
DesO2md Effar
One reason designers incorporate automation is to reduce human error.
Automation may reduce the frequency of human error, but the consequences
224

ockpit Automation

may be more critical because of the extent of the control exerted by the
automation. For example, multimode displays and keyboards may require
more disciplined cross checking and special procedures to assure the desired
mode is selected before control actions are taken.
On the other hand, many errors attributed to humans are facilitated by
poorly designed crew interfaces such as difficult-to-use displays and controls.
In fact, many of the early problems that supported the development of
human factors engineering as a separate design discipline were knob and
dial problems. Design-induced error became recognized as a real contributor
to human error. Automation does not completely eliminate this type of
error, and, in some cases, may facilitate it.
Display design is one of the areas where automation may contribute to
human erroL. The common sense approach so often used in the past by
display designers will certainly not be adequate to evaluate the varieties of
new format designs possible with electronic presentation. What is common
sense to a designer sitting at his desk may not be common sense to a pilot
flying in a crisis environment. For example, electronic displays introduced to
date have presented information in formats similar to those available in
conventional cockpits. These may need to be augmented by displays more
suitable for the specific monitoring function required.
Developing displays with formats that facilitate quick, accurate
understanding and aid problemsolving and decisionmaking can greatly
enhance crew performance and acceptance of automated systems. Designing
control systems that allow the crew to accurately insert information and/or
control the aircraft is essential for reducing error in programming systems
for normal operation as well as for making effective responses in an
emergency or abnormal situa-ion.

Design Practices
The information provided to this point indicates that although automation is
advancing rapidly, it has not always lived up to its promises. One reason for
this lack of complete success arises from the design processes followed. This
section will consider what might be considered a typical engineering design
process. While specific design teams differ in many respects, and both
personnel and practices constantly change, designers historically have had

225

HanFact=i for Right Deck Ceudfiado Pasound

enough in common so that it is possible to identify areas where the design


process can be improved, particularly during design of automated systems.
Travwan Dei 4Mysch
Historically, the main drivers of new airliner developments have been the
engineering disciplines of aerodynamics, propulsion, and structures. Improved
performance in speed, payload range, and efficiency have generally resulted
from developments in these areas. These disciplines have therefore received the
greatest emphasis (and budgets) during preliminary and advanced design phases
of a project. As new customer needs are identified and technology
improvements are achieved, a design team is established consisting almost
entirely of designers from these disciplines.
The design team reviews accident/incident data to identify safety and reliability
data that will lead to additional product improvement opportunities. They
establish goals, develop alternative corfigurations, and perform trade studies to
identify the most promising configuration(s). They calculate projected
performance data and contact customers to generate or determine product
interest. Customer feedback is used to refine the design. This process is iterated
until a marketable design is evolved. Once enough sales have been achieved to
justifi the required investment and financing is obtained, the authority to
proceed into the detail design and development phases is given.
The development of a new airliner may cost several billion dollars. A number of
preliminary designs may be required before one is accepted for development.
Consequently, preliminary and advanced design teams are usually kept as small
as possi'ble to minimize expenditure of company funds on projects that do not
go forward. Frequently, cockpit human factors issues are not considered except
in a cursory way during early design stages.
The average time spent from advance technical planning to certification is about
3 years. Design, subcontracting, tooling, fabrication, assembly, and testing must
all be completed during this period. Generally, one year is allocated for flight
testing, which reduces engineering, fabrication, and assembly to about 2 years.
Thus, there is little time for research and redesign. Issues must be addressed
and resolved quickly. Redesign, particularly after drawings have been released
from engineering, may greatly increase costs and jeopardize contractual
deadlines. For all these reasons, there is great resistance among aircraft
manufacturers to changing procedures that have proven successful in prior
programs.

226

CockDit Automation

The process described above has a number of weaknesses. One is the ability of
design engineers to fully understand and weigh all the factors that influence
their designs (See Table 9.8). Research into actual engineering practices has
revealed a number of areas where design teams depart from the ideal (Meister,
1987). Designers often deviate from a deliberate, logical process. Behavioral
data, even if available to a designer, may be ignored. Managers may reject
designers' recommendations if they believe these make no difference in
traditional aircraft performance parameters--reliability, cost, or development
time.
Table 9.8
Cognitive Factors Influencing Design Elements (Meister, 1987)

"* Statement of the problem


"* Statement of criteria and priorities
"* Identification of constraints
"* The engineer's design style (logical, intuitive)
"* Information obtained or retained
"* Experience
"* Preconditions (i.e., other design decisions)
"* A mental outline of what must be done
Due to the revolutionary nature of cockpit changes brought about by
automation and the need for experimental testing due to our incomplete
understanding of human-computer interaction, cockpit design should be one of
the earliest issues addressed in the design of advanced aircraft. As much time as
possi'ble should be provided to identify and address human factors issues in the
cockpit. This is particularly true since human factors have received little
emphasis in past designs.
Both Boeing and McDonnell-Douglas have recognized the need for increased
consideration of human factors issues in the design process. These
manufacturers have added to their professional staffs in the human factors
disciplines and also drawn on simulation studies to support the design process.
AuMmadf on Puwo3ophy
Past design practices have generally not made a cockpit design philosophy
explicit. The general approach has been to incorporate advanced technology
whenever it appeared to have a payoff or whenever the manufacturers'
227

Human Factors for Flight Deck Certification Penonnel

customers wanted it. There has also been an interest in reducing crew workload
through automation of various flight functions. The latter area received
particular emphasis during the development of the MD-80 and B-757/67 designs
in order to justify a two-person crew. Recently, as cockpit automation has
developed and its impact on safety has generated concern, more attention has
been devoted to identifying a philosophy of automation. Boeing has published a
paper illustrating its philosophy for some recent aircraft (Fadden and Weener,
1984).
A primary approach to reducing flight deck workload has been to simplify
system design to make the aircraft easier to operate. As an example, the number
of fuel tanks has been reduced to simply fuel transfer procedures. System
redundancy has been the next most common approach to increasing flight
safety. Automation has been incorporated only if design goals cannot be
achieved otherwise. Table 9.9 provides reasons commonly used to justifying
automation in Boeing's view.
Figure 9.4 illustrates Boeing's process for determining the level of crew
involvement in flight deck operations. A number of automation philosophies
have been proposed for making such determinations. Table 9.10 lists some of
them and their limitations.
Although none of these philosophies seems to be completely adequate at
present, there appears to be growing support for the concept of human
centered automation, as evidenced by the conclusions of the NASA conference
attenders cited later in this discussion. It should be apparent that if an
Table 9.9
Boeing's Automation Philosophy
(Reasons to Automate)

"Simplified/minimized crew procedures for subsystem operation


- reduces random and systematic error
- increases time for primary pilot functions
- prevents requiring any immediate crew action
- reduces subsystem mismanagement accidents
- centralizes crew alerting for error reduction
- allows fire walling engine controls
- allows two-person crew operation

" Improved navigation information


228

Cockdit Automation

Table 9.9 (Cont'fd

Boeins AuMomation Philosophy


(Raasons to Automate)
- provides more exact airplane position indication
- reduces fuel usage

- provides higher reliability and improves accuracy


- reduces crew error

- reduces workload, allows more preplanning


Improved guidance and control
- reduces workload

- allows operation at lower minimums


- allows manual, semiautomatic, or automatic pilot flight
- increases precision of guidance information
effective cockpit is to be designed at minimal cost, the cockpit design
philosophy should be specified early in the design process and made clear to all
on the design team.
The knuwiem of Cmw Role an DeWgn
Design of displays and controls depends on the role that is assumed for the
operator. It is therefore imperative to define the role of the flight crew
operating automated aircraft prior to designing cockpit displays and controls.
In designs with a low degree of automation, the operator must be present for
the system to perform properly. As the degree of automation increases, the
function and duties of the crew become less clear, and designers find it possible
to exclude the crew from consideration. This is partially due to designers' overconfidence--their belief that systems won't fail, or that flight crews are
adaptable and able to adequately resolve any problems that may arise from
system malfunction or failure.
The pilot's role has traditionally been described in terms of four primary tasks:
aviate, navigate, operate, and communicate. Aviate means to fly the aircraft by
keeping its altitude, speed, and configuration within safe operating ranges.
Navigate means to perform the actions required to fly from the present position
to a desired position. Operate means to manipulate the controls required to
make all of the systems--control, navigation, hydraulic, electrical, pneumatic,
etc.-perform as intended and/or to compensate for equipment malfunctions.
229

Human Factors for Flight Deck Certification Personnel

CrSysewm

SysDem

stem

DesiIn

Analyze

Achfeu

Re

Mssion

iens

Eva

JDesi

Procgdixoai
onpTs

,noteefine
Sysem
Re ncioents
u

DelineDeieaetol
System
Isonfao, n

ulso

A
Functions

remests

Ca
emb/ar-fso

J~~F
...

_ n~s

Conce 8Tet

Crew System Configuration


Figum 9.4. Boeitg gudenem for crmw hncdon ians

Communicate means to understand human messages and interpret display


information so that others inside and outside the cockpit know the aircraft's
current status and intentions; it also includes providing information as required.
No system designed to date has the range of capabilities to perform all of these
complex tasks. Only the human is uniquely qualified to perform the functions
necessary to fly an aircraft.
Table 9.11 depicts the processes, activities, and specific behaviors that are
characteristic of all task-oriented activities. An assessment of the impact of
automation on the crew role reveals that flight crews are still required to
perform the operator functions shown in Table 9.11. The amount and
scheduling of time allocated to various tasks changes, but not the need for all
of these traditional functions and activities.
A 1988 NASA conference and workshop was dedicated to identifying and
addressing cockpit automation issues (Norman and Orlady, 1988).
Representatives of airlines, manufacturers, pilot associations, academia, and
government participated. The conclusion reached by this group was that
advanced cockpits bring about both task structure and culture changes.

230

Cockpit Auimation

Some of the task changes identified were a decreased need for computations by
flight crews, reduced opportunity to practice motor skills, less active systems
monitoring, and more evenly balanced workload between the pilot flying (PF)
and pilot not flying (PNF). In an advanced cockpit, the PF has more of a
managerial function than previously, while the PNF does more work but less
active systems monitoring.
Table 9.10
Design Philosophies
PHILOSOPHY

UMITATION

Opeator should be manager and


make decisions at knowledge-based
level (skill, rule, knowledge).
Operator should work as a manager.

Doesn't consider operator role in


compensating for system failures.
Term manager is poorly defined.
Cockpit management functions must
play backup role in aviation since
9rocess of fying the aircraft cannot
e shut down.
In order for concept to work, have to
communicate intentions to system.
Because crew desires are variable,
requirements to keep computer
informed may be overwhelming.

Let the crew do what they want to


do and let automation handle the
rest.

Design envelope around system. As


long as crew stays within envelope,
crew can By any way it wants. If
envelope is approached computer
intervenes (warns or takes control).
Automate everything feasible and let
crew handle the rest.

Variation of prior philosophy. May


not be feasible from technical or cost
standpoint. Envelope may vary for
different routes, erivironments, etc.

Human-centered automation.

Not defined well enough to aid


designers.

Crew may not be well adapted to


assigned role. Problem of who is
ultimately responsible for aircraft.

Cockpit cultural changes included a more even division of responsibility, less


crosschecking, and role reversal in terms of information flow, with the PNF
transmitting more information to the PF than previously.
Participants in the NASA conference felt these changes were not serious in
normal operations, but they might be a concern in abnormal situations
involving minor and major systems failure, particularly situations involving
unexpected systems failure. They concluded that it was absolutely essential for
the flight crew to maintain situation dominance in all flight-related functions. In
other words, the crew should have all the information and controls necessary to
perform all of the traditional functions, even in automated systems. Table 9.12
231

Human Factory for FliUR

Deck Certification Personnel

presents participants' conclusions as to how the crew role has been altered in
recent aircraft design.
Table 9.11
Functions of the Human Operator
PROCESSES
PERCEPTUAL

MEDIATIONAL

ACTIVITES

SPECIFIC BEHAVIORS

SEARCHING FOR AND RECEIVING


INFORMATION

DETECTS
INSPECTS
OBSERVES
READS

RECEIVES
SCANS
SURVEYS

IDENTIFYING OBJECTS, ACTIONS,


EVENTS

DISCRIMINATES
IDENTIFIES

LOCATES

INFORMATION PROCESSING

CATEGORIZES

INTERPOLATES

CAILULATES

ITEMIZES

CODES
COMPUTES

TABULATES
TRANSLATES

ANALYZES
CALCULATES
CHOOSES
COMPARES

COMPUTES
ESTIMATES
PLANS

ADVISES
ANSWERS

INFORMS
INSTRUCTS

COMMUNICATES

REQUESTS

DIRECTS
INDICATES

TRANSMITS

COMPLEX-CONTINUOUS

ADJUSTS
ALIGNS
REGULATES

SYNCHRONIZES
TRACKS

SIMPLE-DISCRETE

ACTIVATES
CLOSES
CONNECTS
DISCONNECTS

JOINS
MOVES
PRESSES
SETS

PROBLEMSOLVING AND
DECISIONMAKING

COMMUNICATION

MOTOR

Human Factors
Human factors may be defined as the application of knowledge about human
characteristics to the design, operation, and maintenance of systems. This
discipline gained recognition during World War II when the military recognized
that performance and safety could be enhanced by improving the harmony
between machine and human characteristics.

232

Cockpit Automation

Initial human factors interest was largely in knobs and dials--in improving
displays, such as altimeters, and controls such as levers, knobs, and cranks.
Fundamental principles such as control-display compatibility, and color and
position coding, are products of this work. Many of the early contributors to
human factors were experimental psychologists drawn from academia and
employed by the armed forces to study specific problems. At the end of the war,
most of these professionals returned to civilian status. A few remained in
government laboratories.
Table 9.12
Crew Role
HISTORICALLY

WITH AUTOMATED AIRCRAFT

PRIMARY
RESPONSIBILFIY

SAFETY

SAME

PRIMARY
FUNCTIONS

AVIATE
NAVIGATE
COMMUNICATE
OPERATE

SAME
SAME
SAME
SAME

PRIMARY TASK
CHARACTERISTICS

DIRECT CONTROL

INDIRECT CONTROL

MANAGER, OPERATOR

MANAGER, MONITOR

PRIMARY BACKUP TO SYSTEMS

SECONDARY BACKUP TO SYSTEMS

DIRECT INVOLVEMENT

IN ITITRI _.NT DIRECT INVOLVEMENT

CONTINUOUSLY
MULTIPLE SOURCES OF
INFORMATION

FEWER INFORMATION SOURCES

INFORMATION GENERALLY
AVAILABLE

INFORMATION MAY HAVE TO BE RETRIEVED

PERCEPTIALIPSYCHOMOTOR
SILLS USED FREQUENTLY

PERCEPTUAL/PSYCHOMOTOR SKILLS NOT


DEMANDED VERY FREQUENTLY

CAFTAIN'S AUTHORITY FINAL

CAPTAIN'S AUTHORITY MAY BE PARTIALLY


ABROGATED

As system complexity increased, the U.S. Air Force recognized the need for more
emphasis in human factors and mandated contractors to employ specialists in
this area. Few schools offered courses in the discipline and companies found it
difficult to employ properly qualified people. There was uncertainty also
regarding the role and organizational placement of human factors specialists.
Often they became internal consultants who were used to make
recommendations or perform 3tudies to solve problems after these were
identified.
Because solutions to these problems called for consideration of many issues and
an adequate database was not available, the human factors specialists
233

Human factoi for Frh~t Deck Ceification Peran

recommended experimental studies to resolve the issues. Contributing to their


desire to experiment was the fact that most were trained as scientists, not as
engineers. Design engineers, on the other hand, often couldn't afford the time
and/or budget to accommodate experiments. In addition, human factors
specialists often did not have a background in either aviation or design. As a
result of these and other considerations, the human factors discipline has been
slow to gain whole-hearted support from either the design or the operational
communities.
As system complexity has continued to increase, however, the need for
consideration of human capabilities and limitations has been increasingly
recognized. Most large aircraft manufacturers now maintain a human factors
staff. In fact, human factors disciplines have expanded to include not only
experimental-industrial psychologists, but also physiologists, anthropometrists
and other life and social scientists. Some human factors departments also
include aerospace medicine physicians and training specialists because of the
commonality of their interests and academic backgrounds.
Human factors staffs have become much more knowledgeable about flight
operations and design constraints as they have grown in experience. As a result
of their increased knowledge in these areas, they have also become more
responsive to management and design needs.
In spite of these advances, however, most organizations do not accept human
factors as a core discipline with a status comparable to structures,
aerodynamics, avionics, and more traditional engineering disciplines. (One
exception to this is the U.S. Air Force which has established the Human Systems
Division as one of its prime Research and Development organizations.)
Organizations often support human factors only reluctantly. There are many
reasons for this reluctance:

"* Overconfidence in the human ability to adapt to their designs


"* Faith that training can compensate for design shortcomings
"* Belief that human factors involves only common sense
"* Belief that the sciences upon which human factors is based are "soft"
and pilot experience is better than human factors data

"* Judgment that the system will benefit more from an additional
engineer from one of the traditional engineering disciplines than from
a human factors specialist.
234

Cockpit Amation

Perhaps the most significant contributor to organizations' reluctance to support


human factors is the lack of objective human factors criteria in the Federal
Aviation Regulations or in typical design specifications.
There is ample evidence that attention to human factors is warranted based on
accident and laboratory (including simulator) data. It would seem prudent, for
example, that designers should invest most heavily in the area which has been
found to be the largest contributor to accidents, i.e., human error. Since many
of these errors result from design-induced causes, human factors should be a
major concern of both designers and certifiers.
How Hwnw Fac-om R&ae to A

don Doesi

Flexibility and adaptability are prominent characteristics that make humans


essential to a system. But this flexibility and adaptability are achieved at a cost.
There are design trade-offs for humans as there are for other systems
components. One of the reasons humans are so flexible and adaptable is their
complexity. Many interacting variables can influence their behavior and
performance. This section will briefly address some of the most fundamental
human characteristics related to working with automated systems.
In many automated systems, the role of the human ia that of a monitor. If
something fails, the human is expected to detect the failure, determine the
problem, decide what action to take, and execute the appropriate response. The
human must act as a sensor, decisionmaker, and controller. The performance of
humans as monitors has been studied extensively since World War II. The
development of radar and sonar put some people in a position where it was
necessary to detect small stimulus changes which do not occur very often.
Experiments investigating human performance in this type of situation became
known as vigilance studies. Generally, it has been found that humans do not
perform well as monitors. If their interest is not maintained, they become easily
bored or distracted and direct their attention to other considerations.
Physiological studies have determined that periods of inactivity with few
demands are not conducive to good performance. The need for stimulus change
may cause the monitor to attend to other nonwork-related interests. Attention
to an outside stimulus may make it difficult physiologically for work-related
stimuli to be perceived. The brain inhibits the neural response to stimuli that
are not related to its primary focus (Hilgard and Atkinson, 1967). There is also
evidence that if attention is dedicated to one channel of information for a
period of time, information from other channels may tend to receive increased
priorities (Broadbent, 1957).
235

Huma Fad=o. for Fligh Deck Crertfatian Prionne

An additional human cognitive characteristic is the need for warm-up. Once a


task has been deferred for a while and then reinitiated, it takes some time
before the person is able to perform the task at peak effectiveness. The degree
of cognitive warm-up required depends on the person's level of skill, the
difficulty of the task, the time since performing the task, the degree of similarity
between the task and intervening activities, and other factors.
These and other findings lead to the conclusion that too low a workload
degrades human performance. Similarly, if the workload is too high,
performance can suffer. Optimal performance generally is obtained when the
relevant variable is in the middle range. Figure 9.5 illustrates this relationship
for workload. Figure 9.6 is an example of how the relationship could be applied
to cockpit design.

Hypothetical Relationship Between Workload


and Performance
ILOve Umlt

Performance

4,
.,,-

upper Limit

T
Workload

FgM 9.5

HypoaMcI rabdonshp b0wuM wddoMad


w pwfoMu

(ot n1Gum)

Research dedicated to finding valid, reliable measures of workload has been


recently emphasized in a number of laboratories and progress has been made. A
recent FAA contract supported an extensive review of the workload
measurement literature (Corwin, et al, 1989). Physiological, behavioral and task
analysis measures were investigated. Although no totally acceptable assessment
method was identified, a number of useful techniques are available.

236

Cockpit Automation

Table 9.13 is a list of psychological phenomena relevant to human performance


with automated systems. The literature includes a great deal of research in each
area. Understanding and interpreting this literature requires specialists. This
need for specialists is becoming increasingly recognized by many agencies. The
E

Add Work

111

Subtract Work

o8

CD

] Provide Added Anticipation

Reduce Work to
Provide Safety Margin

I
Takeoff

Climbout

Time

Cruise

Descent

Approach

_Landing

Mission Scenario
Rgm 9.6 Benecial auanmwt Nl~g
f

crew workload by phase offlght (odginalfiu)

Air Transport Association, for example, has not only established a standing task
force to identify human factors issues and promote their resolution, it has
encouraged the elevation of human factors to a core discipline in aircraft design
commensurate to such engineering disciplines as aerodynamics.
SoaW" Sciences and do Need for Tesing
One of the reservations many organizations have about human factors is that
they are supposedly based on "soft" sciences. This perception is not accurate.
Research into human characteristics has generated a great deal of "hard"
information. Sensory processes are reasonably well understood, and a great deal
is known about perception, learning, memory, motivation and emotion. Useful
data are also available regarding decisionmaking.
it is true, however, that in spite of the amount of data available, few theories
are available to integrate these data in useful human factors applications.
Adding to the difficulty is the fact that many interacting variables may influence
a person's performance in unpredictable ways at any specific time. These

237

Human Factors for Plifht Deck Certiicaion Personnel

limitations make prediction of an individual's behavior at any instant difficult.


This problem is not unique to the behavioral and social sciences, however.
Medicine and pharmacology are similarly affected. More to the point, perhaps,
Table 9.13
Some Psychological Topics Relevant to Automation

"* Arousal

- Influence of learning/practice on
perception

"* Motivation (Yerkes-DodsonLaw)

Response time and mental set

"(anticipation)

* Stress

Warm-up

De
"* Inhibition (Hernandez
Peon)

Short-term memory and


distractions

"* Inverted u-shaped curve


"ALong-term memory

* Attention

"I Need for practice

* Isolation

"VTransfer of training

* Vigilance

* Stress

"* Overload

* Biases in decisionmaking

perception
"* Sensation and

is the fact that many "hard" disciplines such as aerodynamics and meteorology
also have similar problems. In all of these disciplines, there is a need for
extensive testing to determine the efficacy of a particular design or model.
Testing is heavily emphasized in most aircraft design. Aircraft structure is
stressed to destruction at a cost of many million dollars to demonstrate that
design requirements are met. Millions of dollars a week are spent on windtunnel testing during some phases of design. In contrast, simulator tests of
cockpit design have not been as frequently or effectively used as other modes of
aircraft design testing. This seems inconsistent in view of the much greater
confidence in the "hard" data of the more traditional disciplines and the
identification of human error as a major contributor to accidents.

238

Cockdit Automation

The Problem of Gfteda

Frequently, nonspecific criteria are used in making human factors assessments.


Subjective pilot judgments are probably the most frequently used criteria for
design acceptability. Pilot judgment has a number of advantages from an
engineer's point of view, such as its apparent validity and quick response.
Rarely, however, are project pilots knowledgeable in the scientific disciplines of
experimental psychology, physiology, and anthropometry that are the foundation
of the human factors specialists. Test pilots, while well trained for their job,

may lack an understanding of the line piloes environment, such as flying the
same aircraft for years, flying many legs late at night, or flying long
intercontinental flights.
Ideally, design decisions should be based on criteria related to overall system
performance, but designers have generally deemed human performance difficult
to assess. Part of the difficulty may arise from the designers' relatively poor
understanding of human factors testing. It seems apparent that more attention
should be devoted to valid, reliable human performance measures.
In addition, if critical human tasks involve reprogramming and/or taking over
for automated systems in the event of a significant failure, it should always be
demonstrated that representative crews can perform adequately under
representative (including worst-case) scenarios. It is also desirable that human
performance be tested near the limits of its capabilities to assure adequate
safety margins.
Conclusions
This review far from exhausts the relevant information regarding cockpit
automation. Training issues have not been addressed at all, for example. Many
years of further study and of industry experience will be required for designers
to be fully confident in how to design automated systems that are compatible
with human characteristics. Several preliminary conclusions seem appropriate,
however:
"* Automation will continue to increase.
"* Successful automation depends on proper integration of human
capabilities.
"* The discipline of human factors has a store of knowledge and methods
which can be useful to good systems design.

239

Human Factors for Flight Deck Certification Personnel

"* Automation is generating and/or highlighting human factors issues.


"* Attention to human factors has generally been lagging in both design
and certification.
" Adequate emphasis on human factors will be facilitated by a System
Engineering Approach, established human factors performance criteria,
early investment in cockpit definition and development, developmental
simulation and testing of human factors issues, and the development
of a design-oriented Human Factors Research and Development
program.
Recommendations
This review suggests several basic automation and human factors questions that
need to be addressed in certification:
"* Will te crew be exposed to potentially catastrophic failures in which
their actions will be crucial?
"* If so, will the crew be able to execute the appropriate actions in the
time available without making a catastrophic error?
"* What is the probability of each of the above?
"* Based on the responses to these questions, will safety be degraded or
enhanced by the automation?
To obtain valid, reliable answers to these questions it is necessary to consider
not only the aircraft features but the whole aviation system. This consideration
should include crew functions in normal and abnormal operations; interactions
between the crew and the system; and crew selection, training, and
composition. Without question, the FAA certification process provides for this.
However, the process needs to be strengthened in several areas if automation
and human factors issues are to be adequately addressed.
First, consideration should be given to making the human factors certification
criteria more explicit and objective. These criteria should be stated in terms of
performance rather than rules of design so as to minimize difficulties as
technology advances.
Second, the FAA should add human factors specialists to its certification team.
The basic issues supporting this recommendation have been reviewed in this
240

Cockpit Automation

discussion. Attention to human factors by the Agency would encourage


manufacturers to increase their emphasis in this area, and the addition of basic
human factors knowledge would greatly enrich the FAA's assessment process.
Table 9.14 identifies activities the FAA certification specialists could perform.
The FAA has recently added a human factors specialist at Headquarters.
Although human factors emphasis on policy and research is desirable, the real
payoff will be obtained only if human factors are incorporated in the
certification process.
Table 9.14
Role of FAA Cockpit Certification Specialist(s) - Human Factors

"o Review, critique, assess, and enrich manufacturer's cockpit development plan
"o Participate in selected development activities to assure adequacy
"o Review and assess cockpit relevant reports of tests, analyses, etc, submitted
by manufacturer

"o Participate in development of FAA cockpit certification requirements


"o Participate in certification testing:
-experimental design
-definition of criteria and performance measures
-adequacy of statistical analysis
-subjective assessment methods
-human factors checklist(s) development and application

"o Noncockpit certification activities:


-helping to identify and structure FAA/NASA human factors R&D efforts
-identification of crew training issues

"o Alternative approach - DERs for human factors

241/242

Display Desie

Chapter 10
Display Design
by Delmar M. Fadden, Chief Engineer--Flight Deck, Boeing Commercial Airplane
Group
The rapid and relable display of visual information in the flight deck requires a
thorough understanding of the functions being supported, thoughtful
application of available human performance knowledge, and careful selection of
the appropriate display media. This chapter explores some display characteristics
of special relevance to achieving highly effective human performance in flight
situations.
The measure of a truly effective display is how well it supports consistent
accomplishment of the tasks assigned to the person who will be using it.
Display design is as concerned with task design as it is with presentation
symbology and display devices. The process of identifying the full range of tasks
and the associated information requirements for a modem, highly integrated
display can be formidable indeed.
243

Human Factors for Flight Deck Certification Personnel

The core of the design process usually involves resolving contentions between
system functional requirements and operator capabilities and limitations.
Through actual design examples, this chapter illustrates the issues associated
with balancing system and human needs. The examples are based on display
development work at Boeing Commercial Airplane Group in suppr-, f the 757,
767, and 747-400 airplanes.

Display Development Process


Figure 10.1 is a Fowchart representing the primary display development
activities (ARP 4155, SAE G-10 Committee, 1990). The flowchart provides a

useful basis for discussing the fundamental elements of display design. Some
steps require considerably more effort than others, depending largely on the
scope and phase of the project. Some of the steps can be accomplished using
traditional engineering tools and methods; others are better suited to techniques
more commonly associated with the sciences of psychology, operations research,
and human factors. Many successful displays have been developed without
explicit attention to this process, though their development histories often show
evolutionary improvements that can be mapped to these steps.

Requirements
Displays exist to provide information to a human being who is asked to achieve
some objective. Accurately recognizing that objective in terms of required
outcomes is crucial to successful display design. Once the top level objectives
are identified, the focus shifts to determination of the detailed tasks necessary
to accomplish the objective and the information requirements that support those
tasks.
There is an understandable tendency to skip the formal definition of the
detailed tasks and associated information requirements and start the design by
developing display formats. Working on display formatting can be a useful aid
in initiating an understanding of the information requirements. However, the
understanding gained by first developing information requirements from the
related tasks is virtually always more accurate and complete. The are two
significant side effects that can follow a design which begins with display
formatting selection. The format selected likely will be based on the similarity
of information content to that of other displays rather than any actual linkage
to the tasks this specific display supports. The conceptualizations of the
information required and its organization will be well established before the full
range of task possibilities has been explored. Together these effects can result in
excessive display complexity, more operator errors, and less efficient task
performance by the pilot.
244

Display

Recognize New or
Modified Task. Function,
or Technology

ftqt.drements

No
Task

Property

Porlorm

Defined?

Detailed
Task Ana"is

Yes

Dellne
Information
Requirements

No

Into Conten
Adequately
Defined

Yes
R
rd Symbo

or Task?

Design

Develop
Candidate
Symbols &Formats

Yes
Suitable
Display
T nology?

No

Yes
Legible?
.. .......
....
..................
.......
...

Evaluateon

No

-- Ve-s-, - - - - -Symbology
Supports Primary
Task?

No

Yes
Display
& Syrribo

No

Yes
Flight
Tasks &Symbo
to

No

Yes
Op2rgon

Task Performance

No

Achieved?

F"

10.1

Display Development Fk)wchart (from SAE documerit ARP 4155).

245

Human Factors for Flight Deck Certification Personnel

Each element of information to be displayed is integrally linked to the dynamic


performance requirements for the associated task. Characterization of the task
with measurable performance objectives is a key step in understanding the
specific contribution of each information element and the effectiveness of the
symbology used to portray that information.

In most circumstances it is best to trace the task from the top level mission
objectives. While this is tedious work, it provides essential insight into the
interrelationships between tasks and provides a basis for meaningful discussions
with those who will use the system. The top-down task analysis also yields a
complete description of the steps believed necessary to execute each particular
task. To some, it may seem premature to prepare a task analysis before the
hardware is designed. However, the system designer knows conceptually what
has to take place. Specific switches and controls are not yet known, neither are
the overhead tasks which will be necessary to operate the system, so the initial
analysis must be done at a top level As the design develops, the analysis can
be expanded to a more detailed level. Proceeding in this fashion provides a
good check on the correctness and efficiency of the specific design. By
comparing the detailed analysis with the initial top level analysis, the designer
determines how much overhead has been added and checks to see that the
functional design remains consistent with the stated objectives.
The task analysis should identify at least the following information:
o the objective of the task, stated in measurable terms;
"o the timing for the task (any task initiation dependencies should be
defined, along with constraints on execution or completion time);
"o expectations about task performance, including accuracy, consistency
and completeness;
"o possible errors and related consequences (be sure to consider errors of
omission along with errors of commission);
"o task dependencies, other than those associated with timing already
identified (dependencies might include other tasks, specific events,
combinations of flight conditions, etc.);
"o criticality of the task objective in relationship to the safety and
efficiency of the flight.

246

DisDaV Desp
i

Having completed a detailed task analysis, the tasks can be linked with the
information necessary to accomplish each task. This would normally include:
"o definition of information requirements
"o the accuracy and range needed
o the context within which that information will be used
"o the necessary dynamic response
"o any special relationships with other events or information
At this point, it is useful to examine similar tasks and the information necessary
to support them. Task similarities provide valuable insight into the range of
human performance that can be expected. Where tasks are new or involve a
change in the required precision or dynamics, the designer will have to turn to
rapid prototyping, part task simulation, experimental tests, or other human
factors testing to identify and quantify the specific information requirements.
Tasks involving continuous dynamic control are further complicated by the
complex interaction between the dynamics of the control device, the vehicle
dynamics, the dynamics of the displayed information, and sometimes the
dynamics of the pilots' response. In difficult cases, this step will be iterated
many times in a series of progressive refinements until satisfactory performance
is achieved. Often these iterations are accomplished in conjunction with
iterations of the previously discussed task analysis and the symbology
development step that follows.
Not all information requirements need to be satisfied through on-board displays.
There are various other sources for required information that can be just as
effective. One of these sources is the knowledge that the pilot carries in his or
her mind through previous experience or training. Also, information can be
carried on board with the pilot or the pilot can derive it from other information
available on the flight deck. Information from an alternate source may be easier
for the pilot to integrate with the task than if it were contained in a flight deck
display. Taking the time to examine alternate sources for required information
can simplify display design considerably and aid the pilot by simplifying access
to the required information.

247

Human Factor for Flight Deck Certification Personnel

Design
Once the information requirements for a display have been defined, the next
step is to determine how to present the information. Symbology selection
determines how specific information elements will be represented within the
display. (In this context, symbology encompasses any form of character, graphic,
or textual entity.) By contrast, display format selection determines the
conceptual framework within which the information will be presented. The two
selections are closely related. For highly integrated dLsplays, the selection of
formatting will be heavily influenced by the top level tasks the display supports,
while the symbology selection often will be guided by specific requirements of
the detailed tasks. The necessity for joint and iterative refinement of symbology
and formatting frequently increases as display complexity increases.
It is standard practice to pay particular attention to how information has been
represented and related to tasks in similar successful displays. Building on past
successes has numerous advantages. Training can be simplified, if the pilot is
familiar with a significant portion of the display. The risks of introducing a new
display can be reduced, if the human performance expectations are based on
operational use of a similar display. These benefits are often perceived to be of
sufficient value as to preclude serious consideration of alternative symbology
and formats. However, examine the underlying tasks carefully, since subtle
differences in the current tasks may require that different information be
portrayed or that formatting be adjusted to highlight different relationships.
Changes in the technology used for display can force a change in the selection
of symbology even when the task and information requirements remain the
same. This would be the case when the change in technology alters important

characteristics used in creating symbology. For example, line widths that can be
presented using practical CRT technology are considerably thicker than those
which can be produced using print technology. This changes the amount of
detail that can be presented successfully in a given area. In effect, print media
have a much greater upper limit for information density when compared with a
CRT display. Another difference concerns the manner in which displays generate
brightness. Since CRTs emit light, the overall brightness of a CRT display will
be a direct function of the information content and a reverse function of the
ambient light. Reflective displays, on the other hand, change brightness as a
direct function of ambient light with a much smaller contribution based on the
information content.
When technology changes are involved, it should not be assumed that
symbology that has been successful in the past will carry over equally
successfully to the new display. Each display technology has unique
characteristics or capabilities that can be exploited to enhance the effectiveness
248

Display Desirn

of information transfer. The common ground for assessing the impact of any
limitations and the value of any enhancements is the task performance
achievable. Objective evaluation of these issues has a profound impact on the
decision about which technology to use.
If there isn't an existing presentation format for the task, new symbols and
formats must be created. Simplicity, quick recognition, and directness are
characteristics of proven value in effective symbology. Regardless of how the
symbol is conceived, there needs to be an appropriate performance measure
(agreed to in advance) to determine how well the symbol performs its job. User
preference is a significant factor in the development of symbology. If the users
don't like a symbol, there is little to be gained by continuing its use. However,
just because the users like a symbol does not mean that they can use it
effectively. The only way to know tmat a symbol really works is to have the
pilot use it and to measure the resulting performance.
As in all human performance testing, the test engineer is faced with the
challenge of obtaining an appropriate performance measurement yardstick. In
this case, it comes from the detailed task analysis. How much tracking accuracy
does the pilot have to achieve? What probability of error car, be tolerated?
How quickly do decisions have to be made? These questions can be quantified
based on the pilot's top level task and the details of the task analysis.
Finally, designers have to look at factors of legibility, so that the displayed
information can be seen in the operating environment. Legibility is a complex
issue in a modem airplane. Several factors contribute to the potential for less
than optimal viewing: the geometrical requirements for the aerodynamic shape
of the flight deck, external environmental influences, and the large vision
variability between pilots. Vision is one of the more variable of human
capabilities. It is not unusual for otherwise similar pilots to have quite different
visual capability. Corrective lenses can reduce the effects of individual acuity
differences; however, accommodation time, color perception, and critical flicker
fusion frequency remain highly variable individual characteristics. The pilot's
external environment varies from virtually pitch black to extremely bright
sunlight. The distance between the pilot's eyes and the display is generally
greater than a person would choose to read a book or a newspaper. Accordingly
the size of text and graphics must be increased to compensate. The pilot and
the displays vibrate at different rates when the airplane is in turbulence. The
resulting reiLative motion can severely hamper readability, particularly the
readability of small symbols or fine detail. If all or a portion of the information
must be read in turbulence, both the design and the legibility testing must take
that into account.

249

Human Factorn for Flight Deck Certification Personnel

Evaluaton
The first part of the evaluation cycle determines whether the primary task
performance defined in the early requirements phase has been achieved. As with
the early development work, it is having dearly identified and measurable
performance criteria, that makes efficient testing possible. Once it has been
determined that the expected performance can be achieved for the intended
task, it is important to determine that the performance of other tasks has not
been degraded. This second portion of the evaluation process is generally more
difficult.
Knowledge of the various mechanisms that have contributed to performance
degradation in the past is a good place to start in deveklping an evaluation
strategy. Typical conflict mechanisms include the following:
"o Apparent symbol motion caused by actual motion of nearby symbols.
"o Poor recognition of a symbol or alphanumeric caused by excessive
dominance of an unrelated nearby symbol. Such dominance may be
due to relative size, color, brightness, or shape differences.
"o Symbol uses or format interpretations that are inconsistent with pilot
expectations. The pilot's expectation derive from many sources
including: other associated tasks or displays, his mental
conceptualization of the situation, cultural influences, training, or
previous experiences.
"o Similar symbols that support different tasks but can be confused. This
problem is particularly difficult to identify if the information is
identical or highly similar but the task or task performance level is
subtly different.
Integrated displays present a great deal of information, and have many tasks
associated with them. Therefore, if a new task is being added to an already
complex display, it is important to confirm that the required level of
performance for previous tasks can still be achieved. Once task performance has
been confirmed for all tasks associated with the integrated display, the check
should be expanded to examine all applicable task-display combinations on the
flight deck.

250

DiVbla Desin

Ope ton
The final phase in the display development process is operational follow-up.
Comments about problems or concerns are readily available from both
certification and airline personnel. Over the life of a typical display system, it is
not unusual to find that some of the tasks for which the display was originally
designed get redefined in subtle ways. This may be due to changes in the
operating environment, changes in the skill or knowledge base of the pilots
using the display, or it may be the result of refinement of a partially understood
task. In any case it is important that user comments be recognized and
evaluated against the design intent. The quality of future decisions about use of
the display, associated training, or operational enhancements depends on
accurate understanding of the pilots' tasks and how they are supported by
displayed information.
General Design Issues
OppmUot

as ber Sndrdbown

A recurring question directed to display designers deals with the notion of


standardization. A typical question might be, "Since these displays contain the
same information, can't we standardize on common symbols and formats?" The
answer focuses on the detailed nature of the pilots' tasks. If the tasks are
indeed common, then a common display would be operationally attractive.
However, if the tasks are different in any significant way, a standardized display
may result in degraded pilot performance or an increased error rate.
An example may clarify this counter-intuitive situation. The relative merits of
vertical tape engine instruments as compared with round dial displays have
been debated by developers since the mid 1960's. With the appearance of CRT
engine displays in the 1980's, it became feasible to provide whichever format an
airline preferred. During development of the 757 and 767, Boeing conducted
numerous simulator tests and demonstrations designed to aid airline personnel
in the selection of the preferred format. The unanimous selection was the round
dial format, in spite of initial pilot expectations that a more balanced preference
would exist. Similar simulation testing was conducted during development of
the 747-400 in the late 1980's. This time, the unanimous selection was the
vertical tape format.
The engines on the 767 and 747-400 are identical; the same part number. The
parameters displayed to the pilot are identical. Why should there be such a
marked difference in selection? As discussions between pilots and researchers
probed this issue, it became clear that while the high level task objective is
251

Human Factors for Flight Deck Certification Personnel

indeed the same for the engine-related tasks on the two airplanes, the task
execution strategies the pilots preferred were distinctly different.
For the four-engine 747-400, the pilot monitored for an engine anomaly by
comparing the same parameter on all four engines and focusing on the engine
whose parameter was inconsistent with the other three. For the twin-engine
767, the strategy involved comparing the parameters for each engine with the
pilot's expectations and his knowledge of past performance. In this case, the
pilot was concerned with relating the different parameters for a single engine.
Cross comparisons for the twin-engine airplane would be inconclusive for many
failure conditions.
Understanding this difference in task execution strategy provides a good basis
for understanding why there was such a dear difference in the display format
selection for the two airplanes. Where task differences do exist, the issue of
standardization can be reduced to comparing the cost saving which might result
from standard display hardware and software with the cost of the associated
degradation in performance and the added compensatory training that would be
necessary.
Flight functions that are common across many airplane types come under
significant market forces that, over time, promote de facto standardization. This
tends to apply to functions that are well known and quite stable. As would be
expected, the bulk of industry attention is focused on functions that are new,
incompletely understood, and rapidly changing. It should be possible to achieve
a reasonably high level of display standardization provided that detailed tasks
can be standardized. The crucial factor is whether the tasks are truly common.
That is a difficult question to answer in a business climate involving intense
competition and rapid technological change both on the flight deck and in the
ATC environ. In many ways, it is a tribute to the entire industry that the degree
of standardization that exists now has been achieved at all.
An example illustrates the subtlety of the pilot's use of dynamic symbology. The
primary instrument arrangement for the Boeing 767 has the map display
directly below the primary attitude display. The localizer deviation display is at
the bottom of the ADI. Since the track scale is at the top of the map display,
there is no need for repeating any heading information on the ADI. The Boeing
747-400 has larger CRT displays in a side-by-side arrangement. In this case, the
track scale is separated from the localizer deviation. Since this altered the "basic
T instrument arrangement, it was decided to place a heading scale at the
bottom of the primary flight display (PFD). The initial format for this
information was selected to emphasize airplane heading, thus maintaining a
strong link with past HSI displays. The scale at the top of the navigation
252

Display Desism

display (ND) is track oriented, as it is on most 767 airplanes. The two different
orientations were believed to match a difference between the localizer capture
and runway alignment tasks. In separate applications, both of these orientations
had been in wide-spread use for an extended period of time, each with highly
successful operational histories. During initial 747-400 flight testing, it was
found that a significant number of pilots were having difficulty with the
transition between instrument and visual conditions during initial departure and
the final phase of ILS approaches. Having the two scales in close proximity and
with a different orientation was suspected as contributing to the problem, since
the basic information contents of the displays on the 767 and the 747-400 are
identical. Identification of the specific sources of the performance difficulty was
done by a team led by John Wiedemann. The steps they accomplished in
resolving the difficulty provide an interesting perspective on the complexity of
designing highly integrated displays.
Figure 10.2 shows the original ;47-400 heading and track symbology on the
primary flight display (left side of the figure) and navigation display (right side
PROBLEM:

HEADING/TRACK SYMBOLOGY
THE PFD AND THE ND.

INCONSISTENCY BETWEEN

CRS214

SOURCES OF CONFUSION:
1.
2,

ND HEADING BUG
PFD TRACK BUG

A
Figure 10.2 Inkkd 747-400 PFD mid ND Heading md Track Symbklogy. (ovigd

g)

of the figure). On the navigation display, track is fixed at the top of the display
and heading is shown by a modified, triangular pointer which moves along the

253

Human Factors for Ffight Deck Certification Personnel

inside of the compass scale. Conversely, on the PFD heading is indicated by a


fixed pointer (which appears as a mirror image of the ND track pointer) and
track is shown by a small triangular symbol which also moves along the inside
of the compass scale. In both displays, the selected heading value is indicated
by a split rectangle which moves along the outside of the compass scale. On the
ND, the selected heading is reinforced by a dashed line emanating from the
airplane position and leading to the split rectangle.
The first step in clearing up the problem was to minimize confusion caused by
the different pointers by changing the ND heading pointer to make it more
distinctive. Figure 10.3 shows the new shape.
SOLUTION #It
PROBLEMs

CHANGE SHAPE OF ND HEADING BUG.

INCONSISTENCY STEMS FROM RELATIONSHIP


BETWEEN ALL PFD/ND HEADING/TRACK SYMBOLOGY.

CRS214

SOURCES OF CONFUSION:
40

1.

ND HEADING BUG

2.

PFD TRACK BUG

3.
4.
5.

READOUT BOX
ND TRACK LINE
SELECTED HEADING BUG

A
Figure 10.3 Nav"Ion Display Heacing Poter Shape ChanOg. (owg

iure)

This simple change did not solve the problem. At this point, a thorough review
of task information relationships was accomplished beginning with an
assessment of how these tasks were supported by earlier displays. This review
confirmed that the information content was correct but indicated three areas of

potential confusion brought about by the close proximity of the PFD and ND
presentations. The next step involved changes in each of the three areas:
(shown in figure 10.4)
254

Display Design

"o make both heading pointers the same shape, but put them outside the
compass scale circle,

"o locate both digital readout boxes at the top of the display,
"o add a moveable track line on the PFD, analogous to the fixed track
line on the ND.
SOLUTION #2:

CHANGE SHAPE OF HEADING BUGS;


PFD TRACK BUG; READOUT BOXES.

PROBLEM, READOUT BOX ORIENTATION CONFUSIONj


PFD TRACK LOOKS LIKE NEEDLEj PFD HDG TAPE NOT SYMMETRICAL,
TR 291 M
29

e4

SOURCES OF CONFUSION,
40

1. READOUT BOXES
2. Pi-D TRACK BUG
3. SELECTED HEADING BUG

A
Figure 10.4 Consistent Shapes for Heading and Track Pointers. (original figure)

Performance with this format was better; however there was now confusion
associated with the digital readouts and the track information. The results of
simulator testing suggested three more changes: (shown in Figure 10.5)
o remove the digital readout box from the PFD, so there is no read-out
confusion;

o add a tick to the PFD track line to strengthen the association with the
ND track line;

255

Human Factors for Flight Deck Certification Personnel

o move the selected heading split rectangle to the inside of the compass
arc to avoid conflict with the heading triangle.

FINAL SOLUTION:

180

CONSISTENT

SYMBOLOGY BETWEEN

DISPLAYS

MAG'S

Figure 10.5 Consistent 747-400 PFD and ND Heading and Track Symbology. (original figure)

This combination performed well. Clearly the success of this symbology suggests
that the actual task for which the pilots use the scale on the PFD is closer to
the capture and track task associated with the map display than the runway
alignment task that had been presumed. This interpretation follows from the
relatively small change 1n the ND format compared with previous map displays
and the much more significant changes to the PFD when compared with
previous HSI presentations. Note that no information was added or removed
from either display. All seven of the changes involved symbology and formatting
only. The number of changes and the sequential manner in which they were
identified emphasizes the high degree of interaction among the symbols in these
two displays.

Use of Color
The first commercial CRT displays developed by Boeing (originally intended for
the Boeing SST) were integrated in the NASA TCV airplane after the SST
256

displav Design

program was canceled in the early 1970s. (The NASA Terminal Configured
Vehicle, TCV, is a Boeing 737 airplane with a reconfigureable research flight

deck and extensive avionics designed to address a wide range of systems


integration and pilot performance issues.) The TCV CRT displays were
monochromatic because there was no suitable color display on the market that
could be viewed in even bright room light, much less full sunlight. Because of
the extreme ambient light range typical of commercial flight decks (0.1 to 8000
foot lamberts), it was not possible to use more than two levels of symbol
brightness without a risk that the lower brightness symbols would disappear
under some conditions. Extensive laboratory and simulator testing was done to
develop symbols that were easily recognized and correctly associated with the
information they represented. The primary coding tools available were symbol
shape, line type, and text. Simple shapes were used as much as possible to
minimize display clutter, improve readability in turbulence, and control the
amount of drawing time required. Even with all this attention, the displays
became quite busy.
In the late 1970s, color display technology improved enough to warrant their
consideration for flight deck use. The general presumption was that color would
simplify the information coding problem. In fact, coding was a secondary reason
for using color, the primary objective was to separate the various classes of
information to operationally declutter the display.
Two characteristics of human color vision played a key role in establishing this
objective. Color is recognized over a much larger field of view that the small
zone of sharp visual acuity where details of shape will be perceived. This
permits a different, and potentially quicker, search strategy to be associated
with color information.
Color CRT displays are affected by bright sunlight in two ways. The contrast of
the symbol against its background is reduced in much the same manner as with
monochromatic displays. More subtly, the color of the sunlight mixes with the
color of the symbol, shifting the hue and saturation that the pilot perceives.
Accurate recognition of color is marked by significant individual differences.
Testing conducted by Boeing showed that no more than six colors (seven under
certain conditions) could be discriminated by the full range of pilots having
"normal" color vision under all anticipated brightness conditions. This finding is
not as operationally restrictive as one might first believe. In fact, it is
conveniently appropriate. Psychological research consistently reports that human
beings will have the least difficulty dealing with memory-based codes of not
more than 7 (t2) dimensions.

257

Human Factors for ltight Deck Certification Personnel

A third consideration was stereotypical meanings associated with various colors


by different cultures. Red is widely associated with warning or alert conditions.
Amber (or yellow) is often recognized as indicating some form of caution or
heightened awareness. Other colors have more diverse cultural meanings and
are, therefore, more suited to general grouping than detailed operational
information coding.
Using color this way does not reduce the need for shape and line style coding.
However, it does permit higher information density to be used without
incurring a pilot performance penalty. Dissimilar redundancy in the coding can
improve pilot confidence in the display and help maintain good performance
under marginal operating conditions.

Eye Fatigue
The use of CRT displays incroduced new opportunities for eye fatigue. To
minimize this potential, several characteristics of the displays were carefully
controlled. Eye fatigue results when the muscles controlling the eye are subject
to overuse.
The mn-scles that change the shape of the lens respond to the sharpness of the
edges in the image falling on the retina. For conventional mechanical displays,
edge sharpness is very high. The manner by which a CRT image is created
produces a Gaussian-like distribution of light across each line in the display. If
line widths, along with phosphor dot arrangement and spacing, are not
carefully selected, the resulting soft edges can cause excessive refocusing and
eventual eye fatigue.
Laboratory testing with a variety of pilots revealed that the optimum line widths
for color CRT displays were significantly wider than for monochromatic CRTs
and that the desired widths varied with the color of the line. This latter finding
appears, at least in part, to be related to the fact that misconvergence can cause
color fringing along those lines composed of two or more primary colors.
Eye fatigue can also result from fixating in one location for an extended time.
Fortunately the distributed nature of information in a modern flight deck
encourages the pilot to change his point-of-regard frequently. When the large
format CRTs were first proposed for the 767, there was concern that the
novelty of the display along with the large amount of information they
contained would result in much greater dwell time on these instruments than
was true of previous displays. The original performance criteria for the displays
included graphic symbology that could be interpreted quickly by the pilot. Eye

258

Disvlay Desin

track records confirmed that dwell times remained quite similar to those
associated with conventional displays.
A third potential source of eye fatigue is the apparent motion in a display
caused by flicker. Rapid motion is a powerful means of attracting visual
attention. This is true for any visual scene real or created. The motion attention
response is so automatic, that it is not under the conscious control of the pilot
in most situations. The human visual system's sensitivity to flicker is not
uniform throughout the visual field. For most people, it is greatest in the
peripheral region between 45 and 60 degrees away from the eye
point-of-regard. In this region, the critical flicker fusion frequency generally will
not be less than 45 Hz nor more than 62 Hz. This is significantly higher than
in the foveal region where critical flicker fusion frequencies below 30 Hz are
common.
Unfortunately the zone of greatest flicker sensitivity overlaps the location of the
other pilot's displays in most side-by-side two-pilot flight decks. Thus the
required refresh frequency for flight displays is set by the flight deck geometry.
For displays used on the 757 and 767, the nominal refresh rate is 80 Hz. This
is allowed to drop, under high data presentation conditions, to as low as 65 Hz.
Below that frequency, a message appears alerting the pilot to the data overload
condition and allowing him to deselect unneeded information.
Though glare and reflections do not cause eye fatigue directly, they are likely to
be reported as such if the pilot becomes aware of them on a continuing basis.
The use of an anti-reflective coating on the external surface of the display along
with careful matching of the index of refraction for the various layers of the
display face plate greatly reduces the opportunity for perceived reflections.
Finally, the flight deck geometry is established to ensure that sunlight on the
pilot's white shirt will not reflect off the screen and into the pilots' eyes in his
normal seated position.
Attention to all of these details has resulted in displays that pilots regard as
highly readable and with which they achieve consistently high performance.
Future technology changes will likely alter the specific requirements
characteristic of current displays. Even some of the areas of concern might
change. However, by understanding the factors that influence both perceptions
and performance, it will be possible to ensure that the next display technology
evolution is at least as successful as the transition to CRTs has been.

259

H=n Factons for Plught Deck Cetaification Peomnnel

rime Shared Intormation


When the primary display devices were mechanical, there were few
opportunities to time-share display space. True enough, VOR course deviation,
ILS localizer deviation, and possibly inertial cross track deviation can be
displayed on an electro-mechanical HSI. However, most other electro-mechanical
displays have a fixed information content and a fixed format for that
information. The change to CRT displays presented the opportunity to change
the conventional one display one function relationship. With this opportunity
came the necessity of understanding the circumstances under which time
sharing would alter pilot performance. The potential to improve p-rformance is
there, and along with it, the potential to degrade performance.
Clearly a complete understanding of all the tasks that might De affected by time
sharing of information is the appropriate starting point. The simplest cases
involve tasks that can be isolated. It helps if these tasks are done relatively
infrequently and under very dearly identifiable conditions. Slightly more
sophisticated cases involve a change in priority or importance for a task, or
tasks, which are necessarily serial in execution. The greatest challenge occurs
when one or more tasks can occur in parallel with any number of other tasks
and the relative priority of the tasks is known only by the pilot.
The first question asked by the designer should be, is it desirable to time share
information for this task. If task execution is continuous, or nearly so, the
answer is obviously no. For infrequently executed and logically isolated tasks,
the answer is probably yes. The vast majority of tasks fall between these two
extremes. In these cases the answer depends upon the composite impact of the
total information display requirements on the pilot and the means available to
effect the time sharing.
The map displays on all Boeing airplanes incorporate manually selected time
sharing for supplemental navigation data. This includes depiction of navaids,
intersections, and airports other than those currently in use or formally defined
as part of the flight plan route. Manual selection is used since the specific
circumstances favoring use depend on conditions known best by the pilot. All
information that is mandatory for proper execution and monitoring of the
defined flight plan is presented without specific action by the pilot. For
example, the navaids currently being used for navigation updating are shown
whether or not navaids manual data has been selected. The same is true of the
departure and destination airports and any intersections that are identified as
waypoints along the route of flight.

260

Display Desi

A variety of performance data is available when the pilot takes deliberate action
that indicates such data would be useful. For example, the normal procedure for
changing altitude is to select the new altitude on the mode select panel and
then initiate a climb or descent, as appropriate. The two actions generate a
prediction of how far ahead of the current position the aircraft will be when
the new altitude is reached. This prediction is shown as a green arc on the map
display. Once the new altitude has been captured, the prediction is no longer
meaningful and it is automatically removed from the display.
A similar strategy is used to support the temporary engine exhaust gas
temperature (EGT) limit that applies during engine start. A red radial is shown
on the EGT gauge at the start limit value from the time the start is initiated by
the pilot until the start cycle is completed. If the start operation occurs while
the airplane is in flight, additional information is needed to ensure that enough
air flow is available to complete the start. In this case, appropriate information
about the airspeeds necessary for an unassisted start are displayed near the
primary engine indicators when the engine is not running during flight. If the
airplane is not at a speed sufficient for an unassisted start, the need for
cross-bleed assistance is shown directly on the appropriate engine rpm indicator.
The time sharing illustrated by these examples would not have been possible
without the flexibility of a general purpose display device like the CRT. The
obvious benefit obtained from the engine and performance time sharing
discussed above is the heightened awareness of the time shared data that occurs
during the interval when that data is significant to the pilot. The corollary
benefit may not be so obvious but is, nevertheless, one of the fundamental
operational reasons for considering time sharing. This benefit can best be
illustrated by noting that the most effective displays are those kept simple.
Every extra display element takes time to interpret and introduces additional
opportunities for misinterpretation and error. Further, the errors will not be
confined to the extra data. As noted in the section discussing evaluation, the
presence of nearby symbols, particularly dynamic symbols, can be a significant
enabling factor for error.
A human characteristic that points toward the desirability of simple displays is
the notion of selective attention, or "tunneling." In essence, under certain
conditions many people have a tendency to fixate on selected data or tasks and
ignore others. The circumstances that trigger this phenomenon are highly
individual; but excessive workload, high stress, fatigue, or fear are often
precursors. The task that is attended to may or may not be the most
appropriate for the existing circumstances. Indeed, if tunneling continues for
any significant time, it is likely that the data that would aid the pilot in
recognizing the need for a priority change has, itself, been biased by the lack of
261

Human Factors for RFldt Deck Cetification Personnel

attention. The simpler the normal displays are, the more likely they are to avoid
the tunneling phenomenon. If tunneling does occur and the displays are kept
simple, there is a greater chance that the pilot will see only high priority
information.
Another aspect of human perception that may play a part in the decision to
time share is our human tendency to see what we expect to see. If data are
continuously presented and are normal for an extended time, it is likely that the
threshold at which a pilot will recognize an abnormality exists, will become less
precise. Many tools are available to deal with this characteristic. Most depend
on some form of alerting triggered by a parameter exceeding a limit value. Two
examples illustrate ways of dealing with this phenomenon.
Exhaust gas temperature (EGT) is a basic engine health parameter on most jet
engines. As such, EGT is required to be displayed in the flight deck. It has no
other operational use. The actual value of EGT varies with engine power setting
and altitude in a rather complex way. Thus, over a typical flight, the pilot can
expect to see the EGT value vary from some low value to quite close to the
limit value. Thus, proximity to the limit is not necessarily a concern, but
exceeding the limit is. The reliability of modem jet engines suggests that, on
average, a pilot would see an over limit condition not more than once every
few years. That represents many hours of seeing normal values for every case of
an abnormal value.
Simple limit values can usually be sensed precisely and reliably by the
instrumentation system. That is the case for EGT. Several elements of the EGT
presentation change color when the established EGT limit is exceeded. The
color change affects the EGT pointer, the related EGT digital readout, and the
box drawn around the digital readout. Since the majority of the Engine
Indicating and Crew Alerting System (EICAS) display is white-on-black, this
change to red-on-black is highly visible. With the color change, there is no
doubt that the limit has been exceeded and which engine has the problem.
There are three engine types available for the 767 and the 757. A common type
rating was planned between the two airplanes. None of the engines have
exactly the same values for their limits, but they are displayed in exactly the
same way. Therefore, the pilot doesn't have to memorize a new number when
transitioning between airplane types. Instead, he uses the display exactly the
same way on both airplanes. This is one of the versatile things that can be
done with a CRT instrument.
The secondary engine instruments present a slightly different challenge. In this
case, there are five or more parameters per engine. The values of some of these
262

Display Desimn

parameters are only subtly linked to the pilot's operation of the engine. They
may or may not have limits associated with them. Fundamentally, these
parameters are used for long-term engine performance assessment, for backup if
a primary indication fails, or for maintenance assessment if abnormal engine
operation is encountered.
These secondary indications are grouped on the lower EICAS display. The
design of this display is such that the data can be turned off without loss of
limit indication. The computer monitors track those parameters that have limits
and pop up the appropriate information on the display if a parameter goes out
of limits.
Recommended usage of this feature for most engine-airframe combinations is to
have the lower display active during engine start and then to blank the display
for normal flight operations. Of course the pilot should activate the lower
display any time he wishes to check any of the secondary data. The flexibility of
use of this feature allows airlines and pilots to tailor operations to fit their
particular operating style. At the same time, the availability of the feature
recognizes that it is unreasonable to expect that all pilots will be properly
attentive to displayed information regardless of the circumstances and the
quantity of data actively displayed.
All the time sharing discussed to this point has involved changes to the data
content of an existing display. In all cases, the basic conceptual framework for
each display remains intact. The most general form of time sharing involves
conceptual changes in the content of the display. In the extreme, this could
mean that the display surface is used sequentially for totally independent tasks
involving completely different information.
Successful implementation of this type of time sharing requires careful attention
to the details of all related tasks and for the circumstances under which
switching from one task to another will occur. Recognizing and supporting all
the task linkages that can occur, particularly those associated with non-normal
operation, is a key prerequisite for success.
Selecting the various modes of a time shared display will be most successful if
the conceptual model used to implement the switching, matches the pilots'
understanding of system usage. For complex systems, this is a difficult task since
the level of system usage understanding will likely be different from pilot to
pilot. Understanding will also be different for a single pilot as his skill with the
system evolves from novice to expert. For example, a tree-structured selection
concept is often preferred during initial training but shifts as experience is

263

Human Factors for Hught Deck Certification Personnel

gained to a preference for direct selection, particularly for frequently used


features.
Accommodating this shift can be accomplished in many different ways involving
design or training or some combination. Deciding what is best in a particular
application is a complex task. No one answer is correct for all situations. A
thorough understanding of the tasks, and their criticality in relation to other
flight tasks, is the best basis for initiating the decision.

Command vs. Situafion-PredictionDisplays


Most flight deck displays support continuous control tasks, decision-making
tasks, monitoring tasks, or some combination of these. At the task level, the
supporting display information can:
1) show the current situation;
2) show what should be done to accomplish an established goal, or
3) show what will happen if the current action is maintained.
These three types of information can be categorized as: situation, command,
and prediction respectively. Various combinations of these data can be used to
optimize support for specific tasks.
Situation data is fundamental to many monitoring tasks and most, if not all,
decision-making tasks. Command information is often associated with high
precision control tasks. Prediction information can be used with all three task
types though usually it is not used alone but in conjunction with situation
information.
Situation information has the broadest applicability across tasks. It often entails
more information transfer to the pilot than other means. The minimum situation
data to support the lateral control task would involve:
o current airplane location with respect to the desired location,

"o airplane heading (or track angle),


"o airplane speed,
"o airplane bank angle,

264

Display Desim

o limits associated with any of these parameters, and


o any other applicable constraints.
Understanding all of these data places the pilot in an excellent position to
recognize subtle deviations from plan or in expected performance. It also gives
him the widest possible range of task execution strategies. At the same time, it
requires considerable skill to correlate all of this information correctly and to
select the proper control strategy. Even for highly skilled pilots, there are
practical limits on how fast this task can be completed correctly.
Command information simplifies the information processing load on the pilot by
integrating the relevant information into a new piece of information indicating
how much control should be applied. By presenting to the pilot the difference
between the computed desired control input and the actual input, he can see
immediately what should be done. This greatly reduces the information
processing workload on the pilot and reduces his response time essentially to
that associated with simple eye-hand coordination.
There are several costs associated with command information. The reduction in
processing load on the pilot means that his awareness of the situation is also
reduced. Similarly the choice of execution strategy is handled by the command
generator rather than the pilot. Where performance demands are high, these
costs may be considered acceptable or they may be reduced by procedurally
involving the other pilot in some portion of the task.
Predictive information, like command information, combines data to reduce the
processing workload. However, while the command information is based on a
predefined control strategy, predictive information is based on the existing
control strategy. Furthermore interpretation of the prediction requires enough
understanding of the situation to determine the suitability of the current control
input. This explains why most predictive information is presented in the context
of a situation display.
The 767 map display contains several predictions. Those associated with lateral
maneuvering clearly illustrate the differences between prediction and command
information. In determining how to maneuver laterally, the pilot has a number
of decisions to make. One involves how much turn rate is needed and another
involves how quickly to roll out of the turn. A command display would indicate
how much bank to use for a pre-established turn rate by showing a bank
command to the pilot at the appropriate time. Then, at the appropriate time for
roll out, an opposite bank command would indicate how quickly the pilot
should reduce the bank angle to re-establish level flight.
265

Human Factors for Flight Deck Certification Personnel

The corresponding predictive information on the map (see figure 10.6) consists
of a variable radius circular arc symbol whose radius varies with the current
turn rate. In this case, the pilot can see that he has selected the proper bank
angle when the arc is tangent to the desired path or when it passes through the
desired point ahead of the aircraft. A fixed straight line from the airplane
symbol to the top of the display shows the path the airplane would follow if
the turn rate were zero. The rate of closure between this symbol and the
desired path line or target way point and this fixed line provide the position
and rate information the pilot needs to select and control his roll out to level
flight again. These predictions are very simple but very powerful.

Track Line
Desired

Path
Curved
Trend
Vector

_Current
Position

OuM 10.6 Value


rmcbuw
d ar Wnibol whose rais varies wih 1he curt turn
rVt (odkin fou).

The length of the curved trend vector is proportional to the airplane ground
speed. Gaps in the curved trend vector show where the airplane will be 30, 60,
and 90 seconds ahead of current position. Of course the pilot can get some
sense of speed from how fast the map information is moving beneath the
airplane symbol. However, the fixed time intervals of the arc symbol provide the
pilot with a relative time reference to use in interpreting the rest of the display
information.
The predictive information does not directly tell the pilot when to maneuver nor
does it demand a particular maneuvering strategy. The pilot must make these
266

Display Design

decisions. In order to make them, he must have an understanding of the current


flight situation. After a little practice with the predictive information, the pilot
can make those decisions very accurately. Because of his interaction with the
rest of the map information, good situation awareness is ensured.
Predictive displays are best suited to tasks where both deviations from some
plan or standard and some form of rate information are involved. These displays
are usually superior to other forms, where both control and a related
monitoring task must be performed.
Which type of display is best? The answer is the one that most consistently and
accurately enables the pilot to achieve the performance goals associated with
the task he is doing. Here again is strong support for the necessity of
understanding the task and the related information requirements before
selecting the display format or symbology.

Future Display Issues


The broad acceptance of computer-generated data and the trend toward
graphical user interfaces suggests that the flight decks of the future will contain
more general purpose displays and that the pilots will expect to see much of
the data presented in a graphic form. Technology trends indicate that flat panel
displays may well replace CRTs as the display of choice for many applications.
The detailed human factors issues associated with flat panel displays are quite
different from those of the CRT since the image generation mechanisms are
completely different. Though the technology details are different, the
methodology for developing and evaluating such displays will remain consistent
with the process outlined in figure 10.1. In the past, there has been a steady
trend towards more and more data being made available to the pilot. Large
format, computer-generated displays can readily overwhelm the pilot with
information. Adherence to a structured process for evaluating pilot performance
when using these displays will become increasingly necessary. Techniques such
as time-sharing and adaptive selection of display information will be primary
aids to the designer in coping with the information expansion. The certification
issues raised by these techniques will need thoughtful consideration and debate.
Effective management of the rapidly expanding flight deck information system
will require the cooperation of many people and organizations that support the
pilot. A common understanding of both desired performance and actual
performance along with the means to share this understanding across the
industry will be very helpful. Human engineering plays a significant part in this
process by providing a common understanding of the pilot and his performance
to all of the participants in this endeavor.
267/268

Woarload Assessment

Chapter 11
Workload Assessment
by Delmar M. Fadden, Chief Engineer-Flight Deck, Boeing Commercial Airplane
Group
Workload assessment became a formal part of the certification of large
commercial transports, with the adoption of Appendix D to FAR Part 25. While
Appendix D identifies the need for such assessment it does not define the
means. In retrospect, the fortuitous lack of rigidly defined methodology
prompted considerable research and development that otherwise might not have
occurred. The expansion of workload understanding and of the methods for
assessing workload has enabled the industry to keep pace with the rapidly
evolving character of crew workload over the last quarter century.
Operational differences in airplanes, such as the 737, 757/767, and 747-400,
cause changes in the workload the pilot experiences. The nature of these
changes has led to changes in the tools used to assess workload. On the 737,
the workload of primary concern was the shift of system management

269

Human Factors for Flight Deck Certification Personnel

responsibility to the two pilots. The flying task assigned to the pilots did not
change significantly between the 727 and the 737. The physical layout of the
column and wheel, primary flight displays, and the cockpit windows remained
very similar to the 727. The tasks that did change were those associated with
engines and systems management. The engine management tasks were subtly
different, reflecting the twin engine configuration of the 737. The systems
underwent substantial change to bring them into conformity with the two-pilot
operating concept.
By the time the 767 design was initiated, extensive experience had been
obtained from a wide range of two-crew operations around the world. This
experience confirmed the soundness of the basic principles underlying the
design of systems for two-crew operation. However, airline desire for improved
operating efficiency, coupled with the increasing complexity of the air traffic
control environment, argued for significant enhancements to the primary flight
information. A new flight management system concept was devised featuring
cathode ray tube (CRT) flight instruments and digital computers handling many
of the navigation, flight planning, and performance assessment calculations.
These changes altered the pilots' tasks in ways that achieved improved efficiency
and greater overall situational awareness. These changes produced
corresponding changes in the pilots' experience of workload.
The 747-400 incorporates both the systems enhancements that had been
pioneered on the 737 and the flight management capabilities first introduced on
the 757 and 767. In addition, the primary instrument panel is modified
permitting the use of larger CRT displays. Finally, a number of new information
management features assist the pilot in coping with the increasing quantity of
flight, engine, and systems information available. These changes, along with a
complete redesign of the airplane systems, made it possible to change the crew
size from three, as it had been on previous 747 models, to two. The workload
concerns in this case focused on the integration effectiveness of the overall
flight deck design.
TMir. chapter reviews the evolving techniques that have been found useful for
assessing workload in modem jet transports. Emphasis is placed on workload
assessment in the early stages of design, since that is the time where
quantitative workload data is the most effective in shaping the product. The
techniques that have been developed to add structure to the subjective
assessments of the evaluation pilots are described. Several issues that have
significant effect on workload and the workload certification process are
presented. The chapter concludes with a discussion of pilot error and a glimpse
at future workload issues.

270

Workload Assessment

Woddoad Mefhod

Commercial aviation, during the jet age, has established an excellent record for
safety. The skills of many pilots have been a vital factor in that achievement.
Nevertheless, when accidents do occur, history indicates that some type of pilot
error will be involved in over 70% of the cases. Any work that leads to a
reduction in the consequences of pilot error has the potential to improve the
future accident record. While pilot workload, per se, has never been cited as the
cause of an accident, there is a common perception that workload and error are
related in some fashion.
Workload on a commercial airliner seldom, if ever, reaches the absolute limits
of the flight crew. However, circumstances do arise which result in a significant
elevation of workload. Whether or not such increases are large enough to cause
concern about the potential for error is one of the reasons for doing workload
assessment. The general relationship between workload and error is not well
understood, even within the human engineering community. There is general
agreement that error increases at both extremely low and extremely high
workload levels. In between, evidence for any direct relationship is weak or
nonexistent. Individual differences between pilots contribute to the difficulty of
establishing a useful working relationship for workload and error. There
appear to be significant variations in the level at which workload is considered
extremely high or extremely low from one individual to another and, even, for
the same individual under different personal and environmental circumstances.
Regulations applicable to commercial aircraft treat workload as a series of
factors that must be considered for each of the primary flight functions. The
workload factors, identified in Appendix D to FAR Part 25, constitute several of
the key dimensions through which a pilot experiences workload. The
characteristics describing these factors remain reasonably consistent for any one
pilot across a variety of vehicles and flight conditions. Differences among
individuals, however, tend to be large. The workload functions, also identified
in Appendix D, encompass the major functional tasks normally assigned to the
pilot. The details of these tasks, the related specific performance objectives, and
the relative task priorities, vary considerably from one aircraft type to another.
Workload assessment plays a dual role in the design and development process.
During the design cycle, workload assessment provides insights about the design
that identify opportunities for improving the pilot interface. Workload
assessment during the certification process provides a structured method for
examining the various workload issues that are relevant to the particular
aircraft type under scrutiny. Because it is very difficult to change the
fundamental factors that establish crew workload after the airplane is built,
271

Human Factors for Flight Deck Certification Personnel

manufacturers place heavy emphasis on the selection and use of assessment


methods that correlate well across these two roles.
The design development role argues for assessment methods that are both
sensitive to detail and quantitative. The number, type, and timing of required
tasks are important elements in determining how the design of the flight deck
will influence the pilots' subjective experience of workload. Yet the pace of most
development programs is such that workload assessment methods must be
simple enough for timely application. Furthermore, since the entire airplane
design does not approach maturity at a constant rate, the workload
methodology must support assessments of isolated systems as well as
assessments of the entire airplane.
For certification assessment, the diagnostic sensitivity of the workload method is
less important than its overall ve.cle applicability. The reality of certification in
a social and political, as well as technical, environment means that particular
attention must be paid to any unique or unusual features of the vehicle or its
environment. Thus, the certification methodology must be flexible enough to
adapt quickly to new tasks, new technologies, or new human performance
concerns.
Since aviation progress is normally evolutionary, each new airplane type will
contain a mixture of significant design changes and designs closely linked to
previous airplanes. History during the jet age indicates that the elements of
design undergoing the greatest change shift focus from one generation of
airplanes to the next. It is, therefore, not surprising that the analytical methods
that have been developed, depend on comparisons between the new design and
existing designs having an established safety and operational performance
record.
The multidimensional nature of the workload experience makes it unlikely that
a single absolute workload scale will ever be developed. Indeed there is reason
to suspect that creation of such a scale would be of little practical utility in the
development of commercial cockpits. Instead, all current workload assessment
techniques involve multiple measures, most of which depend on some form of
comparison. The comparison will determine if the new design has the higher
workload. Whether the difference is significant depends on the magnitude of
the difference, the length of time the difference remains, and the phase of flight
when the difference occurs.

272

Workload Assessment

CommercialAircraft Woddoad
Commercial aircraft workload can be divided into two broad regimes: normal
and nonnormal. The former constitutes all the tasks associated with planned
operation of the aircraft, including:
"o all allowable flight operations,
"o all certified weathei , ,erations,
"o certified minimum crew size,
"o selected equipment unavailability under the minimum equipment list,
and
"o normal flight operations following probable equipment fault or failure
conditions (exclude tasks associated directly with management of the
fault or failure).
Normal workload presumes compliance with all operating and performance
requirements along with adherence to all restrictions, limitations and established
policies. Under nonnormal conditions, strict compliance with normal operating
requirements can be relaxed, as long as aircraft or personnel safety is not
further compromised. In addition, through appropriate coordination, it may be
possi'ble to relax adherence to certain externally imposed restrictions or
performance standards. Such relaxation plays a significant part in mitigating
additional workload that might otherwise accrue from nonnormal events.
All remaining tasks are considered nonnormal. Both the consequences of
occurrence and the probability of occurrence are considered in determining
which nonnormal tasks are identified with specific procedures in the operational
documentation and the training the pilot receives. During design, assessments
are made of all possible ways in which safety hazards can occur. In this
manner, the relevance of every nonnormal event is determined. Experience
shows that particular attention is needed for events that are associated with:
"o other than normal flight conditions,
"o incapacitation of a required crew member,
"o management of equipment fault or failure conditions,

273

Human Factos for FiAit Deck Certification Personnel

o flight operations subsequent to improbable equipment fault or failure


conditions, or
o flight operations following combinations of faults and nonnormal
events.
An important aspect of nonnormal workload management concerns the design
of equipment and procedures that minimize the consequences of failures on
subsequent aircraft operations. This focus has the obvious benefit of reducing
the aggregate workload, but, what is more important, it also reduces the
opportunities for error that would accompany a sustained change in procedures.
This principle is embedded in the systems design for Boeing airplanes and has
produced many nonnormal procedures that are independent, time-limited task
sequences. This results in getting the pilot back to normal flight operations and
normal procedures very quickly for most first failure conditions.
While care must be exercised to avoid unnecessary workload buildup, staying
well below the pilotes maximum workload capability is a relatively
straight-forward task to accomplish on a commercial flight deck. Important as it
is, attention to task loading alone is not sufficient to ensure an error-tolerant
flight deck. The timing of tasks plays a significant role in determining what
opportunities for error may be encountered. Thus, it is recognized as desirable
to organize the normal task loading with the following timing-related guidelines
in mind:
o it should be possible to interrupt any procedural task sequence at any
point to accomplish time or event-driven actions,
o abrupt changes in normal task loading should be avoided, particularly,
during the departure and arrival phases of flight,
o the need for precisely timed tasks should be minimized,
o where task start time constraints are necessary, task completion time
requirements should be relaxed,
o similarly, where task completion time constraints exist, the start time
requirements should be flexible.
Rigid application of these guidelines is not necessary, but deviations should be
treated as circumstances meriting special attention.

274

Workload Assessment

Woddoad Assessment Scheduling


Figure 11.1 shows a typical workload assessment program. Workload assessment
is initiated early in the process so that the results can be used in optimizing the
design. A typical airplane development program at Boeing usually takes five to

Selective

Reference
Comparative

Mission Analysis

Analysis

New Airpln

Verification

New Airplane

Simulaion

Mission Analysis
SReference

1Selective

Subsystems

Comparative

New Airplane

Analysis

Verification
Certification
Flight Test

Subsystems

Simulation

>

Full Mission

Part Task

Avionics >
Airplane >
6
-

Product Development

Certification

Roll Out

Go Ahead
5

Delivery

Test

Specifications

Design

Figure 11.1 A Typical Five-Year Workload Assessment Program.

six years. The fundamental decisions that shape the basic airplane itself are
frequently made in the first 12 to 24 months of the design activity. Structured
workload assessments usually begin about 50 months before certification. The
assessment tools selected at this point provide useful insight even though many
of the details of the design are not yet finished. Where a reasonable degree of
task similarity exists, comparative analyses based on these tasks can provide an
275

Human Factors for Flfrht Deck Certification Personnel

anchored reference. Where the task or the information presented is new and
cannot be quantitatively linked to previous designs, some form of laboratory
assessment, part-task simulation, or even experimental flight test may be
necessary; particularly if the task is important to the safety or operating success

of the airplane.
The costs of using these tools, particularly simulation or flight test, are not

limited to dollars but extend to the time and human resources they absorb.
Since committing resources reduces their availability for other developmental
work, those issues selected for this type of testing are carefully considered and
prioritized.
The first step in many new airplane workload assessment programs is a
comparative analysis of the internal airplane systems; electrical power,

hydraulic, pneumatic, environmental control and fuel being the most important.
This analysis depends on system knowledge but requires little detail about

events external to the airplane. Any impacts associated with external events or
inter-system effects will be incorporated in subsequent analyses. Negative or
neutral analytic results indicate where to focus further design attention. As with
all analytic methods, these analyses provide visibility based on known or
hypothesized relationships. Additional testing must be done when the possibility
of unanticipated relationships between design elements or crew tasks cannot
otherwise be reduced to an acceptable level.
During design of the 767, the analytic workload assessment process resulted in
two additional design optimization cycles for the hydraulic system and one
added cycle for the pneumatic system. These cycles occurred well before
hardware was built at a time when significant design flexibility remained.
Similarly, the fuel system of the 757 was changed from a five-tank to a
three-tank configuration based on workload considerations. The fuel tank issue
is particularly interesting because it illustrates the complexity of achieving truly
effective designs.
Fuel is a major element of weight in a long range airplane. The distribution of

that weight in the wing affects the stress each portion of the wing will
experience during flight. The structural weight of the wing is directly related to
these stresses. Naturally, the more the structure weighs, the more fuel must be
carried to lift the extra weight. It is advantageous to reduce the bending stress
by having more weight remain within the outboard portion of the wing than
the inboard portion as fuel is burned during flight. Consideration of these
factors for an airplane the size of the 757 suggested that the best structural
design solution would involve five fuel tanks: two outboard, two mid-wing, and

276

Workload Assessment

a center tank. However, such a system would make necessary additional routine
actions to manage the flow of fuel to each engine.
On a three-tank system, the center tank pumps normally operate at higher
output pressure than the wing tank pumps. This ensures that center fuel will be
used first. Operating procedures can be kept very simple; turn on all pumps
before take-off and turn off the center pumps when center fuel is exhausted.
Managing a five-tank system is more complicated for the pilot, unless a system
is added to sequence the fuel automatically. Considering the criticality of the
fuel system and the additional complexity that would be necessary to
compensate for new failure modes, such added automation would result in an
increase in electronics weight, several new nonnormal procedures, and increased
maintenance requirements. Several design iterations addressed each of these
issues and resulted in a revised fuel system that achieved lower total weight
and the simplicity of a three-tank design. Reaching this decision required
agreement across several, otherwise independent, functional groups within the
design organization, the regulatory organization, and the airlines, thereby
adding considerable time and effort to the design. The in-service results suggest
that the effort was worthwhile.
This example points out how important the early workload estimates are.
Redesigning the tank layout would not have been practical had the workload
assessment been delayed until a full mission simulation or a flight test vehicle
was available. In this case, the workload concern was identified by the
manufacturer who then took timely steps to resolve it. Had the issue been a
regulatory concern, it would have been equally important to identify early.
Wlddocd Assessrmnt Cdtfeia
Should analytic workload techniques be used for certification? It is convenient
for the manufacturer if they are, because the manufacturer has already applied
them, and based a design upon them. If the regulatory agency and the
manufacturer both agree on the scope and validity of such methods, then they
can be highly useful.
Boeing starts with a subsystem analysis program called Subsystems Workload
Assessment Tool (SWAT). The SWAT program assesses both normal and
non-normal procedures. The primary purpose of this program is to relate the
operating procedures, the display and control devices, and the geometry of the
cockpit using a common measure. The subsystem analyses are not related to a
specific mission so all the normal and nonnormal procedures are accomplished
serially. The analysis encompasses time and motion assessments for hand and
eye tasks. Such ergonomic data is essential for ensuring that displays and
277

Human Factors for Flikht Deck Certification Personnel

controls are properly located within the system panel Time and complexity
assessments for aural, verbal, eye, hand, and cognitive tasks are also examined.
The complexity score is a method of estimating the mental effort related to
gathering information. It characterizes the information content of the displays
and the number of discrete operating choices available to the pilot using a
logarithmic measure (BITs). SWAT generates summary statistics for each system
and for all systems.
Table 11.1 is a systems workload data summary comparing 767-200 normal
inflight procedures with those for the 737. The data reflect that the 767-200
systems require that the pilot only switch off the two center tank fuel pumps
when the center tank is depleted. The 737 requires a few more hand and eye
tasks.
Table 11.1
Subsystems Workload Data Summary
Normal Inflight Procedures (Boeing& 1982)

Motion
737
767-200

212
1
Motion

Cinches)

737
767-200

17
1

mevity Channel

(Br__2
32
3

IMs
32
2

Hand Activity Channel


Time

Bs"

17
3

S
2

Tasks
14
2

Ta
a
2

Combines results for electrical, hydraulic, ECS, and fuel subsystems.

"

The Brr score derives from the classic definition of information.

Table 11.2 is a similar systems workload data summary comparing all


nonnormal inflight procedures for the same aircraft. This table provides a gross
check on the overall effect of nonnormal procedures. If any of the 767-200
statistics had exceeded the comparable data for the 737, that would have been
an indicator that additional investigation is essential To understand how
individual systems fare in the comparison, it is necessary to examine individual
systems data at a more detailed level Interpretation of these data requires
thorough knowledge of the system operation and the intended pilot interface.
Figures 11.2 and 11.3 summarize workload evaluation results for various
airplanes under normal and nonnormal procedures, respectively. The two-crew
747-400 evaluation shown in the top two graphs of figure 11.2, for example,
278

Workdoad Assement

Table 11.2
Workoad Data Summary
Subyt
Noua4loaal Infi~ Procedures (Boein& 1982)

Modon
737
767-200

2214
1348

Motion

eanndvity

(sTcs
330
183

458
3s5

348
297

Tasks
169
126

Hand MAcdvity Channel

Time

(
737
767-200

Bf

183
154

1Ts"
l

Tasks

140
134

88
81

Combinm resats for electical, hydraulic, ECS, and fuel subsystems.


"TheBIT score derves from the dasdc definition of information.

has a lower average number of tasks and lower average time to complete the
tasks than the three-crew 747-200. These graphs show that the 747-200 results
are similar to the results for the 737 and 767. These comparisons give the
designer an initial indication of how the workload associated with a new design
will compare with other airplanes. The normal procedure eye analysis (upper
left graph) shows that the 747-400 has a larger average number of tasks, but
that on average, the tasks take less time to do than on the 737. The goal in
this case is ensuring that total task time is similar to the total task time of
another airplane having a good operating history. The 737 has been used as a
reference by Boeing since the development of the 767, because the 737 has an
excellent safety record and is flown by more customers, in more environments,
using a wider diversity of pilots, than any other Boeing airplane. Experience
indicates that the 737 is highly tolerant of pilot error and that it supports many
different operating strategies.
A workload assessment summary for nonnormal procedures is shown in figure
11.3. Numeric totals are not particularly interesting by themselves because, even
under the worst of circumstances, the pilot will use only a small percentage of
the nonnormal procedures at any one time. Minimizing the number of tasks per
procedure is considered desirable. While the 767 nonnormal workload is
consistently the lowest, the corresponding 747-400 workload is significantly
closer to that of the 737 on these logarithmic graphs than to the three-crew
747-200.
In successfully reducing workload, designers can establish circumstances where
the crew has limited opportunities to experience certain events. If the crew
279

Human Factors for Flight Deck Certification Personnel

TIME VS. TASKS - HANDS

TIME VS. TASKS - EYES

1000

10000

10000

-1000

747-200
AVERAGE
1111

(SECONDS)

7(SECONDS)

AVER
TIME

747-200

1007

100

77-400
++7

767

737

++

747-4 D0

767
10

10

10

1 )0

1 0

10100

AVERAGE NUMBER OF TASKS

AVERAGE N1UMER OF TASKS

-0 6.1-6

to&61-2

MOTION VS. TASKS - HANDS

MOTION VS. TASKS - EYES


10000

10000

747-200
+

1000

1000
AVERAGE
EY
MOTION

(DEGREES)

+ 747-200

AVERAGE
UAND

+ + 7 7-400

737

MTON
(MCHES)
100

100

++ 747-400

767
10

10
100

10 0

1
AVERAGE NIMBER OF TASKS

AVERAGE NUMIER OF TASKS


FuIOUR-14

FIg6l-I

Fogure 11Z2

Systems Normal Procedures Workload Results for Variou Airplanes. P(Boin, 1969)

280

110

Workload Assessment

TIME VS. TASKS - HANDS

TIME VS. TASKS - EYES


10000

10000

1000

1000

-oo

AVERAGE

AVERAGE

TIME
(SECONDS)

TIME
(SECONDS)

*47--200
737-t 747-400

100

747--40

100

747-200

737

767

767
10

10

AVERAOE NUMBR O

AVERAGE NUMBER 0F TASKS

1Cb

10

ILq

TASKS

WSC3/$,12
F106 1-14

Y13C3/"92
FIO6I.-10

MOTION VS. TASKS - HANDS

MOTION VS. TASKS - EYES

10000

10000

1000

737

,47_400

1000

4720AVERAGE
47-200

AVERAGE
EYE

.MOTION

+.4i- 747-200

MOTION
(ONCHO)N

(DEGREES)

737

HlAND

7471400

7767

100

767
10

10
l60

AVERAGE NUMBER OF TASKS

AVnAGE NUMBER OF TASKS


IOIGUlS &I-13

Fo 6.4

RFgu

10)

11.3.

Sy&uns Nmonnom

Procedures Worklobd Results for Varous Airplanes. (Boeing, 1989)

281

Human Factors for Flight Deck Certification Personnel

response to such an event requires proficiency at a physical skill, a mental data


manipulation, or a complex decision, then some alternate means for developing
and maintaining that proficiency may be required. It is certainly a poor trade-off
to sacrifice achievable reliability and efficient operations simply to retain skills. As
an example of the trade-off, the CRT displays on many newer airplanes are two
to four times more reliable than the horizontal situation indicators (HSIs) on
previous airplanes. As a result, the pilot doesn't have to use the standby
instruments very often. The CRT presents data in a map format that is much
easier to interpret than the symbolic presentations of the standby navigation
instruments. Most pilots would say this is a good trade-off. However, this
trade-off means that when the pilot must use the standby instruments, it is likely
he will be less proficient with them than would have been the case on previous
airplanes. Of course, simulator training can, at least partially, compensate for the
loss of line exposure to the actual condition. If these possibilities are recognized
early in the design process there may be other options. The designer may be able
to design standby instrument flight procedures that are tolerant of a reduced skill
level or allow for a longer transition period during which the pilot regains the
needed skill.
Time/ine Anals
Once all the individual systems and panels are defined by hardware, functional
description, operating procedures, and layout, the assessment process can be
expanded to incorporate realistic operational scenarios. This permits quantitative
evaluations of issues related to panel location within the cockpit, multiple
system operations, and, most importantly, the time criticality of functions.
Time is one of the key dimensions of workload. Is sufficient time available for
the pilot to complete all the tasks necessary to operate the airplane efficiently
and safely? Timeline analysis is a structured methodology for examining this
question. The fundamental equation for timeline analysis is the ratio of the time
it takes to complete a task to the time available for the task. This sou.nds like a
very simple idea. In practice there are many issues that must be addressed to
accomplish the analysis. At the point in the development process where timeline
analysis is first done, actual operating hardware is not yet available. Estimates
of the time required to complete each task must be made. When hardware or a
suitable simulator is available, the time estimates can be checked and
appropriate adjustments made to the analytic data.
Timeline analysis is accomplished by examining what the pilot does in every
300-second (5-minute) time block along the course of an entire flight. The
average workload over the entire flight is of little interest because the workload
during departure and arrival is much higher than that during the, often much
282

Workload Aesment

longer, cruise phase of the flight. Consequently, statistics are focused on the
arrival and departure phases of flight or on each of the 300-second blocks. Four
separate channels of activity are examined: visual (eyes), motor (hands), aural,

and verbal. Modal initiation and execution times for each task are recorded and
each task is assumed to require 100% channel capacity for the duration of that
task. Tasks are shifted as necessary to avoid overlap. Recent research results
indicate that the 100% channel capacity assumption is significantly more
conservative than necessary. However, it has proven useful, in the relatively
benign workload environment of commercial aviation, by ensuring early

identification of any brief periods when the assumption might be violated. It


also avoids the necessity of collecting data justifying the selection of a lesser
percentage.

The decision to keep the four channels separate has a similar expediency basis.
From the design point of view, knowledge of the specific channel workload is
essential if any adjustments are required. Thus, a combined statistic would
-re
only as an intermediate step toward getting to the specific channel workload.
Combining the channel workload data immediately raises the question of the
basis for the combination. With the exception of the aural-verbal pair,
experience indicates that all the pairings can overlap successfully most of the
time. The circumstances where complete overlap may not work appear to
involve task events unfamiliar to the pilot or tasks of unusual complexity. The
idiosyncratic nature of these circumstances makes a rule for identifying them
difficult to develop and even harder to defend. The reason usually given for
wanting a single workload number is to simplify the decision of whether the
overall workload is acceptable. The lack of a firm basis for combining the
channels has led Boeing to focus on the individual channel statistics.
Timeline analysis provides visibility of both dwell time and transition time.
Dwell time is the time taken to read or operate the specific control or display
device; for example: adjusting a control, reading information from a display,
entering a way point name, or selecting a new switch position. Transition time
is the time taken to switch from one activity to another. Examples of transition
time are: moving the eye-point-of-regard from one display to another, moving
the hand from the control column to the throttles, or changing from looking

outside the cockpit to focusing on the instrument panel.


Comparing dwell time and transition time data for different flight decks
provides useful information about the effectiveness of a particular design. If the

dwell times for the design are high, then the system designer needs to consider
using alternative display formats or control devices. If the transition times are
high, the flight deck designer is prompted to examine alternate physical
arrangements of the various controls and displays. Table 11.3 is a flight
283

Human Factors for Flight Deck Certification Personnel

procedure workload data summary for a Chicago to St. Louis flight depicting
dwell and transition times for eye and hand activities. These are total dwell and
transition times needed for the entire flight. These particular data were
generated early in the 767 development as a gross indication of the design
progress.
Table 11.3
Flight Procedure Workload Data Summary
Chicago to St. Louis Flight Totals (Boeing, 1979)
Captain

Eye Acivity Channel


Dwell
Time

(sc.

737
767-200

550
372

Transition
Time

(se

24
20

BTs_
2,271
1,811

Hand Activity Channel


737

767-200

737
767-200

199

161

510
331

119

91

629

370

First Officer
Eye Activity Channel
30
2,423
26
1,844
Hand Activity Channel

737

274

181

895

767-200

196

136

488

Other useful statistics generated by the timeline analysis program include the
average amount of dwell time spent on a particular instrument and the
probability of transitioning between various instrument pairs. Samples of these
statistics are shown in Table 11.4. These two statistics are very useful in
developing the most effective flight deck layout. Where these statistics depart
significantly from those associated with current airplanes, the designer has
reason to conduct more detailed studies.
The next two tables show the activity demands on the captain and first officer
during each five-minute block of the one-hour, Chicago to St. Louis, flight. The
total flight time is divided into 300-second (5-minute) blocks beginning at
brake release and the time demands during each interval are shown as a
percentage. The purpose of this form of data presentation is to examine the
distribution of workload throughout the flight. Several characteristics are of
interest in these tables. While none of the following trigger levels should be
284

WoMroad Assesment

considered a limit, exceeding of any of these levels is sufficient reason to


conduct a detailed analysis of the activities within the intervaL'The results of
the analysis will indicate whether the activities in the interval warrant
adjustment.
Table 11.4

MFight
rument V-iual Scan
Dwell Time and Transition Probability Summary (Boeing. 1979)
Averag Dwell Time (Seconds)
urument
ADI
HSI

"Z-AirR

Takeo/Climb
1.17
0.81
0.64
0.47

Dcen,nd
1.11
1.05
0.68
0.50

Average Transition Probability


uns-unent Links
Airstpeed t ADI
ee to AD!

Takeoff/Climb

nescenvian

0.90
0.87

0.86
0.79

0.25

0.23

HSI to ADI

0.78

AD! to
maw
ADI to HSI

0.36
0.31

ADI to Airsee

0.80
0.28
0.36

Representative time-demand workload trigger levels are:


"o interval workload greater than 25%,
"o workload increase greater than 10% of total for consecutive intervals,
"o workload greater than the reference airplane for two consecutive
intervals,
"o interval workload greater than 5% of total above the reference
airplane.
Table 11.5 shows the visual activity time demands, while table 11.6 shows the
corresponding data for motor activity time demands on the same flight. The
flight scenario for this mission begins with a takeoff from Chicago (O'Hare) and
a planned instrument departure. Once airborne, ATC provides radar vectors until
the airplane is above FL240 when responsibility for "normal navigation" is
returned to the pilot. Now the pilot returns to the cleared flight plan
proceeding toward St. Louis. The cruise segment of the flight lasts for about
five minutes after which the crew begins a standard arrival into the St. Louis
area. The cleared arrival routing is different from the original flight plan.
285

Human Factors for FIifht Deck Cerftification Personnel

Table 11.5
Line Operation V-iual Activity Time Demand (Boeing, 1982)
Average Percent of Time Available Devoted to Vsual Tasks
Tim Interval

Captain

First Officer

in Seconds
Takeoff
Climb

Cruise

767

737

767

737

1 - 300

23

28

19

30

301 - 600

19

20

601 - 900

901 - 1200

12

1201 - 1500

10

16

1501 - 1800

12

24

10

13

1801 - 2100

15

12

12

19

2101 - 2400

13

17

13

2401 - 2700

18

26

10

13

2701 - 3000

15

12

3001 - 3300

15

17

13

15

3301 - 3600

10

14

17

Descent

Land

"-Excludesflight

path control and outside watch.

During the descent, there is a runway change at St. Louis (Lambert). A


thunderstorm on the descent flight path requires a detour. Finally, the visibility
at St. Louis is low enough to require a precision instrument approach. Both
airplanes are flown using the equipment provided on their respective flight
decks. The performance of each airplane dictates the exact timing for the
various events that occur. The data for the 767 generally indicates lower time
demands than for the 737. The motor demands, in particular, are lower except
early in the descent phase where the 767 pilots are receiving and programming
the new arrival routing on the FMC-CDU. The 737 pilots have to respond to the
revised routing as well, but without an FMC, they must wait to set their
equipment until the airplane reaches the various maneuvering points in the
procedure. This points out one of the advantages of having a flight
management system: the ability to move selected tasks away from the later,
lower altitude, portions of the flight path. As has been pointed out by many
people, the introduction of new flight deck systems does not necessarily result
286

Wozkload Anesmazt

Table 11.6

Average Time Devoted to Motar Task" (Boeing. 1982)


0Ieffen o Ai"laIUM 7m)
Time Interval
in Secondsl
Takeoff
C1 iib

Cruise

Descent

Land

Captain

First Officer
-

767

737

767

737

1 - 300

17

20

15

27

301 - 600

11

10

21

601 - 900

901 - 1200

1201 - 1500

11

1501 - 1800

11

14

11

1801 - 2100

10

13

2101 - 2400

20

12

2401 - 2700

11

11

12

2701 - 3000

12

3001 - 3300

11

11

15

3301 - 3600

10

<1

"Excludes flight path control and outide watch.

in lower total workload. Often the objective of a new system is to shift


workload from one phase of flight to another. This is particularly true where
routine involvement of the pilot is necessary to maintain the proper level of
situational awareness. The flight management system is just such a system. By
storing and displaying the flight plan before it is needed, the pilot is given the
option of performing some tasks at a time of his choosing rather than having
the task timing be established by the position of the airplane. While unexpected
external events may occasionally reduce the value of this option, the data in
tables 11.5 and 11.6 show the option can have an overall positive effect on
terminal area workload.
The same interval-based time demand workload data can be shown graphically
making the comparison somewhat easier. Visual channel timeline analysis data
for the 747-400 is shown graphically in figure 11.4. Here again, the reference
airplane was the 737 and the mission was a flight from Chicago to St. Louis.

287

Haman Factors for Fliaht Deck Cutification Personnel

CHICAGO

ST. LOUIS ]FLIGHT7

100

VISUAL CHANNEL
75

% OF TflE

CAI'TA[N
737-200

X-X

so

--

747-400

BUSY
2S

10
TJO

30
0

401DL

ROBERT

MISSION TIME (MINUTES


100

75

VISUAL .CHANNEL
FIRST OFFICER
X

so

737-200

-X

...

747-400

%OFTIMEf

.BUSY

10

2J

~30

41

TOO

ODLY

ROBER73I
TOC

MISSION TIME (MIUTES)


12112=8

Fig= 11A kmbdnm PraUs ViWW Aciviy This Dummid, 747-400 md 737-

28

(Booftg 1968)

Woxkloads

The final type of timeline analysis is a comparative plot of the total workload
time for each of the four channels: eyes, hands, verbal, and auditory. An
example is shown in figure 11.5 for both the 767-200 and the 737. The white
bar represents the 767-200. The black bar represents the 737. This figure
involves data for the same flight scenario used to generate the data in tables
11.5 and 11.6.
Mission duration

Eyes

Captain
First
Officer

Hands

Captain
First
Officer

Verbal

Captain
First
Officer

Legend
Ln 767-200
117
I

Captain
Auditory

First
Officer

_.L

1000

2000
3000
4Q000
Total time workload seconds

5000

Analysis based on takeoff brake release at Chicago (O'Hare) to touchdown at St. Louis (Lambert)
Foium 11.5

Mson Actift Civid Tkhi Demand Sumvmy. Prmn Boeing 1981)

Task-Time Probobi/'dy
Another technique for examining task demands on the crew is called task-time
probability. This method estimates the probability that the pilot will be busy
with a task at each point along the flight. Since the method is probabilistic, it is
possible to account for a range of pilot performance. Each task is associated
with separate initiation and execution times, as was true for timeline analysis.
However in this case, instead of being assigned discrete values, these two times
are assigned probability densities. Tasks are allowed to overlap. Task-time
probability is computed for each one-second interval along the ffight path.
The probability density functions are centered on the modal initiation and
execution times. As a first estimate, a nonsymmetrical, triangular distribution is
assumed unless more specific test data are available to support a different
distribution. The nonsymmetrical distribution for task initiation or completion
289

Human Factors for FlWrt Deck Certification Personnel

Table 11.7
Probability of Being Busy with a Visual Task* (Boeing. 1982)
0toot-M-Sque P
Time Intervat
in Seconds
Takeoff
CtI ib

Cruise

Descent
Land

Captain

rat

y)

First Officer

767

737

767.

737

1 - 300

.41

.49

.38

.52

301 - 600

.26

.40

.25

.43

601 - 900

.15

.19

.12

.22

901 - 1200

.16

.26

.12

.30

1201 - 1500

.29

.37

.19

.27

1501 - 1800

.31

.47

.30

.34

1801 - 2100

.35

.32

.32

.42

2101 - 2400

.27

.33

.37

.31

2401 - 2700

.38

.47

.26

.31

2701 - 3000

.25

.35

.16

.31

3001 - 3300

.35

.38

.32

.35

3301 - 3600

.20

.30

.36

.40

"Excludes fliht path control and outside watch.


times recognizes that many flight tasks have either constrained starting or
constrained ending times. The nonsymmetrical execution time distribution
accounts for small variations in individual performance for highly skilled
behavior and larger variation in performance where behavior is less skilled or
involves more conscious effort. Examination of keyboarding test results using a
number of different military pilots indicates that, at least for some tasks, the
two distributions may not be entirely independent. Results show that the pilot
who is slow executing a task is also likely to be slow initiating the task.
The value of this method is not how accurately the density functions
characterize the pilot population but rather the insight that can be gained into
interactive system performance at a point well before test hardware is available.
The task-time probability statistics can be combined into the same five-minute
blocks that were used for the timeline analysis. The various activity channels
remain separate for the same reasons as were discussed in the timeline analysis
section. For each channel, the second-by-second probabilities are combined into
290

Worldoad Asemet

a single number representing the root-mean-square probability statistic for each


five-minute interval. Table 11.7 shows the root-mean-square probability of being
busy with a visual task during each of the five-minute blocks of the same
Chicago to St. Louis flight that was characterized in Table 11..S. Similarly, Table
11.8 depicts the root-mean-square probability of being busy with a motor task.
Table 11.8
Probability of Being Busy with a Motor Task" (Boeing, 1982)
ft~

Root-Mem*,
Tim Interval
in Seconda

Takeoff
CMaim

Cruise

Descent

Land

Captain
767
737

fy
First Officer
767

737

1 - 300

.39

.42

.35

.50

301 - 600

.21

.30

.29

.4

601 - 900

.15

.18

.22

.27

901 - 1200

.15

.16

.14

.22

1201 - 1500

.26

.26

.23

.31

1501 - 1800

.28

.31

.37

.32

1801

2100

.28

.21

.23

.34

2101 - 2400

.19

.22

.41

.30

2401 - 2700

.27

.29

.30

.28

2701 - 3000

.16

.17

.19

.31

3001 - 3300

.18

.29

.31

.35

3301 - 3600

.26

.31

.00

.16

'Excludes flight path control and outside watch.

Workload assessment using timeline analysis and task-time probability analysis


is usually accomplished before there is a mission simulator in operation. As
soon as a simulator is operating, the key task is to examine those segments of
the flight where the analysis suggests that workload will be the highest.
Simulator results can then be used to update the analysis. Spot checks in the
low and medium workload segments provide increased confidence in the
analysis and provide the opportunity to uncover any performance characteristics
that were unanticipated.

291

Human Factors fbr Fbahr Deck Certification Personnel

Unless unresolved questions remain after the simulator testing, it should not be
necessary to conduct instrumented flight tests simply to verify the analysis.
Flight testing for quantitative time demand workload is extremely difficult to
accomplish and is easily confounded by external circumstances beyond the
control of the test conductor.

Quantitative workload testing in the actual airplane is much more difficult than
in the simulator. The single biggest contributor to the difficulty is the
unpredictability of the actual flight environment. At the same time, the actual
flight environment improves the pilot's conscious sensitivity to variations in his
experience of workload. That sensitivity can be focused and standardized using
a well designed, structured workload questionnaire. Sample pages from a
Boeing questionnaire for assessing pilot workload on the 757/767 airplanes are
shown in Figures 11.6 to 11.10. By completing the questionnaire, the evaluation
pilots indicate their experience of workload while operating either airplane. The
specific workload functions and factors are related to those identified in FAR
25, Appendix D.
The questionnaire is structured to ensure that the pilot specifically thinks about
the departure and the arrival phases of the flight, each type of activity that
occurred, and each dimension of workload. Becoming consciously aware of the
various aspects of workload requires training. Figure 11.6 provides descriptions
of workload function and factor combinations that each pilot is asked to
evaluate. A copy of this matrix is reviewed before each flight and is available
with the questionnaire at the end of each flight leg. At the end of the
questionnaire there is space for comments the pilot may have concerning any
aspect of the questionnaire or the flight. After completion of each flight
sequence an analyst reviews the completed questionnaire with the pilot and
solicits more detailed information about any unusual events or any particularly
high or low workload experiences.
The bottom section in Figure 11.12 shows the part of the questionnaire where
the pilot specifies the reference airplane used in his or her evaluation of the
test airplane. Currency in the reference airplane is established by indicating
whether the pilot has flown the reference airplane within the preceding 90
days. The identification of a reference airplane by the evaluation pilot serves to
anchor the pilots ratings and comments. It also helps to temper any biases a
pilot may have for or against an individual design or design feature.

292

-7

71

''~~

uenm

Workload

WORKLOAD FACTORS
DIPARTURE (AARINAL)
WORKLOAD FUNCTIONS111 Mental Effort
Nuegeoe e MENTAL
COmpOWSe
EFFORT a, volatilye

Iipaiun

Understanding ofi
Horlionisi Position

Time Required

Physical Difficulty
Compo*th e PHYSICAL
OIPIICULTY itope~sabittse

CompeteMe AMOUNT OFTIME

imlwitofhe
ime~
NWimar
'~

NONIZONIAL POSIIIONINGO I

PGARG-

IAI~nell.

I~n#A

Meieig
andkOM

dee cI
0MENTAL
CompaimeNW
EFFORT aeemplasy eameor

COMPOOmePHYSICAL
OWFICULRIT
of opera"~n the

n FmeS
won mae
aMO es
mON
-d Untie
acsmba
leu
ft"
In me r 01
h

PUSvWtlhmctIqIwdOed1
anCtisrglth touter kincliont; e
miii thoplart@dwingdug
thei

Oepanef-el.
""desiog

IPPOarnmesaaarymecp...me

Etirse/irpem

Isystemaopermon,
an 110"

4 ~RWJ00T
~ ~
4

l~

meEIu

rel. akpian.
In mhe
lumakivns
Departure (Arrival).

ln

-b

Departure ltrativl

Camapoer
NWetegre of MENTAL

REQUIRED
Comps.. Ow TOME
to opera$, MeFMS w~hmet
relluiredtO
10 .. UakiW-

sysima led', maFMSI


""Pite
ip"OCepetorelkihvalfl
MENIAL
Celpewa alMdee. @1
EFFMMaCeasaytemhcu
Speaddu.Ig
dpInd
,MnuiOigi
Wo

~Op"hon
rpwtre
(Arrtell).

IntoPHYSICAL
Conmpare
DIFFICULTY loOpwafle nd
Mofio me Engines and
FMSI
father mhan
Airplan systemos
durk"g epirteJAt~wal.

-stank.

-B-Stnk.

Camops.. me PHYSIICAL
o-ca-"Bak-ln
OIFFICULTYO~~Ien
I eA dur"a

paml
end
NWgh

Oepetwearkimllt.a
--------------------------------------------------------------------------------Campete

meiPHYSICAL

Sak
-DIFFICULTYeOIOPeram
Oaperunkare

Cipa

5N
sArtrsIIOwn

TIMEAVAILABLE

t)SEFULNESSOFWEFOR4ATION

ICam~wpaum

deres
,
oldMENIAL

neNi~e~t~tIColmpantelhe
~~EFFORT
g etlann~eavailef Oar
meom

--

Blne1

deetesamakdurW4
Depeslur tA~nwaip

USEFULNESSoNWM
TIMEAVAILABLE
decision making duinrnraetndembltor
dECIicin makin, durii
Oep~llartur(Arivll
00p"kur lAienvall

Comp~are
the TIMEAVAILABLE

ato

Blanket

tod iulge

eIl" r

--------------------------------------------------

F19m' 11.6 Duwrom d Woddosi Evalumdon Funcdon wid Factor Conftialons (Boleing

293

um=n Facton for FliAht Deck Certification Penonnel

Subjective Evaluation
S5Pilot
0

Pilots are asked to provide an .assesment o0 the 757/767


workload functions and factors, (FAR 25. Appendix 0)
experienced during flight crew operations.

0 Detailed Instructions are attached In a eperate booklet.

Please fil-in the lollowing Information:


(AIO.,ih

A Arplan Model

0 757

Airplane Number ..

' Ous.lionnair

0 767
..

Date of

t*ri

f ye.,)

Flight

Flight Number

Test Number

Was Comptled: Daise

(Local)

Time

. Pilot's Name

* Flight Crew Assignment This Flight:

* Organization
L

Captain

0 First Officer

Boeing: 0 Flighl Test

Boeing: 0 Other

FAA:

FAA:

[] Flight Test

0 Other

Other:
Reernc Airplane: Plas
0]737
(3[7;27'

niaewihsinle

[] 737(SP 17 7)
[]747

[]707

0l DC-9
[] ot-e

s~c
airplane you areusing as a
[] DC-9-80
[ ODC-10

Have you flown your reference airplane (or an approved simulator

(check one)

O'IL 101 1.
_.
Other.
hal airplane) in the lsto90 days [

Yes

[0oNo

A representalive ol the 751/767 Flight Deck Inlegralion (8-8765) will coilecl your completed questionnaire. For
additional Information contact OD.. Fadden
THANK YOU

Figur 11.7 Ev iMM Background 093 Sheet Plat SbscdivO Evluidon QusGeionnke.

(Bmaf 19M
294

Workload Assmn

Normal Operations: Departure


1.0 Oiafetral Depaitum Information
1.1 Deputurw

Date

(a) D@p"treM Airloet_____________

(b) Take-Off Timie

(Local)

1.2 Flight Contdttlono for Departure


(a) Departure Airport Weather
I il@None
SLowsthan 400it andtu~
ImAeiatOO0ftand~nmt~e

(34001.and

o Setftethen 100011tand 3 miles

(b) Precipitation of
Departure Airport

(c) Meteorological Conditions


Atoll During Departure

(d) Turbulence During

Departure
0 None

0VMC
QIMC
0 Mixed

0 Lght
] Moderate
O Heavy

0 Lght
0 Moderate
Q Severe

IQ) Other Significant Weafthe_________________________________


1.3 ATC Data Associtetd wtth Departure
(a) ATC Procedures Used Duinng Departure

o) VFFI
o IFFRVedmia"Otty

(b) Did you enter an


amended route

(c) Level of Interaction


with ATC: During
Departure

into the FMC/CDU

alter takeoff?

(3 IFR:VAsged RoutsgAagndo

+
(3
ssgnd
FR Vc"loleQ3High

OvYes

(d)Number 0f Altitude Clearance


Changes During Departure

]Lw0

0 No

.
03-'

C]Moderate

C] 5or more

Q None

1.4 Fill Modes Used During Departure (Check applicable modes.)


IIa) EHSI Use

w~a

QDYes
O MAP
O VOAI11LS C). Ye
DYes
0 "od

(b) Autopilot Use

(c) Flight Director Usa

(d) A/T Use

QNo
0No

QCWS
0CMD

Q LNAV
0 VNAV

0 Full-Time
0 Pant-Time

0 FuII.Tirne
0 Pan-Tnime

(:)No

0 Not Used

0 Other

0 Not Used

0j Not Used

Figins 11.8 Dupinku' hilarmdo D~at ShhinI POW St~mcOv Evduieon Quuseionakra

295

Hufman Factors for Flifht Deck Certification Personnel

Normal Operations: Departure


2.0 Departure Workload Functions
Compare "571677 flight crew operations with your roemrance airplane

Menial Effort
Mo

2.1

-,

Physical Difficulty

L- S
Less

More

'A
*<

Time Required
More #

Less

Undertalnding of
Horizontal Position
Less

Mole

=Do=

NC

(eNm 2. ito be Comphleedfor EICAS oquwpDoo ampIWnas only.)

'--a-,
IZEI

2.3 ISalowoeinees"BanBan

"

(Im

2.4

2. 4 Al becompered fo maneal

rr~\vi
~

Might
0*.)-

~(Comsplte

-E

------------------------------------------

Time Avwl;ble
LeSS

la-k

2.?

-Cl

J..\L

Stnl-SaM

Usefulness of Informallon

Moe

J.

Mome

Less
t

-1
L-----------------------------'

ROM 11.9 Deptbu

Worklod Rung ShO,,

296

these
hiss to,2.6 "27*

PIO SLjsdM EVau*o QU9WVV Ik.

Workload AeU~

Non-Normal Operations
5.0 Non-Normal Procedures Workload Factors
FMIn one secuton for eacih Non.Norml Procedurecomplebed

5.1

Name of Non-Normal Procedure

Alrting Indications
Attention Gelling
Ousllty
Less

Uote

k.

Planned

__]_

(to, unplanned only)

Procedures

Mental Effort
To Understand Problem
Mw.e

Less

Cornplexily
1MoOe

C Unplanned

(for planned and unplanned)


Ease 01 Maintaining
Other PilotingFunctions

Physical Dlifflcully
Ls

L0
Mo,0

Less .

MueS

Less

SA.,

5.2 Name of Non-Normal Procedure

Alerting Indications
Attention Getting
Quality
LOeSS

Ma

Planned

_-_

(for unplanned only)

Mental Elfort
To Understand Problem
M
a
AeMom Less 4

Complexity
A

LOSS
Less

Mm
Mo

ess

1,M4t

0,00
-:/

,k

Mw.e

CQUnplanned

(oProcedures
(o
planned and unplanned)

Mental Effort
To Understand Problem
Mesma
-e

Lost

Q Planned

Alerftngindlcetlons I (forunplannedonly)

Less

Ease Of Maintaining
Other Piloting Functions

Physical Difficulty

5.3 RNm of Non-Normal Procedure

Attention Getting
Quality

CQUnplanned

oProcedures
planned and unplanned)
p(o

Less
Less

Complexity
Mwete

Less 1
*,

FRw 11.10 NMonnnu Opinions Woddoad RBOt ShO,

297

Physical Difficulty
Mwes
Less
Less.

P&A S&tJ910v Ev

Ease Of Maintaining
Other Piloting Functions
Ls ,
Less?

IuI

ASI

Human Factors for Flight Deck Certification Personnel

Figure 11.11 is an enlargement of the rating boxes used in the Boeing


questionnaire. This particular rating is for the Physical Difficulty of a function.
The response boxes are arranged with increasing "goodness" to the right. The
leftmost box indicates the greatest workload in comparison with the other
airplane, while the rightmost box indicates the least workload. For example, if
the pilot must exert much more physical effort to perform the task in question
when flying the 757 than when flying the reference airplane, the pilot simply
checks the box farthest to the left. The condition where workload is essentially
the same for both the 757 and

the reference airplane is indicated


by the diamond in the center of
the workload scale. For some of
the workload or evaluation

Physical Difculy
Less !

More

factors in the questionnaire, the

labels "More" and "Less" are


reversed from the sense in this
figure. Consistency is retained in

ci

that goodness continues to


increase to the right in all cases.

The evaluation forms developed


by Boeing for the 757/767 focus
on departure and arrival activities
and nonnormal procedures where

workload is highest and most


variable. The questionnaire was

401

4.

,,

F111<" I11T]
RgFuw 11.11 Rating B
Used int Boefi PUlt
Subjdiv Evdueo Quemonnahme
Okxft 1982)

originally drafted as eighteen

pages of text-based questions. In this form, it was explicit enough to be used


without training; however, after using the form several times, many evaluators
objected to having to read so much material. Along with the objections, the rate
of inconsistent answers on the questionnaire increased. With the help cf a
consultant skilled in questionnaire development, the text format was changed to
a graphical one reducing the page count by two-thirds.
The basic portion of the questionnaire asks for ratings for mental effort,
physical difficulty, and time for each of the significant workload functions (see
Figure 11.6). Where both equipment and procedures on the new airplane are
conceptually identical to those on current airplanes, the rating request is
deleted. The time required rating presents a particular problem for the evaluator
and the analyst. Studies by Sandra Hart at NASA-Ames have shown that people
are poor judges of time when they are involved in highly skilled tasks. To make
matters worse, the pilot is not likely to recognize when his time estimates are
good and when they are not. In deciding when to ask for time estimates, we
298

WMUkOa Assmesnot

gave significant weight to those tasks that have a high conscious activity
content. These choices were then subjected to review during the simulator
validation of the questionnaire.
While the core content of the form applies equally well to any commercial
airplane, the unique features of any new model might warrant special
consideration. For example, the 767 included a CRT map display and a full-time
flight management system. There was some concern that these devices would
add workload. To understand better the total impact of these devices, two
questions were added to the questionnaire dealing with the information
supplied by these systems. These questions were integrated into the
questionnaire and appear in the far right column of Figure 11.6. They are titled,
"Understanding of Horizontal Position' and "Usefulness of Information."
The nonnormal operations portion of the questionnaire (Figure 11.10) provides
additional workload information about equipment failures or abnormal flight
conditions. These events always involve two aspects: recognition of the event
or condition and accomplishment of any special handling required to restore
normal operations. The questionnaire asks for two ratings regarding the alerting
indications and three ratings about the nonnormal procedure itself. In this case,
the mental effort rating is titled "Complexity" and the time required rating is
titled "Ease of Maintaining Other Piloting Functions." These enhancements
resulted from discussions with pilots who found these titles easier to relate to
the specific events of a nonnormal procedure.
During flight test operations, there is the possibility that actual equipment
failures or nonnormal flight environments will occur. Even though these
unplanned events are not specified in the test plan, they are included in the
nonnormal portion of the questionnaire process. Where possible, simulated
inflight faults are introduced in a way that will produce the appropriate alerting
and recognition indications to the pilot. These events are also treated as
unplanned on the questionnaire, since they appear to be unplanned from the
viewpoint of the evaluation pilot. Safety concerns limit the failure event realism
that can be simulated inflight. Where sucA concerns come into play, the alerting
indications will be missing or incorrect. However, the procedure portion of the
questionnaire is still valid and useful.
Normally, the questionnaire is completed by the evaluation pilots for both the
departure and arrival phases of the current flight leg immediately after landing
and before any discussion takes place. On occasion, the departure sheet can be
completed once the aircraft reaches cruise altitude; however, the requirements
of the test program generally place heavy demands on the pilots while airborne.
The post flight debriefing involving the evaluation pilots and a human
performance analyst is an important element in the total process. Through this
299

Human Factors for

ft Deck Certification Pemonnel

debriefing, additional material is collected giving a complete understanding of


the events that each pilot felt were significant contributors to the ratings.
initial validation of the questionnaire was done in the simulator using a variety
of test and training pilots. This was fziowed by trial use during developmental
flight testing of the 767. The questionnaire was used during the minimum crew
size proving flights for the 767 and later for the 757. With appropriate
adjustments it was also used during the minimum crew size proving flights for
the 747-400.
The pilot subjective evaluation process provides nonscalar ratings for specific
workload functions; as such, the ratings are not amenable to summary
combination. Various people on all sides of aircraft certification would like to
have a single number or rating to characterize the airplane. The present state of
human performance knowledge does not provide a simple and meaningful basis
for combining the PSE ratings. Future research may provide new insights that
will make such a combination meaningful For the time being it will continue
to be necessary to repeat the explanation of why arithmetical combinations of
the ratings are not meaningful.
One final issue surrounds the use of subjective ratings as a part of aircraft
certification. Who should do the evaluations? Clearly pilots from the
responsible regulatory agency must be involved. The manufacturer has a central
role since it is the manufacturer who is offering the aircraft for certification,
and it is the manufacturer who bears total responsibility for the aircraft until it
is delivered to the final customer. Test pilots from the manufacturer and the
regulatory agency are the best trained evaluators. They know the airplane well
through exposure during the development program and have seen it perform
through many tests, some of which exceeded the flight envelope boundaries of
line operation. The regulatory agency pilots who are responsible for training
and overseeing line operations have an insight into the full variety of airline
operations that exceed the experience of most line pilots who fly with a single
airline. These two groups should constitute the bulk of the evaluation pilot
pooL
The use of line pilots in the certification program has been suggested on
various occasions. We believe that line pilots are better used early in the
development program and for simulator tests of new functions and features
where airplane performance can be measured along with the pilotes opinion and
differences can be resolved through discussion and further testing. In any case,
if line pilots were to be directly involved in the evaluation, it is likely that
significant changes would have to be made to the overall test program to
compensate for the lack of evaluator training and to assure sufficient
300

Wadkload

standardization of this subjective process that the results obtained can be


interpreted. Such steps would be necessary to protect the regulatory agency, the
manufacturer, and the line pilot himself from errors of commission and
omisson in the evaluations. In recent certification programs, the FAA has asked
a few retired industry pilots to consult during the crew size flight testing. In
this way, the ultimate authority for the ertification decision has remained with
the FAA while an additional source of information and review has been made
available. This program appears to have worked satisfactorily for all parties.

C w s fon Condiewffos
Et Re

D
ags

One of the driving issues in airplane manufacturing today is reshaping the


structure of the design-build cycle in ways that will improve the efficiency of
the process so that the right airplane is designed and the airplane is built right
the first time. The factors that make this effort mandatory are deeply rooted in
the commercial aviation marketplace. Cost is a major factor but, so too, is
time-to-market. These changes cannot be accomplished at the expense of safety.
At the same time, safety cannot be used as an excuse for not finding ways to
satisfy market demands. Many people believe that the needed process changes
mandate both earlier and more complete determination of requirements. In this
context, the word "early" means that requirements are understood and
documented before the airplane is built. This places a significant burden on
FAA certification personneL The certification system itself is designed to place
primary emphasis on near term certification programs. Furthermore, there is a
strong tradition of withholding judgment until the completed product is
available. Finding ways to uncover the majority of concerns while the design is
still on paper, and yet maintain the objectivity necessary for the final approval,
will be challenging indeed.
MwxdWey Imn for aWnd

egasd Cockit

Mandatory displays, particularly those defined explicitly in terms of their format,


are another problem. Most mandatory displays are the result of previous
accidents or highly focussed public concerns. Required displays reflect aircraft
operations and the pilot interface understanding that exists at the time they are
first developed. Over time, both aircraft operations and the pilot interface
understanding evolve and, as they do, the displays, indicators, and procedures
that characterize the flight deck change as well. Eventually the gap between the
current displays and the mandatory displays becomes great enough that there is
concern that effective pilot performance will be retained.

301

Human Factors for Fliaht Deck Certification Personnel

A good example of this difficulty is the handling of indications alerting the


flight crew to equipment failures or abnormal operational conditions. By the
mid-1960s, the number of independent indicators had grown to the point where
people within the FAA, the airlines, and the manufacturers were concerned. An
FAA-sponsored study, done jointly by Boeing and Douglas, developed and
validated the concept of a centralized caution and warning system. The concept
has been widely embraced and is implemented in the 767 and subsequent
airplanes. Certifying the system on the 767 required an equivalent safety ruling
from the FAA. Even today, if a manufacturer abides strictly by the rules, the
resulting flight deck will contain an array of dedicated red and yellow lights
and a multitude of alerting sounds. No one questions the intent of those who
established the initial mandatory display requirements. The concern is that
conditions have changed. Our collective understanding of human performance
has improved and the technology available to satisfy operational needs has
changed. It is time to recast some of the very specific design rules with the
performance they are meant to achieve.
Aid/ne Diffences
Another certification consideration that poses a problem for both the FAA and
manufacturers is airline difference. Airlines are different. They have different
fleet mixes. They operate in different regions. They have different crewing
policies. They have different strategies for achieving operating standardization.
These differences exist among domestic airlines and even more among foreign
carriers. It is important that these differences be understood and accommodated
in the certification process. The apparent efficiency that some believe would
follow from enforced flight deck standardization may be an illusion. There
certainly are standardized features that benefit the entire industry; e.g.,
direction of movement of primary controls, general layout of the primary
instrument panel, and minimum instruments for IFR flight. However, each
feature should be judged on its own merits before concluding that
standardization is the appropriate path. Even when standardization is chosen,
the choice should be re-evaluated at regular interval to determine if it is still
the appropriate action.
Equipment standardization and operations standardization are not synonymous.
If fundamental airplane or equipment performance differences force operations
to be different, standardizing equipment will not achieve operation
standardization. It may, in fact, interfere with safe and efficient operations by
creating the illusion of consistency where it does not, and should not, exist. The
standardization debate would be better served by addressing the fundamental
principles that underlie effective human performance. This approach has

302

Workoad AMesumnent

significantly greater potential to combat the consequences of human error,


though it is much more difficult to accomplish.

Coping with Pilot Effra


Enur T~w
Since accidents are the most serious consequence of human error, significant
time and effort are spent evaluating accident and near accident situations. A
consistent finding is that several errors occurred before the accident was
unavoidable. Studying crew-related accidents helps identify possible error
sequences and patterns and may lead to an understanding of the factors that
kept the crew from recognizing the seriousness of the situation until it was too
late. The ultimate goal is preventing errors that cause accidents. Helping the
pilot break the error chain before an accident is inevitable is one of the ways of
achieving the goal. Errors that result from clearly understood events or
circumstances can be handled more directly by the designer and the pilot than
those resulting from unknown conditions. Because of the difference in
management and coping strategies, we find it convenient to classify errors as
either systematic or random, respectively.
Through careful design, systematic errors can be reduced to a very small
number and the pilot can be trained to recognize and deal with those
systematic errors that cannot be eliminated. Minimizing systematic errors
involves careful attention to human factors data and rigorous attention to the
design development process. The unspecific nature of random errors makes their
elimination more problematical Human performance research will, over time,
uncover the knowledge that converts random errors into systematic errors that
then can be eliminated. Meanwhile, design strategies, such as system
simplification and the minimization of time critical procedures, can reduce the

opportunities for random error. In the end, however, ensuring that the pilot can
detect that an error has occurred and can do something about it, is the best
means of preventing the error from compounding into a more serious situation.
This is the essence of error tolerance--detection and effective action.

EnW ToknWt Dein


If the pilot is to cope with the error, the pilot must first detect it or have it
pointed out. Direct feedback of pilot actions is an obvious way to helping the
pilot to detect an error. In some circumstances, direct feedback is not practical
or simply cannot be done. Under these circumstances, enhancing situational
awareness for the pilot can provide a framework within which certain errors
can be detected. Providing redundant, dissimilar cues is another useful error
303

Human Factors for Flizht Deck Certification Personnel

detection technique, particularly where the consequences of an error would be

costly. This technique is particularly valuable where the human tendency to


perceive what is expected, even in the presence of contrary cues, is the root
cause of the error. Of course, detection of certain errors can be done by systems
on the airplane and their existence announced to the pilot. This widely used
technique can be highly effective where response time requirements are
compatible with human capabilities.
Once an error has been detected, the pilot must be able to react in a way that

reduces the likelihood of the error sequence continuing. Often the reaction will
be to accomplish some physical action. Under other circumstances the
appropriate reaction may be a change in planning or strategy for the remainder
of the flight. Recognizing the full range of possible responses is the key step in
ensuring that the pilot is provided with the appropriate controls, information,
knowledge, and skills to react effectively.
One of the most difficult aspects of pilot error is recognizing what errors are
most likely. It is nearly impossible for one human being to imagine how
another human being could understand and interpret the same circumstances
differently, yet evidence abounds that such is the case. Add to this that pilots
vary considerably in their decision-making styles and it is evident that
understanding error is a team effort. Collective wisdom is consistently one of
the more effective means of seeking out possible error patterns and their causes.
For collective wisdom to work, it must be nonjudgmental with an emphasis on
understanding as many ways of interpreting the display or control device as
possible. There are no wrong answers, except to believe that one interpretation
is correct and the others are wrong. The goal must be to help all pilots catch
their mistakes.
Pilot error can be triggered by unrecognized and subtle mismatches between the
information that is presented and the tasks that information is meant to
support. It is easy for the pilot to assume that if the information presented is
the same, then the associated tasks must be the same as well. Conditions where
identical indications are used to support different tasks are an invitation to
error. To make matters worse, error detection by the pilot under these
circumstances is particularly difficult. Making the design error-tolerant means
that the possibility of this error is acted upon during design. If the assumption
that the tasks are the same is false, the simplest design solution is selection of
different display formats, indications, or controls.
As an example, the hydraulic systems of the 757 and 767 are slightly different
operationally, because of different load assignments to the individual hydraulic
systems. Slight differences in system management and post-failure planning
304

wokload A es

result from this difference. Because of the task difference, the hydraulic system
control panels on the two airplanes are intentionally different. Even though the
same number of control devices is required on both airplanes, the types of
switches and the physical layout of the panel are different.
Boredom, fatigue, and time-of-day are among the factors that influence pilot
attentiveness. Their effects will normally vary during a single flight. Given these
facts, it is obvious that the pilot cannot be at maximum attentiveness all the
time. The design of nonnormal procedures can be made error-tolerant by
ensuring that the pilot has extra time to recognize and respond to situations
that, from his perspective, are new or unexpected. Once alerted to the
possibility of a problem or unusual condition, virtually all pilots can achieve
significantly increased levels of attention within a short time. This heightened
attention can then be sustained, if the circumstances warrant, for much longer
than it took to reach the heightened attention initially.
Future Woddoad/d Iues
In the future, crew workload will be influenced strongly by the strategy used to
prioritize flight deck information. Pilots are expected to look at, and be aware
of, an ever increasing array of information. Human beings can be exceptionally
versatile at handling large quantities of information. However, the time pressure
of flight can lead to impromptu prioritizing strategies that may not be well
suited to the actual circumstances. While certifying an individual system, the
composite effect of that system on the total flight deck information load may
not be evident. Yet the overall flight deck information management issues can
only be addressed by managing the contribution of each system. This means
that everyone involved in development and certification of specific equipment or
systems must share responsibility for the impact of those systems on the overall
pilot-airplane interface effectiveness.
A related issue is the potential for information overload that could follow the
addition of a general purpose data link capability to the airplane. Conceptually,
such systems could allow the nearly limitless information sources stored in
ground-based computers to be available in the cockpit. The potential for good is
great but so is the potential for excessive information management workload.
The knowledge and the tools are available to ensure that realistic consideration
is made of the pilot's human capabilities and limitations. The question is, how
will we, as an industry, use this information to ensure that new data sources
are managed in a manner that improves the effectiveness of the pilot and
protects the aviation system from new human error risks.

305

Human Factors for Flift Deck Certification Personnel

In the future, it is conceivable that the basic reliability of some of the control
and display equipment will approach the lifetime of the airplane. This implies
that many crews will go through their entire careers without seeing certain first

failure conditions on the actual airplane. While this will reduce nonnormal
workload, it presents some interesting challenges for selecting appropriate fault
management strategies and training. Certainly the strategies of today, based on
memorized or highly practiced procedures, will be inefficient and may not be
effective. The assistance of computer-based expert systems may be desirable.

Alternatively, it may be better to create designs that ensure the pilot will have
time to develop a suitable response by applying his knowledge of the system or
event.
A final issue concerns the increasing performance demands placed on pilots and
systems by the increasing need for aviation system efficiency. Many of the
improvements in efficiency are likely to result from better matching of: the
information available to the pilot, the procedures established for the various
tasks, and the training the pilot receives. To avoid any unnecessary increase in
pilot workload, coordination of these improvements will require more
communication and understanding among all the organizations and agencies
involved. It will take foresight and initiative to weld the traditionally
independent domains of aviation equipment and operations into a team that
enables the American aviation system to remain the best in the world.

306

Human Pamt=upTestif
ind

alai

Human Factors Testing and


Evaluation
by Kim M. Cardosi, Ph.D., Volpe Center

Intoduco
Many different types of questions are best answered with the results of a
human factors test. Some of the most common human factors questions include:
- Which of two or more proposed designs (of displays, controls, training
programs, etc.) is best from a human factors standpoint?
- What performance benefits are achieved from a specified design change?
307

Huma Factor for Fifht Deck Certification Permonnel

- What performance benefits are achieved from a specified design change?


-

Is a design of a new system or subsystem viable from a human factors


perspective?

What changes, if any, need to be made to a prototype system to minimize


operator error?

Is a proposed training program (e.g., for new equipment) adequate?

How long will it take for an operator to perform a task, or part of a task,
with a new system?

Human factors specialists, working with operations specialists, can often


anticipate human factors problems by examining specifications documents,
proposed designs, and prototypes of new systems and subsystems. Still, human
factors tests are often required to identify problems that are not self-evident or
to be able to quantify the impact of new systems on line operations. Formal
evaluations are always needed to ensure that the new system or procedure is
ready for implementation.
This chapter will address the following questions.
- When is a human factors test warranted?
- How is operator performance measured and what factors can affect these
measures?
-

What method of testing should be employed?

- How should test results be analyzed and interpreted?


Understanding the principles and philosophy behind human factors testing is
useful even to people who never conduct human factors tests because it helps
operations specialists critique tests conducted by manufacturers, universities or
industrial labs and determine the validity of their conclusions.
When Is a Human Factors Teet Wafnited?
It is not always easy to predict all of the ways in which an operator will use or
misuse a new system or a new component of an existing system. Nor is it
always evident what types of errors that operators are likely to make. One
example of a faulty display design that should have never made it to
308

Huma Factr Tesdt

and Eauat

implementation is the case of a major air carrier that wanted to give the flight
attendants a cue as to when sterile cockpit was in effect. The airline installed a
small indicator light above the cockpit door that was to be illuminated when
sterile cockpit was in effect. Problems arose because the light that was chosen
was green. In most cultures, the color green is not associated with "stop" or "no
admittance." The lights had to be changed to red, at no small expense to the
airline. In that case, a human factors test was not needed to predict the
problems that were experienced by the airline; it is common knowledge how a
green light is likely to be interpreted by a crewmember. However, most
questions about training, displays, controls, and how the operators may use or
abuse them are much more complex and require controlled testing to be
answered effectively.
The findings of basic research, such as information about our sensory and
cognitive capabilities and limitations, can steer us away from what is known to
be troublesome and can help us to identify desirable design options. However,
each specific application of a technology, training program, or procedure should
be evaluated under the same or similar conditions as it will be used, by the
same type of operator that will be using it, and while the operators are
performing the same types of tasks that actual operations require.
When a human factors evaluation of a system or subsystem is warranted, it
should be designed by both a human factors specialist and an operations
specialist. Operations specialists are intimately familiar with the operational
environment (e.g., a specific cockpit or ATC facility). They represent the
potential users and are usually operators (e.g., pilots or controllers) themselves.
As long as they are operationally current (i.e., knowledgeable of current issues,
procedures, and practices), they are the most appropriate source for information
on user preferences and suggestions for symbology, terminology, display layout,
etc. However, even the most experienced users should not be solely responsible
for the user-machine interface. In fact, many years of experience can
occasionally be a liability in making such decisions, since the skills and
knowledge that develop with extensive experience can often compensate for
design flaws that may then remain unnoticed. For these and other reasons, it is
important for operations specialists to work with human factors specialists in
the planning and conduct of a human factors test. Human factors specialists are
intimately familiar with the capabilities and limitations of the human system,
testing methods, and appropriate data analysis techniques. They can point to
potential problems that operational specialists might otherwise overlook. While
working together, the two specialists can predict problems and head them off
before they occur in actual opc dons. Together, they are best equipped to
decide exactly what needs to be cested and how it should be tested.

309

Hunan Factors for Flifht Deck Certification Personnel

How Is Human Pefommnce Measured?


Measures of human performance can be subjective or objective. Subjective
measures use responses that are measured in terms of the person's own units.
Such measures can be influenced by the individual's expectations and
motivations. An example of a subjective measure of workload is a pilot's
opinion as to how difficult a task is. What constitutes a high workload situation
for one person may not be considered high workload for another person.
Subjective measures are used whenever objective measures either aren't
available, or aren't appropriate. They're also used to complement objective
measures.
Objective meas res of human performance use units that are dearly defined,
such as seconds, or percent errors, heart beats per minute, blood pressure, etc.
The most commonly used objective measures of performance are response
accuracy and response time. Response time measures the time required for a
person to perform a specific task, or component of a task. Response accuracy
measures the percentage of errors made while completing the task or the
precision with which a specific task is accomplished (such as flying a predetermined route, as measured by cross-track error).
When measuring only response accuracy, it is possible to obtain insignificant
results due to either a ceiling effect or a floor effect. That is, the response being
measured may be so skilled, (e.g., a baseline of 95 percent accuracy) that any
manipulated factor is not likely to have an observable effect. This is called a
ceiling effect. Conversely, initial performance may be so poor that any
manipulation will not have a measurable effect. The tests may not be sensitive
enough to measure an effect beyond this very high or very low baseline.
Generally, if baseline performance on the measured task is extremely accurate,
and it is not desirable to induce more errors by manipulating other factors (e.g.,
workload), then response time is generally a more sensitive measure than error
rates. Differences in the response times may be observable even when the
differences in response accuracy are not.
Cowonenbof Respmws rue
While response time appears to be a simple measure of human performance, it
is actually quite complex. Response times have several components and each of
these components can be affected by many different factors. These factors must
be considered in any human factors test so that the controls necessary for
confident interpretation of the data can be employed.

310

Human Facbms Tesdft and Eauto

A complex response, such as one to a cockpit warning system, may be broken


down into four components: detection time, time to identify and interpret the
message, decision time, and time to initiate (or complete) the appropriate
response. When a warning signal appears, for example, the first component of
the required response is to detect the presence of that signal (Le., the warning
message), that is, to notice that it is there. The second component of the
response is the interpretation of the message. The operator needs to identify the
message. For example, is it TCAS, or GPWS? While this stage may sound
simplistic, the task becomes more difficult as the number of alarms and
warnings increases. After deciding which message it is, the next response
component required is to decide what physical action, ff any, (e.g., a climb or
turn) is required. Then, and only then, can a physical response be initiated.
Results of a series of flight simulation studies indicate that, with an executive
system, (that is, one that requires immediate action) it will take approximately
two to three seconds to detect that the message is there, five to six seconds to
decide what to do about it, and one to two seconds to initiate a response
(Boucek, White, Smith, and Kraus, 1982; Boucek, Po-Chedley, Berson, Hanson,
Leffler, and White, 1981; Boucek, Erickson, Berson, Hanson, Leffler, Po-Chedley,
1980; see also Berson, Po-Chedley, Boucek, Hanson, Leffler, and Wasson, 1981).
This leads to a total of eight to eleven seconds that should be allotted for a
pilot to respond.
The most stable of these components, that is, the one that has the most
predictable duration, is the initiation of the physical response. Since the
decision as to what action is required has already been made, the initiation of
the response constitutes the smallest component of response time. The time
required to complete the response will, of course, depend upon the task.
Fact= AMhncg Hun,

kmcm

There are many factors that are known to affect human performance, and
hence, response time. Some of these factors are characteristics of the stimulus,
that is, of the visual or auditory display. Others are characteristics of the
operator, such as, previous experience, skill, fatigue, etc. Still others are
characteristics of the test or operational environment, such as workload,
consequences of errors, etc. Each of these factors needs to be considered from
the test design to the interpretation of the results and controlled as much as
possible during a test.

311

Human Factors for Flight Deck Certification Personnel

Slhu*s Factor
Factors that influence detection of visual signals include location in the visual
field, and presentation format (e.g., blinking vs. steady text, brightness, etc.).
(See Chapters 1 and 2 of this text.) Response time will be faster if the signal is
presented in the center of the visual field, as opposed to out on the periphery.
If it is presented in the periphery, but flickering, detection time will be faster
than if it is in the periphery but steady. (This is one reason why a flickering
display can be distracting.) Intensity is also an important factor. Within limits,
a higher intensity stimulus will attract attention more efficiently than a less
intense stimulus. In the visual domain, intensity translates into brightness
(although other factors, such as contrast) are also critical. For auditory displays
(e.g., a tone or spoken warning message), intensity translates into loudness,
with frequency as a critical variable. The frequencies that are contained in the
ambient noise must be considered in deciding which frequencies should be
contained in the alert. The relative intensity of a message (tone or voice) must
always be measured in the environment in which it will be used. A warning
signal that sounds very loud on the bench may be inaudible in a 727 with the
windshield wipers on. In fact, the original Traffic Alert and Collision Avoidance
System (TCAS) voice alerts passed the bench test, but were found to be
unusable in the cockpit (Boucek, personal communication).

Another factor that can affect how quickly a signal can be recognized and
interpreted is how meaningful the signal is. Personally meaningful stimuli, such
as one's own name, and culturally meaningful stimuli, such as the color red or
a European siren (both of which are associated with danger) will attract
attention more efficiently than other stimuli of equal intensity. One exception to
this, however, is if one of these "meaningful" signals is presented repeatedly
without accompanying important information (as with false alarms). In this
case, it is not difficult to learn to ignore a signal that previously attracted
attention efficiently.

Ease of Irfte"maon
Another factor that affects response time is how intuitive the meaning of the
symbol is to the user. For example, one of the first TCAS prototypes used a red
arrow to convey to the pilot the urgency of the alert (red) and the direction in
which the pilot should fly. Even after training, some pilots felt that there could
be instances in which pilots would be unsure as to whether a red arrow
pointing up meant that they should climb or that the traffic was above them.
The arrow was changed to a red or amber arc on the IVSI (Instantaneous
Vertical Speed Indicator) with the instructions to the pilot to keep the IVSI
312

Huma Factots Teuinf and


needle out of the lit (red or amber) band. This provided a more consistent
coding between the urgency of the alert and the required action.

P Cp rtun-_Qd
om
Cor&W
Expectations and context have a strong influence on response time. Responses
to a stimulus that occurs very frequently, or one that we expect to occur, will
be faster than to one that occurs once every month. However, expectations may
also lead to inaccurate responses, when what is expected is not what occurs. In
many situations, particularly ambiguous ones, we see what we expect to see and
we hear what we expect to hear.
The following ASRS report (October, 1989) entitled "Something Blue" illustrates
the power of expectation:
"On a clear, hazy day with the sun at our backs we were being vectored for
an approach...at 6000' MSL. Approach advised us of converging IFR traffic at
10 o'dock, 5000, NE bound. After several checks in that position I finally
spotted him maybe 10 seconds before he passed beneath us... When I looked
up again I saw the small cross-section and very bright landing light of a jet
fighter at exactly 12:00 at very close range at our altitude... I overrode the
autopilot and pushed the nose over sharply. As I was pulling back the thrust
levers and cursing loudly, the "fighter" turned into a silver mylar balloon with
a blue ribbon hanging from itd I could see what it was when it zipped just
over our heads and the sunlight no longer reflected directly back in my eyes
(the landing light). I was convinced it was a military fighter, complete with
the usual trail of dark smoke coming out the back (the blue ribbon?)l
Then -- I remembered the traffic directly below usl I pulled the nose up just
as sharply as before. Fortunately, everyone was seated in the back, and there
were no injuries or damage... Our total altitude deviation was no more than
200'."
In this case, the expectation or "set" to spot traffic led to a false identification
of an object and, consequently, an inappropriate response to it.
Another good example of the powers of expectation is seen in the videotapes
that Boeing made of their original TCAS simulation studies. In this study, the
pilots had the traffic information display available to them and often tried to
predict what TCAS was going to do. In one case with a crew of two
experienced pilots, the pilot flying looked at the traffic alert (TA) display and
said, "I think we'll have to go above these two guys" (meaning other aircraft).
This set up the expectation for both crewmembers for a "climb" advisory. The

313

Human Factors for Flight Deck Certification Personnel

crew started to climb when they received their first TCAS message, "Don't
climb." The pilot flying told the pilot not flying to call Air Traffic Control (ATC)
and tell them what action they were taking. Without reservation, the pilot not
flying called ATC, said that in response to a TCAS alert, they were climbing to
avoid traffic. He also requested a block altitude. He then told the pilot flying
that they were cleared to climb. Meanwhile, as the climb was being executed,
"Descend" was repeated in the background over 25 times. Eventually, the pilot
not flying said, "I think its telling us to go down." The next thing that is heard
on the tape is "[expletive], it changed, What a mess." Crash. (Boucek, personal
communication).
Anyone could have made a similar mistake. It is human nature to assess a
situation and form expectations. In support of the pilot's expectation, and
perhaps because of it, he didn't hear the first syllable, which was "don't" - he
heard the action word "climb". The idea was then cemented. It takes much more
information to change an original thought than it does to induce a different
original thought.
Praceco
Another factor that affects response time is how practiced the response is. If the
response is a highly-practiced one, then response times will be quicker than if it
is a task that isn't performed very often.
Usw Conikdece
Another important factor is trust in the system. This may, or may not, develop
with exposure to the system. Response time will increase with the time required
to evaluate the validity of the advisory. Confidence in the system and a
willingness to follow it automatically will result in shorter response times.
Nwnmtr

of
Response Akmoav

Another factor that influences the decision component of response time is the
number of response alternatives. In Ground Proximity Warning System (GPWS)
for example, once you decide to respond, there is only one possible response: to
climb. In TCAS II there are two response alternatives: to climb, or to descend.
With TCAS III, there are at least four alternatives: climb, descend, turn right, or
turn left. Studies have shown that the response time increases with the number
of response alternatives (see Boff and Lincoln, 1988 p. 1862 for a review).

314

--

77-

Humen Facrs Testih

and Evaluatio

ReaW Wodd Dao n Pilo Reponse Time


It is difficult, if not impossible, to fully simulate the operational environment in
even the most sophisticated simulation facilities. For this reason, data on pilot
response time that is obtained unobtrusively from observational studies of "real
world" events is extremely valuable (but rarely available). There are at least two
such studies of pilot behavior. One examines pilot response times to GPWS and
the other to time-critical ATC communications.
Groind Praduity War*ig System (GWS)
Several large overseas international airlines measured pilot response times to a
time-critical GPWS warning - mode 2 "Terrain-Terrain" (which indicates high
speed flight toward rising terrain). This information was collected during actual
flights and indicated that the pilot response times ranged from 1.2 to 13
seconds with an average of 5.4 seconds (Flight Safety Foundation, Accident
Prevention Bulletin, January 1986). No other statistical information on pilot
response time (such as how many data points were included in this sample or
the response time at the 90th percentile) was reported. It is also interesting to
note from this study that even though the Boeing recommendation was an
initial pull-up of 15 degrees, and the Douglas recommendation was an initial
pull-up of 20 degrees, the average pull-up observed was 8.5 degrees with a
rotation rate of 1.4 degrees per second. This may be inadequate in many terrain
encounters.
Air Taft Contrd (ATC)
In an analysis of pilot response time to time-critical ATC transmissions in an en
route environment, Cardosi and Boole (1991) analyzed 46 hours of controller to
pilot communications from three Air Route "raffic Control Centers (ARTCCs). In
these 46 hours of voice tapes, 80 commu,
tions from controllers to pilots
were found to contain time-critical messages, such as maneuvers required for
traffic avoidance, or maneuvers followed by words expressing urgency
(e.g.,"now" or "immediately"). The pilots' verbal response times, as measured
from the end of the controller's transmission to the beginning of the pilot's
acknowledgement, ranged from one to 31 seconds with a mean (Le, average) of
three seconds (standard deviation = 5). The 90th percentile was 13 seconds.
This means that we would expect most (90%) of pilot responses to be initiated
within 13 seconds. The average response time, as measured from the end of the
controller's transmission to the end of the pilot's initial transmission (even if it
was only a "say again") was six seconds.
To measure response time from a systems approach, Cardosi and Boole
examined the total time required for successful transmission of a time-critical
315

Human Factn for Fikh Deck Certfication Personne

message. This was measured from the beginning of the controller's transmission
to the end of the pilot's correct acknowledgement (and included "say agains"
and other requests for repeats). This total time ranged from four to 40 seconds
and averaged 10 seconds. Ninety percent of the transmissions were successfully
completed within 17 seconds. Interestingly, times required to complete similar,
but not time-critical transmissions, such as turns issued by controllers for
reasons other than traffic avoidance, were very similar. The time required for
successful transmission of such calls ranged from four to 52 seconds with a
mean of 10 seconds.
Finally, it is interesting to note that many pilots' (and controllers') perception is
that a pilot's responses to GPWS and to time-critical calls is immediate. While
this is largely true, analysis of the data shows that even the immediate takes
time.
What Melhod of Testing Should Be Used?
The testing method of choice depends on the specific problem or question
under investigation and the available resources. Most importantly, the method
must be appropriate to the issue. For example, one would not consider a
questionnaire for measuring the time required to complete a small task, nor
would one collect data on pilot eye movements by asking the pilots where and
when they moved their eyes. Another necessary consideration is the amount and
type of testing resources available. Often, the most desirable type of test is too
expensive and many compromises are necessary. The implications of these
compromises need to be recognized as do their implications for the
interpretation of the test results.
FRW Obswvaff
One evaluation technique that is often used is field observation. This includes
any over-the-shoulder evaluations, such as sitting behind the pilot and observing
a specific pilot activity or sitting behind a controller team and observing their
interactions. One advantage to this method is that it allows investigators to
make observations in the most natural setting possible. It can increase our
understanding of the nature of processes and problems in the work
environment. Specifically, valuable insights can be gained as to where problems
might occur with a specific system or procedure and why they might occur.
One task in which field observations are helpful is in rying to determine the
information or cues that people use in performing a task. We, as humans, are
rarely aware of all of the information that we use in performing a task. This is
illustrated in a "problem" that Boeing Commercial Airplanes once had with one
316

Human Fa... rs Testing and Byaluati-"

of their engineering simulators. After flying the simulator, one pilot reported
that, "It felt right Jast week, but it just doesn't feel right this week." The
mechanics examined everything that could possibly affect the handling qualities
of the simulator. They took much of it apart and put it back together. They
fine-tuned a few things, but made no substantive changes. The pilot flew the
simulator again, but again reported that it still didn't "feel right." It seemed a
little better, but it just wasn't right. Someone finally realized that the engine
noise had inadvertently been turned off. The engine noise was turned back on
and suddenly, the simulator once again "handled" like the aircraft (Fadden,
personal communication).
While field observations are often useful as initial investigations into a problem,
their limitations often preclude objective conclusions. Their findings may be
more subjective than objective, are dependent on the conditions under which

the observations were made and can actually be affected by the observation
process itself.
One factor that affects the reliability of findings based on field observations is
the number of observations made. For example, a conclusion based on 10 test
flights is going to be more reliable (i.e., more repeatable) than one based on
three flights. Furthermore, the findings based on field observations are
condition-dependent. That is, the findings must be qualified with respect to the
specific conditions under which the observations were made. For example, if
you observed five test flights and they all happened to be in good weather,
with no malfunctions, et cetera, you may have observed only low or moderate
workload flights. Any findings based on these flights can not then be

generalized to situations involving high workload.


Another, and more subtle, consideration is that the very process of observation
can alter what is being observed. An observer's activities, or even his or her
mere presence, can affect performance. For example, depending on who the
observer is (and their stated or implied mission), a flight crew may change their
behavior. They may, for example, become more conscientious (e.g., about
checklists). It is easy to envision how different observers (e.g., a university

researcher, an air traffic controller, or an FAA inspector) might observe slightly


different behaviors exhibited by the same crew, all of which may be different
from what occurs when no observer is present.
Another possibility is that the observer's presence might make a crewmember
nervous and induce a classic case of "checkitis". In this case, performance would
be poorer than when no observer is present. Observers, or their questions, may
also be distracting and this may adversely affect performance.

317

tHmsan Factors for Phifbt Deck Certification Personne

Questionnaires are important research tools that allow investigators to collect


information from many people with a minimum cost. They are very useful in
surveying user opinion, company procedures, individual practices and
preferences, etc. Developing a useful questionnaire is not a simple process.
There are experts available in questionnaire development and guidelines for
developing and administering useful questionnaires (see Kidder, 1981).
The first rule of questionnaire design is that the questions should be simple and
direct. The probability of confusing questions resulting in different people
interpreting the questions or rating scales differently should be minimized.
Confusing or ambiguous questions need to be eliminated. The best way to
accomplish this is to administer the questionnaire to a small group of
individuals who are part of the target population (e.g., pilots) and see how they
interpret the questions. It is also very helpful to ask for their feedback on the
format of the questionnaire, the clarity of the questions, etc.
While it is true that the best questions are simple and direct, care must be
taken in the specific wording of the questions. A question with an obviously
desirable answer will not yield informative results. For example, in a survey on
cockpit and cabin crew coordination, Cardosi and Huntley (1988) wanted to
assess crewmembers' knowledge of sterile cockpit procedures. The most direct
question, "Do you know your airlines's procedure for sterile cockpit?" would
probably have resulted in crewmembers answering in the affirmative, whether
or not they were certain of the procedure. Instead, they asked, "What is your
airline's procedure for sterile cockpit?" It was an interesting finding in itself
that different crewmembers from the same airlines gave different answers.
Second, the questions need to be unbiased, both individually and as a set.
Individual questions can be biased in terms of their wording. For example,
asking "How much easier is it to use trackball X than trackball Y," presumes
that trackball X i! z:-,er -ri" respondents ar- unlikely to report that X is
actually more difficult. An unbiased way to present the same question is
"Compare the ease of using trackball X to using trackball Y." This question
would be answered with a scale ranging from "X is more difficult" to "Y is more
difficult" with a midpoint of "X and Y are the same."
Just as any individual question can be biased, a questionnaire may also be
biased in its entirety. For example, if there are more questions about possible
problems with a system than about its advantages, respondents may report
feeling less favorably toward the system than if the questionnaire had more
positive than negative questions.
318

Human Faco=s Testin and Evauato

Finally, the questionnaire should be administered as soon as possible after the


experience or task that is under investigation. Because memory for detail can be
very fleeting, it would not be advisable to show a pilot a new display, and then
a week later administer the questionnaire. The sooner after exposure the

questionnaire is administered, the more useful the results are likely to be. One
exception to this rule is a questionnaire that is used to exauiine the

effectiveness of a training program. That is, how much of the information that
is presented in training is retained over a given period of time. For such a
"test," a significant time interval (e.g., one month or longer) between exposure
to the training and the questionnaire would be useful. A test with such a delay
would be more effective than a test with no delay in predicting what

information will be remembered and accessible for use when needed in actual
operations.
A&nes=W
Rating scales are often very useful. Most scales offer five or seven choices.
Fewer than five choices is confining; larger than seven, makes it difficult to
define the differences between consecutive numbers on the scale.

Unless it is desirable to force questionnaire respondents to choose between two


alternatives, rating scales should always have a mid-point. (This is one reason
why an odd number of choices is recommended.) The scale should also have
descriptive "anchors," that is, at least both ends and the middle values should
have a word or phrase that identifies exactly what is meant by that number.
This helps to minimize differences in people's own standards. For example, if
the questions asks for a rating of the ease or difficulty of the use of a system Y
as compared to system X, anchors should be given where the number '1' means
much easier than X; the number '3' corresponds to 'no difference' and '5' means
much more difficult than X. The results will be easier to interpret and,
therefore, much more valuable, than those obtained by simply asking for a
rating of ease or difficulty on a scale of one to five.
While user opinion is extremely valuable, there are many problems with making
important design decisions by vote or consensus alone. We, as humans, are not
very good at estimating our own response times, or predicting our own errors;
nor do our initial preferences always match what will be most efficient in actual
operations. Furthermore, there is also a tendency to prefer what is most familiar
to us. Initial perceptions of new systems or subsystems may change with
experience. For example, pilots who first used the B747-400 primary flight
displays rated them as "very cluttered." With experience, however, these ratings
change to "just right" (Boucek, personal communication). Also, the first line
pilots to fly the B767 thought they preferred the electronic Horizontal Situation
319

Hum Factor for MiAht Deck Cui1icatin Personne

Indicator (HSI), until they used the electronic map display (Boucek, personal
communication).
It has also been the case that pilots have preferred one thing on the ground
(e.g., a display with lots of high-tech options and information) and something
else (usually a simpler, less cluttered, version) once they tried to use it in actual
operations.
Even simple behaviors do not lend themselves to accurate judgments about our
own actions. As part of an evaluation of a prototype navigation display, the
Boeing flight deck integration team monitored pilots eye movements as they
used a prototype navigation display. The team also asked the pilots to report
where they thought they were spending most of their time looking. There was
no systematic relation between where the pilots thought they were looking the
most and where the data actually showed that they were looking most (Fadden,
personal communication).
Lomky EWM
It is difficult, if not impossible, to investigate issues by manipulating factors in
actual operations. Such control is usually only available in a laboratory setting.
The goal of an experiment is to manipulate the variables under investigation
while keeping everything else constant. This careful manipulation of the key
variables allows investigators to determine which of them has an effect.
One common type of a laboratory experiment is a part-task simulation. Part-task
simulations are useful for studying simple questions, such as: "How long does it
take to notice a particular change in the display?" or "Will the user immediately
know what that symbols mean?" A part-task simulation is an ideal way to
conduct an in-depth test of a new display. It allows attention to be focussed on
the details of the display before it is tested operationally in a full-mission
simulation. In addition to providing valuable results, a part-task simulation
often points to specific areas that should be tested in a full-mission simulation.
The full-mission simulation is, of course, a very desirable type of test because it
preserves the most realism, and thus, yields results that are easy to generalize to
the real world. Full-mission simulation can give the same degree of control as a
laboratory experiment, with the added benefits afforded by the realism.
The major drawback of full-mission simulation is that it is very expensive. The
costs for computer time, simulator time, the salary for the pilots and/or
controllers who participate, in addition to the other costs of research, can be
prohibitive for all but the largest, and most well-funded, of projects. Also, there
320

Humi

Facow Testing ad

are only a few places in the country that have the capability to conduct full
mission simulation studies.
Another limitation of simulation studies that must be considered when
interpreting the results is the priming effect. When pilots walk into a simulator
knowing that they are going to participate in a test of Warning System X, they
are expecting to see that system activated. They will see System X activated
more times in one hour than they are likely to see in an entire day of line
flying. This expectation leads to a priming effect which yields faster response
times than can be expected when the activation of System X is not anticipated.
For this reason, the response times obtained in simulations are faster than can
be expected in the real-world and must be considered as examples of best-case
performance. How much faster the response times will be in simulation than in
actual operations is difficult to say as it depends on a variety of factors,
particularly the specific task. In addition to response times being faster, they are
also more homogeneous in simulation studies than would be expected in actual
operations. This reduced variability can result in a higher likelihood of
obtaining a statistically significant difference between two groups or conditions
in a simulation study than in actual operations. However, since data obtained in
actual operations are rarely obtainable, data from realistic simulation studies are
a good alternative.
Expwf mental Validity and Reliability
The goal of any evaluation is to have reliable and valid results. Reliability refers
to the repeatability of the results. If another investigator was to run the same
test with the same equipment and same type of test participants, what are the
chances that they would get the same results? In order to have repeatable
results, the results obtained need to be due to the factors that were
manipulated, and not to extraneous factors, chance, or anything peculiar to the
testing situation or individuals tested.
In any experiment, it is necessary to carefully manipulate the factors that will
be examined in the study and control all other variables (if only by keeping
them constant). Careful controls help to ensure that the results of the study are,
in fact, due to the factors examined and not to extraneous factors.
Validity refers to measuring what the test purports to measure. A classic
example of this is the IQ test. Does it really measure one's ability to learn? Do
the Standardized Aptitude Tests (SATs) actually measure one's ability to succeed
in college? If the answer to this type of question is "no," then the test is not
valid.

321

Hman Factors for Flight Deck Certification Penonnel

One way to help ensure that the results of the study are valid and reliable, is to
employ careful controls of critical factors of interest and of extraneous factors
(such as fatigued participants) that may influence the results of the study. This
is easier said than done because it is often very difficult to even identify all of
the factors that may contribute to your results. However, careful selection of
test participants and testing conditions, in addition to a sound experimental
design, will help to ensure valid and reliable results. A sound experimental
design ensures that an adequate number of test participants ("subjects") are
properly selected and tested (in an appropriate number and order of conditions)
and that careful controls of the variables are included in the test.
-

DarmW Vnjbw

One fundamental component of an evaluation that often gets neglected is the


idea that the test variables be operationally defined. This means that the factors
under investigations must be defined in ways that can be measured. For
example, a test to determine whether the use of the Traffic Alert and Collision
Avoidance System (TCAS) increases Air Traffic Control (ATC) frequency
congestion, would begin with an operational definition of frequency congestion.
A suitable measure, in this case, would be the number of ATC calls generated
by TCAS equipped aircraft (e.g., pilots contacting ATC to inform the controller
of a maneuver or ask a question concerning a traffic alert) per unit time as
compared to the number of traffic related calls generated by aircraft without
TCAS under similar conditions.
Whether a test is designed to examine something simple (such as display
clutter) or complex (such as situational awareness), all variables must be
defined in terms of units that can be measured in the study.
RegMewira v Sat*

Pool

Another necessary component of an evaluation is a representative subject pooL


Since most research on basic perceptual and cognitive processes is conducted
using college students as subjects, a question often arises as to whether or not
we may generalize the results to specific populations, such as pilots or
controllers.

One rule of thumb is that if the study purports to examine an aspect of


behavior in which the target population would be expected to be different from
college students in key ways then the results will not be applicable. The
differences between the target and test populations may be in terms of physical
differences (such as age), or intellectual abilities (such as specific skills or
knowledge). Whether or not these differences prevent a generalization of test
322

Human Fact=i TOW=n and Lauto

results depends upon the task. These differences can be quite subtle, but
important.
For example, one approach to studying the similar call-sign problem might
involve determining which numbers are most likely to be confused when
presented auditorily. A sample research question would be "Is 225 more
confusable with 252, or 235?". This is a relatively simple task and the results
would comprise a confusability matrix. Because this is a simple auditory task,
pilots would not be expected to perform much differently than college students
(with the exception of the differences attributable to hearing loss due to age
and exposure to noise). In this case, performance depends solely on the ability
to hear the differences between numbers and results of experiments performed
with college students as subjects are likely to be applicable to pilots.
Now consider a superficially similar, but technically very different, task. If the
experimental task was to look at the effect of numerical grouping on memory
for air traffic control messages, subjects might listen to messages with numerical
information presented sequentially (e.g., "Descend and maintain one, zero
thousand. Reduce speed to two two zero. Contact Boston Approach oi.e one
niner point six five"), and messages with numerical information presented in
grouped form (e.g., "Descend and maintain ten thousand. Reduce speed to two
twenty. Contact Boston Approach one nineteen point sixty-five.") Since a pilot's
memory for that type of information is going to be very different from a college
student's memory of that information, (mostly because it is meaningful to the
pilot), results obtained by using college students would probably not be directly
applicable to pilot populations.
One important aspect in which subjects should be representative of the target
population is in terms of skill level. It is highly unlikely that a test pilot can
successfully train himself to react or think like a line pilot. A below-average
pilot (or an average pilot on a bad day) is likely to experience more difficulties
with a new system than a skilled test pilot, or an Aircraft Evaluation Group
(AEG) pilot. It is very difficult for a highly experienced operator to predict how
people without prior knowledge or specific experiences will perform a certain
task or what mistakes they are likely to make. Exceptional skill can enable an
operator to compensate for design flaws - flaws which, because of the skill, may
go unnoticed.

COaoMV

MM

While it is important that the people used as subjects are as similar as possible
to the people to whom you want to generalize the results, it is also important
that the subjects' biases don't affect the results of the test. If the participants
323

Pactai
HumanIIon-

for Mibt Deck Certification

have their own ideas as to how the results should come out, it is possible for
them to influence the results, either intentionally or unintentionally. It is not
unusual for subjects to be able to discern the "desirable" test outcome and
respond accordingly. To prevent this, investigators must take steps to control
subject bias. For example, studies designed to test the efficacy of a new drug
often employ a control group that receives a placebo (sugar pill). None of the
subjects knows whether he or she is in the group receiving the new drug or in
the group given the placebo. Some studies are conducted "double-blind"
meaning that even the experimenters who deal with the subjects do not know
who is receiving the placebo and who is receiving the drug.
In aviation applications, it is usually impossible to conduct a test (e.g., of new
equipment) without the participants knowing the purpose of the test.
Furthermore, this is often undesirable, since subjects' opinions (e.g., of the new
display) can be a vital component of the data. One solution to the problem of
controlling or balancing the effects of biases and expectations is the use of a
control group. This group of subjects is tested under the same conditions (and
presumably would have the same expectations) as the experimental group, but
is not exposed to the tested variable.
For example, consider a test designed to examine the effectiveness of a new
training program for wind shear (e.g., on the time required to maneuver based
on a recognition of wind shear, number of simulated crashes, etc.). If the new
training program is to be compared to an existing program, then the
performance of pilots who were trained in each program could be compared.
Pilots trained in the new program would be the experimental group and pilots
trained in the existing program would constitute the control group. If the
training program was a prototype and there was no such comparison to be
made, then the performance of pilots trained with the new program
(experimental group) could be compared to that of pilots who did not receive
this training (control group). In this case, however, it would be important to
control for test expectations. If, for example, the test wind shear scenarios were
presented within days of the training, then the pilots would naturally expect
wind shear to occur in the simulation sessions. This expectation would be
expected to improve their performance over what it would be if wind shear was
not anticipated. In this case, for the comparison between the two groups to be
meaningful, pilots in both groups would need to be informed of the purpose of
the test or be caught by surprise.
Another way to control subject bias is with careful subject selection. A good
example of this is illustrated in a test conducted at the FAA Civil Aeromedical
Institute to look at low-visibility minimums for passive auto-land systems
(Huntley, unpublished study). The Air Transport Association (ATA) wanted
324

Hiinank Factg Testa

and

l-w

lower minimums than the Air Line Pilots' Association (ALPA) thought was rafe.
Clearly, both of these groups had a stake in the outcome of the test. When the
simulation study was conducted, a portion of the subject pilots came from ALPA
and an equal number of pilots came from ATA (Huntley, personal
communication). While it is impossible to get rid of the biases that people bring
to a test, it is usually possible to balance them out.

Repmw I h Tom

on

It is usually desirable for the test conditions to be as representative as possible


to "real world" conditions. While the engineer looks at a system and asks, "Does
it perform its intended function?", the human factors specialist wants to know if
die pilots (or other operators) are able to use the system effectively under the
conditions under which it will be used. Because of this, the key conditions
included in the test must be as representative as possible to actual operating

conditions so that the results of the test can be generalized to actual operations.
Important conditions may include (but are not limited to): varied workload
levels, weather conditions, ambient illumination levels (Le., lighting conditions),
ambient noise conditions, traffic levels, etc. For example, if a data input device
is designed to be used in the cockpit, then it is important to ensure that it is
easily used in a wide variety of lighting conditions and in turbulence (when it
is difficult to keep a steady hand).
It is often important to include the "worst-case" scenario in addition to
representative conditions in a test. Most human factors evaluations must include
a worst case test condition, since it is the worst case (e.g., combination of
failures) that often results in a dangerous outcome. For example, if it is
important that a time-critical warning system be usable in all conditions, then
the operator response time that is assumed by the software's algorithm needs to
take this into account. In this case, in addition to measuring how long will it
take the average person under average conditions to respond to the system, the
longest possible response time, or response time at the 95th or 99th percentile,
should also be measured. Such "worst case' response times should be obtained
under "worst case" conditions.

Counter-bkw
One control that is not necessary in the engineering world but can be critical in
the human factors world is counter-balancing. When measuring the noise level
of two engines, it doesn't matter which one is tested first; the test of the first
engine will not affect the outcome of the test of the second engine. When
testing human pe~formance, however, such order effects are common.

325

Human Factors for Flight Deck Cetification Persomnel

There are two possibilities of how human performance can change during the
course of the test; it can get better or worse. Performance may improve because
exposure to the first system gives subjects some information that helps them in
using the second system. This is called positive transfer. For example, in a test
of two data input devices, it would be reasonable to have pilots use each of
them and measure the time required to perform specific tasks (response time)
with each system. The number of errors made in the data input process
(response accuracy) wot Id also be measured. Performance with System A could
be compared to perform3nce with System B to determine which of the two
systems is preferable. If the procedures for two systems are similar (e.g., in
terms of keypad layout, the required order of the information input, etc.) but
new to the pilot, then the practice acquired during test of System A might
improve his or her performance with System B over what it would have been
without the experience gained during the first test.
However, if the two systems are physically similar, but require different
procedures to operate, then the experience acquired with the use (test) of
System A would probably impair performance with System B. Performance with
System B would have been better with no previous experience with System A.
This phenomenon is referred to as negative transfer.
One way to avoid the possibility of positive or negative transfer influencing test
results is to balance the order of conditions. For example, in a comparison of
two navigation displays, a test could be conducted in which half of the pilots
are tested with one display and half the pilots are tested using the other
display. In this case, it is particularly important to ensure that there are no
important differences in the two pilot populations (e.g., in terms of skill level).
Alternatively, each pilot could be tested using both displays, with half of them
using Display A first and half of them using Display B first. This is referred to
as "counter-balancing."
There is another reason why performance may deteriorate over the course of a
test. If the test is extremely long or the task is very tedious, performance may
suffer due to a fatigue effect. When fatigue may be a factor in a test, careful
controls (such as the use of an appropriate control group or balancing the order
of conditions) must be considered. One study of the effects of fatigue on flight
crews illustrates this point. Foushee, Lauber, Baetge, and Acomb (1986)
investigated the effects of fatigue on flight crew errors. They had two groups of
active line pilots fly a LOFT-type scenario in a full-mission simulation. Ten
flightcrews flew the scenario within two to three hours after completing a
three-day, high density, short-haul duty cycle. The other ten flightcrews flew the
test scenario after a minimum of three days off. The results showed that while
the "Post-Duty" crews were more fatigued than the "Pre-Duty" crews, their
326

Human Fact=i Testnand Evalundton

performance was significantly better than that of the "Pre-Duty" crews. Of


course, the better performance was not attributable to fatigue, but to a personal
familiarity that developed over their duty cycle. The crews who had flown
together on the duty cycle prior to the simulation got to know each other and
knew what to expect from each other. This is often considered to be the birth
of cockpit (or crew) resource management.
The first part of this study did not have the control group of pilots who flew
together for the same amount of time right before the simulation, but weren't
fatigued. As the second part of this study, a subsequent analysis of the data
showed that the superior performance was, indeed, due to familiarity with the
other crewmembers and not to fatigue.
How Should Test Results Be Analyzed?
Once the human factors test has been conducted, the next step is to analyze the
results and present them in the simplest and most straightforward manner. The
goals of data analysis are to describe the results and, where applicable, to
determine whether there are important differences between groups or conditions
of interest. Data analysis is used to summarize and communicate the meaning of
a large set of numbers (such as response times or error rates) with the fewest
possille numbers.

Memmm Of Centrl TenducY,

Measures of central tendency seek to describe a set of data (e.g., a set of


reaction times) with a single value. The most commonly cited measures of
central tendency are the arithmetic mean, the median, and the mode.
The mean The mean is computed as the sum of all the scores (e.g., response
times or error rates) divided by the number of scores. For example, if response
times (in msec.) for eight different pilots on a particular task were measured to
be:
200,

225,

275,

300,

400,

400,

500,

1450

then the mean would be (200 + 225 + 275 + 300 + 400 + 400 + 500 +

1450)/8 or 469 msec. The mean is considered to be the fulcrum of a data set
because the deviations in scores above it balances the deviation in scores below
it. The sum of the deviations about the mean is always zero. Because of this,
the mean is very sensitive to outlying scores, that is, scores that are very
327

Human Factor for Fliaht Deck Cefication Peronnel

different from the rest. A very high or very low score will tend to pull the mean
in the direction of that score. In our example data set cited above, the mean of
the first seven scores is 329 msec (compared to the mean of 469 with the score
of 1450). While the mean is more frequently cited than the median or the
mode, it is not always appropriate to cite it alone for this reason.
The median. The median is the score at which 50 percent of the scores fall
above it and 50 percent of the scores are below it. With an odd number of
scores, the median is the score in the middle when the scores are arranged from
lowest to highest. With an even number of scores, the median is the average of
the two middle scores. In the example array of data cited above, the median
would be the average of 300 and 400 or 350 msec. One advantage of the
median is that it is less sensitive to outlying data points. When there are a few
scores that are very different from the rest, then the median should be
considered as -well as the mean.
The mode. The mode is the most frequently occurring score. In our example
data set, the mode is 400, since it is the only score that occurs more than once.
It is always possible, especially with very small data sets to have no mode. In
very large data sets, it is possile to have multiple modes. While the mode is
the most easily computed measure of central tendency, it is also less stable than
the mean or median, and hence, usually not as useful.

Measurs of vadabmy
A measure of central tendency, when presented in isolation, cannot fully
describe the test results. In addition to the mean or median, we also need to
know how dose or disparate the scores were. In other words, how
homogeneous were the scores as a group? For example, did half of the pilots
take five seconds to perform the task and half of them require ten seconds or
did they all take about 7.5 seconds? To answer this type of question, we need
to compute a measure of variability, also known as a measure of dispersion.
The most commonly used measure of dispersion is the standard deviation. The
standard deviation takes into account the number of scores and how dose the
scores are to the mean.
The standard deviation (abbreviated as "s" or "s.d.") is the square root of the
variance. The variance (s2) equals the squared deviations of epch score from the

328

Human Factor Testing and Eauto

mean divided by the total number of scores. One equation for computing the
variance is as follows:

n-1

Where:
E is the summation sign
X represents each score
"equals the mean of the distribution, and
n equals the number of scores in the distribution.
To compute the standard deviation in this way, we subtract each score from the
mean, square each difference, add the squares of the differences, divide this
sum by the number of scores (or the number of scores minus one), and take
the square root of the result Relatively small standard deviation values are
indicative of a homogeneous set of scores. If all of the scores are the same, for
example, the standard deviation equals zero. In our sample set of data used to
compute the mean, the standard deviation equals 383 msec.
Another use of the standard deviation is that it helps us to determine what
scores, if any, we are justified in discarding from the data set. Studies in visual
perception, for example, often use stimuli that are presented for very brief
exposure durations (e.g., less than one-half of a second). In this case, a sneeze,
lapse in attention, or other chance occurrence, could produce an extraordinarily
long response time. This data point would not be representative of the person's
performance, nor would it be useful to the experimenter. What objective
criterion could be used to decide whether this data point should be included in
the analysis?
In the behavioral sciences, it is considered acceptable to discard any score that
is at least three standard deviations above or below the mean. In our sample set
of data, if we discrd the outlying score of 1450, the standard deviation
becomes 100. Leaving this score out of the analysis would not be acceptable,
however, using the convention of discarding scores three standard deviations
above or below the mean. In this example, only scores above 1635 would be
legitimately left out of the analysis. (In this case, it is impossible to have a
score three standard deviations below the mean, because it would indicate a
negative response time.)

329

Hum=n Factors for Nffiet Deck Catificatio

asne

Caffbaon
Correlation is a commonly used descriptive statistic that describes the relation
between two variables. A correlation coefficient is reported as "r = x", where "x"
equals some number between negative one and one. When two variables are
unrelated (e.g., number of rainy days per month in Kansas and cost of airline
fares), the correlation coefficient is near zero. A high positive "r" indicates that
high values in one variable are associated with high values in the other
variable. A high negative "r" indicates that high values in one variable are
associated with low values in the other variable. A correlation of .7 or greater
(or -.7 or less) is usually regarded as indicative of a strong relation between the
two factors. An important note about correlation is that even a very high
correlation (e.g., r =.90) does not imply causality or a cause-effect relationship.
A correlation coefficient merely indicates the degree to which two factors varied
together, perhaps, as a result of a third variable that remains to be identified.
Another way in which the correlation coefficient, is useful is that when squared,
it indicates the percentage of the variance that is accounted for by the
manipulated factors. For example, with a correlation coefficient of .7, the
factors Lhat were examined in the analysis account for only 49 percent of the
variance (i.e., the variability in the data). The other 51 percent is due to chance
or things that were not controlled.

The statistics discussed above describe the test results and are, therefore,
referred to as "descriptive statistics." Inferential statistics are used to determine
whether two or more samples of data are significantly different, for example, if
performance on System A is significantly better or worse than performance on
System B.
The most commonly cited inferential statistics are the t-test and analysis of
variance. Each method of analysis has an underlying set of assumptions. If these
assumptions are seriously violated, or the analysis is inappropriate for the
experimental design, then the conclusions based on the analysis are
questionable.
SbuMd

' t RaIo

Student's t ratio (commonly referred to as a t-test) compares two different


groups of scores and determines the likelihood that the differences found
between them are due to chance. For example, t-tests would be appropriate
when comparing the results of two groups of scores, whether it be the
performance of the same group of pilots with System A and with System B, or
330

Human Facom TeAanz and Evhtin

the performance of two groups of pilots - one using System A and the other
using System B. When both sets of scores are taken from the same group of
people, Student's t ratio for correlated samples is appropriate. When the scores
of two different groups of people are examined, Student's t ratio for
independent samples is appropriate. The formulas for computing a t-ratio (and
all of the statistics discussed in this chapter) can be found in Experimental
Statistics (Natrella, 1966) and in most statistics textbooks. Both types of t-tests
look at the differences between the two groups of scores with reference to the
variability found within the groups. They provide an indication as to whether or
not the difference between the two groups of scores is statistically significant.
The results of a t-test are typically reported in the following format:
t(df) = x, (p < p.)
Where:
"df" equals the number of degrees of freedom
"x" equals the computed t-value
"p." equals the probability value.
For example, t(20) = 3.29, p < .01).
Degrees of freedom (df) refers to the number of values that are free to vary,
once we have placed certain restrictions on the data. In the case of a t-test for
correlated samples, the number of degrees of freedom equals the number of
subjects minus one. For independent samples, df equals the number of subjects
in one group added to the number of subjects in the other group minus two. In
both cases, as the number of subjects increases (and, hence, the number of df
increases), a lower t-value is required to achieve significance.

S W sOCn
The p value relates to the probability that this specific result was achieved by
chance. This is true not only for the t-values, but for all other statistics as well.
A "p < .01" indicates that the probability that this result would be achieved by
chance (and not due to the manipulated factors) is less than one in 100. When
the results are significant at the .05 level, (i.e., p _<.05), the chances of the
results occurring by chance are 5 in 100, or less. Very often, the statistic is cited
at the end of a statement of the results. For example, "The number of errors
was significantly higher in the high workload condition than in the low
workload condition (t(15) = 2.25, p < .05.)." It can also be used to show that
there were no statistically significant differences between two conditions. For
example, "The number of errors in the high workload conditions was
331

Human Factors for MUM Deck Certification Personnel

comparable to the number of errors in the moderate low condition (t(15) =


0.92, p >. 10)." It cannot, however, be used to prove that there are no
differences between the two groups, or that the two groups are the same.
For comparisons among more than two groups or more than two conditions in
the same test, performing t-tests between all of the possible pairs would not be
the best approach. A more appropriate test is Analysis of Variance (ANOVA).
Analysis of Variance is similar to a t-test in that it examines the differences
between groups with respect to the differences within groups. In fact, when
there a,. only two groups, an analysis of variance yields the same probability
value as the t-ratio.
MAeot & Vwiance
Analysis of variance (ANOVA) permits us to divide all of the potential
information contained in the data into distinct, non-overlapping, components.
Each of these portions reflects a certain part of the experiment, such as the
effect of an individual variable (i.e., a main effect), the interaction of any of the
variables, or the differences due solely to individual subjects. Each main effect
and interaction is reported separately, in the following format:
F(d, df) = x(p < pJ
Where:
"drf equals the number of degrees of freedom
"Y equals the computed F statistic
"p." equals the probability value.
For example, F(2,24) = 7.78, p < .01). For an ANOVA, the two reported
degrees of freedom are dependent upon the number of subjects and the number
of conditions or levels of effects.
An Enno.
As a hypothetical example, consider a simulation study of the operational effects
of transmitting pilot-to-controller and controller-to-pilot communications via
satellites. (For an actual study that is very similar to the hypothetical one
described here, see Nadler, et aL, 1992.) This method of transmission would
impose a delay (of approximately one-half second) between the time the
controller keys the microphone and the time the pilot was able to hear the
beginning of the transmission. Pilot transmissions to controllers would be
similarly affected.
332

Human Fa3ton Testing and Evaluatigo

One effect that satellite transmission might be expected to have on operations is


to increase the number of blocked transmissions ("step-ons?), since the delay
makes it possible for both controllers and pilots to key their microphones
without realizing that there is an incoming transmission. (The number of pilotpilot step-ons would not change, as the pilots would still be able to hear the
beginning of the other pilots' transmissions without a delay.) Without this
delay induced by satellites, blocked transmissions are due solely to two or more
people (controller and pilot or two pilots) attempting to transmit at the same
time and to stuck mikes. Since the probability of two individuals trying to
transmit simultaneously is logically a function of frequency congestion, the
number of transmissions on the frequency would be an important experimental
variable.
In this simulation study, two independent variables - the number of aircraft on
the frequency and whether or not there is a communication delay - would be
manipulated. Their effect on the number of step-ons (the dependent variable)
would be measured. In this example, we have two levels of delay (500 msec. to
simulate the satellite condition and no delay to simulate the present system).
We are careful to ensure that the number of aircraft on the frequency generates
different levels of frequency congestion. We categorize these levels of frequency
congestion into "low," "moderate," and "high," based on data obtained from
actual operations. Since we have two levels of delay and three levels of
frequency congestion, this is referred to as a two-by-three experimental design.
Furthermore, we have a completely balanced design. This means that we have
an equal number of hours of voice recordings in each combination of delayfrequency congestion conditions. We are statistically confident that we have an
adequate number of different controllers and number of hours of data. We are
also careful to keep all other conditions constant (which is always easier said
than done).
The three sources of variation in our ANOVA are the effects of delay, frequency
congestion, and subjects (i.e., differences in the number of step-ons associated
with different controllers). The results may show that the only significant effect
is that of delay, meaning that the number of step-ons was significantly different
for the two delay conditions. Graphically, this possibility might look like the

following:

333

Huma Factan for Plu.ht Deck Cetifigaon Pzonnel

Delay
delay

----No

Number
of
Step-Ons

low

moderate
Frequency Congestion

high

Another possible result is that the only significant effect was due to frequency
congestion. This could mean that the number of step-ons increased with
frequency congestion regardless of the delay condition. Graphically, this
possibility might look like this:
Delay
------No delay

Number
of
Step-Ons

low

moderate
Frequency Congestion

high

In addition to one, both, or neither of the effects of delay and frequency


congestion being significant, a significant interaction may occur. A significant
interaction would occur if, for example, there was no difference between the
delay conditions at the lowest level of frequency congestion, but there was a
significant difference at the highest level of frequency congestion. Graphically,
this possibility might look like the graph on the next page:

334

h a Paco= Teutb

ad Ewalunid

Delay
----No

delay

Number
of
Step-Oris

low

moderate

high

Frequency Congestion

These are only examples of the type of results that may produce significant

main effects or a significant interaction. There are many other possibilities. Of


course, only a statistical analysis can determine whether differences portrayed
on a graph are significant. Interpretation of test results is usually not simple,
particularly with complex experimental designs. For this reason, human factors
specialists with expertise in experimental design, but preferably statisticians,
should be involved in the design of the research and the analysis of the results.

Regrwabn An
A special case of analysis of variance that is often used is regression analysis.
Regression analysis takes the data and fits it to a mathematical function. The
function may be a straight line, a parabola, or any other function. The analysis
provides an indication of how well the data fits that particular function.
One of the advantages of regression analysis is that it is very forgiving of empty
cells in an experimental design (i.e., conditions in the design that do not have
as many data points as the other conditions). For example, if we wanted to test
how many mistakes pilots were likely to make with a certain system, but were
most interested in the number of errors to be expected under conditions of high
workload, then we might run a test with the majority of responses being in
high workload conditions. Perhaps some pilots would only be tested in the high
workload condition. Because of this asymmetry of data points in the high and
335

Hwnan Factor for FRiht Deck Certification Personnel

moderate workload conditions, ANOVA would not be the most appropriate


analysis; regression analysis, however, would still be appropriate.
Regression analysis also has some predictive value that analysis of variance does
not. Regression analysis is often used to project from the data obtained in an

experiment to situations that were not included in the test. In our hypothetical
example, regression analysis would be appropriate if communication delays of 0
msec., 250 msec., and 500 msec. were tested and we wanted an estimate of the
number of step-ons that could be expected at delays of 300 or 600 msec. When

using regression analysis in this way, it is important to remember three points.


First, the projection can only be as good as the fit of the data to the
mathematical function. Second, all other things being equal, an estimate
between two data points inspires more confidence than a projection beyond
(above or below) the values included in the test. Third, confidence in the
projection decreases as the distance between the hypothetical or projected point
and the value that was included in the test increases.
SUlas

v& Operadonal Signicnc

A final note about data analysis concerns the differences betw en statistically
significant and operationally significant results. Most statisticians only seriously
consider results that are statistically significant at the .05 level or better. This
enables the investigator to be reasonably certain that the findings were not due
to chance. A statistically significant difference may, however, be very small as
long as it is consistent. This may or may not be operationally useful. This
difference between statistical significance and operational significance is often
overlooked. A difference in response times of half of a second may be
statistically significant, but may not be operationally important, depending upon
the task.
On the other hand, when the experimental focus is actual operations, results
that are not statistically significant at the .05 level may still be important. For
example, if the focus of the experiment is serious operator errors that could

significantly affect flight safety, then we may choose to conservatively consider


results that are statistically significant only at the .1 level. The standard criteria
for acceptance of statistical significance at the .05 level should not be used to
ignore potentially interesting findings. It may also be the case that statistically
significant results would be attainable with a more powerful test or change in
research design (e.g., by utilizing better experimental controls or by increasing
the number of subjects). The decision as to what level of significance is to be
used should be dependent on the nature of the question that the test is
designed to answer.

336

lut of Rkum

____________

REFERENCES
Chafteu 1-4
Albers, J. (1975) Intfudioun of Color. New Haven, CT: Yale University Press.
Alpern, M. (1971) Effector mechanisms in vision. In J.W. Kling & L.A. Riggs
(Eds.) Fjpfjw muta PIdwho
. New York: Holt, Rinehart & Winston, 367-394.
Appelle, S. (1972) Perception and discrimination as a function of stimulus
orientation: The oblique effect in man and animals. Pycholoical BuRlad, 78,
266-278.
Bedford, R.E. & Wyszecki, G. (1958) Wave length discrimination for point
sources. Jouin of he (4calSociety ofAnueca, 48, 129-135.
Bik~sy, G. & Rosenblith, WA. (19S1) The mechanical properties of the ear. In
S.S. Stevens (Ed.), Handbook of EpVabnental Psychoiqo. New York: John Wiley
& Sons, Inc., 1075-1115.
Blakemore, C.B. & Sutton, P. (1969) Size adaptation: A new aftereffect. Sa"e,
166, 245-247.
Boettner, EA. (1967) SpecbW Thamiguia of the Eye (data only). Report for
contract AF41(609)-2966. University of Michigan. Prepared for USAF School of
Aerospace Medicine, Aerospace Medical Division (AFSC), Brooks Air Force Base,
Texas.
Bouman, MA. & Walraven, P.L (1957) Some color naming experiments for red
and green monochromatic lights. kounal of the OpIcal Soety of Ameica, 47,
834-839.
Boynton, R.M (1978) Color in contour and object perception. In E.C. Carterette
& M.P. Friedman (Eds.), Handbiok of Parqim.New York: Academic Press.
Bowmaker, J.L., Dartnafl, HJ.A. & Mollon, J.D. (1980) Microspectrophotometric
demonstration of four classes of photoreceptors in an old world primate Macaca
fascicularis. .kwual ofaPhjwioiog, 298, 131-143.
Brown, J.F, (1931) The thresholds for visual velocity. PIyhokohe Foauzmng,
14, 249-268.

R-1

H-uman FWact= for Fiht Deck Crdficat

Peuun

Brown, J.L (1965) Flicker and intermittent stimulation. In C.H. Graham (Ed.),
Hsi=an
udmal cq m. New York: John Wiley & Sons, Inc., 251-320.
Campbell, F.W. & Robson, J.G. (1968) Application of Fourier analysis to the
visibility of gratings. Joana of APWio, 197, 551-566.
Campbell, F.W. & Westheimer, G. (1960) Dynamics of accommodation responses
of the human eye. Jounalof Phjysiol,
151, 285-295.
Campbell, F.W. & Wurtz, R.H. (1978) Saccadic omission: Why we do not see a
grey-out during a saccadic eye movement. ViMan Rawrh, 18, 1297-1303.
Carter, R.C. (1979) Visual search and color coding. rovceedkV of die Human
Facds Socdiety 23rd Annual Meetg, 369-373.
Chipanis, A. (1965) Color names for color space. Amerkan Scientht, 53,
327-346.
Christ, R.E. (1975) Review and analysis of color coding research for visual
displays. Human Factors, 17, 542-570.
Cole, B.L. & Macdonald (1988) Defectiv Color Vi.uon Can Impede Informaio
Aapaitianfrom Colour Coded Video Dijya. Department of Aviation Report
No. 3, Melbourne, Australia.
Coren, S., Porac, C., & Ward, L.M. (1984) Sewation and PI&qion. (2nd edit.)
Orlando, FL: Harcourt Brace & Company.
Cornsweet, T.N. (1970) V'aud hnqio.
Company.

Orlando, FL: Harcourt Brace &

C.Qaik, F.I.M. & Lockhart, RIS. (1972) Levels of processing: A framework for
memory research. Journal of Verbal Learbng and Vera/ Bdeaviour, 11, 671-684.
Davis, R.G. & Silverman, S.R. (1960) Heanmg and Deafueu. New York: Holt,
Rinehart & Winston.
DeHaan, W.V. (1982) The Optometrist'sand OphdiabioloSW's Guide to P7o's
iuvon. Boulder, Co: American Trend Publishing Company.
deLange, H. (1958) Research into the nature of the human fovea-cortex systems
with intermittent and modulated light. I. Attenuation characteristics with white
and colored light. Journal of de Optical Society of Ameica, 48, 777-784.

R-2

a Rdknuces
ofs
DeValois, R.L & DeValois, K.K. (1988) Spada VJiaw
University Press.

New York: Oxford

DeValois, R.L, Morgan, H. & Snodderly, D.M. (1974) Psychophysical studies of


monkey vision--Ill. Spatial luminance contrast sensitivity tests of macaque and
human observers. V's
Rnawah, 14, 75-81.
Ditchburn, RLW. (1955) Eye movements in relation to retinal action. Opica
Aca, 1, 171-176.
Dowling, J.E. & Boycott, B.B. (1966) Organization of the primate retina:
Electron microscopy. Phme
of the Roya Sockey of Lnda, Series B, 166,
80-111.
Durlach, I. & Colburn, H.S. (1978) Binaural phenomena. In E.C. Carterette &
e
. New York: Academic Press,
M.P. Friedman (Eds.), Handbno of
365-466.
Ericsson, K.A & Faivre, lA. (1988) What's exceptional about exceptional
abilities? In L.K. Obler & D. Fein (Eds.), The ErceptonalBrnin- Neumopychology
of Taent and Special Abifites, 436-473.
Evans, R.M. (1948) An I1ntrdudi
Inc.

to Color. New York: John Wiley & Sons,

Fiorentini, A., Baumgartner, G., Magnussen, S., Schiller, P.H. & Thomas, J.P.
(1990) The perception of brightness and darkness: Relations to neuronal
receptive fields. In L. SpflLmann & J.S. Werner (Eds.), Vfulal
-tT7e
NeurraphysiolicalFoundadonm. New York: Academic Press, 129-161.
Fletcher, H. (1953) Speech and Hearing in Cmnrnuicado. (2nd edit.) New
York: Van Nostrand Company.
Fletcher, H. & Munson, W.A., (1933) Loudness, its definition, measurement and
calculation Journalof the Acoustcal Society of America, 5, 82-108.
Frisby, J.P. & Mayhew, J.E.W. (1976) Rivalrous texture stereograms. Natwr,
264, 53-56.
Fuld, K., Werner, J.S. & Wooten, B.R. (1983) The possible elemental nature of
brown. Vuion Reearch, 23, 631-637.
Gibson, J.J. (1966) The Sew Cousidered as Percepual Sywteu. Boston:
Houghton Mifflin.
R-3

Human Factors for Mwht Deck Certification Pemonnel


Ginsburg, A.P., Evans, D.W., Sekuler, RL & Harp, S.A. (1982) Contrast sensitivity
predicts pilots' performance in aircraft simulators. American Journal of Optomeby

& Phydoklc

Opdcx, 59, 105-108.

Goldstein, E.B. (1984) Sensation and PcaqgEon. Belmont, CA: Wadsworth


Publishing Company, (2nd ed.).
Graham, C.H. (1965a) Some fundamental data. In C.H. Graham (Ed.), Vuion
and Viual Percepton. New York: John Wiley & Sons, Inc., 68-80.
Graham, C.H. (1965b) Perception of movement. In C.H. Graham (Ed.), VIsion
and Visual Pereqfi, New York: John Wiley & Sons, Inc., 575-588.
Graham, C.H. & Brown, J.L. (1965) Color contrast and color appearances:
Brightness constancy and color constancy. In C.H. Graham (Ed.), Vision and
Viuual Peaepton. New York: John Wiley & Sons, Inc., 452-478.
Ham, W.T., Mueller, H.A., Ruffolo, J.J., Guerry, D. & Guerry, R.K. (1982) Action
spectrum for retinal injury from near-ultraviolet radiation in the aphakic
monkey. American ournal of Ophdhdmolog, 93, 299-306.
Hecht, S. (1934) Vision: II. The nature of the photoreceptor process. In C.
Murchison (Ed.), A Handbook of Genenul Eqiaientid PsycholV. Worcester,
MA: Clark University Press, 704-828.
Hecht, S. & Smith, E.L. (1936) Intermittent stimulation by light: VI. Area and

the relation between critical frequency and intensity. Journal of Geneml


Phy.iokoV, 19, 979-989.
Hecht, S. & Verrijp, C.D. (1933) Intermittent stimulation by light. 11. The
relation between intensity and critical fusion frequency for different retinal
locations. Jowaal of GeneW
l Plywiollogy, 17, 251-265.
Helps, E.P.W. (1973) Physiological effects of aging. Proceedinp of the Royal
Society of Medidne, 66, 815-818.
Hering, E. (1920; translation published 1964) Outlina of a Theo7
y of the L~k
Sense. (trans. by L.M. Hurvich and D. Jameson; originally published in 1920).

Cambridge, MA: Harvard University Press.


Higgins, K.E., Jaffe, M.J., Caruso, R.C. & deMonasterio, F.M. (1988) Spatial
contrast sensitivity: Effects of age, test-retest, and psychophysical method.
Jounal of the OQptcal Society of America A, 5, 2173-2180.

R-4

Hunt, G. (1982) Cathode ray tubes. In R.D. Bosman (Ed.) Modem Diqplay
TeduokSj andIAppicadiom. AGARD Advisory Report No. 169, 37-53.
Hurvich, LM. (1981) Color Vnion. Sunderland, MA: Sinauer Associates.

Iavecchia, J.H., Iavecchia, H.P. & Roscoe, S.N. (1988) Eye accommodation to
head-up virtual images. IHunan Factors, 30, 689-702.
Jameson, D. & Hurvich, L.M. (1972) Color adaptation: Sensitivity, contrast,
after-images. In D. Jameson & L.M. Hurvich (Eds.), Handbook of Sensory
Physiolov, VoL VI/4 (pp. 568-581). Berlin: Springer-Verlag.
Judd, D.B. (1951) Basic correlates of the visual stimulus. In S.S. Stevens (Ed.),
New York: John Wiley & Sons, Inc.,
Handbook of ErpbnentalPsyadkoo.
811-867
Julesz, B. (1971) Foundaiomof qdqiean Peroqion. Chicago: University of
Chicago Press.
Kaiser, P.K. (1968) Color names of very small fields varying in duration and
luminance. Journal of the Opical Sociey of Amrica, 58, 849-852.
Kardon, D. (1989) Electroluminescent backlights for liquid crystal displays.
InJormaa'n Dbiplay, 6, 17-21.
Kaufman, L. (1974) Sigt and Md.- An Immiduaon to Kmal Pacgptm. New
York: Oxford University Press.
Kaufman, L. & Rock, . (1962) The moon illusion. Sieti wAmrican, 207,
120-132.
Kelly, D.H. (1972) Adaptation effects on spatiotemporal sine wave thresholds.
V'ion Reseanh, 12, 89-101.
Kelly, D.H. (1974) Spatio-temporal frequency characteristics of color-vision
mechanisms. Joumnal of die Opical Society of America, 64, 983-990.
Kinney, J.A.S. (1983) Brightness of colored self-luminous displays. Color
Reeawh and App4iicaion, 8, 82-89.
Klingberg, C.S., Elworth, C.S. & Filleau, C.R. (1970) Image quality and
detection performance of military photointerpreters. Boeing Comaany Repot
D162-10323-1.

R-5

Hunan Fact= for Might Deck Citifaton Peanel

Krupka, D.C. & Fukui, H. (1973) The determination of relative critical flicker
frequencies of raster-scanned CRT displays by analysis of phosphor persistence
characteristics. Procweinp of the Sociey for Inf'madik Dbiqy, 14, 89-91.
Kuo, K.H. & Kaimanash, M.H. (1984) Automatic chrominance compensation for
cockpit color displays. SD 84 Dihs, 65-67.
Kurtenbach, W., Sternheim, C.E. & Spillmann, L. (1984) Change in hue of
spectral colors by dilution with white light (Abney effect). Journal of the Opidca
Society, A, 1, 365-372.
Laycock, J. (1982) Reomvuendad cohaa of uw in airbornedirpaw. Technical
Report 82110, Royal Aircraft Establishment, Famborough, UK.
Leibowitz, H.W. (1983) A behavioral and perceptual analysis of grade crossing
accidents. Operaion .L1esa Naional SynwsAwn 1982. Chicago: National
Safety Council.
Mach, E. (1865) Ober die Wirkung der riumlichen Verteilung des Lichtreizes auf
die Netzhaut. L
der madmaisch-n
i
cftidwn Cl.we
der "ka hen Akademie der Wuenuwhaen, 52, 303-322.
Marc, R. E. & Sperling, H. G. (1977) Chromatic organization of primate cones.
Sdence, 196, 454-456.
Matlin, M.W. (1983) Senatio and Perceione
Bacon.

. (2nd edit.) Boston: Allyn and

McCollough, C. (1965) Color adaptation of edge-detectors in the human visual


system. Science, 149, 1115-1116.
Meier, G. (1982) Electrochemical displays. In R.D. Bosman (Ed.) Modem Diplkay
Technoogies and AGARD Advisory Report No. 169, 165-180.
Mikoshiba, S. (1988) Plasma displays. Informalion Dispay, 4, 14-18.
Murch, G.M. & Huber, J. (1982) Colour--the logical next step. New Eewrnis,
15, 31-32.
Newman, C.V., Whinham, E.A. & MacRae, A.W. (1973) The influence of texture
on judgment of slant and relative distance in a picture with suggested depth.
7io&an
hqwi a, 14, 280-284.

R-6

List of Refmeus

Nickerson, D. & Newall, S.M. (1943) A psychological color solid. Joumal of sde
Opdcal Soiery of America, 33, 419-422.
Osterberg, G. (1935) Topography of the layer of rods and cones in the human
retina. Acra 0phtalmoo~ ca, Supplement 6.
Owsley, C., Sekuler, R. & Siemsen, D. (1983) Contrast sensitivity throughout
adulthood. Vsion Reearch, 23, 689-699.
Pearlman, A.L., Birch, J. & Meadows, J.C. (1979) Cerebral color blindness: An
acquired defect in hue discrimination. Annas of Neawvogy, 5, 253-261.
Pettigrew, J.D. (1972) The neurophysiology of binocular vision. Scientic
American, 227, 84-96.
Pitts, D.G. (1982) The effects of aging on selected visual functions: Dark
adaptation, visual acuity, stereopsis and brightness contrast. In R. Sekuler, D.W.
Kline and KDismukes (Eds.), A8g and Human V'sual Funcdionr. New York:
Liss.
Pokorny, J, Smith, V.C., Verriest, G. & Pinckers, A.J.L.G. (1979) Congenita and
.
Acquire Color Vi'wn Defects. New York: Grune & Stratton.
Polyak, S. (1941) The Redia. Chicago: University of Chicago Press.
Purdy, D.M. (1931) Spectral hue as a function of intensity. Ameican Jounal of
Psycholob, 43, 541-559.
Regan, D., Beverly, K. & Cynader, M. (1979) The visual perception of motion in
depth. SdemnijiAmerican, 241, 136-151.
Richards, W. (1970) Stereopsis and stereoblindness. FEqerimet
10, 380-388.

Brain Research,

Riggs, LA. Visual acuity. In C.H. Graham (Ed.), Vuion and V'ual Peaceptou.
New York: John Wiley & Sons, Inc., 321-349.
Riggs, LA., Ratliff, F., Cornsweet, J.C. & Cornsweet, T.N. The disappearance of
steadily fixated visual test objects. Journal of the Optical Society of America, 43,
495-501.
Riggs, LA., Volkmann, F.C. & Moore, R.K. (1981) Suppression of the blackout
due to blinks. Vrion Reseawrch, 21, 1075-1079.

R-7

l-Immn

for Flhzt Deck CAdfcatim Personnel

Rovamo, J., Vursu, V. & Nosinen, R. (1978) Cortical magnification factor


predicts the photopic contrast sensitivity of peripheral vision. Natur, 271,

54-56.
Schlam, E. (1988) Thin-film electroluminescent displays. Infomaion Dilay, 4,
10-13.
Sekuler, R. (1974) Spatial vision. Annual Review of Pa'dy

o,

25, 195-232.

Sekuler, R. & Blake, R. (1985) Pevqiou New York: McGraw Hill, Inc.
Sherr, S. (1979) Electmic Diipaqw. New York: John Wiley & Sons, Inc.
Snyder, H.L. (1980) Human Viual Pefomance and Fa Panel Dilay hmqe
Quality. Technical Report HFL-80-1/ONR-80-1.
Snyder, H.L (1988) Image Quality. In M. Helander (Ed.), Handbook of
Hunan-ConnaerIntfmcdoc. Amsterdam: Elsevier Science Publishers.
Society of Automotive Engineers (1988) Human Engineein Considduu
s in the
Application of Color to Elctoic Aircmft Dijwp. ARP4032. Warrendale, PA:
SAE.
Stokes, A.F. & Wickens, C.D. (1988) Aviation displays. In E.L. Wiener & D.C.
Nagel (Eds.), Human Factr in Aviation. New York: Academic Press.
Taylor, C.A. (1965) The Phpicrof MjW/ Sound,. New York: Elsevier.
Teichner, W.H. (1979) Color and visual information coding. Proceadimpof te

Society for Inftaalkm Diyday, 20, 3-9.


Tolin, P. (1987) Maintaining the three-dimensional illusion. Infouaton Dispay,
3, 10-12.
Turnage, R.E. (1966) The perception of flicker in cathode ray tube displays.

Inf

n Diplay, 3, 38.

Uchikawa, H., Kaiser, P. K. & Uchikawa, K. (1982) Color-discrimination


perimetry. Color Reearch and Applicato, 7, 264-272.
Verriest, G. (1963) Further studies on acquired deficiency of color
discrimination. ournal of the Opical Society of Aneica, 53, 185-195.

R-8

Uist of Refrece
Vision Committee, National Research Council (1981) Procedures for Tnng
Color Vni.m. Report of Working Group 41. Washington, D.C.: National
Academy Press.
Viveash, J.P. & Laycock, J. (1983) Computation of the resultant chromaticity

coordinates and luminance of combined and filtered sources in display design.


Dirdaj',

4, 17-23.

Volbrecht, V.J., Aposhyan, H.M. & Werner, J.S. (1988) Perception of electronic
display colours as a function of retinal illuminance. Dialays, 9, 56-64.
Volkmann, F.C. (1962) Vision during voluntary saccadic eye movements. Journal
of she Opical Society ofAmerica, 52, 571-578.

Vos, J. J. & WaLraven, P. L. (1971) On the derivation of the foveal cone


primaries. VIion Research, 11, 799-818.
Wald, G. (1945) Human vision and the spectrum. Science, 101, 653-658.
Walraven, J. (1985) The colours are not on the display: a survey of
non-veridical perceptions that may turn up on a colour display. Diylays 6,
35-42.
Walraven, J., Enroth-Cugell, C., Hood, D.C., MacLeod, D.I.A. & Schnapf, J.L.
(1990) The control of visual sensitivity: Receptoral and postreceptoral processes.
In L. Spillmann & J.S. Werner (Eds.), VI/ual Peceptio- The Neurophysiological
Foundadim, New York: Academic Press, 53-101.
Walraven, P.L. (1962) On the MecJanbms of Colour FKsia.
University of Utrecht.

Ph.D. Thesis,

Ward, W.D. & Glorig, A. (1961) A case of fire-cracker induced hearing loss.
Laryngoscope, 71, 1590-1596.
Weale, RLA (1982) A Biography of the Eye. London: H.K Lewis & Co.
Weale, R.A. (1988) Age and the transmittance of the human crystalline lens.
Jounal of Ph)violg, 395, 577-587.
Welch, R.B. & Warren, D.H. (1980) Immediate perceptual response to
intersensory discrepancy. Psychological Buletin, 88, 638-667.

R-9

Human Factrs for Fhlzht Deck Certification Personnel


Werner J.S. (1982) Development of scotopic sensitivity and the absorption
spectrum of the human ocular media. Journalof the Optc Society of Ameica,
72, 247-258.
Werner, J.S., Cicerone, C.M., Kliegl, R. & DellaRosa, D. (1984) Spectral
efficiency of blackness induction. Journalof the Optical Society of America, 1,
981-986.
Werner, J.S., Peterzell, D. & Scheetz, A.J. (1990) light, vision, and aging -- A
brief review. Optometvy and VKiion Sdnce, 67 214-229.
Werner, J.S. & Schlesinger, I. (1991) Psycholoy: Science of Mw'd, Brain, and
Behavior. New York: McGraw-Hill.
Werner, J.S. & Steele, V.G. (1988) Sensitivity of human foveal color
mechanisms throughout the life span. Journal of the OpticalSociety of America,
A, 5, 2122-2130.
Werner, J.S. & Walraven, J. (1982) Effect of chromatic adaptation on the
achromatic locus: The role of contrast, luminance and background color. V'son
Research, 22, 929-943.
Werner, J.S. & Wooten, B.R. (1979) Opponent-chromatic mechanisms: Relation
to photopigments and hue naming. Journalof the Optical Society of America, 69,
422-434.
Wertheimer, M. (1912) Experimentelle Studien ober das Sehen von Bewegung.
Zeikwchdftfr ychologie, 61, 161-265.
Westheimer, G. (1964) Pupil size and visual resolution. ViMson Reseanch, 4,
39-45.

Wheatstone, C. (1838) Some remarkable phenomena of binocular vision.


- The Mneteenth
(Reprinted in 1964) In W.N. Dember (Ed.), Visual Pavp
CeOneo, New York: John Wiley & Sons, Inc., 114-129.
Williams, R.D. & Garcia, F. (1989) Volume visualization displays. Information
D play, S, 8-10.
Wilson, H.R., Levi, D., Maffei, L., Rovamo, J. & DeValois, R. (1990) The
perception of form: Retina to striate cortex. In L. Spillmann & J.S. Werner
(Eds.), Vual Perceapon- The Neurphyuiologica Foundations. New York:
Academic Press, 231-272

R-10

List of Rderem

Wooten, B.R. (1984) The effects of successive chromatic contrast on spectral


hue. In L Spillnann & B.PR Wooten (Eds.), Seiwy Eqednc, Adapato and
Pacqvmi. (pp. 471-494) Hillsdale, NJ: Lawrence ErIbaum Associates.
Wright W.D. (1946) Researhs oan Normal anDefeciw Colour Visio. London:
Henry Kimpton.
Wright, W.D. & Pitt, F.H.G. (1934) Hue discrimination in normal colour-vision.
ftoceerig of the Physkd Socidy, 46, 459-473.
Wurtz, J.E. (1989) The not-so-amazing survival of the CRT. Information Diday,
5, 5-18.
Wyszecki, G. & Stiles, W.S. (1982) Color Scence:- Cbncqtx and MeodiA
Quwittatve Dat and Formulae. (2nd ed.) New York: John Wiley & Sons, Inc.
Yarbus, A.L (1967) Eye Movement and FVlion. New York: Plenum Publishing
Corp.
Young, R.W. (1988) Solar radiation and age-related macular degeneration.
Survey of Ophtfhlmloy, 32, 252-269.
Zelman, S. (1973) Correlation of smoking history with hearing loss. Jounal of
the Ameican Medica Asociato, 223, 920.
Zihl, J., von Cramon, D. & Mai, N. (1983) Selective disturbance of movement
vision after bilateral brain damage. Brain, 106, 313-340.
Zrenner, E., Abramov, I. Akita, M., Cowey, A., Livingstone, M. & Valberg, A.
(1990) Color perception: Retinal, geniculate, and cortical mechanisms. In L
Spillmann & J.S. Werner (Eds.), Vuual P q
" The NuarphyWolopical
Foundadon. New York: Academic Press, 163-203.
Zwicker, E. (1958) Uber psychologische und methodosche grundlagen der
Lautheit. Aautica, 8, 237-258.
Chapter s
Bransford, J. D., Barclay, J. EL & Franks, J.J. (1972) Sentence memory: A
constructive versus interpretive approach. Cofuifive Pycholog, 3, 193-209.
Craik, F. & Lockhart R. (1972) Level of Processing: A Framework for Memory
Researh. ounal of Vae
Learnin and Vua Behavior, 11, 671-684.
R-11

Ihmn Pactn fo Flibt DIck Culificati

PeI nnne

James, W. (1890) The PFbiclaof PaydhlV, New York: Holt.


Johnston, W. A., & Heinz, S. P. (1978) Flexibility and capacity demands of
attention. Joual of F/qmimnta Paycholog: Gemn /l, 107, 420-435.
Kahneman, D. AmAnt

and Effort. (1973) Englewood Cliffs, NJ: Prentice HalL

MiyawaKi, K, Strange, W., Verbrugge, R., Liberman, A.M., Jenkins, J.J., &
Fujimura, 0. (1975) An effect of linguistic experience: The discrimination of [r]
and [1] by native speakers of Japanese and English. lq~io & Paydiipkvia,
18, 331-340.
Nadler, E., Meigert, P., Sussman, E.D., Grossberg, M., Salomon, A., & Walker,
K., (unpublished manuscript). Effect of Binaural Delays in the Communication
Channel Linking Radar and Data Controller.
Neisser, U. (1967) Coaiive PolokV, New York: Appleton-Century-Crofts.
Neisser, U. & Becklen, R. (1975) Selective Looking: Attending to visually
specified events. Cogiitw Paoio&V, 7, 480-494.
Palmer, S. E. (1975) The effects of contextual scenes on the identification of
objects. Memoey and Cautilu., 3, 519-526.
Peterson, G. E. & Barney, H.L. (1952) Control methods used in the study of
vowels. Jomunal of he AcouWa Society ofAmerica, 24, 175-184.
Reicher, G. M. (1969) Perceptual recognition as a function of meaningfulness of
stimulus materiaL Jouralof FremealPochoV, 81, 275-280.
Scharf, B., Quigley, S., Aoki, C., Peachey, N., & Reeves, A. (1987) Focused
attention and frequency selectivity. Pewq n & / hoh
, 42, 215-221.
Sperling, G. S. (1960) The information available in brief visual presentations.
Psyic gical Moanograpl, 74, 1-29.
Tsal, Y. (1983) Movements of attention across the visual field. Jounal of
i
en habki-: Human Pewqpio and Pervfomance, 9, 523-530.
Warren, R.M. (1970) Perceptual Restoration of Missing Speech Sounds. Science,
167, 392-393.
Warren, R.M. & Obusek, C.J. (1971) Speech Perception and Phonemic
Restorations. PfcqWk and Pa
pvqi,
9 (38), 358-362.
R-12

Weisstein, N. & Harris, C. S. (1974) Visual detection of line segments: An


object-superiority effect. Sckuc, 186, 752-755.
Chapten6-8
Aldrich T.B., Szabo, S.M., & Bierbaum, C.R. (1989) The development and
application of models to predict operator workload during system design. In
G.R. McMillan, D. Beevis, E. Salas, M.H. Strub, R. Sutton, & L Van Breda
(Eds.), Appicadion of Human Performance to Spvtem Deikc,
New York: Plenum
Publishi-ig Corp., 65-80.
Allnut, M.F. (1987). Human factors in accidents. BDifA Jounal of Anaeatueia,
59, 856-864.
Andre, A.D. & Wickens, C.D. (1990). Display control compatability in the
cockpit: guidelines for display layout analysis. Tech Repor4 ARL-90-12/1NASSA190-1. Savoy, Illinois: University of Illinois Aviation Research Lab.
Braune, R.J. (1989) The common/same type rating: human factors and other
issues. PaperNa 892229, Warrendale, PA: SAE.
Cooper, G.E. & Harper, R.P., Jr. (1969) The use of pilot rating in the evaluation
of aircraft handling qualities. NA7T: AGARD-AG-567 Paris, France (DTIC).
Czeisler, CA., Weitzman, E.D., Moore-Ede, M.C., Zimmerman, J.C., & Knauer,
R.S. (1980) Human sleep: its duration and organization depend on its circadian
phase. Scimce, 210, 1264-1267.
Degani, A. & Weiner, E.L (1990) Human factors of flight deck checklists: the
normal checklist. NASA Conbdor Reqwt 177549, NASA Ames Research Center,
Moffet Field, CA.
Desmond, J. (1986) Improvements in aircraft safety and operational
dependability from a projected flight path guidance display. Paper No. 861732,
Warrendale, PA: SAE.
Farmer, E.W. & Green, R.G. (1985) The sleep-deprived pilot: performance and
EEG response. 16th Cmnfffnc for Watem Eu ean Awodadm for Aviation
yologj, Helsinki, 1985, 24-285.
Fischer, E., Haines, R. & Price, T. (1980, December) Cognitive Issues in HeadUp Displays, NAS Tecbkcal Paper 1171, NASA, Washington, DC.

R-13

Human Factors for Flight Deck Cardfication Personnel


Gopher, D. (1991) The skill of attention control: acquisition and execution of
attention strategies. In D. Meyer & S. Kornblum (Eds.), Atnton and
Perfoumance IVX, Hillsdale, NJ: Erlbaum Associates.
Graeber, R.C. (1988) Aircrew fatigue and circadian rhythmicity. In E. Weiner &
D. Nagel (Eds.), Human Factor in Aviation, New York: Academic Press, 305344.
Graeber, R.C. (1988, 1989) Jet lag and sleep disruption. In M.H. Kryger, T.
Roth, & W.C. Dement (Eds.), Mmcile and Practice of Sleep Medifci,
Philadelphia, PA: W.B. Saunders Company, 324-331.
Griffin & Rockwell (1989) The confirmation bias is supported by oe.cany.
What you expect to see helps you confirm what you believe your state is. A
major concern in private pilot aviation is continued flight into deteriorating
weather. This has been documented by research at Ohio State University
(Griffin & Rockwell, 1989).
Groce, J.L. & Boucek, G.P., Jr. (1987). Air transport crew tasking in an ATC
data link environment. Paper No. 871764. Warrendale, PA: SAE.
Hart, S.G. (1989) Crew workload management strategies: a critical factor in
system performance. Fsth Annual SyrrWyium on Aviation Prychog, Columbus,
OH.
Hartzell, E.J., Dunbar, S., Beveridge, R. & Cortilla, R. (1982) Helicopter pilot
response latency as a function of the spatial arrangement of instruments and

controls. Proceadbn of the 28h Annual Cofenence on Manual Control, WrightPatterson AFB, Dayton, OH.
Hawkins, F.H. (1987) Human Fackva in Flih, Brookflield, VT: Gower Technical
Press.
Hyman, R. (1953) Stimulus information as a determinant of reaction time.
Journal of Expfeimental P.ayholo,
45, 423-432.
Jensen, R.J. & Benel, R. (1977) Judgment evaluation and instruction in civil
pilot training. fnal Report FAA-RD-78-24, Springfield, VA, National Technical
Information Service.
Johnson, S.L. & Roscoe, S.N. (1972) What moves, the airplane or the world?
Human Facovrs, 14, 107-129.

R-14

Lisatofu Memno

Kahneman, D., Slovic, P., & Tversky, A. (Eds.), (1982) AMWuen Unda
Unwerain*. Hawiau and Biwa, New York: Cambridge University Press.
Klein, GA (1989a) Do decision biases explain too much? HMan Factoa
Sodey Adn, 32, 1-3.
Klein, GA (1989b) Recognition-primed decisions. In W. Rouse (Ed.) Advancer
in Man-Madci
se u Rnan*, VoL 5, Greenwich, CT: JAI Press, 47-92.
Klein, K.G., Wegmann, H.M. & Hunt, B.I. (1972) Desynchronization of body
temperature and performance circadian rhythms as a result of outgoing and
homegoing transmeridian flights. Affaqwce Mediine, 43, 119-132.
Nagel, D.C. (1988) Human error in aviation operations. In E. Weiner & D.
Nagel (Eds.), Human Fadorsin Avisaion, New York: Academic Press, 263-303.
Newman, R.L (1987) I o
of head-q digiiay Atndan. Volumes I, II,
III. (AFWAL-TR-3055). Wright-Patterson Air Force Base, OH: Flight Dynamics
Laboratory.
Norman, D. (1988) The Psycho&V of Everyday Thin, New York: HarperCollins.
North, RA., & Riley, V.A (1989) A predictive model of operator workload. In
G.R. McMillan, D. Beevis, E. Salas, M.H. Strub, Rt. Sutton, & L Van Breda
(Eds.), Appiicadorn of Human Perforance to Syute Deui, New York: Plenum
Publishing Corp., 81-99.
Parks, D.L. & Boucek, G.P., Jr. (1989) Workload prediction, diagnosis and
continuing challenges. In G.R. McMillan et al (Eds.), Appficadom of Human
63.'
a
Modek to Symem Deuiu, New York: Plenum Publishing Corp., 4763.
Reason, J. (1990) Human Emr, New York: Cambridge University Press.
Richardson et aL (1982) Circadian Variation in Sleep Tendency in Elderly and
Young Subjects. Sleqp, 5 (suppL2), A.P.S.S., 82.
Roscoe, A.H. (1987). The puctical anam
t of'pl& woxkkad. NATO: AGARDAG-282. Loughton, U.K: Specialized Printing Services Ltd.
Roscoe, S.N. (1968) Airborne displays for flight and navigation. Human Factor,
10, 321-322.

R-15

Human Facto

for PFlht Deck Certification Pannel

Steenblik, J.W. (1989, December) Alaska Airlines' HGS. Air Lin Pikv, 10-14.
Vicente, KJ., Thornton, D.C., & Moray, N. (1987) Spectral analysis of sinus
arrhythmia: a measure of mental effort. Human Fadovi, 29(2), 171-182.
Wegman, H.M., Gundel, A., Naumann, M., Samel, A., Schwartz, E. & Vejvoda,
M. (1986). Sleep, sleepiness, and circadian rhythmicity in aircrews operating on
transatlantic routes. Aviation, Space, and Envone
Medicine, 57,
(12, suppl), B53-B64.
Weiner, E.L (1988) Cockpit automation. In E.L. Weiner & D.C. Nagel (Eds.),
Human Factorsin Aviation, New York: Academic Press, 263-303.
Weinstein, L.F. (1990) The reduction of central-visual overload in the cockpit.

lroeedinp of the 12t Sywium on Pzyhdok

in th Department of Defu,

U.S. Air Force Academy, Colorado Springs.


Weintraub, D.J. & Ensing, M. (1992) Hnuman Fatorshaw in Head-Up Diiay
De.4- Book of HUD - CSER!AC- Wright-Patterson Air Force Base, OH: Crew
System Ergonomics Information Analysis Center (CSERIAC).
Weintraub, D.J., Haines, R.F. & Randall, R.J. (1984) The utility of head-up
displays: eye focus vs. decision times. Proceedbgqof the 2&* Annual Meeting of
dwe Human Factom Society, 1, 529-533.
Wickens, C.D. (1984). Enae
Prycho&l
edit.), New York, Harper/Collins.

and Human Perfomnance (1st

Wickens, C.D. (1992). Eninering Psycholog and Human Performance (2nd


edit.), New York, Harper/Collins.
Wickens, C.D & Flach, J. (1988) Human Information Processing, in E. Weiner &
D. Nagel (Eds.), Human Factors in Aviation, 111-155, New York: Academic
Press.
V i, kens, C.D., Martin-Emerson, R., & Larish, 1. (1993) Attentional tunneling
and the head-up display. Proceedinpfron the 7h Inwnational Sympoium on
Aviation Psycholog. Columbur, Ohio: Ohio State University, Department of
Aviation.
Wickens, C.D., Stokes, A.F., Barnett, B. & Hyman, F. (1988). Stress and pilot
judgment: an empirical study using MIDIS, a microcomputer-based simulation.
Phoceadip of the 32nd Meeting of the Human Factors Society, Human Facto-s
Society, Santa Monica, CA.
R-16

List of References
Wilson, G.F., Skelly, J., & Purvis, G. (1989). Reactions to emergency situations
in actual and simulated flight. In Human Behavior in Hih Ser Siuations in
Afeospc Operaions. NATO: AGARD-CP-458. Loughton. U.K.: Specialized
Printing Services Ltd.
Wright, P. (1977) Presenting technical information: a survey of research
findings. Instrudonal Science, 6, 93-134.

Chapters 9-11
Bainbridge, L (1987) Ironies of automation. In New Technoloq and Human
Emrrr. Ed. M.J. Rasmussen, K. Duncan, and J. Leplat. West Sussex, U.K.: John
Wiley & Sons Ltd.
Barnett, A. & Higgins. M.K. (1989) Airline safety: the last decade. Journal of
Management Scienm 35, January 1989, 1-21.
Broadbent, D.E. (1971) Decisio and Stnes. London: Academic Press.
Card, J. K. (1987. Conversation with Richard Gabriel, Palo Alto, California.
Corwin, W.H., Sandry-Garza, D.L., Biferno, M.H., Boucek, G.P., Logan, A.L.,
Jonsson, J.E., & Metalis, SA. (1989) Asueomn of cew workload meauremnen
met,
ftechniues, and pvrcedurs (WRDC-7R-89-7006). Wright-Patterson Air
Force Base, Wright Research and Development Center, Dayton, Ohio.
Crossman, E.R., Cooke, J.E., & Beishon, R.J. (1974) Visual attention and
sampling of displayed information in process control In The Human Operator in
Process ControL Ed. M. E. Edwards and J. Lees, 25-50.
Curry, R.E. (1985) The introduction of new cockpit technology: a human factors
study. NASA Technical Memorandum 86659. Moffett Field, California: NASA
Ames Research Center.
Department of Defense. (1985) Human eaieerig guideina for management
informnaion swtems (DOD-HBK-761). Wright-Patterson Air Force Base, Air Force
Systems Command, Dayton, Ohio.
Fadden, D.M., & Weener, E.F. (1984) Selecting effective automation. Paper
presented at ALPA Air Safety Workshop Washington, D.C.
Gabriel, R.F. (1987) Internal memorandum. Society of Automotive Engineers
(SAE) Committee on Human Behavioral Technology.
R-17

Human Facton for Mli

Deck Cifificatho

Peruou

Hilgard, E.R., & Atkinson, R.C. (1967) lnvuucdm to Psycholog.


Harcourt, Brace and World.

New York:

Meister, D. (1987) A cognitive theory of design and requirements for a


behavioral design aid in system design. In Behaviaml Pw~xcdw for Dedpov,
Tooks and Orgmn&AdanL Ed. W.B. Rouse and K.R. Boff. New York: North
Holland Publications.
Nagel, D.C. (1987) Aviation safety: needs for human factors research.
Presentatk to Air T)mport Asociation (ATA) Abm Opmaians Forunk 19-21
October, Annapolis, Maryland.
Norman, S.D. & Orlady, H.W. (Eds.) (1988) FWigh Deck Au-atoao Pmmier
and Reat Foal Reot of a NASA/FAA/Indutry Workvhq. Moffett Field,
California: NASA Ames Research Center.
Rogers, Locan, & Boley (1989) Caj
and reducton ofpikt ero. NASA,
CR-181867, DOT/FAA/DS-89/24. Hampton, VA: NASA.
Roth, E.M., Bennett, KB., & Woods, D.D. (1987) Human interaction with an
intelligent machine. Inimnatonal Journal of Man-Machine Studies (27).
Rouse, W.B. (1977) Human-Computer interaction in multitask situations. IEEE
7hwhsadions on Sywtens, Man and qbmreics 384-392.
Rouse W.B., & Boff, K.R. (1987) System design. In Behaviomvl Peaqciv on
Designen Too/t, and OrganimaonL New York: North Holland Publications.
Sheridan, T.B. (1980) Computer control and human alienation. Technology
Review, 61-73.
Sinaiko, H.W. (1972) Human intervention and full automation in controlsystems. APli Euimmics, 3-7.
Society of Automotive Engineers (1990) Human ntuface desi*n mwetod
bdnenat diylay syrnbolo. ARP 4155. Warrendale, PA: SAE.
Websters New Worid
Publishing Co.

for

Dctonary, (1970) Guralnik, D.B. (Ed.) New York: World

Wiener, E.L. (1985) Human factors of cockpit automation: A field study of


flight crew transition NASA Conuictor RepoT 177333. Moffett Field, California:
NASA Ames Research Center.

R-18

List of
Woods, D.D. (1987) Tedhniog

alnis

not em

.-reducing the Pounialfor

disaawin rinty technoloSie Pittsburgh, Pennsylvania: Westinghouse Research


and Development Center.

Chapter 12
Berson, B. L, Po-Chedley, D. A., Boucek, G. P., Hanson, D. C., Leffler, M. F.,
& Wasson, R. L. (1981) AircraftAletig Stms Stw ddiation Study Volume H.
Akrmu Aleing Stmem Daiu Guiddeie, Report No. DOT/FAA/RD-81/II.
Boff, K. R., & Lincoln, J. E., (Eds.) (1988) Eigbweeing Data ComperdiuHuman Pacqei and Performance, Volume III, Harry G. Armstrong Aerospace
Medical Research Laboratory, Wright-Patterson Air Force Base, Ohio, 1862.
Boucek, G. P., Erickson, J. B., Berson, B. L., Hanson, D. C., Leffler, M. F., & Po-

Chedley, D. A., (1980) Arft

Alng Ssm

Stada'diai

Sdy Phase L

Final Report. Report No. FAA/RD-80-68.


Boucek, G. P., Pfaff, T. A., White, R. W., & Smith, W. D. (1985) T7a]T&cAlrt
and Cbum Avoidane Swtem - Opfatonal Simultion. Report No.
DOT/FAA/PM-85/10.
Boucek, G. P., Po-Chedley, D. A., Berson, B. L., Hanson, D. C., Leffler, M. F., &
White, R. W. (1981) Akmmft Alring Sysems StandardinatonStudy, Volume. Candidate System Validation and 7Tne-Ciical Diplay Eviadoe. Report No.
FAA-RD-81-381.
Boucek, G. P., White, R. W., Smith, W. D., & Kraus, J. M. (1982) TraffcAlat

and Cofisio Avoidance System - L

omna Smulation, Report No.

DOT/FAA/RD-82/49.
CardosL Kl & Boole, P. (1991) Analysis of Pilot Rerqn Tme to Thue-Qitical
Air Thfic Cbnad CaAL Report No. DOT/FAA/RD-91/20.
Cardosi, K. & Huntley, M. S. (1988) Cocdpt and Cabin CewCoodntio
Report No. DOT/FAA/FS-88/1.
Foushee, C., Lauber, J., Baetge, M., & Acomb, D. (1986) OCrw Factors ;n F

pmaim19;- 77e q

tnad SodfiAnc

off qu" , to shot-haul airVmupot

opffatonL NASA Technical Memorandum 88322 Moffett field, CA.

Hopkin, D. (1980) The measurement of the air traffic controller. Human


Factor* 22(5).
R-19

Human FPao

for Flight Deck Cerdficatio Prsoune

Kidder, L H. (1981) Seli Wd#WinaF4 and Cook's Rwarcl, Met/wi, in oidal


Reladio (4th edit.), New York: Holt, Rinehart and Winston, 483.
Nadler, E.D., DiSario, R., Mengert, P., & Sussman, E.D. (1990, reprinted 1992)
A sbnuladm saady of the effects of conununim~don delay on air rauffi conftv4
Report No. DOT/FAA/CT-90/6.
Natrella, M. G., (1966) Erperimea Statisic, National Bureau of Standards
Handbook 91, 1966.

R-20

INDEX

Absorption, 13
increase with cataracts, 19
spectra, 42
Accident data, see automation, Workload assessment
Accommodation, 16, 126, 249
as a consideration in Display design, 249
effects of aging on, 126
to a display panel, 16
to a head-up display (IIUD), 16, 126
Adaptation, 8, 24, 63-65
chromatic, 63-65
to dark, 24
to sound, 8
Afterimage, see color contrast effects
Amacrine cells, see retina
Ambient noise, 8, 106
Amplitude,
spectra, 4
Angle of incidence, 14
Anomaloscope, see Color discrimination tests
Arctan, see retina
Assimilation, see color contrast effects
Attention, 96-97, 118-122, 165-169
automaticity in information display, 121
color as a focusing mechanism for, 119-120
divided, 119-121
early selection, 99
effect of display consistency on, 122
effect of display organization on, 121
effects of display clutter on, 122
electronic display issues of, 119
emergent display features of, 120
focused, 119
improvement with training in, 121
internal, 100
late selection of, 99
limited capacity of, 134
selective, 119, 121-122, 261
separate modalities in information display, 121
timesharing issues of, 165-169
two types of, 97
index-I

USS Vincennes incident, 120


Audiogram, 6
Auditory information, I
Auditory warning, 1
Automation, 189, 205, 209-241
"deskilling" through use of, 214, 224
"knobs and dials" problems of, 225, 233
"soft' sciences and the need for human factors testing in, 237-238
accident data as a source of insight into, 215-216
advanced cockpit culture changes, 230-231
advanced cockpit task structure changes, 230-231
approaches to reducing flight deck workload through, 228
Aviation Safety Reporting System (ASRS) incident database, 217-218
Boeing philosophy of, 228-229
Cathode Ray Tube (CRT), 217, 256, 270
complacency as a result of, 224
concerns related to aviation automation, 210, 221-224
Control Display Unit (CDU), 217
crew role in averting accidents caused by equipment malfunction, 219
defined, 210
design-induced error in, 222
Douglas Aircraft Accident/Incident Database, 219
effects in non-aviation systems, 214-215
effects of Three Mile Island accident on design of, 213
effects on aviation industry, 210
envelope protection, 211, 224
experience with in aviation, 215-220
experience with in non-aviation systems, 213-214
FAA review of workload measurement literature, 236
flight crews as a primary cause for accidents, 215-216
Flight Management System (FMS), 220
general philosophies and limitations of, 231
how human factors relate to automation design, 235-237
human characteristics related to, 235-237
human factors defined, 232
human factors disciplines, 234
human-centered, 228
incident data as a source of insight into, 217-219
incident defined, 217
increased training requirements as a result of, 224
influence of crew role on design of, 229-231
intimidation by, 224
ironies of, 222
lack of human factors support by organizations, 234-235
index-2

lack of objective human factors criteria in FARs and design specifications, 235
loss of proficiency as a result of, 223
loss of situation awareness as a result of, 223
main drivers of new airliner development, 226
manufacturer use of human factors in, 227
NASA conference on cockpit automation issues, 230-232
NASA studies of advanced aircraft acceptance, 220
NASA study of CDU problems, 217-218
need for FAA human factors specialists in certification, 240
need for situation dominance in advanced cockpits, 231
nuclear power studies of, 213
office applications of, 214
operator functions for task-related activities, 230
overconfidence as a result of, 224
pilot opinion as a source for cockpit design information, 219
pilot's role defined, 229-230
problem of non-specific human factors criteria for, 239
reasons cited for, 220-221
reduced job satisfaction as a result of, 223
role of human as a systems monitor in, 235
Sheridan's ten levels of, 210
traditional design practices in, 226-227
variability in applications of, 214
weaknesses of traditional design approach, 227
workload effects in relation to cockpit design, 236
Axons, 21
Bezold Spreading Effect, see color contrast effects
Bezold-Brucke hue shift, see hue
Bifocal lenses, see presbyopia
Binaural unmasking, 9
Bipolar cells, see retina
Blind spot, see optic disc
Brightness, 24, 51
saturation, 52, 257
simultaneous brightness contrast, 51
Broadbent, Donald, 98, 235
Cataract, 19
Cathode Ray Tube (CRT), see automation
Certification, 170, 272, 301-303
human factors criteria for, 240
issues raised by new display technology, 267
methodology requirements for Workload assessment, 272
index-3

need for human factors specialists in, 239-240


Workload assessment considerations, 301-303
Certification, 170, 239-240, 267, 272, 301-303
for aircraft, 170
Checklist design principles, 150-151
Chromatic adaptation, 63-65
Abney effect, 65
effects of ambient light on, 64
effects of sunlight on display colors, 65, 257
on aircraft displays, 64
sensor adjustment of luminance, 65
Chromatic and achromatic colors, 53-57
CIE, 23, 65-68
color specification system, 65-68
standard observer's visibility function, 23
standard observer, 23
Cockpit automation, see automation
Cocktail party effect, 9, 98
Color appearance, 51-57
Color blindness, see color vision deficiencies
Color constancy, 13
Color contrast effects, 60-63
afterimage, 60
assimilation, 62
contrast colors, 62
simultaneous contrast, 60
successive contrast, 60
Color contrast sensitivity, 75-83
as a predictor of image quality in displays, 83
as a predictor of visual performance, 79-83
Boeing studies, 83
envelope, 76
Fourier analysis, 80
function, 75
high frequency, 76
optical blur, 76
reduction in, 77
variation with age, 78
variation with luminance, 77
variation with retinal eccentricity, 78
Color contrast, 72
threshold, 75
Color discrimination tests, 47
anomaloscope, 48
index-4

color matching, 48
Farnsworth-Munsell 100-Hue test, 47
pseudoisochromatic plates, 49
Color identification, 59
optimum number of colors for visual displays, 59
Color specification, 65-69
advantages of blue stimuli, 70
CIE spectral tristimulus value, 66
CIE spectrum locus, 68
CIE tristimulus value, 66
CIE V function, 66
constraints on use of colors, 70
implications for displays, 69-70
Munsell chroma, 69
Munsell color chip parameters, 69
Munsell system, 68-69
problems with blue stimuli, 70
search time for display items, 69-70
Color vision deficiencies, 45-51
blue-yellow color defects, 47
color blindness, 47
deuteranomaly, 45-46
deuteranopia, 45-46
diabetes-related, 47
drug-related, 47
glaucoma, 47
occurrence in pilot population, 51
protanomaly, 45-46
protanopia, 45-46
red-green color defects, 47
tritanomaly, 45-46
tritanopia, 45-46
Coloi, see hue
Complex sound, 2
Cones, 20, 42-43
long-wave, 42
middle-wave, 42
retinal asymmetry in distribution of, 43
S cones, 43
short-wave, 42
Control Display Unit (CDU), see automation
Convergence, 24, 89
ocular, 89
to bipolar cells, 24
index-5

Cornea, 14
Critical flicker fusion (CFF), see flicker
Cycles, 2
Decibels (dB), 3
Decision making, 133-163
"broad/shallow" FMC menus, 149
"narrow/deep" FMC menus, 149

action tunneling in, 160


anchoring heuristic in, 138
automatic, 144
availability heuristic in, 140
availability in. 140
avoiding negatives in checklist design, 150
base rate concept in Bayes theorem, 139-140
Bayes theorem, 139
belief in, 139
biases in situation assessment, 137-142
choice of action in, 136
clockwise increase stereotype of control movement, 153
cognitive tunneling in, 144
cognitive-response-stimulus (CRS) compatibility, 153, 159
colocation principle in display-control compatibility, 152
complexity, 146
complexity advantage, 149
concept of uncertainty in, 135
confirmation bias in, 137-138
congruence in checklist design, 150
congruence principle in display-control compatibility, 152, 156
congruence stereotype of control movement, 156
control movement compatibility and pilots' mental models, 156-160
control movement related to displays, 152
control movement stereotypes, 153-156
cues to, 135
de-biasing techniques in, 145
degrading effects of stress on multimode systems, 160
design considerations for data link, 152
diagnosis of situations in, 136
display tunneling in, 144
display-control compatibility in, 152-160
effect of context on response selection speed, 147
effect of modality on SR-compatibility, 159
effect of practice on response selection speed, 149
effect of publicity on availability, 140
index-6

effect ,-, recency on availability, 140


efL.., of signal discriminability on response selection, 148
expectancy effect in confirmation bias, 138
expectancy effect on reaction time, 147
expert systems effects on, 145
extrinsic feedback in, 151
factors affecting response selection speed, 146-160
flight management computer (FMC) menu choices, 149
following checklist procedures, 149-150
heuristics defined, 137
heuristics types in, 138-141
high-speed, 146
implications of stress for voice control, 160
intrinsic feedback in, 151
lessening bias in, 145
model of, 136
negative transfer and the common type rating, 161
negative transfer design issues in, 161-163
overconfidence bias in, 141
pilot laboratory experiments in, 141-142
positive transfer design issues in, 161
proximity stereotype of control movement, 153
relative location of controls and displays, 152
representativeness in, 140
response execution, 134
response feedback in, 151-152
response selection, 134
response time equation, 146
risk assessment in, 137, 142-143
salience bias in, 137
selection, 134
similarity concept in Bayes theorem, 139-140
situation assessment in, 136
situational awareness, 134
speed-accuracy trade-off in, 147
stress effects on, 144-145, 160-161
stress-induced losses in working memory, 144
stress-resistant decision processes, 144
under certainty, 146
voice-activated controls, 160
Depth perception, 83-92
aerial perspective, 86
binocular depth cues, 83
binocular disparity, 89
index-7

binocular rivalry, 91
chromostereopsis, 91
color stereopsis, 91
cues used by pilots, 87
interposition, 85
linear perspective, 85
monocular cues in relation to size and distance, 84-86
monocular cues in relation to size, 84
monocular depth cues, 83
moon illusion, 84
motion parallax, 87
motion perspective, 87
occurrence of strabismus in population, 91
optic flow patterns, 87
perceptionof texture, 86
random-dot stereograms, 90
spatial errors in, 85
stereo imagery on displays, 92
stereopsis, 89
strabismus, 91
use of binocular cues in aerial surveillance, 91
Dichromats, 45
Display compatibility, 115-118
applications to aviation, 116-118
meaning of colors, 116, 233, 258
multiple stereotypes, 116
perception of displayed information, 115
population stereotypes, 116, 258
principle of pictorial realism, 116
principle of the moving part, 116, 233
principles of multi-element display design, 118-122
S-C compatibility, 116
S-R compatibility, 116
spatial interpretation, 116
Display design, 243-267
advantages of building on past successes, 248
analysis of alternate sources for required information in, 247
analysis of continuous dynamic control tasks in, 247
analysis of new tasks in, 247
analysis of similar tasks in, 247
benefits of top-down task analysis for, 246
certification issues raised by new technology, 267
characteristics of proven value in symbology, 249
command information in, 264-265
index-8

command vs. situation-prediction displays, 264


costs associated with command information in, 265
direct selection concepts, 263
display development process, 244
dwell time in CRTs, 258, 283
effects of ignoring information requirements in, 244
effects on performance of time shared information, 260-261
Engine Indicating and Crew Alerting System (EICAS) display, 263
evaluation of, 250
evaluation strategy in, 250
expectation as a factor in, 262
eye fatigue factors in, 258-259
factors of legibility in, 249
format selection defined, 248
fundamental elements of, 244-245
future issues in, 267
general design issues of, 251-256
human factors issues associated with flat panel displays, 267
irrportance of task execution strategies in standardization of, 252
information requirements of, 244, 247
integrated displays, 250
need for appropriate performance measures in, 249
need for refinement of symbology and formatting, 248
nominal refresh rate of displays, 259
operational follow-up in, 251
optimum line widths for color CRT displays, 258
prediction information in, 264, 265-267
problem of "soft edges" in CRTs, 258
problem of fixation in CRTs, 258
problem of flicker in CRTs, 259
problem of glare and reflection in CRTs, 259
reasons for using color in, 257
role of NASA Terminal Configured Vehicle (TCV), 256-257
situation data in, 264-265
standardization issues in, 251-252
symbology selection defined, 248
task analysis for, 244-246
time shared information in, 260-264
time sharing benefits in, 261
time sharing of conceptual changes in display content, 263
time sharing of EGT gauge data, 261
time sharing of supplemental navigation data, 260
tree-structured selection concepts, 263
tunneling as a factor in, 261
index-9

typical conflict mechanisms in evaluation of, 250


use of color in, 256-258
value of alternative symbology and formats, 248
Dynes, see Decibels
Electroencephalogram, 96
Electromagnetic spectrum, see spectrum
Emmetrophia, 16
Energy, 2
spectral distribution of, 2
Equiloudness contour, see loudness
Expectation, see information processing
Expert systems, see decision making
Eye blinking, 30
Eye movements, 28-30
conjunctive, 28
in sleep, 190
pursuit, 28, 30
saccadic (ballistic), 28-30
vergence, 28
vestibular, 28, 30
FAA guidelines, 21, 59
for advisory level alerts in displays, 60
for caution signals in displays, 59-60
for master visual alerts, 21
Figure, 37, 72
perception during motion, 37
visual separation of, 72
Fixation, 28, 72
Flicker, 32, 249, 259
critical flicker fusion, 32, 249, 259
sensitivity to, 33, 259
Flight Management System (FMS), see automation
Form-Color interactions, 83
color-contingent aftereffects, 83
Fourier,
analysis, 4, 34, 80
use of in psychophysical experiments, 81
Fovea, 14
Frequency,
fundamental, 4, 80
intensity, 2
range for aircraft warning, 5
index- 10

range of human speech, 4


sensitivity, 6
tones, 6
Ganglion cells, see optic nerve
Ground, 37, 72
perception during motion, 37
visual separation of, 72
Habituation, see sound habituation
Harmonics, see frequency fundamental
Head-Up Display (HUD), 122-132
accommodation, 126
accommodative response, 127
advantages of, 132
amount of information displayed, 130
attention issues, 124, 130-132
attentional tunneling, 131
cognitive issues, 129-130
conformal symbology 123, 132
confusion issues, 131, 166
divided attention issues, 131
effects of display clutter, 131
eye reference point, 128
field of view, 128
goals of, 122-123
issues in optics design, 124-128
issues in symbology design, 124, 129-130
military research in, 123
multimode operations, 130
NASA studies of, 132
nonconformal symbology, 132
physical characteristics of, 128-129
simulation experiments in, 124-125, 131
transmittance, 129
updating of information, 129
use by Alaskan Airlines, 122, 124, 131
use of color in, 129, 166
use of optical infinity in, 126
Hering's theory, see Hue
Hertz, 2
Heuristics, see decision making
Horizontal cells, see retina
Hue, 53-57
index- 11

appearance of in displays, 54, 117


Bezold-Brficke hue shift, 55
cancellation in displays, 54
degraded perception of, 55
Hering's four fundamental hues, 53
Hering's opponent-colors theory, 54
Hering's theory of hue appearance, 53
zones in visual field, 57
Human error, 200-207
"bandaid" approach to, 205
as a "resident pathogen," 206
automation as an approach to, 205, 220-221
categories of, 200-204
electronic cocoon approach to, 206
error remediation and safeguards, 204-206
error-tolerant systems as a safeguard against, 204-205
forgetting as a type of, 200
in a systems context, 206-207
knowledge-based mistakes of, 200
lapses, 200, 202
latent, 206
mediating factors of, 207
mode errors, 202
Reason and Norman Classification scheme of, 200-202
remediation for knowledge-based mistakes, 201
remediation for lapses, 202-203
remediation for mode errors, 202
remediation for slips, 204
reversibility of actions as a safeguard against, 204
rule-based mistakes of, 200
slips, 200, 203
system design issues in slip remediation, 204
triggering conditions for slips, 203
two types of memory errors, 202
Human Factors testing, 307-336
"double-blind" studies in, 324
"real-world" studies in, 315

advantage of the arithmetic median in, 329


advantages and disadvantages of the arithmetic mode in, 329
advantages of regression analysis in, 335-336
Air Traffic Control (ATC), 315
Analysis of Variance (ANOVA) test in, 333-335
ceiling effect in response accuracy, 310

common questions in, 307-308


index- 12

commonly used objective measures in, 310


complex response components in, 311
components of response time in, 311
concept of statistical significance in, 333
controlling subject bias in, 325
correlation defined, 331
correlation in, 331
counter-balancing defined, 327-328
counter-balancing in, 327-328
criteria for rating scales used in, 320-321
data analysis in, 328-337
degrees of freedom (dO) defined, 332
descriptive statistics in, 329-336
determining task cues in, 317
eff.. ' of available response alternatives on, 314-315
effect of differences between target and test populations in, 324
effect of ease of interpretation on, 312-313
effect of expectations and context on, 313-314
effect of fatigue in, 327
effect of meaningfulness factors on, 312
effect of practice on, 324
effect of skill level in, 325
effect of stimulus factors on, 312
effect of user confidence on, 314
evaluation design, 309
example of analysis of variance in, 331-332
experimental controls, 323
experimental reliability defined, 322
experimental validity and reliability in, 322-323
experimental validity defined, 323
factors affecting response time in, 311-312
field observations in, 316-319
floor effect in response accuracy, 310
full-mission simulation, 322
Ground Proximity Warning System (GPWS), 315
guidelines for developing and administering questionnaires in, 319
human performance measurement in, 310
importance of "worst-case" scenario in, 327
inferential statistics in, 331-332
laboratory experiments used in, 321-322
limitations of field observations, 318
limitations of full-mission simulation, 322
measures of central tendency in, 329
measures of variability in, 330
index- 13

need for descriptive "anchors" in subjective scales of, 310, 320-321


negative transfer in, 327
objectives measures in, 310
operationally defined variables in, 323
p value in, 331-332
part-task simulation, 321-322
population stereotypes in, 308-309
positive transfer in, 326
priming effect in, 322
questionnaire bias in, 320
questionnaires in, 319-320
rating scales used in, 320-321
regression analysis in, 336
representative subject pools in, 324-325
representative test conditions in, 326-327
response accuracy in, 310
response time as a sensitive measure in, 310
response time in, 310
response time to an executive system in, 311
role of human factors specialists in, 308-309
role of operations specialists in, 308-309
sensitivity of arithmetic mean to outlying scores in, 329
significance of three standard deviations in, 331
standard deviation defined, 330
standard deviation in, 330-331
statistical vs. operational significance in, 336
subject selection in, 326
subjective measures in, 310
t-test in, 332
test methods of, 307

the arithmetic mean in, 329


the arithmetic median in, 329
the arithmetic mode in, 329
use of a control group in, 325

usefulness of, 325-326


uses of the correlation coefficient in, 331
uses of the standard deviation in, 330-331

variance defined, 330-331


variance in, 330-331
Human Factors, 232-239
certification criteria, 240
defined, 232
disciplines, 234
issues associated with flat panel displays, 267
index- 14

lack of objective criteria in FARs and design specifications, 235


lack of support for by organizations, 234-235
need for extensive testing in, 238
need for in FAA certification process involving, 240
non-specific criteria in assessments of, 239
relation to automation design, 235-237
role of human as a systems monitor, 235
Hypermetrophia, 16
Identification, see color identification
Incident data, see automation
Induced movement, 38
Information processing, 93-113
attention, 97-98, 118-122
automatic, 100-101
bottom-up, 103
capacity of short-term memory in, 111
complexities of speech signals in, 105
constructive memory in, 112
contextual cues to pattern recognition in, 103
controlled, 100
depth of processing, 97
display implications for parallel processing, 120-121
echo effect, 107
effect of Alzheimer's disease on memory, 112
effects of listener's age on speech perception, 106
effects of speech rate on speech perception, 106
expectation in ASRS reports, 102
expectation in speech perception, 102
expectation, 101-102, 105, 138, 262
expected information versus actual, 101
feature theory, 102
hidden costs of automatic processing, 101
long-term memory, 11- 113
memory as distinct brain structures, 112
memory, 107-113
model of, 133-134
pattern recognition, 95
popout effect, 100
ra/la distinction, 106
reconstructive memory, 111-112
sensory memory, 108-109
serial processing, 96
short-term memory "chunking, " 111
index- 15

short-term memory interference, 110


short-term memory, 110-111
signal-to-noise ratio, 106, 175
speech frequency attenuation, 107
speech signals, 106-107
TCAS simulation study, 104
template theory, 102
time required for, 96
tip-of-tongue phenomenon, 111
top-down, 103
types of attention, 97
variability in speech signals, 106
working memory, 95, 108, 134
Intensity, see sound intensity
Iris, 14
James, William, 97
Principles of Psychology, 97
Knowledge, 95-96
explicit, 96
implicit, 96
Lens, 14
cataract of, 19
increase in absorption with age, 18
Light, 11
Long-term memory, see memory, information processing
Long-wave cone, see cone
Loudness,
equiloudness contour, 5
sensitivity to, 6
Luminance, 23
Mach number, see speed of sound
Macular degeneration, 27
Macular pigment, 27
Masking, see sound masking, visual masking
McCollough effect, 83
Memory
"chunking, " 111
constructive, Il1
effect of Alzheimer's disease on, 112
long-term, 95, 108, 111-113
index-16

reconstructive, 111-112
sensory, 94, 108-111
short-term memory capacity, 110
short-term memory interference, 110
short-term, 95, 108, 110-111
tip-of-tongue phenomenon, 111
working, 95, 108, 134
see also short-term memory
Memory, 107-113, see also information processing
Middle-wave cone, see cone
Monochromatic lights, 12
Monochromats, 45
cone, 46
rod, 46
Motion perception, 37-39
functions of, 37
illusions of, 38-39
of figure, 37
of ground, 37
stroboscopic, 37
thresholds of, 37
Motion perspective, see Depth perception
Munsell system, see Color specification
Myopia, 16
Nanometers, 12
Nasal retina, see retina
Neurons, 94
Nystagmus, 31
Ocular media transmission, 17
Optic disc, 21
blind spot of, 21
Optic nerve, 21
Optical density, 17
Parallel processing, see information processing
Pattern recognition, see information processing
Pertinence Theory, see Attention
Photons, 11
Photopic spectral sensitivity, see spectral sensitivity
Photopic vision, 22, 46
Photopigments, 20
Physiological nystagmus, see nystagmus
index- 17

Pilot judgment, 135-142


citeL, ;n aviation accident/incident databases, 135
concept of uncertainty in, 135
decision making under certainty, 146
defined, 135
Pitch, 4, 5
Presbycusis, 6
Presbyopia, 17
correction for, 17
Pupil, 14
Pure tone, see sine wave
Quanta, see quantum, 11
Quantum, 11
Receptor, 14, 42
Reflection, 13
Refraction, 14, 125
Resolution, see visual acuity
Retina, 14
amacrine cells of, 21
bipolar cells of, 21
horizontal cells of, 21
nasal, 43
stabilized image on, 31
temporal, 43
visual angle of, 15
Retinal eccentricity, 20, 43
Risk assessment, 137, 142-143
choice between negative outcomes, 143
choice between positive outcomes, 143
framing of decisions, 143
gambling choices, 142
risky option, 142
sure thing option, 142
Risk assessment, see also decision making
Rods, 20
Sclera, 14
Scotopic vision, 22
Sensitivity, see sound sensitivity
Sensory memory, see memory, information processing
Sensory modality, 97
Sensory register, see sensory store
index- 18

Sensory store, 108-109


capacity of, 108
Sensory systems
external, 95
internal, 95
Serial processing, see information processing
Shape constancy, 72
Short-term memory, see memory, information processing
Short-wave cone, see cone

Sine wave, 2
Situational awareness, see decision making, workload assessment
Size constancy, 72
Sleep cycle, 191
circadian rhythms of, 191
defining characteristics of, 190
desynchronization, 195
Mean Sleep Latency Test (MSLT)
resynchronization, 198-199
sleep latency, 192-193
Sleep disruption, 190-199
characteristics of sleep, 190
controlled napping as an antidote to, 199
desynchronization, 195
in pilots, 193-198
micro-sleep, 199
NASA long-haul study of, 195-198
NASA short-haul study of, 193-195
performance as a measure of, 193
prophylactic sleep as an antidote to, 193
rapid eye movement (REM) sleep, 190
shift rates of biological and performance functions after transmeridian flights, 197
sleep inertia as a phenomenon of, 199
sleep resynchronization, 198-199
slow wave sleep, 191
Sound adaptation, 8
Sound exposure, 7
Sound habituation, 8
Sound intensity, 2
stimulus, 8
Sound intensity, 2
interaural differences in, 7
Sound masking, 8, 106
Sound sensitivity, 5-6

index-19

absolute, 5
loss in, 6
Sound time differences, 7
Spectral sensitivity, 22
function, 23
Spectrum, 17
broadband of, 12
ultraviolet portion of, 17
visible portion of, 17
Speech perception, see information processing
Speed of sound, 2
Stabilized retinal image, see retina
Statistical tests and concepts, see human factors testing
Stereograms, see Depth perception, binocular depth cues
Stereopsis, see Depth perception
Stereovision, see Depth perception, binocular depth cues
Strabismus, see Depth perception
Stroboscopic motion, see motion perception
Temporal retina, see retina
Temporal vision, 32
Timbre, 4
Time differences, see sound time differences
Timesharing, 165-169
automatized performance in, 168
confusion in verbally dependent environments, 166-167
confusion in, 166
importance of voice quality in, 167
NASA Langley research in, 166
performance resource function in, 167
residual resources in, 167, 179
resources and, 167
sampling and scheduling in, 166
Trichromats, 45
Tunneling, see decision making, Display design
Tympanic membrane, 2
Ultraviolet radiation, 27
hazardous effects of, 27
Visual acuity, 20, 25, 72, 249
as a measure of resolution, 72
in Display design, 249
loss of, 25
index-20

related to high spatial frequency sensitivity, 72


visual fixation, see fixation
Visual masking, 30
Wavelength discrimination, 58-59
color combinations, 59
color difference between display symbols
field size, 58
Workload assessment, 269-306
absolute scale of, 272
accident data in analysis of pilot error, 303
advantages of flight management system (FMS), 286-287
Boeing airplane development program, 275
Boeing data summaries of, 278-282
Boeing Subsystems Workload Assessment Tool (SWAT), 277
Boeing use of ergonomic data in, 277-278
burden on FAA certification personnel of early requirements determination, 301
certification considerations, 301-303
certification methodology requirements for, 272
changes in pilot's experience of workload, 269-270
commercial aircraft workload types, 273-274
comparative analysis of internal airplane systems. 220-277
computation base for timeline analysis of, 282
costs of using simulation and flight test tools in, 276
criteria, 277-282
design methodology requirements for, 272
dissimilar cues as an error detection technique, 303-304
dual role in design and development, 271
dwell time defined, 283
error tolerant design, 303-305
factors and functions identified in Appendix D of FAR Part 25, 269, 271, 292
four channels of activity in timeline analysis of, 283
future issues in, 305-306
individual channel statistics in timeline analysis of, 283
issues involving airline differences, 302
issues involving fault management strategies and training in, 306
issues involving impromptu task prioritizing strategies in, 305
issues involving information overload in, 305
issues involving mandatory indicators and displays, 301-302
methodology, 271-272
minimizing random errors, 303
minimizing systematic errors. 303
need for early requirements determination, 301
non-normal flight deck workload defined, 273-274
index-21

non-normal procedures in, 274


normal flight deck workload defined, 273
pilot error considerations in, 303-305
pilot error triggers relating to, 304
pilot subjective evaluations (PSE) of, 292-301
probability densities in task-time analysis of 289
problems of subjective pilot evaluations in, 298, 300-301
PSE ratings of, 300
questionnaires as a tool in, 292-301
relationship between workload and human error, 271
scheduling, 275-277
situational awareness as an error detection technique, 287, 303
structured type of, 275
task-time probability analysis of, 289-292
time-demand workload trigger levels, 285
timeline analysis of, 282-287
timing-related guidelines for normal task loading, 274
transition time defined, 283
use of Boeing 737 as a data reference in, 279
value of task-time probability analysis of, 291
Workload, 169-190
"red line" of, 171
absolute, 170
Airbus and Douglas use of physiological measures of, 187
aircraft certification for, 170
assessment, 170-171, 179-190, 269-306
aviation examples of embedded secondary tasks, 183
Bedford Scale of workload measurement, 183
Boeing and Honeywell computation models of, 179
case studies in, 170
component scales 174
computation models of, 179
Cooper-Harper Scale of workload measurement, 183-185
critical instability tracking as a secondary task, 182
defined, 169
demand checklist, 175
difficulty insensitivity, 177
distinctions defining resources, 178
drivers of, 188
dynamic closed-loop concept of, 188-189
effects in relation to cockpit design, 236
embedded secondary tasks, 183
FAA review of measurement literature, 236
four major techniques for measuring, 180-188
index-22

goals of automation in, 189, 228


heartbeat as a physiological measure of, 187
intrusiveness problem in secondary tasks, 182
memory comparison as a secondary task, 182
multidimensional scales of workload measurement, 185
multiple resources, 177-179
NASA TLX Scale, 185
open loop gain, 180
overload defined, 180
physiological measures of, 186-188
prediction, 170-179
primary task performance measures, 180-181
problems associated with secondary tasks, 182
problems in timeline analysis, 174
problems with subjective measures of, 186
random number generation as a secondary task, 182
reduction in through automation, 228
related to Sleep disruption, 190-199
relative, 170
residual resources, 167, 180
response bias in subjective measures of, 186
secondary task performance, 182-183
static open-loop concept of, 188-189
Sternberg Task, 182
subjective measures of workload, 183-186, 292-301
Subjective Workload assessment Technique (SWAT), 185
task demand, 174
task shedding in, 189
time estimation as a secondary task, 182
Timeline Analysis Program (TLAP), 171
timeline analysis, 171-176
timeline model, 171
underload defined, 179
underload, 190
unidimensional scales of workload measurement, 183-185
variables influencing central processing resources demand, 174
variables influencing display processing demand, 174-175
variables influencing response processes demand, 174-175

index-23/index-24

IM9

You might also like