0% found this document useful (0 votes)
998 views199 pages

TESTING ENGLISH Book For Deveoloped Guide of Study

about testing

Uploaded by

Jorge El Curioso
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
998 views199 pages

TESTING ENGLISH Book For Deveoloped Guide of Study

about testing

Uploaded by

Jorge El Curioso
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 199
Mater levesa are de omigeringa PE nape H. DOUGLAS BRC PRIYANVADA ABEYWICKR Language Assessment, Principles and Classroom Practices, Second Eton opi © 2010 by Pasoa Educa, ne. Al igs etre, {No paths puedo maybe reproduced ced ier eo anced in sy fore, or by mens, cco, merbniel, paccopig, cong. rather, wet te pe pemisioa oft publ. Person Bao, 10 ante, Pit Pi, Y 10606 Sete The pope who me up be Language Asessnect, Princes aod Classroom Prscdes Secnad Efton an, represen eos procs, des, and macro t -- Dil elie ha ey ery Mus Geko ay Sag soo, ‘Kim Steines, and jennifer Stem banat alaeeans Cones desig: Tacey Mam Cato “ext desig: Wey OE, TS Gass Tex conpestoe TS! Gaplics Tet foe Garamond Beck Te a Don Naranes Tex cred See page at brary of Congress Cataloging Prion Data boa, Hough, I~ Lngiage sesenenc pois 2nd chssoom race, Doulas roms, Dongs way raza peo. BaND5S10313 |. Laopage td opmges— Sry and eahng. 2. ang and bape Ean. 3 Language aun L Abeyricama Parada, Te 5340762010 sore mops BBO ESL SBHIOB S15 ER Pearionlongman.com ces onlne ‘eure for teaches and students Aces ‘ur Cemparion Webstes our cineca, nd cura ofces oud the wed \stuat pearsontongm com. imei he Uates Sixes necr 56789 10-¥036-15 1413 o - z e - Bae z - o ” * - FEISS TST Preface Text Credits Chapter 1 Assessment Concepts and Issues Assessment and Testing 3 Measurement and Evaluation 4 Assessment and Learsing 5 [formal and Formal assessment 6 Formative and Summaive Assesment 7 ‘NormReerenced and Criterion Referenced Tess 8 ‘Types and Purposes of Assesment § ‘Achievemeat Tess 9 Diagnostic Tests 10 Placemenc Tests 10 roiciency Tests 11 Aptitude Tests 11 tsses in Language Assesment Then and Now 12 ‘Behavioral Influences on Language Testing 13 Tategrave Approaches 15 Communicative Language Testing 14 Pecformance 8ased Assessment 16 Curent “Hot Topics” in CascoomBased Assessment 16 Mukipl inteigences 17 “Ttitoasl and Akermtve Asesmeat 18 CompurerBased Testing 19 (Other Current sues 21 Beercises 2 For our Furtber Reading 24 Chapter 2 Principles of Language Assessment 25 Prcsicaty 25 Reb 77 ‘dant lated Relhiie 78 » bw ater Blaby 28 “est Administration Retail 28 Tes Reliable 29 Way 29 Contented Evidence 30 terion Related Erdence 32 Consiuce Related Bence 33 Consequential ay (np) Face Vility 35 Autbentcy 36 ‘Wash 37 Applying Frincpes to he Brauaton of Casroom Tet 40 1. Ace the test procedures practice? 40 2 se est ise babe? 41 3. Canyouensre te relabiin? 41 4 Does the procedure demonstteconrent aii? 42 5. Haste mpc ofthe te been coef accounted fo? $3 6 Is the procedure ‘ised for bes? 44 7 Are the west ask s authentic as possible 4 8 Doss the test ofr beet wasback to the lear 5 Brerctses 48 ‘For Your Furor Reading 51 Chapter 3 Designing Classroom Language Tests Four Asesmeat Scenarios 533 Scenario I: Renting Qui 54 Sceoai 2: Grammar Unit Tet 4 Scenario 3: Nctemn Essay 54 Scenario Lisesing Speaking Fis! Exam 34 ‘Deven th Purpose of Tess 38 ‘Designing Clea, Unambiguous Objectives 56 Drawing Up Test Speciation 39 Devising Test ems 60 Designing Muliple Choice kems 67 Desig each item fo measure a single jective. 68 ‘Sate bo tem and options as sinply and rec as pose. 69 Make cei hate incended answers leat oat he caret oat. 70, (Options) Use tem indices to accept, sce, or eis ems. 70 Administering be Tes 78 Scoring, Gating, and Giving Feedback 79 Scoring 79 Ai z 4 3 zt 5 43h 3 Itt = SITS cre Grating 79 Giving Feedback 80 Brorcises 82 For Your Farber Reading ®4 Chapter 4 Standards-Based Assessment 85 “The Role of Standards in Standardized Tess 86 Standards Bised Education 87 Designing English Language Stancarés 88 Sundarés Based Assessment $0 ‘CASAS and SCANS 92 ‘Teacher Standards 95 “The Consequences of Standards Based and Sundarized Testing 94 ‘Test Bas 95 ‘TeseDriven Leaming and Teaching 98 ‘tha sues: Cea! Language Testing 98 Eaercises 100 For Your Further Reading 101 Chapter5 Standardized Testing 103 -Adrantages tod Disadvantages of Standaied Tests 104 ‘Developing a Standardized Test 106 ‘Deteize the purpose aad objecires for the res. 107 Desga es spectations 108 ‘Design, select, and arrange test tasks/items. 110 ake appropiate erations of iret kindof tems. 112 ‘Specify scoring procedures and reporting formats. 114 Pm ongoing construct valdaton sts. U7 santadied Language roficency Testing 118, Brees 120 For our Farber Reading 124 Chapter 6 Beyond Tests: Alternatives in Assessment 12 ‘The Dilemma of Maximizing Both Practcalry and Washhack 123 Pecfonnance Based Assessment 126 Rubrics 123 Portolis 130 Jourals 134 (Conferences and Interviews 139 Observations 161 cons Rae Rely 23 ‘Test Adainsration Relay 28 Tes Raby 29 Yat 29 Contented Frdence 30 Gitesion Reed Fidence 32 Coase Rehted Fidenee 33 Consequently pact) 34 Face lig 35, uenici 36 Washbe37 Appling Princip tobe Braaton of Clasroom Teas 0 1.Are the tes procedures pres 40 2 lsthe testis reliable? 41 3, Can you ensure rater reba? 41 4 Does the procedure demonsate content ai? 42 5. Hs the impact of he tet been eae ccoue fo! 43 6 Is he procedure "bite for best? 44 7. Are the test tasks as aueatc as posible 44 8, Does tbe tt fe beac wasack othe esac? 46 Freres 48 For Your Further Reading 31 Chapter 3. Designing Classroom Language Tests Four Asesoment Sernaes 53 Scenario I: Reatig Qin 54 Scentro 2 Gana Unt Tes 4 Scent 3: Mier Esa 34 ‘Scenario 4 Linening/Speaking Final Exam 54 Deterining the Prose ofa Tet 55 : Designing Ger, Tanbiguous Obes 56 Drawiog ip Test Specicaions 59 Devising Test ems 60 Designing Multiple Choice Items 67 Design each tem o meanre a szge objective. Saxe bth sem and option simply and deci as posse 69 ‘ae cerain thar the intended answer i lacy onthe correct one. 70 Options) Use item indices 10 accep, discard, or revit tems. 70 ‘Adminsering the Test 78 Scoring, Grading, and Giving Feedback 79 Scoring 79 52 Grading 79 Giving Feedback 80 Exercises 82 For Your Further Reading 84 Chapter 4 Standards-Based Assessment ‘The Rote of Standards in Sundaried Tests 86 Standart Bised Education 87 Designing English Language Standards 88 Standards used Assessment 90 CASAS and SCANS 92 Teacher Suadards 95 ‘The Consequences of SandardsBised and Sandardized Testing 94 ‘Test Bis 95 ‘TeseDriven Leming and Teaching 98 Bical sues: Critical Language Testing 98 Eeorcises 100 For our Further Reading 101 Chapter 5 Standardized Testing Advantages and Disedrantages of Stndaties Tess 104 Developing aSandartzed Test 106 Determine the purpose aad objectives forthe tes. 107, Design es specications. 18 Design, selec, and arrange test sires. 110 Make appropri evhatons of iferet kinds of items. 112 Specify seocng procedures and reporting formas 114 Perform ongoing constrict validaton states. 117 Standartized Langage ProflescyTetig 118 Exercises 10 ‘For Your Burtter Reading 2. Chapter 6 Beyond Tests: Alternatives in Assessment ‘Tae Dilemma of Naxiizng Both Pactcalty and Washback 123, FecormanceBased Asessreat 126, ubeics 128 Pontolos 130, Jourals 134 Conferences and Interviews 159 (tserations Mt 85 103 12 wi coma Sele and FeeeAsessmeats 166 Types of Self and FeerAsesment 15 Guidelines fr Selécnd Fee Assesment 151 ‘Aono of Se and Peer Assesment Tasks 153 Exarcies 154 ‘or Your Furtber Reading 155 Chapter 7 Assessing Listening 136 Insegraon of Sis ia Language ASessment 137 assessing Gara and Voabuary 158 (Ofeerrng the Peformance ofthe Fur Sis 159 ‘The imporance of Lisering 160 Basic Types of Listeria 161 Mero- and Macross of Usening 162 Designing Assessmeat Tass lnteasive Listening 164 Recognizing Phonological and Morphological Hements 164 Paaphrase Recognition 166 Designing assessment Tsk: Respossive Listeaing 167 Designing Assessment Ts: Selecie Livening 167 Lisering Coze 168 Information Trnsir 169 Seatence Repetition 172 Designing Assessment Tiss: Extensive Listening 172 Dictation 173 ‘Comumusicaive Simuius Response Tass 175 Acshentic Lnenirg Tasks 178 verses 181 For Your Further Reeing 182 Chapter 8 Assessing Speaking 183, Base Types of Speaking 184 ‘Micro-and Macros of Specking 185 Designing Assessment Tasks: ImiziveSpesking 187 ‘Yersunt® 188 Designing Assessment Tasks: Inensve Speaking 189 Directed Response Tasks 189 Read Aloud Tasks 189 Sentence/Dialogve Completion Tasks and Onl Quesonnsires 192 Pierre ued Tass 193, “Translation (of Limited Stretches of Discourse) 201 Designing Assessoent Tasks; Responsive Speaking 201 (Question and Answer 201 inna tnerietnne an Directions 208 -_ z sd o- - - — - “- ss - i IIT nn THT 156 183 cents vi Paraphrasing 203 “Text of Spleen English CTSE® Tes) 205, Designing ssessneat Tasks: ncractive Speaking 207 Incerview 207 ole Pay 214 Discussions and Coaversutions 215 Games 215 ACTEL Onl Proficiency interview (OPD 216 Designing Assessments: Extensive Speaking 218, (Oral Presentations 219 Picture Cued Soryeling 220, Reteling 2 Story, News Event 21 ‘Tanslation (of Extended Prose) 21 Bxorcies 22 For Your Furtber Reading 223, Chapter 9 Assessing Reading ng Genres of Reading 225, Micros, Macresklls and Strategies for Reading 27 ‘Types of Reading 228, Designing Assesment Tasks: Percepiv Reading 250 Reading Aloud 250 Weiten Response 231 Mull Chee 251 Piceue Cued ems 251 Designing Assessment Tasks: Slecsive Reading 234 uldple Choice lo Form focused Criteria) 254 Matching Tats 257 ating Tasks 258 Pion Cued Tasks 39 Gapfiling sks 240 Designing Assessment Tasks: Interactive Reading 41 ‘Cloze Tasks 241 Iimpcompra Reading Pus Comprehension Questions 244 Shoreanswer Tasks 247 ating (Longe Tests) 247 Scanning 249 Ordering Tasks 29 Information Transfer Reading Charts, Maps, Graphs, Diagrams 250 Designing Assessment Tass: Extensive Reading 252 ‘Skimming Tiss 253 ‘Summarizing and Responding 254 Noetaking and Outining 255 Brercises 256 cones Chapter 10. Assessing Writing Genres of Weiten Language 260 ‘Types of Writing Perfomance 261 Miro- and Macroskis of Weting 262 oe in and] Weng ener, Words, ad Punctaton 265 . Spelig ists and Detecting Fhonene— Graphene Conesponencs 265 esgning Aseseat Tass tensive (Conroe) Wing 257 Dietaton ad DictoComp 267 (Geant Trnsormation Tasks 268 PieareCued Tasks 268 ‘Vocabulary Assesment Tiss 271 Ordering Tasks 272 ‘Short Answer and Sentence Completion Tas 272 resus in Assessing Responsive and Extensive Writing 273 Desanig Assessment Tasks: Responsive ané Extensive Wing Paraphrasing 276 ‘Guided Question and Answes 276 Paragraph Construction asks 277 Sategic Options 273, Sondard:zed Tess of Responsive Wing 279 Scoring Methods for Respoosiv and Extensive Writing 285 Holic Scoring 283 Primary Tal Seocng 284 Aaaljti Scoring 284 beyond Scoring: Responding to Excensie Weting 285 Assessing Hat Sxges ofthe Process of Composing 58 ‘evening Laer Sages ofthe Frocess of Composing 259 275 erases 290 For Your Furtber Reading 91 Chapter 11 Assessing Grammar and Vocabulary assessing Geum 293 Desig Gasatical Kaowledge 94 Designing Assesment Tsk Selected Response 295 ‘utp Choice Tasks 295 Discrimination Tisks 298 ‘Noveing Tasks or CoasciousaeseRalsng Tasks 299 ‘Designing Assesment Tass: Linited Production 299 Gaping Tasks 299 Sho answer Tsks 300 Dialogue Completion Tasks 302 239 FTE asiiiiitt VW Designing Assesment Tasks: Bended Production 303 Infoceation Gap Tasks 303, Role Payor Simulaton Tasks 304 Assessing Vocabalary 305 “The Nature of Vocabulary 306 Defining Lexie! Raowedge 307 Some Considerations ia Designing Assessment Tasks 310 Designing Assessment Tasks: Receptive Vocabulary 311 Designing Assessment Tasks: Productive Vocabulry 314 Exercises 316 or Your Furtber Reading 317 Chapter 12 Grading and Student Evaluation 318 “The Philosophy of Grading: What Shoulé Grades Reflect? 329 ‘Guidelines for Selecting Grading Criteria 322 ‘Methods for Calculating Grades 322 ‘Teachers’ Perceptions of Aopropiste Grade Distibutions 326 Isintonal Expectstions and Consiins 328 rose-Calual Factors and the Question of Difculy 329 What Do Lees Gades Mean"? 31 Calculating Grades 332 Altereaties to Leer Grading 332 Some Principles and Guidelines for Grading and Evalution 337, Beercises 338 For Your Furtber Reading 340 3a Appendix: Commercial Tests Glossary 346 Bibliography 355 ‘Name Index 37k Subject Index a7 PREFACE ‘As this second edoa of Language Asessment Principles and Classroom Practices goes to press, we are embarking on the second decade of this new irilennium, In that Bs tenear period alone, the Feld of second langeage acquisition and pedagogy sew remarkable advances in our stockpile of metboce ‘bogies! opdons for teaching languages. The subdscipin of lnguise assess tment kepe pace with this growth. ntis second edition, we bave almost doubled the aumber of bibliographic eatries found in the fist edition, 2 sgn of fall seearch agenda. Also, in thit period of time Several aew jourals—exclusively ‘evoted to language sssesmeat—bave been published, 2 further index of cae ‘eaic prosperity. “feseament ia gener, but in panicular language assesment, is an ares of intense fiscination. No longer afield exchusvey relegated to psychorsetricas and testing exper asessment bas caught the interes of lasoom teachers, sents, parents, 2nd politcal ation groups. Hows ca (teacher design an effective ast ‘oom ten? What can I stwent) do to prepare fora test and to make assessments tal Kinds enhancing learning experiences? Are the standardized cess of angusge (bar mny child bas to ake) accurate measure of sii? And do (as a toca for far esting practices) belie tha the plethora of tet that sudents ae exposed 1 sxe culture, fee from bis, and not instruments ofa powerful ete designe arher che gap berween the haves and the haves? "al of these and sny more quesions now being addressed by teachers, researchers, and specaliss can be overthelming tothe novice languege teacher, ‘whois already balled by lnguisic and psychological paradigms and a mukiude of Inetiodelogical opsoas. Tis book provides teachers~and excherstobe—with 2 ‘Gent readeriesdly preeataton ofthe esential foundzton stones of language sscessment, with ample praccl examples to iustrate their application i language ‘assooms Isa book ta simples the isues without oresmpliffing I esu't dodge complex questions, and it eas them in ways hat cseroam ceachers can comprehend Readers do nothze to become eng expers~orstiiclas adept jn manipulating mated equations and advanced cacus—to understand and apply the concepts in this book, S111 ii SEA J 4 AH ef i 3 TH ater INF etice i + Glossary of terminology. In this edton you will fad 2 ghssary of ascessnent ems and concep al of which hae been boc in the tent of the book We hope ths wi bea usefl way to quit cant he tenia ofthe myriatof ems invoduced inthis book. + Appendix listing commercially avaiable tests. Curent don Wiel ‘muted tenis now preseared in an appends ths ists perdnen infer ration, specications, nd lternet eferences. PERSONAL WORDS OF APPRECIATION Fist, I want to welcome my coauthor for this edition, De Payunrads ‘Aberwickama, profesor of English at San Francisco Sue Univers. thas been my pleasure to work with rps in writing this second eon, as she has been expe- dally helpful in iMdentfing new, cutiagedge research inthe field 2s wel asia feria insights on standards-based and foruMocused asessneat ‘We can both hear aser that this book is very much the producto our own teaching of language assessmene. Our stents have colecively taught us more than we have taught them, which prompts us ro chan them a, everwhere, for these gis of knowledge. And ofcourse the embracing suppor ofthe NATESOL fc ‘uly at Sen Francisco Sate Universi 2n upifing source of stinulation and afi Tm farther indebged to teachers in many counties around the world where [ hve had the honor of fering workshops and seminars on anguapeasessmeat[ have memorble impressions of such sessons in Bri, Casada, Chile, the Dominican Republic, Egypt, Japan, Korea, Mexico, Peri, Spin, Taiwan, Thailand, “Tuckey, Uruguay, and the former repubc of Yugosavia, wheve cosrcukural issues inascesment have beea especialy stimulating, Final, we wish co teak our respective paraes, Scot and Mary, for toe sting authors in theic ome who need all oo many hours of univerrupted fous ‘uring periods of book wring. Tei love and support isa marvelous afiemation of our work ‘We would lke thank the folowing reviewers who offered valuable insights about the Ses edition and eaty drat ofthe revised edion: Jorge H. Cables, Universy of Delaware, Newark, Detware; Fernando Fleuzquia, University of Maryland Bakimore County, Ealtinore, Maryland; Vasie Kelos, Seneca Coleg, “Foroato,Cansex; ezanne Medina, Caforia Sate Unversity, California (he Hie) yon Stafford-Yilmaz, Belevue Community Colles, Belerue, Weshingor; Diane Steong-Krause, Brigham Young Universi, Provo, Uy; Latricia Tits, Musa ‘Sue Universe, Kennucy. De. H.Dougha Brown, Sepeember 209 TEXT CREDITS __ oaefuleknoledgmen i made tothe flowing publishers and authors for per smision to reprint copyihted materia. “anercan Cound 0a Tenciog Foreign Languages (ACT, fr mate from ACTHL Proficiency Guidelines Speaking (1986), Oral Proficiency Inventory (OPD: ‘Sumamary Hight ‘Buackwel Publishers, for mater rom Browa, Janes Dean & Balle, Kathleen M. (1980), A categorical instrument fr scoring second language wing sil language Learning 34,242 ‘Cabfoia Deparment of Eduction, for mate from Cafornie Engl Language Deveopment ELD) Standards Usening and Speaking. "BocxtonlTesig Service (TS, fe mater fom Ts of Engl asa Foreign Language TOEEL* Tet) Tt of Spon Engle (TSE* Te Tit of Writen Ege 15 ugg ie sp etn Mee Eng Language Assessment Batiery QIELAB). ‘Georetown Universi Press, for material from Swi, Mesid. (1990). Tae ne guage of French inmersion studens: apcations fr thecryand practice. la ames Alas (Ed), Georgetoun Univesity round table on languages and lings. ‘Washington: Georgetown University ress. Osi iver Pres, fr mare rom Bachaan, fF (1990). Fundamental considerations language esting New Yodc Oxford Uaivessy res. Pearson Education, for mitral fom Versan# Test Pearson Longman ESL, and Deborah Philip, for mate fom Pais, Deborah 001), Longman Introductory Course or be TOEFL Test Wit Pits, Ni;Pearsen ivcaion. Second Langage Testing, Inc. (LTD, for mater tom Moder Language Apitude Tet. ‘alversy of Cambridge Local Examinations Spaicate (UCLES, for mate ftom International Englis Language Testing Stor, “Yasir nz, Rosian Khar, Ec Philp, and Shela Vi, for unpublished ster Pater i PURPOSE AND AUDIENCE ‘This book is designed to oe a compreheaie survey of eset principles and tool for second langage asesrent. thas ben succes used in its frst eition in eacbertining courses teacher certifcaion curcula and TESOL master of as programs. As the dd in a trogy of teacher education textbooks, its designed to fotow H, Douglas Brown’ other two books, Principe of Language Learning and Teaching (ith edkon, Pearson ivcaton, 207) and Teaching by Principles (ied ‘edton, Pearson Education, 2007), References to those two books are sprinkled throughout the curent book In keeping with the toe st in the previo tO ‘ok this one erties uncomplicated prose and a systematic, spring organize on. Conceps ae inuodhced wit a maximum of prc! exemplifcaton and 3 ‘minimum of weighty defo. Supportive esearch i acknoredged aad suciacly explained wihoutburdning the reader with ponderous debate ore miouize. ‘The testing discipline someties poses ana of sency that can se teachers to fee indent 2s they approach the task of mastering peiniples and designing effecire instruments, Some wstng manual, with thes heavy erphass ca Jann ard mama equacons, doa'thelp to dspate that mq. By the en of LangucgeAsesment eaters wil hae gained access otis ncesofrighesing Bld ‘They wi havea woking koowiedge ofa numberof wel, fandaneatpringpes of ssseseat ad wl bre apple thos prinsplesto praia cassroom contents. They ‘il ao have acqueda storehonse of weil comprehene tools for evabatng aad desiring paca ettve sessment ehniues the cesses. ‘PRINCIPAL FEATURES [Noable fare of his book indude the flowing: + lacy famed fundamental principles fo evaluating and designing asses meat procedures of al kinds * Focus on the most common pedagogical caleage: cassroontbased assessment + Many practical examples to use principles and guidelines + Concise bur comprehensive reameat of assessing al fou skills (isening, speaking, reading, writing) + In exch sil, cassification of assessment techniques that ange fom ‘controlled to open-ended item pes ona specie continuum of mire-and macros of language + Explanation of standards based assessment wt it, wy itis 0 pop- ay, and ts pros and cons + Thorough discussion of large-scale standardized tests: their. purpose, design, vali, and wy sii Pace + Consteradon ofthe ethies of testing in an educnonal and commercial world driven by ests + Acomprekensie presentation of alternatives in assessment, same, pore fotos, ural, conferences, observations, iterviews, and selt and pees assessment « A sysemaic scusion of lees grading nd overall evaluation of uuent pesformancein2eourse + Endofchapter exereises dot suggest wholectiss discussion and ind ‘ial, pai, and group work forte classroom + Suggexed additional readings atthe end of each chapter IMPROVEMENTS IN THE SECOND EDITION nts second etion of Language Assessment, a aumber of changes ae present tilecingadrees inthe Ged s wells enancerens based on fetbek om he fest eon Some ofthese changes ae 2 lows + Advaace organizers at the beginaing of each chapter. Each chapter ‘vw begins with a brief list of objectives, which serve as preceding org sets fr stadens and insiructors, ‘Arne chapter, The domain of formfocused assessnent is now arsed in a sepacste chapter (Chapter 11) Although assesing the four sls ‘vores, ia some cases, assessing pertinent grammar tnd vocabulcy, i is appropriate co rea such foes a8 separate sue. Background information taeda suvey of reseach are presented long with praca examples of chy niques fr assessing grammar and ocabular. - Updated references and new information, The sx years of esearch and ‘prcice dat have wanspre since the frst eition ofthis book create the ‘vious need ro encapsulate that progres in the form of reports of new research, updated seferences, and te insenion of new information. The lace incdes the ftlowing: * Reorganization ofthe rst tree Capters ora more logical progression of sieps toward undersunding casiroombased assessment «+ Adnzer and moreincksive discussion ofthe history oflanguage assesment + Upued descriptions of curent sues and challenges throughout 4 Recent esearch and practice with regard to sandards bed stessment, in ‘a completely redesigned chapter (Chapter 4) +A discussion of rbrcs in the chaper ca ‘aemaives* (Chapter 6) 1 Tie afbremendoned weatment of asessing grammar and vocabulary, now ‘wh che aseadon it eserves ina separte chapter (Chaper 11) «An addtional sect in the chapter oa grading and eration (Chpeet 12) on ealeuliting gndes with refeenees to Webbased resources ] OBJECTIVES: After reading this chapter, you wi beable to + undessund diferences besreen + appreciate historical antecedents Tsvesment an testng, along with of presenta tens and reseasch ther basic assessment concepis in language assesement and tems «+ rasp some ofthe major current + singh among Ae ferent ‘asus tha assesment researchers ypes oflnguage tess, cite sare now addressing ‘cumple of exch, and apply them {or ferent purposes and conress ‘estshave away of scaring suaeats. How many tne ia your schooldays did you feelyoursell tense up when your teacher mentioned ates? The anicipation ofthe upcoming ‘moweat of tru” provoked felngs of ansery and seldoubt along “ah a fervene bope tha you would come out onthe other end with at leat & sense of worthiness. The Fear ofl is pechaps one ofthe strongest negative emotions 2 student caa experience, and the mos comon iastrumeat ining ‘och fear isthe eet. You are not Hkely to view atest as positive, pleasant, oF ‘firming, aad, Uke most ordinary morals, you intensely wis for # miraculons cxemaptign from the ordeal And yet es seem 25 unavoidable 25 comorsow' sunrise in virtaiy all ‘education setings around the Word. Courses of sexy in every dcipine ire marked by these periodic milestones of progress (or sometimes in the pe Ception ofthe lamer, confirmations cf inaequacy) tha hae become conve: femal methods of measurement. The guckeeping function of tesis—from CGassroom achievement tess to largescale standardized (ets—Ius become eo acceptable norm. "Now, ust fo fun, take the following quiz. Al five of the word are found ia standard English diconaries, so you sould be able to answer al ie items cs right? Okay, go fori 2 wrt Asan Cac an es Deco in exch of he i ites belo seth definition that conectly ‘how unequivocally the tose Kad of asks predic communicative sucess in language especialy wrteedacqston ofthe ante evrce oft Limitation, sandzdined apne ex are seo wed oe, withthe exception, praps, ef Senifying foreign nguageesming diseiity (Gnsteié & Reed, 2008, led, aes to measure ngage apirde more fren provide learners with information aout tec peered ses andthe ote fal seengts and weskneses, with flowy sacl for capitling 08 che Secaghs and overcoming the weaknesses Robins, 205; Seen, 202) Any test has custo pred soees nang a lnguage is undoubted fed becase see now mow that with appropae elinow edge ace sates ivaveetia Teaing anor stegesbeed aston viral everyone can every se eed Te pigrntole lamers prio, before they have even attempted co eam heage to presippose aur or success without substantial ens. (fret “Gseusson of guage aptnde can be foand in H.D. rowats 2007] Prinses of Language Leorning and Teaching PLT, Chapter 43" ISSUES IN LANGUAGE ASSESSMENT: THEN AND NOW Before moving on to the practicalities of creating casroom tests and asessments, you wil beter appreciate the intricacies ofthe process by taking rif histone Took at language esting over the pat half centuy and taking not of some curéat ‘souesin the Held. istorii, langugesesting trends and practices hae flowed the siting sade of teaching methodology (or a description ofthese trends, see HD, Bron {a00rby, acing by Principles (TBP, Chapter 2). Fr rape inthe 19206 {950s aa eraofbehavorsm and special atention to contastiv nals, language tees foewsed on specific lingisie elements such asthe pbonologicel, gramme: fea, an lexical contrasts berween two laguages. nthe 1970 and 1880s, com tnunicative theories of language Brought with them 2 more integrative view of tenting in which specials cained tat‘the whole ofthe commuriceve event 7 puma levees ae mde ini Book wo compos vtumes by #- Dovghs esa of Language Learning ond Teaching PEL, ta etion, 200782 ‘on which pedagogical pracces te based. Teacing by Princes CBE, til elton, ‘oon spels oat tha peeagogy in practical terms forte language teacher | ume Aces Cece adios 13 wis consideabiy grt than the sum ofits linguistic elements” (Cask, 1983, 1.432) Today, et designers ae stil callenged in their quest for more authentic, ‘lid instruments that simulate real-wodd interaction (Leung & Lewkowcr, 2006) Behavioral Influences on Language Testing ‘Theoogh the mile ofthe twentieth century, language teaching and esting were both stone infuenced by bebaviol porehology and srvtu linguists. Both indiions enptoszed sentenceievel grammatical praigns, definitions of voeabu lary ‘tems, an transation fom fist language to second language an paced oly tnor focus on realword authentic communication, Typical, tes consisted of yaar and vocabulary items in mulplechoie format along with 2 varie of transition exercises ranging from words to sentence to short paragraphs Such diserete point formats sl preva nd, especaly in largescale stand sned‘estance examination” wed to admit students tiation of higher edcaion ‘around the wo. Gee Baravel, 1996, an Spot, 1978, 1955 fora summary) scent asessments were designed onthe assumption tha nguage canbe broken dient its component pass and that those par canbe teed success. Tee components are de ski ofseing, speaking, reitiog, aad wring andthe various tus of inginge (Gace poins) of ehonoiog/gaptolgy, morptoogy, lexicon yes, and discourse i was cimed tha an ore nguge proce tes then, should sample al fours ands many igus dsc points pose, Discrete point testing provided ferle round for what Spolsky (1978, 1995) ‘called the psychometric sructaralist approach to language sessment, in Which test designers sized the tools ofthe day 0 focus on issues of vali relabiy, and cbjectvn, Sancadized tess of znguage blossomed inthis scenic climate nd the language texching/teting word saw such rests 2 the Michigan Test of Ealish Tanguage Podcicacy (961) and the Test of English asa Foreign Language (1565) ‘pecome extiortirly popular. The science of meisuement and the a of teaching appeared to have made a revolutionary bance Integrative Approaches Tate midst ofthis eros, anguage pedagogy was rapidly moving in more comm nicave decons, and testing specats were focced into a debate that woud ‘soon respond to the changes. The discretepolnt approach presupposed decin- texralization hat was proving tobe inauthetie. So, asthe profession emergedinto ‘an er of emphasizing comunicaton,auticaity, and contest, ew approaches ‘vere sought Jon Oller (1979) argue thar hnguage compecence was und set of inceracting sides that could not be tested separately His cai was that com snicative competence i global and requires such integration tha itcanaotbe ‘capeuted in dive ess of pramumar, exding, vocabulary, nd other discrete pats ‘of langage, Others (among them Ceiko, 1982, and Savignon, 1962) soon follomet in their suppor: for what became known 25 integrative testing, 1H owner Aas Comedies ‘What does aa iategtie test lok ike? Two types of tests were, atthe ime, claimed tobe examples of incegraive tests cloze ests and dications, A cloze test isareading pasage (perhaps 150 to 300 words in which roughly every ssthorser- ‘nth word las bea deleted the ressakeris required 0 sup words that fi ito those blanks, Gee Chapter 8 fora fll discussion of coz testing) (ter (1979) ciaimed thc cloze tes: ests Were good measures Of Oral peo- ficiency. According to theoretical constrvts undeing this cin, the ability to supply appropeite words in blanks requires competence in 2 language, which inchdes Knowedge of vocabulary, grammatical sirvcure, discourse structure, reading sills and staesies; aad an incerazed “expectancy” grammar (enabling ‘ne to predict antes that will come next in a sequence). I was argued that suc cessful completion of cloze tems taps ar all of those abies, which were sed ro be the essence of gota langage proficiency. ‘Dictation, in which learners listen toa sort passage and write what they hea 4s familar language eaching techie that evolved into a testing techaiue, Gee (Ctaprer 6 fora dscuson of dictation a an astessment device) Supporters argued thar dictation was an integrate test because ps ino grammatical and discourse competencies required foc other modes of peformasce ina language. Seccess on 2 dctaon est requies careful Istening, repeoduton in wrkdag of wit is head, ef ent shortierm memory, and, to an extent, some expectancy rules to ad the shore ‘erm memory. Further, dictation tes resus tend to corlate suongly with thee ‘ess of proficiency. For lrgescule testing, the usualy casroomentred dictation techaigue ca be practical and refable trough the design of multiple choice items. Proponents of integrative test methods scoa centered thei argumens on Wit became inown as the wnltary trait hypothests, which suggested an indie” view of angvage proficiency: tar vocabulcy,graumar, phonology, the our skis? and other discrete points of nguage could not be dsectangled fom each other in Innguage performance. The unitary tit aypothess agued that there s 4 general ‘actor of language proficiency suc tht al the discrete Points do nce. add upto that ‘whole However, nasties of debates and research evidence (Fahy, 1982; Ole, 1983), the unitary tat hypothesis was sbandoned. Communicative Language Testing By the mid-19608, especially inthe wake of Canale and Swan's (1960) seminal ‘work on commuaicative competence, the languagecesting eld had begun to focus on designing communicative languagecesting tasks, Bachman and aimer (1996) included among “fundamen” principles of anguage testing the ‘need for a correspondence betweea language test performance and language vse: ‘In onde fora particular language test to be usefil for its intended pur poses, rest performance must correspond in demonstrable ways to language use {in nontest steations”(p. 9). The problem that language assesment experts faced was that tasks tended wo be anil, contrived, and alikely to mirror la Sage use in rel ie. As Weir (1990) noted, integrative tess such as cloze only ame Aus Conard inet 15, tellus about a candidate's linguistic competence. They do act tellus anything iretly about a student's performance abil (p. 6). ‘Thus quest foc authentic was lurched, stst designers centered oncom ‘unica pesformance, Folowing Canale and Swain’ (1980) model, Bachman (1990) proposed a model of language competence consisting of organizational and pragmatic competence, respectively subdivided into grammatical and teal com- ‘Poneats aod ino ilocatonary an sociolinguistic components (ee Figure 1.2) Language Competence Organizational Pragiatic Competence Competence Grammatical Teva Wociionary —~ Secidlinguisic Competence Competence Competence Competence hay fee vesent —_ | ean oly eats uncon: Diletervariey Syne ‘ganization fMangulatie ——_LSerstty eaolgy Funes toRegiser ‘raholegy ise Senstity uncon ‘oNatrares toast tal Rls Fens dF, oF Speech Figure 1.2. Component of language campetence (Bachman, 1990, p. 87) Bachman and Falmer (1996, pp. 70-75) alo emphasized the inporsace of strategic competence (the ability to emplay cemminiatve stategks (0 coat east for breakdowns as well as enhance the rhetorical effec of uerances in the proces of communication ll element ofthe model, especially pragmatic and step bilities, needed tobe included inthe consracsof language esting aad in the actual performance requiced of esttaers Communicative testing presented challenges to tet designes, 2s we will set ‘in subsequent chapers of this book. Test designers began to identify the kinds of realword ts that language earners were called onto perfor. It was clea that ke Content for those tasks were extaosinarily widely varied and that the sum pling oftasts for any asesstent procedure needed to be validated by what fe song users acully do with linguage. Weir (1990) resnded his eaders that“ 16 cn Aunt Cees an es measure nguage proficiency... ancount mus now be taken of where, wha, how, with whom, and why language is he used, and on what ropics, and with ‘what efect(p. 1). The assessment fed also became more concerned with the authentic of sks andthe genuineness of texts, (See Skchan, 1968, 1989, and Fulcher, 2000, for surveys of communicative testing research) Performance Based Assessment Ta aoguge courses and proams around the wot est designer are now cng this new and pore swentcentered sgeda (Aeron & Daneyjee, 200, 20% Bachman, 202 Leung & Leki, 206; Wei, 2005). Ine offen paperaad- penal muliplechoice tens of pekon of septe teas, prformancetased asessmentcf langage pial ares ol production, wren proucton pea: ved sponses, regres peranmaace (Across eae, group pecan, al ‘er inerace tasks. T be sure, uch assent iste censuning ad theeere legen, but those exa es esl in more direct and more scare testing teense stata ase shy peroom sel or Suatedreaewoé sks. techies higher conceae ally Ge Chapter 2 fran expats) achieved because leane ae mesuedin the proces of pening te teed inguic cs. nan Exes agege teaching context, pevormance based asesmneat mens you iy hie aia ine dsingshing Dersea formal and infomal aes rent you rey ie es on forma truce tess ad aide mee on eae aoa wile sudeas se pecrming varius ust, you wl be taking some eps Cowan mecing the goals of penance based asessment. (ee Chapter 10 fora farther dseuson of peormance sed asesmeat) Acuncerisic of many ut eotall) pedormance bared nguig sesens is the presence of icerace tasks, hence the aenatne tera, taiebased assessment, for sch appocces In such cast, the assessments ine leaaes i cuprates ei cesvaers are mestued in he acto peaking requesting, esponding, oa com (Signy ad rang a tpt eig 2 re Pp pen texts certainly donot it such communicative perfomance, A prime example ofan interactive language asessment procedure isan or incre, The tester requeed to fen arcuntely to someone else and respond appropri cats taken in the test desgn proces, angage ected ‘eovonceeby tester cn epee! esi ads an approach the atheticty of elie hoguage use (Se Chapter) (CURRENT “HOT TOPICS” IN CLASSROOM-BASED ASSESSMENT Designing communicative, pevformance-ased assessment rubrics continues «0 céallenge astsomeat experts aad casscoom teachers aie in addition, thee new fsmues in the Bel te shaping our current undersanding of elfecive assesment ver | Aseen Cong andinet V7 “These are: (1) the elect of nes theories of ntligence on the resting industry ia ener, @) the vet of what has come tobe cle ake” assessmenh a te increasing use of cooper ecology in asesmens of vans Kinds We bey explore these issues bere. Multiple Enelligences tnttigence was once viewed sty asthe sity to perm (agus and) Joeatmabeantal problem solving. This 19 Gaeligence quotient) concept of ineligence bus permeated the Westra weld and its way of resin for almost = eer, Because mares in gener is measured by ined, discrete pot es Conssing of ere of sepa ens, why shout every fell of uy be tease? Foray ears, eka led ina wort of sandarined, aormeferenced teas thar ae timed ina mukple choice forsat and consist of multiplicity of log constrained items, many of which 2 inauthentic, However in tc ls wo decades ofthe rveatieth cen resarch on intel ence begat tra the paychomeuic word upside ow, Howar Gare (1985, 199, for example, exeaded te atonal view of inteligence vo eight dierear onponents! He accepted the tracioal concepuaizatons of guise inele fence and logicamatematc ineligence on which andardaed 19 tes 2 Ease bat included other rames of min” in bis theory of eanltipe intelligences sata, musi, Gnesthec, murals, inerpersoal, aod incapecsoasl. Rober ‘Steberg (198, 1997) aso cared ew triton in ineigence research a eo ining ceive inking ed manipulate satis 2 pat ofinteligece, Ukewse, ‘anal Glenn's (1995) concept of £Q (emotional quotient) spurred us 0 une ‘scare the imporance ofthe emtoas in our cogitve processing “Tse conceptions of inceligece were not universally ecceped by te seadenie comunity (Se White, 1988, forexample). Aeral, bow does one objec: trely mesure such bypotbedalconsnics as iterpesonlineligenc, cea, vid elreceer? Nevertbeles, thes lantive appeal infuses educators with a sess ft borb fetdor and responsi in thei caching and testing agendas, a8 dleaced by educatioal reforms atthe tie (Armstong, 1994) adhe language asesssent il in particular, the recognition of nile inte genes tas had an invert fect. On the one band, communicate das aarvais in textbooks and programs have gid increasing attention 10 divers of tesrng ables and ses. Cisson (2005), for camp feed more tan 150 satvtes for langeage lamers, exch exphasizing specie ineigences, On the ‘rhe hand, in casroom asesment, new ews 0 neigence bare helped to free Tanguageiastrecion progam fro rejiag excunely on timed, dire pin tea tess in measuring guage. Cassrom language eachers have ben fecely prod 0 cao combat te potential tranny of obectiviy” an —_——_—____——. “TJeranmaazy ol are? ear ofiarligene, DB (07, pp 107-110) 12 cure Aseinen Cones sate accompanying impersonal approich. Teachers and adminsators have also been urged to measure whe inguage sl, earning proceses, andthe ability 1 nego. ‘iste meaning. Our challenge continues to be designing assessments tat tp sto tespersona, creative, communicative, an incerctive sis and in dogo pace Some test in ou subject ad ntition, ‘Traditional and Alternative Assessment Implied in some ofthe eutier description of performance based clasroom asses ‘ment isa trend wo supplement tadtional tet design with atemaives tat ae more authea inthe ection of meaningful communication Tbe 1.1 highligh df ‘erences beoween th two approaches adapted from Armstrong, 1934, and Bailey, 1988, p. 207. ‘Two caveats need to be sated here. Fest, the concepsin Table 1. represent some overgeneraizations and should therefore be ncerpreted with cation. is di Sut inact, to daa car ineofdsincon berween what anmstzong (1954) and Baley (1996) called wadional and alternative assessment Many forms of aes ‘ment fla berweea the two, and some combize the best of both, ‘Secon itis obvious tat the ale shows abs toward aternative assessment, ‘and one shoud not be mised into thinking that evecything on the leehand side i tainted wibereas the ist oa the righthand side offers station to the feld of a guage asessient. As Brown aad Hudson (1998) pty poiated out, the assesment traions sralable tows should be valued and lized forthe funcios they pro- ide, At the same time, we might al be stimalated to lok athe rghtiand Hs and ‘ask ourselves if among those concept, cere ae alternatives to assesment tat we ‘can use constructively in our chssrooms. 1k should be noted here tat consideibly more time sad higher insrtoaal budges are required to sdminister and score assesment that presuppose more subjective eration, more individualzation, znd more iteration inthe proces Table 1.1. Taitonal and atematve. Condinvous longterm assesment Timed, multiple choice fomat Untimed, open-ended responses Deconietualze est ies CConteualized communicaive asks ‘Sens suficefr feedback lndividuelized fecback Norm-eferenced seres Citerionseferenced sores Focus on derte anowers Operended, creative ansvert Summative Formative Oriented t product Oriented t process Nenintratve priomance Interactive psfrmance etisic motvaton Faster invinsic matvaton a 10 been jt aego- |p ato iso place | ec eeerreeereereeeet over | wsesme Coc and et 19 of otfeing feedback. The pay for the later, however, comes with more usefil feedback to setents, the potential for intrinsic motivation, and ukimately a store complete description ofa tudea’s abl. (See Chaper6 for a complete treatneat of aliernziesin assessment) More educstors 2nd atvocates for edison refora, axe arguing fora deemphasis of large scale sandaried tess in ive of contexte alized, communicative, pecormancebased asesoment that Wil beter fitate tearing in our schools. la Chapters 4 and 5, isues surrounding saadardzed testing ae addressed ar length) Computer-Based Testing Receat years have seen a burgeoning of computer technology and applications of that technology to language learning and teaching. Virwaly every language leamer wortdwide is 0 a lesser or grestr extent, a user of computers, the Tere, Pods, cel phones, the Web, and other common cyberecheolog. I's 0 surprise, then, that an orerwelming aunber of language courses ullze some ‘orm of computer assisted language learning (CALL) to achieve thet goals, 28 recent publiitions show (Chapelle, 2005; Chapelle & Jamieson, 2008; de ‘Seendelly, 2005; H.D. Brown, 20073). The assessment of language learning is ao excepdoa to the mushrooating growth of computer techaology in educatonal conten (see Chapelle & Dough, 2006; Douglas & Hegeheier, 2008; Jamieson, 2005, for overviews of computer based secoad hoguage testing), Some computerbased tess are smalscile “home sgrowa" tess avaiable on a plethom of Web sites, Others ae standaized, large scale tet in which ens of thousands of estnkers ay be involved. Sucents receive prompts (or probes, 28 they are sometimes refered tc) ia the form of spoken or writen simul from preprogrammed algoithm zad ae required to pe (or in some cases, speak thir responses, Most computerbased test items have Fixed, cosedended responses; however, tests soch 25 the Test of English a5 2 Foreign Language (TOEFL? Test) now offer a writen essay section aad aa ol pro: uction section, both of which are scored by humans (as opposed to 2uromatc, electronic, or machine scoring. Recent developments in computerbased sssessucat clude coaubutons of ‘corpus linguistics in providing more authentic, the design of more complex tusks in computerdeivered tet, the utlzation of speech and wri recopaon software 10 score onl and weinen production Gamieson, 2005), and some iniguing questions about ‘whether and how the delivery medium (of computer bused language testing) changes the oaruse of the construc being measured" (Dougias & Hegelacimer, 2008, p. 116). A speoflc pe of computer-based es, 2 computeradaptive test (CAN), tas been ava for many yeas bu has receay gained moments. In computer adaptive tes, each esiaker receives a set of quescons that meet he test spice tions aad are geceralyappropste fr his oc hex performance level. The CAT sas ‘with questions of moderate diiculy, As tetakers answer cach question, he 20 UM! Aten Covey des ‘computer scores the quesin ad ust that infomation, a el ste responses to previous quetons,t detemine which queson wil be presented ax. AS Fong as ‘eumines respond cerect, be computer pica sles queinas of greater ee aya cut. cere answers, however, pial bring queios of leer ot gale, The computers programmed 10 ul there dsgn contin cay acjuss tind quesns af appropriate iticuy for estates a al peor tance lve, {a CATs, the tester sees only one quesion aa ine, and the computer scores each question before selecting the net one. AS eu, tesaers ‘cannoskip questions and cace they tae entered and contre this aswers, they ‘cao eb fo quesoas or 92 ext par ofthe est. Computecated esting with or wit CAT techaoogofs these adage + avariery of ex adnlnisered cassoombased test + selfirected testing on vasious aspects of 2 language (Vocabulary, grammar, ‘discourse, one oral ofthe four sls, etc.) + practice for upcoming high stakes sandarized tests + some individalizaton inthe cise of CATS + largescale sandardized tess that canbe administered easy to thousands of testiakers at many diferent stons, then scored electronically for rapid feponing of results + iniproved (but imperien) technology for automated essay evaluation and speech recognion (Doves & Hegel, 2008) Of couse, some dsadrantages are present in our current peddeton for com puterbased esting. Among them: + Lack ofsecurtyand the possbiiyof cheating ae inherent in unsupervised computerized tes, + Occasional "homegrown quizes that appear on oficial Web ses may be sisaken for validated ssessmens, «+The muliplechoice format preferred for mos computertased tests contains the usual potential for awed item design (see Chapter 3). + Openeaded responses are less key 10 appenc de ro (2) the expense and poteatal uneabilryof human scoring or (the complexity of recognition sofware for automated soring +The uman interactive element (especialy n or production is absent. + Validation issues xeruning from testiakers approaching wsks a test uss rather than 2 eahworld language use Douglas & Hegtbeimes, 2008), Some argue that computer based testing, pushed to is ulimate level, might mitigate recent efforts to return testing to is acl form of (2) being tailored by teaches for their dasscooms, (b) being designed to be performance based, and (© allowing 2 teacher-sudent dialogue co form the basis of assesment ‘This need not be the case, While ‘compaterasisted language tests (CALTS] cura Assent Cones diss 21 teave not ily ived up to their promise... esearch and development of CALTS continues in interesting and principled divecuoas” (Douglas & Hegelneimes, 2008, p. 127). Computer techgology canbe a boon to both communicative ln fuage teaching and testing (Chapelle & Jamieson, 2008; Jamiesoo, 2005. Teachers and tetmakers now have accesso an everincreaing range of tolsto make computerbased testing less formulaic. By using technological innovations cecatively, testers will be able to enhance authentic, increase interactive exchange, and promote autonomy (OTHER CURRENT ISSUES _Asurrey of recent artes and books oa language asesitent yields severt other ‘urea issues, beyond those metionedabor,thzt ae being probed inasessmeat cdeles around the world. They wil be discussed in subsequent chapter ofthis book but deserve some meation nos, 2 ou begin this journey inc te intcaces oflanguge asessneat. Direct assessment of speaking and writing, Whether by means of buna evaluators or automated computer-based software, the testing industry has begun anew erin the iret assessment of productive skis. ‘Dizece” means that tx takers must accaiy*€o" language, not just respond to quesions“about aguzgs, In bygone years, est designers shrank from involving tscekers in acta! le guage performance, especially in Irge scale assessment, because of the cos and ‘orelability ofthe endeavor. Now, wih advances in discourse anasis, beuer accounting of examiner-examineeinceriction, improved rubrics, and technology tenhanced scoring, the testing iaduty is taking on the challenge of direct aces seat (ior, 200). “Adcances in corpus linguistic. As tray bilioas of words and seaences ae gathered fom the cel worl, logged Ino Unguistc copor, and eatlogued into ‘manageable, rercrable daa, the design of zsesement inssuments is being cevor ‘ionized Te od complaint that te lnguige of sandardived tests was too"phoy” tad consved should no longer hold inthe near fore as we have accesso real ae guage spoken and wren inthe real word (Conrad, 2005) ‘Standards based assessment. la Chapter 4, we go into deta on issues surrounding standards-based assessment or a5 the proces is know in some cir des, esublising beachmacks or frameworks of reference. Around the ‘world, educational insntion ae demanding common criteria for evaluation of sudents entering into programs, advancing fom one course to another aad graduating from one gride tothe next Such benchmars a a hotbed of cou: trovessy at times as teachers, ainisatcs, and poltcians clash over ethical ‘ssves ieNamare& Roever, 2006). la more constructive moments, they provide such needed standardization. Consequential nalidityGmpact) As you wil oon readin the ext chapter and then agin in Chapter 4, according to some researchers, the impact of sdacaed test ubiquousin many societies, goveras and detenmines people's eewe edueton 22 curr sess Cocaps sn ss (Shotamy, 2001). The anthesis ofthe negntve pecs of uch impact the poten: ‘il for eeeramining the washback that tess and asessments can provide inthe language classroom (Tyla, 200), toe ek ke ‘As You cea this book, we hope you wil do so with a appreciation fr the place of ‘esting in assessment and with sense ofthe interconnection of assesment and ‘teaching. Assessment isan integra part of the teaching-leaming ce. In an inter ‘ctve, communicative cuniulum, assessment is almost constant. Tests, which ae 4 subset of assessment, can provide authentic, mociration, and feedback tothe learner. Tests ae essential compocents of a sucessful cureuiun and one of ser ‘ral pare inthe learning process. Keep in mind these basic principles: 1, Pesiodc assessmens, both formal and inform, can increate motration by serving 2s milestones of den progress. 2 Appropriate assesment a inthe enforcement and retention of ingormaon. 3. Assesmen’ can confirm aes of swengih and pinpoint aces needing further woe. 4, Assessments can provide a sense of pesodi elosue to modules within acorricuhun, 5. Assessments can promote safer autonomy by encouraging student's eration oftheir progress 6. Assessment can spur lamers to set goal for theaseres, 7 Assesment ca aid in evaluating teaching eectivenes. 2. Answers tothe analogies quiz on page 1.2.03. 485 bo EXERCISES (Next: (9 Incvidual work: (6) Group opie week; (C) Whole ces discussion.) 2. (@) Ina small group, ook at Fpure 1.1 on page 6 that shows, among other ‘elatonships, ests asa subse of assessment ad the ater asa subset of teaching. Consider the following clssroom teaching techniques: choral dei, ‘air pronunciation practice, eiding loud information exp task, singing s00gs ia English, wing a desricion ofthe weekead!sactriies.La your ‘roup, specifically desonbe aspects ofeach that could be wed for asessmest usposes. Share your conlusons withthe rest ofthe cas. 2 (6) The car on te ne page shows 2 hypothe ine ofdsincdon berween ‘ornate and sunmairessesment and bexween nfl and foal ses ‘sec. Ina group, reproduce the chart ona bege shee of paper (or four see). vant Asser Cece and ses 23 ‘Then come o 2 consensus on wich of the four cel each ofthe flloing tee iques/precedues would be placed in ad jusy your decision. Stare your esis With ater groups and lscus any difreaces cf opinion. cement es Diagn ess Reroc achievement tes Som pop quizes ‘Sodarzed proficiency tess eal exams Pontos Jourmas ‘Speeches (prepared and rebeased) ‘Ol presentations (prepared bur not rehearsed) mprompr dent responses to teacher's questions Sradensen response (one paragrpl 1 a reaing assigument raking 2nd revising weitng al esas (er seve dats) Sreeat orl sponses to teacher questions ater a videotaped lecre “Wholecas openended discussion of atopic “oral Fownal . On your own eh edt eee oeieda > Cece ea hwbelsapedmobe tsa ese Then cones the usin oometreszed es a Pet «a disrbiition of scores that resemble a bellshaped curve, what kinds of Sette pea of cto aire ein ourexperece Repo your igs ota cas 4. (ins whic ds wanes dete pees {elie ora sors) sue grate BrP 1M cures Auten Cone ar es tens dessbe on ages 9-12 I each case, have the volumes aes he erent which the tes was suceshlin accomplishing is purpose. 5, (ss: nholecss dseusson, bainsom a varie of es as (eg mpl hole, vee, e827 queston) that dass members have experienced i teaming foreign ng and make son the boul of al the ass tht a trentioned. Then deade Which ofthese ads are peormance base, whch are commusicatre, at whic a both, nets, o perhaps flin been took wt be is of Gadnes eh inligeaces. Tike one or wo ites geoces, assigned to 70U group, 204 brainstorm some tesching aces thr fver at pe ofitelignce. Thea bist some assessment ks Unt may presuppose the sane incligence to perform wl Shae your resus vot eer groups. 1. (@ ble 1 ss talon snd akeraiveasessnen: ass and chants tis In pai, bnsorm bo the postive 2nd the oegatie aspects of ak in ‘rc it Shae our coecstoes—which hou ld x anced perspec rei heres ofthe dss (©) ds cass members to share any experiences wits computebased esing tne emlate te advantages and dsrantiges of tne experiences 9. (@ tha panne is Dave's ESL Café at hts con/ an oo fone ofthe dozer of quizes at are presented, Can you detemine the teen to Wich such izes ae wsefl for a cascom eacher and the {Eten which they may present some prblens and disadraapes? Report our ings back othe dss 6 FOR YOUR FURTHER READING MeNamara, Tim. 2000. Language testing Oxford: Oxford Univers Press, (ne of + number of Oxford University Press bie iaroductons wo various arcasoflnguage sy, this 40-oage primer on resing cles defnons OF Las tems in language texting wi brief explensons of fndamental concepts. ris usefl ile reference book to check your understanding of tesing jagon aed issues in the Feld. Mousa, Seyge! Abbas. (2008). An encyclopedic dctionary of language testing (his ed), Tebran:Rahasa Publications. ‘This ia highly useful, very daaled compilation of vimully every term in the field of language tesing, with definons, background history, and research references. It provides comprehensive explanations of theories, fencpls, sues, tol, and tasks and an exhaustive §-page bibliography. ‘A chorer version ofthis 946;ege come may be found in Mousav's (1958) ‘Dictionary of Language Tesing Tehran: Rahnama Publication) (OBJECTIVES: After reading thls chapter, you wil be duction and the behaviors (isening, reading, granmaicalty deectoa, and ‘wring) acraly sampled on such tess Dura, Caml, Peneld, Sash, & UiskinGaspuc, 1985). Because ofthe crcl need wo of fancy atreble probciency tess and te high cot of aministing and scaing ol production tess. the omission of orl concen was justified an ecocomic neces. Homer, Jn the la deca, wih cranes in deveopiag rab for soning rl production tusks and in automated speech recognition sata, mace general ngage prof cieacy tes inca orl production sts, largely temuing rom the demas of the profesional community fo authenticity and content ay. Consequential Vality (impact) ‘Aswellas these thre widely accepted forms of evidence tha may be introduced to ‘support the val of un asessment,cwo other categories nay be of some interest and uty in your own quest for validating classroom tess, Mesick (1989), ‘T McNamare 2060), Bindley (2001), Fulcher and Davidson (2007, and Gromlund ‘and Waugh (200, among others, underscore the poteatalimporaace ofthe con- sequeaces of wsing an assesment. Consequential validity encompases al the coasequences of est, incluting such considerations sits accuracy in measuring intended exter, is effect on the preparation of testakes, and the Gatended and ‘unintended socal consequences of tes’ inerpresaton aad use ‘Bachman and Palmer (1996), MeKay (2000), Davies (2003), and Cho (2008) use the term impact io refer to consequential vay, perhaps more broadly encom. passing the any consequences of assesment, before and afer a est administion, ‘Toe impact of teaking andthe use of test scores can, according to Bachan and Palmer (p. 30), be sen at both 2 macro lve! (the effect on seiey and educational systems) and a micro level (he effect on inva esakers. Ar the mcr level, Choi argued thet the wholesale employment of standardized tess for such gate- {ceping purposes as college admission” deprive sudens of erucal opportunites to fear and acquire productive language sil causing tet consumers fo be ncreas- ingly distlsioned wid EFL tesng”(p. 58). Moe wil besa about impact and rebated issues of values, social consequences ethics, ant iraess in Chapter 4, As high-stakes asessment has gained ground in the lst ra decades, oo aspect of consequat validity has drwa special tenon: the effec of test preps- ‘Aton courses and manvals on pesfornazce.T McNamara 20K) cautoned against test resus that may sflect socioeconomic conditions such 25 opportunities for ‘coaching tht are “diferentally svalzble co the sudens being asesed (for stprepe lites for Mra 2 FnibeofLpage Ament 35 example, beaut only some fies can tion coaching, or because chee with ‘more highty educated parents get ep fom thes parents) (p. 5, ‘tthe micolevel, specify be cassoom nseuctional er, notes ger ‘ant consequence of test fils int the category of washback, tbe dened od ire fly suse tei this chap: Gronind aod Wavgh 2208) encouraged ‘cicers fo conser the eect ofasesezens on tuen modalcn,subsequeet esfommance ina cours, independent leaning, suey habits, and a Pesan Fearing, sey e toad Face Validity A futher face of consequential vali isthe extent to which *sadentswew the ‘sccsmeat 2 ft, relevant, and useful fo improving learning” (Ground, 1998, 210), oc waat bas popula been caled~or misamed—face valid Face ‘lity refers othe degree to Which atest looks igh, ad eppearsto meagre the ‘uowiedge or abies clams to measre, based oa the sabjecinejudgmen ofthe ‘aninets who take i the adminisrasre personel who decide on its se, snd ‘othe pychometicall unsophisticated observes" ousn, 2008, p. 247, Despite the innlive appeal of the coocepe office vay, remains 2 aoton that ano be empscaly measued or theoretical juried ues the aepoy of ‘ay. is purely fro ofthe “ee othe belder”—ow the ester er poe Sly the test gre iantrely perceives an instrument. For this exon, aay asses fest expe (see Bachman, 190, pp. 285-285) view face vy 2s 2 superficial Seto tat is foo dependent cn the whim of the perceiver. In Bchia's “pos Fare ou te ay he echoes Mosier 97, . 196 decateol ceatenon q 'sa"pemiious (thar should be techn ieeemire fallacy. hat shoud be|pucged irom te techn (x the same tie, Bchman (1990) and other asessment expen “grudgingly” agte that test appearance does indeed have an eect hat acter estates one {eit desgnes cia ignore. Sudeas aay for a variety of reasons fe that at's {ering what i's supposed tote, and this might aie: ther pefoamance and, con ‘Sedu, create teat elated uneEabity refered co previous So snaent per * aelldonstructed, expected format with fiir tasks tks that canbe accomplished iia an alloed time limit * items that are Gear aad uncomplicated + Gxecons tha are crystal lear + ess that have been rehearsed in thelr previous course wouk * tks that relate their couse wore (Content ri). * acktfcuty level thar presents a reasonable chaleage Be amin? Pareles! ng Ase aly the ne office vay eins ws tht he ppl te ofS ean fidence, ace ee) n impocant ingredient a peak Peron, eriascaa be sect ander anseyincsese you" tuow a cue! US ae ey ated to have reared ret sts before the ct and el cmon sei a hea. casoom te othe Ene 10 nodes OW Usk, Paces 2 Tove kow ste: ea arf he ws ser he esses ou are testing wr es ined 2 dca et and aoe test ee Chapter 9 fons uno cre es) as cena es for & goup of eae of EG 3 ca ging Some ier wee upset DEC Suh E00 he Be oF sored tea oe thee abies in xg, The ata AE choice grammar tes woud have been the appropcate format to use. Afew caimed Speyer pero wel on he cone aed ition becase ey Wer 95 ET (rey eo thse frau As carne ou the et served as super asruneNS for placement, bute students did ot think o. se peady noted seve, aria complex concen Yeti spense ae enero deang of wht makes ocd es, We owe hee Mesos ‘Gpey.p 3 ean hat a ot an aeranne proposten an a ES Say nay aed wo be apple 04ers wi forms of ecener in your angus asesent proces YO CONE reef prim fg on con an cetein ali, tea You se 0 8 ei pa mang acct gens bout compe ef sears ‘whom you are working. AUTHENTIC ‘afer eo pip of lng ein saben «coer tt acne, eapecaly within the 2c and eiece of erating and Sgn CaS setae Pine (1996 dened atheatciy asthe degre of correspon wee Pie darters of ea ngage St he eae of 2 SS weipage wk (pant en sgl a ga fr ifeting ose ane ne ss and for tasfoing hem into 2d et Kes Fe one, saheniy i nota concept tat easy fends self to eB ica det or mesure (Lewinics (200) suse he dha so ating bent agg assent) After a he ct AY aac ask ox angage seals s"reaiwor” or ao? Ofea such joan Tae ject and yt suthentcy concept ha angageesing PT 7 eda oe de of enon co Gaiman & Pes 1996 Fue: & DS, Fey. panes, acolo Chia (2006), many eps asst Fak sword sks 1 ay when youmske a din forauhency in ssh YOu ata bo ey be ena in the el wos, Many es tem es ure? cpa ngage Asesment 37 simulate seabvord tasks They may be contrived or arial in thie atempt to tanger a grammatical form ora lxical item. The sequencing of tems tia bear 9 ‘elaoashp to one anther lcks authentic. One des ot bave to look ery lag to fad reading comprehension passages in proficiency tests that do aor refect & sealord pusoge. nates, auheatcty may be preseat in the folowing ways: Tan AUTHENTIC TEST . | conmin language thts as natu 3s posse | + has tems aecostenalized rather tan ole | inde mening relat, teresing pes + provides some thematic organization ro ems, such suhovgh 2 soy lin eps + ofes mask tha replicate reakwodd tasks “The authentic of test asks in recea years has increased noticeably. Two hee decades ago, unconnected, boring, contrived items were acepted a ances ‘ary component of testing. Thiags have change. c was once asumed that le sous testing could not nchade pecformance of the producive skis and say within budgetary constrains, but now many such tests offer speaking and writing Comper nents. Reading passages ae elected fom real-world sources that tse at Fel o have encountered oe wil encounter. istening compretesion sections fe cure masa language With Kestations, white noise, and inerruption. Bore tes ‘offer iems that are episodic in thar they are sequenced to form eating waits, pangaphs, or sis "We ive you 0 tke up the challenge of authentic in your classroom test, [As we explore many diferent types of tasks in this book, especialy in Chapees 6 ‘hou 9, the principle of authentic wl be very much inthe forefoot. ‘WASHBACK, Atetoenteestvaliy Saud shore heeft Sauer on sage ce fede Ty snus oe pt oan sseseert, dined a {Se onic ink oft tes ween oa fo one (Sertich neo sch cement stent ee gee eopmene stk C996 p. 2 ended ws tate wick ft 2 eben ke poston site nin of ing seg Mat se pele mec ver af coeur watback Aen TL Ginp conser wack an tags esp concep fe 2 Feu emess tar seni tore co tow es enc bth 30 ourae2 Pines pag Acer teaching aad leaing. Cheng, Watanabe, 2nd Curtis (2004 devoted an entire forlogy to the sue of washback, and Spt (2005) changed teicher 0 become agets of beneical washback in their language classooas. “The following factors comprise the concept of washbac | a TEST THAT PROVIDES BENEFICIAL WASHBACK + posvelyinuences wt and tow exces ech «fosters vst nd tow ae ees, 1 fs tamer a une to sdeqtey prepare «ges lamers edb tht enkances thei nung evelopest 1s more oma inate tan sae + proves cones fe peak peormance bye esate as eee eee ee so ee eae ee Satan ence oca lp ppc coun cp ncexed ceecce in cran bagged (Chapel i ae cersate toto ee sat i oe eae ret aa Se ener ee Se ores een ene ene oa ney aero soit cg nage eeepc een acreerio ata en pegepeerrine ae erode pee ere ae ete er meee eer oeecnrreett Se eee eee eee ote a = say itace watts commen aul tod seca oe ee anes over Peel tage Assent 38 ‘with 2 single leer ga or nuaerca score and consider thei job dove. In reaiy, leer grades and muverca scores eve aboluely no informatio of ini nterest to the student. Grades and scores alone, without comments and other feedback, reduce the lnguisic and cognitive performance data avalable to student to aon nothing. at bes, they ge a celztve indication ofa formulaic judgment of perfor ‘mance as compared t cters in the cass—which fosters compecte, not cope: ie, eming ‘Wie this in mind, when you ctu a writen test rt datasheet fom an cel production re, consider givag more thana ruber, grade, or phase 2s you feed: ‘ack. Bren if you evuaton is nota neat paragraph appended othe test, ou ‘an espond to as many deals txoughout the tes as me wil pert. Give praise for sceagihs—the good stuf?—as well as coascuctve cris of weaknesses Give sumegic bins on how a student might improve cea elemeats of perfor ‘mance. {n other words, take some time to make the test performance an intrns- cally motivating experience from which 2 student wll gain 2 sense of accomplishment and challenge Ait bic of washinack aay aso help students through 2 spécifeation ofthe ‘numerical sores on the vaous subsections ofthe test. A subeecton on verb ‘eases, for example, that yields a tlavely lve score may serve the dapnosic pu ese of showing the student an ae of challenge, Another viewpoint on washback is achieved by a quick consideration of ci ferences becween formative and summative tests, mentioned ia Chapter 1 Formative tests, by deiitioa, provide washback i the form of formation tothe learner on progress toward goal. But teachers might be tempted to fe! hat sum native tess, which proride asessment athe end ofa course or program, do not ‘eed oofer much in the way of washbaci. Such an atdrude is uafrunse because ‘he end of every langue couse or program saays the beginning of fhe pur ‘sis, ore learning, more gels, and more chalenges o face. Eve inal exanina- ‘ion ina course shoud carry wit it soe means fr giving washbac to students. ny courses I never gia fnal examination a the lst scheduled classroom sessio, [always amine fis exam dung the enulinate session then cou plete the evaation ofthe exans inorder to retun them ostadeats dura the st ls, A this tine, the svdeas ceive scores, grades, and comments o thet wor, and I spend some of the cas sesion addressing material on which the students ‘were sot completely lear. My summative assesment is thereby enhance by some beneficial washback tha is usualy not expected of fl examisatons. ‘inal, washback implies tha students have ready accesso yout discuss the feesbackand eration you have gen. Wheres ou almost certsniy ave known teachers with whom you woulda’ dare argue abou a grade, an interactive, cooper ‘tie, colaboratire classroom can promote an atmosphere of dialogue berween St Gears and teachers regaring evaluative judgmens. For learing to contin, seats nce to havea chance to fed back on you feedback, to sek carfcaton ‘any issues tha are hy, and ose new and appropice goals fr themselves for the days and weeks ahead, 0 wren? Pcl of agape see APPLYING PRINCIPLES TO THE EVALUATION OF CLASSROOM TESTS “te ive piles of pace eb, vats, antici, ad washbck 9 ‘ag may owud proving sf oes forboshevaeaing a exsing mss spent roedur and desgnngoa¢ on You oma. Ques, ess, Ga exams sad Tenure proficiency et can al be seruinized hrogh these ve lenses Ae there other pencils tat shoul be invoked in erating and sing ascconen? The towey, of eure, yes Lange sssment isan corti frou! dgpine wit many brands, axere es, and ise, The proces of (ging eaive asesenet insane fx 0 comps tobe reed ie fen Good tt construct, for camp goveedy sacs eso Ter pepuaion, sapling of i, tem dein a cssurton, edn sponses ‘che sandat, ands on. Bat the Bre penis ced here serve as an exelent fGunacon on wich evhaue exiting nuns ac ld You Om ‘Me wl lock at how o design ess in Chapter 3. Thetis and checlss hat foto ints apes ince bythe fe princes, wil ep you eee ising fess br your own dasoor is import for yout ements howerey tha the Seroenc of tee questions does nop apis oe. Vai, oe example sc Nerninly te mow seniicant cao! principle of asessment ersten Petal eouldbe a secondary sve in dasroom testing. Of fora parce ourmay seed place autensciy#5our pray coserion, When ass Td dou, however, fay not substan al oterconideralons may be rendered uses 1. Are the test procedures practical? Practical is determined by the vacher’s (aad the swdens) time const, ons and sdeseseatve deals andr some extent by what occurs before and afer ‘he nt To detemmine whether «testis practical for Four needs, you may want f we the check belo PRACTICALITY CHECKLIST D2 canscens complete thers reaceably within thes ne fume? 3, Ca the tebe admired sooty, without proce “gtches"? ‘Areal priced marl accounted fr? Hs equipment been preset? {site scoragleralnton stem feasible in the eacher time fe? a Os. 1 & eke cox ofthe test win budgeced tims? an D8, are methods fr reporting resus derermines in arance? fourna2 Pinplesof ngage rset 1 As this checklist suggests, after you account for the adminisintive deta ef ving test, you need ro tak about she practical of your plas for scoring the test. fn teachers busy Eves time often emerges asthe most important fctor, ont ‘hat overrides other considerations in evaluating anassessnea. Ifyou aced to taer ‘atest to fityour own time fame, as teachers kequent do, ou néed to accomplish this thou damaging the test's vakity and washback. Teachers should, for crample, oid the tempation to offer only quicly scored multiple-choice sele tion items that may be neither appropriate nor welldesgned. Exeryoue knows teachers secret hate to grade tts osc a much as students hace wo ake then) sd will do almost anything to get throogh tat ek as quickly aad efordessy as possible. Yer good teaching almost always impli an investment ofthe teachers tie in giving leedbeck—comments and suggestions to students on thei tests 2. 1s the test itself reliable? _Relabie appesto the suet, the est adminisraoo, the esse, andthe teacher. Areas four sources of unelibiiy must be guarded guns, 28 noced in tis chapter ‘on pages 27-29 Test and test administration relay canbe achieved by making sure tharall waders receive the same quay ofiaput, whether writen oe auditory Te lowing checks shoul help yoa to devermin if esti ise relabe: TEST RELIABILITY CHECKLIST 1, Does every suena dal photcopid text shee? 1D 2 Issoud anpiicaon cea audible to everpoce athe room? 3 side iapcleny and wnormyvisle wal? 4, ascii, empersuc, ennai nse, nd oer cso cocisequl (nd opin) orl sent? D5. Ferdosegeated sponses, do scoring procedures leave Ee debate abo corecass ofan 205m? EN gegen OI, Can you ensure rater reliability? Racer relabiliy, another common see in assessments, may be more dificult, per Inaps because we too often overlook thisas an ise. Because classroom tess carly favolve two scorers, interrater elit s seldom anise. Instead, inrsater rele ably is of constant concem ro teachers: What happen to our faible conceat™ tion and samina over the period of time during wich we are evaluating & cx? “Teaches ped o fad ways to aninain thei fous and energy over the te sakes to score assesses. [a openended response test, this issue is of paramount Importance, Ici eay 0 let meataly established sundards erode over the ows sequired 10 evaluate the test. 2 cama? ines larpage msec lnwrasatr celal for openened responses zy be eahanced by answering these questions: of PRACTICALITY CHECKLIST 1. Have os esabished consent enter for comet response? 2 can yougive uiform sealant those ritersthrooghour the craton tie? 3, Canyooguarasiee thar coin is based only onthe exablsed era and or on erencos or ected! © 4 Hine you ead tough est a tie to check orconsceny O 5, tyoubivemade-misiran’ modlexions of what ou coer cone syns, i yo go bck dp he sae Sanda xP O 6, angouarid ge by eng th ess in ere segs, epecaly Hebe dne urements water of erreur? 4, Does the procedure demonstrate content validity? ‘Tae major source for esabishing ray in acssroom tes is coaten aii: the cement to Which the asessent requires sudents to pean tasks that Were Included in the previous classroom lessons and thar direct epresent the objectives ofthe unt on hich the assessment is based. If you have been teaching an English language cls to svadeats who have been reading, sunmaing, and responding tO shor pusages, and if our assessments based on this Wot, thea tbe conten aid, the re needs to inchde pecormance in those sks, For casscoom assestnens, content nd eteron raility are closely linked, becuse lesson or uni objectives are essentially the exteron of an ssessmeat covering that lesson oe unit. Sever steps might be aken wo evauate the content aid ofa classroom test: of CONTENT VALIDITY CHECKLIST (FOR A TEST ‘ON A UNIT) 1, emit objected ideaiiee? 2 Ave ni objects represented in the form of te speieatons? (See ” the nex page fr deals oa ex speccntons) G5, Do the est specifications include tasks that have already been pedormed 2s partf th couse procedures? 4. Do thee: spectros inchde mts that represen (0 os of (he objects fr the wi? 5. Dothose tasks ave anal perormance ofthe tage ts? our? Pinch mpage Aaminet 43 A primary isu in esablshing content ality recognising hae undesping every good clasroom cet ate the objectives ofthe lesson, module, or uc of the course in question, So the frst measure of an effective classroom testi the ident ‘cadoa of objectives. Sometimes this is easier sid than done, Too often teachers "Work through lessoas day afer day with lin or no cognizance of the objectives they sec co full, Or perhaps thse objectives are so poorly defined that deter ining whetber they were accomplished i imposible, A second isu in content validity i test specifications (specs), Don't et this ‘word scare you. Itsimply means tha test should havea srucare that flows log ‘call from the leson o uc you ae tesng. Many tests have 2 design that + vides them into a number of secdons (comesponding, pehaps, to the objectives that re being assessed) + offers sudensavarcty of item types + hres an appropriate relative weight exch section Some tens, ofcourse, do not lend themselves to tis kind of rucrre. A test ina couse in academic wig atthe university level might usifably consist ofan ‘sass writen esay on 2 given topic—caly one “tem” and ote response, in a imanaer of speaing. But inthis case the specs would be embeded inthe prompt ‘use and in the scoring or erauation rubric use vo grade ital gie feedback. We ‘eum tothe concept of test specsin the next chapter. ‘The content Yay of zn existing classroom tex should be apparent in how the objectives ofthe wl being tested are cepreseated inthe form of the concent of items, custrs of items, an item types. Do you cleat pecceve the performance of ‘esttakers as reflective of the classroom objective? If so (and you ca argue thi), content vali has most ely been achieved. 5. Has the impact of the test been carefully accounted for? ‘This question integrates the concept of consequential valty CImpac) 2nd the Atporaace of sirucurag an assessment procedure to etic the optical perce mance ofthe aude Remember tat even though ic an elusive concept, the appearace ofa es oma stent’ pint of ew i pont consid \ 4 oumne Pinesfanpag Asement “The flowing fctrs aight help you to pinpoint some of the issues sur rounding the impact of ates: CONSEQUENTIAL VALIDITY CHECRLIST 01. ive oct sues appre rvew ad repro or | kee | 2. Have you sagen etang seesaw be Benefca? 5 wee secre so ay if posse, the be rade Wil be odeatycalnged andthe weaker sens wl not te overvined? C4, Does ee end telco your rng bene washack? 5, ace the seudeas eacournged tose the es as. earning experience? | 6.15 the procedure “biased for best"? ‘A phrse dint is come to be asocaed wih consequent vy is “biased for beat’ term tz goes line beyond how the studeot views the test toa degree of strategie iavlvenent on the prt of Seen ad teacher in prepa fo, seg Up, 40d following upon the et ise. According to Sain (198, fo give an assessment cede thats based fr bas, teacher provides condos fora stdent’ opal performance. In such a case, your roles aot to becky” oo Seaze your sadess Faro encourage them and big out the best in their performance. Cohen (206) suppored Swal's concept ina comprehensive discusioa of research tat showed the psive elects of xudeas awareness and tllzation of tescaking stati. Its exsy for teachers to forgethow challenging some tests canbe, and so 2 welt planed ting experience wil include some sirtegic suggestions on how tudes ‘igh optiize their perfomance. In evakting a chssroom tes, consider the ferret 9 which Delo, during, sd afer tex options se fled 7. Are the test tasks as authentic as possible? Evaluate the extent to which testis authentic by asking the folowing quesdoas: U xursentery cnecKusst G1. sche lguage inthe test a narra as possible? i are gens is conesulzd ss possle uber han led? | 5. Ar topesan sextons teresting. exe andjorhumerus? | 0 & tssone theme opiates proved such tough a sory ine | or eiode? 5, Do ass represen, cls apposite, abo a? j ° eg EO ow? Finis el ingageAssimen 48 Consider the folowing te excerps from tests, andthe concept of athe Leiey may become alle caret tiple choice tsts—contetalized (9 3) Dens: Aer answering the questn, oe the‘Sabit ten, Going To” 1. Amanda: What__ this weekend? Oynsgeng a Oaeyougin boo jour 2, Gwen: ti not see, _anyingspeca? Ode yu gig ta ‘Oto aregang oo Ob pig nce ‘3. mands: Meissa and |_a party. Wold youths come? ate einato are gehg Owe 4, Gnenctidievetol_? (oWhatsing tobe (OW tobe ‘Ore it te 5, fmand:its_tobe at ts aus on 9 Ome Er ——— = Ape om SV, fo DarsESL UE i c « i $6 cxera2 Pincsoangap Ase: ‘Multiplechoice tasks —decontextualized “Going To” 1. Wat___tissunner? dein ig do Bis.in gig bio youre gong ao __—_mything psi ect weekend? ‘Rhee oot oto Bw pig 8 Gls ing todo 4. Steand|__my Engle ass tomorow, ‘are wing Baza C gona 4, Th Giants are paying besabal on Wedestay, Aatsiaxng? os Rang bt (c's Egoig be pye? 5 Theocean’s___tobeatlow te ltr this marta, Aw Bing Cig ‘The sequence ofitems inthe conrerulzed sks achieves wodicum of utheo- ‘ety by contermsakzing al the items in 2 storyline. The conversation oe that sight occurin the real work, evenif wih ide les formality, The sequence ofits in the decooterrsized tas takes the estar into re diferent topic areas wth ‘no cootet for any, wit the grammatical category sth oni wang element. Each sentence i ely tobe write or spoken inthe real woeld but ony pettaps in fre cent conten, Gien the contains ofa mulplechoie forma, ona mesure of abet would say the rst excerpts good andthe second excerpts only fal 8. Does the test offer beneficial washback to the learner? ‘The desig ofan efeive test should poine the way to beneficial wahback. A test that achieves concen ality demonstates relevance to the cuicaum in question and thereby ses the sage for washback When tet items epreseat the varios objec: tives of a uni, and/or when sections ofa test cles focus oa jor topics ofthe ‘ui, casroom tests can serve in a dagnosiccapecity even ifthey aren't specially Labeled as such. (OWT? Piss engugeAosmen 47 ‘The folowing cect shaldbelp you to maxiize benefice washback oa text: | 7 wasupack cnecuisr 1 tthe tet designed ia uch wy tht you can gir fedhak hat wil Derelevant othe cbccves af dhe ni being tete® 2. fave you sien sens ufc pretest opparanises to review the subject mater ofthe ts? G5. tnyourwrinen feedback to each dent, do you leche comments tat wilconubute to student formate development? G4. er emening tess do you spend cs tie “ung over he est and feng advice on what stents shoud focus oa inthe une? 5. Afr ening tes; do you eacourge questions fom suena? j 9 6 Wein an creunsances pet, do you oer sadents especialy the | eake ones) chance ro Asus results in aa ofc hour ‘roups are more valuable, in terns of measurable washback, than the tet ele By spending classroom time ater the test reviewing the conten, studeas de ove tei ares of sength and weakest, Teachers can rae the wathback poten ‘wi Jouaal wring may provide sradents a spcic place to rece thes feelings, WOE they learned, and thei resotuons for furae effort ne 48 come? Pincesol lange Ae “Te five basi principles of lnguage assessment have been expanded hee into beret questions you might sk yourself about an sessment As yous the rine and gielie to ete varius ferns f test and procedures be sre Poallow ech on ofthe fie to uke on greater or lser importance, pending on the coatet In lgescae standardized testing or exapl, racial is aly fnore mporant than wasibac, but the reverse may be woe of mos clasroom tet Vly is of couse ways the al arbiter. Remeber, 20 tha these prac pies porns they a, are tthe ony considers rating mang a eerie tes Lave some space for othe tors 0 eer in "Te nex cate focuses ca how to design 2 tes. These same fire principles andere est coosoction 2 wel 2 test vat, log With some new concep tha expand your abit apply prinoples tthe practic of language ses tment in yourown casroom EXERCISES {Not: (0) tnivcal work; (6) Group or ait works, (C) Whole-ss discussion] 1. (©) sk the cass to volunteer brie descriptions of tests they have taken or ven that iste, either poskively or negatively, each of the five basic pie tSpls of language assessment tht se defined and expbined inthis chaptes Tae proces, try to come up with exampes of tess that Huse (ind cit ‘erentae) four kinds of relablir as wel sth four types of evidence that ‘support the ality of es fume? tina of anpuge Asam 9 4, Standardized molple-choice proiiency tsp ral or writen production (Student eeceives a epet form lisig otal score and subscores fer sering,gamnmas, proofeeding, and reading comprehersion. 2 Timed impromptu test of writen English WE? Test S receives a report fom sing one oli score ranging betoeen Oar 6. 4. One-on-one oa interview to assess cyeral orl production abiity Stecives one hols score ranging ietweenOand 5. | 2. GUC) Some assessment expers contend that fice vali isnot alegiiate {orm of aii because irks solely onthe perception ofthe estar ‘her than en exteroal messure. Nevertheless, a numberof education fesessment experts recognize the perception ofthe testeaker sa very Impor ‘ant factor intext design and administration. What is your opinion? How swould you reconcile the two views? +3, (G) Inthe section on washack, itis sated that"Washback enhances nuraber of basi principles of language acqision: auinsic motivation, autonomy, sefcoaldence, nguage eg, iteanguage, and tte invest ‘meat, among others (page 38). In2 group, discuss the connection between ‘washbzck and each of the boveamed general princiles of aneuage Tearing ad reaching. Desc speciiccramples or thstrations ofeach coo eccon If time permis, report your examples tthe cls. 4, (© laa smal group, evaluate the assessment scenarios in the chart on pages 49-50 by anking the sx fctors ised there from 105 with score of 5 Inciesting tat the priacple is high fuled anda score of 1 indicating very low or no fullmen), Brake te scenarics by usisg your bes inno in the absence of complete infomation for exch context. Report your group's findings tothe rest ofthe clas and compare. 4,5 gives a ivesinute prepared oral ‘resentation in chs, ‘teacher evaluates by filing in & tng sheet indicating 3 succes in | detnvry,appor, pronunciation, | ipomeay, an er. | 5, Slitens oa fitteen-ninute video | lecture and takes notes, T makes individual coments on each | of $5 notes, | 6 Site atk home (venight one | tage esa onan asgned topic. 5 reads paper and comments on cngaizason ane crit ony, then, ren esa) Siorasubsequent at Govino) ure? Pps LxpageAsesinens $1 Flute eee of be eset in ems (te eb po es fase dr Deep aes rt entero ea Clap: ren: jourEdns eras wen apt ope een 7, S creates multiple oma of a Leary essay, peers and T-reviewed, fein fal veto, Teanmert on germataltercal rence es | ‘This highly informative “state-oftheart* aricle summarizes a qumber of ‘issues in classroom-based language assessment, For a reladvely short anicte, ‘sexta tad nis tenet eaaree pone at ae ben sn ecaeyns Com cal et 2ad lity a eam nd cee nd eee ey theme scenes ctl assed dt vehoene eee Data amples tat aches rh ely seat ‘Wei 7 C005) Longuae esting and adetion n cena basedepproa Basogstoke, Eagan: Palgrave Macmillan iH 4, Sassembls a porto of materials over a semester-long couse, T conferences with onthe porto atthe end ofthe semester. 5. (6) Ts chap ges cet ol ou uae and phe i pich ls ofuguge snc nour poop, kt on pace Desc set i onere in your op oko pe Dec loi quesion: Dili es he eter be ec Repo ny ctyou discs tact ote cs 6. (©)lnowr dscsionof input in scape te sugeson as mde hat techs an prepress es yeas hem seeps : pepuing ling sod ean tom tes. Test at ecg 8 "eb, dog nd afer cates “ee” seeps ol nde ig iran abu what expand signs rb oc. i “Dassen he is acing tes a ese. eae 'Aer sep shasta omeneasaleinienng | frugal cou Dene sens i sal pep, eign cet often eps, ete cach goup Dea ono be tees Repo you heck bc the ds 1. (6) in an accesible gage ins, he exces alow yout obsene t= asesent prcfite tatabouto ae lets mses pe tae tesa ue). Do te loi 2 Codes ee i te ache e pct ge ines onthe pupae ft seen eis pce een 2. Ober pol de xl xian oe tse. Aeaige hor ter wih th tech er et 17 cuore DESIGNING CLASSROOM _ LANGUAGE TESTS _____ OBJECTIVES: afer ceacing this chepter, you wil beable to + gobeyoad erating an exiting + design avai of tens (est ‘esto acwally dsianing one 02 methods) fora proposed est yourowa + cary out the adminisraion of + analyze de purpose of 2 tes, afer checking a umber of est ‘essenal deta invoived « futeinexplct tems he objecives + construct ainae for scoring ‘ofa proposed vst ging, and giving feedback oa «cet ust specications for proposed tes proposed te Gomme and sumuatire tex and not and cerionrferenced tests, Deere tppet or purposes of assent have been induced to Fou Youve tact some ‘ie hstoral ines of though in the del of language estesiment. You bave 3 Seose of mar cuentwrends in anguageesessmeat, especially the preset foes 09 npushing ordeals into challenging and itinsealy modating learning expert senes By dow, cerain fundetona principles bave entered your woskng rob tare praia, relay, vali, aubency, and wshbse. You should nos dis powes 2 few tos with which you can evn the eecvenes of an ‘existing caso ts. hts chapter, you wil draw on those foundations and rol co begin the proces of exgning es revising exsing ess, A ays, the primary fs 2 ve book is on casoonr-bsed assessment, beemst that’ the doworo-arh con teen wtbich you ace reguaely involved, Well eal deci with issues in farge- eal, standardized resting in the next chapter. So fo nw, fo castoom PUEOSS Tes art the process by asking ome ciel questions. 1. What the purpose of be test? Why ae you retin this test, or why ws crete by, 9, a extbook writer? What sits significance relative your ‘couse (or eximple, to eraluteoveral proficiency or pace asruéeat ia 1 413 be ii = 2 ak SSSI THN Loom} Owing Catoom Lgugr is 52 course)? How importane isthe test compared to oer student performance? ‘What wil as impact be on you and your students before and ater the asses- reat? Once you have established the major purpose of ates then becomes easiest specify its objectives, ‘2, What are the objectives of the te? What exactly are you ring to ind out? Exablishing appropiate objectives involves a numberof issues, from rls tively simple one about forms an functions covered in 2 course ust 19 such wore complex ones about constructs to be represented on the et. Included here are decisions about wit language ables ae tobe ested, 43. Aow uil he test specifications reflect botb the purpose and the objectives? ‘To design or erate ates, you must ale sure that he est asa sacar that opal fofows from the ur o esson ics testing, The das objectires shold be preset inthe test through appropriate task types and weights, a losial sequence, and 2 variery of asks, 4, How wl ets ster types (tasks) be sleted and the separate tems arvanged? The sks nee tobe praccl as defined ia Chapter 2) To hare content vali, they shoul lo mieor tasks of the cours, lesson, OS ent. They should also be aulhendc, with a progression bse for best per formance, Finally, the asks must be ones that can be evaluated aby by the teacher or scores 45, In adinstering the et, what deta shoul attend to in onder to belp “students achieve optimal performance? Once the est has been created and js ready to administer, scents need to feet well prepared fr their pefor ‘mance. An otherwise effective, valid test igh to reach is goal ifthe con lions fortes taking ae inadequately exablished. How will ou rece ‘unnecessary ame in students iether confidence, and help them view ‘he test an opportunity to lara? 6, What kindof scoring, grading, and/or feedback s expected? The appro pate form of eda on test wil rary, depending on their purpose For vey te, he way results are reported i an impocaat consideration, Under Some creauisunces aletr guide or a boliste score may he sppropeate foes circumstances tay require that a teacher offer substantive washback 0 the lamer ‘These six questions should form the basis of your approach to designs, sdesnistering, and making maxim use of tests in your casroom. FOUR ASSESSMENT SCENARIOS forthe purposes of making preccal applications inthis chapter, we wil oasier four scenaio 25 we proceed chrough the sit steps for designing an asessment. ‘These common ckstoom coateats should caable you to entity with rehworld assessment sintions, rere mo ee SH cn) Deng Caer lnpage ot Scenario 1: Reading Quiz “Te fist contet ian intermedia English chs in Bo. The students, for secondaiy school seuded been asigned a two-page shor story to The gut Will @ give adem sense of how wel they undestod the sory and act asa staring point for a eachered discussion on each ofthe items, Results of he quiz Wil ot be recorded in the texcher’ record book. Scenario 2: Grammar Unit Test ‘This est comes the end of three sree unit in 2 grammarfocus course ata high ‘begining (Level 2) lass in an adit schoolin the United Sates, Seder have com- pleted Level 1 or are been placed ino level 2 bya placement es Al the seudents xe simulancousy aiing ovo lntegrtedstils classes Cistening/speaking and reading/writing), and the grammar clas serves to reinforce the grammatical forms thathare been encounterein the other two cases. ‘The grammar unit has covered verb tenses, The curicaum specifies tbat the ‘Sminue est isto be divided into tee sections: muliplechoie irems inthe ‘blak (lore) items, aad grammar editing task (where stadeats mus. detect eros in several writen paragraph). The test wil be handed i, graded by the teacher, ad erured ro students few day ates Scenario 3: Midterm Essay Ina wring course in university in Tuiland, sudents atthe advanced level bare been working forbalfasemeser on writing sys, most aarativ ad desription ‘essays. ln the second half ofthe course, sdents wll move ono cause, gir meat, and opinion esas. ‘Tue midterm es san opportunity for stadt to demons their by to ‘write a coherent say with elavel few grammatical and rhetorical eros. The essay ‘wl be given in css Gia 90a das period. The scent 6 not know the rpic shend of time bucare allowed tose bling dctonary tolook up words orspeig. ‘Tae curicalum species quality of writing over quanti. The teacher wil read esas cover the weekend and make commen'sbut ac give gre o 2 senre.Durng the next ‘weak, there willbe peer conferences wit the goal ofeach stdea to revs his or ber sy followed bya student-teacher conference afer revision has ben tae i Scenario 4; Listening/Speaking Final Exam ‘Chileen in te fit grade of x private schol in Japan have been taking 215+ ‘week course in orl comamuaication sil (imtening and speaking). This i their third yeu of English couses (hey begin i the third gride), and by now they are naa sg | | | ies in sn pesoqyio Bd 1 class the questions [story and as, Results ‘OWE. Deing Ceseomtangae es 38 able to comprehend simple English seaences, distinguish many phonemic con teats, only produce (epeat) sentences that have been modeled for ther, aad camry on very rudimentary oa exchanges, mos wsing words and phrases the See neonate ther won ecb vey vn Se matical accuracy is perhaps paszble with the mininal amouzt aia pass of language they ‘The fl exam forthe css, according to the prescided curicaum, consis of @ listening to an avdio program with most ofthe couse’s grmmatical and phonological elements represented ina varey of stimulus types and respond to -Rtten muliplechoie items (ihe suggesed time Knit fr this lsering portion 20 minwes), followed by () a threeminuce ora interview, oneonone, with the teacher (Because sia private schoo, the class sie is quit sul (15 stents, ‘hich gives the teacher tne to complete ora interviews within the alloted tine le for the final examination) While dents ae going oa by one, ito the onl interview in a separte room, ohers are doing Lnernetbased English activities aad ‘ames inthe School's compucer lab. Because this fnal examization, the oaly ‘icipated follow-up othe two-part eam is Score repor by the teaches, which ‘parents and students wil seein 2 Few weeks, ‘Keep these fourassessment sitions ia miad2s you read this chapter, Al four Will be referred 1028 we look 2 the five steps in designing an efectve test DETERMINING THE PURPOSE OF A TEST You may nk tat eery tet ou dese mis be a wonder asa ase sent hat ipresses you coezgues zd dents ale Nos, Fe new dan? ‘athe tetg formats ake of eto desig and ng ie ee teh wal and er Secon, adios rein tecniqus can, witha Ud cea coafoem to the spit of niece, communicate ngage caret, Yous Bes couse of acon sa new teacer isto mod tin be ues of cepted, Neowin ig ches Sv wi exes, oc ag design. Inte spn, les eonier ome practi ep in oastug -dassroom tests, f fae The is ad perhaps mos impact ep in desgning ey so of elscom assesment (or in determing he appropiteaess of an exsing tx) i 0 ep ‘bac and conse th fel purpose of te exercise Ua your iene sb ‘opeconm, The pupae ofan assent sat Baha and ane (1956, p. 17-19) fe toasts wsefuless 0 rr sappy to wa we You pl ib asec? Conte the chek onthe net page fr dteig pose co er demain purposed 36 uEME Osigng Clem agg Toss (/ PURPOSE AND USEFULNESS CHECKLIST H \ i 0 1 bot scedto wininera tea this pont in my couse? 30, wlat pepo wl seve the tet nore? | 2, Whats uicaace elt to my couse? { 5S eiesinpy a expected way to matte end ofaieso,uator period of ie? 4 Ho agora ist compared to oer seat perornzoce! 5. Dol wise oust erie any seas ve mtein | predetermined cone sant? \ 6, Bot geo ware sens tobe eins of bene wast? © 7, Witte theres as a means to aca my own pedagoicl eft inched or nels w flow? } 1g, what wile impact be on what do, and what stants do, befoe and | afer bets? i LN er ow look back at each of the four acessment scenarios described on pages 454-55 and think about the purpase of cach, Before resting on, do some personal ‘ranstonmin (ee Exerise 1 atthe en of his chapter) on just how he eight ques tis in the checkli wil be answered fr each scentio. "Reading Quiz To sar you thinking process, e's look athe purpose ofthe sist scenarothe eading quiz. Te quiz is designed to bean insractoeal teal #9 ‘ide cassoom dcasion for one cxstoom pedod. ls significance is ino bat fot val when viewed agune the backdrop ofthe whole couse, Because isa. su- prise tes and tel for teaching and slassessnent, the resus wil jusiabiy not be recorded, and s0 one students pecormance compared to oes is relevant. Ik ‘is entiely formative ia rature, with ie almost exchasie purpose of prowcing bene ficial wacback. Forcing sefentem think independent about te eating pase lows then to se ares of srength and weakness in their comprehension sts, ‘aa you sow consider the other thre scenarios and think about the overall purpose of each one, gten the comet described an the infomation given? Your Fndersanding of the purpose of an asessmcat procedure governs, 10 & great feet, te next tps ou take ia identifying lear objectives, designing ces spect Featons, coasting sks, and derecmining scoring and reporting cites, © 99 00 DESIGNING CLEAR, UNAMBIGUOUS OBJECTIVES In addon to knowing the purpose ofthe txt you're creating, you aeed to know as specials posible whar itis you wan o test. Sometimes teachers give ess Simply because ifs Friday in Ue hid week of the course; after hasy glances at the ‘our. DaspingCaswer Lapage ss ST eapter( covered during those cree weeks, they dash off some test items so hat -seadets wil have something to do duriag te elas. Tis sno way to approach 2 tes nstead, begin by taking 2 careful look at everything that you thnk your st dents should *know” or be able to"dobased on the material thatthe students are responsible fx in odes words, examine the objectives forthe wnt you are esting Remember tht every curicuum should hare appropriately framed, assessable objectives, that, objectives that ae stated in ters of over pecormance by stu dent, Thus an objective that states “Srudens wil eam tg questions o simply names the grammatical focus of tag questions is ot tesible. You dont know ‘whether students should beable to understand them in spoken oc wrnen language, Or wbether they should be abl to produce them orally or in writing. Nor 60 you know ia what context (a conversation? an essay? an academic leur!) thos ix ‘uit forms shouldbe wed, Your ist askin designing tex, then, isto deterine sppropie objectives, sated as explicitly as posible (Grammar Unit Test. I you'r lucky, someone wil have already sated objec tives cea in performance terms, Iyou'e les foracate, you may have to go back though 2 unit and formulate them yourselt. Le’ sy you ind youself vaching the grammar focus class cescibed in Scenario 2, andthe objectives given by the course ‘guide simpy specify the folowing forthe unit on verb tenses: Students wil understand and produce te foiuing ver tenses in aproprite ce and wen cnt 1. sip present (eieu rom Lol 2. eset caninuous >. single past “4. aes perfect ‘sewer in the cusiculum, ‘appropiate cones are described a8 acontn- unton of the mater introduced an practiced in the other two istening/speaking and teading/ writing) cases So yout left with a sketchy bar workable setof objec tives on which to base your unit ret You wil certainly need to flesh these out it ‘ore detil before you can be stised that you have cen, assesable objecives. ‘Where do you bepa? Ia this grammar couse, stdents equally wse al for skills 1s they work with the grammar foonsisructures. So to achieve content rai your objectives should reflec ll four modes of pedformance and sample al four ‘eb tenses. Hee is a pessibe set of objectives for you to work fom: 58 cued Deg Chom Lge ss Oo. hes Seen RE | ~_| suse incanterts dandy enanien din ite dnecanecea | ia ea oe 1 single asenease senate z ile past eesen gacet nce Production —_ Soest ua ete tend acaulaediatne ee dases) —_| [| —luanactypaducenctoaditeniamsitine | ste sar | ekcontiwnsas plegatinse [Oo pases J SSIS oe Aldbough these objectives may sear a bit ovesated i's usally helpfl to ‘nical all the possible elements ofboth comprehension and production to give you an lasan: checks for your est specications (ace next section) Notice that, exch objective is sated in tems ofthe pectomance elced andthe target nguisc ctx Pecado ee onl or writen response and Mkewise be writen eiciions, Gand at alo | ‘nocall of 7 Lid uci oe ap the response modes cocespond oa ofthe ection modes For example, Ris | stole, iis Begilacand ineqdlac art oan unlikely hat 2 prompt of minal pale Cea, bit) would be matched with 2 | Hh a Modal a esi es, ca? ond Na eat “yes/no” response, nor would 2 monologue asa prompt elicit speling a word 2s a (Pond asa Pneeations of ca response. A modicum of auton wil eliiaate thee aonsequus, i oe In Scenario 4, the fithgrade English cass in Japa, the curriculum dicares 2 cates a ets ode serene sig ne cng oreo dus | lsening secon of 20 nines anda tee nine ol iesen. Tis maybe ta Fe cal Of ocder 3 fl examination that ostensbly covers & semesters wok ino com bn com ‘munication skis, bu well begin with the course objectives, In shortened form, {ed form, those objectives are as folows: oo cum Oxi Com ngage Tiss ‘Asyou desgn your fl exanisimpocaat to consider the ageof the sedens, ‘itn gers ace approximately 10 years lo, and at his age exp for fous is appropiate fa par ofthe curiam, So your obec, sated onthe pre “ios page imply temp we ofthe forms intcared bt ao exp dete tion. A gre: dea ofthe insrcion throughout the reac tas consisted of audtory input fom leet tased activites, DVDs anda CD supplied withthe tetbook or the course Acres age om gamestorepeton ils aed most orl production sseebeaed ‘Ustening Compreension Section Becnse of he conan of your cur ‘sclum, the senng parc ofthe al exam mast tke no ore haa 20 mits, s aleady noted, The stdens have become accustomed to taking muliprecoice testa quis in thee casswok So for reasons of pratay and impo, you decide co desn a maple choice ening comprehension tet wih Ue diferent tests Your school asthe latest omer technology arable so you ean make 2 god uly aio recoding sing your oie 2nd dat of oe th person (8 col Jeagu inthe school whois cate pecking apanse but as excelet ora sin Engi Here’ the frat you deci wwe: Listening comprehension format. Tet method: Audio prompts, multiple-choice response Specification: Each tem wes fila, rehearsed conetope. art 1 Minimal pis in words eed sentences (DiienanS mine) Part 2 Vocabulary comprehesio of cjecs, Cahn, and clas Ostemsizminites) —_— Fart 3A ix of te eng reg location (10 ites: 8 inte) Scoring: Record the number of correct responses out of 30. i, and prepestiog of “This nformal,cssroomeniented outine gies you an indicaon of + the implied econ and response forms frites + the objectives you wll over + che aunber of ems in exch secon + the dine tobe allocated foreach ‘Novice that a number of the possible isening objectives are no deci teted “This decision may be based oa the sie you devoted to these objectives, the impor our} Osiptng lssoom angagetir 65 tance you place oa each objective, and ofcourse the fist mumber of minus rai able to adinster the test. Is this an appropiate decision? “Tae ial ter in your tes: outline species scoring For the steing secon, coring is simple. Fr the oral meri, it becomes considerably more complex a8 ‘ve shall ee Weil bok again atscoving, grading, aod feedback ter in his chapeer ad then much more comorchensrety in Chapter 12 ‘What wl hose mukiplechoce ening comprehension ems oc ike? How wil you design appropriate stems (ee pages 68-70 fora descpion of mute ‘hoice question design), each witha lay, conrect response and muliple dita tor? Can you ensue enough auhentcity 2nd also provi some variety or your Seen? Well take up these questions inthe next main secon ofthis ptr ‘Mean, we tam our azenion to the or production sexion (Oral Production Section. Your curculum allows you to design your 7m cra interiew protaco, and so You daft questions to conform tothe accepted fat tern of onl intervie (ee Chapter 8 fr inforatloa on coatucting cn ine ‘yews), You have decided to conduct the interviews one-on-one becast you hare ‘nough tine with smal cls todos. You beet and en with nonscored items qarmup and window) designed tos students a ase and chen suadch fermen them ems imended t9 test the objectives (evel chet) and 2 ile beyond (rob. “Because these are 10-year-old hire, you have decided, on the advice of “other teachers, to make use of fe of piers for stimuli. They hve responded ‘pelo pictures in previous oncon-one simtions Here isthe outline you decide vo folow: Oral interview format ‘A. Warmup: retings and seing the cid at ease B. Leel-check questions 1 dering objec, single and plural 2. Present progressive tense 3. Adjectives i, lite) , Probe questions 1, Tak aboot his picture, 2. Agere question 1. Wind down: comments an reassurance Le (ue. Dalgig Cc rps You're now ready to det acral est tems—with matching pictures ect espouses, Here's what you come up with asst def H__tname. How ayo? {Civea Gorplrnent othe cd. They have practiced ging conpimeiin chss) (Grey xpi, in panes, th procedure re tre. Ressure tech} Gi nde wats is gos What color is #2 be (hereto oo ey te ea Gilinmee ete Paget eee Ths isa pea of by he pice hows a boy waking What sth by dig? (Rept this proce wih ee ober picts doing ton) lay a ae re i wo gles ne big and one all) Ae ihe ame Sie They have pace ingle compas) eesti proce eee cre ear fits) Pat TT (Sow th cla ire ofa ay eg a mel tome) Ff Olay lest Cape pie oe yng in pak) ‘Godot How can yas a quesion’ (iran hs been racic nls ary tins) [Biko ell ise coo thecomr rw yt c if 4S you continue to put ourself in the place ofthe teacher inthis school in ‘Japan, are there any changes you would make to you potocl forthe orl atervew? ‘Toe fal step in the process isto devise a method of coreg You need make this a simple and sighorvard as posible, so lets say you decide to give two Scores for ezch separte question, cae for pronation and one fr grammar a3 Desig Coseom Lsgeage Te 7 Because the couse hs focused 2 gor deal on grammatical and phonologcl form, and because stents expect such an asessmest, you just the exclusion of such cements as content and social sill, Here's what you come up with + visually pecectpromunciatcalgammar: 2 * some ero) in the response: 1 + wrong oF no respoase: 0 You prepare a card for each student with your list of questions. Bese each ‘question we the numbers 2 1, and 0, Yu csc the numbers athe interview p1o- ceeds, When al interviews ae complete, you cin add up the numbers fr 3 toa score. Because it’ final examination and the schoo’ poicy does‘ offer a means to give more than a score report to the child (and the parent), your numer scores appear o be suliciet, ‘DESIGNING MULTIPLE CHOICE ITEMS Sona well ream othe spect ask o esgning the muliple choice seni com echeasion test or the Japanese fifth araders, but we fst ned to tara ou anen- ‘ion o some important principles and tips for designing maaltiple choice tests. Muluple choice items, which may on the surface appeat wo be spe items to con- sSruc, are acu very feu to design core. Hughes 2003, pp. 76-78) caw ‘dons agains 2 number of weaknesses of makiplechoce ites: + The technique tess only recognition knowledge + Guessing may havea considerable eect om test scones. + The technique severely rescicts what can be tested. + Ieisvery cult o wrce succes items, + Beneficial washback may be minimal + Chening aay be fciiated. ‘The two principles that stand out in support of muliplechoie formato, of ‘ours, pracy and reabily. With ther predetermined correct responses and time-saving scoring procedures, muliplechoice tems offer overmonked teachers the tempting possibilty ofan easy and consistent proces of scoring and gracias ‘But isthe preparstion phase worth the eor? Sometines ii, but ou might spend ren more ime designing such tems tan you save in grading the te. OF cours, ‘fyour objective is to design a argescale standardized test for eepeted adminis. ions, thea a multiple-choice format does indeed become viable 64 cumer Desig Clason ange Ts 1, Design each item to measure a single objective. ction ofthe As you fice the tsk of designing the tsening compreension section ofthe % fiingaeEalis cam, le fi conser some inporanterislgy inthe supply ten pes se, cexaaly 1k Muliple choice ems area receptive, o selective response items in that the tester chooses froma st of respoases (commonly cals supply type of response) rather than creating a response. Other receptive tem pes ‘relude true/false questions and matching ls. in the cscussion here, he guidelines apply primarily ro mukiplecheic item types a noe necessary to ober recepie types) 2, Evecy mmlplechoice item basa stem (the "body of the item thar presents sul) and several (osaly between tree and five) options or alternatives to choose from. +3, One of those options the Key, i the comrect reponse, whereas the others serve as distractor. others propriate, stor bot 960-75, ‘Because these wll be occasions when muliplechoice ems are appropiate, onside he following four gudeines for designing mile choice items for both Gaseroon based and large scale sntions (adapted fom Groniud, 1998, pp. 60-75, and J.D. Brown, 2005, p. 48-50) Consider the following tem from a secondary school cass in English at ee inte ‘he acer smetite level. The objective is whe questions: Testers ear: Where id Gecge goa apart ast igh? Testes read: A Yes ned 1B, becuse he wes red ©. to Baie placer nc paty oun een cock Diswactor Ais designed to ascertain tht the salen knows the dlference bermeen an answer toa teh: question anda yes/ao question, Disuactrs Band as sels the ke item, C test comprehension ofthe measing of ubere as opposed uy aed wben. Te objective hasbeen directly adresse, ‘Ga the oer hand, heres an teen that was designed to tet recognition of the ‘comec word order of indirect questions: reuse ie, doyouknon 2 ‘A whe te pst fce 8, ert post fc is ©. wer poste is PRE eee eee eeee eee erate Yr onmrad Oniing Casco Langage Tas 68 Distacoe A is designed to lure srodens ‘who dont know bow to frame ind rece questions and therefore serves as an efficient dsracror. But wht does df tractor C actualy measure? In face, the missing definite ance (ie) is wht JD. ‘Brow (2005) cllsan ‘unintentional clue"(p. 48)—a flaw that could case the tes taker to eliminae Czutommticaly Inthe process, no assessment bas been made of indiret question in is distractor Can you tisk of better distractor for C that ‘would focus more cleuly a the objective? 2, State both stem and options as simply and directly as possible. ‘We're sometimes tempted to make muliple-choce items too wordy. A god re of tum sw get dzectyto the point. Here's a negative example My eyaiht a ely bean eter ley. wonder ited sss, tink beter gotthe__phave my es cece. A pedabican 2B. eermatlogst ©. epturtit ‘Yu might angue that the frst two sentences ofthis item giv it some authe- ‘icy and accomplish abc of schema senting, But if you simply want a student to ‘deny che e7pe of medical profesional that deals wit eyesight issue, those sea- tence ae superfuous. Moreorer, by lengthening the stem, you bave introduced a poten confounding lexical item, deteriorate, that could distract the sradent unecessary. ‘Another rule of sueciactness isto remoe needless redundancy from your ‘options. Ja the following em, “which were is repeated in all three options. It should be pled inthe scu Wy heep the tem 2 succinct as posse ‘We went ows temples, _ sang A. wich were auc 5B. which wre especialy ©. whch wre haly te 14 cura. etn Cesioom Large To ‘tren, because seuent ave also been rating Engh word and can 1005 size them wel the Ist Sve ers use veal cues, 3 flows Testtakers seer Aba Brat can Bo Testers ear: 1. The abat one fos. 8, Vice (How ld ha? Voie Ste’ it or Pare 2, Jessa ou agin choose to se ptr ced ies, his time for sis seems Bete allow you to depict objects easly and uaambiguouly, You have ‘hose to have four muliglechoice options for each te, So #0 of those es took ike this. Vii - | eforsic i seitems Tsttaks ee o Testers heer 1, (sepa sit 2. Thea a earintofeest HIV . curr) Despng Cononm ages 7S For the ast fouritems, you choose to have students doa matching exercise, to test knowledge of jects in the casroom. The item, worth four pin for all four objets, looks like this: Testers see cranple: ABODEFC 7. ABODEFG 8 ABCOEFG 9. ABCDEFG 10 ABCOEFS Testtakrs bear In depaesa ache cored ites in pce a youse. Fr eae vet puree wer, cae the coret ea ta este wre An examples been sane rou, ane: penal ‘The eter" has been dia because eet" pits to fe pen. wien, : 1. desk 8 toot 2. compet 1. cic PETITE 78 cure omipirg Cee gene ant For Par, ia which youre wo est negacives, contractions, a prepositions of location, ence agia you bavedecied to rely on pictures, given the age of your tee takers and wat they ate accustomed to responding ofa your classroom acts. Here's what tof thse items look ike: posilons of ofyourtes. actives Testis see Bay Testtakers hear 1. Te shoes under box, 2 The catisiton he cha _ taste anne erm} Cegrngcseeom npge es 77 Asyoucan see, these items are quite waitonl. The format les self to pac: tical aod realy, paving the way to quick, consistent scoring. The items areal very Cleary formulate, within all the expected objectives of the course, and ste ents are accustomed co such testing techniques o various aspects of vali ae accounted fer You nigh setriticaly admit ta the format of some ofthe ites is contrived, hus lowering the level of auhendciy. Bur your students are in quite 2 ‘rastionaledveaona spstem in Wich they wi need faacion in adtona test formats, so these items may help them, in aiendy" way, o handle such assess sents inte rue. No washback is bul into the syste, 0 you have co be sat fied with what you hope was ample washback nthe 15 weeks of cassoom actrty thatled upto this da. As you ook over the items, ae there some that nee tobe revised before you Sinalze thea? tn revising your rat, ak yourself the following questions: Suggestion fo evising your test: 1. Ae the directions to each section absolutely cles? 2, fs there an example item for exch section? Ifnot ar the dictions and formato famine o student that they wil eal understand the tasks they tre being asked to perform? 3. Dots cach tera measure a specified abject? 4, Is there a single conect answer for each question? 5. Is each tem sated in clea, simple language? 6, Does each mutiplechoice em have appropriate distractor; thats, ae the ‘wrong items cleat wrong and ye sufclenty“aluag" that they aren't tidiulouy espe 7, sth dlifculy of ach item appropriate for your students? 8, Is the language of cach item sufficiently autbeatc? 9. ls thereabetance berween exsy and dificultitems? 10, Do te sum of the Rems and the esas whole adequiely reflect the learning objectives? ‘deat you would ty outall our tess on 2 sample of srdens notin your dss before acualy admitting the tet, but in ou daly classroom teaching, such 2 ‘ayour phase is almost impossible. Atenatvely, you coud enlist the ad of cok league to look over your test or, beter yer, taketh test 5 rl run You must €o ‘what you can to bring to your students a instru tat i, 1 the bes of our abil, practical and relible In the fina revision of your test, imagine that you'e & student taking the tes. Go through each set of eizections and al tems slowiy and deliberately. Time your self. (Often we underesimate the time students eed to complete atest) I the west, shouid be shortened or leaginened, make the aecestay adjustments, Make sure your testis nextand uncured onthe page and that ars clear and uaunblguous, ‘reflecting al the cae and precision you hae put ino is construction, there isan 7a cum Casing Cees ange Tes audio component, 2 there is inthe Bstening test fr Japanese ith graders, make fare that the sep sear, that your voce and anyother vices are clea, and that the audo equipment isin woking ower beoce stating the test ADMINISTERING THE TEST “The moment has ssived. You kare designed your test based on your carefully con. sidered purposes objectives, and specs, Cou anything now go ary in these best tad plan? Ofcourse, you know te answers es So consider some ofthe measures souean ake to ensure thatthe aca agminitaton ofthe west accomplishes every thing you want to. Here's ais of pointers: precest consiratins (te day before the in-as ssy) 1. Provide appropriate presest information on 1. the condtons forthe est (me Limits, 2 portable electronics, breaks, e:) >, mater hat eudens shoul bring wit theo ¢. the kinds of items (rem tyes) that wil be on che vest suggestion of sates foe optial performance fe. eration erteriaubecs, show benchmark samples) 2, Offer arevicw of component of narrative and description es 3, Give sofents a chance to ak any questions, and provide cespoases “Tesadinisaton deals ‘4 arrive caly and se to iethat the classroom contons Gighing,vemnpesarue, fa dock tat al can see cea, farinue arrangement, er) are conceive 5, tf audo oe video oF ober techaology is needed for administration, xy every ‘bing out in advance. 66 Have exta paper, wing aseumens, o other response materials on band, 1. Sarton tine 8, Disbute the rest iuel 9, Remain quietly seaed atthe teacher’s ded, arable for questions fom stu. dents as they proceed 10, Fora ined test, war students when time i about to man our, and encourage heir completion of thes wor. “This ts notan exhaustive is, 2st doesnot cover al posible testing snstions, butt shoul serve as sting pint fr you 2 you aempé to cover all the deals iaelved in an séminisiation ers, ake ig and chat shally come se bes beserere as, et) ——4 une Deipng siwom vps 79 ‘SCORING, GRADING, AND GIVING FEEDBACK Scoring ‘As you desig &casroom test, you must consider how the test wil be scored and ‘raed, Your scoring pa reflects the relive weight tat you place on each section ado te items in each esi, In the four scenario that we have bea discussing in this chapter, scoring i thematic terms) «factor in ony to of them, the ‘grammar tit tex and be fin Isteningspesking examination for Jepanese fit ‘riders, Let's look at each, “The grammar test as three sections, eich wit a numberof scorble items. togialy then, ou coud place equal weight on each secon and mathematicaly celeulat a sore. However, his may nc reflecr your ewn conception ofthe impor tance of each task type, and so one decison you might make isto place more ‘weight on, pettap, the grammar editing secon. Your argument might simply be that you fee those tsk represen more geara o integrative language ably and therefore are deserving of greater weight. ‘The lstening/speaking zl exam fr Japanese ith graders presen a poten aly complex chlleage, because school policy cequires ene grade for the Saal ‘examination 10 be reported fr exch rodent.’ your ob to deter the relative ‘weight ofthe listening section and th or ineriem You could argue that the oral Interview invoives both compethension and production ané therefore give it more ‘weight, or ou might simply consderthem equal components, To make afinl dec sion oa this issue, you Would need to know more about the particu conret than hasbeen desrived ee. ‘Asa classroom teaches ster administering atest once, you may deride to erst our scoring pln fo the course the next you teach it Attar point you" bare ‘alusble information about how easy or difculc a test was, abou whether te ine limit was reasonable, about your sents’ alective reaction to and about their ener peeformance, all, yo have an acuive judgment about whether test Conecty assessed your aden, Take note ofthese impressions, eventhough they ae aot empirical dita, and use them for revising atest in anothe term Grading ‘our fsc thought might beth asigning gies to udentperfoomance on 25 vw be easy Jus ge 20 A foe 900160 percent, B for 80 o 89 percent, and s0 tn. Nor so ise! Grading is uch athoay sue ta all of Chapter 12 devoted to the topic. How pou asig leer grades otis test fa product of + the county, culture, and conte ofthe English dassoom. + instrtinal expectations (ost of them unwritten) ote ew cured Deng Ce ange ss + cxplic and implicit definions of grades that you have set forth + the rebtionship you have established with this class + suudeat expecuitons that have been engendered in previous tests and quizes in the ss Torthe dine being, then, we wil set aside issues that deal with grating the four scenarios in particular in fivor of the comprehensive treatment of gr in Chap 12 Giving Feedback ‘Asection on scoring and grading would not be complete without some considers tion ofthe forms in Which you can offer feedback ro your students feedback that you want to become benefciadwasbback. Consider jst afew of the many possible ‘maniesations of feedback asociated with tests (his i not an exiausve Is): [tn general bv gradi sa lee rade baal coe «subsea og, of separate sis or sects of es) 2. inden of corecinconect repens 5. agri st of ets og, scars on ceain gaat cage) cc aresnedng work ard rate opto, for ol pructon est 2. Sor foreach elena beng ated 1 check of areas ecg wk and rate optans ‘orl eeback fr pereence tite conten 9 ‘crv nh lea beg ed Beco seas eng work nd gpd cepacia ee imprinting «magn Sd endoltay cman, jets pst ion ove wok = Addljonalfalterative feedback fora est: [aon aaltepet, po corfewrcs en els ithe cee of rl be ot inka coerce ienh sen re corgi tt 4 elt aseentn vero mansions over) Osiging Coon Langage ss 81 In the four example scenarios that we have been refesing to in this chapees, there aze« mule of options fr giving feedback: Reading Quiz. The primary if not excusive pupose ofthe eading que wis ‘0 prompt seltassessment and diss discussion, With oo scoring or grating, fed. back was to some degseeselFinduced through the knowledge of what questions one got right or wrong, but more extensively inthe form of wholedlass discussion ofthe reading passage Grammar Unit Test. The most salen form of eedbeck isin tot score and subscores, but perhaps the most useful feedkack could come in the form of dige nostic scores, a checklist of areas needing work, and cas discussion of the resus oftae test. Midterm Essay. Al the pes of feedback Listed ae fessible and potedally ‘seh, Dut peraps te kind of feedback that would mast contribute to beneficial vwashback would be the subsequent peer conferences and individual conferences between sudent 2nd teacher ‘istening/Speaking Final Exam. Evenmzly the cidren in this cass wil recive a lener grade forthe couse, eich mayinlude scores and subscars ofthe ‘Sul examination, with line else posse witin the system. One might vearre to ‘say tat the teacher could give some minimal ora feedback fer the ol itevie. ee [in this chapter, guidelines and tools were provided to enable yout adres the fre quistios posed at the outset: (@—) how to determine the purpose ofthe test, () how eo sae objectives (€) how o design rest specications, () how to design or select estas, incudng eralating aoe aks with item indices and (©) how ‘o begin to address scoring and grading, This var template, show in Figure 32, ‘an serve asa puter as you design cssroom tess, Determine Draw up purposelisetness. Sate objecting, spectators Inzcrinsieingtte | [ Constuctasyaton Sdecttaks anton | | tat hp studeis ‘ofscrngrading | ‘ypes and anareo tg ieneenaien: toachieve optinal and poddg Performance student feabacke Figure 3.2. Steps to designing an efective test 52 core} Dering Cassoom ngage Tos nthe ner ewo chapters (Chapters and 5, you will expove the extent to which any of these principles and guidelines apply to standards based (and stan tric) lange scale testing You wil then consider an aay of posses of what fp come to be ealledateratve” assessment (Chapter 6, ony because pools, conferences, jourmals, and se and peerastsments a no alvays comforably Categorized oog core trsitoral forms of assessment. Subsequent tapers 7 ‘rough 10) willed you through a wide selection of est sks inthe separate sls ‘flixening, speaking, resting, nd weing, swells provdeasense of how testing foc em focused objecties sito the piware (Chapter 11) Fal in Chapter 12 you wil tea lng, bard look athe diecumas of eading stents EXERCISES {Note: 0 individual works (6) Group opie won; (C) Wales dscusson} 1 (@ The fist issue disused in this chapter was determining the purpose oF a proposed test, anda checlis was offered In sl groups, hae one OF foo members share an experience they bad either taking or grag es, then systematically escus the probable answers to each item on the checke Ii The group aay be able to help solve cern poblens or demu hat ‘ame up. Repo back to the dass ay notable suprises or quesons chat were usesolred 2, (@) lock agin the discussion of objectives (page 36-58). You wil note that we have oot expicly discussed objectives for Senaros 1 3, a6 &. ‘Wat woul tose objectives look ike? With a pares or ina small group, see sfyou can jot down objectives for al thee scenarios. Stare your thoughts ‘wit the rescof the cs +5, (UC) Figure 3.1 depits various modes of elicitation and esponse. Are there other modes of elicaton ad response hat could be included in sucha ‘hac Jsify your aditions with an example of exch. 4. (@ With a panes orn small zoup, lok a the tems in each past of the Tisteing comprehension west foc Japanese fh graders (page 64) and cesign several more ems for each par Share you resis with another pir (@5 2 frou of four) and tak abou strengths and weaknesses of ens. 5. (Gn the or interview fr Japanese fith graders cscased carer inthis caper (pages 65-67), can you justify the decsons made? Would you sug gest any changes? Discus with a partner ured Osiging soon ngage: 6, (6) This exer could be a challenge, especily those who have never designed test specietions before, so platy of ime and asiance may be ‘necessary Look atthe following sine objectives fem a loweintermedte inte- gated couse. In four ifleen groups craft st f test specs and sample test tems foc the objectives our group has been assigned Report your ndings to the rest ofthe css. Studen wil rei nd pg airs wt Gar pesca for ad fal {moran palin insingl sci convaatns 2 tech tnd rouce mein qos wth cet fl onto ten ‘Communication sil (pedi Stolen wl rate Sheed aca ad wenn asocalcooveraon {Btls cafematon 2 ocelconesaton Freepers abou an even a cl omeraon EB cpap wan comet protien, se adhe Reading skills (simple ‘or story) ee See rg paste ced ves in ay 87 ‘Wein kl (imple esa o stor) Sut i wt oe parooh ay about singe esa a etd ear aonb 17. (@) Selecta lagtage cas in your immediate eairoament forthe folowing projec In sal groups, design an achievement et fora eaonably hort Segment ofthe course (perhaps 2 lesson or unit for which there is 80 current tes or for which the present tests inadequat) Follow the guidlines ia this cuptr for developing an assessment procedure Wea tis completed, pre- se yur astssmene project othe restof the cbs. 8. (© Ifpossbie, oeae an existing, recenty used standardized mulkiplechoice test for whieh there i accessible daa on student pesformance, Calcul the item fry CF) and item escrimistion (ID) index for sleced items. IF there are no da for an exiting tes, select some tems onthe test and 203. ‘yar te trucrre of those items na cstactr analysis to decerine if they shave (® any bad cistacoss, (any bad ses, or (© more than oe potete aly conect answer, 14 cman) Osi Clsioaninguge Tos 9. (U/C) On pag, ten teen options ried fe ging feedback os Gents on assessments. Review the practicality of cach and determine the eaten to ich pactaty (grin, more tine expend) tabi ssciced node toe beer wasbick lamers FOR YOUR FURTHER READING ‘Bachan, Iyle F, & Palmer, Adin S (1996). Language testing in practice. New Yor: Oxford tiniest Press. Bachman and Pale’ book remsirs a sundae inthe fil of language assessens. Nore paccalycriented than Bachar’ (1990) seminal theo retical expeston oa language tesing, tis one is aed 2¢ providing 2 comprehensire set of tools for he development of language tex Its dif cut feadieg forthe beginning level graduate stent or novice teacher, bat ‘thers will appreciate its thoroughness, theoreical soundness, acd practical examples. Geoalund, Norman E, & Waugh, . Ket. 200), Asesimen of student acter ‘ent (hed) Beso: Alfa &Bscon, This widely used general manual of tesing acs education subject aes provdes useful information for language assessment. In patel, Chapter 5, 6 7, and 8 desrbe deed seps fr designing tess end ‘king lip choice, efile and sho-anser ems. Browa, James Dean. (2005). Testing in language programs (2d et) New Yor: MeGraw-HiL, Chapters 3 ard 4 of this anguage-esting manual offer some fuer infor. ‘mation on developing tests and tet items inluding fonmulas fr elewaing item facity and fe discrimination sack ts jn the jasiay vactice. New langage aatheo viding 2 esa shes, but sand nt achieve subject icv, sis and New Yor: + aloe lating OBJECTIVES: Ante reading this chapter, you willbe able to + understand the crucial sole of + appy principles of sandadizaion staodard in edvcatonal to the construction ofeacher ‘instruction and assessment, bused sans especially in sundardved testing + appreciate the rroeged swont of + examine ase of standards for a lucgescle standards based specified age, level, and context testng—is soci, politica, ad and apply them to contexs of ‘deoogial consequences your orn + be prepared to take action in your + -aiyze the purpose, advantages, ‘oa teaching and assessing to and dissdvanages of standards: casa finess and openness for based assessment our studenss ‘Throughout hisory, people have been tested to prove ther capabilites or quale Cations. Some ofthe eaest formal cuminatons or tess have been waced back almost 2,C00 yes to the Han Dynasy in Chia where they were used t select the highest oficidsin the country (Cheng, 2008). Even earl, in de Bile Gudges 125-6), we havea recorded instanceof the use of the socalled “shibboleth test? by aeans of which evo ethnical and inguistical diferent groups of people Were clisingushed. When an Ephaimite was ordered ta pronauce the word “hibbe- leh his language caused him to say ‘sibboleds* with ans, a8 opposed to the Gileadites' pronunciation that used 2 /sy/. The consequences were dit Ephrimies were thus cxposed and lle. In another example, before Word War 4, the usalian government used language tess 28 a method o keep out ima rans from other counties (7: MeNamace, 2000). The goverament oer could Select any language forthe dictation test thatthe iumigrant bad wo take. “Tday, we ac all sll deeply afeted by tests and examination, especialy highstakes standardized tests. For almost a ceaury, schools, waivers, bust ‘esses and governments have looked to standardized measures for ectoomicl, relale and val aesment of hose who would ente, coating in, o exit heir instaions, Proponents of these lacgescale instruments make strong cains for ‘hs usefulness when great numbers of people must be measured quickly end fictive. Those chins are well supported by reas of research dit that com rise const validations oftheir efficacy and the specification of standards or cuore ¢ Sande Aemet benchmarks tht are to be icorponted ino assessment instrument. We tave become sword that abides bythe resus of sandardzed ests ifthey wee saco- snc, havng been blesed by esearch findings and intwiona tuna ta seats, we look a jot wit hese sandars ae that dese mary vrgece sundried, where they come fom, and wheter the ways ‘and Our purpose iso case our avarens of standards based asessment— sentbea cat ae used to evlatesudent academic achievement and show that sregpns have cached cerain performance levels or standards. With his bak hop, we wil then tua, in Chapter 5, the issues sureundng te standarczed tests that such standards ae intended to Support. "THE ROLE OF STANDARDS IN STANDARDIZED TESTS ‘sk nootangusge specials wht standatiaed testi and thy ate key ve ‘partva mukiplectciee x, and the they wil gre youn empl such he Tat or GRE By aow you know tat thisis ot a comple answer. A standardize} test among ober tings, presupposes cern standard objectives or Peformance tenis now beter knowa as standards Gnd a known as beachmarks) that reall constant aros ove form of testo soothe. The sandards that underte Mendzed et ae ual tof carey defied competencies that 0 0 ‘etease, a comiculum, 2 yearlong progr, o een matipeyear ebjers fo, tay a Kl? progam or secondary school gradation cer, Standards base¢ Sr ement teers 10 procedues that have been speicaly designed 10 ret such competences. “Thew do these standards come from? Who designs them and how ae they incorpontd ao asesenetisrments? The pst 30 yess Bae seen & mute Tooming of ef on the part of edcsional leaders wordide c base = pletion of scholaéminsered sanéald tess on dey specified tt pin each conten ara being measured For example, most depanents of e+ ron ae ate lerelin the United Stats hareaow specie the appropriate san arts (at ester o objective) foreach grade lve Gnderarten ro grade 12) ‘od exch coment aea(aath language, sciences a) supnsosbeniet scspom Bayon “sue ape mace onideg aye qia esc on uiog voyeuoy sno 5 veneer va * (g4-d “uowwonp3 jo wuounsedoc onal cr1a SuwurSeq ‘Tupjeods pue ywuart 19) sprepunie kiouudojenap OBenuLy Ysy Huy ewreseD "ze BIEL pedo ejusogneD “Z008 a) Usb ensvoynea "2b SIL an aygeye eaae eatTHg CH open HAW HIE GRIM cree. fqhaiaays dete) tau; Seabsceg beds @ ehlaiat eebebigdgs gilguada Hb Pe) Gee hae He TWH GEIS aa We ae TELE Hoag HUUUMEGG LT) Hind : a basin aly é au ite RRO U HABE Erie ere waa au § ey 1, anes ane i peibgecgh att ashy . j a8 ik B 5 beede sue Wa se Bt Ue i Hi ello dgah | 2 elisa uae 2 é 1 ' ‘ ' t ‘ ( ( { { { SR yarns Sado Asien Beeweca 1999 and 2002, the Cloris English Language Deeopmeat Tet (Ceumty wus devetoped. Te CDT sate ofinsrunets desired to sesh taneat of ELD sar ars rade levels For esos of test erry, pec ‘toasts test are aoc arable tothe public) Hosa (2007 examined the eee to which the Eg Language Developmen (FLD) Cassoom Asesment sed ina large wcban school dri in Gafoia measures the same corsircs sthe CELDT and concided tht thatthe evdece gathered via the ELD Clastoom Asessmestis ‘consent with that provided bythe CELDT, the saadaried measue For more {nfoemton on the CLD, cons the append 2 the ead of is bok ‘Te proces of adnisisering 2 comprehense, vai, and fi asessmen of 1D eden comaues tobe perfected. Stagest budgets within deparments of ‘efuciton Wweldvide predispose many in decsionaking poston to ey ont ‘ional sandazed tests for ELD assess, but mys of hope ein the pio rion of more sidenrcenteed apprcaces to leamer assessment. Sack and cllagues 2000), for came reported on «porto assessment system inthe San Franco Unied Schl District called the langage and Literacy assesment Rubric (ALAR, in which mukipe forms of evideace of suet? work ae cot lected. Teachers observe sens yeacround and record tir cbse on canna forms The we ofthe LALAR sem provides wef aon ude fonmance atl gr lees fo orl producon an for eading end wg pec mance in elementary and mide echoo rates (1) Farther research ongcing for highschool level (ends 9-12. ‘We can find sine sancardsboseé ascents in ote countries such 38 ong Kong, Chia, wich eceoly moved rom aomreeenced io stundards based ascesiments in the Hoag Koug Certfeae of Education Fxamizatten Davison, 2009, Ia Ban, er, and Eewdor, standards ised asessment snow common lace, with all the advantages and dsranaes that come with sich eductonal reform Godoy, B Lena, & M. Nagdany, pertonl communion, 208). “Advantages inde common itera natoavie fo iachers to pursue ia tir ur reul, whereas te most obvious dstansage the temptation teach to the tesla te ated Ses, a perenialcomplaint of schoo teachers isthe potential, doe oft “a” ofteseing ite feedom of the cea 1o cups wat he or she ges to be inporant—and instead having 0 gine equal eto to 193 cit ferent competences in 2 couse unit. Browa, personal communication, 2008. These shortcomings newithsunding, tnd in thir besinten, ae of couse designed to be anchors, signing cui, instruction, and assessment (CASAS and SCANS {At the higher eves ofeducaton (cllegs, community colleges, adult schools, lane ‘guage schools, and workplace seings), standards based assessment systems bare also had an enormous impact. The Comprehensive Adut Student Assessment System (CASAS), for example, is 2 progam designed to provide broadly based | sessment of armen of rely on tre the explo. sack and ‘tem in the Assessment wk ae cok tons on ents pee log perio. is oogring 2s such a5 ardsbased (Darina, comton Sucaonal 2, 209) thei eu ch to the poten tice or 0193 cit 1, 2008), ofcourse os ne sms have rssnent “y based owns Sandu seedAcmumet 9 assessments of ESL custcula2eoss the United States, The system includes more ‘han 80 standadined assessment instruments used to place leemers in programs, diagnose learners needs, monitor progress, and ceil mastery of functional basic skill. CASAS assessment instruments ae used to mezsure functioeal reading, ‘writing, istening, and speaking ski, as wel as higher oder thinking sil. CASAS scaled Scores report learners’ nguage abit levels in employment and atl fe sls conte. Fra review ofthe CASAS test, see Gorman and Erase (200 some deus ae also proved inthe appendix atthe end of tls book. A simiar set of standards compiled by the U. . Deparumest of Labor, now" ‘known a the Secetary’s Commission in Achieving Necessary Skis (SCANS), out lines competencies necessary for language in the workplace The competencies cover guage functions in tems of standards: + esoureesGllecatng time, materi, staf ete) + Tnterpersoa! sills teamwork, customer service. ete + information processing, evaluating data, organizing fies, ete. + systems (c., understanding social and organizational teas) + technology use and application ‘These fie competencies ae acquire and mrsined though ising i the brs ke reating, wring, stenig, speaking hiking sks suck seasoning and creative problem solving: and persoral qualtes, och 2 stfestem and soc Dir Fo ore information on SCANS, see Berton ele, & Thompeon (200), Teacher Standards In addition to the moverent ro create standards for leaning, an equally song ‘movement has emerged to design standard for texcing. Cloud (2001) noted that a suadent’ "performance fon an assesment} depends on the qual ofthe lsc ional program provided, . .. which depends on the quality of the professional development” (p. 3 of teachers. Kuhlman (2001) emphasied the importance of veacher suncar n three domains: 1. linguistics and language development 2, culture and the interreladoaship berween language and cule 3. planing and managing instraction Ince education of new teachers, che University of Californi (2008 advocates stention to six dortins, one of which i the assessment of student learning, With ‘maphass on fre factors 1. esublshing and communicating learning goals fr all cudens 2 collecting tnd using multe sources of information to assess student teaming onic) *% owe 4 Sanda ued Aceon tnvolving and guiding al sudentsin assessing hee own Fearing sing te resus of assessment to ie instruction 5. communiating with sriden's, files, and other mudences about sodene progress Professional teaching andar have als been the fous of several comminess in the international association of TESOL (Gotlieb etal, 2006). Coespondingh, the Ansralan Couscl of TESOL Associations has developed standards for teachers sind ouwines the expectations of TESOL practitioners is eaton to three erent tons woctng ia a muliculrlseciey, second language educaoa, andthe prac: tice of TISOL Garment & Antenucd, 2006) iow to aces whether teachers have met sundands remains @ comple se. can pdagogi experi be aed through 3 tadkonl santa est to he Tet of Ruhlnaa’s (2001) doazise—lingisics and guage developmest— know fdge can priaps bes crahated, but the cull an intrcve characters of (ecoveteaching cannot be s easy asesedin suc 2 tet TESOL standards com ‘ainee vpaites perommancedbased assessment of teachers forthe follwing reasons: «Teachers can demoastne the standards in thes eaching 1 Teaching ean be asested through what eacher do wit their learners ia their claserooms or vrual classrooms (heir perfomance) «This perfomance can be dee in what ae cad indicators" examples cf evidence tat the teacher can meet a partofa sanded. «Te processes used to assess weachers ace to drew on complex evidence cf gerformance In ther wort, indica ase more than simple “ROW to" ssatemeats. «+ Perormancebased asesement ofthe standards isan integrated ste. I's either checklist aor series of discrete assessments, «Fach assesment within the system bas performance criteria against which the perfomance can be measured. «Perfomance crite identify to what extent the teacher meets the andar 1 sruden eareing is tthe bear of the eacher’ pesformance. “The sandarshased approsch to teaching and assessment preseas the roles son with aay cbalenges Homere thorny those issues are, the socal consequences ‘ofthis movemenccanno be ignceed, especialy in terms of suden assessment, “THE CONSEQUENCES OF STANDARDS-BASED AND ‘STANDARDIZED TESTING ‘We bave already sted tha standards based assessments are aot withour thee sare of pmbiems. Although sandads are implemeated to improve eduction a growing body of researc tas found 2 number of unintended or negrive consequences of Acommitees cespondigy, Hor texches, tee oan. und the pac mpl ue es? Inthe ent—inowt acces of cadens com ring esos earners examples xevidence sebow tot system, is slnst which ve standart, the profes sequences cso aunsenay iin 1 ure t Sutebebued Asem 95 ssaneardsbased assessments Jone, Jones, & Hargrove, 2003; ina, 2001; Wan, eck, & Brows, 2006). Soe sues have found tat sandardsbaced es can ono the curicum, pushing atiction toward lowerorder rather han higher ‘ior cogatve sels, Further, lower test score result in grade retention, which “peas aot o improve educational clevenen: fo thse saeas who a held ‘ack Darling Hammond, 200. ‘One of tie sroogest arguments made aginst sandardsbased assessments is ihe ive of secountabiy, That, ress results ae used 10 bod schol distces ecounaible for raising student academic achieement and ieatfing schools in teed of improvement. A prime example inthe United Stes isthe aforementioned [No chia Lef Behind Act. Thus for many schools, goverament funda and suppor depend on student perfomance, whlch pus pressure aot cay on he suet but vio on teachers, sehook principal, end school dissit adminiator. To avoid ‘peing penalized schools and school sic sometimes pus lowsconing students ino special edvaton or retin or bold them back a grade herby encour them to drop out) 50 that the schools average rest scores wil lok betwen this ‘case some argue (€@, Dating Hammond, 204) thatthe standards based assess ‘Gens do not improve sunt achievement but rather pobibit some student frm “Another major challenge in sandardsbaced education iste close ik to the sandandized tenting industey that we al ae very ilar with, Stories abound tome of then of Blockbuster sppnovel prporons—on the high pice cestakers tre wing wo py tops such tess, dramatically ustraing he gate keeping role ‘eibose cess. We have already alloded othe widespread global acceptance of tan- dined tess 25 Tali procedures for assessing individual i many walls of Ee. ‘Taose tess bring wi them ceria Consequences thar fll ude the exegory of ‘consequential Yaidity or impact discused in Chapter 2 ‘Some of those consequences ae pestire. Sanduized tess oft high levels of prctealy and relabry and a often sypponed by impressive const ak dauoa sues, They ace therefore capable of securely placing hundreds of thou aod of tetiakess onto a normefeenced sale with high rebity ais (ost ‘raging berven 80 and 90 percent), For decades, university admins oles round the word Rave rele ca the ress of ets such asthe Scholastic Apne ‘est GAT, the Graduate Record Bram (GRE*), 2nd the Test of English 2 ocean Language CTOER® Tes) to screen appians. The respecubly dente cor: tions between test tests and academic perfomance are wed to usy determining the scudens educational fur on the basis of one relatively inespense seo tmliplechoice test Thusas emerged the tera high-stakes testing, based on the faekeeping fnction tat standardized tes pcfomn. “rec ihe insutons that produce and ule highstakes standard tetsu tied in tele decisions An impressive aray of research would scea 0 Sf Te ‘Consider the fact that coreations between “oi” TOEEL scores (before the advent lof computer based TOEFL) and academic peeermance inthe Sst year of coe “yeve impressively high Henig & Casal, 1992), and more recent valldson 96 ures Sandstone Eder a st Gt a ce ed tema 2 ee ce esi Se ci vie ae So canner a eee arg a aa eg duction Henning & Cascallar, 1992). Lrfenurns commonly use such findings Se TA ap erm Se ee een ate Soe x ed with high but not 1. Should the educational and business worlds be soi i peviecxprotablites of accurately assessing testers on sandaied insta ‘meni? in other words, what about the small minority who are not fly assessed? _ 12. Regardless of construct validation swties and correlation satis, shoul fr ‘her types of pecformance be elite eo obi a more compreneasive pic ture ofthe reaker? : 3. Does the proliferation of sandardzed tes throughout a young petso's ie ‘ie deco estdiven curricula, diving the auention of nudents fom ce. {tive or pesonal interes ad in-depth pure? 4, isthe sanderdined tes induscry in effect promoting a cua och and polical agenda that mains exiting power srtuses by assuring oppor ‘iy to an ete (realy) as of people Ghobamy, 2000)? ~ Test Bias Tes oo secret that standardized tet can nv a numberof peso ests. Te oft fines ino ew a guage eng bat rece supe of interest {hacen bree wy ad es Os, 20%, a, 26 So, 20) Shows a widespread concem vertex Bis, Some researches arg at bh tex developers an erasers mu seo cet ir and unbiased tess aswell 2 we tesa Way hat icfo alee That bis, heyy can come in any fons feng, cuss, ce, gence, and eaming ses Koh, 2000 Media & NEB, 1990; Ross & Obie, 200. The Nena Center for Fas Open Testing, in is mec ocwsece Fr Te recy ear oes doen ofinace fins oft tis fom teaches, parent siden, ad lel consulta. ewe the American Poyctolgal Assocao’s Join Conuicer on Tesing Pacics has published ‘de fx proesods to promotes tnt ae far tnaltestakes eae of, eae, dsbiy,rce, cic, cational og, rego, sel aeration, ingisic i te WW NUNN E # ure 4 Suncast beens 97 background, or other personal characterises” (Code of Faic Tesing Practices in ducition, 2004p. 1) For example, rating selection in sandeized ets may use a passage om a iterary pice that reflectsa middeclas, white, Anglo-Saxon nom, ‘Leetwres used fr iseaing simul can easly promote a biaced socopotica view Consider the flowing proage for an esa in ‘general writing abiiy” on the ‘oxeratpndl English Language Testing System (HITS) You rear a house throvgh an agecy. The hetng stem tas stopped ‘wodking You phoned te ageacy 1 week 2g but it hs stl aot been ‘mendes Wee alec to the agency. Expiathesteton and tllhem ‘wht you wane them todo abut Although this ak fvorably Ulusates the principle of authenicy, 2 umber of call presuppostions are evident in uch 2 prompt. For example, acepied ‘norms foe “complaining to an agency or expressing power relationships berweea ‘eaters and landiods/agents maybe unfnlas discourse for testiakers, cling ito ‘question 2 potential cultural bis, ‘further issue extends beyond he fac bar 2 based iter messes features that ae relevant tothe test construct. Bas also cea systematically harm one ‘aroup of eseakers, thereby making the cer una In an era When We see tose gnize the multiple iteligences present within every student (Chrisison, 2005, 4. Gardaer, 1985, 1999, is it not likely that sandartied tests procote logic ‘mathematical and verbalinguisicitligences othe vtul exclusion of the other contextualized, integratire inteligeaces? Only very recently have taditionaly receptive tests begun to include writen and oral production ia her tes batery— 2 positive sign. But is it enough? Kis alo lear that many otherwise “smart” people o not pecfoom well on sundariized cess. They tay excel in cognitive ss that ‘re ot amenable toa sandanized format. Peshaps they need to be assessed by such perormancebased evaluation as imerviews, porto, samples of wotk, demonsrations, nd observatoarepors,Feshaps, as Wer (2001, p. 122) suggesed, learners and teachers need to be given the freedom to choose more formative _ssessmenr rater than the sumaativeasesonet inherent in standard tess, ‘Expanding test bates to include such measures Would help o solve the problem of res bis (which extremely dct co conte or in standard ites) ng account fr the small bot sigiticant mumber of testakers who are not ca tel assessed by sandardized tess. Those Who ae using the tes for steeping orposes vith few if any other acesiments, would do well cocsder multiple ‘mezsures before auributng infill preci power to andantized te, On the sure, such efor sound laudable with oaly Beneficial outcomes, 25 ‘they have the potential for transfering the “gatekeeper” role of tests to that of “toor opener (Bachmaa & Purpur., 2008). On the ober hand, assessmeat experts ‘so wara that principles of pracicaly and valty canbe threatened by “ies 6 © c ‘agate against a mumber of ed We oust somehow find ‘Test Driven Learning and Teaching ee anpher omeqpence of sxndarzed tesing ie dagtt OF wn re ae excane When sent afer ec ow Uo 8488 lenin erence wi ar ete elec be Ee He 9 18 ety to take a poste side toward eatin "The motes in such a conten are ey i ay exes ith le eood of ing ats To a mdng woke sve a Jpn, Ke, ant aban 6 a8 FE sensing hea et of scans 2 DSSS sre pogenenkege ence exusistinn, snr sina of WS 8 English (Chi, 2008; Kab, 2002). Tine atention given to any topic cc sk that fds not due conebate to passing tbat one eD tn the United States, high Se Tpesor ands forced elms much ation 9S” las op get xh up the wave of ween ans. Bon spent eheswee ecey prosel cash bmi $100 SE a ep acl igh prooance 0 he sama delT the nal x et cheatre ieee. Ham (Sos Peat Herne a Th deco is oy wa undue pressure on als A Se ra recede eam san ey sco 1 GE POS BY a dering ent en pg ober ces ins TS aa ena veces pst is OTC scyachertn ach a cool igh acy be aspartate an at ee een nh ake exelent rages ough shoo PSS veh eedrven poi, te eacer woud recive evar sl [ETHICAL ISSUES: CRITICAL LANGUAGE TESTING Some researchers beer hat one ofthe pout 02 reply growing jnduty isthe dager ofanabuse of power 10 special eport oa “owt fom the testing explosion; Medi and Weil (190) nocd the folowing efecto en pleas acd evr Be inne he Ronen of sng res 208 cous Eas te Se Ste on sedated swe, Hen hy have Ben sss 8 se of silt uc. The poe which has bese pf ak and om chien xr eration wes @ 20) i i umes Sadan Avecinest 99 soot: sess ised by olay (957 Ta ye ent cei pees ‘hice pene be casa ecg fore ad eo Te satel bya fen ie ese Seman ene ct Catton Sed ober sel tae ean, tame scoped sas ey ose epee, ve EE Qin mou en rus oped er cat eno (ots Ts sant bng nthe ern cil esa sel one imc O97, 2002) ad ers ap as 20 ne Ft a etn Bp, 37 sets Sah cnn ochre lee plaor ome sere eue tal langage etn Ger BF Cape 2 sone com ‘Sten ete tana pepe gear,Popones cite pps a iy cist eed steed eng rca wich Cac etter apc of cas poe ne {GST iar to dope hee of intl perp, ees. fe Gi 7, 9 ed oe a + Psychometric traditions are challenged by interpre, individualized proce ares fr predicting success and evaluating ail. «Tes designers have «responsibilty roofer multiple modes of pesformaace to accoune fr varying ses and abies among testakes + Tess ae deeply embeded in clare and ideology. + Teseakers ate politica subjects in political comtet “These iswes are oot om: More than a cea agp, Erith edo E ss lo aa a pen oc nape ig ‘examinations for nvr enanceIn recent ear te dette haseated wp Tess fee ore peralent ino ves and ar often sed to make deco that are signe Can Thus it was ot upg hatin 197, an ene sof the joumal Language Testing was devoted to questions about etcs in agg eng Moe recent in 2 ct ey i in angugeteing Futhemore, he nto Langage Tesing soci Spe Nee bs wh eos ow el pop ‘gprpate professional conduc. The code of ics sates ais athe seit aor regulation adit doe not provide guidelines fr racic, buts intended to fer a benchnatk of ssc tical beaver by al aaguage ess" (.D. ‘One ofthe problns highlhved bythe push or cra ngage esingis tbe widespread conviction, aeady aed to above, th cael conse 0. Canned vests designed by reputable west manufrs ar infill i thei re“ ‘sve vay. Too ofe one sundaized testis deemed tobe sufi, and flloeup mearuses re conseed tobe to cost,

You might also like