0% found this document useful (0 votes)

4 views7 pages

Introduction To The Speaking Tasks

The document compares the KET Speaking Test and the Shanghai Junior High School Oral English Test, highlighting their similarities in task design, cognitive validity, and criterion validity. Both tests effectively measure speaking skills, with high correlations in key areas, but the KET requires a broader vocabulary and more spontaneous responses. The study emphasizes the reliability of the Shanghai test as a valid alternative for assessing spoken English in the region.

Uploaded by

kevin chen

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views7 pages

Introduction To The Speaking Tasks

Uploaded by

kevin chen

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

Introduction to the Speaking Tasks (Task Design & Equivalence)

The researcher looks at two standard assessments in English used in schools: the Key English
Test (KET) Speaking and the Shanghai Junior High School Oral English Test. The test from
Cambridge English, Level A2, called the KET Speaking Test, evaluates your ability to talk in
English in real-life situations.(Cambridge Assessment English, 2020)

The Shanghai test is an achievement test designed by Chinese regions to gauge how well juni
or secondary students have learned to speak English on graduation. Although KET is meant t
o measure overall language skills and the Shanghai test tracks outcomes from the curriculum,
both tests are built in a way that makes them easy to compare.

All tests cover several speaking tasks and are usually about 15–20 minutes long. Both types o
f tasks offer students chances for open-ended speech and interaction in picture-based discussi
ons, question-answer pairs and quick dialogues. The assessment tasks in both tests were matc
hed for length, difficulty and the speaking skills they measure. In particular, the tests were ex
amined using four different analytic dimensions: Grammar & Vocabulary, Pronunciation, Flu
ency & Coherence and Interactive Communication. By making them aligned, both assessmen
ts become suitable for checking their validity and reliability.

Cognitive Validity Comparison

Cognitive validity means that a test checks how the mind uses that language during real-worl
d situations. It shows how effectively the tasks help students demonstrate language they woul
d normally use in everyday communication. Following our midterm essay’s framework, cogn
itively valid speaking tasks require students to engage in several areas such as getting words,
formulating sentences, building talk and sharing meaning with each other. In our analysis, we
analyze the test demands of the KET Speaking component and the Shanghai Junior High Sch
ool Oral English Test, paying special attention to four important features: Grammar & Vocab
ulary, Pronunciation, Fluency & Coherence and Interactive Communication.

The purpose of the KET Speaking Test which matches the CEFR A2 level, is to see how well
test-takers can share information about themselves, the suite they are in and their immediate s
urroundings.(Council of Europe, 2001)

We should highlight using language to perform daily activities, like socializing, asking and a
nswering basic questions and describing what happens to us. Local educators designed the Sh
anghai test to match the common tasks for students, including explaining images, answering
daily questions and stating their opinions in English as required by the official curriculum.
(Weir, 2005)

All of the questions in both tests involve easy sentence forms and familiar vocabulary. Even s
o, for KET tasks, candidates may need to have a wider vocabulary and understand more gram
mar forms, as KET pictures are diverse and the topics tend to be less familiar than the Shangh
ai test’s more academic questions. Both tests require students to talk constantly, but the KET
focuses mainly on how they organize their language and respond freely on the spot, unlike th
e Shanghai test which is more organized and planned ahead. In both tests, pronunciation matt
ers, so the examiners look at how clearly you pronounce words, handle stress and use intonati
on. Monitoring of rhythm and ability to control phonics is central in the KET, whereas the Sh
anghai test centers on a speaker’s clarity and how easy their speech is to understand. Further
more, KET contains paired activities that model everyday conversations, so students must tak
e turns and negotiate, whereas Shanghai’s format is mainly set up with the teacher leading the
conversation instead.(Field, 2011)

Both tests are designed to measure basic speaking abilities and have much in common intelle
ctually, but the KET tends to use language in more subtle and less predictable ways, leading t
o somewhat more complex cognitive tasks throughout the test. Even so, the Shanghai test is a
n effective way to evaluate spoken English at a regional level of education.

Data Collection Process

Making sure the data collection method was standardized, impartial and sound was very impo
rtant. All twelve students in the research were second grade Chinese students aged between 9
and 11 with similar backgrounds in their studies and English. By being so similar, the particip
ants were more similar in how much they used language outside the study, allowing us to foc
us only on the two speaking tests.

All students took part in the KET Speaking Test and the Shanghai Junior High School Oral E
nglish Test on Zoom which was scheduled for a Sunday in May. The virtual mode was chose
n to give everyone equal and easy access to the training. Students took each test one after the
other in a 20-minute session under standard conditions. We were told clearly what to do in ad
vance and the testing was carried out in a calm and quiet room.

Every session was recorded for correct assessment and in case rewatching was needed. All th
ree scores were provided by experienced raters—Greta, Vicky and Elena—each using their e
xpertise and litigation skills. Across the four dimensions of Grammar & Vocabulary, Pronunc
iation, Fluency & Coherence and Interactive Communication, a 1–5 analytic scale was applie
d. By handling all tests in the same way and with the same standards, the results could be trus
ted.

Criterion Validity (Pearson r)

If there is strong criterion validity, the results on one test will mostly match those of a well-kn
own comparison assessment, as both tests measure the same thing. For this study, we compar
ed the Shanghai Junior High School Oral English Test to the Key English Test, a worldwide a
ccepted proficiency exam. The relationship between the two sets of scores for speaking was
measured using Pearson’s correlation coefficient (r) in the subskills of Grammar & Vocabular
y, Pronunciation, Fluency & Coherence and Interactive Communication.

They indicate that the total scores of the two tests show a strong positive correlation, with a re
sulting coefficient of r = 0.9715. Analysis at the subskill level revealed high correlations: Gra
mmar & Vocabulary (r = 0.9146), Pronunciation (r = 0.9097) and Interactive Communication
(r = 0.8905) all worked well together. Fluency & Coherence showed a fairly strong relationsh
ip (r = 0.7616) yet reveals how it is tested in the reading passages can show slight variances.
As an example, during KET, accent and the way you speak will be favored, while in the Shan
ghai test, you should answer coherently.

It appears that the two speaking tests are closely connected when measuring communication s
kills, mainly in grammar, pronunciation and how people interact. In Comparison, the slightly
lower correlation in Fluency & Coherence points out an opportunity for realignment, by adjus
ting the rubric or changing the task. The strong criterion validity confirms that the Shanghai t
est can be relied on to represent global-speaking standards like KET in the region. The Shang
hai results can predict how well students will perform on international tests, so using them hel
ps with deciding on student placement and creating curriculum.

Pearson’
Subskill Interpretation
sr

Grammar & Vocabulary 0.9146 Very strong correlation

Pronunciation 0.9097 Very strong correlation

Strong correlation (moderate

Fluency & Coherence 0.7616
variance)

Interactive
0.8905 Very strong correlation
Communication

Overall Speaking
0.9715 Extremely strong correlation
Score

Inter-Rater Reliability & Standardization Process

When two or more raters score a performance in the same way, it is known as inter-rater relia
bility. To make sure that scores in speaking tests accurately show what someone is capable of ,
there must be high consistency among assessors. A strict rating and standardization method
was used in the study to ensure that results from the KET Speaking Test and Shanghai Junior
High School Oral English Test were as similar as possible.(Taylor & Galaczi, 2011)
Greta, Vicky and Elena were chosen as raters because they had professional knowledge of En
glish, good understanding of testing and were familiar with both global and local testing nor
ms. After their training was complete, all raters were brought together for a standardization se
ssion to review the rating scales for use in both tests. Every scale had four abilities as subskill
s: Grammar & Vocabulary, Pronunciation, Fluency & Coherence and Interactive Communica
tion, all judged on a scale from 1 to 5. To set up the benchmark scores and be sure everyone a
greed on the criteria, sample responses were reassessed and discussed in training.(Cohen,
1988)

Rater
Standard
Subskill Agreement Interpretation
Deviation (SD)
(%)

Grammar &
92% 0.31 High agreement
Vocabulary

Very high
Pronunciation 94% 0.27
agreement

Fluency & Moderate to high

89% 0.38
Coherence agreement

Interactive
90% 0.35 High agreement
Communication

Overall Consistently
91.25% 0.33
Agreement reliable

After being trained, all the raters assessed every student recording on their own. The people
writing the tests did so under quiet conditions and with close attention to avoid any interrupti
ons. After the initial scoring, two rounds of verification were carried out. Initially, raters went
over each other’s notes and pointed out any ratings that were a great deal different. When ther
e was a disagreement, the raters talked about the recordings and reached a decision that satisfi
ed the set criteria. Because the team worked closely, any disagreements in how to rate eviden
ce were settled thoughtfully and according to the evidence available.

As a result of this thorough review, there was a good level of agreement between the research
ers. Because well-developed rubrics, regular training and specific verification steps were used
all scoring processes were unbiased and repeatable. For this reason, the final scores truly sho
w how well each student speaks, while limiting the chances for rater bias or mistakes. Becaus
e variations in rater judgment may mask the differences between tests, this level of reliability
assures us that observed changes in scores truly come from the tests and not from the rating.
Consequently, this makes the study’s findings accurate and just.

Sco Grammar & Fluency & Interactive

Pronunciation
re Vocabulary Coherence Communication

Fluent with
Wide range, Clear, natural stress Initiates, responds,
5 logical
accurate use & intonation negotiates well
connections

Generally Participates
Good range, Mostly clear with
4 fluent, minor actively and
minor errors few issues
hesitation effectively

Moderate
Understandable but Some fluency, Responds
range,
3 some pronunciation occasional appropriately
noticeable
issues pauses when prompted
errors

Limited range, Frequent Limited

Often unclear,
2 frequent pauses, hard interaction, relies
stress issues
errors to follow on prompts

Very limited No fluency,

Very unclear, hard No interaction or
1 and fragmented
to understand understanding
inaccurate speech

Conclusion

Both the speaking test from the KET and the Shanghai Junior High School Oral English Test
have similar task lengths, organization and difficulty. Cognitive validity and criterion validity
for all tests were high, with a Pearson correlation over 0.9 in all major aspects of speech. Bec
ause of the proper training and standardization methods followed, the process scored an excel
lent level of inter-rater reliability. In other words, these findings suggest that using the Shang
hai speaking test would improve assessments of spoken English and, after making a few smal
l improvements, could be trusted as a reliable alternative for tests such as the KET in the regi
on.

Reference:

Cambridge Assessment English. (2020). A2 Key Speaking test.

https://www.cambridgeenglish.org/

Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.).
Lawrence Erlbaum Associates.

Council of Europe. (2001). Common European Framework of Reference for Languages:

Learning, teaching, assessment. Cambridge University Press.

Field, J. (2011). Cognitive validity in speaking test tasks. Studies in Language Testing,
30, 115–147.

Taylor, L., & Galaczi, E. D. (2011). Scoring validity. In L. Taylor (Ed.), Examining
speaking: Research and practice in assessing second language speaking (pp. 171–
233). Cambridge University Press.

Weir, C. J. (2005). Language testing and validation: An evidence-based approach.

Palgrave Macmillan.

A Comparative Analysis of CET, IELTS and TOFEL For English Acquisition
No ratings yet
A Comparative Analysis of CET, IELTS and TOFEL For English Acquisition
10 pages
Classroom Testing - Heaton PDF
No ratings yet
Classroom Testing - Heaton PDF
192 pages
J.B.heaton Writing English Language Tests
100% (5)
J.B.heaton Writing English Language Tests
192 pages
KET Test 2: Speaking Skills Review
No ratings yet
KET Test 2: Speaking Skills Review
5 pages
Writing English Language Tests J B Heaton
100% (3)
Writing English Language Tests J B Heaton
192 pages
Writing English Language Tests
100% (1)
Writing English Language Tests
195 pages
PracticeTests Teachers KET
100% (2)
PracticeTests Teachers KET
30 pages
Pendekatan Komunikatif
No ratings yet
Pendekatan Komunikatif
180 pages
Writing English Language Tests Longmanf
No ratings yet
Writing English Language Tests Longmanf
192 pages
IELTS Teacher Training Guide
100% (1)
IELTS Teacher Training Guide
101 pages
A. Placement Test Before Starting A Speaking Course
No ratings yet
A. Placement Test Before Starting A Speaking Course
5 pages
IELTS Workshop - Speaking
No ratings yet
IELTS Workshop - Speaking
63 pages
Ket
No ratings yet
Ket
5 pages
Rank Key 4-Grams Keyness Frequency
No ratings yet
Rank Key 4-Grams Keyness Frequency
20 pages
Task3 - Comparing English Exams
No ratings yet
Task3 - Comparing English Exams
25 pages
Issues in Assessing Speaking: Shery Lou M. de Leon Maed-Elt I
No ratings yet
Issues in Assessing Speaking: Shery Lou M. de Leon Maed-Elt I
18 pages
Testing English As A Second Language Compress
No ratings yet
Testing English As A Second Language Compress
163 pages
Overview of KET Speaking Paper
No ratings yet
Overview of KET Speaking Paper
4 pages
Test Classification
No ratings yet
Test Classification
1 page
ZhengCheng LT 253 2008
No ratings yet
ZhengCheng LT 253 2008
12 pages
Speaking Definition
No ratings yet
Speaking Definition
16 pages
Project Presentation
No ratings yet
Project Presentation
37 pages
Osu 1164601340
No ratings yet
Osu 1164601340
182 pages
Language Testing
No ratings yet
Language Testing
132 pages
Designing Classroom Language Tests
75% (4)
Designing Classroom Language Tests
9 pages
College English Test (CET) in China: Language Testing December 2008
No ratings yet
College English Test (CET) in China: Language Testing December 2008
12 pages
Background of The Study
No ratings yet
Background of The Study
9 pages
Chapter II
No ratings yet
Chapter II
18 pages
Communicative Language Test PDF
100% (1)
Communicative Language Test PDF
14 pages
English's Rise in China
No ratings yet
English's Rise in China
6 pages
Mid Term Record
No ratings yet
Mid Term Record
7 pages
HSk1 Coursebook
No ratings yet
HSk1 Coursebook
152 pages
Bab 2
No ratings yet
Bab 2
17 pages
Chapter 1
No ratings yet
Chapter 1
47 pages
18-Course Syllabus - Aun-Practical English 1
No ratings yet
18-Course Syllabus - Aun-Practical English 1
12 pages
Te30503 Week 4
No ratings yet
Te30503 Week 4
50 pages
Exploring Language Assessment and Testing PDF
100% (7)
Exploring Language Assessment and Testing PDF
144 pages
Ket For Schools
67% (3)
Ket For Schools
27 pages
Target Ket For Schools Students Book 160322083304 PDF
100% (2)
Target Ket For Schools Students Book 160322083304 PDF
123 pages
Literature Review
No ratings yet
Literature Review
9 pages
副本Pathways to Recovery
No ratings yet
副本Pathways to Recovery
10 pages
副本Literature Review
No ratings yet
副本Literature Review
9 pages
11predicting Criminal Thinking Patterns Using ADHD S
No ratings yet
11predicting Criminal Thinking Patterns Using ADHD S
12 pages
4P Strategy To Deliver The Value Proposition
No ratings yet
4P Strategy To Deliver The Value Proposition
5 pages
Electron Treatment Planning
No ratings yet
Electron Treatment Planning
33 pages
Final Report: Reading For Ethiopia'S Achievement Developed Technical Assistance (Read Ta)
100% (1)
Final Report: Reading For Ethiopia'S Achievement Developed Technical Assistance (Read Ta)
92 pages
6-Article-An Analysis of Grammar Assessment
No ratings yet
6-Article-An Analysis of Grammar Assessment
3 pages
Delta Module 1 Exam Report Dec 2008
No ratings yet
Delta Module 1 Exam Report Dec 2008
6 pages
Danna Menco AssessmentReport 1
No ratings yet
Danna Menco AssessmentReport 1
6 pages
The Role of Teachers in Developing Learners' Speaking Skill: April 2015
No ratings yet
The Role of Teachers in Developing Learners' Speaking Skill: April 2015
18 pages
Analysis of Code Switching and Code Mixi
No ratings yet
Analysis of Code Switching and Code Mixi
21 pages
Sec Ks3 Hi Eal Access Engage
No ratings yet
Sec Ks3 Hi Eal Access Engage
24 pages
Inbound 8759212486095198467
No ratings yet
Inbound 8759212486095198467
13 pages
Relationship Between The Two Skills of Reading and Writing
No ratings yet
Relationship Between The Two Skills of Reading and Writing
5 pages
Developing Mathematical Proficiency: Susie Groves
No ratings yet
Developing Mathematical Proficiency: Susie Groves
27 pages
Application at Symbion
No ratings yet
Application at Symbion
8 pages
Qaida Teaching Techniques Guide
No ratings yet
Qaida Teaching Techniques Guide
6 pages
AsiaTEFL V18 N2 Summer 2021 Effect of Micro Flipped Method On EFL Learners Speaking Fluency
No ratings yet
AsiaTEFL V18 N2 Summer 2021 Effect of Micro Flipped Method On EFL Learners Speaking Fluency
17 pages
Grade 5 Rationalized English Schemes of Work Term 3 Skills in English
No ratings yet
Grade 5 Rationalized English Schemes of Work Term 3 Skills in English
18 pages
HR Training Challenges & Solutions
No ratings yet
HR Training Challenges & Solutions
6 pages
Grade 8 Math Curriculum Guide
No ratings yet
Grade 8 Math Curriculum Guide
4 pages
Flipping The Translation in Popular Science (By Dr. Pei-Shu Tsai)
No ratings yet
Flipping The Translation in Popular Science (By Dr. Pei-Shu Tsai)
94 pages
Stuttering
No ratings yet
Stuttering
2 pages
Kurs Ishi - 06.01.2025-1
No ratings yet
Kurs Ishi - 06.01.2025-1
33 pages
IELTS Speaking Fluency Guide FINAL
No ratings yet
IELTS Speaking Fluency Guide FINAL
28 pages
Fall 2025 - ENG510 - 1 - BC250427906
No ratings yet
Fall 2025 - ENG510 - 1 - BC250427906
3 pages
Abebayehu Mulugeta
No ratings yet
Abebayehu Mulugeta
67 pages
Best, C.T., and Tyler, M. D. (2007) - Nonnative and Second-Language Speech Perception Commonalities and Complementarities
No ratings yet
Best, C.T., and Tyler, M. D. (2007) - Nonnative and Second-Language Speech Perception Commonalities and Complementarities
35 pages
Enhancing MUET Speaking with PEEL Formula
No ratings yet
Enhancing MUET Speaking with PEEL Formula
32 pages
English Marking Rubrics
No ratings yet
English Marking Rubrics
27 pages
IELTS Speaking Band Criteria Guide
No ratings yet
IELTS Speaking Band Criteria Guide
4 pages
Grade 2 English PDF
No ratings yet
Grade 2 English PDF
36 pages
The Dyslexic Reader 2012 - Issue 60
No ratings yet
The Dyslexic Reader 2012 - Issue 60
32 pages
Eskey (2005)
100% (2)
Eskey (2005)
17 pages
BILIT Module 10.2 10-12
No ratings yet
BILIT Module 10.2 10-12
156 pages

Introduction To The Speaking Tasks

Uploaded by

Introduction To The Speaking Tasks

Uploaded by

Introduction to the Speaking Tasks (Task Design & Equivalence)

Cognitive Validity Comparison

Data Collection Process

Criterion Validity (Pearson r)

Grammar & Vocabulary 0.9146 Very strong correlation

Pronunciation 0.9097 Very strong correlation

Strong correlation (moderate

Inter-Rater Reliability & Standardization Process

Fluency & Moderate to high

Sco Grammar & Fluency & Interactive

Limited range, Frequent Limited

Very limited No fluency,

Cambridge Assessment English. (2020). A2 Key Speaking test.

Council of Europe. (2001). Common European Framework of Reference for Languages:

Weir, C. J. (2005). Language testing and validation: An evidence-based approach.

You might also like