0% found this document useful (0 votes)

68 views8 pages

COSMIN Checklist for Measurement Properties

The document provides information on the COSMIN checklist for evaluating the methodological quality of studies on measurement properties of health-related patient-reported outcomes. It outlines the measurement properties evaluated in the checklist including internal consistency, reliability, measurement error, validity, and responsiveness. It then provides examples of the design and statistical requirements assessed for each measurement property. The checklist is intended to determine if studies applying classical test theory or item response theory meet standards for good methodological quality in evaluating measurement properties.

Uploaded by

Thayla Amorim Santino

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

68 views8 pages

COSMIN Checklist for Measurement Properties

Uploaded by

Thayla Amorim Santino

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

The COSMIN checklist

Contact
CB Terwee, PhD
VU University Medical Center
Department of Epidemiology and Biostatistics
EMGO Institute for Health and Care Research
1081 BT Amsterdam
The Netherlands
Website: [Link], [Link]
E-mail: [Link]@[Link]

Step 1. Evaluated measurement properties in the article

Internal consistency Box A

Reliability Box B
Measurement error Box C
Content validity Box D
Structural validity Box E
Hypotheses testing Box F
Cross-cultural validity Box G
Criterion validity Box H
Responsiveness Box I
Interpretability Box J

Step 2. Determining if the statistical method used in the article are based on CTT or
IRT

Box General requirements for studies that applied Item Response Theory (IRT) models
yes no ?
1 Was the IRT model used adequately described? e.g. One Parameter Logistic
Model (OPLM), Partial Credit Model (PCM), Graded Response Model (GRM)

2 Was the computer software package used adequately described? e.g.

RUMM2020, WINSTEPS, OPLM, MULTILOG, PARSCALE, BILOG, NLMIXED

3 Was the method of estimation used adequately described? e.g. conditional

maximum likelihood (CML), marginal maximum likelihood (MML)

4 Were the assumptions for estimating parameters of the IRT model checked? e.g.
unidimensionality, local independence, and item fit (e.g. differential item
functioning (DIF))

1
Step 3. Determining if a study meets the standards for good methodological quality

Box A. Internal consistency

yes no ?
1 Does the scale consist of effect indicators, i.e. is it based on a reflective model?
Design requirements yes no ?

2 Was the percentage of missing items given?

3 Was there a description of how missing items were handled?
4 Was the sample size included in the internal consistency analysis adequate?
5 Was the unidimensionality of the scale checked? i.e. was factor analysis or IRT
model applied?

6 Was the sample size included in the unidimensionality analysis adequate?

7 Was an internal consistency statistic calculated for each (unidimensional)
(sub)scale separately?

8 Were there any important flaws in the design or methods of the study?
Statistical methods yes no NA

9 for Classical Test Theory (CTT): Was Cronbachs alpha calculated?

10 for dichotomous scores: Was Cronbachs alpha or KR-20 calculated?
11 for IRT: Was a goodness of fit statistic at a global level calculated? e.g. 2,
reliability coefficient of estimated latent trait value (index of (subject or item)
separation)

Box B. Reliability: relative measures (including test-retest reliability, inter-rater reliability and
intra-rater reliability)

Design requirements yes no ?

1 Was the percentage of missing items given?

2
8 Was the time interval appropriate?
9 Were the test conditions similar for both measurements? e.g. type of
administration, environment, instructions

10 Were there any important flaws in the design or methods of the study?

Statistical methods yes no NA ?

11 for continuous scores: Was an intraclass correlation coefficient (ICC)

calculated?
12 for dichotomous/nominal/ordinal scores: Was kappa calculated?
13 for ordinal scores: Was a weighted kappa calculated?
14 for ordinal scores: Was the weighting scheme described? e.g. linear,
quadratic

Box C. Measurement error: absolute measures

Design requirements yes no ?

1 Was the percentage of missing items given?

2 Was there a description of how missing items were handled?
3 Was the sample size included in the analysis adequate?
4 Were at least two measurements available?
5 Were the administrations independent?
6 Was the time interval stated?
7 Were patients stable in the interim period on the construct to be measured?
8 Was the time interval appropriate?
9 Were the test conditions similar for both measurements? e.g. type of
administration, environment, instructions

10 Were there any important flaws in the design or methods of the study?
Statistical methods yes no ?

11 for CTT: Was the Standard Error of Measurement (SEM), Smallest Detectable
Change (SDC) or Limits of Agreement (LoA) calculated?

3
Box D. Content validity (including face validity)

General requirements yes no ?

1 Was there an assessment of whether all items refer to relevant aspects of the
construct to be measured?

2 Was there an assessment of whether all items are relevant for the study
population? (e.g. age, gender, disease characteristics, country, setting)

3 Was there an assessment of whether all items are relevant for the purpose of the
measurement instrument? (discriminative, evaluative, and/or predictive)

4 Was there an assessment of whether all items together comprehensively reflect

the construct to be measured?

5 Were there any important flaws in the design or methods of the study?

Box E. Structural validity

yes no ?
1 Does the scale consist of effect indicators, i.e. is it based on a reflective model?

Design requirements yes no ?

2 Was the percentage of missing items given?

3 Was there a description of how missing items were handled?
4 Was the sample size included in the analysis adequate?
5 Were there any important flaws in the design or methods of the study?
Statistical methods yes no NA

6 for CTT: Was exploratory or confirmatory factor analysis performed?

7 for IRT: Were IRT tests for determining the (uni-) dimensionality of the items
performed?

Box F. Hypotheses testing

Design requirements yes no ?

1 Was the percentage of missing items given?

2 Was there a description of how missing items were handled?
3 Was the sample size included in the analysis adequate?

4
4 Were hypotheses regarding correlations or mean differences formulated a priori
(i.e. before data collection)?
yes no NA

5 Was the expected direction of correlations or mean differences included in the

hypotheses?

6 Was the expected absolute or relative magnitude of correlations or mean

differences included in the hypotheses?

7 for convergent validity: Was an adequate description provided of the comparator

instrument(s)?

8 for convergent validity: Were the measurement properties of the comparator

instrument(s) adequately described?

9 Were there any important flaws in the design or methods of the study?
Statistical methods yes no NA

10 Were design and statistical methods adequate for the hypotheses to be tested?

Box G. Cross-cultural validity

Design requirements yes no ?

1 Was the percentage of missing items given?

2 Was there a description of how missing items were handled?
3 Was the sample size included in the analysis adequate?
4 Were both the original language in which the HR-PRO instrument was developed,
and the language in which the HR-PRO instrument was translated described?

5 Was the expertise of the people involved in the translation process adequately
described? e.g. expertise in the disease(s) involved, expertise in the construct to
be measured, expertise in both languages

6 Did the translators work independently from each other?

7 Were items translated forward and backward?
8 Was there an adequate description of how differences between the original and
translated versions were resolved?

9 Was the translation reviewed by a committee (e.g. original developers)?

10 Was the HR-PRO instrument pre-tested (e.g. cognitive interviews) to check
interpretation, cultural relevance of the translation, and ease of comprehension?

5
11 Was the sample used in the pre-test adequately described?
12 Were the samples similar for all characteristics except language and/or cultural
background?

13 Were there any important flaws in the design or methods of the study?
Statistical methods yes no NA

14 for CTT: Was confirmatory factor analysis performed?

15 for IRT: Was differential item function (DIF) between language groups assessed?

Box H. Criterion validity

Design requirements yes no ?

1 Was the percentage of missing items given?

2 Was there a description of how missing items were handled?
3 Was the sample size included in the analysis adequate?
4 Can the criterion used or employed be considered as a reasonable gold
standard?

5 Were there any important flaws in the design or methods of the study?
Statistical methods yes no NA

6 for continuous scores: Were correlations, or the area under the receiver operating
curve calculated?

7 for dichotomous scores: Were sensitivity and specificity determined?

Box I. Responsiveness

Design requirements yes no ?

1 Was the percentage of missing items given?

2 Was there a description of how missing items were handled?
3 Was the sample size included in the analysis adequate?
4 Was a longitudinal design with at least two measurement used?
5 Was the time interval stated?
6 If anything occurred in the interim period (e.g. intervention, other relevant events),
was it adequately described?

6
7 Was a proportion of the patients changed (i.e. improvement or deterioration)?

Design requirements for hypotheses testing yes no ?

For constructs for which a gold standard was not available:

8 Were hypotheses about changes in scores formulated a priori (i.e. before data
collection)?
yes no NA

9 Was the expected direction of correlations or mean differences of the change

scores of HR-PRO instruments included in these hypotheses?

10 Were the expected absolute or relative magnitude of correlations or mean

differences of the change scores of HR-PRO instruments included in these
hypotheses?
11 Was an adequate description provided of the comparator instrument(s)?
12 Were the measurement properties of the comparator instrument(s) adequately
described?

13 Were there any important flaws in the design or methods of the study?
Statistical methods yes no NA

14 Were design and statistical methods adequate for the hypotheses to be tested?

Design requirement for comparison to a gold standard yes no ?

For constructs for which a gold standard was available:

15 Can the criterion for change be considered as a reasonable gold standard?

16 Were there any important flaws in the design or methods of the study?
Statistical methods yes no NA

17 for continuous scores: Were correlations between change scores, or the area
under the Receiver Operator Curve (ROC) curve calculated?

18 for dichotomous scales: Were sensitivity and specificity (changed versus not
changed) determined?

Box J. Interpretability
yes no ?
1 Was the percentage of missing items given?
2 Was there a description of how missing items were handled?

7
3 Was the sample size included in the analysis adequate?
4 Was the distribution of the (total) scores in the study sample described?
5 Was the percentage of the respondents who had the lowest possible (total) score
described?

6 Was the percentage of the respondents who had the highest possible (total)
score described?

7 Were scores and change scores (i.e. means and SD) presented for relevant (sub)
groups? e.g. for normative groups, subgroups of patients, or the general
population

8 Was the minimal important change (MIC) or the minimal important difference
(MID) determined?

9 Were there any important flaws in the design or methods of the study?

Step 4: Determining the Generalisability of the results

Box Generalisability
yes no NA
Was the sample in which the HR-PRO instrument was evaluated adequately
described? In terms of:

1 median or mean age (with standard deviation or range)?

2 distribution of sex?
3 important disease characteristics (e.g. severity, status, duration) and
description of treatment?

4 setting(s) in which the study was conducted? e.g. general population,

primary care or hospital/rehabilitation care

5 countries in which the study was conducted?

6 language in which the HR-PRO instrument was evaluated?
7 Was the method used to select patients adequately described? e.g. convenience,
consecutive, or random
yes no ?
8 Was the percentage of missing responses (response rate) acceptable?

COSMIN Checklist for Measurement Properties
No ratings yet
COSMIN Checklist for Measurement Properties
6 pages
Rasch Model in Survey Validation
No ratings yet
Rasch Model in Survey Validation
42 pages
Validity and Reliability in Data Collection
No ratings yet
Validity and Reliability in Data Collection
44 pages
Primary Data & Integrative Methods in HTA
No ratings yet
Primary Data & Integrative Methods in HTA
58 pages
102 Occupational Therapy For Physical Dysfunction
No ratings yet
102 Occupational Therapy For Physical Dysfunction
61 pages
Mixed Methods for Content Validity
No ratings yet
Mixed Methods for Content Validity
48 pages
Survey Reliability and Validity Methods
No ratings yet
Survey Reliability and Validity Methods
37 pages
Key Research Methodology Terms Explained
No ratings yet
Key Research Methodology Terms Explained
3 pages
Reliability and Validity in Measurement
No ratings yet
Reliability and Validity in Measurement
13 pages
Health Measurement Methods and Properties
No ratings yet
Health Measurement Methods and Properties
10 pages
Critical Evaluation of RCTs and Studies
No ratings yet
Critical Evaluation of RCTs and Studies
4 pages
P S (Extras) (Controlled Docx)
No ratings yet
P S (Extras) (Controlled Docx)
26 pages
Understanding Reliability and Validity in Measurement
No ratings yet
Understanding Reliability and Validity in Measurement
2 pages
Data Analysis Plan PDF
No ratings yet
Data Analysis Plan PDF
31 pages
Reliability and Validity in Research Methods
No ratings yet
Reliability and Validity in Research Methods
5 pages
Research Design and Methodology Overview
No ratings yet
Research Design and Methodology Overview
6 pages
Validity - New Umar
No ratings yet
Validity - New Umar
35 pages
Multi-Item Scale Development Guide
No ratings yet
Multi-Item Scale Development Guide
6 pages
Validating Health Measurement Techniques
No ratings yet
Validating Health Measurement Techniques
10 pages
Understanding Reliability and Validity in Research
No ratings yet
Understanding Reliability and Validity in Research
5 pages
Reliability and Validity in Quantitative Research
No ratings yet
Reliability and Validity in Quantitative Research
6 pages
Business Research Methods Overview
No ratings yet
Business Research Methods Overview
31 pages
Understanding Measurement and Scaling Techniques
No ratings yet
Understanding Measurement and Scaling Techniques
29 pages
Assessment Scale Theory Overview
No ratings yet
Assessment Scale Theory Overview
21 pages
Data Collection Methods and Sampling Techniques
No ratings yet
Data Collection Methods and Sampling Techniques
37 pages
Lesson 5.instrument, Validity, and Reliability
No ratings yet
Lesson 5.instrument, Validity, and Reliability
64 pages
AXIS Tool for Cross-Sectional Study Appraisal
No ratings yet
AXIS Tool for Cross-Sectional Study Appraisal
2 pages
Confirmatory Factor Analysis (CFA) of First Order Factor Measurement Model-ICT Empowerment in Nigeria
No ratings yet
Confirmatory Factor Analysis (CFA) of First Order Factor Measurement Model-ICT Empowerment in Nigeria
8 pages
Nursing Research Methodology Overview
No ratings yet
Nursing Research Methodology Overview
9 pages
Understanding Reliability in Research
100% (1)
Understanding Reliability in Research
10 pages
Overview of Validity Types in Research
100% (2)
Overview of Validity Types in Research
5 pages
Understanding Reliability in Research
No ratings yet
Understanding Reliability in Research
17 pages
Methodology for HRQoL Study in GDM
No ratings yet
Methodology for HRQoL Study in GDM
3 pages
Analyzing Randomized Control Trials
No ratings yet
Analyzing Randomized Control Trials
49 pages
Understanding Tool Reliability in Research
No ratings yet
Understanding Tool Reliability in Research
15 pages
Measurement Criteria: Reliability & Validity
No ratings yet
Measurement Criteria: Reliability & Validity
25 pages
CFA of ICT Empowerment in Nigeria
No ratings yet
CFA of ICT Empowerment in Nigeria
8 pages
Measurement Scales Overview
No ratings yet
Measurement Scales Overview
20 pages
Measurement and Scaling Techniques in Research
No ratings yet
Measurement and Scaling Techniques in Research
29 pages
Understanding Reliability and Validity
No ratings yet
Understanding Reliability and Validity
18 pages
Research Rigour: Validity & Reliability
No ratings yet
Research Rigour: Validity & Reliability
6 pages
Validity and Reliability in Research Instruments
No ratings yet
Validity and Reliability in Research Instruments
5 pages
Research Methodology and Design Insights
No ratings yet
Research Methodology and Design Insights
5 pages
Research Methodology and Design Insights
No ratings yet
Research Methodology and Design Insights
5 pages
Validity and Reliability of Research Instruments
100% (5)
Validity and Reliability of Research Instruments
47 pages
Validity and Reliability in Research Instruments
No ratings yet
Validity and Reliability in Research Instruments
7 pages
STROBE & CONSORT Reporting Checklists
No ratings yet
STROBE & CONSORT Reporting Checklists
9 pages
Types and Methods of Reliability Testing
No ratings yet
Types and Methods of Reliability Testing
12 pages
Reliability and Validity in Research Methods
No ratings yet
Reliability and Validity in Research Methods
5 pages
Reliability and Validity in Research Design
No ratings yet
Reliability and Validity in Research Design
33 pages
Steps in Scale
No ratings yet
Steps in Scale
14 pages
Understanding Validity in Measurement
No ratings yet
Understanding Validity in Measurement
35 pages
Guidelines for Effective Scale Development
No ratings yet
Guidelines for Effective Scale Development
89 pages
Validity and Reliability in Research Instruments
No ratings yet
Validity and Reliability in Research Instruments
26 pages
Understanding Measurement Scales and Validity
No ratings yet
Understanding Measurement Scales and Validity
17 pages
Study Quality Assessment Tools Guide
No ratings yet
Study Quality Assessment Tools Guide
31 pages
Fundamentals of Experimental Design
No ratings yet
Fundamentals of Experimental Design
3 pages
Business Analytics Test Methods Guide
No ratings yet
Business Analytics Test Methods Guide
5 pages
Statistical Analysis and Hypothesis Testing
No ratings yet
Statistical Analysis and Hypothesis Testing
34 pages
Evaluating Teaching Outcomes Worksheet
No ratings yet
Evaluating Teaching Outcomes Worksheet
9 pages
Advertising Impact on Twitch Users
No ratings yet
Advertising Impact on Twitch Users
51 pages
Statistik Sikap Ibu Bapa
No ratings yet
Statistik Sikap Ibu Bapa
6 pages
Sustainability
No ratings yet
Sustainability
19 pages
Validity and Reliability in Assessment
No ratings yet
Validity and Reliability in Assessment
10 pages
Constructing an Effective Attitude Scale
No ratings yet
Constructing an Effective Attitude Scale
4 pages
Employee Selection in I/O Psychology
No ratings yet
Employee Selection in I/O Psychology
32 pages
EFL Public Speaking Anxiety Scale Development
No ratings yet
EFL Public Speaking Anxiety Scale Development
13 pages
(2017) IBD - Spesific Health Related QoL Instrument
No ratings yet
(2017) IBD - Spesific Health Related QoL Instrument
13 pages
Revised Statistical Anxiety Scale Development
No ratings yet
Revised Statistical Anxiety Scale Development
9 pages
VR's Impact on Welding Skill Training
No ratings yet
VR's Impact on Welding Skill Training
20 pages
Customer Satisfaction's Role in Repurchase Decisions
No ratings yet
Customer Satisfaction's Role in Repurchase Decisions
14 pages
Criminological Research: Key Concepts and Questions
100% (3)
Criminological Research: Key Concepts and Questions
11 pages
Maslach Burnout Inventory
75% (4)
Maslach Burnout Inventory
13 pages
General Self-Efficacy Scale Overview
100% (3)
General Self-Efficacy Scale Overview
3 pages
Gantt Chart Overview for Project Management
No ratings yet
Gantt Chart Overview for Project Management
21 pages
Trust Factors in Maluku Construction Partnerships
No ratings yet
Trust Factors in Maluku Construction Partnerships
11 pages
Pricing Strategies Impact on Small Business Sales
No ratings yet
Pricing Strategies Impact on Small Business Sales
31 pages
Copra Processing Facilities in Paranas
100% (2)
Copra Processing Facilities in Paranas
46 pages
Competency-Based Assessment for Disabilities
No ratings yet
Competency-Based Assessment for Disabilities
52 pages
Issues in Language Testing Methods
100% (1)
Issues in Language Testing Methods
7 pages
Tripartite Influence on Body Image
No ratings yet
Tripartite Influence on Body Image
7 pages
IT Flexibility and Dynamic Capabilities Alignment
No ratings yet
IT Flexibility and Dynamic Capabilities Alignment
19 pages
2019sarstedtetal AMJ
No ratings yet
2019sarstedtetal AMJ
16 pages
Occupational Stress in College Teachers
No ratings yet
Occupational Stress in College Teachers
42 pages
M.Ed. Program Structure and Regulations
No ratings yet
M.Ed. Program Structure and Regulations
28 pages
Tattoo Stigma and Negative Perceptions
No ratings yet
Tattoo Stigma and Negative Perceptions
21 pages
Psychometric Evaluation of The Connor-Davidson Resilience Scale (CD-RISC) in A Sample of Indian Students
No ratings yet
Psychometric Evaluation of The Connor-Davidson Resilience Scale (CD-RISC) in A Sample of Indian Students
9 pages
Investigating Exothermic Reactions
No ratings yet
Investigating Exothermic Reactions
5 pages
1 s2.0 S0020748919302159 Main
No ratings yet
1 s2.0 S0020748919302159 Main
13 pages

COSMIN Checklist for Measurement Properties

Uploaded by

COSMIN Checklist for Measurement Properties

Uploaded by

The COSMIN checklist

Step 1. Evaluated measurement properties in the article

Internal consistency Box A

2 Was the computer software package used adequately described? e.g.

3 Was the method of estimation used adequately described? e.g. conditional

Box A. Internal consistency

2 Was the percentage of missing items given?

6 Was the sample size included in the unidimensionality analysis adequate?

9 for Classical Test Theory (CTT): Was Cronbachs alpha calculated?

Design requirements yes no ?

1 Was the percentage of missing items given?

Statistical methods yes no NA ?

11 for continuous scores: Was an intraclass correlation coefficient (ICC)

Box C. Measurement error: absolute measures

Design requirements yes no ?

1 Was the percentage of missing items given?

General requirements yes no ?

4 Was there an assessment of whether all items together comprehensively reflect

Box E. Structural validity

Design requirements yes no ?

2 Was the percentage of missing items given?

6 for CTT: Was exploratory or confirmatory factor analysis performed?

Box F. Hypotheses testing

Design requirements yes no ?

1 Was the percentage of missing items given?

5 Was the expected direction of correlations or mean differences included in the

6 Was the expected absolute or relative magnitude of correlations or mean

7 for convergent validity: Was an adequate description provided of the comparator

8 for convergent validity: Were the measurement properties of the comparator

Box G. Cross-cultural validity

Design requirements yes no ?

1 Was the percentage of missing items given?

6 Did the translators work independently from each other?

9 Was the translation reviewed by a committee (e.g. original developers)?

14 for CTT: Was confirmatory factor analysis performed?

Box H. Criterion validity

Design requirements yes no ?

1 Was the percentage of missing items given?

7 for dichotomous scores: Were sensitivity and specificity determined?

Design requirements yes no ?

1 Was the percentage of missing items given?

Design requirements for hypotheses testing yes no ?

For constructs for which a gold standard was not available:

9 Was the expected direction of correlations or mean differences of the change

10 Were the expected absolute or relative magnitude of correlations or mean

Design requirement for comparison to a gold standard yes no ?

For constructs for which a gold standard was available:

15 Can the criterion for change be considered as a reasonable gold standard?

Step 4: Determining the Generalisability of the results

1 median or mean age (with standard deviation or range)?

4 setting(s) in which the study was conducted? e.g. general population,

5 countries in which the study was conducted?

You might also like