Module 4: Health Data
Analytics
Data Analytics
4-1
Data Types and
Measurement
4-10
pp. 219-220
1. Categorical (count)
2. Continuous (measurement)
• There are different sampling methods, data collection, and
analysis for each type of data
• Performance and process improvement work involves both
types of data and associated statistics
4-11
p. 219
• Nominal - Also known as count, discrete, qualitative;
considered attributes data with no quantitative value
Binary data: 2 possibilities/values
Nominal Values Categories
Surgical patients Preoperative or postoperative
Gender Female or male
Patient education Attended or did not attend class
4-12
p. 219
• Ordinal – Nominal data put into categories and rank-ordered
Ordinal Values Categories
Nursing staff rank Nurse level I, II, III, IV, V
Education AD, BS, MS, PhD
Attitude toward research Agree, neutral, disagree
scale
4-13
p. 220
• Interval – Measured on scales that theoretically have no
gaps; considered variables data; no true zero
Interval Data
Equal distance between each point (e.g., values
on a thermometer); no true zero
4-14
p. 220
• Ratio – Measured on scales that theoretically have no gaps;
considered variables data; has a true zero
Ratio Data
Equal distance between each point, but there is
a true zero – no value goes below zero (e.g.,
height and weight)
4-15
p. 220
• A critical issue is whether right data are measured or counted
• Most quality improvement data (continuous/count) readily available are
analyzed because they are easy to retrieve but are not always the best
data to use
Categorical Data Continuous Data
Least statistical power Most power and need fewer
Example: hypertensive data points
versus non-hypertensive Example: systolic and
diastolic BP values
4-16
p. 225
4-17
Are the following data categorical or continuous?
• How many patients had surgery this month?
• A patient's temperature was 103 degrees. You medicated the
patient with Tylenol and his temperature came down to 101
degrees.
• You want to know what the average length of stay was for
patients in the intensive care unit in the first six months of
the year.
4-18
p. 204
• Healthcare data must be carefully defined and systematically
collected and analyzed
• Tremendous amounts of healthcare data and information are
available; not all is useful
• Mature quality improvement information revolves around
clearly established patterns of care, not individual cases
4-19
p. 204
Abstract representations of things, facts,
Data concepts, and instructions that are stored
in a defined format and structure
Is obtained when data are translated into
Information results and statements that are useful for
decision making
Decisions
4-20
pp. 204-206,
219
• Identify current available data sources
• Identify critical information needs
• Define data elements
• Determine data collection plan
• Acquire/collect data
• Aggregate and display data
4-21
pp. 204-206,
219
• Analyze data
• Interpret data/information
• Act on information
• Report data/information/knowledge/decision
• Collect more data to monitor/analyze the decision
4-22
pp. 204-206,
219
• Identify who needs to know information
• Determine what specific information they need
• Develop a system where the right people receive the right
information at the right time in the right way
4-23
pp. 211-214
• Clinical information systems support direct care processes
(laboratory/radiology results, etc.)
• Administrative (non-clinical) support systems aid day-to-day
operations (billing financial, human resources, etc.)
• Decision support systems deal with strategic planning
functions (resource allocation, performance evaluation,
product evaluation)
4-24
pp. 213-214
• Helps in making comparison with competitors
• Identifies practitioners and providers who meet acceptable
levels of quality
• Allows providers to respond rapidly to market changes
• Justifies pay for exceptional performance
• Analyzes and interprets outcomes data
• Used to develop outcomes information management plan
4-25
pp. 213-214
• Identifies positive and negative outcomes
• Includes risk/severity adjustment data
• Facilitates cross-functional analyses
• Integrates clinical and financial data
4-26
pp. 213-214
• Strategic planning and marketing
• Resource allocation
• Performance evaluation and monitoring
• Product evaluation and services
• Medical management
Evidence-based practice
Clinical guidelines
Clinical pathways
4-27
pp. 212
• Does system allow for:
Capture, storage, and retrieval of clinical and financial information
from a variety of sources?
Interface with other information systems?
Establishment of trigger or threshold measures?
Critical alerts (e.g., for abnormal values)?
Rules-based processing?
4-28
pp. 212
• Does system allow for:
Both concurrent and retrospective reviews?
Supporting accreditation and regulatory requirements?
Data mining reporting or statistical analysis?
Multiple users accessing at same time?
Various operating systems?
Secure access to reports?
4-29
p. 209
• Chart-based (EHR-based) system
Nursing or medical record analysts review records/EHRs
Input diagnostic, procedural, and detailed clinical findings
Has higher cost and smaller sample size
To identify severity-adjusted and risk-adjusted information
– Severity-adjusted: Measurement data
– Risk-adjusted: Count data
4-30
• Code-based system
Based on retrospective administrative data such as
Uniform bill document (UB-92)
Uses clinical information spanning entire stay
Has lower cost and larger sample size
Submission of payer data deemed public information
required by states
4-31
Performance Improvement
and Research
4-32
pp. 206-207
• Quality, performance improvement activities and research
exist on a continuum of rigor
• May be viewed more like a soft science
• Scientific approach involves inductive and deductive
reasoning which might be considered superior
• Underlying assumptions of design, measurement,
interpretation are similar
4-33
pp. 206
4-34
pp. 206-207
• Research utilization is a key aspect of QI process and critical
to achieving healthcare quality as defined by local, state, and
federal regulations and standards
• Healthcare quality professionals use the level of research
rigor that best answers specific performance improvement
question and area of study balancing rigor and practicality
4-35
pp. 207
4-36
Collect, Interpret,
and Use Data
4-37
p. 219
• Determine who, what, when, where, how,
and why
• Structure the design of the collection methodology
• Choose and develop the sampling method
• Determine and conduct data collection training
4-38
• Delegate responsibilities for data collection p. 219
• Facilitate coordination among involved groups
• Forecast budget
• Conduct pilots of forms and collection process
4-39
pp. 234-235
• Healthcare quality professionals assist organizations and
practitioners by interpreting both comparison and
benchmarking results
• Comparison: Examine processes and results against a
reference point either internally or externally with
competitors and other organizations providing similar
services
• Benchmarking: Examine processes and results that
represent best practices for similar activities inside or
outside the healthcare industry
4-40
pp. 234-235
• Involves asking the right questions
What is the best practice?
What are we doing?
How are we doing it?
How well are we doing it?
What are the measurement results?
Why are we looking for improvement?
4-41
pp. 234-235
• Enables organization to set target or goal for process
improvement activities
• Potential data sources for benchmarking
Government (CMS, CDC, state agencies)
Large healthcare alliances or systems
State peer review organizations and hospital associations
For-profit database companies
4-42
pp. 204-205
• Step 1: Plan and organize
Anticipate barriers, identify responsibilities, and lay groundwork for
multidisciplinary collaboration
Develop data dictionary
• Step 2: Verify and correct
Begin limited data collection as a pilot test
Identify data limitations and errors
Modify data collection plan, if needed
Collect data as planned
4-43
• Step 3: Identify and present findings
p. 205
How do data compare with data from other organizations?
What is the trend over time?
How are data likely to be interpreted?
Is there an opportunity for improvement?
Who should receive the data?
For what purpose?
4-44
pp. 205-206
• Step 4: Study and develop recommendations
Perform variation analysis
Review additional data
Conduct retrospective medical reviews
Perform process analysis
4-45
• Step 5: Take action p. 206
Empower teams to make decisions and implement changes based on
information discovered
Educate and train staff
Report findings
Make necessary changes in policies and processes
Implement changes in practice patterns
4-38
• Step 6: Monitor performance p. 206
Have proposed changes actually been implemented?
How could compliance with changes be enhanced?
What effect are changes having on patient outcomes?
Should changes be modified and then tested further, tested longer, or
ended?
4-39
p. 206
• Step 7: Communicate results
Barriers to interpretation and utilization of information
–Human (fear of data, resentment of external data, unrealistic
expectations about data such as perfect data)
–Statistical (flawed data, missing data, untimely data, poorly
displayed data, difficult to integrate with other organizational data)
–Organizational (data overload, poor data retrieval system, lack of
resources such as time people, money)
4-40
pp. 232-234,
238
• Report and analyze data regularly
Consider timeliness of internally gathered data, internal data
gathered by external sources, and external data
• Validate accurate data collection
• Display data in easily understood format
• Provide a brief summary of data
• Provide contextual background
• Use graphs to display data and include a table of values
4-41
pp. 209-210,
232-234
• Explain data collection specifics (how, when, where, from whom)
• Summarize meaning of values and how they were computed
• Identify removed outliers
• Include time order
• Analyze variances and identify unexpected
patterns
Common cause
Special cause
4-42
p. 220
• Population (N): total aggregate or group
• Sample (n): a portion of the population
• Sampling:
Provides a logical way of making statements about a larger group
Allows quality professionals to make statements or generalize from
the sample to the population depending on the type of sampling used
4-43
p. 221
• Probability sampling
Every element in the population has an
equal or random chance of being selected
• Nonprobability sampling
It is not possible to estimate the probability
that every element has been included
4-44
p. 221
• Simple random sampling: Each individual in the population has
an equal chance to be chosen
Put all names in a hat; pull one for a door prize
• Systematic sampling: After random selection of first case, draw
every nth case from population
Every fifth patient
• Stratified random sampling: The population is divided into
groups; each member of the group has an equal probability of
being selected
Patients with particular diseases
4-45
p. 221
• Convenience sampling: Any available group of subjects is used
(lack of randomization)
Selecting participants who took one instructor’s CPHQ class but not
including all classes taught
• Snowball sampling: Subjects suggest other subjects (subtype of
convenience sampling)
Asking cancer patients in a clinic to identify other cancer patients they
know
• Purposive or judgment sampling: A particular group is
subjectively selected based on criteria
Using a group of nurses because researcher believes they represent a
cross section of women
4-46
p. 221
• Quota sampling
A judgment is made about the most representative sample
– 15 charts per month
– 5% or 30 – whichever is greater
4-47
p. 221
• With exception of case studies, larger samples yield a more
valid and accurate study and are more representative of the
population
• As the actual difference between groups gets smaller, the
size of the sample required to detect the difference gets
bigger
• Using too large a sample to answer the research question is
a waste of time and resources
4-48
p. 221
• Regardless of the shape of the original population
distribution, as sample size increases, the shape of sampling
distribution becomes a normal bell curve
• Factors that influence sample size include:1)
Research
purpose, Design, 2) Level of confidence desired, 3)
Anticipated degree of difference between study groups, and
4) Size of population.
• Continuous data requires a smaller sample size than
categorical data
4-49
Which of these sampling methods is NOT probability
sampling?
A. Simple random sampling
B. Quota sampling
C. Systematic sampling
D. Stratified random sampling
50
The quality professional evaluated hypertension rates in their
internal medicine clinic, looking at ages <35, 35-50, 50-65 and
>65. He evaluated a sample from each age group. What type of
sampling did he use?
A. Purposive sampling
B. Systematic sampling
C. Simple random sampling
D. Stratified random sampling
51
The quality team is evaluating handwashing rates in the three
intensive care units. They use a checklist and have someone
stationed for 1 hour daily covering both 12-hour shifts for one
week. What type of sampling best describes this method?
A. Simple random sampling
B. Convenience sampling
C. Purposive sampling
D. Systematic sampling
52
Statistical Analysis
Methods
4-53
p. 222
• Reliability: Extent to which an instrument yields the same
result on repeated trials
Test-retest
Split half
Reliability by equivalence
Reliability coefficient: Also known as correlation coefficient;
numerical index of comparison’s reliability; greater than 0.70
acceptable
Inter-rater reliability: Degree to which two raters, operating
independently, will assign same ratings
4-54
pp. 222-223
• Validity: Degree to which instrument measures what it is intended
to measure
• Content (face) validity: Degree to which instrument adequately
represents universe of content; includes judgments by experts
about the degree to which a test appears to measure what it is
supposed to measure
• Criterion-related validity: Extent that score on instrument can be
related to a criterion (behavior instrument is supposed to predict);
for example risk-adjustment scales are tools used to measure
morbidity and mortality
• Can be predictive (future) or concurrent (present)
4-55
Statistical Techniques
4-56
p. 223
• Measures of central tendency are statistical indexes that
describe where a set of scores/values of a distribution
cluster
Central refers to middle value
Tendency refers to general trend of the numbers
• Type and distribution of the data determine which measures
of central tendency are most appropriate – mean, median, or
mode
4-57
pp. 223-224
• Mean = average
• Median = middle
• Mode = most frequently occurring
4-58
p. 223
• The mean of a set of measurements is the sum of all
scores/values divided by the total number of scores
• Most commonly used
• Most sensitive to extreme scores
• Used with interval and ratio data
4-59
The mean is used with interval and ratio data. These types of
data are
A. Categorical or count data
B. Continuous or measurement data
60
p. 223
• Zero is a numerical value and must be included
Example 1 Example 2
Apgar scores: 7, 8, 8, 1, Infection rates: 0, 0, 0, 0, 1.5,
8 3.2, 4.3, 5.4
Sum of scores = 32 Sum of rates = 14.4
Divide by 5 scores = 6.4 Divide by 8 rates = 1.8
Mean = 6.4 Mean = 1.8
4-61
pp. 223-224
• Measure that corresponds to middle score; does not take
quantitative value of individual scores into account
• Point on a numerical scale above and below which 50% of
data falls
• Arrange values in rank order
• If number of values is:
Odd, count from ends to middle value
Even, compute mean of two middle values
4-62
pp. 223-224
Example 1 Example 2 Example 3
Values: 2, 2, 3, 4, 5, 6, Values: 2, 2, 2, 3, 4, 5, Values: 2, 2, 2, 3, 4, 5,
6, 8, 9 6, 6, 8, 9 6, 6, 8, 84
5 is the middle Add 4 plus 5 (middle Quantitative values of
number numbers) and divide individual #s not
by 2 = 4.5 taken into account
Median = 5 Median = 4.5 Median = 4.5
Mean = 5 Mean = 4.7 Mean = 12.2
4-63
p. 224
• Score or value that occurs most frequently and is easiest to
determine
• Can be calculated quickly and easily
• Can vary widely from sample to sample (unstable)
4-64
p. 224
Examples
Values: 30, 31, 31, 32, 33, 33, 33, 33, 33, 34, 35, 36
Mode = 33
Values: 2, 3, 6, 8, 10
Mode = no mode
4-65
p. 224
• How measures spread out
• Degree to which values differ
4-66
p. 224
• Difference between the highest and lowest values in a
distribution of scores
• Best reported as the values themselves and not as the
distance between the values
• Provides quick estimate of variability
• Varies easily and affected by extreme values
Example
Test scores: lowest score = 60, highest score = 98
Therefore the range is 60 to 98
4-67
pp. 223-224
Values Statistics
A. 2, 4, 6, 8, 10 Mean = Median =
B. 2, 4, 6, 8, 100 Mean = Median =
C. 0, 2, 4, 6, 7, 8, 10 Mean = Median =
D. 2, 4, 6, 6, 8, 10 Mode =
E. 2, 4, 4, 6, 6, 8, 10 Mode =
F. 2, 4, 4, 6, 6, 6, 8, 8, 10 Mode =
G. 2, 4, 6, 8, 10 Range =
H. 102, 104, 106, 108, 110 Range =
4-68
pp. 224-225
• Measure of variability – Average of deviations from the mean
Standard: Average spread of scores around mean
Deviation: How much each score is scattered from the mean
• Most frequently used statistic for measuring degree of
variability
• σ is symbol for standard deviation called ‘Sigma’
4-69
pp. 224-225
• A normal distribution is a standard bell curve
• Used with normally distributed interval or ratio data
• The greater the spread of distribution, the greater dispersion
or variability from the mean (heterogeneous, more
differences)
• The more values cluster around the mean, the smaller the
variability or deviation (homogeneous, more similar)
• All scores are taken into consideration
4-70
pp. 224-225
4-71
pp. 224-225
4-72
Read & Review Homework
• HQ Solutions: • Review the following topics
166 – 176 (links provided in the
216 – 230 participant workbook):
Quality Essential Toolkit
Run Chart
Pareto Chart
Histogram
73