0% found this document useful (0 votes)
55 views15 pages

Introduction to Statistics in Healthcare

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
55 views15 pages

Introduction to Statistics in Healthcare

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd

Section 1.

1
What is Statistics?
Learning objectives.
At the end of this lecture, the student should be able to
 State at least one definition of statistics.
 Give example of a population parameter, and one example of a sample statistic.
 Be able to classify a variable into quantitative or qualitative, and as nominal,
ordinal, interval or ratio.
Outline

 Definition of Statistics
 Population Parameter & Sample Statistic
 Classifying Levels of Measurement
Definition of Statistics
Concepts in Statistics

 What is statistics?
 Individuals and Variables
 Examples of statistics, individuals and variables in healthcare.
What is statistics?

 Statistics is the study of how to collect, organize, analyze, and interpret numerical information
and data.
 Statistics is both the science of uncertainty and the technology of extracting information from
data.
 Statistics is used to help us make decisions. This is especially important in health care and public
health.
Example: CDC & the flu vaccine
o During the year, the United States Centers for Disease Control and Prevention (CDC)
collects, organizes, analyzes, and interprets numerical information and data.
o They extract information from the data and make decisions about what to include in next
year’s flu vaccine.
Individuals & Variables
Meaning outside Statistics

 Individuals are people.


-we expect 50 individuals at the graduation.
 A variable is a factor that can vary, possibly causing a problem.
 The time the shop takes with my car is an unknown variable, and I can’t predict it.
Meaning in Statistics

 Individuals are people or objects included in a study.


-5 individuals could be 5 people, 5 records, or 5 reports.
 A variable is a characteristic of the individual to be measured or observed.
- the age of an individual person
- the time an individual record was entered
- the diagnosis listed on an individual report.
A few examples

Individual Variable
Kidney dialysis patient Number of blood transfusions
Baby born to a mother who makes cigarettes Birthweight
Post-menopausal woman Compliance with health screening
recommendations
State with casinos Rate of pathological gambling
County with hospitals Medication error rate
Urban city Rate of overdose deaths

Concepts in Statistics

 Statistics is used in healthcare and other disciplines to help aid in decision-making.


 Understanding statistics is necessary to understand certain processes in healthcare.
Parameters vs. Statistics

 What is a population and what is a sample?


 Difference between population and sample data
 Population parameters and sample statistics
 Examples of parameters and statistics
What is population?
Definition

 A population is a group of people or objects with a common theme.


 When every member of that group is considered, it is a population.
 Example
-theme: Nurses who work at Massachusetts general Hospital (MGH)
-population: List from Human Resources of every currently employed nurse at MGH
What is a Sample?
Definition

 A sample is a small portion of the population.


 It can be a representative sample.
 But it can also be a biased sample.
 Example
 Only survey ICU nurses at MGH
-Not a representative sample
- At least one nurse from each department_ more representative sample.
Population vs. Sample Data
Population Data

 In population data, data from every individual in the population is available.


 Entire population=census
Sample Data

 In sample data, data is only available from some of the individuals in the population
 Very commonly used in research studies of patients
Examples of Population data

 Medicare Claims Dataset


- Has all the insurance claims filed by Medicare population.
 United States Census (conducted every 10 years)
Examples of Sample Data

 the Medicare Beneficiary Survey (MBS) is a survey of a sample of individuals on Medicare.


 American Community Survey (ACS) conducted yearly by the United States Census Bureau
Statistical Notation
Total population =N
Sample of population = n
Parameter vs. Statistic
A parameter is a measure that describes the entire population.
A statistic is a measure that describes only a sample of a population.e
Example of Parameters & Statistics

Parameter Statistics
Mean age of every American on Medicare Mean age of Americans on Medicare estimated
using the MBS
The proportion of Americans addicted to The proportion of Americans in the Behavioral
cigarettes risk factor Surveillance Survey (BRFSS) who
admit they are addicted to cigarettes
actual voter turnout The proportion of people in opinion polls who
say then plan to vote

Don’t Get Confused!

 When you hear a number on the television or radio – do they mention if it is a population
parameter or a sample statistic?
 Clues that the number is a population parameter:
- A dataset was used that encompasses the entire population (like Medicare)
- Analyses were done by the government or on behalf of the government.
 clues that number is about a sample statistic:
- it was from a study recruiting volunteers.
- the report mentions only surveying or measuring a sample of individuals.
Describing vs. Inferring

 descriptive statistics involve methods of organizing, picturing, and summarizing


information from samples and populations.
 Inferential statistics involves methods of using information from a sample to draw
conclusions regarding the population.
Parameters vs. Statistics

 In statistics, it is important to properly identify measures as population parameters or sample


statistics.
 Different types of data are used for parameters and statistics.

Classifying Levels of Measurement


Classifying Variables

 Quantitative vs. qualitative


- Interval vs. ratio data
- Nominal vs. ordinal data
 Examples of how to classify healthcare data.
Four-level Data Classification
Human research Data

Quantitative Qualitative
(continuous) (categorical)

Interval Ratio

Quantitative vs. Qualitative


Quantitative is a numerical measurement of something.
Examples of quantitative variables in healthcare.
 Time of admit
 Year of diagnosis
 Systolic blood pressure
 Platelet count
Qualitative refers to a “quality” or categorical characteristic of something.
Examples of Qualitative Variables in Healthcare
 Type of Health insurance
 Country of origin
 Stage of cancer
 Trauma center level.

Interval vs. Ratio


Interval
 Quantitative (continuous) data
- Differences between data values are meaningful.
- There is no true zero.
 Ratio
- Quantitative (continuous) data
- Differences between data values are meaningful.
- There is a true zero.
Examples of quantitative variables in healthcare.
 Time of admit.
 Year of diagnosis
 Systolic blood pressure
 Platelet count

Ratio = there is a true zero


- If you are dead, you have a SBP = 0
- Same is true with platelet count.
Interval = no true zero
“Time” cannot have a true zero.
- Time of admit = 8:09 am
- Year of diagnosis = 1999

Four – level Data Classification


Nominal vs. Ordinal

Nominal
 Nominal applies to categories, labels or names, and cannot be ordered from smallest to largest.
Ordinal
 Ordinal applies to data that can be arranged in order in categories,
 But the difference between data values cannot be determined or is meaningless.

Examples of Qualitative Variables in Healthcare


 Type of health insurance
 Country of origin
 Stage of cancer
 Trauma center level
In nominal = cannot be ordered, while in ordinal = natural order
Differences between levels is meaningless.

Classifying Variables
 All data can be classified as quantitative or qualitative.
 Data can be further classified as interval, ratio, nominal or ordinal.
 It is important to know how to classify data in healthcare.

Conclusions
 Definition of Statistics
 Population Parameter & Sample Statistic
 Classifying Levels of Measurement

Section 1.2
Sampling

Learning Objectives
At the end of this lecture, the student should be able to:
 Define “sampling frame” and “sampling error”
 Give one example of how to do simple random sampling, and one example
of how to do systematic sampling.
 Explain one reason to choose stratified sampling over other approaches.
 State two differences between cluster sampling and convenience sampling.
 Give an example of a national survey that uses multi-stage sampling.
Outline
 Sampling Definitions
 Simple Random Sampling
 Stratified Sampling
 Systematic Sampling
 Convenience & Multi-stage Sampling

Sampling Definitions
Concepts in Sampling
 What is a “sample”?
 Sampling frames, and errors in representing sampling frames.
 Summary of definitions presented.

Sampling and Samples


 We take a sample of the population because we want to do “inferential
statistics”.
 We want to infer from the sample to the population
 Reasons not to measure the whole population
 Impractical
 Unnecessary

Sampling Frame
 List of individuals from which a sample is actually selected.
 “List” may be a physical, concrete list
- List of students enrolled at a nursing college
 Maybe be a theoretical list does not make up yet
- List of patients who will present to the Emergency Department today.
Sampling frame is the part of the population from which you want to draw a
sample.
Therefore, you want everyone from your sampling frame to have a chance of being
selected for your sample.
Undercoverage
 What is it?
- Omitting population members from the sampling frame.
 How can it happen?
- List of nursing students may not include everyone for administrative
reasons.
- People who present to the Emergency Department at night might be
different than those in the day.
Errors in Statistics
Fact -of-Life Error
 Sampling error
- The population mean will probably be different from your sample mean.
- The population percentage will probably be different from your sample
percentage.
Error you want to Avoid.
 Non-sampling error
- Using a bad list.
- Make sure that you pay careful attention that everyone in the population
who is supposed to be represented in your sampling frame is in there.
Causes of error
Fact -of-Life Error
 Sampling error
- Caused by the fact that regardless of what you do, your sample will not
perfectly represent the population.
Error you want to Avoid.
 Non-sampling error
Caused by poor sample design, sloppy data collection, inaccurate measurement
instruments, bias in data collection, other problems introduced by the researcher.

Simulation
 A simulation is defined as a “numerical facsimile or representation of real-
world phenomenon.”
 It is a essentially working through a pretend situation to see how it would
come out in the case it was real.
 That is why this course includes many simulations, or real-life examples.
Concepts in Sampling
 It is important to do your best to avoid non-sampling error.
 This is achieved by making sure you do not have undercoverage when
sampling from your sampling frame.

Simple Random Sampling


 What is simple random sampling?
 Two methods of randomly sampling from a list
 Limits of simple random sampling.
Definition & example
 “A simple random sample of n measurements from a population is a subset
of the population selected in such a manner that every sample of size n from
the population has an equal chance of being selected.”
Example:
- You have a list of the population of students in a class.
- You want to take a sample of 5 (n=5).
- If you take a simple random sample (SRS) from the class list, it means all
the different possible groups of 5 students you could pick from the list
has an equal chance of being the sample (group) you actually pick.
One Method of SRS
 Number all of the individuals in the population with a unique number.
- Like students ID number.
 Out all the student ID numbers in a place from randomly without looking
(like a hat).
 Draw 5 ID’s and use those students as your sample.
Another Method of SRS
 Generate a list of random numbers as long as the list of the population.
 Randomly assign these numbers to the population in then list.
 Take the first 5 numbers (whoever gets assigned 1 through 5).

SRS means Equal Chance of Being Selected


 first method: old-fashioned “hat”
 Second method: Electronic “hat”
 In both methods, all members of the population had an equal probability of
being selected into the sample.
Limits of Simple Random Sampling
 You need a list.
- You don’t know who will present at the Emergency Department that day,
how do you sample?
- Okay when a list is available.
 You need a good list.
- Otherwise, you risk undercoverage
- What if part-time students were not on the list.
- Non- sampling error.
Simple Random Sampling
 Characteristics of SRS
 Two methods of randomly sampling from a list
 Limits of SRS.

Stratified Sampling
 First, the list is divided into groups, or strata.
 This is a way to make it so that there are certain proportions of groups in the
final sample.
 Next, simple random sampling (SRS) takes place for each of the strata.
Steps in stratified sampling
1. Divide entire population into distinct subgroups called strata.
2. The strata are based on a specific characteristic, such as age, income,
education level, and so on.
3. All members of a stratum share this specific characteristic.
4. Draw an SRS from each stratum.
Examples of stratified Sampling
 Ina high school, sampling so many students from each of the grades
(freshman, sophomore, junior, senior)
 In hospitals, sampling so many patients or providers from departments
(different intensive care units)

Limitations of Stratified Sampling


 Oversampling one group means your summary statistic is unbalanced.
 Its is not possible to do without a list beforehand (like with SRS)
 It also is hard because you have to split then list into groups (“strata”) then
SRS from the strata.
Stratified Sampling
 Stratified = taken form groups.
 Several steps are involved.
 Useful if necessary to male all strata equal, or to sample from groups that are
small in the large population.

Systematic Sampling
 Systematic sampling can be done with or without a list.
 Systematic sampling is best described through the steps one takes to do it.
Steps in Systematic Sampling
1. Arrange all individuals of the population in a particular order.
2. Pick a random individuals as a start.
3. Then take every kth member of the population in the sample.
- “kth” means “every so many”
Examples of Systematic Samplingb from a List
 Take out a list of classes available next semester.
 Pick a random number that is small – like 3. Go to the third class.
 Pick another random number – like 5. Pick every 5th class after that.

Characteristics of systematic Sampling


 You cannot do this when there is a pattern to the data (boy/girl/girl)
 You can do systematic sampling in a clinical setting, where you do not
know who is going to come in that day.
systematic sampling – is easy to do with or with out a list. Just pick a random
starting point, then pick every “kth” individual.

Cluster Sampling
 Why use cluster sampling when you could use stratified, systematic or
simple random sampling?
 Because then problem is in a particular geographic location.
Why use cluster sampling?
 The problem is localized to a particular location
 In cluster sampling, we begin by dividing the map in geographic areas.
 Then we randomly pick clusters, or areas, from the map. We take all the
people in the cluster.
Problems with Cluster Sampling
 Sometimes, the people located in a cluster are all similar in a way that makes
the problem hard to study.
 If cancer rates are high all over the clusters, it’s hard to see if a geographic
location is causing higher rates.
Cluster Sampling
- Is used when geography is important in sampling. The map is divided
into areas, and all the people in a particular area are sampled. Biased
toward type of people living in the area.
Convenience & Multi-Stage Sampling

Convenience Sampling
 Convenience sampling can be used under low risk circumstances.
 What ice cream is the best from the restaurant next to the hospital?
 However, often results are not reliable.
 Using results or data that are conveniently or readily obtained.
 Can be useful if not a lot of resources allocated to the study.
 Use an already-assembled group for surveys.
 Ask patients in the waiting room to fill out a survey, or students in a class.

What are the problems with convenience sampling?


 There is a bias in every group.
 Often miss important subpopulations (what stratified sampling addresses).
 Results can be severely biased.

Multi- stage Sampling


 Combination of sampling strategies layered in stages.
 Example:
- Stage 1: Cluster sample of states (two census regions)
- Stages 2: Simple random sample of counties (from each state)
- Stage 3: Stratified sample of schools (urban/rural)
- Stage 4: Stratified sample of classrooms

Convenience & Multi-Stage Sampling


 Avoid using convenience sampling unless the question is low risk.
 Use if the only type of sampling possible under the circumstances.
 Also used when resources are low.
 Multi-stage sampling usually used in large, governmental studies.

Section 1.3
Introduction to Experimental Design

Learning Objectives:
At the end of this lecture, the student should be able to:
 State the steps of conducting a statistical study.
 Select one step of developing a statistical study and state the reason for this
step.
 Name one common mistake that can introduce bias into a survey and given
an example.
 Explain what a lurking variable is and given an example.
 Define what a completely randomized experiment is.
Introduction
 Steps to Conducting Statistical Study
 Basic Terms & Definitions
 Avoiding Bias in Survey Design
 Topics in Randomization
Basic Terms & Definitions
 Review the steps to conducting a statistical study.
 Define vocabulary terms.
 Examples provided from healthcare.
Basic Guidelines for Planning a Statistical Study
1. State a hypothesis.
2. Identify the individuals of interest.
3. Specify the variables to measure.
4. Determine if you will use the entire population or a sample.
- If you choose a sample, choose a sampling method.
5. Address ethical concerns before data collection.
6. Collect the data.
7. Use descriptive or inferential statistics to answer your hypothesis.
8. Note any concerns about your data collection or analysis.
- Make recommendations for future studies.
Hypothesis & Variables
 Hypothesis: air pollution causes asthma in children who live in urban
settings.
 Individuals: children in urban settings
 Variables: air pollution and asthma.
Sampling, Ethics & Data Collection
 Either collect data or use existing dataset.
- Can use a government dataset for population measures.
 Can collect data from a sample for estimates.
- Need to choose sampling approach.
- Will need consent if legally found to be “human research.”
- May need consent from parents to collect data about children.
Census vs. Sample
 In a census, measurements or observation from the entire population are
used.
 In a Sample, measurements or observation from the part of the population
are used.
Experiment vs. Observational Study
Experiment
 A treatment or intervention is deliberately assigned to the individuals.
 The purpose is to study the effect of a treatment or intervention on the
variables measures.
Observational study
 Observations and measurements of individuals are taken.
 However, no measurement or intervention are assigned by the researcher.

You might also like