0% found this document useful (0 votes)

67 views

Applied Biostatistics 2020 - 01 Basics, Centrality and Dispersion

This document provides an overview of an elective module on applied biostatistics taught by Alexandr Parlesak from February 3rd to March 13th, 2020. The module consists of 72 hours of face-to-face learning, 84 hours of directed learning, and 84 hours of autonomous learning, totaling 10 ECTS points. Key topics that will be covered include basics, centrality, dispersion, probability, statistical inference, and how to perform statistical analyses and interpret results using the R software package. The overall objectives are for students to understand fundamental statistical principles and analyses and develop skills to evaluate quantitative data and research claims.

Uploaded by

Yuki NoShinku

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

67 views

Applied Biostatistics 2020 - 01 Basics, Centrality and Dispersion

Uploaded by

Yuki NoShinku

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 86

Global Nutrition & Health

Elective Module “Applied Biostatistics”

Teacher: Alexandr Parlesak

February 3rd – March 13th, 2020

University College
Copenhagen

72 h Face-to-face learning;
84 h Directed learning;
84 h Autonomous learning

10 ECTS Points
Applied Biostatistics 2020 A. Parlesak 1
Global Nutrition & Health

Elective Module “Applied Biostatistics”

Chapter 1: Basics, Centrality and Dispersion

http://jblomo.github.io/datamining290/slides/2013-02-08-Probability.html Applied Biostatistics 2020 A. Parlesak 2

Global Nutrition & Health
Elective Module “Applied Biostatistics”

Literature:
Peter Dalgaard:
Introductory Statistics with R
Second Edition 2008

Urdan, T.C.:
Statistics in Plain English
3rd ed. Routledge, New York

Kirkwood, B.R.; Sterne, J.A.C.:

Essential Medical Statistics
2nd ed. Blackwell, Malden, MS

Applied Biostatistics 2020 A. Parlesak 3

About Me

• I’m not an educated theoretical statistician.

• My competence in statistics developed from

meeting necessities for statistical evaluation of more then
150 clinical/experimental studies performed, more than 45 of
those published. Urge for understanding statistical evaluation
made me close theoretical gaps.

• Accordingly, the focus of this course will be on practical

application of statistical procedures.

• Nevertheless, the most important chunks of statistical theory

will be implemented into the teaching so you will have a
basic understanding of the principles of statistical testing.

http://rlv.zcache.com/retired_statistics_teacher_hat-p148022764156727328tru5_152.jpg
Applied Biostatistics 2020 A. Parlesak 4
Why Should You be Here?

• You have an interest in statistical

reasoning.
• You have a desire to learn to use
statistics properly in data analysis.
• You want to evaluate the (quantitative) data of your bachelor
thesis with your own skills.
• You want to develop your ability to critically assess scientific
arguments and reveal pseudo-scientific abuse of statistics.
• You want to have a common basis of communication with
professional statisticians when bringing up problems of your
work (statistical literacy).

http://blogs.stthomas.edu/opusmagnum/files/2010/08/statistics.jpg Applied Biostatistics 2020 A. Parlesak 5

Why Doing Statistics at All?
When searching for e.g. clinical studies, the abstract may look like this:

• In quantitative science, statements on differences, correlations, etc. ALWAYS

need to be backed up by an appropriate statistical test.

• This test usually gives you key indicators such as the effect size and the
probability whether the finding occurred only by chance – or not.

• There is NO evidence without statistical proof

Applied Biostatistics 2020 A. Parlesak 6
Objectives

• Understand the fundamental

principles of statistical inference.
• Understand the general principles
underlying the most common tests.
Gerhard Richter, 1024 Farben
• Know the assumptions of common
tests and understand impact of violations.
• Be able to perform standard statistical analyses with “R”.
• Developing skills to extrapolate bivariate tests to
multivariate formats.
Applied Biostatistics 2020 A. Parlesak 7
Competencies You Will Have After the Course (1)

1. Describe the roles biostatistics

serves in the discipline of public health.
2. Distinguish among the different
measurement scales (e.g., categorical,
ordinal and interval) and the implications for selection of
statistical methods to be used based on
these distinctions.
3. Apply descriptive techniques commonly used to summarize
public health data including data display (tables and figures)
and measures of distribution shape, central tendency,
variability, correlation, and risk assessment.
4. Understand key concepts of probability, random variation, and
commonly used statistical probability distributions such as
“normal” and “F” that shape the practice of biostatistics.

http://www.usrbingeek.com/blog/competence.gif
Applied Biostatistics 2020 A. Parlesak 8
Competencies You Will Have After the Course (2)

5. Apply common statistical methods for

inference, including: estimation, confi-
dence intervals, and hypothesis testing.
6. Specify preferred methodological
alternatives to commonly used statistical
methods when assumptions are not met.
7. Apply descriptive and inferential methodologies
(consisting of: sample selection, hypothesis development and
testing, decision errors, power, and sample size) according to
the type of study design (e.g., cross-sectional) for answering a
particular research question.
8. Interpret results of statistical analyses found in public health
studies including assessing the assumptions, quality of data
(objectivity, reliability, and validity), appropriateness of
statistical methods, and validity and utility of conclusions.

http://www.usrbingeek.com/blog/competence.gif
Applied Biostatistics 2020 A. Parlesak 9
Questions About This Class
(You might want to ask – or not)

• Is this class to be hard?

- No. Concept is easy and
procedure is clear.
• Why do we spend time on
theoretical stuff?
“It was my understanding that
- Helpful to understand the there would be no math”
applications ant their potential - Chevy Chase, ‘Spies Like Us‘

• Do we need to know all the stuff?

- You may not need all, but be prepared

http://www.doblu.com/wp-content/uploads/2010/08/spieslikeus10124.jpg
Applied Biostatistics 2020 A. Parlesak 10
Class Preparation

• Read appropriate chapter(s)

in books (as given in the
according Wiki) and on
Intrapol beforehand and
bring questions to class.
• There is no such thing as a stupid question!
• If you’ve got a question, ask it immediately!

http://lsatpreparationclasses.com/images/LSAT_Prep_Class.JPG
Applied Biostatistics 2020 A. Parlesak 11
Is Biostatistics Hard to Study?

Factors that make it hard for some

students to learn biostatistics:
• The terminology is deceptive. You have to
understand that the meaning of statistical
terms such as significant, error and hypothesis
is distinct from ordinary use of these words.
• Statistics requires mastering abstract concepts. Theoretical
concepts such as skewness of populations, probability
distributions, and null hypotheses are not easy to handle.
• There are always 2 sides of learning: the rationalistic
(knowledge) and the emotional (self-confidence, motivation).
Not understanding these abstract concepts might drive you
into feelings of despair.
(“I will never understand this, so why should I go on?”).

http://afteramerica.files.wordpress.com/2010/01/despair1237852510.jpg
Applied Biostatistics 2020 A. Parlesak 12
Is Biostatistics Hard to Study? NO!

Reasons why you should be optimistic

on your future statistical skills:
• The derivation of most statistical tests
involves difficult math. However, you
can learn to use statistical tests and
interpret the results even if you do not
fully understand how they work. You only need to know
enough about how the tool works so that you can use them in
appropriate situations. The type of data determines the right
test.
• Basically, you can calculate statistical tests and interpret
results even if you don’t understand how the equations were
derived, as long as you know enough how and when to use
the statistical tests appropriately.
http://dontsqueezethejj.com/blog/wp-content/uploads/2008/10/20080424-kenya_0425.jpg
Applied Biostatistics 2020 A. Parlesak 13
Why choosing “R” as Software for
Statistical Evaluations?

o Most statistical software packages are expensive.

R is the only widely accepted open-source
software with built-in statistical procedures.
o Key advantages:
- Freely available (http://www.r-project.org/)
- Achievable on every computer connected to the
internet all over the world
- High creative potential
- Constantly up-dated with routine commands for
modern statistical applications
o Key disadvantages: computer language-like
commands that need to be typed in
(no “simple click” solutions)

Applied Biostatistics 2020 A. Parlesak 14

Principles of Statistics are
Independent of the Software Applied

o When working in (quantitative) science, you will

be faced with specific questions on differences,
correlations, prevalence, etc.
o For each of these questions, a specific statistical
test is available, which usually has a name
(frequently after it’s inventor such as Spearman, Kendall,
Kolmogorov-Smirnoff, etc.)

o Independently of the software applied, both the

statistical procedures and their names are
identical.
o Having learned standard procedures of statistical
testing in one program (here: R), you will find
easily your way through other software
packages (if you can afford them)

Applied Biostatistics 2020 A. Parlesak 15

ACTIVITY 1:

1. Install “R” ([R]) on your computer.

(http://cran.r-project.org/bin/windows/base/)

2. Install “R Studio” on your computer.

(http://www.rstudio.com/products/rstudio/download/)

3. Install a shortcut for “R_Studio” and check whether

the programs are running.

http://www.famousquotes.com/show/1040765/
Applied Biostatistics 2020 A. Parlesak 16
What Statistics Can and Can’t Do

Can Can’t
• Deliver a basis for hypothesis • Tell the truth
generation (explorative research; (probabilistic conclusions only!)
help to detect patterns in messy • Compensate for poor design
data) (Sir Ronald Fisher)
• Provide objective criteria for • Indicate biological/social significance:
evaluating hypotheses and statistical significance does not mean
argument raising biological/social significance, nor vice
• Condense cluttered information (not versa!
without information loss, so keep
your raw data!)
• Translate data into statements
(e.g. “Increased intake of folic acid significantly
reduces incidence of spina bifida.”)
• Help you critically evaluate
arguments of others

Applied Biostatistics 2020 A. Parlesak 17

Some Opinions on Statistics

“There are three types of lies: lies, damn

lies, and statistics!”
Benjamin Disraeli,
Benjamin Disraeli; source: Mark Twain former British
Prime Minister

“If your experiment needs statistics, you

should have done a better experiment.”
Ernest Rutherford Ernest Rutherford,
Physicist,
Nobel prize winner

“To call in a statistician after the experiment

is done may be no more than asking him to
perform a postmortem examination; he may
Sir Ronald Fisher,
be able to say what the experiment died of.” Inventor of
Sir Ronald Fisher ANOVA

http://johngushue.typepad.com/photos/uncategorized/2008/02/03/benjamin_disraeli_portrait.jpg
http://www.nzhistory.net.nz/files/images/ernest-rutherford-image.jpg
http://www-history.mcs.st-and.ac.uk/Posters/217c.html Applied Biostatistics 2020 A. Parlesak 18
Biostatistics vs. Statistics

• The tools of statistics are employed in many fields –

e. g. economics, business, education, and
psychology.
• When the data being analyzed are derived from life
science, we use the term biostatistics (also
applying to social life) to distinguish this particular
application of statistical tools and concepts.
• In the current course, most of the statistical
procedures will refer to health issues and nutrition.
However, you will also learn how to extrapolate
your knowledge to other fields of science (e.g.
quantitative social science).

Applied Biostatistics 2020 A. Parlesak 19

Biostatistics (cont.)

“A habit of basing convictions upon evidence, and of

giving to them only that degree or certainty which
the evidence warrants, would, if it became general,
cure most of the ills from which the world suffers.”
Bertrand Russell - British philosopher, mathematician, social reformer

All biostatistics begins with description. Before you do

anything else, you look at the data and summarize the data.
Hence, our first goal:
- to show how to get a first look at the data and
- get ready to do more elaborate procedures.
A statistic is just a numerical summary of the data, like the
largest number in the data set or its average (mean).

http://www.famousquotes.com/show/1040765/
Applied Biostatistics 2020 A. Parlesak 20
The Framework We (and ALL Other
Scientists) Will Stay In: INFERENCE
Hypothesis formulation and data collection

In (bio)statistics, you can make a GENERAL

statement for the population ONLY if you infer
this statement from a sample being (more or
less) representative for the population. This
happens in 5 steps:
• From qualitative science (e.g. interviews
with focus groups) you form a hypothesis.
(e.g. you note that overweight persons
describing their daily life mention quite frequently “watching TV” => hypothesis:
“Prolonged TV watching is associated with overweight.”)
• Then you collect data from a SAMPLE to prove (or reject) this hypothesis.
• By appropriate data evaluation, the hypothesis is ACCEPTED or REJECTED.
• Due to the outcome of the statistical evaluation, you predict with a certain
PROBABILITY a general rule for the POPULATION.
Hence: without statistical evidence, NO generalized STATEMENT, NO general
RULE, NO general RECOMMENDATION.

http://www.famousquotes.com/show/1040765/
Applied Biostatistics 2020 A. Parlesak 21
Sample and Population

• Population: all items/persons

that have something in common
(e.g. disease, sex, education) Estimation Prediction

• Sample: collection of observations

taken from the study population
• Representative sample: part
of the population with characteristics
being as close as possible to the population.
• Random sample: each element of the
population has some chance of being selected to
• Simple random sample: the chance of each
element of the population of being selected
is the same
http://www.uwsp.edu/psych/stat/2/popsamp.gif/;
http://www.six-sigma-material.com/images/PopSamples.GIF Applied Biostatistics 2020 A. Parlesak 22
GROUP ACTIVITY 1:
Qualitative and Quantitative Science
In groups of 2-3,

1. formulate definitions of “Qualitative Science” and “Quantitative Science”

so that you can explain the main goals and the difference in an oral
exam;
2. State whether the hypothesis is part of qualitative or quantitative
science;
3. Explain why qualitative science is useless without quantitative science
and why quantitative science is useless without qualitative science;
4. Explicate whether you need statistics in qualitative science. If so, which
one?
5. Indicate how results from quantitative science foster new qualitative
science approaches.

http://www.famousquotes.com/show/1040765/
Applied Biostatistics 2020 A. Parlesak 23
Introductory Concepts –
Descriptive Techniques
Chapter 1:
• Types of Data
Sample, population, variable,
scales of measurement
• Descriptive Measures
Centrality, dispersion, skewness
Chapter 2:
• Probability and Distributions
• Estimation Techniques
Confidence intervals
http://bio.informatics.indiana.edu/VLDB07/images/smallgraphic2.png Applied Biostatistics 2020 A. Parlesak 24
Types of Data Dependencies

• If you measure the change of a dependent

variable (outcome) on a single
independent variable (exposure) only, you
apply univariate statistics.
• If you measure 2 exposures, you follow a
bivariate experimental design
• And if you have multiple exposures, it is
called a multivariate design

Applied Biostatistics 2020 A. Parlesak 25

Variables, Data Sets and Parameters
Age Sex Years of … Variable
[y] education m

• Variable: an attribute that varies from Subject 1 21 m 7.8 … x1j

Subject 2 34 f 8.8 … x2j
one element of the sample to the next Subject 3 18 f 7.3 … x3j
(e.g. weight of preschool children, Subject 4 37 m 9.8 … x4j
… … … … … …
sex of bosses, education of parents,
Subject n x
i1 x i2 x … i3 xij
income of teachers, etc…)
• Data set (=dataframe in R): each column is headed by the variable
name and contains in each cell the value of this variable measured in
that particular case. The position of the value in the dataset is crucial
for correct evaluation. The data set without its headers is a matrix.
• Parameter: descriptive measure of a variable based on probability
distributions, derived from a population, estimated from the sample.
For example, the sample mean (Χ) can be used as an estimate of the
mean parameter (μ) of the population from which the sample was
drawn.
Applied Biostatistics 2020 A. Parlesak 26
Principal Steps in a
Statistical Analysis Process

There are basically three steps in the statistical

analysis process, starting after you got your data.

1. Data description
(averages, standard deviation, distribution, etc.)
2. Data analysis
(significance of differences, correlations, etc.)
3. Data interpretation
(meaning of the differences/correlations found -
or not found – linking statistics to the hypothesis).

Applied Biostatistics 2020 A. Parlesak 27

Step I: Data Description

The goal of data description is to describe the

empirical frequency distribution of the variable.

The manner of the description will be dependent on

the class of the variable.

Applied Biostatistics 2020 A. Parlesak 28

Scales of Measurement

• Scales of measurement: different types of variables.

• The scale of measurement [of the investigated/interacting
variable(s)] defines the rules for further statistical proceeding.
• They are commonly broken down into four types:
– Nominal (categorial) N
– Ordinal (categorial) O
– Interval (numerical) I
– Ratio (numerical) R

Mnemonic

http://farm2.static.flickr.com/1298/534434590_581e774e8d.jpg
Applied Biostatistics 2020 A. Parlesak 29
Categorical Nominal Variables

• A nominal variable is a categorial scale and can be placed

in categories, which do NOT posses a natural hierarchy.
• It is characterized by its incapability to be ordered or
measured on a continuous scale.
Examples:
• Place (e.g. city, country)
• Disease type
(lung cancer, cervix cancer,
colon cancer, ...)
• Ethnicity (Hispanic, Caucasian, ...)
• Sex (male/female) (dichotomous)
• Smoker (yes/no) (dichotomous)
http://talks.blogs.com/phototalk/images/NewCategory.jpg
Applied Biostatistics 2020 A. Parlesak 30
Categorical Ordinal Variables
• An ordinal variable is a categorial scale and can be placed in
categories, which DO posses a natural hierarchy (inherent
order).
• It is characterized by its capability to be ordered but NOT to
be measured on a continuous scale.
Examples:
• Severity of disease
(healthy, stages 0, I, II, II, IV)
• Weight status
(underweight, normal,
overweight, obesity)
• Pain level
(none, mild, moderate,
severe, unbearable)
• Likert scale levels (strongly agree, agree, indifferent, disagree, strongly disagree)
http://beatcoloncancer.com/images/stages_of_cc.jpg
Applied Biostatistics 2020 A. Parlesak 31
Numerical Interval Variables
• An interval variable is a numerical scale and can be placed in
categories, which DO posses a natural hierarchy (inherent
order).
• It is characterized by its capability to be ordered; WITHIN the
intervals, the variable can be measured on a continuous scale.
• Note: Single intervals may miss and range for intervals might differ.
Examples:
• Age group
(0-0.99; 1-5.99; 12-18; …)
• Weight range
(14-18.49; 18.5-24.99;
25-29.99; 30-34.99; 35+)
• Yearly income level, U$
(<12 999, 13 000-20 000, …)

http://agritech.tnau.ac.in/nutrition/nutri_comondse_obesity_clip_image001.gif
Applied Biostatistics 2020 A. Parlesak 32
Numerical Ratio Variables (1)

• A ratio variable is a numerical scale (quantitative observation).

• It is characterized by its direct comparability and has a “zero”
value; the values can be added, subtracted, multiplied and
divided. It can take a limited (discrete) value (in R: integer) or
an infinite number of values between any two other values
(continuous).
Examples:
• Weight (12.3 kg, 14.6 kg, …)
• Distance (14.6 km, 23.5 km, …)
• Energy content of food (2043.5 kcal, …)
• Number of admissions (21, 136, 13 453, …)

http://www.janus-pyttel.de/assets/images/Skala_2.jpg
http://www.higa.bildung-rp.de/gymnasium/unterrichtszeiten/zeit500_500.jpg Applied Biostatistics 2020 A. Parlesak 33
Numerical Ratio Variables (2)
• There are ratio variables that do not meet the criterion of
having a “true zero” value (e.g. temperature in Celsius degree)
but which are treated in statistics as ratios anyway
• In literature, these values are also frequently called “interval
variables”
Examples:
• BMI
(12.3 kg, 14.6 kg, …)
• Time
(1.2 min, 23.5 h, …)
• Temperature (Kelvin)
[0 K (= -273.15 ° C), 298 K, …)

http://www.janus-pyttel.de/assets/images/Skala_2.jpg
http://www.higa.bildung-rp.de/gymnasium/unterrichtszeiten/zeit500_500.jpg Applied Biostatistics 2020 A. Parlesak 34
Levels of Measurement
• Higher level variables (ratio variables) can always be expressed as
lower level variables (other types) but this never works the other
way around
• Lowering the level of variables is always associated with
RELEVANT INFORMATION LOSS
• Therefore, in any study, you always should record data at the
highest possible level (ratios)
YES
Variable type Variable Unit
Ratio Weight, height kg, m
Interval BMI Kg/m2
Being underweight, normal
Ordinal none
weight, overweight or obese
Nominal Overweight or not none
NO
Applied Biostatistics 2020 A. Parlesak 35
Criteria of a Satisfactory Scale

A satisfactory scale meets the following requirements:

• Appropriate
• Practicable
• Powerful
• Clearly defined
• Sufficient number of categories
• Collectively exhaustive
• Mutually exclusive

http://www.medicalscale1.com/wp-content/uploads/2010/08/balance-scale.jpg Applied Biostatistics 2020 A. Parlesak 36

Appropriateness of a Satisfactory Scale

Keep in mind the conceptual definition of the variable and

the objectives of the study. Ask yourself: does the
variable represent the parameter necessary to answer the
research question?
Occupations, for instance, may be classified in different
ways, depending on whether the purpose is to use it as
• measure of social class,
• habitual physical activity,
• exposure to specific physical and chemical hazards.

Applied Biostatistics 2020 A. Parlesak 37

Practicability of a Satisfactory Scale

The practicability of a satisfactory scale is linked to the methods

that will be used in collecting the data.
• E.g. in an food frequency questionnaire, to ask for an estimation
of calorie intake on the basis of kJ might be inacceptable
because the study participants are not familiar with this unit.
Solution: provide pictures with standardized picture sizes:

• Usually, a high precision is linked to a high effort. Always ask

yourself: is it worth it?
(e.g. Detailed data on income, or broad income categories?
Weighing in tenth of kilograms, or whole kilograms?)
Applied Biostatistics 2020 A. Parlesak 38
Powerfulness of a Satisfactory Scale

The powerfulness of a satisfactory scale is linked to the

type of scale:
If there is a choice,
• an ordinal scale should be used rather than a nominal one
• and numerical scale rather than an ordinal one.

E.g., an analysis using the whole

spectrum of weight is more
informative than one using a
only two variables, such as
“below 70 kg” and
“more than 70 kg”.

http://3.bp.blogspot.com/_2XK0_P3eLgw/SxOKT8tmxqI/
AAAAAAAAAD0/mWWHHrMMe8o/s1600/weight-loss.jpg Applied Biostatistics 2020 A. Parlesak 39
Clarity of Definition of a Satisfactory Scale

Operational definitions are obligatory for

variables and categories.
• Nominal and ordinal scales
E.g. cases of a malnutrition: “present” or “absent”:
depends on the applied definition!
• Numerical measurements
- Number of decimal places to be used: realistic,
e.g. body weight to ± 0.1 kg, NOT to ± 0.1 g
- Your calculated value has ONLY the precision of the least
precise measurement used, e.g. BM: 67 kg, height:
178.8 cm => BMI= 21 and NOT 20.96 or 21.0
- Rule: calculate with one digit more than your
measurement allows: weight in whole kg, but indicate
final result with the least available precision
- Values to be rounded: off-downwards, upwards, or to the
nearest number (most preferable)
Applied Biostatistics 2020 A. Parlesak 40
Recap Exercise: Clarity of Definition of a
Satisfactory Scale

If the concept of PRECISION INDICATION of RESULTS

does not seem familiar to you, please go through the
exercise sheets “Precision and significant digits 1” and
“Precision and significant digits 2” in the shared folders on
Intrapol.

In the final exam, points will be deduced for every result

that has been indicated with the wrong number of digits!

Applied Biostatistics 2020 A. Parlesak 41

Sufficiency of Categories of a Satisfactory Scale

Avoid compression of data into too few categories – this

may lead to a loss of useful information!
E.G., if immigrants from North Africa have a particularly
high rate of mortality from a disease, this fact may be
completely masked if they are included in a broader
category (immigrants).
Hence: Collect data in a detailed form and decide later
whether to use the full scale or a “collapsed” one/both.

Applied Biostatistics 2020 A. Parlesak 42

Mutual Exclusiveness of a Satisfactory Scale

Provide sufficient classification to place every subject.

This may necessitate the inclusion of one or more of the
following categories: other or not applicable,
E.g. duration of marriage (also for case “not married”):

“under 5 years”,
“5-9.9 years”,
“10-19.9 years”,
“20-29.9 years”,
“30-39.9 years”,
“40 years and more”
“not applicable”.

http://manasij.files.wordpress.com/2010/06/not-married.jpg Applied Biostatistics 2020 A. Parlesak 43

Collective Exhaustiveness of a Satisfactory Scale

• Each item of information fits in only one place along the

scale. E.g., an age scale including both “70 to 80” and “80
to 90” is unacceptable, as “80” fits onto either of these
categories.

• A scale for measuring the conditions producing disability

includes the categories “blindness” and “deafness”
“not blind, not deaf”,
“blind, not deaf”,
“deaf, not blind”
“blind and deaf”.

http://www.bibliotecapleyades.net/imagenes_sociopol/globalbanking12_02.jpg Applied Biostatistics 2020 A. Parlesak 44

Description of Qualitative Variables

The empirical frequency distribution of a nominal or

an ordinal variable is usually summarized as list of:

• Frequencies (counts)

• Proportions

• Percentages

These numbers are also used in description of incidence

and prevalence (epidemiology).

Applied Biostatistics 2020 A. Parlesak 45

Descriptive Statistics (DS)

• A way to summarize data

from one or more samples
or populations.
• Reducing data of the sample to a
small number of summary measures is called statistics.
• DS illustrate the central tendency, variability, and shape of one
or more sets of data.
• DS should be clear and easily interpreted. It should not
mislead you about the data they are summarizing and should
retain sufficient information to allow characterization .

http://4.bp.blogspot.com/_FBWyXkXeMkc/TUATvpa0OaI/AAAAAAAAAW8/ljIcnYl8nd8/s1600/
statisticscomputer.gif Applied Biostatistics 2020 A. Parlesak 46
Summations
5 n

x i =1
i = x1 + x2 + x3 + x4 + x5 x
i =1
i = x1 + x2 + x3 + ... + xn

3
1 + 50 = 51
2 + 49 = 51
x i =1
2
i = x12 + x22 + x32
3 + 48 = 51
… … …
25 + 26 = 51

n n = 25 * 51 = 1275
 cx
i =1
i = cx1 + cx2 + cx3 + ... + cxn = c  xi
i =1
n
n(n + 1)
Hence: 
i =1
i=
2

Applied Biostatistics 2020 A. Parlesak 47

Descriptive Measures

After that we have recorded our data, we want to be

able to characterize it quantitatively:

Measures of Central Tendency (Measures of

Location) give you a POINT estimate
(Arithmetic) Mean, Median, Mode
Measures of Variability give you a RANGE estimate
Range, Variance, Standard Deviation
Measures of Relative Standing
Z-Scores, Percentiles, Quartiles

Applied Biostatistics 2020 A. Parlesak 48

Vectors and Variables

In statistics, a vector is a one-dimensional array:

(18, 32, 56, 23, 32, 27, 40, 38, 23)

Usually, such a vector represents a variable (e.g. age,

body mass, martial status, sex, etc.)

The vector has a name (variable name) and

occasionally a unit (if ordinal/continuous, e.g. kg,
years, m, etc.)

Applied Biostatistics 2020 A. Parlesak 49

(Arithmetic) Mean – The ”Average”
• Suppose we have N measurements (x1, x2, x3,…,xN) of a
particular parameter in a sample. We denote these N
measurements as a vector and can calculate the average
(mean): n

( x1 + x 2 + x3 + ....x )
n
 x i
 X
• Definition x= i =1
= =
n n N
Arithmetic mean:
• More accurately called the arithmetic mean, it is defined
as the sum of measures observed divided by the number
of observations.
• We can use the arithmetic mean of the empirical
frequency distribution to estimate the mean value of the
variable in the study population.
• [R: mean(vector)]
Applied Biostatistics 2020 A. Parlesak 50
[R] Exercise: Data Type

• Open “R_Studio”

• In the Console, type

“typeof(3.4)” ENTER

“typeof(TRUE)” ENTER

“typeof(“Sky”)” ENTER

“typeof(“Bambi”) ENTER
“typeof(cars)” ENTER

“list(cars)”

Applied Biostatistics 2020 A. Parlesak 51

[R] Exercise: Calculation of the
Arithmetic Mean (Average)
• Open “R_Studio”
• Type “age<-c(18,32,56,23,32,27,40,38,23)” and ENTER
(‘c’ stands for concatenate and includes the single values
into one vector; ‘<-’ assigns the aforementioned name
to the subsequent vector)

• Type “age” and ENTER

• Type mean(age) and ENTER

- what is the correct value (precision) for the mean of
‘age’?
• Try the same with the vectors
weight<-c(60,72,57,90,95,72) and
height<-c(1.75,1.80,1.65,1.90,1.74,1.91)

Applied Biostatistics 2020 A. Parlesak 52

[R] Exercise: Mathematical Operations
with Values and Vectors
• In [R], you can perform any mathematical operation
such as
> 2+5 > 3-9
> 8*7 > 372/56
> log(1000,10) (Note: with log, first comes the argument
and then the basis. Omitting the basis implicates
automatically Euler’s number (2.71828) as the basis.)
• Each mathematical operation is completed with ENTER .
The return is the result.
• Accordingly, you can perform mathematical operations
with vectors (if they have same length). Try
> BMI<-weight/height^2 ENTER . How do you interpret
the resulting vector?

Applied Biostatistics 2020 A. Parlesak 53

The Arithmetic Mean – Sensitive to Outliers

1 n
Sample: µˆ = x =  xi is an estimate of µ (population)
n i =1

The (arithmetic) mean is extremely sensitive to outliers:

E.g. let n=3; x1=6, x2=7, x3=5

1 3 1 1
x =  xi = (6 + 7 + 5) = 18 = 6
3 i =1 3 3
E.g. let n=4; x1=1, x2=6 x3=2, x4=91

1 4 1
x =  xi = (1 + 6 + 2 + 91) = 25
4 i =1 4
[R: sum(vector); length(vector); mean(vector)]
Applied Biostatistics 2020 A. Parlesak 54
Median –The “Middle Value” (1)
Frequently used if there are single extreme values in a distribution
Definition
Value that divides the ‘ordered array’ into two equal parts
If an odd number of observations, the median will be the
(n+1)/2th observation
E.g., the median of 11 observations is the 6th observation
In the case of an even number of observations, the median is the
midpoint between the two central values.
E.g., the median of 12 observations is the midpoint (average)
between the 6th and 7th observation
[R: median(vector)]

Applied Biostatistics 2020 A. Parlesak 55

Median –The “Middle Value” (2)

• Suppose there are n observations

• Order data from smallest and
11 values:
largest observation The median is
the 6th value
• The sample median is the
observation in the center of the
ordered observations:
1. the observation in the center or
(n+1)/2th largest for n odd
2. The average of the two most
central observation of (n/2)th 12 values:
and (n/2 + 1)th largest for n The median is
the average of
even the 6th and
the 7th value

Applied Biostatistics 2020 A. Parlesak 56

Median –The “Middle Value”: Examples

Example 1: odd number of observations:

Data: 3, 8, 15, 1, 9, 4, 6, 20, 14 n=9
Ordered: 1, 3, 4, 6, 8, 9, 14, 15, 20
median is (n+1)/2th largest value, (n+1)/2 = 5
Md (median) = 8

Example 2: even number of observations:.

Data: 3, 8, 15, 1, 9, 4, 6, 20, 14, 10 n=10
Ordered: 1, 3, 4, 6, 8, 9, 10, 14, 15, 20
median is avg (n/2)th and (n/2+1)th largest values,
n/2 = 5, n/2+1 = 6
Md = (8+9)/2 = 8.5

Applied Biostatistics 2020 A. Parlesak 57

[R] Exercise: Calculation of the
Median

• Type
> median(age), median(weight), median(height), and
median(BMI) and compare these values with the mean
values of the corresponding vectors.

• How do you explain the difference between the mean and

the median?

• Due to your gut feeling – do the mean or the median

represent the central value of the vectors better?

Applied Biostatistics 2020 A. Parlesak 58

Choosing Between Mean and Median
Robustness: The robustness of a statistic is
related to the statistic’s resistance to being affected
by extreme values. The arithmetic mean is a non- 180
robust statistic while the median is a robust 160
statistic. If the empirical distribution is skewed or 140
extreme values are present, the median will 120
provide a better measure of central location than 100
the arithmetic mean. 80
Strength of median: insensitive to outliers 60
Arithmetic
mean:
Weakness of median: completely guided by central 40
Median
values. 20
0
0.9 1 1.1
Summarizing Capability: The arithmetic mean is
a more appropriate statistic if the data can be
described by a particular mathematical model such
as the normal (Gaussian) distribution.
Applied Biostatistics 2020 A. Parlesak 59
Mode – The “Most Frequent Observation(s)”
Definition
Value that occurs most Mode
frequently in data set

Example:
Question to pupils in a class:
“How frequently are you
visiting a doctor each year?”

Mode (Mo): “4”

Bimodal Trimodal

• If all values different, no mode

0.20
0.12
• May be more than one mode

0.10

0.15
0.08
(bimodal, multimodal)

0.10
0.06
• Mode is poor measure of central tendency

0.04

0.05
0.02
• Not used frequently in practice
0.0

0.0
-10 -5 0 5 10 -10 -5 0 5 10

[R: names(sort(-table(vector)))[1]]
Applied Biostatistics 2020 A. Parlesak 60
[R] Exercise: Calculation of the Mode

• A physician wants to know what the most frequent yearly

number of customer visits is. For this purpose, she
recorded the number of visits for her 19 customers:
> docfreq<-c(2,3,4,3,4,3,5,3,4,2,3,5,4,3,7,6,3,4,5)
> names(sort(-table(docfreq)))[1]

• How do you interpret the result?

• Type > mode(docfreq) and interpret the result.

Applied Biostatistics 2020 A. Parlesak 61

Statistics Derived from Sample: Percentiles (1)

• The Pth percentile of a sample of n observations

is the value of the variable that has ordered rank
(P/100)(1+n). As and example for P=20, and n=100,
(20/100)(1+100) = 20.2. So the 20th smallest value
of the variable is the value at the 20th percentile.

• The 50th percentile of a sample of n observations

is referred to as the sample median.

• We can use the median of the empirical frequency

distribution to estimate the median value of the
variable in the study population.
[R: quantile(vector,percentile_value)]
percentile_value: 0<value≤1 - e.g. 0.25 for upper
limit of lowest quartile Applied Biostatistics 2020 A. Parlesak 62
Statistics Derived from Sample: Percentiles (2)

• The 25th percentile of a sample of n observations is referred

to as the lower quartile, and the 75th percentile of a
sample of n observations is referred to as the upper
quartile.
- Example Scores of 20, 30, 50, 60, 67, 67, 70, 80, 90, 95
- 1st Quartiles = 50, 3rd Quartile = 80

• The difference between the upper quartile value and the

lower quartile value is referred to as the interquartile
range of the empirical frequency distribution.

• We can use the interquartile range (IQR) of the

empirical frequency distribution to estimate the interquartile
range of the distribution of the values of the variable in the
study population.

Applied Biostatistics 2020 A. Parlesak 63

Visualizing Data from 1 Sample/Population
- Bar Charts and Box-and-Whiskers-Plots -
Data set: 22 values, mean: 9.5, SD: 1.77, SEM: 0.377
6
7
7
8
8
8
9
9
9
9
9
10
10
10
10
10
11 Mean ± SD Mean ± SEM
11
11
12
12
Box-and-whiskers-plot,
13 non-parametric indicators
http://www.physics.csbsju.edu/stats/simple.box.defs.gif Applied Biostatistics 2020 A. Parlesak 64
Measures of central tendency (cont.)

• Each of the three methods of measuring

central tendency has certain advantages and
disadvantages
• Which method should be used?
• It depends on the
type of data that is
being analyzed,
mainly on their
DISTRIBUTION and
SKEWNESS

http://herdingcats.typepad.com/photos/uncategorized/statistics.jpg Applied Biostatistics 2020 A. Parlesak 65

Measurement Scales and Indicators
of Centrality

Measurement Permissible mathematic Best measure of

scale operations central tendency

Nominal Counting Mode

Greater or less than

Ordinal Median
operations

(Symmetrical – Mean)
Interval Addition and subtraction
Skewed – Median

Addition, subtraction, Symmetrical – Mean

Ratio
multiplication and division Skewed – Median

Applied Biostatistics 2020 A. Parlesak 66

Measure of Distribution: Dispersion

• RANGE
What is the highest, what is the lowest value?
• STANDARD DEVIATION
How closely do values cluster around the mean value?
• SKEWNESS
How symmetrical is the curve of distribution?

Applied Biostatistics 2020 A. Parlesak 67

Descriptive Measures
After that we have recorded our data, we want to be able to
characterize it quantitatively:

Measures of Central Tendency (=Measures of Location)

(Arithmetic) Mean, Median, Mode
Measures of Variability (Dispersion)
Range, Variance, Standard Deviation
Measures of Relative Standing
Z-Scores, Percentiles, Quartiles

Applied Biostatistics 2020 A. Parlesak 68

Measure of Distribution: Range

Range is the difference between the largest and

smallest values in the data set
R=Max(Xi)-Min(Xi)
[R: max(vector)-min(vector);range(vector)]
Set 1: 100, 30, 20, 7, –20, –30, –100
Set 2: 10, 3, 2, 7, -2, -3, -10

R1=200; R2=20

Heavily influenced by two most extreme values and

ignores the rest of the distribution.

Applied Biostatistics 2020 A. Parlesak 69

Measure of Dispersion: Example

Look at these two data sets:

Set 1: 100, 30, 20, 7, –20, –30, –100
Set 2: 10, 3, 2, 7, -2, -3, -10

If we calculate mean:
Set 1. n = 7, x = 1
Set 2. n = 7, x = 1

How to measure dispersion (spread, variability)?

Applied Biostatistics 2020 A. Parlesak 70

Variance and Standard Deviation (Population)
Suppose we have N measurements of a particular
variable in a population: X1, X2, X3,…,XN,

The mean is µ , as  (X i − µ) = 0 , we define:

1 1 1
σ = ( X 1 − µ ) + ( X 2 − µ ) + ... + ( X N − µ ) 2 =
2 2 2  i
( X − µ ) 2

N N N N

as variance, unit is X unit2

σ=  i
2
( X − µ )
N
as standard deviation,
unit is X unit.
Applied Biostatistics 2020 A. Parlesak 71
Variance and Standard Deviation (Sample)

Suppose we have n measurements of a particular

variable in a sample: X1, X2, X3,…,Xn,
The mean is x , we define:

s2 =
 i
( X − x ) 2

n −1
as sample variance (s2) and

s=  (X i − x ) 2

n −1
as standard deviation (s) µ ±σ
(of the sample: note “n-1” x±s
in the denominator!)
[R: var(vector), sd(vector)]
http://www.six-sigma-material.com/images/PopSamples.GIF Applied Biostatistics 2020 A. Parlesak 72
Standard Error of the Mean (Sample)

• The standard error of the mean (SEM)

is the standard deviation (SD), divided
by the square root of the number of
observed values:
SD
SEm =
n
• Hence, SEM rather than the SD represents the estimated
standard deviation of the population than of the sample. In
nearly all cases, when sample size increases, the SD decreases
with small n.

• [R: sd(vector)/sqrt(length(vector))]
http://www.six-sigma-material.com/images/PopSamples.GIF Applied Biostatistics 2020 A. Parlesak 73
When to Use SEM, When SD?

• The standard deviation (SD) is used when describing:

quantifying the variation around the mean of a sample.
Std deviation is an important statistic when determining if two
samples likely originated from the same underlying population.
• Central limit theorem; “sample means are normally distributed”
• The standard error (of the mean) (SEM or SE) is used when
estimating the mean of the underlying population (from which
the sample originated).
SEM is the important statistic for use in calculating the
confidence of your sample statistic (sample mean), and it is
determined by both SD of the sample & the sample size.

http://www.six-sigma-material.com/images/PopSamples.GIF Applied Biostatistics 2020 A. Parlesak 74

Coefficient of Variation

The Coefficient of Variation (COV or CV, “unit”: %) can be

understood as the relative standard deviation:

SD
Definition of CV: CV = ×100
x

Useful in comparing variation between two distributions

Used particularly in comparing laboratory measures to

identify those methods that give you a lower variation
(higher precision)

[R: sd(vector)/mean(vector)*100]

Applied Biostatistics 2020 A. Parlesak 75

Example: Mean, Variance, Standard Deviation,
and Coefficient of Variation

Set 1: 100, 30, 20, 7, –20, –30, –100

Set 2: 10, 3, 2, 7, -2, -3, -10

Calculate x , s2, s and CV (e.g. in “R”):

Set x s2 s CV[%]
1 1 3773.7 61.4 6140.0
2 1 44.7 6.7 670

Applied Biostatistics 2020 A. Parlesak 76

Which measure to use?

•Range. It’s not often used because it’s very sensitive to

outliers.
•Interquartile range. It’s pretty robust to outliers. It’s used a
lot in combination with the median, i.e. when using non-
parametric tests.
•Variance. It’s completely uninterpretable because it doesn’t
use the same units as the data. It’s almost never used except as
a mathematical tool
•Standard deviation. This is the square root of the variance.
It’s expressed in the same units as the data. The standard
deviation is often used in the situation where the mean is the
measure of central tendency, along with parametric tests.

Applied Biostatistics 2020 A. Parlesak 77

Converting to Standard Normal
Frequently in science, you have to compare distributions that change
over time.
Example: BMI of children. Problem: compare BMIs of children from
different classes (age groups): how? =>
Convert N(µ,σ2) to a standard normal (Z-Score):
X −µ
Z=
obese
σ
overweight

normal

underweight
Extremely
underweight

Child 1: 5 years, BMI: 18.8, mean (5ys): 15.45 kg/m2; s=2.05 => Z= +1.64

Child 2: 16 years, BMI: 19.0, mean (5ys): 20.48 kg/m2; s=1.25 => Z= -1.63

[R: what
So: (vector-mean(vector))/sd(vector)]
is the call for z-scores in R? Applied Biostatistics 2020 A. Parlesak 78
Recap: Centrality and Dispersion

Population Sample

µ=  X i
x=
 X i

Mean N n

Variance σ2 =  i
( X − µ ) 2

s2 =
 i
( X − x ) 2

N n −1

SD σ=  i
( X − µ ) 2

s=
 i
( X − x ) 2

N n −1
s
CV CV = ×100
x
s
SEM SEm =
n
Applied Biostatistics 2020 A. Parlesak 79
Tools in [R] that Make Your Life Easier (1)

• mean(V). Reports the average value (=arithmetic mean) of

the vector’s elements
• median(V). Reports the median value of the vector’s
elements
• var(V). Reports the median value of the vector’s elements
• sd(V). Reports the standard deviation of the vector’s
elements
• quantile(v). Reports the minimum, the maximum and the
0.25, 0.50 and 0.75 quantiles (=quartiles).
• IQR(V). Reports the interquartile range of the vector
(difference between the 0.75 and the 0.25 quantile)

Applied Biostatistics 2020 A. Parlesak 80

Tools in [R] that Make Your Life Easier (2)

• summary(DF). Summary of a data frame - the function

summary(DF) is automatically applied to each column. The
format of the result depends on the type of the data
contained in the column. For example:
• If the column is a numeric variable, mean, median, min, max
and quartiles are returned.
• If the column is a factor variable, the number of
observations in each group is returned.
• You can use the summary() function also to single vectors.

Applied Biostatistics 2020 A. Parlesak 81

Tools in [R] that Make Your Life Easier (3)

• sapply() function - used to repeatedly apply a particular

function over a list or data frame. You can use it to compute
for each vector (column) in a data frame, the mean, sd, var,
min, quantile, …

• # Compute the mean of each column

• sapply(my_data, mean)

Applied Biostatistics 2020 A. Parlesak 82

Tools in [R] that Make Your Life Easier (4)
stat.desc(DF) The function stat.desc() [in pastecs package],
provides other useful statistics including:
• the median
• the mean
• the standard error on the mean (SE.mean)
• the confidence interval of the mean (CI.mean) at the p level (default is 0.95)
• the variance (var)
• the standard deviation (std.dev)
• and the variation coefficient (coef.var) defined as the standard deviation
divided by the mean

Install pastecs package: install.packages("pastecs")

Use the function stat.desc() to compute descriptive statistics
# Compute descriptive statistics
library(pastecs)
res <- stat.desc(my_data)
round(res, 2)
Applied Biostatistics 2020 A. Parlesak 83
Visualization of Centrality, Variation, and
Skewness of Data: Histograms

• Histograms are used to visually depict

frequency distributions of continuous data.
• A histogram is a type of bar chart without
spaces between the bars
• By each column, the number of observation
within the category is given.
• The variables on the X-axis (abscissa) are
either “categorical ordinal” (rarely) or
“numerical interval” (in most cases), without
gaps.
Example: age groups of pupils in a class

Applied Biostatistics 2020 A. Parlesak 84

Visualizing Data from 1 Sample/Population
- Histograms -
Histogram of 100 randomly generated values, µ=0, σ=0.0316:

X-axis (abscissa):
Categorical ordinal variables or numerical interval variables
http://grants.hhp.coe.uh.edu/doconnor/PEP6305/Topic%20005%20Normal%20Distributio
n_files/SEM%20histogram.jpg Applied Biostatistics 2020 A. Parlesak 85
Did I get it?
Yes, if you can …
o Define the terms population, sample, random sample, variable,
parameter, data set
o List the 3 principal levels of statistical data evaluation
o Explain the 4 types of scales that can be used in statistics
o Define the criteria for a satisfactory scale in an investigation
o Work with the summation symbol
o Define the terms centrality, dispersion, and skewness on a
statistical basis
o Explain the terms mean, median, mode and if you know how to
calculate them
o Explicate the terms range, standard deviation, standard error of
the mean, coefficient of variation, standard normal, and
histogram

Applied Biostatistics 2020 A. Parlesak 86

BN2102 1-6 Notes
No ratings yet
BN2102 1-6 Notes
38 pages
Applied Medical Statisticsv2
No ratings yet
Applied Medical Statisticsv2
277 pages
Association Rule Mining For Healthcare Data Analysis
No ratings yet
Association Rule Mining For Healthcare Data Analysis
16 pages
Basic Statistics Course at COURSERA
0% (1)
Basic Statistics Course at COURSERA
17 pages
215 Final Exam Formula Sheet
No ratings yet
215 Final Exam Formula Sheet
2 pages
Biostatistics Introduction
100% (1)
Biostatistics Introduction
39 pages
Types of Data: Categorical (Qualitative)
No ratings yet
Types of Data: Categorical (Qualitative)
26 pages
[Ebooks PDF] download Applied Statistics: From Bivariate Through Multivariate Techniques Second Edition – Ebook PDF Version full chapters
100% (3)
[Ebooks PDF] download Applied Statistics: From Bivariate Through Multivariate Techniques Second Edition – Ebook PDF Version full chapters
51 pages
The Normal Distribution
No ratings yet
The Normal Distribution
9 pages
Basic Biostatistics Part I
No ratings yet
Basic Biostatistics Part I
194 pages
Chapter 1 Introduction The Teaching of Theory (3 Hours) Objective
100% (1)
Chapter 1 Introduction The Teaching of Theory (3 Hours) Objective
32 pages
Measures of Dispersion
100% (1)
Measures of Dispersion
25 pages
Biostatistics Syllabus
No ratings yet
Biostatistics Syllabus
2 pages
Prognosis Appraisal Tools
No ratings yet
Prognosis Appraisal Tools
2 pages
Basics of Biostatistics Course
100% (1)
Basics of Biostatistics Course
124 pages
Unit 3 Z-Scores, Measuring Performance: Learning Outcome
No ratings yet
Unit 3 Z-Scores, Measuring Performance: Learning Outcome
10 pages
Population vs. Sample
100% (1)
Population vs. Sample
44 pages
Biostatistics: A Refresher: Kevin M. Sowinski, Pharm.D., FCCP
100% (1)
Biostatistics: A Refresher: Kevin M. Sowinski, Pharm.D., FCCP
20 pages
STATA Codes - Basic
No ratings yet
STATA Codes - Basic
8 pages
Data Manipulation
No ratings yet
Data Manipulation
24 pages
Preparing Research Instruments
No ratings yet
Preparing Research Instruments
16 pages
Wayne Daniel
100% (6)
Wayne Daniel
186 pages
Fundamentals of Hypothesis Testing: One-Sample Tests
100% (1)
Fundamentals of Hypothesis Testing: One-Sample Tests
105 pages
Tests of Significance and Measures of Association
No ratings yet
Tests of Significance and Measures of Association
21 pages
Estimation of Sample Size
No ratings yet
Estimation of Sample Size
33 pages
Master of Statistics
100% (1)
Master of Statistics
24 pages
Full Download (eBook PDF) Biostatistics: A Foundation for Analysis in the Health Sciences, 11th Edition PDF DOCX
100% (7)
Full Download (eBook PDF) Biostatistics: A Foundation for Analysis in the Health Sciences, 11th Edition PDF DOCX
45 pages
Measure of Variance
No ratings yet
Measure of Variance
16 pages
Biostatistics Teaching
No ratings yet
Biostatistics Teaching
283 pages
Confidence Interval
No ratings yet
Confidence Interval
7 pages
Modules
No ratings yet
Modules
1 page
Confidence Intervals: By: Asst. Prof. Xandro Alexi A. Nieto UST - Faculty of Pharmacy
No ratings yet
Confidence Intervals: By: Asst. Prof. Xandro Alexi A. Nieto UST - Faculty of Pharmacy
24 pages
Epidemiologic Study Designs: Dr. Sunita Dodani Assistant Professor Family Medicine, CHS
No ratings yet
Epidemiologic Study Designs: Dr. Sunita Dodani Assistant Professor Family Medicine, CHS
23 pages
Scale of Data Measurement
No ratings yet
Scale of Data Measurement
11 pages
How To Calculate Sample Size in Animal Studies?
No ratings yet
How To Calculate Sample Size in Animal Studies?
6 pages
Poisson Regression and Negative Binomial Regression
100% (2)
Poisson Regression and Negative Binomial Regression
34 pages
Significance Tests
No ratings yet
Significance Tests
43 pages
Frequency Distributions and Graphs
100% (1)
Frequency Distributions and Graphs
49 pages
Measures of Dispersion
No ratings yet
Measures of Dispersion
18 pages
Chapter 1 Introduction To Biostat
No ratings yet
Chapter 1 Introduction To Biostat
62 pages
Parametric and Non Parametric Tests
No ratings yet
Parametric and Non Parametric Tests
37 pages
Bisotat For Mls
No ratings yet
Bisotat For Mls
57 pages
MST005 Solved
No ratings yet
MST005 Solved
41 pages
Worksheets To Conduct Analysis of Variance Tests: Essentials of Biostatistics in Public Health Lisa M. Sullivan
No ratings yet
Worksheets To Conduct Analysis of Variance Tests: Essentials of Biostatistics in Public Health Lisa M. Sullivan
3 pages
1.medical Statistics
No ratings yet
1.medical Statistics
33 pages
Bio Introduction
No ratings yet
Bio Introduction
101 pages
Biostatistics Tutorial
No ratings yet
Biostatistics Tutorial
1 page
Malaria Disease Prediction and Grading System: A Performance Model of Multinomial Naïve Bayes (MNB) Machine Learning in Nigerian Hospitals
No ratings yet
Malaria Disease Prediction and Grading System: A Performance Model of Multinomial Naïve Bayes (MNB) Machine Learning in Nigerian Hospitals
14 pages
M.SC 2022-2023
No ratings yet
M.SC 2022-2023
220 pages
Regression: Knowledge For The Benefit of Humanity
No ratings yet
Regression: Knowledge For The Benefit of Humanity
46 pages
Sample Size and Power Calculation
No ratings yet
Sample Size and Power Calculation
31 pages
Introduction To Inferential Statistics
No ratings yet
Introduction To Inferential Statistics
11 pages
Chi Square
No ratings yet
Chi Square
13 pages
Seefeld-Statistics Using R With Biological Examples PDF
No ratings yet
Seefeld-Statistics Using R With Biological Examples PDF
325 pages
A - Statistical Versus Practical Significance
No ratings yet
A - Statistical Versus Practical Significance
12 pages
ANOVA and Regression
No ratings yet
ANOVA and Regression
1 page
17 A Introduction To Descriptive Statistics and Exploratory Data Analysis
No ratings yet
17 A Introduction To Descriptive Statistics and Exploratory Data Analysis
47 pages
Confounding in Epidemiology
100% (1)
Confounding in Epidemiology
36 pages
BSN 315 Biostatistics
No ratings yet
BSN 315 Biostatistics
2 pages
Data Preparation and Exploration: Applied to Healthcare Data
From Everand
Data Preparation and Exploration: Applied to Healthcare Data
Robert Hoyt
No ratings yet
Morgenstern B DIALIGN Multiple DNA and Protein Seq
No ratings yet
Morgenstern B DIALIGN Multiple DNA and Protein Seq
5 pages
Biostatistics
No ratings yet
Biostatistics
53 pages
Sensitivity and Specificity
No ratings yet
Sensitivity and Specificity
12 pages
Bioinformatik: Biofizik & Biokimia Molekul 447b3 / 747b3
No ratings yet
Bioinformatik: Biofizik & Biokimia Molekul 447b3 / 747b3
15 pages
Sequence Alignment Methods and Algorithms
No ratings yet
Sequence Alignment Methods and Algorithms
37 pages
National Center for Biotechnology Information
No ratings yet
National Center for Biotechnology Information
23 pages
Call For Abstracts - CSEB 2011
No ratings yet
Call For Abstracts - CSEB 2011
3 pages
Informative Leaflet Mu Omics Data Analysis
No ratings yet
Informative Leaflet Mu Omics Data Analysis
21 pages
Pdbbind 2007 Intro
No ratings yet
Pdbbind 2007 Intro
2 pages
Variants of Blast: By-Darshana D Ghadi Roll No. - 03
No ratings yet
Variants of Blast: By-Darshana D Ghadi Roll No. - 03
17 pages
Bioinformatics Exercises Print
No ratings yet
Bioinformatics Exercises Print
6 pages
Managing Data Python Newbooks - 1
No ratings yet
Managing Data Python Newbooks - 1
2 pages
Basic Biostatistics: by Wakgari Deressa, BSC, MPH, PHD School of Public Health, Aau
No ratings yet
Basic Biostatistics: by Wakgari Deressa, BSC, MPH, PHD School of Public Health, Aau
22 pages
Biostatistics and Research Methodology - Kerala University of Health Sciences KUHS - Question Paper 2022 November
No ratings yet
Biostatistics and Research Methodology - Kerala University of Health Sciences KUHS - Question Paper 2022 November
3 pages
MODULE 1_2_3_9f4a498698b173779cd9137c73b536c9
No ratings yet
MODULE 1_2_3_9f4a498698b173779cd9137c73b536c9
45 pages
Bioinformatics Ii - Lab No.1
No ratings yet
Bioinformatics Ii - Lab No.1
4 pages
Biostatistics_and_Computer_Application
No ratings yet
Biostatistics_and_Computer_Application
3 pages
Research Method: Applying Parsimony To A Problem in Molecular Systematics
No ratings yet
Research Method: Applying Parsimony To A Problem in Molecular Systematics
1 page
Using BLAST: FASTA Format
0% (1)
Using BLAST: FASTA Format
3 pages
John Moult, Krzysztof Fidelis, CASP
No ratings yet
John Moult, Krzysztof Fidelis, CASP
4 pages
Predictive Values, Sensitivity and Specificity in Clinical Virology
No ratings yet
Predictive Values, Sensitivity and Specificity in Clinical Virology
26 pages
Week 3c - Phylogenetic - Tree - ConstructionMai PDF
No ratings yet
Week 3c - Phylogenetic - Tree - ConstructionMai PDF
19 pages
Sequence Alignment Methods Final
No ratings yet
Sequence Alignment Methods Final
69 pages
Proteomics Basics
No ratings yet
Proteomics Basics
18 pages
_second_done_w14a_substitution patterns
No ratings yet
_second_done_w14a_substitution patterns
36 pages
Macse
No ratings yet
Macse
5 pages
Bioinformatics Practical Assignment (1)
No ratings yet
Bioinformatics Practical Assignment (1)
25 pages
Multiple Range and Multiple F Tests (1955)
No ratings yet
Multiple Range and Multiple F Tests (1955)
43 pages
(Ebooks PDF) Download Functional Microbial Genomics 1st Edition Brendan Wren Full Chapters
100% (4)
(Ebooks PDF) Download Functional Microbial Genomics 1st Edition Brendan Wren Full Chapters
84 pages
Building Phylogenetic Trees From Molecular Data With MEGA: Molecular Biology and Evolution March 2013
No ratings yet
Building Phylogenetic Trees From Molecular Data With MEGA: Molecular Biology and Evolution March 2013
8 pages