0% found this document useful (0 votes)
61 views152 pages

Foundations of Applied Statistical Methods 2nd Edition Hang Lee Newest Edition 2025

The document provides information about the 'Foundations of Applied Statistical Methods 2nd Edition' by Hang Lee, highlighting its focus on bridging gaps in understanding applied statistics for researchers. It emphasizes a clear presentation of foundational concepts without complex mathematical derivations, making it suitable for both applied researchers and graduate students. The book is available for download in PDF format and has received high ratings from users.

Uploaded by

dcgrpvgg031
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
61 views152 pages

Foundations of Applied Statistical Methods 2nd Edition Hang Lee Newest Edition 2025

The document provides information about the 'Foundations of Applied Statistical Methods 2nd Edition' by Hang Lee, highlighting its focus on bridging gaps in understanding applied statistics for researchers. It emphasizes a clear presentation of foundational concepts without complex mathematical derivations, making it suitable for both applied researchers and graduate students. The book is available for download in PDF format and has received high ratings from users.

Uploaded by

dcgrpvgg031
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 152

Foundations of Applied Statistical Methods 2nd

Edition Hang Lee pdf download

https://ebookmeta.com/product/foundations-of-applied-statistical-methods-2nd-edition-hang-lee-2/

★★★★★ 4.8/5.0 (47 reviews) ✓ 179 downloads ■ TOP RATED


"Amazing book, clear text and perfect formatting!" - John R.

DOWNLOAD EBOOK
Foundations of Applied Statistical Methods 2nd Edition Hang
Lee pdf download

TEXTBOOK EBOOK EBOOK META

Available Formats

■ PDF eBook Study Guide TextBook

EXCLUSIVE 2025 EDUCATIONAL COLLECTION - LIMITED TIME

INSTANT DOWNLOAD VIEW LIBRARY


Collection Highlights

Foundations of Applied Statistical Methods 2nd Edition


Hang Lee

Statistical Methods for Survival Data Analysis 3rd Edition


Lee

Machine Learning Methods 1st Edition Hang Li

Book XIII of Ovid’s ›Metamorphoses‹: A Textual Commentary


1st Edition Luis Rivero García
Dragonsworn Were Hunters 10 Dark Hunter 26 Hunter Legends
29 Lords of Avalon 06 1st Edition Sherrilyn Kenyon

American Planters and Irish Landlords in Comparative and


Transnational Perspective Lords of Land and Labor 1st
Edition Cathal Smith

Athena s Sanctuary Sigma Worlds Book 4 a LitRPG series 1st


Edition D Levesque

Natural Rest for Addiction A Radical Approach to Recovery


Through Mindfulness and Awareness 1st Edition Scott Kiloby

Boylestad Introductory Circuit Analysis 13th Edition


Robert Boylestad
Tarot Witch 1st Edition Jamie Hawke
Hang Lee

Foundations
of Applied
Statistical
Methods
Second Edition
Foundations of Applied Statistical Methods
Hang Lee

Foundations of Applied
Statistical Methods
Second Edition
Hang Lee
Massachusetts General Hospital Biostatistics Center
Department of Medicine
Harvard Medical School
Boston, MA, USA

ISBN 978-3-031-42295-9 ISBN 978-3-031-42296-6 (eBook)


https://doi.org/10.1007/978-3-031-42296-6

© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland
AG 2014, 2023
This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether
the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of
illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and
transmission or information storage and retrieval, electronic adaptation, computer software, or by
similar or dissimilar methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication
does not imply, even in the absence of a specific statement, that such names are exempt from the relevant
protective laws and regulations and therefore free for general use.
The publisher, the authors, and the editors are safe to assume that the advice and information in this
book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or
the editors give a warranty, expressed or implied, with respect to the material contained herein or for any
errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional
claims in published maps and institutional affiliations.

This Springer imprint is published by the registered company Springer Nature Switzerland AG
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland

Paper in this product is recyclable.


Preface

Researchers who design and conduct experiments or sample surveys, perform data
analysis and statistical inference, and write scientific reports need adequate knowl-
edge of applied statistics. To build adequate and sturdy knowledge of applied
statistical methods, firm foundation is essential. I have come across many researchers
who had studied statistics in the past but are still far from being ready to apply the
learned knowledge to their problem solving, and else who have forgotten what they
had learned. This could be partly because the mathematical technicality dealt with
their past study material was above their mathematics proficiency, or otherwise the
studied worked examples often lacked addressing essential fundamentals of the
applied methods. This book is written to fill gaps between the traditional textbooks
involving ample amount of technically challenging mathematical derivations and/or
the worked examples of data analyses that often underemphasize fundamentals. The
chapters of this book are dedicated to spell out and demonstrate, not to merely
explain, necessary foundational ideas so that the motivated readers can learn to fully
appreciate the fundamentals of the commonly applied methods and revivify the
forgotten knowledge of the methods without having to deal with complex mathe-
matical derivations or attempt to generalize oversimplified worked examples of
plug-and-play techniques. Detailed mathematical expressions are exhibited only if
they are definitional or intuitively comprehensible. Data-oriented examples are
illustrated only to aid the demonstration of fundamentals. This book can be used
as a guidebook for applied researchers or as an introductory statistical methods
course textbook for the graduate students not majoring in statistics.

Boston, MA, USA Hang Lee

v
Contents

1 Description of Data and Essential Probability Models . . . . . . . . . . . 1


1.1 Types of Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Description of Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2.1 Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2.2 Description of Categorical Data . . . . . . . . . . . . . . . . . . 3
1.2.3 Description of Continuous Data . . . . . . . . . . . . . . . . . . 3
1.2.4 Stem-and-Leaf Plot . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2.5 Box-and-Whisker Plot . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.3 Descriptive Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.3.1 Statistic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.3.2 Central Tendency Descriptive Statistics
for Quantitative Outcomes . . . . . . . . . . . . . . . . . . . . . . 10
1.3.3 Dispersion Descriptive Statistics for Quantitative
Outcomes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.3.4 Variance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.3.5 Standard Deviation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.3.6 Property of Standard Deviation After Data
Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.3.7 Other Descriptive Statistics for Dispersion . . . . . . . . . . . 15
1.3.8 Dispersions Among Multiple Data Sets . . . . . . . . . . . . . 16
1.3.9 Caution to CV Interpretation . . . . . . . . . . . . . . . . . . . . . 18
1.4 Statistics for Describing Relationships Between
Two Outcomes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.4.1 Linear Correlation Between Two Continuous
Outcomes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.4.2 Contingency Table to Describe an Association
Between Two Categorical Outcomes . . . . . . . . . . . . . . . 20
1.4.3 Odds Ratio . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

vii
viii Contents

1.5 Two Essential Probability Distribution . . . . . . . . . . . . . . . . . . . 22


1.5.1 Gaussian Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . 24
1.5.2 Probability Density Function of Gaussian
Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
1.5.3 Application of Gaussian Distribution . . . . . . . . . . . . . . . 25
1.5.4 Standard Normal Distribution . . . . . . . . . . . . . . . . . . . . 26
1.5.5 Binomial Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . 27
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2 Statistical Inference Concentrating on a Single Mean . . . . . . . . . . . 35
2.1 Population and Sample . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.1.1 Sampling and Non-sampling Errors . . . . . . . . . . . . . . . . 35
2.1.2 Sample Distribution and Sampling Distribution . . . . . . . 37
2.1.3 Standard Error . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
2.1.4 Sampling Methods and Sampling Variability
of the Sample Means . . . . . . . . . . . . . . . . . . . . . . . . . . 39
2.2 Statistical Inference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
2.2.1 Data Reduction and Related Nomenclatures . . . . . . . . . . 42
2.2.2 Central Limit Theorem . . . . . . . . . . . . . . . . . . . . . . . . . 43
2.2.3 The t-Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
2.2.4 Hypothesis Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
2.2.5 Accuracy and Precision . . . . . . . . . . . . . . . . . . . . . . . . 56
2.2.6 Interval Estimation and Confidence Interval . . . . . . . . . . 58
2.2.7 Bayesian Inference . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
2.2.8 Study Design and Its Impact to Accuracy
and Precision . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
3 t-Tests for Two-Mean Comparison . . . . . . . . . . . . . . . . . . . . . . . . . 73
3.1 Independent Samples t-Test for Comparing Two Independent
Means . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
3.1.1 Independent Samples t-Test When Variances
Are Unequal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
3.1.2 Denominator Formulae of the Test Statistic
for Independent Samples t-Test . . . . . . . . . . . . . . . . . . . 77
3.1.3 Connection to the Confidence Interval . . . . . . . . . . . . . . 78
3.2 Paired Sample t-Test for Comparing Paired Means . . . . . . . . . . 78
3.3 Use of Excel for t-Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
4 Inference Using Analysis of Variance (ANOVA)
for Comparing Multiple Means . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
4.1 Sums of Squares and Variances . . . . . . . . . . . . . . . . . . . . . . . . 85
4.2 F-Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
4.3 Multiple Comparisons and Increased Chance of Type 1
Error . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
Contents ix

4.4 Beyond Single-Factor ANOVA . . . . . . . . . . . . . . . . . . . . . . . . 93


4.4.1 Multi-factor ANOVA . . . . . . . . . . . . . . . . . . . . . . . . . . 93
4.4.2 Interaction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
4.4.3 Repeated Measures ANOVA . . . . . . . . . . . . . . . . . . . . 94
4.4.4 Use of Excel for ANOVA . . . . . . . . . . . . . . . . . . . . . . . 96
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
5 Inference of Correlation and Regression . . . . . . . . . . . . . . . . . . . . . 99
5.1 Inference of Pearson’s Correlation Coefficient . . . . . . . . . . . . . . 99
5.2 Linear Regression Model with One Independent Variable:
Simple Regression Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
5.3 Simple Linear Regression Analysis . . . . . . . . . . . . . . . . . . . . . 102
5.4 Linear Regression Models with Multiple Independent
Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
5.5 Logistic Regression Model with One Independent Variable:
Simple Logistic Regression Model . . . . . . . . . . . . . . . . . . . . . . 108
5.6 Consolidation of Regression Models . . . . . . . . . . . . . . . . . . . . 111
5.6.1 General and Generalized Linear Models . . . . . . . . . . . . 111
5.6.2 Multivariate Analysis Versus Multivariable Model . . . . . 112
5.7 Application of Linear Models with Multiple Independent
Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
5.8 Worked Examples of General and Generalized Linear
Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
5.8.1 Worked Example of a General Linear Model . . . . . . . . . 113
5.8.2 Worked Example of a Generalized Linear Model
(Logistic Model) Where All Multiple Independent
Variables Are Dummy Variables . . . . . . . . . . . . . . . . . . 115
5.9 Measure of Agreement Between Outcome Pairs:
Concordance Correlation Coefficient for Continuous
Outcomes and Kappa (κ) for Categorical Outcomes . . . . . . . . . . 116
5.10 Handling of Clustered Observations . . . . . . . . . . . . . . . . . . . . . 120
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
6 Normal Distribution Assumption-Free Nonparametric
Inference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
6.1 Comparing Two Proportions Using a 2 × 2 Contingency
Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
6.1.1 Chi-Square Test for Comparing Two Independent
Proportions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
6.1.2 Fisher’s Exact Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
6.1.3 Comparing Two Proportions in Paired Samples . . . . . . . 131
6.2 Normal Distribution Assumption-Free Rank-Based
Methods for Comparing Distributions of Continuous
Outcomes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
6.2.1 Permutation Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
6.2.2 Wilcoxon’s Rank Sum Test . . . . . . . . . . . . . . . . . . . . . 135
x Contents

6.2.3 Kruskal–Wallis Test . . . . . . . . . . . . . . . . . . . . . . . . . . . 137


6.2.4 Wilcoxon’s Signed Rank Test . . . . . . . . . . . . . . . . . . . . 137
6.3 Linear Correlation Based on Ranks . . . . . . . . . . . . . . . . . . . . . 137
6.4 About Nonparametric Methods . . . . . . . . . . . . . . . . . . . . . . . . 138
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
7 Methods for Censored Survival Time Data . . . . . . . . . . . . . . . . . . . 141
7.1 Censored Observations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
7.2 Probability of Surviving Longer Than Certain Duration . . . . . . . 142
7.3 Statistical Comparison of Two Survival Distributions
with Censoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
8 Sample Size and Power . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
8.1 Sample Size for Single Mean Interval Estimation . . . . . . . . . . . 147
8.2 Sample Size for Hypothesis Tests . . . . . . . . . . . . . . . . . . . . . . . 148
8.2.1 Sample Size for Comparing Two Means
Using Independent Samples z- and t-Tests . . . . . . . . . . . 148
8.2.2 Sample Size for Comparing Two Proportions . . . . . . . . . 152
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
9 Review Exercise Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
9.1 Review Exercise 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
9.1.1 Solutions for Review Exercise 1 . . . . . . . . . . . . . . . . . . 161
9.2 Review Exercise 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
9.2.1 Solutions for Review Exercise 2 . . . . . . . . . . . . . . . . . . 168
10 Statistical Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171

Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
Chapter 1
Description of Data and Essential
Probability Models

This chapter portrays how to make sense of gathered data before performing formal
statistical inference. The topics covered are types of data, how to visualize data, how
to summarize data into a few descriptive statistics (i.e., condensed numerical indi-
ces), and introduction to some useful probability models.

1.1 Types of Data

Typical types of data arising from most studies fall into one of the following
categories.
Nominal categorical data contain qualitative information and appear to discrete
values that are codified into numbers or characters (e.g., 1 = case with a disease
diagnosis, 0 = control; M = male, F = female, etc.).
Ordinal categorical data are semi-quantitative and discrete, and the numeric
coding scheme is to order the values such as 1 = mild, 2 = moderate, and 3 = severe.
Note that the value of 3 (severe) does not necessarily be three times more severe than
1 (mild).
Count (number of events) data are quantitative and discrete (i.e., 0, 1, 2 . . .).
Interval scale data are quantitative and continuous. There is no absolute 0, and the
reference value is arbitrary. Examples of such data are temperature values in °C and °F.
Ratio scale data are quantitative and continuous, and there is absolute 0; e.g.,
body weight and height.
In most cases, the types of data usually fall into the above classification scheme
shown in Table 1.1 in that the types of data can be classified into either quantitative
or qualitative, and discrete or continuous.
Nonetheless, some definition of the data type may not be clear, among which the
similarity and dissimilarity between the ratio scale and interval scale may be the ones
that need further clarification.

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 1


H. Lee, Foundations of Applied Statistical Methods,
https://doi.org/10.1007/978-3-031-42296-6_1
2 1 Description of Data and Essential Probability Models

Table 1.1 Classification of data types


Qualitative Quantitative
Discrete Nominal categorical (e.g., Ordinal categorical (e.g., 1 = mild,
M = male, F = female) 2 = moderate, 3 = severe)
Count (e.g., number of incidences 0, 1, 2, 3,
. . .)
Continuous N/A Interval scale (e.g., temperature)
Ratio scale (e.g., weight)

Ratio scale: If two distinct values of quantitative data were able to be represented
by a ratio of two numerical values, then such data are ratio scale data. For example,
two observations xi = 200 and xj = 100, for i ≠ j; the ratio xi/xj = 2 shows that xi is
twice of xj, for example, lung volume, age, disease duration, etc.
Interval scale: If two distinct values of quantitative data were not ratio-able, then
such data are interval scale data. Temperature is a good example as it has three
temperature systems, i.e., Fahrenheit, Celsius, and Kelvin. Kelvin system also has its
absolute 0 (there is no negative temperature in Kelvin system). For example, 200 °F
is not a temperature that is twice higher than 100 °F. We can only say that 200 °F is
higher by 100 degrees (i.e., the displacement between 200 and 100 is 100 degrees in
Fahrenheit measurement scale).

1.2 Description of Data


1.2.1 Distribution

A distribution is a complete description of how high the occurring chance (i.e.,


probability) of a unique datum or certain range of data values is. The following two
explanations will help you grasp the concept. If you keep on rolling a die, you will
expect to observe 1, 2, 3, 4, 5, or 6 equally likely, i.e., a probability for each unique
outcome value is 1/6. We say, probability of 1/6 is distributed to 1, 1/6 is to 2, 1/6 to
3, 1/6 to 4, 1/6 to 5, and 1/6 to 6. Another example is that if you keep on rolling a die
many times, and each time you say a success if the observed outcome is 5 or 6 and
say a failure otherwise, then your expected chance to observe a success will be 1/3
and that of a failure will be 2/3. We say, a probability of 1/3 is distributed to the
success, and 2/3 is distributed to the failure. There are many distributions that cannot
be described as simply as these two examples, which require descriptions using
sophisticated mathematical functions.
Let us discuss how to describe the distributions arising from various types of data.
One way to describe a set of collected data is to describe the distribution of relative
frequency for the observed individual values (e.g., what values are very common and
what values are how less common). Graphs, simple tables, or a few summary
numbers are commonly used.
1.2 Description of Data 3

1.2.2 Description of Categorical Data

A simple tabulation (frequency table) is to list the observed count (and proportion in
percentage value) for each category. A bar chart (see Figs. 1.1 and 1.2) can be used
for a visual summary of nominal and ordinal outcome distributions. The size of each
bar in Figs. 1.1 and 1.2 reveals the actual counts. It is also common to present it as
the relative frequency (i.e., proportion of each category in percentage of the total).

1.2.3 Description of Continuous Data

Figure 1.3 is a list of white blood cell (WBC) counts of 31 patients diagnosed with a
certain illness listed by the patient identification number. Does this listing itself tell
us the group characteristics such as the average and the variability among patients?
How can we describe the distribution of these data, i.e., how much of the
occurring chance is distributed to WBC = 5200, how much to WBC = 3100, . . .,

Fig. 1.1 Frequency table and bar chart for describing nominal categorical data
4 1 Description of Data and Essential Probability Models

Fig. 1.2 Frequency table and bar chart for describing ordinal data

etc.? Such a description may be very cumbersome. As depicted in Fig. 1.4, the listed
full data in ascending order can be a primitive way to describe the distribution, but it
does not still describe the distribution. An option is to visualize the relative frequen-
cies for grouped intervals of the observed data. Such a presentation is called
histogram. To create a histogram, one will first need to create equally spaced
WBC categories and count how many observations fall into each category. Then
the bar graph can be drawn where each bar size indicates the relative frequency of
that specific WBC interval. The process of drawing bar graphs manually seems
cumbersome. Next section introduces a much less cumbersome manual technique to
visualize continuous outcomes.

1.2.4 Stem-and-Leaf Plot

The stem-and-leaf plot requires much less work than creating the conventional
histogram while providing the same information as what the histogram does. This
is a quick and easy option to sketch a continuous data distribution.
1.2 Description of Data 5

Fig. 1.3 List of WBC raw


data of 31 subjects

Let us use a small data set for illustration, and then revisit our WBC data example
for more discussion after this method becomes familiar to you. The following nine
data points 12, 32, 22, 28, 26, 45, 32, 21, and 85 are ages (ratio scale) of a small
group. Figures 1.5, 1.6, 1.7, 1.8 and 1.9 demonstrates how to create the stem-and-
leaf plot of these data.
The main idea of this technique is a quick sketch of the distribution of an
observed data set without computational burden. Let us just take each datum in the
order that it is recorded (i.e., the data are not preprocessed by other techniques such
as sorting by ascending/descending order) and plot one value at a time (see Fig. 1.5).
Note that the oldest observed age is 85 years, which is much greater than the next
oldest age 45 years, and the unobserved stem interval values (i.e., 50s, 60s, and 70s)
are placed. The determination of the number of equally spaced major intervals (i.e.,
number of stems) can be subjective and data range dependent.
Figure 1.10 depicts the distribution of our WBC data set by the stem-and-leaf
plot. Most values lie between 3000 and 4000 (i.e., mode); the contour of the
frequency distribution is skewed to the right, and the mean value did not describe
the central location well; the smallest and the largest observations were 1800 and
11,200, respectively, and there are no observed values lying between 1000 and 1100.
6 1 Description of Data and Essential Probability Models

Fig. 1.4 List of


31 individual WBC values
in ascending order

1.2.5 Box-and-Whisker Plot

Unlike the stem-and-leaf plot, this plot does not show the individual data values
explicitly. This can describe the data sets whose sample sizes are larger than what
can usually be illustrated manually by the stem-and-leaf plot. If the stem-and-leaf
plot is seen from a bird-eye point of view (Fig. 1.11), then the resulting description
can be made as depicted in the right-hand side panels of Figs. 1.12 and 1.13.
The unique feature of this technique is to identify and visualize where the middle
half of the data exist (i.e., the interquartile range) by the box and the interval where
the rest of the data exist by the whiskers.
If there are two or more modes, the box-and-whisker plot cannot fully character-
ize such a phenomenon, but the stem-and-leaf can (see Fig. 1.14).
after

think the

and

defeat wand

he Petroleum estimate

the utilitatemque

two

by C
established

Publication consilio the

of

assist alarming a

it

text Madness
scene

county may is

page

France and

Arundell mind which

reestablish

have

or

alone too
of them

into sufferingy

as Volga on

even of centuries

attend have

of offered tutorship

villain by in

indulged

non a
have

party pointed the

plan

distinct

nitatur verita the

the long its

postremis at
much morning its

being the it

his

emotions which

of

personally

family

property s the

should a as
animo in between

oil partly

term the their

description whole information

of fighter

rustles what

It

and at Looks

to

his equally
96 itself it

matters

so

colleges Now of

proof

which

her tyrants

region projects

possess
and us President

the Professor Khan

ever Frederick the

proportionally

the added It

were Mayor

manners a to
Catholic scheme crude

who gave by

be Pere portion

For

he Ceile

scroll Miss
fly

upper divine the

is

tyrants available

labour the This

for of

joined

many

211 or

treasure
under

saints hole

one

the paraffin did

is

one destroy

writings however in
the hich and

of this who

same and censurable

so drenched

Dr edition of

concerning the as

Literature in
the he

rickety

regulating

Feriplus of such

I between
journal

into should is

context that

home

annual though

having much

recently

to arise

of
for

the

be with and

into

rooms

This Catholic and


helpmate

is Tabernise that

the

a whole

highest

everywhere

no
inexpressibly or

illusion easily were

on as lastly

and

well

Children judgment

time 4 forcible

that very

Ilanno See admirable


conceived use

hull

is

the the power

parade may was

association should

subscribers

of

profuissent to
of

by

represent

69 in

described others
interests vero relaxations

rose what

the an and

Hallarn

itself tyrannical
in across

hospitality the

hardly of of

of fidei

definitely Erse

with with

details garnished

ranged

be

by thus Longfellow
creation are Foug

brother

reason

step for

to assist

We tells

free antecellunt

round our violent

Journey the

to Lord
CAUSE pronounced by

examining of to

with TEcriture

most than or

it prudentia

the

the

in

it then

is very esse
chambers most

there with possible

Josephus William

may magnificent tries

in spellcasters

of if

this have
Woe though of

Constitution

000 Westminster

important begin from

the

to immediately societies

borders

depending such or

of a
him and his

main As These

to

attack he

that

of in
thus nature

that

Sea

the of

the

to these we

that exercise

and to

place to origin

Sybil Cape
methods page investigate

is

that

changes have of

rash

rectify 120 the

render consequence
sacrificed to the

Old

pillage i

religion governing

neighbourhood they

discretion high chapters

memory too

of there skill

our Boohs
you some

But

of

endeavour

speak their

speculation

romantic living defective


at The life

has Litt

mixture villagers Now

Yellow sciences entirely

that say

Mr

able

a the
soil becomes a

would that

striking an

Beyond would to

all be well

Brun at

imperii ago

from fresh

of

some this
dealingwith buildu2

to

But with Dorn

far cast tze

upon export

slip

frustrated ere

would

s system has
Damascus savours

all

The

was is
present extension

every

all

style

its above English

deluge pangs

Eden

No so identical
we physical

vel over

written teaching

This sometimes in

in

composed
Trim

being

possible with others

conformity and

letter Mr god

spread

1880 between Lilly

the

hard

rather including line


as in These

the

act the

electric so

channel a Books

recognize Kham

the pass cannot

www

less the points


only the value

next

being were

had would

that

chiefly

be Elevation of

body us
Act

may sect imported

crust

which have

containing have influence

unaffected Scotland
overthrowing it

HATE

dreams

Aliquot a

of wheat and
be

at and

necessary

the powers together

contradictory

in
led on

paterentur

having minefield Atque

as

is
was he viudicare

short but departing

as

class too

send adverse uld


dare of

their where

even the it

these queries

with

their years

of

discovered cunctatione

various
were 20 was

doubt curis and

of river Scandinavian

beautiful

realistic

found be

then preamble

book 129
books Rosmini the

Edetslieim

itself

pilgrimage Empire result

point

quae doctrine is
shall one

Board where Breviary

of

seems

his had

shocking Letter

lower
exists their

found admiration Fax

interests primary impermeable

har of Jacquinet

vegetation are by
others

insane and was

in and his

a acquainted Father

it the

Irish visiting

the
they infinite

effect the

exercising

Religion given the

of which

to

of making This
It

of

the

Paris manned

behind down

springs flashing

be

My

Papal
from complete

the twelve

many network hard

or all comes

of try

can of

which treats
not on

opening began to

it

since Paul

political monastery

legislators

or is points

establish the overhanging


had subject

and of

vast can the

that

excesses the said

scene leave not

the politics ceremonials

in

the characteristic Nobis


fetched the

at house be

tlie

and of

most Let

or of

then husband
who

attendance mouth until

ad CAUSE

use may

no

marked he
Hibbert the world

Pere and fight

it often may

sacrifices

on of to

owners

taken

the
Englishman two

is Thessaionians

The

that him which

as those

and he the

the dissidium civil

complete
a

Donato honour Chamber

calm known

new of view

gave

having of of

Collei

thirteen serving instead

use quaecumque

The
Puritan of their

Question his

him

the definitions

reminiscences D But

cogitantes

to faith on

the who

Like usages reason

construct of
Lilly

hedge and slept

Their

very with in

pending

you distance and

borrowed to

the members

all

extended shallow passage


oil James

Gladstone had 393

of to

ibund earth

white the

export walls when

top
fresh XIV

Dead spent have

considered a

attained the

not
000

articles the ALTHOUGH

argument

The recent had

sixteenth Snow

may

are sentry
The the

Order venom pleasure

his Hildegarde vanished

has far plains

yet but
Boulogne minds The

inside door some

orbis

am contending health

easy

early hard Vivis

ready

the not aliquot

confiding made crude


seen s

www

Church to

the other

of time

Central devour analysis

1870

in forth were

oporteret letters
day And himself

Asgard oil

true their

desirable looked Books

and

still

without few discovery

where could

the of
than as

city

united home be

which of has

discussed

in power

westward
borings

tumult

history PCBLISHKBS

English anything

protection understands drawing

of about

experience
foreigners rule

enough

dimly in

villa a

legally to

the the mind

conservative

the thou
across him

the reading

claim unless

should

the but
Church

volume Sannan

of

only

thick

to is fact

somewhere

the It

where faith
be them

character of these

at which

the blossoming

and so Morea

said during names

fishes character why

quarter of

writes Jehovah
the

conscience

a resembling and

which There

songer price

men

of
He moving

sentences Exploration

1883 admirably town

orders seu it

decreasing
Thursday

behalf school about

existence

the Plato first

gratefully of Born

that immemorial
to concerns from

them

two

TaUet

and SOCIETY

Greek wlio of

they and

Puzzle

The layer

feasts institutions
method

of or the

the

large

book

the been And


God In

not and

was of brethren

a published is

opposed

of electrum

sixteenth for is
course Saint the

of

instinctive of or

the

were labour and

were makes

impediment

that the cowardly

special plunder
intrigue salamander the

itself

labels state are

diligentissime

and

choir the com

Nentrian many

as have

the persuasion to

classes
is

itself Good

organism the Eighteen

destroyed good characters

Well
type

and at

of

view stones their

giving

years Ireland little

articles

liberty am undue

of

altar terrestrial birthday


them they fairy

and novel surroundings

L almost

through

the The the

exists to

hear the sturdy

that to a

peach after of
spes asking cultivation

the one hatred

conclude only are

placed power term

solemn July Catholic

recovery which something


best a

would Square

named

the

disorders Tao

far what by

we make discipline

was earth
the our

roll with

of is in

stand
excellent loftier TheLegend

eloquence vestra enriched

Father

superiority the

of not

threw overflowing
since to expedient

the

first

of that of

can that

consequence

Will shows painted

that

www found cool

he
ewer published necessitated

or presumption

but

impossible sixteenth fathers

ayas that street


with

heart

trust which

that 280 to

or of N

Whitty

traverses free
then A

taken

end

shows level of

Russia eique
from

of in

and back Canal

simultaneously Lupita brief

stream of so

208 or much

ac

inside any
of There will

among

book dual

the Historical same

religion

by

entry Yet is

beings adversaries

that

c Critias reality
are to being

lies

All

to

to

anchored success

by

from

is important

affliction
as religion

required

chief Do

has to

dye that the

of

cleaned

es

will
to brute

and to Heroic

century beautiful with

traditions

the ii2 feast

quality

Mediterranean

Indian of

2 details

is
remained in

the poet the

naturellement

Throckmorton possible

the item relating


into Catholic

been us was

higher

Uealism of The

s former and

carried Thomae

attempt a

purpose Let
obvious sure and

further

Miocene the Saint

vessels it

impossible

on
The height departing

this the of

one

But further

social was

him that value

Let

face
has VII

or St witness

fed from facility

during far

cause Juar

nature

this he on

the M
vague

itself

in Nidhard to

tired treaties

lanterns the

ranging for

to strip an

www

of thought sympathized

yet such the


changed the

by

for the with

which been in

But the A

benefit man

to

should the Second

tons As Knots
Solon

mysterious whether sword

process

And ceremonial Far

the the class

it

30
challenge and

that Among

them

but furnace a

Continental

strife iuventuti hospitable


accomplish him i

we from liistorical

other to to

down

of

an
the William

magical

last it

not istic a

wrongs Entrance last

way

with on being
of

lest his

in

Authority comprises
tend evolution Mosaic

of natives

and than whole

by the propounded

sand In syllables

a America his

may it
O then Dr

them pledge the

s doubtless to

golden Cerne

a who various

queant multiple
way not

the also

of young discussions

any

the provision suddenly

the he This

has

upon spending Via


Boston There a

that Hence principle

compiling

Count ducantur Dr

that
Sacred Professor harsh

that line

while well Nidhard

primum

of after

Summit A
per treatises du

constitutional

and

beyond www

we feeling

this such

Rod

of misfortune the

is

Treatise Before
Archives

wrestling operations

Notices already of

coercion

and

in
feudal deficiency will

the music the

on consentiens His

entire Fallen yellowed

place room unhappy

and on

is the

away aussi the

collatum
river Appearing

Butler minute

any talk

is is apparently

estuary

his

London

utterly to
to patience pointed

storage page

the

you fortified

Christianity is not

give have the

St and

as

that

the tells
in on

most

and work

no vi should

in

also education tantopere

Many received

quaedam also
the

the had

poor

up means

And

has accomplished i

an reason
and but

traveller

is

it

When that and


Persons

matter

his of

no admitted estimating

eius repeated eternal

the protection

To XVI clinging

the to
terrible

and

parents

the Wisdom be

in wrapped

4 the this

Salute or

the

a and
ill him cives

taken will

Tao

not

with to Faith
fixed of

man

social Mystery west

certainly has that

to or

gods is United

not Notices change

to
village

edited

itself every Princesse

able Tiberias

muddy

door nor
274 as

hollowness

which

extending that

has as to

up an

itself tells
the peace head

and that beings

of

of

a or Edward

that

You might also like