Power and Sample Size in R, 1st Edition Digital DOCX Download
Power and Sample Size in R, 1st Edition Digital DOCX Download
Visit the link below to download the full version of this book:
https://medipdf.com/product/power-and-sample-size-in-r-1st-edition/
Reasonable efforts have been made to publish reliable data and information, but the author and pub-
lisher cannot assume responsibility for the validity of all materials or the consequences of their use.
The authors and publishers have attempted to trace the copyright holders of all material reproduced
in this publication and apologize to copyright holders if permission to publish in this form has not
been obtained. If any copyright material has not been acknowledged please write and let us know so
we may rectify in any future reprint.
Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced,
transmitted, or utilized in any form by any electronic, mechanical, or other means, now known or
hereafter invented, including photocopying, microfilming, and recording, or in any information stor-
age or retrieval system, without written permission from the publishers.
For permission to photocopy or use material electronically from this work, access
www.copyright.com or contact the Copyright Clearance Center, Inc. (CCC), 222 Rosewood Drive,
Danvers, MA 01923, 978-750-8400. For works that are not available on CCC please contact mpkbook-
[email protected]
Trademark notice: Product or corporate names may be trademarks or registered trademarks and are
used only for identification and explanation without intent to infringe.
DOI: 10.1201/9780429488788
Publisher’s note: This book has been prepared from camera-ready copy provided by the authors.
To my husband and children, and all of the students
who have taught me how to teach over the years.
Taylor & Francis
Taylor & Francis Group
http://taylorandfrancis.com
Contents
Preamble xi
List of Tables xv
1 Preliminaries 1
1.1 R implementation . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Probability distributions . . . . . . . . . . . . . . . . . . . . 1
1.3 Notation for common distributions . . . . . . . . . . . . . . . 3
1.4 R functions for common distributions . . . . . . . . . . . . . 4
1.5 Symmetric property of the normal distribution . . . . . . . . 5
1.6 Standardizing a normal distribution . . . . . . . . . . . . . . 6
vii
viii Contents
Bibliography 329
Index 349
Preamble
This book was developed out of a course on the topic of power and sample size
calculation that I teach at the UCLA Jonathan and Karin Fielding School of
Public Health in Los Angeles, California. Through developing and teaching the
course, I found that while there are some excellent reference-style textbooks on
sample size calculation, such as Chow et al. [33], textbooks that were suitable
for teaching a rigorous course or for self-learning and that were also integrated
with software were in short supply. Hence this book was written.
I have attempted to make the material in the book accessible while not
sacrificing rigor. The book assumes the reader understands basic probabil-
ity concepts such as random variables, distributions of random variables and
hypothesis testing, although there is a short refresher in Chapter 1. Most
of the book should be accessible to graduate students who have had a one-
year core sequence in statistics or biostatistics, who are the audience for the
UCLA course, as well as scientists and investigators with some understanding
of statistics. However, it is unavoidable that more statistical sophistication
is needed to understand some of the more sophisticated modeling techniques
and study designs. The material gets more challenging starting with Chapter
11 on generalized linear models.
The goal of the book is to emphasize statistical thinking about the factors
that influence power for different study designs and outcomes. Therefore the
book develops many of the mathematical formulas that lie behind sample size
and power calculations rather than simply presenting formulas. This is not
a cookbook of recipes. It is also not intended to be a complete inventory of
sample size and power methods.
The book demonstrates calculations in R using examples. Only a basic
proficiency in R is assumed. We use functions from the powertools package,
which was developed as a companion R package to this book, as well as other
packages. There are many packages for various types of power and sample size
calculations available in R, and we try to summarize available resources in
each chapter. The index includes not only concepts and topics but also the R
functions and packages mentioned in the book.
This book focuses on applications to clinical and health-related research.
However, the methods themselves can be applied in other areas. We focus
on techniques for sample size and power analysis when planning a study in
which the data will be analyzed using traditional “frequentist” methods of
statistical inference rather than, for example, Bayesian methods. Data from
most clinical studies are analyzed using traditional frequentist methods, e.g., t
xi
xii Preamble
tests, linear and logistic regression, and linear mixed models. Although there is
a lively discussion in the scientific community about the merits and demerits of
hypothesis testing (see, for example, Wasserstein and Lazar [225]), we sidestep
these issues and assume that the study data will be analyzed using hypothesis
testing or (less commonly) confidence intervals.
I am deeply indebted to the many individuals who have helped me produce
this book and the companion R package. These include Analissa Avila, Zichen
Liu, Kristen McGreevy, Phillip Sundin and Yixuan Zhou. You forever have
my gratitude.
Catherine M. Crespi
List of Figures
2.1 Rejection and acceptance regions and critical values for one-
and two-sided test. . . . . . . . . . . . . . . . . . . . . . . . . 11
2.2 Distribution of test statistic under null and alternative hypothe-
ses for a lower-tailed z test. . . . . . . . . . . . . . . . . . . . 16
2.3 Distribution of test statistic under null and alternative hypothe-
ses for a upper-tailed z test. . . . . . . . . . . . . . . . . . . . 18
2.4 Distribution of test statistic under null and alternative hypothe-
ses for a two-tailed z test. . . . . . . . . . . . . . . . . . . . . 20
2.5 Power for two-sided, one-sample z test for range of µA values. 29
xiii
xiv List of Figures
6.1 Data set-up and notation for two correlated proportions. . . . 133
17.1 Samples sizes per group for two co-primary endpoints . . . . 323
17.2 Samples sizes per group for two alternative (“at least one”)
primary endpoints using Bonferroni adjustment . . . . . . . . 326
xv
Taylor & Francis
Taylor & Francis Group
http://taylorandfrancis.com
1
Preliminaries
1.1 R implementation
This book explains power and sample size calculations for many different
study designs and outcomes and demonstrates their implementation in R. We
assume that the reader has a basic proficiency in R. The book uses functions
from the powertools package as well as other packages. To use the functions
in an R package, it is necessary to first install the package and then load it in
the current session using the library function, e.g.,
install.packages("powertools")
library(powertools)
The reader should be sure to consult the latest R documentation when us-
ing various functions and packages. To see the documentation for the function
ttest.2samp, for example, enter at the R prompt
?ttest.2samp
DOI: 10.1201/9780429488788-1 1