An Introduction to Medical Statistics, 4th Edition
Visit the link below to download the full version of this book:
https://medidownload.com/product/an-introduction-to-medical-statistics-4th-editi
on/
Click Download Now
3
Great Clarendon Street, Oxford, OX2 6DP,
United Kingdom
Oxford University Press is a department of the University of Oxford.
It furthers the University’s objective of excellence in research, scholarship,
and education by publishing worldwide. Oxford is a registered trade mark of
Oxford University Press in the UK and in certain other countries
© Oxford University Press 2015
The moral rights of the author have been asserted
First Edition published in 1987
Second Edition published in 1995
Third Edition published in 2000
Fourth Edition published in 2015
Impression: 1
All rights reserved. No part of this publication may be reproduced, stored in
a retrieval system, or transmitted, in any form or by any means, without the
prior permission in writing of Oxford University Press, or as expressly permitted
by law, by licence or under terms agreed with the appropriate reprographics
rights organization. Enquiries concerning reproduction outside the scope of the
above should be sent to the Rights Department, Oxford University Press, at the
address above
You must not circulate this work in any other form
and you must impose this same condition on any acquirer
Published in the United States of America by Oxford University Press
198 Madison Avenue, New York, NY 10016, United States of America
British Library Cataloguing in Publication Data
Data available
Library of Congress Control Number: 2014959481
ISBN 978–0–19–958992–0
Printed in Italy by
L.E.G.O. S.p.A.
Oxford University Press makes no representation, express or implied, that the
drug dosages in this book are correct. Readers must therefore always check
the product information and clinical procedures with the most up-to-date
published product information and data sheets provided by the manufacturers
and the most recent codes of conduct and safety regulations. The authors and
the publishers do not accept responsibility or legal liability for any errors in the
text or for the misuse or misapplication of material in this work. Except where
otherwise stated, drug dosages and recommendations are for the non-pregnant
adult who is not breast-feeding
Links to third party websites are provided by Oxford in good faith and
for information only. Oxford disclaims any responsibility for the materials
contained in any third party website referenced in this work.
To Emily and Nicholas Bland
Preface to the Fourth Edition
This book is for medical students, doctors, medical I have included some new examples, though many
researchers, nurses, members of professions allied to of the old ones remain, being too good to replace, I
medicine, and all others concerned with medical data. thought. I have changed most of the exercises, to re-
When I wrote the first edition of An Introduction to move all calculations. I never touch a calculator now, so
Medical Statistics, I based the contents on the statis- why should my readers? Instead, I have concentrated on
tical methods which appeared frequently in the Lancet understanding and interpreting analyses. I have dropped
and the British Medical Journal. I continued to do this the stars for sections with material which was beyond the
with each succeeding edition. Each time, the range and undergraduate course. I no longer teach medical or nurs-
complexity of the methods used increased. There are ing students and I do not have my finger on that pulse.
two reasons for this. One is that the size and com- All the graphs have been redrawn using Stata12, except
plexity of medical research studies has increased greatly for one pie chart, done using Excel.
and, I think, the quality has increased greatly, too. The This is a book about data, not statistical theory. The
other reason is that developments in computing have fundamental concepts of study design, data collection,
enabled statisticians to develop and bring into use new, and data analysis are explained by illustration and ex-
computer-intensive methods of analysis and these have ample. Only enough mathematics and formulae are
been applied in medical research. given to make clear what is going on. For those who
In this fourth edition, I have added new chapters on wish to go a little further in their understanding, some
meta-analysis and on handling missing data by multiple of the more mathematical background to the techniques
imputation, methods now seen routinely in major jour- described is given as appendices to the chapters rather
nals. I have also added a chapter explaining the Bayesian than in the main text.
approach to data, including Markov Chain Monte Carlo The book is firmly grounded in medical data, particu-
methods of analysis. I have added a new chapter collect- larly in medical research, and the interpretation of the
ing together and expanding the material on time to event results of calculations in their medical context is empha-
or survival data. I have also added new sections on allo- sized. Except for a few obviously invented numbers used
cation by minimization, bootstrap methods, Poisson and to illustrate the mechanics of calculations, all the data in
negative binomial regression, kappa statistics for agree- the examples and exercises are real, from my own re-
ment between observers, and the creation of composite search and statistical consultation or from the medical
scales using principal components and factor analysis, all literature.
things you will see in medical journals. There are two kinds of exercise in this book. Each
Apart from changes in the practice of statistics in medi- chapter has a set of multiple choice questions of the ‘true
cine in general, I hope that I have changed a bit, too. or false’ type, 122 in all. Multiple choice questions can
Since writing the third edition, I have moved to a different cover a large amount of material in a short time, so are
university, where I now spend a lot more time on clinical a useful tool for revision. As MCQs are widely used in
trials. I have also spent 6 years on the Clinical Evaluation postgraduate examinations, these exercises should also
and Trials Board of the Health Technology Assessment be useful to those preparing for memberships. All the
programme, reading and criticising hundreds of grant ap- MCQs have solutions, with reference to an appropriate
plications. I hope that I have learned something along the part of the text or a detailed explanation for most of the
way and I have revised the text accordingly. answers. Each chapter also has a long exercise, also with
viii Preface to the Fourth Edition
suggested answers, mostly on the interpretation of data Maugdal, Douglas Maxwell, Georgina Morris, Charles
in published studies. Mutoka, Tim Northfield, Andreas Papadopoulos, Mo-
I wish to thank many people who have contributed to hammed Raja, Paul Richardson, and Alberto Smith. I am
the writing of this book. First, there are the many medical particularly indebted to John Morgan, as Chapter 21 is
students, doctors, research workers, nurses, physiother- partly based on his work.
apists, and radiographers whom it has been my pleasure I thank Douglas Altman, Daniel Heitjan, David Jones,
to teach, and from whom I have learned so much. Klim McPherson, Janet Peacock, Stuart Pocock, and
Second, the book contains many examples drawn from Robin Prescott for their helpful comments on earlier
research carried out with other statisticians, epidemiolo- drafts and Dan Heitjan for finding mistakes in this one.
gists, and social scientists, particularly Douglas Altman, I am very grateful to Julian Higgins and Simon Crouch for
Ross Anderson, Mike Banks, Barbara Butland, Beulah their comments on my new chapters on meta-analysis
Bewley, Nicky Cullum, Jo Dumville, Walter Holland, and and Bayesian methods, respectively. I am grateful to John
David Torgerson. These studies could not have been Blase for help with converting my only Excel graphic.
done without the assistance of Patsy Bailey, Bob Harris, I have corrected a number of errors from earlier edi-
Rebecca McNair, Janet Peacock, Swatee Patel, and Vir- tions, and I am grateful to colleagues who have pointed
ginia Pollard. Third, the clinicians and scientists with them out to me. Most of all I thank Pauline Bland for her
whom I have collaborated or who have come to me for unfailing confidence and encouragement.
statistical advice not only taught me about medical data Since the last edition of this book, my children, Nick
but many of them have left me with data which are used and Em, have grown up and have both become health
here, including Naib Al-Saady, Thomas Bewley, Frances researchers. It is to them I dedicate this fourth edition.
Boa, Nigel Brown, Jan Davies, Caroline Flint, Nick Hall,
Tessi Hanid, Michael Hutt, Riahd Jasrawi, Ian Johnston, M.B.
Moses Kapembwa, Pam Luthra, Hugh Mather, Daram York, April 2015
Contents
Detailed Contents xi
Chapter 1 Introduction 1
Chapter 2 The design of experiments 5
Chapter 3 Sampling and observational studies 25
Chapter 4 Summarizing data 41
Chapter 5 Presenting data 57
Chapter 6 Probability 73
Chapter 7 The Normal distribution 85
Chapter 8 Estimation 101
Chapter 9 Significance tests 115
Chapter 10 Comparing the means of small samples 131
Chapter 11 Regression and correlation 159
Chapter 12 Methods based on rank order 177
Chapter 13 The analysis of cross-tabulations 193
Chapter 14 Choosing the statistical method 213
Chapter 15 Multifactorial methods 223
Chapter 16 Time to event data 251
Chapter 17 Meta-analysis 265
Chapter 18 Determination of sample size 295
Chapter 19 Missing data 305
Chapter 20 Clinical measurement 313
Chapter 21 Mortality statistics and population structure 347
Chapter 22 The Bayesian approach 357
Appendix 1: Suggested answers to multiple choice questions and exercises 367
References 397
Index 411
Detailed Contents
Chapter 1 Introduction 1
1.1 Statistics and medicine 1
1.2 Statistics and mathematics 1
1.3 Statistics and computing 2
1.4 Assumptions and approximations 2
1.5 The scope of this book 3
Chapter 2 The design of experiments 5
2.1 Comparing treatments 5
2.2 Random allocation 6
2.3 Stratification 10
2.4 Methods of allocation without random numbers 10
2.5 Volunteer bias 12
2.6 Intention to treat 13
2.7 Cross-over designs 13
2.8 Selection of subjects for clinical trials 15
2.9 Response bias and placebos 15
2.10 Assessment bias and double blind studies 17
2.11 Laboratory experiments 18
2.12 Experimental units and cluster randomized trials 18
2.13 Consent in clinical trials 20
2.14 Minimization 21
2.15 Multiple choice questions: Clinical trials 23
2.16 Exercise: The ‘Know Your Midwife’ trial 23
Chapter 3 Sampling and observational studies 25
3.1 Observational studies 25
3.2 Censuses 26
3.3 Sampling 26
3.4 Random sampling 27
3.5 Sampling in clinical and epidemiological studies 29
3.6 Cross-sectional studies 31
3.7 Cohort studies 32
3.8 Case–control studies 33
xii Detailed Contents
3.9 Questionnaire bias in observational studies 35
3.10 Ecological studies 36
3.11 Multiple choice questions: Observational studies 37
3.12 Exercise: Campylobacter jejuni infection 38
Chapter 4 Summarizing data 41
4.1 Types of data 41
4.2 Frequency distributions 41
4.3 Histograms and other frequency graphs 44
4.4 Shapes of frequency distribution 47
4.5 Medians and quantiles 49
4.6 The mean 50
4.7 Variance, range, and interquartile range 51
4.8 Standard deviation 52
4.9 Multiple choice questions: Summarizing data 53
4.10 Exercise: Student measurements and a graph of study numbers 54
Appendix 4A: The divisor for the variance 55
Appendix 4B: Formulae for the sum of squares 56
Chapter 5 Presenting data 57
5.1 Rates and proportions 57
5.2 Significant figures 58
5.3 Presenting tables 60
5.4 Pie charts 61
5.5 Bar charts 61
5.6 Scatter diagrams 63
5.7 Line graphs and time series 65
5.8 Misleading graphs 66
5.9 Using different colours 68
5.10 Logarithmic scales 68
5.11 Multiple choice questions: Data presentation 69
5.12 Exercise: Creating presentation graphs 70
Appendix 5A: Logarithms 70
Chapter 6 Probability 73
6.1 Probability 73
6.2 Properties of probability 73
6.3 Probability distributions and random variables 74
6.4 The Binomial distribution 75
6.5 Mean and variance 77
6.6 Properties of means and variances 77
Detailed Contents xiii
6.7 The Poisson distribution 79
6.8 Conditional probability 79
6.9 Multiple choice questions: Probability 81
6.10 Exercise: Probability in court 81
Appendix 6A: Permutations and combinations 82
Appendix 6B: Expected value of a sum of squares 82
Chapter 7 The Normal distribution 85
7.1 Probability for continuous variables 85
7.2 The Normal distribution 86
7.3 Properties of the Normal distribution 89
7.4 Variables which follow a Normal distribution 92
7.5 The Normal plot 93
7.6 Multiple choice questions: The Normal distribution 96
7.7 Exercise: Distribution of some measurements obtained by students 97
Appendix 7A: Chi-squared, t, and F 98
Chapter 8 Estimation 101
8.1 Sampling distributions 101
8.2 Standard error of a sample mean 102
8.3 Confidence intervals 104
8.4 Standard error and confidence interval for a proportion 105
8.5 The difference between two means 105
8.6 Comparison of two proportions 106
8.7 Number needed to treat 108
8.8 Standard error of a sample standard deviation 109
8.9 Confidence interval for a proportion when numbers are small 109
8.10 Confidence interval for a median and other quantiles 110
8.11 Bootstrap or resampling methods 111
8.12 What is the correct confidence interval? 112
8.13 Multiple choice questions: Confidence intervals 112
8.14 Exercise: Confidence intervals in two acupuncture studies 113
Appendix 8A: Standard error of a mean 114
Chapter 9 Significance tests 115
9.1 Testing a hypothesis 115
9.2 An example: the sign test 116
9.3 Principles of significance tests 116
9.4 Significance levels and types of error 117
9.5 One and two sided tests of significance 118
9.6 Significant, real, and important 119
xiv Detailed Contents
9.7 Comparing the means of large samples 120
9.8 Comparison of two proportions 121
9.9 The power of a test 122
9.10 Multiple significance tests 123
9.11 Repeated significance tests and sequential analysis 125
9.12 Significance tests and confidence intervals 126
9.13 Multiple choice questions: Significance tests 126
9.14 Exercise: Crohn’s disease and cornflakes 127
Chapter 10 Comparing the means of small samples 131
10.1 The t distribution 131
10.2 The one sample t method 134
10.3 The means of two independent samples 136
10.4 The use of transformations 138
10.5 Deviations from the assumptions of t methods 141
10.6 What is a large sample? 142
10.7 Serial data 142
10.8 Comparing two variances by the F test 144
10.9 Comparing several means using analysis of variance 145
10.10 Assumptions of the analysis of variance 147
10.11 Comparison of means after analysis of variance 148
10.12 Random effects in analysis of variance 150
10.13 Units of analysis and cluster randomized trials 152
10.14 Multiple choice questions: Comparisons of means 153
10.15 Exercise: Some analyses comparing means 155
Appendix 10A: The ratio mean/standard error 156
Chapter 11 Regression and correlation 159
11.1 Scatter diagrams 159
11.2 Regression 160
11.3 The method of least squares 160
11.4 The regression of X on Y 162
11.5 The standard error of the regression coefficient 163
11.6 Using the regression line for prediction 164
11.7 Analysis of residuals 165
11.8 Deviations from assumptions in regression 166
11.9 Correlation 167
11.10 Significance test and confidence interval for r 169
11.11 Uses of the correlation coefficient 170
11.12 Using repeated observations 171
11.13 Intraclass correlation 172
11.14 Multiple choice questions: Regression and correlation 173
Detailed Contents xv
11.15 Exercise: Serum potassium and ambient temperature 174
Appendix 11A: The least squares estimates 174
Appendix 11B: Variance about the regression line 175
Appendix 11C: The standard error of b 175
Chapter 12 Methods based on rank order 177
12.1 Non-parametric methods 177
12.2 The Mann–Whitney U test 177
12.3 The Wilcoxon matched pairs test 182
12.4 Spearman’s rank correlation coefficient, ρ 185
12.5 Kendall’s rank correlation coefficient, τ 187
12.6 Continuity corrections 188
12.7 Parametric or non-parametric methods? 189
12.8 Multiple choice questions: Rank-based methods 190
12.9 Exercise: Some applications of rank-based methods 190
Chapter 13 The analysis of cross-tabulations 193
13.1 The chi-squared test for association 193
13.2 Tests for 2 by 2 tables 195
13.3 The chi-squared test for small samples 196
13.4 Fisher’s exact test 197
13.5 Yates’ continuity correction for the 2 by 2 table 199
13.6 The validity of Fisher’s and Yates’ methods 199
13.7 Odds and odds ratios 200
13.8 The chi-squared test for trend 202
13.9 Methods for matched samples 204
13.10 The chi-squared goodness of fit test 205
13.11 Multiple choice questions: Categorical data 207
13.12 Exercise: Some analyses of categorical data 208
Appendix 13A: Why the chi-squared test works 209
Appendix 13B: The formula for Fisher’s exact test 210
Appendix 13C: Standard error for the log odds ratio 211
Chapter 14 Choosing the statistical method 213
14.1 Method oriented and problem oriented teaching 213
14.2 Types of data 213
14.3 Comparing two groups 214
14.4 One sample and paired samples 215
14.5 Relationship between two variables 216
14.6 Multiple choice questions: Choice of statistical method 218
14.7 Exercise: Choosing a statistical method 218
xvi Detailed Contents
Chapter 15 Multifactorial methods 223
15.1 Multiple regression 223
15.2 Significance tests and estimation in multiple regression 225
15.3 Using multiple regression for adjustment 227
15.4 Transformations in multiple regression 228
15.5 Interaction in multiple regression 230
15.6 Polynomial regression 231
15.7 Assumptions of multiple regression 232
15.8 Qualitative predictor variables 233
15.9 Multi-way analysis of variance 234
15.10 Logistic regression 237
15.11 Stepwise regression 239
15.12 Seasonal effects 239
15.13 Dealing with counts: Poisson regression and negative binomial regression 240
15.14 Other regression methods 244
15.15 Data where observations are not independent 244
15.16 Multiple choice questions: Multifactorial methods 245
15.17 Exercise: A multiple regression analysis 246
Chapter 16 Time to event data 251
16.1 Time to event data 251
16.2 Kaplan–Meier survival curves 251
16.3 The logrank test 256
16.4 The hazard ratio 258
16.5 Cox regression 259
16.6 Multiple choice questions: Time to event data 261
16.7 Exercise: Survival after retirement 263
Chapter 17 Meta-analysis 265
17.1 What is a meta-analysis? 265
17.2 The forest plot 265
17.3 Getting a pooled estimate 267
17.4 Heterogeneity 268
17.5 Measuring heterogeneity 268
17.6 Investigating sources of heterogeneity 270
17.7 Random effects models 272
17.8 Continuous outcome variables 274
17.9 Dichotomous outcome variables 279
17.10 Time to event outcome variables 282
17.11 Individual participant data meta-analysis 283
17.12 Publication bias 284
Detailed Contents xvii
17.13 Network meta-analysis 289
17.14 Multiple choice questions: Meta-analysis 290
17.15 Exercise: Dietary sugars and body weight 292
Chapter 18 Determination of sample size 295
18.1 Estimation of a population mean 295
18.2 Estimation of a population proportion 296
18.3 Sample size for significance tests 296
18.4 Comparison of two means 297
18.5 Comparison of two proportions 299
18.6 Detecting a correlation 300
18.7 Accuracy of the estimated sample size 301
18.8 Trials randomized in clusters 302
18.9 Multiple choice questions: Sample size 303
18.10 Exercise: Estimation of sample sizes 304
Chapter 19 Missing data 305
19.1 The problem of missing data 305
19.2 Types of missing data 306
19.3 Using the sample mean 307
19.4 Last observation carried forward 307
19.5 Simple imputation 308
19.6 Multiple imputation 309
19.7 Why we should not ignore missing data 310
19.8 Multiple choice questions: Missing data 311
19.9 Exercise: Last observation carried forward 312
Chapter 20 Clinical measurement 313
20.1 Making measurements 313
20.2 Repeatability and measurement error 315
20.3 Assessing agreement using Cohen’s kappa 317
20.4 Weighted kappa 322
20.5 Comparing two methods of measurement 324
20.6 Sensitivity and specificity 326
20.7 Normal range or reference interval 329
20.8 Centile charts 331
20.9 Combining variables using principal components analysis 332
20.10 Composite scales and subscales 335
20.11 Internal consistency of scales and Cronbach’s alpha 341
20.12 Presenting composite scales 341
20.13 Multiple choice questions: Measurement 342
20.14 Exercise: Two measurement studies 344