100% found this document useful (11 votes)
93 views84 pages

Multilevel Analysis Techniques and Applications 3rd Edition Joop J. Hox All Chapter Instant Download

ebook

Uploaded by

arvizurynek
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (11 votes)
93 views84 pages

Multilevel Analysis Techniques and Applications 3rd Edition Joop J. Hox All Chapter Instant Download

ebook

Uploaded by

arvizurynek
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 84

Full download ebook at ebookgate.

com

Multilevel Analysis Techniques and


Applications 3rd Edition Joop J. Hox

https://ebookgate.com/product/multilevel-analysis-
techniques-and-applications-3rd-edition-joop-j-
hox/

Download more ebook from https://ebookgate.com


More products digital (pdf, epub, mobi) instant
download maybe you interests ...

Multilevel Analysis Techniques and Applications Second


Edition Joop Hox

https://ebookgate.com/product/multilevel-analysis-techniques-and-
applications-second-edition-joop-hox/

Multilevel Analysis Techniques and Applications


Quantitative Methodology Series 1st Edition J. J. Hox

https://ebookgate.com/product/multilevel-analysis-techniques-and-
applications-quantitative-methodology-series-1st-edition-j-j-hox/

An Introduction to Multilevel Modeling Techniques MLM


and SEM Approaches Using Mplus 3rd Edition Ronald H.
Heck

https://ebookgate.com/product/an-introduction-to-multilevel-
modeling-techniques-mlm-and-sem-approaches-using-mplus-3rd-
edition-ronald-h-heck/

Trace environmental quantitative analysis principles


techniques and applications 2nd ed Edition Loconto

https://ebookgate.com/product/trace-environmental-quantitative-
analysis-principles-techniques-and-applications-2nd-ed-edition-
loconto/
Trace Environmental Quantitative Analysis Principles
Techniques and Applications 1st Edition Paul R. Loconto

https://ebookgate.com/product/trace-environmental-quantitative-
analysis-principles-techniques-and-applications-1st-edition-paul-
r-loconto/

Nondestructive Evaluation Theory Techniques and


Applications 1st Edition Peter J. Shull (Author)

https://ebookgate.com/product/nondestructive-evaluation-theory-
techniques-and-applications-1st-edition-peter-j-shull-author/

Multilevel Analysis of Individuals and Cultures 1st


Edition Fons J.R. Van De Vijver

https://ebookgate.com/product/multilevel-analysis-of-individuals-
and-cultures-1st-edition-fons-j-r-van-de-vijver/

Trace Environmental Quantitative Analysis Principles


Techniques and Applications Second Edition Paul R.
Loconto (Author)

https://ebookgate.com/product/trace-environmental-quantitative-
analysis-principles-techniques-and-applications-second-edition-
paul-r-loconto-author/

Geometry and Its Applications 3rd Edition Walter J.


Meyer

https://ebookgate.com/product/geometry-and-its-applications-3rd-
edition-walter-j-meyer/
This an outstanding introduction to the topic of multilevel modeling. The new edition
is even more detailed with key chapter revisions and additions, all combined with
insightful computer-based examples and discussions. It is an excellent resource for
anyone wanting to learn about multilevel analysis.
—George A. Marcoulides, University of California, Santa Barbara

This is a comprehensive book that takes the reader from the basics of multilevel
modeling through to advanced extensions into models used for meta-analysis and
survival analysis. It also describes the links with structural equation modeling and
other latent models such as path models and factor analysis models. The book offers a
great exposition of both the models and the estimation methods used to fit them and is
accessible and links each chapter well to available software for the models described.
The book also covers topics such as Bayesian estimation and power calculations in
the multilevel setting. This edition is a valuable addition to the multilevel modeling
literature.
—William Browne, Centre for Multilevel Modelling, University of Bristol

This book has been a staple in my research diet. The author team is at the developing
edge of multilevel modeling and as they state about multilevel analysis in general,
‘both the statistical techniques and the software tools are evolving rapidly.’ Their book
is the perfect melding of being an introduction to multilevel modeling as well as a
researcher’s resource when it comes to the recent advances (e.g., Bayesian multilevel
modeling, bootstrap estimation). It’s clearly written. With a light and unpretentious
voice, the book narrative is not only accessible, it is also inviting.
—Todd D. Little, Director and Founder, Institute for Measurement, Methodology,
Analysis, and Policy, Texas Tech University; Director and Founder of Stats Camp
Multilevel Analysis

Applauded for its clarity, this accessible introduction helps readers apply multilevel
techniques to their research. The book also includes advanced extensions, making it useful
as both an introduction for students and as a reference for researchers. Basic models and
examples are discussed in nontechnical terms with an emphasis on understanding the
methodological and statistical issues involved in using these models. The estimation and
interpretation of multilevel models is demonstrated using realistic examples from various
disciplines including psychology, education, public health, and sociology. Readers are
introduced to a general framework on multilevel modeling which covers both observed and
latent variables in the same model, while most other books focus on observed variables. In
addition, Bayesian estimation is introduced and applied using accessible software.

Joop J. Hox is Emeritus Professor of Social Science Methodology at Utrecht University,


the Netherlands.

Mirjam Moerbeek is Associate Professor of Statistics for the Social Sciences at Utrecht
University, the Netherlands.

Rens van de Schoot is an Associate Professor of Bayesian Statistics at Utrecht University,


the Netherlands, and Extra-Ordinary Professor at the North-West University, South Africa.

iii
Quantitative Methodology Series

George A. Marcoulides, Series Editor

This series presents methodological techniques to investigators and students. The goal is
to provide an understanding and working knowledge of each method with a minimum of
mathematical derivations. Each volume focuses on a specific method (e.g. factor analysis,
multilevel analysis, structural equation modeling).

Proposals are invited from interested authors. Each proposal should consist of: a brief
description of the volume’s focus and intended market; a table of contents with an outline of
each chapter; and a curriculum vitae. Materials may be sent to Dr. George A. Marcoulides,
University of California – Santa Barbara, [email protected].

Published titles
Marcoulides • Modern Methods for Business Research
Marcoulides/Moustaki • Latent Variable and Latent Structure Models
Heck • Studying Educational and Social Policy: Theoretical Concepts and
Research Methods
van der Ark/Croon/Sijtsma • New Developments in Categorical Data
Analysis for the Social and Behavioral Sciences
Duncan/Duncan/Strycker • An Introduction to Latent Variable Growth
Curve Modeling: Concepts, Issues, and Applications, Second Edition
Cardinet/Johnson/Pini • Applying Generalizability Theory Using EduG
Creemers/Kyriakides/Sammons • Methodological Advances in Educational
Effectiveness Research
Heck/Thomas/Tabata • Multilevel Modeling of Categorical Outcomes
Using IBM SPSS
Heck/Thomas/Tabata • Multilevel and Longitudinal Modeling with IBM
SPSS, Second Edition
McArdle/Ritschard • Contemporary Issues in Exploratory Data Mining in
the Behavioral Sciences
Heck/Thomas • An Introduction to Multilevel Modeling Techniques: MLM
and SEM Approaches Using Mplus, Third Edition
Hox/Moerbeek/van de Schoot • Multilevel Analysis: Techniques and
Applications, Third Edition
Multilevel Analysis

Techniques and Applications

Third Edition

Joop J. Hox, Mirjam Moerbeek,


Rens van de Schoot
Third edition published 2018
by Routledge
711 Third Avenue, New York, NY 10017

and by Routledge
2 Park Square, Milton Park, Abingdon, Oxon, OX14 4RN

Routledge is an imprint of the Taylor & Francis Group, an informa business

© 2018 Taylor & Francis

The right of Joop J. Hox, Mirjam Moerbeek, and Rens van de Schoot to be
identified as authors of this work has been asserted by them in accordance
with sections 77 and 78 of the Copyright, Designs and Patents Act 1988.

All rights reserved. No part of this book may be reprinted or reproduced or


utilised in any form or by any electronic, mechanical, or other means, now
known or hereafter invented, including photocopying and recording, or in
any information storage or retrieval system, without permission in writing
from the publishers.

Trademark notice: Product or corporate names may be trademarks or


registered trademarks, and are used only for identification and explanation
without intent to infringe.

First edition published by Lawrence Erlbaum Associates, Publishers 2002

Second edition published by Routledge 2010

Library of Congress Cataloging-in-Publication Data


Names: Hox, J. J., author. | Moerbeek, Mirjam, 1973- author. | Schoot, Rens
van de, author.
Title: Multilevel analysis : techniques and applications / Joop J. Hox,
Mirjam Moerbeek, Rens van de Schoot.
Description: Third edition. | New York, NY : Routledge, 2017. |
Series: Quantitative methodology series | Includes bibliographical references
and index.
Identifiers: LCCN 2017013032| ISBN 9781138121409 (hard back :
alk. paper) | ISBN 9781138121362 (paper back : alk. paper) | ISBN
9781315650982 (ebook)
Subjects: LCSH: Social sciences--Statistical methods. | Analysis of variance.
| Regression analysis.
Classification: LCC HA29 .H783 2017 | DDC 001.4/22--dc23
LC record available at https://lccn.loc.gov/2017013032

ISBN: 978-1-138-12140-9 (hbk)


ISBN: 978-1-138-12136-2 (pbk)
ISBN: 978-131-565-098-2 (ebk)

Typeset in Times New Roman


by HWA Text and Data Management, London

Visit the companion website: www.routledge.com/cw/hox


Contents

Preface xi

1. Introduction to Multilevel Analysis 1


1.1 Aggregation and Disaggregation 2
1.2 Why Do We Need Special Multilevel Analysis Techniques? 4
1.3 Multilevel Theories 6
1.4 Estimation and Software 7

2. The Basic Two-Level Regression Model 8


2.1 Example 8
2.2 An Extended Example 13
2.3 Three- and More Level Regression Models 19
2.4 Notation and Software 23

3. Estimation and Hypothesis Testing in Multilevel Regression 27


3.1 Which Estimation Method? 27
3.2 Bayesian Methods 30
3.3 Bootstrapping 32
3.4 Significance Testing and Model Comparison 33
3.5 Software 40

4. Some Important Methodological and Statistical Issues 41


4.1 Analysis Strategy 42
4.2 Centering and Standardizing Explanatory Variables 46
4.3 Interpreting Interactions 52
4.4 How Much Variance Is Explained? 57
4.5 Multilevel Mediation and Higher-Level Outcomes 64
4.6 Missing Data in Multilevel Analysis 65
4.7 Software 69

5. Analyzing Longitudinal Data 71


5.1 Introduction 71
5.2 Fixed and Varying Occasions 72

vii
viii Contents

5.3 Example with Fixed Occasions 73


5.4 Example with Varying Occasions 84
5.5 Advantages of Multilevel Analysis for Longitudinal Data 88
5.6 Complex Covariance Structures 89
5.7 Statistical Issues in Longitudinal Analysis 94
5.8 Software 102

6. The Multilevel Generalized Linear Model for Dichotomous Data and


Proportions 103
6.1 Generalized Linear Models 103
6.2 Multilevel Generalized Linear Models 107
6.3 Example: Analyzing Dichotomous Data 111
6.4 Example: Analyzing Proportions 113
6.5 The Ever-Changing Latent Scale: Comparing Coefficients and
Explained Variances 121
6.6 Interpretation 128
6.7 Software 128

7. The Multilevel Generalized Linear Model for Categorical and Count Data 130
7.1 Ordered Categorical Data 130
7.2 Count Data 139
7.3 Explained Variance in Ordered Categorical and Count Data 146
7.4 The Ever-Changing Latent Scale, Again 147
7.5 Software 147

8. Multilevel Survival Analysis 148


8.1 Survival Analysis 148
8.2 Multilevel Survival Analysis 153
8.3 Multilevel Ordinal Survival Analysis 158
8.4 Software 160

9. Cross-Classified Multilevel Models 161


9.1 Introduction 161
9.2 Example of Cross-Classified Data: Pupils Nested Within (Primary and
Secondary Schools) 163
9.3 Example of Cross-Classified Data: (Sociometric Ratings) in Small Groups 165
9.4 Software 172

10. Multivariate Multilevel Regression Models 173


10.1 The Multivariate Model 174
Contents ix

10.2 Example of Multivariate Multilevel Analysis: Multiple Response


Variables 176
10.3 Example of Multivariate Multilevel Analysis: Measuring Group
Characteristics 181

11. The Multilevel Approach to Meta-Analysis 189


11.1 Meta-Analysis and Multilevel Modeling 189
11.2 The Variance-Known Model 191
11.3 Example and Comparison with Classical Meta-Analysis 195
11.4 Correcting for Artifacts 201
11.5 Multivariate Meta-Analysis 204
11.6 Software 209

12. Sample Sizes and Power Analysis in Multilevel Regression 212


12.1 Sample Size and Accuracy of Estimates 212
12.2 Power Analysis 219
12.3 Methods for Randomized Controlled Trials 221
12.4 Methods for Observational Studies 229
12.5 Methods for Meta-Analysis 230
12.6 Software for Power Analysis 233

13. Assumptions and Robust Estimation Methods 235


13.1 Introduction 235
13.2 Example Data and Some Issues with Non-Normality 236
13.3 Checking Assumptions: Inspecting Residuals 238
13.4 The Profile Likelihood Method 246
13.5 Robust Standard Errors 247
13.6 Multilevel Bootstrapping 250
13.7 Bayesian Estimation Methods 255
13.8 Software 267

14. Multilevel Factor Models 269


14.1 Introduction 269
14.2 The Within and Between Approach 271
14.3 Full Maximum Likelihood Estimation 272
14.4 An Example of Multilevel Factor Analysis 274
14.5 Standardizing Estimates in Multilevel Structural Equation Modeling 278
14.6 Goodness of Fit in Multilevel Structural Equation Modeling 279
14.7 Software 282
x Contents

15. Multilevel Path Models 284


15.1 Example of a Multilevel Path Analysis 284
15.2 Statistical and Software Issues 291

16. Latent Curve Models 294


16.1 Introduction 294
16.2 Example of Latent Curve Modeling 297
16.3 A Comparison of Multilevel Regression Analysis and Latent Curve
Modeling 303
16.4 Software 304

Appendix A: Checklist for Multilevel Reporting 305


Appendix B: Aggregating and Disaggregating 309
Appendix C: Recoding Categorical Data 312
Appendix D: Constructing Orthogonal Polynomials 315
Appendix E: Data and Stories 317
References 325
Index 342
Preface

To err is human, to forgive divine;


but to include errors into your design is statistical.
­— Leslie Kish

This book is intended as an introduction to multilevel analysis for students and researchers.
The term ‘multilevel’ refers to a hierarchical or nested data structure, usually subjects within
organizational groups, but the nesting may also consist of repeated measures within subjects,
or respondents within clusters, as in cluster sampling. The expression multilevel model is
used as a generic term for all models for nested data. Multilevel analysis is used to examine
relations between variables measured at different levels of the multilevel data structure.
This book presents two types of multilevel model in detail: the multilevel regression model
and the multilevel structural equation model. Although multilevel analysis is used in many
research fields, the examples in this book are mainly from the social and behavioral sciences.
In the past decades, multilevel analysis software has become available that is both
powerful and accessible, either as special packages or as part of a general software package.
In addition, several handbooks have been published, including the earlier editions of this
book. There is a continuing interest in multilevel analysis, as evidenced by the appearance
of several reviews and monographs, applications in different fields ranging from psychology
and sociology to education and medicine, a thriving Internet discussion list with more than
1400 subscribers, and a biennial International Multilevel Conference that has been running
for more than 20 years. The view of ‘multilevel analysis’ applying to individuals nested
within groups has changed to a view that multilevel models and analysis software offer a
very flexible way to model complex data. Thus, multilevel modeling has contributed to the
analysis of traditional individuals within groups data, repeated measures and longitudinal
data, sociometric modeling, twin studies, meta-analysis and analysis of cluster randomized
trials.
This book treats two classes of multilevel models: multilevel regression models, and
multilevel structural equation models (MSEM).
Multilevel regression models are essentially a multilevel version of the familiar multiple
regression model. As Cohen and Cohen (1983), Pedhazur (1997) and others have shown, the
multiple regression model is very versatile. Using dummy coding for categorical variables,
it can be used to analyze analysis of variance (ANOVA) type models, as well as the more

xi
xii Preface

usual multiple regression models. Since the multilevel regression model is an extension
of the classical multiple regression model, it too can be used in a wide variety of research
problems.
Chapter 2 of this book contains a basic introduction to the multilevel regression model,
also known as the hierarchical linear model, or the random coefficient model. Chapter 3 and
Chapter 4 discuss estimation procedures, and a number of important methodological and
statistical issues. They also discuss some technical issues that are not specific to multilevel
regression analysis, such as centering of predictors and interpreting interactions.
Chapter 5 introduces the multilevel regression model for longitudinal data. The model is
a straightforward extension of the standard multilevel regression model, but there are some
specific complications, such as autocorrelated errors, which are discussed.
Chapter 6 treats the generalized linear model for dichotomous data and proportions.
When the response (dependent) variable is dichotomous or a proportion, standard regression
models should not be used. This chapter discusses the multilevel version of the logistic and
the probit regression model.
Chapter 7 extends the generalized linear model introduced in Chapter 6 to analyze data
that are ordered categorically and to data that are counts of events. In the context of counts,
it presents models that take an overabundance of zeros into account.
Chapter 8 introduces multilevel modeling of survival or event history data. Survival
models are for data where the outcome is the occurrence or non-occurrence of a certain
event, in a certain observation period. If the event has not occurred when the observation
period ends, the outcome is said to be censored, since we do not know whether or not the
event has taken place after the observation period ended.
Chapter 9 discusses cross-classified models. Some data are multilevel in nature, but
do not have a neat hierarchical structure. Examples are longitudinal school research
data, where pupils are nested within schools, but may switch to a different school in later
measurements, and sociometric choice data. Multilevel models for such cross-classified
data can be formulated, and estimated with standard software provided that it can handle
restrictions on estimated parameters.
Chapter 10 discusses multilevel regression models for multivariate outcomes. These can
also be used to assess the reliability of multilevel measurements.
Chapter 11 describes a variant of the multilevel regression model that can be used in
meta-analysis. It resembles the weighted regression model often recommended for meta-
analysis. Using standard multilevel regression procedures, it is a flexible analysis tool,
especially when the meta-analysis includes multivariate outcomes.
Chapter 12 deals with the sample size needed for multilevel modeling, and the problem
of estimating the power of an analysis given a specific sample size. An obvious complication
in multilevel power analysis is that there are different sample sizes at the distinct levels
which should be taken into account.
Preface xiii

Chapter 13 discusses the statistical assumptions made and presents some ways to check
these. It also discusses more robust estimation methods, such as the profile likelihood
method and robust standard errors for establishing confidence intervals, and multilevel
bootstrap methods for estimating bias-corrected point-estimates and confidence intervals.
This chapter also contains an introduction into Bayesian (MCMC) methods for estimation
and inference.
Multilevel structural equation models (MSEM), are a powerful tool for the analysis of
multilevel data. Recent versions of structural equation modeling software such as LISREL,
and Mplus all include at least some multilevel features. The general statistical model for
multilevel covariance structure analysis is quite complicated. Chapter 14 describes two
different approaches to estimation in multilevel confirmatory factor analysis. In addition,
it deals with issues of calculating standardized coefficients and goodness-of-fit indices in
multilevel structural models. Chapter 15 extends this to multilevel path models.
Chapter 16 describes structural models for latent curve analysis. This is an SEM
approach to analyzing longitudinal data, which is very similar to the multilevel regression
models treated in Chapter 5.
This book is intended as an introduction to the world of multilevel analysis. Most of
the chapters on multilevel regression analysis should be readable by social and behavioral
scientists who have a good general knowledge of analysis of variance and classical multiple
regression analysis. Some of these chapters contain material that is more difficult, but these
are generally a discussion of specialized problems, which can be skipped at first reading.
An example is the chapter on longitudinal models, which contains a long discussion of
techniques to model specific structures for the covariances between adjacent time points.
This discussion is not needed in understanding the essentials of multilevel analysis
of longitudinal data, but it may become important when one is actually analyzing such
data. The chapters on multilevel structure equation modeling obviously require a strong
background in multivariate statistics and some background in structural equation modeling,
equivalent to, for example, the material covered in Tabachnick and Fidell’s (2013) book on
multivariate analysis. On the other hand, in addition to an adequate background in structural
equation modeling, the chapters on multilevel structural equation modeling do not require
knowledge of advanced mathematical statistics. In all these cases, we have tried to keep
the discussion of the more advanced statistical techniques theoretically sound, but non-
technical.
In addition to its being an introduction, this book describes many extensions and special
applications. As an introduction, it is useable in courses on multilevel modeling in a variety
of social and behavioral fields, such as psychology, education, sociology, and business.
The various extensions and special applications also make it useful to researchers who
work in applied or theoretical research, and to methodologists who have to consult with
these researchers. The basic models and examples are discussed in non-technical terms; the
emphasis is on understanding the methodological and statistical issues involved in using
xiv Preface

these models. Some of the extensions and special applications contain discussions that are
more technical, either because that is necessary for understanding what the model does,
or as a helpful introduction to more advanced treatments in other texts. Thus, in addition
to its role as an introduction, the book should be useful as a standard reference for a large
variety of applications. The chapters that discuss specialized problems, such as the chapter
on cross-classified data, the meta-analysis chapter, and the chapter on advanced issues in
estimation and testing, can be skipped entirely if preferred.

New to this edition

One important change compared to the second edition is the introduction of two co-authors.
This reflects the expansion of multilevel analysis; the field has become so broad that it is
virtually impossible for a single author to keep up with the new developments, both in
statistical theory and in software.
Compared to the second edition, some chapters have changed much, while other
chapters have mostly been updated to reflect recent developments in statistical research and
software development. One important development is increased use of Bayesian estimation
and development of robust maximum likelihood estimation. We have chosen not to add a
separate chapter on Bayesian estimation; instead, Bayesian estimation is discussed in those
places where its use improves estimation. The chapters on multilevel logistic regression and
on multilevel ordered regression have been expanded with a better treatment of the linked
problems of latent scale and explained variance. In multilevel structural equation modeling
(MSEM) the developments have been so fast that the chapters on multilevel confirmatory
factor analysis and on multilevel path analysis have been significantly revised, in part by
removing discussion of estimation methods that are now clearly outdated. The chapter on
sample size and power and the chapter on multilevel survival analysis have been extensively
rewritten.
An updated website at (https://multilevel-analysis.sites.uu.nl/) holds the data sets for all
the text examples formatted using the latest versions of SPSS, HLM, MLwiN and Mplus,
plus some software introductions with updated screen shots for each of these programs. Most
analyses in this book can be carried out by any multilevel regression program, although the
majority of the multilevel regression analyses were carried out in HLM and MLwiN. The
multilevel SEM analyses all use Mplus. System files and setups using these packages are
also available at the website.
Some of the example data are real, while others have been simulated especially for this
book. The data sets are quite varied so as to appeal to those in several disciplines, including
education, sociology, psychology, family studies, medicine, and nursing; Appendix E
describes the various data sets used in this book in detail. Further example data will be
added to the website for use in computer labs.
Preface xv

Acknowledgments

We thank Dick Carpenter, Lawrence DeCarlo, Brian Gray, Ellen Hamaker, Don Hedeker,
Peter van der Heijden, Herbert Hoijtink, Suzanne Jak, Bernet Sekasanvu Kato, Edith de
Leeuw, Cora Maas, George Marcoulides, Cameron McIntosh, Herb Marsh, Allison O’Mara,
Ian Plewis, Ken Rowe, Elif Unal, Godfried van den Wittenboer, and Bill Yeaton for their
comments on the manuscript of the current book or on earlier editions. Their critical
comments still shape this book. We also thank numerous students for the feedback they
gave us in our multilevel courses.
We thank our colleagues at the Department of Methodology and Statistics of the Faculty
of Social Sciences at Utrecht University for providing us with many discussions and a
generally stimulating research environment. Our research has also benefited from the
lively discussions by the denizens of the Internet Multilevel Modeling and the Structural
Equations Modeling (SEMNET) discussion lists.
We also express our gratitude to the reviewers that reviewed our proposal for the new
edition. They provided valuable feedback on the contents and the structure of the proposed
book.
As always, any errors remaining in the book are entirely our own responsibility. We
appreciate hearing about them, and will keep a list of errata on the homepage of this book.

Joop J. Hox
Mirjam Moerbeek
Rens van de Schoot

Utrecht, August 2017


1
Introduction to Multilevel Analysis

Summary

Social research regularly involves problems that investigate the relationship between
individuals and the social contexts in which they live, work, or learn. The general concept
is that individuals interact with the social contexts to which they belong, that individual
persons are influenced by the contexts or groups to which they belong, and that those groups
are in turn influenced by the individuals who make up that group. The individuals and
the social groups are conceptualized as a hierarchical system of individuals nested within
groups, with individuals and groups defined at separate levels of this hierarchical system.
Naturally, such systems can be observed at different hierarchical levels, and variables may
be defined at each level. This leads to research into the relationships between variables
characterizing individuals and variables characterizing groups, a kind of research that is
generally referred to as ‘multilevel research’.
In multilevel research, the data structure in the population is hierarchical, and the sample
data are a sample from this hierarchical population. For example, in educational research,
the population typically consists of classes and pupils within these classes, with classes
organized within schools. The sampling procedure often proceeds in successive stages: first,
we take a sample of schools, next we take a sample of classes within each sampled school,
and finally we take a sample of pupils within each sampled class. Of course, in real research
one may have a convenience sample of schools, or one may decide not to sample pupils but to
study all available pupils in each class. Nevertheless, one should keep firmly in mind that the
central statistical model in multilevel analysis is one of successive sampling from each level
of a hierarchical population.
In this example, pupils are nested within classes. Other examples are cross-national
studies where the individuals are nested within their national units, organizational research
with individuals nested within departments within organizations, family research with
family members within families and methodological research into interviewer effects with
respondents nested within interviewers. Less obvious applications of multilevel models
are longitudinal research and growth curve research, where a series of several distinct
observations are viewed as nested within individuals, and meta-analysis where the subjects
are nested within different studies.

1
2 Multilevel Analysis: Techniques and Applications

1.1 Aggregation and Disaggregation

In multilevel research, variables can be defined at any level of the hierarchy. Some of these
variables may be measured directly at their ‘own’ natural level; for example, at the school
level we may measure school size and denomination, at the class level we measure class
size, and at the pupil level, intelligence and school success. In addition, we may move
variables from one level to another by aggregation or disaggregation. Aggregation means
that the variables at a lower level are moved to a higher level, for instance, by assigning to
the classes the class mean of the pupils’ intelligence scores. Disaggregation means moving
variables to a lower level, for instance by assigning to all pupils in the schools a variable
that indicates the denomination of the school they belong to.
The lowest level (level 1) is usually defined by the individuals. However, this is not
always the case. For instance, in longitudinal designs, repeated measures within individuals
are the lowest level. In such designs, the individuals are at level two, and groups are at level
three. Most software allows for at least three levels, and some software has no formal limit
to the number of levels. However, models with many levels can be difficult to estimate, and
even if estimation is successful, they are unquestionably more difficult to interpret.
At each level in the hierarchy, we may have several types of variables. The distinctions
made in the following are based on the typology offered by Lazarsfeld and Menzel (1961),
with some simplifications. In our typology, we distinguish between global, structural and
contextual variables.
Global variables are variables that refer only to the level at which they are defined,
without reference to other units or levels. A pupil’s intelligence or gender would be a global
variable at the pupil level. School denomination and class size would be global variables at
the school and class level. Simply put: a global variable is measured at the level at which
that variable actually exists.
Structural variables are operationalized by referring to the sub-units at a lower level.
They are constructed from variables at a lower level, for example, in defining the class
variable ‘mean intelligence’ as the mean of the intelligence scores of the pupils in that
class. Using the mean of a lower-level variable as an explanatory variable at a higher level
is called aggregation, and it is a common procedure in multilevel analysis. Other functions
of the lower-level variables are less common, but may also be valuable. For instance, using
the standard deviation of a lower-level variable as an explanatory variable at a higher level
could be used to test hypotheses about the effect of group heterogeneity on the outcome
variable (cf. Klein and Kozlowski, 2000).
Contextual variables are the result from disaggregation; all units at the lower level
receive the value of a global variable for the context to which they belong at the higher
level. For instance, we can assign to all pupils in a school the school size, or the mean
intelligence, as a pupil-level variable. Disaggregation is not needed in a proper multilevel
analysis. For convenience, multilevel data are often stored in a single data file, in which
the group-level variables are repeated for each individual within a group, but the statistical
Introduction to Multilevel Analysis 3

model and the software will correctly recognize these as a single value at a higher level.
The term contextual variable, however, is still used to denote a variable that models how
the context influences an individual.
In order to analyze multilevel models, it is not important to assign each variable to its
proper place in the typology. The benefit of the scheme is conceptual; it makes clear to
which level a measurement properly belongs. Historically, multilevel problems have led to
analysis approaches that moved all variables by aggregation or disaggregation to one single
level of interest followed by an ordinary multiple regression, analysis of variance, or some
other ‘standard’ analysis method. However, analyzing variables from different levels at one
single common level is inadequate, and leads to two distinct types of problems.
The first problem is statistical. If data are aggregated, the result is that different data
values from many sub-units are combined into fewer values for fewer higher-level units. As
a result, much information is lost, and the statistical analysis loses power. On the other hand,
if data are disaggregated, the result is that a few data values from a small number of super-
units are ‘blown up’ into many more values for a much larger number of sub-units. Ordinary
statistical tests treat all these disaggregated data values as independent information from the
much larger sample of sub-units. The proper sample size for these variables is of course the
number of higher-level units. Using the larger number of disaggregated cases for the sample
size leads to significance tests that reject the null-hypothesis far more often than the nominal
alpha level suggests. In other words, investigators come up with many ‘significant’ results
that are totally spurious.
The second problem is conceptual. If the analyst is not very careful in the interpretation
of the results, s/he may commit the fallacy of the wrong level, which consists of analyzing
the data at one level, and formulating conclusions at another level. Probably the best-known
fallacy is the ecological fallacy, which is interpreting aggregated data at the individual
level. It is also known as the ‘Robinson effect’ after Robinson (1950). Robinson presents
aggregated data describing the relationship between the percentage of blacks and the
illiteracy level in nine geographic regions in 1930. The ecological correlation, that is, the
correlation between the aggregated variables at the region level is 0.95. In contrast, the
individual-level correlation between these global variables is 0.20. Robinson concludes
that in practice an ecological correlation is almost certainly not equal to its corresponding
individual-level correlation. For a statistical explanation, see Robinson (1950) or Kreft and
de Leeuw (1987). Formulating inferences at a higher level based on analyses performed at a
lower level is just as misleading. This fallacy is known as the atomistic fallacy.
A better way to look at multilevel data is to realize that there is not one ‘proper’ level
at which the data should be analyzed. Rather, all levels present in the data are important
in their own way. This becomes clear when we investigate cross-level hypotheses, or
multilevel problems. A multilevel problem is a problem that concerns the relationships
between variables that are measured at a number of different hierarchical levels. For
example, a common question is how a number of individual and group variables influence
4 Multilevel Analysis: Techniques and Applications

one single individual outcome variable. Typically, some of the higher-level explanatory
variables may be structural variables, for example the aggregated group means of lower-
level global (individual) variables. The goal of the analysis is to determine the direct effect
of individual- and group-level explanatory variables, and to determine if the explanatory
variables at the group level serve as moderators of individual-level relationships. If group-
level variables moderate lower-level relationships, this shows up as a statistical interaction
between explanatory variables from different levels. In the past, such data were analyzed
using conventional multiple regression analysis with one dependent variable at the lowest
(individual) level and a collection of disaggregated explanatory variables from all available
levels (cf. Boyd & Iversen, 1979). This approach is completely outdated, since it analyzes
all available data at one single level, it suffers from all of the conceptual and statistical
problems mentioned above.

1.2 Why Do We Need Special Multilevel Analysis Techniques?

Multilevel research concerns a population with a hierarchical structure. A sample from such
a population can be described as a multistage sample: first, we take a sample of units from
the higher level (e.g., schools), and next we sample the sub-units from the available units
(e.g., we sample pupils from the schools). In such samples, the individual observations are
in general not independent. For instance, pupils in the same school tend to be similar to
each other, because of selection processes (for instance, some schools may attract pupils
from higher social economic status (SES) levels, while others attract lower SES pupils) and
because of the common history the pupils share by going to the same school. As a result,
the average correlation (expressed as the so-called intraclass correlation) between variables
measured on pupils from the same school will be higher than the average correlation between
variables measured on pupils from different schools. Standard statistical tests lean heavily on
the assumption of independence of the observations. If this assumption is violated (and with
nested data this is almost always the case) the estimates of the standard errors of conventional
statistical tests are much too small, and this results in many spuriously ‘significant’ results.
The effect is generally not negligible, small dependencies in combination with medium to
large group sizes still result in large biases in the standard errors. The strong biases that may
be the effect of violation of the assumption of independent observations made in standard
statistical tests has been known for a long time (Walsh, 1947) and are still a very important
assumption to check in statistical analyses (Stevens, 2009).
The problem of dependencies between individual observations also occurs in survey
research, if the sample is not taken at random but cluster sampling from geographical areas
is used instead. For similar reasons as in the school example given above, respondents from
the same geographical area will be more similar to each other than respondents from different
geographical areas are. This leads again to estimates for standard errors that are too small
and produce spurious ‘significant’ results. In survey research, this effect of cluster sampling
Introduction to Multilevel Analysis 5

is well known (cf. Kish, 1965, 1987). It is called a ‘design effect’, and various methods are
used to deal with it. A convenient correction procedure is to compute the standard errors by
ordinary analysis methods, estimate the intraclass correlation between respondents within
clusters, and finally employ a correction formula to the standard errors. For instance, Kish
( )
(1965, p. 259) corrects the sampling variance using veff = v 1 + (nclus − 1)  , where veff is
the effective sampling variance, v is the sampling variance calculated by standard methods
assuming simple random sampling, nclus is the cluster size, and ρ is the intraclass correlation.
The intraclass correlation is described in Chapter 2, together with its estimation. The
following example makes clear how important the assumption of independence is. Suppose
that we take a sample of 10 classes, each with 20 pupils. This comes to a total sample
size of 200. We are interested in a variable with an intraclass correlation οf 0.10, which is
a rather low intraclass correlation. However, the effective sample size in this situation is
200 / [1 + (20 – 1)0.1] = 69.0, which is far less than the apparent total sample size of 200.
Clearly, using a sample size of 200 will lead to standard errors that are much too low.
Since the design effect depends on both the intraclass correlation and the cluster size, large
intraclass correlations are partly compensated by small group sizes. Conversely, small intraclass
correlations at the higher levels are offset by the usually large cluster sizes at these levels.
Some of the correction procedures developed for cluster and other complex samples are
quite powerful (cf. Skinner et al., 1989). In principle such correction procedures could also
be applied in analyzing multilevel data, by adjusting the standard errors of the statistical
tests. However, multilevel models are multivariate models, and in general the intraclass
correlation and hence the effective N is different for different variables. In addition, in most
multilevel problems we have not only clustering of individuals within groups, but we also
have variables measured at all available levels, and we are interested in the relationships
between all of these variables. Combining variables from different levels in one statistical
model is a different and more complicated problem than estimating and correcting for
design effects. Multilevel models are designed to analyze variables from different levels
simultaneously, using a statistical model that properly includes the dependencies.
To provide an example of a clearly multilevel problem, consider the ‘frog pond’ theory
that has been utilized in educational and organizational research. The ‘frog pond’ theory
refers to the notion that a specific individual frog may be a medium-sized frog in a pond
otherwise filled with large frogs, or a medium-sized frog in a pond otherwise filled with
small frogs. Applied to education, this metaphor points out that the effect of an explanatory
variable such as ‘intelligence’ on school career may depend on the average intelligence of
the other pupils in the school. A moderately intelligent pupil in a highly intelligent context
may become demotivated and thus become an underachiever, while the same pupil in a
considerably less intelligent context may gain confidence and become an overachiever.
Thus, the effect of an individual pupil’s intelligence depends on the average intelligence
of the other pupils in the class. A popular approach in educational research to investigate
‘frog pond’ effects has been to aggregate variables like the pupils’ IQ into group means, and
6 Multilevel Analysis: Techniques and Applications

then to disaggregate these group means again to the individual level. As a result, the data
file contains both individual-level (global) variables and higher-level (contextual) variables
in the form of disaggregated group means. Already in 1976 the educational researcher
Cronbach suggested to express the individual scores as deviations from their respective
group means (Cronbach, 1976), a procedure that has become known as centering on the
group mean, or group mean centering. Centering on the group means makes very explicit
that the individual scores should be interpreted relative to their group’s mean. The example
of the ‘frog pond’ theory and the corresponding practice of centering the predictor variables
makes clear that combining and analyzing information from different levels within one
statistical model is central to multilevel modeling.

1.3 Multilevel Theories

Multilevel data must be described by multilevel theories, an area that seems underdeveloped
compared to the advances made in the modeling and computing machinery. Multilevel models
in general require that the grouping criterion is clear, and that variables can be assigned
unequivocally to their appropriate level. In reality, group boundaries are sometimes fuzzy
and somewhat arbitrary, and the assignment of variables is not always obvious and simple.
In multilevel research, decisions about group membership and operationalizations involve a
range of theoretical assumptions (Klein & Kozlowski, 2000). If there are effects of the social
context on individuals, these effects must be mediated by intervening processes that depend
on characteristics of the social context. When the number of variables at the different levels
is large, there is an enormous number of possible cross-level interactions (discussed in more
detail in Chapter 2). Ideally, a multilevel theory should specify which direct effects and cross-
level interaction effects can be expected. Theoretical interpretation of cross-level interaction
effects between the individual and the context level require a specification of processes within
individuals that cause those individuals to be differentially influenced by certain aspects of the
context. Attempts to identify such processes have been made by, among others, Stinchcombe
(1968), Erbring and Young (1979), and Chan (1998). The common core in these theories is
that they all postulate processes that mediate between individual variables and group variables.
Since a global explanation by ‘group telepathy’ is generally not acceptable, communication
processes and the internal structure of groups become important concepts. These are often
measured as a structural variable. In spite of their theoretical relevance, structural variables
are infrequently used in multilevel research. Another theoretical area that has been largely
neglected by multilevel researchers is the influence of individuals on the group. In multilevel
modeling, the focus is on models where the outcome variable is at the lowest level. Models
that investigate the influence of individual variables on group outcomes are scarce. For a
review of this issue see DiPrete and Forristal (1994), an example is discussed by Alba and
Logan (1992). Croon and van Veldhoven (2007) discuss analysis models for multilevel data
where the outcome variable is at the highest level.
Introduction to Multilevel Analysis 7

1.4 Estimation and Software

A relatively new development in multilevel modeling is the use of Bayesian estimation


methods. Bayesian estimation offers solutions to some estimation problems that are
common in multilevel analysis, for example small sample sizes at the higher levels. Earlier
editions of this book already introduced Bayesian estimation; in this edition the discussion
of Bayesian estimation is expanded. We have chosen to do this by expanding the discussion
of Bayesian methods where appropriate, rather than inserting a separate chapter on Bayesian
methods. This book is not intended as a full introduction to Bayesian modeling. Our aim is
to get the reader interested in Bayesian modeling by showing when and where it is helpful,
and providing the necessary information to get started in this exciting field.
Many of the techniques and their specific software implementations discussed in this
book are the subject of active statistical and methodological research. In other words: both
the statistical techniques and the software tools are evolving rapidly. As a result, increasing
numbers of researchers are applying increasingly advanced models to their data. Of course,
researchers still need to understand the models and techniques that they use. Therefore,
in addition to being an introduction to multilevel analysis, this book aims to let the reader
become acquainted with some advanced modeling techniques that might be used, such as
bootstrapping and Bayesian estimation methods. At the time of writing, these are specialist
tools, and not part of the standard analysis toolkit. But they are developing rapidly, and are
likely to become more popular in applied research as well.
2
The Basic Two-Level Regression Model

Summary

The multilevel regression model has become known in the research literature under a
variety of names, such as ‘random coefficient model’ (Kreft & de Leeuw, 1998), ‘variance
component model’ (Searle et al., 1992; Longford, 1993), and ‘hierarchical linear model’
(Raudenbush & Bryk, 2002; Snijders & Bosker, 2012). Statistically oriented publications
generally refer to the model as a ‘mixed-effects’ or ‘mixed linear model’ (Littell et al.,
1996) and sociologists refer to it as ‘contextual analysis’ (Lazarsfeld & Menzel, 1961).
The models described in these publications are not exactly the same, but they are highly
similar, and we refer to them collectively as ‘multilevel regression models’. The multilevel
regression model assumes that there is a hierarchical data set, often consisting of subjects
nested within groups, with one single outcome or response variable that is measured at the
lowest level, and explanatory variables at all existing levels. The multilevel regression model
can be extended by adding an extra level for multiple outcome variables (see Chapter 10),
while multilevel structural equation models are fully multivariate at all levels (see Chapter
14 and Chapter 15). Conceptually, it is useful to view the multilevel regression model as
a hierarchical system of regression equations. In this chapter, we explain the multilevel
regression model for two-level data, providing both the equations and an example, and later
extend this model with a three-level example.

2.1 Example

Assume that we have data from J classes, with a different number of pupils nj in each class.
On the pupil level, we have the outcome variable ‘popularity’ (Y), measured by a self-rating
scale that ranges from 0 (very unpopular) to 10 (very popular). We have two explanatory
variables on the pupil level: pupil gender (X1: 0 = boy, 1 = girl) and pupil extraversion
(X2, measured on a self-rating scale ranging from 1–10), and one class-level explanatory
variable teacher experience (Z: in years, ranging from 2–25). There are data on 2000 pupils
in 100 classes, so the average class size is 20 pupils. The data are described in Appendix
E. The data files and other support materials are also available online (at https://multilevel-
analysis.sites.uu.nl/).
To analyze these data, we can set up separate regression equations in each class to predict
the outcome variable Y using the explanatory variables X as follows:

8
The Basic Two-Level Regression Model 9

Yij =  0 j + 1 j X 1ij +  2 j X 2ij + eij . (2.1)

Using variable labels instead of algebraic symbols, the equation reads:

popularityij =  0 j + 1 j genderij +  2 j extraversionij + eij . (2.2)

In this regression equation, β0j is the intercept, β1j is the regression coefficient (regression
slope) for the dichotomous explanatory variable gender (i.e., the difference between boys
and girls), β2j is the regression coefficient (slope) for the continuous explanatory variable
extraversion, and eij is the usual residual error term. The subscript j is for the classes (j = 1…J)
and the subscript i is for individual pupils (i = 1…nj). The difference with the usual regression
model is that we assume that each class has a different intercept coefficient β0j, and different
slope coefficients β1j and β2j. This is indicated in Equations 2.1 and 2.2 by attaching a subscript
j to the regression coefficients. The residual errors eij are assumed to have a mean of zero, and
a variance to be estimated. Most multilevel software assumes that the variance of the residual
errors is the same in all classes. Different authors (cf. Goldstein, 2011; Raudenbush & Bryk,
2002) use different systems of notation. This book uses  e2 to denote the variance of the lowest
level residual errors.
Figure 2.1 shows a single-level regression line for a dependent variable Y regressed on a
single explanatory variable X. The regression line represents the predicted values ŷ for Y, the
regression coefficient b0 is the intercept, the predicted value for Y if X = 0. The regression slope
β1 indicates the predicted increase in Y if X increases by one unit.
Since in multilevel regression the intercept and slope coefficients vary across the classes,
they are often referred to as random coefficients. Of course, we hope that this variation is
not totally random, so we can explain at least some of the variation by introducing higher-
level variables. Generally, we do not expect to explain all variation, so there will be some
unexplained residual variation. In our example, the specific values for the intercept and the

b1
e

b0

Figure 2.1 Example single-level regression line.


10 Multilevel Analysis: Techniques and Applications

slope coefficients are a class characteristic. In general, a class with a high intercept is predicted
to have more popular pupils than a class with a low value for the intercept. Since the model
contains a dummy variable for gender, the value of the intercept reflects the predicted value
for the boys (who are coded as zero). Varying intercepts shift the average value for the
entire class, both boys and girls. Differences in the slope coefficient for gender or extraversion
indicate that the relationship between the pupils’ gender or extraversion and their predicted
popularity is not the same in all classes. Some classes may have a high value for the slope
coefficient of gender; in these classes, the difference between boys and girls is relatively large.
Other classes may have a low value for the slope coefficient of gender; in these classes, gender
has a small effect on the popularity, which means that the difference between boys and girls
is small. Variance in the slope for pupil extraversion is interpreted in a similar way; in classes
with a large coefficient for the extraversion slope, pupil extraversion has a large impact on their
popularity, and vice versa.
Figure 2.2 presents an example with two groups. The panel on the left portrays two groups
with no slope variation, and as a result the two slopes are parallel. The intercepts for both groups
are different. The panel on the right portrays two groups with different slopes, or slope variation.
Note that variation in slopes also has an effect on the difference between the intercepts!
Across all classes, the regression coefficients β0j … β2j are assumed to have a multivariate
normal distribution. The next step in the hierarchical regression model is to explain the
variation of the regression coefficients β0j … β2j by introducing explanatory variables at the
class level, for the intercept

 0 j =  00 +  01Z j + u0 j , (2.3)

and for the slopes


1 j =  10 +  11Z j + u1 j
(2.4)
2 j =  20 +  21Z j + u2 j .

Y Y
Slope b2
Group 2 Group 2

Common
slope b

Slope b1

Group 1 Group 1
X X

Figure 2.2 Two groups without (left) and with (right) random slopes.
The Basic Two-Level Regression Model 11

Equation 2.3 predicts the average popularity in a class (the intercept β0j) by the teacher’s
experience (Z). Thus, if γ01 is positive, the average popularity is higher in classes with a more
experienced teacher. Conversely, if γ01 is negative, the average popularity is lower in classes
with a more experienced teacher. The interpretation of the equations under 2.4 is a bit more
complicated. The first equation under 2.4 states that the relationship, as expressed by the slope
coefficient β1j, between the popularity (Y) and the gender (X) of the pupil, depends upon the
amount of experience of the teacher (Z). If γ11 is positive, the gender effect on popularity is
larger with experienced teachers. Conversely, if γ11 is negative, the gender effect on popularity
is smaller with more experienced teachers. Similarly, the second equation under 2.4 states, if
γ21 is positive, that the effect of extraversion is larger in classes with an experienced teacher.
Thus, the amount of experience of the teacher acts as a moderator variable for the relationship
between popularity and gender or extraversion; this relationship varies according to the value
of the moderator variable.
The u-terms u0j, u1j and u2j in Equations 2.3 and 2.4 are (random) residual error terms at the
class level. These residual errors uj are assumed to have a mean of zero, and to be independent
from the residual errors eij at the individual (pupil) level. The variance of the residual errors
u0j is specified as  u20 , and the variance of the residual errors u1j and u2j are specified as  u21
and  u22 . The covariances between the residual error terms are denoted by  u01 ,  u02 and  u12 ,
which are generally not assumed to be zero.
Note that in Equations 2.3 and 2.4 the regression coefficients γ are not assumed to vary
across classes. They therefore have no subscript j to indicate to which class they belong.
Because they apply to all classes, they are referred to as fixed coefficients. All between-
class variation left in the β coefficients, after predicting these with the class variable Zj, is
assumed to be residual error variation. This is captured by the residual error terms uj, which
do have subscripts j to indicate to which class they belong.
Our model with two pupil-level and one class-level explanatory variables can be written
as a single complex regression equation by substituting Equations 2.3 and 2.4 into Equation
2.1. Substitution and rearranging terms gives:
Yij =  00 +  10 X 1ij +  20 X 2ij +  01Z j +  11 X 1ij Z j +  21 X 2ij Z j
(2.5)
+u1 j X 1ij + u2 j X 2ij + u0 j + eij

Using variable labels instead of algebraic symbols, we have

popularityij = γ00 + γ10 genderij + γ20 extraversionij + γ01 experiencej


           + γ11 genderij´ experiencej + γ21 extraversionij×experiencej + u1j genderij
           + u2j extraversionij + u0j + eij .

The segment [γ00 + γ10 X1ij + γ20 X2ij + γ01Zj + γ11 X1ijZj + γ11 X2ijZj] in Equation 2.5 contains the
fixed coefficients. It is often called the fixed (or deterministic) part of the model. The segment
[u1jX1ij + u2jX2ij + u0j + eij] in Equation 2.5 contains the random error terms, and it is often called
12 Multilevel Analysis: Techniques and Applications

the random (or stochastic) part of the model. The terms X1iZj and X2ijZj are interaction terms
that appear in the model as a consequence of modeling the varying regression slope βj of a
pupil-level variable Xij with the class-level variable Zj. Thus, the moderator effect of Z on the
relationship between the dependent variable Y and the predictor X, is expressed in the single
equation version of the model as a cross-level interaction. The interpretation of interaction
terms in multiple regression analysis is complex, and this is treated in more detail in Chapter
4. In brief, the point made in Chapter 4 is that the substantive interpretation of the coefficients
in models with interactions is much simpler if the variables making up the interaction are
expressed as deviations from their respective means.
Note that the random error terms u1j are connected to the Xij. Since the explanatory
variable Xij and the corresponding error term uj are multiplied, the resulting error term will
be different for different values of the explanatory variable Xij, a situation that in ordinary
multiple regression analysis is called ‘heteroscedasticity’. The usual multiple regression
model assumes ‘homoscedasticity’, which means that the variance of the residual errors is
independent of the values of the explanatory variables. If this assumption is not true, ordinary
multiple regression does not perform very well. This is another reason why analyzing
multilevel data with ordinary multiple regression techniques does not perform well.
As explained in the introduction in Chapter 1, multilevel models are needed because
grouped data observations from the same group are generally more similar to each
other than the observations from different groups, and this violates the assumption of
independence of all observations. The amount of dependence can be expressed as a
correlation coefficient: the intraclass correlation. The methodological literature contains a
number of different formulas to estimate the intraclass correlation ρ. For example, if we
use one-way analysis of variance with the grouping variable as independent variable to test
the group effect on our outcome variable, the intraclass correlation is given by ρ = [MS(B)-
MS(error)] / [MS(B) + (n-1) × MS(error)], where MS(B) is the between-groups mean
square and n is the common group size. Shrout and Fleiss (1979) give an overview of
formulas for the intraclass correlation for a variety of research designs.
The multilevel regression model can also be used to produce an estimate of the intraclass
correlation. The model used for this purpose is a model that contains no explanatory
variables at all, the so-called intercept-only or empty model (also referred to as baseline
model). The intercept-only model is derived from Equations 2.1 and 2.3 as follows. If there
are no explanatory variables X at the lowest level, Equation 2.1 reduces to

Yij = β0j + eij . (2.6)

Likewise, if there are no explanatory variables Z at the highest level, Equation 2.3 reduces
to

β0j = γ00 + u0j . (2.7)


The Basic Two-Level Regression Model 13

We find the single equation model by substituting 2.7 into 2.6:

Yij = γ00 + u0j + eij. (2.8)

The intercept-only model of Equation 2.8 does not explain any variance in Y. It only
decomposes the variance into two independent components:  e2 , which is the variance of the
lowest-level errors eij, and  u20 , which is the variance of the highest-level errors u0j. These two
variances sum up to the total variance, hence they are often referred to as variance components.
Using this model, we can define the intraclass correlation ρ by the equation

 u20
= . (2.9)
 +  e2
2
u0

The intraclass correlation ρ indicates the proportion of the total variance explained by the
grouping structure in the population. Equation 2.9 simply states that the intraclass correlation
is the proportion of group-level variance compared to the total variance.1 The intraclass
correlation ρ can also be interpreted as the expected correlation between two randomly drawn
units that are in the same group.
In the intercept-only model we defined variance of the lowest-level errors and variance
of the highest-level errors. Both terms can be interpreted as unexplained variance on both
levels since there are no predictors in the model specified yet. After adding predictors, just
like in ordinary regression analyses, the R², which is interpreted as the proportion of variance
modeled by the explanatory variables, can be calculated. In the case of multilevel analyses,
however, there is variance to be explained at every level (and also for random slope factors).
The interpretation of these separate R² values are dependent on the ICC-values. For example, if
the R² at the highest level appears to be 0.20 and the ICC is 0.40, then out of 40 percent of the
total variance 20 percent is explained. This is further explained in Chapter 4.

2.2 An Extended Example

The intercept-only model is useful as a null-model that serves as a benchmark with which
other models are compared. For our pupil popularity example data, the intercept-only model
is written as

popularityij = γ00 + u0j + eij.

The model that includes pupil gender, pupil extraversion and teacher experience, but not
the cross-level interactions, is written as

popularityij = γ00 + γ10 genderij + γ20 extraversionij + γ01 experiencej + u1j genderij
           + u2j extraversionij + u0j + eij.
14 Multilevel Analysis: Techniques and Applications

Table 2.1 Intercept-only model and model with explanatory variable

Model Single-level model M0: intercept only M1: with predictors

Fixed part Coefficient (s.e.) Coefficient (s.e.)


Intercept 5.08 (.03) 5.08 (.09) 0.74 (.20)
Pupil gender 1.25 (.04)
Pupil extraversion 0.45 (.03)
Teacher experience 0.09 (.01)
Random parta
σ e2 1.91 (.06) 1.22 (.04) 0.55 (.02)

σu2 0 0.69 (.11) 1.28 (.47)

σu21 0.00 (–)

σu2 2 0.03 (.008)

Deviance 6970.4 6327.5 4812.8


a For simplicity the covariances are not included

Table 2.1 presents the parameter estimates and standard errors for both models.2 For
comparison, the first column presents the parameter estimates of a single-level model. The
intercept is estimated correctly, but the variance term combines the level-one and level-two
variances, and is for that reason not meaningful. M0, the intercept-only two-level model, splits
this variance term in a variance at the first and a variance at the second level. The intercept-
only two-level model estimates the intercept as 5.08, which is simply the average popularity
across all classes and pupils. The variance of the pupil-level residual errors, symbolized
by  e2 , is estimated as 1.22. The variance of the class-level residual errors, symbolized by
 u20 , is estimated as 0.69. All parameter estimates are much larger than the corresponding
standard errors, and calculation of the Z-test shows that they are all significant at p <0.005.3
( )
The intraclass correlation, calculated by equation 2.9 as  =  u20 /  u20 +  e2 , is 0.69 / 1.91,
which equals 0.36. Thus, 36 percent of the variance of the popularity scores is at the group
level, which is very high for social science data. Since the intercept-only model contains
no explanatory variables, the residual variances represent unexplained error variance. The
deviance reported in Table 2.1 is a measure of model misfit; when we add explanatory
variables to the model, the deviance will go down.
The second model in Table 2.1 includes pupil gender and extraversion and teacher
experience as explanatory variables. The regression coefficients for all three variables are
significant. The regression coefficient for pupil gender is 1.25. Since pupil gender is coded
0 = boy, 1 = girl, this means that on average the girls score 1.25 points higher than boys on the
popularity measure, when all other variables are kept constant. The regression coefficient for
pupil extraversion is 0.45, which means that with each scale point higher on the extraversion
The Basic Two-Level Regression Model 15

measure, the popularity is expected to increase with 0.45 scale points. The regression
coefficient for teacher experience is 0.09, which means that for each year of experience of
the teacher, the average popularity score of the class goes up by 0.09 points. This does not
seem very much, but the teacher experience in our example data ranges from 2 to 25 years,
so the predicted difference between the least experienced and the most experienced teacher is
(25 – 2 = ) 23 × 0.09 = 2.07 points on the popularity measure. The value of the intercept is
generally not interpreted; it is the expected value of the dependent variable if all explanatory
variables have the value zero. We can use the standard errors of the regression coefficients
reported in Table 2.1 to construct a 95 percent confidence interval. For the regression coefficient
of pupil gender, the 95 percent confidence interval runs from 1.17 to 1.33, the confidence
interval for pupil extraversion runs from 0.39 to 0.51, and the 95 percent confidence interval
for the regression coefficient of teacher experience runs from 0.07 to 0.11.4 Note that the
interpretation of the regression coefficients in the fixed part is no different than in any other
regression model (cf. Aiken & West, 1991).
The model with the explanatory variables includes variance components for the regression
coefficients of pupil gender and pupil extraversion, symbolized by  u21 and  u22 in Table 2.1.
The variance of the regression coefficients for pupil extraversion across classes is estimated
as 0.03, with a standard error of 0.008. The variance of the regression coefficients for pupil
gender is estimated as zero and not significant, so the hypothesis that the regression slopes for
pupil gender vary across classes is not supported by the data. We should remove the residual
variance term for the gender slopes from the model, and estimate the new model again. Table
2.2 presents the estimates for the model with a fixed slope for the effect of pupil gender.
Table 2.2 also includes the covariance between the class-level errors for the intercept and the
extraversion slope. These covariances are rarely interpreted (for an exception see Chapter 5

Table 2.2. Model with explanatory variables, extraversion slope random

Model M1: with predictors

Fixed part Coefficient (s.e.)


Intercept 0.74 (.20)
Pupil gender 1.25 (.04)
Pupil extraversion 0.45 (.02)
Teacher experience 0.09 (.01)
Random part
σ e2 0.55 (.02)
σ 2
u0
1.28 (.28)
σ 2
u2
0.03 (.008)
σu02 –0.18 (.05)
Deviance 4812.8
16 Multilevel Analysis: Techniques and Applications

and Chapter 16 where growth models are discussed), and for that reason they are often not
included in the reported tables. However, as Table 2.2 demonstrates, they can be quite large
and significant, so as a rule they are always included in the model.
The significant variance of the regression slopes for pupil extraversion implies that we
should not interpret the estimated value of 0.45 without considering this variation. In an
ordinary regression model, without multilevel structure, the value of 0.45 means that for each
point difference on the extraversion scale, the pupil popularity goes up by 0.45, for all pupils
in all classes. In our multilevel model, the regression coefficient for extraversion varies across
the classes, and the value of 0.45 is just the expected value (the mean) across all classes. The
varying regression slopes for pupil extraversion are assumed to follow a normal distribution.
The variance of this distribution is in our example estimated as 0.034. Interpretation of this
variation is easier when we consider the standard deviation, which is the square root of the
variance and equal to 0.18 in our example data. A useful characteristic of the standard deviation
is that with normally distributed observations, about 67 percent of the observations lie between
one standard deviation below and above the mean, and about 95 percent of the observations lie
between two standard deviations below and above the mean. If we apply this to the regression
coefficients for pupil gender, we conclude that about 67 percent of the regression coefficients
are expected to lie between (0.45 – 0.18 = ) 0.27 and (0.45 + 0.18 = ) 0.63, and about 95 percent
are expected to lie between (0.45 – 0.37 = ) 0.08 and (0.45 + 0.37 = ) 0.82. The more precise
value of Z.975 = 1.96 leads to the 95 percent predictive interval calculated as 0.09 – 0.81. We can
also use the standard normal distribution to estimate the percentage of regression coefficients
that are negative. As it turns out, if the mean regression coefficient for pupil extraversion is
0.45, given the estimated slope variance, less than 1 percent of the classes are expected to have
a regression coefficient that is actually negative. Note that the 95 percent interval computed
here is totally different from the 95 percent confidence interval for the regression coefficient of
pupil extraversion, which runs from 0.41 to 0.50. The 95 percent confidence interval applies to
γ20, the mean value of the regression coefficients across all the classes. The 95 percent interval
calculated here is the 95 percent predictive interval, which expresses that 95 percent of the
regression coefficients of the variable ‘pupil extraversion’ in the classes are predicted to lie
between 0.09 and 0.81.
Given the significant variance of the regression coefficient of pupil extraversion across the
classes, it is attractive to attempt to predict its variation using class-level variables. We have
one class-level variable: teacher experience. The individual level regression equation for this
example, using variable labels instead of symbols, is given by:

popularityij =  0 j + 1 genderij +  2 j extraversionij + eij . (2.10)

The regression coefficient β1 for pupil gender does not have a subscript j, because it is not
assumed to vary across classes. The regression equations predicting β0j, the intercept in class
j, and β2j, the regression slope of pupil extraversion in class j, are given by Equation 2.3 and
Equation 2.4, which are rewritten below using variable labels
The Basic Two-Level Regression Model 17

0 j =  00 +  01experience j + u0 j
(2.11)
2 j =  20 +  21experience j + u2 j .

By substituting 2.11 into 2.10 we get


popularityij =  00 +  10 genderij +  20extraversionij +  01experience j
(2.12)
+ 21extraversionij × experience j + u2 j extraversionij + u0 j + eij

The algebraic manipulations of the equations above make clear that to explain the variance
of the regression slopes β2j, we need to introduce an interaction term in the model. This
interaction, between the variables pupil extraversion and teacher experience, is a cross-level
interaction, because it involves explanatory variables from different levels. Table 2.3 presents
the estimates from a model with this cross-level interaction. For comparison, the estimates for
the model without this interaction are also included in Table 2.3.

Table 2.3 Model without and with cross-level interaction

Model M1A: main effects M2: with interaction

Fixed part Coefficient (s.e.) Coefficient (s.e.)


Intercept 0.74 (.20) –1.21 (.27)
Pupil gender 1.25 (.04) 1.24 (.04)
Pupil extraversion 0.45 (.02) 0.80 (.04)
Teacher experience 0.09 (.01) 0.23 (.02)
Extra*T.experience –0.03 (.003)
Random part
σ e2 0.55 (.02) 0.55 (.02)
σu2 0 1.28 (.28) 0.45 (.16)
σ 2
u2
0.03 (.008) 0.005 (.004)
σu02 –0.18 (.05) –0.03 (.02)
Deviance 4812.8 4747.6

The estimates for the fixed coefficients in Table 2.3 are similar for the effect of pupil gender,
but the regression slopes for pupil extraversion and teacher experience are considerably larger in
the cross-level model. The interpretation remains the same: extraverted pupils are more popular.
The regression coefficient for the cross-level interaction is –0.03, which is small but significant.
This interaction is formed by multiplying the scores for the variables ‘pupil extraversion’ and
‘teacher experience’, and the negative value means that with experienced teachers, the advantage
of extraverted is smaller than expected from the direct effects only. Thus, the difference between
extraverted and introverted pupils is smaller with more experienced teachers.
18 Multilevel Analysis: Techniques and Applications

Comparison of the other results between the two models shows that the variance component
for pupil extraversion goes down from 0.03 in the main effects model to 0.005 in the cross-level
model. Apparently, the cross-level model explains some of the variation of the slopes for pupil
extraversion. The deviance also goes down, which indicates that this model fits better than the
previous model. The other differences in the random part are more difficult to interpret. Much
of the difficulty in reconciling the estimates in the two models in Table 2.3 stems from adding
an interaction effect. This issue is discussed in more detail in Chapter 4.
The coefficients in the tables are all unstandardized regression coefficients. To interpret
them properly, we must take the scale of the explanatory variables into account. In multiple
regression analysis, and structural equation models (SEM) for that matter, the regression
coefficients are often standardized because that facilitates the interpretation when one wants
to compare the effects of different variables within one sample. If the goal of the analysis is
to compare parameter estimates from different samples to each other, one should always use
unstandardized coefficients. To standardize the regression coefficients, as presented in Table
2.1 or Table 2.3, one could standardize all variables before putting them into the multilevel
analysis. However, this would in general also change the estimates of the variance components,
and their standard errors as well. Therefore, it is better to derive the standardized regression
coefficients from the unstandardized coefficients:

standardized = unstandardized coefficient * stand .dev. explanatory var. (2.13)


coefficient stand .dev. outcomevar.

In our example data, the standard deviations are: 1.38 for popularity, 0.51 for gender, 1.26
for extraversion, and 6.55 for teacher experience. Table 2.4 presents the unstandardized and
standardized coefficients for the second model in Table 2.2. It also presents the estimates that
we obtain if we first standardize all variables, and then carry out the analysis.
Table 2.4 shows that the standardized regression coefficients are almost the same as
the regression coefficients estimated for standardized variables. The small differences in
Table 2.4 are simply due to rounding errors. However, if we use standardized variables
in our analysis, we find very different variance components and a very different value
for the deviance. This is not only the effect of scaling the variables differently; the
covariance between the slope for pupil extraversion and the intercept is significant for
the unstandardized variables, but not significant for the standardized variables. This kind
of difference in results is general. The fixed part of the multilevel regression model is
invariant for linear transformations, just as the regression coefficients in the ordinary
single-level regression model. This means that if we change the scale of our explanatory
variables, the regression coefficients and the corresponding standard errors change by
the same multiplication factor, and all associated p-values remain exactly the same.
However, the random part of the multilevel regression model is not invariant for linear
transformations. The estimates of the variance components in the random part can and do
change, sometimes dramatically. This is discussed in more detail in Section 4.2 in Chapter
The Basic Two-Level Regression Model 19

Table 2.4 Comparing unstandardized and standardized estimates

Model Standardization using 2.13 Standardized


variables

Fixed part Coefficient (s.e.) Standardized Coefficient (s.e.)


Intercept 0.74 (.20) – –0.03 (.04)
Pupil gender 1.25 (.04) 0.46 0.45 (.01)
Pupil extraversion 0.45 (.02) 0.41 0.41 (.02)
Teacher experience 0.09 (.01) 0.43 0.43 (.04)
Random part
σ e2 0.55 (.02) 0.28 (.01)
σ 2
u0
1.28 (.28) 0.15 (.02)
σu2 2 0.03 (.01) 0.03 (.01)
σu02 –0.18 (.01) –0.01 (.01)
`
Deviance 4812.8 3517.2

4. The conclusion to be drawn here is that, if we have a complicated random part, including
random components for regression slopes, we should think carefully about the scale of our
explanatory variables. If our only goal is to present standardized coefficients in addition
to the unstandardized coefficients, applying Equation 2.13 is safer than transforming our
variables. On the other hand, we may estimate the unstandardized results, including the
random part and the deviance, and then re-analyze the data using standardized variables,
merely using this analysis as a computational trick to obtain the standardized regression
coefficients without having to do hand calculations.

2.3 Three- and More Level Regression Models

2.3.1 Multiple-Level Models

In principle, the extension of the two-level regression model to three and more levels is
straightforward. There is an outcome variable at the first, the lowest level. In addition,
there may be explanatory variables at all available levels. The problem is that three-
and more level models can become complicated very fast. In addition to the usual fixed
regression coefficients, we must entertain the possibility that regression coefficients for
first-level explanatory variables may vary across units of both the second and the third
levels. Regression coefficients for second-level explanatory variables may vary across units
of the third level. To explain such variation, we must include cross-level interactions in the
model. Regression slopes for the cross-level interaction between first-level and second-
20 Multilevel Analysis: Techniques and Applications

level variables may themselves vary across third-level units. To explain such variation, we
need a three-way interaction involving variables at all three levels.
The equations for such models are complicated, especially when we do not use the more
compact summation notation but write out the complete single equation-version of the model
in an algebraic format (for a note on notation see Section 2.4).
The resulting models are not only difficult to follow from a conceptual point of view;
they may also be difficult to estimate in practice. The number of estimated parameters is
considerable, and at the same time the highest level sample size tends to become relatively
smaller. As DiPrete and Forristal (1994, p. 349) put it, the imagination of the researchers “…
can easily outrun the capacity of the data, the computer, and current optimization techniques
to provide robust estimates.”
Nevertheless, three- and more level models have their place in multilevel analysis.
Intuitively, three-level structures such as pupils in classes in schools, or respondents nested
within households, nested within regions, appear to be both conceptually and empirically
manageable. If the lowest level is repeated measures over time, having repeated measures on
pupils nested within schools again does not appear to be overly complicated. In such cases, the
solution for the conceptual and statistical problems mentioned is to keep models reasonably
small. Especially specification of the higher-level variances and covariances should be driven
by theoretical considerations. A higher-level variance for a specific regression coefficient
implies that this regression coefficient is assumed to vary across units at that level. A higher-
level covariance between two specific regression coefficients implies that these regression
coefficients are assumed to covary across units at that level. Especially when models become
large and complicated, it is advisable to avoid higher-order interactions, and to include in the
random part only those elements for which there is strong theoretical or empirical justification.
This implies that an exhaustive search for second-order and higher-order interactions is not
a good idea. In general, we should seek for higher-order interactions only if there is strong
theoretical justification for their importance, or if an unusually large variance component for
a regression slope calls for explanation. For the random part of the model, there are usually
more convincing theoretical reasons for the higher-level variance components than for the
covariance components. Especially if the covariances are small and non-significant, analysts
sometimes do not include all possible covariances in the model. This is defensible, with some
exceptions. First, it is recommended that the covariances between the intercept and the random
slopes are always included. Second, it is recommended to include covariances corresponding
to slopes of dummy variables belonging to the same categorical variable, and for variables that
are involved in an interaction or belong to the same polynomial expression.

2.3.2 Intraclass Correlations in Three-Level Models

In a two-level model, the intraclass correlation is calculated in the intercept-only model


using Equation 2.9, which is repeated below:
The Basic Two-Level Regression Model 21

 u20
= . (2.9, repeated)
 u20 +  e2

The intraclass correlation is an indication of the proportion of variance at the second level,
and it can also be interpreted as the expected (population) correlation between two randomly
chosen individuals within the same group.
If we have a three-level model, for instance pupils nested within classes, nested within
schools, there are two ways to calculate the intraclass correlation. First, we estimate an
intercept-only model for the three-level data, for which the single-equation model can be
written as follows:

Yijk = γ000 + v0k + u0jk + eijk . (2.15)

The variances at the first, second, and third level are respectively  e2 ,  u20 , and  v20 . The first
method (cf. Davis & Scott, 1995) defines the intraclass correlations at the class and school level as

 u20
class = , (2.16)
 v20 +  u20 +  e2

and

 v20
 school = . (2.17)
 v20 +  u20 +  e2

The second method (cf. Siddiqui et al., 1996) defines the intraclass correlations at the class
and school level as

 v20 +  u20
class = , (2.18)
 v20 +  u20 +  e2

and

 v20
 school = . (2.19)
 v20 +  u20 +  e2

Actually, both methods are correct (Algina, 2000). The first method identifies the
proportion of variance at the class and school level. This should be used if we are interested
in a decomposition of the variance across the available levels, or if we are interested in how
much variance is located at each level (a topic discussed in Section 4.5). The second method
represents an estimate of the expected (population) correlation between two randomly chosen
elements in the same group. So ρclass as calculated in Equation 2.18 is the expected correlation
between two pupils within the same class, and it correctly takes into account that two pupils
who are in the same class must by definition also be in the same school. For this reason, the
variance components for classes and schools must both be in the numerator of Equation 2.18.
If the two sets of estimates are different, which may happen if the amount of variance at the
22 Multilevel Analysis: Techniques and Applications

school level is large, there is no contradiction involved. Both sets of equations express two
different aspects of the data, which happen to coincide when there are only two levels. The first
method, which identifies the proportion of variance at each level, is the one most often used.

2.3.3 An Example of a Three-Level Model

The data in this example are from a hypothetical study on stress in hospitals. The data
are from nurses working in wards nested within hospitals. In each of 25 hospitals, four
wards are selected and randomly assigned to an experimental and control condition. In the
experimental condition, a training program is offered to all nurses to cope with job-related
stress. After the program is completed, a sample of about 10 nurses from each ward is given
a test that measures job-related stress. Additional variables are: nurse age (years), nurse
experience (years), nurse gender (0 = male, 1 = female), type of ward (0 = general care,
1 = special care), and hospital size (0 = small, 1 = medium, 2 = large).
This is an example of an experiment where the experimental intervention is carried
out on a higher level, in this example the ward level. In biomedical research this design is
known as a multisite cluster randomized trial. They are quite common also in educational
and organizational research, where entire classes or schools are assigned to experimental and
control conditions. Since the design variable Experimental versus Control group (ExpCon)
is manipulated at the second (ward) level, we can study whether the experimental effect is
different in different hospitals, by defining the regression coefficient for the ExpCon variable as
random at the hospital level.
In this example, the variable ExpCon is of main interest, and the other variables are
covariates. Their function is to control for differences between the groups, which can occur
even if randomization is used, especially with small samples, and to explain variance in the
outcome variable stress. To the extent that these variables successfully explain variance, the
power of the test for the effect of ExpCon will be increased. Therefore, although logically
we can test if explanatory variables at the first level have random coefficients at the second
or third level, and if explanatory variables at the second level have random coefficients at the
third level, these possibilities are not pursued. We do test a model with a random coefficient
for ExpCon at the third level, where there turns out to be significant slope variation. This
varying slope can be predicted by adding a cross-level interaction between the variables
expcon and hospsize. In view of this interaction, the variables expcon and hospsize have been
centered on their overall mean.5 Table 2.5 presents the results for a series of models.
The equation for the first model, the intercept-only model is

stressijk =  000 + v0 k + u0 jk + eijk . (2.20)

This produces the variance estimates in the M0 column of Table 2.5. The proportion of
variance (ICC) is 0.52 at the ward level, and 0.17 at the hospital level, calculated following
Equations 2.18 and 2.19. The nurse-level and the ward-level variances are evidently significant.
The Basic Two-Level Regression Model 23

Table 2.5 Models for stress in hospitals and wards

Model M0: M1: M 2: M3:


intercept only with predictors with random with cross-level
slope ExpCon interaction

Fixed part Coefficient (s.e.) Coefficient (s.e.) Coefficient (s.e.) Coefficient (s.e.)
Intercept 5.00 (0.11) 5.50 (.12) 5.46 (.12) 5.50 (.11)
ExpCon a
–0.70 (.12) –0.70 (.18) –0.50 (.11)
Age 0.02 (.002) 0.02 (.002) 0.02 (.002)
Gender –0.45 (.03) –0.45 (.03) –0.45 (.03)
Experience –0.06 (.004) –0.06 (.004) –0.06 (.004)
Ward type 0.05 (.12) 0.05 (.07) 0.05 (.07)
Hospital size a
0.46 (.12) 0.29 (.12) 0.46 (.12)
Exp × HSize 1.00 (.16)
Random part
σ e2 ijk 0.30 (.01) 0.22 (.01) 0.22 (.01) 0.22 (.01)
σu2 0 jk 0.49 (.09) 0.33 (.06) 0.11 (.03) 0.11 (.03)
σ v20 k 0.16 (.09) 0.10 (0.05) 0.166 (.06) 0.15 (.05)
σ 2
u1k
0.66 (.22) 0.18 (.09)
Deviance 1942.4 1604.4 1574.2 1550.8

a
Centered on grand mean

The test statistic for the hospital-level variance is Z = 0.162 / 0.0852 = 1.901, which produces
a one-sided p-value of 0.029. The hospital-level variance is significant at the 5 percent level.
The sequence of models in Table 2.5 shows that all predictor variables have a significant effect,
except the ward type, and that the experimental intervention significantly lowers stress. The
experimental effect varies across hospitals, and a large part of this variation can be explained
by hospital size; in large hospitals the experimental effect is smaller.

2.4 Notation and Software

2.4.1 Notation

In general, there will be more than one explanatory variable at the lowest level and more
than one explanatory variable at the highest level. Assume that we have P explanatory
variables X at the lowest level, indicated by the subscript p (p = 1…P). Likewise, we have
Q explanatory variables Z at the highest level, indicated by the subscript q (q = 1…Q). Then,
Equation 2.5 becomes the more general equation:
24 Multilevel Analysis: Techniques and Applications

Yij = γ00 + γp0 Xpij + γ0q Zqj + γpq ZqjXpij + upj Xpij + u0j + eij . (2.21)

Using summation notation, we can express the same equation as

Yij =  00 + ∑  p 0 X pij + ∑  0 q Z qj + ∑∑  pq X pij Z qj + ∑ u pj X pij + u0 j + eij . (2.22)


p q p q p

The errors at the lowest level eij are assumed to have a normal distribution with a mean
of zero and a common variance  e2 in all groups. The u-terms u0j and upj are the residual
error terms at the highest level. They are assumed to be independent from the errors eij at
the individual level, and to have a multivariate normal distribution with means of zero.
The variance of the residual errors u0j is the variance of the intercepts between the groups,
symbolized by  u20 . The variances of the residual errors upj are the variances of the slopes
between the groups, symbolized by  u2p . The covariances between the residual error terms
 u pp ' are generally not assumed to be zero; they are collected in the higher-level variance/
covariance matrix Ω.6
Note that in Equation 2.15, γ00, the regression coefficient for the intercept, is not associated
with an explanatory variable. We can expand the equation by providing an explanatory
variable that is a constant equal to one for all observed units. This yields the equation

Yij = γp0 Xpij + γpq ZqjXpij + upj Xpij + eij (2.23)

where X0ij = 1, and p = 0…P. Equation 2.23 makes clear that the intercept is a regression
coefficient, just like the other regression coefficients in the equation. Some multilevel
software, for instance HLM (Raudenbush et al., 2011) puts the intercept variable X0 = 1 in
the regression equation by default. Other multilevel software, for instance MLwiN (Rasbash
et al., 2015), requires that the analyst includes a variable in the data set that equals one in all
cases, which must be added explicitly to the regression equation.
Equation 2.23 can be made very general if we let X be the matrix of all explanatory variables
in the fixed part, symbolize the residual errors at all levels by u(l) with l denoting the level, and
associate all error components with predictor variables Z, which may or may not be equal to
the X. This produces the very general matrix formula Y = Xβ + Z(l)u(l) (cf. Goldstein, 2011,
Appendix 2.1). Since this book is more about applications than about mathematical statistics,
it generally uses the algebraic notation, except when multivariate procedures such as structural
equation modeling are discussed.
The notation used in this book is close to the notation used by Goldstein (2011) and Kreft
and de Leeuw (1998). The most important difference is that these authors indicate the higher-
level variance by σ00 instead of our  u20 . The logic is that, if σ01 indicates the covariance
between variables 0 and 1, then σ00 is the covariance of variable 0 with itself, which is its
variance. Raudenbush and Bryk (2002), and Snijders and Bosker (2012) use a different
notation; they denote the lowest level error terms by rij, and the higher-level error terms by uj.
The Basic Two-Level Regression Model 25

The lowest-level variance is σ2 in their notation. The higher-level variances and covariances
are indicated by the Greek letter τ (tau); for instance, the intercept variance is given by τ00. The
τpp are collected in the matrix Tau, symbolized as T. The HLM program and manual in part use
a different notation, for instance when discussing longitudinal and three-level models.
In models with more than two levels, two different notational systems are used. One
approach is to use different Greek characters for the regression coefficients at different levels,
and different (Greek or Latin) characters for the variance terms at different levels. With many
levels, this becomes cumbersome, and it is simpler to use the same character, say β for the
regression slopes and u for the residual variance terms, and let the number of subscripts
indicate to which level these belong.

2.4.2 Software

Multilevel models can be formulated in two ways: (1) by presenting separate equations
for each of the levels, and (2) by combining all equations by substitution into a single
model-equation. The softwares HLM (Raudenbush et al., 2011) and Mplus (Muthén &
Muthén, 1998–2015) require specification of the separate equations at each available level.
Most other software, e.g., MLwiN (Rasbash et al., 2015), SAS Proc Mixed (Littell et al.,
1996), SPSS command Mixed (Norusis, 2012), and the R package LME4 (Bates et al.,
2015) use the single equation representation. Both representations have their advantages
and disadvantages. The separate-equation representation has the advantage that it is always
clear how the model is built up. The disadvantage is that it hides from view that modeling
regression slopes by other variables is equivalent to adding a cross-level interaction to the
model. As will be explained in Chapter 4, estimating and interpreting interactions correctly
requires careful thinking. On the other hand, while the single-equation representation
makes the existence of interactions obvious, it conceals the role of the complicated error
components that are created by modeling varying slopes. In practice, to keep track of the
model, it is recommended to start by writing the separate equations for the separate levels,
and to use substitution to arrive at the single-equation representation.
To take a quote from Singer’s excellent introduction to using SAS Proc Mixed for multilevel
modeling (Singer, 1998, p. 350): ‘Statistical software does not a statistician make. That said,
without software, few statisticians and even fewer empirical researchers would fit the kinds of
sophisticated models being promulgated today.’ Indeed, software does not make a statistician,
but the advent of powerful and user-friendly software for multilevel modeling has had a
large impact in research fields as diverse as education, organizational research, demography,
epidemiology, and medicine. This book focuses on the conceptual and statistical issues that
arise in multilevel modeling of complex data structures. It assumes that researchers who apply
these techniques have access to and familiarity with some software that can estimate these
models. Specific software is mentioned in some places, but only if a technique is discussed
that requires specific software features or is only available in a specific program.
26 Multilevel Analysis: Techniques and Applications

Since statistical software evolves rapidly, with new versions of the software coming
out much faster than new editions of general handbooks such as this, we do not discuss
software setups or output in detail. As a result, this book is more about the possibilities
offered by the various techniques than about how these things can be done in a specific
software package. The techniques are explained using analyses on small but realistic data
sets, with examples of how the results could be presented and discussed. At the same time, if
the analysis requires that the software used have some specific capacities, these are pointed
out. This should enable interested readers to determine whether their software meets these
requirements, and assist them in working out the software setups for their favorite package.
In addition to the relevant program manuals, several software programs have been
discussed in introductory articles. Using SAS Proc Mixed for multilevel and longitudinal data
is discussed by Singer (1998). Peugh and Enders (2005) discuss SPSS Mixed using Singer’s
examples. Both Arnold (1992), and Heck and Thomas (2009) discuss multilevel modeling
using HLM and Mplus as the software tool. Sullivan, Dukes and Losina (1999) discuss HLM
and SAS Proc Mixed. West, Welch and Gatecki (2007) present a series of multilevel analyses
using SAS, SPSS, R, Stata and HLM. Heck, Thomas and Tabata (2012, 2014) discuss SPSS.
Finally, the multilevel modeling program at the University of Bristol maintains a multilevel
homepage that contains a series of software reviews. The homepage for this book contains
links to these and other multilevel resources (at https://multilevel-analysis.sites.uu.nl/).
The data sets used in the examples are described in the resources. In addition, it contains
the data sets used in the examples and described in Appendix E (https://multilevel-analysis.
sites.uu.nl/).

Notes
1 The intraclass correlation is an estimate of the proportion of group-level variance in the population. The
proportion of group-level variance in the sample is given by the correlation ratio η² (eta-squared, cf.
Tabachnick & Fidell, 2013, p. 54): μ² = SS(B)/SS(Total).
2 For reasons to be explained later, different options for the details of the maximum likelihood
estimation procedure may result in slightly different estimates. So, if you re-analyze the example
data from this book, the results may differ slightly from the results given here. However, these
differences should never be so large that you would draw entirely different conclusions.
3 Testing variances is preferably done with a test based on the deviance, which is explained in Chapter 3.
4 Chapter 3 treats the interpretation of confidence intervals in more detail.
5 Chapter 4 discusses the interpretation of interactions and centering.
6 We may attach a subscript to Ω to indicate to which level it belongs. As long as there is no risk of
confusion, the simpler notation without the subscript is used.
3
Estimation and Hypothesis Testing in
Multilevel Regression

Summary

The usual method to estimate the values of the regression coefficients and the intercept
and slope variances is the maximum likelihood estimation method. This chapter gives a
non-technical explanation of maximum likelihood estimation, to enable analysts to make
informed decisions on the estimation options offered by current software. Some alternatives
to maximum likelihood estimation are briefly discussed. Other estimation methods, such as
Bayesian estimation methods and bootstrapping, are also briefly introduced in this chapter.
Finally, this chapter describes some procedures that can be used to compare nested and non-
nested models, which are especially useful when variance terms are tested.

3.1 Which Estimation Method?

Estimation of parameters (regression coefficients and variance components) in multilevel


modeling is mostly done by the maximum likelihood method. The maximum likelihood
(ML) method is a general estimation procedure, which produces estimates for the population
parameters that maximize the probability (produce the ‘maximum likelihood’) of observing
the data that are actually observed, given the model (cf. Eliason, 1993). Other estimation
methods that have been used in multilevel modeling are generalized least squares (GLS),
generalized estimating equations (GEE), bootstrapping methods and Bayesian methods
such as Markov chain Monte Carlo (MCMC). In this section, we will discuss these methods
briefly.

3.1.1 Maximum Likelihood (ML): Full and Restricted ML Estimation

Maximum likelihood (ML) is the most commonly used estimation method in multilevel
modeling. The results presented in Chapter 2 are all obtained using full ML estimation.
An advantage of the maximum likelihood estimation method is that it is generally robust,
and produces estimates that are asymptotically (i.e, when the sample size approximates
infinity) efficient and consistent. With large samples, ML estimates are usually robust
against mild violations of the assumptions, such as having non-normal errors. Maximum
likelihood estimation proceeds by maximizing a function called the likelihood function.

27
28 Multilevel Analysis: Techniques and Applications

Two different likelihood functions are used in multilevel regression modeling. One is
full maximum likelihood (FML); in this method, both the regression coefficients and the
variance components are included in the likelihood function. The other estimation method
is restricted maximum likelihood (RML); here only the variance components are included in
the likelihood function, and the regression coefficients are estimated in a second estimation
step. Both methods produce parameter estimates with associated standard errors and an
overall model deviance, which is a function of the likelihood. FML treats the regression
coefficients as fixed but unknown quantities when the variance components are estimated,
but does not take into account the degrees of freedom lost by estimating the fixed effects.
RML estimates the variance components after removing the fixed effects from the model (cf.
Searle et al., 1992, Chapter 6). As a result, FML estimates of the variance components are
biased; they are generally too small. RML estimates have less bias (Longford, 1993). RML
also has the property, that if the groups are balanced (have equal group sizes), the RML
estimates are equivalent to ANOVA estimates, which are optimal (Searle et al., 1992, p. 254).
Since RML is more realistic, it should, in theory, lead to better estimates, especially when
the number of groups is small (Bryk & Raudenbush, 1992; Longford, 1993). In practice, the
differences between the two methods are usually small (cf. Hox, 1998; Kreft & de Leeuw,
1998). For example, if we compare the FML estimates for the intercept-only model for the
popularity data in Table 2.1 with the corresponding RML estimates, the only difference
within two decimals is the intercept variance at level two. FML estimates this as 0.69,
and RML as 0.70. The size of this difference is absolutely trivial. If nontrivial differences
are found, the RML method is preferred (Browne, 1998). FML still continues to be used,
because it has two advantages over RML. Firstly, the computations are generally easier, and
secondly, since the regression coefficients are included in the likelihood function, an overall
chi-square test based on the likelihood can be used to compare two models that differ in the
fixed part (the regression coefficients). With RML, only differences in the random part (the
variance components) can be compared with this test. Most tables in this book have been
produced using FML estimation; if RML is used this is explicitly stated in the text.
Computing the maximum likelihood estimates requires an iterative procedure. At the
start, the computer program generates reasonable starting values for the various parameters
(for example based on single-level regression estimates). In the next step, an ingenious
computation procedure tries to improve upon the starting values, to produce better estimates.
This second step is repeated (iterated) many times. After each iteration, the program inspects
how much the estimates actually changed compared to the previous step. If the changes are
very small, the program concludes that the estimation procedure has converged and that it is
finished. Using multilevel software, we generally take the computational details for granted.
However, computational problems do sometimes occur. A problem common to programs
using an iterative maximum likelihood procedure is that the iterative process is not always
guaranteed to stop. There are models and data sets for which the program may go through an
endless sequence of iterations, which can only be ended by stopping the program. Because of
Estimation and Hypothesis Testing in Multilevel Regression 29

this, most programs set a built-in limit to the maximum number of iterations. If convergence
is not reached within this limit, the computations can be repeated with a higher limit. If the
computations do not converge after an extremely large number of iterations, we suspect that
they may never converge.1 The problem is how one should interpret a model that does not
converge. The usual interpretation is that a model for which convergence cannot be reached is
a bad model, using the simple argument that if estimates cannot be found, this disqualifies the
model. However, the problem may also lie with the data. Especially with small samples, the
estimation procedure may fail even if the model is valid. In addition, it is even possible that,
if only we had a better computer algorithm, or better starting values, we could find acceptable
estimates. Still, experience shows that if a program does not converge with a data set of
reasonable size, the problem often is a badly misspecified model. In multilevel analysis, non-
convergence often occurs when we try to estimate too many random (variance) components
that are actually close or equal to zero. The solution is to simplify the model by leaving out
some random components; often the estimated values from the non-converged solution
provide an indication which random components can be omitted. The strategy you apply to
solve convergence issues should be reported in your logbook and/or paper.

3.1.2 Generalized Least Squares

Generalized least squares (GLS) is an extension of the standard estimation ordinary least
squares (OLS) method that allows for heterogeneity and observations that differ in sampling
variance. GLS estimation approximates ML estimates, and they are asymptotically
equivalent. Asymptotic equivalence means that in very large samples they are in practice
indistinguishable. ‘Expected GLS’ estimates can be obtained from a maximum likelihood
procedure by restricting the number of iterations to one. Since GLS estimates are obviously
faster to compute than full ML estimates, they can be used as a stand-in for ML estimates
in computationally intensive procedures such as extremely large data sets. They can also be
used when ML procedures fail to converge; inspecting the GLS results may help to diagnose
the problem. Simulation research has shown that GLS estimates are less efficient, and that
the GLS-derived standard errors are inaccurate (cf. Hox, 1998; van der Leeden et al., 2008;
Kreft, 1996). Therefore, in general, ML estimation should be preferred.

3.1.3 Generalized Estimating Equations

The generalized estimating equations method (GEE, cf. Liang & Zeger, 1986) estimates
the variances and covariances in the random part of the multilevel model directly from
the residuals, which makes them faster to compute than full ML estimates. Typically, the
dependences in the multilevel data are accounted for by a very simple model, represented by
a working correlation matrix. For individuals within groups, the simplest assumption is that
the respondents within the same group all have the same correlation. For repeated measures,
30 Multilevel Analysis: Techniques and Applications

a simple autocorrelation structure is usually assumed. After the estimates for the variance
components are obtained, GLS is used to estimate the fixed regression coefficients. Robust
standard errors are generally used to counteract the approximate estimation of the random
structure. For non-normal data this results in a population average model, where the emphasis
is on estimating average population effects and not on modeling individual differences.
According to Goldstein (2011) and Raudenbush & Bryk (2002), GEE estimates are less
efficient than full ML estimates, but they make weaker assumptions about the structure
of the random part of the multilevel model. If the model for the random part is correctly
specified, ML estimators are more efficient, and the model-based (ML) standard errors are
generally smaller than the GEE-based robust standard errors. If the model for the random
part is incorrect, the GEE-based estimates and robust standard errors are still consistent.
So, provided the sample size is reasonably large, GEE estimators are robust against
misspecification of the random part of the model, including violations of the normality
assumption. A drawback of the GEE approach is that it only approximates the random
effects structure, and therefore the random effects cannot be analyzed in detail. Most
software will simply estimate a full unstructured covariance matrix for the random part,
which makes it impossible to estimate random effects for the intercept or slopes. Given
the general robustness of ML methods, it is preferable to use ML methods when these are
available, and use robust estimators or bootstrap corrections when there is serious doubt
about the assumptions of the ML method. Robust estimators, which are used with GEE
estimators (Burton et al., 1998), are treated in more detail in Chapter 13 of this book.

3.2 Bayesian Methods

In many different fields, including the field of multilevel analysis, Bayesian statistics is gaining
popularity (van de Schoot et al., 2017), mainly because it can deal with all kinds of technical
issues, for example multicollinearity (Can et al., 2014) or non-normality (see Chapter 13), or
because it can deal with smaller sample sizes on the highest level (e.g., Baldwin & Fellingham,
2013). The scope of this paragraph is not to provide a full introduction to Bayesian multilevel
modeling, for this we refer to Hamaker and Klugkist (2011). For a very gentle introduction to
Bayesian modeling, we refer the novice reader to, among many others, Kaplan (2014), or
van de Schoot et al. (2014). More detailed information about Bayesian multilevel modeling
can be found in Gelman and Hill (2007). For a discussion in the context of MLwiN see
Browne (2005). In the current chapter, and see also Section 13.5, we want to highlight some
important characteristics of Bayesian estimation.
There are three essential ingredients underlying Bayesian statistics. The first ingredient is
the background knowledge of the parameters in the model being tested. This first ingredient
refers to all knowledge available before seeing the data and is captured in the so-called prior
distribution. The prior is a probability distribution reflecting the researchers’ beliefs about the
value of the parameter in the population, and the amount of uncertainty the researcher has
Estimation and Hypothesis Testing in Multilevel Regression 31

regarding this belief. Researchers may have a great degree of certainty in their belief, and
therefore specify an “informative prior”—that is, a prior with a low variance. In contrast, they
may have very little certainty in this belief, and consequently specify a non-informative prior—
that is, a prior with a large variance, also known as a diffuse or flat prior. The informativeness
of a prior is governed by hyperparameters. For example, the hyperparameters for a normal
distribution are the mean and variance terms that dictate the location and spread of the normal
distribution. A normally distributed prior would be written N(μ,σ2), where N denotes that the
prior follows a normal distribution (other distributions can also be specified in a model), the
mean of the prior is given by μ, and σ2 is the prior variance. Consequently, μ can be based on
background information about the model parameter value, and σ2 can be used to specify how
certain we are about the value of μ. The more informative a prior, the larger the impact it will
have on final model results, especially if the prior is combined with small sample sizes. If a non-
informative prior is desired, this is accomplished by specifying a very large variance for the
prior. Many simulations studies have shown that the more information is captured via the prior
distribution the smaller the sample size can become while maintaining power and precision.
The second ingredient in Bayesian estimation is the information in the data itself. It is
the observed evidence expressed in terms of the likelihood function of the data given the
parameters. In other words, the likelihood function asks: “given a set of parameters, such as the
mean and/or the variance, what is the likelihood or probability of the data at hand?”
The third ingredient is based on combining the first two ingredients, which is called
posterior inference. Both (1) and (2) are combined via Bayes Theorem and are summarized
by the so-called posterior distribution, which is a combination of the prior knowledge
and the observed evidence. The posterior distribution reflects one’s updated knowledge,
balancing prior knowledge with observed data. Given that the posterior is a combination of
information from the prior and the data, a more informative prior has a larger impact on the
posterior (or final result).
The use of prior knowledge is one of the main elements that separate Bayesian and
frequentist methods. However, the process of estimating a Bayesian model can also be quite
different. Typically, Markov chain Monte Carlo (MCMC) methods are used, where estimation
is conducted through the use of a Markov chain—or a chain that captures the nature of the
posterior. Given that the posterior is a distribution (rather than a single, fixed number), we
need to sample from it in order to obtain a “best guess” of what the posterior looks like.
These samples from the posterior distribution form what we refer to as a chain. Every model
parameter has a chain associated with it, and once that chain has converged (i.e., the mean—or
horizontal middle of the chain— and the variance—or height of the chain—have stabilized),
we use the information in the chain to derive the final model estimates. Often, the beginning
portion of the chain is discarded because it represents an unstable part before convergence is
reached; this portion of the chain is called the burn-in phase. The last portion of the chain, the
post burn-in phase of the chain, is then used as the estimated posterior distribution where final
model estimates are obtained.
32 Multilevel Analysis: Techniques and Applications

The prior has the potential to have a rather large impact on final model results (even if
it is non-informative). As a result, it is important to report all details surrounding the prior
(see Depaoli & van de Schoot, 2017), which include: the distribution shape selected, the
hyperparameters (i.e., the level of informativeness), and the source of the prior information.
Equally important is to report a sensitivity analysis of priors to illustrate how robust final
model results are when priors are slightly (or even greatly) modified; this provides a better
understanding of the role of the prior in the analysis. Finally, it is also important to report all
information surrounding the assessment of chain convergence. Final model estimates are only
trustworthy if the Markov chain has successfully converged for every model parameter, and
reporting how this was assessed is a key component to a Bayesian analysis.
Bayesian multilevel estimation methods are discussed in more detail in Chapter 13 where
robust estimation methods are discussed to deal with non-normality, and in Chapter 12 where
sample size issues are discussed.

3.3 Bootstrapping

Bootstrapping is not, by itself, a different estimation method. In its simplest form, the
bootstrap (Efron, 1982; Efron & Tibshirani, 1993) is a method to estimate the parameters
of a model and their standard errors strictly from the sample, without reference to a
theoretical sampling distribution.2 The bootstrap directly follows the logic of statistical
inference. Statistical inference assumes that in repeated sampling, the statistics calculated
in the sample will vary across samples. This sampling variation is modeled by a theoretical
sampling distribution, for instance a normal distribution, and estimates of the expected
value and the variability are taken from this distribution. In bootstrapping, we draw b times
a sample (with replacement) from the observed sample at hand. In each sample, we estimate
the statistic(s) of interest, and the observed distribution of the b statistics is used for the
sampling distribution. Estimates of the expected value and the variability of the statistics are
taken from this empirical sampling distribution (Stine, 1989; Mooney & Duval, 1993; Yung
& Chan, 1999). Thus, in multilevel bootstrapping, in each bootstrap sample the parameters
of the model must be estimated, which is usually done with ML.
Since bootstrapping takes the observed data as the sole information about the population,
it needs a reasonable original sample size. Good (1999, p. 107) suggests a minimum sample
size of 50 when the underlying distribution is not symmetric. Yung and Chan (1999) review
the evidence on the use of bootstrapping with small samples. They conclude that it is not
possible to give a simple recommendation for the minimal sample size for the bootstrap
method. However, in general the bootstrap appears to compare favorably over asymptotic
methods. A large simulation study involving complex structural equation models (Nevitt
& Hancock, 2001) suggests that, for accurate results despite large violations of normality
assumptions, the bootstrap needs an observed sample of more than 150. Given such results,
the bootstrap is not the best approach when the major problem is a small sample size.
Estimation and Hypothesis Testing in Multilevel Regression 33

When the problem is violations of assumptions, or establishing bias-corrected estimates


and valid confidence intervals for variance components, the bootstrap appears to be a viable
alternative to asymptotic estimation methods.
The number of bootstrap iterations b is typically large, with b between 1000 and 2000
(Booth & Sarkar, 1998; Carpenter & Bithell, 2000). If the interest is in establishing very
accurate confidence intervals, we need an accurate estimate of percentiles close to 0 or 100,
which requires an even larger number of iterations, such as b >5000.
The bootstrap is not without its own assumptions. A key assumption of the bootstrap is
that the resampling properties of the statistic resemble its sampling properties (Stine, 1989).
As a result bootstrapping does not work well for statistics that depend on a very “narrow
feature of the original sampling process” (Stine, 1989, p. 286), such as the maximum value.
Another key assumption is that the resampling scheme used in the bootstrap must reflect
the actual sampling mechanism used to collect the data (Carpenter & Bithell, 2000). This
assumption is very important in multilevel modeling, because in multilevel data we have a
hierarchical sampling mechanism, which must be mimicked in the bootstrapping procedure.
If we carry out a bootstrap estimation for our example data introduced in Chapter 2, the
results are almost identical to the asymptotic FML results reported in Table 2.2. The estimates
differ by 0.01 at most, which is a completely trivial difference. Of course, the example data
in Chapter 2 are simulated, and all assumptions are fully met. Bootstrap estimates are most
attractive when we have reasons to suspect the asymptotic results, because we have non-
normal data. Bootstrapping is described in more detail in Chapter 13 where robust estimation
methods are discussed to deal with non-normality.

3.4 Significance Testing and Model Comparison

This section discusses procedures for testing significance and model comparison for the
regression coefficients and variance components.

3.4.1 Testing Regression Coefficients and Variance Components

Maximum likelihood estimation produces parameter estimates and corresponding standard


errors. These can be used to carry out a significance test of the form Z = (estimate) / (standard
error of estimate), where Z is referred to as the standard normal distribution. This test is
known as the Wald test (Wald, 1943). The standard errors are asymptotic, which means that
they are valid for large samples. As usual, it is not precisely known when a sample is large
enough to be confident about the precision of the estimates. Simulation research suggests
that for accurate standard errors for level-2 variances, a relatively large level-2 sample size
is needed. For instance, simulations by van der Leeden, Busing and Meijer (1997) suggest
that with fewer than 100 groups, ML estimates of variances and their standard errors are
not very accurate. In ordinary regression analysis, a rule of thumb is to require 104 + p
Another random document with
no related content on Scribd:
The Project Gutenberg eBook of By motor to
the Golden Gate
This ebook is for the use of anyone anywhere in the United States
and most other parts of the world at no cost and with almost no
restrictions whatsoever. You may copy it, give it away or re-use it
under the terms of the Project Gutenberg License included with this
ebook or online at www.gutenberg.org. If you are not located in the
United States, you will have to check the laws of the country where
you are located before using this eBook.

Title: By motor to the Golden Gate

Author: Emily Post

Release date: June 6, 2024 [eBook #73784]

Language: English

Original publication: New York: D. Appleton, 1916

Credits: Peter Becker and the Online Distributed Proofreading Team


at https://www.pgdp.net (This file was produced from
images generously made available by The Internet Archive)

*** START OF THE PROJECT GUTENBERG EBOOK BY MOTOR TO


THE GOLDEN GATE ***
BY MOTOR TO
THE GOLDEN GATE

The Pacific at Last!

BY MOTOR
to the
GOLDEN GATE

BY
EMILY POST
ILLUSTRATED WITH
PHOTOGRAPHS and ROAD MAPS

NEW YORK AND LONDON


D. APPLETON AND COMPANY
1916

Copyright, 1916, by
D. APPLETON AND COMPANY
Copyright, 1915, by P. F. Collier & Son, Inc.

Printed in the United States of America


TO
MY YOUNGER SON
BRUCE
PREFACE
“Qui s’excuse s’accuse.” Which, I suppose, proves this a defence
to start with! But having been a few times accused, there are a few
explanations I want very much to make.
When this cross-continent story was first suggested, it seemed the
simplest sort of thing to undertake. All that was necessary was to
put down experiences as they actually occurred. No imagination, or
plot or characterization—could anything be easier? But when the
serial was published and letters began coming in, it became
unhappily evident that writing fact must be one of the most
unattainably difficult accomplishments in the world.
In the first place, only those who, having lived long in a particular
locality and knowing it in all its varying seasons, are qualified truly to
present its picture. The observations of a transient tourist are
necessarily superficial, as of one whose experiences are merely a
series of instantaneous impressions; at one time colored perhaps too
vividly, at another fogged; according to the sun or rain at one brief
moment of time.
It would be very pleasant to write nothing but eulogies of people
and places, but after all if a personal narrative were written like an
advertisement, praising everything, there would be no point in
praising anything, would there?
Compared with crossing the plains in the fifties, the worst stretch
of our most uninhabited country is today the easiest road
imaginable. There are no longer any dangers, any insurmountable
difficulties. To the rugged sons of the original pioneers, comments
upon “poor roads”—that are perfectly defined and traveled-over
highways—or “poor hotels”—where you can get not only a room to
yourself, but steam heat, electric light, and generally a private bath
—must seem an irritatingly squeamish attitude. “Poor soft
weaklings” is probably not far from what they think of people with
such a point of view.
On the other hand if I, who after all am a New Yorker, were to
pronounce the Jackson House perfect, the City of Minesburg
beautiful, the Trailing Highway splendid, everyone would naturally
suppose the Jackson House a Ritz, Minesburg an upper Fifth Avenue,
and the Trailing Highway a duplicate of our own state roads, to say
the least!
I am more than sorry if I offend anyone—it is the last thing I
mean to do—at the same time I think it best to let the story stand as
it was written; taking nothing back that seems to me true, but
acknowledging very humbly at the outset, that after all mine is only
one out of a possible fifty million other American opinions.
CONTENTS
CHAPTER PAGE
I.
It Can’t Be Done—But Then, It Is Perfectly Simple 1
II.
Albany, First Stop 15
III.A Breakdown 18
IV.Pennsylvania, Ohio and Indiana 23
V.
Luggage and Other Luxuries 37
VI.Did Anybody Say “Chicken”? 41
VII.The City of Ambition 46
VIII. A Few Chicagoans 52
IX.Tins 60
X.Mud!! 67
XI.In Rochelle 72
XII.The Weight of Public Opinion 75
XIII.Muddier! 79
XIV. One of the Fogged Impressions 86
XV.A Few Ways of the West 90
XVI. Halfway House 99
XVII. Next Stop, North Platte! 107
XVIII. The City of Recklessness 119
XIX. A Glimpse of the West That Was 135
XX. Our Little Sister of Yesterday 150
XXI. Ignorance With a Capital I 155
XXII. Some Indians and Mr. X 159
XXIII. With Nowhere to Go But Out 172
XXIV. Into the Desert 175
XXV. Through the City Unpronounceable to an Exposition
Beautiful 187
XXVI. The Land of Gladness 198
XXVII. The Mettle of a Hero 205
XXVIII. San Francisco 211
XXIX. The Fair 229
XXX. “Unending Sameness” Was What They Said 237
XXXI. To Those Who Think of Following in Our Tire Tracks—
To the Man Who Drives 241
XXXII. On the Subject of Clothes—Food Equipment—Expenses—
Daily Expense Account 251
XXXIII. How Far Can You Go in Comfort?—Some Day 278
LIST OF ILLUSTRATIONS
The Pacific at last! Frontispiece
FACING PAGE
What we finally carried 8
Stowing the luggage 12
Leaving Gramercy Park, New York 16
Still in New York State 20
The crowd in less than a minute. “Out of the window”
in Cleveland 34
One of the exciting things in motoring is wondering
what sort of a hotel you will arrive at for the night 44
Hours and hours, across land as flat and endless as
the ocean 84
A bedroom in the Union Pacific Hotel, North Platte—
not much of a hardship, is it? 108
A straight, wide road; not even a shack in sight—and a
speed limit of twenty miles an hour 112
Wyoming in the ranch country 116
Cripple Creek 120
In the Garden of the Gods 124
Colorado. Pike’s Peak in the distance 128
First cowboys and cattle 132
Halfway across a thrilling ford, wide and deep, on the
Huerfano River 136
A glimpse of the West of yesterday 140
Your route leads through many Mexican and Indian
villages 148
The Indian pueblo of Taos 160
To see the sleeping beauty of the Southwest, the path
is by no means a smooth one to the motorist 170
Across the real desert 180
Our chauffeur takes a day off at the Grand Canyon of
the Colorado 184
This is not a gallery in a Spanish palace, but a gallery
in the Mission Inn at Riverside, California 188
In a California garden 192
Under Santa Barbara skies 196
Ostrich Rock, Monterey, California 200
On the seventeen-mile drive at Monterey 208
On a beautiful ocean road of California 216
The portico of a California house 226
Sometimes we struck a bad road 244
In order to cross here, E. M. built a bridge with the
logs at the right 248
On the famous “staked plains” of the Southwest 254
BY MOTOR TO
THE GOLDEN GATE
CHAPTER I
IT CAN’T BE DONE—BUT THEN, IT IS
PERFECTLY SIMPLE
“Of course you are sending your servants ahead by train with your
luggage and all that sort of thing,” said an Englishman.
A New York banker answered for me: “Not at all! The best thing is
to put them in another machine directly behind, with a good
mechanic. Then if you break down the man in the rear and your own
chauffeur can get you to rights in no time. How about your
chauffeur? You are sure he is a good one?”
“We are not taking one, nor servants, nor mechanic, either.”
“Surely you and your son are not thinking of going alone! Probably
he could drive, but who is going to take care of the car?”
“Why, he is!”
At that everyone interrupted at once. One thought we were insane
to attempt such a trip; another that it was a “corking” thing to do.
The majority looked upon our undertaking with typical New York
apathy. “Why do anything so dreary?” If we wanted to see the
expositions, then let us take the fastest train, with plenty of books so
as to read through as much of the way as possible. Only one, Mr. B.,
was enthusiastic enough to wish he was going with us. Evidently,
though, he thought it a daring adventure, for he suggested an
equipment for us that sounded like a relief expedition: a block and
tackle, a revolver, a pickaxe and shovel, tinned food—he forgot
nothing but the pemmican! However, someone else thought of
hardtack, after which a chorus of voices proposed that we stay
quietly at home!
“They’ll never get there!” said the banker, with a successful man’s
finality of tone. “Unless I am mistaken, they’ll be on a Pullman inside
of ten days!”
“Oh, you wouldn’t do that, would you?” exclaimed our one
enthusiastic friend, B.
I hoped not, but I was not sure; for, although I had promised an
editor to write the story of our experience, if we had any, we were
going solely for pleasure, which to us meant a certain degree of
comfort, and not to advertise the endurance of a special make of car
or tires. Nor had we any intention of trying to prove that motoring in
America was delightful if we should find it was not. As for breaking
speed records—that was the last thing we wanted to attempt!
“Whatever put it into your head to undertake such a trip?”
someone asked in the first pause.
“The advertisements!” I answered promptly. They were all so
optimistic, that they went to my head. “New York to San Francisco in
an X— car for thirty-eight dollars!” We were not going in an X— car,
but the thought of any machine’s running such a distance at such a
price immediately lowered the expenditure allowance for our own.
“Cheapest way to go to the coast!” agreed another folder. “Travel
luxuriously in your own car from your own front door over the
world’s greatest highway to the Pacific Shore.” Could any motor
enthusiasts resist such suggestions? We couldn’t.
We had driven across Europe again and again. In fact I had in
1898 gone from the Baltic to the Adriatic in one of the few first
motor-cars ever sold to a private individual. We knew European
scenery, roads, stopping-places, by heart. We had been to all the
resorts that were famous, and a few that were infamous, but our
own land, except for the few chapter headings that might be read
from the windows of a Pullman train, was an unopened book—one
that we also found difficulty in opening. The idea of going occurred
to us on Tuesday and on Saturday we were to start, yet we had no
information on the most important question of all—which route was
the best to take. And we had no idea how to find out!
The 1914 Blue Book was out of print, and the new one for this
year not issued. I went to various information bureaus—some of
those whose advertisements had sounded so encouraging—but their
personal answers were more optimistic than definite. Then a friend
telegraphed for me to the Lincoln Highway Commission asking if
road conditions and hotel accommodations were such that a lady
who did not want in any sense to “rough it” could motor from New
York to California comfortably.
We wasted a whole precious thirty-six hours waiting for this
answer. When it came, a slim typewritten enclosure helpfully
informed us that a Mrs. Somebody of Brooklyn had gone over the
route fourteen months previously and had written them many
glowing letters about it. As even the most optimistic prospectus
admitted that in 1914 the road was as yet not a road, and hotels
along the sparsely settled districts had not been built, it was evident
that Mrs. Somebody’s idea of a perfect motor trip was independent
of roads or stopping-places.
Meanwhile I had been told that the best information was to be
had at the touring department of the Automobile Club. So I went
there.
A very polite young man was answering questions with a facility
altogether fascinating. He told one man about shipping his car—even
the hours at which the freight trains departed. To a second he gave
advice about a suit for damages; for a third he reduced New York’s
traffic complications to simplicities in less than a minute; then it was
my turn:
“I would like to know the best route to San Francisco.”
“Certainly,” he said. “Will you take a seat over here for a
moment?”
“This is the simplest thing in the world,” I thought, and opened my
notebook to write down a list of towns and hotels and road
directions. He returned with a stack of folders. But as I eagerly
scanned them, I found they were all familiarly Eastern.
“Unfortunately,” he said suavely, “we have not all our information
yet, and we seem to be out of our Western maps! But I can
recommend some very delightful tours through New England and
the Berkshires.”
“That is very interesting, but I am going to San Francisco.”
His attention was fixed upon a map of the “Ideal Tour.” “The New
England roads are very much better,” he said.
“But, you see, San Francisco is where I am going. Do you know
which route is, if you prefer it, the least bad?”
“Oh, I see.” He looked sorry. “Of course if you must cross the
continent, there is the Lincoln Highway!”
“Can you tell me how much work has been done on it—how much
of it is finished? Might it not be better on account of the early season
to take a Southern route? Isn’t there a road called the Santa Fé
trail?”
“Why, yes, certainly,” said the nice young man. “The road goes
through Kansas, New Mexico and Arizona. It would be warmer
assuredly.”
“How about the Arizona desert? Can we get across that?”
“That is the question!”
“Perhaps we had better just start out and ask the people living
along the road which is the best way farther on?”
The young man brightened at once. “That would have been my
suggestion from the beginning.”
Once outside, however, the feasibility of asking our road as we
came to it did not seem very practical, so I went to Brentano’s to
buy some maps. They showed me a large one of the United States
with four routes crossing it, equally black and straight and inviting. I
promptly decided upon the one through the Allegheny Mountains to
Pittsburgh and St. Louis when two women I knew came in, one of
them Mrs. O., a conspicuous hostess in the New York social world,
and a Californian by birth. “The very person I need,” I thought. “She
knows the country thoroughly and her idea of comfort and mine
would be the same.”
“Can you tell me,” I asked her, “which is the best road to
California?”
Without hesitating she answered: “The Union Pacific.”
“No, I mean motor road.”
Compared with her expression the worst skeptics I had
encountered were enthusiasts. “Motor road to California!” She
looked at me pityingly. “There isn’t any.”
“Nonsense! There are four beautiful ones and if you read the
accounts of those who have crossed them you will find it impossible
to make a choice of the beauties and comforts of each.”
She looked steadily into my face as though to force calmness to
my poor deluded mind. “You!” she said. “A woman like you to
undertake such a trip! Why, you couldn’t live through it! I have
crossed the continent one hundred and sixty odd times. I know
every stick and stone of the way. You don’t know what you are
undertaking.”
“It can’t be difficult; the Lincoln Highway goes straight across.”
“In an imaginary line like the equator!” She pointed at the map
that was opened on the counter. “Once you get beyond the
Mississippi the roads are trails of mud and sand. This district along
here by the Platte River is wild and dangerous; full of the most
terrible people, outlaws and ‘bad men’ who would think nothing of
killing you if they were drunk and felt like it. There isn’t any hotel.
Tell me, where do you think you are going to stop? These are not
towns; they are only names on a map, or at best two shacks and a
saloon! This place North Platte why, you couldn’t stay in a place like
that!”
I began to feel uncertain and let down, but I said, “Hundreds of
people have motored across.”
“Hundreds and thousands of people have done things that it
would kill you to do. I have seen immigrants eating hunks of fat pork
and raw onions. Could you? Of course people have gone across, men
with all sorts of tackle to push their machines over the high places
and pull them out of the deep places; men who themselves can
sleep on the roadside or on a barroom floor. You may think ‘roughing
it’ has an attractive sound, because you have never in your life had
the slightest experience of what it can be. I was born and brought
up out there and I know.” She quietly but firmly folded the map and
handed it to the clerk. “I am sorry,” she said, “if you really wanted to
go! By and by maybe if they ever build macadam roads and put up
good hotels—but even then it would be deadly dull.”
For about five minutes I thought I had better give it up, and I
called up my editor. “It looks as though we could not get much
farther than the Mississippi.”
“All right,” he said, cheerfully, “go as far as the Mississippi. After
all, your object is merely to find out how far you can go pleasurably!
When you find it too uncomfortable, come home!”
What We Finally Carried

No sooner had he said that than my path seemed to stretch


straight and unencumbered to the Pacific Coast. If we could get no
further information, we would start for Philadelphia, Pittsburgh and
St. Louis, as we had many friends in these cities, and get new
directions from there, but as a last resort I went to the office of a
celebrated touring authority and found him at his desk.
“I would like to know whether it will be possible for me to go from
here to San Francisco by motor?”
“Sure, it’s possible! Why isn’t it?”
“I have been told the roads are dreadful and the accommodations
worse.”
He surveyed me from head to foot with about the same
expression that he might have been expected to use if I had asked
whether one could safely travel to Brooklyn.
“You won’t find Ritz hotels every few miles, and you won’t find
Central Park roads all of the way. If you can put up with less than
that, you can go—easy!” Whereupon he reached up over his head
without even looking, took down a map, spread it on the table
before him, and unhesitatingly raced his blue pencil up the edge of
the Hudson River, exactly as the pencil of Tad draws cartoons at the
movies.
“You go here—Albany, Utica, Syracuse.”
“No, please!” I said. “I want to go by way of Pittsburgh and St.
Louis.”
“You asked for the best route to San Francisco!” He looked rather
annoyed.
“Yes, but I want to go by way of St. Louis.”
“Why do you want to go to St. Louis?”
“Because we have friends there.”
“Well, then, you had better take the train and go and see them!”
Indifferently he took down another map and made a few casual blue
marks on the mountains of Pennsylvania. “They’re rebuilding roads
that will be fine later in the season, but at the moment [April, 1915]
all of these places are detours. You’ll get bad grades and mud over
your hubs! Of course, if you’re set on going that way, if you want to
burn any amount of gasoline, cut your tires to pieces, and strain
your engine—go along to St. Louis. It’s all the same to me; I don’t
own the roads! But you said you wanted to take a motor trip.”
“Then Chicago is much the best way?”
“It is the only way!”
He did not wait for my agreement, but throwing aside the second
map and turning again to the first, his pencil swooped down upon
Buffalo and raced to Cleveland as though it fitted in a groove. He
seemed to be in a mental aeroplane looking actually down upon the
roads below.
“There is a detour you will have to take here. You turn left at a
white church. This stretch is dusty in dry weather, but along here,”
his pencil had now reached Iowa and Nebraska, “you will have no
trouble at all—if it doesn’t rain.”
“And if it rains?”
“Well, you can get out your solitaire pack!”
“For how long?” The vision of the sort of road it must be if that
man thought it impassable was hard to imagine.
“Oh, I don’t know; a week or two, even three maybe. But when
they are dry there are no faster roads in the country. What kind of a
car are you going in?”
I told him proudly. Instead of being impressed by its make and
power he remarked: “Humph! You’d better go in a Ford! But suit
yourself! At any rate, you can open her wide along here, as wide as
you like if the weather is right.” At the foot of the Rocky Mountains
his pencil swerved far south.
“Way down there?” I asked. “That is all desert. Can we cross the
desert?”
“Why can’t you?” He looked me over from head to foot. I had felt
he held small opinion of me from the start. “I only wondered if the
roads were passable,” I answered meekly.
“The roads are all right.” He accented the word “roads.”
“I was wondering if there were hotels.”
“And what if there aren’t? Splendid open dry country; won’t hurt
anyone to sleep out a night or two. It’d do you good! A doctor’d
charge you money for that advice. I’m giving it to you free!”
On the doorstep at home I met my amateur chauffeur.
“Have you found out about routes?” he asked.
“We go by way of Cleveland and Chicago.”
He looked far from pleased. “Is that so much the best way?”
“It is the only way,” and I imitated unconsciously the voice of the
oracle of the touring bureau.

One would have thought that we were starting for the Congo or
the North Pole! Friends and farewell gifts poured in. It was quite
thrilling, although myself in the rôle of a venturesome explorer was a
miscast somewhere. Every little while Edwards, our butler, brought in
a new package.
One present was a dark blue silk bag about twenty inches square
like a pillow-case. At first sight we wondered what to do with it. It
turned out afterward to be the most useful thing we had except a tin
box, the story of which comes later. The silk bag held two hats
without mussing, no matter how they were thrown in, clean gloves,
veils, and any odd necessities, even a pair of slippers. The next
friend of mine going on a motor trip is going to be sent one exactly
like it!
By far the most resplendent of our presents was a marvel of a
luncheon basket. Edwards staggered under its massiveness, and we
all gathered around its silver-laden contents; bottles and jars, boxes
and dishes, flat silver and cutlery, enamelware and glass, food
paraphernalia enough to set before all the kings of Europe.
“I could not bear,” wrote the giver, “to think of your starving in the
desert.”
Stowing the Luggage

Mr. B. brought us a block and tackle and two queer-looking canvas


squares that he explained were African water buckets. All we needed
further, he told us, were fur sleeping-bags and we would be quite
fixed!
Another thing sent us was an air cushion. Air cushions make me
feel seasick, but the lady who traveled with us loved them. By the
way, we added a passenger at the last moment. On Friday
afternoon, a member of our family announced she was going with us
to protect us.
“The only thing is,” we said, “there is no place for you to sit except
in the back underneath the luggage.”
“I adore sitting under luggage; it is my favorite way of traveling,”
she replied. And as we adore her, our party became three.
We had expected to leave New York about nine o’clock in the
morning, but at eleven we were still making selections of what we
most needed to take with us, and finally choosing the wrong things
with an accuracy that amounted to a talent. Besides our regular
luggage, the sidewalk was littered with all the entrancing-looking
traveling equipment that had been sent us, and nowhere to stow it.
By giving it all the floor space of the tonneau, we managed to get
the big lunch basket in. Then we helped in the lady who traveled
with us and added a collection of six wraps, two steamer rugs, and
three dressing-cases, a typewriter, a best big camera and a little
better one—with both of which we managed to take the highest
possible percentage of worst pictures that anyone ever brought
home—a medicine chest, and various other paraphernalia neatly
packed over and around her. Of this collection our passenger was
allowed one of the dressing-cases, two wraps and a big bag. As
there was not room for three bags on the back, my son and I
divided a small motor trunk between us; I took the trays and he the
bottom. It seemed at the time a simple enough arrangement.
On our way up Fifth Avenue, two or three times in the traffic
stops, we found the motors of friends next to us. Seeing our
quantity of luggage, each asked: “Where are you going?”
Very importantly we answered: “To San Francisco!”
“No, really, where are you going?”
“SAN-FRAN-CIS-CO!!!” we called back. But not one of them
believed us.
CHAPTER II
ALBANY, FIRST STOP[1]

We had intended making Syracuse our first night’s stopping-place.


It can easily be done, but as we were so late starting—it was nearly
half-past one—we decided upon Albany instead. We felt very self-
important; it even seemed that people ought to cheer us a little as
we passed. A number of persons, especially boys, did look with
curiosity at our unusually foreign type of car—solid wheels and
exhaust tubes through the side of the hood always attract attention
in America—but no one seemed to divine or care about the thrilling
adventure we were setting out upon!
For about thirty miles outside of New York the road grew worse
and worse. Through Dobbs Ferry and Ardsley the surface looked
fairly good, but was full of brittle places. Our chauffeur says that the
word brittle has no sense, but it is the only one I can think of to
convey the sudden sharp flaked-off places that would snap the
springs of a car going at fair speed.
I was rather perturbed; because if the road was as bad as this
near home, what would it be further along? But the further we went
the better it became, and for the latter seventy or eighty miles it was
perfect.
The Hudson River scenery, the lower end of it, always oppressed
me; I can never think of anything but the favorite fiction descriptions
of the “mansions where the wealthy reside.” Such overwhelmingly
serious piles of solid masonry, each set squarely in the middle of a
seed catalog painter’s dream of pictorial lawn! Steep hills, steep
houses, steep expenditure, typify the lower Hudson, but the scenery
a hundred miles above the river’s mouth is enchanting! Wide,
beautiful views of rolling country; great comfortable-looking houses
with hundreds of acres about them; here, though many are worth
fortunes, one feels that they were built solely to answer the
individual need of their owners, and as homes.
Out on a knoll, with the river spread like a great silver mirror in
the distance, we christened our tea-basket. It took us five minutes
to burrow down and unpile all the things we had on top of it, and
five more to find in which compartment were huddled a few
sandwiches and in which other box was the cake. For twenty
minutes we boiled water in our beautiful little silver kettle, but as at
the end of that time the boiling water was tepid, we gave it up and
ate our sandwiches as recommended by the Red Queen in “Alice”
who offered her dry biscuits for thirst. Then we spent fifteen minutes
in putting everything away again.

Leaving Gramercy Park, New York

“When we get out on the prairies, where can we get supplies


enough to fill it?” I wondered. Our “chauffeur” mumbled something
about “strain on tires” and “not driving a motor truck.”
“It is a most wonderfully magnificent basket,” said the lady who
was traveling with us, rather wistfully, as she braced all the heaviest
pieces of luggage between her and it.
Not counting the time out for tea, which we didn’t have, it took us
five hours and a half from Fifty-ninth Street, New York, to the Ten
Eyck at Albany.
The run should have been one hundred and fifty miles, but we
made it one hundred and sixty because we lost our way at Fishkill.
We had no Blue Book, but had been told we need only follow the
river all the way. At Fishkill the road runs into the woods and the
river disappears until it seems permanently lost! We wandered
around and around a mountain in a wood for about ten miles before
we discovered a signpost pointing the way to Albany!
Fortunately we had telegraphed ahead for rooms at the Ten Eyck,
or they would not have been able to take us in. The hotel was filled
to overflowing with senators and assemblymen, but we had very
comfortable rooms and delicious coffee in the morning before we left
for Syracuse.

You might also like