0% found this document useful (0 votes)

8 views5 pages

Chapter 2

This document discusses Generalized Linear Models (GLMs), which extend ordinary regression models to accommodate nonnormal response distributions and multiple explanatory variables. It outlines the three components of GLMs: the random component (response variable and its distribution), the systematic component (explanatory variables in a linear predictor), and the link function (connecting the random and systematic components). The document also introduces specific GLMs for binary and count data, including logistic regression and Poisson loglinear models.

Uploaded by

Sanjida Tasnim

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views5 pages

Chapter 2

Uploaded by

Sanjida Tasnim

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 5

Generalized Linear Models

In previous chapter we focused on methods for two-way contingency tables. Most studies, however, have several
explanatory variables, and they may be continuous as well as categorical. The goal is usually to describe their effects
on response variables. Modeling the effects helps us do this efficiently. A good-fitting model evaluates effects,
includes relevant interactions, and provides smoothed estimates of response probabilities. This chapter focuses on
model building for categorical response variables. In this chapter we introduce a family of generalized linear models
that contains the most important models for categorical responses as well as standard models for continuous
responses.

2.1 GENERALIZED LINEAR MODEL

Generalized linear models (GLM) extend ordinary regression models to encompass nonnormal response
distributions and modeling functions of the mean. Three components specify a generalized linear model: A random
component identifies the response variable Y and its probability distribution; a systematic component specifies
explanatory variables used in a linear predictor function; and a link function specifies the function of E(Y) that the
model equates to the systematic component. Nelder and Wedderburn (1972) introduced the class of GLMs.

2.1.1 Components of Generalized Linear Models

The random component of a GLM consists of a response variable Y with independent observations (y1, . . . , yN )
from a distribution in the natural exponential family. This family has probability density function or mass function
of form

Several important distributions are special cases, including the Poisson and binomial. The value of the parameter θi

may vary for i=1,2, …N as a function of values of explanatory variables. The term is called the natural
parameter of the distribution.

The systematic component of a GLM relates a vector (η1 , η2 , … ,η N ) to the explanatory variables through a linear
model. Let xi j denote the value of explanatory variable j (j=0, 1, 2, . . . ) for subject i. Then

This linear combination of explanatory variables is called the linear predictor.

The third component of a GLM is a link function that connects the random and systematic components. Let

μi=E ( Y i ) ,i=1 , 2 ,3 , … N . The model links μi ¿ ηi by ηi =g ( μ i ) where the link function g is a monotonic,
differentiable function. Thus, g links μi to explanatory variables through the formula
, i=1, 2,….N
The link function g ( μi )=μ i called the identity link, has ηi =μ i . It specifies a linear model for the mean itself. This
is the link function for ordinary regression with normally distributed Y. The link function that transforms the mean to
the natural parameter is called the canonical link. For it, g ( μi )=Q (θ¿¿ i)¿ and Q(θ¿ ¿i)= ∑ β j x ij ¿.
j
In summary, a GLM is a linear model for a transformed mean of a response variable that has distribution in the
natural exponential family. We now illustrate the three components by introducing the key GLMs for discrete
response variables.

2.1.2 Binomial Logit Models for Binary Data

Many categorical response variables have only two categories. The observation for each subject might be classified
as a “Success” or a “Failure”. Represent these outcomes by 1 and 0. The Bernoulli distribution for binary random

variables specifies probabilities for the two outcomes, for which .

When has Bernoulli distribution with parameter , the probability mass function is

for y = 0 and 1. This is in the natural exponential family, identifying θ with π , a( π ¿=1- π and b(y)=1 and

. The natural parameter is the log odds of response outcome 1, the logit of
π. This is the canonical link function. GLMs using the logit link are often called logistic regression models or logit
models.

2.1.3 Poisson Loglinear Models for Count Data

Some response variables have counts as their possible outcomes. In a health survey, each observation might be the
number of illness in the past year for which the subject visited a doctor. Counts also occur as entries in contingency
tables.

The simplest distribution for count data is the Poisson. Like counts, Poisson variates can take any nonnegative
integer value. Let Y denote a count and let μ=E (Y ) .The Poisson probability mass function for a count Y is

1
This has natural exponential form (eq. 1) with θ=μ , a ( μ )=exp (−μ ) , b ( y )= ∧Q ( μ )=log ( μ ) . The natural
y!
parameter is log ( μ ) . So the canonical link function is the log link η=log ( μ ) . The model using this link function
is
This model is called a Poisson loglinear model.

Generalized Linear Models for Binary Data

Linear Probability Model

For a Binary response, the regression model

is called a linear probability model. The linear probability model has a major structural defect. Probabilities must fall
between 0 and 1, whereas linear functions take values over the entire real line.

E(Y) = π ( x ) =P(Y =1)

Logistic Regression Model

Because of the structural problems with the linear probability model, it is more fruitful to study models implying a

curvilinear relationship between .

is called the logistic regression model.

 When the model holds with , the binary response is independent of .

 When β>0, π ( x ) increases as x increases and
 When β<0, π ( x ) decreases as x increases.

For model , the odds of making response “YES=1” are

This formula provides a basic interpretation for . The odds increase multiplicatively by for every unit increase
in . The log odds has the linear relationship

For multiple predictors, log

( π (x )
1−π ( x ))=α + β 1 x1 + β 2 x 2 +…+ β p x p

Binomial GLM for 2×2 Contingency Tables

Among the simplest GLMs for a binary response is the one having a single explanatory variable X that is also
binary. Label its values by 0 and 1. For a given link function, the GLM link

link ( π ( x ) ) =α + βx

has the effect of X described by β=link [ π ( 1 ) ] −link [ π ( 0 ) ]

For the identity link, β=π ( 1 )−π ( 0 ) is the difference between proportions.

For the log link, β=log [ π ( 1 ) ] −log [ π ( 0 ) ] =log ¿ is the log relative risk.

For the logit link,

π ( 1) π ( 0) π ( 1 ) /(1−π ( 1 ))
β=logit [ π ( 1 ) ] −logit [ π ( 0 ) ] =log −log =log
1−π ( 1 ) 1−π ( 0 ) π ( 0 ) /(1−π ( 0 ) )

is the log odds ratio.

Example: Snoring and Heart Disease

Table 2: Relationship between Snoring and Heart Disease

We illustrate the linear probability model with Table 2, from an epidemiological survey of 2484 subjects to
investigate snoring as a risk factor for heart disease. Those surveyed were classified according to their spouses’
report of how much they snored. The model states that the probability of heart disease is linearly related to the level
of snoring x. We treat the rows of the table as independent binomial samples. No obvious choice of scores exists for
categories of x. We used (0, 2, 4, 5), treating the last two levels as closer than the other adjacent pairs. ML estimates
and standard errors are the same if we use a data file of 2484 binary observations or if we enter the four binomial
totals of “yes” and “no” responses listed in Table 2.

From SPSS, we get ^π ( x ) =0.0172+0.0198 x

For nonsnorers (x=0), the estimated proportion of subjects having heart disease is
0.0172. Table 2 shows the sample proportions and the fitted values (estimated values of E(Y) for a GLM) for this
model.

For the snoring data in Table 2, SPSS reports the logistic regression
logit [ ^π ( x ) ] =−3.87+0.40 x
The positive β=0.40 reflects the increased incidence of heart disease at higher snoring levels. Figure 1 displays
the fit. The fit is close to linear over this narrow range of estimated probabilities, and results are similar to those for
the linear probability model.

Categorical Data Analysis: GLM Basics
100% (1)
Categorical Data Analysis: GLM Basics
53 pages
Lecture 3
No ratings yet
Lecture 3
30 pages
Logistic Regression
No ratings yet
Logistic Regression
8 pages
Lecture 3
No ratings yet
Lecture 3
18 pages
Generalized Linear Models
No ratings yet
Generalized Linear Models
7 pages
ES714glm Generalized Linear Models
No ratings yet
ES714glm Generalized Linear Models
26 pages
Modelling Lecture 5
No ratings yet
Modelling Lecture 5
10 pages
S M S T C Lecture 2425
No ratings yet
S M S T C Lecture 2425
45 pages
GLM Theory
No ratings yet
GLM Theory
46 pages
GLM Slides 6 Binary Response Print
No ratings yet
GLM Slides 6 Binary Response Print
55 pages
Generalized Linear Models Guide
No ratings yet
Generalized Linear Models Guide
12 pages
Httpsemas2.Ui - Ac.idpluginfile - Php2375826mod Resourcecontent1kuliah1 2 PDF
No ratings yet
Httpsemas2.Ui - Ac.idpluginfile - Php2375826mod Resourcecontent1kuliah1 2 PDF
31 pages
GLM & Logistic
No ratings yet
GLM & Logistic
26 pages
Logistics Regression
No ratings yet
Logistics Regression
56 pages
Regression3 Slides
No ratings yet
Regression3 Slides
47 pages
Ch13slides Generalized Linear Models
No ratings yet
Ch13slides Generalized Linear Models
24 pages
6.1 - Introduction To GLMs
No ratings yet
6.1 - Introduction To GLMs
3 pages
Chapter 2
No ratings yet
Chapter 2
11 pages
Ho GLM
No ratings yet
Ho GLM
5 pages
Review of Logistic and Poisson Regression Models
No ratings yet
Review of Logistic and Poisson Regression Models
15 pages
Lecture 8
No ratings yet
Lecture 8
22 pages
Intro to General Linear Models
No ratings yet
Intro to General Linear Models
3 pages
Class - Lectur 5&6
No ratings yet
Class - Lectur 5&6
12 pages
Statistics 244 - Binary Response Regression, and Related Issues
100% (1)
Statistics 244 - Binary Response Regression, and Related Issues
30 pages
07 GLM
No ratings yet
07 GLM
49 pages
Note on Generalized Linear Models: y y Xβ w X β w I y Xβ I y Xβ X w X
No ratings yet
Note on Generalized Linear Models: y y Xβ w X β w I y Xβ I y Xβ X w X
4 pages
(TRANSLATED) Generalized Linear Model
No ratings yet
(TRANSLATED) Generalized Linear Model
11 pages
Lecture 13: Introduction To Generalized Linear Models: 21 November 2007
No ratings yet
Lecture 13: Introduction To Generalized Linear Models: 21 November 2007
12 pages
7 Generalized Linear Models Padua
No ratings yet
7 Generalized Linear Models Padua
29 pages
University of Illinois - Introduction To GLM
No ratings yet
University of Illinois - Introduction To GLM
39 pages
4.2 Slides - Generalized Linear Mixed Models Part 1
No ratings yet
4.2 Slides - Generalized Linear Mixed Models Part 1
9 pages
Generalized Linear Model
No ratings yet
Generalized Linear Model
9 pages
Cda Chapter Three
No ratings yet
Cda Chapter Three
18 pages
Presentation Generalized Linear Model Theory
No ratings yet
Presentation Generalized Linear Model Theory
77 pages
Logistic Regression
No ratings yet
Logistic Regression
26 pages
RM - Elements of Generalised Linear Models (GLM) and Inference For GLM
No ratings yet
RM - Elements of Generalised Linear Models (GLM) and Inference For GLM
11 pages
Exponential Family Distributions
No ratings yet
Exponential Family Distributions
13 pages
15 GLM
No ratings yet
15 GLM
32 pages
An Introduction To Generalized Linear Models (Third Edition, 2008) by Annette Dobson & Adrian Barnett Outline of Solutions For Selected Exercises
No ratings yet
An Introduction To Generalized Linear Models (Third Edition, 2008) by Annette Dobson & Adrian Barnett Outline of Solutions For Selected Exercises
23 pages
θ, then the probability density function for Y, θ), can be written as  y∣=exp  ybcd  y θ) is called the natural −m  n y ,
No ratings yet
θ, then the probability density function for Y, θ), can be written as  y∣=exp  ybcd  y θ) is called the natural −m  n y ,
6 pages
Difference Between Logit and Probit Models
100% (1)
Difference Between Logit and Probit Models
7 pages
SandipK - Generalized Linear Models Question Bank
No ratings yet
SandipK - Generalized Linear Models Question Bank
38 pages
James W. Hardin, Joseph M. Hilbe - Generalized Linear Models and Extensions-Stata Press (2018)
100% (1)
James W. Hardin, Joseph M. Hilbe - Generalized Linear Models and Extensions-Stata Press (2018)
789 pages
2.1972 Generalized Linear Models Nelder Wedderburn
No ratings yet
2.1972 Generalized Linear Models Nelder Wedderburn
16 pages
(GAM) Application PDF
No ratings yet
(GAM) Application PDF
30 pages
Counting: Twisted Relationships
No ratings yet
Counting: Twisted Relationships
12 pages
Countdata2018 2
No ratings yet
Countdata2018 2
23 pages
Logistic Regression & GLMs Guide
No ratings yet
Logistic Regression & GLMs Guide
20 pages
Section 9 Limited Dependent Variables
No ratings yet
Section 9 Limited Dependent Variables
17 pages
Logistic Regression Guide
100% (3)
Logistic Regression Guide
20 pages
Lecture BDS 2 23 24 Print
No ratings yet
Lecture BDS 2 23 24 Print
10 pages
Generalized Linear Models GLMs
No ratings yet
Generalized Linear Models GLMs
10 pages
Home Lesson 15: Logistic, Poisson & Nonlinear Regression
No ratings yet
Home Lesson 15: Logistic, Poisson & Nonlinear Regression
32 pages
Unit 2
No ratings yet
Unit 2
11 pages
Econometria Avanzada: Generalized Linear Models
No ratings yet
Econometria Avanzada: Generalized Linear Models
30 pages
GAMS Getting Started
No ratings yet
GAMS Getting Started
31 pages
Sta 3010 Quizes
No ratings yet
Sta 3010 Quizes
10 pages
Audio Production and Critical Listening Technical Ear Training 2nd Edition Jason Andrew Corey PDF Download
No ratings yet
Audio Production and Critical Listening Technical Ear Training 2nd Edition Jason Andrew Corey PDF Download
83 pages
Fortnite Download All The Ways To Play Fortnite
No ratings yet
Fortnite Download All The Ways To Play Fortnite
1 page
Important Instructions:: Activities Engaged in
No ratings yet
Important Instructions:: Activities Engaged in
4 pages
Industrial FRP Cable Tray Guide
No ratings yet
Industrial FRP Cable Tray Guide
4 pages
ERPCODE Inc
No ratings yet
ERPCODE Inc
5 pages
Java Programming Assignments
No ratings yet
Java Programming Assignments
8 pages
Change Your Password
No ratings yet
Change Your Password
1 page
COMM 337U - Communication and Gender: Syllabus - Portland State University - Spring 2011 - Tierney
No ratings yet
COMM 337U - Communication and Gender: Syllabus - Portland State University - Spring 2011 - Tierney
6 pages
FG Wilson Service Bulletin: SB 0087 - PCA
100% (1)
FG Wilson Service Bulletin: SB 0087 - PCA
4 pages
1359.31-1997 Rotating Electrical Machines General Machines (Part 31 Three Phase Induction Motors Operation On Unbalanced Voltages)
No ratings yet
1359.31-1997 Rotating Electrical Machines General Machines (Part 31 Three Phase Induction Motors Operation On Unbalanced Voltages)
8 pages
5 Eee
No ratings yet
5 Eee
26 pages
Drew 3 Hematology Analyzer
No ratings yet
Drew 3 Hematology Analyzer
2 pages
Chapter 1 - The Worlds of Database Systems
No ratings yet
Chapter 1 - The Worlds of Database Systems
31 pages
Me2032 QB
No ratings yet
Me2032 QB
10 pages
TTLM
No ratings yet
TTLM
35 pages
Salesforce Exam Purchase Confirmation
No ratings yet
Salesforce Exam Purchase Confirmation
1 page
Clearance Form
No ratings yet
Clearance Form
2 pages
VIT (Part A)
No ratings yet
VIT (Part A)
8 pages
Excel For Seo
No ratings yet
Excel For Seo
41 pages
Booking Confirmation: Fairfield Inn NY
0% (1)
Booking Confirmation: Fairfield Inn NY
2 pages
Vaibhav
No ratings yet
Vaibhav
1 page
Criminology Refresher Reviewer
No ratings yet
Criminology Refresher Reviewer
99 pages
Spamming Tutorial: What What What What Is IS IS Is Spamming Spamming Spamming Spamming
80% (5)
Spamming Tutorial: What What What What Is IS IS Is Spamming Spamming Spamming Spamming
20 pages
Question Bank 2023 Class 10
No ratings yet
Question Bank 2023 Class 10
24 pages
Branding Decisions & Evolution: of Apple & Samsung
No ratings yet
Branding Decisions & Evolution: of Apple & Samsung
12 pages
HH Coins
75% (4)
HH Coins
3 pages
Digital Detox Podcast: Improve Listening Skills
No ratings yet
Digital Detox Podcast: Improve Listening Skills
6 pages
CWCommander User Manual V2.1
No ratings yet
CWCommander User Manual V2.1
34 pages
Impacts of IFRS 17 Insurance Contracts Accounting Standard: Considerations For Data, Systems and Processes
No ratings yet
Impacts of IFRS 17 Insurance Contracts Accounting Standard: Considerations For Data, Systems and Processes
23 pages
CET I 4. Second Law 2021
No ratings yet
CET I 4. Second Law 2021
63 pages

Chapter 2

Uploaded by

Chapter 2

Uploaded by

Generalized Linear Models

2.1 GENERALIZED LINEAR MODEL

2.1.1 Components of Generalized Linear Models

This linear combination of explanatory variables is called the linear predictor.

2.1.2 Binomial Logit Models for Binary Data

variables specifies probabilities for the two outcomes, for which .

2.1.3 Poisson Loglinear Models for Count Data

Generalized Linear Models for Binary Data

Linear Probability Model

E(Y) = π ( x ) =P(Y =1)

Logistic Regression Model

curvilinear relationship between .

is called the logistic regression model.

 When the model holds with , the binary response is independent of .

For model , the odds of making response “YES=1” are

For multiple predictors, log

Binomial GLM for 2×2 Contingency Tables

has the effect of X described by β=link [ π ( 1 ) ] −link [ π ( 0 ) ]

For the logit link,

is the log odds ratio.

Example: Snoring and Heart Disease

Table 2: Relationship between Snoring and Heart Disease

From SPSS, we get ^π ( x ) =0.0172+0.0198 x

You might also like