1
The Hedonic Pricing Method
Professor A. Markandya
Department of Economics and International
Development University of Bath
[email protected] tel. +44 1225 386954
Environmental Economics 2
March 2006
2
Hedonic pricing method
Revealed preferences
No questionnaire!
We gather data that come from the market
No need to build a hypothetical market
Where do we decide to live?
Why do we choose a specific location?
Which factors push companies to choose one location
rather than another?
Which characteristics of an area affect housing prices?
Which are the important elements of a house that
determine its price?
3
The choice of localization
The choice of housing is a composite good
Distance from work, availability of public services, distance from
schools, availability of green areas, availability of sport facilities,
characteristics of housing (# of bedrooms, # of bathrooms, flat,
detached, etc.) etc.
We assume that buyers choose houses that maximize their utility
The constraints in the maximization problem are given by income,
the price of the houses and the level of taxes
=> therefore, the housing market give us some information on
buyers preferences for housing and for their localization
4
Composite goods
In the conjoint analysis and in the multi-site travel cost model we
saw that goods (or sites) can be described by a set of attributes or
characteristics.
The hedonic pricing method uses the same idea that goods are
composed by a set of characteristics.
Consider the characteristics of a house:
Number of floors, presence of a garden, GCH, number of bedrooms,
number of bathrooms, square footage of the house, type of house,
age, materials, etc.
And also:
Distance from public transport, distance from the city centre,
distance from main roads, distance from shops, distance from sport
facilities, crime rate, average income of inhabitants, presence of a
university, etc.
The composite good has a price, but there is no explicit price for
each characteristic that compose the good.
5
The hedonic pricing method
Problem of estimating hedonic equations
Hedonic prices are identified through a comparison of similar goods
that differ for the quality of one characteristic
The basic idea is to use the systematic variation in the price of
a good that can be explained by an environmental
characteristic of the good. This is the starting point to assess
the WTP for the environmental characteristic
We look at market data!
Real transactions!
6
Example
Lets consider 2 residential properties identical in all
characteristics and localization.
The only difference is that house A has 2 bedrooms,
while house B has 3 bedrooms.
In a competitive market, the price difference between the
two houses reflects the value of the additional room of
house B.
If the price difference between the two houses is less
than buyers WTP for the additional room, then buyers
will try to buy house B, driving up its price until the
equilibrium is reached.
In the same way, if house A costs much less than house
B, buyers will increase the demand for house A, driving
up its price.
7
The hedonic pricing method applies this simple concept
to the environmental characteristics of residential
properties
The price difference between houses that have different
levels of environmental quality, keeping constant all
other characteristics, reflects the WTP for the different
level of environmental quality
=> we can assess the value of an environmental quality,
according to market prices of residential properties
=> variation in environmental quality affects the price of
housing
8
A bit of history
1926 Waugh studies the variation of prices of vegetables
1938 Court looks at the car market in Detroit
1967 first application to the housing market: Ridker and
Henning => effects of air pollution on prices of housing
1974 Rosen describe the first formal model of the
hedonic pricing method
Other applications:
Agricultural goods
Cars
Wine
Job market
9
Rosens model
Consumers (buyers) have a utility function:
U(s,n,c)
s = house characteristics
n = characteristics of the area where the house is located
c = other consumption goods
Budget constraint:
m = c + p(s,n)
m = income
p(s,n) expenditure for a house
p(s,n) is assumed to change in a non linear relationship with the
characteristics of houses. That is, the cost of houses change in an
unknown relationship with number of rooms, etc.
c is the expenditure for all other goods
10
The maximization of the utility function subject to the budget
constraint, gives the usual first order conditions.
That is, the marginal rate of substitution between each characteristic
n and the consumption of other goods is equal to the price
(coefficient) of n and the price of c.
The price of c is our numeraire and we put it equal to 1.
The price of n describes the price of a marginal change in n.
The first order conditions are:
(U
n
is the partial derivative of U with respect to n)
First order conditions simply say that the consumer (buyer) is willing
to pay p
n
for a marginal change of n
) , (
) , , (
) , , (
n s p
c n s U
c n s U
n
c
n
=
n
n s p
p
n
c
c
=
) , (
11
Utility maximization and budget constraint
This looks like a normal example from your microeconomic class.
We only add a non linear constraint for a given value of s, s*:
n
c
) *, ( n s p
U
U
n
c
n
=
U(s*,n,c)
m=c+p(s*,n)
12
The hedonic price function
The function that describes how housing price changes
when housing characteristics change:
p(s,n)
is the hedonic price function
The derivative of the function with respect to one of the
characteristics n is the implicit price of n.
If we knew the hedonic price function and the implicit
price of n, we could estimate buyers WTP for n, given
that this is equal to the marginal rate of substitution
between n and the other goods (numeraire)
13
Indifference curves
The budget constraint says that what we dont spend for other goods
is spent for housing:
p(s,n): c = m p(s,n)
The utility function can be written in this way:
U(s,n,c)=U(s,n,m p(s,n))
Therefore we can describe the utility function of consumers (buyers)
with indifference curves (for given values of m and s):
Each indifference curve gives for a constant level of utility the
expenditure on housing and n for a given level of income and s.
U
n
p(s*.n)
14
Heterogeneous consumers
People with different incomes have different indifference curves,
even if they have the same preferences (U has the same functional
form for all respondents)
People with different preferences have different indifference curves
In a world of heterogeneous consumers (buyers) that have different
levels of income, we have a continuum of indifference curves:
p(s*.n)
n
15
Hedonic equilibrium
Suppose that consumers (buyers) consider exogenous the hedonic
price function
Consumers (buyers) maximize utility subject to the budget constraint
and to the hedonic price function:
p(s*.n)
n
Hedonic price function
16
Hedonic equilibrium considering the supply
The hedonic price function comes from the equilibrium of demand
and supply of housing. Both are considered exogenous.
Sellers have isoprofit curves ()
p(s*.n)
n
U
i
U
k
Buyers
b
Sellers
17
Marginal Willingness To Pay
The main characteristic of the model is that buyers and sellers are
efficiently matched along the hedonic price function
At any point along the hedonic price function, buyers marginal
willingness to pay (and sellers willingness to accept) for a change in
n is given by the derivative of the hedonic price function with respect
to n.
This implicit price changes with n if the hedonic price function is non
linear.
The model can be generalized to the case where we consider
several characteristics of residential properties and of the area
where houses are located:
p(x
1
,x
2
,x
k
)
18
Model estimate
Now we need to specify a functional form for p.
A common functional form is the semi-log:
The coefficients of the regression function give the implicit price, in
natural logarithm terms, of the characteristics of the house
The implicit price can be estimated for specific value of the
characteristics of houses (for example, the average value)
For the semi-log function, the implicit price of x
1
is given by:
1
gives the percentage change in the price of housing given a
percentage change in x
1
We usually estimate the implicit price at the average value of housing
i ki k i i i
x x x p c | | | o + + + + + = ... ln
2 2 1 1
1
1
| p
x
p
=
c
c
19
Some limitations and assumptions
Perfect information:
Buyers observe the characteristics of houses and are able to perfectly
describe the hedonic price function
Buyers can purchase whatever combination of characteristics they
desire.
They can always find the combination of bedrooms, bathrooms, location
of the house that they want
Implicit prices allow us only to assess marginal variations in the
characteristics of houses (but if we consider that all buyers are
identical then we can consider non marginal changes as well too
strong assumption!)
Example: if the average house has 3 bedrooms and costs X, I cannot
say that buyers are willing to pay Y for a house that has 7 bedrooms.
We cant say that an increase of 4 bedrooms is a marginal change
The estimate of non-marginal variations requires the estimate of
individual demand parameters, which is very difficult
20
Econometric problems
Multicollinearity
if a house has several bedrooms, it will likely have several bathrooms,
etc.
distances: dont use too many distances in your function
Heteroskedasticity
Spatial autocorrelation
The value of one house will be influenced by the value of surrounding
houses
Market extension: homogeneous markets => bias
If I only use the data of sold properties and do not consider the
characteristics of unsold properties, my coefficient can be biased
(sample selection bias)
Solution: 2 steps estimate 1) Probit model for the probability of a sale
with both sold and unsold properties 2) regression model with only sold
properties + Inverse Mills Ratio calculated in 1. Check if the coefficient
of the inverse mills ratio is significantly different from zero. If it is not,
then delete it from the regression
21
Sample selection problem
Formally, we observe the sale of a property i only if the profit from
the sale is greater than the discounted sum of profits from keeping
the property. Therefore, the net profits from selling a property can
be described by the following:
where z
i
is a vector of site characteristics and location attributes,
and
i
is the error term assumed to be i.i.d. standard normal.
However, we only observe net profits when , that is only
when a sale occurs. If the error terms in the above equation and
in the hedonic equation are correlated, the coefficient estimates of
the hedonic equation may be biased. The solution to this problem
is to explicitly model the selection problem by conditioning the
logarithm of price on the fact that a sale occurs. Indicating a sale
by y
i
, we have that y
i
=1 if a sale occurs, and 0 otherwise.
Formally, we have:
i i i
z y q o + =
*
0
*
>
i
y
,
) (
) (
) 0 ( ) 1 (
1 0
*
o |
o
| |
qc
it
it
it i i i i
y lprice E y lprice E
z
z
x + + = > = =
22
where (-) is the standard normal pdf and u(-) is the standard
normal cdf. The last part of the above equation, , is the
inverse Mills ratio. The dependence on
, the correlation between
the error terms of the hedonic price function and the above equation,
is apparent through this formulation:
When there is none, estimate the hedonic price function as usual.
However, if there is correlation, include the inverse Mills ratio as an
independent variable in the hedonic price function.
This requires estimating , which can be estimated with a probit
model that predicts the probability of observing a sale:
Introducing the inverse Mills ratio in the hedonic price function
produces consistent estimation of the parameters in the hedonic
pricing equation and also provides a test for the presence of
selection bias: If the coefficient on the inverse Mills ratio is not
statistically distinguishable from zero, then the hypothesis that
is
not zero can be rejected and analysis can proceed as usual.
) ( / ) ( o o |
i i
z z u
) ( ) 0 Pr( ) 1 (
*
o
i i i
y y E z u = > = =
23
Mortality
how do we estimate the value of a life?
Key concept: the Value of a Statistical Life (VSL).
Suppose each member of a group of 10,000 is willing to pay $30 for
a 1 in 10,000 (0.0001) reduction in the risk of death (=1 life saved in
this group). Then, the VSL is $30/0.0001=$300,000.
VSL is used to estimate the life-saving benefits of proposed
environmental policies and other safety regulations.
24
How do we estimate VSL?
Traditional approach: discounted stream of income from future years
of life. Disadvantage: places very low values on the life of
homemakers and retirees.
The appropriate measure is Willingness to Pay for the risk reduction.
WTP for the risk reduction can be estimated using
Compensating wage studies (labor market)
Other hedonic approaches (housing market, price of a car)
Revealed preference approaches
Contingent valuation
25
Compensating wage studies
Based on the idea that firms must pay workers more for them to
accept higher workplace risks
Use wage data and information about fatal risks in ones job to
estimate the regression equation:
Where w=wage rage, p=fatal risk in the workplace faced by worker i,
q=risk of non-fatal injury, WC = worker comp., and the xs are other
worker and job characteristics (age, experience, etc.)
=
+ + + + + =
M
m
i m mi i i i i i
x WC q q p w
1
) ( c o | o
26
Compensating wage studies
Many studies available in the literature.
VSL ranges from a few hundred thousand Dollars to several million
dollars, best range is 1-9 million.
Butrisk premium might in reality be inter-industry wage
differential.
Butapproach relies on workers knowing exactly the correct risks
they face.
27
Problems with compensating wage studies
Butthis is risk for an accident in the next year, while most
environmental Risks are incurred later in life.
Butthis is risk for males in their prime age, whereas environmental
Risk affects mostly older and sick people.
Butworkplace risks are voluntary, while environmental Risks are
not.
And finally, compensating wage studies do not obtain WTP from the
demand function of individuals, but only points at the tangency
between demand for safety and supply of safety by firms.
28
Example 1: Air quality
Air pollution is one of the first application of the hedonic pricing
model
Ridker, Henning (1967) The determinants of property values with
special reference to air pollution Review of Economics and
Statistics.
No residential properties sale prices, but census tract data from St.
Luis, 1960.
Dependent variable: median value of property prices
Independent variables: median characteristics of houses in a census
tract, quality of schooling, access to highway, neighbourhood
characteristics, tax levels, public services
Air quality (SO
2
, SO
3
, H
2
S, H
2
SO
4
) measured as direct effects on
houses and on human health.
29
Results (some variables)
Coefficient Standard
Error
Air pollution -245.0 88.1
Rooms 488.5 41.1
Distance from city centre (minutes) 320.2 138.7
New buildings (%) 48.36 7.20
Access to highway (dummy) 922.5 278.9
Number of persons in a house -3210.0 548.7
Median income per family 0.937 0.1057
Linear model
House price falls by 245US$ if pollution is present
Ridker and Henning estimate the environmental damage of air pollution
in St. Louis to be 82 million dollars => need to compare this estimate
with the cost of a public program to clean pollution
30
Problems of Ridker e Henning example
Multicollinearity
Omitted variables
Positive sign for the coefficient of the distance from the city centre
They do not consider the price of single houses, but the median
value of the houses sold in a census tract
31
Example 2: Water quality
Leggett and Bockstael (2000) Journal of Environmental Economics
and Management
741 observations
Effects of Chesapeake Bay water quality on prices of houses
located along the bay
Rather than using the characteristics of houses (rooms, bathrooms,
etc.), Leggett and Bockstael use the appraised value of houses.
Water quality is measured using information on the level of pollution
of the bay publicly given by the Department of Health of Maryland
32
Descriptive Statistics
Variable Description Media N=741
Price ($1000) Sale price 335.91
VSTRU Apprised value of the house 125.84
ACRES House acreage 0.90
ACSQ acreage
2
2.42
DISBA Distance from Baltimore 26.40
DISAN Distance from Annapolis 13.30
ANBA DISBA*DISAN 352.50
BDUM DISBA*(% commuters) 8.04
PLOD % of land not intensively developed 0.18
PWAT % of land with water or humid areas 0.32
DBAL Minimum distance from a polluting source 3.18
F.COL Median concentration of fecal coliform 109.70
33
Results
Dependent variable = sale price; Linear model
Coefficient Standard Error
Intercept 238.69 47.44
VSTRU 1.37 0.040
ACRES 116.9 7.62
ACSQ -7.33 0.79
DISBA -3.96 1.74
DISAN -11.80 2.50
ANBA 0.36 0.09
BDUM -10.2 -0.03
PLOD 71.69 0.27
PWAT 119.97 0.35
DBAL 2.78 2.50
F.COL -0.052 0.025
34
Welfare change
The presence of fecal coliform is equal to -0.052 dollars
per 1,000 dollars of the value of the house
Suppose fecal coliform increase from 109 (average
value) to 159:
The welfare change is equal to:
(159-109)*(-0.052) = -2.6
This means that a person that is buying a house is
willing to pay $2,600 more to avoid the increase in the
concentration of fecal coliform.
35
Measuring Welfare Via Hedonic
Methods
The above calculations give the change in the
house price due to fecal coliform changes but do
not really give the changes in welfare, in the
sense of the addtional consumer surplus you get
from the change.
The following example shows how that
consumer surplus can be calculated. It entails a
two stage estimate procedure, the first stage is
the hedonic price estimates as above and the
second stage is a follow on regression based on
households willingness to pay the implict price.
36
Case Study - Overview
This exercise presents an application of the
Hedonic Price Method for the valuation of
benefits brought about by the improvement of
the broadleaf coverage rate in an urban area.
The local government decided to improve the
quality of urban parks and green spaces near
residential areas.
For details see Markandya et al. 2002.
37
Case Study - Overview
This study focuses on the valuation of only one
of them, namely the increase of broadleaf
coverage.
To elicit the value assigned to a change in
broadleaf coverage, the prices of houses in
areas with different coverage rates are
observed.
38
Methodology
Collection of data:
Resident households were randomly selected from the
council directory of resident households.
The broadleaf coverage rate within the ray of 300 meters
for every house was calculated.
The price of the house was determined looking at estate
agency bulletins and recent transactions. Expert advice on
prices for some properties was also required. (Nb sample
selection bias not addressed).
The collection of information on the socio-economic
features of the household was done looking at recent census
data.
39
Methodology
Calculation of the value of environmental quality:
Estimation of the House Price Function.
Calculation of the Implicit Marginal Price of the
environmental good (the responsiveness of the house price
function with respect to the environmental quality, ie the
first derivative: c x P/Z
m
) for each observation.
Estimation of the Implicit Inverse Demand Function for
the environmental good. (implicit price as a function of the
environmental good and socio-economic features of
individuals).
Calculation of the Consumer Surplus
40
Database for Case Study
41
OBSERV PRICEX NUMROOINDEPE DISTAN MURDER BROADL REDHOU COMPON
1 50,847 1 0 6 1.8 2 21 2
2 53,593 2 0 44 3.1 4 25 3
3 54,019 2 0 45 4.5 6 23 4
4 59,940 3 0 41 2.5 5 28 5
5 60,849 2 0 7 3.5 8 32 4
6 61,947 3 0 39 2.2 1 34 4
7 75,908 2 0 10 1 6 30 3
8 81,304 4 0 50 4.5 5 36 2
9 85,028 3 0 48 2.1 30 43 2
10 88,484 4 0 35 3 10 38 2
11 98,648 2 1 13 1.5 15 49 3
12 98,920 3 0 9 0.9 13 55 3
13 111,049 3 0 24 1.2 18 72 2
14 121,345 4 0 50 0.5 22 68 3
15 132,049 4 1 6 0.1 11 62 4
16 136,018 4 0 15 0.5 7 78 5
17 142,546 6 0 43 0.1 28 74 4
18 145,584 4 0 3 1.5 5 80 3
19 173,394 4 1 5 1.4 42 92 4
20 173,904 3 0 3 0.9 23 87 5
21 180,394 4 1 4 0.3 11 85 3
22 198,765 3 0 5 0.7 40 93 4
23 212,038 5 1 0.5 0.1 45 96 6
24 234,194 5 1 0.5 0.6 90 80 4
25 241,879 5 1 7 0.4 70 93 5
26 267,944 4 0 0.1 0.1 85 78 4
27 267,975 7 1 24 0.6 80 83 6
28 271,039 6 1 10 0.1 75 98 3
29 294,048 5 1 12 0.4 39 105 4
30 295,536 4 1 1 0.1 80 110 4
Sample average 148,973 3.70 0.37 18.67 1.34 29.20 64.93 3.67
42
Key
OBSERV - Number of the observation
PRICEX - Price of house (US$, 1995)
NUMROO - Number of rooms in the house
INDEPE - Dummy variable: INDEPE=1 Detached house. 0
otherwise.
DISTAN - Distance from downtown (Km)
MURDER - Murder rate of the area (murders/year per 1000
residents)
BROADL - Broadleaves tree coverage rate (covered area/total
area)
REDHOU - Annual income of the household (Usd, 1995)
COMPON - Number of components in the household
43
Estimation of Hedonic House
Price Function
Regression of LnPriceX on other variables:
Most of the variables included in the model are highly
significant from a statistical point of view in explaining the
variability of the price of houses.
The coverage rate of broadleaves exhibits a positive relevant
relationship with the price, other things equal.
VARIABLE lnBROADL lnMURDER lnDISTAN lnINDEPE lnNUMROO CONSTANT
COEFFICIENT 0.16871 -0.05785 -0.08952 0.13697 0.50855 10.78918
ST.ERROR 0.04898 0.04570 0.03133 0.09606 0.13320 0.15815
T-RATIOS 3.44456 -1.26595 -2.85766 1.42595 3.81788 68.22257
R2 0.90026 ST.ERR.Y^ 0.19924
TEST F 43.32314 Degrees of freedom 24.00000
Explained sum of squares 8.59852 Sum of squared residuals 0.95268
44
Estimation of Hedonic House
Price Function
Estimated Hedonic Price function is not linear:
0
50,000
100,000
150,000
200,000
0 20 40 60 80 100
Coverage of broadleaves-trees
p
r
i
c
e
o
f
t
h
e
h
o
u
s
e
(
U
s
d
)
45
Calculation of Implicit Price
This function, as described in the methodology, is
the first derivative of the house price function with
respect to the broadleaf tree rate. The implicit price
function is:
IMPLIP = (0.16871 / BROADL) PRICEX
This function is used for estimating, observation by
observation, the implicit price of an additional unit of
broadleaf coverage:
46
47
Estimation of inverse demand
curve
The estimation of the inverse demand function is a
second stage estimation based on the result of the first
estimation (i.e. the house price function and related
first derivative).
The estimated implicit price of the broadleaf coverage
unit (in this case, the percent point) is regressed on
the observed coverage rate and the socio-economic
features of the owners.
48
Estimation of inverse demand
curve
Results of regression analysis:
The inverse demand function is therefore estimated
as:
IMPLIP = EXP( 6.34) REDDHOU
0.76
COMPON
0.11
BROADL
-0.85
VARIABLE
LNCOMPON LNREDHOU LNBROADL LNCONST
COEFFICIENT 0.1073 0.7619 -0.8530 6.3412
ST.ERROR 0.0991 0.0930 0.0388 0.2950
T-RATIOS 1.0827 8.1942 -21.9976 21.4966
R2 0.9620 ST.ERR.Y^ 0.1581
TEST.F 219.4173
Degrees of Freedom
26.0000
Expl. sum of squares 16.4636 Sum of squared residuals 0.6503
49
Inverse Demand Function
The inverse demand function can be shown for
three different levels of income: the sample
first quartile, the sample median and the
sample upper quartile. The variable COMPON
is fixed at the sample mean level.
50
Inverse Demand Function
0
2000
4000
6000
8000
10000
12000
0 20 40 60 80 100
Broadleaves trees coverage rate
I
m
p
l
i
c
i
t
m
a
r
g
i
n
a
l
p
r
i
c
e
o
f
c
o
v
e
r
a
g
e
First quart.
Median
third quart.
51
Calculation of Consumer
Surplus
The consumer surplus is calculated by estimating the
area under the demand curve.
This is done by integrating the inverse demand curve
with respect to the implicit price and calculating the
definite integral observation by observation (variable
CONSUR) between the present coverage rate
(E1=BROADL) and the coverage rate (UPLIM)
planned by the policy maker.
52
OBSERV BROADL UPPLIM CONSUR
1 2 12 14,113
2 4 14 12,508
3 6 16 9,856
4 5 15 12,902
5 8 18 10,794
6 1 11 27,797
7 6 16 11,700
8 5 15 14,161
9 30 40 5,200
10 10 20 10,000
11 15 25 9,782
12 13 23 11,735
13 18 28 11,098
14 22 32 9,645
15 11 21 14,746
16 7 17 23,520
17 28 38 8,913
18 5 15 27,178
19 42 52 7,757
20 23 33 11,910
21 11 21 18,183
22 40 50 8,119
23 45 55 7,936
24 90 100 3,817
25 70 80 5,367
26 85 95 3,922
27 80 90 4,510
28 75 85 5,004
29 39 49 9,079
30 80 90 5,351
Average 29 39 11,220
53
Discussion
This information can be used first to calculate the
average consumer surplus per household and can be
multiplied by the number of households to get a
measure of the total benefits which can be compared
with the cost of the intervention.
On distributional grounds, notice that a 10% increase
for people living in areas with high coverage rate
does not change the consumer surplus much,
compared to those living in low coverage rate areas,
which could be lower income neighbourhoods.