07-separability.tex

\chapter{Separability}\label{ch:separability}

\section{The construction of utility}\label{sec:construction-value}

When a possible outcome looks attractive, then this is usually because it has
attractive aspects. It may also have unattractive aspects, but the attractive
aspects (the ``pros'') outweigh the unattractive aspects (the ``cons''). In this
chapter, we will explore how this weighing of different aspects might work.

Take a concrete example. You are looking for a flat to rent. There are two
options. $A$ is a small and central flat that costs £800/month. $B$ is a larger
flat in the suburbs for £600/month. You might draw up a lists of pros and cons
for each option, and give them a weight, like so:

\medskip
\begin{center}
% \setlength{\arrayrulewidth}{1pt} 
  \arrayrulecolor{lightgray}
  \def\arraystretch{1} % horizontal padding
  \begin{tabular}{|c|c|}
    % \arrayrulecolor{lightgray}
    \rowcolor{gray!20}
    $A$                & $B$ \\\hline
    good location (+2) & bad location (-2) \\
    a little small (-1) & good size (+3) \\
    expensive (-3) & a little expensive (-1) \\\hline 
  \end{tabular}
\end{center}
\medskip\noindent%
You might then determine the \emph{total} utility of each option as the Asum of
these numbers, so that $\U(A)$ is +2-1-3 = -2, while $\U(B)$ is -2+3-1 = 0.

Is this a reasonable approach? It looks OK in this example. But we have to be
careful. Suppose you had drawn up the following table.
\medskip
\begin{center}
% \setlength{\arrayrulewidth}{1pt} 
  \arrayrulecolor{lightgray}
  \def\arraystretch{1} % horizontal padding
  \begin{tabular}{|c|c|}
    % \arrayrulecolor{lightgray}
    \rowcolor{gray!20}
    $A$                & $B$ \\\hline
    good location (+2) & bad location (-2) \\
    short commute (+1) & long commute (-1) \\
    can get up later (+1) & have to get up earlier (-1) \\
    a little small (-1) & good size (+3) \\
    expensive (-3) & a little expensive (-1) \\\hline 
  \end{tabular}
\end{center}
\medskip\noindent%
Now $\U(A)$ comes out as $0$ and $\U(B)$ as $-2$. Do you see what's wrong with
this table?

The problem is that the first three criteria in the list aren't independent.
Once you've taken ``good location'' into account, you shouldn't
\emph{additionally} take into account ``short commute'' and ``can get up
later''. Location, size, and costs are independent criteria. Location and
commute time are not.

But what, exactly, does independence mean here? There is no \emph{logical}
connection between ``good location'' and ``short commute''. And there may well
be a strong statistical connection between (say) location and costs.

\section{Additivity}\label{sec:additivity}

Let's stick with the flat example. We assume that you care about certain aspects
of a flat: size, location, and costs. We'll call these aspects
\textbf{attributes}. Let's assume that size, location, and costs are all the
attributes that ultimately matter to you. Your preferences between possible
flats is then determined by your preferences between combinations of these
attributes. If two flats perfectly agree in each of the three attributes then
you are always indifferent between them. If you prefer one flat to another,
that's always because you prefer the combined attributes of the first to those
of the second.

Instead of talking about the desirability of a particular flat, we can therefore
talk about the desirability of its attributes. We'll write combinations of
attributes as lists enclosed in angular brackets.
`$\t{40 \m^2, \text{central}, \text{£500}}$', for example, would represent any
flat with a size of 40 $\m^2$, central location, and monthly costs of £500. We
are interested in the utility you assign to any such list.

% Your intrinsic utility function then assigns the same value to all flats
% represented by the list $\t{40 \m^2, \text{central}, \text{£500}}$.

\cmnt{%
  An \textbf{attribute}, on this usage, is not a particular property of any
  particular flat. Rather, an aspect is a set of related properties -- a set
  that divides all possible flats into groups. For example, monthly cost of rent
  (an aspect) divides all possible flats into those that cost £300, those that
  cost £310, those that cost £320, and so on.%
} %

Strictly speaking, of course, utility functions don't assign numbers to lists,
or even to flats. When I say that you prefer one kind of flat over another, what
I really mean is that you prefer living in the first kind of flat over living in
the other. In full generality, we should speak about attributes of worlds, not
of flats. To keep things simple, we currently assume that the only thing you
ultimately care about is what kind of flat you are living in (or going to live
in). A list like $\t{40 \m^2, \text{central}, \text{£500}}$ therefore settles
everything you ultimately care about. It represents one of your ``concerns'', in
the terminology of section \ref{sec:basic-desire}.

In the example from section \ref{sec:basic-desire}, we assumed that you care
about two things: being free from pain and being admired. We pretended that
these are all-or-nothing matters. The resulting four concerns could be
represented by the following lists:
\[
  \t{\emph{Pain}, \emph{Admired}}, \t{\neg\emph{Pain}, \emph{Admired}}, \t{\emph{Pain}, \neg\emph{Admired}}, \t{\neg\emph{Pain}, \neg\emph{Admired}}.
\]
Here, there are two attribute, each of which can take two value. The first
attribute specifies whether you are in pain, and the answer is either yes or no.
The second attribute similarly specifies whether you are admired. If we allowed
for different degrees of pain, then the first attribute would have more than two
possible values. We could, for example, distinguish
$\t{\emph{Little Pain}, \emph{Admired}}$ from
$\t{\emph{Strong Pain}, \emph{Admired}}$.

% It's the same with possible worlds. If all you care about is the degree of
% pleasure of you and your three best friends, then we can represent your basic
% desires by a value function that assigns numbers to lists like
% $\t{10, 1, 2, 3}$, specifying degrees of pleasure for you and your friends (in
% some fixed order). Any such list effectively specifies one of your concerns: a
% maximal conjunction of propositions you care about.

In the flat example, we have three attributes, each of which can take many
different values: size, location, and costs. Your intrinsic utility function
assigns a desirability score to all possible combinations of these values.

If you're like most people, we can say more about how these scores are
determined. For example, you probably prefer cheaper flats to more expensive
flats, and larger flats to smaller flats. The ``weighing up pros and cons'' idea
suggests that the overall score for a flat is determined by adding up individual
scores for the flat's properties. Let's spell out how this might work.

We want to compute the utility of any given attribute list as the sum of numbers
assigned to the elements in the list. We'll call these numbers
\textbf{subvalues}. A size of 40 m$^{2}$ might have subvalue
$V_{S}(40\, \m^2) = 1$. Central location might have subvalue
$V_{L}(\text{central}) = 2$. Monthly costs of £500 might have subvalue
$V_{C}(\text{£500}) = -1$. Note that we have three different subvalue functions:
one for size, one for location, one for costs. The overall value (utility) of
$\t{40\, \m^2, \text{central}, \text{£500}}$ would then be the sum of these
subvalues:
\[
\U(\t{40\, \m^2, \text{central}, \text{£500}}) = V_{S}(40 \m^2) +
V_{L}(\text{central}) + V_{C}(\text{£500}) = 2.
\]
%
If $\U$ is determined by adding up subvalues in this manner, then it is called
\textbf{additive} relative to the attributes in question.

Additivity may seem to imply that you assign the same weight to all the
attributes: that size, location, and price are equally important to you. To
allow for different weights, we could introduce scaling factors $w_S, w_L, w_C$,
so that
\[
\U(\t{40\, \m^2, \text{central}, \text{£500}}) = w_s \cdot V_{S}(40\, \m^2) +
w_L \cdot V_{L}(\text{central}) + w_C \cdot V_{C}(\text{£500}).
\]
For convenience, we will omit the weights by folding them into the subvalues. We
will let $V_S(200\, \m^2)$ measure not just how awesome it would be to have a 200
$\m^2$ flat, but also how important this feature is compared to cost and
location.

Subvalue functions are typically defined over propositions that don't have
uniform utility. Recall that, strictly speaking, `$200\, \m^{2}$' expresses the
proposition that you are going to live in a 200 $\m^{2}$ flat. Some of the worlds
where you live in such a flat are great. Others are bad. That's because you also
care about location and costs, and the $200\, \m^{2}$ worlds differ in these respects. An
(improbable) world in which you rent a 200 $\m^{2}$ central flat for £100/month
is better than a (more probable) world in which you rent a 200 $\m^{2}$ flat in
the suburbs for £1000/month. As a result, the utility of 200 $\m^2$ may
be low, even though the subvalue is high.

Informally, the \emph{utility} of 200 $\m^{2}$ measures the desirability of the relevant
proposition. Would you be glad to learn that you are going to rent a 200 $\m^{2}$
flat? Perhaps not, because the large size indicates high costs and bad location.
The \emph{subvalue} of 200 $\m^{2}$ is not sensitive to your beliefs. It measures the
intrinsic desirability of that size, no matter what it implies or
suggests about other attributes. It measures how much a size of 200 $\m^{2}$
contributes to the overall desirability of a flat, holding fixed the other
attributes.

% \begin{exercise}{2}\label{e:subv-not-u}
%   % Like utility functions, subvalue functions assign numbers to propositions that
%   % needn't be of uniform utility. Unlike utility functions, however, subvalue
%   % functions are insensitive to belief. For example,
%   If you can afford to pay £600 in monthly rent, then $V_{C}(\text{£300})$ is
%   plausibly high, even though the utility you assign to renting a flat for £300
%   is plausibly low. Can you explain why?
% \end{exercise}

\cmnt{%
  If we want to decompose the overall desirability of a given flat into the
  desirability of the flat's individual aspects, we need to assume that the
  desirability for the aspects has more than just an ordinal scale.

  Consider number of rooms. Perhaps you'd really like to have more than one
  room, but you don't care much about whether you have three rooms or four. Your
  ranking of the possibilities is 4 rooms $\succ_r$ 3 rooms $\succ_r$ 2 rooms
  $\succ_r$ 1 room, but the difference in desirability between 4 rooms and 3
  rooms is smaller than that between 2 rooms and 1 room. (`$\succ_r$' is
  supposed to represent your basic preferences over room number, setting aside
  your knowledge that more rooms typically cost more, etc.) To make such
  comparisons between differences meaningful, we need an interval scale.

  So let's assume that if we arbitrarily measure the desirability of 1 room as 0
  and the desirability of 2 rooms as 10, then your basic desires fix the number
  assigned to 3 rooms and 4 rooms. Let $V_r$ be the resulting value function. It
  is a value function not a utility function because we take it to be defined
  just over room numbers. We don't ``look inside'' the room number
  possibilities, taking into account what else is likely to be the case if a
  flat has four rooms.
} %

\begin{exercise}{3}
  We could define a concept of additivity purely in terms of utility. Let's say
  that a utility function $\U$ is \emph{utility-additive} with respect to
  attributes $A_{1},\ldots,A_{n}$ iff
  $\U(\t{A_{1},\ldots,A_{n}}) = \U(A_{1})+\ldots+\U(A_{n})$. Explain why your
  utility function in the flat example isn't utility-additive with respect to
  size, location, and costs.
  % Consider <200m, central>. If you think that central location and large size
  % are inversely correlated, then U(200m) and U(central) are relatively low
  % although U(200m \land central) is relatively high. We can use Jeffrey's
  % axiom to expand U(A&B) = U(A)+U(B) and show that if Cr(A&B/A) = Cr(A&B/B) =
  % 0.1 and U(A&¬B) = U(¬A&B) = x, then the equation holds iff U(A&B) =
  % 0.444444x. I.e., if A&B is unlikely given both A and B and its utility is
  % positive, then A&¬B and ¬A&B have greater utility.
\end{exercise}

\begin{exercise}{2}
  Additivity greatly simplifies an agent's psychology. Suppose an agent's basic
  desires pertain to 10 logically independent propositions
  $A_1,A_2,\ldots,A_{10}$. There are $2^{10} = 1024$ conjunctions of these
  propositions and their negations (such as
  $A_1 \land A_2 \land \neg A_3 \land \neg A_4 \land A_5 \land A_6 \land \neg A_7 \land A_8 \land A_9 \land \neg A_{10}$).
  To store the agent's intrinsic utility function in a database, we would therefore need to
  store up to 1024 numbers. How many numbers do we need to store in the database
  if the agent's intrinsic utility function is additive?
\end{exercise}


\section{Separability}\label{sec:separability}

Under what conditions is intrinsic utility determined by adding subvalues? How are different
subvalue functions related to one another? We can get some insight into these
questions by following an idea from the previous chapter and study how intrinsic
utility might be derived from preferences.

% For the moment, we want to set aside the influence of the agent's beliefs, so we
% are not interested in an agent's preferences between lotteries or gambles.
% Rather, we are interested in an agent's preferences between complete attribute
% lists, assuming the relevant attributes comprise everything the agent cares
% about.

The main motivation for starting with preferences is, as always, the problem of
measurement. We need to explain what it means that your subvalue for a given
attribute is 5 rather than 29. Since the numbers are supposed to reflect, among
other things, the importance (or weight) of the relevant attribute in comparison
to other attributes, it makes sense to determine the subvalues from their effect
on the overall ranking of attribute lists.

So assume we have preference relations $\succ$, $\succsim$, $\sim$ between lists
of attributes. (We aren't interested in lotteries or gambles this time, only in
complete concerns.) To continue the illustration in terms of flats, if you
prefer a central \mbox{40 $\m^2$} flat for £500 to a central 60 $\m^2$ for £800,
then we have
\[
 \t{40 \m^2, \text{central}, \text{£500}} \succ
 \t{60 \m^2, \text{central}, \text{£800}}.
\]

If, like most people, you prefer to pay less rather than more, then your
subvalue function $V_C$ is a decreasing function of monthly costs: the higher
the costs $c$, the lower $V_C(c)$. This doesn't mean that you prefer \emph{any}
cheaper flat to \emph{any} more expensive flat. You probably don't prefer a 5
$\m^2$ flat for £499 to a 60 $\m^2$ flat for £500. The other attributes also
matter. But the following should hold: whenever two flats agree in size and
location, and one is cheaper than the other, then you prefer the cheaper one.

Let's generalize this idea.

Consider an attribute list $\t{A_{1}, A_{2}, \ldots A_{n}}$, and let $A_{1}'$ be
an alternative to $A_{1}$. If, for example, the first position in an attribute
list represents monthly costs, then $A_{1}$ might be £400 and $A_{1}'$ £500. We
can now compare $\t{A_{1}, A_{2}, \ldots A_{n}}$ to
$\t{A_{1}', A_{2}, \ldots A_{n}}$ -- a hypothetical flat that's like the first
in terms of size and location, but costs £100 more. If
\[
   \t{A_1,A_2,\ldots,A_n} \succ \t{A_1',A_2,\ldots,A_n},
\]
we say that you prefer $A_1$ to $A_1'$ \emph{conditional on}
$A_{2},\ldots,A_{n}$.

Suppose you prefer $A_{1}$ to $A_{1}'$ conditional on any way of filling in the
remainder $A_{2},\ldots,A_{n}$ of the attribute list. In that case, we can say
that your preference of $A_{1}$ over $A_{1}'$ is \emph{independent} of the other
attributes.

In the flat example, your preference of £400 over £500 is plausibly independent
of the other attributes: whenever two possible flats agree in size and location,
but one costs £400 and the other £500, you plausibly prefer the one for £400.
(We are still assuming that size, location, and costs are all you care about.)

We can similarly consider alternatives $A_{2}$ and $A_{2}'$ that may figure in
the second position of an attribute list, and alternatives $A_{3}$ and $A_{3}'$
in the third positions, and so on. If we find that your preferences between
$A_{i}$ and $A_{i}'$ are always independent of the other attributes, we say that
your preferences between attribute lists are \textbf{weakly separable}.

Weak separability means that your preference between two attribute lists that
differ only in one position does not depend on the attributes in the other
positions.

Consider the following preferences between four possible flats.
\begin{gather*}
\t{50 \m^2, \text{central}, \text{£500}} \succ \t{40 \m^2, \text{beach}, \text{£500}}\\
\t{40 \m^2, \text{beach}, \text{£400}} \succ \t{50 \m^2, \text{central}, \text{£400}}
\end{gather*}
Among flats that cost £500, you prefer central 50 \m$^2$ flats to 40 $\m^2$
flats at the beach. But among flats that cost £400, your preferences are
reversed: you prefer 40 $\m^2$ beach flats to 50 $\m^2$ central flats. In a
sense, your preferences for size and location depend on price. But we don't have
a violation of weak separability, simply because the relevant attribute lists
differ in more than one position.

That's why weak separability is called `weak'. To rule out the present kind of
dependence, we need to strengthen the concept of separability. Preferences are
called \textbf{strongly separable} if the ranking of lists that differ in
\emph{one or more positions} does not depend on the attributes in the remaining
positions, in which they do not differ. In the example, your ranking of
$\t{50 \m^2, \text{central}, -}$ and $\t{40 \m^2, \text{beach}, -}$ depends on
how the blank (`$-$') is filled in. Your preferences aren't strongly separable.

(Are they weakly separable? We can't say. I have only specified how you rank two
pairs of lists. Your preferences are presumably defined for many other
combinations of flat size, location, and costs. There's no violation of weak
separability in the two data points I have given. But there might be a violation
elsewhere.)

\begin{exercise}{2}
  Suppose all you care about is the degree of pleasure of you and your three
  friends, which we can represent by a list like $\t{10,1,2,3}$. Suppose further
  that you prefer states in which you all experience equal pleasure to states in
  which your degrees of pleasure are different. For example, you prefer
  $\t{2,2,2,2}$ to $\t{2,2,2,8}$, and you prefer $\t{8,8,8,8}$ to $\t{8,8,8,2}$.
  Are your preferences weakly separable? Are they strongly separable?
\end{exercise}

\cmnt{%
  Exercise: Show that strong entails weak?%
} %

\begin{exercise}{2}
  Which of the following preferences violate weak separability, based on the information provided? Which violate strong separability?

  \medskip

  {\small
  \noindent\hspace{-2mm}\begin{tabular}{lll}
    (a) & (b) & (c)\\
    $\t{A_1,B_1,C_3} \!\succ\! \t{A_3,B_1,C_1}$ & $\t{A_1,B_3,C_1} \!\succ\! \t{A_1,B_3,C_2}$  & $\t{A_1,B_3,C_2} \!\succ\! \t{A_1,B_1,C_2}$ \\ 
    $\t{A_3,B_2,C_1} \!\succ\! \t{A_1,B_2,C_3}$ &  $\t{A_1,B_2,C_2} \!\succ\! \t{A_1,B_2,C_3}$ &  $\t{A_2,B_3,C_2} \!\succ\! \t{A_2,B_1,C_2}$ \\
    $\t{A_3,B_2,C_3} \!\succ\! \t{A_3,B_2,C_1}$ &  $\t{A_3,B_2,C_3} \!\succ\! \t{A_3,B_1,C_3}$ &  $\t{A_1,B_1,C_1} \!\succ\! \t{A_1,B_3,C_1}$ 
 \end{tabular}  
 }
 \cmnt{%
   Answer: 

   In a., there's no counterex to w.s., but there is one to s.s.: hold fixed B in row 1 and 2.

   In b., there's no counterex to either.

   In c., there's a counterex to w.s.: hold fixed A,C in rows 1 and 3.

 } %
\end{exercise}

In 1960, G\'erard Debreu proved that strong separability is exactly what is
needed to ensure additivity.

% I am following Bergstrom's 2018 "Lecture notes on separable preferences"

To state Debreu's result, let's say that an agent's preferences over attribute
lists have an \textbf{additive representation} if there are a function $\U$,
assigning numbers to the lists, and subvalue functions $V_1, V_2, \ldots, V_n$,
assigning numbers to the items on the lists, such that the following two
conditions are satisfied. First, the preferences are represented by $\U$. That
is, for any two lists $A$ and $B$,
\begin{gather*}
  A \pref B \text{ iff } \U(A) > \U(B), \text{ and }\\
  A \sim B \text{ iff }\U(A) = \U(B).
\end{gather*}
Second, the $\U$-value assigned to any list $\t{A_1,A_2,\ldots,A_n}$ equals
the sum of the subvalues assigned to the items on the list:
\[
\U(\t{A_1,A_2,\ldots,A_n}) = V_1(A_1) + V_2(A_2) + \ldots + V_n(A_n).
\]

Now, in essence, Debreu's theorem states that if preferences over attribute
lists are complete and transitive, then they have an additive representation if
and only if they are strongly separable.

A technical further condition is needed if the number of attribute combinations
is uncountably infinite; we'll ignore that. Curiously, the result also requires
that there are at least three attributes that matter to the agent. For two
attributes, a stronger condition called `double-cancellation' is required.
Double-cancellation says that if $\t{A_1,B_1} \succsim \t{A_2,B_2}$ and
$\t{A_2,B_3} \succsim \t{A_3,B_1}$ then $\t{A_2,B_3} \succsim \t{A_3,B_2}$. But
let's just focus on cases with at least three relevant attributes.

Debreu's theorem has an interesting corollary. Suppose a utility function $\U$
has an additive representation in terms of certain attributes. One can show that
if the attributes are sufficiently fine-grained, and small differences to the
attributes make for small difference in overall utility, then every utility
function $\U'$ that has an additive representation in terms of the relevant
attributes differs from $\U$ at most in the choice of unit and zero.
% See Theorem 4 in Bergstom's 2018 lecture notes.

This suggests a new response to the ordinalist challenge. The ordinalists
claimed that utility assignments are arbitrary as long as they respect the
agent's preference order. In response, one might argue that rational (intrinsic)
preferences should be strongly separable and that an adequate representation of
such preferences should involve an additive utility function. The only arbitrary
aspect of a utility representation would then be the choice of unit and zero.

\begin{exercise}{2}
  Show that whenever $\U$ additively represents an agent's preferences, then so
  does any function $\U'$ that differs from $\U$ only by the choice of zero and
  unit. That is, assume that $\U$ additively represents an agent's preferences,
  so that for some subvalue functions $V_1,V_2,\ldots,V_n$,
  \[
  \U(\t{A_1,A_2,\ldots,A_n}) = V_1(A_1) + V_2(A_2) + \ldots + V_n(A_n).
  \]
  Assume $\U'$ differs from $\U$ only by a different choice of unit and zero,
  which means that there are numbers $x>0$ and $y$ such that
  $\U'(\t{A_1,A_2,\ldots,A_n}) = x\cdot \U(\t{A_1,A_2,\ldots,A_n}) + y$. From
  these assumptions, show that there are subvalue functions
  $V_1',V_2',\ldots,V_n'$ such that
  \[
  \U'(\t{A_1,A_2,\ldots,A_n}) = V_1'(A_1) + V_2'(A_2) + \ldots +
  V_n'(A_n).
\]
\end{exercise}

\cmnt{%
  This proves that whenever $V$ additively represents an agent's preferences,
  then so does any function $V'$ that differs only by the choice of zero and
  unit. The converse can also be shown, but it's a little harder: if there are
  \emph{no} numbers $x>0$ and $y$ for which
  $V'(\t{A_1,A_2,\ldots,A_n}) = x\cdot V(\t{A_1,A_2,\ldots,A_n}) + y$, then $V'$
  does \emph{not} additively represent the agent's preferences.
  
  To see that additive representation is not preserved under arbitrary positive
  transformations of $U$, assume that
  $U(X_1,X_2,X_3) = \log(X_1) + \log(X_2) + \log(X_3)$. If we transform $U$ by
  the exponential function, then
  $U'(X_1,X_2,X_3) = e^{U(X_1,X_2,X_3)} = e^{\log(X_1) + \log(X_2) + \log(X_3)} = e^{\log(X_1)} e^{\log(X_2)} e^{\log(X_3)} = X_1 X_2 X_3$.
  The product of three numbers cannot be represented as a sum of some
  function of the three numbers. By contrast, if we take any positive linear
  transform of $U$, then additivity is preserved:
  \begin{align*}
    a U(X_1,X_2,X_3) +b &= a(u_1(X_1) + u_2(X_2) + u_3(X_3)) + b
    &= a u_1(X_1)+b + a u_2(X_2)+b + a u_3(X_3)b.
  \end{align*}
  Indeed, the only transformation that preserve additive representation are
  increasing linear transformations. Hence additive separability implies
  cardinality.
} %


\begin{exercise}{3}
  Assume all you care about are your wealth and your height. On one way of
  representing your preferences, the utility you assign to any combination of
  wealth $w$ (in GBP) and height $h$ (in meters) is $\U(\t{w,h}) = w \cdot h$.
  Do your preferences have an additive representation? Explain your answer.
  % Any monotonic transformation of U also represents your preferences. Take the
  % log-transform. Could also find the answer by looking at double-cancellation,
  % but that seems hard.
\end{exercise}

Why might one think that rational preferences should be separable? Remember that
we are talking about preferences over ``attribute lists'' that settle everything
the agent ultimately cares about, with each position in a list settling one
question that intrinsically matters to the agent. In our toy example, these were
the size, location, and costs of their flat. More realistically, items in the
attribute list might be the agent's level of happiness, their social standing,
the well-being of their relatives, etc. Now, if an agent has a basic desire for,
say, happiness, then we would expect that increasing the level of happiness,
while holding fixed everything else the agent cares about, always is a change
for the better. That is, if two worlds $w_{1}$ and $w_{2}$ agree in all respects
that matter to the agent except that the agent is happier in $w_{1}$ than in
$w_{2}$, then we would expect the agent to prefer $w_{1}$ over $w_{2}$. From
this perspective, separability might be understood as a condition on how to
identify basic desires: if an agent's preferences over some attribute lists are
not separable, then the attributes don't represent (all) the agent's basic
(intrinsic) desires.

\cmnt{%
  From Decision Analysis, pp.35f.:

  Call two outcomes (attribute lists) \textbf{aspect equivalent} if the agent is
  indifferent between each of their aspects (items); that is, if
  $a(w) \sim a(w')$ for all aspects $a$. Now suppose an agent is indifferent
  between any two outcomes that are aspect equivalent. In that case there are
  scaling coefficients with which the addition rule correctly represents her
  preferences over the outcomes.

  How do we construct the agent's value function?

  Let's pretend there are two aspects $a$ and $b$, both of which have a minimum
  and a maximum value (in terms of preference). Let $V_a$ and $V_b$ range
  between 0 and 1 accordingly. Now compare three outcomes that agree in terms of
  $b$, but differ in $a$: in $o_0$, $a$ takes its minimum value, in $o_1$ it
  takes its maximum value, in $o$ it takes some intermediate value. For some
  value $x$, you should be indifferent between $o$ and a lottery that yields
  $o_1$ with probability $x$ else $o_0$. So
  \[
    U(o) = x U(o_1) + (1-x)U(o_0).
  \]
  If you are indifferent between aspect equivalent outcomes,
  \begin{gather*}
    U(o_0) = s_a V_a(a(o_0)) + s_b V_b(b(o_0)) = 0 + s_b V_b(b(o_0)).\\
    U(o_1) = s_a V_a(a(o_1)) + s_b V_b(b(o_1)) = s_a + s_b V_b(b(o_1)).\\
    U(o) = s_a V_a(a(o)) + s_b V_b(b(o)).
  \end{gather*}
  Substituting these in the previous equation, we get
  \[
    s_a V_a(a(o)) + s_b V_b(b(o)) = x(s_a + s_b V_b(b(o_1))) + (1-x) s_b V_b(b(o_0))
  \]
  Since $o_0, o_1, o$ agree in aspect $b$, this simplifies:
  \begin{align*}
    s_a V_a(a(o)) + k &= x(s_a + k) + (1-x) k\\
    s_a V_a(a(o)) + k &= xs_a + xk + k-xk\\
    s_a V_a(a(o)) + k &= xs_a + k\\
    s_a V_a(a(o)) &= xs_a \\
    V_a(a(o)) &= x.
  \end{align*}

  So we can determine the individual value functions. How do we determine the
  scaling factors? Define $o_{00}$ to be the outcome in which $a$ and $b$ both
  take minimum value; similarly for $o_{11}$ and maximum value. Let $o_{01}$
  have minimum value for $a$ and maximum for $b$, conversely for $o_{10}$. For
  some value $x_a$, you should be indifferent between $o_{10}$ and a lottery
  $x_a o_{11} + (1-x_a) o_{00}$. So
  \[
    U(o_{10}) = x_a U(o_{11}) + (1-x_a)U(o_{00}).
  \]
  If you are indifferent between aspect equivalent outcomes,
  \begin{gather*}
    U(o_{00}) = 0\\
    U(o_{11}) = 1 \text{ provided $s_a +s_b = 1$}\\
    U(o_{10}) = s_a.
  \end{gather*}
  Plugging these into the previous equation, we get
  \[
    s_a = x_a.
  \]

  Suppose for each $x_1,y_1$ and $z$, if
  $\t{x_1,x_2,\ldots,x_n} \succsim z \succsim \t{y_1,x_2,\ldots,x_n}$ then there
  is some $t_1$ such that $z \sim \t{t_1, x_2,\ldots, x_n}$. Then $\succsim$ is
  said to have \textbf{restricted solvability} w.r.t. the first attribute.
  (Similarly for the other attributes.) Restricted solvability is not quite
  enough for additive respresentation. We need to add non-triviality of each
  position: that for each $i$ there are $x_i, y_i$ such that
  $\t{x_1,\ldots,x_i\ldots,x_n} \succ \t{x_1,\ldots,y_i\ldots,x_n}$. For the
  infinite case, we also need an archimedian condition blocking lexical
  orderings. We also need to assume weak separability.

} %

% Could do the Kant example: happiness is good for virtuous people and bad for
% non-virtuous people. Is this a counterexample to additivity?

\section{Separability across time}\label{sec:separability-time}

According to psychological hedonism, the only thing people ultimately care about
is their personal pleasure. But pleasure isn't constant. The hedonist conjecture
leaves open how people rank different ways pleasure can be distributed over a
lifetime. Unless an agent just cares about their pleasure at a single point in
time, a basic desire for pleasure is really a concern for a lot of things:
pleasure now, pleasure tomorrow, pleasure the day after, and so on. We can think
of these as the ``attributes'' in the agent's intrinsic utility function. The
hedonist's intrinsic utility function somehow aggregates the value of pleasure
experienced at different times.

To keep things simple, let's pretend that pleasure does not vary within any
given day. We might then model a hedonist utility function as a function that
assigns numbers to lists like $\t{1,10,-1,2,\ldots}$, where the elements in the
list specify the agent's degree of pleasure today (1), tomorrow (10), the day
after (-1), and so on. Such attribute lists, in which successive positions
correspond to successive points in time, are called \textbf{time streams}.

A hedonist agent would plausibly prefer more pleasure to less at any point in
time, no matter how much pleasure there is before or afterwards. If so, their
preferences between time streams are weakly separable. Strong separability is
also plausible: whether the agent prefers a certain amount of pleasure on some
days to a different amount of pleasure on these days should not depend on how
much pleasure the agent has on other days. It follows by Debreu's theorem that
the utility the agent assigns to a time stream can be determined as the sum of the
subvalues she assign to the individual parts of the stream. That is, if $p_1$,
$p_2$, \ldots, $p_n$ are the agent's degrees of pleasure on days
$1, 2, \ldots, n$ respectively, then there are subvalue functions
$V_1,V_2,\ldots,V_n$ such that
\[
V(\t{p_1,p_2,\ldots,p_n}) = V_1(p_1) + V_2(p_2) + \ldots + V_n(p_n).
\]

We can say more if we make one further assumption. Suppose an agent prefers
stream $\t{p_1,p_2,\ldots,p_n}$ to an alternative $\t{p_1',p_2',\ldots,p_n'}$.
Now consider the same streams with all entries pushed one day into the future,
and prefixed with the same degree of pleasure $p_0$. So the first stream turns
into $\t{p_0, p_1,p_2,\ldots,p_n}$ and the second into
$\t{p_0, p_1',p_2',\ldots,p_n'}$. Will the agent prefer the modified first
stream to the modified second stream, given that she preferred the original
first stream? If the answer is yes, then her preferences are called
\textbf{stationary}. From a hedonist perspective, stationarity seems plausible:
if there's more aggregated pleasure in $\t{p_1,p_2,\ldots,p_n}$ than in
$\t{p_1',p_2',\ldots,p_n'}$, then there is also more pleasure in
$\t{p_0,p_1,p_2,\ldots,p_n}$ than in $\t{p_0,p_1',p_2',\ldots,p_n'}$.

It is not hard to show that if preferences over time streams are separable and
stationary (as well as transitive and complete), then they can be represented by
a function of the form
\[
\U(\t{A_1,\ldots,A_n}) = V_1(A_1) + \delta \cdot V_1(A_2) +
\delta^2 \cdot V_1(A_3) \ldots + \delta^{n-1} \cdot V_1(A_n),
\]
where $\delta$ is a fixed number greater than 1. The interesting thing here is
that the subvalue function for any time equals the subvalue function $V_1$ for
the first time, scaled by an exponential \textbf{discounting factor} $\delta^i$.

\cmnt{%

  Here's the argument, roughly. By separability,
  $U(p_1,\ldots,p_n) = u_1(p_1) + \ldots$. By stationarity,
  $u_1(p_1) + \ldots \geq u_1(p_1') + \ldots$ iff
  $u_2(p_1) + \ldots \geq u_2(p_1') + \ldots$. By cardinal uniqueness there
  exist $\delta > 0$ and $b_t$ such that $u_{i+1} = \delta u_i + b_i$, which by
  cardinal uniqueness again means we can find another representation with
  $u_{i+1} = \delta u_i$.

} %

If a hedonist has strongly separable and stationary preferences, then her
preferences over time streams are fixed by two things: how much she values
present pleasure, and how much she discounts the future. If $\delta = 1$, the
agent values pleasure equally, no matter when it occurs. If
$\delta = \nicefrac{1}{2}$, then one unit of pleasure tomorrow is worth half as
much as to the agent as one unit today; the day after tomorrow it is worth a
quarter; and so on.

\begin{exercise}{1}
  Consider the following streams of pleasure:
  \begin{enumerate}
    \itemsep-0.3em 
  \item[S1:] $\t{1,2,3,4,5,6,7,8,9}$ 
  \item[S2:] $\t{9,8,7,6,5,4,3,2,1}$
  \item[S3:] $\t{1,9,2,8,3,7,4,6,5}$ 
  \item[S4:] $\t{9,1,8,2,7,3,6,4,5}$ 
  \item[S5:] $\t{5,5,5,5,5,5,5,5,5}$
  \end{enumerate}
  Assuming present pleasure is valued in proportion to its degree, so
  that $V_1(p) = p$ for all degrees of pleasure $p$, how would a
  hedonist agent with separable and stationary preferences rank these
  streams, provided that (a) $\delta = 1$, (b)
  $\delta < 1$, (c) $\delta > 1$? (You need to give three answers.)
\end{exercise}

Even if you're not a hedonist, you probably care about some things that can
occur (and re-occur) at different times: talking to friends, going to concerts,
having a glass of wine, etc. The formal results still apply. If your preferences
over the relevant time streams are separable and stationary, then they are fixed
by your subvalue function for the relevant events (talking to friends, etc.)
right now and by a discounting parameter $\delta$.

Some have argued that stationarity and separability across times are
requirements of rationality. Some have even suggested that the only rationally
defensible discounting factor is 1, on the ground that we should be impartial
with respect to different parts of our life.

An argument in favour of stationarity is that it is often thought to be required
to protect the agent from a kind of disagreement with her future self. To
illustrate, suppose you prefer $\t{10, 0, 0, 0, \ldots}$ to
$\t{0, 11, 0, 0, \ldots}$ because you care more about today's pleasure than
about tomorrow's. You care less about the difference between getting pleasure in
four days and getting it in five days, so you prefer
$\t{0, 0, 0, 0, 11, 0, 0, \ldots}$ to $\t{0, 0, 0, 11, 10, 0, 0, \ldots}$. These
preferences violate stationarity. Stationarity would imply that if you prefer
$\t{10, 0, 0, 0, \ldots}$ to $\t{0, 11, 0, 0, \ldots}$ then you also prefer the
first stream to the second if both are prefixed with 0, and therefore also if
both are prefixed with two 0s, and with three 0s. Now suppose your
(non-stationary) preferences remain the same for the next 4 days. At the end of
this time, you'd still rather have 10 units of pleasure today than 11 tomorrow:
you still prefer $\t{10, 0, 0, 0, \ldots}$ to $\t{0, 11, 0, 0, \ldots}$. But
your ``today'' is what used to be ``in 4 days''. Your new preferences disagree
with those of your earlier self, in the sense that the worlds your former self
regarded as better you now regard as worse. This kind of disagreement is called
\textbf{time inconsistency}.

% xxxx I should explain that time inconsistency is about preference over lists.
% It's not enough to look at the present entry in the list.

% I say that stationarity "appears to be required" for time inconsistency
% because I think this is only true under certain (problematic) modeling
% assumptions: see https://www.umsu.de/blog/2017/671

Empirical studies suggest that time inconsistency is pervasive. People often
prefer their future selves to study, eat well, and exercise, but choose burgers
and TV for today.

These preferences do look problematic. Other apparent violations of
stationarity, and even separability across time, however, look OK. Suppose you
like to have a glass of wine every now and then. But only now and then; you
don't want to have wine every day. It seems to follow that your preferences
violate both separability and stationarity. You violate stationarity because
even though you might prefer a stream $\t{\text{wine}, \text{no wine}, \text{no
    wine}, \ldots}$ to $\t{\text{no wine}, \text{no wine}, \text{no
    wine}, \ldots}$, your preference reverses if both streams are prefixed with
wine (or many instances of wine). You violate separability because whether you
regard having wine in $n$ days as desirable depends on whether you will have
wine right before or after these days.

Even if an agent only cares about pleasure, it is not obvious why a rational
agent might not (say) prefer relatively constant levels of pleasure over wildly
fluctuating levels, or the other way round.

One might argue, however, that in these cases the items in the time streams do
not represent you basic desires, or not all of them. If, for example, you have a
preference for constant levels of pleasure, then your basic desires don't just
pertain to how much pleasure you have today, how much pleasure you have
tomorrow, and so on. You have a further basic desire: that your pleasure be
constant from day to day.

\begin{exercise}{2}
  Are your preferences in the wine example time-inconsistent, in the sense that
  what you prefer for your future self is not what your future self prefers for
  itself?
\end{exercise}

\begin{exercise}{2}
  If you care about whether you have wine on consecutive days, then arguably an
  adequate time stream for your concerns shouldn't simply specify, for each day,
  whether you do or do not have wine, but also whether you are \emph{having wine
    after having had wine the previous day}. An adequate representation of a
  week in which you have wine on days 2, 4, and 5 would therefore be
  $\t{\bar{W} \bar{P}, W \bar{P}, P \bar{W}, W \bar{P}, W P, \bar{W} P, \bar{W} \bar{P}}$,
  where $W$ means that you have wine, $\bar{W}$ that you don't have wine, $P$
  that you had wine the previous day, and $\bar{P}$ that you didn't have wine
  the previous day. Do your preferences over such streams satisfy separability
  and stationarity?
\end{exercise}

Let's briefly return to the problematic kind of time-inconsistency, manifested
by the common desire for vice today and virtue tomorrow. What could explain this phenomenon?

Part of the explanation might be that our preferences have different sources (as
I emphasized in chapter \ref{ch:utility}). When we reflect on having fries or
salad now, we are more influenced by spontaneous cravings than when we consider
the same options for tomorrow.

We could represent different sources of value by different subvalue functions.
We might, for example, have a subvalue function $V_{c}$ that measures the extent
to which a proposition satisfies you present cravings, and another subvalue
function $V_{m}$ that measures to what extent it matches your moral convictions.
Your intrinsic utility function is some kind of aggregate of these components.
Here, too, separability is plausible. If, for example, you think that one world
is morally better than another, and the two worlds are equally good with respect
to all your other motives (your cravings are equally satisfied in either, etc.),
then you plausibly prefer the first world to the second. This suggests that
different sources of intrinsic utility combine in an additive manner.

\cmnt{%

  An interesting observation in behavioural economics concerns the difference
  between single and repeated offers of gambles. Many people would reject a
  gamble $[0.5? \$200 : -\$100]$; but hardly anyone would reject a long sequence
  of such gambles. It is clear why: if losses hurt more than gains are
  pleasurable, then a single gamble looks much less attractive than a sequence
  of gambles. With a hundred instances of the above gamble, the chance of a net
  loss is down from $1/2$ to $1/2300$. This is interesting because it shows that
  we mustn't assume that if an agent prefers $A$ over $B$ in a single choice,
  she also prefers $A$ over $B$ when the choice is repeated, or known to be part
  of a sequence. As Kahneman \citey[338f.]{kahneman11thinking} points out, it
  also suggests that we are often irrational when we evaluate choices by
  themselves, not regarding them in a wider context. After all, every choice is
  in a sense part of a long sequence. If you reject gambles with positive
  expected payoff out of loss aversion, you do worse in the long run.

} %

\cmnt{%

  Dual-ranking?

  Harder if morality imposes fixed side constraints.

  Compare Sen on commitment vs preference?

  Compare also http://www.tnr.com/article/books-and-arts/true-lies on the book
  ``Private Truths, Public Lies: The Social Consequences of Preference
  Falsification'' 1995, By Timur Kuran.

  Kuran's real interest is not in moral evaluation, but in explaining individual
  and collective choices. To this end, he offers a simple economic framework
  based on three factors, which he describes (somewhat awkwardly) as intrinsic
  utility, reputational utility and expressive utility. A person's purely
  private preference is based on the intrinsic utility, to him, of the options
  under consideration. Some people really want to get rid of affirmative action
  or welfare programs, because they think that these are bad things, but their
  private preference may not be expressed publicly, because of the loss of
  reputational utility that would come from expressing it. The importance of
  reputational utility in a particular case depends on the extent of the risk to
  your reputation, and also on how much you care about your reputation. And
  people get what Kuran calls expressive utility from bringing their public
  statements into alignment with their private judgments. We all know people who
  hate to bow before social pressures; such people are willing to risk their
  reputation because what they especially hate is to speak or act in a way that
  does not reflect their true beliefs. 

} %

% \section{Separability across states}

% An agent faces a choice between some acts. According to the MEU Principle, the
% agent should evaluate each option $A$ by its expected utility
% \[
% \EU(A) = \U(O_1)\cdot \Cr(S_1) + \U(O_2)\cdot \Cr(S_2) + \ldots + \U(O_n)\cdot\Cr(S_n),
% \]
% where $S_1,S_2,\ldots,S_n$ are the relevant states and $O_1,O_2,\ldots,O_n$ are
% the outcomes of act $A$ in those states. Holding fixed the states, we can
% represent each available act by the list of its outcomes:
% $\t{O_{1}, O_{2},\ldots,O_{n}}$. In the mushroom problem from chapter
% \ref{ch:overview}, for example, eating the mushroom can be represented by the
% list $\t{\text{satisfied}, \text{dead}}$, and not eating by
% $\t{\text{hungry}, \text{hungry}}$, with the understanding that the first item
% in the list comes about if the mushroom is a paddy straw and the second if it is
% a death cap.

% Suppose the agent ranks the available acts by their expected utility. Her
% preference over the relevant outcome lists then have an additive representation: they
% are represented by a function $\U$ that assigns numbers to lists in such a way
% that the number assigned to any list is determined by adding up subvalues
% assigned to individual items on the list. This function $\U$ is the $\EU$
% function; the subvalues are the credence-weighted utilities of the outcomes. The
% subvalue of outcome $O_1$, for example, is $\U(O_1)\cdot\Cr(S_1)$.

% By Debreu's theorem, rational preferences have an additive representation if and
% only if they are strongly separable. The MEU Principle therefore implies that an
% agent's preferences between the acts in a decision problem are (strongly)
% separable \emph{across states}, meaning that the desirability of an outcome in
% one state does not depend on the outcomes in other states.

% Admittedly, this is a very roundabout path to a fairly obvious result. I mention
% it for two reasons. First, it shows that the response to the ordinalist
% challenge from the previous section is closely related to the response that we
% met in chapter \ref{ch:preference}. Von Neumann and Ramsey, in effect, assume
% that rational preferences are separable across states, and that the right way to
% measure separable preferences construes the net utility of an act as the sum of
% certain values assigned to the individual outcomes.

% Second, a general consequence of separability is that the relevant preferences
% are insensitive to ``shapes'' in the distribution of subvalues. For example,
% separable preferences cannot prefer even distributions to uneven distributions.
% This may seem to point at a problem with the MEU Principle. Consider the following schematic decision problem:
% %
% \begin{center}
%   \begin{tabular}{|r|c|c|}\hline
%     \gr & \gr State 1 (\nicefrac{1}{2}) & \gr State 2 (\nicefrac{1}{2}) \\\hline
%     \gr $A$ & Outcome 1 (+10) & Outcome 1 (+10) \\\hline
%     \gr $B$ & Outcome 2 (-10) & Outcome 3 (+30) \\\hline
%   \end{tabular}
% \end{center}
% %
% Option $A$ leads to a guaranteed outcome with utility 10, while option $B$ leads
% either to a much better outcome or to a much worse one. The expected utilities
% are the same, but one might think an agent might rationally prefer $A$ because
% the utility distribution $\t{10,10}$ is more even than $\t{\text{-10},30}$.
% Intuitively, $A$ is safe, while $B$ is risky. We will return to this issue in
% the next chapter.

% \begin{exercise}{2}
%   Where in their axioms do von Neumann and Morgenstern assume a kind
%   of separability across states?
% \end{exercise}

\section{Harsanyi's ``proof of utilitarianism''}\label{sec:utilitarianism}

The ordinalist movement posed a challenge not only to the MEU Principle, but
also to utilitarianism in ethics. Utilitarianism is a combination of two claims.
The first says that an act is right iff it brings about the best available state
of the world. The second says that the ``goodness'' of a state is the sum of the
utility of all people. Without a numerical (and not just ordinal) measure of
personal utility, this second claim makes no sense. We would need a new
criterion for ranking states of the world.

One such criterion was proposed by Pareto. Recall that Pareto did not deny that
people have preferences. If we want to know which of two states is better, we
can still ask which of them people prefer. This allows us to define at least a
partial order on the possible states:
%
\begin{genericthm}{The Pareto Condition}
  If everyone is indifferent between $A$ and $B$, then $A$ and $B$ are equally
  good; if at least one person prefers $A$ to $B$ and no one prefers $B$ to $A$,
  then $A$ is better than $B$.
\end{genericthm}
%
Unlike classical utilitarianism, however, the Pareto Condition offers little
moral guidance. For instance, while classical utilitarianism suggests that one
should harvest the organs of an innocent person in order to save ten others, the
Pareto Condition does not settle whether it would be better or worse to harvest
the organs, given that the person to be sacrificed ranks the options differently
than those who would be saved.

\cmnt{%

  When Ramsey, Savage, and von Neumann and Morgenstern showed how a meaningful
  numerical quantity of utility could be derived from an agent's preferences,
  they helped to rescue the MEU Principle, but they did not help classical
  utilitarianism. For remember that the utility functions derived from personal
  preferences have arbitrary zero and unit. Thus if according to one adequate
  representation of our preferences, my utility for a given state is 10 and
  yours is 0, then on another equally adequate representation, my utility for
  the state will be -100 and yours 20.

} %

\begin{exercise}[The Condorcet Paradox]{1}
  A ``democratic'' strengthening of the Pareto condition might say that whenever
  \emph{a majority} of people prefer $A$ to $B$, then $A$ is better than $B$.
  But consider the following scenario. There are three relevant states: $A,B,C$,
  and three people. Person 1 prefers $A$ to $B$ to $C$. Person 2 prefers $B$ to
  $C$ to $A$. Person 3 prefers $C$ to $A$ to $B$. If betterness is decided by
  majority vote, which of $A$ and $B$ is better? How about $A$ and $C$, and $B$
  and $C$?
\end{exercise}

In 1955, John Harsanyi proved a remarkable theorem that seemed to rescue, and
indeed vindicate, classical utilitarianism.

As a first step, Harsanyi adopts von Neumann's response to the ordinalist
challenge. He assumes that each individual has preferences not only among the
relevant states, but also among lotteries involving the states, and that their
preferences conform to the von Neumann and Morgenstern axioms. We can then
represent their preferences by personal utility functions $\U_{1},\ldots,\U_{n}$
(one for each individual) that are unique up to the choice of unit and zero.

Our goal is to derive a ``social preference'' relation between states that
settles whether a state is overall better than another. Harsanyi assumes that
this social preference relation can be extended to lotteries in a way that
conforms to the von Neumann and Morgenstern axioms. It follows that social
preference is also represented by a (``social'') utility function $\U_{s}$
that is unique up to the choice of unit and zero.

Harsanyi now showed that if we add the Pareto condition (for both states and
lotteries), then the individual and social preferences are represented by
utility functions $\U_1,\ldots,\U_n$ and $\U_s$ in such a way that social
utility is simply the sum of the individual utilities: for any state $A$,
\[
  \U_s(A) = \U_1(A) + \ldots + \U_n(A).
\]
Once we have allowed lotteries into the picture, the Pareto condition entails
full-blown utilitarianism! How is this possible?

The Pareto condition implies that the social utility of any state is determined
by the personal utility each individual assigns to the state. For suppose the
social utility of some state $A$ depends on an aspect of $A$ that doesn't affect
the personal utilities. Then there is an alternative $B$ to $A$ (that differs
from $A$ in this aspect) for which $\U_{s}(B) \not= \U_{s}(A)$ even though every
individual assigns the same utility to $A$ and $B$. This contradicts the Pareto
condition.

So the only ``attributes'' of a state that are relevant to its social utility
are its personal utility scores. We can represent a state by a list of numbers
$\t{u_{1}, \ldots, u_{n} }$, each of which specifies how desirable the state is
for a particular individual.

Most non-utilitarians would disagree on this point. They would hold that even if
everyone is indifferent between two states $A$ and $B$, $A$ might still be worse
than $B$, if it involves gratuitous human rights violations, animal suffering,
sin, or whatever.

The really surprising part of Harsanyi's theorem is that the social utility of a
state is simply the sum of its personal utility scores $u_{1} + \ldots + u_{n}$.
This tells us that social preference is separable across the personal utilities,
and that each personal utility (each attribute) simply contributes its value to
social utility. How does this come about? Couldn't an even distribution
$\t{10,10,10,10,\ldots }$ be better than an uneven distribution
$\t{0,20,0,20,\ldots}$? Relatedly, couldn't personal utility have ``declining
social value'', so that adding 1 unit of personal utility to an individual whose
utility is already at 1000 contributes less to social utility than adding 1 unit
to an individual who stands at 0?

These possibilities are ruled out by three assumptions that look harmless in
isolation, but have great power when combined.

One is the assumption that the Pareto condition holds for both lotteries and
states. This implies that if every individual is indifferent between some
lottery $L$ and some state $A$, then the social preference relation is
indifferent between $L$ and $A$.

The second assumption is that each individual evaluates lotteries by their
expected (personal) utility. Let $L$ be a fair lottery between
$\t{0,20,0,20,\ldots}$ and $\t{20,0,20,0\ldots }$. The expected personal utility
for each individual is 10. If everyone evaluates the lottery by its expected
personal utility, then everyone is indifferent between $L$ and
$\t{10,10,10,10,\ldots}$. By the first assumption, it follows that the social
preference order is indifferent between $L$ and $\t{10,10,10,10,\ldots}$.

Finally, we have assumed that the social preference order ranks lotteries by
their expected social utility. Assuming that the number of individuals is even,
the states $\t{20,0,20,0,\ldots}$ and $\t{0,20,0,20,\ldots }$ plausibly
have the same social utility. It follows that the social preference order is
indifferent between either of these states and $L$. (If $A$ and $B$ have equal
utility, then the expected utility of a lottery between $A$ and $B$ must equal
the utility of $A$ and $B$.) But we've just seen that the social preference
order is indifferent between $L$ and $\t{10,10,10,10,\ldots}$. It follows that
$\t{0,20,0,20,\ldots}$ and $\t{10,10,10,10,\ldots}$ have equal social utility.

If we think that even distributions of utility are better than uneven
distributions, we have to reject at least one of the three assumptions. If we
also accept that the right way to evaluate lotteries is by expected utility, it
looks like the first assumption has to go. $L$ is worse than
$\t{10,10,10,10,\ldots}$ even though each individual is indifferent between the
two.

But should we accept that the right way to evaluate lotteries is by
expected utility? This is the question to which we turn next.


\begin{essay}
  Do you think time consistency is a requirement of rationality? Can you
  explain why, or why not?
\end{essay}

\begin{sources}

  The topic of this chapter is rarely discussed in mainstream philosophy,
  although its importance is occasionally recognized. See, for example, Philip
  Pettit, ``Decision Theory and Folk Psychology'' (1991). In economics, our
  topic is commonly known as ``multi-attribute utility theory''. Ralph L.\
  Keeney and Howard Raiffa, \emph{Decisions with Multiple Objectives}
  (1976/1993) is a classical, and very detailed, exposition. Paul Weirich,
  \emph{Decision Space} (2001) explores the area from a more philosophical
  angle. The theorem by Debreu that I've referred to is from his 1960 article
  ``Topological methods in cardinal utility''. More results along the same line
  are surveyed in David Krantz et al., \emph{Foundations of Measurement, Vol. I:
    Additive and Polynomial Representations} (1971).

  For an in-depth discussion of preferences over time streams, including
  relevant empirical results, see Shane Frederick, George Loewenstein, and Ted
  O'Donoughue, ``Time Discounting and Time Preference: A Critical Review''
  (2002).

  A simple proof of Harsanyi's proof of utilitarianism is given in Michael D.\
  Resnik, \emph{Choices} (1987, pp.~197-200). For a sympathetic philosophical
  evaluation, see John Broome, ``General and Personal Good: Harsanyi's
  Contribution to the Theory of Value'' (2015).
  
\end{sources}


%%% Local Variables: 
%%% mode: latex
%%% TeX-master: "bdrc.tex"
%%% End: