-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy path07-separability.tex
1071 lines (925 loc) · 55.5 KB
/
07-separability.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
\chapter{Separability}\label{ch:separability}
\section{The construction of utility}\label{sec:construction-value}
When a possible outcome looks attractive, then this is usually because it has
attractive aspects. It may also have unattractive aspects, but the attractive
aspects (the ``pros'') outweigh the unattractive aspects (the ``cons''). In this
chapter, we will explore how this weighing of different aspects might work.
Take a concrete example. You are looking for a flat to rent. There are two
options. $A$ is a small and central flat that costs £800/month. $B$ is a larger
flat in the suburbs for £600/month. You might draw up a lists of pros and cons
for each option, and give them a weight, like so:
\medskip
\begin{center}
% \setlength{\arrayrulewidth}{1pt}
\arrayrulecolor{lightgray}
\def\arraystretch{1} % horizontal padding
\begin{tabular}{|c|c|}
% \arrayrulecolor{lightgray}
\rowcolor{gray!20}
$A$ & $B$ \\\hline
good location (+2) & bad location (-2) \\
a little small (-1) & good size (+3) \\
expensive (-3) & a little expensive (-1) \\\hline
\end{tabular}
\end{center}
\medskip\noindent%
You might then determine the \emph{total} utility of each option as the Asum of
these numbers, so that $\U(A)$ is +2-1-3 = -2, while $\U(B)$ is -2+3-1 = 0.
Is this a reasonable approach? It looks OK in this example. But we have to be
careful. Suppose you had drawn up the following table.
\medskip
\begin{center}
% \setlength{\arrayrulewidth}{1pt}
\arrayrulecolor{lightgray}
\def\arraystretch{1} % horizontal padding
\begin{tabular}{|c|c|}
% \arrayrulecolor{lightgray}
\rowcolor{gray!20}
$A$ & $B$ \\\hline
good location (+2) & bad location (-2) \\
short commute (+1) & long commute (-1) \\
can get up later (+1) & have to get up earlier (-1) \\
a little small (-1) & good size (+3) \\
expensive (-3) & a little expensive (-1) \\\hline
\end{tabular}
\end{center}
\medskip\noindent%
Now $\U(A)$ comes out as $0$ and $\U(B)$ as $-2$. Do you see what's wrong with
this table?
The problem is that the first three criteria in the list aren't independent.
Once you've taken ``good location'' into account, you shouldn't
\emph{additionally} take into account ``short commute'' and ``can get up
later''. Location, size, and costs are independent criteria. Location and
commute time are not.
But what, exactly, does independence mean here? There is no \emph{logical}
connection between ``good location'' and ``short commute''. And there may well
be a strong statistical connection between (say) location and costs.
\section{Additivity}\label{sec:additivity}
Let's stick with the flat example. We assume that you care about certain aspects
of a flat: size, location, and costs. We'll call these aspects
\textbf{attributes}. Let's assume that size, location, and costs are all the
attributes that ultimately matter to you. Your preferences between possible
flats is then determined by your preferences between combinations of these
attributes. If two flats perfectly agree in each of the three attributes then
you are always indifferent between them. If you prefer one flat to another,
that's always because you prefer the combined attributes of the first to those
of the second.
Instead of talking about the desirability of a particular flat, we can therefore
talk about the desirability of its attributes. We'll write combinations of
attributes as lists enclosed in angular brackets.
`$\t{40 \m^2, \text{central}, \text{£500}}$', for example, would represent any
flat with a size of 40 $\m^2$, central location, and monthly costs of £500. We
are interested in the utility you assign to any such list.
% Your intrinsic utility function then assigns the same value to all flats
% represented by the list $\t{40 \m^2, \text{central}, \text{£500}}$.
\cmnt{%
An \textbf{attribute}, on this usage, is not a particular property of any
particular flat. Rather, an aspect is a set of related properties -- a set
that divides all possible flats into groups. For example, monthly cost of rent
(an aspect) divides all possible flats into those that cost £300, those that
cost £310, those that cost £320, and so on.%
} %
Strictly speaking, of course, utility functions don't assign numbers to lists,
or even to flats. When I say that you prefer one kind of flat over another, what
I really mean is that you prefer living in the first kind of flat over living in
the other. In full generality, we should speak about attributes of worlds, not
of flats. To keep things simple, we currently assume that the only thing you
ultimately care about is what kind of flat you are living in (or going to live
in). A list like $\t{40 \m^2, \text{central}, \text{£500}}$ therefore settles
everything you ultimately care about. It represents one of your ``concerns'', in
the terminology of section \ref{sec:basic-desire}.
In the example from section \ref{sec:basic-desire}, we assumed that you care
about two things: being free from pain and being admired. We pretended that
these are all-or-nothing matters. The resulting four concerns could be
represented by the following lists:
\[
\t{\emph{Pain}, \emph{Admired}}, \t{\neg\emph{Pain}, \emph{Admired}}, \t{\emph{Pain}, \neg\emph{Admired}}, \t{\neg\emph{Pain}, \neg\emph{Admired}}.
\]
Here, there are two attribute, each of which can take two value. The first
attribute specifies whether you are in pain, and the answer is either yes or no.
The second attribute similarly specifies whether you are admired. If we allowed
for different degrees of pain, then the first attribute would have more than two
possible values. We could, for example, distinguish
$\t{\emph{Little Pain}, \emph{Admired}}$ from
$\t{\emph{Strong Pain}, \emph{Admired}}$.
% It's the same with possible worlds. If all you care about is the degree of
% pleasure of you and your three best friends, then we can represent your basic
% desires by a value function that assigns numbers to lists like
% $\t{10, 1, 2, 3}$, specifying degrees of pleasure for you and your friends (in
% some fixed order). Any such list effectively specifies one of your concerns: a
% maximal conjunction of propositions you care about.
In the flat example, we have three attributes, each of which can take many
different values: size, location, and costs. Your intrinsic utility function
assigns a desirability score to all possible combinations of these values.
If you're like most people, we can say more about how these scores are
determined. For example, you probably prefer cheaper flats to more expensive
flats, and larger flats to smaller flats. The ``weighing up pros and cons'' idea
suggests that the overall score for a flat is determined by adding up individual
scores for the flat's properties. Let's spell out how this might work.
We want to compute the utility of any given attribute list as the sum of numbers
assigned to the elements in the list. We'll call these numbers
\textbf{subvalues}. A size of 40 m$^{2}$ might have subvalue
$V_{S}(40\, \m^2) = 1$. Central location might have subvalue
$V_{L}(\text{central}) = 2$. Monthly costs of £500 might have subvalue
$V_{C}(\text{£500}) = -1$. Note that we have three different subvalue functions:
one for size, one for location, one for costs. The overall value (utility) of
$\t{40\, \m^2, \text{central}, \text{£500}}$ would then be the sum of these
subvalues:
\[
\U(\t{40\, \m^2, \text{central}, \text{£500}}) = V_{S}(40 \m^2) +
V_{L}(\text{central}) + V_{C}(\text{£500}) = 2.
\]
%
If $\U$ is determined by adding up subvalues in this manner, then it is called
\textbf{additive} relative to the attributes in question.
Additivity may seem to imply that you assign the same weight to all the
attributes: that size, location, and price are equally important to you. To
allow for different weights, we could introduce scaling factors $w_S, w_L, w_C$,
so that
\[
\U(\t{40\, \m^2, \text{central}, \text{£500}}) = w_s \cdot V_{S}(40\, \m^2) +
w_L \cdot V_{L}(\text{central}) + w_C \cdot V_{C}(\text{£500}).
\]
For convenience, we will omit the weights by folding them into the subvalues. We
will let $V_S(200\, \m^2)$ measure not just how awesome it would be to have a 200
$\m^2$ flat, but also how important this feature is compared to cost and
location.
Subvalue functions are typically defined over propositions that don't have
uniform utility. Recall that, strictly speaking, `$200\, \m^{2}$' expresses the
proposition that you are going to live in a 200 $\m^{2}$ flat. Some of the worlds
where you live in such a flat are great. Others are bad. That's because you also
care about location and costs, and the $200\, \m^{2}$ worlds differ in these respects. An
(improbable) world in which you rent a 200 $\m^{2}$ central flat for £100/month
is better than a (more probable) world in which you rent a 200 $\m^{2}$ flat in
the suburbs for £1000/month. As a result, the utility of 200 $\m^2$ may
be low, even though the subvalue is high.
Informally, the \emph{utility} of 200 $\m^{2}$ measures the desirability of the relevant
proposition. Would you be glad to learn that you are going to rent a 200 $\m^{2}$
flat? Perhaps not, because the large size indicates high costs and bad location.
The \emph{subvalue} of 200 $\m^{2}$ is not sensitive to your beliefs. It measures the
intrinsic desirability of that size, no matter what it implies or
suggests about other attributes. It measures how much a size of 200 $\m^{2}$
contributes to the overall desirability of a flat, holding fixed the other
attributes.
% \begin{exercise}{2}\label{e:subv-not-u}
% % Like utility functions, subvalue functions assign numbers to propositions that
% % needn't be of uniform utility. Unlike utility functions, however, subvalue
% % functions are insensitive to belief. For example,
% If you can afford to pay £600 in monthly rent, then $V_{C}(\text{£300})$ is
% plausibly high, even though the utility you assign to renting a flat for £300
% is plausibly low. Can you explain why?
% \end{exercise}
\cmnt{%
If we want to decompose the overall desirability of a given flat into the
desirability of the flat's individual aspects, we need to assume that the
desirability for the aspects has more than just an ordinal scale.
Consider number of rooms. Perhaps you'd really like to have more than one
room, but you don't care much about whether you have three rooms or four. Your
ranking of the possibilities is 4 rooms $\succ_r$ 3 rooms $\succ_r$ 2 rooms
$\succ_r$ 1 room, but the difference in desirability between 4 rooms and 3
rooms is smaller than that between 2 rooms and 1 room. (`$\succ_r$' is
supposed to represent your basic preferences over room number, setting aside
your knowledge that more rooms typically cost more, etc.) To make such
comparisons between differences meaningful, we need an interval scale.
So let's assume that if we arbitrarily measure the desirability of 1 room as 0
and the desirability of 2 rooms as 10, then your basic desires fix the number
assigned to 3 rooms and 4 rooms. Let $V_r$ be the resulting value function. It
is a value function not a utility function because we take it to be defined
just over room numbers. We don't ``look inside'' the room number
possibilities, taking into account what else is likely to be the case if a
flat has four rooms.
} %
\begin{exercise}{3}
We could define a concept of additivity purely in terms of utility. Let's say
that a utility function $\U$ is \emph{utility-additive} with respect to
attributes $A_{1},\ldots,A_{n}$ iff
$\U(\t{A_{1},\ldots,A_{n}}) = \U(A_{1})+\ldots+\U(A_{n})$. Explain why your
utility function in the flat example isn't utility-additive with respect to
size, location, and costs.
% Consider <200m, central>. If you think that central location and large size
% are inversely correlated, then U(200m) and U(central) are relatively low
% although U(200m \land central) is relatively high. We can use Jeffrey's
% axiom to expand U(A&B) = U(A)+U(B) and show that if Cr(A&B/A) = Cr(A&B/B) =
% 0.1 and U(A&¬B) = U(¬A&B) = x, then the equation holds iff U(A&B) =
% 0.444444x. I.e., if A&B is unlikely given both A and B and its utility is
% positive, then A&¬B and ¬A&B have greater utility.
\end{exercise}
\begin{exercise}{2}
Additivity greatly simplifies an agent's psychology. Suppose an agent's basic
desires pertain to 10 logically independent propositions
$A_1,A_2,\ldots,A_{10}$. There are $2^{10} = 1024$ conjunctions of these
propositions and their negations (such as
$A_1 \land A_2 \land \neg A_3 \land \neg A_4 \land A_5 \land A_6 \land \neg A_7 \land A_8 \land A_9 \land \neg A_{10}$).
To store the agent's intrinsic utility function in a database, we would therefore need to
store up to 1024 numbers. How many numbers do we need to store in the database
if the agent's intrinsic utility function is additive?
\end{exercise}
\section{Separability}\label{sec:separability}
Under what conditions is intrinsic utility determined by adding subvalues? How are different
subvalue functions related to one another? We can get some insight into these
questions by following an idea from the previous chapter and study how intrinsic
utility might be derived from preferences.
% For the moment, we want to set aside the influence of the agent's beliefs, so we
% are not interested in an agent's preferences between lotteries or gambles.
% Rather, we are interested in an agent's preferences between complete attribute
% lists, assuming the relevant attributes comprise everything the agent cares
% about.
The main motivation for starting with preferences is, as always, the problem of
measurement. We need to explain what it means that your subvalue for a given
attribute is 5 rather than 29. Since the numbers are supposed to reflect, among
other things, the importance (or weight) of the relevant attribute in comparison
to other attributes, it makes sense to determine the subvalues from their effect
on the overall ranking of attribute lists.
So assume we have preference relations $\succ$, $\succsim$, $\sim$ between lists
of attributes. (We aren't interested in lotteries or gambles this time, only in
complete concerns.) To continue the illustration in terms of flats, if you
prefer a central \mbox{40 $\m^2$} flat for £500 to a central 60 $\m^2$ for £800,
then we have
\[
\t{40 \m^2, \text{central}, \text{£500}} \succ
\t{60 \m^2, \text{central}, \text{£800}}.
\]
If, like most people, you prefer to pay less rather than more, then your
subvalue function $V_C$ is a decreasing function of monthly costs: the higher
the costs $c$, the lower $V_C(c)$. This doesn't mean that you prefer \emph{any}
cheaper flat to \emph{any} more expensive flat. You probably don't prefer a 5
$\m^2$ flat for £499 to a 60 $\m^2$ flat for £500. The other attributes also
matter. But the following should hold: whenever two flats agree in size and
location, and one is cheaper than the other, then you prefer the cheaper one.
Let's generalize this idea.
Consider an attribute list $\t{A_{1}, A_{2}, \ldots A_{n}}$, and let $A_{1}'$ be
an alternative to $A_{1}$. If, for example, the first position in an attribute
list represents monthly costs, then $A_{1}$ might be £400 and $A_{1}'$ £500. We
can now compare $\t{A_{1}, A_{2}, \ldots A_{n}}$ to
$\t{A_{1}', A_{2}, \ldots A_{n}}$ -- a hypothetical flat that's like the first
in terms of size and location, but costs £100 more. If
\[
\t{A_1,A_2,\ldots,A_n} \succ \t{A_1',A_2,\ldots,A_n},
\]
we say that you prefer $A_1$ to $A_1'$ \emph{conditional on}
$A_{2},\ldots,A_{n}$.
Suppose you prefer $A_{1}$ to $A_{1}'$ conditional on any way of filling in the
remainder $A_{2},\ldots,A_{n}$ of the attribute list. In that case, we can say
that your preference of $A_{1}$ over $A_{1}'$ is \emph{independent} of the other
attributes.
In the flat example, your preference of £400 over £500 is plausibly independent
of the other attributes: whenever two possible flats agree in size and location,
but one costs £400 and the other £500, you plausibly prefer the one for £400.
(We are still assuming that size, location, and costs are all you care about.)
We can similarly consider alternatives $A_{2}$ and $A_{2}'$ that may figure in
the second position of an attribute list, and alternatives $A_{3}$ and $A_{3}'$
in the third positions, and so on. If we find that your preferences between
$A_{i}$ and $A_{i}'$ are always independent of the other attributes, we say that
your preferences between attribute lists are \textbf{weakly separable}.
Weak separability means that your preference between two attribute lists that
differ only in one position does not depend on the attributes in the other
positions.
Consider the following preferences between four possible flats.
\begin{gather*}
\t{50 \m^2, \text{central}, \text{£500}} \succ \t{40 \m^2, \text{beach}, \text{£500}}\\
\t{40 \m^2, \text{beach}, \text{£400}} \succ \t{50 \m^2, \text{central}, \text{£400}}
\end{gather*}
Among flats that cost £500, you prefer central 50 \m$^2$ flats to 40 $\m^2$
flats at the beach. But among flats that cost £400, your preferences are
reversed: you prefer 40 $\m^2$ beach flats to 50 $\m^2$ central flats. In a
sense, your preferences for size and location depend on price. But we don't have
a violation of weak separability, simply because the relevant attribute lists
differ in more than one position.
That's why weak separability is called `weak'. To rule out the present kind of
dependence, we need to strengthen the concept of separability. Preferences are
called \textbf{strongly separable} if the ranking of lists that differ in
\emph{one or more positions} does not depend on the attributes in the remaining
positions, in which they do not differ. In the example, your ranking of
$\t{50 \m^2, \text{central}, -}$ and $\t{40 \m^2, \text{beach}, -}$ depends on
how the blank (`$-$') is filled in. Your preferences aren't strongly separable.
(Are they weakly separable? We can't say. I have only specified how you rank two
pairs of lists. Your preferences are presumably defined for many other
combinations of flat size, location, and costs. There's no violation of weak
separability in the two data points I have given. But there might be a violation
elsewhere.)
\begin{exercise}{2}
Suppose all you care about is the degree of pleasure of you and your three
friends, which we can represent by a list like $\t{10,1,2,3}$. Suppose further
that you prefer states in which you all experience equal pleasure to states in
which your degrees of pleasure are different. For example, you prefer
$\t{2,2,2,2}$ to $\t{2,2,2,8}$, and you prefer $\t{8,8,8,8}$ to $\t{8,8,8,2}$.
Are your preferences weakly separable? Are they strongly separable?
\end{exercise}
\cmnt{%
Exercise: Show that strong entails weak?%
} %
\begin{exercise}{2}
Which of the following preferences violate weak separability, based on the information provided? Which violate strong separability?
\medskip
{\small
\noindent\hspace{-2mm}\begin{tabular}{lll}
(a) & (b) & (c)\\
$\t{A_1,B_1,C_3} \!\succ\! \t{A_3,B_1,C_1}$ & $\t{A_1,B_3,C_1} \!\succ\! \t{A_1,B_3,C_2}$ & $\t{A_1,B_3,C_2} \!\succ\! \t{A_1,B_1,C_2}$ \\
$\t{A_3,B_2,C_1} \!\succ\! \t{A_1,B_2,C_3}$ & $\t{A_1,B_2,C_2} \!\succ\! \t{A_1,B_2,C_3}$ & $\t{A_2,B_3,C_2} \!\succ\! \t{A_2,B_1,C_2}$ \\
$\t{A_3,B_2,C_3} \!\succ\! \t{A_3,B_2,C_1}$ & $\t{A_3,B_2,C_3} \!\succ\! \t{A_3,B_1,C_3}$ & $\t{A_1,B_1,C_1} \!\succ\! \t{A_1,B_3,C_1}$
\end{tabular}
}
\cmnt{%
Answer:
In a., there's no counterex to w.s., but there is one to s.s.: hold fixed B in row 1 and 2.
In b., there's no counterex to either.
In c., there's a counterex to w.s.: hold fixed A,C in rows 1 and 3.
} %
\end{exercise}
In 1960, G\'erard Debreu proved that strong separability is exactly what is
needed to ensure additivity.
% I am following Bergstrom's 2018 "Lecture notes on separable preferences"
To state Debreu's result, let's say that an agent's preferences over attribute
lists have an \textbf{additive representation} if there are a function $\U$,
assigning numbers to the lists, and subvalue functions $V_1, V_2, \ldots, V_n$,
assigning numbers to the items on the lists, such that the following two
conditions are satisfied. First, the preferences are represented by $\U$. That
is, for any two lists $A$ and $B$,
\begin{gather*}
A \pref B \text{ iff } \U(A) > \U(B), \text{ and }\\
A \sim B \text{ iff }\U(A) = \U(B).
\end{gather*}
Second, the $\U$-value assigned to any list $\t{A_1,A_2,\ldots,A_n}$ equals
the sum of the subvalues assigned to the items on the list:
\[
\U(\t{A_1,A_2,\ldots,A_n}) = V_1(A_1) + V_2(A_2) + \ldots + V_n(A_n).
\]
Now, in essence, Debreu's theorem states that if preferences over attribute
lists are complete and transitive, then they have an additive representation if
and only if they are strongly separable.
A technical further condition is needed if the number of attribute combinations
is uncountably infinite; we'll ignore that. Curiously, the result also requires
that there are at least three attributes that matter to the agent. For two
attributes, a stronger condition called `double-cancellation' is required.
Double-cancellation says that if $\t{A_1,B_1} \succsim \t{A_2,B_2}$ and
$\t{A_2,B_3} \succsim \t{A_3,B_1}$ then $\t{A_2,B_3} \succsim \t{A_3,B_2}$. But
let's just focus on cases with at least three relevant attributes.
Debreu's theorem has an interesting corollary. Suppose a utility function $\U$
has an additive representation in terms of certain attributes. One can show that
if the attributes are sufficiently fine-grained, and small differences to the
attributes make for small difference in overall utility, then every utility
function $\U'$ that has an additive representation in terms of the relevant
attributes differs from $\U$ at most in the choice of unit and zero.
% See Theorem 4 in Bergstom's 2018 lecture notes.
This suggests a new response to the ordinalist challenge. The ordinalists
claimed that utility assignments are arbitrary as long as they respect the
agent's preference order. In response, one might argue that rational (intrinsic)
preferences should be strongly separable and that an adequate representation of
such preferences should involve an additive utility function. The only arbitrary
aspect of a utility representation would then be the choice of unit and zero.
\begin{exercise}{2}
Show that whenever $\U$ additively represents an agent's preferences, then so
does any function $\U'$ that differs from $\U$ only by the choice of zero and
unit. That is, assume that $\U$ additively represents an agent's preferences,
so that for some subvalue functions $V_1,V_2,\ldots,V_n$,
\[
\U(\t{A_1,A_2,\ldots,A_n}) = V_1(A_1) + V_2(A_2) + \ldots + V_n(A_n).
\]
Assume $\U'$ differs from $\U$ only by a different choice of unit and zero,
which means that there are numbers $x>0$ and $y$ such that
$\U'(\t{A_1,A_2,\ldots,A_n}) = x\cdot \U(\t{A_1,A_2,\ldots,A_n}) + y$. From
these assumptions, show that there are subvalue functions
$V_1',V_2',\ldots,V_n'$ such that
\[
\U'(\t{A_1,A_2,\ldots,A_n}) = V_1'(A_1) + V_2'(A_2) + \ldots +
V_n'(A_n).
\]
\end{exercise}
\cmnt{%
This proves that whenever $V$ additively represents an agent's preferences,
then so does any function $V'$ that differs only by the choice of zero and
unit. The converse can also be shown, but it's a little harder: if there are
\emph{no} numbers $x>0$ and $y$ for which
$V'(\t{A_1,A_2,\ldots,A_n}) = x\cdot V(\t{A_1,A_2,\ldots,A_n}) + y$, then $V'$
does \emph{not} additively represent the agent's preferences.
To see that additive representation is not preserved under arbitrary positive
transformations of $U$, assume that
$U(X_1,X_2,X_3) = \log(X_1) + \log(X_2) + \log(X_3)$. If we transform $U$ by
the exponential function, then
$U'(X_1,X_2,X_3) = e^{U(X_1,X_2,X_3)} = e^{\log(X_1) + \log(X_2) + \log(X_3)} = e^{\log(X_1)} e^{\log(X_2)} e^{\log(X_3)} = X_1 X_2 X_3$.
The product of three numbers cannot be represented as a sum of some
function of the three numbers. By contrast, if we take any positive linear
transform of $U$, then additivity is preserved:
\begin{align*}
a U(X_1,X_2,X_3) +b &= a(u_1(X_1) + u_2(X_2) + u_3(X_3)) + b
&= a u_1(X_1)+b + a u_2(X_2)+b + a u_3(X_3)b.
\end{align*}
Indeed, the only transformation that preserve additive representation are
increasing linear transformations. Hence additive separability implies
cardinality.
} %
\begin{exercise}{3}
Assume all you care about are your wealth and your height. On one way of
representing your preferences, the utility you assign to any combination of
wealth $w$ (in GBP) and height $h$ (in meters) is $\U(\t{w,h}) = w \cdot h$.
Do your preferences have an additive representation? Explain your answer.
% Any monotonic transformation of U also represents your preferences. Take the
% log-transform. Could also find the answer by looking at double-cancellation,
% but that seems hard.
\end{exercise}
Why might one think that rational preferences should be separable? Remember that
we are talking about preferences over ``attribute lists'' that settle everything
the agent ultimately cares about, with each position in a list settling one
question that intrinsically matters to the agent. In our toy example, these were
the size, location, and costs of their flat. More realistically, items in the
attribute list might be the agent's level of happiness, their social standing,
the well-being of their relatives, etc. Now, if an agent has a basic desire for,
say, happiness, then we would expect that increasing the level of happiness,
while holding fixed everything else the agent cares about, always is a change
for the better. That is, if two worlds $w_{1}$ and $w_{2}$ agree in all respects
that matter to the agent except that the agent is happier in $w_{1}$ than in
$w_{2}$, then we would expect the agent to prefer $w_{1}$ over $w_{2}$. From
this perspective, separability might be understood as a condition on how to
identify basic desires: if an agent's preferences over some attribute lists are
not separable, then the attributes don't represent (all) the agent's basic
(intrinsic) desires.
\cmnt{%
From Decision Analysis, pp.35f.:
Call two outcomes (attribute lists) \textbf{aspect equivalent} if the agent is
indifferent between each of their aspects (items); that is, if
$a(w) \sim a(w')$ for all aspects $a$. Now suppose an agent is indifferent
between any two outcomes that are aspect equivalent. In that case there are
scaling coefficients with which the addition rule correctly represents her
preferences over the outcomes.
How do we construct the agent's value function?
Let's pretend there are two aspects $a$ and $b$, both of which have a minimum
and a maximum value (in terms of preference). Let $V_a$ and $V_b$ range
between 0 and 1 accordingly. Now compare three outcomes that agree in terms of
$b$, but differ in $a$: in $o_0$, $a$ takes its minimum value, in $o_1$ it
takes its maximum value, in $o$ it takes some intermediate value. For some
value $x$, you should be indifferent between $o$ and a lottery that yields
$o_1$ with probability $x$ else $o_0$. So
\[
U(o) = x U(o_1) + (1-x)U(o_0).
\]
If you are indifferent between aspect equivalent outcomes,
\begin{gather*}
U(o_0) = s_a V_a(a(o_0)) + s_b V_b(b(o_0)) = 0 + s_b V_b(b(o_0)).\\
U(o_1) = s_a V_a(a(o_1)) + s_b V_b(b(o_1)) = s_a + s_b V_b(b(o_1)).\\
U(o) = s_a V_a(a(o)) + s_b V_b(b(o)).
\end{gather*}
Substituting these in the previous equation, we get
\[
s_a V_a(a(o)) + s_b V_b(b(o)) = x(s_a + s_b V_b(b(o_1))) + (1-x) s_b V_b(b(o_0))
\]
Since $o_0, o_1, o$ agree in aspect $b$, this simplifies:
\begin{align*}
s_a V_a(a(o)) + k &= x(s_a + k) + (1-x) k\\
s_a V_a(a(o)) + k &= xs_a + xk + k-xk\\
s_a V_a(a(o)) + k &= xs_a + k\\
s_a V_a(a(o)) &= xs_a \\
V_a(a(o)) &= x.
\end{align*}
So we can determine the individual value functions. How do we determine the
scaling factors? Define $o_{00}$ to be the outcome in which $a$ and $b$ both
take minimum value; similarly for $o_{11}$ and maximum value. Let $o_{01}$
have minimum value for $a$ and maximum for $b$, conversely for $o_{10}$. For
some value $x_a$, you should be indifferent between $o_{10}$ and a lottery
$x_a o_{11} + (1-x_a) o_{00}$. So
\[
U(o_{10}) = x_a U(o_{11}) + (1-x_a)U(o_{00}).
\]
If you are indifferent between aspect equivalent outcomes,
\begin{gather*}
U(o_{00}) = 0\\
U(o_{11}) = 1 \text{ provided $s_a +s_b = 1$}\\
U(o_{10}) = s_a.
\end{gather*}
Plugging these into the previous equation, we get
\[
s_a = x_a.
\]
Suppose for each $x_1,y_1$ and $z$, if
$\t{x_1,x_2,\ldots,x_n} \succsim z \succsim \t{y_1,x_2,\ldots,x_n}$ then there
is some $t_1$ such that $z \sim \t{t_1, x_2,\ldots, x_n}$. Then $\succsim$ is
said to have \textbf{restricted solvability} w.r.t. the first attribute.
(Similarly for the other attributes.) Restricted solvability is not quite
enough for additive respresentation. We need to add non-triviality of each
position: that for each $i$ there are $x_i, y_i$ such that
$\t{x_1,\ldots,x_i\ldots,x_n} \succ \t{x_1,\ldots,y_i\ldots,x_n}$. For the
infinite case, we also need an archimedian condition blocking lexical
orderings. We also need to assume weak separability.
} %
% Could do the Kant example: happiness is good for virtuous people and bad for
% non-virtuous people. Is this a counterexample to additivity?
\section{Separability across time}\label{sec:separability-time}
According to psychological hedonism, the only thing people ultimately care about
is their personal pleasure. But pleasure isn't constant. The hedonist conjecture
leaves open how people rank different ways pleasure can be distributed over a
lifetime. Unless an agent just cares about their pleasure at a single point in
time, a basic desire for pleasure is really a concern for a lot of things:
pleasure now, pleasure tomorrow, pleasure the day after, and so on. We can think
of these as the ``attributes'' in the agent's intrinsic utility function. The
hedonist's intrinsic utility function somehow aggregates the value of pleasure
experienced at different times.
To keep things simple, let's pretend that pleasure does not vary within any
given day. We might then model a hedonist utility function as a function that
assigns numbers to lists like $\t{1,10,-1,2,\ldots}$, where the elements in the
list specify the agent's degree of pleasure today (1), tomorrow (10), the day
after (-1), and so on. Such attribute lists, in which successive positions
correspond to successive points in time, are called \textbf{time streams}.
A hedonist agent would plausibly prefer more pleasure to less at any point in
time, no matter how much pleasure there is before or afterwards. If so, their
preferences between time streams are weakly separable. Strong separability is
also plausible: whether the agent prefers a certain amount of pleasure on some
days to a different amount of pleasure on these days should not depend on how
much pleasure the agent has on other days. It follows by Debreu's theorem that
the utility the agent assigns to a time stream can be determined as the sum of the
subvalues she assign to the individual parts of the stream. That is, if $p_1$,
$p_2$, \ldots, $p_n$ are the agent's degrees of pleasure on days
$1, 2, \ldots, n$ respectively, then there are subvalue functions
$V_1,V_2,\ldots,V_n$ such that
\[
V(\t{p_1,p_2,\ldots,p_n}) = V_1(p_1) + V_2(p_2) + \ldots + V_n(p_n).
\]
We can say more if we make one further assumption. Suppose an agent prefers
stream $\t{p_1,p_2,\ldots,p_n}$ to an alternative $\t{p_1',p_2',\ldots,p_n'}$.
Now consider the same streams with all entries pushed one day into the future,
and prefixed with the same degree of pleasure $p_0$. So the first stream turns
into $\t{p_0, p_1,p_2,\ldots,p_n}$ and the second into
$\t{p_0, p_1',p_2',\ldots,p_n'}$. Will the agent prefer the modified first
stream to the modified second stream, given that she preferred the original
first stream? If the answer is yes, then her preferences are called
\textbf{stationary}. From a hedonist perspective, stationarity seems plausible:
if there's more aggregated pleasure in $\t{p_1,p_2,\ldots,p_n}$ than in
$\t{p_1',p_2',\ldots,p_n'}$, then there is also more pleasure in
$\t{p_0,p_1,p_2,\ldots,p_n}$ than in $\t{p_0,p_1',p_2',\ldots,p_n'}$.
It is not hard to show that if preferences over time streams are separable and
stationary (as well as transitive and complete), then they can be represented by
a function of the form
\[
\U(\t{A_1,\ldots,A_n}) = V_1(A_1) + \delta \cdot V_1(A_2) +
\delta^2 \cdot V_1(A_3) \ldots + \delta^{n-1} \cdot V_1(A_n),
\]
where $\delta$ is a fixed number greater than 1. The interesting thing here is
that the subvalue function for any time equals the subvalue function $V_1$ for
the first time, scaled by an exponential \textbf{discounting factor} $\delta^i$.
\cmnt{%
Here's the argument, roughly. By separability,
$U(p_1,\ldots,p_n) = u_1(p_1) + \ldots$. By stationarity,
$u_1(p_1) + \ldots \geq u_1(p_1') + \ldots$ iff
$u_2(p_1) + \ldots \geq u_2(p_1') + \ldots$. By cardinal uniqueness there
exist $\delta > 0$ and $b_t$ such that $u_{i+1} = \delta u_i + b_i$, which by
cardinal uniqueness again means we can find another representation with
$u_{i+1} = \delta u_i$.
} %
If a hedonist has strongly separable and stationary preferences, then her
preferences over time streams are fixed by two things: how much she values
present pleasure, and how much she discounts the future. If $\delta = 1$, the
agent values pleasure equally, no matter when it occurs. If
$\delta = \nicefrac{1}{2}$, then one unit of pleasure tomorrow is worth half as
much as to the agent as one unit today; the day after tomorrow it is worth a
quarter; and so on.
\begin{exercise}{1}
Consider the following streams of pleasure:
\begin{enumerate}
\itemsep-0.3em
\item[S1:] $\t{1,2,3,4,5,6,7,8,9}$
\item[S2:] $\t{9,8,7,6,5,4,3,2,1}$
\item[S3:] $\t{1,9,2,8,3,7,4,6,5}$
\item[S4:] $\t{9,1,8,2,7,3,6,4,5}$
\item[S5:] $\t{5,5,5,5,5,5,5,5,5}$
\end{enumerate}
Assuming present pleasure is valued in proportion to its degree, so
that $V_1(p) = p$ for all degrees of pleasure $p$, how would a
hedonist agent with separable and stationary preferences rank these
streams, provided that (a) $\delta = 1$, (b)
$\delta < 1$, (c) $\delta > 1$? (You need to give three answers.)
\end{exercise}
Even if you're not a hedonist, you probably care about some things that can
occur (and re-occur) at different times: talking to friends, going to concerts,
having a glass of wine, etc. The formal results still apply. If your preferences
over the relevant time streams are separable and stationary, then they are fixed
by your subvalue function for the relevant events (talking to friends, etc.)
right now and by a discounting parameter $\delta$.
Some have argued that stationarity and separability across times are
requirements of rationality. Some have even suggested that the only rationally
defensible discounting factor is 1, on the ground that we should be impartial
with respect to different parts of our life.
An argument in favour of stationarity is that it is often thought to be required
to protect the agent from a kind of disagreement with her future self. To
illustrate, suppose you prefer $\t{10, 0, 0, 0, \ldots}$ to
$\t{0, 11, 0, 0, \ldots}$ because you care more about today's pleasure than
about tomorrow's. You care less about the difference between getting pleasure in
four days and getting it in five days, so you prefer
$\t{0, 0, 0, 0, 11, 0, 0, \ldots}$ to $\t{0, 0, 0, 11, 10, 0, 0, \ldots}$. These
preferences violate stationarity. Stationarity would imply that if you prefer
$\t{10, 0, 0, 0, \ldots}$ to $\t{0, 11, 0, 0, \ldots}$ then you also prefer the
first stream to the second if both are prefixed with 0, and therefore also if
both are prefixed with two 0s, and with three 0s. Now suppose your
(non-stationary) preferences remain the same for the next 4 days. At the end of
this time, you'd still rather have 10 units of pleasure today than 11 tomorrow:
you still prefer $\t{10, 0, 0, 0, \ldots}$ to $\t{0, 11, 0, 0, \ldots}$. But
your ``today'' is what used to be ``in 4 days''. Your new preferences disagree
with those of your earlier self, in the sense that the worlds your former self
regarded as better you now regard as worse. This kind of disagreement is called
\textbf{time inconsistency}.
% xxxx I should explain that time inconsistency is about preference over lists.
% It's not enough to look at the present entry in the list.
% I say that stationarity "appears to be required" for time inconsistency
% because I think this is only true under certain (problematic) modeling
% assumptions: see https://www.umsu.de/blog/2017/671
Empirical studies suggest that time inconsistency is pervasive. People often
prefer their future selves to study, eat well, and exercise, but choose burgers
and TV for today.
These preferences do look problematic. Other apparent violations of
stationarity, and even separability across time, however, look OK. Suppose you
like to have a glass of wine every now and then. But only now and then; you
don't want to have wine every day. It seems to follow that your preferences
violate both separability and stationarity. You violate stationarity because
even though you might prefer a stream $\t{\text{wine}, \text{no wine}, \text{no
wine}, \ldots}$ to $\t{\text{no wine}, \text{no wine}, \text{no
wine}, \ldots}$, your preference reverses if both streams are prefixed with
wine (or many instances of wine). You violate separability because whether you
regard having wine in $n$ days as desirable depends on whether you will have
wine right before or after these days.
Even if an agent only cares about pleasure, it is not obvious why a rational
agent might not (say) prefer relatively constant levels of pleasure over wildly
fluctuating levels, or the other way round.
One might argue, however, that in these cases the items in the time streams do
not represent you basic desires, or not all of them. If, for example, you have a
preference for constant levels of pleasure, then your basic desires don't just
pertain to how much pleasure you have today, how much pleasure you have
tomorrow, and so on. You have a further basic desire: that your pleasure be
constant from day to day.
\begin{exercise}{2}
Are your preferences in the wine example time-inconsistent, in the sense that
what you prefer for your future self is not what your future self prefers for
itself?
\end{exercise}
\begin{exercise}{2}
If you care about whether you have wine on consecutive days, then arguably an
adequate time stream for your concerns shouldn't simply specify, for each day,
whether you do or do not have wine, but also whether you are \emph{having wine
after having had wine the previous day}. An adequate representation of a
week in which you have wine on days 2, 4, and 5 would therefore be
$\t{\bar{W} \bar{P}, W \bar{P}, P \bar{W}, W \bar{P}, W P, \bar{W} P, \bar{W} \bar{P}}$,
where $W$ means that you have wine, $\bar{W}$ that you don't have wine, $P$
that you had wine the previous day, and $\bar{P}$ that you didn't have wine
the previous day. Do your preferences over such streams satisfy separability
and stationarity?
\end{exercise}
Let's briefly return to the problematic kind of time-inconsistency, manifested
by the common desire for vice today and virtue tomorrow. What could explain this phenomenon?
Part of the explanation might be that our preferences have different sources (as
I emphasized in chapter \ref{ch:utility}). When we reflect on having fries or
salad now, we are more influenced by spontaneous cravings than when we consider
the same options for tomorrow.
We could represent different sources of value by different subvalue functions.
We might, for example, have a subvalue function $V_{c}$ that measures the extent
to which a proposition satisfies you present cravings, and another subvalue
function $V_{m}$ that measures to what extent it matches your moral convictions.
Your intrinsic utility function is some kind of aggregate of these components.
Here, too, separability is plausible. If, for example, you think that one world
is morally better than another, and the two worlds are equally good with respect
to all your other motives (your cravings are equally satisfied in either, etc.),
then you plausibly prefer the first world to the second. This suggests that
different sources of intrinsic utility combine in an additive manner.
\cmnt{%
An interesting observation in behavioural economics concerns the difference
between single and repeated offers of gambles. Many people would reject a
gamble $[0.5? \$200 : -\$100]$; but hardly anyone would reject a long sequence
of such gambles. It is clear why: if losses hurt more than gains are
pleasurable, then a single gamble looks much less attractive than a sequence
of gambles. With a hundred instances of the above gamble, the chance of a net
loss is down from $1/2$ to $1/2300$. This is interesting because it shows that
we mustn't assume that if an agent prefers $A$ over $B$ in a single choice,
she also prefers $A$ over $B$ when the choice is repeated, or known to be part
of a sequence. As Kahneman \citey[338f.]{kahneman11thinking} points out, it
also suggests that we are often irrational when we evaluate choices by
themselves, not regarding them in a wider context. After all, every choice is
in a sense part of a long sequence. If you reject gambles with positive
expected payoff out of loss aversion, you do worse in the long run.
} %
\cmnt{%
Dual-ranking?
Harder if morality imposes fixed side constraints.
Compare Sen on commitment vs preference?
Compare also http://www.tnr.com/article/books-and-arts/true-lies on the book
``Private Truths, Public Lies: The Social Consequences of Preference
Falsification'' 1995, By Timur Kuran.
Kuran's real interest is not in moral evaluation, but in explaining individual
and collective choices. To this end, he offers a simple economic framework
based on three factors, which he describes (somewhat awkwardly) as intrinsic
utility, reputational utility and expressive utility. A person's purely
private preference is based on the intrinsic utility, to him, of the options
under consideration. Some people really want to get rid of affirmative action
or welfare programs, because they think that these are bad things, but their
private preference may not be expressed publicly, because of the loss of
reputational utility that would come from expressing it. The importance of
reputational utility in a particular case depends on the extent of the risk to
your reputation, and also on how much you care about your reputation. And
people get what Kuran calls expressive utility from bringing their public
statements into alignment with their private judgments. We all know people who
hate to bow before social pressures; such people are willing to risk their
reputation because what they especially hate is to speak or act in a way that
does not reflect their true beliefs.
} %
% \section{Separability across states}
% An agent faces a choice between some acts. According to the MEU Principle, the
% agent should evaluate each option $A$ by its expected utility
% \[
% \EU(A) = \U(O_1)\cdot \Cr(S_1) + \U(O_2)\cdot \Cr(S_2) + \ldots + \U(O_n)\cdot\Cr(S_n),
% \]
% where $S_1,S_2,\ldots,S_n$ are the relevant states and $O_1,O_2,\ldots,O_n$ are
% the outcomes of act $A$ in those states. Holding fixed the states, we can
% represent each available act by the list of its outcomes:
% $\t{O_{1}, O_{2},\ldots,O_{n}}$. In the mushroom problem from chapter
% \ref{ch:overview}, for example, eating the mushroom can be represented by the
% list $\t{\text{satisfied}, \text{dead}}$, and not eating by
% $\t{\text{hungry}, \text{hungry}}$, with the understanding that the first item
% in the list comes about if the mushroom is a paddy straw and the second if it is
% a death cap.
% Suppose the agent ranks the available acts by their expected utility. Her
% preference over the relevant outcome lists then have an additive representation: they
% are represented by a function $\U$ that assigns numbers to lists in such a way
% that the number assigned to any list is determined by adding up subvalues
% assigned to individual items on the list. This function $\U$ is the $\EU$
% function; the subvalues are the credence-weighted utilities of the outcomes. The
% subvalue of outcome $O_1$, for example, is $\U(O_1)\cdot\Cr(S_1)$.
% By Debreu's theorem, rational preferences have an additive representation if and
% only if they are strongly separable. The MEU Principle therefore implies that an
% agent's preferences between the acts in a decision problem are (strongly)
% separable \emph{across states}, meaning that the desirability of an outcome in
% one state does not depend on the outcomes in other states.
% Admittedly, this is a very roundabout path to a fairly obvious result. I mention
% it for two reasons. First, it shows that the response to the ordinalist
% challenge from the previous section is closely related to the response that we
% met in chapter \ref{ch:preference}. Von Neumann and Ramsey, in effect, assume
% that rational preferences are separable across states, and that the right way to
% measure separable preferences construes the net utility of an act as the sum of
% certain values assigned to the individual outcomes.
% Second, a general consequence of separability is that the relevant preferences
% are insensitive to ``shapes'' in the distribution of subvalues. For example,
% separable preferences cannot prefer even distributions to uneven distributions.
% This may seem to point at a problem with the MEU Principle. Consider the following schematic decision problem:
% %
% \begin{center}
% \begin{tabular}{|r|c|c|}\hline
% \gr & \gr State 1 (\nicefrac{1}{2}) & \gr State 2 (\nicefrac{1}{2}) \\\hline
% \gr $A$ & Outcome 1 (+10) & Outcome 1 (+10) \\\hline
% \gr $B$ & Outcome 2 (-10) & Outcome 3 (+30) \\\hline
% \end{tabular}
% \end{center}
% %
% Option $A$ leads to a guaranteed outcome with utility 10, while option $B$ leads
% either to a much better outcome or to a much worse one. The expected utilities
% are the same, but one might think an agent might rationally prefer $A$ because
% the utility distribution $\t{10,10}$ is more even than $\t{\text{-10},30}$.
% Intuitively, $A$ is safe, while $B$ is risky. We will return to this issue in
% the next chapter.
% \begin{exercise}{2}
% Where in their axioms do von Neumann and Morgenstern assume a kind
% of separability across states?
% \end{exercise}
\section{Harsanyi's ``proof of utilitarianism''}\label{sec:utilitarianism}
The ordinalist movement posed a challenge not only to the MEU Principle, but
also to utilitarianism in ethics. Utilitarianism is a combination of two claims.
The first says that an act is right iff it brings about the best available state
of the world. The second says that the ``goodness'' of a state is the sum of the
utility of all people. Without a numerical (and not just ordinal) measure of
personal utility, this second claim makes no sense. We would need a new
criterion for ranking states of the world.
One such criterion was proposed by Pareto. Recall that Pareto did not deny that
people have preferences. If we want to know which of two states is better, we
can still ask which of them people prefer. This allows us to define at least a
partial order on the possible states:
%
\begin{genericthm}{The Pareto Condition}
If everyone is indifferent between $A$ and $B$, then $A$ and $B$ are equally
good; if at least one person prefers $A$ to $B$ and no one prefers $B$ to $A$,
then $A$ is better than $B$.
\end{genericthm}
%
Unlike classical utilitarianism, however, the Pareto Condition offers little
moral guidance. For instance, while classical utilitarianism suggests that one
should harvest the organs of an innocent person in order to save ten others, the
Pareto Condition does not settle whether it would be better or worse to harvest
the organs, given that the person to be sacrificed ranks the options differently
than those who would be saved.
\cmnt{%
When Ramsey, Savage, and von Neumann and Morgenstern showed how a meaningful
numerical quantity of utility could be derived from an agent's preferences,
they helped to rescue the MEU Principle, but they did not help classical
utilitarianism. For remember that the utility functions derived from personal
preferences have arbitrary zero and unit. Thus if according to one adequate
representation of our preferences, my utility for a given state is 10 and
yours is 0, then on another equally adequate representation, my utility for
the state will be -100 and yours 20.
} %
\begin{exercise}[The Condorcet Paradox]{1}
A ``democratic'' strengthening of the Pareto condition might say that whenever
\emph{a majority} of people prefer $A$ to $B$, then $A$ is better than $B$.
But consider the following scenario. There are three relevant states: $A,B,C$,
and three people. Person 1 prefers $A$ to $B$ to $C$. Person 2 prefers $B$ to
$C$ to $A$. Person 3 prefers $C$ to $A$ to $B$. If betterness is decided by
majority vote, which of $A$ and $B$ is better? How about $A$ and $C$, and $B$
and $C$?
\end{exercise}
In 1955, John Harsanyi proved a remarkable theorem that seemed to rescue, and
indeed vindicate, classical utilitarianism.
As a first step, Harsanyi adopts von Neumann's response to the ordinalist
challenge. He assumes that each individual has preferences not only among the
relevant states, but also among lotteries involving the states, and that their
preferences conform to the von Neumann and Morgenstern axioms. We can then
represent their preferences by personal utility functions $\U_{1},\ldots,\U_{n}$
(one for each individual) that are unique up to the choice of unit and zero.
Our goal is to derive a ``social preference'' relation between states that
settles whether a state is overall better than another. Harsanyi assumes that
this social preference relation can be extended to lotteries in a way that
conforms to the von Neumann and Morgenstern axioms. It follows that social
preference is also represented by a (``social'') utility function $\U_{s}$
that is unique up to the choice of unit and zero.
Harsanyi now showed that if we add the Pareto condition (for both states and
lotteries), then the individual and social preferences are represented by
utility functions $\U_1,\ldots,\U_n$ and $\U_s$ in such a way that social
utility is simply the sum of the individual utilities: for any state $A$,
\[
\U_s(A) = \U_1(A) + \ldots + \U_n(A).
\]
Once we have allowed lotteries into the picture, the Pareto condition entails
full-blown utilitarianism! How is this possible?
The Pareto condition implies that the social utility of any state is determined
by the personal utility each individual assigns to the state. For suppose the
social utility of some state $A$ depends on an aspect of $A$ that doesn't affect
the personal utilities. Then there is an alternative $B$ to $A$ (that differs
from $A$ in this aspect) for which $\U_{s}(B) \not= \U_{s}(A)$ even though every
individual assigns the same utility to $A$ and $B$. This contradicts the Pareto
condition.
So the only ``attributes'' of a state that are relevant to its social utility
are its personal utility scores. We can represent a state by a list of numbers
$\t{u_{1}, \ldots, u_{n} }$, each of which specifies how desirable the state is
for a particular individual.
Most non-utilitarians would disagree on this point. They would hold that even if
everyone is indifferent between two states $A$ and $B$, $A$ might still be worse
than $B$, if it involves gratuitous human rights violations, animal suffering,
sin, or whatever.
The really surprising part of Harsanyi's theorem is that the social utility of a
state is simply the sum of its personal utility scores $u_{1} + \ldots + u_{n}$.
This tells us that social preference is separable across the personal utilities,
and that each personal utility (each attribute) simply contributes its value to
social utility. How does this come about? Couldn't an even distribution
$\t{10,10,10,10,\ldots }$ be better than an uneven distribution
$\t{0,20,0,20,\ldots}$? Relatedly, couldn't personal utility have ``declining
social value'', so that adding 1 unit of personal utility to an individual whose
utility is already at 1000 contributes less to social utility than adding 1 unit
to an individual who stands at 0?
These possibilities are ruled out by three assumptions that look harmless in
isolation, but have great power when combined.