100% found this document useful (1 vote)
949 views306 pages

Game Theory Explained

The document is a preface and introductory section of a book titled 'Game Theory Explained: A Mathematical Introduction with Optimization' by Christopher Griffin, which aims to provide a comprehensive introduction to game theory and its connections to optimization. The book is designed for undergraduates and includes classical game theory, optimization relationships, and cooperative game theory, structured in a theorem-proof-example format. It emphasizes mathematical proofs and practical applications, while also addressing the author's motivations and teaching experiences related to the subject.

Uploaded by

Phúc Nguyễn
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
949 views306 pages

Game Theory Explained

The document is a preface and introductory section of a book titled 'Game Theory Explained: A Mathematical Introduction with Optimization' by Christopher Griffin, which aims to provide a comprehensive introduction to game theory and its connections to optimization. The book is designed for undergraduates and includes classical game theory, optimization relationships, and cooperative game theory, structured in a theorem-proof-example format. It emphasizes mathematical proofs and practical applications, while also addressing the author's motivations and teaching experiences related to the subject.

Uploaded by

Phúc Nguyễn
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 306

Game Theory

Explained
A Mathematical Introduction with Optimization
This page intentionally left blank
Game Theory
Explained
A Mathematical Introduction with Optimization

Christopher Griffin
Pennsylvania State University, USA

NEW JERSEY • LONDON • SINGAPORE • GENEVA • BEIJING • SHANGHAI • TAIPEI • CHENNAI


Published by
World Scientific Publishing Co. Pte. Ltd.
5 Toh Tuck Link, Singapore 596224
USA office: 27 Warren Street, Suite 401-402, Hackensack, NJ 07601
UK office: 57 Shelton Street, Covent Garden, London WC2H 9HE

Library of Congress Cataloging-in-Publication Data


Names: Griffin, Christopher, 1979- author.
Title: Game theory explained : a mathematical introduction with optimization /
Christopher Griffin, Pennsylvania State University, USA.
Description: New Jersey : World Scientific, [2025] | Includes bibliographical references and index.
Identifiers: LCCN 2024035441 | ISBN 9789811297212 (hardcover) |
ISBN 9789819812875 (paperback) | ISBN 9789811297229 (ebook for institutions) |
ISBN 9789811297236 (ebook for individuals)
Subjects: LCSH: Game theory. | Mathematical optimization.
Classification: LCC QA269 .G75 2025 | DDC 519.3--dc23/eng/20241226
LC record available at https://lccn.loc.gov/2024035441

British Library Cataloguing-in-Publication Data


A catalogue record for this book is available from the British Library.

Copyright © 2025 by World Scientific Publishing Co. Pte. Ltd.


All rights reserved. This book, or parts thereof, may not be reproduced in any form or by any means,
electronic or mechanical, including photocopying, recording or any information storage and retrieval
system now known or to be invented, without written permission from the publisher.

For photocopying of material in this volume, please pay a copying fee through the Copyright Clearance
Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, USA. In this case permission to photocopy
is not required from the publisher.

For any available supplementary material, please visit


https://www.worldscientific.com/worldscibooks/10.1142/13962#t=suppl

Desk Editors: Nambirajan Karuppiah/Rosie Williamson

Typeset by Stallion Press


Email: [email protected]

Printed in Singapore
This book is dedicated to Beverly Leech (Kate Monday), Joe
Howard (George Frankly), and Toni DiBuono (Pat Tuesday) of
Mathnet and every person who made Square One TV possible
when I was a kid. Without you, this book would not exist.
This page intentionally left blank
Preface

Why did I write this book? This book started in 2010 as a series
of lecture notes called “Game Theory: Penn State Math 486 Lecture
Notes.” Game theory was the second course I ever taught, and I wrote
the notes during a cold State College winter. People always ask me,
“Why do you write such detailed lecture notes?” And I always tell
them, “Sheer terror!” As a young faculty member, one is always
worried about something, and I figured if I had a well-prepared set of
typed lecture notes, the chances of me making a bone-headed mistake
in class would be minimized. When I started the class, I had intended
to use Luce and Raiffa’s book [1], with elements of Morris’ book [2]
added for good measure. But the further I got into the class, the more
I realized I wanted to cover the material in a way that emphasized the
connection to other parts of mathematics, especially optimization.
Nisan, Roughgarden, Tardos, and Vazirani had published their book,
Algorithmic Game Theory [3], a few years earlier, and there was a
sudden emphasis on finding equilibria in games algorithmically. I was
aware of the work from the 1960s on using optimization methods
to find Nash equilibria; consequently, a rather unique course was
born that emphasized both classical results from game theory and
interesting results from optimization theory. The material was aimed
squarely at undergraduates and (in my opinion) provided a deep but
not impenetrable introduction to the mathematics of games that is
often missing at the undergraduate level.
When I taught at the United States Naval Academy for a few
years, I developed a course on “Computational Game Theory,”

vii
viii Game Theory Explained: A Mathematical Introduction with Optimization

where I continued to polish and refine the lecture notes. Finally,


during the COVID-19 pandemic, a very nice editor named Rochelle
from World Scientific contacted me and asked if I’d be willing to turn
some of my notes into a book. I started with my notes on graph the-
ory, and the result was my first book, Applied Graph Theory. With
that project finished, I was finally ready to turn my notes on game
theory into a book, and the result is this text. Between you and me,
I’m glad I wrote this one second. The graph theory notes were more
mature and easier to convert; I had found my voice (for the most
part). Writing this book has been a bit like taking a trip down mem-
ory lane to a time when I was less confident in myself as a teacher
and had not yet found my voice as a writer. Frankly, parts of the
process were painful, but I’m reasonably pleased with the outcome.
I will give you a fair warning: There are parts of this book that are
dense. I don’t like the “fluffy” approach to game theory, and I feel
that such an approach hides too much of the beautiful intricacies of
the subject and its deeper connections to the mathematical world.
That being said, I’ve included plenty of examples and worked prob-
lems so that the density is broken up by practical applications of
game theory, along with some mathematical interludes that I hope
you enjoy.

How could you use this book? This book can be used entirely
for self-study or in a classroom. I’ve had e-mails from all around
the world saying that people have used the lecture note form for self-
study, so it is possible. The book is really designed for a one-semester
course and is geared toward undergraduates who have familiarity
with differential vector calculus and matrices. The book emphasizes
the proofs. However, many of these can be skipped on a first reading,
though you will lose something if you skip too many of them. Almost
all the proofs are derivations, though a few are by contradiction. So,
while a “proofs class” is not needed, it won’t hurt.
The book is organized into three parts and written in a “theorem-
proof-example” style. There are remarks throughout and chapter
notes that try to emphasize some history or other examples. Part 1
covers classical game theory, including games against the house
(casino games) and utility theory. Part 2 covers the relationship
between game theory and optimization. Part 3 covers cooperative
game theory. There is an appendix that introduces evolutionary
Preface ix

game theory by way of replicator dynamics. I used this material


just after the pandemic, when we were running a week ahead of
schedule in the course. In general, I favor a separate class on evolu-
tionary games rather than wedging it into a course on game theory.
Of course, part of my research focuses on evolutionary game theory,
so I would say that. Each part emphasizes not only the results but
also the proofs along with examples. A nice feature of the book is
the relationship between Nash bargaining theory and multi-criteria
optimization, which (as far as I know) is not covered in any other
book.
Sample curriculum paths are shown in the following figure.
Prerequisite
Appendices

A B
Extra Material
Appendix

1 2 3 4 5 6 7 8 9 10 C

Classical Game Theory & Cooperative Replicator


Game Theory Optimization Game Theory Dynamics

Classic route: The first time I taught this course, I went from
Chapter 1 to Chapter 10, with limited information coming from the
appendices. This is by far the easiest way to teach the course and
gives a thorough understanding of classical game theory and its rela-
tionship to optimization. The material on optimization and probabil-
ity is self-contained, so there is no need to use an external resource,
unless so desired. It has been my experience that students like the
material on games against the house in Chapter 1, which helps offset
the formal introduction to probability that it provides.
Skip utility theory: When I taught game theory in 2021, I decided
to skip Chapter 2. I like the material in it, but the students often
find it a little dry. It is perfectly fine to skip Chapter 2 and go from
Chapter 1 to Chapter 3. In this case, one could proceed at a slower
pace and really emphasize all the proofs in the book or include
Appendix C on evolutionary game theory, which is self-contained
enough that students don’t need to have seen differential equa-
tions (though it would help). In 2021, we covered the material in
Appendix C and did most of the proofs.
x Game Theory Explained: A Mathematical Introduction with Optimization

Skip Chapters 1, 2 and (parts of ) 3: I’ve never done this


because the students like the games in Chapter 1, but it is possi-
ble to skip Chapters 1 and 2, assuming that the students are familiar
enough with probability theory. In that case, Chapter 3 could be con-
densed to just the material on game trees that do not include games
of chance. (Essentially, you’re skipping poker.) It’s also possible to
skip all of Chapter 3 and start immediately at Chapter 4 with normal
form games. I would only do this if I wanted to really emphasize the
proofs or review the material on calculus and matrices thoroughly.
You would probably have to include Appendix C to make a 15-week
course, though this could work in a 10-week term.
How do you handle optimization? The book is sensitive to
the fact that optimization can be a niche interest in some math
departments. Consequently, all the material on optimization is self-
contained, including the coverage of the Karush–Kuhn–Tucker con-
ditions. Penn State has no math course that covers these as a prereq-
uisite for the game theory class, and students had no problems with
them. Appendices A and B can also be used to help bring students
up to speed on elements of matrix arithmetic and calculus that they
may have forgotten or missed.
My favorite aspect of the book is the connection between games
and optimization, especially the connection between optimization
and cooperative games, which, I think, is wholly underemphasized.
The quadratic programming method for finding Nash equilibria in
general-sum bimatrix games is not the most “modern” way to solve
such problems, but it is both understandable to undergraduates and
can be performed on a computer algebra system. Moreover, it pro-
vides critical mathematical foundations for more advanced study in
game theory, using, for example, the book by González-Dı́az, Garcı́a-
Jurado, and Fiestras-Janeiro [4].
What isn’t in this book? Since this is geared toward a one-
semester class, there are several omissions. There is no mention
of dynamic or differential games, unless you consider multi-layered
game trees to be a form of dynamic game, which I do not. Likewise,
iterated games are not covered because they could form an entire text
on their own. Cooperative game theory of the kind studied deeply
in economics is only introduced in Chapter 10. It is not a feature
of the book. Evolutionary game theory is covered in Appendix C.
Preface xi

As this is one of my main areas of research, I would prefer a separate


treatment of the subject. However, the material on the replicator is
both exciting and produces nice pictures, so it is worth putting into
an appendix as “bonus material” to be used as needed. Formal treat-
ment of evolutionarily stable strategies (ESS) is explicitly omitted.
ESS are subtle, and in my experience, the students hate it. When
presented poorly, it is literally a way to turn students off from evo-
lutionary games.
Explicit optimization algorithms such as the simplex algorithm
are not presented. Instead, I show how to solve optimization problems
with a computer. For those readers interested in understanding the
methods used by the computer for solving the optimization problems
that arise in this book, there are ample references.
This page intentionally left blank
Acknowledgements

This book was created with LATEX2e using Overleaf, TeXShop,


BibDesk, and the grammar checker LanguageTool. Figures were cre-
ated using MathematicaTM and OmnigraffleTM , unless otherwise
noted. In acknowledging those people who helped make this work
possible, first let me say thank you to all the other scholars who
have worked in game theory. Without your pioneering work, there
would be nothing to write about. Also, I must thank individuals who
found typos in my original lecture notes and wrote to tell me about
them: James Fan, George Kesidis, Nicolas Aumar, Arlan Stutler, and
Sarthak Shah. I would also like to thank Volmir Eugênio Wilhelm,
who selflessly translated a version of my game theory lecture notes
into Spanish. If there is anyone else I have forgotten, I apologize, but
your contributions are appreciated. I would very much like to thank
Andrew Belmonte, at Penn State, for encouraging me to turn my
notes into a book, even though I ignored him for 14 years. Finally,
I owe a debt to Rochelle Kronzek Miller of World Scientific, who
gave me the final push to write this, and to my desk editor, Rosie
Williamson, and her editorial staff, who dealt with the mechanics of
getting this published. Thank you all.

xiii
This page intentionally left blank
About the Author

Christopher Griffin is a Research Professor at


the Applied Research Laboratory (ARL), where
he holds a courtesy appointment as Professor of
Mathematics at Penn State. He was a Eugene
Wigner Fellow in the Computational Science
and Engineering Division of the Oak Ridge
National Laboratory and has also taught in the
Mathematics Department at the United States
Naval Academy. His favorite part of teaching is
connecting complex mathematical concepts to
their applications. He finds most students are excited by math when
they know how much of our modern world depends on it.
When he is not teaching (which is most of the time), Dr. Griffin’s
research interests are in applied mathematics, where he focuses on
applied dynamical systems (especially on graphs), game theory, and
optimization. His research has been funded by the National Science
Foundation, the Office of Naval Research, the Army Research Office,
the Intelligence Advanced Research Projects Agency, and the Defense
Advanced Research Projects Agency. He has published over 100 peer-
reviewed research papers in various forms of applied mathematics.
His book, Applied Graph Theory: An Introduction with Graph Opti-
mization and Algebraic Graph Theory, was also published by World
Scientific in 2023.

xv
This page intentionally left blank
Contents

Preface vii
Acknowledgements xiii
About the Author xv

Part 1: Classical Game Theory 1


1. Games Against the House with
an Introduction to Probability Theory 3
1.1 Probability . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Random Variables and Expected Values . . . . . . 8
1.3 Some Specialized Results on Probability . . . . . . 12
1.4 Conditional Probability . . . . . . . . . . . . . . . . 15
1.5 Independence . . . . . . . . . . . . . . . . . . . . . 18
1.6 Blackjack: A Game of Conditional Probability . . . 19
1.7 The Monty Hall Problem and Decision Trees . . . . 22
1.8 Bayes’ Theorem . . . . . . . . . . . . . . . . . . . . 25
1.9 Chapter Notes . . . . . . . . . . . . . . . . . . . . . 27
1.10 Exercises . . . . . . . . . . . . . . . . . . . . . . . . 29
2. Elementary Utility Theory 33
2.1 Decision-Making Under Certainty . . . . . . . . . . 33
2.2 Preference and the von Neumann–Morgenstern
Assumptions . . . . . . . . . . . . . . . . . . . . . . 35
2.3 Expected Utility Theorem . . . . . . . . . . . . . . 40
2.4 Chapter Notes . . . . . . . . . . . . . . . . . . . . . 46
2.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . 48
xvii
xviii Game Theory Explained: A Mathematical Introduction with Optimization

3. Game Trees and Extensive Form 51


3.1 Graphs and Trees . . . . . . . . . . . . . . . . . . . 51
3.2 Game Trees with Complete Information
and No Chance . . . . . . . . . . . . . . . . . . . . 56
3.3 Game Trees with Incomplete Information . . . . . . 61
3.4 Games of Chance . . . . . . . . . . . . . . . . . . . 65
3.5 Payoff Functions and Equilibria . . . . . . . . . . . 66
3.6 Chapter Notes . . . . . . . . . . . . . . . . . . . . . 81
3.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . 82
4. Games and Matrices: Normal and
Strategic Forms 87
4.1 Normal and Strategic Forms . . . . . . . . . . . . . 87
4.2 Strategic-Form Games . . . . . . . . . . . . . . . . 89
4.3 Strategy Vectors and Matrix Games . . . . . . . . . 93
4.4 Chapter Notes . . . . . . . . . . . . . . . . . . . . . 95
4.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . 96
5. Saddle Points, Mixed Strategies,
and Nash Equilibria 97
5.1 Equilibria in Zero-Sum Games: Saddle Points . . . 98
5.2 Zero-Sum Games without Saddle Points . . . . . . 103
5.3 Mixed Strategies . . . . . . . . . . . . . . . . . . . . 105
5.4 Dominated Strategies and Nash Equilibria . . . . . 111
5.5 The Indifference Theorem . . . . . . . . . . . . . . 117
5.6 The Minimax Theorem . . . . . . . . . . . . . . . . 120
5.7 Existence of Nash Equilibria . . . . . . . . . . . . . 125
5.8 Finding Nash Equilibria in Simple Games . . . . . 127
5.9 Nash Equilibria in General-Sum Games . . . . . . . 131
5.10 Chapter Notes . . . . . . . . . . . . . . . . . . . . . 134
5.11 Exercises . . . . . . . . . . . . . . . . . . . . . . . . 135

Part 2: Optimization and Game Theory 137


6. An Introduction to Optimization and the
Karush–Kuhn–Tucker Conditions 139
6.1 Motivating Example . . . . . . . . . . . . . . . . . 139
6.2 A General Maximization Formulation . . . . . . . . 141
Contents xix

6.3 Gradients, Constraints, and Optimization . . . . . 143


6.4 Convex Sets and Combinations . . . . . . . . . . . 145
6.5 Convex and Concave Functions . . . . . . . . . . . 147
6.6 Karush–Kuhn–Tucker Conditions . . . . . . . . . . 149
6.7 Relating Back to Game Theory . . . . . . . . . . . 154
6.8 Chapter Notes . . . . . . . . . . . . . . . . . . . . . 156
6.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . 157
7. Linear Programming and Zero-Sum Games 159
7.1 Linear Programs . . . . . . . . . . . . . . . . . . . . 160
7.2 Intuition on the Solution of Linear Programs . . . . 162
7.3 A Linear Program for Zero-Sum Game Players . . . 168
7.4 Solving Linear Programs Using a Computer . . . . 171
7.5 Standard Form, Slack and Surplus Variables . . . . 173
7.6 Optimality Conditions for Zero-Sum Games
and Duality . . . . . . . . . . . . . . . . . . . . . . 175
7.7 Chapter Notes . . . . . . . . . . . . . . . . . . . . . 183
7.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . 184
8. Quadratic Programs and General-Sum Games 187
8.1 Introduction to Quadratic Programming . . . . . . 187
8.2 Solving Quadratic Programming Problems
Using Computers . . . . . . . . . . . . . . . . . . . 188
8.3 General-Sum Games and Quadratic Programming . 189
8.4 Chapter Notes . . . . . . . . . . . . . . . . . . . . . 202
8.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . 204

Part 3: Cooperation in Game Theory 205


9. Nash’s Bargaining Problem and
Cooperative Games 207
9.1 Payoff Regions in Two-Player Games . . . . . . . . 208
9.2 Collaboration and Multi-Criteria Optimization . . . 213
9.3 Nash’s Bargaining Axioms . . . . . . . . . . . . . . 217
9.4 Nash’s Bargaining Theorem . . . . . . . . . . . . . 219
9.5 Chapter Notes . . . . . . . . . . . . . . . . . . . . . 227
9.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . 228
xx Game Theory Explained: A Mathematical Introduction with Optimization

10. An Introduction to N -Player


Cooperative Games 229
10.1 Motivating Cooperative Games . . . . . . . . . . . 229
10.2 Coalition Games . . . . . . . . . . . . . . . . . . . . 231
10.3 Division of Payoff to the Coalition . . . . . . . . . . 233
10.4 The Core . . . . . . . . . . . . . . . . . . . . . . . . 234
10.5 Shapley Values . . . . . . . . . . . . . . . . . . . . 237
10.6 Chapter Notes . . . . . . . . . . . . . . . . . . . . . 239
10.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . 240
Appendix A. Introduction to Matrix Arithmetic 243
A.1 Matrices, Row and Column Vectors . . . . . . . . . 243
A.2 Matrix Multiplication . . . . . . . . . . . . . . . . . 245
A.3 Special Matrices and Vectors . . . . . . . . . . . . . 246
A.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . 247
Appendix B. Essential Concepts from
Vector Calculus 249
B.1 Geometry for Vector Calculus . . . . . . . . . . . . 249
B.2 Gradients . . . . . . . . . . . . . . . . . . . . . . . 252
B.3 Exercises . . . . . . . . . . . . . . . . . . . . . . . . 254
Appendix C. Introduction to Evolutionary
Games Using the Replicator Equation 257
C.1 Differential Equations . . . . . . . . . . . . . . . . . 257
C.2 Fixed Points, Stability, and Phase Portraits . . . . 261
C.3 The Replicator Equation . . . . . . . . . . . . . . . 264
C.4 Appendix Notes . . . . . . . . . . . . . . . . . . . . 271
C.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . 272

References 273
Index 281
Part 1

Classical Game Theory


This page intentionally left blank
Chapter 1

Games Against the House with an


Introduction to Probability Theory

Chapter Goals: The goal of this chapter is to introduce prob-


ability theory and optimal decision-making in the context of
casino games. By “casino game,” we mean any game that has
an element of chance and features a player (you) against the
house (casino). Roulette, black jack, craps, etc., all fall into this
category, as do many television game shows. Probability is intro-
duced in a formal setting, as are random variables, expectation,
and Bayes’ theorem. A discussion of the Monty Hall problem
concludes the chapter.

1.1 Probability

Remark 1.1. Our study of game theory begins with a characteri-


zation of optimal decision-making for an individual in the absence
of any other players. The games we often see on television fall into
this category. TV game shows (that do not pit players against each
other in knowledge tests) often require a single player (who is, in a
sense, playing against the house) to make a decision that will affect
only her life.
Remark 1.2. In this chapter, we frequently discuss the TV game
show Deal or No Deal. In this game, players are shown unmarked

3
4 Game Theory Explained: A Mathematical Introduction with Optimization

boxes1 containing various amounts of money. At any given time, a


banker offers the player an amount to quit the game. Players choose
boxes, thus eliminating them. The objective is to retain (by luck)
the box with the largest amounts of money.
Example 1.3. Congratulations! You have made it to the very final
stage of Deal or No Deal. Two suitcases with money remain in play:
One contains $0.01, while the other contains $1,000,000. The banker
has offered you a payoff of $499,999 to quit. Do you accept the
banker’s safe offer or do you risk it all to try for $1,000,000. Suppose
the banker offers you $100,000; what about $500,000 or $10,000?
Remark 1.4. Example 1.3 may seem contrived, but it has real-world
implications and most of the components needed for a serious discus-
sion of decision-making under risk. In order to study these concepts
formally, we need a grounding in probability theory. Unfortunately, a
formal study of probability requires a heavy dose of measure theory,
which is well beyond the scope of an introductory course on game
theory. Therefore, the following definitions are meant to be intuitive
rather than mathematically rigorous.
Definition 1.5 (Outcome). Let Ω be a finite set of elements
describing the outcome of a chance event (a coin toss, a roll of the
dice, etc.). We call Ω the sample space. Each element of Ω is called
an outcome.
Example 1.6. In the case of Example 1.3, the only thing we care
about is the position of $1,000,000 and $0.01 within the boxes. In this
case, Ω consists of two possible outcomes: either $1,000,000 is in box
number 1 (while $0.01 is in box number 2) or $1,000,000 is in box
number 2 (while $0.01 is in box number 1).
Formally, let us refer to the first outcome as A and the second
outcome as B. Then, Ω = {A, B}.
Definition 1.7 (Event). If Ω is a sample space, then an event is
any subset of Ω. We write this as E ⊆ Ω to indicate that event E is a
subset of Ω. If we are certain that E is a proper subset (i.e., E = Ω),
we write E ⊂ Ω.

1
Some versions of the show use suitcases, others use boxes.
Games Against the House with an Introduction to Probability Theory 5

Example 1.8. The sample space in Example 1.3 consists of precisely


four events: ∅ (the empty event), {A}, {B}, and {A, B} = Ω. These
four sets represent all possible subsets of the set Ω = {A, B}.
Definition 1.9 (Union). If E, F ⊆ Ω are both events, then E ∪ F is
the union of the sets E and F and consists of all outcomes in either
E or F . Event E ∪ F occurs if event E or event F occurs.
Example 1.10. Consider the roll of a fair six-sided die. The out-
comes are Ω = {1, . . . , 6}. If E = {1, 3} and F = {2, 4}, then
E ∪ F = {1, 2, 3, 4} and will occur as long as we don’t roll a 5 or 6.
Definition 1.11 (Intersection). If E, F ⊆ Ω are both events, then
E ∩ F is the intersection of the sets E and F and consists of all
outcomes in both E and F . Event E ∩ F occurs if both event E and
event F occur.
Example 1.12. Again, consider the roll of a fair six-sided die. The
outcomes are 1, . . . , 6. If E = {1, 2} and F = {2, 4}, then E ∩F = {2}
and will occur only if we roll a 2.
Definition 1.13 (Mutual Exclusivity). Two events E, F ⊆ Ω are
said to be mutually exclusive if and only if E ∩ F = ∅.
Definition 1.14 (Discrete Probability Distribution Func-
tion). Given a discrete sample space Ω, let F be the set of all events
on Ω. A discrete probability function is a mapping P : F → [0, 1]
with the following properties:
(1) P (Ω) = 1; and
(2) if E, F ∈ F and E ∩ F = ∅, then P (E ∪ F ) = P (E) + P (F ).
Remark 1.15 (Power Set). In this definition, we consider the set
F as the set of all events over a set of outcomes Ω. This is an example
of the power set: the set of all subsets of a set. We sometimes denote
this set as 2Ω . Thus, if Ω is a set, then 2Ω is the power set of Ω or
the set of all subsets of Ω.
Remark 1.16. Definition 1.14 is surprisingly technical and probably
does not conform to your ordinary sense of what probability is. It’s
best not to think of probability in this very formal way. Instead,
it suffices to think that a probability function assigns a number to
an outcome (or event) that tells you the chances of it occurring.
6 Game Theory Explained: A Mathematical Introduction with Optimization

Put more simply, suppose we could run an experiment where the


result of that experiment will be an outcome in Ω. Then, the function
P simply tells us the proportion of times we will observe an event
E ⊂ Ω if we run this experiment an exceedingly large number of
times.
Example 1.17. Suppose we could play the Deal or No Deal example
over and over again and observe where the money ends up. A smart
game show would mix the money up so that approximately one-half
of the time we observe $1,000,000 in Suitcase 1, and the other half
of the time we observe $1,000,000 in Suitcase 2.
A probability distribution formalizes this notion and might assign
1/2 to event {A} and 1/2 to event {B}. However, to obtain a true
probability distribution, we must also assign probabilities to ∅ and
{A, B}. In the former case, we know that something must happen.
Therefore, we can assign 0 to the event ∅. In the latter case, we know
for certain that either outcome A or B must occur, and so, in this
case, we assign a value of 1 to P (Ω).
Example 1.18. In a fair six-sided die, the probability of rolling any
value is 1/6. Formally, Ω = {1, 2, . . . , 6}, and any roll is an event with
only one element: {ω}, where ω is some value in Ω. If we consider
the event E = {1, 2, 3}, then P (E) gives us the probability that we
will roll a 1, 2, or 3. Since {1}, {2}, and {3} are disjoint sets and
{1, 2, 3} = {1} ∪ {2} ∪ {3}, we know that
1 1 1 1
P (E) = + + = .
6 6 6 2
Definition 1.19 (Discrete Probability Space). The triple
(Ω, F, P ) is called a discrete probability space over Ω.
Definition 1.20 (Set of All Probability Spaces). Let Ω be a
discrete sample space. The set of all probability spaces defined on Ω
is denoted Δ(Ω).
Remark 1.21. The previous definition is used in our study of general
utility formulations. In the case when Ω = {ω1 , . . . , ωn }, we can also
think of Δ as being in one-to-one correspondence with the set of all
points (p1 , . . . , pn ) ∈ Rn such that:
(1) p1 + p2 + · · · + pn = 1 and
(2) pi ≥ 0.
Games Against the House with an Introduction to Probability Theory 7

That is, if (Ω, F, P ) ∈ Δ(Ω), then P (ωi ) = pi . We will encounter the


set
 

n
Δn = (p1 , . . . , pn ) ∈ R : pi = 1, pi ≥ 0, i ∈ {1, . . . , n}
i

again in Chapter 5.
Lemma 1.22. Let (Ω, F, P ) be a discrete probability space. Then,
P (∅) = 0.
Proof. The sets Ω ∈ F and ∅ ∈ F are disjoint (i.e., Ω ∩ ∅ = ∅).
Thus,

P (Ω ∪ ∅) = P (Ω) + P (∅).

We know that Ω ∪ ∅ = Ω. Thus, we have

P (Ω) = P (Ω) + P (∅) =⇒ 1 = 1 + P (∅) =⇒ 0 = P (∅).




Lemma 1.23. Let (Ω, F, P ) be a discrete probability space, and let


E, F ∈ F. Then,

P (E ∪ F ) = P (E) + P (F ) − P (E ∩ F ). (1.1)

Proof. If E ∩ F = ∅, then by definition, P (E ∪ F ) = P (E) + P (F )


but P (∅) = 0; therefore, P (E ∪ F ) = P (E) + P (F ) − P (E ∩ F ).
Suppose E ∩ F = ∅. Then, let

E  = {ω ∈ E|ω ∈ F } and

F = {ω ∈ F |ω ∈ E}.

This is illustrated in Fig. 1.1. Then, we know the following:


(1) E  ∩ F  = ∅,
(2) E  ∩ (E ∩ F ) = ∅,
(3) F  ∩ (E ∩ F ) = ∅,
(4) E = E  ∪ (E ∩ F ), and
(5) F = F  ∪ (E ∩ F ).
8 Game Theory Explained: A Mathematical Introduction with Optimization

Fig. 1.1. An illustration of the probabilities used in the proof of Lemma 1.23.

Thus (by the inductive extension of Property 2 in Definition 1.14),


we know that

P (E ∪ F ) = P (E  ∪ F  ∪ (E ∩ F )) = P (E  ) + P (F  ) + P (E ∩ F ).
(1.2)

We also know that

P (E) = P (E  ) + P (E ∩ F ) =⇒ P (E  ) = P (E) − P (E ∩ F ) (1.3)

and

P (F ) = P (F  ) + P (E ∩ F ) =⇒ P (F  ) = P (F ) − P (E ∩ F ). (1.4)

Combining these three equations yields

P (E ∪ F ) = P (E) − P (E ∩ F ) + P (F ) − P (E ∩ F ) + P (E ∩ F )
= P (E ∪ F ) = P (E) + P (F ) − P (E ∩ F ). (1.5)

This completes the proof. 

1.2 Random Variables and Expected Values

Remark 1.24. Intuitively, a random variable is a variable X whose


value is not known a priori and which is determined according to
some probability distribution P that is part of a probability space
(Ω, F, P ). Naturally, we can make this more rigorous, but it is not
necessary.
Games Against the House with an Introduction to Probability Theory 9

Example 1.25. Suppose that we consider flipping a fair coin. Then,


the probability of seeing heads (or tails) should be 1/2. If we let X
be a random variable that provides the outcome of the flip, then it
will take on a value of either heads or tails, and it will take each value
half of the time (in the long run).
Remark 1.26. The problem with allowing a random variable to
take on arbitrary values (like heads or tails) is that it makes it dif-
ficult to use random variables in formulas involving numbers. There
is a very technical definition of a random variable that arises in for-
mal probability theory. However, it is well beyond the scope of this
book. We can, however, get a flavor of this definition in the following
restricted form.
Definition 1.27. Let (Ω, F, P ) be a discrete probability space. Let
D ⊆ R be a finite discrete subset of the real numbers. A random
variable X is a function that maps each element of Ω to an element
of D. Formally, X : Ω → D.
Remark 1.28. Clearly, if S ⊆ D, then X −1 (S) = {ω ∈ Ω|X(ω) ∈
S} ∈ F. We can think of the probability of X taking on a value in
S ⊆ D as precisely P (X −1 (S)).
Using this observation, if (Ω, F, P ) is a discrete probability distri-
bution function, X : Ω → D is a random variable, and x ∈ D, then
let P (x) = P (X −1 ({x})). That is, the probability of X taking the
value x is the probability of the element in Ω corresponding to x.
Example 1.29. Consider our coin-flipping random variable. Instead
of having X take a value of either heads or tails, we can instead let
X take on value of 1 if the coin comes up heads and 0 if the coin
comes up tails. Thus, if Ω = {heads, tails}, then X(heads ) = 1 and
X(tails ) = 0.
Example 1.30. When Ω is already a subset of R, then defining ran-
dom variables is easy. The random variable can simply be the obvi-
ous mapping from Ω into itself. For example, if we consider rolling
a fair die, then Ω = {1, . . . , 6}, and any random variable defined on
(Ω, F, P ) will take on values 1, . . . , 6.

Definition 1.31 (Expected Value). Let (Ω, F, P ) be a discrete


probability distribution, and let X : Ω → D be a random variable.
10 Game Theory Explained: A Mathematical Introduction with Optimization

Then, the expected value of X is



E(X) = xP (x). (1.6)
x∈D
Example 1.32. Let’s play a die-rolling game. You put up your own
money. Even numbers lose $10 times the number rolled, while odd
numbers win $12 times the number rolled. What is the expected
amount of money you will win in this game?
Let Ω = {1, . . . , 6}. Then, D = {12, −20, 36, −40, 60, −60}; these
are the dollar values you will win for various dice outcomes. Then,
the expected value of X is
   
1 1
E(X) = 12 + (−20)
6 6
       
1 1 1 1
+ 36 + (−40) + 60 + (−60) = −2.
6 6 6 6
Would you still want to play this game considering the expected
payoff is −$2?
Example 1.33 (Roulette). A roulette wheel consists of 38 pockets
(slots) numbered 0–36, with an extra pocket labeled 00. Pockets 0 and
00 are green. The remaining pockets are black or red. Eighteen are
black, and eighteen are red. You bet by placing chips on a board that
has the numbers arranged in rows and columns. Depending on the bet
you are making the sample space may change. For example, if you are
betting on color, the sample space is Ωcolor = {Red, Black, Green}.
If you are betting strictly on numbers, the sample space is Ωnumber =
{00, 0, 1, . . . , 36}. A representation of a roulette board and wheel is
shown in Fig. 1.2. Payoffs in roulette are given in ratios. For example,
a bet on any number gives a payout of 35 to 1. That means if you
bet $1 on number 1 and win, you get the original $1 back and $35
additional dollars. The payoff ratio establishes the mapping X : Ω →
R, which is the random variable (payoff) in this case.
Ignoring the notational complexity, on a single-number bet, the
expected profit is
   
1 37 1
35 · −1· = − ≈ −0.053. (1.7)
38 38 19
More complex betting strategies are possible, but they lead to similar
results. Playing red returns a payout of 1-to-1, giving an expected
Games Against the House with an Introduction to Probability Theory 11

Fig. 1.2. An (American) roulette wheel is shown above. A French roulette wheel
lacks the 00 pocket. This image was obtained from https://commons.wikimedia.
org/wiki/File:American roulette.svg under CC-BY-SA-3.0 license.

profit of
   
18 20 1
1· −1· =− ≈ −0.053. (1.8)
38 38 19

This is because there are 18 pockets that are red and 20 pockets that
are not red (18 black and 2 green).
Example 1.34 (Nick the Greek). Nick the Greek was a profes-
sional gambler in Las Vegas during the 1940s. Though he died poor
(due to poker losses) while in Las Vegas, he developed a system in
which he would gamble against the players at the table and not the
games themselves. For example, if a person was convinced that the
roulette wheel was going to come up red, he (Nick) might say, “I’ll
give you 1-to-1 odds that it’s not going to be red. If you’re right,
you’ll double your bet!” People’s superstitions would work against
them. If the person lost the spin, Nick got, say, $1. If they won the
spin, e.g., Nick lost $1. In effect, Nick became the house. Now, on a
red bet, Nick’s expected payoff would be
   
18 20 1
−1 · +1· = . (1.9)
38 38 19

That means, in the long run, Nick would come out ahead on all
his side bets. Since Nick was wealthy to begin with (his wealth was
partially inherited), he could cover small losses periodically and reap
the marginal long-run gain. In essence, Nick was acting like a one-
man hedge fund.
12 Game Theory Explained: A Mathematical Introduction with Optimization

Example 1.35. Return to Example 1.3. You are on Deal or No Deal,


and the two boxes remaining in play contain $0.01 and $1,000,000,
but you do not know which is which. The banker has offered you
$499, 999 to quit. You have a 50% chance of choosing the correct
box. Let W be the random variable giving the winnings if you play.
It has an expected value of
   
1 1
E(W ) = 1, 000, 000 + 0.01 = 500, 000.005.
2 2

This value is larger than the banker’s offer. So, one school of thought
argues you should play on.
However, here is another way to look at the problem. The banker’s
money is yours with no risk; suppose you consider it already your
money. Rather than assuming you started with nothing, assume
you’re starting with $499, 999. Let G be the random variable denot-
ing your gain from deciding to play. Then, we have

1 1
E(G) = (0.01 − 499, 999) + (500, 000 − 499, 999) = −249998.995.
2 2

Viewed in this way, the banker’s offer seems to be a good deal.


The decision to take the banker’s offer has no risk, and you keep
the $499, 999. The decision to play on is risky and, viewed from the
perspective of gain, will likely cost you almost a quarter of a million
dollars.
This example illustrates a fundamental problem in game theory
and decision theory: Determining the objective or payoff function
in complex situations is challenging. In Chapter 2, we discuss von
Neumann and Morgenstern’s [5] approach to this problem.

1.3 Some Specialized Results on Probability

Remark 1.36. This section is optional. The results herein are used
only once in Chapter 2.

Remark 1.37. The proof of the following lemma is left as an


exercise.
Games Against the House with an Introduction to Probability Theory 13

Lemma 1.38. Let (Ω, F, P ) be a discrete probability space, and let


E, F ∈ F. Then,
P (E) = P (E ∩ F ) + P (E ∩ F c ). (1.10)
Remark 1.39. There are several ways to prove the following lemma.
We choose a classic set-theoretic proof.
Lemma 1.40. Let Ω be a set (sample space), and suppose that
E, F1 , . . . , Fn are subsets of Ω. Then,
n
 n

E∩ Fi = (E ∩ Fi ). (1.11)
i=1 i=1

That is, intersection distributes over union.


Proof. We prove set containment in both directions. That is, we
prove that
n
 n

E∩ Fi ⊆ (E ∩ Fi ) and
i=1 i=1
n
 n

E∩ Fi ⊇ (E ∩ Fi ).
i=1 i=1

For simplicity, define


n

F = Fi .
i=1

Suppose we have ω ∈ E ∩ F . Then, ω ∈ E and ω ∈ F . That means


there is at least one i so that ω ∈ Fi . Then, ω ∈ E ∩ Fi ; consequently,
n

ω∈ (E ∩ Fi ).
i=1

Now, we proceed in the opposite direction. Suppose that


n

ω∈ (E ∩ Fi ).
i=1

Then, there is at least one i so that ω ∈ E and ω ∈ Fi . This implies


that ω ∈ E ∩ F . This completes the proof. 
14 Game Theory Explained: A Mathematical Introduction with Optimization

Theorem 1.41. Let (Ω, F, P ) be a discrete probability space, and let


E ∈ F. Let F1 , . . . , Fn be any pairwise disjoint collection of sets that
partition Ω. That is, assume
n

Ω= Fi , (1.12)
i=1
and Fi ∩ Fj = ∅ if i = j. Then,
n
P (E) = P (E ∩ Fi ). (1.13)
i=1
Proof. We proceed by induction on n. If n = 1, then F1 = Ω, and
we know that P (E) = P (E ∩ Ω) by necessity. Therefore, suppose the
statement is true for k ≤ n. We show that the statement is true for
n + 1.
Let F1 , . . . , Fn+1 be pairwise disjoint subsets satisfying Eq. (1.12).
Let
n
F = Fi . (1.14)
i=1
Clearly, if x ∈ F , then x ∈ Fn+1 since Fn+1 ∩ Fi = ∅ for i = 1, . . . , n.
Also, if x ∈ F , then x ∈ Fn+1 since from Eq. (1.12), we must have
F ∪ Fn+1 = Ω. Thus, F c = Fn+1 , and we can conclude inductively
that
P (E) = P (E ∩ F ) + P (E ∩ Fn+1 ). (1.15)
We may apply Lemma 1.40 to show that
n n

E ∩F = E∩ Fi = (E ∩ Fi ). (1.16)
i=1 i=1
Note that if i = j, then (E ∩ Fi ) ∩ (E ∩ Fj ) = ∅ because Fi ∩ Fj = ∅;
therefore,
n n
 
P (E ∩ F ) = P (E ∩ Fi ) = P (E ∩ Fi ). (1.17)
i=1 i=1
Thus, we may write
n
 n+1

P (E) = P (E ∩ Fi ) + P (E ∩ Fn+1 ) = P (E ∩ Fi ). (1.18)
i=1 i=1
This completes the proof. 
Games Against the House with an Introduction to Probability Theory 15

Example 1.42. In the casino game craps, we roll two dice, and
winning combinations are determined by the sum of the values on
the dice. An ideal first craps roll is 7. The sample space Ω in
which we are interested has 36 elements, one each for the possi-
ble values the dice will show (the related set of sums can be easily
obtained).
Suppose that the dice are colored blue and red (so they can be
distinguished). Let’s suppose we are interested in the event that we
roll 1 on the blue die and that the pair of values obtained sums to 7.
There is only one way this can occur, namely, we roll 1 on the blue
1
die and 6 on the red die. Thus, the probability of this occurring is 36 .
In this case, event E is the event that we roll 7 in our craps game,
and event F1 is the event that the blue die shows a 1. We could also
consider the event F2 , in which the blue die shows a 2. By similar
reasoning, we know that the probability of both E and F2 occurring
1
is 36 . In fact, if Fi is the event that the blue die shows a value of i
(i = 1, . . . , 6), then we know that
1
P (E ∩ Fi ) = .
36
Clearly, the events Fi (i ∈ {1, . . . , 6}) are pairwise disjoint (you
cannot have both 1 and 2 on the same die). Furthermore, Ω =
F1 ∪ F2 ∪ · · · ∪ F6 because some number has to come up on the
blue die. Thus, we can compute
6
 6 1
P (E) = P (E ∩ Fi ) = = ,
36 6
i=1

which is precisely what we would expect. The probability of rolling


7 with two dice is 16 .

1.4 Conditional Probability

Remark 1.43. Suppose we are given a discrete probability space,


(Ω, F, P ), and we are told that an event E has occurred. We now
wish to compute the probability that some other event F will (or has)
occurred. This value is called the conditional probability of event F
given event E and is written P (F |E).
16 Game Theory Explained: A Mathematical Introduction with Optimization

Example 1.44. Consider an experiment where we roll a fair six-


sided die twice. The sample space in this case is the set Ω =
{(x, y)|x = 1, . . . , 6, y = 1, . . . , 6}. Suppose I roll a 2 on the first
try. I want to know what the probability of rolling a combined score
of 8 is. That is, given that I have rolled a 2, I wish to determine the
conditional probability of rolling a 6.
Since the die is fair, the probability of rolling any pair of values
(x, y) ∈ Ω is equally likely. There are 36 elements in Ω, and so each
is assigned a probability of 1/36. That is, (Ω, F, P ) is defined so that
P ((x, y)) = 1/36 for each (x, y) ∈ Ω.
Let E be the event that we roll a 2 on the first try. We wish to
assign a new set of probabilities to the elements of Ω to reflect the fact
that the event E has occurred. We know that our final outcome must
have the form (2, y), where y ∈ {1, . . . , 6}. In essence, E becomes
our new sample space. Further, we know that each outcome with
form (2, y) is equally likely because the dice are fair. Thus, we may
assign
1
P [(2, y)|E] =
6
for each y ∈ {1, . . . , 6} and P [(x, y)|E] = 0 just in case x = 2. That is,
any outcome (x, y) that is not in E is assigned a probability of 0. We
do this because we know that we have already observed the number
2 on the first roll, so it’s impossible to see a first number not equal
to 2.
At last, we can answer the question we originally posed. The only
way to obtain a sum equal to 8 is to roll a 6 on the second attempt.
Thus, the probability of rolling a combined score of 8, given that we
roll 2 on the first roll, is 16 .
Lemma 1.45. Let (Ω, F, P ) be a discrete probability space, and sup-
pose that event E ⊆ Ω. Then, (E, FE , PE ) is a discrete probability
space, with
P (F )
PE (F ) = (1.19)
P (E)
for all F ⊆ E and PE (ω) = 0 for any ω ∈ E.
Proof. Our objective is to show that (E, FE , PE ) is a properly
defined probability space.
Games Against the House with an Introduction to Probability Theory 17

If ω ∈ E, then we can assign PE (ω) = 0. Suppose that ω ∈ E.


For (E, FE , PE ) to be a discrete probability space, we must have
PE (E) = 1, or

PE (E) = PE (ω) = 1. (1.20)
ω∈E

We know from Definition 1.14 that



P (E) = P (ω).
ω∈E

Thus, if we assign PE (ω) = P (ω)/P (E) for all ω ∈ E, then Eq. (1.20)
will be satisfied automatically. Since for any F ⊆ E we know that

P (F ) = P (ω),
ω∈F

it follows that PE (F ) = P (F )/P (E). Finally, if F1 , F2 ⊆ E and


F1 ∩F2 = ∅, then the fact that PE (F1 ∪F2 ) = PE (F1 )+PE (F2 ) follows
from the properties of the original probability space (Ω, F, P ). Thus,
(E, FE , PE ) is a discrete probability space. 
Remark 1.46. The previous lemma gives us a direct way to con-
struct P (F |E) for arbitrary F ⊆ Ω. Clearly, if F ⊆ E, then
P (F )
P (F |E) = PE (F ) = .
P (E)
Now, suppose that F is not a subset of E but that F ∩ E = ∅. Then,
clearly, the only possible events that can occur in F , given that E has
occurred, are the ones that are also in E. Thus, PE (F ) = PE (E ∩ F ).
More to the point, we have
P (F ∩ E)
P (F |E) = PE (F ∩ E) = . (1.21)
P (E)
This leads to the following definition.
Definition 1.47 (Conditional Probability). Given a discrete
probability space (Ω, F, P ) and an event E ∈ F, the conditional
probability of event F ∈ F given event E is
P (F ∩ E)
P (F |E) = . (1.22)
P (E)
18 Game Theory Explained: A Mathematical Introduction with Optimization

1.5 Independence

Definition 1.48 (Independence). Let (Ω, F, P ) be a discrete


probability space. Two events E, F ∈ F are called independent if
P (E|F ) = P (E) and P (F |E) = P (F ).
Theorem 1.49. Let (Ω, F, P ) be a discrete probability space.
If E, F ∈ F are independent events, then P (E ∩ F ) = P (E)P (F ).
Proof. We know that
P (E ∩ F )
P (E|F ) = = P (E).
P (F )
Multiplying by P (F ), we obtain P (E ∩ F ) = P (E)P (F ). This com-
pletes the proof. 
Example 1.50. Consider rolling a fair die twice in a row. Let Ω be
the sample space of possible results. Thus,

Ω = {(x, y)|x = 1, . . . , 6, y = 1, . . . , 6}.

Let E be the event that we obtain a 6 on the first roll. Then,

E = {(6, y) : y = 1, . . . , 6},

and let F be the event that we obtain a 6 on the second roll, so that

F = {(x, 6) : x = 1, . . . , 6}.

These two events are independent. The first roll cannot affect the
outcome of the second roll, thus P (F |E) = P (F ). We know that
P (E) = P (F ) = 16 . That is, there is a one in six chance of observing
a 6. Thus, the chance of rolling double sixes in two rolls is precisely
the probability of both events E and F occurring. Using our result
on independent events, we can see that
 2
1 1
P (E ∩ F ) = P (E)P (F ) = = ,
6 36
just as we expect it to be.
Example 1.51. Suppose we are interested in the probability of
rolling at least one 6 in two rolls of a single die. Again, the rolls
Games Against the House with an Introduction to Probability Theory 19

are independent. Let’s consider the probability of not rolling a 6 at


all. Let E be the event that we do not roll a 6 in the first roll.
Then, P (E) = 5/6 (as there are five ways to not roll a 6). If F
is the event that we do not roll a 6 on the second roll, then again
P (F ) = 5/6. Since these events are independent (as before), we can
compute P (E ∩ F ) = (5/6)(5/6) = 25/36. This is the probability of
not rolling a 6 on the first roll and not rolling a 6 on the second roll.
We are interested in rolling at least one 6. Thus, if G is the event of
not rolling a 6 at all, then Gc must be the event of rolling at least
one 6. Thus, P (Gc ) = 1 − P (G) = 1 − 25/36 = 11/36.

1.6 Blackjack: A Game of Conditional Probability

Remark 1.52. In the card game blackjack, a dealer deals cards to


players and himself or herself. Non-face cards have the value shown;
i.e., a two-card is worth the value of two. Face cards all have a value
of 10, except aces, which are either worth 11 or 1, depending on what
is best for the player. The object is to continue requesting cards (by
saying “hit me”) to achieve a total value as close to 21 as possible
without going over. A player who wishes to receive no more cards is
said to be “standing” or “standing pat.” If a player achieves a value
closer to 21 than the dealer, then he/she wins and doubles his/her
bet. Otherwise, the dealer (house) wins. There are several nuances to
the game (doubling down, insurance, etc.) that are outside the scope
of this discussion. See Ref. [6] for additional details.
When blackjack is played with a dealer and a single player, it
is a simple game against the house since the dealer is required to
follow specific rules with respect to his/her cards. As a result, there
is only one decision-maker (the solo player), as opposed to a game
like poker in which there are multiple decision-makers (players). It is
interesting to note that multi-player blackjack admits only a weak
coupling between the players and the rewards they receive because all
players share the same deck(s) used by the dealer. Even in this case,
all players play directly against the house and not against each other.
Player decisions only affect the conditional probability distribution on
the next card. As such, blackjack (and similar games) admit certain
strategies that can help players improve their chances of winning.
These are usually called card-counting strategies.
20 Game Theory Explained: A Mathematical Introduction with Optimization

Table 1.1. A table of three example card counting strategies. The level of the
count is the number of different distinct non-zero values that can be assigned to
a card.

Strategy 2 3 4 5 6 7 8 9 10/Face A Level

Hi-Lo +1 +1 +1 +1 +1 0 0 0 −1 −1 1
Hi-Opt I 0 +1 +1 +1 +1 0 0 0 −1 0 1
Hi-Opt II +1 +1 +2 +2 +1 +1 0 0 −2 0 2

Card counting in blackjack is designed to allow the player to deter-


mine when the house has an advantage (without computing condi-
tional probabilities) so that he/she can adjust the betting strategy
(amount) accordingly. As cards are shown, the player adjusts the
count according to a table of values for different cards. Three exam-
ples of card-counting strategies are shown in Table 1.1. Additional
strategies can be found in Ref. [7]. In a counting strategy, the level
of the count is the number of different distinct non-zero values that
can be assigned to a card.
High counts favor the player, while low counts favor the dealer.
This is (in some sense) intuitively clear. The more high cards that
are seen, the lower the count goes and the less likely a player will
see a large initial value on a deal. When many low cards are in play,
players face the risk of busting (going over 21) as they try to increase
the value of the hand dealt.

Example 1.53. We use a simple blackjack example to illustrate


card counting. Suppose you are sitting at the blackjack table, and
you see the configuration shown in Fig. 1.3. Assume you have already
observed the following cards in a previous round of play: A♥, J♠,
6♣, 2♥, 5♦, 10♣, 8♦, and 6♠. You must decide whether to hit or not.
Assuming you are playing with a 52-card deck, deciding whether to
hit or not can be accomplished by computing the conditional proba-
bility of going over 21 (busting) given the current state of the cards
and whether you hit or not. In this case, you will not bust if you get
a 4 or less. Given the cards that have already been drawn, we see
that there are 37 cards remaining in the deck. Of the 15 cards that
have been drawn, you know the value of 13 of them. This implies the
following:
Games Against the House with an Introduction to Probability Theory 21

Dealer

7♦ K♥
Other Player You

Fig. 1.3. You are sitting at a blackjack table. The dealer holds a king and
something. You hold a 7 and a king. Do you hit?

• There are at most four 4’s remaining in the deck.


• There are at most three 3’s remaining in the deck.
• There are at most three 2’s remaining in the deck.
• There are at most three A’s remaining in the deck.
Therefore, there are at most 13 cards out of the remaining 37 that
will help you. A rough back-of-the-envelope computation tells you
that
13
Pr(Drawing a card with value ≤4) ≤ ≈ 0.35. (1.23)
37
Thus, the probability of busting is at least 65%. Consequently, you
might consider holding with your 17, even though the dealer is show-
ing a king. The probability we have just estimated is the conditional
probability of drawing a card with a value of 4 or less, given the
current state of the cards in play. The event we conditioned on is the
current set of cards that have been drawn, or, equivalently, the cards
remaining in the deck.
Let’s investigate what a simple hi-lo count says about this situa-
tion. Using the counting system, we see that we have a count of

−1(A♥) − 1(J♠) + 1(6♣) + 1(2♥) + 1(5♦) − 1(10♣) + 0(8♦)


+1(6♠) − 1(K♠) − 1(Q♣) + 1(3♥) + 0(7♦) − 1(K♥) = −1.

This suggests a slight advantage for the dealer at this point. In the
next round of play, the count suggests that some caution be exercised
22 Game Theory Explained: A Mathematical Introduction with Optimization

in placing a bet, which is consistent with our observations thus far.


At the moment, the probability of busting is quite high.
It is worth noting that the count is about the next set of bets.
It’s not strictly about what to do next in terms of hitting, though
it certainly can inform that decision. In this case, the odds favor
standing pat in order to avoid going over 21, which is a guaranteed
loss.

1.7 The Monty Hall Problem and Decision Trees

Remark 1.54. The game show Let’s Make a Deal originated in


the United States in 1963 [8] and was originally hosted by Monty
Hall. The show features the host offering various deals and trades to
contestants, who often have incomplete information.
Example 1.55 (The Monty Hall Problem). You are a contestant
on Let’s Make a Deal. You must choose between Door Number 1,
Door Number 2, and Door Number 3. Behind one of these doors is
a fabulous prize. Behind the other two doors are goats. Once you
choose your door, the host, Monty Hall, will reveal a door that does
not have a big deal. At this point, you can decide if you want to keep
the original door you chose or switch doors. When the time comes,
what do you do?
It is tempting at first to suppose that it doesn’t matter whether
you switch or not. You have a 1/3 chance of choosing the correct door
on your first try, so why would that change after you are provided
information about an incorrect door? It turns out that it does matter.
To solve this problem, it helps to understand the set of potential
outcomes and the information associated with the decision. There are
really three possible pieces of information that determine an outcome:
(1) which door the producer chooses for the big deal,
(2) which door you choose first, and
(3) whether you switch or not.
For the first decision, there are three possibilities (three doors). For
the second decision, there are again three possibilities (again, three
doors). For the third decision, there are two possibilities (you either
switch or not). Thus, there are 3 × 3 × 2 = 18 possible outcomes.
These outcomes can be visualized in the order in which the decisions
Games Against the House with an Introduction to Probability Theory 23

are made (more or less), which is shown in Fig. 1.4. The first step
(where the producers choose a door to hide the prize) is not observ-
able by the contestant, so we adorn this part of the diagram with
a box. When we discuss game trees in Chapter 3, we explain this
notation more completely.
The next to last row (labeled “Switch”) of Fig. 1.4 illustrates the
18 elements of the probability space. We assume that they are all
equally likely (i.e., you randomly choose a door, you randomly decide
to switch, and the producers of the show randomly choose a door for
hiding the prize). In this case, the probability of any outcome is 1/18.
Now, let’s focus exclusively on the outcomes in which we decide to
switch. In Fig. 1.4, these appear with bold, black borders. This is the
conditioning event, that is, the event set E. Let the event F consist
of those outcomes for which the contestant wins. This is shown in the
bottom row of Fig. 1.4, with a W . We are interested in P (F |E). That
is, what are the chances of winning, given that we actively choose to
switch?
Within E, there are precisely 6 outcomes in which we win. If each
of these mutually exclusive outcomes has a probability of 1/18,
 
1 1
P (E ∩ F ) = 6 = .
18 3
Obviously, we switch in 9 of the 18 possible independent outcomes, so
 
1 1
P (E) = 9 = .
18 2
Thus, we can compute
P (E ∩ F ) 1/3 2
P (F |E) = = = .
P (E) 1/2 3

If we switch, there is a 23 chance we will win the prize. If we don’t


switch, there is only a 13 chance we will win the prize. Thus, switching
is better than not switching.
If this reasoning does not appeal to you, there is another way
to see that the chance of winning given a switch is 23 . In the case
of switching, we are making a conscious decision; there is no prob-
abilistic voodoo that is affecting this part of the outcome. So, just
consider the outcomes in which we switch and count. Note that there
24
Game Theory Explained: A Mathematical Introduction with Optimization
Prize is Behind: 1 2 3

Choose Door: 1 2 3 1 2 3 1 2 3

Switch: Y N Y N Y N Y N Y N Y N Y N Y N Y N

Win/Lose: L W W L W L W L L W W L W L W L L W

Fig. 1.4. The Monty Hall problem is a multi-stage decision problem whose solution relies on conditional probability. The
stages of decision-making are shown in the diagram. We assume that the prizes are randomly assigned to the doors. We cannot
see this step, so we adorn this decision with a square box. We discuss these boxes further when we talk about game trees.
You, the player, must first choose a door. Lastly, you must decide whether to switch doors after being shown an incorrect
door.
Games Against the House with an Introduction to Probability Theory 25

are 9 outcomes in which we switch from our original door to a door


we did not pick first. In 6 of these 9, we win the prize, while in 3, we
fail to win the prize. Thus, the chances of winning the prize when we
switch is 69 = 23 .

1.8 Bayes’ Theorem

Remark 1.56. In this final section, we turn to Bayes’ theorem and


use it to consider the most serious decision problem of all: one related
to health. This section is not used in the remainder of the text, but
it rounds out the treatment of elementary probability in decision
problems. It can be safely skipped with no loss of continuity.

Remark 1.57. In its simplest form, Bayes’ theorem can be proved


using the definition of conditional probability. Proving the following
lemma is left as an exercise.

Lemma 1.58 (Bayes’ Theorem – Special Case). Let (Ω, F, P )


be a discrete probability space, and suppose that E, F ∈ F. Then,

P (E|F )P (F )
P (F |E) = . (1.24)
P (E)

Remark 1.59. We can generalize this result when we have a col-


lection of sets, F1 , . . . , Fn ∈ F, that partition Ω and are pairwise
disjoint.

Theorem 1.60 (Bayes’ Theorem – General Form). Let


(Ω, F, P ) be a discrete probability space, and suppose that
E, F1 , . . . , Fn ∈ F, with F1 , . . . , Fn being pairwise disjoint and
n

Ω= Fi .
i=1

Then,

P (E|Fi )P (Fi )
P (Fi |E) = n . (1.25)
j=1 P (E|Fj )P (Fj )
26 Game Theory Explained: A Mathematical Introduction with Optimization

Proof. Consider the fact that


n
 n 
 
P (E ∩ Fj )
P (E|Fj )P (Fj ) = P (Fj )
P (Fj )
j=1 j=1
n

= P (E ∩ Fj ) = P (E),
j=1

by Theorem 1.41. From Lemma 1.58, we conclude that

P (E|Fi )P (Fi ) P (E|Fi )P (Fi )


n = = P (Fi |E).
j=1 P (E|Fj )P (Fj ) P (E)

This completes the proof. 

Example 1.61. Here’s a rather morbid example: Suppose that a


specific disease occurs with a probability of 1 in 1,000,000. A simple
test exists to determine whether an individual has this disease. When
an individual has the disease, the test will detect it 99 times out of
100. The test also has a false positive rate of 1 in 1,000 (that is, there
is a 0.001 probability of misdiagnosis). The treatment for this disease
is costly and unpleasant. You have just tested positive. What do you
do?
There are two events to consider: (i) the event of having the dis-
ease (F ) and (ii) the event of testing positive (E). We are interested
in computing

P (F |E) = The probability of having the disease given a


positive test.

We know the following information:

(1) P (F ) = 1 × 10−6 : There is a 1 in 1,000,000 chance of having this


disease.
(2) P (E|F ) = 0.99: The probability of testing positive given that
you have the disease is 0.99.
(3) P (E|F c ) = 0.001: The probability of testing positive given that
you do not have the disease is 1 in 1,000.
Games Against the House with an Introduction to Probability Theory 27

We can apply Bayes’ theorem to see that

P (E|F )P (F )
P (F |E) =
P (E|F )P (F ) + P (E|F c )P (F c )
(0.99)(1 × 10−6 )
= = 0.00098.
(0.99)(1 × 10−6 ) + (0.001)(1 − 1 × 10−6 )
(1.26)

Thus, the probability of having the disease given the positive test
is less than 1 in 1,000. You should follow up with additional tests
before committing to the unpleasant and costly treatment.

1.9 Chapter Notes

Modern probability theory began with a letter from the Chevalier de


Méreé (pen name of Antoine Gombaud) to Blaise Pascal (of Pascal’s
triangle fame) [9]. (Though Cardano did work on games of chance
well before this [10].) Gombaud was a gambler and wanted to know
how to divide winnings in a game of dice when play was inter-
rupted (and the game could not be finished). It is clear from the
correspondence that Gombaud kept detailed records of his gambling
and was able to furnish Pascal with significant details. Pascal, in
turn, corresponded with Pierre de Fermat (of Fermat’s last theo-
rem [9] fame), and together they pieced together the essentials of
discrete probability distributions. Devlin [11] presents an outstand-
ing history of this interaction.
The work of Pascal and Fermat was extended by Huygens [12],
and the classical definition of probability was completed by Laplace
[13]. During this time, the reverend Thomas Bayes worked as an
amateur statistician and independently published the theorem that
now bears his name [14]. Among others, Gauss would later develop
additional statistical methods as part of his duties as an astronomer
and was among the first to use error analysis in tracking the dwarf
planet Ceres using what we now call Gaussian distributions [15],
which is why the distribution bears his name. It is worth noting
that some authors credit de Moivre with the invention of the normal
distribution [16], though this is not widely accepted.
28 Game Theory Explained: A Mathematical Introduction with Optimization

By the late 19th century, it was clear that this treatment of


probability would not suffice. David Hilbert challenged the math-
ematical community to axiomatize probability in the service of sta-
tistical physics in his 6th problem. This problem was taken up by
Kolmogorov [9], who provided the first measure-theoretic axioma-
tization of probability theory, which is similar in style to the one
presented in this chapter. Kolmogorov was a student of Markov, who
contributed to the theory of stochastic processes [9]. For those inter-
ested in an introduction to measure theory, Bear [17] has a short and
readable book on the subject.
Card counting is almost an immediate consequence of
Gambaoud’s letter to Pascal, though it was the mathematician
Edward O. Thorp who proved that card counting could be used
to overcome house advantage in blackjack in his book Beat the
Dealer [18]. Thorp’s work was highly original and influential, but
was influenced by the prior work of J. L. Kelly of Bell Labs, who
established the Kelly criterion for formulating bet size [19].
The chief roadblock to card counters is knowing the count before
sitting at the table. The MIT card-counting team (featured in the
movie 21 ) used a big player team strategy [20]. In this strategy, card
counters would sit at a table and make safe bets, winning or losing
very little over the course of time. They would keep the card count
and signal big players from their team, who would arrive at the table
and make large bets when the count was high (in their favor). The
big players would leave once signaled that the count had dropped.
Using this strategy, the MIT players cleared millions from the casinos
using basic probability theory.
In response to the academic work on card counting, casinos
have invested time and money in detection methods. Detecting card
counting has become a major part of casino intelligence operations.
There are several extremely sophisticated methods casinos employ
for detecting counting strategies. In general, it is difficult to make
counting work in a modern casino without using a team system, and
careful hedging is required to stay below the radar of the pit boss and
the eye in the sky. We note that the gambling example given in this
chapter is of a very unsophisticated version of blackjack. Most casinos
use six decks at a blackjack table, introduce fresh packs of cards, and
Games Against the House with an Introduction to Probability Theory 29

employ automatic shufflers designed to negatively impact counting.


When all else fails, dealers may intentionally distract players they
suspect are counters to break their concentration and ruin the count.
The Monty Hall problem discussed in this chapter first appeared
in 1975 in the American Statistician by Steve Selvin [21]. It was sub-
sequently described by Morgan, Chaganty, Dahiya, and Doviak [22],
again in the American Statistician, who pointed out that the assump-
tions in the Monty Hall problem setup are critical to the solution.
The decision problem was then discussed in follow-up letters and arti-
cles [23–25], and in 2002, a “quantum” version was formulated and
analyzed [26]. A historical account of the problem and its solutions
was written by Rosenhouse in 2009 [27]. If the problem is rephrased
as a two-player game in which the host stands to gain, it can rad-
ically alter the solution structure. The problem then becomes one
of determining problem setups in which switching is always better.
Interestingly, in a 1991 interview [28], Monty Hall revealed that he
had control over which deals to offer and was therefore able to manip-
ulate the contestants psychologically and affect their decisions. Thus,
in real life, Let’s Make a Deal was more similar to a two-player game
than a game against the house.

– ♠♣♥♦ –

1.10 Exercises

1.1 A fair four-sided die is rolled. Assume that the sample space
of interest is the number appearing on the dice, and the numbers
run from 1 to 4. Identify the space Ω precisely and all the possible
outcomes and events within the space. What is the (logical) fair
probability distribution in this case. [Hint: See Example 1.18.]

1.2 Compute the probability of rolling a double 6 in 24 rolls of a


pair of dice. [Hint: Each roll is independent of the last roll. Let E
be the event that you do not roll a double 6 on a given roll. The
probability of E is 35/36 (i.e., there are 35 other ways the dice could
come out other than double 6). Now, compute the probability of
not seeing a double six in all 24 rolls using independence. (You will

https://avxhm.se/blogs/hill0
30 Game Theory Explained: A Mathematical Introduction with Optimization

get a power of 24.) Let this probability be p. Finally, note that the
probability of a double 6 occurring is precisely 1 − p. To see this,
note that p is the probability of the event that a double six does not
occur. Thus, the probability of the event that a double 6 does occur
must be 1 − p.]

1.3 Prove the following: Let E ⊆ Ω and define E c to be the set


of elements of Ω not in E (which is called the complement of E).
Suppose (Ω, F, P ) is a discrete probability space. Show that P (E c ) =
1 − P (E).

1.4 Prove Lemma 1.38. [Hint: Show that E ∩ F and E ∩ F c are


mutually exclusive events. Then, show that E = (E ∩ F ) ∪ (E ∩ F c ).]

1.5 Use Definition 1.47 to compute the probability of obtaining a


sum of 8 in two rolls of a die, given that in the first roll, a 1 or
2 appears. [Hint: The space of outcomes is still Ω = {(x, y)|x =
1, . . . , 6, y = 1, . . . , 6}. First, identify the event E within this space.
How many elements within this set will enable you to obtain an 8 in
two rolls? This is the set E ∩ F . What is the probability of E ∩ F ?
What is the probability of E? Use the formula in Definition 1.47. It
might help to write out the space Ω.]

1.6 Show (in any way you like) that the probability of winning in
the Monty Hall problem given that you do not switch doors, is 1/3.

1.7 In the little-known Lost Episodes of Let’s Make a Deal, Monty


(or Wayne) introduces a fourth door. Suppose that you choose a door
and then you are shown two incorrect doors and given the chance
to switch. Should you switch? Why? [Hint: Build a figure similar
to Fig. 1.4. It will be a bit large. Use the same reasoning we used
to compute the probability of successfully winning the prize in the
previous example.]

1.8 Prove the simple form of Bayes’ theorem. [Hint: Use Defini-
tion 1.47.]
Games Against the House with an Introduction to Probability Theory 31

1.9 In Example 1.61, for what probability of having the disease is


there a 1 in 100 chance of having the disease, given that you have
tested positive? [Hint: I’m asking for what value of P (F ) is the value
of P (F |E) 1 in 100. Draw a graph of P (F |E) and use a calculator.]

1.10 There are several other ways to analyze the Monty Hall prob-
lem. Use Bayes’ theorem to show that the probability of winning in
the Monty Hall problem is 2/3.

https://avxhm.se/blogs/hill0
This page intentionally left blank
Chapter 2

Elementary Utility Theory

Chapter Goals: The goal of this chapter is to introduce utility


theory, as defined by von Neumann and Morgenstern [5], and to
use it to show how even non-monetary prizes can be transformed
into numerical payoffs. We also introduce the expectation maxi-
mization theorem. This chapter can be skipped by those readers
who want to get into multiplayer game theory, as long as you
accept the central premise that all prizes can be converted into
numeric payoffs under certain conditions.

2.1 Decision-Making Under Certainty

Remark 2.1. The game show The Price is Right premiered in the
United States in 1972 and features contestants competing in vari-
ous games against the house to win prizes and occasionally money.
Almost all games involve an element of chance and knowledge about
the prices of commercial products [29]. This represents a stark con-
trast from Deal or No Deal, where the prizes are all monetary in
nature.
Definition 2.2 (Lottery). A lottery, L = {A1 , . . . , An }, P , is
a collection of prizes (or rewards, or costs), {A1 , . . . , An }, along
with a discrete probability distribution, P , with the sample space
{A1 , . . . , An }. We denote the set of all lotteries over A1 , . . . , An by L.

https://avxhm.se/blogs/hill0 33
34 Game Theory Explained: A Mathematical Introduction with Optimization

Remark 2.3. In a lottery (of this type), we do not assume that we


will determine the probability distribution P as a result of repeated
exposure. (This is not like a state lottery.) Instead, the probability
is given ab initio and does not change. Also, ab initio is a fancy way
of saying “from the beginning.”
Remark 2.4. To simplify notation, we state that

L = (A1 , p1 ), . . . , (An , pn )

is the lottery consisting of prizes A1 –An , where you receive prize A1


with a probability of p1 , prize A2 with a probability of p2 , etc. Note
that it is assumed in the definition that

p1 + p2 + · · · + pn = 1

because these probabilities arise from a discrete probability function


with sample space A1 , . . . , An .
Remark 2.5. The lottery in which we win the prize Ai with a prob-
ability of 1 and all other prizes with a probability of 0 will be denoted
as Ai as well. Thus, the prize Ai is equivalent to a lottery in which
one always wins the prize Ai .
Example 2.6. On The Price is Right, in the game Temptation, the
contestant is offered four (small) prizes and given their dollar value.
From the dollar values, the contestant must then construct the price
of a large prize (e.g., a car). Once the player is shown all the prizes
(and constructs a guess for the price of the car), the player must
make a choice between taking the small prizes and leaving or risking
the prizes and playing for the large prize.
In this example, there are two lotteries: the small prize option and
the large prize option. The small prize option contains a single reward
consisting of the various items seen by the contestant. Denote this
lottery as A1 . This lottery is (A1 , P1 ), where P1 (A1 ) = 1. The other
lottery option contains two rewards: the large prize (car) A2 and the
null prize A0 (where the contestant leaves with nothing). This lottery
has the form {A0 , A2 }, P2 , where P2 (A0 ) = p, P2 (A2 ) = 1 − p, and
p ∈ (0, 1), and depends on the nature of the prices of the prizes in
A1 , which were used to construct the guess for the price of the large
prize.
Elementary Utility Theory 35

2.2 Preference and the von Neumann–Morgenstern


Assumptions

Definition 2.7 (Preference). Let L1 and L2 be lotteries. We write


L1  L2 to indicate that an individual prefers lottery L1 to lottery
L2 . If both L1  L2 and L2  L1 , then L1 ∼ L2 , and L1 and L2 are
considered equivalent to the individual.
Remark 2.8. The axiomatic treatment of utility theory rests on
certain assumptions about an individual’s behavior when they are
confronted with a choice of two or more lotteries. We have already
seen this type of scenario in Example 2.6. We assume that these
choices are governed by preference. Preferences can vary from indi-
vidual to individual.
Remark 2.9. For the remainder of this chapter, we assume that
every lottery consists of prizes A1 , . . . , An and that the prizes are
preferred in the order

A1  A2  · · ·  An . (2.1)

We now introduce five assumptions that will be needed to prove the


expected utility theorem.
Assumption 1. Let L1 , L2 and L3 be lotteries:
(1) Either L1  L2 or L2  L1 or L1 ∼ L2 .
(2) If L1  L2 and L2  L3 , then L1  L3 .
(3) If L1 ∼ L2 and L2 ∼ L3 , then L1 ∼ L3 .
(4) If L1  L2 and L2  L1 , then L1 ∼ L2 .
Remark 2.10. Item 1 of Assumption 1 states that the ordering 
is a total ordering on the set of all lotteries with which an individual
may be presented. That is, we can compare any two lotteries to each
other and always be able to decide which one is preferred or whether
they are equivalent. Item 2 of Assumption 1 states that this ordering
is transitive.
It is clear that preference should be reflexive (i.e., L1 ∼ L1 for
all lotteries L1 ) and symmetric (L1 ∼ L2 if and only if L2 ∼ L1 for
all lotteries L1 and L2 ). The assumption of transitivity implies that
preferential equivalence is an equivalence relation over the set of all
lotteries.

https://avxhm.se/blogs/hill0
36 Game Theory Explained: A Mathematical Introduction with Optimization

Example 2.11 (Is Transitivity a Problem?). For this example,


you must use your imagination and think like a preschooler. Imag-
ine a scenario in which we present a preschooler with the following
choices (lotteries with only one item): a ball, a puzzle, and a crayon
(and paper). If we present the choice of the puzzle and the crayon,
the child may choose the crayon. In presenting the puzzle and the
ball, the child may choose the puzzle. On the other hand, suppose
we present the crayon and the ball. If the child chooses the ball,
then transitivity is violated. We can ask why this might happen.
It is possible that the child’s preferences will change depending upon
the current requirements of their imagination. Such transitivity vio-
lations have been observed in real-world experiments [30].
Definition 2.12 (Compound Lottery). Let L1 , . . . , Ln be a set
of lotteries, and suppose that the probability of being presented with
lottery i (i = 1, . . . , n) is qi . A lottery Q = (L1 , q1 ), . . . , (Ln , qn ) is
called a compound lottery.
Example 2.13. Consider a fictional game called Flip of a Coin!.
Two contestants are randomly assigned heads or tails. A coin is
flipped:
• The winner is offered a chance to flip a second coin or to leave with
a guaranteed $500. If the second coin comes up heads, the winner
receives $1,000; if the coin comes up tails, the contestant flips an
unfair coin that comes up heads 10% of the time. If the contestant
flips heads, she wins $10,000. If the contestant flips tails, she leaves
with nothing.
• The loser is offered the choice of leaving with nothing or flipping a
second coin. If the second coin comes up heads, the loser receives
$100. If the coin comes up tails, the contestant flips an unfair coin
that comes up heads 10% of the time. If the contestant flips heads,
he wins $1,000. If the contestant flips tails, he falls into a tank of
water and leaves with nothing but wet clothes.
We can work backward to model the entire show as a collection of
(compound) lotteries. Unless stated otherwise, assume all coin flips
are fair and that the loser opts to flip the second coin. Then, there
are two possible lotteries he will see:
 1
  9

LT = $1,000, 10 , Wet, 10 .
LH = ($100, 1).
Elementary Utility Theory 37

The lottery LT occurs if the second coin flip comes up tails. The
lottery LH occurs if the second coin flip comes up heads. Note that
this is a trivial lottery with a guaranteed prize.
The losing contestant must choose between a trivial lottery,
L1 = (0, 1),
in which he gets nothing, and a compound lottery,
   
L2 = LH , 12 , LT , 12 .
For the winning contestant, if she opts to flip the second coin, the
outcomes can be again modeled with two lotteries:
 1
  9

WT = $10,000, 10 , $0, 10
and
WH = ($1,000, 1).
Since these are again decided by a coin flip, the compound lottery in
the winning contestant’s choice is
   
W2 = WH , 12 , WT , 12 .
The winning contestant chooses between this compound lottery and
the trivial lottery
W1 = ($500, 1).
The possible lotteries and outcomes (with decisions) are illustrated
in Fig. 2.1.
Assumption 2. Let (L1 , q1 ), . . . , (Ln , qn ) be a compound lottery,
and suppose each Li (i = 1, . . . , n) is composed of prizes A1 , . . . , Am
with probabilities pij (j = 1, . . . , m). Then, this compound lottery is
equivalent to a simple lottery in which the probability of prize Aj is
rj = q1 p1j + q2 p2j + · · · + qn pnj .
Remark 2.14. Assumption 2 states that compound lotteries can be
transformed into equivalent simple lotteries. Note further that the
probability of prize j (Aj ) is actually
n

P (Aj ) = P (Aj |Li )P (Li ). (2.2)
i=1

https://avxhm.se/blogs/hill0
38 Game Theory Explained: A Mathematical Introduction with Optimization

Winner Loser

Don’t Flip Flip Second Don’t Flip Flip Second


Second Coin Coin Second Coin Coin

Get $500 Heads Tails Get $0 Heads Tails


(50%) (50%) (50%) (50%)

Heads Tails Heads Tails


Get $1000 Get $100
(10%) (90%) (10%) (90%)

Get $10,000 Leave with Get $1,000 Get Wet


Nothing

Fig. 2.1. The possible outcomes for “Flip a Coin!” are illustrated. Simple lotter-
ies are labeled in the circles in which a random event occurs. Compound lotteries
are shown with the boxes.

This statement should be clear from Theorem 1.41, when we define


a probability space in the correct manner.
Example 2.15. Consider the winner’s compound lottery from
Example 2.13. Recall that the lotteries were
   
W2 = WH , 12 , WT , 12 ,
 1
  9

WT = $10,000, 10 , $0, 10 ,
WH = ($1,000, 1).
The prizes in this case are $10,000, $1,000, and $0. Then, we can
compute the probabilities associated with the different prizes:
1 9 1 9
p$0 = · + ·0 = ,
2  10 2
 20
From WT From WH

1 1 1
p$1,000 = ·0 + ·1 = ,
2
 2
 2
From WT From WH

1 1 1 1
p$10,000 = · + ·0 = .
2  10 2
 20
From WT From WH
Elementary Utility Theory 39

Since these prizes are all monetary, we can compute the expected
payoff to the winner of the first prize using the simple lottery repre-
sentation. If W is the random variable (not the lottery) giving the
payoff to the winner, assuming she flips the second coin, the payoff is
9 1 1
E(W ) = 0 · + 1000 · + 10000 · = 1000.
20 2 20
Remark 2.16. In Example 2.15, the winner receives only monetary
prizes; therefore, it is easy to compare her possible decisions. In con-
trast, it is possible the loser will land in a tank of water, which does
not have an immediate monetary value. The remaining assumptions
will allow us to construct a mapping from lotteries to numeric val-
ues, which can then be compared directly so that we do not have to
convert cars to dollars or understand the monetary loss associated
with wet clothes.
Assumption 3. For each prize (or lottery) Ai , there is a number ui ∈
[0, 1] so that the prize Ai (or lottery Li ) is preferentially equivalent
to the lottery in which you win prize A1 with a probability of ui , An
with a probability of 1 − ui , and all other prizes with a probability
of 0. This lottery will be denoted Ãi .
Remark 2.17. Assumption 3 is often called the continuity assump-
tion, and it is a little strange. It assumes that for any ordered set of
prizes (A1 , . . . , An ), a person would view winning any specific prize
(Ai ) as equivalent to playing a game of chance in which either the
worst or best prize could be obtained.
This assumption may not be valid in all cases. Suppose that the
best prize A1 is a new car, while the worst prize An is spending 10
years in jail. If the prize in question (Ai ) is that you receive $100,
then the continuity assumption implies that there is a game of chance
you would play involving a new car or 10 years in jail that would be
equal to receiving $100.
Assumption 4. If L = (A1 , p1 ), . . . , (Ai , pi ), . . . , (An , pn ) is
a lottery, then L is preferentially equivalent to the lottery
(A1 , p1 ), . . . , (Ãi , pi ), . . . , (An , pn ).
Remark 2.18. Assumption 4 only asserts that we can substitute
any equivalent lottery for a prize and not change the preferential
ordering.

https://avxhm.se/blogs/hill0
40 Game Theory Explained: A Mathematical Introduction with Optimization

Assumption 5. A lottery L in which A1 is obtained with a proba-


bility of p and An is obtained with a probability of (1 − p) is always
preferred or equivalent to a lottery in which A1 is obtained with a
probability of p and An is obtained with a probability of (1 − p ) if
and only if p ≥ p .
Remark 2.19. Assumption 5 states that anyone would prefer (or at
least be indifferent to) winning A1 with a higher probability and
An with a lower probability. This assumption is reasonable when we
have the case A1  An . In Ref. [1], it is pointed out that there are
psychological reasons why this assumption may be violated.

2.3 Expected Utility Theorem

Remark 2.20. The expected utility theorem is the result of all these
assumptions. It provides a formal way of assigning numerical values
to prizes, even if those prizes have no obvious numerical value. Its
proof rests on the five assumptions. If any of those assumptions are
violated, then this theorem need not be true.
Theorem 2.21 (Expected Utility Theorem). Let  be a prefer-
ence relation satisfying Assumptions 1–5 over the set of all lotteries
L defined over the prizes A1 , . . . , An . Furthermore, assume that

A1  A2  · · ·  An .

Then, there is a function u : L → [0, 1] with the property that

u(L1 ) ≥ u(L2 ) ⇐⇒ L1  L2 . (2.3)

Proof. We begin by defining the utility function:


(1) Define u(A1 ) = 1. Recall that A1 is not only prize A1 but also
the lottery in which we receive A1 with a probability of 1, that
is, the lottery in which p1 = 1 and p2 . . . , pn = 0.
(2) Define u(An ) = 0. Again, recall that An is also the lottery in
which we receive An with a probability of 1.
(3) By Assumption 3, for lottery Ai (i = 1 and i = n), there is a ui
so that Ai is equivalent to Ãi : the lottery in which you win prize
A1 with a probability of ui , An with a probability of 1 − ui , and
all other prizes with a probability of 0. Define u(Ai ) = ui .
Elementary Utility Theory 41

(4) Let L ∈ L be a lottery in which we win prize Ai with a probability


of pi . Then,
u(L) = p1 u1 + p2 u2 + · · · + pn un . (2.4)
Here, u1 ≡ 1 and un ≡ 0.
We now show that this utility function satisfies Eq. (2.3).
(⇐) Let L1 , L2 ∈ L, and suppose that L1  L2 . Suppose that
L1 = (A1 , p1 ), (A2 , p2 ), . . . , (An , pn ),
L2 = (A1 , q1 ), (A2 , q2 ), . . . , (An , qn ).
By Assumption 3, for each Ai , (i = 1, i = n), we know that Ai ∼ Ãi ,
with Ãi ≡ (A1 , ui ), (An , 1 − ui ). Then, by Assumption 4, we know
that
L1 ∼ (A1 , p1 ), (Ã2 , p2 ), . . . , (Ãn−1 , pn−1 ), (An , pn ),
L2 ∼ (A1 , q1 ), (Ã2 , q2 ), . . . , (Ãn−1 , qn−1 ), (An , qn ).
These are compound lotteries, and we can expand them as
L1 ∼ (A1 , p1 ), ((A1 , u2 ), (An , (1 − u2 )), p2 ), . . . ,
((A1 , un−1 ), (An , (1 − un−1 )), pn−1 ), (An , pn ). (2.5)
L2 ∼ (A1 , q1 ), ((A1 , u2 ), (An , (1 − u2 )), q2 ), . . . ,
((A1 , un−1 ), (An , (1 − un−1 )), qn−1 ), (An , qn ). (2.6)
We can apply Assumption 2 to transform these compound lotter-
ies into simple lotteries. By combining the like prizes, we see that
L1 ∼ (A1 , p1 + u2 p2 + · · · + un−1 pn−1 ),
Δ
(An , (1 − u2 )p2 + · · · + (1 − un−1 )pn−1 + pn ) = L̃1 ,

and
L2 ∼ (A1 , q1 + u2 q2 + · · · + un−1 qn−1 ),
Δ
(An , (1 − u2 )q2 + · · · + (1 − un−1 )qn−1 + qn ) = L̃2 .
Δ
Here, = simply means that we define L̃1 and L̃2 by these expressions.
We can apply Assumption 1 to see that L1 ∼ L̃1 , L2 ∼ L̃2 , and

https://avxhm.se/blogs/hill0
42 Game Theory Explained: A Mathematical Introduction with Optimization

L1  L2 . This implies that L̃1  L̃2 . By Assumption 5, we conclude


that

p1 + u2 p2 + · · · + un−1 pn−1 ≥ q1 + u2 q2 + · · · + un−1 qn−1 . (2.7)

Note, however, that

u(L1 ) = p1 + u2 p2 + · · · + un−1 pn−1 ,


u(L2 ) = q1 + u2 q2 + · · · + un−1 qn−1 .

Thus, we have u(L1 ) ≥ u(L2 ).


(⇒) Suppose now that L1 , L2 ∈ L and that u(L1 ) ≥ u(L2 ). Then
we know that,

u(L1 ) = u1 p1 + u2 p2 + · · · + un−1 pn−1 + un pn


≥ u1 q1 + u2 q2 + · · · + un−1 qn−1 + un qn = u(L2 ). (2.8)

As before, we have L1 ∼ L̃1 and L2 ∼ L̃2 . We also have u(L1 ) =


u(L̃1 ) and u(L2 ) = u(L̃2 ). To see this, note that in L̃1 , the probability
associated with prize A1 is

p1 + u2 p2 + · · · + un−1 pn−1 .

Thus (since u1 = 1 and un = 0), we know that

u(L̃1 ) = u1 (p1 + u2 p2 + · · · + un−1 pn−1 )


= u1 p1 + u2 p2 + · · · + un−1 pn−1 + un pn .

A similar statement holds for L̃2 , and thus we can conclude that

p1 + u2 p2 + · · · + un−1 pn−1 ≥ q1 + u2 q2 + · · · + un−1 qn−1 . (2.9)

We can now apply Assumption 5 (which is an if-and-only-if state-


ment) to see that

L̃1  L̃2 .

We can now conclude from Assumption 1 that, since L1 ∼ L̃1 ,


L2 ∼ L̃2 , and L̃1  L̃2 , L1  L2 . This completes the proof. 
Remark 2.22. This theorem is called the expected utility theorem
because the utility of any lottery is in fact the expected utility of any
Elementary Utility Theory 43

of the prizes. That is, let U be the random variable that takes value
ui if prize Ai is received. Then,
n

E(U ) = ui p(Ai ) = u1 p1 + u2 p2 + · · · + un pn . (2.10)
i=1

This is simply the utility of the lottery in which prize i is received


with a probability of pi .
Example 2.23. Suppose you are a contestant on Let’s Make a Deal,
and the following prizes are available:
(1) A1 : A new car (worth $20,000).
(2) A2 : A gift card (worth $1,000).
(3) A3 : A new iPad (worth $800).
(4) A4 : A donkey (technically worth $500, but somewhat challeng-
ing).
We assume that you prefer these prizes in the order in which they
appear. The host offers you the following deal: You can compete in
either of the following games (lotteries):
(1) L1 = (A1 , 0.25), (A2 , 0.25), (A3 , 0.25), (A4 , 0.25);
(2) L2 = (A1 , 0.15), (A2 , 0.4), (A3 , 0.4), (A4 , 0.05).
Which games should you choose to make you the most happy? The
problem here is valuing the prizes. Maybe you really need a new car
(or you just bought a new car). The car may be worth more than
its dollar value. Alternatively, suppose you actually want a donkey.
Suppose you know that donkeys are expensive to own and the “retail”
value ($500) is false. Perhaps it would be difficult to use a gift card
of that size.
For the sake of argument, let’s suppose that you determine that
the donkey is worth nothing to you. You might say that:
(1) A2 ∼ (A1 , 0.1), (A4 , 0.9);
(2) A3 ∼ (A1 , 0.05), (A4 , 0.95).
This implies that u(A2 ) = u2 = 0.1 and u(A3 ) = u3 = 0.05. The
numbers really don’t make any difference; you can supply any val-
ues you want for 0.1 and 0.05 as long as the other numbers enforce

https://avxhm.se/blogs/hill0
44 Game Theory Explained: A Mathematical Introduction with Optimization

Assumption 3. Then, we can write


L1 ∼ (A1 , 0.25), ((A1 , 0.1), (A4 , 0.9), 0.25),
((A1 , 0.05), (A4 , 0.95), 0.25), (A4 , 0.25),
L2 ∼ (A1 , 0.15), ((A1 , 0.1), (A4 , 0.9), 0.4),
((A1 , 0.05), (A4 , 0.95), 0.4), (A4 , 0.05).
We can now simplify this by expanding these compound lotteries into
simple lotteries in terms of A1 and A4 . To see how we do this, consider
only Lottery 1. Lottery 1 is a compound lottery that contains the
following sub-lotteries:
(1) S1 : A1 with a probability of 0.25;
(2) S2 : (A1 , 0.1), (A4 , 0.9) with a probability of 0.25;
(3) S3 : (A1 , 0.05), (A4 , 0.95) with a probability of 0.25;
(4) S4 : A4 with a probability of 0.25.
To convert this lottery into a simpler lottery, apply Assumption 2.
We have
P (A1 ) = P (A1 |S1 ) · P (S1 ) + P (A1 |S2 ) · P (S2 )
       
1 0.25 0.1 0.25

+ P (A1 |S3 ) · P (S3 ) + P (A1 |S4 ) · P (S4 ) = 0.2875.


       
0.05 0.25 0 0.25

Using a similar computation, we deduce that


P (A4 ) = 0.71250.
We can conclude that
Δ
L1 ∼ (A1 , 0.2875), (A4 , 0.71250) = L̃1 .
We can perform a similar calculation for L2 to get
Δ
L2 ∼ (A1 , 0.21), (A4 , 0.79) = L̃2 .
It is now easy to compute the utility of the two games. We know
immediately that
u(L1 ) = u(L̃1 ) = 0.2875,
u(L2 ) = u(L̃2 ) = 0.21.
Elementary Utility Theory 45

Computing u(L1 ) and u(L2 ) using Eq. (2.4) would yield the same
result:
u(L1 ) = p1 u1 + p2 u2 + p3 u3 + p4 u4 = 0.2875,
       
0.25 1 0.25 0.1 0.25 0.05 0.25 0

u(L2 ) = p1 u1 + p2 u2 + p3 u3 + p4 u4 = 0.21.
       
0.15 1 0.4 0.1 0.4 0.05 0.05 0

Thus, you should opt to play the first game (L1 ), even though there
is a greater chance of winning the donkey because L1 has a higher
expected utility than L2 .
Definition 2.24 (Linear Utility Function). We say that a utility
function u : L → R is linear if, given any lotteries L1 , L2 ∈ L and
some q ∈ [0, 1], then
u [(L1 , q), (L2 , (1 − q))] = qu(L1 ) + (1 − q)u(L2 ). (2.11)
Here, (L1 , q), (L2 , (1 − q)) is the compound lottery made up of the
lotteries L1 and L2 , each having probabilities of q and (1−q), respec-
tively.
Lemma 2.25. Let L be the collection of lotteries defined over prizes
A1 , . . . , An , with A1  A2  · · ·  An . Let u : L → [0, 1] be the
utility function defined in Theorem 2.21. Then, L1 ∼ L2 if and only
if u(L1 ) = u(L2 ).
Theorem 2.26. The utility function u : L → [0, 1] defined in Theo-
rem 2.21 is linear.
Proof. Let
L1 = (A1 , p1 ), (A2 , p2 ), . . . , (An , pn ),
L2 = (A1 , r1 ), (A2 , r2 ), . . . , (An , rn ).
Thus, we know that
n

u(L1 ) = pi ui ,
i=1
n

u(L2 ) = ri ui .
i=1

https://avxhm.se/blogs/hill0
46 Game Theory Explained: A Mathematical Introduction with Optimization

Choose q ∈ [0, 1]. The lottery L = (L1 , q), (L2 , (1 − q)) is equiv-
alent to a lottery in which prize Ai is obtained with a probability of

Pr(Ai ) = qpi + (1 − q)ri .

Thus, applying Assumption 2, we have


Δ
L̃ = (A1 , [qp1 + (1 − q)r1 ]), . . . , (An , [qpn + (1 − q)rn ]) ∼ L.

Applying Lemma 2.25, we can compute


n
 n
 n

u(L) = u(L̃) = [qpi + (1 − q)ri ] ui = qpi ui + (1 − q)ri ui
i=1 i=1 i=1
n
 n

=q pi ui + (1 − q) ri u i = qu(L1 ) = (1 − q)u(L2 ).
i=1 i=1

(2.12)

Thus, u is linear. 
Theorem 2.27. Suppose that a, b ∈ R with a > 0. Then, the function
u : L → R, given by

u (L) = au(L) + b, (2.13)

also has the property that u (L1 ) ≥ u (L2 ) if and only if L1  L2 ,


where u is the utility function given in Theorem 2.21. Furthermore,
this utility function is linear.
Remark 2.28. A generalization of Theorem 2.27 simply shows that
the class of linear utility functions is closed under a subset of affine
transforms. That means, given a linear utility function, we can con-
struct another by multiplying by a positive constant and adding
another constant.

2.4 Chapter Notes

The expected utility theorem and the theory of lotteries, as pre-


sented in this chapter, are due to von Neumann and Morgenstern [5]
Elementary Utility Theory 47

and described in their book Theory of Games and Economic Behav-


ior. Oskar Morgenstern was a German economist and is the co-
founder of modern game theory, along with von Neumann. Von
Neumann, a child prodigy, made seminal contributions to both pure
and applied mathematics, as well as physics and computer science
[31]. As two examples, the von Neumann architecture is the de facto
standard computer architecture now used, and von Neumann alge-
bras are named in his honor. Von Neumann was famously witty.
Freeman Dyson recalled Enrico Fermi quoting von Neumann as say-
ing, “[W]ith four parameters I can fit an elephant, and with five I
can make him wiggle his trunk.” [32]. This is in reference to the
ability of models with multiple parameters to fit arbitrarily com-
plex datasets without providing any additional insight. Fitting an
“elephant” became a problem in recreational mathematics, with the
current best “fit” being the parametric equations,

x = 30 sin(t) − 8 sin(2t) + 10 sin(3t) − 60 cos(t)


y = 50 sin(t) + 18 sin(2t) − 12 cos(3t) + 14 cos(5t),

for t ∈ [0, 2π]. See Fig. 2.2.


This was discovered by Mayer, Khairy, and Howard [33] in 2010
using Fourier analysis with four parameters. The result was then con-
verted into the parametric equations given above. A fifth parameter
can be used to make the elephant’s eye.
A counterexample to the continuity assumption in Remark 2.17
is a variation of Bernoulli’s St. Petersburg paradox, which posits a
lottery with an infinite payoff and asks how much anyone would
be willing to risk on such a (non-guaranteed) payoff [34]. From
this, Bernoulli concluded that taking only the expected reward into
account was not rational. Consequently, acceptance of von Neumann
and Morgenstern’s axioms implies rejection of Bernoulli’s conclusion
(and vice versa).
There are several variations of utility theory. The one presented
in this chapter is usually called the von Neumann–Morgenstern util-
ity. Other theories of utility may assume determinism. Fishburn [35]
has a survey of classical utility theory from the perspective of man-
agement science, which is valid up to 1968. A more recent survey
by Karni and Schmeidler [36] provides information on utility theory

https://avxhm.se/blogs/hill0
48 Game Theory Explained: A Mathematical Introduction with Optimization

50

−50

−50 0 50

Fig. 2.2. This elephant fitted by four parameters was discovered by Mayer,
Khairy, and Howard in 2010 [33].

under uncertainty, as considered in this chapter. In general, this topic


is considered a subset of microeconomic theory. See Mankiw’s book
on the subject for details [37]. For a generalization of the results in
this chapter, see Myerson’s text on game theory, which provides an
economic perspective [38].

– ♠♣♥♦ –

2.5 Exercises

2.1 Follow the steps in Example 2.15 to construct a simple lottery


describing the case when the loser decides to flip the second coin.
Decide whether it is in either the winner’s or the loser’s interest to
Elementary Utility Theory 49

flip the second coin (or determine whether it is not possible to do so


without more information).

2.2 Use YouTube to find a clip of the Price is Right game Tempta-
tion. Assume that you have no knowledge on the price of the big prize
(usually a car). Model the game as a set of lotteries and compute the
probabilities. [Hint: The zero knowledge assumption is critical for
this to be possible.]

2.3 Explain what happens to the computation in Example 2.23 if


you replace the “donkey prize” with something more unpleasant, e.g.,
being imprisoned for 10 years. How does this affect your acceptance
of the continuity assumption?

2.4 Prove Lemma 2.25. [Hint: We know L1  L2 and L2  L1 if and


only if L1 ∼ L2 . We also know L1  L2 if and only if u(L1 ) ≥ u(L2 ).]

2.5 Prove Theorem 2.27.

https://avxhm.se/blogs/hill0
This page intentionally left blank
Chapter 3

Game Trees and Extensive Form

Chapter Goals: The goal of this chapter is to introduce games


in extensive form, which are sometimes called game trees. This
provides a visual representation for a game, similar to the one
in Fig. 1.4 or Fig. 2.1. In doing this, we will slowly build more
complex games from simpler ones. We begin with deterministic
games of complete information (such as checkers or chess) and
proceed to games of incomplete information and games with
probabilistic moves. We then define and prove the existence of
an equilibrium strategy in the case of complete information, and
we conclude with Zermelo’s theorem.

3.1 Graphs and Trees

Remark 3.1. We have already seen two diagrams (Figs. 1.4 and
2.1) with a branching structure modeling decisions that individuals
can make. We now formalize these diagrams using the language of
graph theory [39].
Definition 3.2 (Graph). A directed graph (digraph) is a pair G =
(V, E) where V is a finite set of vertexes and E ⊆ V × V is a finite
set of directed edges composed of ordered two-element subsets of V .
Remark 3.3. For the sake of simplicity, we assume that edges of
the form (v, v) (called self-loops) are not permissible in the graph we
draw because they have no game-theoretic meaning.

https://avxhm.se/blogs/hill0 51
52 Game Theory Explained: A Mathematical Introduction with Optimization

Fig. 3.1. There are 64 = 26 distinct graphs on three vertices. The increased
number of edges is caused by the fact that the edges are now directed.

Example 3.4. Under the previous assumption, there are 26 = 64


possible digraphs on three vertices. This can be computed by con-
sidering the number of permutations of two elements chosen from a
three-element set. This yields six possible ordered pairs of vertices
(directed edges). For each of these edges, there are two possibilities:
either the edge is in the edge set or not. Thus, the total number of
digraphs on three edges is 26 = 64. A few are illustrated in Fig. 3.1.

Definition 3.5 (Directed Path). Let G = (V, E) be a digraph.


Then, a directed path in G is a sequence of vertices (v0 , v1 , . . . , vn ) so
that (vi , vi+1 ) ∈ E for each i = 0, . . . , n−1 and no vertex is repeated.
We say that the path goes from vertex v0 to vertex vn . The number
of edges in a path is called its length.
Example 3.6. We illustrate a short path in Fig. 3.2. The directed
edges in this graph prevent many long paths.

Definition 3.7 (Directed Tree). A digraph G = (V, E) that pos-


sesses a unique vertex r ∈ V called the root so that (i) there is a
Game Trees and Extensive Form 53

v1
v2
v5
v0

v3

v4
Fig. 3.2. A short path consisting of three vertices is illustrated in a directed
graph.

Root

Terminal Vertices
Fig. 3.3. We illustrate a directed tree. Every directed tree has a unique vertex
called the root. The root is connected by a unique directed path to every other
vertex in the directed tree.

unique path from r to every vertex v ∈ V and (ii) there is no v ∈ V


so that (v, r) ∈ E is called a directed tree.
Example 3.8. Figure 3.3 illustrates a simple directed tree. Note
that there is exactly one directed path connecting the root to every
other vertex in the tree.

https://avxhm.se/blogs/hill0
54 Game Theory Explained: A Mathematical Introduction with Optimization

Remark 3.9. We note that there are other ways to define trees
(including directed trees). This one is most convenient for us because
we will always be able to identify an obvious root vertex. See Ref. [39]
for additional information on graphs and trees.

Definition 3.10 (Descendants). If T = (V, E) is a directed tree


and v, u ∈ V with (v, u) ∈ E, then u is called a child of v, and v is
called the parent of u. If there is a path from v to u in T , then u is
called a descendant of v, and v is called an ancestor of u.

Definition 3.11 (Out-Edges). If T = (V, E) is a directed tree and


v ∈ V , then we denote the out-edges of vertex v by Eo (v). These are
edges that connect v to its children. Thus,

Eo (v) = {(v, u) ∈ V : (v, u) ∈ E}.

Definition 3.12 (Terminal Vertex). If T = (V, E) is a directed


tree and v ∈ V so that v has no descendants, then v is called a
terminal or terminating vertex. All vertices that are not terminal
are non-terminal or intermediate vertices.

Remark 3.13. In other contexts, terminal vertices are called leaves


of the tree, and a terminal vertex is called a leaf [39]. Because
digraphs (and more generally graphs) are used in many disciplines,
a wide range of terminology is in use, depending on the context.

Definition 3.14 (Tree Height). Let T = (V, E) be a tree. The


height of the tree is the length of the longest path in T .

Example 3.15. The height of the tree shown in Fig. 3.3 is 3. There
are three paths of length 3 in the tree that start at the root of the
tree and lead to three of the four terminal vertices.

Lemma 3.16. Let T = (V, E) be a directed tree. If v is a vertex of


v and u is a descendant of v, then there is no path from u to v.

Proof. Let r be the root of the tree. If v = r, then the theorem is


proved. Suppose not. Let (w0 , w1 , . . . , wn ) be a path from u to v, with
w0 = u and wn = v. Let (x0 , x1 , . . . , xm ) be the path from the root of
the tree to the node v (thus x0 = r and xm = v). Let (y0 , y1 , . . . , yk )
Game Trees and Extensive Form 55

be the path leading from r to u (thus y0 = r and yk = u). Then, we


can construct a new path,

r = y0 , y1 , . . . , yk = u = w0 , w1 , . . . , wn = v,

from r (the root) to the vertex v. Thus, there are two paths leading
from the root to vertex v, contradicting our assertion that T was a
tree. 

Theorem 3.17. Let T = (V, E) be a tree. Suppose u ∈ V is a vertex,


and let

V (u) = {v ∈ V : v = u or v is a descendant of u}.

Let E(u) be the set of all edges defined in paths connecting u to a


vertex in V (u). Then, the graph Tu = (V (u), E(u)) is a tree with root
u and is called the subtree of T descended from u.
Example 3.18. A subtree of the tree in Example 3.8 is shown in
Fig. 3.4. Subtrees can be useful in analyzing decisions in games.

Proof of Theorem 3.17. If u is the root of T , then the statement


is clear. There is a unique path from u (the root) to every vertex in
T , by definition. Thus, Tu is the whole tree.

Root

Sub-tree
Fig. 3.4. We illustrate a subtree. This tree is the collection of all nodes that
are descended from a vertex u.

https://avxhm.se/blogs/hill0
56 Game Theory Explained: A Mathematical Introduction with Optimization

Suppose that u is not the root of T . The set V (u) consists of all
descendants of u and u itself. Thus, between u and each v ∈ V (u),
there is a path p = v0 , v1 , . . . , vn  where v0 = u and vn = v. To see
that this path must be unique, suppose that it is not. Then, there
is at least one other distinct path (w0 , w1 , . . . , wm ) with w0 = u and
wm = v. But if that’s so, we know there is a unique path (x0 , . . . , xk )
with x0 being the root r of T and xk = u. It follows that there are
two paths,
(r = x0 , . . . , xk = v0 = u, v1 , . . . , vn = v) and
(r = x0 , . . . , xk = w0 = u, w1 , . . . , wm = v),
between the root x0 and the vertex v. This is a contradiction of our
assumption that T was a directed tree.
To see that there is no path leading from any element in V (u) back
to u, we apply Lemma 3.16. Since, by definition, every edge found
in the paths connecting u with its descendants is in E(u), it follows
that Tu is a directed tree and u is the root since there is a unique
path from u to each element of V (u) and there is no path leading
from any element of V (u) back to u. This completes the proof. 

3.2 Game Trees with Complete Information


and No Chance

Remark 3.19. We now define a special type of game tree with per-
fect information and no chance moves. For this, we consider a player
set P = {P1 , . . . , PN } and a finite set of allowable moves S that
the players can make. Formally, players and moves are just labels
to be assigned to the vertices and edges of a tree. Terminal vertices
in the tree correspond to end-game conditions (e.g., checkmate in
chess), and each terminal vertex is assigned a payoff (score or prize).
We start with the assumption that there are no “chance moves” and
all players “know” precisely who is allowed to move and what moves
were made. By this, we mean that when reasoning about player deci-
sions, we assume players have complete knowledge of the game tree
and any prior moves made. We introduce chance moves later.
Definition 3.20 (Player Vertex Assignment). Let T = (V, E)
be a directed tree, with F ⊆ V the terminal vertices of T and D
Game Trees and Extensive Form 57

the non-terminal vertices of T . (See Definition 3.12.) An assignment


of players to vertices is an onto function ν : D → P that assigns to
each non-terminal vertex v ∈ D a player ν(v) ∈ P. Player ν(v) is
said to own or control vertex v.

Remark 3.21. In the context of game trees, the non-terminal ver-


tices of a game tree are sometimes called decision vertices, which is
why we use D to represent them in Definition 3.20.

Definition 3.22 (Move Assignment). Let T = (V, E) be a


directed tree. A move assignment function is a mapping μ : E → S
where S is a finite set of player moves, where we enforce the con-
dition that if v, u1 , u2 ∈ V and (v, u1 ) ∈ E and (v, u2 ) ∈ E, then
μ(v, u1 ) = μ(v, u2 ) if and only if u1 = u2 .

Remark 3.23. The condition given in Definition 3.22 simply states


that we must label each out-edge from a vertex v with a unique
move. That is, there cannot be any question in the player’s mind
about what will happen when she chooses a move.

Definition 3.24 (Payoff Function). If T = (V, E) is a directed


tree, let F ⊆ V be the terminal vertices. A payoff function is a
mapping π : F → RN that assigns to each terminal vertex of T a
numerical payoff for each player in P = {P1 , . . . , PN }.

Remark 3.25. It is possible that the payoffs from a game may not
be real-valued but instead be tangible assets, prizes, or penalties.
We assume that the assumptions of the expected utility theorem
(Theorem 2.21) are in force; therefore, a linear utility function can
be defined that provides the real values required for the definition of
the payoff function π.

Definition 3.26 (Game Tree with Complete Information and


No Chance Moves). A game tree with complete information and
no chance moves is a tuple (ordered list), G = (T, P, S, ν, μ, π), such
that T is a directed tree, P is the set of players, S is the set of moves,
ν is a player vertex assignment on intermediate vertices of T , μ is a
move assignment on the edges of T , and π is a payoff function on T .

Example 3.27 (Rock-Paper-Scissors). Consider an odd version


of rock-paper-scissors played between two people, in which Player 1

https://avxhm.se/blogs/hill0
58 Game Theory Explained: A Mathematical Introduction with Optimization

P1

R P S

P2 P2 P2

R P S R P S R P S

(0,0) (-1,1) (1,-1) (1,-1) (0,0) (-1,1) (-1,1) (1,-1) (0,0)

Fig. 3.5. Rock-paper-scissors with perfect information: Player 1 moves first


and holds up a symbol for either rock, paper, or scissors. This is illustrated by
the three edges leaving the root node, which is assigned to Player 1. Player 2
then holds up a symbol for either rock, paper, or scissors. Payoffs are assigned to
Players 1 and 2 at terminal nodes. The index of the payoff vector corresponds to
the players.

throws and then Player 2 throws. If we assume that the winner


receives +1 points and the loser receives −1 points (and in ties,
both players win 0 points), then the game tree for this scenario is
shown in Fig. 3.5. You may think this game is not entirely fair, which
is not mathematically defined, because it looks like Player 2 has an
advantage in knowing Player 1’s move before making his own move.
Irrespective of this feeling, this is a valid game tree.

Definition 3.28 (Strategy-Perfect Information). Let G =


(T, P, S, ν, μ, π) be a game tree with complete information and no
chance, with T = (V, E). Let Vi ⊂ V be the vertices controlled
by Player i. A pure strategy for player Pi (in a perfect information
game) is a mapping σi : Vi → S with the property that if v ∈ Vi
and σi (v) = s, then there is some u ∈ V so that (v, u) ∈ E and
μ(v, u) = s. (Thus, σi will only choose a move that labels an edge
leaving v.)

Remark 3.29. Definition 3.28 tells us that a pure strategy for player
Pi is to choose one out-edge from each vertex controlled by that
player. This is the move the player will make at that point in the
game.
Game Trees and Extensive Form 59

Remark 3.30 (Rationality). As we build methods to find strate-


gies for players, we assume that players are rational, that at any time
they know the entire game tree, and that each player will attempt
to maximize her payoff at the end of the game by choosing an appro-
priate strategy function σi in reference to all the strategy functions
any other player might choose.

Example 3.31 (The Battle of the Bismark Sea – Part 11 ).


Games can be used to illustrate the importance of intelligence in
combat. In February 1943, the battle for New Guinea had reached a
critical juncture in World War II. The Allies controlled the southern
half of New Guinea and the Japanese the northern half. Reports
indicated that the Japanese were amassing troops to reinforce their
army in New Guinea in an attempt to control the entire island. These
troops had to be delivered by a naval convoy. The Japanese had a
choice of sailing either north of New Britain, where rain and poor
visibility were expected, or sailing south of New Britain, where the
weather was expected to be good. Either route required the same
amount of sailing time. See Fig. 3.6.
General Kenney, the Allied forces commander in the Southwest
Pacific, had been ordered to do as much damage to the Japanese
convoy fleet as possible. He had reconnaissance aircraft to detect
the Japanese fleet but had to determine whether to concentrate his
search planes on the northern or southern route.
The game tree in Fig. 3.7 summarizes the choices for the Japanese
(J) and American (A) commanders (players), with payoffs given as
the number of days available for the bombing of the Japanese fleet.
(Since the Japanese cannot benefit, their payoff is reported as the
negative of these values.) The moves for each player are to either sail
north or sail south for the Japanese and to either search north or
search south for the Americans.
In this game tree, we assume perfect information. Thus, the
Americans (somehow) know which route the Japanese will sail.
Knowing this, they can make an optimal choice for each contingency.
If the Japanese sail north, then the Americans search north and will
be able to bomb the Japanese fleet for two days. Similarly, if the

1
This example is discussed in detail by Brams in Ref. [40].

https://avxhm.se/blogs/hill0
60 Game Theory Explained: A Mathematical Introduction with Optimization

Bismark Sea

Northern Route

New Britain
Papua New Guinea Southern Route

Fig. 3.6. New Guinea is located in the South Pacific and was a major region of
contention during World War II. The northern half was controlled by Japan until
1943, while the southern half was controlled by the Allies. The routes shown are
approximate.

Japanese sail south, the Americans will search south and be able to
bomb the Japanese fleet for three days.
The Japanese, however, also have access to this game tree and,
reasoning that the Americans are payoff maximizers, will choose a
path to minimize their exposure to attack. They must choose to go
north and accept two days of bombing. If they choose to go south,
then they know they will be exposed to three days of bombing. Thus,
their optimal strategy is to sail north.
Naturally, the Allies did not know which route the Japanese would
take. We will return to this case later.
Remark 3.32. The complexity of a game (especially one with per-
fect information and no chance moves) can often be measured by
how many nodes are in its game tree. Certain games, such as chess
and Go, have huge game trees. When computers play games, they
often attempt to explore a game tree in order to determine optimal
moves (even when using neural networks, as in AlphaGo [41]).
Game Trees and Extensive Form 61

N S

A A

N S N S

(–2,2) (–1,1) (–2,2) (–3,3)

Fig. 3.7. The Japanese could choose to sail either north or south of New Britain.
The Americans (Allies) could choose to concentrate their search efforts on either
the northern or southern routes. Given this game tree, the Americans would
always choose to search the north if they knew the Japanese had chosen to sail
on the north side of New Britain. Alternatively, they would search the south route
if they knew the Japanese had taken that route. Assuming the Americans had
perfect intelligence, the Japanese would always choose to sail the northern route,
as they would expose themselves to only two days of bombing as opposed to three
with the southern route.

Another measure of complexity is the length of the longest path


in the game tree. In Example 3.31, the length of the longest path
in the game tree is two edges (moves) containing three nodes. This
reflects the fact that there are only two moves in the game: first,
Player 1 moves, and then, Player 2 moves.

3.3 Game Trees with Incomplete Information

Remark 3.33 (Power Set and Partitions). Recall from


Remark 1.15 that, if X is a set, then 2X is the power set of X or the
set of all subsets of X.

Definition 3.34 (Partition). Let X be a set. A partition of X is a


set I ⊆ 2X so that for all x ∈ X, there is exactly one element I ∈ I
so that x ∈ I.

https://avxhm.se/blogs/hill0
62 Game Theory Explained: A Mathematical Introduction with Optimization

Definition 3.35 (Information Sets). If T = (V, E) is a tree and


D ⊂ V are the intermediate (decision) nodes of the tree, ν is a player
assignment function, and μ is a move assignment, then information
sets are a set of subsets I ⊂ 2D satisfying the following:

(1) For all v ∈ D, there is exactly one set Iv ∈ I such that v ∈ Iv .


This is the information set of the vertex v.
(2) If v1 , v2 ∈ Iv , then ν(v1 ) = ν(v2 ).
(3) If (v1 , v) ∈ E and μ(v1 , v) = m, and v2 ∈ Iv1 (i.e., v1 and v2 are
in the same information set), then there is some w ∈ V so that
(v2 , w) ∈ E and μ(v2 , w) = m.

Thus, I is a partition of D.

Remark 3.36. Definition 3.35 says that every vertex in a game tree
is assigned to one information set. It also says that if two vertices
are in the same information set, then they must both be controlled
by the same player. Finally, the definition says that two vertices can
be in the same information set only if the moves from these vertices
are indistinguishable.

Remark 3.37. An information set is used to capture the notion that


a player doesn’t know what vertex of the game tree she is at, i.e., she
cannot distinguish between two nodes in the game tree. All that is
known is that the same moves are available at all vertices in a given
information set.
In a case like this, it is possible that the player doesn’t know
which vertex in the game tree will come next as a result of choosing
a move, but she can certainly limit the number of possible vertices.

Remark 3.38. We can also think of the information set as a map-


ping ξ : V → I, where I is a finite set of information labels and the
labels satisfy requirements like those in Definition 3.35.

Definition 3.39 (Game Tree with Incomplete Information


and No Chance Moves). A game tree with incomplete information
and no chance is a tuple, G = (T, P, S, ν, μ, π, I), such that T is a
directed tree, ν is a player vertex assignment on intermediate vertices
of T , μ is a move assignment on the edges of T , and π is a payoff
function on T , and I are information sets.
Game Trees and Extensive Form 63

Definition 3.40 (Strategy). Let G = (T, P, S, ν, μ, π, I) be a game


tree with incomplete information and no chance moves, with T =
(V, E). Let Ii be the information sets controlled by Player i. A pure
strategy for Player Pi is a mapping σi : Ii → S with the property
that if I ∈ Ii and σi (I) = s, then for every v ∈ I, there is some edge
(v, w) ∈ E so that μ(v, w) = s. The set of all strategies for player i
is denoted by Σi .

Proposition 3.41. If G = (T, P, S, ν, μ, π, I) and I consists of


only singleton sets, then G is equivalent to a game with complete
information.

Proof. The information sets are used only for defining strategies.
Since each I ∈ I is a singleton, we know that for each I ∈ I, we have
I = {v}, where v ∈ D. Here, D is the set of decision nodes in V , with
T = (V, E). Thus, any strategy σi : Ii → E can easily be converted
into σi : Vi → E by stating that σi (v) = σi ({v}) for all v ∈ Vi . This
completes the proof. 

Remark 3.42. Note that Definition 3.40 fully generalizes Defini-


tion 3.28 since, in a perfect information game, each decision vertex
is considered to be in its own information set.

Example 3.43 (The Battle of the Bismark Sea – Part 2).


Recall Example 3.31. Obviously, General Kenney did not know
a priori which route the Japanese would take. This can be modeled
using information sets. In this game, the two nodes that are owned by
the Allies in the game tree are in the same information set. General
Kenney doesn’t know whether the Japanese will sail north or south.
He could (in theory) have reasoned that they should sail north, but
he doesn’t know. The information set for the Japanese is likewise
shown in Fig. 3.8 as dashed rectangles.
From the perspective of the Japanese, since the routes will take
the same amount of time, the northern route is more favorable. To see
this, consider Table 3.1.
If the Japanese sail north, then the worst they will suffer is two
days of bombing, while the best they will suffer is one day of bomb-
ing. If the Japanese sail south, the worst they will suffer is three days
of bombing, while the best they will suffer is two days of bombing.

https://avxhm.se/blogs/hill0
64 Game Theory Explained: A Mathematical Introduction with Optimization

Japanese Information Set

N S

A A

Allied Information Set

N S N S

(–2,2) (–1,1) (–2,2) (–3,3)

Fig. 3.8. The game tree for the Battle of the Bismark Sea with incomplete
information. Obviously, Kenney could not have known a priori which path the
Japanese would choose to sail. He could have reasoned (as they might have) that
their best plan was to sail north, but he wouldn’t really know. We can capture
this fact by showing that when Kenney chooses his move, he cannot distinguish
between the two intermediate nodes that belong to the Allies.

Table 3.1. Various strategies and payoffs in the Battle of the Bismark Sea.
The northern route is favored by the Japanese, who will always do no worse
in taking it than they do the southern route.

Sail North Sail South

Search North Bombed for 2 days ≤ Bombed for 2 days


Search South Bombed for 1 day ≤ Bombed for 3 days

Thus, the northern route should be preferable, as the cost of tak-


ing it is never worse than taking the southern route. We say that
the northern route strategy dominates the southern route strategy.
(We discuss this more formally later.) If General Kenney could reason
this out, then he might choose to commit his reconnaissance forces to
searching the north, even without being able to determine whether
the Japanese sailed north or south.
Game Trees and Extensive Form 65

3.4 Games of Chance

Remark 3.44. In games of chance, there is always a point in the


game where a chance move is made. In card games, the initial deal is
one of these points. To accommodate chance moves, we assume the
existence of a Player 0, who is sometimes called Nature. When dealing
with games of chance, we assume that the player vertex assignment
function assigns some vertices the label P0 .
Definition 3.45 (Moves of Player 0). Let T = (V, E) and ν be
a player vertex assignment function. For all v ∈ D such that ν(v) =
P0 , there is a probability assignment function pv : Eo (v) → [0, 1]
satisfying

pv (e) = 1. (3.1)
e∈Eo (v)

Remark 3.46. The probability function(s) pv in Definition 3.45


essentially defines a roll of the dice. When game play reaches a ver-
tex owned by P0 , Nature (or Player 0 or Chance) probabilistically
advances the game by moving along a randomly chosen edge. The
fact that Eq. (3.1) holds simply asserts that the chance moves of
Nature form a probability space at that point, whose outcomes are
all the possible chance moves.
Definition 3.47 (Game Tree). Let T = (V, E) be a directed tree,
F ⊆ V be the terminal vertices, and D = V \F be the intermediate
(or decision) vertices. Let P = {P0 , P1 , . . . , PN } be a set of players,
including the chance player P0 . Let S be a set of moves for the
players. Let ν : D → P be a player vertex assignment function and
μ : E → S be a move assignment function. Let P denote the set of
Player 0 move functions. That is,

P = {pv : ν(v) = P0 }.

Let π : F → RN be a payoff function. Let I ⊆ 2D be the set of


information sets.
A game tree is a tuple: G = (T, P, S, ν, μ, π, I, P). In this form,
the game defined by the game tree G is said to be in extensive form.

https://avxhm.se/blogs/hill0
66 Game Theory Explained: A Mathematical Introduction with Optimization

Remark 3.48. A strategy for Player i in a game tree like the one
in Definition 3.47 is the same as that in Definition 3.40.
Example 3.49 (Coin Flip Poker). At the beginning of this game,
each player antes up $1 into a common pot. Player 1 flips a coin
and knows its outcome. Player 2 does not. Player 1 has the option
of passing or raising:
(1) If Player 1 passes, she shows the coin to Player 2; if the coin is
heads, Player 1 wins the pot, whereas if the coin is tails, Player 1
loses the pot.
(2) If Player 1 raises, then she adds another dollar to the pot, and
Player 2 must decide whether to call or fold:
(a) If Player 2 folds, then the game ends and Player 1 takes the
money, irrespective of the coin.
(b) If Player 2 calls, then he adds $1 to the pot. Player 1 shows
the coin; if the coin shows heads, then she wins the pot ($2)
and Player 2 loses the pot, whereas if the coin shows tails,
then she loses the pot and Player 2 wins the pot ($2).
The game tree for this game is shown in Fig. 3.9. The root node
of the game tree is controlled by Nature (Player 0). This corresponds
to the initial coin flip of Player 1, which is random and will result in
heads 50% of the time and tails 50% of the time.
Note that the nodes controlled by P2 are in the same information
set. This is because it is impossible for Player 2 to know whether
Player 1 has flipped heads or tails.
The payoffs shown on the terminal nodes are determined by how
much each player will win or lose.

3.5 Payoff Functions and Equilibria

Remark 3.50 (Some Notation). To study equilibria, we have to


work with subtrees. Unfortunately, the notation can get cumbersome,
but it will drive all our proofs.
Let G = (T, P, S, ν, μ, π, I, P) be a game tree, and let u ∈ D,
where D is the set of non-terminal vertices of T . Recall that Tu is
the subtree of T rooted at u. Then, V (Tu ) denotes the vertex set of
Tu . Suppose ν : D → P is a player vertex assignment function, then
Game Trees and Extensive Form 67

P0
Heads (0.5) Tails (0.5)

P1 P1

Raise Raise

Pass Pass

P2 P2

Call Fold Call Fold

(1,–1) (2,–2) (1,–1) (–2,2) (1,–1) (–1,1)

Fig. 3.9. The root node of the game tree is controlled by Nature. Player 1’s
coin flip happens at this point. Player 1 can then decide whether to end the game
by passing (and thus receiving a payoff or not) or continue the game by raising.
At this point, Player 2 can then decide whether to call or fold, thus potentially
receiving a payoff.

ν|Tu : V (Tu ) ∩ D → P is the function ν restricted to the vertices in


the subtree. Other functions restricted to Tu are denoted similarly.
Theorem 3.51. Let G = (T, P, S, ν, μ, π, I, P) be a game tree, and
let u ∈ D, where D is the set of non-terminal vertices of T . Then,
the following is a game tree:

G  = (Tu , P, S, ν|Tu , μ|Tu , π|Tu , I|Tu , P|Tu ),

where I|Tu is the set of information sets restricted to the vertices of


Tu . That is,

I|Tu = {I ∩ V (Tu ) : I ∈ I}.

Finally, P|Tu is the set of probability assignment functions in P


restricted only to the edges in Tu .
Proof. By Theorem 3.17, we know that Tu is a subtree of T .
The functions ν|Tu , μ|Tu , and π|Tu are simply restrictions of the
respective functions to the subtree.

https://avxhm.se/blogs/hill0
68 Game Theory Explained: A Mathematical Introduction with Optimization

Let v be a descendant of u controlled by chance. Since all descen-


dants of u are included in Tu , it follows that all descendants of v are
contained in Tu . Thus,

pv |Tu (e) = 1,
e∈Eo (v)

as required. Thus, P|Tu is an appropriate set of probability functions.


Finally, since I is a partition of Tu , we may compute I|Tu by
simply removing the vertices in the subsets of I that are not in
Tu . This set, ITu , is a partition of Tu and necessarily satisfies the
requirements set forth in Definition 3.35 because all the descendants
of u are elements of V (Tu ). 

Example 3.52. If we consider the game in Example 3.49, but sup-


pose that Player 1 is known to have flipped heads, then the new game
tree is derived by considering only the subtree in which Player 1 sees
heads. This is shown in Fig. 3.10. It is worth noting that when we
restrict our attention to this subtree, a game that was originally an
incomplete-information game becomes a complete-information game.
That is, each vertex is now the sole member of its information set.
Additionally, we have removed chance from the game.

P1

Raise

Pass

P2

Call Fold

(1,–1) (2,–2) (1,–1)

Fig. 3.10. We are told that Player 1 flips heads. The resulting game tree is
substantially simpler. Because the information set on Player 2’s controlled nodes
indicated a lack of knowledge of Player 1’s card, we can see that this sub-game
is now a complete-information game.
Game Trees and Extensive Form 69

Theorem 3.53. Let G = (T, P, S, ν, μ, π, I) be a game with no


chance. Let σ1 , . . . , σN be a set of strategies for Players 1 − N . Then,
these strategies determine a unique path through the game tree.
Proof. To see this, suppose we begin at the root node r. If this
node is controlled by Player i, then node r exists in the information
set Ir ∈ Ii . Then, σi (Ir ) = s ∈ S, and there is some edge (r, u) ∈ E
so that μ(r, u) = s. We have a two-vertex path (r, u).
Consider the game tree G  constructed from subtree Tu as in
Theorem 3.51. This game tree has root u. We can apply the same
argument to construct a two-vertex path (u, u ), which, when joined
with the initial path, forms the three-node path (r, u, u ). Repeating
this argument inductively will yield a path through the game tree
that is determined by the strategy functions of the players. Since the
number of vertices in the tree is finite, this process will end, produc-
ing the desired path. The uniqueness of the path is ensured by the
fact that the strategies are functions, and thus, at any information
set, exactly one move will be chosen by the player in control. 

Example 3.54. In the Battle of the Bismark Sea, the strategy we


defined in Example 3.31 clearly defines a unique path through the
tree. Since each player determines a priori the unique edge he/she
will select when confronted with a specific information set, a path
through the tree can be determined from these selections. This is
illustrated in Fig. 3.11.

Theorem 3.55. Let G = (T, P, S, ν, μ, π, I, P). Let σ1 , . . . , σN be


a collection of strategies for Players 1 − N . Then, these strategies
determine a discrete probability space (Ω, F, P ), where Ω is a set of
paths leading from the root of the tree to a subset of the terminal
nodes, and if ω ∈ Ω, then P (ω) is the product of the probabilities of
the chance moves defined by the path ω.
Proof. We proceed inductively from the height of the tree T . Sup-
pose the tree T has a height of 1. Then, there is only one decision
vertex (the root). If that decision vertex is controlled by a player
other than chance, then applying Theorem 3.53, we know that the
strategies σ1 , . . . , σN define a unique path through the tree. The only
paths in a tree of height 1 have the form r, u, where r is the root
of T and u is a terminal vertex. Thus, Ω is a singleton consisting of

https://avxhm.se/blogs/hill0
70 Game Theory Explained: A Mathematical Introduction with Optimization

N S

A A

N S N S

(–2,2) (–1,1) (–2,2) (–3,3)

Fig. 3.11. A unique path through the game tree of the Battle of the Bismark
Sea, determined by the strategies of the two players.

only the path r, u determined by the strategies, and it is assigned


a probability of 1.
If chance controls the root vertex, then we can define
Ω = {r, u : u ∈ F },
where F is the set of terminal nodes in V . The probability assigned
to a path (r, u) is simply the probability that chance (Player P0 )
selects the edge (r, u) ∈ E. The fact that

pr (r, u) = 1
u∈F

ensures that we can define the probability space (Ω, F, P ). Thus, we


have shown that the theorem is true for game trees of height 1.
Suppose the statement is true for game trees with heights up to
k ≥ 1. We show that the theorem is true for game trees of height
k + 1. Let r be the root of the tree T , and consider the set of children
of U = {u ∈ V : (r, u) ∈ E}. For each u ∈ U , we can define a game
tree of height k with tree Tu using Theorem 3.51. The fact that
this tree has a height of k implies that we can define a probability
space (Ωu , Fu , Pu ), with Ωu composed of paths from u to the terminal
vertices of Tu .
Game Trees and Extensive Form 71

Suppose that vertex r is controlled by Player Pj (j = 0). Then,


the strategy σj determines a unique move that will be made by Player
j at vertex r. Suppose that move m is specified by σj at vertex r so
that μ(r, u) = m for edge (r, u) ∈ E with u ∈ U . That is, edge (r, u)
is labeled m. We define the new event set Ω of paths in the tree T
from root r to a terminal vertex. The probability function on paths
is defined as

Pu [(u1 , . . . , uk )] (u1 , . . . , uk ) ∈ Ωu and u1 = u,
P [(r, u1 , . . . , uk )] =
0 otherwise.

The fact that Pu is a properly defined probability function over Ωu


implies that P is a properly defined probability function over Ω, and
thus (Ω, F, P ) is a probability space over the paths in T .
Now, suppose that chance (Player P0 ) controls r in the game tree.
Again, Ω is the set of paths leading from r to a terminal vertex of T .
The probability function on paths can then be defined as

P [(r, v1 , . . . , vk )] = pr (r, v1 )Pv1 [(v1 , . . . , vk )],

where v1 ∈ U and v1 , . . . , vk  ∈ Ωv1 , the set of paths leading from


v1 to a terminal vertex in the tree Tv1 , and p(r, v1 ) is the probability
chance assigned to the edge (r, v1 ) ∈ E.
To see that this is a properly defined probability function, suppose
that ω ∈ Ωu . That is, ω is a path in the tree Tu leading from u to
a terminal vertex of Tu . Then, a path in Ω is constructed by joining
the path that leads from vertex r to vertex u and then following a
path ω ∈ Ωu . Let (r, ω) denote such a path. Then, we know that
   
P [(r, ω)] = p(r, u)Pu (ω)
u∈U ω∈Ωu u∈U ω∈Ωu
 
  
= p(r, u) Pu (ω) = p(r, u) = 1.
u∈U u∈Ωu u∈U
(3.2)

This is because ω∈Ωu Pu (ω) = 1. Since clearly P [(r, ω)] ∈ [0, 1]
and the paths through the game tree are independent, it follows that
(Ω, F, P ) is a properly defined probability space. The theorem follows
by induction. 

https://avxhm.se/blogs/hill0
72 Game Theory Explained: A Mathematical Introduction with Optimization

P0
Heads (0.5) Tails (0.5)

P1 P1

Raise Raise

Pass Pass

P2 P2

Call Fold Call Fold

(1,–1) (2,–2) (1,–1) (–2,2) (1,–1) (–1,1)

{ {
P0 Heads (0.5) P1 Raise P2 Call 50%
Ω=
P0 Tails (0.5) P1 Raise P2 Call 50%

Fig. 3.12. The probability space constructed from fixed player strategies in
a game of chance. The strategy space is constructed from the unique choices
determined by the strategy of the players and the independent random events
that are determined by the chance moves.

Example 3.56. Consider the coin-flip game from Example 3.49.


Suppose we fix strategies in which Player 1 always raises and Player 2
always calls. Then, the resulting probability distribution defined as
in Theorem 3.55 contains two paths: one when the coin is heads and
the other when the coin is tails. This is shown in Fig. 3.12. The
sample space consists of the possible paths through the game tree.
As in Theorem 3.53, the paths through the game tree are completely
specified because the only time probabilistic moves occur is when
chance causes the game to progress.

Example 3.57. Suppose we play a game in which Players 1 and 2


ante $1 each. One card each is dealt to Player 1 and Player 2. Player 1
can choose to raise (and add $1 to the pot) or fold (and lose the
Game Trees and Extensive Form 73

pot). Player 2 can then choose to call (adding $1) or fold (and lose
the pot). Player 1 wins if both cards are black. Player 2 wins if
both cards are red. The pot is split if the cards have opposite colors.
Suppose that Player 1 always chooses to raise and Player 2 always
chooses to call. Then, the game tree and strategies are shown in
Fig. 3.13. The sample space in this case consists of four distinct
paths, each with a probability of 14 , assuming that the cards are dealt
with equal probability. In this example, constructing the probabilities
of the various events requires multiplying the probabilities of the
chance moves in each path. Note that the information sets define
the information that the players have. In this case, Player 1 knows
the color of her card, and Player 2 knows the color of his card, but
neither player knows the color of the other player’s card until the
game ends.
Definition 3.58 (Strategy Space). Let Σi be the set of all strate-
gies for Player i in a game tree G. Then, the entire strategy space is
Σ = Σ1 × Σ2 × · · · × Σn .
Definition 3.59 (Strategy Payoff Function). Let G be a game
tree with no chance moves. The strategy payoff function is a mapping
π : Σ → RN . If σ1 , . . . , σN are strategies for Players 1 − N , then
π(σ1 , . . . , σN ) is the vector of payoffs assigned to the terminal node
of the path determined by the strategies σ1 , . . . , σN in the game tree
G. For each i = 1, . . . , N , πi (σ1 , . . . , σN ) is the payoff to Player i in
πi (σ1 , . . . , σN ).
Example 3.60. Consider the Battle of the Bismark Sea game from
Example 3.43. Then, there are four distinct strategies in Σ with the
following payoffs:
π (Sail North, Search North) = (−2, 2),
π (Sail South, Search North) = (−2, 2),
π (Sail North, Search South) = (−1, 1),
π (Sail South, Search South) = (−3, 3).
Definition 3.61 (Expected Strategy Payoff Function). Let G
be a game tree with chance moves. The expected strategy payoff func-
tion is a mapping π : Σ → RN defined as follows: If σ1 , . . . , σN

https://avxhm.se/blogs/hill0
P0

74
Red (0.5) Card to Player 1 Black (0.5)

Game Theory Explained: A Mathematical Introduction with Optimization


P0 P0
Red (0.5) Card to Player 2 Black (0.5) Red (0.5) Card to Player 2 Black (0.5)

P1 P1 P1 P1

Raise Raise Raise Raise

Fold Fold Fold Fold

P2 P2 P2 P2

Call Fold Call Fold Call Fold Call Fold

(–1, 1) (–2, 2) (1,–1) (0,0) (1,–1) (–1,1) (–1,1) (0,0) (1,–1) (2,–2) (1,–1) (–1,1)

{ {
P0 Red (0.5) P0 Red (0.5) P1 Raise P2 Call 25%

P0 Red (0.5) P0 Black (0.5) P1 Raise P2 Call 25%

Ω=
P0 Black (0.5) P0 Red (0.5) P1 Raise P2 Call 25%

P0 Black (0.5) P0 Black (0.5) P1 Raise P2 Call 25%

Fig. 3.13. The probability space constructed from fixed player strategies in a game of chance. The strategy space is
constructed from the unique choices determined by the strategy of the players and the independent random events that are
determined by the chance moves. In this example, the probabilities of the various paths are constructed by multiplying the
probabilities of the chance moves in each path.
Game Trees and Extensive Form 75

are strategies for Players 1 − N , then let (Ω, F, P ) be the probabil-


ity space over the paths constructed by these strategies, as given in
Theorem 3.55. Let Πi be a random variable that maps ω ∈ Ω to the
payoff for Player i at the terminal node in the path ω. Let

πi (σ1 , . . . , σN ) = E(Πi ).

Then,

π(σ1 , . . . , σN ) = π1 (σ1 , . . . , σN ), . . . , πN (σ1 , . . . , σN ).

As before, πi (σ1 , . . . , σN ) is the expected payoff to Player i in


π(σ1 , . . . , σN ).

Example 3.62. Consider the coin-flipping game from Example 3.49.


There are four distinct strategies in Σ:

⎪ (Pass, Call),



⎨ (Pass, Fold),

⎪ (Raise, Call),



(Raise, Fold).

Note that the strategies specify moves at every vertex, even if they
cannot be reached because of decisions by certain players.
Focus on the strategy (Pass, Call). Then, the resulting paths in
the graph defined by these strategies are shown in Fig. 3.14. There are
two paths, and we note that the decision made by Player 2 makes
no difference in this case because Player 1 passes. Each path has
a probability of 12 . Our random variable Π1 maps the top path in

P0 Heads (0.5) P1 Pass (1,–1)

P0 Tails (0.5) P1 Pass (–1,1)

Fig. 3.14. Game tree paths derived from the game in Example 3.49 are the
result of the strategy (Pass, Fold). The probability of each of these paths is 12 .

https://avxhm.se/blogs/hill0
76 Game Theory Explained: A Mathematical Introduction with Optimization

Fig. 3.14 to a $1 payoff for Player 1 and the bottom path in Fig. 3.14
to a payoff of −$1 for Player 1. Thus, we can compute

1 1
π1 (Pass, Call) = (1) + (−1) = 0.
2 2

Likewise,

1 1
π2 (Pass, Call) = (−1) + (1) = 0.
2 2

Thus, we compute

π (Pass, Call) = (0, 0).

Using this approach, we can compute the expected payoff function


to be

π (Pass, Call) = (0, 0),


π (Pass, Fold) = (0, 0),
π (Raise, Call) = (0, 0),
π (Raise, Fold) = (1, −1).

Definition 3.63 (Equilibrium). A strategy (σ1∗ , . . . , σN


∗ ) ∈ Σ is an

equilibrium if, for all i,

πi (σ1∗ , . . . , σi∗ , . . . , σN

) ≥ πi (σ1∗ , . . . , σi , . . . , σN

),

where σi ∈ Σi .

Remark 3.64. An equilibrium strategy is one in which no player


can improve his/her payoff by unilaterally changing his/her strategy.

Example 3.65. Consider the Battle of the Bismark Sea. We can


show that (Sail North, Search North) is an equilibrium strategy.
Game Trees and Extensive Form 77

Recall that

π (Sail North, Search North) = (−2, 2).

Now, suppose that the Japanese deviate from this strategy and
decide to sail south. Then, the new payoff is

π (Sail South, Search North) = (−2, 2).

Thus,

π1 (Sail North, Search North) ≥ π1 (Sail South, Search North).

Now, suppose that the Allies deviate from the strategy and decide
to search south. Then, the new payoff is

π (Sail North, Search South) = (−1, 1).

Thus,

π2 (Sail North, Search North) > π2 (Sail North, Search South).

Now, change only one player strategy at a time and evaluate whether
that improves that player’s payoff. In this case, neither player benefits
from changing strategy.
Remark 3.66. The next proof is long and proves the existence of
equilibria in games of complete information. It is safe to skip this
proof, but it does introduce the idea of backward induction, which
is used in dynamic programming [42], which is in turn used in more
dynamic game theory.
Theorem 3.67. Let G = (T, P, S, ν, μ, π, I, P) be a game tree
with complete information. Then, there is an equilibrium strategy
(σ1∗ , . . . , σN
∗ ) ∈ Σ.

Proof. We apply induction on the height to the game tree


T = (V, E). Before proceeding to the proof, recall that a game with
complete information is one in which if v ∈ V and Iv ∈ I is the infor-
mation set of vertex v, then Iv = {v}. Thus, we can think of a strat-
egy σi for Player Pi as a mapping from V to S, as in Definition 3.28.
We now proceed to the proof.

https://avxhm.se/blogs/hill0
78 Game Theory Explained: A Mathematical Introduction with Optimization

Suppose the height of the tree is 1. Then, the tree consists of a


root node r and a collection of terminal nodes F so that if u ∈ F ,
then (r, u) ∈ E. If chance controls r, then there is no strategy for
any of the players, and they are randomly assigned a payoff. Thus,
we can think of the empty strategy as the equilibrium strategy. On
the other hand, if player Pi controls r, then we let σi (r) = m ∈ S
so that if μ(r, u) = m for some u ∈ F , then πi (u) ≥ πi (v) for all
other v ∈ F . That is, the vertex reached by making the move m
has a payoff for Player i that is greater than or equal to any other
payoff Player i might receive at another vertex. All other players
are assigned empty strategies (as they never make a move). Thus,
it is easy to see that this is an equilibrium strategy since no player
can improve their payoff by changing their strategies. Thus, we have
proved that there is an equilibrium strategy in this case.
Now, suppose that the theorem is true for a game tree G of some
height k ≥ 1 with complete information. We show that the statement
holds for game trees of height k + 1. Let r be the root of the tree, and
let U = {u ∈ V : (r, u) ∈ E} be the set of children of r in T . If r is
controlled by chance, then the first move of the game is controlled by
chance. For each u ∈ U , we can construct a game tree with tree Tu
by Theorem 3.51. By the induction hypothesis, we know that there
∗ u∗ ). Let π u∗ be the payoff
is some equilibrium strategy (σ1u , . . . , σN i
associated with using this strategy for Player Pi . Now, consider any
∗ u∗ , σ u , σ u∗ . . . , σ u∗ ). Let π u be the
alternative strategy (σ1u , . . . , σi−1 i i+1 N i
payoff to Player Pi that results from using this new strategy in the
game with game tree Tu . It must be that

πiu ≥ πiu ∀i ∈ {1, . . . , N }, u ∈ U. (3.3)

Thus, we construct a new strategy for Player Pi so that if chance


causes the game to transition to vertex u in the first step, then Player

Pi will use strategy σiu . Eq. (3.3) ensures that Player i will never
have a motivation to deviate from this strategy, as the assumption of
complete information assures us that Player i will know for certain
to which u ∈ U the game has transitioned.
Alternatively, suppose that the root is controlled by Player Pj .

Let U and πiu be as above. Then, let σj (r) = m ∈ S so that if
μ(r, u) = m, then
∗ ∗
πju ≥ πjv (3.4)
Game Trees and Extensive Form 79

for all v ∈ U . That is, Player Pj chooses a move that will yield a
new game tree Tu that has the greatest terminal payoff using the
∗ u∗ ) in that game tree. We can now
equilibrium strategy (σ1u , . . . , σN
define a new strategy:

(1) At vertex r, σj (r) = m.


∗ u∗ ).
(2) Every move in the tree Tu is governed by (σ1u , . . . , σN
(3) If v = r and v ∈ Tu and ν(v) = i, then σi (v) may be chosen at
random from S (because this vertex will never be reached during
game play).

We can show that this is an equilibrium strategy. To see this, consider


any other strategy. If Player i = j deviates, then we know that this
player will receive the payoff πiu (as above) because Player j will force
the game into the tree Tu after the first move. We know further that

πiu ≥ πiu . Thus, there is no incentive for Player Pi to deviate from
∗ u∗ ) in T .
the given strategy, (σ1u , . . . , σN u
On the other hand, suppose Player j deviates at some vertex in

Tu , then we know Player j will receive the payoff πju ≤ πju . Thus,
once the game play takes place inside tree Tu , there is no reason
to deviate from the given strategy. If Player j deviates on the first
move and chooses a move m so that μ(r, v) = m , then there are two
possibilities:
∗ ∗
(1) πjv = πju ,
∗ ∗
(2) πjv < πju .

In the first case, we can construct a strategy as before, in which Player


Pj will still receive the same payoff as if he played the strategy in
which σj (r) = m (instead of σj (r) = m ). In the second case, the
∗ ∗
best payoff Player Pj can obtain is πjv < πju , so there is certainly
no reason for Player Pj to deviate by defining σj (r) = m . Thus, we
have shown that the strategy we constructed is an equilibrium, and
it follows that there is an equilibrium strategy for this tree of height
k + 1. The theorem follows by induction. 

Remark 3.68. The equilibrium constructed in this theorem is called


a sub-game perfect equilibrium because it contains an equilibrium for
every possible sub-game.

https://avxhm.se/blogs/hill0
80 Game Theory Explained: A Mathematical Introduction with Optimization

Example 3.69. We can illustrate the construction from the theorem


of the Battle of the Bismark Sea with complete information. In fact,
you have already seen this construction once. Consider the game tree
in Fig. 3.11. We construct the equilibrium solution from the bottom
of the tree up. Consider the vertex controlled by the Allies, in which
the Japanese sail north. In the subtree below this node, the best move
for the Allies is to search north (they receive the highest payoff).
This is highlighted in blue. Now, consider the vertex controlled by
the Allies, where the Japanese sail south. The best move for the
Allies is to search south. Now, consider the root node controlled by
the Japanese. The Japanese can examine the two subtrees below this
node and determine that the payoffs resulting from the equilibrium
solutions in these trees are −2 (from sailing north) and −3 (from
sailing south). Naturally, the Japanese will choose to make the move
of sailing north, as this is the highest payoff they can achieve. The
equilibrium strategy is shown in red and blue in the tree in Fig. 3.15.

Remark 3.70. We note that the equilibrium identified in the previ-


ous example also happens to be an equilibrium strategy for this game

N S

A A

N S N S

(–2,2) (–1,1) (–2,2) (–3,3)

Fig. 3.15. The game tree for the Battle of the Bismark Sea. If the Japanese sail
north, the best move for the Allies is to search north. If the Japanese sail south,
then the best move for the Allies is to search south. The Japanese, observing the
payoffs, note that given these best strategies for the Allies, their best course of
action is to sail north.
Game Trees and Extensive Form 81

if we introduce incomplete information. This is not always the case,


which can be seen by investigating rock-paper-scissors with incom-
plete information.
Corollary 3.71 (Zermelo’s Theorem). Let G = (T, P, S, ν, μ, π)
be a two-player game with complete information and no chance.
Assume that the payoff is such that
(1) the only payoffs are +1 (win), −1 (lose);
(2) Player 1 wins +1 if and only if Player 2 wins −1;
(3) Player 2 wins +1 if and only if Player 1 wins −1.
Finally, assume that the players take alternate turns. Then, one of
the two players must have a strategy to obtain +1.
Remark 3.72. In particular, Zermelo’s theorem implies that if we
ensure that a chess game ends in a finite number of moves with no
draws, then there is some strategy to ensure either white or black
wins. We just don’t know what that strategy is.

3.6 Chapter Notes

Game trees are a special case of decision trees [43], which appear fre-
quently in artificial intelligence. In fact, game play was one of the ear-
liest applications of artificial intelligence – in the classical sense [44].
In this case, the so-called α−β pruning [45] is used to explore a game
tree, allowing a machine player to make optimal moves. Variants of
this approach reached their peak with the creation of Deep Blue [46],
the chess-playing algorithm that defeated Kasparov in 1996, and the
solution of checkers in 2007 [47]. Games with complete information
and no chance moves (such as chess and checkers) are often called
combinatorial games and are covered in depth by Berlekamp, Con-
way, and Guy’s four-volume Winning Ways for Your Mathematical
Plays [48–51].
The challenge with game-tree analysis is the combinatorial explo-
sion of the tree size in most games. Modern deep-learning-based
methods still use a type of tree exploration (often called fictitious
play) but adopt a radically different approach to computing the long-
run payoff estimates compared to the methods used by Deep Blue.
The approach used by Deep Mind to win at Go [52] is effectively a
82 Game Theory Explained: A Mathematical Introduction with Optimization

combination of simulation and payoff function approximation. Fu [41]


offers an excellent summary of the approach from the perspective of
simulation.
Game trees are extensively used in economics, where they are
sometimes referred to as dynamic games – though this is not uni-
versal. Myerson [38] provides a detailed introduction to game theory
from an economics perspective and refers to these games as being
in extensive form, as we do here. A classic example of an economic
game in extensive form is Rosenthal’s centipede game [53]. This game
is interesting because the equilibrium identified by backward induc-
tion is almost never played in practical tests with humans. This type
of investigation falls under the area of behavioral game theory or
behavioral economics [54].
In addition to its use in economics, game theory is also used in
political science. The Battle of the Bismark Sea example comes from
Brams’ book, Game Theory and Politics [40]. Game theory has been
extensively used in military analysis with varying levels of success
(see, e.g., Refs. [55–58]).
– ♠♣♥♦ –

3.7 Exercises

3.1 Compute the number of directed graphs on four vertices. [Hint:


How many different pairs of vertices are there?]

3.2 Using the approach from Example 3.31, derive a strategy for
Player 2 in the rock-paper-scissors game (Example 3.27) assuming
she will attempt to maximize her payoff. Similarly, show that it
doesn’t matter whether Player 1 chooses rock, paper, or scissors in
this game, and thus any strategy for Player 1 is equally good (or
bad).

3.3 Consider a simplified game of tic-tac-toe where the objective is


to fill in a board shown in Fig. 3.16.
Game Trees and Extensive Form 83

XX
O
Game Board X Wins!
Fig. 3.16. Players in this game try to get two in a row.

Assume that X goes first. Construct the game tree for this game
by assuming that the winner receives +1 while the loser receives −1,
and draws result in 0 for both players. Compute the depth of the
longest path in the game tree. Show that there is a strategy so that
the first player always wins. [Hint: You will need to consider each
position on the board as one of the moves that can be made.]

3.4 On a standard 3 × 3 tic-tac-toe board, compute the length of


the longest path in the game tree. [Hint: Assume you draw in this
game.]

3.5 Consider the information sets as a collection of labels I, and


let ξ : V → I. Write down the constraints that ξ must satisfy so that
this definition of information set is analogous to Definition 3.35.

3.6 Identify the information sets for a regular game of rock-paper-


scissors, and draw the game tree to illustrate the incomplete infor-
mation. You do not need to identify an optimal strategy for either
player.

3.7 Define a strategy for rock-paper-scissors, and show the unique


path through the tree in Fig. 3.5 determined by this strategy. Do the
same for the game tree describing the Battle of the Bismark Sea with
incomplete information.
84 Game Theory Explained: A Mathematical Introduction with Optimization

3.8 Draw a game tree for the following game: At the beginning of
this game, each player antes up $1 into a common pot. Player 1 takes
a card from a randomized (shuffled) deck. After looking at the card,
Player 1 decides whether to raise or fold:
(1) If Player 1 folds, he shows the card to Player 2; if the card is
red, then Player 1 wins the pot and Player 2 loses the pot,
whereas if the card is black, then Player 1 loses the pot and
Player 2 wins the pot.
(2) If Player 1 raises, then Player 1 adds another dollar to the pot,
and Player 2 picks a card and must decide whether to call or
fold.
(a) If Player 2 folds, then the game ends, and Player 1 takes the
money, irrespective of any cards drawn.
(b) If Player 2 calls, then he adds $1 to the pot. Both players
show their cards; if both cards are of the same suit, then
Player 1 wins the pot ($2) and Player 2 loses the pot, whereas
if the cards are of opposite suits, then Player 2 wins the pot
and Player 1 loses.

3.9 Continuing from Exercise 3.8, draw the game tree when we
know that Player 1 is dealt a red card. Illustrate in your drawing
how it is a subtree of the tree you drew in Exercise 3.8. Determine
whether this game is still (i) a game of chance and (ii) whether it is
a complete-information game or not.

3.10 Suppose that players always raise and call in the game defined
in Exercise 3.8. Compute the probability space defined by these
strategies in the game tree you developed.

3.11 Decide whether the strategy (Raise, Call) is an equilibrium


strategy in the game in Example 3.49.
Game Trees and Extensive Form 85

3.12 Show that in rock-paper-scissors with perfect information,


there are three equilibrium strategies.

3.13 Prove Zermelo’s theorem. Can you illustrate a game of this


type? [Hint: Use Theorems 3.67 and 3.53. There are many games of
this type.]
This page intentionally left blank
Chapter 4

Games and Matrices:


Normal and Strategic Forms

Chapter Goals: The goal of this chapter is to introduce games


in normal and strategic forms. Games in strategic form are usu-
ally called matrix games. We also discuss strategy vectors and
show how payoffs can be computed using simple matrix arith-
metic. Equilibria are defined in terms of strategy vectors. This
chapter forms the foundation for results in later chapters. An
introduction to the elements of matrix arithmetic needed to
understand this chapter is provided in Appendix A.

4.1 Normal and Strategic Forms

Definition 4.1 (Normal Form). A game in normal form is a triple


G = (P, Σ, π), where P is the player set, Σ = Σ1 × Σ2 × · · · × ΣN
is a discrete strategy space, and π : Σ → RN is a strategy payoff
function.
Remark 4.2. If G = (P, Σ, π) is a normal-form game, then the
function πi : Σ → R is the payoff function for Player Pi and returns
the ith component of the function π.
Remark 4.3. The notation used in Definition 4.1 is identical to that
introduced in Chapter 3. Consequently, it is straightforward to see

87
88 Game Theory Explained: A Mathematical Introduction with Optimization

that a game in extensive form can be converted into a game in normal


form.
Definition 4.4 (Constant/General-Sum Game). Let G =
(P, Σ, π) be a game in normal form. If there is a constant C ∈ R
so that for all strategy tuples (σ1 , . . . , σN ) ∈ Σ, we have
N

πi (σ1 , . . . , σN ) = C, (4.1)
i=1

then G is called a constant-sum game. If C = 0, then G is called


a zero-sum game. Any game that is not constant sum is called a
general-sum game.
Example 4.5. This example is a variation on the one by Brian
Burke [59] on his blog. A North American Football play (in which
the score does not change) is an example of a zero-sum game when
the payoff is measured by yards gained or lost. In this game, there
are two players: the Offense (P1 ) and the Defense (P2 ). The Offense
may choose between two strategies:
Σ1 = {Pass, Run}. (4.2)
The Defense may choose between three strategies:
Σ2 = {Pass Defense, Run Defense, Blitz}. (4.3)
The yards gained by the Offense are lost by the Defense. Suppose the
following payoff function (in terms of yards gained or lost by each
player) π is defined:
π(Pass, Pass Defense) = (−2, 2),
π(Pass, Run Defense) = (8, −8),
π(Pass, Blitz) = (−4, 4),
π(Run, Pass Defense) = (6, −6),
π(Run, Run Defense) = (−2, 2),
π(Run, Blitz) = (5, −5).
If P = {P1 , P2 } and Σ = Σ1 × Σ2 , then the tuple G = (P, Σ, π) is a
zero-sum game in normal form. Note that each pair in the definition
of the payoff function sums to zero.
Games and Matrices: Normal and Strategic Forms 89

Remark 4.6. Just as in a game in extensive form, we can define an


equilibrium. This definition is identical to the definition we gave in
Definition 3.63.
Definition 4.7 (Equilibrium). A strategy (σ1∗ , . . . , σN
∗ ) ∈ Σ is an

equilibrium if, for all i,

πi (σ1∗ , . . . , σi∗ , . . . , σN

) ≥ πi (σ1∗ , . . . , σi , . . . , σN

),

where σi ∈ Σi .

4.2 Strategic-Form Games

Remark 4.8. For the remainder of this book, we assume that the
reader is familiar with matrix arithmetic. A review of all the neces-
sary facts can be found in Appendix A. It is worth noting that we
use some notation common to operations research for convenience.
The ith row of a matrix A ∈ Rm×n (i.e., an m × n rectangular array
of real numbers) is denoted Ai· , while the jth column of that matrix
is denoted A·j . This notation appears throughout the remainder of
the book.
Definition 4.9 (Strategic Form – Two-Player Games). Let
G = (P, Σ, π) be a normal-form game, with P = {P1 , P2 } and Σ =
Σ1 × Σ2 . Suppose the strategies in Σi (i = 1, 2) are ordered so that
Σi = {σ1i , . . . , σni i } (i = 1, 2). Then, there are two matrices A, B ∈
Rn1 ×n2 so that

Arc = π1 (σr1 , σc2 ),


Brc = π2 (σr1 , σc2 ).

That is, the (r, c) element of the matrices are given by the payoff
functions. Then, the tuple G = (P, Σ, A, B) is a two-player game in
strategic form.
Remark 4.10. Games with two players given in strategic form are
also sometimes called matrix games or bimatrix games because they
are defined completely by matrices. Note also that, by convention,
Player P1 ’s strategies correspond to the rows of the matrices, while
Player P2 ’s strategies correspond to the columns of the matrices.
90 Game Theory Explained: A Mathematical Introduction with Optimization

Example 4.11. Consider the two-player game defined in the Battle


of the Bismark Sea. If we assume that the strategies for the players
are
Σ1 = {Sail North, Sail South},
Σ2 = {Search North, Search South},
then the payoff matrices for the two players are
 
−2 −1
A= ,
−2 −3
 
2 1
B= .
2 3
Here, the rows represent the different strategies of Player 1, and
the columns represent the strategies of Player 2. Thus, the (1, 1)
entry in the matrix A is the payoff to Player 1 when the strat-
egy pair (Sail North, Search North) is played. The (2, 1) entry
in matrix B is the payoff to Player 2 when the strategy pair
(Sail South, Search North) is played, etc. Note in this case that
A = −B. This is because the Battle of the Bismark Sea is a zero-sum
game.
Example 4.12 (Chicken). Consider the following two-player game:
Two cars face each other and begin driving (quickly) toward each
other (see Fig. 4.1.). The player who swerves first loses 1 point,
while the other player wins 1 point. If both players swerve, then
each receives 0 points. If neither player swerves, a bad crash occurs
and both players lose 10 points. (In reality, a crash is worse than
losing points, but we must assign a numeric value.)
Assuming that the strategies for Player 1 are in the rows and the
strategies for Player 2 are in the columns, then the two matrices for
the players are

Fig. 4.1. Illustration of a game of Chicken.


Games and Matrices: Normal and Strategic Forms 91

Swerve Don’t Swerve

Player 1
Swerve 0 −1
Don’t Swerve 1 −10
Player 2
Swerve 0 1
Don’t Swerve −1 −10

From this, we can see that the matrices are


 
0 −1
A= ,
1 −10
 
0 1
B= .
−1 −10
Note that Chicken is not a zero-sum game; it is a general-sum game.

Remark 4.13. Chicken (sometimes called Hawk-Dove or Snowdrift)


can be generalized in the sense that we could write the payoff matrix
for Player 1 as
 
T L
A= ,
W X
where W > T > L > X are arbitrary values. As expected, the payoff
matrix for Player 2 is then B = AT . For our examples, we use the
numerical payoff matrices and leave generalization to the reader as
appropriate.
Remark 4.14. Definition 4.9 can be extended to N -player games.
However, we no longer have matrices with payoff values for various
strategies. Instead, we construct N N -dimensional arrays. So, a game
with 3 players yields 3 arrays with dimension 3. This is illustrated in
Fig. 4.2.
Multidimensional arrays are easy to represent on computers but
difficult to represent on paper. They have multiple indices instead of
just one index like a vector or two indices like a matrix. The elements
of the array for Player i store the various payoffs for Player i under
different strategic combinations of the different players. If there are
92 Game Theory Explained: A Mathematical Introduction with Optimization

r 2
laye

Strategies for Player 3


forP
gies
ate
Str

Payoff Values

Strategies for Player 1

Fig. 4.2. A three-dimensional array is like a matrix with an extra dimension.


They are difficult to capture on a page. The elements of the array for Player i
store the various payoffs for Player i under different strategic combinations of
the different players. If there are three players, then there will be three different
arrays.

three players, then there will be three different arrays, one for each
player.

Remark 4.15. The normal form of a (two-player) game is essentially


the recipe for transforming a game in extensive form into a game in
strategic form. Any game in extensive form can be transformed in
this way, and the strategic form can be analyzed. Reasons for doing
this include the fact that the strategic form is substantially more
compact. However, it can be complex to compute if the size of the
game tree in extensive form is very large.

Definition 4.16 (Zero-Sum Matrix Game). Suppose G =


(P, Σ, A, B) is a game in strategic form and B = −A, so the game
is zero sum. Then, we denote such a game as G = (P, Σ, A), because
the information in B is redundant.

Definition 4.17 (Symmetric Game). Let G = (P, Σ, A, B). If


A = BT , then G is called a symmetric game.

Example 4.18. Chicken is an example of a symmetric game.


Games and Matrices: Normal and Strategic Forms 93

4.3 Strategy Vectors and Matrix Games

Remark 4.19. Our next proposition relates the strategy set Σ to


pairs of standard basis vectors and reduces the payoff function com-
putation to simple matrix multiplication. Recall that
ei = 0, 0, . . . , 0, 1, 0, 0, . . . , 0
     
i−1 n−i

is the ith standard basis vector, as discussed in Appendix A.


Proposition 4.20. Let G = (P, Σ, A, B) be a two-player game in
strategic form with Σ1 = {σ11 , . . . , σm
1 } and Σ = {σ 2 , . . . , σ 2 }. If
2 1 n
Player P1 chooses strategy σr and Player P2 chooses strategy σc2 ,
1

then
π1 (σr1 , σc2 ) = eTr Aec , (4.4)
π2 (σr1 , σc2 ) = eTr Bec . (4.5)
Proof. For any matrix A ∈ Rm×n , Aec returns column c of
matrix A, that is, A·c . Likewise, eTr A·c is the rth element of this
vector. Thus, eTr Aec is the (r, c)th element of the matrix A. By defi-
nition, this must be the payoff for the strategy pair (σr1 , σc2 ) for Player
P1 . A similar argument follows for Player P2 and matrix B. 
Remark 4.21. Proposition 4.20 says that for two-player matrix
games, we can relate any choice of strategy that Player Pi makes
with a unit vector. Thus, we can define the payoff function in terms
of vector and matrix multiplication. We will see that this can be
generalized to cases when the strategies of the players are not repre-
sented by standard basis vectors.
Example 4.22. Consider the game of Chicken. Suppose Player P1
decides to swerve, while Player P2 decides not to swerve. Then, we
can represent the strategy of Player P1 by the vector
 
1
e1 = ,
0
while the strategy of Player P2 is represented by the vector
 
0
e2 = .
1
94 Game Theory Explained: A Mathematical Introduction with Optimization

Recall that the payoff matrices for this game are


 
0 −1
A= ,
1 −10
 
0 1
B= .
−1 −10

Then, we can compute



  
0 −1 0
π1 (Swerve, Don’t Swerve) = eT1 Ae2
= 1 0 · · = −1,
1 −10 1
   
T 0 1 0
π2 (Swerve, Don’t Swerve) = e1 Be2 = 1 0 · · = 1.
−1 −10 1

We can also consider the case when both players swerve. Then,
we can represent the strategies of both Players by e1 . In this case,
we have
   
T 0 −1 1
π1 (Swerve, Swerve) = e1 Ae1 = 1 0 · · = 0,
1 −10 0
   
T 0 1 1
π2 (Swerve, Swerve) = e1 Be1 = 1 0 · · = 0.
−1 −10 0

Remark 4.23. We now define equilibria in terms of matrix opera-


tions.
Proposition 4.24 (Equilibrium). Let G = (P, Σ, A, B) be a two-
player game in strategic form with Σ = Σ1 × Σ2 . The expressions

eTi Aej ≥ eTk Aej ∀k = i (4.6)

and

eTi Bej ≥ eTi Bel ∀l = j (4.7)

hold if and only if (σi1 , σj2 ) ∈ Σ1 × Σ2 is an equilibrium strategy.


Games and Matrices: Normal and Strategic Forms 95

Proof. From Proposition 4.20, we know that

π1 (σi1 , σj2 ) = eTi Aej , (4.8)


π2 (σi1 , σj2 ) = eTi Bej . (4.9)

From Eq. (4.6), we know that for all k = i,

π1 (σi1 , σj2 ) ≥ π1 (σk1 , σj2 ). (4.10)

From Eq. (4.7), we know that for all l = j,

π2 (σi1 , σj2 ) ≥ π2 (σi1 , σl2 ). (4.11)

Thus, from Definition 4.7, it is clear that (σi1 , σj2 ) ∈ Σ is an equilib-


rium strategy. The converse is clear from this as well. 
Remark 4.25. We can now think of relating a strategy choice for
player i, σki ∈ Σi , with the unit vector ek . From context, we will be
able to identify to which player’s strategy vector ek corresponds.

4.4 Chapter Notes

Matrices (in the form used in this chapter) are a relatively recent
invention, though arrays of numbers have been used for centuries.
The term matrix was first used by J. J. Sylvester in Ref. [60]. At this
time, matrices were not used for computational simplicity, as they
are today. In the following chapters, we will see how the matrix rep-
resentation of games can simplify our analysis.
Some texts list both the matrices A and B together. In this form,
for example, Chicken is described by the table
 
(0, 0) (−1, 1)
.
(1, −1) (−10, −10)

This can be compact but has the potential to cause confusion. As a


rule, the row player is always player one, while the column player is
always player two.
Interestingly, Chicken formed the basis for the early American
and Soviet nuclear policies under the concept of Mutual Assured
Destruction (MAD) [27]. In the MAD policy, the idea was to build
96 Game Theory Explained: A Mathematical Introduction with Optimization

in an automated nuclear response to any attack, thus ensuring that


“Don’t Swerve” (i.e., attack) was played by both players (the United
States and the Soviet Union) automatically. In this way, no side act-
ing rationally would ever execute a first strike because the result
would be catastrophic.
– ♠♣♥♦ –

4.5 Exercises

4.1 Compute the payoff matrices for Example 4.5.

4.2 Construct payoff matrices for rock-paper-scissors. Also, con-


struct the normal form of the game.

4.3 Compute the strategic form of the two-player coin-flipping


game using the expected payoff function defined in Example 3.62.

4.4 Confirm that (e1 , e2 ) is an equilibrium strategy for the game


of Chicken using the numeric payoff matrices given in Example 4.12.
Does this generalize to the case with arbitrary values W , T , L, and X,
as in Remark 4.13? Use symmetry to find a second pure strategy
equilibrium.
Chapter 5

Saddle Points, Mixed Strategies,


and Nash Equilibria

Chapter Goals: In this chapter, we focus on equilibria in


games. We introduce the concept of a Nash equilibrium, which
fully generalizes the previous notions of equilibria we have dis-
cussed. We study dominated strategies and their relationship
with Nash equilibria. We then prove the minimax theorem and
study Nash’s original proofs of the existence of Nash equilibria.
The results in this chapter are used as a basis for the remainder
of the book, which uses techniques from optimization to find
equilibria.

Remark 5.1 (Notational Remark). For the remainder of the


book, unless otherwise noted, we assume that G = (P, Σ, π) is a
game in normal form, with P = {P1 , . . . , PN } and Σi = {σ1i , . . . , σni i }.
When we discuss two-player games, we assume that Σ = Σ1 × Σ2 ,
Σ1 = {σ11 , . . . , σm
1 }, and Σ = {σ 2 , . . . , σ 2 }. Therefore, a zero-sum
2 1 n
game is the tuple G = (P, Σ, A), with A ∈ Rm×n , while a bimatrix
game is the tuple G = (P, Σ, A, B), with A, B ∈ Rm×n .

97
98 Game Theory Explained: A Mathematical Introduction with Optimization

5.1 Equilibria in Zero-Sum Games: Saddle Points

Theorem 5.2. Let G = (P, Σ, A) be a zero-sum two-player game.


A strategy pair, (ei , ej ), is an equilibrium strategy if and only if
eTi Aej = max min Akl = min max Akl . (5.1)
k∈{1,...,m} l∈{1,...,n} l∈{1,...,n} k∈{1,...,m}

Example 5.3. Before we prove Theorem 5.2, let’s first con-


sider an example from television before streaming. Two net-
work corporations believe that there are 100M viewers to be
had during Thursday night prime time (8–9 pm). The cor-
porations must decide which type of programming to run:
science fiction, drama, or comedy. If the two networks initially split
the 100M viewers evenly, we can think of the payoff matrix as deter-
mining how many excess viewers the networks’ strategies will yield
over 50M . The payoff matrix (in millions) for Network 1 is shown in
Eq. (5.2):
⎡ ⎤
−14 −34 11
⎢ ⎥
A = ⎣ −4 9 0 ⎦. (5.2)
−11 −35 21
That is, if Network 1 and Network 2 both choose strategy one, then
Network 1 has 36 = 50 − 14 million viewers and Network 2 has
64 = 50 + 14 million viewers. The expression
min max Akl
l∈{1,...,n} k∈{1,...,m}

asks us to compute the maximum value in each column to create the


set
Cmax = {c∗l = max{Akl : k ∈ {1, . . . , m}} : l ∈ {1, . . . , n}}
and then choose the smallest value in this case. If we look at this
matrix, the column maximums are

−4 9 21 .
We then choose the minimum value in this case, which is −4. This
value occurs at position (2, 1).
The expression
max min Akl
k∈{1,...,m} l∈{1,...,n}
Saddle Points, Mixed Strategies, and Nash Equilibria 99

asks us to compute the minimum value in each row to create the set

Rmin = {rk∗ = min{Akl : l ∈ {1, . . . , n}} : k ∈ {1, . . . , m}}

and then choose the largest value in this case. Again, if we look at
the matrix in Eq. (5.2), we see that the minimum values in the rows
are
⎡ ⎤
−34
⎢ ⎥
⎣ −4 ⎦.
−35

The largest value in this case is −4. Again, this value occurs at
position (2, 1). This process is captured in Fig. 5.1. The two values
are equal, and so by Theorem 5.63, the equilibrium strategy pair is
(e2 , e1 ), which returns the value in the second row and first column
to Network 1. Network 2 receives the negative of this value.
Remark 5.4. To understand why this works, consider the following
logic. The row player (Player 1) knows that Player 2 (the column
player) is trying to maximize her (Player 2’s) payoff. Since this is
a zero-sum game, any increase in Player 2’s payoff will come at the
expense of Player 1. So, Player 1 looks at each row independently
(since he chooses rows) and asks, “What is the worst possible out-
come I could see if I played a strategy corresponding to this row?”
Having obtained these worst possible scenarios, he chooses the row
with the highest value.

Payoff Matrix Row Min


–14 –34 11 –34
–4 9 0 –4
–11 –35 21 –35
–4 8 20 maxmin = –4
Column Max minmax = –4
Fig. 5.1. The minimax analysis of the game of competing networks. The
row player knows that Player 2 (the column player) is trying to maximize her
(Player 2’s) payoff. Thus, Player 1 asks: “What is the worst possible outcome I
could see if I played a strategy corresponding to this row?” Having obtained these
worst possible scenarios, he chooses the row with the highest value. Player 2 does
something similar in columns.
100 Game Theory Explained: A Mathematical Introduction with Optimization

Player 2 faces a similar problem. She knows that Player 1 wishes


to maximize his payoff and that any gain will come at her expense. So,
Player 2 looks across each column of the matrix A and asks, “What
is the best possible score Player 1 can achieve if I (Player 2) choose
to play the strategy corresponding to the given column?” Remember
that the negation of this value will be Player 2’s payoff in this case.
Having done that, Player 2 then chooses the column that minimizes
this value and thus maximizes her payoff. If these two values are
equal, then the theorem claims that the resulting strategy pair is an
equilibrium.
Remark 5.5. We are now ready to prove Theorem 5.2.

Proof of Theorem 5.2. (⇒) Suppose that (ei , ej ) is an equilibrium


solution. Then, we know that

eTi Aej ≥ eTk Aej ,


eTi (−A)ej ≥ eTi (−A)el ,

for all k ∈ {1, . . . , m} and l ∈ {1, . . . , n}. We can write this as

eTi Aej ≥ eTk Aej (5.3)

and

eTi Aej ≤ eTi Ael . (5.4)

We know that eTi Aej = Aij and that Eq. (5.3) holds if and only if

Aij ≥ Akj , (5.5)

for all k ∈ {1, . . . , m}. That is, the element i must be maximal in the
column A·j . Note that for a fixed row k ∈ {1, . . . , m},

Akj ≥ min{Akl : l ∈ {1, . . . , n}}.

This means that if we compute the minimum value in a row k, then


the value in column j, Akj , must be at least as large as that minimal
value. Combining this with Eq. (5.5), we conclude that for each row
k ∈ {1, . . . , m},

Aij ≥ min{Akl : l ∈ {1, . . . , n}}. (5.6)


Saddle Points, Mixed Strategies, and Nash Equilibria 101

However, Eq. (5.6) implies that


eTi Aej = Aij = max min Akl . (5.7)
k∈{1,...,m} l∈{1,...,n}

Likewise, Eq. (5.4) holds if and only if


Aij ≤ Ail , (5.8)
for all l ∈ {1, . . . , n}. Arguing as before, for a fixed column l ∈
{1, . . . , n}, we have
Ail ≤ max{Akl : k ∈ {1, . . . , m}}.
This means that if we compute the maximum value in a column l,
then the value in row i, Ail , must not exceed that maximal value.
Combining this with Eq. (5.8) gives
Aij ≤ max{Akl : k ∈ {1, . . . , m}}. (5.9)
However, Eq. (5.9) implies that
eTi Aej = Aij = min max Akl . (5.10)
l∈{1,...,n} k∈{1,...,m}

Thus, it follows that


Aij = eTi Aej = max min Aij = min max Akl .
k∈{1,...,m} l∈{1,...,n} l∈{1,...,n} k∈{1,...,m}

(⇐) To prove the converse, suppose that


eTi Aej = max min Akl = min max Akl .
k∈{1,...,m} l∈{1,...,n} l∈{1,...,n} k∈{1,...,m}

Consider the quantity


eTk Aej = Akj .
The fact that
Aij = max min Akl
k∈{1,...,m} l∈{1,...,n}

implies that Aij ≥ Akj for any k ∈ {1, . . . , m}. To see this, remember
that
Cmax = {c∗l = max{Akl : k ∈ {1, . . . , m}} : l ∈ {1, . . . , n}} (5.11)
102 Game Theory Explained: A Mathematical Introduction with Optimization

and Aij ∈ Cmax by construction. Thus, it follows that


eTi Aej ≥ eTk Aej ,
for any k ∈ {1, . . . , m}. By a similar argument, we know that
Aij = min max Akl ,
l∈{1,...,m} k∈{1,...,n}

which implies that Aij ≤ Ail for any l ∈ {1, . . . , n}. To see this,
remember that
Rmin = {rk∗ = min{Akl : l ∈ {1, . . . , n}} : k ∈ {1, . . . , m}}
and Aij ∈ Rmin by construction. Thus, it follows that
eTi Aej ≤ eTi Ael ,
for any l ∈ {1, . . . , n}. We conclude that (ei , ej ) is an equilibrium
solution. This completes the proof. 
Theorem 5.6. Suppose that G = (P, Σ, A) is a zero-sum two-player
game. Let (ei , ej ) be an equilibrium strategy pair for this game. If
(ek , el ) is a second equilibrium strategy pair, then
Aij = Akl = Ail = Akj .
Definition 5.7 (Saddle Point). Let G = (P, Σ, A) be a zero-sum
two-player game. If (ei , ej ) is an equilibrium, then it is called a saddle
point.
Definition 5.8 (Game Value). Let G = (P, Σ, A) be a zero-sum
game. If there exists a strategy pair (ei , ej ) so that
max min Akl = min max Akl ,
k∈{1,...,m} l∈{1,...,n} l∈{1,...,n} k∈{1,...,m}

then
VG = eTi Aej (5.12)
is the value of the game.
Remark 5.9. We see that we can define the value of a zero-sum
game even when there is no equilibrium point in strategies in Σ.
Using Theorem 5.6, we can see that this value is unique, that is, any
equilibrium pair for a game will yield the same value for a zero-sum
game. This is not the case in a general-sum game.
Saddle Points, Mixed Strategies, and Nash Equilibria 103

5.2 Zero-Sum Games without Saddle Points

Remark 5.10. Not all games have saddle points of the kind found in
Example 5.3. The easiest way to show that this is true is to illustrate
it with an example.

Example 5.11. In August 1944, after the invasion of Normandy, the


Allies broke out of their beachhead at Avranches, France, and headed
toward the main part of the country (see Fig. 5.2).1 The German
General von Kluge, commander of the ninth army, faced two
options:

Wait

Bradley

Move East Von Kluge


Attack

Reinforce Gap Retreat

Avranches
Region

Fig. 5.2. In August 1944, the allies broke out of their beachhead at Avranches.
Each commander faced several troop movement choices. These choices can be
modeled as a game. (Diagrammed troop movements are approximate.)

1
This example is discussed in detail by Brams in Ref. [40].
104 Game Theory Explained: A Mathematical Introduction with Optimization

Table 5.1. Game matrix of the battle of Avranches showing that this
game has no saddle-point solution. There is no position in the matrix where
an element is simultaneously the maximum value in its column and the
minimum value in its row.
von Kluge’s Strategies Row Min
Bradley’s Strategy Attack Retreat —
Reinforce Gap 2 3 2
Move East 1 5 1
Wait 6 4 4
Column Max 6 5 maxmin = 4
minmax = 5

(1) Stay and attack the advancing Allied armies.


(2) Withdraw into the mainland and regroup.
Simultaneously, General Bradley, commander of the Allied ground
forces, faced a similar set of options regarding the German ninth
army:
(1) Reinforce the gap between the US and Canadian forces created
by troop movements at Avranches.
(2) Send his forces east to cut off a German retreat.
(3) Do nothing and wait a day to see what the adversary did.
The player set P consists of Bradley (Player 1) and von Kluge
(Player 2). The strategy sets are:
Σ1 = {Reinforce the gap, Send forces east, Wait},
Σ2 = {Attack, Retreat}.
In real life, there were no obvious pay-off values; however, General
Bradley’s diary indicates the scenarios he preferred in order. There
are six possible scenarios, i.e., six elements in Σ = Σ1 × Σ2 . Bradley
ordered them from most to least preferable, and using this ranking,
we can construct the game matrix shown in Table 5.1.
Note that the maximin2 value of the rows is not equal to the
minimax value of the columns. This is indicative of the fact that

2
The maxmin value is usually written maximin. Similarly, the minmax value is
written minimax.
Saddle Points, Mixed Strategies, and Nash Equilibria 105

(Retreat, Wait) (Retreat, Move East)


(4, –4) (5, –5)

(Attack, Wait) (Attack, Move East)


(6, –6) (1, –1)

Fig. 5.3. The Payoff values cause a cycle to occur in pure strategies.

there is not a pair of strategies that form an equilibrium for this


game.
To illustrate this more clearly, suppose that von Kluge plays his
minimax strategy to retreat. Then, Bradley would do better to not
play his maximin strategy (wait) and instead move east, cutting
off von Kluge’s retreat, thus obtaining a payoff of (5, −5). But von
Kluge would realize this and deduce that he should attack, which
would yield a payoff of (1, −1). However, Bradley could deduce this
as well and would know to play his maximin strategy (wait), which
yields a payoff of (6, −6). However, von Kluge would realize that
this would occur, in which case he would decide to retreat, yielding
a payoff of (4, −4). The cycle then repeats. This logic is illustrated
in Fig. 5.3.

5.3 Mixed Strategies

Remark 5.12. Heretofore, we have assumed that Player Pi will


deterministically choose a strategy in Σi . It’s possible, however, that
Player Pi might choose a strategy at random. In this case, we assign
a probability to each strategy in Σi .

Definition 5.13 (Mixed Strategy). A mixed strategy for Player


Pi ∈ P is a discrete probability distribution function ρi defined over
the sample space Σi . That is, we can define a discrete probability
space (Σi , FΣi , ρi ) where Σi is the discrete sample space, FΣi is the
power set of Σi , and ρi is the discrete probability function that assigns
probabilities to events in FΣi .
106 Game Theory Explained: A Mathematical Introduction with Optimization

Table 5.2. The payoff matrix for Player P1


in rock-paper-scissors. This payoff matrix can
be derived from Fig. 3.5.
Rock Paper Scissors
Rock 0 –1 1
Paper 1 0 –1
Scissors –1 1 0

Remark 5.14. We assume that players choose their mixed strategies


independently. Thus, we can compute the probability of a strategy
element, (σ 1 , . . . , σ N ) ∈ Σ, as

ρ(σ 1 , . . . , σ N ) = ρ1 (σ 1 )ρ2 (σ 2 ) · · · ρN (σ n ). (5.13)

Using this, we can define a discrete probability distribution over the


sample space Σ as (Σ, FΣ , ρ). Define Πi as a random variable that
maps Σ into R so that Πi returns the payoff to Player Pi as a result
of the random outcome (σ 1 , . . . , σ N ). Therefore, the expected payoff
for Player Pi for a given mixed strategy (ρ1 , . . . , ρN ) is

E(Πi ) = ··· πi (σ 1 , . . . , σ n )ρ1 (σ 1 )ρ2 (σ 2 ) · · · ρN (σ N ).


σ1 ∈Σ1 σ2 ∈Σ2 σN ∈ΣN

Example 5.15. Consider the rock-paper-scissors game. The payoff


matrix for Player 1 is given in Table 5.2.
Suppose that each strategy is chosen with a probability of 13 by
each player. Then, the expected payoff to Player P1 with this strategy
is

1 1 1 1
E(Π1 ) = π1 (Rock, Rock) + π1 (Rock, Paper)
3 3 3 3
1 1
+ π1 (Rock, Scissors)
3 3
1 1 1 1
+ π1 (Paper, Rock) + π1 (Paper, Paper)
3 3 3 3
Saddle Points, Mixed Strategies, and Nash Equilibria 107

1 1
+ π1 (Paper, Scissors)
3 3
1 1
+ π1 (Scissors, Rock)
3 3
1 1
+ π1 (Scissors, Paper)
3 3
1 1
+ π1 (Scissors, Scissors) = 0.
3 3

We can likewise compute the same value for E(π2 ) for Player P2 .
Remark 5.16. For the remainder of this book, we will be deal-
ing with vectors of the form x = x1 , . . . , xm  ∈ Rm . In general,
a bold lower-case symbol is a vector, and corresponding non-bold
and indexed symbols are its entries. Vectors are discussed in Defini-
tion A.9 in Appendix A.
Definition 5.17 (Mixed-Strategy Vector). To any mixed strat-
egy, ρi , for Player  Pi , we may associate a mixed-strategy vector,
xi = xi1 , . . . , xini , where
 
xik = ρi σki .

Remark 5.18. From Definition 5.17, we can deduce that if xi is a


mixed-strategy vector, then it must satisfy:
(1) xij ≥ 0 for j ∈ {1, . . . , ni },
n i i
(2) j=1 xj = 1.

Moreover, these two properties are sufficient to ensure that we are


defining a mathematically correct probability distribution over the
strategy set Σi .
Definition 5.19 (Player Mixed-Strategy Space). The set
 n

i

Δni = x1 , . . . , xni  ∈ Rni : xi = 1; xi ≥ 0, i = 1, . . . , ni


i=1
(5.14)
is the mixed-strategy space in ni dimensions for Player Pi .
108 Game Theory Explained: A Mathematical Introduction with Optimization

x3

1 1

x1 x2
Fig. 5.4. In three-dimensional space, Δ3 is the face of a tetrahedron. In four-
dimensional space, it would be a tetrahedron, which would itself be the face of a
four-dimensional object.

Remark 5.20. There is a pleasant geometry to the space Δn , which


is usually called the unit simplex embedded in Rn . In three dimen-
sions, for example, the space is an equilateral triangle. (See Fig. 5.4.)
Remark 5.21. The space Δn is always an n − 1-dimensional object
that is embedded in (or lives in) Rn . For this reason, some authors
write Δn−1 for Δn . Other authors use Sn or Sn−1 , where S denotes
simplex. For pedagogic purposes, we have chosen to use Δn to remind
the reader that there are n elements in a vector in Δn . More advanced
texts may use different notation.
Definition 5.22 (Pure-Strategy Vector). The standard basis
vector ej ∈ Δni corresponds to the strategy σji ∈ Σi and, as such, is
called a pure-strategy vector, or sometimes simply a pure strategy.
Definition 5.23 (Mixed-Strategy Space). The mixed-strategy
space for the game G is the set
Δ = Δn1 × Δn2 × · · · × ΔnN . (5.15)
Definition 5.24 (Mixed-Strategy Payoff Function). The
expected payoff function, written in terms of the tuple of mixed-
strategy vectors (x1 , . . . , xN ), is
n1 n2 nN
ui (x1 , . . . , xN ) = ··· πi (σi11 , . . . , σinN )x1i1 x2i2 · · · xN
iN .
i1 =1 i2 =1 iN =1
(5.16)
Saddle Points, Mixed Strategies, and Nash Equilibria 109

Here, xji is the ith element of the vector xj . The function ui : Δ →


R, defined in Eq. (5.16), is the mixed-strategy payoff function for
Player Pi .
Example 5.25. For rock-paper-scissors, since each player has three
strategies, n = 3 and Δ3 consists of the vectors x1 , x2 , x3  so that
x1 , x2 , x3 ≥ 0 and x1 + x2 + x3 = 1. For example, the vectors
⎡1⎤
3
⎢1⎥
x=y=⎢ ⎥
⎣3⎦
1
3

are mixed strategies for Players 1 and 2, respectively, that instruct


the players to play rock 1/3 of the time, paper 1/3 of the time, and
scissors 1/3 of the time.
Definition 5.26 (Nash Equilibrium). A Nash equilibrium is a
∗ ∗
tuple of mixed strategies, (x1 , . . . , xN ) ∈ Δ, so that for all i ∈
{1, . . . , N },
∗ ∗ ∗ ∗ ∗
ui (x1 , . . . , xi , . . . , xN ) ≥ ui (x1 , . . . , xi , . . . , xN ), (5.17)

for all xi ∈ Δni . If the inequality in the definition is strict, then it is


a strict Nash equilibrium.
Remark 5.27. What Definition 5.26 states is that a tuple of mixed
∗ ∗
strategies, (x1 , . . . , xN ), is a Nash equilibrium if no player has any
reason to deviate unilaterally from her mixed strategy.
Remark 5.28 (Notational Remark). In many texts, it becomes
cumbersome in N -player games to denote the mixed-strategy tuple
(x1 , . . . , xN ), especially since we are usually only interested in the
arbitrary Player Pi . To deal with this, textbooks sometimes adopt
the notation (xi , x−i ). Here, xi is the mixed strategy for Player Pi ,
while x−i denotes the mixed-strategy tuple for the other players (who
are not Player Pi ). When expressed this way, Eq. (5.17) is written as
∗ ∗ ∗
ui (xi , x−i ) ≥ ui (xi , x−i ),

for all i ∈ {1, . . . , N }. While notationally convenient, we restrict


our attention to two-player games, so this will generally not be
necessary.
110 Game Theory Explained: A Mathematical Introduction with Optimization

Proposition 5.29. Let G = (P, Σ, A, B) be a two-player matrix


game. Let x ∈ Δm and y ∈ Δn be mixed strategies for Play-
ers 1 and 2, respectively. Then,
u1 (x, y) = xT Ay, (5.18)
u2 (x, y) = xT By. (5.19)
Proof. For simplicity, let x = x1 , . . . , xm  and y = y1 , . . . , yn .
We know that π1 (σi1 , σj2 ) = Aij . Simple matrix multiplication yields

xT A = xT A·1 · · · xT A·n .
That is, xT A is a row vector whose jth element is xT A·j . For fixed j,
we have
m
T
x A·j = x1 A1j + x2 A2j + · · · + xm Amj = π1 (σi1 , σj2 )xi .
i=1
From this, we can conclude that
⎡ ⎤
y1
⎢y ⎥
 ⎢ 2⎥
xT Ay = xT A·1 · · · xT A·n ⎢ ⎥.
⎢ .. ⎥
⎣.⎦
yn
This simplifies to
xT A·1 y1 + · · · + xT A·n yn
= (x1 A11 + x2 A21 + · · · + xm Am1 ) y1
+ · · · + (x1 A1n + x2 A2n + · · · + xm Amn ) ym . (5.20)
Distributing the multiplication through, we can simplify Eq. (5.20)
as
m n m n
xT Ay = Aij xi yj = π1 (σi1 , σj2 )xi yj = u1 (x, y).
i=1 j=1 i=1 j=1
(5.21)
A similar argument shows that u2 (x, y) = xT By. This completes the
proof. 
Saddle Points, Mixed Strategies, and Nash Equilibria 111

5.4 Dominated Strategies and Nash Equilibria

Definition 5.30 (Weak Dominance). A mixed strategy, xi ∈ Δni ,


for Player Pi weakly dominates another strategy, yi ∈ Δni , for Player
Pi if for all mixed strategies z−i , we have

ui (xi , z−i ) ≥ ui (yi , z−i ), (5.22)

and for at least one z−i , the inequality in Eq. (5.22) is strict.
Definition 5.31 (Strict Dominance). A mixed strategy, xi ∈ Δni ,
for Player Pi strictly dominates another strategy, yi ∈ Δni , for Player
Pi if for all mixed strategies z−i , we have

ui (xi , z−i ) > ui (yi , z−i ). (5.23)

Definition 5.32 (Dominated Strategy). A strategy, yi ∈ Δni ,


for Player Pi is said to be weakly (strictly) dominated if there is a
strategy, xi ∈ Δni , that weakly (strictly) dominates it.
Remark 5.33. In a two-player matrix game G = (P, Σ, A, B), the
mixed strategy x ∈ Δm for Player 1 weakly dominates the strategy
y ∈ Δm if for all z ∈ Δn (mixed strategies for Player 2) we have

xT Az ≥ yT Az (5.24)

and the inequality is strict for at least one z ∈ Δn . If x strictly


dominates y, then we have

xT Az > yT Az, (5.25)

for all z ∈ Δn .
Example 5.34 (Prisoner’s Dilemma). The following example is
called prisoner’s dilemma and is a classic example in game theory.
There are several variations of this example, but they all have the
same structure: Two criminals – we call them Bonnie and Clyde –
commit a bank robbery. They hide the money and are driving around
wondering what to do next when they are pulled over and arrested for
carrying an illegal weapon. The police suspect Bonnie and Clyde of
112 Game Theory Explained: A Mathematical Introduction with Optimization

the bank robbery but do not have any hard evidence. They separate
the prisoners and offer the following options to Bonnie:
(1) If neither Bonnie nor Clyde confess, they will each go to prison
for 1 year for carrying the illegal weapon.
(2) If Bonnie confesses, but Clyde does not, then Bonnie can go free
while Clyde will go to jail for 10 years.
(3) If Clyde confesses and Bonnie does not, then Bonnie will go to
jail for 10 years while Clyde will go free.
(4) If both Bonnie and Clyde confess, then they will each go to jail
for 5 years.
A similar offer is made to Clyde. The scenario is described by a two-
player bimatrix game with the player set P = {Bonnie, Clyde}, the
strategy sets Σ1 = Σ2 = {Don’t Confess, Confess}, and the payoff
matrices
   
−1 −10 −1 0
A= and B = .
0 −5 −10 −5

Here, payoffs are given in negative years (for years lost to prison).
Bonnie’s matrix is A, and Clyde’s matrix is B. The rows (columns)
correspond to the strategies “Don’t Confess” and “Confess.” Thus,
we see that if Bonnie does not confess and Clyde does (row 1, col-
umn 2), then Bonnie loses 10 years and Clyde loses 0 years.
We can show that the strategy Confess dominates Don’t Confess
for Bonnie. Recall that pure strategies correspond to standard basis
vectors. We claim that e2 strictly dominates e1 for Bonnie. From
Remark 5.33, we must show that

eT2 Az > eT1 Az (5.26)

for all z ∈ Δ2 . We know that


 
z1
z= ,
z2

where z1 + z2 = 1 and z1 , z2 ≥ 0. For simplicity, let’s redefine


 
z
z= ,
(1 − z)
Saddle Points, Mixed Strategies, and Nash Equilibria 113

with z ≥ 0. We know that


 
 −1 −10 
eT2 A = 0 1 = 0 −5 ,
0 −5
 
 −1 −10 
eT1 A = 1 0 = −1 −10 .
0 −5
Then,
 
z 
eT2 Az
= 0 −5 = −5(1 − z) = 5z − 5,
(1 − z)
 
T
 z
e1 Az = −1 −10 = −z − 10(1 − z) = 9z − 10.
(1 − z)

There are many ways to show that if z ∈ [0, 1], 5z − 5 > 9z − 10, but
the easiest way is to plot the two functions. This is shown in Fig. 5.5.
It is also possible to show this analytically, but the visual proof is
far more appealing. We can see the line corresponding to the payoff
for Confess is always above the line corresponding to the payoff for
Don’t Confess.

−2

−4
Payoff

−6

−8

− 10
0.0 0.2 0.4 0.6 0.8 1.0
z
Payoff to Confess Payoff to Don't Confess

Fig. 5.5. Payoff for Bonnie’s pure strategies for varying probabilities that Clyde
confesses.
114 Game Theory Explained: A Mathematical Introduction with Optimization

Remark 5.35. In general, prisoner’s dilemma is presented as an


abstract symmetric game, with Player 1’s payoff matrix being
 
R S
A=
T P

and Player 2’s payoff matrix being B = AT . We assume that T >


R > P > S. In this way, row 2 of A strictly dominates row 1, while
column 2 of B strictly dominates column 1.
Remark 5.36. Strict dominance can be extremely useful for identi-
fying pure Nash equilibria. This is especially true for matrix games.
We summarize this in the following two theorems.
Theorem 5.37. Let G = (P, Σ, A, B) be a two-player matrix game,
with A, B ∈ Rm×n . If

eTi Aek > eTj Aek , (5.27)

for k ∈ {1, . . . , n}, then ei strictly dominates ej for Player 1.


Remark 5.38. We know that eTi A is the ith row of A. Theorem 5.37
states that if every element in Ai· (the ith row of A) is greater than
its corresponding element in Aj· (the jth row of A), then Player 1’s
ith strategy strictly dominates Player 1’s jth strategy.
Proof. For all k ∈ {1, . . . , n}, we know that

eTi Aek > eTj Aek .

Suppose that z1 , . . . , zn ∈ [0, 1], with z1 + · · · + zn = 1. Then, for


each k ∈ {1, . . . , n}, we know that

eTi Aek zk > eTj Aek zk .

Adding these inequalities together gives

eTi Ae1 z1 + · · · + eTi Aen zn > eTj Ae1 z1 + · · · + eTj Aen zn .

Factoring, we have

eTi A (z1 e1 + · · · + zn en ) > eTj A (z1 e1 + · · · + zn en ).


Saddle Points, Mixed Strategies, and Nash Equilibria 115

Define
⎡ ⎤
z1
⎢.⎥
z = z1 e1 + · · · + zn en = ⎢ ⎥
⎣ .. ⎦.
zn

Since the original z1 , . . . , zn were chosen arbitrarily from [0, 1] so that


z1 + · · · zn = 1, we know that

eTi Az > eTj Az,

for all z ∈ Δn . Thus, ei strictly dominates ej by Definition 5.31. 


Remark 5.39. There is an analogous theorem for Player 2 which
states that if each element of column B·i is greater than the corre-
sponding element in column B·j , then ei strictly dominates strategy
ej for Player 2.
Remark 5.40. Theorem 5.37 can be generalized to N players; how-
ever, stating this theorem is notationally cumbersome and leads to
no further insight.
Theorem 5.41. Let G = (P, Σ, A, B) be a two-player matrix game.
Suppose that the pure strategy ej ∈ Δm for Player 1 is strictly domi-
nated by the pure strategy ei ∈ Δm . If (x∗ , y∗ ) is a Nash equilibrium,
then x∗j = 0. Similarly, if the pure strategy ej ∈ Δn for Player 2 is
strictly dominated by the pure strategy ei ∈ Δn , then yj∗ = 0.
Remark 5.42. Theorem 5.41 states that pure strategies that are
dominated have no support (i.e., occur with a probability of zero) in
a Nash equilibrium. We can use this fact in a process called analy-
sis by iterated dominance to find pure strategy Nash equilibria and
occasionally to simplify games with many strategies.

Proof of Theorem 5.41. We prove the theorem for Player 1; the


proof for Player 2 is completely analogous. We proceed by contra-
diction. Let x∗ = x∗1 , . . . , x∗m , and suppose that x∗j > 0. We know
that

eTi Ay∗ > e∗j Ay∗


116 Game Theory Explained: A Mathematical Introduction with Optimization

because ei strictly dominates ej . We can write


 
x∗T Ay∗ = x∗1 eT1 + · · · + x∗i eTi + · · · + x∗j eTj + · · · + x∗m eTm Ay∗ .
(5.28)
Since x∗j > 0, we know that

x∗j eTi Ay∗ > x∗j e∗j Ay∗ .


Replacing x∗j ej by x∗j ei in Eq. (5.28) yields the inequality
 ∗ T 
x1 e1 + · · · + x∗i eTi + · · · + x∗j eTi + · · · + x∗m eTm Ay∗
 
> x∗1 eT1 + · · · + x∗i eTi + · · · + x∗j eTj + · · · + x∗m eTm Ay∗ .
(5.29)
If we define z = z1 , . . . , zm  ∈ Δm so that


⎪ x∗ + x∗j k = i
⎨ i
zk = 0 k=j (5.30)


⎩x otherwise,
k

then Eq. (5.29) implies

zT Ay∗ > x∗ T Ay∗ . (5.31)


Thus, (x∗ , y∗ ) could not have been a Nash equilibrium. This com-
pletes the proof. 
Remark 5.43. The preceding proof worked by showing that trans-
ferring the probability placed on strategy ej to strategy ei improved
the payoff to Player 1 precisely because we assumed that ei dominates
ej . Consequently, no Nash equilibrium strategy can assign non-zero
probability to a dominated strategy.
Example 5.44. We can use the two previous theorems to our advan-
tage. Consider the prisoner’s dilemma (Example 5.34). The payoff
matrices (again) are
   
−1 −10 −1 0
A= and B = .
0 −5 −10 −5
For Bonnie, row (strategy) 1 is strictly dominated by row (strat-
egy) 2. Thus, Bonnie will never play strategy 1 (Don’t Confess) in a
Saddle Points, Mixed Strategies, and Nash Equilibria 117

Nash equilibrium. That is,


 
A1· < A2· ≡ −1 −10 < 0 −5 .
We can consider a new game in which we remove this strat-
egy for Bonnie (since Bonnie will never play this strategy). The
new game has P = {Bonnie, Clyde}, Σ1 = {Confess}, Σ2 =
{Don’t Confess, Confess}. The new game matrices are
 
A = 0 −5 and B = −10 −5 .
In this new game, we note that for Clyde (Player 2), column (strat-
egy) 2 strictly dominates column (strategy 1). That is,
B·1 < B·2 ≡ −10 < −5.
Clyde will never play Strategy 1 (Don’t Confess) in a Nash equi-
librium. We can construct a new game with P = {Bonnie, Clyde},
Σ1 = {Confess}, Σ2 = {Confess} and (trivial) payoff matrices as
A = −5 and B = −5.
In this game, there is only one Nash equilibrium in which both players
confess, and this equilibrium is the Nash equilibrium of the original
game.
Remark 5.45 (Iterative Dominance). A game whose Nash equi-
librium is computed using the method from Example 5.44, in which
strictly dominated strategies are iteratively eliminated for the two
players, is said to be solved by iterated dominance. A game that can
be analyzed in this way is said to be strictly dominance solvable.

5.5 The Indifference Theorem

Theorem 5.46 (Indifference Theorem). Let G = (P, Σ, A, B)


be a two-player matrix game, and suppose that (x∗ , y∗ ) is a Nash
equilibrium. If x∗i > 0, then

x∗T Ay∗ = eTi Ay∗ .


Likewise, if yj∗ > 0, then

x∗T Ay∗ = x∗T Aej .


118 Game Theory Explained: A Mathematical Introduction with Optimization

Remark 5.47. The indifference theorem states that if Player 2 plays


a Nash equilibrium, then Player 1 is indifferent between playing her
mixed-strategy Nash equilibrium and some pure strategy that has
a non-zero probability in the Nash equilibrium. That is, Player 1
will receive the same payoff no matter whether she plays the mixed-
strategy equilibrium or a pure-strategy equilibrium. This does not
mean that both players can switch to a pure strategy equilibrium, as
that could clearly change the payoffs.
Remark 5.48. The proof uses the same trick we have already used
twice, though it does require a bit more intricacy because we will
be ignoring those strategies that have zero probability in the Nash
equilibrium.

Proof of the Indifference Theorem. We prove this for Player 1.


The result follows by symmetry for Player 2. We know that we cannot
have

eTi Ay∗ > x∗T Ay∗

by the definition of a Nash equilibrium. Let I ⊆ {1, . . . , m} be the


set of indices so that k ∈ I if and only if x∗k > 0. That is, I is the set
of strategy indexes that occur with non-zero probability. We know
that

x∗k = 1.
k∈I

By construction,

x∗k ei = xk ei = x∗ ,
k∈I i

which is simply a variation of Eq. (5.28) written more compactly.


Assume that

eTi Ay∗ < x∗ T Ay∗ , (5.32)

for some i ∈ I. We also know that

eTk Ay∗ ≤ x∗ T Ay∗ ,


Saddle Points, Mixed Strategies, and Nash Equilibria 119

for all k ∈ I, by the definition of the Nash equilibrium. Since x∗k > 0
for all k ∈ I, we can rewrite these inequalities as
x∗i eTi Ay∗ < x∗i x∗ T Ay∗ ,
x∗k eTk Ay∗ ≤ x∗k x∗ T Ay∗ ∀k ∈ I, k = i.
Adding all the inequalities yields
 
x∗i eTi Ay∗ + x∗k eTk Ay∗ = x∗k ek Ay∗ = x∗ T Ay∗
j∈I,k=i k∈I
 
< xi x∗ T Ay∗ + xk x∗ T Ay∗ = x∗k x∗ T Ay∗ = x∗ T Ay∗ .
k∈I,k=i k∈I

The strictness of the inequality follows from our assumption in


Eq. (5.32). But this is a contradiction. Therefore, we know that,
for all i ∈ I, we cannot have eTi Ay∗ < x∗T Ay∗ ; therefore, it follows
that we must have eTi Ay∗ = x∗T Ay∗ for all i ∈ I. The argument for
Player 2 is identical. This completes the proof. 
Remark 5.49. Note that the strategies with zero probability are
free to yield a lower payoff without affecting the proof. That is, if
k = I in the proof above, it is perfectly fine that
eTk Ay∗ < x∗ T Ay∗
because x∗k = 0, and so those terms will never appear in any of the
sums used in the proof.
Example 5.50. Recall that the payoff matrix for the zero-sum game
rock-paper-scissors is
⎡ ⎤
0 −1 1
⎢ ⎥
A=⎣ 1 0 −1⎦ .
−1 1 0

The Nash equilibrium is x∗ = y∗ = 13 , 13 , 13 . Note that
⎡ ⎤ ⎡1⎤ ⎡ ⎤
0 −1 1 3 0
∗ ⎢ ⎥⎢ ⎢ 1⎥
⎥ ⎢ ⎥
Ay = ⎣ 1 0 −1⎦ ⎣ 3 ⎦ = ⎣0⎦ .
−1 1 0 1 0
3
Thus, it is easy to see that the indifference theorem holds in this
case. In fact, for any x ∈ Δ3 , we see at once that xAy∗ = 0.
120 Game Theory Explained: A Mathematical Introduction with Optimization

Remark 5.51. In light of the previous example, we see that the


indifference theorem can be generalized a bit. This generalization
appears in the exercises.

5.6 The Minimax Theorem

Remark 5.52. We now return to zero-sum games and show that


there is a Nash equilibrium for every zero-sum game. Before pro-
ceeding, we recall the definition of a Nash equilibrium as it applies
to a zero-sum game. A mixed strategy, (x∗ , y∗ ) ∈ Δ, is a Nash equi-
librium for a zero-sum game, G = (P, Σ, A) with A ∈ Rm×n , if we
have
x∗ T Ay∗ ≥ xT Ay∗ ,
for all x ∈ Δm and
x∗ T Ay∗ ≤ x∗ T Ay,
for all y ∈ Δn .
Remark 5.53. Let G = (P, Σ, A) be a zero-sum game, with A ∈
Rm×n . Define the function v1 : Δm → R as
v1 (x) = min xT Ay. (5.33)
y∈Δn
That is, given x ∈ Δm , we choose a vector y that minimizes the value
xT Ay. This value is the best possible result Player 1 can expect if
she announces to Player 2 that she will play strategy x. Player 1
then faces the problem that she would like to maximize this value
by choosing x appropriately. That is, Player 1 hopes to solve the
problem
max v1 (x). (5.34)
x∈Δm
Thus, Player 1’s problem is to solve
max v1 (x) = max min xT Ay. (5.35)
x∈Δm x y
By a similar argument, define the function v2 : Δn → R as
v2 (y) = max xT Ay. (5.36)
x∈Δm

That is, given y ∈ Δn , we choose a vector x that maximizes xT Ay.


This value is the best possible result that Player 2 can expect if he
announces to Player 1 that he will play strategy y. Player 2 then faces
Saddle Points, Mixed Strategies, and Nash Equilibria 121

the problem that he would like to minimize this value by choosing y


appropriately. That is, Player 2 hopes to solve the problem
min v2 (y). (5.37)
y∈Δn

Player 2’s problem is to solve


min v2 (y) = min max xT Ay. (5.38)
y∈Δn y x

Note that this is the precise analogue in mixed strategies to the


concept of a saddle point. The functions v1 and v2 are called the
value functions for Players 1 and 2, respectively. The main problem
we must tackle now is to determine whether these maximization and
minimization problems can be solved.
Remark 5.54. The proof of the following lemma is left as an
exercise.
Lemma 5.55. Let G = (P, Σ, A) be a zero-sum game, with A ∈
Rm×n . Then,
max v1 (x) ≤ min v2 (y). (5.39)
x∈Δm y∈Δn

Remark 5.56. The proof of the following theorem, the minimax


theorem, is long and uses some trickery. It is best to read it a few
times, or to skip it on the first reading, since the insight is all in the
theorem rather than in the proof.
Theorem 5.57 (Minimax Theorem). Let G = (P, Σ, A) be a
zero-sum game, with A ∈ Rm×n . Then, the following are equivalent:
(1) There is a Nash equilibrium (x∗ , y∗ ) for G.
(2) The following equation holds:
v1 = max min xT Ay = min max xT Ay = v2 . (5.40)
x y y x

(3) There exists a real number v and x∗ ∈ Δm and y∗ ∈ Δn so that


the following inequalities hold:
Aij x∗i ≥ v, with j ∈ {1, . . . , n} and
i

Aij yj∗ ≤ v, with i ∈ 1, . . . , m}.


j
122 Game Theory Explained: A Mathematical Introduction with Optimization

Proof. (A version of this proof is given in Ref. [1], Appendix 2.)


(1 =⇒ 2): Suppose that (x∗ , y∗ ) ∈ Δ is a Nash equilibrium. By
the definition of a minimum, we know that
v2 = min max xT Ay ≤ max xT Ay∗ .
y x x

The fact that for all x ∈ Δm ,


x∗ T Ay∗ ≥ xT Ay∗ ,
implies that
x∗ T Ay∗ = max xT Ay∗ .
x

Thus, we have
v2 = min max xT Ay ≤ max xT Ay∗ = x∗ T Ay∗ .
y x x

Again, the fact that for all y ∈ Δn ,


x∗T Ay∗ ≤ x∗T Ay,
implies that
x∗T Ay∗ = min x∗T Ay.
y

Thus,
v2 = min max xT Ay ≤ max xT Ay∗ = x∗ T Ay∗ = min x∗ T Ay.
y x x y

Finally, by the definition of a maximum, we know that


v2 = min max xT Ay ≤ max xT Ay∗ = x∗ T Ay∗
y x x

= min x∗T Ay ≤ max min xT Ay = v1 . (5.41)


y x y

By Lemma 5.55, we know that v1 ≤ v2 . We have just proved that


v2 ≤ v1 . Therefore, v1 = v2 , as required.
(2 =⇒ 3): Let v = v1 = v2 , let x∗ be the vector that maximizes
v1 (x), and let y∗ be the vector that minimizes v2 (y). For fixed j, we
know that
Aij x∗i = x∗ T Aej .
i
Saddle Points, Mixed Strategies, and Nash Equilibria 123

Note that we are summing down a column of A in the preceding


equation. By the definition of minimum, we know that

Aij x∗i = x∗ T Aej ≥ min x∗ T Ay.


y
i

We defined x∗ so that it is the maximin value, and thus,

Aij x∗i = x∗ T Aej ≥ min x∗ T Ay = max min xT Ay


y x y
i

= v = min max xT Ay.


y x

By a similar argument, we defined y∗ so that it is the minimax value,


and thus,

Aij x∗i = x∗T Aej ≥ min x∗T Ay = max min xT Ay


y x y
i

= v = min max xT Ay = max xT Ay∗ .


y x x

Finally, for fixed i, we know that

Aij yj∗ = eTi Ay∗ ,


j

and thus, we conclude that

Aij x∗i =x∗ T Aej ≥ min x∗ T Ay = max min xT Ay


y x y
i

=v = min max xTAy = max xT Ay∗ ≥ eTi Ay∗ = Aij yj∗ .


y x x
j

(5.42)

From Eq. (5.42), we can read off the two inequalities

Aij x∗i ≥ v for j ∈ {1, . . . , n},


i

Aij yj∗ ≤ v for i ∈ {1, . . . , m}.


j
124 Game Theory Explained: A Mathematical Introduction with Optimization

(3 =⇒ 1): For any fixed j, we know that


x∗ T Aej ≥ v.
Thus, if y1 , . . . , yn ∈ [0, 1] and y1 + · · · + yn = 1 for each
j ∈ {1, . . . , n}, we know that
x∗T Aej yj ≥ vyj .
Thus, we can conclude that
x∗T Ae1 y1 + · · · + x∗T Aen yn = x∗T A (e1 y1 + · · · + en yn ) ≥ v.
Letting y = y1 , . . . , yn , we can conclude that
x∗ T Ay ≥ v, (5.43)
for any y ∈ Δn . By a similar argument, we know that
xT Ay∗ ≤ v, (5.44)
for all x ∈ Δm . From Eq. (5.44), we conclude that
x∗T Ay∗ ≤ v, (5.45)
and from Eq. (5.43), we conclude that
x∗ T Ay∗ ≥ v. (5.46)
Thus, v = x∗T Ay∗ , and we know that, for all x and y,
x∗ T Ay∗ ≥ xT Ay∗ ,
x∗ T Ay∗ ≤ x∗ T Ay.
Thus, (x∗ , y∗ ) is a Nash equilibrium. This completes the proof. 
Remark 5.58. Theorem 5.57 does not assert the existence of a
Nash equilibrium; it simply provides insight into what happens if
one exists. In particular, we know that the game has a unique value:
v = max min xT Ay = min max xT Ay. (5.47)
x y y x

Proving the existence of a Nash equilibrium can be accomplished in


several ways, the oldest of which uses a topological argument, which
we present in the following. We can also use a linear programming-
based argument, which we explore in Chapter 7.
Saddle Points, Mixed Strategies, and Nash Equilibria 125

5.7 Existence of Nash Equilibria

Lemma 5.59 (Brouwer Fixed Point Theorem). Let Δ be the


mixed strategy space of a two-player zero-sum game. If T : Δ → Δ
is continuous, then there exists a pair of strategies, (x∗ , y∗ ), so that
T (x∗ , y∗ ) = (x∗ , y∗ ). That is, (x∗ , y∗ ) is a fixed point of the map-
ping T .
Remark 5.60. In the previous lemma, we are casually avoiding a
formal definition of function continuity and relying on the fact that
the reader has probably encountered this definition in a calculus
class. Moreover, the proof of Brouwer’s fixed point theorem is well
outside the scope of this book. It is a deep theorem in topology. The
interested reader should consult Ref. [61] (pp. 351–353), which also
has a definition of function continuity.
Remark 5.61. Before proving that every zero-sum game has a Nash
equilibrium, we state a lemma. The proof is left as an exercise.
Lemma 5.62. Let G = (P, Σ, A) be a zero-sum game, with A ∈
Rm×n . Let x∗ ∈ Δm and y∗ ∈ Δn . If
x∗T Ay∗ ≥ eTi Ay∗ ,
for all i ∈ {1, . . . , m}, and
x∗T Ay∗ ≤ x∗T Aej ,
for all j ∈ {1, . . . , n}, then (x∗ , y∗ ) is an equilibrium.
Theorem 5.63. Let G = (P, Σ, A) be a zero-sum game with A ∈
Rm×n . Then, there is a Nash equilibrium, (x∗ , y∗ ).

Nash’s Proof. (A version of this proof is given in Ref. [1],


Appendix 2.) Let (x, y) ∈ Δ be a pair of mixed strategies for Play-
ers 1 and 2. Define the following functions:

eTi Ay − xT Ay if this quantity is positive
ci (x, y) = (5.48)
0 otherwise


xT Ay − xT Aej if this quantity is positive
dj (x, y) = (5.49)
0 otherwise,
126 Game Theory Explained: A Mathematical Introduction with Optimization

for i ∈ {1, . . . , m} and j ∈ {1, . . . , n}. Note that ci (respectively, dj )


is non-zero just in case the pure strategy i (resp., j) offers a better
payoff to Player 1 (resp., Player 2) than the strategy x (resp., y).
Let T : Δ → Δ, where T (x, y) = (x , y ), so that for i ∈
{1, . . . , m}, we have

xi + ci (x, y)
xi =  , (5.50)
1+ m k=1 ck (x, y)

and for j ∈ {1, . . . , n}, we have

yj + dj (x, y)
yj =  . (5.51)
1 + nk=1 dk (x, y)

Since x1 + x2 + · · · + xm = 1, we know that



  x1 + · · · + xm + m k=1 ck (x, y)
x1 + · · · + xm =  = 1. (5.52)
1+ m c
k=1 k (x, y)

It is also clear that since xi ≥ 0 for all i ∈ {1, . . . , m}, we know


that xi ≥ 0 for all i. A similar argument shows that yj ≥ 0 for all
j ∈ {1, . . . , n} and y1 + y2 + · · · + yn = 1. Thus, as we have defined
it: T is a map from Δ to Δ. The fact that T is continuous follows
from the continuity of the payoff function. Now, we show that (x, y)
is a Nash equilibrium if and only if it is a fixed point of T .
To see this, note that ci (x, y) measures the amount that the pure
strategy ei is better than x as a response to y. That is, if Player
2 decides to play strategy y, then ci (x, y) tells us if and how much
playing the pure strategy ei is better than playing x ∈ Δm . Similarly,
dj (x, y) measures how much better ej is as a response to Player 1’s
strategy x than the strategy y for Player 2. Suppose that (x∗ , y∗ )
is a Nash equilibrium. Then, necessarily, ci (x∗ , y∗ ) = 0 = dj (x∗ , y∗ )
for all i and j, by the definition of equilibrium. Thus, xi = x∗i for all
i and yj = yj∗ for all j. Thus, we have shown that (x∗ , y∗ ) is a fixed
point of T .
To show the converse, suppose that (x, y) is a fixed point of T .
It suffices to show that there is at least one i so that xi > 0 and
ci (x, y) = 0. We know that there is at least one i for which xi > 0
Saddle Points, Mixed Strategies, and Nash Equilibria 127

because x1 + · · · + xm = 1. Note that


m
xT Ay = xi eTi Ay.
i=1

Thus, xT Ay < eTi Ay cannot hold for all i ∈ {1, . . . , m} with xi > 0
(otherwise, the previous equation would not hold). Therefore, for at
least one i with xi > 0, we must have ci (x, y) = 0. But for this
specific i, the fact that (x, y) is a fixed point implies that
x
xi = m i . (5.53)
1+ k=1 ck (x, y)

This implies that


m
ck (x, y) = 0.
k=1

For this to be true, we conclude that ck (x, y) = 0 because ck (x, y) ≥ 0


for all k ∈ {1, . . . , m}. A similar argument applies to y. Thus, we
know that ci (x, y) = 0 = dj (x, y) for all i ∈ {1, . . . , m} and j ∈
{1, . . . , n}. We have shown that

xT Ay ≥ eTi Ay and xT Ay ≤ xT Aej ,

for all i ∈ {1, . . . , m} and j ∈ {1, . . . , n}. Therefore, by Lemma 5.62,


we have that (x, y) is a Nash equilibrium.
Now, apply Lemma 5.59 (Brouwer’s fixed point theorem) to see
that T must have a fixed point, and thus every two-player zero-sum
game has a Nash equilibrium. This completes the proof. 

5.8 Finding Nash Equilibria in Simple Games

Remark 5.64. It is relatively straightforward to find a Nash equilib-


rium in 2 × 2 zero-sum games, assuming that a saddle point cannot
be identified using the approach from Example 5.3 or the method
of iterated dominance. We illustrate this approach using the Battle
of Avranches, which we know does not have an equilibrium in pure
strategies.
128 Game Theory Explained: A Mathematical Introduction with Optimization

Example 5.65. Consider the Battle of Avranches (Example 5.11).


The payoff matrix is
⎡ ⎤
2 3
⎢ ⎥
A = ⎣1 5⎦ .
6 4

Note first that Row 1 (Bradley’ first strategy) is strictly dominated


by Row 3 (Bradley’s third strategy), and thus we can reduce the
payoff matrix to
 
1 5
A= .
6 4
Suppose that Bradley chooses the strategy
 
x
x=
1−x
with x ∈ [0, 1]. If von Kluge chooses to attack (column one), then
Bradley’s expected payoff will be
  
T
 1 5 1
x Ae1 = x 1 − x = x + 6(1 − x) = 6 − 5x.
6 4 0
A similar argument shows that if von Kluge chooses to retreat (col-
umn two), then Bradley’s expected payoff will be
xT Ae2 = 5x + 4(1 − x) = x + 4.
We can visualize these two possibilities by plotting the functions, as
shown in Fig. 5.6 (left). Reasoning that von Kluge will attempt to
minimize his payoff, Bradley should consider the function
u1 (x) = min{6 − 5x, x + 4}
and find a value x that maximizes u1 (x). This maximizing point
comes at x = 13 , where the two lines intersect.
Put another way, when x ≤ 13 , von Kluge does better if he retreats
because x + 4 is below 6 − 5x (remember, von Kluge wishes to
minimize Bradley’s payoff). That is, the best Bradley can hope to
get is x + 4 if he announces to von Kluge that he is playing a mixed
Saddle Points, Mixed Strategies, and Nash Equilibria 129

6 6
6−5x 2y+4
5 x+4 5 5−4y
Player 1 Payoff

Player 2 Payoff
4 1 4 1
x= y=
3 6
3 3

2 2

1 1
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0
x y

Fig. 5.6. Plotting the expected payoff to Bradley by playing the mixed strategy
x, (1 − x) when von Kluge plays pure strategies shows which strategy von Kluge
should pick. When x ≤ 1/3, von Kluge does better if he retreats because x + 4
is below 6 − 5x. On the other hand, if x ≥ 1/3, then von Kluge does better if he
attacks because 6 − 5x is below x + 4. Remember, Von Kluge wants to minimize
the payoff to Bradley. The point at which Bradley does best (i.e., maximizes his
expected payoff) comes at x = 13 . By a similar argument, when y ≤ 16 , Bradley
does better if he chooses to move east (strategy one), while when y ≥ 16 , Bradley
does best when he waits (strategy two). Remember, Bradley is minimizing von
Kluge’s payoff since we are working with −A.

strategy with x ≤ 13 . On the other hand, if x ≥ 1/3, then von Kluge


does better if he attacks because 6 − 5x is below x + 4. That is, the
best Bradley can hope to get is 6 − 5x if he announces to von Kluge
that he is playing a mixed strategy with x ≥ 13 . The break-even point
occurs at x = 13 , at which it does not matter what strategy von Kluge
plays.
By a similar argument, we can compute the expected payoff to von
Kluge when he plays the mixed strategy y, (1 − y) with y ∈ [0, 1]
and Bradley plays pure strategies. The expected payoff to von Kluge
when Bradley plays strategy one is
eT1 (−A)y = −y − 5(1 − y) = 4y − 5.
When Bradley plays strategy two, the expected payoff to von Kluge is
eT2 (−A)y = −6y − 4(1 − y) = −2y − 4.
Reasoning that Bradley will attempt to minimize his payoff, von
Kluge should consider the function
u2 (y) = min{4y − 5, −2y − 4}
and find a value y that maximizes u2 (y). Again, this maximizing
point occurs at the intersection of the two lines or when y = 16 .
130 Game Theory Explained: A Mathematical Introduction with Optimization

Note that we have done something subtle here. We are now treating
both players as agents attempting to maximize their payoffs, rather
than thinking of one agent as a minimizer and another agent as a
maximizer. We take this approach when discussing computational
methods for finding Nash equilibria in later chapters.
We now define
1 1
3 6
x∗ = 2
and y∗ = 5
.
3 6

The pair (x∗ , y∗ ) is the Nash equilibrium for this problem.


Example 5.66 (Saddle-Point Visualization). Continuing on
from the previous example, we note that any Nash equilibrium for a
zero-sum game is called a saddle point. To see why, consider the pay-
off function for Player 1 as a function of x and y (from the previous
example). This function is
  
 1 5 y
x 1−x = 4 + x + 2y − 6xy. (5.54)
6 4 1−y

We plot this in Fig. 5.7. The surface (with the corresponding contour
plot) is a hyperbolic saddle. In three-dimensional space, it looks like
a combination of an upside-down parabola going in one direction
and a right-side-up parabola going in the other. The maximum of
one parabola and the minimum of the other parabola occur precisely
at the point (x∗ , y ∗ ) = ( 13 , 16 ). This is the point in the (x, y)-plane
corresponding to the Nash equilibrium. We discuss this further in
Section 6.7.
Remark 5.67. The techniques discussed in Examples 5.65 and 5.66
can be extended to cases when one player has two strategies and the
other player has more than two strategies; however, these methods
are not efficient for finding Nash equilibria in general. In the fol-
lowing chapters, we show how to find Nash equilibria for games by
solving specific optimization problems corresponding to the games in
question. These techniques will work for general two-player zero-sum
games with an arbitrary number of strategies. We also discuss the
problem of finding Nash equilibria in arbitrary bimatrix games.
Saddle Points, Mixed Strategies, and Nash Equilibria 131

Fig. 5.7. The payoff function for Player 1 is shown as a function of x and y.
Note: The Nash equilibrium occurs at a saddle point of the function.

5.9 Nash Equilibria in General-Sum Games

Remark 5.68. We now generalize our discussion on the existence


of Nash equilibria to general-sum games with N ≥ 2 players. The
results in this section are (almost) completely analogous to those we
have already seen, only that here they are in a more general setting.
For the remainder of this section, all results will be about an N -player
game, G = (P, Σ, π), in normal form with Σi = {σ1i , . . . , σni i }, where
Δ is the mixed-strategy space for this game.
Definition 5.69 (Player Best Response). If y = (y1 , . . . ,
yi , . . . , yN ) ∈ Δ is a mixed strategy for all players, then the best
reply for Player Pi is the set
 
Bi (y) = xi ∈ Δni : ui (xi , y−i ) ≥ ui (zi , y−i ) ∀zi ∈ Δni . (5.55)
Recall that y−i = (y1 , . . . , yi−1 , yi+1 , . . . , yN ) is the collection of
mixed strategies not including Player i’s.
Remark 5.70. There is a lot going on in Definition 5.69. What we
are really saying is that, given some strategies, y−i = (y1 , . . . , yi−1 ,
yi+1 , . . . , yN ), find all the strategies xi ∈ Δni for Player i that return
the best possible payoffs, given all the other players are using y−i .
132 Game Theory Explained: A Mathematical Introduction with Optimization

We can see that if a Player Pi is confronted by some collection of


strategies y−i , then the best thing she can do is to choose some
strategy xi ∈ Bi (y).
Remark 5.71. Note that Eq. (5.55) defines the function Bi : Δ →
2Δni . That is, Bi is a point-to-set map. We can generalize this to the
best response function, which combines these individual functions for
each player.
Definition 5.72 (Best Response Function). The mapping
B: Δ → 2Δ , given by

B(x) = B1 (x) × B2 (x) × · · · × BN (x), (5.56)

is called the best response mapping.


Theorem 5.73. The strategy x∗ ∈ Δ is a Nash equilibrium for G if
and only if x∗ ∈ B(x∗ ).
Proof. Suppose that x is a Nash equilibrium. Then, for all i ∈
{1 . . . , N },
∗ ∗ ∗
ui (xi , x−i ) ≥ ui (zi , x−i ),

for every zi ∈ Δni . Thus,


!
∗ ∗ ∗ ∗
xi ∈ xi ∈ Δni : ui (xi , x−i ) ≥ ui (zi , x−i ) ∀zi ∈ Δni .
∗ ∗
It follows that xi ∈ Bi (xi ). Since this holds for each i ∈ {1, . . . , N },
it follows that x∗ ∈ B(x∗ ).
To prove the converse, suppose that x∗ ∈ B(x∗ ). Then, for all
i ∈ {1, . . . , N }. Then, we have
∗  
xi ∈ xi ∈ Δni : ui (xi , x−i ) ≥ ui (zi , y−i ) ∀zi ∈ Δni .

However, this implies that for all i ∈ {1 . . . , N },


∗ ∗ ∗
ui (xi , x−i ) ≥ ui (zi , x−i ),

for every zi ∈ Δni . This is the definition of a Nash equilibrium. 


Remark 5.74. Theorem 5.73 shows that in the N -player, general-
sum game setting, every Nash equilibrium is a kind of fixed point
Saddle Points, Mixed Strategies, and Nash Equilibria 133

of the mapping B: Δ → 2Δ . This fact, along with a more general


topological fixed point theorem called Kakutani’s fixed point theorem,
is sufficient to establish the following theorem.
Theorem 5.75 (Existence of Nash Equilibria). Let G = (P,
Σ, π) be an N -player game in normal form. Then, G has at least one
Nash equilibrium.
Remark 5.76. The proof based on Kakutani’s fixed point theorem is
neither useful nor satisfying. Moreover, to apply Kakutani’s theorem,
we would have to prove that the mapping B has additional properties,
which we have not discussed. Nash constructed an alternative proof
using Brouwer’s fixed point theorem, following the same steps we
used to prove Theorem 5.63, which we now generalize.

Proof of the Existence of Nash Equilibria. Define the function



ui (ek , x−i ) − ui (xi , x−i ) if this quantity is positive,
Jki (x) =
0 otherwise.
(5.57)
This function measures the benefit of changing to the pure strategy
ek for Player Pi when all other players hold their strategy fixed at
x−i . Now, define the transformation T : Δ → Δ so that

 xij + Jji (x)


xij =  i . (5.58)
1 + nk=1 Jki (x)

This is a generalization of Eqs. (5.50) and (5.51), and it follows from



the same reasoning as in the proof of Theorem 5.63 that xi1 + · · · +
 
xini = 1 and xij ≥ 0 for all i, thus establishing that T maps Δ to
itself.
It now follows by the same reasoning as in Theorem 5.63 that
x∗ is a fixed point of T if and only if x∗ is a Nash equilibrium. We
assert that T is continuous by the continuity of the payoff function,
and thus a fixed point for T exists by Brouwer’s fixed point theo-
rem. Consequently, every general-sum game has at least one Nash
equilibrium. This completes the proof. 
Remark 5.77. Unfortunately, this is still not a very useful way to
construct a Nash equilibrium. In the following chapters, we explore
134 Game Theory Explained: A Mathematical Introduction with Optimization

this problem in depth for two-player zero-sum games and then


proceed further to explore the problem for two-player general-sum
games. The story of computing Nash equilibria takes on a life of its
own. It is an important study within computational game theory
and has had a substantial impact on the literature in mathematical
programming (optimization), computer science, and economics.

5.10 Chapter Notes

John F. Nash studied at Princeton University, completing his the-


sis, Non-Cooperative Games, in 1950 and publishing it in 1951 [62].
During this time, Nash also worked on algebraic geometry, differential
geometry, and partial differential equations. The influence these fields
had on his work in game theory is clear from his use of the Kakutani
and Brouwer fixed-point theorems. While he was awarded the Nobel
Prize in Economics for his work on game theory, he is better remem-
bered in some mathematical circles for his work in geometry and par-
tial differential equations. Nash’s embedding theorem [63, 64] shows
that any Riemannian manifold (a fancy shape with special prop-
erties) can be isometrically embedded into some Euclidean space.
Here, isometric means that lengths in the manifold are preserved
during the process of embedding the shape in Euclidean space. By
embedding, we mean a way of placing a shape inside a space. For
example, imagine picking a circle up off a piece of paper and plac-
ing it in midair in front of you. This is a simple embedding of the
circle into R3 . Nash’s approach to the proof worked by solving a
system of partial differential equations and laid the foundations for
what would become the Nash–Moser theorem [65, 66], which gener-
alizes the inverse function theorem (see [67, Ch. 9]). His work, later
recognized as the De Giorgi–Nash theorem, resolved Hilbert’s 19th
problem.
Thanks to the well-known biography A Beautiful Mind by Sylvia
Nasar [68], it is widely known that Nash suffered from schizophrenia
but slowly improved after 1970. The book was made into a movie of
the same name. Interestingly, the depiction of Nash equilibrium in
the movie is fantastically incorrect and is closer to describing Nash’s
bargaining theorem (which we discuss in Chapter 9).

– ♠♣♥♦ –
Saddle Points, Mixed Strategies, and Nash Equilibria 135

5.11 Exercises

5.1 Show that the strategy (e2 , e1 ) is an equilibrium for the game in
Example 5.3. That is, show that the strategy (drama, science fiction)
is an equilibrium strategy for the networks.
5.2 Show that (Sail North, Search North) is an equilibrium solution
for the Battle of the Bismark Sea using the approach from Exam-
ple 5.3 and Theorem 5.2.
5.3 Prove Theorem 5.6.
5.4 Show that rock-paper-scissors does not have a saddle-point
strategy.
5.5 Complete the proof of Proposition 5.29, and show explicitly
that u2 (x, y) = xT By.
5.6 Recall from Remark 5.33 that we wrote down what it meant
for a Player 1 strategy to (weakly) dominate another in a two-player
matrix game. Write these conditions for Player 2, assuming that we
have the game G = (P, Σ, A, B). [Hint: Remember, Player 2 multi-
plies on the right-hand side of the payoff matrix. Also, you need to
use B.]
5.7 Show that confess strictly dominates don’t confess for Clyde in
Example 5.34 (prisoner’s dilemma).
5.8 Using Theorem 5.37, state and prove an analogous theorem for
Player 2.
5.9 Consider the game matrix (matrices) in Example 5.3. Show
that this game is strictly dominance solvable. Recall that the game
matrix is ⎡ ⎤
−15 −35 10
⎢ ⎥
A = ⎣ −5 8 0 ⎦.
−12 −36 20
[Hint: Start with Player 2 (the column player) instead of Player 1.
Note that column 3 is strictly dominated by column 1, so you can
remove column 3. Proceed from there. You can eliminate two rows
(or columns) at a time.]
136 Game Theory Explained: A Mathematical Introduction with Optimization

5.10 Prove Lemma 5.55. [Hint: Argue that, for all x ∈ Δm and for
all y ∈ Δn , we know that v1 (x) ≤ v2 (y) by showing that v2 (y) ≥
xT Ay ≥ v1 (x). From this, conclude that miny v2 (y) ≥ maxx v1 (x).]

5.11 Suppose that (x∗ , y∗ ) is a Nash equilibrium for G =


(P, Σ, A, B), with A, B ∈ Rm×n , and suppose I is a set of indices so
that i ∈ I if and only if x∗i > 0. Suppose that if x ∈ Δm with the
property that xi > 0 if and only if i ∈ I. Show that

xT Ay∗ = x∗T Ay∗ ,

thus generalizing the indifference theorem. [Hint: Use the indifference


theorem, stating that eTi Ay∗ = x∗ T Ay∗ for all i ∈ I and the fact
that x1 + · · · xm = 1.]

5.12 Prove Lemma 5.62.

5.13 Consider the football game in Example 4.5. Compute the Nash
equilibrium strategy. [Hint: Find the matrices A and B = −A from
the data. Use the method in Example 5.65 with Player 1, who has two
strategies, and find x∗ = x∗ , 1 − x∗ . Compute the row x∗T B. You
will find that one strategy should be eliminated. This tells you the two
remaining strategies to use. Now, apply the method in Example 5.65
to Player 2 and the two remaining strategies.]

5.14 Verify that the function T in Theorem 5.63 is continuous.


Part 2

Optimization and Game Theory


This page intentionally left blank
Chapter 6

An Introduction to Optimization and


the Karush–Kuhn–Tucker Conditions

Chapter Goals: Many of the results we have encountered


in game theory so far can be rephrased using the language of
optimization. Our goal in this chapter is to introduce the basic
elements of optimization theory necessary to derive optimization
problems whose solutions will provide Nash equilibria for matrix
games. Using a running example, we introduce the reader to the
dual variables (called Lagrange multipliers in vector calculus),
convex sets and functions, and the Karush–Kuhn–Tucker neces-
sary conditions for optimality. Unlike in the previous chapters,
we will not prove many of the theorems we state in this chapter,
and instead take them as given. Detailed proofs for the results
(or variations thereof) can be found in Ref. [69] or Ref. [70].
We will see in later chapters that these conditions can be used
to derive the necessary (and sufficient) conditions for a pair of
strategies to be a Nash equilibrium in a two-player matrix game.

6.1 Motivating Example

Example 6.1. We begin with a simple example that can be solved


using techniques from an introductory calculus course. Suppose we
wish to build a pen to keep some goats (or whatever farm animal you
prefer). We are given 100 meters of fencing, and we wish to build the

139
140 Game Theory Explained: A Mathematical Introduction with Optimization

Goat Pen y

Fig. 6.1. Goat pen with unknown side lengths: The objective is to identify the
values of x and y that maximize the area of the pen (and thus the number of
goats that can be kept).

pen in a rectangle with the largest possible area. How long should
the sides of the rectangle be to maximize the resulting area?
The problem is illustrated in Fig. 6.1. Assume that we define the
rectangle to have vertex points (0, 0), (x, 0), (y, 0), and (x, y), as
illustrated in Fig. 6.1. We know that
2x + 2y = 100 (6.1)
because 2x + 2y is the perimeter of the pen, and we are given 100
meters of fencing. The area of the pen is A(x, y) = xy. We can use
Eq. (6.1) to solve for x in terms of y,
y = 50 − x; (6.2)
therefore, A(x) = x(50 − x). To maximize A(x), recall that we take
the first derivative of A(x) with respect to x, set this derivative to
zero, and solve for x:
dA
= 50 − 2x = 0. (6.3)
dx
Thus, x = 25 and y = 50 − x = 25. Recall from basic calculus that
we can confirm that this is a maximum by evaluating the second
derivative:

d2 A 
= −2 < 0. (6.4)
dx2  x=25
An Introduction to Optimization and the KKT Theorem 141

The fact that this is negative implies that x = 25 is a local maximum


for this function. Another way of seeing this is to note that A(x) =
50x − x2 is an “upside-down” parabola. As we could have guessed,
a square will maximize the area available for holding goats.

6.2 A General Maximization Formulation

Remark 6.2. We now generalize the goat pen example to study arbi-
trary optimization problems. The area function is a mapping from R2
to R, written A : R2 → R. The domain of A is the two-dimensional
space R2 , and its range is R. Our objective in Example 6.1 is to max-
imize the function A by choosing values for x and y. In optimization
theory, the function we are trying to maximize (or minimize) is called
the objective function. In general, an objective function is a mapping,
z : D ⊆ Rn → R. Here, D is the domain of the function z, usually
taken to be all of Rn .

Definition 6.3. Let z : D ⊆ Rn → R. The point x∗ ∈ D is a global


maximum for z if for all x ∈ D, z(x∗ ) ≥ z(x). A point x∗ ∈ D is a
local maximum for z if there is a set S ⊆ D with x∗ ∈ S so that for
all x ∈ S, z(x∗ ) ≥ z(x).

Remark 6.4. In Example 6.1, we are constrained in our choice of


x and y by the fact that 2x + 2y = 100. This is called a constraint
in the optimization problem. More specifically, it’s called an equality
constraint. If we did not need to use all the fencing, then we could
write the constraint as 2x + 2y ≤ 100, which is called an inequality
constraint. In complex optimization problems, we can have many
constraints. The set of all points in Rn for which the constraints are
true is called the feasible set (or feasible region). In the goat pen
problem, we wanted to find the best values of x and y to maximize
the area A(x, y). The variables x and y are called decision variables.
We now generalize this.

Definition 6.5. Let z : D ⊆ Rn → R be a function. For


i ∈ {1, . . . , m}, let gi : D ⊆ Rn → R be functions. Finally, for
j ∈ {1, . . . , l}, let hj : D ⊆ Rn → R be functions. Then, the
general maximization problem with objective function z(x1 , . . . , xn ),
142 Game Theory Explained: A Mathematical Introduction with Optimization

inequality constraints gi (x1 , . . . , xn ) ≤ bi (i ∈ {1, . . . , m}), and equal-


ity constraints hj (x1 , . . . , xn ) = rj (j ∈ {1, . . . , l}) is written as



max z(x1 , . . . , xn ),




⎪ s.t. g1 (x1 , . . . , xn ) ≤ b1 ,




⎪ ..

⎪ .


gm (x1 , . . . , xn ) ≤ bm , (6.5)





⎪ h1 (x1 , . . . , xn ) = r1 ,



⎪ ..



⎪ .


hl (x1 , . . . , xn ) = rl .

Remark 6.6. Eq. (6.5) is also called a mathematical programming


problem. Naturally, when constraints are involved, we define the
global and local maxima for the objective function, z(x1 , . . . , xn ),
in terms of the feasible region instead of the domain of D since
we are only concerned with the values of x1 , . . . , xn that satisfy the
constraints.
Example 6.7 (Continuation of Example 6.1). We can rewrite
the problem in Example 6.1 as

⎪ max A(x, y) = xy,



⎨ s.t. 2x + 2y ≤ 100,
(6.6)

⎪ x ≥ 0,



y ≥ 0.

Note that we have the option to change the constraint 2x + 2y =


100 to the inequality constraint 2x + 2y ≤ 100. In reality, there is
no requirement to use all the fencing, though that certainly seems
prudent.
As a result, we’ve added two inequality constraints, x ≥ 0 and
y ≥ 0, because it doesn’t really make any sense to have negative
lengths. We can rewrite these constraints as −x ≤ 0 and −y ≤ 0,
where g1 (x, y) = −x and g2 (x, y) = −y, to make Eq. (6.6) look like
Eq. (6.5). We use this formulation of the problem in later examples.
An Introduction to Optimization and the KKT Theorem 143

Remark 6.8. We have formulated the general maximization prob-


lem in Eq. (6.5). Suppose that we are interested in finding a value
that minimizes an objective function z(x1 , . . . , xn ), subject to certain
constraints. Then, we can write Eq. (6.5), replacing max with min.
Remark 6.9. An alternative way of dealing with minimization is
to transform a minimization problem into a maximization prob-
lem. If we want to minimize z(x1 , . . . , xn ), we can maximize
−z(x1 , . . . , xn ). In maximizing the negation of the objective func-
tion, we are actually finding a value that minimizes z(x1 , . . . , xn ).
This is particularly useful when dealing with code for optimization
(e.g., SciPy’s linprog at the time of this writing [71]).

6.3 Gradients, Constraints, and Optimization

Remark 6.10. We now make use of the relationship between the


gradient of a function, its level sets, and optimization. The details
needed for this are usually taught in a vector calculus class and can
be found in Appendix B.
Definition 6.11 (Binding Constraint). Let g(x) ≤ b be a con-
straint in an optimization problem. If at point x0 ∈ Rn we have
g(x0 ) = b, then the constraint is said to be binding. Equality con-
straints h(x) = r are always binding.
Example 6.12 (Continuation of Example 6.1). Recall Exam-
ple 6.1. Consider the level curves of the objective function z = xy
and the constraint 2x + 2y ≤ 100, which is binding at the optimal
point x = y = 25. In Fig. 6.2, we see the level curves of the objective
function (the hyperbolae) and the feasible region shown as a shaded
triangle. The elements in the feasible regions are all values for x and
y for which 2x + 2y ≤ 100 and x, y ≥ 0. Note that at the point of
optimality, the level curve, defined by the equation
A(x, y) = xy = 625
is tangent to the line 2x + 2y = 100. That is, the level curve of the
objective function is tangent to the binding constraint.
The gradient of A(x, y) = xy at this point is given by
   
y  25
∇A(25, 25) =  = .
x x=25,y=25 25
144 Game Theory Explained: A Mathematical Introduction with Optimization

50

40
∇A
30
∇g
y

20

10

0
0 10 20 30 40 50
x
Fig. 6.2. At optimality, the level curve of the objective function is tangent to
the binding constraints.

We see that it is pointing in the direction of increase for the function


A(x, y), as expected. Now, let g(x, y) = 2x + 2y. Note that
 
2
∇g(25, 25) = .
2
This gradient is simply a scaled version of the gradient of the objec-
tive function. Let (x∗ , y ∗ ) = (25, 25), and define λ = 25
2 Then, we
have
∇A(x∗ , y ∗ ) = λ∇g(x∗ , y ∗ ).
That is, the gradient of the objective function is simply a dilation
of the gradient of the binding constraint. This is also illustrated in
Fig. 6.2.
Remark 6.13. The elements illustrated in the previous example
are true in general. You may recognize λ as a Lagrange multiplier
from vector calculus [72]. We formalize this idea when we discuss
Theorem 6.36, but first we need to divert and discuss a bit more
geometry.
An Introduction to Optimization and the KKT Theorem 145

6.4 Convex Sets and Combinations

Remark 6.14. To understand optimization as it applies to the kinds


of game theory we have been discussing, we need to introduce the
concept of convexity.

Definition 6.15 (Convex Set). Let X ⊆ Rn . The set X is convex


if and only if, for all points x1 , x2 ∈ X, we have λx1 + (1 − λ)x2 ∈ X
for all λ ∈ [0, 1].

Remark 6.16. This definition seems complex, but it is easy to


understand. First, recall that if λ ∈ [0, 1], then the point λx1 + (1 −
λ)x2 is on the line segment connecting x1 and x2 in Rn . For example,
when λ = 12 , then λx1 + (1 − λ)x2 is the midpoint between x1 and
x2 . In fact, for every point x on the line connecting x1 and x2 , we
can find a value λ ∈ [0, 1] so that x = λx1 + (1 − λ)x2 . From this, we
deduce that a set is convex if, given any pair of points x1 , x2 ∈ X,
the line segment connecting these points lies entirely inside X.

Example 6.17. Fig. 6.3 illustrates a convex and non-convex set.


In two dimensions, non-convex sets have some pieces that resemble
crescent shapes.

Theorem 6.18. The intersection of a finite number of convex sets


in Rn is convex.

x1
x1 x2 x2

X X

Convex Set Non-Convex Set

Fig. 6.3. The set on the left (an ellipse and its interior) is a convex set. Every
pair of points inside the ellipse can be connected by a line contained entirely in
the ellipse. The set on the right is clearly not convex. We have found two points
whose connecting line is not contained inside the set.
146 Game Theory Explained: A Mathematical Introduction with Optimization

Proof. Let C1 , . . . , Cn ⊆ Rn be a finite collection of convex sets.


Let
n
C= Ci (6.7)
i=1

be the set formed from the intersection of these sets. Choose x1 , x2 ∈


C and λ ∈ [0, 1]. Consider x = λx1 +(1−λ)x2 . We know that x1 , x2 ∈
C1 , . . . , Cn by the definition of C. We know that x ∈ C1 , . . . , Cn by
the convexity of each set. Therefore, x ∈ C. Thus, C is a convex
set. 
Definition 6.19 (Linear, Conical, and Convex Combina-
tions). Let x1 , . . . , xm be vectors in ∈ Rn , and let α1 , . . . , αm ∈ R
be scalars. Then,
y = α1 x1 + · · · + αm xm (6.8)
is a linear combination of the vectors x1 , . . . , xm .
If α1 , . . . , αm ≥ 0, then Eq. (6.8) is called a conical combination
(or sometimes a non-negative combination) of the vectors. Moreover,
if α1 , . . . , αm ∈ [0, 1] and
α1 + · · · + αm = 1,
then Eq. (6.8) is called a convex combination of the vectors. If αi < 1
for all i ∈ {1, . . . , m}, then Eq. (6.8) is called a strict convex combi-
nation.
Remark 6.20. We can see that we move from the very general
to the very specific as we go from linear combinations to conical
combinations to convex combinations. A linear combination of points
or vectors allows us to choose any real value for the coefficients. A
conical combination restricts us to positive values, while a convex
combination asserts that those values must be non-negative and sum
to 1.
Theorem 6.21. Let x1 , . . . , xm be vectors in Rn . Then, the set

Cone(x1 , . . . , xn ) = y ∈ Rn : y = αi xi , αi ≥ 0
i

is convex.
An Introduction to Optimization and the KKT Theorem 147

Fig. 6.4. The cone generated from the vectors x1 = 1, 2 and x2 = 2, 1. This
looks like a squashed version of the cone discussed in elementary school.

Remark 6.22. The set Cone(x1 , . . . , xn ) is called the cone or conical


hull, generated by the vectors x1 , . . . , xn . This geometric structure
appears regularly in the study of optimization.
Example 6.23. The cone generated by the vectors x1 = 1, 2 and
x2 = 2, 1 is shown in Fig. 6.4. It is easy to see through visual
inspection that this set is convex. In three dimensions, a cone looks
more like the “cones” we are familiar with from elementary school
or vector calculus.

6.5 Convex and Concave Functions

Definition 6.24 (Convex Function). A function f : Rn → R is


convex if it satisfies

f (λx1 + (1 − λ)x2 ) ≤ λf (x1 ) + (1 − λ)f (x2 ) (6.9)

for all x1 , x2 ∈ Rn and for all λ ∈ [0, 1]. The function is strictly
convex if the inequality is replaced with a strict inequality.
148 Game Theory Explained: A Mathematical Introduction with Optimization

15
λf(x1 )+(1–λ)f(x2 )

f(x) 10

0 f(λx1 +(1–λ)x2 )
–4 –2 0 2 4
x
Fig. 6.5. A convex function satisfies the expression f (λx1 + (1 − λ)x2 ) ≤
λf (x1 ) + (1 − λ)f (x2 ) for all x1 and x2 and λ ∈ [0, 1].

Example 6.25. A convex function is illustrated in Fig. 6.5. Here,


we see that the graph of the function lies below all of its secant lines.
Recall that a secant line connects two points on the graph of the
function.
Definition 6.26 (Concave Function). A function f : Rn → R is
convex if it satisfies

f (λx1 + (1 − λ)x2 ) ≥ λf (x1 ) + (1 − λ)f (x2 ) (6.10)

for all x1 , x2 ∈ Rn and for all λ ∈ [0, 1]. The function is strictly
concave if the inequality is replaced with a strict inequality.
Remark 6.27. To visualize this definition, simply flip Fig. 6.5 upside
down. The graphs of concave functions lie above their secant lines.
In calculus, these functions are called “concave down,” while convex
functions are called “concave up.” Those terms are not used outside
of calculus, and it is best to ignore (or forget) them in favor of the
terms convex and concave.
Remark 6.28. The following theorem relates convex functions and
convex sets. Its proof is outside the scope of this book but can be
found in Ref. [69]. Note that there is no such thing as a concave set.
Theorem 6.29. Let f : Rn → R be a convex function. Then, the set
C = {x ∈ Rn : f (x) ≤ c}, where c ∈ R, is a convex set.
An Introduction to Optimization and the KKT Theorem 149

Definition 6.30 (Linear Function). A function g : Rn → R is


linear if there are constants c1 , . . . , cn ∈ R so that

g(x1 , . . . , xn ) = c1 x1 + · · · + cn xn . (6.11)

Example 6.31. We have had experience with many linear functions


already. The left-hand side of the constraint 2x + 2y ≤ 100 is a linear
function. That is, the function g(x, y) = 2x + 2y is a linear function
of x and y.

Remark 6.32. It is worth noting that linear functions can be defined


in a much more general context. This is usually handled in a linear
algebra course. See Ref. [73] for details. For us, Definition 6.30 is
more than sufficient.

Definition 6.33 (Affine Function). A function g : Rn → R is


affine if g(x) = l(x) + b, where l : Rn → R is a linear function and
b ∈ R.

Remark 6.34. It is easy to see that every linear function is affine


by setting b = 0. While linear functions are easier to deal with, it is
often easier to state theorems in terms of affine functions, especially
when these affine functions appear as constraints.

6.6 Karush–Kuhn–Tucker Conditions

Remark 6.35. We now make use of all the machinery we have con-
structed to state the Karush–Kuhn–Tucker (KKT) theorem. This
powerful theorem provides the necessary conditions for when a point
x∗ ∈ Rn will maximize an objective function z(x), subject to some
constraints given by additional functions. We state this theorem
(and its corollaries) but do not prove them. Proofs can be found
in Ref. [69, 70].

Theorem 6.36. Let z : Rn → R be a differentiable objective


function, gi : Rn → R be differentiable constraint functions for
i ∈ {1, . . . , m}, and hj : Rn → R be differentiable constraint functions
150 Game Theory Explained: A Mathematical Introduction with Optimization

for j ∈ {1, . . . , l}. If x∗ ∈ Rn is an optimal point satisfying an appro-


priate regularity condition for the following optimization problem,



max z(x1 , . . . , xn ),



⎪ s.t. g1 (x1 , . . . , xn ) ≤ 0,





⎪ ..

⎪ .


P gm (x1 , . . . , xn ) ≤ 0,





⎪ h1 (x1 , . . . , xn ) = 0,



⎪ ..



⎪ .


hl (x1 , . . . , xn ) = 0,

then there are constants, λ1 , . . . , λm ∈ R and μ1 , . . . μl ∈ R, so that


we have the following:

gi (x∗ ) ≤ 0 for i ∈ {1, . . . , m},


Primal feasibility : ∗
hj (x ) = 0 for j ∈ {1, . . . , l}.
⎧ m l



⎪ ∇z(x∗
) − λ ∇g (x∗
) − μj ∇hj (x∗ ) = 0,


i i
i=1 j=1
Dual feasibility :

⎪ λi ≥ 0 for i ∈ {1, . . . , m},




μj ∈ R for j ∈ {1, . . . , l}.
Complementary slackness : λi gi (x∗ ) = 0 for i ∈ {1, . . . , m}.

Remark 6.37. The regularity condition mentioned in Theorem 6.36


is sometimes called a constraint qualification. A common one is that
the gradients of the binding constraints are all linearly independent
at x∗ . There are many variations of constraint qualifications. These
are detailed in texts on optimization [69, 70]. Suffice it to say that
all the problems we consider will automatically satisfy a constraint
qualification, meaning that the KKT theorem holds.
Remark 6.38. The KKT theorem provides the necessary conditions
for a point to be an optimal solution to a constrained maximization
An Introduction to Optimization and the KKT Theorem 151

problem. The following theorem provides a sufficient condition for


a point to be a global optimal solution to a constrained maximiza-
tion problem. You will note that the conditions are nearly identical –
it is the order of implication that has changed, as has the imposition
of concavity and convexity requirements on the functions involved in
the problem.
Theorem 6.39. Let z : Rn → R be a differentiable concave function,
gi : Rn → R be differentiable convex functions for i ∈ {1, . . . , m}, and
hj : Rn → R be affine functions for j ∈ {1, . . . , l}. Suppose there are
constants λ1 , . . . , λm ∈ R and μ1 , . . . μl ∈ R so that we have the
following:
gi (x∗ ) ≤ 0 for i ∈ {1, . . . , m},
Primal feasibility :
hj (x∗ ) = 0 for j ∈ {1, . . . , l}.
⎧ m l



⎪ ∇z(x∗ ) − ∗
λi ∇gi (x ) − μj ∇hj (x∗ ) = 0,


i=1 j=1
Dual feasibility :

⎪ λi ≥ 0 for i ∈ {1, . . . , m},




μj ∈ R for j ∈ {1, . . . , l}.

Complementary slackness : λi gi (x ) = 0 for i ∈ {1, . . . , m}.
Then, x∗ is a global maximum for the problem



max z(x1 , . . . , xn ),



⎪ s.t. g1 (x1 , . . . , xn ) ≤ 0,





⎪ ..

⎪ .


P gm (x1 , . . . , xn ) ≤ 0,





⎪ h1 (x1 , . . . , xn ) = 0,



⎪ ..



⎪ .


hl (x1 , . . . , xn ) = 0.
Remark 6.40. The values λ1 , . . . , λm and μ1 , . . . , μl are some-
times called Lagrange multipliers or dual variables. Primal feasibil-
ity, dual feasibility, and complementary slackness are called the KKT
conditions.
152 Game Theory Explained: A Mathematical Introduction with Optimization

Remark 6.41. Note that Theorem 6.36 is true even if z(x) is not
concave, the functions gi (x) (i ∈ {1, . . . , m}) are not convex, or the
functions hj (x) (j ∈ {1, . . . , l}) are not linear. This is because it is
the direction of the implication in Theorem 6.36 that matters. In
particular, the fact that a triple, (x, λ, μ) ∈ Rn × Rm × Rl , satisfies
the KKT conditions does not ensure that this is an optimal solution
for problem P . However, finding such triples and then evaluating the
objective function at x∗ is one way to solve optimization problems.
Remark 6.42. Looking more closely at the dual feasibility condi-
tions, we see something interesting. Suppose that there are no equal-
ity constraints, i.e., no constraints of the form hj (x) = 0. Then, the
statements
m l
∗ ∗
∇z(x ) − λi ∇gi (x ) − μj ∇hj (x∗ ) = 0,
i=1 j=1

λi ≥ 0 for i ∈ {1, . . . , m}
imply that
m
∇z(x∗ ) = λi ∇gi (x∗ ),
i=1

λi ≥ 0 for i ∈ {1, . . . , m}.


Specifically, this says that the gradient of z at x∗ is a conical com-
bination of the gradients of the constraints at x∗ . But more impor-
tantly, since we also have complementary slackness, we know that if
gi (x∗ ) = 0, then λi = 0 because λi gi (x∗ ) = 0 for i = 1, . . . , m. Thus,
what dual feasibility is really saying is that the gradient of z at x∗
is a conical combination of the gradients of the binding constraints
at x∗ . Remember that a constraint is binding if gi (x∗ ) = 0, in which
case λi ≥ 0.
Remark 6.43. Continuing from the previous remark, in the general
case when we have some equality constraints, then dual feasibility
states that
m l
∇z(x∗ ) = λi ∇gi (x∗ ) + μj ∇hj (x∗ ),
i=1 j=1

λi ≥ 0 for i ∈ {1, . . . , m},


μj ∈ R for j ∈ {1, . . . , l}.
An Introduction to Optimization and the KKT Theorem 153

Since equality constraints are always binding, this states that the
gradient of z at x∗ is a linear combination of the gradients of the
binding constraints at x∗ .
Example 6.44. We now come full circle and revisit Example 6.1.
First, we reformulate the problem to match the style of Theorem 6.36:

⎪ max A(x, y) = xy,



⎨ s.t. 2x + 2y − 100 = 0,
(6.12)

⎪ − x ≤ 0,



− y ≤ 0.

Note that the greater-than inequalities x ≥ 0 and y ≥ 0 in Eq. (6.6)


have been changed to less-than inequalities by multiplying them by
−1. The constraint 2x + 2y = 100 has simply been transformed to
2x + 2y − 100 = 0. Thus, if h(x, y) = 2x + 2y − 100, we can see that
h(x, y) = 0 is our constraint. We can let g1 (x, y) = −x and g2 (x, y) =
−y. Then, we have g1 (x, y) ≤ 0 and g2 (x, y) ≤ 0 as our inequality
constraints. We already know that x = y = 25 is our optimal solution.
Thus, we know that there must be Lagrange multipliers μ, λ1 , and
λ2 corresponding to the constraints h(x, y) = 0, g1 (x, y) ≤ 0, and
g2 (x, y) ≤ 0 that satisfy the KKT conditions.
Let’s investigate the three components of the KKT conditions:
Primal feasibility. If x = y = 25, then h(25, 25) = 0, g1 (25, 25) =
−25 ≤ 0, and g2 (25, 25) = −25 ≤ 0. So, primal feasibility is
satisfied.
Complementary slackness. We know that g1 (x, y) = g2 (x, y) =
−25. Since neither of these functions is 0, we know that
λ1 = λ2 = 0. This will force the conditions of complementary
slackness to hold, namely,

λ1 g1 (25, 25) = 0,
λ2 g2 (25, 25) = 0.

Dual feasibility. We now know that λ1 = λ2 = 0. That means we


need to find μ ∈ R so that

∇A(25, 25) − μ∇h(25, 25) = 0.


154 Game Theory Explained: A Mathematical Introduction with Optimization

We know that
 
y
∇A(x, y) = ∇(xy) = ,
x
 
2
∇h(x, y) = ∇(2x + 2y − 100) = .
2

Evaluating ∇A(25, 25) yields


     
25 2 0
−μ = .
25 2 0
25
Thus, setting μ = 2 will accomplish our goal.
We have identified the Lagrange multipliers λ1 = 0, λ2 = 0 and
μ = 25
2 corresponding to the optimal point x = y = 25.

6.7 Relating Back to Game Theory

Remark 6.45. Game theory and optimization theory are intimately


related. When you play a game, you’re trying to maximize your pay-
off, subject to constraints on your moves and to the actions of the
other players. We can formalize this idea.
Remark 6.46. Consider a game G = (P, Σ, π) in normal form.
Assume that P = {P1 , . . . , PN and Σi = {σ1i , . . . , σni i }. If we assume
a fixed mixed strategy x ∈ Δ, Player Pi ’s objective when choosing a
response xi ∈ Δni is to solve the following problem:


⎪ max ui (xi , x−i ),

Player Pi : s.t. xi1 + · · · + xini = 1, (6.13)



xij ≥ 0 j ∈ {1, . . . , ni }.

Recall that x−i are the strategies for all the players except for
Player i. Eq. (6.13) is a mathematical programming problem, pro-
vided that ui (xi , x−i ) is known. This specific optimization problem
assumes that all players, except for Player i, are holding their strat-
egy constant, e.g., playing x−i . In reality, each player is solving a
An Introduction to Optimization and the KKT Theorem 155

version of this problem simultaneously. Thus, an equilibrium solution


solves the following simultaneous system of optimization problems:


⎪ max ui (xi , x−i ),

∀i : s.t. xi1 + · · · + xini = 1, (6.14)



xij ≥ 0 j ∈ {1, . . . , ni }.
This formulation leads to a rich class of problems in mathematical
programming, which we discuss in the following chapters. However,
we can easily illustrate this for a two-strategy game.
Example 6.47 (Finding the Nash equilibrium of the Battle
of Avranches). Consider the Player 1 game matrix from the Battle
of Avranches:
 
1 5
A= .
6 4

Let x = x, 1 − x and y = y, 1 − y , where x, y ∈ [0, 1]. We can


construct the payoff function for Player 1 as
  
  1 5 y
u1 (x, y) = x 1 − x = 4 + x + 2y − 6xy. (6.15)
6 4 1−y
Recall that Player 1 wishes to maximize this function, while Player 2
wishes to minimize this function because this is a zero-sum game,
where the gains of Player 1 are the losses of Player 2, and vice versa.
For a moment, we ignore the constraints that 0 ≤ x ≤ 1 and 0 ≤
y ≤ 1 and simply compute the necessary conditions for optimality.
In this case, we have

∂u1
Player 1 = 1 − 6y = 0,
∂x

∂u1
Player 2 = 2 − 6x = 0.
∂y
Note that this could simply be written as
∇u1 = 0,
where ∇ is computed in terms of x and y.
156 Game Theory Explained: A Mathematical Introduction with Optimization

We can solve the equations for Player 1 and Player 2 to obtain


x∗ = 13 and y ∗ = 16 . This yields a candidate Nash equilibrium solu-
tion:
1 1
3 6
x∗ = 2
, y∗ = 5
.
3 6

One can easily verify whether this is a Nash equilibrium by let-


ting x, 1 − x and y, 1 − y be arbitrary strategies for Player 1 and
Player 2, respectively, and computing that
  1   
13  1 2  1 5 6   1 5 16 13
= 3 3 5
≥ x1−x 5
= ,
3 6 4 6 4 3
6 6
  1   
13  1 2  1 5 6 1 2 1 5 y 13
= 3 3 ≤ 3 3 = .
3 6 4 5 6 4 1−y 3
6

Thus, neither player has an incentive to change strategies,1 and so


the strategy we identified must be a Nash equilibrium.
We saw this solution visualized in Fig. 5.7. The equilibrium occurs
at a saddle point of the function. This is a point that is simultane-
ously a minimum in one direction and a maximum in another direc-
tion, justifying the notion that one player is pushing the value of the
payoff down and the other player is pushing its value up.

6.8 Chapter Notes

It has only been in the past few decades that the KKT theorem has
been named after Karush, Kuhn, and Tucker. Before this, the theo-
rem was named after only Harold Kuhn and Albert Tucker [74]. It
was rediscovered [75] that William Karush, in fact, published a ver-
sion of the theorem in 1939 as part of his master’s thesis [76], though
Kuhn and Tucker did not try to hide this and were aware of the prior
work in their 1951 publication. The theorem has subsequently been
renamed the Karush–Kuhn–Tucker theorem in popular use, though

1
You can also use Nash’s map, T : Δ → Δ, to show that this point is a Nash
equilibrium, if you like.
An Introduction to Optimization and the KKT Theorem 157

older texts will still refer to it as the Kuhn–Tucker theorem. Interest-


ingly, Karush worked for the Manhattan Project during World War
II, ultimately becoming a devout advocate for peace [75].
Harold W. Kuhn was a mathematician and colleague of John Nash
(after whom the Nash equilibrium is named) and was instrumental in
bringing Nash’s work to the attention of the Nobel Prize committee
[77]. In addition to working with Albert W. Tucker (Nash’s advisor)
[77], he is also known for his general work on optimization, including
the “Hungarian Algorithm” [78], which provides a solution to the
assignment problem and can be used for various practical purposes.
There are ways to generalize the various theorems presented in
this chapter. The most common way is to consider generalizations
of convex functions to “quasi-convex” or “pseudo-convex” functions
[70]. These generalizations still allow the KKT conditions to be suffi-
cient for an optimal solution. Convex optimization is a considerable
field of study in itself. Boyd and Vandenberghe’s text on the subject
provides a thorough starting point [79].
From a practical perspective, optimization is one of the three pri-
mary pillars of machine learning [80] (the other two being probabil-
ity theory and linear algebra, for use in optimization). Prior to this,
substantial work in numerical optimization (i.e., finding optimal solu-
tions to real-world problems) has resulted in several algorithms for
identifying solutions to optimization problems. Nocedal and Wright
provide a thorough introduction to this topic [81]. In the coming
chapters, we exploit the theory of optimization (in the form of the
KKT conditions), along with existing numerical algorithms, to find
methods for solving matrix games.

– ♠♣♥♦ –

6.9 Exercises

6.1 Find the radius and height of a cylinder that has minimal sur-
face area, assuming that its volume must be 27 cubic units. [Hint:
Suppose that the can has radius r and height h. The formula for the
surface area of a cylinder is 2πrh + 2πr 2 . The volume of a cylinder
is πr 2 h and is constrained to be equal to 27.]
158 Game Theory Explained: A Mathematical Introduction with Optimization

6.2 Write the problem from Exercise 6.1 as a general minimization


problem. Add any appropriate non-negativity constraints. [Hint: You
must change max to min.]

6.3 Plot the level sets of the objective function and the feasible
region in Exercise 6.1. At the point of optimality you identified, show
that the gradient of the objective function is a scaled version of the
gradient (linear combination) of the binding constraints.

6.4 Find the values of the dual variables for the optimal point in
Exercise 6.1. Show that the KKT conditions hold for the values you
found.

6.5 Using analogous reasoning, write the definitions for global and
local minima. [Hint: Think about what a minimum means and find
the correct direction for the ≥ sign in the definition above.]

6.6 Prove Theorem 6.29.

6.7 Prove that every affine function is both convex and concave.
Chapter 7

Linear Programming and


Zero-Sum Games

Chapter Goals: In this chapter, we show how to convert any


two-player zero-sum matrix game into an optimization prob-
lem that has a linear objective function and linear constraints.
Such a problem is called a linear programming problem (for his-
torical reasons). Readily available computer software can then
be used to find the Nash equilibria as optimal solutions to
these optimization problems. Thus, we provide an algorithmic
way to solve a two-player zero-sum game with any number
of strategies. To accomplish this, we explore a special prop-
erty of linear programming problems, namely the fact that
each linear programming problem has a partner linear program-
ming problem called its dual. These two problems will provide
the Nash equilibria for the two players in a zero-sum matrix
game.

159
160 Game Theory Explained: A Mathematical Introduction with Optimization

7.1 Linear Programs

Definition 7.1. A linear programming problem is an optimization


problem with the following form:

⎪ max z(x1 , . . . , xn ) = c1 x1 + · · · + cn xn ,





⎪ s.t. a11 x1 + · · · + a1n xn ≤ b1 ,





⎪ ..

⎪ .

am1 x1 + · · · + amn xn ≤ bm , (7.1)



⎪ h11 x1 + · · · + hn1 xn = r1 ,





⎪ ..



⎪ .


hl1 x1 + · · · + hln xn = rl .
Here, aij , hkj , bi , and rk are all real numbers for i ∈ {1, . . . , m},
k ∈ {1, . . . , l}, and j ∈ {1, . . . , n}.
Remark 7.2. We can use matrices to write these problems more
compactly. Consider the following system of equations:

⎪ a11 x1 + a12 x2 + · · · + a1n xn = b1 ,




⎨ a21 x1 + a22 x2 + · · · + a2n xn = b2 ,
(7.2)


..

⎪ .


am1 x1 + am2 x2 + · · · + amn xn = bm .
Then, we can write this in matrix notation as
Ax = b, (7.3)
where Aij = aij for i ∈ {1, . . . , m}, j ∈ {1, . . . , n}, x is a column
vector in Rn with entries xj for j ∈ {1, . . . , n}, and b is a column
vector in Rm with entries bi for i ∈ {1 . . . , m}. If we replace the
equalities in Eq. (7.3) with inequalities, we can also express systems
of inequalities in the form
Ax ≤ b. (7.4)
Using this representation, we can write our general linear pro-
gramming problem using matrix and vector notation. Equation (7.1)
Linear Programming and Zero-Sum Games 161

becomes


⎪ max z(x) = cT x,

s.t. Ax ≤ b, (7.5)



Hx = r.
Here, cT is the transpose of the column vector c. We use this notation
later to simplify our analysis.
Example 7.3. Consider the problem of a toy company that produces
toy planes and toy boats. The toy company can sell its planes for $10
and its boats for $8. It costs $3 in raw materials to make a plane and
$2 in raw materials to make a boat. A plane requires 3 hours to make
and 1 hour to finish, while a boat requires 1 hour to make and 2 hours
to finish. The toy company knows that it will not sell any more
than 35 planes per week. Further, given the number of workers, the
company cannot spend any more than 160 hours per week finishing
toys and 120 hours per week making toys. The company wishes to
maximize the profit it makes by choosing how much of each toy to
produce.
We can represent the profit maximization problem of the company
as a linear programming problem. Let x1 be the number of planes
the company will produce, and let x2 be the number of boats the
company will produce. The profit for each plane is $10 − $3 = $7 per
plane, and the profit for each boat is $8 − $2 = $6 per boat. Thus,
the total profit the company will make is
z(x1 , x2 ) = 7x1 + 6x2 . (7.6)
The company can spend no more than 120 hours per week making
toys, and since a plane takes 3 hours to make and a boat takes 1 hour
to make, we have
3x1 + x2 ≤ 120. (7.7)
Likewise, the company can spend no more than 160 hours per week
finishing toys, and since it takes 1 hour to finish a plane and 2 hours
to finish a boat, we have
x1 + 2x2 ≤ 160. (7.8)
Finally, we know that x1 ≤ 35 since the company will make no more
than 35 planes per week. Thus, the complete linear programming
162 Game Theory Explained: A Mathematical Introduction with Optimization

problem is given by



max z(x1 , x2 ) = 7x1 + 6x2 ,



⎪ s.t. 3x1 + x2 ≤ 120,




⎨ x1 + 2x2 ≤ 160,
(7.9)

⎪ x1 ≤ 35,





⎪ x1 ≥ 0,



x2 ≥ 0.

Remark 7.4. In strict terms, the linear programming problem in


Example 7.3 is not a true linear programming problem because we
don’t want to manufacture a fractional number of boats or planes;
therefore, x1 and x2 must be drawn from the integers and not the
real numbers (a requirement for a linear programming problem). This
type of problem is called an integer programming problem [82]. In
Example 7.3, we ignore this fact and assume that we can indeed
manufacture a fractional number of boats and planes.

7.2 Intuition on the Solution of Linear Programs

Remark 7.5. Linear programs (LPs) with two variables can be


solved graphically by plotting the feasible region along with the level
curves of the objective function. We illustrate the method using the
problem from Example 7.3.
Example 7.6 (Continuation of Example 7.3). To solve the lin-
ear programming problem from Example 7.3 graphically, begin by
drawing the feasible region. This is shown in the blue shaded region
of Fig. 7.1. The dashed lines are labeled by the corresponding con-
straints and make up the boundary of the feasible region, including
the constraints x1 , x2 ≥ 0.
After plotting the feasible region, the next step is to plot the level
curves of the objective function. In our problem, the level sets have
the form
−7 c
7x1 + 6x2 = c =⇒ x2 = x1 + .
6 6
Linear Programming and Zero-Sum Games 163

. e
ctiv
g o of
150

sin on
bje
rea irecti
incD
100 3x1 +x2 =120
x1 =35
∇z
x2

(x1∗ ,x2∗ )=(16,72) x1 +2x2 =160

50

0
0 10 20 30 40 50 60
x1

Fig. 7.1. The shaded region in the plot is the feasible region and represents
the intersection of the five inequalities constraining the values of x1 and x2 . The
optimal solution is the “last” point in the feasible region that intersects a level
set as we move in the direction of increasing profit.

This is a collection of parallel lines with slope − 76 and intercept 6c ,


where c is varied as needed. In Fig. 7.1, the level curves are shown
in colors ranging from purple to red depending upon the value of c.
Larger values of c are closer to red.
To solve the linear programming problem, follow the level sets
along the gradient (shown as the black arrow) until the last level set
(line) intersects the feasible region. When doing this by hand, you
can draw a single line of the form 7x1 +6x2 = c and then simply draw
parallel lines in the direction of the gradient (7, 6). At some point,
these lines will fail to intersect the feasible region. The last line to
intersect the feasible region will do so at a point that maximizes the
profit. In this case, the point that maximizes z(x1 , x2 ) = 7x1 + 6x2 ,
subject to the constraints given, is (x∗1 , x∗2 ) = (16, 72).
Note that the point of optimality (x∗1 , x∗2 ) = (16, 72) is at a corner
of the feasible region. This corner is formed by the intersection of the
two lines 3x1 + x2 = 120 and x1 + 2x2 = 160. These lines correspond
164 Game Theory Explained: A Mathematical Introduction with Optimization

to the constraints

3x1 + x2 ≤ 120,
x1 + 2x2 ≤ 160.

At this point, both of these constraints are binding, while the other
constraints are non-binding. (See Definition 6.11.) In general, when
an optimal solution to a linear programming problem exists, it is at
the intersection of several binding constraints; that is, it will occur
at a vertex of a higher-dimensional polyhedron or polytope.

7.2.1 Karush–Kuhn–Tucker conditions for linear


programs
Remark 7.7. As with any mathematical programming problem, we
can derive the Karush–Kuhn–Tucker (KKT) conditions for a linear
programming problem. We illustrate this by deriving the KKT con-
ditions for Example 7.3. Note that, since linear (affine) functions are
both convex and concave functions, we know that finding Lagrange
multipliers satisfying the KKT conditions is necessary and sufficient
for proving that a point is an optimal solution.

Example 7.8. Let z(x1 , x2 ) = 7x1 + 6x2 , the objective function in


Eq. (7.9). We have argued that the point of optimality is (x∗1 , x∗2 ) =
(16, 72). The KKT conditions for Eq. (7.9) are as follows:
Primal feasibility

⎪ Lagrange Multiplier





⎪ g1 (x∗1 , x∗2 ) = 3x∗1 + x∗2 − 120 ≤ 0 (λ1 ),



⎨ g2 (x∗ , x∗ ) = x∗ + 2x∗ − 160 ≤ 0
1 2 1 2 (λ2 ),
(7.10)

⎪ g3 (x∗1 , x∗2 ) = x∗1 − 35 ≤ 0 (λ3 ),





⎪ g4 (x∗1 , x∗2 ) = −x∗1 ≤ 0 (λ4 ),



g5 (x∗1 , x∗2 ) = −x∗2 ≤ 0 (λ5 ).
Linear Programming and Zero-Sum Games 165

Dual feasibility
⎧ 5  

⎪  0
⎨ ∇z(x∗1 , x∗2 ) − λi ∇gi (x∗1 , x∗2 ) = ,
i=1
0 (7.11)



λi ≥ 0 i = 1, . . . , 5.
Complementary slackness
{λi gi (x∗1 , x∗2 ) = 0 i = 1, . . . , 5. (7.12)
The vector 0, 0 occurs in our dual feasible conditions because the
gradients of our functions will all be two-dimensional vectors (there
are two variables). Specifically, we can compute
∇z(x∗1 , x∗2 ) = 7, 6,
∇g1 (x∗1 , x∗2 ) = 3, 1,
∇g2 (x∗1 , x∗2 ) = 1, 2,
∇g3 (x∗1 , x∗2 ) = 1, 0,
∇g4 (x∗1 , x∗2 ) = −1, 0,
∇g5 (x∗1 , x∗2 ) = 0, −1.
Note that g3 (16, 72) = 16 − 35 = −17 = 0. This means that for
complementary slackness to be satisfied, we must have λ2 = 0. By
the same reasoning, λ4 = 0 because g4 (16, 72) = −16 = 0 and λ5 = 0
because g5 (16, 72) = −72 = 0. Thus, dual feasibility can be simplified
to
⎧       


7 3 1 0
− λ1 − λ2 =
6 1 2 0 (7.13)


λ1 , λ2 ≥ 0.
This is just a set of linear equations with some non-negativity con-
straints, which we ignore for now. We have
7 − 3λ1 − λ2 = 0 =⇒ 3λ1 + λ2 = 7, (7.14)
6 − λ1 − 2λ2 = 0 =⇒ λ1 + 2λ2 = 6. (7.15)
166 Game Theory Explained: A Mathematical Introduction with Optimization

We can solve these linear equations (and hope that the solution is
positive). Doing so yields
8
λ1 = ≥ 0, (7.16)
5
11
λ2 = ≥ 0. (7.17)
5
Thus, we have found a KKT point:
x∗1 = 16,
x∗2 = 72,
8
λ1 = ,
5
11 (7.18)
λ2 = ,
5
λ3 = 0,
λ4 = 0,
λ5 = 0.
This proves (via Theorem 6.36) that the point we found graphically
is, in fact, the optimal solution to Eq. (7.9).

7.2.2 Problems with an infinite number of solutions


Remark 7.9. Linear programming problems can have an infinite
number of solutions. We can construct such a problem by modifying
the objective function in Example 7.3.
Example 7.10. Suppose the toy maker in Example 7.3 finds that
he or she can sell planes for a profit of $18 each instead of $7 each.
The new linear programming problem becomes


⎪ max z(x1 , x2 ) = 18x1 + 6x2 ,



⎪ s.t. 3x1 + x2 ≤ 120,




⎨ x1 + 2x2 ≤ 160,
(7.19)

⎪ x1 ≤ 35,





⎪ x1 ≥ 0,



x2 ≥ 0.
Linear Programming and Zero-Sum Games 167

.
150
of tive
n
t io jec
c ob
re
Di ing
a s
re
inc

100 3x1 +x2 =120


x1 =35
∇z
x2

x1 +2x2 =160

Each point on this


50
line is a solution.

∇z

0
0 10 20 30 40 50 60
x1

Fig. 7.2. An example of infinitely many alternative optimal solutions in a linear


programming problem: The level curves for z(x1 , x2 ) = 18x1 + 6x2 are parallel
to one face of the polygonal boundary of the feasible region. Moreover, this side
contains the points of greatest value for z(x1 , x2 ) inside the feasible region. Any
combination of (x1 , x2 ) on the line 3x1 + x2 = 120 for x1 ∈ [16, 35] will provide
the largest possible value z(x1 , x2 ) can take in the feasible region.

Applying the graphical method for finding optimal solutions to linear


programming problems yields the plot shown in Fig. 7.2. The level
curves for the function z(x1 , x2 ) = 18x1 + 6x2 are parallel to one
face of the polygonal boundary of the feasible region. In particular,
they are parallel to the line defined by the equation 3x1 + x2 = 120.
Hence, as we move further up and to the right in the direction of the
gradient, corresponding to larger and larger values of z(x1 , x2 ), we
see that there is not one point on the boundary of the feasible region
that intersects that level set with the greatest value, but instead a
side of the polygonal boundary described by the line 3x1 + x2 = 120,
where x1 ∈ [16, 35] intersects the level set with the greatest value.
Let

S = {(x1 , x2 |3x1 + x2 ≤ 120, x1 + 2x2 ≤ 160, x1 ≤ 35, x1 , x2 ≥ 0},

that is, S is the feasible region of the problem. Then, for any value
of x∗1 ∈ [16, 35] and any value x∗2 so that 3x∗1 + x∗2 = 120, we will have
168 Game Theory Explained: A Mathematical Introduction with Optimization

z(x∗1 , x∗2 ) ≥ z(x1 , x2 ) for all (x1 , x2 ) ∈ S. Since x1 and x2 may take
on infinitely many values, we see that this problem has an infinite
number of alternative optimal solutions.
Remark 7.11 (Other Possibilities). In addition to the two sce-
narios above, in which a linear programming problem has a unique
solution or an infinite number of alternative optimal solutions, there
are two other possibilities:
(1) A linear programming problem can have no solution, which
occurs when the feasible region is empty. (The problem is said
to be infeasible.) An empty feasible region occurs if there are
inconsistent constraints. The simplest examples of such a situa-
tion are the constraints x ≥ 0 and x ≤ −1. When dealing with
large linear programming problems, it is sometimes difficult to
spot (visually) inconsistent constraints, though there are tests to
determine whether constraints are inconsistent.
(2) The linear programming problem can have an unbounded solu-
tion, which can occur if the feasible region is an unbounded set
and there are feasible solutions that make the objective function
arbitrarily large in a maximization problem or arbitrarily small
in a minimization problem.
Fortunately, we will not encounter either of those situations in our
study of zero-sum games, and so we can ignore these possibilities.
The interested reader can consult Ref. [78] or a related text on linear
programming for details.

7.3 A Linear Program for Zero-Sum Game Players

Remark 7.12. We are now in a position to derive a linear program-


ming problem for finding the Nash equilibrium of a zero-sum game.
Derivation 7.13. Let G = (P, Σ, A) be a zero-sum game with A ∈
Rm×n . We use the components of A and thus assume that
⎡ ⎤
a11 a12 · · · a1n
⎢a a ··· a ⎥
⎢ 21 22 2n ⎥
A=⎢ ⎢ ..

.. . . .. ⎥ .
⎣ . . . . ⎦
am1 am2 · · · amn
Linear Programming and Zero-Sum Games 169

Recall from Theorem 5.57 that the following are equivalent:

(1) There is a Nash equilibrium (x∗ , y∗ ) for G.


(2) The following equation holds:

v1 = max min xT Ay = min max xT Ay = v2 . (7.20)


x y y x

(3) There exists a real number v and x∗ ∈ Δm and y∗ ∈ Δn so that


the following inequalities hold:

Aij x∗i ≥ v, with j ∈ {1, . . . , n} and (7.21)
i

Aij yj∗ ≤ v, with i ∈ 1, . . . , m}. (7.22)
j

The fact that x∗ ∈ Δm implies that

x∗1 + · · · + xm
m =1 (7.23)

and x∗i ≥ 0 for i ∈ {1, . . . , m}. Similar conditions will hold for y∗ .
Combining Eq. (7.21) and incorporating the constraints imposed
by x∗ ∈ Δm , we arrive at a set of constraints for a linear programming
problem. That is,

a11 x∗1 + · · · + am1 x∗m − v ≥ 0,


a12 x∗1 + · · · + am2 x∗m − v ≥ 0,
..
.
(7.24)
a1n x∗1 + · · · + amn x∗m − v ≥ 0,
x∗1 + · · · + x∗m = 1,
x∗i ≥ 0 i ∈ {1, . . . , m}.

In this set of constraints, we have m + 1 variables: x∗1 , . . . , x∗m and v,


the value of the game. We know that Player 1 (the row player) is a
value maximizer. Therefore, Player 1 solves the linear programming
170 Game Theory Explained: A Mathematical Introduction with Optimization

problem


⎪ max v,



⎪ s.t. a11 x∗1 + · · · + am1 x∗m − v ≥ 0,





⎪ a12 x∗1 + · · · + am2 x∗m − v ≥ 0,



.. (7.25)
⎪ .



⎪ a1n x∗1 + · · · + amn x∗m − v ≥ 0,





⎪ x∗1 + · · · + x∗m = 1,




x∗i ≥ 0 i ∈ {1, . . . , m}.
By a similar argument, we know that Player 2’s equilibrium strat-
egy y∗ is constrained by
a11 y1∗ + · · · + a1n yn∗ − v ≤ 0,
a21 y1∗ + · · · + a2n yn∗ − v ≤ 0,
..
.
(7.26)
am1 y1∗ + · · · + amn yn∗ − v ≤ 0,
y1∗ + · · · + yn∗ = 1,
yi∗ ≥ 0 i ∈ {1, . . . , n}.
We know that Player 2 (the column player) is a value minimizer.
Therefore, Player 2 solves the linear programming problem


⎪ min v,




⎪ s.t. a11 y1 + · · · + a1n yn − v ≤ 0,




⎪ a21 y1 + · · · + a2n yn − v ≤ 0,



.. (7.27)
⎪ .



⎪ am1 y1 + · · · + amn yn − v ≤ 0,





⎪ y1 + · · · + yn = 1,




yi ≥ 0 i ∈ {1, . . . , n}.
Thus, we have derived the two linear programming problems for play-
ers in a zero-sum game.
Linear Programming and Zero-Sum Games 171

Example 7.14. Consider the game matrix from Example 5.3. In this
example, two television networks (streaming services) were attempt-
ing to decide what content to provide. The payoff matrix for Player
1 is given as
⎡ ⎤
−15 −35 10
⎢ ⎥
A = ⎣ −5 8 0 ⎦.
−12 −36 20
This is a zero-sum game, so the payoff matrix for Player 2 is simply
the negation of this matrix. The linear programming problem for
Player 1 is
max v,
s.t. − 15x1 − 5x2 − 12x3 − v ≥ 0,
− 35x1 + 8x2 − 36x3 − v ≥ 0,
(7.28)
10x1 + 20x3 − v ≥ 0,
x1 + x2 + x3 = 1,
x1 , x2 , x3 ≥ 0.
Note that we simply work our way down each column of the matrix
A to form the constraints of the linear programming problem. To
form the problem for Player 2, we work our way across the rows of
A and obtain
min v,
s.t. − 15y1 − 35y2 + 10y3 − v ≤ 0,
− 5y1 + 8y2 − v ≤ 0,
(7.29)
− 12y1 − 36y2 + 20y3 − v ≤ 0,
y1 + y2 + y3 = 1,
y1 , y2 , y3 ≥ 0.

7.4 Solving Linear Programs Using a Computer

Remark 7.15. Solving linear programs can be accomplished by


using the simplex algorithm or an interior point method, the details of
172 Game Theory Explained: A Mathematical Introduction with Optimization

which are outside the scope of this book. The interested reader should
consult a text on linear programming, such as Ref. [78]. It suffices
to say that each approach provides a series of steps that can be per-
formed by a computer to solve a linear programming problem. While
there are several software packages that can solve linear program-
ming problems (e.g., Python’s SciPy, MATLAB, etc.), we illustrate
how to solve the problems in Eqs. (7.28) and (7.29) using Mathemat-
ica because it has a simple interface that most closely resembles the
linear programming problems as given.
Example 7.16. Consider the linear programming problem in
Eq. (7.28). A Mathematica code to solve this problem is shown in
the following. Note that in Mathematica, comments are written as
“(*Comment*).” These comments are optional, as is the spacing.
FindMaximum[
{
v, (*Objective*)
-15*x1 - 5*x2 -12*x3 >= v, (*Constraint 1*)
-35*x1 + 8*x2 -36*x3 >= v, (*Constraint 2*)
10*x1 +20*x3 >= v, (*Constraint 3*)
x1 + x2 + x3 == 1, (*Probability
Constraint*)
x1 >= 0, x2 >= 0, x3 >=0 (*Non-negativity
Constraints*)
},
{x1, x2, x3, v} (*Variables*)
]
Note that we have replaced x1 , x2 , and x3 with x1, x2, and x3,
respectively, to ease the representation of the code in the text. The
code makes use of FindMaximum, which can automatically deduce
that a linear program is being passed into it. The answer reported
by Mathematica is
{-5., {x1 -> 0., x2 -> 1., x3 -> 0., v -> -5.}}
This tells us that at equilibrium, Player 1 receives a payoff of −5 and
should use a pure strategy, e2 = 0, 1, 0.
The corresponding code to solve Player 2’s linear programming
problem given in Eq. (7.29) is shown as follows.
Linear Programming and Zero-Sum Games 173

FindMinimum[
{
v, (*Objective*)
-15*y1 - 35*y2 + 10*y3 <= v, (*Constraint 1*)
-5*y1 + 8*y2 <= v, (*Constraint 2*)
-12*y1 - 36*y2 + 20*y3 <= v, (*Constraint 3*)
y1 + y2 + y3 == 1, (*Probability
Constraint*)
y1 >= 0, y2 >= 0, y3 >= 0 (*Non-negativity
Constraints*)
},
{y1, y2, y3, v} (*Variables*)
]

This code makes use of the FindMinimum function, which operates in


the same way as FindMaximum used above. The solution provided by
MathematicaTM is

{-5., {y1 -> 1., y2 -> 0., y3 -> 0., v -> -5.}}

This tells us that at equilibrium, Player 2 should use a pure strategy,


e1 = 1, 0, 0. The payoff at equilibrium to Player 2 is not −5; this
occurs because we are using Player 1’s payoff matrix. Instead, the
payoff to Player 2 at equilibrium is −(−5) = 5.
Thus, we have used two linear programming problems to confirm
that (e2 , e1 ) is the Nash equilibrium game between two networks from
Example 5.3, and the value of the game is v = −5 = eT2 Ae1 = A21 .
This is precisely the answer we got using the minimax theorem.

7.5 Standard Form, Slack and Surplus Variables

Remark 7.17. Before completing our analysis of zero-sum games


and linear programming, we require one additional definition for com-
pleteness.

Definition 7.18 (Standard Form). A linear programming prob-


lem is in standard form if it is written with the constraints Ax = b
174 Game Theory Explained: A Mathematical Introduction with Optimization

and x ≥ 0, as in


⎪ max z(x) =cT x,

s.t. Ax = b, (7.30)



x ≥ 0.
We note that this could be a minimization problem as well. It is the
structure of the constraints that is important.
Remark 7.19. It is relatively easy to convert any inequality con-
straint into an equality constraint. Consider the inequality constraint
ai1 x1 + ai2 x2 + · · · + ain xn ≤ bi . (7.31)
We can add a new slack variable, si ≥ 0, to this constraint to obtain
ai1 x1 + ai2 x2 + · · · + ain xn + si = bi .
The slack variable then becomes just another variable whose value
we must discover as we solve the linear program for which Eq. (7.31)
is a constraint, as is the non-negativity of si . That is, we add si ≥ 0
as a constraint to the modified linear programming problem as well.
We can deal with constraints of the form
ai1 x1 + ai2 x2 + · · · + ain xn ≥ bi (7.32)
similarly. In this case, we subtract a surplus variable, si ≥ 0, to obtain
ai1 x1 + ai2 x2 + · · · + ain xn − si = bi .
Thus, we have shown that any linear programming problem can be
converted into a problem in standard form.
Example 7.20. We can convert Player 1’s problem from Eq. (7.28)
into the standard form by adding three surplus variables, s1 , s2 , and
s3 . The resulting problem is
max v
s.t. − 15x1 − 5x2 − 12x3 − v − s1 = 0,
− 35x1 + 8x2 − 36x3 − v − s2 = 0,
(7.33)
10x1 + 20x3 − v − s3 = 0,
x1 + x2 + x3 = 1,
x1 , x2 , x3 , s1 , s2 , s3 ≥ 0.
Linear Programming and Zero-Sum Games 175

7.6 Optimality Conditions for Zero-Sum Games


and Duality

Remark 7.21. Our final goal in this chapter is to prove a deep


connection between the linear programming problems for the two
players. To achieve this, we make use of the concept of duality, which
we define as we proceed. This section is dense, but the reward for
working through it is an interesting theoretical property of zero-sum
games. Perhaps more importantly, we will use this same approach
when we tackle general-sum games in the following chapter.

Theorem 7.22. Let G = (P, Σ, A) be a zero-sum two-player game


with A ∈ Rm×n . Then, the linear program for Player 1,

max v
s.t. a11 x1 + · · · + am1 xm − v ≥ 0,
a12 x1 + · · · + am2 xm − v ≥ 0,
..
.
a1n x1 + · · · + amn xm − v ≥ 0,
x1 + · · · + xm − 1 = 0,
xi ≥ 0 i = 1, . . . , m,

has an optimal solution, x = x1 , . . . , xm , if and only if there exist


Lagrange multipliers, y1 , . . . , yn , ρ1 , . . . , ρm , and ν, and surplus vari-
ables, s1 , . . . , sn , such that the following hold:
⎧ m

⎪ 

⎪ aij xi − v − sj = 0 j ∈ {1, . . . , n},



⎪ i=1



⎪ m



⎨ xi = 1,
Primal feasibility : i=1





⎪ xi ≥ 0 i ∈ {1, . . . , m},





⎪ sj ≥ 0 j ∈ 1, . . . , n},


⎩ v unrestricted.
176 Game Theory Explained: A Mathematical Introduction with Optimization

⎧ n
⎪ 

⎪ aij yj − ν + ρi = 0 i ∈ {1, . . . , m},



⎪ j=1





⎪ n


⎨ yj = 1,
Dual feasibility : j=1





⎪ yj ≥ 0 j ∈ {1, . . . , n},





⎪ ρi ≥ 0 i ∈ {1, . . . , m},



ν unrestricted.

yj sj = 0 j ∈ {1, . . . , n},
Complementary slackness :
ρi xi = 0 i ∈ {1, . . . , m}.

Proof. We begin by showing the statements that primal feasibility


must hold. Clearly, v is unrestricted and xi ≥ 0 for i ∈ {1, . . . , m}.
The fact that x1 + · · · + xm = 1 is also clear from the statement of
the problem. We can rewrite each constraint of the form

a1j x1 + · · · + amj xm − v ≥ 0, (7.34)

where j ∈ {1, . . . , n}, as

a1j x1 + · · · + amj xm − v − sj = 0, (7.35)

where sj ≥ 0 is a surplus variable for j ∈ {1, . . . , n}. From this, it


is clear that if x1 , . . . , xm  is a feasible solution, then at least the
variables s1 , . . . , sn ≥ 0 exist and primal feasibility must hold.
We now prove that complementary slackness and dual feasibility
hold. Rewrite the constraints of the form in Eq. (7.34) as

−a1j x1 − · · · − amj xm + v ≤ 0 j ∈ {1, . . . , n} (7.36)

and each non-negativity constraint as

−xi ≤ 0 i ∈ {1, . . . , m}. (7.37)

We know that each affine function is both concave and convex; there-
fore, by Theorem 6.36 (the KKT theorem), there are Lagrange mul-
tipliers, y1 , . . . , yn , corresponding to the constraints in Eq. (7.36) and
Linear Programming and Zero-Sum Games 177

Lagrange multipliers, ρ1 , . . . , ρm , corresponding to the constraints in


Eq. (7.37). Lastly, there is a Lagrange multiplier, ν, corresponding
to the constraint

x1 + x2 + · · · + xm − 1 = 0. (7.38)

From Theorem 6.36, we know that

yj ≥ 0 j ∈ {1, . . . , n},
ρi ≥ 0 i ∈ {1, . . . , m},
ν unrestricted.

To see that complementary slackness holds, note that by Theo-


rem 6.36, we know that

yj (−a1j x1 − · · · − amj xm + v) = 0 j ∈ {1, . . . , n},


ρi (−xi ) = 0 i ∈ {1, . . . , m}.

If ρi (−xi ) = 0, then ρi xi = 0 for all i. From Eq. (7.35),

a1j x1 + · · · + amj xm − v + sj = 0 =⇒ sj = −a1j x1 − · · · − amj xm + v.

Therefore, we can write

yj (−a1j x1 − · · · − amj xm + v) = 0 =⇒ yj (sj ) = 0,

for all j. Thus, we have shown that

yj sj = 0 j ∈ {1, . . . , n}, (7.39)


ρi xi = 0 i ∈ {1, . . . , m}. (7.40)

We now complete the proof by showing that dual feasibility holds.


Let

gj (x1 , . . . , xm , v) = −a1j x1 − · · · − amj xm + v j ∈ {1, . . . , n},


fi (x1 , . . . , xm , v) = −xi i ∈ {1, . . . , m},
h(x1 , . . . , xm , v) = x1 + x2 + · · · + xm − 1,
z(x1 , . . . , xm , v) = v.
178 Game Theory Explained: A Mathematical Introduction with Optimization

Then, we can apply Theorem 6.36 to see that


n
∇z − yj ∇gj (x1 , . . . , xn , n)
j=1
m

− ρi ∇fi (x1 , . . . , xm , v) − ν∇h(x1 , . . . , xm , v) = 0. (7.41)
i=1
Computing the gradients yields
⎡ ⎤
0
⎢0⎥
⎢ ⎥
⎢.⎥
∇z(x1 , . . . , xm , v) = ⎢ ⎥
⎢ .. ⎥ ∈ R
(m+1)×1
,
⎢ ⎥
⎣0⎦
1
⎡ ⎤
1
⎢1⎥
⎢ ⎥
⎢.⎥
∇h(x1 , . . . , xm , v) = ⎢ ⎥
⎢ .. ⎥ ∈ R
(m+1)×1
,
⎢ ⎥
⎣1⎦
0

∇fi (x1 , . . . , xm , v) = −ei ∈ R(m+1)×1 ,


and
⎡ ⎤
−a1j
⎢ −a2j ⎥
⎢ ⎥
⎢ . ⎥
∇gj (x1 , . . . , xm , v) = ⎢ .. ⎥

⎥∈R
(m+1)×1
.
⎢ ⎥
⎣−amj ⎦
1
Before proceeding, note that in computing ∇fi (x1 , . . . , xm , v) =
−ei ∈ R(m+1)×1 with i ∈ {1, . . . , m}, we will never see the vector
⎡ ⎤
0
⎢0⎥
⎢ ⎥
⎢ . ⎥
−em+1 = ⎢ ⎥
⎢ .. ⎥ ∈ R
(m+1)×1
⎢ ⎥
⎣0⎦
−1
Linear Programming and Zero-Sum Games 179

because there is no function fm+1 (x1 , . . . , xm , v). We can now rewrite


Eq. (7.41) as
⎡ ⎤ ⎛ ⎡ ⎤⎞ ⎡ ⎤
0 −a1j 1
⎢0⎥ ⎜ ⎢ −a2j ⎥⎟   ⎢ ⎥
⎢ ⎥ ⎜ ⎢ ⎥⎟ ⎢1⎥
⎢.⎥ ⎜ n ⎢ . ⎥⎟ m
⎢.⎥
⎢.⎥ − ⎜ yj ⎢ ⎥⎟ ρi (−ei ) − ν ⎢ ⎥
⎢.⎥ ⎜ ⎢ .. ⎥⎟ − ⎢ .. ⎥ = 0. (7.42)
⎢ ⎥ ⎜ j=1 ⎢ ⎥⎟ i=1 ⎢ ⎥
⎣0⎦ ⎝ ⎣−amj ⎦⎠ ⎣1⎦
1 1 0

Consider the ith row of that expression, where 1 ≤ i ≤ m. Adding


term-by-term, we have
n

0+ aij yj + ρi − ν = 0. (7.43)
j=1

Now, consider row m + 1. We have


n

1− yj + 0 + 0 = 0. (7.44)
j=1

From these two equations, we conclude that


n

Aij yj + ρi − ν = 0,
j=1
n

yj = 1.
j=1

Thus, we have shown that dual feasibility holds. The fact that these
conditions are necessary and sufficient follows from the fact that a
linear programming problem is a convex optimization problem and
from Theorem 6.36. 

Remark 7.23. The proof of the following theorem is nearly identical


to the proof of the previous theorem. It is left to the reader as an
exercise.
180 Game Theory Explained: A Mathematical Introduction with Optimization

Theorem 7.24. Let G = (P, Σ, A) be a zero-sum two-player game


with A ∈ Rm×n . Then, the linear program for Player 2,

min ν
s.t. a11 y1 + · · · + a1n yn − ν ≤ 0,
a21 y1 + · · · + a2n yn − ν ≤ 0,
..
.
am1 y1 + · · · + amn yn − ν ≤ 0,
y1 + · · · + yn − 1 = 0,
yi ≥ 0 i = 1, . . . , m,

has an optimal solution, y = y1 , . . . , yn , if and only if there exist


Lagrange multipliers, x1 , . . . , xm , s1 , . . . , sn , and v, and slack vari-
ables, ρ1 , . . . , ρm , such that we have the following:
⎧ n
⎪ 

⎪ aij yj − ν + ρi = 0 i ∈ {1, . . . , m},






j=1



⎪ n


⎨ yj = 1,
Primal feasibility : j=1





⎪ yj ≥ 0 j ∈ {1, . . . , n},





⎪ ρi ≥ 0 i ∈ {1, . . . , m},



ν unrestricted.
⎧ m

⎪ 

⎪ aij xi − v − sj = 0 j ∈ {1, . . . , n},



⎪ i=1



⎪ m



⎨ xi = 1,
Dual feasibility : i=1





⎪ xi ≥ 0 i ∈ {1, . . . , m},





⎪ sj ≥ 0 j ∈ {1, . . . , n},


⎩ v unrestricted.
Linear Programming and Zero-Sum Games 181


yj sj = 0 j ∈ {1, . . . , n},
Complementary slackness :
ρi xi = 0 i ∈ {1, . . . , m}.

Remark 7.25. Theorems 7.22 and 7.24 show that the KKT condi-
tions for the linear programming problems for Player 1 and Player 2
in a zero-sum game are identical, but with the primal and dual feasi-
bility conditions exchanged. Linear programming problems with this
property are naturally related, as we discuss in the following.
Definition 7.26. Let P and D be two linear programming prob-
lems. If the KKT conditions for Problem P are equivalent to the
KKT conditions for Problem D with primal feasibility and dual fea-
sibility exchanged, then Problems P and D are called dual linear
programming problems.
Corollary 7.27. The linear programming problem for Player 1 is
the dual problem of the linear programming problem for Player 2 in
a zero-sum two-player game, G = (P, Σ, A), with A ∈ Rm×n .
Remark 7.28. There is a very deep theorem about dual linear pro-
gramming problems, which is beyond the scope of this book. The
interested reader can consult Ref. [78] for a proof. We make use of it
to prove the minimax theorem in a completely novel way.
Theorem 7.29 (Strong Duality Theorem). Let P and D be
dual linear programming problems. Then, exactly one of the following
statements holds:
(1) Both P and D have a solution, and at optimality, the objective
function value for Problem P is identical to the objective function
value for Problem D.
(2) Problem P has no solution because it is unbounded, and Problem
D has no solution because it is infeasible.
(3) Problem D has no solution because it is unbounded, and Problem
P has no solution because it is infeasible.
(4) Both Problem P and Problem D are infeasible.
Theorem 7.30 (Minimax Theorem). Let G = (P, Σ, A) be a
zero-sum two-player game with A ∈ Rm×n , then there exists a Nash
equilibrium, (x∗ , y∗ ) ∈ Δ. Furthermore, for every Nash equilibrium
strategy pair, (x∗ , y∗ ) ∈ Δ, there is one unique value, v ∗ = x∗T Ay∗ .
182 Game Theory Explained: A Mathematical Introduction with Optimization

Sketch of Proof. Let Problems P1 and P2 be the linear program-


ming problems in Eqs. (7.25) and (7.27) for Players 1 and 2, respec-
tively. We have already shown these linear programming problems are
dual; therefore, if Problem P1 has a solution, then so does problem
P2 . More importantly, at these optimal solutions (x∗ , v ∗ ), (y∗ , ν ∗ ), we
know that v ∗ = ν ∗ as the objective function values must be equal by
Theorem 7.29.
Consider Problem P1 : We know that x = x1 , . . . , xm  ∈ Δm and
Δm must be bounded. The value v clearly cannot exceed maxij Aij
as a result of the constraints and the fact that x ∈ Δm . Obviously, v
can be made as small as we like; however, this is not possible since
this is a maximization problem. The fact that v is bounded from
above and x ∈ Δm and P1 is a maximization problem (on v) implies
that there is at least one solution (x∗ , v ∗ ) to Problem P1 . Likewise,
there is a solution (y ∗ , ν ∗ ) to Problem P2 and v ∗ = ν ∗ . Since the
constraints for Problems P1 and P2 were taken from Theorem 5.57,
we know that (x∗ , y∗ ) is a Nash equilibrium, and therefore such an
equilibrium must exist.
Furthermore, while we have not proved this explicitly, one can
prove that if (x∗ , y∗ ) is a Nash equilibrium, then it must be a part of
a set of solutions (x∗ , v ∗ ), (y∗ , ν ∗ ) to Problems P1 and P2 . Thus, any
two equilibrium solutions are simply alternative optimal solutions to
P1 and P2 , respectively. Thus, for any Nash equilibrium pair, we have

ν ∗ = v ∗ = x∗T Ay∗ . (7.45)

This completes the proof sketch. 


Remark 7.31 (A Remark on Complementary Slackness).
Consider the KKT conditions for Players 1 and 2 given in The-
orems 7.22 and 7.24. Suppose that in an optimal solution of the
problem for Player 1, sj > 0. Then, it follows that yj = 0 by com-
plementary slackness. We can understand this from a game-theoretic
perspective. The expression

a1j x1 + · · · + amj xm

is the expected payoff to Player 1, assuming Player 2 plays column j.


If sj > 0, then

a1j x1 + · · · + amj xm > v.


Linear Programming and Zero-Sum Games 183

However, that means if Player 2 ever played column j, then Player 1


could do better than the equilibrium value v of the game. Thus,
Player 2 has no incentive to ever play this strategy, and the result is
that yj = 0 (as required by complementary slackness).
This implies something stronger. Let (x∗ , y∗ ) be a Nash equilib-
rium strategy pair. If yj∗ > 0, then
a1j x∗1 + · · · + amj x∗m = v.
Thus, if Player 2 were to play the pure strategy ej against x∗ , he
would expect to receive a payoff of −v, which is precisely what he
would receive using y∗ . That is, if yj∗ > 0, then

x∗ T Ay∗ = x∗ T Aej .
A similar statement is true for Player 1, and thus we see that
complementary slackness also implies the indifference theorem,
Theorem 5.46.

7.7 Chapter Notes

The problem of solving a system of linear inequalities was studied


as early as the 1800s, and a solution was given by Fourier. This
method is now called the Fourier–Motzkin [83] method and is not
usually covered in modern treatments of linear programming (except
in its historical context). Foundational contributions to linear pro-
gramming were made by George B. Dantzig, whose solution to two
open problems in statistics (which he mistook for homework as a
student) [84–86] may have formed the basis for part of the script
of Good Will Hunting. Dantzig developed the simplex algorithm for
solving arbitrary linear programming problems and is considered the
father of modern algorithmic optimization. He first proved the dual-
ity results discussed in this chapter; however, it was von Neumann
who conjectured them on meeting with Dantzig to discuss linear
programming [87]. Dantzig’s work (at first derided by colleagues for
being linear in a nonlinear world) was defended by von Neumann.
Interestingly, linear programming problems are often at the core of
many problems in optimization [85, 86].
The connection between zero-sum games and linear programming
is substantially deeper than the material in this chapter suggests.
184 Game Theory Explained: A Mathematical Introduction with Optimization

We have proved that finding a Nash equilibrium strategy for a zero-


sum game is equivalent to solving a pair of dual linear programming
problems. Dantzig asserted that every linear programming prob-
lem (and its dual) can be converted into a corresponding zero-sum
game [88]. His proof is incomplete and, as a result, so are the proofs
presented in several textbooks, including Luce and Raiffa [1] and
Raghavan [89]. A complete proof was only given by Adler in 2013 [90].
Thus, the study of zero-sum games is essentially the study of linear
programs.
Using the equivalence, we can deduce how hard it is to solve a
linear programming problem. Solving a linear programming problem
can be accomplished in polynomial time [91]. By this, we mean that
a solution can be produced using a set of steps, the number of which
is a polynomial function of the size of the problem, as measured in
the number of constraints and variables. This relatively recent result
was improved upon and made practical by Karmarker’s interior-point
method [92], which has spawned an entirely new field in computa-
tional optimization. As a consequence, we can conclude that zero-sum
games are relatively “easy” to solve. This is not true of general-sum
games, which we discuss in the following chapter.
– ♠♣♥♦ –

7.8 Exercises

7.1 A chemical manufacturer produces three chemicals: A, B, and


C. These chemicals are produced by two processes: 1 and 2. Run-
ning process 1 for 1 hour costs $4 and yields 3 units of chemical A,
1 unit of chemical B, and 1 unit of chemical C. Running process 2
for 1 hour costs $1 and produces 1 unit of chemical A and 1 unit
of chemical B (but none of chemical C). To meet customer demand,
at least 10 units of chemical A, 5 units of chemical B, and 3 units
of chemical C must be produced daily. Assume that the chemical
manufacturer wants to minimize the cost of production. Develop a
linear programming problem describing the constraints and objec-
tives of the chemical manufacturer. [Hint: Let x1 be the amount of
Linear Programming and Zero-Sum Games 185

time Process 1 is executed, and let x2 be the amount of time Pro-


cess 2 is executed. Use the coefficients above to express the cost of
running Process 1 for x1 time and Process 2 for x2 time. Do the
same to compute the amounts of chemicals A, B, and C that are
produced.]

7.2 Modify the linear programming problem from Exercise 7.1 to


obtain a linear programming problem with an infinite number of
alternative optimal solutions. Solve the new problem and obtain a
description for the set of alternative optimal solutions.

7.3 Construct the two linear programming problems for Bradley


and von Kluge in the Battle of Avranches.

7.4 Prove Theorem 7.24.

7.5 Show that the Nash equilibrium given in Example 5.65 solves
the linear programming problems from Exercise 7.3.

7.6 Use the approach in Remark 7.31 to prove the indifference the-
orem for Player 1.
This page intentionally left blank
Chapter 8

Quadratic Programs and


General-Sum Games

Chapter Goals: In this chapter, we show how to convert any


two-player bimatrix game into an optimization problem that
has a quadratic objective function and linear constraints. Such
a problem is called a quadratic programming problem. Readily
available computer software can then be used to find the Nash
equilibria as optimal solutions to these optimization problems.
When combined with the results in Chapter 7, this provides
an algorithmic way to solve any two-player (bi)matrix game.
We achieve these results by exploring the Karush–Kuhn–Tucker
(KKT) conditions of the individual player’s optimization prob-
lems, showing how to combine them into a single quadratic prob-
lem whose solution yields the Nash equilibrium.

8.1 Introduction to Quadratic Programming

Definition 8.1 (Quadratic Programming Problem). Let Q ∈


Rn×n be a symmetric matrix, and let c ∈ Rn×1 be a vector. As in the
definition of the general linear programming problem in Eq. (7.5), let
matrix A ∈ Rm×n , vector b ∈ Rm×1 , matrix H ∈ Rl×n , and vector
r ∈ Rl×1 . Then, a quadratic (maximization) programming problem

187
188 Game Theory Explained: A Mathematical Introduction with Optimization

is the nonlinear programming problem



⎪ 1

⎪ max xT Qx + cT x,
⎨ 2
QP s.t. Ax ≤ b, (8.1)




Hx = r.

Example 8.2. Example 6.1 is an instance of a quadratic program-


ming problem. Recall that we had

⎪ max A(x, y) = xy,



⎨ s.t. 2x + 2y = 100,

⎪ x ≥ 0,



y ≥ 0.
We can write this as
⎧  

⎪ 1  01 x

⎪ max xy ,

⎪ 2 10 y



⎨ 
  x
s.t. 22 = 100,

⎪ y



⎪  

⎪ x 0

⎩ ≥ .
y 0
Obviously, we can put this problem in precisely the format given in
Eq. (8.1), if so desired.
Remark 8.3. Quadratic programs are simply a special instance of
nonlinear (or mathematical) programming problems. There are many
applications for quadratic programs that are beyond the scope of this
book, as well as many solution techniques for quadratic programs.
Interested readers should consult Refs. [69, 70, 81] for details.

8.2 Solving Quadratic Programming Problems


Using Computers

Remark 8.4. Just as was the case for linear programming problems,
there are several computer-based solvers for quadratic programming
Quadratic Programs and General-Sum Games 189

problems. We illustrate how to use Mathematica to solve one such


problem.
Example 8.5. We use Mathematica to solve the goat pen problem
from Examples 6.1 and 8.2. The code to do this is shown as follows.
NMaximize[
{
x*y, (*Objective *)
2*x + 2*y == 100, (*Constraint*)
x>=0, y>=0 (*Non-negativity
Constraints*)
},
{x,y} (*Variables*),
Method -> "DifferentialEvolution" (*Method to use*)
]
This code uses the NMaximize function, which will look for global
optimal solutions numerically as opposed to local optimal solutions.
This function can be customized by choosing a different method. We
will need global optimal solutions when seeking Nash equilibria with
quadratic programs. The output is
{625., {x -> 25., y -> 25.}}
Remark 8.6. Note that the input syntax for Mathematica is rela-
tively natural. Other solvers such as Gurobi [93] or CPLEX [94] have
a less natural syntax, but are generally more powerful. When using
a solver, care must be taken. Many quadratic programming solvers
require the problem to be convex. That is, they require the objective
function,
1
z(x) = xT Qx + cT x,
2
to be convex. The quadratic programming problem we construct for
finding Nash equilibria will not have this property as a general rule.

8.3 General-Sum Games and Quadratic Programming

Remark 8.7. The material in this section can be found originally


in the work of Mangaserian and Stone [95], which we discuss in the
chapter note.
190 Game Theory Explained: A Mathematical Introduction with Optimization

Derivation 8.8. Consider a two-player general-sum game G =


(P, Σ, A, B), with A, B ∈ Rm×n . Let 1m ∈ Rm×1 be the vector of all
ones with m elements, and let 1n ∈ Rn×1 be the vector of all ones
with n elements. By Theorem 5.75, there is at least one Nash equilib-
rium (x∗ , y∗ ). Suppose we treat this information as exogenous, i.e.,
provided somehow through a mechanism outside the problem. Then,
the optimization problems for the two players are


⎪ max xT Ay∗ ,

P1 s.t. 1Tm x = 1,



x ≥ 0;


⎪ max x∗T By,

P2 s.t. 1Tn y = 1,



y ≥ 0.
Individually, these are linear programs. Unfortunately, the Nash equi-
librium (x∗ , y∗ ) is not available a priori. However, we can draw
insight from these problems to derive the necessary and sufficient
conditions for a Nash equilibrium.
Remark 8.9. In the proof of the following lemma, we use ∇ in
multiple contexts, and so we append a subscript to it to indicate
the variables involved in the differentiation. For example, if x =
x1 , . . . , xm , then
∂ ∂
∇x = ,..., .
∂x1 ∂xm
Likewise, if y = y1 , . . . , yn , then
∂ ∂
∇y = ,..., .
∂y1 ∂yn
Remark 8.10. The proofs of the following lemma and theorem are
long. However, they make up the core of the basic theory surrounding
general-sum games. On a first reading, it is perfectly fine to skip the
details of the proofs and move on to the example. However, it is
worth the trouble to understand the proofs because they elucidate
the connections between the KKT conditions and the Nash equilibria
in general-sum games.
Quadratic Programs and General-Sum Games 191

Lemma 8.11. Let G = (P, Σ, A, B) be a general sum, two-player


bimatrix game, with matrices A, B ∈ Rm×n . A pair (x∗ , y∗ ) ∈ Δ
is a Nash equilibrium if and only if there exist real values α and β
such that
x∗ T Ay∗ − α = 0,
x∗ T By∗ − β = 0,
Ay∗ − α1m ≤ 0,
x∗T B − β1Tn ≤ 0,
1Tm x∗ − 1 = 0,
1Tn y∗ − 1 = 0,
x∗ ≥ 0,
y∗ ≥ 0.
Proof. Assume that x∗ = x∗1 , . . . , x∗m  and y∗ = y1∗ , . . . , yn∗ . Con-
sider the KKT conditions for the linear programming problem for P1 .
The objective function is
z(x1 , . . . , xn ) = xT Ay∗ = cT x,
where c ∈ Rn×1 and
ci = Ai· y∗ = ai1 y1∗ + ai2 y2∗ + · · · + ain yn∗ .
The vector x∗ is an optimal solution for this problem if and only
if there are Lagrange multipliers λ1 , . . . , λm , corresponding to the
constraints x ≥ 0, and α, corresponding to the constraint 1Tm x = 1,
such that we have the following:
x∗1 + · · · + x∗m = 1
Primal feasibility:
x∗i ≥ 0 i ∈ {1, . . . , m}.
⎧ m



⎪ ∇x z(x∗ ) − λi (−ei ) − α1m = 0

⎨ i=1
Dual feasibility:

⎪ λi ≥ 0 i ∈ {1, . . . , m}




α unrestricted.

Complementary slackness: λi x∗i = 0 i ∈ {1, . . . , m}.
192 Game Theory Explained: A Mathematical Introduction with Optimization

We observe first that ∇x z(x) = Ay∗ . Therefore, we can write the


first equation in the dual feasibility condition as
m
Ay∗ − α1m = − λi e i . (8.2)
i=1

Since λi ≥ 0 and ei is simply the ith standard basis vector, we know


that λi ei ≥ 0, and thus
Ay∗ − α1m ≤ 0. (8.3)
Now, again consider the first equation in the dual feasibility condition
written as
m
Ay∗ + λi ei − α1m = 0.
i=1

If we multiply this by x∗ T on the left, we obtain


m
x∗T Ay∗ + λi x∗T ei − αx∗T 1m = x∗T 0. (8.4)
i=1

However, λi x∗T ei = λi x∗i = 0 by complementary slackness and


αx∗T 1m = α1Tm x∗ = α by the primal feasibility conditions. Clearly,
x∗ T 0 = 0. Thus, we conclude from Eq. (8.4) that
x∗ T Ay∗ − α = 0. (8.5)
If we consider the problem for Player 2, then
 
z(y1 , . . . , yn ) = z(y) = x∗T B y (8.6)

so that the jth component of ∇y z(y) is x∗T B·j . If we consider the


KKT conditions for Player 2, we know that y∗ is an optimal solu-
tion if and only if there exists Lagrange multipliers, μ1 , . . . , μn , cor-
responding to the constraints y ≥ 0, and β, corresponding to the
constraint y1 + · · · + yn = 1, so that we have the following:
y1∗ + · · · + yn∗ = 1
Primal feasibility :
yj∗ ≥ 0 j ∈ {1, . . . , n}.
Quadratic Programs and General-Sum Games 193

⎧ n

⎪ ∗

⎪ ∇y z(y ) − μj (−ei ) − β1n = 0

⎨ j=1
Dual feasibility :

⎪ μj ≥ 0 j ∈ {1, . . . , n}




β unrestricted.

Complementary slackness : μj yj∗ = 0 j ∈ {1, . . . , n}.

By the same arguments we used in the case for Player 1, we can


show that

x∗T B − β1Tn ≤ 0, (8.7)

and

x∗ T By∗ − β = 0. (8.8)

Thus, we have shown (from the necessity and sufficiency of KKT


conditions for the two problems) that

x∗ T Ay∗ − α = 0,
x∗ T By∗ − β = 0,
Ay∗ − α1m ≤ 0,
x∗T B − β1Tn ≤ 0,
1Tm x∗ − 1 = 0,
1Tn y∗ − 1 = 0,
x∗ ≥ 0,
y∗ ≥ 0

are necessary and sufficient conditions for (x∗ , y∗ ) to be a Nash equi-


librium of the game G. 

Theorem 8.12. Let G = (P, Σ, A, B) be a general-sum, two-player


bimatrix game, with matrices A, B ∈ Rm×n . A pair (x∗ , y∗ ) ∈ Δ is
194 Game Theory Explained: A Mathematical Introduction with Optimization

a Nash equilibrium if and only if the tuple (x∗ , y∗ , α∗ , β ∗ ) is a global


maximizer for the quadratic programming problem
max xT (A + B)y − α − β,
s.t. Ay − α1m ≤ 0,
xT B − β1Tn ≤ 0,
1Tm x − 1 = 0, (8.9)

1Tn y − 1 = 0,
x ≥ 0,
y ≥ 0.

Proof. Before proceeding to the proof, note that if we assume


Ay − α1m ≤ 0,
then multiplying both sides of the equation by xT yields
xT Ay − αxT 1m ≤ 0.
Now, if x ∈ Δm , then xT 1m = 1Tm x = 1, which implies that
xT Ay − α ≤ 0.
Multiplying xT B − β1Tn ≤ 0 on the right by y yields a similar
inequality:
xT By − β ≤ 0.
Adding these results together yields
z(x, y, α, β) = xT (A + B)y − α − β ≤ 0.
Thus, any set of variables (x∗ , y∗ , α∗ , β ∗ ) so that z(x∗ , y∗ , α∗ , β ∗ ) = 0
is a global maximum, assuming the constraints given in the theorem.
Note also that the constraint
Ay − α1m ≤ 0
is shorthand for m distinct constraints of the form
Ai· y − α ≤ 0.
Quadratic Programs and General-Sum Games 195

Similarly, the constraint


xT B − β1Tn ≤ 0
is shorthand for n constraints of the form
xT B·j − β ≤ 0.
Finally, x ≥ 0 and y ≥ 0 are shorthand for m + n additional con-
straints of the form xi ≥ 0, with i ∈ {1, . . . , m}, and yj ≥ 0, with
j ∈ {1, . . . , n}. We use this information when we form KKT condi-
tions. We now proceed to the proof.
(⇐) We show that the KKT conditions for Eq. (8.9) are identical
to the conditions given in Lemma 8.11. Thus, if (x∗ , y∗ , α∗ , β ∗ ) is a
global optimum for Eq. (8.9), then it must satisfy the KKT conditions
by Theorem 6.36; consequently, by Lemma 8.11, (x∗ , y∗ ) must be a
Nash equilibrium for G.
At an optimal point (x∗ , y∗ , α∗ , β ∗ ), there are the following mul-
tipliers:
(1) λ1 , . . . , λm , corresponding to the constraints Ay − α1m ≤ 0;
(2) μ1 , . . . , μn , corresponding to the constraints xT B − β1Tn ≤ 0;
(3) ν1 , corresponding to the constraint 1Tm x − 1;
(4) ν2 , corresponding to the constraint 1Tn y − 1 = 0;
(5) φ1 , . . . , φm , corresponding to the constraints x ≥ 0; and
(6) θ1 , . . . , θn , corresponding to the constraints y ≥ 0.
Write x ≥ 0 as −x ≤ 0 and y ≥ 0 as −y ≤ 0. When computing the
gradients of the various constraint and objective functions, note that
we can compute the gradients of the various constraint functions and
the objective function. Each gradient has m + n + 2 components: one
for each variable in x and y and two additional components for α
and β. Throughout the remainder of this proof, the vector 0 will vary
in size to ensure that all vectors have the correct size. The gradients
are:
(1)
⎡ ⎤
(A + B)y
⎢(A + B)T x⎥
⎢ ⎥
∇z(x, y, α, β) = ⎢ ⎥;
⎣ −1 ⎦
−1
196 Game Theory Explained: A Mathematical Introduction with Optimization

(2)
⎡ ⎤
0
⎢AT ⎥
⎢ ⎥
∇ (Ai· y − α) = ⎢ i· ⎥ i ∈ {1, . . . , m};
⎣ −1 ⎦
0
(3)
⎡ ⎤
B·j
 ⎢
 ⎢0⎥ ⎥
∇ xT B·j − β = ⎢ ⎥ j ∈ {1, . . . , n};
⎣ 0 ⎦
−1
(4)
⎡ ⎤
1m
⎢0⎥
⎢ ⎥
∇(1Tm x − 1) = ⎢ ⎥ ;
⎣ 0 ⎦
0
(5)
⎡ ⎤
0
⎢1 ⎥
⎢ n⎥
∇(1Tn y − 1) = ⎢ ⎥ ;
⎣0⎦
0
(6)
⎡ ⎤
−ei
⎢ 0 ⎥
⎢ ⎥
∇(−xi ) = ⎢ ⎥;
⎣ 0 ⎦
0
(7)
⎡ ⎤
0
⎢−e ⎥
⎢ j⎥
∇(−yj ) = ⎢ ⎥.
⎣ 0 ⎦
0
Quadratic Programs and General-Sum Games 197

In the final two gradient computations, ei ∈ Rm×1 and ej ∈ Rn×1


so that the standard basis vectors agree with the dimensionality of
x and y, respectively. The dual feasibility constraints of the KKT
conditions for the quadratic program assert that:
(1) λ1 , . . . , λn ≥ 0,
(2) μ1 , . . . , μm ≥ 0,
(3) φ1 , . . . , φm ≥ 0,
(4) θ1 , . . . , θn ≥ 0,
(5) ν1 ∈ R, and
(6) ν2 ∈ R.
The KKT equality that appears in the dual feasibility conditions is
⎡ ⎤ ⎡ ⎤ ⎡ ⎤
(A + B)y 0 B·j
⎢(A + B)T x⎥ m ⎢AT ⎥ n ⎢0⎥
⎢ ⎥ ⎢ i· ⎥ ⎢ ⎥
⎢ ⎥− λi ⎢ ⎥ − μj ⎢ ⎥
⎣ −1 ⎦ ⎣ −1 ⎦ ⎣ 0 ⎦
i=1 j=1
−1 0 −1
        
∇z ∇(Ai· y−α) ∇(xT B·j −β)
⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤
1m 0 ei 0
⎢0⎥ ⎢1 ⎥ m ⎢0⎥ n ⎢−e ⎥
⎢ ⎥ ⎢ n⎥ ⎢ ⎥ ⎢ j⎥
− ν1 ⎢ ⎥ −ν2 ⎢ ⎥ − φi ⎢ ⎥ − θj ⎢ ⎥ = 0.
⎣ 0 ⎦ ⎣0⎦ ⎣0⎦ ⎣ 0 ⎦
i=1 j=1
0 0 0 0
           
∇(1T
m x−1) ∇(1T
n y−1) ∇(−xi ) ∇(−yj )
(8.10)

We can analyze this expression row by row, starting with the last
row. Writing the last row on its own yields
n
−1 + μj = 0.
j=1

This implies
n
μj = 1.
j=1
198 Game Theory Explained: A Mathematical Introduction with Optimization

Similarly, analyzing the third row on its own yields


m
λi = 1. (8.11)
i=1

Thus, dual feasibility shows that (λ1 , . . . , λm ) ∈ Δm and (μ1 , . . .


μn ) ∈ Δn .
Rows 1 and 2 are more complex and represent m and n equa-
tions, corresponding to the m elements in x and the n elements in y.
Consider the jth component of the second row, which corresponds to
the variable yj . Isolating this component and simplifying yields the
equation
m
xT (A·j + B·j ) − λi Aij − ν2 + θj = 0.
i=1

Note that we have used a transpose on the first term to write every-
thing in terms of the jth column of A + B, rather than the jth row
of (A + B)T . In the remaining terms, we are simply extracting the
jth component. This equation implies
m
xT (A·j + B·j ) − λi Aij − ν2 ≤ 0. (8.12)
i=1

We can similarly analyze the ith component of the first row corre-
sponding to the variable xi . The result is the (simplified) equation

n
(Ai· + Bi· )y − μj Bij − ν1 ≤ 0. (8.13)
j=1

Eqs. (8.12) and (8.13) can be generalized to a system of equations


(constraints) using matrix arithmetic. We have
xT (A + B) − λT A − ν2 1Tn ≤ 0 (8.14)
since j ∈ {1, . . . , n}, and
(A + B)y − Bμ − ν1 1m ≤ 0 (8.15)
since i ∈ {1, . . . , m}. Here, λ = λ1 , . . . , λm  and μ = μ1 , . . . , μn .
There is now a trick required to complete the proof. Suppose we
choose the Lagrange multipliers so that λ = x and μ = y. We are free
Quadratic Programs and General-Sum Games 199

to do so because we have already proven that λ ∈ Δm and μ ∈ Δn .


Furthermore, suppose we choose ν1 = α and ν2 = β. Then, if x∗ , y∗ ,
α∗ , β ∗ is an optimal solution, then Eqs. (8.14) and (8.15) become
x∗T (A + B) − x∗T A − β ∗ 1Tn ≤ 0,
(A + B)y∗ − By∗ − α∗ 1m ≤ 0.
Simplifying yields the inequalities
x∗ T B − β ∗ 1Tn ≤ 0,
Ay∗ − α∗ 1m ≤ 0.
We also know that:
(1) 1Tm x∗ = 1,
(2) 1Tn y∗ = 1,
(3) x ≥ 0, and
(4) y ≥ 0.
Finally, if we apply complementary slackness for the quadratic
programming problem, we see that
λi (Ai· y − α) = 0 i ∈ {1, . . . , m},
 T 
x B·j − β μj = 0 j ∈ {1, . . . , n}.
Using our assumptions that λi = x∗i and μj = yj∗ and summing over
i yields
m
x∗i (Ai· y∗ − α∗ ) = 0.
i=1
However, this implies
m m
x∗i Ai· y∗ − α∗ x∗i = 0.
i=1 i=1
Rewriting this as a matrix equation yields
x∗ T Ay∗ − α∗ = 0. (8.16)
Performing the same computation but this time summing over j
yields
n  
x∗ T B·j − β ∗ μj = 0.
j=1
200 Game Theory Explained: A Mathematical Introduction with Optimization

Following a similar analysis as before at last yields the expression

x∗T By∗ − β ∗ = 0. (8.17)

If we add Eqs. (8.16) and (8.17), we obtain

x∗ T (A + B)y∗ − α∗ − β ∗ = 0.

Thus, we have shown that any tuple, (x∗ , y∗ , α∗ , β ∗ ), satisfying the


KKT conditions for the quadratic programming problem must also
be a global maximizer because Section 8.3 asserts that the objective
function of the quadratic programming problem takes on its maxi-
mum value. Moreover, by Lemma 8.11, it follows that such a point
must also be a Nash equilibrium for G.
(⇒) The converse of the theorem states that if (x∗ , y∗ ) is a Nash
equilibrium for G, then setting α∗ = x∗T Ay∗ and β ∗ = x∗T By∗
gives an optimal solution, (x∗ , y∗ , α∗ , β ∗ ), to the quadratic program.
It follows from Lemma 8.11 that when (x∗ , y∗ ) is a Nash equilibrium,
we know that

x∗T Ay∗ − α∗ = 0,
x∗ T By∗ − β ∗ = 0,

and thus we know at once that

x∗ T (A + B)y∗ − α∗ − β ∗ = 0

holds. Therefore, it follows that (x∗ , y∗ , α∗ , β ∗ ) must be a global


maximizer for the quadratic program because the objective function
achieves its upper bound. This completes the proof. 
Example 8.13. Recall the explicit Chicken game, which had the
payoff matrices

0 −1
A= ,
1 −10

0 1
B= .
−1 −10

From Exercise 4.4, we know that the pure strategy (e1 , e2 ) is a


Nash equilibrium. Symmetry tells us that the pure strategy (e2 , e1 )
Quadratic Programs and General-Sum Games 201

must also be a Nash equilibrium. We can find a third, non-obvious,


mixed-strategy Nash equilibrium using the quadratic programming
formulation.
The quadratic program for this instance of Chicken is
⎧  
⎪   0 0 y1

⎪ max x1 x2 − α − β,

⎪ 0 −20 y2



⎪    

⎪ 0 −1 y1

⎪ α 0

⎪ s.t. − ≤ ,

⎪ 1 −10 y2 α 0



⎪ 

⎪   0 1    

⎪ x1 x2 − ββ ≤ 00 ,

⎪ −1 −10



⎪ 
⎨   x1
11 = 1, (8.18)

⎪ x2



⎪ 

⎪   y1

⎪ 11 = 1,




y2

⎪  




x1 0

⎪ ≥ ,

⎪ x2 0



⎪  

⎪ y1 0

⎩ ≥ .
y2 0

We can write this without matrix notation as




⎪ max − 20x2 y2 − α − β,



⎪ s.t. − y2 − α ≤ 0,





⎪ y1 − 10y2 − α ≤ 0,




⎨ − x2 − β ≤ 0,
(8.19)

⎪ x1 − 10x2 − β ≤ 0,





⎪ x1 + x2 = 1,





⎪ y1 + y2 = 1,



x1 , x2 , y1 , y2 ≥ 0.
202 Game Theory Explained: A Mathematical Introduction with Optimization

In Mathematica, the problem can be solved using the following code.


NMaximize[
{
-20x2*y2-a-b, (*Objective *)
-y2 - a <= 0, (*Constraint 1*)
y1 - 10y2 - a <= 0, (*Constraint 2*)
-x2 - b <= 0, (*Constraint 3*)
x1 - 10x2 - b <= 0, (*Constraint 4*)
x1 + x2 == 1, (*Constraint 5*)
y1 + y2 == 1, (*Constraint 6*)
x1>=0, x2>=0, y1>=0, y2>=0
},
{x1,x2,y1,y2,a,b} (*Variables*),
Method -> "DifferentialEvolution" (*Method to use*)
]

We obtain the solution as


{0., {x1 -> 0.9, x2 -> 0.1,
y1 -> 0.9, y2 -> 0.1,
a -> -0.1, b -> -0.1}}.

That is, we have

9 1
x∗1 = y1∗ = x∗2 = y2∗ = .
10 10
This is the mixed-strategy Nash equilibrium for this specific payoff
matrix. We know that this must be a Nash equilibrium because the
objective function correctly evaluates to zero at this point. It is pos-
sible (though a little challenging) to show that Chicken games always
have the two pure Nash equilibria given above and a mixed-strategy
Nash equilibrium, which can be found through this quadratic pro-
gramming approach, if so desired.

8.4 Chapter Notes

Mangaserian and Stone’s work [95], as presented in this chapter,


was not the first to develop an algorithmic approach to computing
Quadratic Programs and General-Sum Games 203

Nash equilibria in general-sum bimatrix games. Lemke and How-


son [96] provided what is now called the Lemke–Howson algorithm
and is generally considered more useful than the quadratic pro-
gramming approach. However, the Lemke–Howson algorithm is more
complex to describe and requires a custom implementation, mak-
ing it less suitable for a first presentation. Lemke and Howson’s
approach yields an additional fact: Non-degenerate, two-player bima-
trix games (where non-degenerate is defined in Ref. [96]) always have
an odd number of Nash equilibria. This explains why the games we
have seen so far seem to have one equilibrium or three equilibria.
While it is relatively easy to construct degenerate games, this fact
is not evident from the quadratic programming formulation. Wil-
son [97] generalized this result to N -player games, showing that
in non-degenerate cases, they too have an odd number of equilib-
ria. Lemke and Howson’s approach has been further generalized to
a theory of linear (and nonlinear) complementarity problems [98],
which forms an entire class of mathematical programming problems.
This has also led to methods for studying more general kinds of
games, called Stackelberg games [99], using mixed complementarity
approaches [100].
The computational methods developed for finding Nash equilibria,
beginning with the work of Lemke and Howson [96] and Mangase-
rian and Stone [95], also mark the beginning of algorithmic game
theory [3,101–103], a topic of study in its own right. This area mixes
computer science, mathematics, and economics to identify algorith-
mic foundations for analyzing games and finding equilibria. One of
the main results to emerge from algorithmic game theory is the
study of the computational complexity of finding a Nash equilib-
rium. Unlike linear programming problems, which we noted could be
solved in polynomial time, arbitrary quadratic programming prob-
lems can be much harder to solve. In fact, it is possible to prove
that solving an arbitrary quadratic programming problem is NP-
hard, which is a formal way of classifying the difficulty of solving
this problem. This would seem to settle the question of how dif-
ficult finding Nash equilibria is; however, it does not since we are
not deriving an arbitrary quadratic programming problem but one
with a specific form. Consequently, this has led to an exceptionally
deep collection of results on the computational complexity of finding
or approximating Nash equilibria. See Refs. [104–108] for examples.
204 Game Theory Explained: A Mathematical Introduction with Optimization

Many of these results use a specialized computational complexity


class called PPAD, introduced by Papadimitriou.
– ♠♣♥♦ –

8.5 Exercises

8.1 Construct the quadratic programming problem for an instance


of prisoner’s dilemma. Show that the strategy in which both players
defect is a Nash equilibrium.

8.2 Consider a Chicken game with the payoff matrices


 
T L T W
A= , B= ,
W X LX

where W > T > L > X. Define


W −T LW − T X
p= γ= .
(W − T ) + (L − X) (W − T ) + (L − X)

Show that x = 1 − p, p and y = 1 − p, p is the mixed-strategy


Nash equilibrium for Chicken using α = β = γ in the quadratic
programming formulation. If you like, you can derive this equilibrium
yourself: Take the two constraints that come from Ay − α1m ≤ 0,
and subtract them to remove α. Treat the resulting inequality as
an equation. Assuming y1 = 1 − p and y2 = p, solve for p. You
get the same result if you apply this to the constraints coming from
xT B − β1Tn ≤ 0. Now, take the constraints and add them instead,
using the values you just found for y1 and y2 (or x1 and x2 ). Treat
this as an equation and solve for α (or β). You will obtain the formula
that defines γ. It is also worth noting that this is not the simplest
way to derive this equilibrium.

8.3 Show that when B = −A (i.e., we have a zero-sum game),


the quadratic programming problem reduces to the two dual linear
programming problems we already identified in the previous chapter
for solving zero-sum games.
Part 3

Cooperation in Game Theory


This page intentionally left blank
Chapter 9

Nash’s Bargaining Problem and


Cooperative Games

Chapter Goals: Heretofore, we have considered games in


which the players were unable to communicate before play began
or in which players have no way of trusting each other with
certainty (remember prisoner’s dilemma). In this chapter, we
remove this restriction and consider those games in which play-
ers may put in place a pre-play agreement in an attempt to
identify a solution with which both players can live happily.
We begin by defining the cooperative expected payoff and pay-
off regions in both cooperative and competitive games. We then
consider cooperative play as a multi-criteria optimization prob-
lem, defining the necessary machinery to formalize this, includ-
ing Pareto optimality. We then introduce Nash’s bargaining
axioms and prove the Nash bargaining theorem using some tools
from real analysis.

207
208 Game Theory Explained: A Mathematical Introduction with Optimization

9.1 Payoff Regions in Two-Player Games

Remark 9.1. We will be dealing with bimatrix games throughout


this chapter. We assume the two matrices have component forms:
⎡ ⎤ ⎡ ⎤
a11 a12 · · · a1n b11 b12 · · · b1n
⎢a a ··· a ⎥ ⎢b b ··· b ⎥
⎢ 21 22 2n ⎥ ⎢ 21 22 2n ⎥
A=⎢ .⎢ ⎥, B=⎢ ⎥
. .
. . . .
. ⎥ ⎢ .
. .
. . . .. ⎥.
⎣ . . . . ⎦ ⎣ . . . . ⎦
am1 am2 · · · amn bm1 bm2 · · · bmn
Definition 9.2 (Cooperative Mixed Strategy). Let G =
(P, Σ, A, B) be a two-player bimatrix game, with A, B ∈ Rm×n .
A cooperative strategy is a collection of probabilities, xij (i =
1, . . . , m, j = 1, . . . , n), so that
m 
 n
xij = 1,
i=1 j=1

xij ≥ 0 i ∈ {1, . . . , m}, j ∈ {1, . . . , n}.


We associate a vector x ∈ Δmn with this cooperative strategy.
Remark 9.3. For any cooperative strategy xij with i ∈ {1, . . . , m}
and j ∈ {1, . . . , n}, xij gives the probability that Player 1 plays row
i while Player 2 plays column j. Note that x could be considered a
matrix, but for the sake of notational consistency, it is easier to think
of it as a vector with a strange indexing scheme.
Definition 9.4 (Cooperative Expected Payoff ). Let G =
(P, Σ, A, B) be a two-player bimatrix game, with A, B ∈ Rn×m , and
let x ∈ Δmn be a cooperative strategy for Players 1 and 2. Then,
m 
 n
u1 (x) = aij xij (9.1)
i=1 j=1

is the expected payoff for Player 1, while


m 
 n
u2 (x) = bij xij (9.2)
i=1 j=1

is the expected payoff for Player 2.


Nash’s Bargaining Problem and Cooperative Games 209

Definition 9.5 (Competitive Game Payoff Region). Let G =


(P, Σ, A, B) be a two-player bimatrix game, with A, B ∈ Rn×m. The
payoff region of the competitive game is

Q(A, B) = {(u1 (x, y), u2 (x, y)): x ∈ Δm , y ∈ Δn }, (9.3)

where

u1 (x, y) = xT Ay,
u2 (x, y) = xT By

are the standard player payoff functions.


Definition 9.6 (Cooperative Game Payoff Region). Let G =
(P, Σ, A, B) be a two-player matrix game, with A, B ∈ Rn×m . The
payoff region of the cooperative game is

P (A, B) = {(u1 (x), u2 (x)): x ∈ Δmn }, (9.4)

where u1 and u2 are the cooperative payoff functions for Players 1


and 2, respectively.
Remark 9.7. The following lemma is left as an exercise.
Lemma 9.8. Let G = (P, Σ, A, B) be a two-player matrix game, with
A, B ∈ Rn×m . The competitive playoff region Q(A, B) is contained
in the cooperative payoff region P (A, B).
Example 9.9. Consider a two-player bimatrix game with matrices

2 −1 1 −1
A= , B= .
−1 1 −1 2

Games of this type are historically referred to as the Battle of the


Sexes game, though in modern parlance they are called the Battle
of the Buddies or, simply, coordination games. The story describes
the decision-making process of a married couple or a pair of friends
as they attempt to decide what to do on a given evening. In the
classic Battle of the Sexes game, the players must decide whether
to attend a boxing match or a ballet. One clearly prefers the boxing
match (strategy 1 for each player), and the other prefers the ballet
(strategy 2 for each player). Neither derives much benefit from going
210 Game Theory Explained: A Mathematical Introduction with Optimization

(a) Competitive Region (b) Cooperative Region

(c) Overlap

Fig. 9.1. The three plots show the competitive payoff region, the cooperative
payoff region, and an overlay of the regions for the Battle of the Sexes game. Note
that the cooperative payoff region completely contains the competitive payoff
region.

to an event alone, which is indicated by the payoff of −1 in the off-


diagonal elements. The competitive payoff region, the cooperative
payoff region, and an overlay of the two regions for the Battle of the
Sexes are shown in Fig. 9.1. Constructing these figures is done by
brute force through in Mathematica.
Remark 9.10. Coordination games are generally described by pay-
off matrices in which the largest values occur on the diagonal, but in
which players do not (necessarily) agree on the relative value of coor-
dinated strategies. An example generalization of the example above
would be a pair of payoff matrices,
R L S L
A= , B= ,
L S L R
where R > S > L. Thus, Player 1 prefers it when both players play
strategy 1, while Player 2 prefers it when both players play strategy 2
Nash’s Bargaining Problem and Cooperative Games 211

and neither prefers using opposite strategies. Thus, the players prefer
to coordinate strategies. In contrast, an anti-coordination game has
large values on the off-diagonal but smaller values on the diagonal.
Remark 9.11. Coordination games offer the possibility for coop-
eration since both players agree that coordination is preferred. The
remainder of this chapter illustrates one way to build such a coordi-
nated cooperative strategy.
Theorem 9.12. Let G = (P, Σ, A, B) be a two-player bimatrix
game, with A, B ∈ Rn×m . The cooperative payoff region P (A, B)
is a convex set.
Proof. The set P (A, B) is defined as a set of (u1 , u2 ) satisfying
the constraints
⎧ m n
⎪ 

⎪ aij xij − u1 = 0,



⎪ i=1 j=1





⎪ m  n

⎨ bij xij − u2 = 0,
i=1 j=1 (9.5)



⎪ m  n



⎪ xij = 1,



⎪ i=1 j=1



xij ≥ 0 i ∈ {1, . . . , m}, j ∈ {1, . . . , n}.

This set is defined by linear equalities (which are both convex


and concave). Since linear functions are convex, the set of tuples
(u1 , u2 , x) that satisfy these constraints is a convex set by Theo-
rems 6.18 and 6.29. Denote this set by X. It follows immediately that
if (u1 , u2 ) ∈ P (A, B), then there is a x ∈ Δmn so that (u1 , u2 , x) ∈ X.
Choose two points, (u11 , u12 ), (u21 , u22 ) ∈ P (A, B), and suppose that
(u1 , u12 , x1 ) and (u21 , u22 , x2 ) are the corresponding elements of X.
1

Since X is convex, for all λ ∈ [0, 1], we have

λ(u11 , u12 , x1 ) + (1 − λ)(u21 , u22 , x2 ) ∈ X. (9.6)

Let

(u1 , u2 , x) = λ(u11 , u12 , x1 ) + (1 − λ)(u21 , u22 , x2 ).


212 Game Theory Explained: A Mathematical Introduction with Optimization

But then, (u1 , u2 ) ∈ P (A, B) because (u1 , u2 , x) ∈ X. Therefore,

λ(u11 , u12 ) + (1 − λ)(u21 , u22 ) ∈ P (A, B) (9.7)

for all λ ∈ [0, 1]. It follows that P (A, B) is convex. 

Remark 9.13. The following theorem assumes familiarity with the


definition of a closed and bounded set in Rn . A set is bounded in
Rn if it can be enclosed in a ball (hypersphere) of sufficient size.
There are many consistent definitions for a closed set in Rn , all of
which are outside the scope of this book and require a little topology
or real analysis. As a consequence, we will not provide a complete
proof of the following theorem but will take the whole statement
as true. Those readers interested in a formal definition of closure
(and compactness, which is what we are dealing with) might consult
Munkres’ text on topology [61] (Chapter 3).
Theorem 9.14. Let G = (P, Σ, A, B) be a two-player bimatrix
game, with A, B ∈ Rn×m. Let

⎨ m 
 n
X= (x, u1 , u2 ) ∈ Δmn × R × R : aij xij − u1 = 0,

i=1 j=1

m 
 n ⎬
bij xij − u2 = 0 .

i=1 j=1

Then, both X and the cooperative payoff region P (A, B) are bounded
and closed sets.

Sketch of Boundedness Proof. We note that X is equivalently


defined by the expressions in Eq. (9.5). This set must be bounded
because xij ∈ [0, 1] for i ∈ {1, . . . , m} and j ∈ {1, . . . , n}. As a result
of this, the value of u1 is bounded above and below by the largest
and smallest values in A, while the value of u2 is bounded above and
below by the largest and smallest values in B. The fact that P (A, B)
is bounded is clear from this argument as well. 
Remark 9.15. We make use of this theorem when we analyze an
optimization problem whose feasible region is X later in this chapter.
Nash’s Bargaining Problem and Cooperative Games 213

9.2 Collaboration and Multi-Criteria Optimization

Remark 9.16. Recall the generic optimization problem:





max z(x1 , . . . , xn ),



⎪ s.t. g1 (x1 , . . . , xn ) ≤ 0,





⎪ ..

⎪ .


gm (x1 , . . . , xn ) ≤ 0,





⎪ h1 (x1 , . . . , xn ) = 0,



⎪ ..



⎪ .


hl (x1 , . . . , xn ) = 0.

Here, z : Rn → R, gi : Rn → R (i = 1, . . . , m), and hj : Rn → R.


This problem has a single objective function: z(x1 , . . . , xn ).
Definition 9.17. A multi-criteria optimization problem is an opti-
mization problem with several simultaneous objective functions,
z1 , . . . , zs : Rn → R, written as

⎪ max z1 (x1 , . . . , xn ), z2 (x1 , . . . , xn ), · · · , zs (x1 , . . . , xn ),





⎪ s.t. g1 (x1 , . . . , xn ) ≤ 0,





⎪ ..

⎪ .

gm (x1 , . . . , xn ) ≤ 0,





⎪ h1 (x1 , . . . , xn ) = 0,



⎪ ..



⎪ .


hl (x1 , . . . , xn ) = 0.

Remark 9.18. Note that the objective function has now been
replaced with a vector of objective functions. Multi-criteria optimiza-
tion problems can be challenging to solve because, among other rea-
sons, making z1 (x1 , . . . , xn ) larger may make z2 (x1 , . . . , xn ) smaller,
and vice versa.
214 Game Theory Explained: A Mathematical Introduction with Optimization

Example 9.19 (The Green Toy Maker). For the sake of argu-
ment, consider the toy maker problem from Example 7.3:



max z(x1 , x2 ) = 7x1 + 6x2 ,



⎪ s.t. 3x1 + x2 ≤ 120,




⎨ x1 + 2x2 ≤ 160,

⎪ x1 ≤ 35,





⎪ x1 ≥ 0,



x2 ≥ 0.

Suppose a certain amount of pollution is created each time a toy


is manufactured. Each plane generates 3 units of pollution during
manufacturing, while each boat generates only 2 units of pollution.
Since x1 is the number of planes produced and x2 is the number of
boats produced, we can create a multi-criteria optimization problem
in which we simultaneously attempt to maximize profit 7x1 + 6x2
and minimize pollution 3x1 + 2x2 . Since every minimization prob-
lem can be transformed into a maximization problem by negating
the objective, we have the problem

⎪ max 7x1 + 6x2 , −3x1 − 2x2 ,





⎪ s.t. 3x1 + x2 ≤ 120,



⎨ x1 + 2x2 ≤ 160,

⎪ x1 ≤ 35,





⎪ x1 ≥ 0,



x2 ≥ 0.

Remark 9.20. For n > 1, we can choose many ways to order ele-
ments in Rn . For example, on the plane, there are many ways to
decide that a point (x1 , y1 ) is greater than, less than, or equivalent
to another point (x2 , y2 ). We can think of these as the various ways
of assigning a preference relation  to points on the plane (or more
generally points in Rn ). Here are three common ways to assign an
order to points on the plane:
Nash’s Bargaining Problem and Cooperative Games 215

(1) Points can be ordered based on their Euclidean distance to the


origin, i.e.,
 
(x1 , y1 )  (x2 , y2 ) ⇐⇒ x21 + y12 > x22 + y22 .

(2) A lexicographic ordering can be used by comparing the first com-


ponent and then the second component.
(3) A parameter, λ ∈ R, can be specified, and the ordering from R
can be used via the formula

(x1 , y1 )  (x2 , y2 ) ⇐⇒ x1 + λy1 > x2 + λy2 .

For this reason, a multi-criteria optimization problem may have many


equally good solutions, depending on the order chosen. The following
definition gives us a way to think about the tradeoffs that may occur
for various ordering choices.
Definition 9.21 (Pareto Optimality). Let gi : Rn → R (i =
1, . . . , m), hj : Rn → R, and zk : Rn → R (k = 1, . . . , s). Consider
the multi-criteria optimization problem

⎪ max z1 (x1 , . . . , xn ), z2 (x1 , . . . , xn ), · · · , zs (x1 , . . . , xn ),




⎪ s.t. g1 (x1 , . . . , xn ) ≤ 0,






⎪ ..

⎪ .

gm (x1 , . . . , xn ) ≤ 0,





⎪ h1 (x1 , . . . , xn ) = 0,



⎪ ..



⎪ .


hl (x1 , . . . , xn ) = 0.

A payoff vector, z(x∗ ), dominates another payoff vector, z(x) (for


two feasible points x, x∗ ), if:
(1) zk (x∗ ) ≥ zk (x) for all k ∈ {1, . . . , s} and
(2) zk (x∗ ) > zk (x) for at least one k ∈ {1, . . . , s}.
A solution, x∗ , is said to be Pareto optimal if z(x∗ ) is not dominated
by any other vector, z(x), where x is any other feasible solution.
216 Game Theory Explained: A Mathematical Introduction with Optimization

Remark 9.22. A solution, x∗ , is Pareto optimal if changing the


strategy can only benefit one objective function at the expense of
another objective function. Put in terms of Example 9.19, a produc-
tion pattern (x∗1 , x∗2 ) is Pareto optimal if there is no way to change
either x1 or x2 to both increase profit and decrease pollution.
Definition 9.23 (Multi-criteria Optimization Problem for
Cooperative Games). Let G = (P, Σ, A, B) be a two-player matrix
game, with A, B ∈ Rn×m . Then, the cooperative game multi-criteria
optimization problem is
⎧  

⎪ max u1 (x) − u01 , u2 (x) − u02 ,



⎨ m  n
s.t. xij = 1, (9.8)



⎪ i=1 j=1


xij ≥ 0 i ∈ {1, . . . , m} j ∈ {1, . . . , n}.
Here, x is a cooperative mixed strategy;
m 
 n
u1 (x) = aij xij ,
i=1 j=1
m 
 n
u2 (x) = bij xij
i=1 j=1

are the cooperative expected payoff functions; and u01 and u02 are
status quo payoff values – usually assumed to be Nash equilibrium
payoff values for the two players.
Example 9.24. The three Nash equilibria of the Battle of the Sexes
game, along with the Pareto optimal payoff points, are illustrated
in Fig. 9.2. Note that in a maximization problem, the Pareto payoff
points are always up and to the right and are at the boundary of the
region containing all the possible payoffs. In a sense, this makes these
points very similar to the solutions of linear programming problems.
The set of points that are all Pareto optimal is sometimes called the
Pareto frontier.
Remark 9.25. There is a substantial amount of research on solving
multi-criteria optimization problems, which arise frequently in the
real world. The interested reader might consider Ref. [109].
Nash’s Bargaining Problem and Cooperative Games 217

Fig. 9.2. The Pareto payoff points in P(A, B), along with the payoff points of
the three Nash equilibria of the Battle of the Sexes.

9.3 Nash’s Bargaining Axioms

Remark 9.26. For a two-player matrix game, G = (P, Σ, A, B),


with A, B ∈ Rn×m , Nash studied the problem of finding a cooper-
ative mixed strategy, x ∈ Δmn , that would maximally benefit both
players. The resulting strategy x∗ is referred to as an arbitration
procedure and is agreed to by the two players before play begins.
In essence, we imagine this as a pre-play contract that can be enforced
by charging a large penalty if either player breaks the contract to
return the Nash equilibrium.

Remark 9.27. In solving this problem, Nash quantified six axioms


(or assumptions) that he wished to ensure. We describe them as
follows.

Assumption 1 (Rationality). If x∗ is an arbitration procedure, we


must have u1 (x∗ ) ≥ u01 and u2 (x∗ ) ≥ u02 , where (u01 , u02 ) ∈ Q(A, B) is
a non-cooperative status quo, usually a Nash equilibrium payoff.

Remark 9.28. Assumption 1 asserts that no player should be incen-


tivized to return to the non-cooperative status quo.
218 Game Theory Explained: A Mathematical Introduction with Optimization

Assumption 2 (Pareto Optimality). Any arbitration proce-


dure, x∗ , is a Pareto-optimal solution to the two-player coopera-
tive game multi-criteria optimization problem in Eq. (9.8). That is,
(u1 (x∗ ), u2 (x∗ ) is Pareto optimal.

Assumption 3 (Feasibility). The arbitration procedure x∗ ∈ Δmn


and (u1 (x∗ ), u2 (x∗ ) ∈ P (A, B).

Assumption 4 (Independence of Irrelevant Alternatives).


If x∗ is an arbitration procedure and P  ⊆ P (A, B), with
(u01 , u02 ), (u1 (x∗ ), u2 (x∗ )) ∈ P  , then x∗ is still an arbitration pro-
cedure when we restrict our attention to P  (and the corresponding
subset of Δmn ).

Remark 9.29. Assumption 4 may seem odd. It was constructed to


deal with restrictions on the payoff space, which in turn result in a
restriction on the space of feasible solutions to the two-player coop-
erative game multi-criteria optimization problem. It states that if we
add constraints to Eq. (9.8) but the resulting multi-criteria prob-
lem still has u01 and u02 as valid status quo values and the current
arbitration procedure is still available because (u1 (x∗ ), u2 (x∗ ), x∗ )
remains in the reduced feasible region, then the arbitration proce-
dure will not change, even though we have restricted the feasible
region.

Assumption 5 (Invariance Under Linear Transformation).


If u1 (x) and u2 (x) are replaced by ui (x) = αi ui (x) + βi and αi >

0 and u0i = αi u0i + βi for (i ∈ {1, 2}), respectively, and x∗ is an
arbitration procedure for the original problem, then it is also an
arbitration procedure for the transformed problem defined in terms

of ui and u0i .

Remark 9.30. Assumption 5 states that arbitration procedures are


not affected by linear transformations of the underlying (linear) util-
ity function. (See Theorem 2.27.)

Definition 9.31 (Symmetry of P (A, B)). Let G = (P, Σ, A, B)


be a two-player matrix game, with A, B ∈ Rn×m . The set P (A, B) is
symmetric if, whenever (u1 , u2 ) ∈ P (A, B), then (u2 , u1 ) ∈ P (A, B).
Nash’s Bargaining Problem and Cooperative Games 219

Assumption 6 (Symmetry). If P (A, B) is symmetric and u01 = u02 ,


then the arbitration procedure x∗ has the property that u1 (x∗ ) =
u2 (x∗ ).
Remark 9.32. Assumption 6 states that if P (A, B) is symmetric,
i.e., P (A, B) is symmetric in R2 about the line y = x, then both
players will obtain the same payoff in an arbitration procedure, x∗ .
An inspection of Fig. 9.1 reveals that this is a symmetric payoff
region.
Remark 9.33. Our goal is to show that there is an arbitration
procedure, x∗ ∈ Δnm , that satisfies these assumptions and that the
resulting pair (u1 (x∗ ), u2 (x∗ )) ∈ P (A, B) is unique. This is Nash’s
bargaining theorem.

9.4 Nash’s Bargaining Theorem

Remark 9.34. We begin our proof of Nash’s bargaining theorem


with two lemmas. We will not prove the first, as it requires a bit of
analysis. The interested reader can refer to Rudin’s little book on
analysis [67].
Lemma 9.35 (Weierstrass’ Theorem). Let X be a non-empty
closed and bounded set in Rn , and let z be a continuous mapping
with z : S → R. Then, the optimization problem

max z(x),
(9.9)
s.t. x ∈ X,

has at least one solution: x∗ ∈ X.


Remark 9.36. The final lemma sets the stage for Nash’s bargaining
theorem. Note that, in this lemma, we choose an explicit method of
changing the multi-criteria problem in Definition 9.23 to a single-
criterion optimization problem. That is, we choose a way to order
pairs of real numbers. We also note that the proof of this result is
rather long. On a first reading, it is fine to skip the proof or simply
read the existence portion. Also, this proof is adapted from the one
in Ref. [2], with some simplifications.
220 Game Theory Explained: A Mathematical Introduction with Optimization

Lemma 9.37. Let G = (P, Σ, A, B) be a two-player matrix game,


with A, B ∈ Rn×m . Let (u01 , u02 ) ∈ P (A, B). The quadratic program-
ming problem


⎪ max (u1 − u01 )(u2 − u02 ),



⎪ m  n



⎪ s.t. aij xij − u1 = 0,



⎪ i=1 j=1



⎪ m  n

⎪ 

⎪ bij xij − u2 = 0,


⎨ i=1 j=1

⎪ m 
 n



⎪ xij = 1,



⎪ i=1 j=1



⎪ xij ≥ 0 i ∈ {1, . . . , m}, j ∈ {1, . . . , n},





⎪ u1 ≥ u01 ,





u2 ≥ u02
has at least one global optimal solution: (u∗1 , u∗2 , x∗ ). Furthermore,
if (u1 , u2 , x ) is an alternative optimal solution, then u∗1 = u1 and
u∗2 = u2 .
Proof. As before, denote the feasible region by X. By the same
argument as in the proof of Theorem 9.14, X is a closed, bounded
set. Moreover, since (u01 , u02 ) ∈ P (A, B), we know that there is some
x0 satisfying the constraints given in Eq. (9.5) and that the tuple
(u01 , u02 , x0 ) is feasible for this problem. Thus, the feasible region is
non-empty. Applying Lemma 9.35, we know that there is at least one
(global optimal) solution to this problem.
To see the uniqueness of (u∗1 , u∗2 ), suppose that V = (u∗1 −u01 )(u∗2 −
u2 ), and we have a second solution, (u1 , u2 , x ), so that (without loss
0

of generality) u1 > u∗1 and u2 < u∗2 but V = (u1 − u01 )(u2 − u02 ).
By Theorem 9.12, we know X is convex; therefore,
1 1
(ū1 , ū2 , x̄) = (u∗1 , u∗2 , x∗ ) + (u1 , u2 , x ) ∈ X.
2 2
Evaluate the objective function at this point to obtain
  
0 0 1 ∗ 1  0 1 ∗ 1  0
V̄ = (ū1 − u1 )(ū2 − u2 ) = u + u − u1 u + u − u2 .
2 1 2 1 2 2 2 2
Nash’s Bargaining Problem and Cooperative Games 221

Here, we are simply defining V̄ as the value of the objective function


at the new point. Note that for i ∈ {1, 2}, we can write
1 ∗ 1  1 1
ui + ui − u0i = (u∗i − u0i ) + (ui − u0i )
2 2 2 2
1 ∗ 
= (ui − u0i ) + (ui − u0i ) .
2
Then, the objective function at the new point becomes
1 ∗  
V̄ = (u1 − u01 ) + (u1 − u01 ) (u∗2 − u02 ) + (u2 − u02 ) .
4
Our objective is to show that this is an improved objective value. Let
Q∗i = u∗i − u0i Qi = ui − u0i .
Using these expressions, we can write
1 ∗  
V̄ = Q1 + Q1 Q∗2 + Q2 .
4
Before proceeding, note that from our definition of V and our
assumptions on u1 and u2 , we have
V = Q∗1 Q∗2 = Q1 Q2 .
Using this information, expand the expression for V̄ to obtain
⎛ ⎞
1
V̄ = ⎝Q∗1 Q∗2 + Q1 Q2 +Q∗1 Q2 + Q1 Q∗2 ⎠
4      
V V
1 
= 2V + Q∗1 Q2 + Q1 Q∗2 . (9.10)
4
The next part of the proof requires an algebraic trick. Expanding Q∗i
and Qi in u∗i and ui for i ∈ {1, 2} and applying some arithmetic1 will
show that the following equality holds:
Q∗1 Q2 + Q1 Q∗2 − 2V = Q∗1 Q2 + Q1 Q∗2 − Q∗1 Q∗2 − Q1 Q2
= (u∗1 − u1 )(u2 − u∗2 ).

1
You can check this by hand or use a computer algebra system, such as
MathematicaTM .
222 Game Theory Explained: A Mathematical Introduction with Optimization

We initially assumed that u∗1 > u1 and u2 > u∗2 ; therefore,

Q∗1 Q2 + Q1 Q∗2 − 2V > 0.

Let R = Q∗1 Q2 + Q1 Q∗2 − 2V . Assembling the pieces, we can rewrite
Eq. (9.10) as
1  1 R
V̄ = 4V + Q∗1 Q2 + Q1 Q∗2 − 2V = (4V + R) = V + > V
4 4 4
because R > 0. However, this contradicts our assumption that
(u∗1 , u∗2 , x∗ ) and (u1 , u2 , x ) were optimal solutions to the problem
because V̄ is an improved objective value. Therefore, we must have
u∗1 = u1 and u∗2 = u2 . This completes the proof. 
Theorem 9.38 (Nash’ Bargaining Theorem). Let G =
(P, Σ, A, B) be a two-player matrix game, with A, B ∈ Rm×n and
with (u01 , u02 ) ∈ P (A, B). Then, there is at least one arbitration pro-
cedure, x∗ ∈ Δmn , satisfying the six bargaining assumptions and,
moreover, the payoffs u1 (x∗ ) and u2 (x∗ ) are the unique payoff opti-
mal points in P (A, B).
Proof. Consider the quadratic programming problem from
Lemma 9.37:


⎪ max (u1 − u01 )(u2 − u02 ),



⎪ m 
 n



⎪ s.t. aij xij − u1 = 0,



⎪ i=1 j=1



⎪ m 
n

⎪ 

⎪ bij xij − u2 = 0,


⎨ i=1 j=1
(9.11)

⎪ m 
 n



⎪ xij = 1,



⎪ i=1 j=1



⎪ xij ≥ 0 i = 1, . . . , m, j = 1, . . . , n,





⎪ u1 ≥ u01 ,





u2 ≥ u02 .
It suffices to show that the solution to this quadratic program pro-
vides an arbitration procedure, x, satisfying Nash’s assumptions.
Nash’s Bargaining Problem and Cooperative Games 223

Uniqueness follows immediately from Lemma 9.37. As before, denote


the feasible region of this problem by X.
Before proceeding, recall that Q(A, B), the payoff region for
the competitive game G, is contained in P (A, B). If u01 , u02 is cho-
sen as an equilibrium for the competitive game, we know that
(u01 , u02 ) ∈ P (A, B). Thus, there is a x0 so that (u01 , u02 , x0 ) ∈ X,
and it follows that 0 is a lower bound for the maximal value of the
objective function. We now prove that the Nash bargaining assump-
tions hold.
Assumption 1. By the construction of this problem, we know that
u1 (x∗ ) ≥ u01 and u2 (x∗ ) ≥ u02 .
Assumption 2. By Lemma 9.37, any solution, (u∗1 , u∗2 , x∗ ), has
unique u∗1 and u∗2 . Thus, any other feasible solution, (u1 , u2 , x), must
have the property that either u1 < u∗1 or u2 < u∗2 . Therefore, (u∗1 , u∗2 )
must be Pareto optimal.
Assumption 3. Since the constraints of Eq. (9.11) properly contain
the constraints in Eq. (9.5), the assumption of feasibility is ensured.
Assumption 4. Suppose that P  ⊆ P (A, B). Then, there is a subset
X  ⊆ X corresponding to P  . If both (u∗1 , u∗2 ) ∈ P  and (u01 , u02 ) ∈ P  ,
it follows that there is a pair of cooperative mixed strategies, x∗ , x0 ∈
Δmn , so that (u∗1 , u∗2 , x∗ ) ∈ X  and (u01 , u02 , x0 ) ∈ X  . Then, we can
define the new optimization problem


⎪ max (u1 − u01 )(u2 − u02 ),

s.t. (u1 , u2 , x) ∈ X, (9.12)


⎩ (u , u , x) ∈ X  .
1 2

These constraints are consistent, and since


(u∗1 − u01 )(u∗2 − u02 ) ≥ (u1 − u01 )(u2 − u02 ) (9.13)
for all (u1 , u2 , x ) ∈ X, it follows that Eq. (9.13) must also hold for all
(u1 , u2 , x ) ∈ X  ⊆ X. Thus, (u∗1 , u∗2 , x∗ ) is also an optimal solution
for Eq. (9.12).
Assumption 5. Replace the objective function in Eq. (9.11) with
  
α1 u1 + β1 − (α1 u01 − β1 ) α2 u2 + β2 − (α2 u02 − β2 )
= α1 α2 (u1 − u01 )(u2 − u02 ). (9.14)
224 Game Theory Explained: A Mathematical Introduction with Optimization

The constraints of the problem will not be changed since we assume


that α1 , α2 ≥ 0. To see this, note that a linear transformation of the
payoff values implies the new constraints
m 
 n
(α1 aij + β1 )xij − (α1 u1 + β1 ) = 0 ⇐⇒
i=1 j=1

 n
m  m 
 n
α1 aij xij + β1 xij −(α1 u1 + β1 ) = 0 ⇐⇒
i=1 j=1 i=1 j=1
  
Simplifies to 1
m 
 n
α1 aij xij + β1 − α1 u1 − β1 = 0 ⇐⇒
i=1 j=1
m 
 n
aij xij − u1 = 0. (9.15)
i=1 j=1

A similar result holds for the constraints


m 
 n
bij xij − u2 = 0.
i=1 j=1

Since the constraints are identical, it is clear that changing the objec-
tive function to the function in Eq. (9.14) will not affect the solution
since we are simply scaling the value by a positive number.
Assumption 6. Suppose that u0 = u01 = u02 and P (A, B) is sym-
metric. Assuming that P (A, B) is symmetric implies that (u∗2 , u∗1 ) ∈
P (A, B) and that

(u∗1 − u01 )(u∗2 − u02 ) = (u∗1 − u0 )(u∗2 − u0 ) = (u∗2 − u0 )(u∗1 − u0 ). (9.16)

Thus, for some x ∈ Δmn . we know that (u∗2 , u∗1 , x ) ∈ X since


(u∗2 , u∗1 ) ∈ P (A, B). However, this feasible solution achieves the same
objective value as the optimal solution, (u∗1 , u∗2 , x∗ ) ∈ X, and thus by
Lemma 9.37, we know that u∗1 = u∗2 .
Thus, we have shown that the six Nash bargaining assumptions
are satisfied. This completes the proof. 
Nash’s Bargaining Problem and Cooperative Games 225

Example 9.39. Consider the Battle of the Sexes game. Recall that

2 −1 1 −1
A= , B= .
−1 1 −1 2

We can now find the arbitration process that produces the best coop-
erative strategy for the two players. This game has three Nash equi-
libria. The first two are in pure strategies:

(1) (e2 , e1 ): In this equilibrium, Player 2 always gets to go to the


ballet and Player 1 never goes to the boxing match.
(2) (e1 , e2 ): In this equilibrium, Player 1 always gets to go to the
boxing match and Player 1 never goes to the ballet.

The third equilibrium is a mixed-strategy equilibrium. We leave the


details of finding this equilibrium as Exercise 9.2. However, the payoff
to the two players at this equilibrium are u01 = u02 = 15 . Note that
the two pure strategy equilibria are unfair, while the mixed-strategy
equilibrium yields low total happiness.
We can apply Nash bargaining to try to improve the scenario.
Assume that we begin with the mixed-strategy Nash equilibrium.
Using the status quo, u01 = u02 = 1/5, the problem we must
solve is
  
1 1
max u1 − u2 − ,
5 5
s.t. 2x11 − x12 − x21 + x22 − u1 = 0,
x11 − x12 − x21 + 2x22 − u2 = 0,
x11 + x12 + x21 + x22 = 1, (9.17)
xij ≥ 0 i ∈ {1, 2}, j ∈ {1, 2},
1
u1 ≥ ,
5
1
u2 ≥ .
5
226 Game Theory Explained: A Mathematical Introduction with Optimization

We use Mathematica to construct a solution with the following code.

Maximize[
{
(u1-1/5)*(u2-1/5), (*Objective function*)
2x11 - x12 - x21 + x22 - u1 == 0,
(*Constraint 1*)
x11 - x12 - x21 + 2x22 - u2 == 0,
(*Constraint 2*)
x11 + x12 + x21 + x22 == 1, (*Constraint 3*)
x11 >= 0, x12 >= 0, x21 >= 0, x22 >= 0,
u1 >= 1/5, u2 >= 1/5
},
{x11,x12,x21,x22,u1,u2} (*Variables*)
]

The solution is

{169/100, {x11 -> 1/2, x12 -> 0, x21 -> 0,


x22 -> 1/2, u1 -> 3/2, u2 -> 3/2}}.

Note that we have u1 = u2 = 32 , as required by symmetry. Interpret-


ing the solution, this means that Players 1 and 2 should flip a fair
coin to decide whether they will both follow strategy 1 or strategy 2
(i.e., boxing or ballet). This solution is shown on the set P (A, B)
in Fig. 9.3. Note that the resulting solution is now on the Pareto
frontier, as expected.

Example 9.40. It is interesting to consider what happens when bar-


gaining begins from one of the other Nash equilibrium points. Both
of those points are already Pareto optimal. (See Fig. 9.2). Conse-
quently, starting at, say, x1 = 1, x2 = 0, y1 = 1, and y2 = 0, we see
that there is no way to ensure that each player receives at least as
much under the bargaining strategy as they do under the Nash equi-
librium while moving to a fairer position. Thus, the Nash bargaining
solution when starting at x1 = 1, x2 = 0, y1 = 1, and y2 = 0 is again
x1 = 1, x2 = 0, y1 = 1, and y2 = 0. This illustrates a known prin-
ciple: Always negotiate from a position of strength. In this example,
Player 1 is already getting what (s)he wants and has no incentive to
negotiate anything away.
Nash’s Bargaining Problem and Cooperative Games 227

Fig. 9.3. The Pareto-optimal Nash bargaining solution to the Battle of the
Sexes game is for each player to do what makes them happiest 50% of the time.
This seems like the basis for a fairly happy marriage, and it yields a Pareto-
optimal solution, shown by the green dot.

9.5 Chapter Notes

Nash presented this approach to bargaining in his 1953 paper on the


topic [110]. It is straightforward to generalize these results to games
with N > 2 players. The objective function of the Nash bargaining
quadratic programming problem is modified to obtain a product of
more than two terms. Nash bargaining is a subtopic of the broader
field of cooperative bargaining or bargaining theory [111]. Because
of its algorithmic nature, Nash bargaining has found several uses
in computer science, especially in resource allocation and network
design (see, e.g., Refs. [112–115]).
Multi-criteria optimization is also called multi-objective optimiza-
tion and occurs in a wide range of real-world settings. The simple
act of buying a car, where price, fuel efficiency, acceleration, cargo
capacity, and visual appeal, is an exercise in multi-criteria optimiza-
tion. As already discussed, there are several ways of transforming
this vector optimization problem into a single-objective optimization
problem. A method not already mentioned is sometimes called an
-constraint method [116], though in more restricted settings, it can
be called goal programming. In this setting, a single-objective func-
tion is chosen for optimization, and the remaining objective functions
228 Game Theory Explained: A Mathematical Introduction with Optimization

are given minimum (or maximum) targets and used in additional


constraints. These problems then have the form
max zk (x1 , . . . , xn ),
s.t. zj (x1 , . . . , xn ) ≥ j j=k x ∈ X,
where X represents the remaining feasible region from the original
problem.
Because solutions to these problems usually become philosophical
(i.e., which objective is the most important?), it is sometimes more
helpful to quantify the Pareto frontier. This can be accomplished by
varying the objective function weights when using a linear combina-
tion of the objective functions to make a single objective. That is,
s
z(x; λ) = λi zi (x).
i=1
Here, λ = λ1 , . . . , λs  are the weights. Varying λ and solving the
single-criterion optimization problem with the objective function
z(x; λ) will define the Pareto frontier [116]. This process can be time-
consuming, and recent approaches attempt to use deep learning [117]
to estimate the Pareto frontier from a few examples.
– ♠♣♥♦ –

9.6 Exercises

9.1 Prove Lemma 9.8. [Hint: Argue that any pair of mixed strate-
gies can be used to generate a cooperative mixed strategy.]
9.2 Find a Nash equilibrium for the Battle of the Sexes game using
a quadratic programming problem.
9.3 Use Nash’s bargaining theorem to show that players should
trust each other and cooperate rather than defect in prisoner’s
dilemma.
9.4 Solve the Nash bargaining problem for the Chicken game for
each of its three Nash equilibria.
Chapter 10

An Introduction to N -Player
Cooperative Games

Chapter Goals: In this final chapter, we introduce some


elementary results about N -player cooperative games, which
extend the work we began in the previous chapter on bargain-
ing games. We continue to assume that the players in this game
can communicate with each other. This subject is a bit different
from what has been presented in the previous chapters. Here,
the goal is to study games in which it is in the players’ best inter-
est to come together in a grand coalition of cooperating players.
Thus, we define what a coalition is and how payoffs may be
assigned within groups of players. We introduce the concept of
essential games and the core. We also use our work from linear
programming to discuss the Bondareva–Shapley theorem as well
as Shapley values.

10.1 Motivating Cooperative Games

Definition 10.1 (Coalition of Players). Consider an N -player


game G = (P, Σ, π). Any set S ⊆ P is called a coalition of players.
The set S c = P\S is the dual coalition. The coalition P itself is called
the grand coalition.

229
230 Game Theory Explained: A Mathematical Introduction with Optimization

Remark 10.2. Heretofore, we have always written P = {P1 , . . . ,


PN }; however, for the remainder of the chapter, we assume that
P = {1, . . . , N }. This will substantially simplify our notation.
Remark 10.3 (Two Coalition Games). Let G = (P, Σ, π) be
an N -player game. Suppose within a coalition S ⊆ P with S =
{i1 , . . . , i|S| }, the players i1 , . . . , i|S| agree to play some strategy,
σS = (σii1 , . . . , σi|S| ) ∈ Σ1 × · · · × Σi|S| ,
while players in S c = {j1 , . . . , j|S c | } agree to play the strategy
σS c = (σj1 , . . . , σj|S c | ).
Under these assumptions, we may suppose that the net payoff to
coalition S is

KS = πi (σS , σS c ). (10.1)
i∈S

That is, the cumulative payoff to coalition S is simply the sum of


the payoffs of the members of the coalition from the payoff function
π in the game G. The payoff to the players in S c is defined similarly
as KS c . Then, we can think of the coalitions as playing a two-player
general-sum game, with payoff functions given by KS and KS c .
Definition 10.4 (Two-Coalition Game). Given an N -player
game G = (P, Σ, π) and a coalition S ⊆ P, with S = {i1 , . . . , i|S| }
and S c = {j1 , . . . , j|S c | }. The two-coalition game is the two-player
game
    
c
GS = {S, S }, Σi1 × · · · × Σi|S| × Σj1 × · · · × Σj|S c| ,

(KS × KS c ) .

Remark 10.5. We leave the proof of the following lemma as an


exercise.
Lemma 10.6. For any two-coalition game GS , there is a Nash equi-
librium strategy for both the coalition S and its dual S c .
Definition 10.7 (Characteristic (Value) Function). Let S be a
coalition defined over an N -player game G. Then, the value function
v: 2P → R is the expected payoff to S in the game GS when both
coalitions S and S c play their Nash equilibrium strategy in GS .
An Introduction to N -Player Cooperative Games 231

Remark 10.8. The characteristic or value function can be consid-


ered the net worth of the coalition to its members. Clearly,

v(∅) = 0

because the empty coalition can achieve no value. On the other hand,

v(P) = the largest sum of all payoff values possible

because a two-player game against the empty coalition will try to


maximize the value of Eq. (10.1). In general, v(P) answers the fol-
lowing question: If all N players worked together to maximize the
sum of their payoffs, which strategy would they all agree to choose,
and what would that sum be?
Theorem 10.9. If S, T ⊆ P and S ∩ T = ∅, then v(S) + v(T ) ≤
v(S ∪ T ).
Proof. Within S and T , the players may choose a strategy (jointly)
and independently to ensure that they receive at least v(S) + v(T );
however, the value of the game GS∪T to Player 1 (S ∪T ) may be larger
than the result yielded when S and T make independent choices, thus
v(S ∪ T ) ≥ v(S) + v(T ). 
Definition 10.10 (Superadditivity). A function v: 2P → R is
superadditive if
(1) v(∅) = 0;
(2) v(S) + v(T ) ≤ v(S ∪ T ), for all S, T ⊆ P, assuming S ∩ T = ∅.
Remark 10.11. Theorem 10.9 states that when the function
v: 2P → R is defined in terms of the Nash equilibrium payoff in
the two-coalition game, then the resulting function is superadditive.

10.2 Coalition Games

Remark 10.12. The goal of cooperative N -player games is to define


scenarios in which the grand coalition, P, is stable; that is, it is in
everyone’s interest to work together in one large coalition, P. The
problem then becomes to divide the coalition payoff v(S) among the
members of the coalition S, and by being in a coalition, the players
will improve their payoff over competing on their own.
232 Game Theory Explained: A Mathematical Introduction with Optimization

Definition 10.13 (Coalition Game). A coalition game is a pair


(P, v) where P is the set of players and v: 2P → R is a superadditive
characteristic function.
Remark 10.14. At this point, v can be any superadditive function.
It is not necessarily defined by the two-coalition game.
Definition 10.15 (Inessential Game). A game is inessential if
N

v(P) = v({i}).
i=1
A game that is not inessential is called essential.
Remark 10.16. An inessential game is one in which the total value
of the grand coalition does not exceed the sum of the values to the
players if they each played against all other players individually. That
is, there is no incentive for any player to join the grand coalition
because there is no chance that they will receive a better payoff,
assuming the total payoff of the grand coalition is divided among its
members.
Theorem 10.17. Let S ⊆ P. In an inessential game,

v(S) = v({i}).
i∈S
Proof. We proceed by contradiction. Suppose not; then,

v(S) > v({i})
i∈S
by superadditivity. Now,

v(S c ) ≥ v({i}),
i∈S c

and v(P) ≥ v(S) + v(S c ), which implies that


  
v(P) ≥ v(S) + v(S c ) > v({i}) + v({i}) = v({i}).
i∈S i∈S c i∈P
Thus,
N

v(P) > v({i});
i=1
An Introduction to N -Player Cooperative Games 233

therefore, the coalition game is not inessential. This is a direct con-


tradiction of our assumption. 
Corollary 10.18. A two-player zero-sum game produces an inessen-
tial coalition game.

10.3 Division of Payoff to the Coalition

Remark 10.19. Given a coalition game, (P, v), the goal is to equi-
tably divide v(S) among the members of the coalition in such a way
that the individual players prefer to be in the coalition rather than
to leave it.
Definition 10.20 (Imputation). Let (P, v) be a coalition game.
A tuple (x1 , . . . , xN ) (of payoffs to the individual players in P) is
called an imputation if:

i ≥ v({i}) and
(1) x
(2) i∈P xi = v(P).

Remark 10.21. The first criterion for a tuple (x1 , . . . , xN ) to be


an imputation states that each player must do better in the grand
coalition than they would on their own (against all other players).
The second criterion says that the total allotment of payoff to the
players cannot exceed the payoff received by the grand coalition itself.
Essentially, this second criterion asserts that the coalition cannot go
into debt to maintain its members. It is also worth noting that the
condition

xi = v(P)
i∈P

is equivalent to a statement on Pareto optimality in so far as play-


ers all together cannot expect to do any better than the net payoff
accorded to the grand coalition.
Definition 10.22 (Dominance). Let (P, v) be a coalition game.
Suppose x = (x1 , . . . , xN ) and y = (y1 , . . . , yN ) are two imputations.
Then, x dominates y over some coalition S ⊂ P if:

i > yi , for all i ∈ S, and


(1) x
(2) i∈S xi ≤ v(S).
234 Game Theory Explained: A Mathematical Introduction with Optimization

Remark 10.23. The previous definition states that players in coali-


tion S prefer the payoffs they receive under x to the payoffs they
receive under y. Furthermore, these same players can threaten to
leave the grand coalition P because they may actually improve their
payoff by playing coalition S.

Definition 10.24 (Stable Set). A stable set X ⊆ Rn of imputa-


tions is a set satisfying the following criteria:

(1) No payoff vector x ∈ X is dominated over any coalition by


another imputation, y ∈ X.
(2) All payoff vectors y ∈ X are dominated by at least one vector,
x ∈ X.

Remark 10.25. Stable sets are (in some way) good sets of imputa-
tions in so far as they represent imputations that will make players
want to remain in the grand coalition.

10.4 The Core

Definition 10.26 (Core). Given a coalition game (P, v), the core
is the set of imputations
 N
 
 
n
C(v) = x∈R : xi = v(P) and ∀S ⊆ P xi ≥ v(S) .
i=1 i∈S

Remark 10.27. A vector x ∈ C(v) is an imputation. This is clear, as



xi = v(P),
i∈P

and since {i} ⊂ P, we know that xi ≥ v({i}). However, the core


(when it is non-empty) is more than that.

Theorem 10.28. The core is contained in every stable set.

Proof. Let X be a stable set. If the core is empty, then it is con-


tained in X. Therefore, suppose x ∈ C(v). If x is dominated by any
vector z, then there is a coalition S ⊂ P so that zi > xi for all i ∈ S
An Introduction to N -Player Cooperative Games 235


and i∈S zi ≤ v(S). However, by the definition of the core, we know
that
 
zi > xi ≥ v(S).
i∈S i∈S

Thus, i∈S zi > v(S) and z cannot dominate x – a contradiction.

Remark 10.29. We can use linear programming and the definition
of the core to develop a test for the emptiness of the core. The fol-
lowing theorem follows immediately from the definition of the core.
Theorem 10.30. Let (P, v) be a coalition game. Consider the linear
programming problem

⎨ min x1 + · · · + xN ,

 (10.2)

⎩ s.t. xi ≥ v(S) for all S ⊆ P.
i∈S
N
If there is no solution x∗ so that i=1 xi = v(P ), then C(v) = ∅.
Remark 10.31. We minimize the sum in the objective since we
want to find small enough values for x1 , . . . , x so that we can form
an imputation.
Corollary 10.32. The core of a coalition game, (P, v), may be
empty.
Remark 10.33. We can now use linear programming duality to
prove the Bondareva–Shapley Theorem. We state a lemma to help
compute a dual linear programming problem. The lemma can be
proved by showing that the two problems share the Karush–Kuhn–
Tucker (KKT) conditions, with primal feasibility and dual feasibility
swapped.
Lemma 10.34. Suppose A ∈ Rm×n , c ∈ Rn , x ∈ Rn is a set of vari-
ables, and y ∈ Rm is a set of dual variables (Lagrange multipliers).
The linear programming problem


⎪ min cT x,

s.t. Ax ≥ b,



x unrestricted
236 Game Theory Explained: A Mathematical Introduction with Optimization

has the dual problem




⎪ max bT y,

s.t. AT y = c,



y ≥ 0.

Theorem 10.35 (Bondareva–Shapley Theorem). Let (P, v) be


a coalition game with |P| = N . The core C(v) is non-empty if and
only if there exists y1 , . . . , y2N , where each yi corresponds to a set
Si ⊆ P so that


⎪ 2N



⎪ v(P) = yi v(Si ),


⎨ i=1


⎪ yi = 1 ∀j ∈ P,



⎪ S i ⊇{j}



yi ≥ 0 ∀Si ⊆ P.

Proof. Apply Lemma 10.34 to compute the dual linear program-


ming problem for Eq. (10.2) as


⎪ 2N



⎪ max yi v(Si ),


⎨ i=1
 (10.3)

⎪ s.t. yi = 1 ∀j ∈ P,



⎪ Si ⊇{j}



yi ≥ 0 ∀Si ⊆ P.

To see this, we note that there are 2N constraints in Eq. (10.2) and
N variables. Thus, there will be N constraints in the dual prob-
lem, but 2N variables (Lagrange multipliers), and the resulting dual
problem can be written as Eq. (10.3). By Theorem 7.29 (the strong
duality theorem), Eq. (10.3) has a solution if and only if Eq. (10.2)
does; moreover, the objective functions at optimality coincide. This
completes the proof. 
An Introduction to N -Player Cooperative Games 237

Corollary 10.36. A non-empty core is not necessarily a singleton.

Remark 10.37. The core can be considered the possible “equilib-


rium” imputations that smart players will agree to and that cause
the grand coalition to hold together; i.e., no players or group of play-
ers have any motivation to leave the grand coalition. Unfortunately,
the fact that the core may be empty is not helpful.

10.5 Shapley Values

Definition 10.38 (Shapley Values). Let (P, v) be a coalition game


with N players. Then, the Shapley value for Player i is

 |S|!(N − |S| − 1)!


xi = φi (v) = (v (S ∪ {i}) − v(S)). (10.4)
N!
S⊆P\{i}

Remark 10.39. The Shapley value is the average extra value


Player i contributes to each possible coalition that might form. Imag-
ine forming the grand coalition one player at a time. There are N !
ways to do this. Hence, in an average, N ! is in the denominator of
the Shapley value.
Now, if we have formed a coalition S (on our way to forming P),
then there are |S|! ways we could have done this. Each of these ways
yields v(S) in value because the characteristic function does not value
how a coalition is formed, only the members of the coalition.
Once we add Player i to the coalition S, the new value is
v (S ∪ {i}), and the value Player i added was v (S ∪ {i}) − v(S). We
then add the other N − |S| − 1 players to achieve the grand coalition.
There are (N − |S| − 1)! ways of doing this.
Thus, the extra value Player i adds in each case is v (S ∪ {i}) −
v(S) multiplied by |S|!(N − |S| − 1)! for each of the possible ways
this exact scenario occurs. Summing over all possible subsets S and
dividing by N !, as noted, yields the average excess value Player i
brings to a coalition.

Remark 10.40. We state, but do not prove, the following theorem.


The proof rests on the linear properties of averages. That is, we note
that Eq. (10.4) is a linear expression in v(S) and v (S ∪ {i}).
238 Game Theory Explained: A Mathematical Introduction with Optimization

Theorem 10.41. For any coalition game (P, v) with N players:


φi (v) ≥ v({i}).
(1) 
(2) i∈P φi (v) = v(P).
(3) From (1) and (2), we conclude that (φ1 (v), . . . , φN (v)) is an
imputation.
(4) If for all S ⊆ P, v (S ∪ {i}) = v (S ∪ {j}) with i, j ∈ S, then
φi (v) = φj (v).
(5) If v and w are two characteristic functions in coalition games
(P, v) and (P, w), then φi (v + w) = φi (v) + φi (w) for all i ∈ P.
(6) If v (S ∪ {i}) = v(S) for all S ⊆ P with i ∈ S, then φi (v) = 0
because Player i contributes nothing to the grand coalition.

Example 10.42 (Extended Example). Consider the coalition


game (P, v), with P = {1, 2, 3}, and v defined by
v({1, 2, 3}) = 6,
v({1, 2}) = 2,
v({1, 3}) = 6,
v({2, 3}) = 4,
v({1}) = v({2}) = v({3}) = 0.
We can see at once that this game is essential because
v(P) = v({1}) + v({2}) + v({3}).
We can also write the linear programming problem to identify
the core or determine whether it is empty. There are seven sets to
consider: {1}, {2}, {3}, {1, 2}, {1, 3}, {2, 3}, and {1, 2, 3}. We can
ignore the empty set. This gives the linear programming problem
min x1 + x2 + x3 ,
s.t. x1 ≥ 0,
x2 ≥ 0,
x3 ≥ 0,
x1 + x2 ≥ 2,
x1 + x3 ≥ 6,
An Introduction to N -Player Cooperative Games 239

x2 + x3 ≥ 4,
x1 + x2 + x3 ≥ 6.

A solution to this linear programming problem is x1 = 2, x2 = 0,


and x3 = 4. Thus, the core is not empty.
Lastly, we can find the Shapley values for this game. First, con-
sider the set P\{1} = {2, 3}. The subsets are: (i) S = {2}, with
|S|! = 1 and (N − |S| − 1)! = 1; (ii) S = {3}, with |S|! = 1
and (N − |S| − 1)! = 1; and (iii) S = {2, 3}, with |S|! = 2 and
(N − |S| − 1)! = 1. Using the formula for Shapley values, we have
1 1 2
x1 = (2 − 0) + (6 − 0) + (6 − 4) = 2.
6 6 6
Now, consider the set P\{2} = {1, 3}. Using the same approach
as above, we have
1 1 2
x2 = (2 − 0) + (4 − 0) + (6 − 6) = 1.
6 6 6
Finally, consider the set P\{3} = {1, 2}. Then, we have
1 1 2
x3 = (6 − 0) + (4 − 0) + (6 − 2) = 3.
6 6 6
A quick check shows that x1 + x2 + x3 = 6, as required. You will
also notice, by plugging these values into the linear programming
constraints, that this is not an alternative optimal solution to the
linear programming problem. This shows that the Shapley values do
not have to produce an imputation in the core.

10.6 Chapter Notes

Cooperative games and coalition games are of substantial interest


in economics. Luce and Raiffa [1] and Myerson [38] go into signifi-
cantly more detail on the subject. This chapter provides only a basic
introduction to an exceptionally broad field.
The stable set approach was originally developed by von Neu-
mann and Morgenstern [5]. The core was not considered by von
Neumann and Morgenstern because their text focused on zero-sum
games, whereas we have shown that such games are inessential and
240 Game Theory Explained: A Mathematical Introduction with Optimization

consequently the core is always empty. Donald Gillies, a Canadian


mathematician and computer scientist, introduced the idea of the
core in 1959 [118]. Shapley values were introduced by Lloyd Shap-
ley in 1953 [119]. Shapley worked in several areas of game theory,
including stochastic games, cooperative games, and the stable mar-
riage problem, and, with Alvin Roth, won the 2012 Nobel Memorial
Prize in Economic Sciences for his contributions to the subject.
– ♠♣♥♦ –

10.7 Exercises

10.1 Prove Lemma 10.6. [Hint: Use Nash’s theorem.]


10.2 Prove Corollary 10.18.
10.3 Consider the coalition game (P, v), with P = {1, 2, 3}, and v
defined by
v({1, 2, 3}) = 10,
v({1, 2}) = 4,
v({1, 3}) = 8,
v({2, 3}) = 6,
v({1}) = v({2}) = v({3}) = 0.
(a) Is this game essential or inessential? Why?
(b) Write a linear programming problem that will determine whether
the core is empty. (You do not need to solve this linear program-
ming problem.)
(c) Find the Shapely value for each player.

10.4 Use a linear programming solver to find an imputation in the


core of the game in Exercise 10.3 or determine whether the core is
empty.
10.5 Explicitly construct the dual linear programming problem in
Theorem 10.35 and fill in the details of the proof using Lemma 10.34.
An Introduction to N -Player Cooperative Games 241

10.6 Prove Corollary 10.36. [Hint: Think about alternative optimal


solutions.]

10.7 Show that computing the core is an exponential problem, even


though solving a linear programming problem is known to be poly-
nomial in the size of the problem. [Hint: How many constraints are
there?]

10.8 Prove Theorem 10.41.


This page intentionally left blank
Appendix A

Introduction to Matrix Arithmetic

Appendix Goals: This appendix introduces the essentials of


matrix arithmetic needed for studying matrix games. Proofs are
omitted for brevity. The goals of this appendix are to introduce
matrices and vectors in Rn as well as special matrices, such as
the identity matrix. Matrix operations like addition, multiplica-
tion, and transpose are also discussed.

A.1 Matrices, Row and Column Vectors

Definition A.1 (Matrix). An m × n matrix is a rectangular array


of values (scalars), drawn from R. We write Rm×n to denote the set
of m × n matrices with entries drawn from R.
Example A.2. Here is an example of a 2 × 3 matrix drawn from
R2×3 :
 
3 1 72
A= √ .
2 2 5

Remark A.3. We denote the element at position (i, j) of


matrix A as Aij . Thus, in the example above, A2,1 = 2.
Remark A.4. We restrict our attention to matrices with real entries.
In general, a matrix can take entries from any field (such as the
complex numbers). See Ref. [73] for details.

243
244 Game Theory Explained: A Mathematical Introduction with Optimization

Definition A.5 (Matrix Addition). If A and B are both in Rm×n ,


then C = A + B is the matrix sum of A and B in Rm×n , and

Cij = Aij + Bij for i = 1, . . . , m and j = 1, . . . , n. (A.1)

Here, + is the standard operation for addition.


Example A.6.
       
1 2 5 6 1+5 2+6 6 8
+ = = . (A.2)
3 4 7 8 3+7 4+8 10 12

Definition A.7 (Scalar-Matrix Multiplication). If A is a matrix


from Rm×n and c ∈ R, then B = cA = Ac is the scalar-matrix
product of c and A in Rm×n , and

Bij = cAij for i = 1, . . . , m and j = 1, . . . , n. (A.3)

Example A.8. Let


 
3 7
B= .
6 3

When we multiply by the scalar 3 ∈ R, we obtain


   
3 7 9 21
3· = .
6 3 18 9

Definition A.9 (Row/Column Vector). A 1 × n matrix is called


a row vector, and a m × 1 matrix is called a column vector. Unless
specified otherwise, every vector is considered a column vector.
A column vector x in Rn×1 (or Rn ) is x = x1 , . . . , xn .
Remark A.10. It should be clear that any row of the matrix A could
be considered a row vector, and any column of A could be considered
a column vector. The ith row of A is denoted Ai· , while the jth
column is denoted A·j . Also, any row/column vector is nothing more
sophisticated than tuples of numbers (a point in space). You are free
to think of these things however you like.
Appendix A: Introduction to Matrix Arithmetic 245

A.2 Matrix Multiplication

Definition A.11 (Dot Product). If x, y ∈ Rn are two


n-dimensional vectors, then the dot product is
n

x·y = xi y i , (A.4)
i=1

where xi is the ith component of the vector x.


Remark A.12. The dot product is an example of a more general
concept called an inner product, which maps two vectors to a scalar.
Not all inner products behave according to Eq. (A.4), and the defi-
nition of the inner product will be context dependent.
Definition A.13 (Matrix Multiplication). If A ∈ Rm×n and
B ∈ Rn×p , then C = AB is the matrix product of A and B, and
Cij = Ai· · B·j . (A.5)
Note that Ai· ∈ R1×n (an n-dimensional vector) and B·j ∈
Rn×1 (another n-dimensional vector), thus making the dot product
meaningful. Note also that C ∈ Rm×p .
Example A.14.
      
1 2 5 6 1(5) + 2(7) 1(6) + 2(8) 19 22
= = . (A.6)
3 4 7 8 3(5) + 4(7) 3(6) + 4(8) 43 50
Remark A.15. Note that we cannot multiply any pair of arbitrary
matrices. If we have the product AB of two matrices A and B, then
the number of columns in A must be equal to the number of rows
in B.
Definition A.16 (Matrix Transpose). If A ∈ Rm×n is an m × n
matrix, then the transpose of A, denoted AT , is an m × n matrix
defined as
ATij = Aji . (A.7)
Example A.17.
 T  
1 2 1 3
= . (A.8)
3 4 2 4
246 Game Theory Explained: A Mathematical Introduction with Optimization

Essentially, we are just reading down the columns of A to obtain its


transpose.
Remark A.18. The matrix transpose is a particularly useful opera-
tion and makes it easy to transform column vectors into row vectors,
which enables multiplication. For example, suppose x is an n × 1
column vector (i.e., x is a vector in Rn ), and suppose y is an n × 1
column vector. Then,
x · y = xT y. (A.9)
Proposition A.19. Suppose A, B ∈ Rn×n . Then,
(AB)T = BT AT .

A.3 Special Matrices and Vectors

Remark A.20. There are many special (and useful) square matri-
ces, as we discuss in the following.
Definition A.21 (Standard Basis Vector). The ith standard
basis vector in Rn is the vector ei = 0, . . . , 1, 0, . . . , 0, where the
number 1 occurs at position i.
Definition A.22 (Diagonal Matrix). A diagonal matrix is a
square matrix with the property that Dij = 0, for i = j and where
Dii may take any value in R.
Example A.23. Consider the matrix
⎡ ⎤
2 0 0
⎢ ⎥
D = ⎣0 4 0⎦.
0 0 6
This is a diagonal matrix.
Definition A.24 (Identity Matrix). The n × n diagonal matrix
In with ones along the diagonal is called the identify matrix. That is,
⎡ ⎤
1 ... 0
⎢. . ⎥
In = ⎢
⎣ .. . . ... ⎥
⎦. (A.10)
0 ... 1
Appendix A: Introduction to Matrix Arithmetic 247

Remark A.25. Note that the identity matrix consists of rows and
columns made up of standard basis vectors.
Proposition A.26. Let A ∈ Rn×n . Then,

AIn = In A.

Definition A.27 (Zero Vector). The n × 1 zero vector, denoted 0,


is an n × 1 vector consisting of all zeros. The dimension n is deter-
mined from context, when possible. If the dimension of the zero vec-
tor cannot be determined from context, we denote it 0n .
Definition A.28 (One Vector). The n × 1 one vector, denoted 1,
is an n × 1 vector consisting of all ones. The dimension n is deter-
mined from context, when possible. If the dimension of the zero vec-
tor cannot be determined from context, we denote it 1n .
Definition A.29 (Symmetric Matrix). Let M ∈ Rn×n be a
matrix. The matrix M is symmetric if M = MT .
Example A.30. Suppose that
⎡ ⎤
1 2 3
⎢ ⎥
A = ⎣2 4 1⎦.
3 1 7

This matrix is symmetric.

– ♠♣♥♦ –

A.4 Exercises

A.1 Show that if two matrices A, B ∈ Rm×n , then

A + B = B + A,

that is, matrix addition is commutative.

A.2 Show by counterexample that if A, B ∈ Rn×n , then it is


not necessarily the case that AB = BA, thus showing that
248 Game Theory Explained: A Mathematical Introduction with Optimization

matrix multiplication is not commutative. [Hint: Almost any pair of


square matrices that are chosen at random will not commute when
multiplied.]

A.3 Prove Proposition A.26.

A.4 Prove Proposition A.19.


Appendix B

Essential Concepts from


Vector Calculus

Appendix Goals: This appendix introduces the essentials of


vector calculus needed for optimization and hence game theory.
Most proofs are omitted for brevity but can be found in any vec-
tor calculus text, including Ref. [72]. The goals of this appendix
are to introduce dot products, directional derivatives, gradients,
and level sets, along with their properties.

B.1 Geometry for Vector Calculus

Remark B.1. The dot product of two vectors is given in Defini-


tion A.11 in Appendix A. We now give an alternate characterization,
which is useful for optimization.
Definition B.2. Let x = x1 , . . . , xn  be a vector in Rn . The norm
(or length) of x is
√ 
x = x · x = x21 + x22 + · · · + x2n .

Theorem B.3 (Dot Product). Let θ be the angle between the


vectors x and y in Rn . Then, the dot product of x and y is

x · y = x y cos(θ). (B.1)

249
250 Game Theory Explained: A Mathematical Introduction with Optimization

Remark B.4. This fact can be proved using the law of cosines from
trigonometry. As a result, we have the following short proposition
(which is proved as Theorem 1 in Ref. [72]).
Proposition B.5. Let x, y ∈ Rn . Then, the following hold:
(1) The angle between x and y is less than π/2 (i.e., acute) if and
only if x · y > 0.
(2) The angle between x and y is exactly π/2 (i.e., the vectors are
orthogonal) if and only if x · y = 0.
(3) The angle between x and y is greater than π/2 (i.e., obtuse) if
and only if x · y < 0.
Definition B.6 (Graph). Let z: D ⊆ Rn → R be a function. Then,
the graph of z is the set of n + 1 tuples,
G(z) = {(x, z(x)) ∈ Rn+1 |x ∈ D}. (B.2)
Remark B.7. When z: D ⊆ R → R, the graph is precisely what
you’d expect. It is the set of pairs, (x, y) ∈ R2 , so that y = z(x).
This is the “graph” from a first course in elementary algebra.
Remark B.8. It is inconvenient that the set G(z) is called a graph,
as this term is also used to refer to the completely distinct discrete
structure (which is sometimes also called a network) studied in the
subject of graph theory [39]. Fortunately, these two topics are rarely
covered in the same class.
Definition B.9 (Level Set). Let z: Rn → R be a function, and let
c ∈ R. Then, the level set with value c for function z is the set
Lc (z) = {x ∈ Rn |z(x) = c} ⊆ Rn . (B.3)
Example B.10. Consider the function z(x, y) = x2 + y 2 . The level
set of z at c = 4 is the set of points (x, y) ∈ R2 such that
x2 + y 2 = 4. (B.4)
This is the equation for a circle with a radius of 2. We illustrate
this in two ways in Fig. B.1. Figure B.1 (left) shows the level sets of
z as they sit on the 3D plot of the function, while Fig. B.1 (right)
shows the level sets of z in R2 . The plot in Fig. B.1 (right) is called
a contour plot. It is important to remember that if z: Rn → R, then
its level sets exist in Rn , while the graph exists in Rn+1 .
Appendix B: Essential Concepts from Vector Calculus 251

0 Level Set

y
−2

−4
−4 −2 0 2 4
x

Fig. B.1. (Left) Plot with level sets projected on the graph of z. The level
sets exist in R2 , while the graph of z exists in R3 . (Right) Contour Plot of z =
x2 + y 2 . The circles in R2 are the level sets of the function.

Definition B.11 (Line). Let x0 , v ∈ Rn . The line defined by vectors


x0 and v is the parametric function l(t) = x0 + tv. The vector v is
called the direction of the line, and l: R → Rn .
Example B.12. Let x0 = (2, 1) and v = (2, 2). Then, the line
defined by x0 and v is shown in Fig. B.2. The set of points on this
line is the set
C = {(x, y) ∈ R2 : x = 2 + 2t, y = 1 + 2t, t ∈ R}.

Definition B.13 (Directional Derivative). Let z: Rn → R, and


let v ∈ Rn be a vector (direction). Then, the directional derivative
of z at a point x0 ∈ Rn in the direction of v is

d 
z(x0 + tv) , (B.5)
dt t=0
when this derivative exists.
Proposition B.14. The directional derivative of z at x0 in the direc-
tion v is equal to
z(x0 + hv) − z(x0 )
lim . (B.6)
h→0 h
252 Game Theory Explained: A Mathematical Introduction with Optimization

10

5
y

−5

− 10
−5 0 5 10
x
Fig. B.2. A line function: The points in the graph shown in this figure are in
the set produced using the expression x0 + vt, where x0 = (2, 1) and assuming
v = (2, 2).

B.2 Gradients

Definition B.15 (Gradient). Let z: Rn → R be a function, and


let x0 ∈ Rn . Then, the gradient of z at x0 is the vector in Rn given
by
⎡ ∂z ⎤

∂x1 
⎢ . ⎥
∇z(x0 ) = ⎢ . ⎥
⎣ . ⎦ . (B.7)
∂z 

∂xn x=x0

Theorem B.16. If z: Rn → R is a differentiable function, then all


directional derivatives exist. Furthermore, the directional derivative
of z at x0 in the direction of v is given by

∇z(x0 ) · v. (B.8)

Remark B.17. We now come to the two most important results


about gradients: (i) they always point in the direction of steepest
increase with respect to the level curves of a function; (ii) they are
perpendicular (normal) to the level curves of the function used to
Appendix B: Essential Concepts from Vector Calculus 253

compute them. These facts are exploited when we seek to maximize


(or minimize) functions.
Theorem B.18. Let z: Rn → R be a differentiable function, and let
x0 ∈ Rn . If ∇z(x0 ) = 0, then ∇z(x0 ) points in the direction in which
z is increasing fastest.
Proof. Recall that ∇z(x0 ) · n is the directional derivative of z in
the direction n at x0 . Assume that n is a unit vector (i.e., n = 1).
We know that

∇z(x0 ) · n = ||∇z(x0 )|| cos θ, (B.9)

where θ is the angle between the vectors ∇z(x0 ) and n. The function
cos θ is largest when θ = 0, that is, when n and ∇z(x0 ) are parallel
vectors. (If ∇z(x0 ) = 0, then the directional derivative is zero in all
directions.) 
Theorem B.19. Let z: Rn → R be differentiable, and let x0 lie in
the level set Lc (z) defined by z(x) = c for fixed c ∈ R. Then, ∇z(x0 )
is normal to the set Lc (z) in the sense that if v is a tangent vector
at t = 0 of a path r(t) contained entirely in Lc (z) with r(0) = x0 ,
then ∇z(x0 ) · v = 0.
Proof. As stated, let r(t) be a curve in Lc (z). Then, r: R → Rn
and z[r(t)] = c for all t ∈ R. Let v be the tangent vector to r at
t = 0; that is,

dr(t) 
= v. (B.10)
dt t=0

Differentiating z[r(t)] with respect to t using the chain rule and eval-
uating at t = 0 yields

d  dc
z[r(t)] = ∇z[r(0)] · v = ∇z(x0 ) · v = 0 = . (B.11)
dt t=0 dt

Thus, ∇z(x0 ) is perpendicular to v and therefore normal to the set


Lc (z), as required. 
Example B.20. We illustrate this theorem in Fig. B.3. The function
is z(x, y) = x4 + y 2 + 2xy and x0 = (1, 1). At this point, ∇z(x0 ) =
6, 4. In the figure, we have scaled the vector to improve legibility.
254 Game Theory Explained: A Mathematical Introduction with Optimization

2.0

1.5

∇f
1.0

0.5 Tangent

0.0
0.0 0.5 1.0 1.5 2.0

Fig. B.3. A level curve plot with gradient vector. We’ve scaled the gradient
vector to make the picture reasonably compact. Note that the gradient is perpen-
dicular to the level set curve at the point (1, 1), where the gradient was evaluated.

– ♠♣♥♦ –

B.3 Exercises

B.1 Use the value of the cosine function and the fact that x · y =
||x||||y|| cos θ to prove the lemma. [Hint: For what values of θ is
cos θ > 0?]

B.2 Prove Proposition B.14. [Hint: Use the definition of derivative


for a univariate function, apply it to the definition of directional
derivative, and evaluate t = 0.]

B.3 In this exercise, you will use elementary calculus (and a little
bit of vector algebra) to show that the gradient of a simple function
is perpendicular to its level sets:
Appendix B: Essential Concepts from Vector Calculus 255

(a) Plot the level sets of z(x, y) = x2 + y 2 . Draw the gradient at the
point (x, y) = (2, 0). Convince yourself that it is normal to the
level set x2 + y 2 = 4.
(b) Now, choose any level set x2 + y 2 = k. Use implicit differenti-
ation to find dy/dx. This is the slope of a tangent line to the
circle x2 + y 2 = k. Let (x0 , y0 ) be a point on this circle.
(c) Find an expression for a vector parallel to the tangent line at
(x0 , y0 ). [Hint: You can use the slope you just found.]
(d) Compute the gradient of z at (x0 , y0 ), and use it and the vec-
tor expression you just computed to show that two vectors are
perpendicular. [Hint: Use the dot product.]
This page intentionally left blank
Appendix C

Introduction to Evolutionary Games


Using the Replicator Equation

Appendix Goals: This appendix contains a brief introduction


to evolutionary game theory using replicator dynamics. It is
designed to be a self-contained introduction. The goals of the
appendix are to introduce differential equations and systems,
define fixed points of a system of ordinary differential equa-
tions, and illustrate the phase portrait of a system. We then
discuss stability concepts, derive the replicator, and show some
examples.

C.1 Differential Equations

Definition C.1. An ordinary differential equation (ODE) is an


equation involving an unknown function of one variable, such as y(x)
or x(t), and any number of its derivatives. We generally assume that
the dependent variable is restricted to a known interval I, which
could be R.
Remark C.2 (Notation). Let y(x) be a function of the independent
variable x. Then, we can write the first derivative of y as

dy
= y  (x). (C.1)
dx

257
258 Game Theory Explained: A Mathematical Introduction with Optimization

The nth derivative can be written as


dn y
= y (n) (x), (C.2)
dxn
with the second derivative usually written y  (x) and the third deriva-
tive written y  (x). The exception to this notation is when the inde-
pendent variable is time (or time-like), in which case we may have a
function x(t). Then, the first derivative can (but doesn’t have to be)
written as
dx
= ẋ, (C.3)
dt
d2 x
= ẍ, (C.4)
dt2
..
. (C.5)
dn x
= x(n) (t). (C.6)
dtn
The “dot notation” is a holdover from Newton’s fluxion notation,
while the rest of the notation is due to Leibniz.
Remark C.3 (Notation Remark). For the remainder of this
appendix, we will use the dot notation, as we will be taking deriva-
tives with respect to time.
Remark C.4. In general, we can write an ODE as
F (x, ẋ, . . . , x(n) , t) = 0, (C.7)
where F represents a function acting on the unknown function x(t),
its derivatives, and its independent variable t.
Definition C.5 (Order). Consider the ODE F (x, ẋ, . . . , x(n) , t) =
g(t). The order of the ODE is n, the degree of the highest derivative
appearing in the equation.
Example C.6 (Exponential Growth/Decay). The following is
a simple first-order ODE:
ẋ − αx = 0, (C.8)
where α ∈ R is a constant. This differential equation models expo-
nential growth (or decay). We study it in much greater detail shortly.
Appendix C: Introduction to Evolutionary Games Using the Replicator 259

Definition C.7 (Initial Value Problem). An initial value problem


(IVP) is a differential equation F (x, ẋ, . . . , x(n) , t) = 0 along with the
conditions
x(t0 ) = x0 ,
ẋ(t0 ) = x1 ,
..
.
x(n−1) (t0 ) = xn−1 .
Example C.8 (Exponential Growth/Decay). Consider the IVP
ẋ − αx = 0,
x(0) = x0 > 0.
We can write the ODE as
dx
= αx.
dt
Multiplying by dt, dividing by x, and integrating yields
 
dx
= αdt.
x
A little computation shows that
log(x) = αt + C,
where C is a constant of integration. You can put it on either side;
however, by convention, it usually goes on the side with the indepen-
dent variable. Since x(0) = x0 , it’s easy to see that when t = 0, we
have
log(x0 ) = C.
Thus, log(x) − αt − log(x0 ) = 0 implicitly solves the ODE. We can
do better. Note that
 
x
log(x) − log(x0 ) = log = αt.
x0
Taking the exponential yields
x
= eαt ,
x0
260 Game Theory Explained: A Mathematical Introduction with Optimization

or
x(t) = x0 eαt .
When α > 0, the system is said to experience exponential growth.
When α < 0, the system is said to experience exponential decay.
This model is frequently used to produce simple models of biological
populations.
Remark C.9. Note that, if x0 > 0 (as we assumed), then x0 eαt > 0
for all t. If x0 < 0, we need to use the (more correct) formula

dx
= log |x|.
x
The result is the same. If x0 ∈ R, we always obtain the solution
x(t) = x0 eαt for the equation ẋ = αx with x(0) = x0 .
Definition C.10. A system of ODEs is a set of equations involving
a set of unknown functions y1 , . . . , yn and their derivatives, with each
being a function of one independent variable.
Remark C.11. In this appendix, we focus on autonomous systems
of first-order differential equations with the form


⎪ ẋ1 = f1 (x1 , . . . , xn ),


.. (C.9)
⎪ .


⎩ ẋ = f (x , . . . , x ).
n n 1 n

Here, x1 (t), . . . , xn (t) are n unknown functions with the independent


variable t (time). We assume that for i = 1, . . . , n, fi (x1 , . . . , xn ) has
derivatives of all orders. The system is autonomous because time
does not appear explicitly on the right-hand sides of the equations.
We may be given initial conditions for such a system in the form
x1 (0) = a1 , . . . , xn (0) = an ,
where a1 , . . . , an are constants.
Example C.12. Consider the system of differential equations
ẋ = −y,
(C.10)
ẏ = x,
Appendix C: Introduction to Evolutionary Games Using the Replicator 261

(x 0, y0)

Fig. C.1. A solution curve for a specific initial condition (x0 , y0 ) for Eq. (C.10).

with the initial condition x(0) = x0 and y(0) = y0 . This system has
the solution
x(t) = x0 cos(t) − y0 sin(t),
y(t) = x0 sin(t) + y0 cos(t),
which one can check by taking derivatives. For a given initial con-
dition, (x0 , y0 ), we can plot the parametric curve r(t) = x(t), y(t),
called a solution curve. The result is shown in Fig. C.1.

C.2 Fixed Points, Stability, and Phase Portraits

Remark C.13. The system in Eq. (C.9) can be written compactly


as the vector equation
ẋ = F(x), (C.11)
where F : Rn → Rn is a vector-valued function of a vector of inputs.
That is,
⎡ ⎤
f1 (x1 , . . . , xn ),
⎢ ⎥
F(x) = ⎢

..
.
⎥.

fn (x1 , . . . , xn ).
262 Game Theory Explained: A Mathematical Introduction with Optimization

Definition C.14 (Fixed Points). Consider the system of differen-


tial equations given by Eq. (C.11). A vector x∗ ∈ Rn is a fixed or
equilibrium point of Eq. (C.11) if F(x∗ ) = 0.
Example C.15 (Lotka–Volterra Equations). Suppose x and y
are the quantities of a prey species and a predator species, respec-
tively. The prey species grows exponentially in the absence of any
predators. The predators decay exponentially in the absence of any
prey. When prey and predator are together, the prey are consumed by
the predators and are removed at a rate proportional to the product
of the numbers of predators and prey. Similarly, predators convert
their consumed prey into new predators, which are added at a rate
proportional to the product of the numbers of predator and prey.
The resulting system of differential equations describing this
model is
ẋ = αx − βxy,
ẏ = γxy − δy.
We can solve for the fixed points of this system by solving the system
of nonlinear equations
αx − βxy = 0,
γxy − δy = 0.
Doing this yields two fixed points: x∗ = y ∗ = 0 and
δ
x∗ = ,
γ
α
y∗ = .
β
This system of equations is called the Lotka–Volterra system.
Definition C.16 (Stable Fixed Point). Let
ẋ = F(x)
be an autonomous system of ODEs with the fixed point x∗ . Let x(t)
be a solution curve with initial time t0 and initial position x(t0 ) = x0 .
Then, x∗ is stable if for all  > 0, there exists a δ > 0 so that if
x0 − x∗  < δ, then for all t > t0 , x(t) − x∗  < .
Appendix C: Introduction to Evolutionary Games Using the Replicator 263

Remark C.17. The stability of a fixed point sounds complex, but


it’s relatively simple conceptually. If a solution starts near a fixed
point and stays near that fixed point over time, then the fixed point
is stable. There is a stronger form of stability, called asymptotic
stability, in which the trajectory approaches the fixed point in the
long run.
Definition C.18. A fixed point, x∗ , is asymptotically stable if it is
both stable and there is a δ0 > 0 so that if x0 − x∗  < δ0 , then

lim x(t) = x∗ .
t→∞

Remark C.19. Fixed points can be further categorized. This is dis-


cussed at length in texts on nonlinear dynamics. Strogatz provides a
thorough introduction [120]. It is possible to determine the stability
of fixed points in many (but not all) settings using some basic calcu-
lus. For two-dimensional systems, a picture called a phase portrait
can generally show whether a fixed point is stable, asymptotically
stable, or neither (usually called unstable).
Definition C.20 (Phase Portrait). Consider Eq. (C.9), and sup-
pose we have a general solution, x1 (t; a1 ), . . . , xn (t; an ), parameter-
ized by the initial values a1 , . . . , an . (That is, suppose we have a
specific orbit.) Then, a phase portrait is a geometric representation
of the behavior of the differential equation system obtained by cre-
ating multiple parametric curves (x1 (t; a1 ), . . . xn (t; an ) for various
starting values a1 , . . . , an , i.e., we are visualizing multiple solutions
of the system.
Example C.21. Take a specific instance of the Lotka–Volterra
dynamics:

ẋ = x − xy,
ẏ = xy − y.

This system has two fixed points: x∗ = y ∗ = 0 and x∗ = y ∗ = 1.


A phase portrait can help determine the stability of the fixed points.
This is shown in Fig. C.2. The figure shows that the fixed point
x∗ = y ∗ = 1 is stable, but not asymptotically stable. This is because
the solutions orbit around the fixed point, never getting too far away,
just like in Fig. C.1. The fixed point x∗ = y ∗ = 0 is not stable since
264 Game Theory Explained: A Mathematical Introduction with Optimization

2.0

1.5

1.0
y

0.5

0.0
0.0 0.5 1.0 1.5 2.0
x

Fig. C.2. A phase portrait for specific Lotka–Volterra dynamics shows that one
fixed point is stable, but not asymptotically stable. The other fixed point is not
stable.

the solution curves can extend arbitrarily far away from the fixed
point, depending on the initial condition.

C.3 The Replicator Equation

Remark C.22. We now derive a specific class of differential equa-


tions, called the replicator equations, using a game matrix and a
biological interpretation of game play.

Derivation C.23. Suppose there are n species of organisms that


interact in a pairwise fashion by playing a symmetric1 game (such
as prisoner’s dilemma or rock-paper-scissors). Let A ∈ Rn×n be the
payoff matrix for this game so that when a member of species i
interacts with species j, it receives payoff, eTi Aej .

1
Although this assumption is not needed, it makes the derivation sensible.
Appendix C: Introduction to Evolutionary Games Using the Replicator 265

Suppose there are Xi ≥ 0 members of the species i in the popu-


lation, with i ∈ {1, . . . , n}. For simplicity, we allow fractional popu-
lation counts since we are building a notional model. Let
n

M= Xi , (C.12)
i=1
be the total population. Then,
Xi
xi =
M
is the proportion of the population composed of the species i. The
vector x = x1 , . . . , xn  is the proportion vector. Since we associate
species with strategies in the game, x acts like a mixed strategy.
Now, suppose that the fitness of the species i is its expected payoff
in the game. That is,
fi (x) = eTi Ax.
In a game-theoretic sense, this is the expected payoff to a player
playing a pure strategy, i, against a mixed strategy, x. Now, suppose
that the population of each species grows (or decays) exponentially
with parameter fi . That is,
 
Ẋi = fi Xi = Xi eTi Ax . (C.13)
This is a reasonable assumption in the sense that populations with
positive fitness will grow, while those with negative fitness will shrink.
Raw population counts are unwieldy. We seek to compute ẋi , the
rate of change of the proportion of species i in the whole population.
That is, we want
 
d Xi
ẋi = .
dt M
Both Xi and M are time-varying. Apply the quotient rule to find
that
Ẋi M − Xi Ṁ Ẋi Xi Ṁ
ẋi = 2
= − .
M M MM
Using the definition of Ẋi in Eq. (C.13), we have
Ẋi Xi  T   
= ei Ax = xi eTi Ax .
M M
266 Game Theory Explained: A Mathematical Introduction with Optimization

Now, use Eqs. (C.12) and (C.13) together to obtain


 n
   T 
1 
n
Xi Ṁ i=1 Xi ei Ax
= xi Ẋi = xi
MM M M
i=1
 n 
    
T
= xi xi ei Ax = xi xT Ax .
i=1

Combining these together yields the expression


     
ẋi = xi eTi Ax − xi xT Ax = xi eTi Ax − xT Ax . (C.14)

This system of differential equations (for i ∈ {1, . . . , n}) is called


replicator dynamics or the replicator equation.
Remark C.24. In general, if fi (x) is a fitness function for a popu-
lation in terms of population proportions and
n

f¯(x) = xi fi (x)
i=1

is the mean fitness, then the replicator can be written as


 
ẋi = xi fi (x) − f¯(x) .

In this way, the replicator can be generalized from fitness defined in


terms of symmetric two-player games.
Remark C.25. A little work shows that as long as the initial con-
ditions of the replicator dynamics are in Δn , i.e., x(0) = x0 ∈ Δn ,
then the dynamics remain within Δn . This requires showing that
n

ẋi = 0.
i=1

Therefore, as time evolves, if x0 ∈ Δn , then


n

xi (t) = 1,
i=1

for all time.


Appendix C: Introduction to Evolutionary Games Using the Replicator 267

Remark C.26. We call the branch of game theory where we have


populations dynamically changing (evolving) as a result of game play
evolutionary game theory. There are several additional results in this
area. However, there is a nice result relating Nash equilibria in sym-
metric games with fixed points in the replicator dynamics. We state
but do not prove the folk theorem of evolutionary games, found in
Ref. [121]. But first, we require a definition.
Definition C.27 (Interior of Δn ). A point x = x1 , . . . , xn  ∈ Δn
is in the interior of Δn , written x ∈ int(Δn ) if xi > 0 for all i ∈
{1, . . . , n}.
Theorem C.28 (Folk Theorem of Evolutionary Games). Let
A be a game matrix for the row player in a two-player symmetric
game. Then, the following hold:
(1) If x∗ is a Nash equilibrium for the game, then x∗ is a fixed point
of the replicator dynamics defined by A.
(2) If x∗ is a strict Nash equilibrium, then x∗ is an asymptotically
stable fixed point of the replicator dynamics defined by A.
(3) If x∗ is a fixed point of the replicator dynamics in the interior of
Δn and it is the limit as t → ∞ of a solution curve lying entirely
in the interior of Δn , then x∗ is a Nash equilibrium.
(4) If x∗ is a stable equilibrium point of the replicator dynamics, then
it is a Nash equilibrium.

Example C.29 (Prisoner’s Dilemma). Consider the game matrix


we have used for prisoner’s dilemma:
 
−1 −10
A= ,
0 −5
and let x = x1 , x2 . The replicator dynamics are the system of
equations
⎛ ⎞
⎜  ⎟
ẋ1 = x1 ⎝−x1 − 10x2 − (−10x1 − 5x2 ) x2 − x21 ⎠,
     
eT
1 Ax xT Ax
⎛ ⎞
⎜  ⎟
ẋ2 = x2 ⎝−5x2 − (−10x1 − 5x2 ) x2 − x21 ⎠.
     
eT
2 Ax xT Ax
268 Game Theory Explained: A Mathematical Introduction with Optimization

If you carefully add these two expressions together and substitute in


the fact that x2 = 1 − x1 , you will see that they do sum to zero.
One can also verify that the Nash equilibrium e2 is a fixed point
of the dynamics. We note that not every fixed point needs to be a
Nash equilibrium. For example, x∗ = e1 (the “cooperate” strategy)
is also a fixed point; however, it is not a Nash equilibrium. This is
due to the fact that if there are no players playing “defect,” then
that population will never grow.
From Theorem C.28, we expect x∗ = e2 to be an asymptotically
stable fixed point. We can illustrate this by plotting some solution
curves for x2 (t). This is shown in Fig. C.3. From the figure, we see
that unless x0 = e1 , the solutions all approach x2 = 1 (and hence
x1 = 0) as t → ∞. That is, e2 is an asymptotically stable fixed point.

Example C.30 (Rock-Paper-Scissors). The dynamics get more


interesting with three-strategy games. Consider the payoff matrix for
rock-paper-scissors:
⎡ ⎤
0 −1 1
⎢ ⎥
A=⎣ 1 0 −1⎦.
−1 1 0

1.0

0.8
x2 (Defect Proportion)

0.6

0.4

0.2

0.0
0 1 2 3 4 5
t

Fig. C.3. From any starting point, except x1 = 1 and x2 = 0, the solution to
the replicator dynamics using the prisoner’s dilemma matrix converges to x2 = 1.
Appendix C: Introduction to Evolutionary Games Using the Replicator 269

This matrix is interesting because

xT Ax = 0. (C.15)

Here, x = x1 , x2 , x3 , with x1 , x2 , and x3 being the proportions of


the population playing rock, paper, and scissors, respectively. The
resulting replicator dynamics become

ẋ1 = x1 (x3 − x2 ),
ẋ2 = x2 (x1 − x3 ),
ẋ3 = x3 (x2 − x1 ).
$ %
We know that x∗ = 13 , 13 , 13 is a fixed point since it is a Nash
equilibrium. We can construct a plot to determine its stability. Recall
that Δ3 is a triangle in R3 (see Fig. 5.4). We can construct the
trajectories of the dynamics and show them on this triangle, as they
would exist in R3 , but flattened onto the page. This is shown in
Fig. C.4. This figure shows that the Nash equilibrium is a stable but
not asymptotically stable fixed point. This figure is sensible from
a biological perspective. If there are many players playing rock but

R P

Fig. C.4. Example solution curves for rock-paper-scissors showing the cyclic
nature of the game under the replicator dynamics.
270 Game Theory Explained: A Mathematical Introduction with Optimization

only a few playing paper and scissors, then, ultimately, the paper
players will “consume” the rock players, making more paper players,
who will then be “consumed” by the scissors players, who are then
“consumed” by the rock players, leading to the cyclic behavior.
Remark C.31. Intriguingly, we can modify the stability of the fixed
point in the rock-paper-scissors game by adding a winner bias so that
⎡ ⎤
0 −1 1+a
⎢ ⎥
A = ⎣1 + a 0 −1 ⎦. (C.16)
−1 1+a 0
The resulting theorem is a special case of a more general theorem of
games like rock-paper-scissors that was proved by Zeeman [122], who
characterized the behavior of the replicator dynamics for all three-
strategy games. His result is summarized by Hofbauer and Sigmund
in Ref. [121].
Theorem C.32. Consider the replicator dynamics generated by the
payoff matrix in$ Eq. (C.16).
% If a > 0, then the Nash equilibrium
∗ 1 1 1
fixed point x = 3 , 3 , 3 is asymptotically stable. If a < 0, then this
fixed point is unstable. If a = 0, then the fixed point is stable but not
asymptotically stable.
Example C.33. We illustrate the previous theorem for the cases
1 1
when a = 10 and a = − 10 in Fig. C.5. As stated in the theorem,
when a > 0, the Nash equilibrium of the game is stable, and when
a < 0, the Nash equilibrium of the game is unstable.
Remark C.34 (Bimatrix Games). We have assumed that the
replicator dynamics are constructed from a symmetric game with
Player 1 matrix A, thereby using the fact that B = AT . In the case
when we have a non-symmetric game, we can still define a system of
replicator equations, as discussed by Hofbauer in Ref. [123].
Suppose A, B ∈ Rm×n , and let x = x1 , . . . , xm  and y ∈
y1 , . . . , yn . The bimatrix replicator dynamics are
 
ẋi = xi eTi Ay − xT Ay i ∈ {1, . . . , m},
 T 
ẏj = yj x Bej − xT By j ∈ {1, . . . , n}.
Interestingly, if you use these dynamics in the symmetric game case,
you sometimes get different behaviors than when using the ordinary
Appendix C: Introduction to Evolutionary Games Using the Replicator 271

S S

R P R P

Fig. C.5. (Left) A sample trajectory from the biased rock-paper-scissors game,
1
Eq. (C.16), when a = 10 . Here, the Nash equilibrium of the game is asymp-
totically stable. (Right) A sample trajectory from the biased rock-paper-scissors
1
game, Eq. (C.16), when a = − 10 . Note that the Nash equilibrium of the game is
unstable.

replicator dynamic with a single matrix (see Exercise C.2). This illus-
trates the importance of modeling choices when applying mathemat-
ics versus studying it in an abstract setting.

C.4 Appendix Notes

Evolutionary game theory originated with the work of Price and


Smith [124]. Since this early work, the field has grown substantially.
This appendix offers a mere glimpse into a much broader area of
work that is summarized in books by Hofbauer and Sigmund [125],
Weibull [126], and Friedman and Sinervo [127]. It is worth not-
ing that most discussions on evolutionary games begin with the
concept of evolutionary stability and evolutionarily stable strate-
gies (ESS), which can be considered a refinement of the concept of
Nash equilibrium. This is similar to the way that the Folk theorem
(Theorem C.28) relates the stability of fixed points of the replicator
equation to the Nash equilibria of the corresponding game. Evolu-
tionary stability can be a subtle concept, which is why this brief
introduction focuses on replicator dynamics, which are a bit more
concrete and produce nice visualizations.
272 Game Theory Explained: A Mathematical Introduction with Optimization

There are several variations of the replicator dynamics, some


with more exotic properties. Examples include the replicator–
mutator equation [128], which incorporates mutation into the
replicator dynamics, allowing species to spontaneously emerge.
Hofbauer and Sigmund [121] provide a more comprehensive list
of variations. Several authors have modified the replicator equa-
tion to include spatial dynamics, thus creating partial differ-
ential equations. See Vickers’ work for an example [129, 130].
Other authors have added higher-order interactions to model N -
player games, including Gokhale and Traulsen [131] and Grif-
fin and Wu [132]. Simple replicators can exhibit various forms
of chaotic behavior (see, e.g., Refs. [133, 134]), making them
interesting dynamical systems to study in isolation. Among other
applications, replicator equations feature prominently in theoretical
ecology, where they are used to study the qualitative properties of
ecosystems [135, 136].
– ♠♣♥♦ –

C.5 Exercises

C.1 Find the replicator dynamics for the Chicken game and com-
pute its fixed points. Try to determine their stability by using a
differential equation solver, like the one in MathematicaTM .

C.2 Find the replicator dynamics for the Chicken game using the
bimatrix formulation and compute its fixed points. Try to determine
their stability by using a differential equation solver, like the one in
MathematicaTM .

C.3 Prove the statement in Remark C.25.

C.4 Explain why Theorems C.32 and C.28 do not contradict one
another.
References

[1] R. D. Luce and H. Raiffa, Games and Decisions: Introduction and


Critical Survey. Dover Press (1989).
[2] P. Morris, Introduction to Game Theory. Springer (1994).
[3] N. Nisan, T. Roughgarden, E. Tardos and V. V. Vazirani (eds.),
Algorithmic Game Theory. Cambridge University Press (2007).
[4] J. González-Dı́az, I. Garcı́a-Jurado and M. G. Fiestras-Janeiro,
An Introductory Course on Mathematical Game Theory and Appli-
cations, Vol. 238. American Mathematical Society (2023).
[5] J. von Neumann and O. Morgenstern, The Theory of Games and
Economic Behavior, 60th edn. Princeton University Press (2004).
[6] J. Scarne, Scarne’s New Complete Guide to Gambling. Simon &
Schuster (1986).
[7] J. Archer, The Archer Method of Winning at 21. Henry Regnery Co.
(1973).
[8] U. Today, Remembering ‘let’s make a deal’ host monty hall,
https://www.usatoday.com/picture-gallery/life/tv/2017/09/30/
remembering-lets-make-a-deal-host-monty-hall/106172564/ (2017).
[9] C. B. Boyer and U. C. Merzbach, A History of Mathematics. John
Wiley & Sons (2011).
[10] A. E. Moyer, Liber de ludo aleae, Renaissance Quarterly 60, 4,
pp. 1419–1420 (2007).
[11] K. Delvin, The Unfinished Game: Pascal, Fermat, and the
Seventeenth-Century Letter that Made the World Modern. Basic
Books (2008).
[12] O. Ore, Pascal and the invention of probability theory, The Ameri-
can Mathematical Monthly 67, 5, pp. 409–419 (1960).
[13] L. J. Daston, Probabilistic expectation and rationality in classical
probability theory, Historia Mathematica 7, 3, pp. 234–260 (1980).

273
274 Game Theory Explained: A Mathematical Introduction with Optimization

[14] T. Bayes, LII. An essay towards solving a problem in the doctrine


of chances. By the late Rev. Mr. Bayes, F. R. S. communicated by
Mr. Price, in a letter to John Canton, A. M. F. R. S, Philosophical
Transactions of the Royal Society of London, 53, pp. 370–418 (1763).
[15] C. F. Gauß, Theoria Motus Corporvm Coelestivm In Sectionibvs
Conicis Solem Ambientivm. Perthes et Besser (1809).
[16] N. L. Johnson, S. Kotz and N. Balakrishnan, Continuous Univariate
Distributions, Volume 1, Vol. 289. John Wiley & Sons (1995).
[17] H. S. Bear, A Primer of Lebesgue Integration. Academic Press
(2002).
[18] E. O. Thorp, Beat the Dealer: A Winning Strategy for the Game of
Twenty-one, Vol. 310. Vintage (1966).
[19] J. L. Kelly, A new interpretation of information rate, The Bell Sys-
tem Technical Journal 35, 4, pp. 917–926 (1956).
[20] B. Mezrich, 21: Bringing Down the House-Movie Tie-In: The Inside
Story of Six MIT Students Who Took Vegas for Millions. Simon &
Schuster (2008).
[21] S. Selvin, Monty Hall problem, American Statistician 29, 3,
pp. 134–134 (1975).
[22] J. P. Morgan, N. R. Chaganty, R. C. Dahiya and M. J. Doviak, Let’s
make a deal: The player’s dilemma, The American Statistician 45,
4, pp. 284–287 (1991).
[23] R. G. Seymann, [let’s make a deal: The player’s dilemma]: Comment,
The American Statistician 45, 4, pp. 287–288 (1991).
[24] E. Barbeau, Fallacies, flaws, and flimflam, College Mathematics
Journal 32, 2, pp. 149–154 (1993).
[25] S. Lucas, J. Rosenhouse and A. Schepler, The Monty Hall problem,
reconsidered, Mathematics Magazine 82, 5, pp. 332–342 (2009).
[26] A. P. Flitney and D. Abbott, Quantum version of the monty hall
problem, Physical Review A 65, 6, p. 062318 (2002).
[27] J. Rosenhouse, The Monty Hall Problem: The Remarkable Story
of Math’s most Contentious Brain Teaser. Oxford University Press
(2009).
[28] J. Tierney, Behind Monty Hall’s doors: Puzzle, debate and answer,
New York Times 21, p. 1 (1991).
[29] E. Aslanian, ‘the price is right’ hits 9,000 episodes — the game show
by the numbers, TV Insider (2019).
[30] G. Loomes, C. Starmer and R. Sugden, Observing violations of
transitivity by experimental methods, Econometrica: Journal of the
Econometric Society, 59, 2, pp. 425–439 (1991).
[31] C. Blair Jr, Passing of a great mind, Life 25, p. 96 (1957).
[32] F. Dyson, A meeting with Enrico Fermi, Nature 427, 6972,
pp. 297–297 (2004), https://doi.org/10.1038/427297a.
References 275

[33] J. Mayer, K. Khairy and J. Howard, Drawing an elephant with


four complex parameters, American Journal of Physics 78, 6,
pp. 648–649 (2010).
[34] D. Bernoulli, Specimen Theoriae Novae de Mensura Sortis: Trans-
lated into German and English, Letter to Pierre Raymond de
Montmort (1713).
[35] P. C. Fishburn, Utility theory, Management Science 14, 5,
pp. 335–378 (1968).
[36] E. Karni and D. Schmeidler, Utility theory with uncertainty, Hand-
book of Mathematical Economics 4, pp. 1763–1831 (1991).
[37] N. G. Mankiw, Principles of Microeconomic Theory, 9th edn.
Cenage (2021).
[38] R. B. Myerson, Game Theory: Analysis of Conflict. Harvard
University Press (2001).
[39] C. H. Griffin, Applied Graph Theory: An Introduction with Graph
Optimization and Algebraic Graph Theory. World Scientific (2023).
[40] S. J. Brams, Game Theory and Politics. Dover Press (2004).
[41] M. C. Fu, AlphaGo and Monte Carlo tree search: The simulation
optimization perspective, in 2016 Winter Simulation Conference
(WSC). IEEE, pp. 659–670 (2016).
[42] R. Bellman, Dynamic programming, Science 153, 3731, pp. 34–37
(1966).
[43] S. B. Kotsiantis, Decision trees: A recent overview, Artificial Intel-
ligence Review 39, pp. 261–283 (2013).
[44] S. Russell and P. Norvig, Artificial Intelligence: A Modern Approach.
Pearson (2016).
[45] D. E. Knuth and R. W. Moore, An analysis of alpha-beta pruning,
Artificial Intelligence 6, 4, pp. 293–326 (1975).
[46] F.-H. Hsu, Behind Deep Blue: Building the Computer that Defeated
the World Chess Champion. Princeton University Press (2002).
[47] J. Schaeffer, N. Burch, Y. Bjornsson, A. Kishimoto, M. Muller,
R. Lake, P. Lu and S. Sutphen, Checkers is solved, Science 317,
5844, pp. 1518–1522 (2007).
[48] E. R. Berlekamp, J. H. Conway and R. K. Guy, Winning Ways for
Your Mathematical Plays, Vol. 1. A. K. Peters (2001a).
[49] E. R. Berlekamp, J. H. Conway and R. K. Guy, Winning Ways for
Your Mathematical Plays, Vol. 2. A. K. Peters (2001b).
[50] E. R. Berlekamp, J. H. Conway and R. K. Guy, Winning Ways for
Your Mathematical Plays, Vol. 3. A. K. Peters (2001c).
[51] E. R. Berlekamp, J. H. Conway and R. K. Guy, Winning Ways for
Your Mathematical Plays, Vol. 4. A. K. Peters (2001d).
[52] D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. Van
Den Driessche, J. Schrittwieser, I. Antonoglou, V. Panneershelvam,
276 Game Theory Explained: A Mathematical Introduction with Optimization

M. Lanctot et al., Mastering the game of go with deep neural net-


works and tree search, Nature 529, 7587, pp. 484–489 (2016).
[53] R. W. Rosenthal, Games of perfect information, predatory pricing
and the chain-store paradox, Journal of Economic theory 25, 1,
pp. 92–100 (1981).
[54] C. F. Camerer, Behavioral Game Theory: Experiments in Strategic
Interaction. Princeton University Press (2011).
[55] O. G. Haywood Jr, Military decision and game theory, Journal of the
Operations Research Society of America 2, 4, pp. 365–385 (1954).
[56] I. Ravid, Military decision, game theory and intelligence: An anec-
dote, Operations Research 38, 2, pp. 260–264 (1990).
[57] R. Franck and F. Melese, A game theory view of military conflict in
the Taiwan strait, Defense & Security Analysis 19, 4, pp. 327–348
(2003).
[58] W. N. Caballero, B. J. Lunday and R. F. Deckro, Leveraging behav-
ioral game theory to inform military operations planning, Military
Operations Research 25, 1, pp. 5–22 (2020).
[59] B. Burke, Game theory and run/pass balance, http://www.
advancedfootballanalytics.com/2008/06/game-theory-and-runpass-
balance.html (2008).
[60] J. J. Sylvester, XLVII Additions to the articles in the September
number of this journal,“on a new class of theorems,” and on Pas-
cal’s Theorem, The London, Edinburgh, and Dublin Philosophical
Magazine and Journal of Science 37, 251, pp. 363–370 (1850).
[61] J. Munkres, Topology. Prentice Hall (2000).
[62] J. Nash, Non-cooperative games, Annals of Mathematics, 54, 2,
pp. 286–295 (1951).
[63] J. Nash, C1 isometric imbeddings, Annals of Mathematics, 60, 3,
pp. 383–396 (1954).
[64] J. Nash, The imbedding problem for Riemannian manifolds, Annals
of Mathematics 63, 1, pp. 20–63 (1956).
[65] J. Moser, A rapidly convergent iteration method and non-linear par-
tial differential equations-I, Annali della Scuola Normale Superiore
di Pisa-Scienze Fisiche e Matematiche 20, 2, pp. 265–315 (1966a).
[66] J. Moser, A rapidly convergent iteration method and non-linear par-
tial differential equations-II, Annali della Scuola Normale Superiore
di Pisa-Scienze Fisiche e Matematiche 20, 2, pp. 499–535 (1966b).
[67] W. Rudin et al., Principles of Mathematical Analysis, Vol. 3.
McGraw-Hill New York (1976).
[68] S. Nasar, A Beautiful Mind. Simon & Schuster (2011).
[69] D. Bertsekas, Nonlinear Programming, 4th edn. Athena Scientific
(2016).
References 277

[70] M. S. Bazaraa, H. D. Sherali and C. M. Shetty, Nonlinear Program-


ming: Theory and Algorithms. John Wiley & Sons (2006).
[71] SciPy, SciPy v1.12.0 manual (scipy.optimize.linprog), https://docs.
scipy.org/doc/scipy/reference/generated/scipy.optimize.linprog.html
(2024).
[72] J. E. Marsden and A. Tromba, Vector Calculus, 5th edn. W. H.
Freeman (2003).
[73] S. Lang, Introduction to Linear Algebra. Springer Science & Business
Media (2012).
[74] H. W. Kuhn and A. W. Tucker, Nonlinear programming, in Proceed-
ings of the Second Berkeley Symposium on Mathematical Statistics
and Probability, California (1951).
[75] A. H. Foundation, William Karush, https://ahf.nuclearmuseum.
org/ahf/profile/william-karush/ (2024).
[76] W. Karush, Minima of Functions of Several Variables with Inequal-
ities as Side Constraints (1939).
[77] H. Kuhn and S. Nasar (eds.), The Essential John Nash. Princeton
University Press (2002).
[78] M. S. Bazaraa, J. J. Jarvis and H. D. Sherali, Linear Programming
and Network Flows. Wiley-Interscience (2004).
[79] S. P. Boyd and L. Vandenberghe, Convex Optimization. Cambridge
University Press (2004).
[80] R. O. Duda, P. E. Hart and D. G. Stork, Pattern Classification.
Wiley Interscience (2000).
[81] J. Nocedal and S. J. Wright, Numerical Optimization. Springer
(1999).
[82] L. A. Wolsey and G. L. Nemhauser, Integer and Combinatorial Opti-
mization. John Wiley & Sons (2014).
[83] G. Sierksma and Y. Zwols, Linear and Integer Optimization: Theory
and Practice. CRC Press (2015).
[84] D. J. Albers and C. Reid, An interview with George B. Dantzig: The
father of linear programming, The College Mathematics Journal 17,
4, pp. 292–314 (1986).
[85] R. W. Cottle, George B. Dantzig: Operations research icon, Opera-
tions Research 53, 6, pp. 892–898 (2005).
[86] R. W. Cottle, George B. Dantzig: A legendary life in math-
ematical programming, Mathematical Programming 105, 1,
pp. 1–8 (2006).
[87] G. B. Dantzig, Reminiscences about the origins of linear program-
ming, in Mathematical Programming the State of the Art. Springer,
pp. 78–86 (1983).
278 Game Theory Explained: A Mathematical Introduction with Optimization

[88] G. B. Dantzig, A proof of the equivalence of the programming prob-


lem and the game problem, Activity Analysis of Production and
Allocation 13, pp. 330–335 (1951).
[89] T. Raghavan, Zero-sum two-person games, Handbook of Game The-
ory with Economic Applications 2, pp. 735–768 (1994).
[90] I. Adler, The equivalence of linear programs and zero-sum games,
International Journal of Game Theory 42, pp. 165–177 (2013).
[91] L. G. Khachiyan, A polynomial algorithm in linear programming,
in Doklady Akademii Nauk, Vol. 244. Russian Academy of Sciences,
pp. 1093–1096 (1979).
[92] N. Karmarkar, A new polynomial-time algorithm for linear program-
ming, in Proceedings of the Sixteenth Annual ACM Symposium on
Theory of Computing, pp. 302–311 (1984).
[93] Gurobi Optimization, Gurobi, https://www.gurobi.com/academia/
academic-program-and-licenses/ (2024).
[94] IBM, IBM ILOG CPLEX Optimizer, https://www.ibm.com/pro
ducts/ilog-cplex-optimization-studio/cplex-optimizer (2024).
[95] O. L. Mangasarian and H. Stone, Two-person nonzero-sum games
and quadratic programming, Journal of Mathematical Analysis and
Applications 9, pp. 348–355 (1964).
[96] C. E. Lemke and J. T. Howson, Equilibrum points of bimatrix
games, Journal of the Society for Industrial and Applied Mathe-
matics 12, 2, pp. 413–423 (1961).
[97] R. Wilson, Computing equilibria of N-person games, SIAM Journal
on Applied Mathematics 21, 1, pp. 80–87 (1971).
[98] R. W. Cottle, J.-S. Pang and R. E. Stone, The Linear Complemen-
tarity Problem. SIAM (2009).
[99] T. Li and S. P. Sethi, A review of dynamic Stackelberg game models,
Discrete & Continuous Dynamical Systems-B 22, 1, p. 125 (2017).
[100] S. P. Dirkse and M. C. Ferris, The path solver: A nommonotone sta-
bilization scheme for mixed complementarity problems, Optimiza-
tion Methods and Software 5, 2, pp. 123–156 (1995).
[101] M. Mavronicolas, V. Papadopoulou and P. Spirakis, Algorithmic
game theory and applications, Handbook of Applied Algorithms:
Solving Scientific, Engineering and Practical Problems, pp. 287–315
(2008).
[102] G. Persiano, Algorithmic Game Theory. Springer (2011).
[103] T. Roughgarden, Algorithmic game theory, Communications of the
ACM 53, 7, pp. 78–86 (2010).
[104] N. Bitansky, O. Paneth and A. Rosen, On the cryptographic hard-
ness of finding a Nash equilibrium, in 2015 IEEE 56th Annual
Symposium on Foundations of Computer Science. IEEE, pp. 1480–
1498 (2015).
References 279

[105] C. Daskalakis, P. W. Goldberg and C. H. Papadimitriou, The com-


plexity of computing a Nash equilibrium, Communications of the
ACM 52, 2, pp. 89–97 (2009).
[106] E. Hazan and R. Krauthgamer, How hard is it to approximate
the best Nash equilibrium? SIAM Journal on Computing 40, 1,
pp. 79–91 (2011).
[107] R. Mehta, Constant rank two-player games are PPAD-hard, SIAM
Journal on Computing 47, 5, pp. 1858–1887 (2018).
[108] A. Rubinstein, Settling the complexity of computing approxi-
mate two-player Nash equilibria, ACM SIGecom Exchanges 15, 2,
pp. 45–49 (2017).
[109] J. L. Cohen, Multiobjective Programming and Planning. Dover
(2003).
[110] J. Nash, Two-person cooperative games, Econometrica: Journal of
the Econometric Society, pp. 128–140 (1953).
[111] A. Muthoo, Bargaining Theory with Applications. Cambridge Uni-
versity Press (1999).
[112] K. Avrachenkov, J. Elias, F. Martignon, G. Neglia and L. Petrosyan,
Cooperative network design: A Nash bargaining solution approach,
Computer Networks 83, pp. 265–279 (2015).
[113] M. Bateni, M. Hajiaghayi, N. Immorlica and H. Mahini, The coop-
erative game theory foundations of network bargaining games, in
Automata, Languages and Programming: 37th International Collo-
quium, ICALP 2010, Bordeaux, France, July 6-10, 2010, Proceed-
ings, Part I 37. Springer, pp. 67–78 (2010).
[114] M. M. Hassan and A. Alamri, Virtual machine resource allocation
for multimedia cloud: A Nash bargaining approach, Procedia Com-
puter Science 34, pp. 571–576 (2014).
[115] C. Liu, K. Li, Z. Tang and K. Li, Bargaining game-based scheduling
for performance guarantees in cloud computing, ACM Transactions
on Modeling and Performance Evaluation of Computing Systems
(TOMPECS) 3, 1, pp. 1–25 (2018).
[116] K. Miettinen, Nonlinear Multiobjective Optimization, Vol. 12.
Springer Science & Business Media (1999).
[117] A. Navon, A. Shamsian, G. Chechik and E. Fetaya, Learning the
Pareto front with hypernetworks, arXiv preprint arXiv:2010.04104
(2020).
[118] D. B. Gillies, Solutions to general non-zero-sum games, Contribu-
tions to the Theory of Games 4, 40, pp. 47–85 (1959).
[119] L. S. Shapley, A Value for N-person Games, RAND Number P-295
(1952).
280 Game Theory Explained: A Mathematical Introduction with Optimization

[120] S. H. Strogatz, Nonlinear Dynamics and Chaos: With Applica-


tions to Physics, Biology, Chemistry, and Engineering. CRC Press
(2018).
[121] J. Hofbauer and K. Sigmund, Evolutionary game dynamics, Bulletin
of the American Mathematical Society 40, 4, pp. 479–519 (2003).
[122] E. C. Zeeman, Population dynamics from game theory, in Global
Theory of Dynamical Systems, no. 819 in Springer Lecture Notes in
Mathematics. Springer (1980).
[123] J. Hofbauer, Evolutionary dynamics for bimatrix games: A Hamil-
tonian system? Journal of Mathematical Biology 34, pp. 675–688
(1996).
[124] J. M. Smith and G. R. Price, The logic of animal conflict, Nature
246, 5427, pp. 15–18 (1973).
[125] J. Hofbauer and K. Sigmund, Evolutionary Games and Population
Dynamics. Cambridge University Press (1998).
[126] J. W. Weibull, Evolutionary Game Theory. MIT Press (1997).
[127] D. Friedman and B. Sinervo, Evolutionary Games in Natural, Social,
and Virtual Worlds. Oxford University Press (2016).
[128] M. A. Nowak, Evolutionary Dynamics. Harvard University Press
(2006).
[129] G. Vickers, Spatial patterns and ESS’s, Journal of Theoretical Biol-
ogy 140, 1, pp. 129–135 (1989), http://www.sciencedirect.com/
science/article/pii/S0022519389800335.
[130] G. Vickers, Spatial patterns and travelling waves in population
genetics, Journal of Theoretical Biology 150, pp. 329–337 (1991).
[131] C. S. Gokhale and A. Traulsen, Evolutionary games in the mul-
tiverse, Proceedings of the National Academy of Sciences 107, 12,
pp. 5500–5504 (2010).
[132] C. Griffin and R. Wu, Higher-order dynamics in the replicator equa-
tion produce a limit cycle in rock-paper-scissors, Europhysics Letters
142, 3, p. 33001 (2023).
[133] B. Skyrms, Chaos in game dynamics, Journal of Logic, Language
and Information 1, 2, pp. 111–130 (1992).
[134] J. Paik and C. Griffin, Completely integrable replicator dynam-
ics associated to competitive networks, Physical Review E 107, 5,
p. L052202 (2023).
[135] S. Allesina and J. M. Levine, A competitive network theory of
species diversity, Proceedings of the National Academy of Sciences
108, 14, pp. 5638–5642 (2011).
[136] J. M. Levine, J. Bascompte, P. B. Adler and S. Allesina, Beyond
pairwise mechanisms of species coexistence in complex communities,
Nature 546, 7656, pp. 56–64 (2017).
Index

α − β pruning, 81 C
card counting, 19, 28
A big player team, 28
characteristic (value) function, 230
affine function, 149
chicken, 90–91, 93, 200
algorithmic game theory, 203
coalition, see cooperative game,
AlphaGo, 60
coalition
coalition game, see cooperative game,
B coalition game
battle of Avranches, 103, 128 column vector, 244
competitive payoff region, see payoff
battle of the Bismark Sea, 59, 63, 73,
76, 80, 82, 90 region, competitive
complete information, 57
battle of the networks, 98
concave function, 148
battle of the sexes (buddies), 209,
conditional probability, see
225
probability, conditional
Bayes’ theorem, 25
conical combination, 146
Bernoulli, Johann, 47 constraint
best response function, 132 binding, 143
best response strategy, see strategy, equality, 141
best response inequality, 141
bimatrix replicator dynamics, see convex combination, 146
replicator dynamics (equation), convex function, 147
bimatrix convex set, see set, convex
blackjack, 19, 28 Conway, John, 81
Blaise Pascal, see Pascal, Blaise cooperative game
Bondareva–Shapley theorem, coalition game, 229–230, 232
236 core, 234
Brouwer fixed point theorem, dominance, 233
125 grand coalition, 229

281
282 Game Theory Explained: A Mathematical Introduction with Optimization

imputation, 233 elephant, 46


inessential game (zero-sum), equilibrium, 76, 89, 94
232–233 existence, 77, 125, 133
cooperative payoff region, see payoff Nash, 109, 115
region, cooperative subgame perfect, 79
coordinate game, see game, zero-sum game, 98, 100, 120
coordination evolutionary game theory, 267
core, see cooperative game, core expected utility theorem, 40, 42
craps, 15 expected value, 9
exponential growth (decay), 258–259
D
Dantzig, George B., 183 F
de Méreé, Chevalier, 27 Fermat, Pierre de, 27
deal or no deal, 3, 6, 12 fixed point
decision tree, see tree, decision asymptotic stability, 263
Deep Blue, 81 stability, 262
descendent, see tree, descendent folk theorem of evolutionary games,
diagonal matrix, see matrix, diagonal 267
differential equation football (North American), 88
fixed point, 262
initial value problem, 259 G
order, 258 game
phase portrait, 263 against the house, 3
system, 260 chance, 3, 65
autonomous, 260 complete information, 57
differential equation, 257 complexity, 60
directed graph, see graph, directed constant sum, 88
directed path, see path, directed coordination, 210
directed tree, see tree, directed general sum (KKT conditions), 88,
directional derivative, 251–252 191
discrete probability distribution, see incomplete information, 61–62
probability distribution normal form, 87, 92
discrete probability space, see strategic form, 89
probability space
symmetric, 92
dominance (imputation), see
value, 102
cooperative game, dominance
zero-sum, 88
dominated strategy, see strategy,
zero-sum (linear program), see
dominated
linear programming problem,
dot product, 245, 249 zero-sum game
dual feasibility, 152 zero-sum (matrix), 92
game tree, see tree, game
E
Gauss, Carl, 27
edge general-sum game, see game, constant
move assignment, 57 sum, see game, general sum
out, 54 generalized convexity, 157
Index 283

Gillies, Donald, 239 standard form, 173


global maximum, 141 zero-sum game, 168, 175
gradient, 252–253 local maximum, 141
graph, 51 Lotka–Volterra equations, 262
directed, 51 lottery, 33–34
graph (function), 250 compound, 36

H M
Hawk-Dove, see Chicken Markov, Andrey, 28
height, see tree, height matrix, 243
Huygens, Christiaan, 27 addition, 244
diagonal, 246
I identity, 246
identity matrix, see matrix, identity multiplication, 245
imputation, see cooperative game, scalar multiplication, 244
imputation symmetric, 247
independence, see probability, transpose, 245
independent events minimax theorem, 121, 181
indifference theorem, 117, 182 mixed strategy, see strategy, mixed
information set, see set, information mixed-strategy space, 108
initial value problem, see differential Monty Hall problem, 22, 29
equation, initial value problem Morgenstern, Oskar, 46
inner product, 245 move assignment, see edge, move
intersection, see set, intersection assignment
iterative dominance, 117 mutual assured destruction, 95
K
N
Karush–Kuhn–Tucker theorem, 149,
Nasar, Silvia, 134
151
Nash bargaining theorem, 220, 222
Kasparov, Gary, 81
axioms, 217
Kelly, J. L., 28
Kolmogorov, Andrey, 28 Nash equilibrium, see equilibrium,
Kuhn, Harold W., 157 Nash
Nash, John F., 134
L Nick the Greek, 11
norm, see vector, norm
lagrange multipliers, 151
Laplace, Pierre-Simon, 27
O
leave, see vertex, terminal
Lemke–Howson algorithm, 202 one vector, see vector, one
level set, see set, level, 253 optimization problem, 141
line (function), 251 general sum game, 190
linear combination, 146 multi-criteria, 213, 227
linear programming problem, 160 cooperative game, 216
computer solution, 172 ordinary differential equation, 257
dual problem, 181, 235 out edge, see edge, out
infinite solutions, 166 outcome, see probability, outcome
284 Game Theory Explained: A Mathematical Introduction with Optimization

P Q
Pareto frontier, 216, 226 quadratic programming problem, 187
Pareto optimality, 215 computer solution, 189
partition, 61 general sum game, 193
Pascal, Blaise, 27 quasi-convex function, 157
path, 69
directed, 52 R
payoff function, 57, 73
random variable, 9
expected, 73
rationality, 59
cooperative, 208
replicator dynamics (equation), 264
mixed strategy, 108
bimatrix, 270
payoff region
rock-paper-scissors, 57, 106, 109, 119,
competitive, 209
268
cooperative, 209
Roth, Alvin, 239
perfect information strategy, see
roulette, 10
strategy, perfect inforamtion
row vector, 244
phase portrait, see differential
equation, phase portrait
Pierre de Fermat, see de Fermat, S
Pierre saddle point, 102, 130
player sample space, see probability, event,
mixed-strategy space, 107 see probability, sample space
player 0, 65 set
player vertex assignment, see vertex, bounded, 212
player assignment closed, 212
poker, 72 convex, 145–146, 211
coin, 66, 68, 72, 75 information, 62
power set, see set, power set intersection, 5
preference, 35 level, 250
transitivity, 35 power set, 5
prisoner’s dilemma, 111, 114, stable, 234
116, 267 union, 5
probability, 3 set, convex, 148
conditional, 15, 17 Shapley, Lloyd, 239
event, 4 Shapley values, 237
mutually exclusive events, 5 slack variable, 174
history, 27 Snowdrift, see Chicken
independent events, 18 St. Petersburg paradox, 47
outcome, 4 stable fixed point, see fixed point,
sample space, 4 stability
probability distribution, 5 stable set, see set, stable
probability space, 6 strategy
pseudo-convex function, 157 best response, 131
Index 285

column dominance, 115 U


cooperative mixed, 208 union, see set, union
dominated, 111 unit simplex, 108
imperfect information, 63 utility function
mixed, 105 affine transformation, 46
Nash, see equilibrium, Nash linear, 45
perfect information, 58 utility theory, 47
pure, 108
row dominance, 114 V
strict dominance, 111
vector
weak dominance, 111
mixed strategy, 107
strategy space, 73
norm, 249
strong duality theorem,
one, 247
181
pure strategy, 108
subtree, see tree, subtree
standard basis, 93, 246
superadditivity, 231
tangent, 253
surplus variable, 174
zero, 247
symmetric matrix, see matrix,
vertex
symmetric
player assignment, 56
terminal, 54
T
von Neumann, John, 46, 183
tangent vector, see vector,
tangent W
terminal vertex, see vertex, terminal
weak dominance, see strategy, strict
The Price is Right, 33–34
dominance, see strategy, weak
Thorp, Edward O., 28
dominance
transpose, see matrix, transpose
Weierstrass’ theorem, 219
tree, 52
Wilson’s theorem, 202
decision, 81
descendent, 54
Z
directed, 52
game, 56–57, 62, 65, 67 Zeeman’s theorem, 270
height, 54 Zermelo’s theorem, 81
subtree, 55 zero vector, see vector, zero
Tucker, Albert W., 157 zero-sum game, see game, zero-sum

You might also like