Lecture11_AdversarialSearch
Lecture11_AdversarialSearch
Game- Playing
Outline
● Adversarial Search
● MiniMax Algorithm
● Alpha-beta Prunning
Game Playing
3
What Kinds of Games?
Mainly games of strategy with the following
characteristics:
1. Sequence of moves to play
2. Rules that specify possible moves
3. Rules that specify a utility for each move
4. Objective is to maximize your utility
4
Games vs. Search Problems
5
Two-Player Game
Opponent’s Move
Game yes
Over?
no
Generate Successors
Evaluate Successors
no yes
Game
Over?
6
Environment Type (Games)
Fully
Observable ⚫ Turn-taking: Semi-dynamic
yes ⚫ Deterministic and non-
Multi-agent deterministic
yes
Sequential
yes no
Discrete no
yes Discrete
yes
Game Game Matrices Continuous Action Games
Tree
Search
CMPT 310 - Blind
Search
Adversarial Search
● Examine the problems that arise when we try to plan
ahead in a world where other agents are planning
against us.
● Utility values for each agent are the opposite of the other
- creates the adversarial situation
⚫ Games – adversary
⚪ Solution is strategy (strategy specifies move for every
possible opponent reply).
⚪ Optimality depends on opponent. Why?
⚪ Time limits force an approximate solution
⚪ Evaluation function: evaluate “goodness” of game position
Types of Games
deterministic Chance moves
information blackjack
(Initial Chance
Moves)
Game Setup
● Two players: MAX and MIN
● MAX moves first and they take turns until the game is over
- Winner gets award, loser gets penalty.
Games as search:
- Initial state: e.g. board configuration of chess
- Successor function: list of (move,state) pairs specifying legal moves.
- Terminal test: Is the game finished?
- Utility function: Gives numerical value of terminal states. E.g. win
(+1),
- lose (-1) and draw (0) in tic-tac-toe or chess
● Chess
⚪ b ~ 35
⚪ D ~100
- search tree is ~ 10 154 (!!)
- completely impractical to search this
opponent’s
turn
computer’s
turn
The computer is Max.
opponent’s The opponent is Min.
turn
evaluated
Mini-Max Terminology
• move: a move by both players
• ply: a half-move
• utility function: the function applied to leaf nodes
• backed-up value
– of a max-position: the value of its largest successor
– of a min-position: the value of its smallest successor
• minimax procedure: search down several levels; at the bottom level apply
the utility function, back-up values all the way up to the root node, and
that node selects the move.
1
Minimax strategy: Look ahead and reason
backwards
● Find the optimal strategy for MAX assuming an infallible MIN
opponent
○ Need to compute this all the down the tree
-1 8 -3 -1 2 1 -3 4
Example I
Example II
Example III
Example:
Consider a game which has 4 final states and paths to
reach final state are from root to 4 leaves of a perfect
binary tree as shown below. Assume you are the
maximizing player and you get the first chance to move,
i.e., you are at the root and your opponent at next level.
Which move you would make as a maximizing
player considering that your opponent also plays
optimally?
Since this is a backtracking based algorithm, it tries all possible
moves, then backtracks and makes a decision.
2
ma
x
mi
n
ma
x
mi
n
ma
x
mi
n
ma
x
mi
n
ma
x
mi
n
ma
x
mi
n
max
min
max
min
ma
x
mi
n
ma
x
mi
n
ma
x
mi
n
ma
x
mi
n
ma
x
mi
n
ma
x
mi
n
ma
x
mi
n
ma
x
mi
n
ma
x
mi
n
ma
x
mi
n
ma
x
mi
n
ma
x
mi
n
ma
x
mi
n
ma
x
mi
n
Minimax Strategy
• Why do we take the min value every other level of the
tree?
37
• The computer assumes that the human will choose that
move that is of least value to the computer.
Minimax algorithm Adversarial analogue of DFS
38
Example of Algorithm Execution
MAX to move
Properties of Minimax
• Complete?
– Yes (if tree is finite)
• Optimal?
– Yes (against an optimal opponent)
– No (does not exploit opponent weakness against suboptimal opponent)
• Time complexity?
– O(bm) 40
• Space complexity?
– O(bm) (depth-first exploration)
Activity
In the following two-ply game tree, the terminal nodes show the utility
values computed by the utility function. Use the Minimax algorithm to
compute the utility values for other nodes in the given game tree.
• Chess:
Good Enough?
– branching factor b≈35
• The Universe:
– number of atoms ≈ 1078
42
– age ≈ 1018 seconds
v<β
Effectiveness of Alpha-Beta Search
⚫ Worst-Case
⚪ branches are ordered so that no pruning takes place. In this case alpha-beta gives
no improvement over exhaustive search
⚫ Best-Case
⚪ each player’s best move is the left-most alternative (i.e., evaluated first)
⚪ in practice, performance is closer to best rather than worst-case
MIN
MAX
3 5 6
4 1 2 8
7
ma
x
mi
n
ma
x
mi
n
ma
x
mi
n
ma
x
mi
n
ma
x
mi
n
ma
x
mi
n
ma
x
mi
n
ma
x
mi
n
ma
x
mi
n
max
Do we need to
check this node?
min
?? 30
ma
x
mi
n
max
X 31
Alpha-Beta
MinVal(state, alpha, beta){
if (terminal(state))
return utility(state);
for (s in children(state))
{ child =
MaxVal(s,alpha,beta); beta =
min(beta,child);
if (alpha>=beta) return child;
53
}
return beta; }
alpha = the highest value for MAX along the path
beta = the lowest value for MIN along the path
Alpha-Beta
MaxVal(state, alpha, beta){
if (terminal(state))
return utility(state);
for (s in children(state))
{ child =
MinVal(s,alpha,beta); alpha
= max(alpha,child);
if (alpha>=beta) return child;
54
}
return beta; }
alpha = the highest value for MAX along the path
beta = the lowest value for MIN along the path
ma α=-
∞
α - the best value x β=
for max along the ∞
path β - the best value
for min along the α=-
path ∞
β=
min ∞
α=-
ma
∞
x β=
∞
mi α=- 55
∞
n β=8
4
ma α=-
∞
α - the best value x β=
for max along the ∞
path β - the best value
for min along the α=-
path ∞
β=
min ∞
α=-
ma
29
x β=∞
mi α=- α=- 56
∞ 29
n β=- β=∞
29
ma α=-
∞
α - the best value x β=
for max along the ∞
path β - the best value
for min along the α=-
path ∞
β=
α=- min ∞
29
β=∞
ma
x
α=- α=-
∞ 29
β=- β=-
mi 57
29 37
n
ma α=-
∞
α - the best value x β=
for max along the ∞
path β - the best value
for min along the α=-
path ∞
β=
α=- min ∞
29
β=∞
ma
x
α=- α=- β<α
∞ 29
β=- β=-
mi 58
29
n
37
prun
e!
X
ma α=-
∞
α - the best value x β=
for max along the ∞
path β - the best value
for min along the α=-
path ∞
β=-
α=- min 29
29
β=∞ α=-
ma
∞
x β=-
α=- α=- 29
∞ 29
β=- β=- α=-
mi
29 37
59
∞
n β=-
29
X
ma α=-
∞
α - the best value x β=
for max along the ∞
path β - the best value
for min along the α=-
path ∞
β=-
α=- min 29
29
β=∞ α=-
ma
∞
x β=-
α=- 29
α=-
29
∞
β=- α=-
β=-
mi 60
37 ∞
29
n β=-
29
X
ma α=-
∞
α - the best value x β=
for max along the ∞
path β - the best value
for min along the α=-
path ∞
β=-
min 29
α=- α=-
ma
29 43
x β=∞ β=-
α=- α=- 29
∞ 29
β=- β=- α=- α=-
mi
29 37 61
∞ 43
n β=- β=-
43 29
X
ma α=-
∞
α - the best value x β=
for max along the ∞
path β - the best value
for min along the α=-
path ∞
β=-
α=- min 29
29
β=∞ α=-
ma
43
x β=-
α=- α=- α=- α=- 29 β<α
∞ 29 ∞ 43
β=- β=- β=- β=-
mi29 37 62
n
43 75
prun
e!
X X
α - the best value ma α=-
for max along the x
43
path β - the best value β=∞
for min along the
α=-∞
path
β=-43
min
α=-
29
α=-43
β=∞
ma β=-29
x
α=- α=- α=- α=-
∞ 29 ∞ 43
β=- β=- β=- β=-
mi 63
29 37 43 75
n
X X
ma α=-
43
α - the best value x β=∞
for max along the
path β - the best value
for min along the α=-
path 43
β=∞
min
ma α=-
43
x β=∞
α=-
43
β=- α=-
mi 21 64
43
n β=5
8
X X
ma α=-
43
α - the best value x β=∞
for max along the
path β - the best value
for min along the α=- β<α
path 43
β=-
min 46
prun
ma
x
α=-
43
β=∞
X e!
mi
n
α=-
43
β=-
α=-
43
β=-
X X
21 46
X X X X X
Properties of α-β
• Pruning does not affect final result. This means that it gets the exact same result as
does full minimax.
Standard approach:
⚫ cutoff test: (where do we stop descending the tree)
⚪ depth limit
⚪ better: iterative deepening
⚪ cutoff only when no big changes are expected to occur next (quiescence
search).
⚫ evaluation function
⚪ When the search is cut off, we evaluate the
current state by estimating its utility using an
evaluation function.
Cutting off Search
MinimaxCutoff is identical to MinimaxValue except
1. Terminal? is replaced by Cutoff?
2. Utility is replaced by Eval
mi
n
ma Cutof
x f
mi 70
n
Static (Heuristic) Evaluation Functions
⚫ An Evaluation Function:
⚪ estimates how good the current board configuration is for a player.
⚪ Typically, one figures how good it is for the player, and how good it is for
the opponent, and subtracts the opponents score from the players
⚪ Othello: Number of white pieces - Number of black pieces
⚪ Chess: Value of all white pieces - Value of all black pieces
● Evaluation functions estimate the quality of a given board configuration for the Max
player.
● Minimax is a procedure which chooses moves by assuming that the opponent will always
choose the move which is best for them
● Alpha-Beta is a procedure which can prune large parts of the search tree and allow
search to go deeper
● For many well-known games, computer algorithms based on heuristic search match or
out-perform human world experts.