06-Adversarial_Search
06-Adversarial_Search
ARTIFICIAL INTELLIGENCE
Games and
Adversarial
Search
Games: Outline of Unit
o Part 1: Games as Search
Motivation
Game-playing AI successes
Game Trees
Evaluation Functions
o Part II: Adversarial Search
The Minimax Rule
Alpha-Beta Pruning
May 11, 1997
Ratings of human and computer chess
champions
https://srconstantin.wordpress.com/2017/01/28/performance-trends-in-ai/
May 11, 1997
o Bulleted list
Callout box
Bulleted list level 2
Callout box
Content font: Open Sans Size20
o Examples:
Chess, Checkers, Go,
Mancala, Tic-Tac-Toe, Othello …
More complicated games
o Most card games (e.g. Hearts, Bridge, etc.) and Scrabble
Stochastic, not deterministic
Not fully observable: lacking in perfect information
o Real-time strategy games, e.g. Warcraft
Continuous rather than discrete
No pause between actions, don’t take turns
o Cooperative games
Pac-Man
https://youtu.be/-CbyAk3Sn9I
Formalizing the Game setup
1. Two players: MAX and MIN; MAX moves first.
2. MAX and MIN take turns until the game is over.
3. Winner gets award, loser getspenalty.
o Games as search:
Initial state: e.g. board configuration of chess
Successor function: list of (move,state) pairs specifying legal moves.
Terminal test: Is the game finished?
Utility function: Gives numerical value of terminal states.
e.g. win (+∞), lose (-∞) and draw (0)
MAX uses search tree to determine next move.
How to Play a Game bySearching
o General Scheme
1. Consider all legal successors to the current state (‘board position’)
2. Evaluate each successor board position
3. Pick the move which leads to the best board position.
4. After your opponent moves,repeat.
o Design issues
1. Representing the ‘board’
2. Representing legal next boards
3. Evaluating positions
4. Looking ahead
Hexapawn: A very simple Game
o Hexapawn is played on a 3x3 chessboard
⬤ ⬤ ⬤
White moves
⬤ ⬤ ⬤ ⬤ ⬤ ⬤ ⬤ ⬤ ⬤
□ □
Game Trees
o Represent the game problem space by a tree:
Nodes represent ‘board positions’; edges represent legal moves.
Root node is the first position in which a decision must be made.
Hexapawn: Simplified Game Tree for 2Moves
⬤ ⬤ ⬤
White to move
⬤ ⬤ ⬤ ⬤ ⬤ ⬤ ⬤ ⬤ ⬤
Black to move
□ □
⬤ ⬤ ⬤
⬤ ⬤ ⬤ ⬤ ⬤ ⬤ ⬤ ⬤ ⬤ ⬤ ⬤ ⬤ ⬤ ⬤
⬤ ⬤ White
⬤ ⬤ to move
□ □ □ □ □ □
Adversarial Search
Battle of Wits
https://www.youtube.com/watch?v=rMz7JBRbmNo
MAX & MIN Nodes : An egocentric view
o Two players: MAX, MAX’s opponent MIN
o All play is computed from MAX’s vantage point.
o When MAX moves, MAX attempts to MAXimize MAX’s outcome.
o When MAX’s opponent moves, they attempt to MINimize MAX’s outcome.
o WE TYPICALLY ASSUME MAX MOVES FIRST:
-∞ 0 +∞
Chess Evaluation Functions
o Claude Shannon argued for a chess evaluation function in a 1950 paper
Pawn 1.0
o Alan Turing defined function in 1948:
f(n)=(sum of A’s piece values) Knight 3.0
-(sum of B’s piece values) Bishop 3.25
Rook 5.0
o More complex: weighted sum
Type equation her e.Quee 9.0
of positional features: n
Σ wi featurei(n) Pieces values for a simple Turing-
style evaluation function often taught
to novice chess players
o Deep Blue had >8000 features Positive: rooks on open files, knights in
closed positions, control of the center,
developed pieces
• For a MAX node, the backed-up value is the maximum of the values of its
children (i.e. the best for MAX)
• For a MIN node, the backed-up value is the minimum of the values of its
children (i.e. the best for MIN)
The Minimax Procedure
o Until game is over:
2 1
2 1
2 7 1 8
2 7 1 8
2 7 1 8
Evaluation function value
2 7 1 8
Adversarial Search (Minimax)
o Minimax search:
A state-space search tree Minimax values:
Players alternate turns computed recursively
Compute each 5
node’s minimax value: the
best achievable utility against a
rational (optimal) adversary 2 5
8 2 5 6
Terminal values:
part of the game
Minimax Implementation
def max-value(state):
initialize v = -∞ def min-value(state):
for each successor of state: initialize v = +∞
for each successor of state:
v = max(v, value(successor))
return v v = min(v, value(successor))
return v
Minimax Example
3 12 8 2 4 6 14 5 2
3 12 8 2 4 6 14 5 2
o But if MIN does not play optimally, MAX will do even better.
This theorem is not hard to prove
Comments on Minimax Search
o Depth-first search with fixed number of ply m as the limit.
O(bm) time complexity – As usual!
O(bm) space complexity
o Minimax forms the basis for other game tree search algorithms.
Alpha-Beta Pruning (AIMA 5.3)
Alpha-Beta Pruning
o A way to improve the performance of the
Minimax Procedure
o Basic idea: “If you have an idea which is surely bad,
don’t take the time to see how truly awful it is” ~ Pat
Winston
>=2
•We don’t need to
compute the value at
=2 <=1 this node.
o MAX will never allow a move that could lead to a worse score (for MAX)
than α
o MIN will never allow a move that could lead to a better score (for MAX)
than β
2 1
2 1
2 7 1 8
2 7 1 8
2 7 1 8
Evaluation function value
New point: 2 7 1 8
Actually calculated by DFS!
Minimax Algorithm
function MINIMAX-DECISION(state) returns an action
inputs: state, current state in game
v□MAX-VALUE(state)
return an action in SUCCESSORS(state) with value v
o MAX will never choose a move that could lead to a worse score (for MAX)
than α
o MIN will never choose a move that could lead to a better score (for MAX)
than β
Prune below a Min node when its β value becomes ≤ the α value of its
α ancestors.
• Min nodes update β based on children’s returned values.
• Idea: Player MAX at node above won’t pick that value anyway; she can do
better.
Pseudocode for Alpha-Beta Algorithm
function ALPHA-BETA-SEARCH(state) returns an a c t i o n
inputs: s ta te , current state in game
v□MAX-VALUE(state, - ∞ , +∞)
return an a c t i o n in ACTIONS(state) with value v
An Alpha-Beta Example
Do DF-search until first leaf , , initial values
=−
β =+
MAX
, , passed to kids
=−
MIN
=+
α
An Alpha-Beta Example (continued)
=−
β =+
MAX
=−
=3 <=3
MIN
MIN updates , based on kids
3
An Alpha-Beta Example (continued)
=−
β =+
MAX
=−
=3 <=3
MIN
MIN updates , based on kids.
No change.
α
3 12
An Alpha-Beta Example (continued)
MAX updates , based α on kids.
=3
β =+
MAX
=− 3 is returned as node value.
=3 3
MIN
3 12 8
An Alpha-Beta Example (continued)
=3
β =+
MAX
α, β passed to kids
3 =3
=+
MIN
3 12 8
An Alpha-Beta Example (continued)
=3
β =+
MAX MIN updates ,
based on kids.
3 =3
=2
MIN
3 12 8 2
An Alpha-Beta Example (continued)
=3
β =+
MAX
=3
α >= , so prune.
3 =2
MIN
α X X
3 12 8 2
An Alpha-Beta Example (continued)
MAX updates , based on kids.
No change.
=3
β =+ 2 is returned
MAX
as node value.
3 <=2
MIN
α X X
3 12 8 2
An Alpha-Beta Example (continued)
=3
β =+
MAX
, , passed to kids
3 <=2 =3
MIN
=+
α
X X
3 12 8 2
An Alpha-Beta Example (continued)
=3
β =+
MAX MIN updates ,
based on kids.
3 <=2 =3
MIN
=14
α
X X
3 12 8 2 14
An Alpha-Beta Example (continued)
=3
β =+
MAX MIN updates ,
based on kids.
3 <=2 =3
MIN
=5
α
X X
3 12 8 2 14 5
An Alpha-Beta Example (continued)
=3
β =+
MAX 2 is returned as
Node value.
3 <=2 2
MIN
α
X X
3 12 8 2 14 5 2
An Alpha-Beta Example (continued)
Max now makes it’s best move,
as computed by Alpha-Beta
β MAX
3 <=2 2
MIN
α
X X
3 12 8 2 14 5 2
o For Deep Blue, alpha-beta pruning reduced the average branching factor from
35-40 to 6, as expected, doubling search depth
Real systems use a few more tricks
o Expand the proposed solution a little farther
Just to make sure there are no surprises
o Learn better board evaluation functions
E.g., for backgammon
o Learn model of your opponent
E.g., for poker
o Do stochastic search
E.g., for go