0% found this document useful (0 votes)
13 views74 pages

Lecture11_AdversarialSearch

The document discusses adversarial search and game-playing strategies in AI, focusing on the MiniMax algorithm and Alpha-beta pruning. It outlines the characteristics of strategic games, the importance of optimal decision-making, and the challenges posed by unpredictable opponents. The document also details the mechanics of the MiniMax algorithm, including its properties, complexities, and the efficiency improvements offered by Alpha-beta pruning.

Uploaded by

Shizra Ahmed
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views74 pages

Lecture11_AdversarialSearch

The document discusses adversarial search and game-playing strategies in AI, focusing on the MiniMax algorithm and Alpha-beta pruning. It outlines the characteristics of strategic games, the importance of optimal decision-making, and the challenges posed by unpredictable opponents. The document also details the mechanics of the MiniMax algorithm, including its properties, complexities, and the efficiency improvements offered by Alpha-beta pruning.

Uploaded by

Shizra Ahmed
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 74

Adversarial Search and

Game- Playing
Outline

● Adversarial Search
● MiniMax Algorithm
● Alpha-beta Prunning
Game Playing

Why do AI researchers study game playing?

1. It’s a good reasoning problem, formal and nontrivial.

2. Direct comparison with humans and other


computer programs is easy.

3
What Kinds of Games?
Mainly games of strategy with the following
characteristics:
1. Sequence of moves to play
2. Rules that specify possible moves
3. Rules that specify a utility for each move
4. Objective is to maximize your utility

4
Games vs. Search Problems

• Unpredictable opponent 🡪 specifying a move for every


possible opponent reply

• Time limits 🡪 unlikely to find goal, must


approximate

5
Two-Player Game
Opponent’s Move

Generate New Position

Game yes
Over?
no
Generate Successors

Evaluate Successors

Move to Highest-Valued Successor

no yes
Game

Over?
6
Environment Type (Games)

Fully
Observable ⚫ Turn-taking: Semi-dynamic
yes ⚫ Deterministic and non-
Multi-agent deterministic

yes
Sequential
yes no
Discrete no
yes Discrete
yes
Game Game Matrices Continuous Action Games
Tree
Search
CMPT 310 - Blind
Search
Adversarial Search
● Examine the problems that arise when we try to plan
ahead in a world where other agents are planning
against us.

● A good example is in board games.

● Adversarial games, while much studied in AI, are a small


part of game theory in economics.
Typical AI assumptions
● Two agents whose actions alternate

● Utility values for each agent are the opposite of the other
- creates the adversarial situation

● Fully observable environments

● In game theory terms: Zero-sum games of perfect


information.

● We’ll relax these assumptions later.


Search versus Games
⚫ Search – no adversary
⚪ Solution is (heuristic) method for finding goal
⚪ Heuristic techniques can find optimal solution
⚪ Evaluation function: estimate of cost from start to goal through given
node
⚪ Examples: path planning, scheduling activities

⚫ Games – adversary
⚪ Solution is strategy (strategy specifies move for every
possible opponent reply).
⚪ Optimality depends on opponent. Why?
⚪ Time limits force an approximate solution
⚪ Evaluation function: evaluate “goodness” of game position
Types of Games
deterministic Chance moves

Perfect information Chess, checkers, go, othello Backgammon, monopoly

Imperfect Bridge, Skat Poker, scrabble,

information blackjack
(Initial Chance
Moves)
Game Setup
● Two players: MAX and MIN
● MAX moves first and they take turns until the game is over
- Winner gets award, loser gets penalty.

Games as search:
- Initial state: e.g. board configuration of chess
- Successor function: list of (move,state) pairs specifying legal moves.
- Terminal test: Is the game finished?
- Utility function: Gives numerical value of terminal states. E.g. win
(+1),
- lose (-1) and draw (0) in tic-tac-toe or chess

● MAX uses search tree to determine next move.


Size of search trees
● b = branching factor
● d = number of moves by both players
● Search tree is O(bd)

● Chess
⚪ b ~ 35
⚪ D ~100
- search tree is ~ 10 154 (!!)
- completely impractical to search this

● Game-playing emphasizes being able to make optimal decisions in a finite amount of


time
⚪ Somewhat realistic as a model of a real-world agent
⚪ Even if games themselves are artificial
Game Tree (2-player, Deterministic,
Turns)
computer’s
turn

opponent’s
turn

computer’s
turn
The computer is Max.
opponent’s The opponent is Min.
turn

At the leaf nodes, the utility function


leaf nodes are is employed. Big value means good, small is ba d.
7

evaluated
Mini-Max Terminology
• move: a move by both players
• ply: a half-move
• utility function: the function applied to leaf nodes
• backed-up value
– of a max-position: the value of its largest successor
– of a min-position: the value of its smallest successor
• minimax procedure: search down several levels; at the bottom level apply
the utility function, back-up values all the way up to the root node, and
that node selects the move.

1
Minimax strategy: Look ahead and reason
backwards
● Find the optimal strategy for MAX assuming an infallible MIN
opponent
○ Need to compute this all the down the tree

● Assumption: Both players play optimally!


● Given a game tree, the optimal strategy can be determined by
using the minimax value of each node.
Example of Minimax
2
A Max
-1 2
B C Min
8 -1 2 4
D E F G Max

-1 8 -3 -1 2 1 -3 4
Example I
Example II
Example III
Example:
Consider a game which has 4 final states and paths to
reach final state are from root to 4 leaves of a perfect
binary tree as shown below. Assume you are the
maximizing player and you get the first chance to move,
i.e., you are at the root and your opponent at next level.
Which move you would make as a maximizing
player considering that your opponent also plays
optimally?
Since this is a backtracking based algorithm, it tries all possible
moves, then backtracks and makes a decision.

● Maximizer goes LEFT: It is now the minimizers turn. The minimizer


now has a choice between 3 and 5. Being the minimizer it will
definitely choose the least among both, that is 3
● Maximizer goes RIGHT: It is now the minimizers turn. The
minimizer now has a choice between 2 and 9. He will choose 2 as
it is the least among the two values.
● Being the maximizer you would choose the larger value that is 3.
Hence the optimal move for the maximizer is to go LEFT and the
optimal value is 3.
Now the game tree looks like
below
The tree shows two possible
scores when maximizer
makes left and right moves.
Note: Even though there is a
value of 9 on the right
subtree, the minimizer will
never pick that. We must
always assume that our
opponent plays optimally.
Minimax
• Perfect play for deterministic games
• Idea: choose move to position with highest minimax value
= best achievable payoff against best play
• E.g., 2-ply game:

2
ma
x

mi
n

ma
x

mi
n
ma
x

mi
n

ma
x

mi
n
ma
x

mi
n

ma
x

mi
n
max

min

max

min
ma
x

mi
n

ma
x

mi
n
ma
x

mi
n

ma
x

mi
n
ma
x

mi
n

ma
x

mi
n
ma
x

mi
n

ma
x

mi
n
ma
x

mi
n

ma
x

mi
n
ma
x

mi
n

ma
x

mi
n
ma
x

mi
n

ma
x

mi
n
Minimax Strategy
• Why do we take the min value every other level of the
tree?

• These nodes represent the opponent’s choice of move.

37
• The computer assumes that the human will choose that
move that is of least value to the computer.
Minimax algorithm Adversarial analogue of DFS

38
Example of Algorithm Execution
MAX to move
Properties of Minimax
• Complete?
– Yes (if tree is finite)
• Optimal?
– Yes (against an optimal opponent)
– No (does not exploit opponent weakness against suboptimal opponent)
• Time complexity?
– O(bm) 40

• Space complexity?
– O(bm) (depth-first exploration)
Activity
In the following two-ply game tree, the terminal nodes show the utility
values computed by the utility function. Use the Minimax algorithm to
compute the utility values for other nodes in the given game tree.
• Chess:
Good Enough?
– branching factor b≈35

– game length m≈100

– search space bm ≈ 35100 ≈ 10154

• The Universe:
– number of atoms ≈ 1078
42
– age ≈ 1018 seconds

– 108 moves/sec x 1078 x 1018 = 10104


Practical problem with minimax search
● Number of game states is exponential in the number of
moves. (Gets Slow in complex problem)
⚪ Solution: Do not examine every node
=> pruning
⯍ Remove branches that do not influence final decision
Alpha-Beta Procedure
• The alpha-beta procedure can speed up a depth-first
minimax search.
• Alpha: a lower bound on the value that a max node may ultimately
be assigned
v>α

• Beta: an upper bound on the value that a minimizing node may


ultimately be assigned 44

v<β
Effectiveness of Alpha-Beta Search

⚫ Worst-Case
⚪ branches are ordered so that no pruning takes place. In this case alpha-beta gives
no improvement over exhaustive search

⚫ Best-Case
⚪ each player’s best move is the left-most alternative (i.e., evaluated first)
⚪ in practice, performance is closer to best rather than worst-case

⚫ In practice often get O(b(d/2)) rather than O(bd)


⚪ this is the same as having a branching factor of sqrt(b),
⯍ since (sqrt(b))d = b(d/2)
⯍ i.e., we have effectively gone from b to square root of b

⚪ e.g., in chess go from b ~ 35 to b ~ 6


⯍ this permits much deeper search in the same amount of time
⯍ Typically twice as deep.
Example
-which nodes can be
MAX
pruned?

MIN

MAX

3 5 6
4 1 2 8

7
ma
x

mi
n

ma
x

mi
n
ma
x

mi
n

ma
x

mi
n
ma
x

mi
n

ma
x

mi
n
ma
x

mi
n

ma
x

mi
n
ma
x

mi
n

max

Do we need to
check this node?
min

?? 30
ma
x

mi
n

max

No - this branch is guaranteed


to be worse than what max
already has
min

X 31
Alpha-Beta
MinVal(state, alpha, beta){
if (terminal(state))
return utility(state);
for (s in children(state))
{ child =
MaxVal(s,alpha,beta); beta =
min(beta,child);
if (alpha>=beta) return child;
53
}
return beta; }
alpha = the highest value for MAX along the path
beta = the lowest value for MIN along the path
Alpha-Beta
MaxVal(state, alpha, beta){
if (terminal(state))
return utility(state);
for (s in children(state))
{ child =
MinVal(s,alpha,beta); alpha
= max(alpha,child);
if (alpha>=beta) return child;
54
}
return beta; }
alpha = the highest value for MAX along the path
beta = the lowest value for MIN along the path
ma α=-

α - the best value x β=
for max along the ∞
path β - the best value
for min along the α=-
path ∞
β=
min ∞

α=-
ma

x β=

mi α=- 55

n β=8
4
ma α=-

α - the best value x β=
for max along the ∞
path β - the best value
for min along the α=-
path ∞
β=
min ∞

α=-
ma
29
x β=∞

mi α=- α=- 56
∞ 29
n β=- β=∞
29
ma α=-

α - the best value x β=
for max along the ∞
path β - the best value
for min along the α=-
path ∞
β=
α=- min ∞
29
β=∞
ma
x
α=- α=-
∞ 29
β=- β=-
mi 57
29 37
n
ma α=-

α - the best value x β=
for max along the ∞
path β - the best value
for min along the α=-
path ∞
β=
α=- min ∞
29
β=∞
ma
x
α=- α=- β<α
∞ 29
β=- β=-
mi 58
29
n
37
prun
e!
X
ma α=-

α - the best value x β=
for max along the ∞
path β - the best value
for min along the α=-
path ∞
β=-
α=- min 29
29
β=∞ α=-
ma

x β=-
α=- α=- 29
∞ 29
β=- β=- α=-
mi
29 37
59

n β=-
29

X
ma α=-

α - the best value x β=
for max along the ∞
path β - the best value
for min along the α=-
path ∞
β=-
α=- min 29
29
β=∞ α=-
ma

x β=-
α=- 29
α=-
29

β=- α=-
β=-
mi 60
37 ∞
29
n β=-
29

X
ma α=-

α - the best value x β=
for max along the ∞
path β - the best value
for min along the α=-
path ∞
β=-
min 29

α=- α=-
ma
29 43
x β=∞ β=-
α=- α=- 29
∞ 29
β=- β=- α=- α=-
mi
29 37 61
∞ 43
n β=- β=-
43 29

X
ma α=-

α - the best value x β=
for max along the ∞
path β - the best value
for min along the α=-
path ∞
β=-
α=- min 29
29
β=∞ α=-
ma
43
x β=-
α=- α=- α=- α=- 29 β<α
∞ 29 ∞ 43
β=- β=- β=- β=-
mi29 37 62
n
43 75
prun
e!
X X
α - the best value ma α=-
for max along the x
43
path β - the best value β=∞
for min along the
α=-∞
path
β=-43
min
α=-
29
α=-43
β=∞
ma β=-29
x
α=- α=- α=- α=-
∞ 29 ∞ 43
β=- β=- β=- β=-
mi 63
29 37 43 75
n

X X
ma α=-
43
α - the best value x β=∞
for max along the
path β - the best value
for min along the α=-
path 43
β=∞
min

ma α=-
43
x β=∞
α=-
43
β=- α=-
mi 21 64
43
n β=5
8

X X
ma α=-
43
α - the best value x β=∞
for max along the
path β - the best value
for min along the α=- β<α
path 43
β=-
min 46
prun
ma
x
α=-
43
β=∞
X e!

mi
n
α=-
43
β=-
α=-
43
β=-
X X
21 46

X X X X X
Properties of α-β
• Pruning does not affect final result. This means that it gets the exact same result as
does full minimax.

• Good move ordering improves effectiveness of pruning

• With "perfect ordering," time complexity = O(bm/2)


🡪 doubles depth of search

• A simple example of reasoning about ‘which computations are relevant’ (a form of


metareasoning) 66
Shallow Search Techniques

1. limited search for a few levels

2. reorder the level-1 sucessors

3. proceed with α-β minimax search 67


Practical Implementation
How do we make these ideas practical in real game trees?

Standard approach:
⚫ cutoff test: (where do we stop descending the tree)
⚪ depth limit
⚪ better: iterative deepening
⚪ cutoff only when no big changes are expected to occur next (quiescence
search).

⚫ evaluation function
⚪ When the search is cut off, we evaluate the
current state by estimating its utility using an
evaluation function.
Cutting off Search
MinimaxCutoff is identical to MinimaxValue except
1. Terminal? is replaced by Cutoff?
2. Utility is replaced by Eval

Does it work in practice? bm = 106,


b=35 🡪 m=4

4 ply lookahead is a hopeless chess player!


69
– 4-ply ≈ human novice
– 8-ply ≈ typical PC, human master
– 12-ply ≈ Deep Blue, Kasparov
ma
x

mi
n

ma Cutof
x f

mi 70
n
Static (Heuristic) Evaluation Functions
⚫ An Evaluation Function:
⚪ estimates how good the current board configuration is for a player.
⚪ Typically, one figures how good it is for the player, and how good it is for
the opponent, and subtracts the opponents score from the players
⚪ Othello: Number of white pieces - Number of black pieces
⚪ Chess: Value of all white pieces - Value of all black pieces

⚫ Typical values from -infinity (loss) to +infinity (win) or [-1, +1].


⚫ If the board evaluation is X for a player, it’s -X for the opponent.
⚫ Many clever ideas about how to use the evaluation function.
⚪ e.g. null move heuristic: let opponent move twice.
⚫ Example:
⚪ Evaluating chess boards,
⚪ Checkers
⚪ Tic-tac-toe
Iterative (Progressive) Deepening
● In real games, there is usually a time limit T on
making a move

● How do we take this into account?


○ using alpha-beta we cannot use “partial” results with any
confidence unless the full breadth of the tree has been
searched
○ So, we could be conservative and set a conservative depth-
limit which guarantees that we will find a move in time < T
○ ⯍ disadvantage is that we may finish early, could do more search

● In practice, iterative deepening search (IDS) is used


● IDS runs depth-first search with an increasing depth-limit
○ when the clock runs out we use the solution found at the
Summary
● Game playing can be effectively modeled as a search problem

● Game trees represent alternate computer/opponent moves

● Evaluation functions estimate the quality of a given board configuration for the Max
player.

● Minimax is a procedure which chooses moves by assuming that the opponent will always
choose the move which is best for them

● Alpha-Beta is a procedure which can prune large parts of the search tree and allow
search to go deeper

● For many well-known games, computer algorithms based on heuristic search match or
out-perform human world experts.

You might also like