0% found this document useful (0 votes)
8 views34 pages

CCS 3101 - Lecture 5 - Adversarial Search Techniques

The document discusses adversarial search techniques in game playing, highlighting the importance of understanding multi-agent environments and the necessity of adversarial search in competitive settings. It covers various types of games, including perfect and imperfect information games, and introduces concepts such as game trees, minimax algorithms, and alpha-beta pruning to optimize decision-making in games. The document emphasizes the effectiveness of AI in games, showcasing historical milestones in AI achievements in games like checkers and chess.

Uploaded by

madahanacarlos
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views34 pages

CCS 3101 - Lecture 5 - Adversarial Search Techniques

The document discusses adversarial search techniques in game playing, highlighting the importance of understanding multi-agent environments and the necessity of adversarial search in competitive settings. It covers various types of games, including perfect and imperfect information games, and introduces concepts such as game trees, minimax algorithms, and alpha-beta pruning to optimize decision-making in games. The document emphasizes the effectiveness of AI in games, showcasing historical milestones in AI achievements in games like checkers and chess.

Uploaded by

madahanacarlos
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 34

Adversarial search techniques: Game playing

CCS 3101
ARTIFICIAL INTELLIGENCE
Games

 Multi agent environments : any given agent will need to consider the actions of
other agents and how they affect its own welfare.

 The unpredictability of these other agents can introduce many possible


contingencies

 There could be competitive or cooperative environments

 Competitive environments, in which the agent’s goals are in conflict require


adversarial search – these problems are known as games
Why study games
 Fun
 Clear criteria for success
 Interesting, hard problems which require minimal “initial structure”
 Games often define very large search spaces
 chess 10*120 nodes
 Historical reasons
 Offer an opportunity to study problems involving {hostile, adversarial,
competing} agents.
Why do we use AI to play games
 Games are an intelligent activity.

 They provide a structured task in which it is very easy to measure success or


failure.

 They do not require large amounts of knowledge.

 They were thought to be solvable by straightforward search from the starting


state to a winning position
What kind of games?
 Abstraction: To describe a game we must capture every relevant aspect of the game.
Such as:
 Chess
 Tic-tac-toe
…
 Accessible environments: Such games are characterized by perfect information

 Search: game-playing then consists of a search through possible game positions

 Unpredictable opponent: introduces uncertainty thus game-playing must deal


with contingency problems
Types of Games
 Perfect information: A game in which agents can look into the complete board. Agents
have all the information about the game, and they can see each others moves also. Examples
are Chess, Checkers, Go, etc.
 Imperfect information: A game in which agents do not have all information about the
game and are not aware of what's going on, such as tic-tac-toe, Battleship, blind, Bridge, etc.
 Deterministic games: Games which follow a strict pattern and set of rules for the games,
and there is no randomness associated with them. Examples are chess, Checkers, Go, tic-tac-
toe, etc.
 Non-deterministic games: Games which have various unpredictable events with a factor
of chance or luck. This factor of chance or luck is introduced by either dice or cards. These are
random, and each action response is not fixed. Such games are also called as stochastic games.
Example: Backgammon, Monopoly, Poker, etc.
Types of Games

 AI games are a specialized kind - deterministic, turn taking, two-player, zero sum games of
perfect information

 A zero-sum game is a mathematical representation of a situation in which a


participant's gain (or loss) of utility is exactly balanced by the losses (or gains) of the
utility of other participant(s)
Typical assumptions
 Two agents whose actions alternate

 Utility values for each agent are the opposite of the other
 This creates the adversarial situation

 Fully observable environments

 In game theory terms:


 “Deterministic, turn-taking, zero-sum games of perfect information”
Game Trees
 A game tree is a tree where nodes of the tree are the game states and Edges of the
tree are the moves by players.
 Root node represents the “board” configuration at which a decision must be made as
to what is the best single move to make next. (not necessarily the initial
configuration)
 Evaluator function rates a board position. f(board) (a real number).
 Arcs represent the possible legal moves for a player (no costs associates to arcs).
 Terminal nodes represent end-game configurations (the result must be one of “win”,
“lose”, and “draw”, possibly with numerical payoff)
Game Trees
 If it is my turn to move, then the root is labeled a "MAX" node; otherwise it is labeled
a "MIN" node indicating my opponent's turn.

 Each level of the tree has nodes that are all MAX or all MIN; nodes at level i are of the
opposite kind from those at level i+1

 Complete game tree: includes all configurations that can be generated from the root
by legal moves (all leaves are terminal nodes)

 Incomplete game tree: includes all configurations that can be generated from the
root by legal moves to a given depth (looking ahead to given steps)
Deterministic Single-Player
 Deterministic, single player, perfect information:
 Know the rules
 Know what actions do
 Know when you win
 E.g. Freecell, 8-Puzzle… it’s just search!

 Slight reinterpretation:
 Each node stores a value: the best outcome it can reach
 This is the maximal outcome of its children (the max value)

 After search, can pick move that leads to best node


Deterministic Two-Player
 E.g. tic-tac-toe, chess, checkers
 Zero-sum games
 One player maximizes result
 The other minimizes result
 Minimax search
 A state-space search tree
 Players alternate
 Each layer, or ply, consists of a round of moves
 Choose move to position with highest minimax value = best
achievable utility against best play
(Partial) game tree for Tic-Tac-Toe

• f(n) = +1 if the position is a win for X.


• f(n) = -1 if the position is a win for O.
• f(n) = 0 if the position is a draw.

How do we search this tree to find the optimal


move?
-
Search versus Games
 Search – no adversary
 Solution is (heuristic) method for finding goal
 Heuristics techniques can find optimal solution
 Evaluation function: estimate of cost from start to goal through given node
 Examples: path planning, scheduling activities

 Games – adversary
 Solution is strategy
 strategy specifies move for every possible opponent reply.
 Time limits force an approximate solution
 Evaluation function: evaluate “goodness” of game position
 Examples: chess, checkers, Othello, backgammon
Games as Search
 Two players: MAX and MIN
 MAX moves first and they take turns until the game is over
 Winner gets reward, loser gets penalty.

 Formal definition as a search problem:


 Initial state: It specifies how the game is set up at the start.
 Player(s): Defines which player has the move in a state.
 Actions(s): Returns the set of legal moves in a state.
 Result(s, a): Transition model defines the result of a move.
 Terminal-Test(s): Is the game finished? True if finished, false otherwise.
 Utility function(s, p): Gives numerical value of terminal state s for player p.
 E.g., win (+1), lose (-1), and draw (0) in tic-tac-toe.
 MAX uses search tree to determine next move.
An optimal procedure: The Min-Max method
Designed to find the optimal strategy for Max and find best move:

1. Generate the whole game tree, down to the leaves.


2. Apply utility (payoff) function to each leaf.
3. Back-up values from leaves through branch nodes:
A Max node computes the Max of its child values
A Min node computes the Min of its child values
4. At root: choose the move leading to the child of highest value.
Adversarial Search for the minimax procedure
 It aims to find the optimal strategy for MAX to win the game.
 It follows the approach of Depth-first search.
 In the game tree, optimal leaf node could appear at any depth of the tree.
 Propagate the minimax values up to the tree until the terminal node discovered.

 In a given game tree, the optimal strategy can be determined from the minimax
value of each node, which can be written as MINIMAX(n).

 MAX prefer to move to a state of maximum value and MIN prefer to move to a
state of minimum value.
Game Trees
Two-Ply Game Tree
• Mini-Max algorithm uses recursion
to search through the game-tree.

• The minimax algorithm performs a


depth-first search algorithm for the
exploration of the complete game
tree.

• The minimax algorithm proceeds


all the way down to the terminal
node of the tree, then backtrack
the tree as the recursion.
Two-Ply Game Tree
Minimax maximizes the utility for the worst-case outcome for max

The minimax decision


Properties of minimax
 Complete?
 Yes (if tree is finite). It will definitely find a solution (if one exists).
 Optimal?
 Yes (against an optimal opponent).
 Can it be beaten by an opponent playing sub-optimally?
 No.
 Time complexity?
 O(bm) , where b is branching factor of the game-tree, and m is the maximum depth
of the tree.
 Space complexity?
 O(bm) (depth-first search, generate all actions at once)
 O(m) (backtracking search, generate actions one at a time)
Alpha-beta(α-β) pruning
 We can improve on the performance of the minimax algorithm through alpha-beta
pruning.
 Basic idea: “If you have an idea that is surely bad, don't take the time to see how truly awful it is.” --
Pat Winston

>=2
• We don’t need to compute
=2 <=1 the value at this node.
• No matter what it is it can’t
affect the value of the root
node.
2 7 1 ?
Alpha-beta (α-β) pruning
 Traverse the search tree in depth-first order - only considers nodes along a single path at any
time
 At each Max node n, a(n) = maximum value found so far
 Start with -infinity and only increase
 Increases if a child of n returns a value greater than the current alpha
 At each Min node n, b(n) = minimum value found so far
 Start with infinity and only decrease
 Decreases if a child of n returns a value less than the current beta

 Update values of a and b during search and prune remaining branches as soon as the value is
known to be worse than the current a or b value for MAX or MIN
Alpha-beta(α-β) pruning
Condition for Alpha-beta pruning:
 The main condition which is required for alpha-beta pruning is: α>=β

Key points about alpha-beta pruning:


 The Max player will only update the value of alpha.
 The Min player will only update the value of beta.
 While backtracking the tree, the node values will be passed to upper nodes instead
of values of alpha and beta.
 We will only pass the alpha, beta values to the child nodes.

Alpha-Beta (α-β) Example

Step 1:
Max player will start first move
from node A where
α= -∞ and β= +∞,

these value of alpha and beta are


passed down to node B where again
α= -∞ and β= +∞,

and Node B passes the same value to


its child D.
Alpha-Beta (α-β) Example

Step 2:
At Node D, the value of α will
be calculated as its turn for
Max.

The value of α is compared


with firstly 2 and then 3, and
the max (2, 3) = 3 will be the
value of α at node D and node
value will be 3.
Alpha-Beta (α-β) Example Step 3:
Now algorithm backtracks to
node B, where the value of β
will change as this is a turn of
Min,
Now β= +∞, will compare with
the available subsequent nodes
value, i.e. min (∞, 3) = 3,
hence at node B now α= -∞,
and β= 3.

In the next step, algorithm


traverse the next successor of
Node B which is node E, and the
values of α= -∞, and β= 3 will
also be passed.
Alpha-Beta (α-β) Example
Step 4:
At node E, Max will take its turn,
and the value of alpha will change.

The current value of alpha will be


compared with 5, so
max (-∞, 5) = 5,

hence at node E α= 5 and β= 3,


where α>=β,

so the right successor of E will be


pruned, and algorithm will not
traverse it, and the value at node E
will be 5.
Alpha-Beta (α-β) Example
Step 5:
At next step, algorithm again
backtrack the tree, from node B to
node A.

At node A, the value of alpha will be


changed, the maximum available
value is 3 as max (-∞, 3)= 3, and
β= +∞, these two values now
passes to right successor of A which
is Node C.

At node C, α=3 and β= +∞, and


the same values will be passed on to
node F.
Alpha-Beta (α-β) Example
Step 6:
At node F, again the value of α will
be compared with left child which is
0, and max(3,0)= 3,

and then compared with right child


which is 1, and max(3,1)= 3

still α remains 3, but the node value


of F will become 1.
Alpha-Beta (α-β) Example Step 7:
Node F returns the node value 1 to
node C, at C α= 3 and β= +∞,

here the value of beta will be changed,


it will compare with 1 so
min (∞, 1) = 1.

Now at C, α=3 and β= 1, and again it


satisfies the condition α>=β, so the
next child of C which is G will be
pruned, and the algorithm will not
compute the entire sub-tree G.
Alpha-Beta (α-β) Example
Step 8:
C now returns the value of 1 to A
here the best value for A is
max (3, 1) = 3.

The final game tree shows the nodes


which are computed and nodes
which were never computed.

Hence the optimal value for the


maximizer is 3 for this example.
Effectiveness of Alpha-beta pruning
 Alpha-Beta is guaranteed to compute the same value for the root node as computed by
Minimax.

 Worst case: NO pruning, branches are ordered so that no pruning takes place. In this
case alpha-beta gives no improvement over exhaustive search. In this case, the best
move occurs on the right side of the tree. The time complexity for such an order is
O(bm).

 Best case is when each player's best move is the leftmost alternative, i.e. at MAX
nodes the child with the largest value is generated first, and at MIN nodes the child
with the smallest value is generated first.
Games : State-of-the-Art
 Checkers: Chinook ended 40-year-reign of human world champion Marion Tinsley in
1994. Used an endgame database defining perfect play for all positions involving 8 or fewer
pieces on the board, a total of 443,748,401,247 positions. Checkers is now solved!
 Chess: Deep Blue defeated human world champion Gary Kasparov in a six-game match in
1997. Deep Blue examined 200 million positions per second, used very sophisticated
evaluation and undisclosed methods for extending some lines of search up to 40 ply.
Current programs are even better, if less historic.
 Othello: In 1997, Logistello defeated human champion by six games to none. Human
champions refuse to compete against computers, which are too good.
 Go: Human champions are beginning to be challenged by machines, though the best
humans still beat the best machines. In Go, b > 300, so most programs use pattern
knowledge bases to suggest plausible moves, along with aggressive pruning.
 Backgammon: Neural-net learning program TDGammon one of world’s top 3
players.

You might also like