GP-Module 5
GP-Module 5
and
optimization
Module 5
1
Content
• Games for Artificial Intelligence, Game AI Panoram;
• AI Methods: Tree Search, Evolutionary Computation,
Supervised Learning & Reinforcement Learning.
2
Games for Artificial Intelligence
• There are a number of reasons why games offer the ideal domain for the study of
artificial intelligence.
• It is that complexity and interestingness of games as a problem that makes them
desirable for AI.
• From a computational complexity perspective, many games are NP-hard (NP refers to
“nondeterministic polynomial time”), meaning that the worst-case complexity of
“solving” them is very high.
• In other words, in the general case an algorithm for solving a particular game could run
for a very long time.
• Depending on a game’s properties complexity can vary substantially.
• Eg for games that are NP-Hard: Mastermind game, the Lemmings (Psygnosis, 1991),
arcade game, and the Minesweeper game by Microsoft.
3
Why Artificial Intelligence for Games?
5
1. Methods(Computer) Perspective
• The first panoramic view of game AI we present is centered around the AI methods used in the field.
• Evolutionary computation is a dominant method for playing to win, for generating content (in an
assisted/mixed-initiative fashion or autonomously), and for modeling players.
• It has also been considered for the design of believable play (play for experience) research.
Supervised learning is of substantial use across the game AI areas and appears to be dominant in
player experience and behavioral modeling, as well as in the area of AI that plays for experience.
• Behavior authoring, on the other hand, is useful solely for game-playing.
• Reinforcement learning and unsupervised learning find limited use across the game AI areas,
respectively, being dominant only on AI that plays to win and player behavior modeling.
• Finally, tree search finds use primarily in playing to win and it is also considered—as a form of
planning—for controlling play for experience and in computational narrative
6
2. End User(Human) Perspective
• The second panoramic view of the game AI field puts an
emphasis on the end user of the AI technology or general
outcome (product or solution)
• Towards that aim we investigate three core dimensions of the
game AI field and classify all game AI areas with respect to the
process AI follows, the game context under which algorithms
operate and, finally, the end user that benefits most from the
resulting outcome.
7
Three core dimensions of the game AI
• In general, what can AI do within games? AI can model or generate.
For instance, an artificial neural network can model a playing pattern,
or a genetic algorithm can generate game assets.
• What can AI methods model or generate in a game? The two possible
classes here are content and behavior. For example, AI can model a
players’ affective state, or generate a level.
• Finally, the third dimension is the end user: AI can model, or generate,
either content or behavior; but, for whom? The classes under the third
dimension are the designer, the player, the AI researcher, and the
producer/publisher.
8
3. Player –Game Interaction Perspective
• Putting an emphasis on player experience and behavior, player modeling directly
focuses on the interaction between a player and the game context.
• Game content is influenced primarily by research on autonomous procedural
content generation. In addition to other types of content, most games feature
NPCs, the behavior of which is controlled by some form of AI.
• NPC behavior is informed by research in NPCs that play the game to win or any
other playing-experience purpose such as believability.
9
What is a “Game AI”?
• The term “Game AI" is used to refer to a broad set of algorithms that also
include techniques from control theory, robotics, computer
graphics and computer science in general.
• Most video games include various non-player characters. These are controlled by
the computer software in some way, known as Game AI.
• In fact, for many game developers “Game AI” refers to the program code that
controls the NPCs, regardless of how simple or sophisticated that code is.
10
Game AI
• The first application of Game AI is in the development of Nim
(a mathematical game of strategy in which two players take
turns removing objects from distinct heaps. On each turn, a
player must remove at least one object, and may remove any
number of objects provided they all come from the same heap.
The goal of the game is to be the player who removes the last
object). The AI for the game was developed in 1951 by
Christopher Strachey.
• The second was for developing chess in 1952 by Dietrich Prinz.
11
Nim and Chess games
12
Examples of games with very good AI
16
1. Playing To Win In A Player Role
The most common use of AI together with games in AI is used for testing games too! When designing
academic settings is to play to win, while taking the a new game, or a new game level, you can use a
role of a human player. This is especially common game-playing agent to test whether the game or
when using games as an AI testbed. AI Programs level is playable, so called simulation-based
have been used to give tough competition to human testing.
players. Human-like playing is possible too!
Games are excellent testbeds for artificial There are some games where you need strong AI to provide a
challenge to players, which includes many strategic games such as
intelligence for a number of reasons. Games are
classic board games, including Chess, Checkers and Go. However,
made to test human intelligence and as a result
for games with hidden information, it is often easier to provide
they offer the kind of gradual skill progression that challenge by simply “cheating”, for example, by giving the AI player
allows for testing of AI at different capability levels. access to the hidden state of the game or even by modifying the
hidden state so as to make it harder to play for the human.
17
2. Playing To Win In A Non-Player Role
• The Main role that is most commonly associated with AI in the gaming
industry is Non-Playable Characters (NPCs).
• Non-player characters are very often designed to not offer maximum
challenge or otherwise be as effective as possible, but instead to be
entertaining or humanlike.
• Strategy games such as Civilization , XCOM: Enemy Unknown need NPC
Play To Win AI for performing roles that humans do not play.
• NPC Playing to Win is an essential component of AI that is implemented
in many racing games where the NPC cars adapt to the speed of the
human players so that they are never far too behind or too ahead.
18
Playing To Win In A Non-Player Role (contd)
19
3. Playing For Experience In The Player Role
21
4. Playing For Experience In A Non- Player Role
• The most common goal for game-playing AI in the game industry is to make non-
player characters act, almost always in ways which are not primarily meant to beat
the player or otherwise “win” the game
• NPCs may exist in games for many, sometimes overlapping, purposes: to act as
adversaries, to provide assistance and guidance, to form part of a puzzle, to tell a
story, to provide a backdrop to the action of the game, to be emotively expressive
and so on
• NPCs role varies widely, they range from the side characters to the game’s main
boss . It all depends on the coding and the algorithm that is used behind the AI.
• A large part of the challenge posed by AI is for the player to memorize the challenge
and use counters against it.
• AI’s algorithm should be developed in a way that it is not boring for the player.
22
Which Games Can AI Play?
23
Which Games Can AI Play? (contd)
Board Games
• Easy to implement AI Algorithms Card Games
•Card games are games centered on one
• Chess commonly used for AI or several decks of cards
Research •Almost all card games feature a large
• Adversarial Planning degree of hidden information
•Poker is a good example
• Skill demands are narrow
•Prediction, Action ,Reaction
• Most board games have very simple
discrete state representations and
deterministic forward models
24
Which Games Can AI Play? (contd)
Strategy Games
• Strategy games are games where the player controls multiple characters or units,
and the objective of the game is to prevail in some sort of conquest or conflict.
• It is usually turn-based and real time. Difficult for developing AI due to large
number of outcomes that are possible.
25
Which Games Can AI Play? (contd)
Racing Games
• Racing games are games where the player is tasked with controlling some
kind of vehicle or character so as to reach a goal in the shortest possible
time, or as to traverse as far as possible along a track in a given time
• Most racing games actually require multiple simultaneous tasks to be
executed and have significant skill depth.
• Training of the agent is done for the various tracks and conditions.
• Forza is a prominent example for training AI based on human driving on
the tracks.
26
Which Games Can AI Play? (contd)
First Person Shooter
• Shooters are often seen as fast-paced games where speed of
perception and reaction is crucial, and this is true to an extent,
although the speed of gameplay varies between different shooters.
• AI has advantage of quick reaction and movement over real people.
• But there are other cognitive challenges as well, including Visual Input
, orientation and movement in a complex three-dimensional
environment, predicting actions and locations of multiple adversaries,
and in some game modes also team-based collaboration.
27
AI Algorithms used in Games
• Ad-hoc authoring
• Tree search
• Evolutionary computation
• Supervised learning
• Reinforcement learning and
• Unsupervised learning
28
Path Finding Algorithms
29
Pac Man 1980
30
31
32
Path Finding Algorithms
• Breadth First Search
• Depth First Search
• Dijkstra's Algorithm
• Greedy Search
• A*
• D*
33
Problem Statement
34
BREADTH FIRST SEARCH
Treat the neighbourhood as layers
35
Path
Visited
Tentative
36
Path
Visited
Tentative
37
Path
Visited
Tentative
38
Path
Visited
Tentative
39
Path
Visited
Tentative
40
Path
Visited
Tentative
41
Path
Visited
Tentative
42
Path
Visited
Tentative
43
Path
Visited
Tentative
44
Path
Visited
Tentative
45
Path
Visited
Tentative
46
DFS
47
Depth first search in Trees
A tree is an undirected graph in which any two vertices are connected by exactly one path. In other words, any
acyclic connected graph is a tree. For a tree, we have below traversal methods –
48
Depth first search in Graph
Recursive Approach:
Depth first search is a way of traversing To turn this into a graph traversal algorithm, we
graphs, which is closely related to preorder basically replace “child” by “neighbor”. But to
traversal of a tree. Below is recursive prevent infinite loops we keep track of the vertices
implementation of preorder traversal are already discovered and not visit them again.
49
Depth first search in Graph
Iterative Approach:
The non-recursive implementation of DFS is similar to the non-recursive implementation of BFS, but
differs from it in two ways:
50
Tic Tac Toe Using DFS
51
Path
Visited
Tentative
52
Path
Visited
Tentative
53
Path
Visited
Tentative
54
Path
Visited
Tentative
55
Path
Visited
Tentative
56
Dijkstra’s Algorithm
57
Dijkstra’s Algorithm
• Create a set of shortest path tree that keeps track of vertices included in shortest
path tree, i.e., whose minimum distance from source is calculated and finalized.
Initially, this set is empty.
• Assign a distance value to all vertices in the input graph. Initialize all distance
values as ∞. Assign distance value as 0 for the source vertex so that it is picked
first.
58
Continue…
• Pick a vertex u which is not there in set and has minimum distance value.
• Include u to set.
59
60
D
A C B
61
D
A B C D
A C B
62
D
A B C D
0 6 2 ∞ ∞ ∞
A C B
63
D
A B C D
0 6 2 ∞ ∞ ∞
B 6 2 3B ∞ ∞
A C B
64
D
A B C D
0 6 2 ∞ ∞ ∞
B 6 2 3B ∞ ∞
A C B
C 6 3B 5C ∞
65
D
A B C D
0 6 2 ∞ ∞ ∞
B 6 2 3B ∞ ∞
A C B
C 6 3B 5C ∞
D 6 5C 8D
66
D
A B C D
0 6 2 ∞ ∞ ∞
B 6 2 3B ∞ ∞
A C B
C 6 3B 5C ∞
D 6 5C 8D
A 6 8D
67
D
A B C D
0 6 2 ∞ ∞ ∞
B 2 3B ∞ ∞
A C B
C 6 3B 5C ∞
D 6 5C 8D
A 6 8D
8D
68
D
A B C D
0 6 2 ∞ ∞ ∞
B 2 3B ∞ ∞
A C B
C 6 3B 5C ∞
D 6 5C 8D
A 6 8D
8D
69
Application
Google maps!!
It uses more complex and efficient algorithms.. But Dijkstra is the basis.
70
Greedy
“Focus on choosing the next best choice whether it gives the best
solution or not”
71
Greedy
• A greedy algorithm is an algorithmic paradigm that follows the problem solving heuristic of making the
locally optimal choice at each stage with the intent of finding a global optimum.
• In many problems, a greedy strategy does not usually produce an optimal solution, but nonetheless a greedy
heuristic may yield locally optimal solutions that approximate a globally optimal solution in a reasonable
amount of time.
72
Crystal Quest
• Crystal Quest is an action game originally for the Macintosh. It was written by Patrick Buckland for Casady &
Greene in 1987, and was ported to the Apple IIgs in 1989 by Bill Heineman. Ports were also made to
the Amiga, Game Boy, iOS, and Palm. It was notable for being the first game to support the color displays on
the Macintosh II.
• In the Macintosh computer game Crystal Quest the objective is to collect crystals, in a fashion similar to
the travelling salesman problem. The game has a demo mode, where the game uses a greedy algorithm to
go to every crystal. The artificial intelligence does not account for obstacles, so the demo mode often ends
quickly.
73
A B
74
A B
H=2 H=6
75
A B
76
A B
77
A* Algorithm
78
A* Algorithm
79
A* Algorithm - What it does?
What A* Search Algorithm does is that at each step it picks the node according to a
value-‘f’ which is a parameter equal to the sum of two other parameters – ‘g’ and ‘h’. At
each step it picks the node/cell having the lowest ‘f’, and process that node/cell.
g = the movement cost to move from the starting point to a given square on the grid,
following the path generated to get there.
h = the estimated movement cost to move from that given square on the grid to the
final destination. This is often referred to as the heuristic (smart guess)
80
A* Algorithm
1. Initialize the open list successor.h = distance from goal to successor
2. Initialize the closed list successor.f = successor.g + successor.h
put the starting node on the open ii) if a node with the same position as
list (you can leave its f at zero) successor is in the OPEN list which has a
3. while the open list is not empty lower f than successor, skip this successor
a) find the node with the least f on iii) if a node with the same position as
the open list, call it "q" successor is in the CLOSED list which has
b) pop q off the open list a lower f than successor, skip this successor
c) generate q's 8 successors and set their otherwise, add the node to the open list
parents to q end (for loop)
d) for each successor
i) if successor is the goal, stop search e) push q on the closed list
successor.g = q.g + distance between end (while loop)
successor and q
81
A* Algorithm
82
A B
H=2 H=6
G=6 G=2
83
D
H=3
G=9
A C B
H=5
G=3
84
D
H=3
G=5
A C B
H=2
G=6
85
D
A C B
86
D
A C B
87
D* Algorithm
behaves like A* except that the arc costs can change as the algorithm runs
88
D* Algorithm
• This is the algo that most games use for games when the terrain is not known/visible
to the player.
• There are three types to this algo viz., original D* (informed incremental search
algorithm), Focussed D* (informed incremental heuristic search algorithm) and the
D* Lite (Lifelong Planning A* algorithm).
89
D* Algorithm – An Example
Imagine exploring an unknown planet using a robotic vehicle. The robot moves along
the rugged terrain while using a range scanner to make precise measurements of the
ground in its vicinity. As the robot moves, it may discover that some parts were easier
to traverse than it originally thought. In other cases, it might realize that some direction
it was intending to go is impassable due to a large bolder or a ravine. If the goal is to
arrive at some specified coordinates, this problem can be viewed as a navigation
problem in an unknown environment. Sounds like pure Artificial Intelligence isn’t?
90
The Automated Cross-Country Unmanned Vehicle (XUV) is
equipped with laser radar and other sensors, and uses Stentz's
algorithm (D*) to navigate (courtesy of General Dynamics
Robotic Systems).
91
D* Algorithm
92
Bye – Bye Nobita!
I am not going to help you anymore, you have do it yourself now!
5 6 8
93
4 5 6
5 7
94
3 4
4 8
5 6 7
95
a
4 3 4
3 5
6 5 6
96
a
8 9 7
9 10 9 8
97
a
9 8 6
10 9 8 7
98
Bye – Bye Nobita!
I am not going to help you anymore, you have do it yourself now!
4 5
8 7 7
9 8 7
99
Bye – Bye Nobita!
I am not going to help you anymore, you have do it yourself now!
a 3 4
7 6 7
8 7
100
a 2 3
7 6 7
8 7
101
a 2
4 5
7 6 7
8 7
102
a 1 3
4 5
7 6 7
8 7
103
a 2 3
4 5
104
a 2 3
105
D* Algorithm
D* and its variants have been widely used for mobile robot and autonomous vehicle
navigation. Such navigation systems include a prototype system tested on the Mars
rovers Opportunity and Spirit and the navigation system of the winning entry in the
DARPA Urban Challenge, both developed at Carnegie Mellon University.
106
Mini Max Algorithm
107
Tic-Tac-Toe
Racing Games
Initial State: Board position of 3x3 matrix with 0 and X.
Operators: Putting 0’s or X’s in vacant positions alternatively
Terminal test: Which determines game is over
Utility function:
e(p) = (No. of complete rows, columns or diagonals are still open for
player ) – (No. of complete rows, columns or diagonals are still open for
opponent ) X O
108
Minimax Algorithm
• Generate the game tree
• Apply the utility function to each terminal state to get its value
• Use these values to determine the utility of the nodes one level
higher up in the search tree
– From bottom to top
– For a max level, select the maximum value of its successors
– For a min level, select the minimum value of its successors
• From root node select the move which leads to highest value
109
110
Alpha Beta pruning
• Alpha is the best value that the maximizer currently can guarantee at that level or
above.
• Beta is the best value that the minimizer currently can guarantee at that level or above. 111
Alpha Beta pruning (contd)
112
Alpha Beta pruning (contd)
• D now looks at its right child which returns a value of 5.At D, alpha = max(3, 5) which is 5. Now
the value of node D is 5
• D returns a value of 5 to B. At B, beta = min( +INF, 5) which is 5. The minimizer is now
guaranteed a value of 5 or lesser. B now calls E to see if he can get a lower value than 5.
• At E the values of alpha and beta is not -INF and +INF but instead -INF and 5 respectively,
because the value of beta was changed at B and that is what B passed down to E
• Now E looks at its left child which is 6. At E, alpha = max(-INF, 6) which is 6. Here the condition
becomes true. beta is 5 and alpha is 6. So beta<=alpha is true. Hence it breaks and E returns 6
to B
• Note how it did not matter what the value of E‘s right child is. It could have been +INF or -INF, it
still wouldn’t matter, We never even had to look at it because the minimizer was guaranteed a
value of 5 or lesser. So as soon as the maximizer saw the 6 he knew the minimizer would never
come this way because he can get a 5 on the left side of B. This way we dint have to look at that 113
After pruning and repeating the steps for all the nodes
114
Use of Alpha Beta Pruning
• It reduces the computation time by a huge factor.
• This allows us to search much faster and even go into deeper
levels in the game tree.
• It cuts off branches in the game tree which need not be
searched because there already exists a better move available.
115
Reinforcement Learning
Many faces of
Reinforcement Learning
116
Branches of Machine Learning
117
Characteristics of Reinforcement Learning
What makes reinforcement learning different from other
machine learning paradigms?
• There is no supervisor, only a reward signal
• Feedback is delayed, not instantaneous
• Time really matters (sequential, non i.i.d data)
• Agent's actions affect the subsequent data it receives
118
Examples of Reinforcement Learning
• Fly stunt manoeuvres in a helicopter
• Defeat the world champion at Backgammon
• Manage an investment portfolio
• Control a power station
• Make a humanoid robot walk
• Play many different Atari games better than humans
119
Rewards
• Reinforcement learning is based on the reward hypothesis
• All goals can be described by the maximisation of expected
cumulative reward
120
Examples of Rewards
123
Agent and Environment
125
A* Algorithm
126
Graph example
Heuristic
function h(n)
Cost function
g(n)
127
128
129
130
131
132
133
A Real Time
application of A*
134
135
136
137
138
139
D* Algorithm
• The aigorithnl is named D* because it resembles A* , except
that it is dynamic in the sense that arc costs can change during
the traverse of the solution path.
• Provided that the traverse is properly coupled to the
replanning process, it is guaranteed to be optimal.
140
D* Lite
• Consider a goal-directed robot-navigation task in unknown terrain,
where the robot always observes which of its eight adjacent cells are
traversable and then moves with cost one to one of them.
• The robot starts at the start cell and has to move to the goal cell. It
always computes a shortest path from its current cell to the goal cell
under the assumption that cells with unknown blockage status are
traversable.
• It then follows this path until it reaches the goal cell, in which case it
stops successfully, or it observes an untraversable cell, in which case
it recomputes a shortest path from its current cell to the goal cell.
141
D* Lite
(Contd)
• Figure 1 shows the
goal distances of all
traversable cells
and the shortest
paths from its
current cell to the
goal cell both
before and after the
robot has moved
along the path and
discovered the first
blocked cell it did
not know about.
142
D* Lite (Contd)
• Cells whose goal distances have changed are shaded gray. The goal distances are
important because one can easily determine a shortest path from its current cell of
the robot to the goal cell by greedily decreasing the goal distances once the goal
distances have been computed.
• Notice that the number of cells with changed goal distances is small and most of the
changed goal distances are irrelevant for recalculating a shortest path from its current
cell to the goal cell.
• Thus, one can efficiently recalculate a shortest path from its current cell to the goal
cell by recalculating only those goal distances that have changed (or have not been
calculated before) and are relevant for recalculating the shortest path.
• This is what D* Lite does. The challenge is to identify these cells efficiently.
• Reference: https://www.youtube.com/watch?v=q77-uxsDZow 143
Navigation Mesh
• Pathfinding within one of these polygons can be done trivially in a straight line because the polygon is
convex and traversable. Pathfinding between polygons in the mesh can be done with one of the large
number of graph search algorithms, such as A*.
• Agents on a navmesh can thus avoid computationally expensive collision detection checks with
obstacles that are part of the environment.
• Navigation meshes can be created manually, automatically, or by some combination of the two. In
video games, a level designer might manually define the polygons of the navmesh in a level editor.
• This approach can be quite labor intensive. Alternatively, an application could be created that takes the
level geometry as input and automatically outputs a navmesh.
• It is commonly assumed that the environment represented by a navmesh is static – it does not change
over time – and thus the navmesh can be created offline and be immutable. However, there has been
some investigation of online updating of navmeshes for dynamic environments
• Reference: https://www.youtube.com/watch?v=SMWxCpLvrcc
145