3. 3
Agent Definition
An agent is an entity which is:
• Situated in some environment.
• Autonomous, in the sense that it can act
without direct intervention from humans or
other software processes, and controls over its
own actions and internal state.
• Flexible which means:
• Responsive (reactive): agents should perceive
their environment and respond to changes that
occur in it;
• Proactive: agents should not simply act in
response to their environment, they should be
able to show opportunistic, goal-directed
behavior and take the initiative when
appropriate;
• Social: agents should be able to interact with
humans or other artificial agents
4. 4
Structure of Agent
There are two parts in agents
1. Architecture: Hardware with sensors and
actuators.
2. Program: Convert percept's into actions.
The agent takes sensory input from its environment
and produces as output actions that affect it.
The agent function maps from percept histories to
actions;
5. 5
Agents
• An agent is anything that can be viewed as
perceiving its environment through sensors
and acting upon that environment through
actuators
• Software agent:
– keystrokes, file contents, received network
packages;
– displays on the screen, files, Network packets
• Human agent:
– eyes, ears, and other organs for sensors;
– hands, legs, mouth, and other body parts for
actuators
• Robotic agent:
– cameras and infrared range finders for sensors
– various motors for actuators
7. 7
Rationality and Rational Agents
“Rationality is the quality or state of being rational –
that is, being based on or agreeable to reason.
Rationality implies the conformity of one's beliefs with
one's reasons to believe, and of one's actions with
one's reasons for action” (Wikipedia).
We humans have rational behavior by Birth
Rational Agent: For each possible percept sequence, a
rational agent should select an action that is expected
to maximize its performance measure, given the
evidence provided by the percept sequence and
whatever built-in knowledge the agent has.
A rational agent is one that does the right thing;
Obviously, doing the right thing is better than doing
the wrong thing, but what does it mean to do the right
thing?
8. 8
So how to identify if agent is doing right things?
We answer this by considering the consequences of
the agent’s behavior.
• When an agent is plunked down in an environment,
it generates a sequence of actions according to the
percepts it receives.
• This sequence of actions causes the environment to
go through a sequence of states.
• If the sequence is desirable, then the agent
PERFORMANCE MEASURE has performed well.
• This concept of desirability is captured by a
performance measure that evaluates any given
sequence of environment states.
Generally speaking, it is better to design performance
measures according to what one actually wants in the
environment, rather than according to how one thinks
the agent will behave.
9. Agents and environments
Agent Sensors
Actuators
Environment
Percepts
Actions
?
9
Agents include humans, robots, softbots,
thermostats, etc.
The agent function maps from percept histories to
actions:
f : P∗
→ A
The agent program runs on the physical architecture
to produce f: agent = architecture + program
11. 1
1
A vacuum-cleaner agent
Percept sequence Action
[A, Clean]
[A, Dirty]
[B, Clean]
[B, Dirty]
[A, Clean], [A, Clean]
[A, Clean], [A, Dirty]
Right
Suck
Left
Suck
Right
Suck
. .
function Reflex-Vacuum-Agent( [location,status]) returns an action
if status = Dirty then return Suck else
if location = A then return Right else
if location = B then return Left
What is the right function?
Can it be implemented in a small agent program?
12. 1
2
PEAS
To design a rational agent, we must specify the task
environment.
In our discussion of the rationality of the simple
vacuum cleaner agent;
We had to specify the performance measure, the
environment, and the agent’s actuators and sensors.
We group all these under the heading of the task
environment.
We call it PEAS.
• Performance measure
• Environment
• Actuators
• Sensors
Consider, the task of designing a self driving car?
15. 1
5
Internet shopping agent
Performance measure? price, quality, appropriateness,
efficiency
Environment? current and future WWW sites, vendors,
shippers Actuators? display to user, follow URL, fill in form
Sensors? HTML pages (text, graphics, scripts)
17. 1
7
Fully observable (vs. partially observable)
• Is everything an agent requires to choose its
actions available to it via its sensors? Perfect or
Full information.
If so, the environment is fully accessible
• If not, parts of the environment are inaccessible
Agent must make informed guesses about
world.
• In decision theory: perfect information vs.
imperfect information.
Cross Word Backgammon Taxi driver Part picking robot
Poker Image analysis
Fully Fully Fully
Partially
Partially Partially
18. 1
8
Deterministic (vs. stochastic)
If the next state of the environment is completely
determined by the current state and the actions of
the agent, then the environment is deterministic;
otherwise, it is non-deterministic
• Does the change in world state
Depend only on current state and agent’s
action?
• Non-deterministic environments
Have aspects beyond the control of the agent
Utility functions have to guess at changes in
world
Cross Word Backgammon Taxi driver Part picking robot
Poker Image analysis
Cross Word Backgammon Taxi driver Part
Poker Image analysis
Deterministic Deterministic
Stochastic
Stochastic
Stochastic Stochastic
19. 1
9
Episodic (vs. sequential):
• In episodic environments:
the choice of current action not depends on
previous actions.
• In non-episodic environments:
Agent has to plan ahead, current choice will
affect future actions.
Episodic environment: mail sorting system, expert
advice systems etc.
Non-episodic environment: chess game
20. 2
0
Static (vs. dynamic):
• Static environments don’t change
While the agent is deliberating over what to
do
• Dynamic environments do change
So agent should/could consult the world
when choosing actions
• Semi dynamic: If the environment itself does not
change with the passage of time but the agent's
performance score does.
Example: Off-line route planning vs. on-board
navigation system
21. 2
1
Discrete (vs. continuous)
An environment is said to be discrete if there are a
finite number of percepts and actions that can
performed within it vs. a range of values
(continuous).
Discrete environment:
A game of chess or checkers where there are a set
of moves.
Continuous environment:
Taxi driving, These could be a route from anywhere
to anywhere else.
22. 2
2
Single agent (vs. multiagent):
• An agent operating by itself in an environment or
there are many agents working together.
• The environment may contain other agents which
may be of the same or different kind as that of
the agent.
Cross Word Backgammon Taxi driver Part picking robot
Poker Image analysis
Single Single Single
Multi
Multi
Multi
23. 2
3
Environment types (Summary)
Observable Deterministic Static
Episodic Agents
Discrete
Cross Word
Backgammon
Taxi driver
Part picking robot
Poker
Image analysis
Deterministic
Stochastic
Deterministic
Stochastic
Stochastic
Stochastic
Sequential
Sequential
Sequential
Sequential
Episodic
Episodic
Static
Static
Static
Dynamic
Dynamic
Semi
Discrete
Discrete
Discrete
Conti
Conti
Conti
Single
Single
Single
Multi
Multi
Multi
Fully
Fully
Fully
Partially
Partially
Partially
24. Environment types (summary)
i
Peg SolitaireBackgammon Internet shopping
Tax
Observable?
Deterministi
c? Episodic?
Static?
Discrete?
SingleAgent
?
Yes Yes No
No
Yes No Partly
No
No No No
No
Yes Semi Semi
No
Yes Yes Yes
No
Yes No Yes (except
auctions) No
The environment type largely determines the agent design
The real world is (of course) partially observable, stochastic,
sequential, dynamic,
continuous, multi-agent
2
4
25. 2
5
Agent types
Four basic types in order of increasing generality:
–simple reflex agents
–reflex agents with state/model base agents
–goal-based agents
–utility-based agents
–Learning Agents
All these can be turned into learning agents
26. Simple Reflex Agent
2
6
• Simple reflex agents ignore the rest of the percept
history and act only on the basis of the current
percept.
• Percept history is the history of all that an agent
has perceived till date.
• The agent function is based on the condition-
action rule.
• Very limited intelligence.
• No knowledge of non-perceptual parts of state.
• Usually too big to generate and store.
• A boundary following robot is SR agent.
28. 2
8
Example
function Reflex-Vacuum-Agent( [location,status]) returns an action
if status = Dirty then return Suck else
if location = A then return Right else
if location = B then return Left
(setq joe (make-agent :body (make-agent-body)
:program
#’(lambda (percept)
(destructuring-bind (location status) percept (cond
((eq status ’Dirty) ’Suck)
((eq location ’A) ’Right) ((eq
location ’B) ’Left))))))
29. 2
9
Problems with simple reflex agents
Simple reflex agents fail in partially observable environments
E.g., suppose in vacuum cleaner agent location sensor is missing
Agent (presumably) Sucks if Dirty; what if Clean?
⇒ infinite loops are unavoidable
Randomization helps (why?), but not that much
Not flexible, need to update the rules if any change occurs in
environment.
30. Model Based Agents
3
0
• Find a rule whose condition matches the current
situation. It can handle partially observable
environments by using model.
• The agent has internal state, adjusted by each
percept and that depends on the percept history.
• Current state stored inside the agents,
describing the part of the world which cannot be
seen.
• Updating the state requires information about :
how the world evolves in-dependently from
agent
how the agent actions affects the world.
33. Goal-based agents
3
3
• Extension of model-based agents.
• Take decision based on how far they are currently
from their goal.
• Every action is intended to reduce its distance
from the goal.
• Agent choose a way among multiple possibilities,
selecting the one which reaches a goal state.
• Searching and planning.
• Agent needs some sort of looking into future.
Have a goal? A destination to get to
Uses knowledge about a goal to guide its actions
e.g., Search, planning
35. Utility-based agents
3
5
• Main focus on utility not goal
• Used when there are multiple possible
alternatives.
• Goals are not always enough
Many action sequences get taxi to destination
Consider other things. How fast, how safe…..
• A utility function maps a state onto a real number
which describes the associated degree of
“happiness”, “goodness”, “success”.
• Where does the utility measure come from?
Economics: money.
Biology: number of offspring.
36. Utility-based agents
Agent
Environment
Sensors
How happy I will be
in such a state
State
How the world evolves
What my actions do
Utility
Actuators
What action I
should do now
3
6
What it will be like
if I do action A
What the world
is like now
37. Learning Agents
3
7
• Learn from past
experiences.
• Performance element is
what was previously the
whole agent
• Input sensor
• Output action
• Learning element
• Modifies performance
element.
38. Learning Agents
3
8
It has 4 components
1. Learning element: It is responsible for making
improvements by learning from the environment
2. Critic: Learning element takes feedback from
critic which describes how well the agent is doing
with respect to a fixed performance standard.
3. Performance element: Responsible for selecting
external action, based on percept and feedback
from learning element .
4. Problem Generator: Tries to solve the problem
differently instead of optimizing. Suggest actions
that will lead to new and informative
experiences.
39. Learning Agent Example
3
9
• Performance element
How it currently drives?
• Taxi driver Makes quick left turn across 3 lanes
Critics observe shocking language by
passenger and other drivers and informs bad
action.
Learning element tries to modify performance
elements for future.
Problem generator suggests experiment out
something called Brakes on different Road
conditions.
• Exploration vs. Exploitation
Learning experience can be costly in the short
run.
shocking language from other drivers.
Less tip.
Fewer passengers.
40. Different forms of learning
4
0
• Route learning or memorization.
Least amount of inferencing.
Knowledge is copied in knowledge base.
• Learning through instructions
• Learning by analogy
Development of new concepts through already
known similar concepts
• Learning by induction
Conclusion drawn based on large number of
examples.
• Learning by deduction
Irrefutable form of reasoning.
Concepts drawn always already correct, if given
facts are correct.
• Learning based on feedback
Supervised
Unsupervised
41. 4
1
Summary
Agents interact with environments through actuators and sensors.
The agent function describes what the agent does in all
circumstances. The performance measure evaluates the
environment sequence.
A perfectly rational agent maximizes expected
performance. Agent programs implement (some)
agent functions.
PEAS descriptions define task environments.
Environments are categorized along several
dimensions:
Observable?Deterministic?Episodic?Static?
Discrete?Single-agent?
Several basic agent architectures exist: