0% found this document useful (0 votes)
61 views66 pages

Artificial Intelligence Course Overview

The document outlines a course on Artificial Intelligence (AI) with various modules covering topics such as intelligent agents, knowledge representation, state space search, and machine learning. It defines AI, discusses its history, approaches, applications, and techniques, and highlights the importance of interdisciplinary knowledge in AI development. Additionally, it includes learning outcomes and examples of intelligent agents, emphasizing their rational behavior and adaptability in different environments.

Uploaded by

charlesdan4real
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
61 views66 pages

Artificial Intelligence Course Overview

The document outlines a course on Artificial Intelligence (AI) with various modules covering topics such as intelligent agents, knowledge representation, state space search, and machine learning. It defines AI, discusses its history, approaches, applications, and techniques, and highlights the importance of interdisciplinary knowledge in AI development. Additionally, it includes learning outcomes and examples of intelligent agents, emphasizing their rational behavior and adaptability in different environments.

Uploaded by

charlesdan4real
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

COURSE TITLE: ARTIFICIAL INTELLIGENCE

COURSE CODE: CSC 412

1
COURSE CONTENT

MODULE ONE: ARTIFICIAL INTELLIGENCE AND AGENTS

MODULE TWO: INTELLIGENT AGENTS

MODULE THREE: KNOWLEDGE REPRESENTATION IN ARTIFICIAL


INTELLIGENCE
MODULE THREE: KNOWLEDGE REPRESENTATION IN ARTIFICIAL

INTELLIGENCE

MODULE FOUR: STATE SPACE SEARCH PROBLEM

MODULE FIVE : HEURISTIC SEARCH


MODULE SIX: NATURAL LANGUAGE PROCESSING
MODULE SEVEN: PATTERN RECOGNITION
MODULE EIGHT: INTRODUCTION TO EXPERT SYSTEM
MODULE TEN: MACHINE LEARNING AND CONCEPT FORMATION
MODULE ELEVEN: LISP LANGUAGE

2
MODULE ONE

ARTIFICIAL INTELLIGENCE AND AGENTS


1.1. Learning Outcomes
The following are the main points to be learnt from this module:
• Definition of Artificial Intelligence(AI) and its techniques,
• Approaches and main sub-fields of AI
• Applications of AI

• History of AI.

1.2. Introduction
Artificial Intelligence (AI) is a branch of Science which deals with helping machines discover
solutions to complex problems in a more human-like fashion. This generally involves adopting
characteristics from human intelligence, and applying them as algorithms in a computer friendly
way. Though AI is generally associated with Computer Science, but it has many important links
with other fields such as Mathematics, Psychology, Cognition, Biology and Philosophy, among
many others. Our ability to combine knowledge from all these fields will ultimately benefit our
progress in the quest of creating an intelligent artificial being

1.3. Definitions of Artificial Intelligence (AI)


AI is the part of computer science concerned with designing intelligent computer systems,
that is, computer systems that exhibit the characteristics we associate with intelligence in
human behavior such as understanding language, learning, reasoning and solving problems.

AI excits new effort to make computers think machines with minds, in the full and literal sense.

AI is the study and design of intelligent agents, where an intelligent agent is a system that
perceives its environment and takes action that maximizes its chances of success. In brief
summary, AI is concerned with developing computer systems that can store knowledge and
effectively use the knowledge to help solve problems and accomplish tasks.

The above definitions give us four possible goals to pursue in artificial intelligence:
• Systems that think like humans
• Systems that think rationally.( A system is rational if it does the right thing) Systems
that act like humans.

3
• Systems that act rationally.
 Thinking humanly: The cognitive modeling approach
In this approach, the focus is not just on behavior and input/output but looks at the reasoning
process. It deals with computational models as to how the results were obtained. The goal is not
about producing human-like behavior but sequence of steps of the reasoning process, which
must be similar to that of human in solving the same task.

 Thinking rationally: The laws of thought approach


The focus in this approach is on deduction mechanisms that are demonstrably correct and assure
an optimal solution. It can develop systems of representation that allow deduction to be
represented as “Socrates is a man. All men are mortal. Therefore Socrates is mortal.” The goal
of the approach is to formalize the reasoning process as a system of logical rules and procedures
for inference.

 Acting humanly: The Turing Test approach


The Turing Test, proposed by Alan Turing (Turing, 1950), was designed to provide a
satisfactory operational definition of intelligence. Turing defined intelligent behavior as the
ability to achieve human - level performance in all cognitive tasks, sufficient to fool an
interrogator. The test is to have 3 rooms containing a person, a computer and an interrogator
where the interrogator can communicate with the other 2 by teletype. The interrogator tries to
determine which is the person and which is the machine. Both the machine and person try to
convince the interrogator that they are both human. If the machine succeeds in convincing the
interrogator then the machine is intelligent. The focus here is on action, not intelligent behavior.
The approach is not concerned with how to get results but how similar the results are to that of
human.

 Acting rationally: The rational agent approach


Acting rationally means acting so as to achieve one's goals, given one's beliefs. An agent is just
something that perceives and acts. In this approach, AI is viewed as the study and construction
of rational agents. The focus is on systems that act sufficiently if not optimally in all situations.

In the "laws of thought'' approach to AI, emphasis is laid on correct inferences. Ability to make
correct inferences is sometimes part of being a rational agent, because one way to act rationally
is to reason logically to the conclusion that a given action will achieve one's goals, and then act
on that conclusion. On the other hand, correct inference doesn‟t affirm rationality, because there
is no demonstrably correct thing to do in some situations, yet something must still be done.

4
There are also ways of acting rationally that cannot be reasonably said to involve inference. For
example, pulling one's hand off of a hot stove is a reflex action that is more successful than a
slower action taken after careful deliberation.

1.4 Approaches to AI
Various approaches to AI development according to Ndunagu(2012) are:
i. Strong AI aims to build machines that can truly reason and solve problems. These
machines should be self aware and their overall intellectual ability needs to be indistinguishable
from that of a human being. Excessive optimism in the 1950s and 1960s concerning strong AI
has given way to an appreciation of the extreme difficulty of the problem. Strong AI maintains
that suitably programmed machines are capable of cognitive mental states.

ii. Weak AI deals with the creation of some form of computer-based artificial intelligence
that cannot truly reason and solve problems, but can act as if it were intelligent. Weak AI holds
that suitably programmed machines can simulate human cognition.

iii. Applied AI aims to produce commercially viable "smart" systems such as, security
system that is able to recognize the faces of people who are permitted to enter a particular
building. Applied AI has already enjoyed considerable success.

iv. Cognitive AI: computers are used to test theories about how the human mind works--for
example, theories about how we recognize faces and other objects, or about how we solve
abstract problems.

1.5 Applications of AI
Some of the applications of AI include:
• Game Playing.
• Speech Recognition.
• Natural Language Understanding and Semantics.
• Automated Reasoning and Theorem Proving.
• Robotics.
• Computer Vision.
• Expert Systems.

5
1.6 AI Techniques
Various techniques can be applied to variety of AI tasks. These techniques are concerned with
how representation, manipulation and reasoning are done with knowledge in order to solve
problems. These techniques are:

• Describe and match.


• Constraint satisfaction.
• Goal reduction.
• Generate and test.
• Tree searching.
• Rule based systems.
• Neural Networks.
• Genetic Algorithms.
• Reinforcement learning.

1.7 A Brief History of AI
About 400 years ago people started to write about the nature of thought and reason. Hobbes
(1588-1679), who has been described by Haugeland (1985), as the "Grandfather of AI,"
espoused the position that thinking was symbolic reasoning like talking out loud or working out
an answer with pen and paper. The idea of symbolic reasoning was further developed by
Descartes (1596-1650), Pascal (1623-1662), Spinoza (1632-1677), Leibniz (1646-1716), and
others who were pioneers in the philosophy of mind.

The idea of symbolic operations became more concrete with the development of computers.
The first general-purpose computer designed (but not built until 1991, at the Science Museum
of London) was the Analytical Engine by Babbage (1792-1871). In the early part of the 20th
century, there was much work done on understanding computation. Several models of
computation were proposed, including the Turing machine by Alan Turing (1912-1954), a
theoretical machine that writes symbols on an infinitely long tape, and the lambda calculus of
Church (1903-1995), which is a mathematical formalism for rewriting formulas. It can be shown
that these very different formalisms are equivalent in that any function computable by one is
computable by the others. This leads to the Church-Turing thesis:

Here effectively computable means following well-defined operations; "computers" in


Turing's day were people who followed well-defined steps and computers as we know them
today did not exist. The Church-Turing thesis cannot be proved but it is a hypothesis that has

6
stood the test of time. No one has built a machine that has carried out computation that cannot
be computed by a Turing machine. There is no evidence that people can compute functions that
are not Turing computable. An agent's actions are a function of its abilities, its history, and its
goals or preferences. This provides an argument that computation is more than just a metaphor
for intelligence; reasoning is computation and computation can be carried out by a computer.

Once real computers were built, some of the first applications of computers were AI programs.
For example, Samuel (1959) built a checkers program in 1952 and implemented a program that
learns to play checkers in the late 1950s. Newell and Simon (1956) built a program, Logic
Theorist that discovers proofs in propositional logic. In addition to that for high-level symbolic
reasoning, there was also much work on low-level learning inspired by how neurons work.
McCulloch and Pitts (1943) showed how a simple thresholding "formal neuron" could be the
basis for a Turing-complete machine. The first learning for these neural networks was described
by Minsky (1952). One of the early significant works was the Perceptron of Rosenblatt (1958).

In 1961 James Slagle (PhD dissertation, MIT) wrote a symbolic integration program SAINT. It
was written in LISP and solved calculus problems at the college freshman level. In 1963,
Thomas Evan's program Analogy was developed which could solve IQ test type analogy
problems.

In 1963, Edward A. Feigenbaum & Julian Feldman published Computers and Thought, the first
collection of articles about artificial intelligence.

In 1965, J. Allen Robinson invented a mechanical proof procedure, the Resolution Method,
which allowed programs to work efficiently with formal logic as a representation language. In
1967, the Dendral program (Feigenbaum, Lederberg, Buchanan, Sutherland at Stanford) was
demonstrated which could interpret mass spectra on organic chemical compounds. This was
the first successful knowledge-based program for scientific reasoning. In 1969 the SRI robot,
Shakey, demonstrated combining locomotion, perception and problem solving.

The years from 1969 to 1979 marked the early development of knowledge-based systems

In 1974, MYCIN demonstrated the power of rule-based systems for knowledge representation
and inference in medical diagnosis and therapy. Knowledge representation schemes were
developed. These included frames developed by Minski. Logic based languages like Prolog and
Planner were developed.

We will now mention a few of the AI systems that were developed over the years.

7
The Meta-Dendral learning program produced new results in chemistry (rules of mass
spectrometry)

In the 1980s, Lisp Machines developed and marketed.


Around 1985, Neural networks return to popularity. In 1988, there was a resurgence of probabilistic
and decision-theoretic methods.

The early AI systems used general systems, little knowledge. AI researchers realized that
specialized knowledge is required for rich tasks to focus reasoning. The 1990's saw major
advances in all areas of AI including the following:

Machine learning, Data mining


Intelligent tutoring,
Case-based reasoning,
Multi-agent planning, scheduling,
Uncertain reasoning,
Natural language understanding and translation,
Vision, virtual reality, games, and other topics.

Rod Brooks' COG Project at MIT, with numerous collaborators, made significant
progress in building a humanoid robot.

The first official Robo-Cup soccer match featuring table-top matches with 40 teams of
interacting robots was held in 1997. For details, see the sit [Link]
/users/students/2002/c3012299/[Link]

In the late 90s, Web crawlers and other AI-based information extraction programs become essential
in widespread use of the world-wide-web.

Interactive robot pets ("smart toys") become commercially available, realizing the vision of the
18th century novelty toy makers.

In 2000, the Nomad robot explores remote regions of Antarctica looking for meteorite samples.

AI in the news [Link]

Review Questions
1. Explain the term Artificial Intelligence

8
2. Discuss four approaches to Artificial intelligence
3. Write briefly on Characteristics Agent environment

1.8 References and Further Reading


Gardner, H. (1985). The Mind's New Science. Basic Books, New York.
Haugeland, J. (1997). Mind Design II: Philosophy, Psychology, Artificial Intelligence. MIT Press,
Cambridge, MA, revised and enlarged edition.

Bobrow, D.G. (1993). Artificial intelligence in perspective: a retrospective on fifty volumes of


Artificial Intelligence. Artificial Intelligence, 59: 5-20.

Bowling, M. and Veloso, M. (2002). Artificial Intelligence: Multiagent Learning Using a Variable
Learning Rate. 136(2): 215-250.

Buchanan, B.G. (2005). A (very) brief history of artificial intelligence. AI Magazine, 26(4): 53-
60.

Chrisley, R. and Begeer, S. (2000). Artificial intelligence: Critical Concepts in Cognitive Science.
Routledge, London and New York.

George, F. L. (2004). Artificial Intelligence: Structures and Strategies for Complex Problem
Solving (Fifth Edition). Chapter 1, page 3 – 35.

Nilsson, N.J. (2009). The Quest for Artificial Intelligence: A History of Ideas and Achievements.
Cambridge University Press, Cambridge, England.

Posner, M.I. (1989). Foundations of Cognitive Science. MIT Press, Cambridge, MA.
Sandholm, T. (2007). Expressive commerce and its application to sourcing: How we conducted
$35 billion of generalized combinatorial auctions. AI Magazine, 28(3): 45-

58.
Serenko, A. and Detlor, B. (2004). Intelligent Agents as Innovations. AI and Society 18(4):
364–[Link].1007/s00146-00403105.[Link]
serenko/papers/Serenko_Detlor_AI_and_Society.pdf

Stillings et al. (1987). Cognitive Science: An Introduction. MIT Press, Cambridge, MA.
Stuart, J.R. & Peter, N. (2002). Artificial Intelligence: A Modern Approach (Second Edition).
ISBN 0-13-790395-2, Chapter 1 – 2, page 1 – 56.

9
MODULE TWO
INTELLIGENT AGENTS
2.1 Learning Outcomes
The following are the main points to be learnt from this module:
1. Definition of Intelligent Agent and examples
2. Agent Environment
3. Agent architectures
4. Overview of Multi -Agent Systems

2.2 Intelligent Agents


An agent is a system that acts in an environment. An Intelligent agent must sense, act, and be
autonomous (to some extent). It also must be rational. Agents include worms, dogs, thermostats,
airplanes, robots, humans, companies, and countries. We judge an agent by its actions, what it
does and how it acts. A rational agent always does the right thing. An agent acts intelligently
when

• what it does is appropriate for its circumstances and its goals,


• it is flexible to changing environments and changing goals,
• it learns from experience and makes appropriate choices when some given limitations.
2.2.1 Examples of Agents
1. Humans can be looked upon as agents. They have eyes, ears, skin, taste buds, etc. for sensors;
and hands, fingers, legs, mouth for effectors.

2. Robots are agents. Robots may have camera, sonar, infrared, bumper, etc. for sensors.
3. We also have software agents or softbots that have some functions as sensors and some
functions as actuators.

4. Expert systems like the Cardiologist, Autonomous spacecraft, Intelligent buildings are agents.

2.2.2 Rational Agents


Perfect Rationality assumes that the rational agent knows all and will take the action that
maximizes her utility. Human beings do not satisfy this definition of rationality.

Rational Action is the action that maximizes the expected value of the performance measure
given the percept sequence to date. However, a rational agent is not omniscient. It does not
know the actual outcome of its actions, and it may not know certain aspects of its

10
environment. Therefore rationality must take into account the limitations of the agent. The agent
has to select the best action to the best of its knowledge depending on its percept sequence, its
background knowledge and its feasible actions. An agent also has to deal with the expected
outcome of the actions where the action effects are not deterministic.

2.3 Agents Situated in Environments


AI is about practical reasoning: reasoning in order to do something. It involves perception,
reasoning, and acting which makes an agent. An agent's environment may include other agents.
An agent together with its environment is called a world. An agent could be, for example, a
coupling of a computational engine with physical sensors and actuators, called a robot, where
the environment is a physical setting. It could be the coupling of an advicegiving computer--an
expert system--with a human who provides perceptual information and carries out the task. An
agent could be a program that acts in a purely computational environment--a software agent.

Figure 2.1: An agent interacting with an environment


Figure 2.1 above shows the inputs and outputs of an agent. At any time, what an agent does depends
on its

• prior knowledge about the agent and the environment;


• history of interaction with the environment, which is composed of o

observations of the current environment and

o past experiences of previous actions and observations, or other data, from which it

can learn;

11
• goals that it must try to achieve or preferences over states of the world; and
Abilities, which are the primitive actions it is capable of carrying out.

2.3.1 Agent Environment


Environments in which agents operate can be defined in different ways. It is helpful to view the
following definitions as referring to the way the environment appears from the point of view of
the agent itself.

[Link] Observability
In terms of observability, an environment can be characterized as fully observable or partially
observable. In a fully observable environment, the entire environment relevant to the action
being considered is observable. In such environments, the agent does not need to keep track of
the changes in the environment. A chess playing system is an example of a system that
operates in a fully observable environment. In a partially observable environment, the
relevant features of the environment are only partially observable. A bridge playing program
is an example of a system operating in a partially observable environment.

[Link] Determinism
In deterministic environments, the next state of the environment is completely described by the
current state and the agent‟s action. If an element of interference or uncertainty occurs then the
environment is stochastic. Note that a deterministic yet partially observable environment will
appear to be stochastic to the agent.

If the environment state is wholly determined by the preceding state and the actions of multiple
agents, then the environment is said to be strategic. Example: Chess

[Link] Episodicity
An episodic environment means that subsequent episodes do not depend on what actions
occurred in previous episodes. In a sequential environment, the agent engages in a series of
connected episodes.

[Link] Dynamism
Static Environment: does not change from one state to the next while the agent is considering
its course of action. The only changes to the environment are those caused by the agent itself.

12
A dynamic environment changes over time independent of the actions of the agent and thus if an
agent does not respond in a timely manner, this counts as a choice to do nothing.

[Link] Continuity
If the number of distinct percepts and actions is limited, the environment is discrete, otherwise it
is continuous.

2.4 Agent architectures


2.4.1 Table Based Agent
In table based agent the action is looked up from a table based on information about the agent‟s
percepts. A table is simple way to specify a mapping from percepts to actions. The mapping is
implicitly defined by a program. The mapping may be implemented by a rule based system, by
a neural network or by a procedure.

There are several disadvantages to a table based system. The tables may become very large.
Learning a table may take a very long time, especially if the table is large. Such systems usually
have little autonomy, as all actions are pre-determined.

2.4.2 Percept based agent or reflex agent


In percept based agents,
1. information comes from sensors - percepts
2. changes the agents current state of the world
3. triggers actions through the effectors
Such agents are called reactive agents or stimulus-response agents. Reactive agents have no
notion of history. The current state is as the sensors see it right now. The action is based on the
current percepts only.

The following are some of the characteristics of percept-based agents.


Efficient
No internal representation for reasoning, inference.
No strategic planning, learning.
Percept-based agents are not good for multiple, opposing, goals.

13
2.4.3 Subsumption Architecture
We will now briefly describe the subsumption architecture (Rodney Brooks, 1986). This
architecture is based on reactive systems. Brooks notes that in lower animals there is no
deliberation and the actions are based on sensory inputs. But even lower animals are capable of
many complex tasks. His argument is to follow the evolutionary path and build simple agents
for complex worlds.

The main features of Brooks‟ architecture are.


There is no explicit knowledge representation Behaviour is distributed, not centralized
Response to stimuli is reflexive. The design is bottom up, and complex behaviours are fashioned
from the combination of simpler underlying ones.

Individual agents are simple


The Subsumption Architecture built in layers. There are different layers of behaviour. The higher
layers can override lower layers. Each activity is modeled by a finite state machine.

The system is built in three layers.


1. Layer 0: Avoid Obstacles
2. Layer1: Wander behaviour
3. Layer 2: Exploration behavior

Layer 0 (Avoid Obstacles) has the following capabilities:


Sonar: generate sonar scan
Collide: send HALT message to forward
Feel force: signal sent to run-away, turn

Layer1 (Wander behaviour)


Generates a random heading
Avoid reads repulsive force, generates new heading, feeds to turn and forward.
Layer 2 (Exploration behaviour)
When look notices idle time and looks for an interesting place.
Pathplan sends new direction to avoid.
Integrate monitors path and sends them to the path plan.

14
2.4.4 State-Based Agent or Model-Based Reflex Agent
State based agents differ from percept based agents in that such agents maintain some sort of
state based on the percept sequence received so far. The state is updated regularly based on what
the agent senses, and the agent‟s actions. Keeping track of the state requires that the agent has
knowledge about how the world evolves, and how the agent‟s actions affect the world.

Thus a state based agent works as follows:


information comes from sensors – percepts based on this, the agent changes the current state of
the world based on state of the world and knowledge (memory), it triggers actions through the
effectors

2.4.5 Goal-based Agent


The goal based agent has some goal which forms a basis of its actions. Such agents work as follows:

information comes from sensors - percepts changes the agents current state of the world based
on state of the world and knowledge (memory) and goals/intentions, it chooses actions and
does them through the effectors.

Goal formulation based on the current situation is a way of solving many problems and search
is a universal problem solving mechanism in AI. The sequence of steps required to solve a
problem is not known a priori and must be determined by a systematic exploration of the
alternatives.

2.4.6 Utility-based Agent


Utility based agents provide a more general agent framework. In case that the agent has multiple
goals, this framework can accommodate different preferences for the different goals. Such
systems are characterized by a utility function that maps a state or a sequence of states to a real
valued utility. The agent acts so as to maximize expected utility.

2.4.7 Learning Agent


Learning allows an agent to operate in initially unknown environments. The learning element
modifies the performance element. Learning is required for true autonomy

15
2.5 Overview of Multi -Agent Systems

Research literatures indicate that there is no consensus on the definitions and classifications of
software agents. But from different definitions, essential properties can still be identified
clearly, such as autonomous, goal directed, environment awareness and collaboration (Maes,
1994; Nwana, et al., 1996; Wooldridge,2002).

The major attributes of software agents are :

(i) Situatedness: an agent receives input from the environment in which it operates and can
perform actions, which change the environment in some way.
(ii) Autonomy: an agent is able to operate without direct, continuous supervision; it has full
control over its own actions.

(iii) Flexibility: an agent is not only reactive but also pro-active. Reactivity means that it has
perceptions of the world inside which it is acting and reacts to change in quasi real-time
fashion. Proactiviness means that behavior is not exclusively reactive but it is also driven
by internal goals, i.e., it may take initiative.

(iv) Reasoning: through the learned knowledge and expertise, software agents perform
reasoning tasks in a rational way.

Agents are seldom stand-alone systems. In many situations they coexist and interact
with other agents in different ways. Such a system that consists of a group of agents that can
potentially, interact with each other is called a multi agent system(MAS). The corresponding
subfield of AI that deals with principles and design of multiagent systems is called Distributed
Artificial Intelligence(DAI).

A multi-agent system is a system in which there are several autonomous agents in the
same environment which co-operate at least part of the time to solve a global goal together (Lin
and Micheal, 2004). The agents co-operate to perform some task that a single agent can‟t do on
its own, when a single agent do not have all the capabilities or knowledge required to perform
the task, or when it may take a longer time for a single agent to perform the task.

Co-operation in multi-agent systems can take a number of forms:

16
(i) The agents implicitly or explicitly share a common goal and benevolently work to
achieve the overall objectives of the system, even when this conflict with the agent‟s
own goals (e.g., when the agents are „owned‟ by the same organization or individual).

(ii) The agents are self-interested and do not share a common goal (e.g., they are designed
to represent the interests of different individuals or organisations). agents co-operate
because it helps them to achieve their own goals.

2.5.1 Characteristics of Multi- Agent Systems

A multi-agent system (MAS) could be a set of agents organised in either a mesh or a


hierarchical structure (Liu and Mary-Anne, 2002). In a mesh structure, every agent is connected
to all other agents in the system; while agents in a hierarchical structure are connected only to
their parents and children. The organizational structure defines the communication topology of
the MAS, i.e. an agent can only interact with those agents to which it is connected.

The fundamental aspects that characterize MAS and distinguish it from a single-agent
system according to Kok and Vlassis (2006) are:

(i) Environment

Agents deal with environments that can be either static (time-invariant) or dynamic
(non-stationary). Most existing AI techniques for single agents have been developed for static
environments because these are easier to handle and allow for more rigorous mathematical
treatments. In MAS, the mere presence of multiple agents makes the environment appear
dynamic from the point of view of each agent.

(ii) Perception

The collective information that reaches the sensors of the agents in MAS is typically
distributed; the agents may observe data that differ spatially (appear at different locations),
temporally arrive at different times, or even semantically require different interpretations. This
automatically, makes the world state partially observable to each agent, which has various
consequences in the decision making of the agents. An additional issue is sensor fusion,
particularly on how the agents can optimally combine their perceptions in order to increase their
collective knowledge about the current state.

17
(iii) Control

Contrary to single-agent systems, the control in MAS is typically distributed (decent


ralized). This means that there is no central process that collects information from each agent
and then decides what action each agent should take. The decision making of each agent lies to
a large extent, within the agent itself. Distributed decision making results in asynchronous
computation and certain speedups, but it also has the downside because the appropriate
coordination mechanisms need to be additionally developed. Coordination ensures that the
individual decisions of the agents result in good joint decisions for the group.

(iv) Communication
Interaction is often associated with some form of communication. Typically, we view
communication in MAS as a two-way process, where all agents can potentially, be senders and
receivers of messages. Communication can be used in several cases, for coordination among
cooperative agents or for negotiation among self-interested agents. Moreover, communication
additionally raises the issues of what network protocols to use in order for the exchanged
information to arrive safely and timely, and what language the agents must speak in order to
understand each other especially if they are heterogeneous.

(v) Collaboration

Multi-agent system provides a framework where individual agents can communicate


among themselves and take shared responsibilities in accomplishing common tasks. This
important characteristic in multi-agent systems (MAS) is adopted in this study. Agent models
are proliferating; some include learning capabilities, others include intelligent agendas based on
statistics, genetic algorithms and so on. However, a simple yet powerful and mature model
coming from cognitive science and philosophy that has received a great deal of attention,
notably in artificial intelligence, is the Belief-Desire-Intention (BDI) model (Hofsteller, 1998).
This approach has been intensively used to study the design rationale of agents and is proposed
as a keystone model in numerous agent-oriented development environments such as JACK or
JADE.

Beliefs represent the informational state of a BDI agent (i.e. what it knows about itself
and the world); Desires or goals that are its motivational state(i.e. what the agent is trying to
achieve); Intentions that represent the deliberative state of the agent(i.e. which plans the agent
has chosen for possible execution).

18
In more detail, a BDI agent has a set of plans, which defines sequences of actions and
steps available to achieve a certain goal or react to a specific situation. The agent reacts to
events, which are generated by modifications to its beliefs, additions of new goals, or messages
arriving from the environment or from another agent.

Contrary to Remote Method Invocation (RMI), multi agent systems composed of several
autonomy organizations (with their own agents) which cannot share at design time, their
common goals. Therefore, agents need to communicate by requesting actions to be performed,
rather then invoking methods.

Wooldridge (2002), summarized the main differences between objects and agents:
(i) agents autonomy meaning is stronger than the object one. Indeed agents decide for
themselves whether or not to perform an action on request from other agents.

(ii) agents are capable of flexible (reactive, proactive, social) behaviour and the standard
object model has nothing to say about such types of behaviour; each agent is always
physically autonomous by using the thread mechanism.

2.5.2 Categories of Multi -Agent Systems

The proliferation of computer systems led to a new conception about how computer or
non-human entities such as agents might work together. The concept of “open systems” groups
together a large number of systems of different design that can interact and cooperate in order
to accomplish some specific tasks. Multi-agent systems represent a modern approach for
modeling so-called open systems. Communication enables the agents to coordinate their actions
and behaviour, resulting in systems that are more coherent.

Coordination in multi-agent system refers to the state of a community of agents in which


actions of some agents fit in well with each other, as well as to the process of achieving this
state. In the view of Gaston and Desjardins (2005), coordination in multi-agent systems could
be:

(i) Homogenous (same platform) and heterogeneous (different platform).


(ii) Competitive and cooperative.
Cooperative and competitive multi-agent system coordinations are briefly discussed in the
following section.

19
[Link] Cooperative multi agent system (MAS) coordination

The coordination is among non-antagonistic agents. Cooperative agents work together to


solve complex problems with local information. Three properties that must be met to have
„shared cooperative activity‟ are: Mutual responsiveness, commitment to the joint activity and
commitment to mutual support”. The major application of cooperative multi agent systems are
:

(1) Distributed problem solving (DPS)

Cooperation through planning is suited for distributed problem solving environments,


where entities with a-priori known capabilities are pushed to work together under tight rules.
Cooperative distributed problem-solving studies how a loosely coupled network of problem
solvers can work together to solve problems that are beyond their individual capabilities.
Problem solving in the classical AI sense is distributed among multiple agents where agents
formulate a solution/answer to some complex questions. Such Agents may be heterogeneous or
homogeneous.

(2) Distributed planning (DS)

This presents a model of joint planning among distributed multi agent systems. This
includes shared plans (a formal model for joint activity) and joint intentions. Notable
applications of cooperative MAS systems are :

(i) Distributed Sensor Network Management(DSNM) (ii)


Traffic Vehicle Movement(TVM) using multiple sensors.

(iii) Distributed Vehicle Monitoring.


(iv) Distributed Delivery.

[Link] Competitive multi agent system (MAS)

Competition is coordination among competitive or self-interested agents. This is


achieved through negotiation. Each agent's goal is to maximise its own interests, while
attempting to reach agreement with other agents. Examples are buying/selling agents where
negotiation to resolve conflicts is at a competitive level, namely each is trying to obtain the
lowest/highest price possible for its own good and not for the good of the market community as

20
a whole. Hence these agents are working on behalf of an individual user and not as part of a
unified community. Examples of such systems developed using competitive multi agent
systems are :

(i) Buying/Selling Systems e.g., Bargain Finder, FireFly, Kasbah.

(ii) Electronic Commerce e.g., MAGMA

(iii) Virtual Enterprises e.g., AVE, InterRAP.

Review Questions

1. What is a Multi -Agent System

2. List 4 Characteristics of Multi -Agent System

References

Gaston, M.E. and DesJardins J. (2005). Agent-Organized for Dynamic Team Formation . In proceeding

of the 2005 International conference on autonomous agent and multi agent Systems (AAMAS-

05), Utrecht, Netherlands.

Kok, J.R. and Vlassis, N. (2006). Collaborative Multi-Agent Reinforcement Learning by Payoff

Propagation. Journal of Machine Learning Research, 7(22): pp.178-182.

Liu, W. and Mary-Anne, W.(2002). Trustworthiness of Information Sources Information Pedigrees,

Intelligent Agents VIII (Agent Theories,Architectures and Languages), John-Jules Ch. Meyer,

Milind Tambe (Eds.), Springer .

Lin, P. and Michael, W. (2004). Developing Intelligent Agent Systems: A Practical Guide. John

Wiley and Sons.

Nwana, H S, Lee, L. C. and Jennings N.(1996). Co-ordination in Software Agent Systems.

21
MODULE THREE

KNOWLEDGE REPRESENTATION IN ARTIFICIAL INTELLIGENCE


3.1 Learning Outcomes
The following are the main points to be learnt from this module:
• What knowledge representation is?
• How to identify a problem, provide a solution to it and know when a problem solved.
• How an agent acts in an environment and only has access to its prior knowledge, its history
of observations, and its goals and preferences.

• How to solve a problem by computer.


3.2 Knowledge Representation


Typically, a problem to solve or a task to carry out, as well as what constitutes a solution, is
only given informally, such as "deliver parcels promptly when they arrive" or "fix whatever is
wrong with the electrical system of the house."

Figure 2.1: The Role of Representations in Solving Problems


The general framework for solving problems by computer is given in Figure 2.1. To solve a
problem, the designer of a system must

• identify the task and determine what constitutes a solution;


• represent the problem in a language which a computer can reason;
• use the computer to compute an output, an answer is then presented to the user or a
sequence of actions to be carried out in the environment; and interpret the output as a
solution to the identified problem.

Knowledge is the information gotten about a domain which can be used to solve problems in that
domain. A representation scheme is the form of the knowledge that is used by an agent.

22
A representation of some piece of knowledge is the internal representation of the knowledge.
A representation scheme specifies the form of the knowledge. A knowledge base is the
representation of all the knowledge stored by an agent.

A representation should be
• rich enough to express the knowledge needed to solve the identified problem.
• as close to the problem as possible; it should be compact, natural, and maintainable. It
should be easy to see the relationship between the representation and the domain being
represented, so that it is easy to determine whether the knowledge represented is correct.
A small change in the problem should result in a small change in the representation of
the problem.

• amenable to efficient computation, this means that it is able to express features of the
problem that can be exploited for computational gain and able to trade off accuracy and
computation time.

• able to be acquired from people, data and past experiences.

3.3 Defining a Solution


Given an informal description of a problem, before even considering a computer, a knowledge
base designer should determine what would constitute a solution. This question arises not only
in AI but in any software design.

Typically, problems are not well specified. For example, if you ask a trading agent to find out
all the information about resorts that may have health issues, you do not want the agent to return
the information about all resorts, even though all of the information you requested is in the
result. However, if the trading agent does not have complete knowledge about the resorts,
returnin g all of the information may be the only way for it to guarantee that all of the requested
information is there. Similarly, you do not want a delivery robot, when asked to take all of the
trash to the garbage can, to take everything to the garbage can, even though this may be the only
way to guarantee that all of the trash has been taken. Much work in AI is motivated by
commonsense reasoning; we want the computer to be able to make commonsense conclusions
about the unstated assumptions.

Given a well-defined problem, the next issue is whether it matters if the answer returned is
incorrect or incomplete. For example, if the specification asks for all instances, does it matter if
some are missing? Does it matter if there are some extra instances? Often a person does not

23
want just any solution but the best solution according to some criteria. There are four common
classes of solutions:
 Optimal solution
An optimal solution to a problem is one that is the best solution according to some measure of
solution quality. This measure is typically specified as an ordinal, where only the order matters.
However, in some situations, such as when combining multiple criteria or when reasoning under
uncertainty, you need a cardinal measure, where the relative magnitudes also matter. An
example of an ordinal measure is for the robot to take out as much trash as possible; the more
trash it can take out, the better. As an example of a cardinal measure, you may want the delivery
robot to take as much of the trash as possible to the garbage can, minimizing the distance
traveled, and explicitly specify a trade-off between the effort required and the proportion of the
trash taken out. It may be better to miss some trash than to waste too much time.

 Satisficing solution
Often an agent does not need the best solution to a problem but just needs some solution. A
satisficing solution is one that is good enough according to some description of which solutions
are adequate. For example, a person may tell a robot that it must take all of trash out, or tell it
to take out three items of trash.  Approximately optimal solution

One of the advantages of a cardinal measure of success is that it allows for approximations. An
approximately optimal solution is one whose measure of quality is close to the best that could
theoretically be obtained. Typically agents do not need optimal solutions to problems; they only
must get close enough. For example, the robot may not need to travel the optimal distance to
take out the trash but may only need to be within, say, 10% of the optimal distance.

 Probable solution
A probable solution is one that, even though it may not actually be a solution to the problem,
is likely to be a solution. This is one way to approximate, in a precise manner, a satisficing
solution. For example, in the case where the delivery robot could drop the trash or fail to pick
it up when it attempts to, you may need the robot to be 80% sure that it has picked up three
items of trash. Often you want to distinguish the false-positive error rate (the proportion of the
answers given by the computer that are not correct) from the false-negative error rate (which
is the proportion of those answers not given by the computer that are indeed correct). Some
applications are much more tolerant of one of these errors than of the other.

24
3.3.1 Representations
Once you have some requirements on the nature of a solution, you must represent the problem so
a computer can solve it.

Computers and human minds are examples of physical symbol systems. A symbol is a
meaningful pattern that can be manipulated. Examples of symbols are written words, sentences,
gestures, marks on paper, or sequences of bits. A symbol system creates copies, modifies, and
destroys symbols. Essentially, a symbol is one of the patterns manipulated as a unit by a symbol
system. The term physical is used, because symbols in a physical symbol system are physical
objects that are part of the real world, even though they may be internal to computers and brains.

An agent can use physical symbol systems to model the world. A model of a world is a
representation of the specifics of what is true in the world or of the dynamic of the world. An
agent can have a very simplistic model of the world, or it can have a very detailed model of the
world. An agent can have multiple, even contradictory, models of the world. The models are
judged not by whether they are correct, but by whether they are useful.

Example A delivery robot can model the environment at a high level of abstraction in terms of
rooms, corridors, doors, and obstacles, ignoring distances, its size, the steering angles needed,
the slippage of the wheels, the weight of parcels, the details of obstacles, the political situation
in Canada, and virtually everything else. The robot could model the environment at lower levels
of abstraction by taking some of these details into account. Some of these details may be
irrelevant for the successful implementation of the robot, but some may be crucial for the robot
to succeed. For example, in some situations the size of the robot and the steering angles may be
crucial for not getting stuck around a particular corner. In other situations, if the robot stays
close to the center of the corridor, it may not need to model its width or the steering angles.

3.3.2 Reasoning and Acting


The manipulation of symbols to produce action is called reasoning.
One way that AI representations differ from computer programs in traditional languages is that
an AI representation typically specifies what needs to be computed, not how it is to be
computed. We might specify that the agent should find the most likely disease a patient has, or
specify that a robot should get coffee, but not give detailed instructions on how to do these
things. Much AI reasoning involves searching through the space of possibilities to determine
how to complete a task.
In deciding what an agent will do, there are three aspects of computation that must be
distinguished: (1) the computation that goes into the design of the agent, (2) the computation

25
that the agent can do before it observes the world and needs to act, and (3) the computation that
is done by the agent as it is acting.

• Design time reasoning is the reasoning that is carried out to design the agent. It is
carried out by the designer of the agent, not the agent itself.

• Offline computation is the computation done by the agent before it acts. It can include
compilation and learning. Offline, the agent takes background knowledge and data and
compiles them into a usable form called a knowledge base. Background knowledge
can be given either at design time or offline.

• Online computation is the computation done by the agent between observing the
environment and acting in the environment. A piece of information obtained online is
called an observation. An agent typically must use both its knowledge base and its
observations to determine what to do.

It is important to distinguish between the knowledge in the mind of the designer and the knowledge
in the mind of the agent. Consider the extreme cases:

• At one extreme is a highly specialized agent that works well in the environment for
which it was designed, but it is helpless outside of this niche. The designer may have
done considerable work in building the agent, but the agent may not need to do very
much to operate well. An example is a thermostat. It may be difficult to design a
thermostat so that it turns on and off at exactly the right temperatures, but the thermostat
itself does not have to do much computation. Another example is a painting robot that
always paints the same parts in an automobile factory. There may be much design time
or offline computation to get it to work perfectly, but the painting robot can paint parts
with little online computation; it senses that there is a part in position, but then it carries
out its predefined actions. These very specialized agents do not adapt well to different
environments or to changing goals. The painting robot would not notice if a different
sort of part were present and, even if it did, it would not know what to do with it. It
would have to be redesigned or reprogrammed to paint different parts or to change into
a sanding machine or a dog washing machine.

• At the other extreme is a very flexible agent that can survive in arbitrary environments
and accept new tasks at run time. Simple biological agents such as insects can adapt to
complex changing environments, but they cannot carry out arbitrary tasks. Designing
an agent that can adapt to complex environments and changing goals is a major

26
challenge. The agent will know much more about the particulars of a situation than the
designer. Even biology has not produced many such agents. Humans may be the only
extant example, but even humans need time to adapt to new environments. Two broad
strategies have been pursued in building agents:

• The first is to simplify environments and build complex reasoning systems for these
simple environments. For example, factory robots can do sophisticated tasks in the
engineered environment of a factory, but they may be hopeless in a natural environment.
Much of the complexity of the problem can be reduced by simplifying the environment.
This is also important for building practical systems because many environments can be
engineered to make them simpler for agents.

• The second strategy is to build simple agents in natural environments. This is inspired
by seeing how insects can survive in complex environments even though they have very
limited reasoning abilities. Researchers then make the agents have more reasoning
abilities as their tasks become more complicated.

3.4 References and Further Readings

Bobrow, D.G. (1967). Natural language input for a computer problem solving system. In M.
Minsky (Ed.), Semantic Information Processing, pp. 133-215. MIT Press, Cambridge
MA.

Brachman, R. and Levesque, H. (2004). Knowledge Representation and Reasoning. Morgan


Kaufmann.

Shapiro, S.C. (Ed.) (1992). Encyclopedia of Artificial Intelligence. Wiley, New York, second
edition.

Webber, B.L. and Nilsson, N.J. (Eds.) (1981). Readings in Artificial Intelligence. Morgan
Kaufmann, San Mateo, CA.

27
MODULE FOUR

STATE SPACE SEARCH PROBLEM


4.1 Learning Outcomes
The following are the main points to be learnt from this chapter:
• Definition of a state spaces
• What State space search is
• Importance of the search problem

4.2 State Spaces


One general formulation of intelligent action is in terms of state space. A state contains all of
the information necessary to predict the effects of an action and to determine if it is a goal state.
State-space searching assumes that

• the agent has perfect knowledge of the state space and can observe what state it is in (i.e.,
there is full observability);

• the agent has a set of actions that have known deterministic effects;
• some states are goal states, the agent wants to reach one of these goal states, and the agent
can recognize a goal state; and

• a solution is a sequence of actions that will get the agent from its current state to a

goal state.
A state-space problem consists of
• a set of states;
• a distinguished set of states called the start states;
• a set of actions available to the agent in each state;
• an action function that, given a state and an action, returns a new state;
• a set of goal states, often specified as a Boolean function, goal(s), that is true when s is a
goal state; and

• a criterion that specifies the quality of an acceptable solution. For example, any sequence
of actions that gets the agent to the goal state may be acceptable, or there may be costs
associated with actions and the agent may be required to find a sequence that has minimal
total cost. This is called an optimal solution. Alternatively, it may be sa tisfied with any
solution that is within 10% of optimal.

28
4.3 State Space Search
An initial state is the description of the starting configuration of the agent. An action or
an operator takes the agent from one state to another state which is called a successor state. A
state can have a number of successor states.

A plan is a sequence of actions. The cost of a plan is referred to as the path cost. The path cost
is a positive number, and a common path cost may be the sum of the costs of the steps in the
path. The goal state is the partial description of the solution

4.4 Goal Directed Agent


We have earlier discussed about an intelligent agent. In this unit we will study a type of intelligent
agent which we will call a goal directed agent.

A goal directed agent needs to achieve certain goals. Such an agent selects its actions based on
the goal it has. Many problems can be represented as a set of states and a set of rules
of how one state is transformed to another. Each state is an abstract representation of the
agent's environment. It is an abstraction that denotes a configuration of the agent.

Let us look at a few examples of goal directed agents.


1. 15-puzzle: The goal of an agent working on a 15-puzzle problem may be to reach a
configuration which satisfies the condition that the top row has the tiles 1, 2 and 3. The
details of this problem will be described later.

2. The goal of an agent may be to navigate a maze and reach the HOME position.
The agent must choose a sequence of actions to achieve the desired goal.

4.5 Problem Space


A problem space is a set of states and a set of operators. The operators map from one state to
another state. There will be one or more states that can be called initial states, one or
more states which we need to reach what are known as goal states and there will be states in
between initial states and goal states known as intermediate states. So what is the solution? The
solution to the given problem is nothing but a sequence of operators that map an initial state to
a goal state. This sequence forms a solution path. What is the best solution? Obviously the
shortest path from the initial state to the goal state is the best one. Shortest path has only a few
operations compared to all other possible solution paths.

29
Solution path forms a tree structure where each node is a state. So searching is nothing but
exploring the tree from the root node.

4.6 Search Problem


A search problem consists of the following:
• S: the full set of states
• s0 : the initial state
• A:S→S is a set of operators
• G is the set of final states. Note that G S
The search problem is to find a sequence of actions which transforms the agent from the
initial state to a goal state g G. A search problem is represented by a 4-tuple {S, s0, A, G}. S:
set of states s0 S: initial state

A: S S operators/ actions that transform one state to another state G: goal, a set of states. G S

This sequence of actions is called a solution plan. It is a path from the initial state to a goal state.
A plan P is a sequence of actions. P = {a0, a1, aN} which leads to traversing a number of
states {s0, s1, Sn+1 G}. A sequence of states is called a path. The cost of a path is a positive
number. In many cases the path cost is computed by taking the sum of the costs of each action.

4.7 Representation of search problems


A search problem is represented using a directed graph.
• The states are represented as nodes.
• The allowed actions are represented as arcs.

4.8 Searching process


The generic searching process can be very simply described in terms of the following steps:
Do until a solution is found or the state space is exhausted.
1. Check the current state
2. Execute allowable actions to find the successor states.
3. Pick one of the new states.
4. Check if the new state is a solution state
If it is not, the new state becomes the current state and the process is repeated

30
4.8.1 Illustration of a search process
We will now illustrate the searching process with the help of an example. Consider the
problem depicted in Figure 3.1.

 s0 is the initial state. The successor states are the adjacent states in the graph.
 There are three goal states.
 The two successor states of the initial state are generated.
 The successors of these states are picked and their successors are
Example

In the 8-puzzle problem we have a 3×3 square board and 8 numbered tiles. The board has one
blank position. Bocks can be slid to adjacent blank positions. We can alternatively and
equivalently look upon this as the movement of the blank position up, down, left or right.
The objective of this puzzle is to move the tiles starting from an initial position and arrive
at a given goal configuration. The 15-puzzle problems is similar to the 8-puzzle. It has a 4×4
square board and 15 numbered tiles

The state space representation for this problem is summarized below:


States: A state is a description of each of the eight tiles in each location that it can occupy.
Operators/Action: The blank moves left, right, up or down Goal
Test: The current state matches a certain state.

Path Cost: Each moves of the blank costs 1 A small portion of the state space of 8-puzzle is shown
below.

Note that we do not need to generate all the states before the search begins. The states can be
generated when required.

31
Figure 3.2: 8-puzzle partial state space

4.9 Types of AI Search Techniques


Solution can be found with less information or with more information. It all depends on the
problem we need to solve. Usually when we have more information it will be easy to solve the
problem. The following are the types of AI search namely: Uninformed Search, List search,
Tree search, Graph search, SQL search, Tradeoff Based search, informed search, Adversarial
search. This module will only deal with uninformed search, informed search and Tree search.

4.10 REFERENCES/FURTHER READING


Dechter, R. & Judea, P. (1985). Generalized Best-First Search Strategies and the Optimality of
A*. Journal of the ACM 32 (3): 505–536. doi:10.1145/3828.3830.

Koenig, S., Maxim, L., Yaxin, L. & David, F. (2004). Incremental Heuristic Search in AI. AI
Magazine 25(2): 99–112.[Link]

Nilsson, N. J. (1980). Principles of Artificial Intelligence. Palo Alto, California: Tioga Publishing
Company ISBN 0-935382-01-1.

Pearl, J. (1984). Heuristics: Intelligent Search Strategies for Computer Problem Solving. Addison-
Wesley Longman Publishing Co., Inc. ISBN 0-201-05594-5.

Russell, S. J. & Norvig, P. (2003). Artificial Intelligence: A Modern Approach. Upper Saddle
River, N.J. Prentice Hall. pp. 97–104 ISBN 0-13-790395-2.

32
MODULE FIVE

HEURISTIC SEARCH
5.1 Learning Outcomes
At the end of this module, students should be able to:
• explain informed search
• describe best-first search
• describe greedy search
• solve simple problems on informed search.

5.2 What is Heuristic?


Heuristic search methods explore the search space "intelligently". That is, evaluating possibilities
without having to investigate every single possibility.

Heuristic search is an AI search technique that employs heuristic for its moves. Heuristic means
“rule of thumb”. To quote Judea Pearl, “Heuristics are criteria, methods or principles for
deciding which among several alternative courses of action promises to be the most effective
in order to achieve some goal”. In heuristic search or informed search, heuristics are used to
identify the most promising search path.

In a general sense, the term heuristic is used for any advice that is often effective, but is not
guaranteed to work in every case.

5.3 Examples of Heuristic Function


A heuristic function at a node n is an estimate of the optimum cost from the current node to a
goal. It is denoted by h (n). H (n) = estimated cost of the cheapest path from node n to a goal
node

Example 1: We want a path from Kolkata to Guwahati Heuristic for Guwahati may be
straight-line distance between Kolkata and Guwahati h(Kolkata) = Euclidean

Distance(Kolkata, Guwahati)
Example 2: 8-puzzle: Misplaced Tiles Heuristics is the number of tiles out of place. h(n) = 5
because the tiles 2, 8, 1, 6 and 7 are out of place.

33
The first picture shows the current state n, and the second picture the goal state.
h(n) = 5 because the tiles 2, 8, 1, 6 and 7 are out of place.
The first picture shows the current state n, and the second picture the goal state. h(n)
= 5 because the tiles 2, 8, 1, 6 and 7 are out of place.

Manhattan Distance Heuristic: Another heuristic for 8-puzzle is the Manhattan distance
heuristic. This heuristic sums the distance that the tiles are out of place. The distance of a tile is
measured by the sum of the differences in the x-positions and the y-positions. For the above
example, using the Manhattan distance heuristic, h(n) = 1 + 1 + 0 + 0 + 0 + 1 + 1 + 2 = 6

5.4 Best-First Search


Best-first search is a search algorithm which explores a graph by expanding the most
promising node chosen according to a specified rule. Judea Pearl described best-first search
as estimating the promise of node n by a "heuristic evaluation function f(n) which, in general,
may depend on the description of n, the description of the goal, the information gathered
by the search up to that point, and most important, on any extra knowledge about the problem
domain.

Uniform Cost Search is a special case of the best first search algorithm. The algorithm maintains
a priority queue of nodes to be explored. A cost function f(n) is applied to each node. The nodes
are put in OPEN in the order of their f values. Nodes with smaller f(n) values are expanded
earlier.

5.5 Greedy Search


In greedy search, the idea is to expand the node with the smallest estimated cost to reach
the goal. We use a heuristic function f(n) = h(n). h(n) estimates the distance remaining to a goal.
A greedy algorithm is any algorithm that follows the problem solving heuristic of making
the locally optimal choice at each stage with the hope of finding the global optimum. In
general, greedy algorithms are used for optimization problems. Greedy algorithms often
perform very well. They tend to find good solutions quickly, although not always optimal ones.
34
The resulting algorithm is not optimal. The algorithm is also incomplete, and it may fail to find
a solution even if one exists. This can be seen by running greedy search on the following
example. A good heuristic for the route-finding problem would be straight-line distance to the
goal.

S is the starting state, G is the goal state.

Figure 2 is an example of a route finding problem.

Figure 3 -The straight line distance heuristic estimates for the nodes.

Let us run the greedy search algorithm for the graph given in Figure 2. The

straight line distance heuristic estimates for the nodes are shown in

Figure 3.

Step 1: S is expanded. Its children are A and D.

35
Step 2: D has smaller cost and is expanded next.

36
A* illustrated

37
A* search is a combination of lowest-cost-first and best-first searches that considers both path
cost and heuristic information in its selection of which path to expand. For each path on the
frontier, A* uses an estimate of the total path cost from a start node to a goal node constrained
to start along that path. It uses cost (p), the cost of the path found, as well as the heuristic
function h(p), the estimated path cost from the end of p to the goal.

For any path p on the frontier, define f(p)=cost(p)+h(p). This is an estimate of the total path cost
to follow path p then go to a goal node.

If n is the node at the end of path p, this can be depicted as follows:


actual estimate start ---
>n > goal
cost(p) h(p) ---------------
----------> f(p)

If h(n) is an underestimate of the path costs from node n to a goal node, then f(p) is an underestimate
of a path cost of going from a start node to a goal node via p.

positive lower bound?" Those search strategies where the answer is "Yes" have worst-case time
complexity which increases exponentially with the size of the path length. Those algorithms
that are not guaranteed to halt have infinite worst-case time complexity. Space refers to the
space complexity, which is either "Linear" in the path length or "Exponential" in the path length.

38
The depth-first methods are linear in space with respect to the path lengths explored but are not
guaranteed to find a solution if one exists. Breadth-first, lowest-cost-first, and A* may be
exponential in both space and time, but they are guaranteed to find a solution if one exists, even
if the graph is infinite (as long as there are finite branching factors and positive nontrivial arc
costs).

Lowest-cost-first and A* searches are guaranteed to find the least-cost solution as the first solution
found.

REFERENCES/FURTHER READING
Lowerre, B. (1976). The Harpy Speech Recognition System, Ph.D. thesis, Carnegie Mellon
University.

Russell, S. J. & Norvig, P. (2003). Artificial Intelligence: A Modern Approach (2nd Ed.). Upper
Saddle River, New Jersey: Prentice Hall, pp. 111–114, ISBN 0-13-790395-2.

Zhou, R. & Hansen, E. (2005). Beam-Stack Search: Integrating Backtracking with Beam Search.
[Link]

39
MODULE SIX
NATURAL LANGUAGE PROCESSING
6.1 Learning Outcomes
At the end of this unit, student should be able to:
• Describe the history of natural language processing
• List major tasks in NLP
• Mention different types of evaluation of NPL.
6.2 Natural Language Processing
Natural language processing (NLP) is a collection of techniques used to extract grammatical
structure and meaning from input in order to perform a useful task as a result, natural language
generation builds output based on the rules of the target language and the task at hand. NLP is
useful in the tutoring systems, duplicate detection, computer supported instruction and database
interface fields as it provides a pathway for increased interactivity and productivity.

6.3 Natural Language Understanding


There are many advantages of natural language as a communication channel between a man and
a machine. One of them is that the user already knows the natural language, so that he/she does
not have to learn an artificial language nor bear the burden of remembering its conventions over
periods of disuse. There arise occasions where the user knows what he wants the machine to do
and can express it in natural language, but does not know exactly how to express it to the
machine. A facility for machine understanding of natural language could greatly facilitate the
efficiency of expression in such situations -- both in speed and convenience, and in decreased
likelihood of error.

Understanding Generation

NATURAL NATURAL
COMPUTER
LAUGUAGE LAUGUAGE
INPUT OUTPUT

6.4 STEPS OF NATURAL LANGUAGE PROCESSING


There are 5 phases involved in natural language processing

40
 Morphological: The lexicon of a language is its vocabulary that includes its words and
expressions. Morphology depicts analyzing, identifying and description of structure of
words. The words are generally accepted as being the smallest units of syntax. The
syntax refers to the rules and principles that govern the sentence structure of any
individual language.

 Lexical Analysis: It involves dividing a text into paragraphs, words and sentences.
 Syntactic Analysis: This involves analysation of the words in a sentence to depict the
grammatical structure of the sentence. The words are transformed into structure that
shows how the words are related to each other. Some word sequences may be rejected
if they violate the rules of the language for how words may be combined. e.g. “the girl
the go to the school”. This would definitely be rejected by the English syntactic
analyzer.

 Semantic Analysis: This abstracts the dictionary meaning or the exact meaning of a
sentence from context. The structures which are created by the syntactic analyzer are
assigned meaning.

 Discourse Integration: The meaning of any single sentence depends upon the
sentences that precedes it and also invokes the meaning of the sentences that follow it.
 Pragmatic Analysis: It means abstracting or deriving the purposeful use of the language
in situations importantly those aspects of language which require world knowledge the
main focus is on what was said is reinterpreted on what it actually means.

6.5 Major tasks in NLP


The following is a list of some of the most commonly researched tasks in NLP. Note that some
of these tasks have direct real-world applications, while others more commonly serve as
subtasks that are used to aid in solving larger tasks. What distinguishes these tasks from other
potential and actual NLP tasks is not only the volume of research devoted to them but the fact
that for each one there is typically a well-defined problem setting, a standard metric for
evaluating the task, standard corpora on which the task can be evaluated, and competitions
devoted to the specific task.

• Automatic summarization.
• Co reference resolution.
• Machine translation.
• Morphological segmentation.

41
• Named entity recognition (NER).
• Natural language generation.
• Optical character recognition (OCR).
• Parsing.
• Relationship extraction.
• Sentiment analysis.
• Speech recognition.
• Speech segmentation.
• Topic segmentation and recognition.
• Word segmentation.
• Word sense disambiguation.
In some cases, sets of related tasks are grouped into subfields of NLP that are often considered
separately from NLP as a whole. Examples include:

• Information retrieval (IR)


• Information extraction (IE)
• Speech processing

6.6 References and Further Reading


Bates, M. (1995). Models of Natural Language Understanding. Proceedings of the National
Academy of Sciences of the United States of America, Vol. 92, No. 22 (Oct. 24, 1995),
pp. 9977–9982.

Elaine, R. and Kevin, K. (2006). Artificial Intelligence. McGraw Hill companies Inc., Chapter 15,
pp. 377-426.

George F. L. (2002). Artificial Intelligence: Structures and Strategies for Complex Problem
Solving. Chapter 15, page 619-632.

Natural Language Processing,


[Link] downloaded in
March 18, 2015.
Stuart, R. and Peter, N. (2002). Artificial Intelligence: A Modern Approach. Prentice Hall,
Chapter 23, page 834-861.

42
MODULE SEVEN
PATTERN RECOGNITION
7.1 Learning Outcomes
At the end of the lesson, students should be able to
define pattern recognition and its application.

Know its usage.

7.2 Pattern
A pattern is an entity, vaguely defined, that could be given a name, e.g. fingerprint image,
handwritten word, human face, speech signal, DNA sequence. Patterns can be represented as

(i) Vectors of real-numbers,


(ii) Lists of attributes
(iii) Descriptions of parts and their relationships.

7.3 Pattern Recognition


Pattern recognition techniques are used to automatically classify physical objects (2D or 3D) or
abstract multidimensional patterns (n points in d dimensions) into known or possibly unknown
categories. A number of commercial pattern recognition systems exist for character recognition,
handwriting recognition, document classification, fingerprint classification, speech and speaker
recognition, white blood cell (leukocyte) classification, military target recognition among
others.

The design of a pattern recognition system requires the following modules: sensing, feature
extraction and selection, decision making, and system performance evaluation. The availability
of low cost and high resolution sensors (e.g., CCD cameras, microphones and scanners) and
data sharing over the Internet have resulted in huge repositories of digitized documents (text,
speech, image and video). Need for efficient archiving and retrieval of this data has fostered the
development of pattern recognition algorithms in new application domains (e.g., text, image
and video retrieval, bioinformatics, and face recognition).

7.3.1 Models in Pattern Recognition


Pattern recognition systems can be designed using the following main approaches:

1) Template matching.

43
2) Statistical pattern recognition.

3) Artificial Neural Networks.


4) Syntactic pattern recognition.
Approach Representation Recognition Function Typical Criterion
Template Matching Samples, pixels, Correlation, Classification error
curves distance measure

Statistical Features Discriminant function Classification error


Syntactic or Primitives Rules , grammar Acceptance error
Structural
Neural Network Samples ,pixels, Network Function Mean square error
features

Table 1: Models in Pattern Recognition

7.3.2 Important Issues in the Design of a PR System -


Definition of pattern classes.
- Sensing environment.
- Pattern representation.
- Feature extraction and selection.
- Cluster analysis.
- Selection of training and test examples.
- Performance evaluation.

7.3.3 Design of a Pattern Recognition System

Collect Select Select Train Evaluate


Data features model Classifier Classifier

The Design Cycle

Patterns have to be designed in various steps expressed below:


(i) Data collection: During this step collect training and testing data. Next the question arises,
how can we know when we have adequately large and representative set of samples?

(ii) Feature selection: During this step various details have to be investigated such as Domain
dependence and prior information ,Computational cost and feasibility, Discriminative
features, Similar values for similar patterns, Different values for different patterns, Invariant

44
features with respect to translation, rotation and Scale, Robust features with respect to
occlusion, distortion, deformation, and variations in environment.

(iii) Model selection: During this phase select models based on following criteria: Domain
dependence and prior information, Definition of design criteria, Parametric vs. non
parametric models, Handling of missing features, Computational complexity. Various types
of models are: templates, decision-theoretic or statistical, syntactic or structural, neural, and
hybrid.

(iv) Training: Training phase deals with how can we learn the rule from data? Supervised
learning: a teacher provides a category label or cost for each pattern in the training set.
Unsupervised learning: the system forms clusters or natural groupings of the input patterns.
Reinforcement learning: no desired category is given but the teacher provides feedback to
the system such as the decision is right or wrong.

(v) Evaluation: During this phase in the design cycle some questions have to be answered such
as how can we estimate the performance with training samples? How can we predict the

Collect Select Select Train Evaluate


Data features model Classifier Classifier

performance with future data? Problems of over fitting and generalization.

7.3.4 Process Design of a Pattern Recognition System

Pattern recognition process has following steps.


1) Data acquisition and sensing: Measurements of physical variables like bandwidth,
resolution, sensitivity, distortion, SNR, latency, etc.

2) Pre-processing: Removal of noise in data, Isolation of patterns of interest from the


background.

3) Feature extraction: Finding a new representation in terms of features


4) Model learning and estimation: Learning a mapping between features and pattern groups
and categories.

5) Classification: Using features and learned models to assign a pattern to a category.


6) Post-processing: Evaluation of confidence in decisions, Exploitation of context to improve
performance Combination of experts.

45
7.4 Pattern Recognition
Pattern recognition is used in any area of science and engineering that studies the structure of
observations. It is now frequently used in many applications in manufacturing industry, health
care and military. The table below shows some of the examples of Pattern Recognition
Applications

Problem Domain Applications Input Pattern Pattern Classes


Bioinformatics Sequence Analysis DNA/Protein Sequence Known types of
genes or pattern
Data Mining Searching for Points in Compact and well
meaningful patterns multidimensional space separated clusters

Document Internet search Text Document Semantic


Classification Categories

Document Image Optical Document image Alphanumeric


Analysis character characters, word
recognition

Industrial Printed circuit board Intensity or Defective/ non-


Automation inspection range image defective nature of
product
Multimedia Database Internet search Video clip Video genres (e.g.
retrieval Action ,dialogue
etc.)
Biometric Personal Face, iris, fingerprint Authorized users
recognition identification for access control
Remote sensing Forecasting crop Multi spectral image Land use
yield categories ,growth
patterns of crop
Speech recognition Telephone directory Speech waveform Spoken words
Medical Computer aided Microscopic image
diagnosis
Military Automatic target Optical or Target type
recognition infrared image
Natural Information Sentences Parts of speech
language extraction
processing

7.5 Review Questions


1. Explain Model of Pattern Recognition
2. Discuss Pattern Recognition Process

46
3. Mention important isssues involve in Pattern Recognition

References and further Reading


Fu, K.S. (1983). „A step towards unification of syntactic and statistical pattern recognition‟. IEEE
Trans. On Pattern Analysis and Machine Intelligence, vol. 5, no. 2.

Jain, A.K., Duin, R.P. and Mao, J. (2000), „Statistical Pattern Recognition: A Review‟. IEEE
Trans. on Pattern Analysis and Machine Intelligence, vol. 22, no. 1.

Shinde, S. P. and Deshmukh, V.P. (2011). Implementation of Pattern Recognition


Techniques And Overview Of Its Applications In
Various Areas Of Artificial Intelligence. International Journal of Advances in
Engineering &Technology. ISSN: 2231-1963
[Link]
PATTERNRECOGNITION-TECHNIQUES-AND-OVERVIEW-OF-ITS-
APPLICATIONS-

[Link].

47
MODULE EIGHT
INTRODUCTION TO EXPERT SYSTEM

8.1 LEARNING OUTCOMES


Students should be able to

 Define expert system.

• The need for and Structure of an Expert System


• Its uses, benefit and application in real world.

8.2 Definition of Expert Systems


Expert system is a computer program that simulates the thought process of a human expert to
solve complex decision problems in a specific domain. Expert systems provide expert advice
and guidance in a wide variety of activities, from computer diagnosis to delicate medical
surgery. An expert system may be viewed as a computer simulation of a human expert. These
expert systems have proven to be quite successful. Most applications of expert systems will fall
into one of the following categories:

• Interpreting and identifying.


• Predicting.
• Diagnosing.
• Designing.
• Planning.
• Monitoring.
• Debugging and testing.
• Instructing and training.
• Controlling.

8.3 The Need for Expert Systems


Expert systems are necessitated by the limitations associated with conventional human decision-
making processes, including:

1. Human expertise is very scarce.


2. Humans get tired from physical or mental workload.

48
3. Humans forget crucial details of a problem.
4. Humans are inconsistent in their day-to-day decisions.
5. Humans have limited working memory.
6. Humans are unable to comprehend large amounts of data quickly.
7. Humans are unable to retain large amounts of data in memory.
8. Humans are slow in recalling information stored in memory.
9. Humans are subject to deliberate or inadvertent bias in their actions.
10. Humans can deliberately avoid decision responsibilities.
11. Humans lie, hide, and die.
Coupled with these human limitations are the weaknesses inherent in conventional
programming and traditional decision-support tools. Despite the mechanistic power of
computers, they have certain limitations that impair their effectiveness in implementing human-
like decision processes. Conventional programs:

1. Are algorithmic in nature and depend only on raw machine power


2. Depend on facts that may be difficult to obtain
3. Do not make use of the effective heuristic approaches used by human experts
4. Are not easily adaptable to changing problem environments
5. Seek explicit and factual solutions that may not be possible

8.4 Benefits of Expert Systems


Expert systems offer an environment where the good capabilities of humans and the power of
computers can be incorporated to overcome many of the limitations discussed in the previous
section. Expert systems:

• Increase the probability, frequency, and consistency of making good decisions.


• Help distribute human expertise.
• Facilitate real-time, low-cost expert-level decisions by the non-expert.
• Enhance the utilization of most of the available data.
• Permit objectivity by weighing evidence without bias and without regard for the user‟s
personal and emotional reactions.

• Permit dynamism through modularity of structure.


• Free up the mind and time of the human expert to enable him or her to concentrate on more
creative activities.

49
• Encourage investigations into the subtle areas of a problem.
8.5 When to use Expert System Expert
system can be used when:

• Final users agree that payoff will be high


• Application is knowledge intensive
• A human expert exists
• Not a natural-language intensive application.
• A wide range of test cases are available.
• Neither creativity nor physical skills are required.

8.6 Expert System Structure


Complex decisions involve intricate combination of factual and heuristic knowledge. In order
for the computer to be able to retrieve and effectively use heuristic knowledge, the knowledge
must be organized in an easily accessible format that distinguishes among data, knowledge, and
control structures. For this reason, expert systems are organized in three distinct levels:

1. Knowledge base consists of problem-solving rules, procedures, and intrinsic data relevant to
the problem domain.

2. Working memory refers to task-specific data for the problem under consideration.
3. Inference engine is a generic control mechanism that applies the axiomatic knowledge in the
knowledge base to the task-specific data to arrive at some solution or conclusion.

8.7 Building an Expert System


The expert system life cycle entails much more prototyping and revising than normal software
development. The three broad steps in the process are as follows. Note that each stage might be
iterated and one might go back to the previous step.

1. Problem selection & Prototype construction.


2. Formalization & Implementation.
3. Evaluation & Evolution.

8.8 References and Further Reading


Lauritzen S.L. and Spiegelhalter D.J. (1987). Local Computations with probabilities on
graphical structures and their application to expert systems. Journal of the Royal
Statistical Society (series B), vol. 50, pp. 157-224.

50
MODULE NINE

PROBLEM REDUCTION AND GAME PLAYING


9.1 Learning Outcomes Students
should know

• The definition of reduction and graph.


• What game playing is?
• The attribute and characteristics of game playing

9.2 Problem Reduction (And / Or Graphs) o


Decomposable problem:

Idea: any solution has to use a bridge

I.e.: for path a-z find path a-z via f or

path a-z via g

51
o Decomposition:

For path a-z via f find path a-f

and path f-z analogously for

9.3 And/Or Graph: or -nodes as ellipses, and -nodes as boxes

And / Or Graphs (contd.)

Note: goal “nodes” are sub-problems that are trivial or atomic, e.g. direct route from a-c

• Solution Tree:
• the problem is the root node of the solution treeT
• if P is an or node, exactly one successor (with its solution tree) is in T
• if P is an and node, all of its successors (with their solution trees) are in T3

52
Example

a-b b-d

Costs: for this, one has to consider

• arc costs
• node costs o Calculation: or nodes of form X-Z and nodes of form X-Z via Y

• primitive if X-Z connected


• in that case, cost of node X-Z is relative distance between X and Z
• cost of other nodes is 0 o End Games: consider

• game with only win/loss


• 2 players a and b
• playing alternatively
• solution: win for a

53
Interpretation: game is won if solution tree exists, i.e. tree begins with an

or node: there is a choice for a leading to an and node: such that all

possible choices for b lead to or node: and so on until

Goal: successful solution (win) is found

o Interpretation

It means: a has won (solution tree) if it is either in a winning position or it can always choose
a move leading to a losing position of b; i.e. a position such that all moves that b can choose
lead to a winning position of a (i.e. again to a solution tree).

Note: a does not have to have a solution tree. Either Be could have a solution tree (in which a loses)
or neither of them have, so none of the players can force a win.

Endgame Algorithm
Endgame Algorithm: for a

1. consider final (0-step) winning positions for a

2. compute 1-step losing positions for b, i.e. all positions for b from which all immediate successors
lead to a 0-step winning position for a

3. compute 2-step winning positions for a, i.e. all positions where a can choose one immediate
successor to lead to a 1-step losing position for b

4. compute 3-step losing positions for b, i.e. all positions forb where all successors lead to a less-
than-3 (i.e. 2- or 0-) winning position for a.

5. and so on, until no more new positions are collected or maximum depth are exhausted

Result: if no maximum depth limit, the final outcome is a list of winning positions for a (with
maximum depths), a list of losing positions for b (with maximum depths) and a list of tied
positions

54
9.4 Game playing
Game playing is the specific way in which players interact with a game, and in particular with
video games. Game playing is the pattern defined through the game rules, connection between
player and the game, challenges and overcoming them, plot and player's connection with it.

There are three components to gameplay: “Manipulation rules,” defining what the player can
do in the game, “Goal Rules,” defining the goal of the game, and “Metarules,” defining how a
game can be tuned or modified. Various gameplay types are listed below.

• Asymmetric gameplay
• Cooperative gameplay
• Death match
• Hack and slash
• Leveled gameplay
• Nonlinear gameplay
• Passive gameplay
• Twitch gameplay
The experience of gameplay is one of interacting with a game design in the performance of
cognitive tasks, with a variety of emotions arising from or associated with different elements
of motivation, task performance and completion."

9.5 Playability

Playability is the ease or quantity or duration by which the game can be played and is a common
measure of the quality of gameplay. Playability is defined as: a set of properties that describe
the Player experience using a specific game system whose main objective is to provide
enjoyment and entertainment, by being credible and satisfying, when the player plays alone or
in company. Playability is characterized by different attributes and properties to measure the
video game player experience.

• Satisfaction: the degree of gratification or pleasure of the player for completing a video
game or some aspect of it like: mechanism, graphics, user interface, story, etc. Satisfaction
is a highly subjective attribute that provokes a difficult measuring due to player preferences
and pleasures having influence in the satisfaction for specific game elements: characters,
virtual world, challenges, and so on.

55
• Learning: the facility to understand and dominate the game system and mechanics
(objectives, rules, how to interact with the video game, etc.).

• Efficiency: the necessary time and resources to offer fun and entertainment to players while
they achieve the different game objectives and reach the final goal.

• Immersion: the capacity to believe in the video game contents and integrate the player in
the virtual game world. The immersion provokes that the player looks involved in the virtual
world, becoming part of this and interacting with it because the user perceives the virtual
world represented by the video game, with its laws and rules that characterize it.

• Motivation: the characteristics that provoke the player to realize concrete actions and
persist in them until their culmination.

• Emotion: the involuntary impulse originated in response to the stimulus of the video game
and induces feelings or unleashes automatic reactions and conducts. The use of emotions in
video games help to obtain a best player experience and leads players to different emotional
states: happiness, fear, intrigue, curiosity, sadness… using the game challenges, story,
aesthetic appearance or the music compositions that are capable of move, affect, to make to
smile or to cry to the player.

• Socialization: the degree of the set of game attributes elements and resources that promote
the social factor of the game experience in group. The game socialization allows players to
have a totally different game experience when they play with other persons and promote
new social relationships thanks to the interaction among them.

9.5.1 Playability's facets


The playability analysis is a very complex process due to the different point of view to analyze
the different part of video game architecture. Each facet allows us to identify the different
playability's attributes and properties affected by the different elements of video game
architecture. The playability's facets are:

• Intrinsic Playability.
• Mechanical Playability.
• Interactive Playability.
• Artistic Playability.
• Intrapersonal Playability or Personal Playability.
• Interpersonal Playability or Social Playability.

56
9.6 References and Future Reading
Adams, E. and Rollings, A. (2003). Andrew Rollings and Ernest Adams on game design. New
Riders Publishing. ISBN 1-59273-001-9.

Björk, S. and Holopainen, J. (2005). Patterns in Game Design. Charles River Media. ISBN
158450-354-8.

Lindley, C. (2004). "Narrative, Game Play, and Alternative Time Structures for Virtual
Environments". Technologies for Interactive Digital Storytelling and Entertainment:
Proceedings of TIDSE 2004. Darmstadt, Germany: Springer. pp. 183–194.

Nacke, L.E. et al. (2009). "Playability and Player Experience Research". Proceedings of
DiGRA 2009: Breaking New Ground: Innovation in Games, Play, Practice and Theory
(London, UK: DiGRA)

Salen, K. and Zimmerman, E. (2004). Rules of Play: Game Design Fundamentals.


Cambridge, Massachusetts: The MIT Press. ISBN 978-0-262-24045-1.

57
MODULE TEN
MACHINE LEARNING AND CONCEPT FORMATION
10.1 Learning Outcome Student
should be able to

• State what machine is and concept formation.


• Its relation to other fields and its application.
• The relation of machine learning to concept formation. Types of Concept formation.

10.2 Machine Learning?


Machine Learning is a scientific discipline that addresses the following question: „How can we
program systems to automatically learn and to improve with experience? ‟Learning in this
context is not learning by heart but recognizing complex patterns and make intelligent decisions
based on data. The difficulty lies in the fact that the set of all possible decisions given all
possible inputs is too complex to describe. To tackle this problem the field of Machine Learning
develops algorithms that discover knowledge from specific data and experience, based on sound
statistical and computational principles.

The field of Machine Learning integrates many distinct approaches such as probability theory,
logic, combinatorial optimization, search, statistics, reinforcement learning and control theory.

10.3 The History of Machine Learning


In 1946 the first computer system ENIAC was developed. At that time the word „computer‟
meant a human being that performed numerical computations on paper and ENIAC was called
a numerical computing machine. The idea at that time was that human thinking and learning
could be rendered logically in such a machine.

In 1950 Alan Turing proposed a test to measure its performance. The Turing test is based on
the idea that we can only determine if a machine can actually learn if we communicate with it
and cannot distinguish it from another human. Although, there have not been any systems that
passed the Turing test many interesting systems have been developed.

ELIZA
Around 1952 Arthur Samuel (IBM) wrote the first game-playing program, for checkers, to
achieve sufficient skill to challenge a world champion. Samuel‟s machines learning programs
worked remarkably well and were helpful in improving the performance of checkers players.

58
Another milestone was the ELIZA system developed in the early 60‟s by Jospeph Weizenbaum.
ELIZA simulated a psychotherapist by using tricks like string substitution and canned responses
based on keywords. When the original ELIZA first appeared, some people actually mistook her
for human.

Although the overall performance of ELIZA was disappointing, it was a nice proof of concept.
Later on many other systems have been developed. Important was the work of the group of Ted
Shortliffe on MYCIN (Stanford). They demonstrated the power of rule-based systems for
knowledge representation and inference in the domain of medical diagnosis and therapy. This
system is often called the first expert system.

10.4 Tasks
Machine learning tasks are typically classified into three broad categories, depending on the nature
of the learning "signal" or "feedback" available to a learning system. These are:

• Supervised learning
• Unsupervised learning
• In reinforcement learning
Another categorization of machine learning tasks arises when one considers the desired output of
a machine-learned system:

A support vector machine is a classifier that divides its input space into two regions, separated by
a linear boundary. Here, it has learned to distinguish black and white circles.

• In classification, inputs are divided into two or more classes, and the learner must produce a
model that assigns unseen inputs to one (or multi-label classification) or more of these classes.
Spam filtering is an example of classification.

• In regression, also a supervised problem, the outputs are continuous rather than discrete.

59
• In clustering, a set of inputs is to be divided into groups.
• Density estimation finds the distribution of inputs in some space.
10.5 Concept Formation.
Concept learning also known as category learning, concept attainment, and concept formation.
Concept learning also refers to a learning task in which a human or machine learner is trained
to classify objects by being shown a set of example objects along with their class labels. The
learner simplifies what has been observed by condensing it in the form of an example. Concept
learning may be simple or complex because learning takes place over many areas.

10.5.1 Types of Concepts


Concept learning must be distinguished from learning by reciting something from memory
(recall) or discriminating between two things that differ (discrimination). However, these issues
are closely related, since memory recall of facts could be considered a "trivial" conceptual
process where prior exemplars representing the concept are invariant. Similarly, while
discrimination is not the same as initial concept learning, discrimination processes are involved
in refining concepts by means of the repeated presentation of exemplars.

 Concrete or Perceptual Concepts vs. Abstract


Concepts
 Defined (or Relational) and Associated Concepts 
Complex Concepts.

10.5.1 Methods of learning a concept


• Discovery - Every baby discovers concepts for itself, such as discovering that each of its
fingers can be individually controlled. Although this is perception driven, formation of the
concept is more than memorizing perceptions.

• Words - Hearing or reading new words leads to learning new concepts, but forming a new
concept is more than learning a dictionary definition.

• Invention - When people who lacked tools used their fingernails to scrape food from killed
animals or smashed melons, they noticed that a broken stone sometimes had a sharp edge
like a fingernail and was therefore suitable for scraping food. Inventing a stone tool to avoid
broken fingernails was a new concept.

60
10.6 Machine learning approaches to concept learning
Unlike the situation in psychology, the problem of concept learning within machine learning is
not one of finding the 'right' theory of concept learning, but of finding the most effective method
for a given task. In the machine learning literature, this concept learning is more typically called
supervised learning or supervised classification, in contrast to unsupervised learning or
unsupervised classification, in which the learner is not provided with class labels. In machine
learning, algorithms of exemplar theory are also known as instance learners or lazy learners.

10.7 References and Further Readings

Bruner, J. et al. (1967). A study of thinking. New York: Science Editions.


Feldman, J. (2003). "The Simplicity Principle in Human Concept Learning". Psychology Science
12: 227–232. Doi: 10.1046/j.0963-7214.2003.01267.x.

Hammer, R. (2009). The development of category learning strategies: What makes the difference?.
Cognition 112 (1): 105–119. Doi: 10.1016/[Link].2009.03.012.

Huang T.M. et al. (2006). Kernel Based Algorithms for Mining Huge Data Sets, Supervised,
Semi-supervised, and Unsupervised Learning, Springer-Verlag, Berlin, Heidelberg, 260
pp. 96 illus., Hardcover, ISBN 3-540-31681-7.

Mehryar, M. et al. (2012). Foundations of Machine Learning, the MIT Press. ISBN
9780262018258.

Richard O. D. et al. (2001). Pattern classification (2nd edition), Wiley, New York, ISBN 0471-
05669-3.

Sergios, T. and Konstantinos, K. (2009). Pattern Recognition (4th Edition), Academic Press, ISBN
978-1-59749-272-0.

Watanabe, S. (1985). Pattern Recognition: Human and Mechanical. New York:


[Link], Greg (Copyright 1994–2007). "Concept". Retrieved 2007-12-04.

61
MODULE ELEVEN
LISP LANGUAGE
10.1 Learning Outcomes
Students should be able to
• Define LISP Language
• Explain the history of LISP Language
• State the application of LISP Language
• Write a simple LISP program

10.2 LISP Language


Programming languages in Artificial Intelligence (AI) are the major tool for exploring and
building computer programs that can be used to simulate intelligent processes such as learning,
reasoning and understanding symbolic information in context.

Lisp is the first functional programming language: It was invented to support symbolic
computation using linked lists as the central data structure (Lisp stands for List Processor). LISP
was founded on the mathematical theory of recursive functions (in which a function appears in
its own definition). A LISP program is a function applied to data, rather than being a sequence
of procedural steps as in FORTRAN and ALGOL. LISP uses a very simple notation in which
operations and their operands are given in a parenthesized list. For example, (+ a (* b c)) stands
for a + b*c.

10.3 History of LISP


LISP was invented by John McCarthy in 1958 while he was at the Massachusetts Institute of
Technology (MIT). McCarthy published its design in a paper in Communications of the ACM
in 1960; entitled "Recursive Functions of Symbolic Expressions and Their Computation by
Machine, Part I" ("Part II" was never published). He showed that with a few simple operators
and a notation for functions, one can build a Turing-complete language for algorithms. LISP
was first implemented by Steve Russell on an IBM 704 computer. Russell had read McCarthy's
paper, and realized (to McCarthy's surprise) that the Lisp eval function could be implemented
in machine code. The result was a working Lisp interpreter which could be used to run Lisp
programs, or more properly, 'evaluate Lisp expressions.'

The first complete Lisp compiler, written in Lisp, was implemented in 1962 by Tim Hart and

62
Mike Levin at MIT. This compiler introduced the Lisp model of incremental compilation, in which
compiled and interpreted functions can intermix freely. Lisp was a difficult system to implement
with the compiler techniques and stock hardware of the 1970s. Garbage collection routines,
developed by then-MIT graduate student Daniel Edwards, made it practical to run Lisp on general-
purpose computing systems, but efficiency was still a problem. This led to the creation of Lisp
machines: dedicated hardware for running Lisp environments and programs. Advances in both
computer hardware and compiler technology soon made Lisp machines obsolete.
During the 1980s and 1990s, a great effort was made to unify the work on new Lisp dialects
(mostly successors to MacLisp like ZetaLisp and NIL (New Implementation of Lisp)) into a
single language. The new language, Common Lisp, was somewhat compatible with the dialects
it replaced (the book Common Lisp the Language notes the compatibility of various constructs).
In 1994, ANSI published the Common Lisp standard, "ANSI X3.226-1994 Information
Technology Programming Language Common Lisp."

10.4 Significant Language Features


• Atoms & Lists - Lisp uses two different types of data structures, atoms and lists. Atoms are
similar to identifiers, but can also be numeric constants. Lists can be lists of atoms, lists, or
any combination of the two.

• Functional Programming Style - all computation is performed by applying functions to


arguments. Variable declarations are rarely used.

• Uniform Representation of Data and Code - example: the list (A B C D) a list of four
elements (interpreted as data) is the application of the function named A to the three
parameters B, C, and D (interpreted as code)

• Reliance on Recursion - a strong reliance on recursion has allowed Lisp to be successful in


many areas, including Artificial Intelligence.

• Garbage Collection - Lisp has built-in garbage collection, so programmers do not need to
explicitly free dynamically allocated memory.

10.5 Applications of LISP


Below is a short list of the areas where Lisp is used:
Symbolic Algebraic Manipulation Pattern
Recognition.

• Robotics.
• Expert System: Diagnosis, Identification and Design.

63
• Natural Language Understanding.
• Machine Translation.
• Formal Logical Reasoning.
• Perception (Vision, Speech, Understanding).
• Implementation of Real-Time, embedded Knowledge-Based Systems
• List Handling and Processing
• Tree Traversal (Breath/Depth First Search)

10.6 Advantages of LISP


Common Lisp is well suited to large programming projects and explorative programming. The
language has a dynamic semantics which distinguishes it from languages such as C and Ada. It
features automatic memory management, an interactive incremental development environment,
a module system, a large number of powerful data structures, a large standard library of useful
functions, a sophisticated object system supporting multiple inheritance and generic functions,
an exception system, user-defined types and a macro system which allows programmers to
extend the language.

• Common Lisp is a general-purpose programming language and it is interactive


• Common Lisp programs are
o easy to test (interactive) o easy to maintain (depending on programming style) o
portable across hardware/OS platforms and implementations.

• Common Lisp provides o clear syntax, carefully designed semantics o several data types:
numbers, strings, arrays, lists, characters, symbols, structures, streams etc.

o runtime typing: the programmer need not bother about type declarations, but he gets
notified on type violations.

o many generic functions: 88 arithmetic functions for all kinds of numbers (integers,
ratios, floating point numbers, complex numbers), 44

search/filter/sort functions for lists, arrays and strings


o automatic memory management (garbage collection) o packaging of programs
into modules.
o an object system, generic functions with powerful method combination.
o macros: every programmer can make his own language extensions.

64
10.7 The List Data Type
Programming in Lisp actually means defining functions that operate on lists, e.g., create,
traverse, copy, modify and delete lists. Since this is central to Lisp, every Lisp system comes
with a basic set of primitive built-in functions that efficiently support the main list operations.
Type predicate: Firstly, we have to know whether a current s–expression is a list or not (i.e. an
atom). This job is accomplished by the function LISTP which expects any s–expression EXPR
as an argument and returns the symbol T if EXPR is a list and NIL otherwise.

Examples are (the right arrow will be used for pointing to the result of a function call):
(LISTP ‟(1 2 3))= T
(LISTP ‟()) = T
(LISTP ‟3) = NIL
Selection of list elements: There are two basic functions for accessing the elements of a list:
CAR and CDR both expect a list as their argument. The function CAR returns the first element
in the LIST or NIL if the empty list is the argument, and CDR returns the same list from which
the first element has been removed or NIL if the empty list was the argument.

Examples:
(CAR ‟(A B C)) = A (CDR ‟(A B C)) = (A B)
(CAR ‟()) = NIL (CDR ‟(A)) = NIL
(CAR ‟((A B) C)) = (A B) (CAR ‟((A B) C)) = C
By means of a sequence of CAR and CDR function calls, it is possible to traverse a list from left
to right and from outer to inner list elements. For example, during evaluation of (CAR

(CDR ‟ (SEE THE QUOTE)))


The Lisp interpreter will first evaluate the expression
(CDR ‟ (SEE THE QUOTE))
which returns the list (THE QUOTE), which is then passed to the function CAR which returns the
symbol THE. Here, are some further examples:

(CAR (cdr (cdr ‟(SEE THE QUOTE)))) = QUOTE


(CAR (CDR (CDR (CDR ‟(SEE THE QUOTE))))) = NIL (CAR
(CAR ‟(SEE THE QUOTE))) = ???
What will happen during evaluation of the last example? Evaluation of (CAR ‟(SEE THE
QUOTE)) returns the symbol SEE. This is then passed as argument to the outer call of CAR.
However, CAR expects a list as argument, so the Lisp interpreter will immediately stop further

65
evaluation with an error like ERROR: ATTEMPT TO TAKE THE CAR OF SEE WHICH IS
NOT LISTP.

CAR stands for “contents of address register” and CDR stands for “contents of decrement
register. In order to write more readable Lisp code, Common Lisp comes with two equivalent
functions, first and rest. We have used the older names here as it enables reading and
understanding of older AI Lisp code.

Construction of lists: Analogously to CAR and CDR, a primitive function CONS exists which
is used to construct a list. CONS expects two s–expressions and inserts the first one as a new
element in front of the second one. Consider the following examples:

(CONS ‟A ‟(B C)) = (A B C)


(CONS ‟(A D) ‟(B C)) = ((A D) B C)
(CONS (FIRST ‟(1 2 3)) (REST ‟(1 2 3))) = (1 2 3)
In principle, CONS together with the empty list suffice to build very complex lists, for example:

(CONS ‟A (CONS ‟B (CONS ‟C ‟()))) = (A B C)


(CONS ‟A (CONS (CONS ‟B (CONS ‟C ‟())) (CONS ‟D ‟()))) = (A (B C) D)
However, since this is quite cumbersome work, most Lisp systems come with a number of more
advanced built-in list functions.

10.8 REFERENCES
Gordon S. Novak Jr. Lisp Programming Lecture Notes AI-TR-85-06
[Link]

Programming Languages in Artificial Intelligence, G unter Neumann, German Research Center for
Artificial Intelligence (LT–Lab,
DFKI).[Link]

66

You might also like