0% found this document useful (0 votes)
407 views91 pages

Ai and Ml-Unit 1234 (Notes)

Artificial intelligence and machine learning involves the study of how to make computers behave intelligently like humans. Some key areas of AI include perception, reasoning, learning, understanding language, problem solving, and robotics. While modern AI has achieved success in tasks like computer vision, robotics, and language processing, it still cannot match human-level intelligence in areas like understanding language, learning, and exhibiting true autonomy. The document then discusses the history and applications of AI such as game playing, theorem proving, natural language processing, vision, speech, robotics, and expert systems.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
407 views91 pages

Ai and Ml-Unit 1234 (Notes)

Artificial intelligence and machine learning involves the study of how to make computers behave intelligently like humans. Some key areas of AI include perception, reasoning, learning, understanding language, problem solving, and robotics. While modern AI has achieved success in tasks like computer vision, robotics, and language processing, it still cannot match human-level intelligence in areas like understanding language, learning, and exhibiting true autonomy. The document then discusses the history and applications of AI such as game playing, theorem proving, natural language processing, vision, speech, robotics, and expert systems.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 91

Artificial Intelligence and Machine Learning

Unit 01:
What is Artificial Intelligence?

Definition:
Artificial Intelligence is the branch of computer science concerned with the
study of how to make computer do things which, at the moment people do
better.
Artificial Intelligence is concerned with the design of intelligence in an artificial
device.
The term was coined by McCarthy in 1956.
There are two ideas in the definition.
1. Intelligence
2. Artificial device
What is intelligence?
Accordingly there are two possibilities:
– A system with intelligence is expected to behave as intelligently as a human
– A system with intelligence is expected to behave in the best possible manner.
Intelligent behavior
This discussion brings us back to the question of what constitutes intelligent
behavior. Some of these tasks and applications are:
i. Perception involving image recognition and computer vision
ii. Reasoning
iii. Learning
iv. Understanding language involving natural language processing, speech
processing
v. Solving problems
vi. Robotics
Today’s AI systems have been able to achieve limited success in some of
these tasks.
• In Computer vision, the systems are capable of face recognition
• In Robotics, we have been able to make vehicles that are mostly autonomous.
• In Natural language processing, we have systems that are capable of simple
machine translation.

Dr.Harish Naik T,DCA,Presidency College. Page 1


• Today’s Expert systems can carry out medical diagnosis in a narrow domain
• Speech understanding systems are capable of recognizing several thousand
words continuous speech
• Planning and scheduling systems had been employed in scheduling
experiments .
Achievements of Artificial Intelligence:
1. ALVINN:
Autonomous Land Vehicle In a Neural Network
The system drove a car from the East Coast of USA to the west coast, a total of
about 2850 miles. Out of this about 50 miles were driven by a human, and the
rest solely by the system.
2. Deep Blue
In 1997, the Deep Blue chess program created by IBM, beat the current world
chess champion, Gary Kasparov.
3. Machine translation
A system capable of translations between people speaking different languages
4. Autonomous agents
In space exploration, robotic space probes autonomously monitor their
surroundings, make decisions and act to achieve their goals.
NASA's Mars rovers successfully completed their primary three-month missions
in April, 2004.
5. Internet agents
The explosive growth of the internet has also led to growing interest in internet
agents to
monitor users' tasks, seek needed information, and to learn which information is
most useful.
What can AI systems NOT do yet?
• Understand natural language robustly (e.g., read and understand articles in a
newspaper)
• Surf the web
• Interpret an arbitrary visual scene
• Learn a natural language
• Construct plans in dynamic real-time domains
• Exhibit true autonomy and intelligence
We will now look at a few famous AI system that has been developed over the
years.

Dr.Harish Naik T,DCA,Presidency College. Page 2


****Application of Artificial Intelligence
a) Game Playing
b) Speech Recognition
c) Understanding natural language
d) Expert System
e) Robotics
f) Computer Vision
g) E-Commerce
h) Theorem proving
a) Game playing
Game playing is a search problem Defined by –
- Initial state
– Successor function
– Goal test
– Path cost / utility / payoff function
b) Theorem Proving:
Theorem proving has the property that people who do them well are considered to be
displaying intelligence.
There are two basics methods of theory proving.
Start with the given axioms, use the rules of inference and prove the theorem.
Prove that the negation of the result cannot be TRUE.
c) Natural Language Processing:
The goal of natural language processing is to enable people and computer to
communicate in a “natural “(human) language, such as a English, rather than in a
computer language.
Natural language generation, which strives to have computers produce ordinary
English language so that people can understand computers more easily.
d) Vision and Speech Processing.
The goal of speech processing research is to allow computers to understand human
speech so that they can hear our voices and recognize the words we are speaking.
The goal of computer vision research is to give computers this same powerful facility
for understanding their surroundings. Currently, one of the primary uses of computer
vision is in the area of robotics
e) Robotics
A robot is an electro-mechanical device that can be programmed to perform manual
tasks. The Robotic Industries Association formally defines a robot as “a

Dr.Harish Naik T,DCA,Presidency College. Page 3


reprogrammable multi-functional manipulator designed to move material, parts, tools
or specialized devices through variable programmed motions for the performance of a
variety of tasks.”
f) Expert System

An expert system is a computer program designed to act as an expert in a particular


domain (area of expertise). Also known as a knowledge-based system, an expert
system typically includes a sizable knowledge base, consisting of facts about the
domain and heuristics (rules) for applying those facts.

** History of Artificial Intelligence:


 Aristotle (384-322 BC) developed an informal system of syllogistic logic, which
is the basis of the first formal deductive reasoning system.
 Early in the 17th century, Descartes proposed that bodies of animals are nothing
more than complex machines.
 Pascal in 1642 made the first mechanical digital calculating machine.
 In the 19th century, George Boole developed a binary algebra representing
(some) "laws of thought."
 Charles Babbage & Ada Byron worked on programmable mechanical
calculating machines.
 In the late 19th century and early 20th century, mathematical philosophers like
Gottlob Frege, Bertram Russell, Alfred North Whitehead, and Kurt Gödel built
on Boole's initial logic concepts to develop mathematical representations of
logic problems.
 In the late 19th century and early 20th century, mathematical philosophers like
Gottlob Frege, Bertram Russell, Alfred North Whitehead, and Kurt Gödel built
on Boole's initial logic concepts to develop mathematical representations of
logic problems.

Dr.Harish Naik T,DCA,Presidency College. Page 4


 In 1950 Turing wrote an article on “Computing Machinery and Intelligence”
 In 1956 a famous conference took place in Dartmouth. The conference brought
together the founding fathers of artificial intelligence for the first time. In this
meeting the term “Artificial Intelligence” was adopted.
 In 1963, Edward A. Feigenbaum & Julian Feldman published Computers and
Thought, the first collection of articles about artificial intelligence.
 In 1965, J. Allen Robinson invented a mechanical proof procedure, the
Resolution Method,
 The years from 1969 to 1979 marked the early development of knowledge-based
systems
 In the 1980s, Lisp Machines developed and marketed.
 Around 1985, neural networks return to popularity
 In 1988, there was a resurgence of probabilistic and decision-theoretic methods
 The 1990's saw major advances in all areas of AI including the following:
• machine learning, data mining
• intelligent tutoring,
• case-based reasoning,
• multi-agent planning, scheduling,
• uncertain reasoning,
• natural language understanding and translation,
• vision, virtual reality, games, and other topics.

Dr.Harish Naik T,DCA,Presidency College. Page 5


What is an agent? Explain agents with example?

An agent is something that perceives and acts. An agent acts in an environment .An
agent perceives its environment through sensors.
The complete set of inputs at a given time is called a percept. The current percept, or a
sequence of percepts can influence the actions of an agent. The agent can change the
environment through actuators or effectors. An operation involving an effector is
called an action. Actions can be grouped into action sequences. The agent can have
goals which it tries to achieve.
Thus, an agent can be looked upon as a system that implements a mapping from
percept sequences to actions.
Examples of Agents :
An agent is something that acts in an environment - it does something. Agents include
thermostats, airplanes, robots, humans, companies, and countries. We are interested in
what an agent does; that is, how it acts. We judge an agent by its actions.
1.Humans can be looked upon as agents. They have eyes, ears, skin, taste buds, etc. for
sensors; and hands, fingers, legs, mouth for effectors.
2.Robots are agents. Robots may have camera, sonar, infrared, bumper, etc. for
sensors. They can have grippers, wheels, lights, speakers, etc. for actuators.

PROBLEM SOLVING
The steps that are required to build a system to solve a particular problem are:

1. Problem Definition that must include precise specifications of what the


initial situation will be as well as what final situations constitute acceptable
solutions to the problem.
2. Problem Analysis, this can have immense impact on the appropriateness of
varies possible techniques for solving the problem.
3. Selection of the best technique(s) for solving the particular problem.

***Define the problem as A state Space Search:-

Consider the problem of “Playing Chess” . to build a program that could play chess, we
have to specify the starting position of the chess board, the rules that define legal
moves. And the board position that represent a win. The goal of the winning the game,
if possible, must be made explicit.

Dr.Harish Naik T,DCA,Presidency College. Page 6


The starting position can be described by an 8 X 8 array square in which each element
square (x,y),
(x varying from 1to 8 & y varying from1 to 8) describes the board position of an
appropriate chess coin, the goal is any board position in which the opponent does not
have a legal move and his or her “king” is under attack. The legal moves provide the
way of getting from initial state of final state.

The legal moves can be described as a set of rules consisting of two parts: A left side
that gives the current position and the right side that describes the change to be made to
the board position. An example is shown in the following figure.

Current Position Changing Board Position

While pawn at square ( 5 , 2)


AND Move pawn from
Square ( 5 , 3 ) is empty Square ( 5 , 2 ) to
AND Square ( 5 , 4 ).
Square ( 5 , 4 ) is empty.

The current position of a coin on the board is its STATE and the set of all possible
STATES is STATE SPACE. One or more states where the problem terminates is
FINAL STATE or GOAL STATE. The state space representation forms the basis of
most of the AI methods.
It allows for a formal definition of the problem as the need to convert some given
situation into some desired situation using a set of permissible operations. It permits
the problem to be solved with the help of known techniques and control strategies to
move through the problem space until goal state is found.

***State Space Search Notations :

Let us begin by introducing certain terms. An initial state is the description of the
starting configuration of the agent .An action or an operator takes the agent from one
state to another state which is called a successor state. A state can have a number of
successor states.

A plan is a sequence of actions. The cost of a plan is referred to as the path cost. The
path cost is a positive number, and a common path cost may be the sum of the costs of
the steps in the path.

Dr.Harish Naik T,DCA,Presidency College. Page 7


Example for state space problem is water jug problem.

****Explain Water Jug Problem with proper production rules

Statement: - We are given 2 jugs, a 4 liter one and a 3- liter one. Neither have any
measuring markers on it. There is a pump that can be used to fill the jugs with water.
How can we get exactly 2 liters of water in to the 4-liter jugs?

Solution:-

The state space for this problem can be defined as

‘i’ represents the number of liters of water in the 4-liter jug and ‘j’ represents the
number of liters of water in the 3-liter jug. The initial state is ( 0,0) that is no water on
each jug. The goal state is to get ( 2,n) for any value of ‘n’.

{ ( i ,j ) i = 0,1,2,3,4 j = 0,1,2,3}

To solve this we have to make some assumptions not mentioned in the problem. They
are

1. We can fill a jug from the pump.

2. We can pour water out of a jug to the ground.

3. We can pour water from one jug to another.

4. There is no measuring device available.

The various operators (Production Rules) that are available to solve this problem may
be stated as given in the following figure.

Dr.Harish Naik T,DCA,Presidency College. Page 8


Dr.Harish Naik T,DCA,Presidency College. Page 9
Dr.Harish Naik T,DCA,Presidency College. Page 10
What is Production system?

• A production system (or production rule system) is a computer program


typically used to provide some form of artificial intelligence, which consists
primarily of a set of rules about behavior but it also includes the mechanism
necessary to follow those rules as the system responds to states of the world.
Those rules, termed productions, are a basic representation found useful in
automated planning, expert systems and action selection.
• Productions consist of two parts: a sensory precondition (or "IF" statement) and
an action (or "THEN"). If a production's precondition matches the current state
of the world, then the production is said to be triggered. If a production's action
is executed, it is said to have fired. A production system also contains a
database, sometimes called working memory, which maintains data about
current state or knowledge, and a rule interpreter. The rule interpreter must
provide a mechanism for prioritizing productions when more than one is
triggered.
Control Strategies:
Control Strategy in Artificial Intelligence scenario is a technique or strategy,
tells us about which rule has to be applied next while searching for the solution
of a problem within problem space. It helps us to decide which rule has to apply
next without getting stuck at any point. These rules decide the way we approach
the problem and how quickly it is solved and even whether a problem is finally
solved.
Control Strategy helps to find the solution when there is more than one rule or
fewer rules for finding the solution at each point in problem space. A good
Control strategy has two main characteristics they are,
– it should cause motion.
– It should be systematic.
• The control strategy for the search process is called breadth first search. Other
systematical control strategies are also available . for example, we can select one
single branch of a tree until it yields a solution or until some pre specified depth
has been reached. If not we go back and explore to other branches . this is called
depth – first – search.

Search in AI:
Another crucial general technique required when writing AI programs is search. Often
there is no direct way to find a solution to some problem. However, you do know how
to generate possibilities. For example, in solving a puzzle you might know all the
Dr.Harish Naik T,DCA,Presidency College. Page 11
possible moves, but not the sequence that would lead to a solution. When working out
how to get somewhere you might know all the roads/buses/trains, just not the best
route to get you to your destination quickly. Developing good ways to search through
these possibilities for a good solution is therefore vital. Brute force techniques, where
you generate and try out every possible solution may work, but are often very
inefficient, as there are just too many possibilities to try. Heuristic techniques are often
better, where you only try the options, which you think (based on your current best
guess) are most likely to lead to a good solution.

Search Algorithm Terminologies:

o Search: Searching is a step by step procedure to solve a search-problem in a given


search space. A search problem can have three main factors:
a. Search Space: Search space represents a set of possible solutions, which a system
may have.
b. Start State: It is a state from where agent begins the search.
c. Goal test: It is a function which observe the current state and returns whether the
goal state is achieved or not.
o Search tree: A tree representation of search problem is called Search tree. The root of
the search tree is the root node which is corresponding to the initial state.
o Actions: It gives the description of all the available actions to the agent.
o Transition model: A description of what each action do, can be represented as a
transition model.
o Path Cost: It is a function which assigns a numeric cost to each path.
o Solution: It is an action sequence which leads from the start node to the goal node.
o Optimal Solution: If a solution has the lowest cost among all solutions.

Properties of Search Algorithms:

Following are the four essential properties of search algorithms to compare the efficiency of
these algorithms:

Completeness: A search algorithm is said to be complete if it guarantees to return a


solution if at least any solution exists for any random input.

Optimality: If a solution found for an algorithm is guaranteed to be the best solution


(lowest path cost) among all other solutions, then such a solution for is said to be an optimal
solution.

Dr.Harish Naik T,DCA,Presidency College. Page 12


Time Complexity: Time complexity is a measure of time for an algorithm to complete its
task.

Space Complexity: It is the maximum storage space required at any point during the
search, as the complexity of the problem.

Types of search algorithms

1.Blind search:we move through the space without worrying about what is coming
next, but recognising the answer if we see it

2.Informed search:we guess what is ahead, and use that information to decide where to
look next.We may want to search for the first answer that satisfies our goal, or we may
want to keep searching until we find the best answer.

***What is Breadth First Search?

Breadth first search is also like depth first search. Here searching progresses level by
level. Unlike depth first search, which goes deep into the tree. An operator employed to
generate all possible children of a node. Breadth first search being the brute force
search generates all the nodes for identifying the goal.

Dr.Harish Naik T,DCA,Presidency College. Page 13


Algorithm for Breadth first search:

Let fringe be a list containing the initial state

Loop

If fringe is empty return failure

Node remove-first(fringe)

If node is a goal

then return the path from initial state to Node

else

Generate all successors of Node and

Expand shallowest node-first

Add generated nodes to the back of the fringe.

End Loop

Applications of Breadth first search:

1.Shortest Path and Minimum Spanning Tree for unweighted graph In unweighted
graph, the shortest path is the path with least number of edges. With Breadth First.

2.Peer to Peer Networks. In Peer to Peer Networks like BitTorrent, Breadth First
Search is used to find all neighbor nodes.

3.Crawlers in Search Engines: Crawlers build index using Breadth First.

4.Social Networking Websites: In social networks, we can find people within a given
distance ‘k’ from a person using Breadth First Search till ‘k’ levels.

5.GPS Navigation systems: Breadth First Search is used to find all neighboring
locations.

6. Broadcasting in Network: In networks, a broadcasted packet follows Breadth First


Search to reach all nodes.

Dr.Harish Naik T,DCA,Presidency College. Page 14


***Depth First Search:
The searching process in AI can be broadly classified into two major types. Viz. Brute
Force Search and Heuristics Search. Brute Force Search do not have any domain
specific knowledge. All they need is initial state, the final. state and a set of legal
operators. Depth-First Search is one the important technique of Brute Force Search.

In Depth-First Search, search begins by expanding the initial node, i.e., by using an
operator, generate all successors of the initial node and test them.

Algorithm for Depth first search:

Let fringe be a list containing the initial state

Loop

If fringe is empty return failure

Node remove-first(fringe)

If node is a goal

then return the path from initial state to Node

else

Generate all successors of Node and

Expand deepest node-first

Add generated nodes to the front of the fringe.

End Loop

Applications of Depth First Search


1)DFS traversal of the graph produces the minimum spanning tree and all pair shortest
path tree.
2) Solving puzzles with only one solution, such as mazes.
4. Social media applications

Dr.Harish Naik T,DCA,Presidency College. Page 15


***What is Best first search?
Best first search combines the advantages of both depth first search and breadth first
search.

One way to combine the two is to follow a single path at a time, but switch path
whenever some competing path looks more promising than the current one.

At each step, we select the most promising of the nodes we have generated so far. This
is done by applying an appropriate heuristic function to each of them. We then expand
the chosen node by using the rules to generate its successors.

If one them is a solution, we can quit if not, all those new nodes are added to the set of
nodes generated so far.

Again the most promising node is selected and the process continues.

Dr.Harish Naik T,DCA,Presidency College. Page 16


Graphs of these types are called OR graphs because each of its branches represents an
alternative problem solving path.

A is an initial node, which is expand to B,C and D. A heuristic function, say cost of
reaching the goal , is applied to each of these nodes, since D is most promising, it is
expanded next, producing two successor nodes E and F. Heuristic function is applied to
them.

Now out of the four remaining ( B,C and F) B looks more promising and hence it is
expand generating nodes G and H . Again when evaluated E appears to be the next stop
J has to be expanded giving rise to nodes I and J. In the next step J has to be expanded,
since it is more promising this process continues until a solution is found.

***What is Heuristic Search?


• A Heuristic is a technique that improves the efficiency of a search process,
possibly by sacrificing claims of completeness.
• Heuristics are like tour guides
• They are good to the extent that they point in generally interesting directions;
• They are bad to the extent that they may miss points of interest to particular
individuals.
• On the average they improve the quality of the paths that are explored.
• Using Heuristics, we can hope to get good (though possibly nonoptimal )
solutions to hard problems such as a TSP in non exponential time.
• There are good general purpose heuristics that are useful in a wide variety of
problem domains.
• Special purpose heuristics exploit domain specific knowledge

Explain Nearest Neighbor Heuristic


• It works by selecting locally superior alternative at each step.

• Applying to TSP:

1. Arbitrarily select a starting city


2. To select the next city, look at all cities not yet visited and select the one
closest to the current city. Go to next step.
3. Repeat step 2 until all cities have been visited.
– This procedure executes in time proportional to N2
– It is possible to prove an upper bound on the error it incurs. This provides
reassurance that one is not paying too high a price in accuracy for speed.

Dr.Harish Naik T,DCA,Presidency College. Page 17


Heuristic Function

• This is a function that maps from problem state descriptions to measures of


desirability, usually represented as numbers.

– Which aspects of the problem state are considered,

– how those aspects are evaluated, and

– the weights given to individual aspects are chosen in such a way that

• The value of the heuristic function at a given node in the search process gives as
good an estimate as possible of whether that node is on the desired path to a
solution.

Well designed heuristic functions can play an important part in efficiently guiding a
search process toward a solution.

In heuristic search or informed search, heuristics are used to identify the most
promising search path.

Example Simple Heuristic functions

• Chess: The material advantage of our side over opponent.


• TSP: the sum of distances so far
• Tic-Tac-Toe: 1 for each row in which we could win and in we already have one
piece plus 2 for each such row in we have two pieces

****Explain A* ALGORITHM

A Star algorithm:Basic Definition

A Star algorithm is a best first graph search algorithm that finds a least cost path from
a given initial node to one goal node.

Functions used in the algorithm:

Evaluation Function f(n):At any node n,it estimates the sum of the cost of the minimal
cost path from start node s to node n plus the cost of a minimal cost path from node n
to a goal node.

f(n)=g(n)+h(n)

Where g(n)=cost of the path in the search tree from s to n;


Dr.Harish Naik T,DCA,Presidency College. Page 18
h(n)=cost of the path in the search tree from n to a goal node;

Function f*(n):At any node n,it is the actual cost of an optimal path m node s to node n
plus the cost of an optimal path from node n to a goal node.

f*(n)=g*(n)+h*(n)

Where g*(n)=cost of the optimal path in the search tree from s to n;

h*(n)=cost of the optimal path in the search tree from n to a goal node;

h*(n):It is the cost of the minimal cost path from n to a goal node and any path from
node n to a goal node that acheives h*(n)is an optimal path from n to a goal. h is an
estimate of h*.

h(n) is calculated on the heuristic information from the problem domain.

Algorithm:

Step1: Create a single node comprising at root node Step2: If the first member is goal
node then go to step5

Step3: If it is not then remove it from the queue load it to the list of visited
nodes,consider its child nodes if any and evaluate them with the evaluation function.

f(n)=g(n)+h(n) and add it to the queue and reorder

the states on the basis of heuristic merit.

Step4: if queue is empty then go to step6 else go to step2

Step5: Print success and stop

Step6: Print unsuccessful and stop

Example: Problem to find shortest path between A to J using A* algorithm

Dr.Harish Naik T,DCA,Presidency College. Page 19


Step 1:
We start with node A.
Node B and Node F can be reached from node A.
A* Algorithm calculates f(B) and f(F)
 f(B) = 6 + 8 = 14
 f(F) = 3 + 6 = 9
Since f(F) < f(B). So it decides to go to node F.
Path- A F
Step 2:
Node G and Node H can be reached from node F.
A* Algorithm calculates f(G) and F(H).
 f(G) = (3+1) + 5 = 9
 f(H) = (3=7) + 3 = 13
Since f(G) < f(H), so it decides to go to node G.
Path- A F G
Step 3:
Node I can be reached from node G.
A* Algorithm calculates f(I).
 f(I) = (3+1+3) + 1 = 8
It decides to go to node I.
Path- A F G I
Step 4:
Node E, Node H and Node J can be reached from node I.
A* Algorithm calculates f(E), f(H) and f(J).
 f(E) = (3+1+3+5) + 3 = 15
 f(H) = (3+1+3+2) + 3 = 12
 f(J) = (3+1+3+3) + 0 = 10
Since f(J) is least, so it decides to go to node J.
Path- A F G I J

Dr.Harish Naik T,DCA,Presidency College. Page 20


Applications of A* Algorithm:
• It is commonly used in web-based maps and games to find the shortest path at the highest
possible efficiency.
• A* is used in many artificial intelligence applications, such as search engines.
• It is used in other algorithms such as the Bellman-Ford algorithm to solve the shortest path
problem.
• The A* algorithm is used in network routing protocols, such as RIP, OSPF, and BGP, to
calculate the best route between two node

****PROBLEM REDUCTION - AND - OR graphs - AO * Algorithm

When a problem can be divided into a set of sub problems, where each sub problem
can be solved separately and a combination of these will be a solution, AND-OR
graphs or AND - OR trees are used for representing the solution.

The decomposition of the problem or problem reduction generates AND arcs. One
AND arc may point to any number of successor nodes. All these must be solved so that
the arc will rise to many arcs, indicating several possible solutions. Hence the graph is
known as AND - OR instead of AND. Figure shows an AND - OR graph.

An algorithm to find a solution in an AND - OR graph must handle AND area


appropriately. A* algorithm cannot search AND - OR graphs efficiently. This can be
understand from the give figure.

Dr.Harish Naik T,DCA,Presidency College. Page 21


In
figure (a) the top node A has been expanded producing two area one leading to B and
leading to C-D . the numbers at each node represent the value of f ' at that node (cost of
getting to the goal state from current state). For simplicity, it is assumed that every
operation(i.e. applying a rule) has unit cost, i.e., each are with single successor will
have a cost of 1 and each of its components.

With the available information till now , it appears that C is the most promising node to
expand since its f ' = 3 , the lowest but going through B would be better since to use C
we must also use D' and the cost would be 9(3+4+1+1). Through B it would be 6(5+1).

Thus the choice of the next node to expand depends not only n a value but also on
whether that node is part of the current best path form the initial mode. Figure (b)
makes this clearer.

In figure the node G appears to be the most promising node, with the least f ' value. But
G is not on the current best path, since to use G we must use GH with a cost of 9 and
again this demands that arcs be used (with a cost of 27). The path from A through B,
E-F is better with a total cost of (17+1=18). Thus we can see that to search an AND-
OR graph, the following three things must be done.

1. traverse the graph starting at the initial node and following the current best path, and
accumulate the set of nodes that are on the path and have not yet been expanded.

2. Pick one of these unexpanded nodes and expand it. Add its successors to the graph
and computer f ' (cost of the remaining distance) for each of them.

3. Change the f ' estimate of the newly expanded node to reflect the new information
produced by its successors. Propagate this change backward through the graph. Decide
which of the current best path.
Dr.Harish Naik T,DCA,Presidency College. Page 22
The propagation of revised cost estimation backward is in the tree is not necessary in
A* algorithm. This is because in AO* algorithm expanded nodes are re-examined so
that the current best path can be selected. The working of AO* algorithm is illustrated
in figure as follows:

Referring the figure. The initial node is expanded and D is Marked initially as
promising node. D is expanded producing an AND arc E-F. f ' value of D is updated to
10. Going backwards we can see that the AND arc B-C is better . it is now marked as
current best path. B and C have to be expanded next.

This process continues until a solution is found or all paths have led to dead ends,
indicating that there is no solution. An A* algorithm the path from one node to the
other is always that of the lowest cost and it is independent of the paths through other
nodes.

The algorithm for performing a heuristic search of an AND - OR graph is given below.
Unlike A* algorithm which used two lists OPEN and CLOSED, the AO* algorithm
uses a single structure G. G represents the part of the search graph generated so far.

Each node in G points down to its immediate successors and up to its immediate
predecessors, and also has with it the value of h' cost of a path from itself to a set of
solution nodes. The cost of getting from the start nodes to the current node "g" is not
stored as in the A* algorithm.

This is because it is not possible to compute a single such value since there may be
many paths to the same state. In AO* algorithm serves as the estimate of goodness of a
node. Also a there should value called FUTILITY is used.

Dr.Harish Naik T,DCA,Presidency College. Page 23


The estimated cost of a solution is greater than FUTILITY then the search is
abandoned as too expensive to be practical.

For representing above graphs AO* algorithm is as follows

**** AO* ALGORITHM:

***MEANS - ENDS ANALYSIS:-

Most of the search strategies either reason forward or backward however, often a
mixture of the two directions is appropriate. Such mixed strategy would make it
possible to solve the major parts of problem first and solve the smaller problems that
Dr.Harish Naik T,DCA,Presidency College. Page 24
arise when combining them together. Such a technique is called "Means - Ends
Analysis".

The means -ends analysis process centers around finding the difference between
current state and goal state. The problem space of means - ends analysis has an initial
state and one or more goal state, a set of operate with a set of preconditions their
application and difference functions that computes the difference between two state
a(i) and s(j). A problem is solved using means - ends analysis by

1. Computing the current state s1 to a goal state s2 and computing their difference D12.

2. Satisfy the preconditions for some recommended operator op is selected, then to


reduce the difference D12.

3. The operator OP is applied if possible. If not the current state is solved a goal is
created and means- ends analysis is applied recursively to reduce the sub goal.

4. If the sub goal is solved state is restored and work resumed on the original problem.

( the first AI program to use means - ends analysis was the GPS General problem
solver)

means- ends analysis is useful for many human planning activities. Consider the
example of planning for an office worker. Suppose we have a different table of three
rules:

1. If in our current state we are hungry, and in our goal state we are not hungry , then
either the "visit hotel" or "visit Canteen " operator is recommended.

2. If our current state we do not have money, and if in your goal state we have money,
then the "Visit our bank" operator or the "Visit secretary" operator is recommended.

3. If our current state we do not know where something is , need in our goal state we
do know, then either the "visit office enquiry" , "visit secretary" or "visit co worker "
operator is recommended.

Assume the robot in this domain was given the problem of moving a desk with two
things on it from one room to another.

The difference between the start state and the goal state would be the location of the
desk.

To reduce this difference, either push or carry could be chosen.

Dr.Harish Naik T,DCA,Presidency College. Page 25


Problem for household robot: moving desk with 2 things on it from one room to
another.

Main difference between start and goal state is location.

Choose PUSH and CARRY

Push Carry Walk Pickup Putdown Place

Move * *
object

Move *
robot

Clear *
object

Get object *
on object

Get arm * *
empty

Be holding *
object

Dr.Harish Naik T,DCA,Presidency College. Page 26


Operator Preconditions Results

PUSH (obj, loc) at(robot,obj) at(obj, loc)

&large (obj) & at (robot, loc)

&clear (obj) &

arm empty

CARRY (obj, loc) at(robot, obj) &Small (obj) at(obj, loc) &at(robot, loc)

WALK(loc) none At(robot, loc)

PICKUP(obj) At(robot, obj) Holding(obj)

PUTDOWN(obj) Holding(obj) Not holding (obj)

PLACE(obj1, obj2) At(robot,obj2) & holding on(obj1, obj2)


(obj1)

Mini-Max Algorithm

Dr.Harish Naik T,DCA,Presidency College. Page 27


Chapter 2: Knowledge and Knowledge Representation

Definition: Knowledge can be defined as the body of facts and principles accumulated
by human kind or the act, fact or state of knowing. Also knowledge is having a
familiarity with language, concepts procedures, rules, ideas, abstractions, places,
customs, beliefs, facts and associations coupled with an ability to use these notions
effectively in modeling different aspects of the world.

Intelligence requires the possession of and access to knowledge. AI Technology is


a method that exploits knowledge.

***Properties of knowledge:
 Knowledge requires data
 Knowledge is voluminous
 It is hard to characterize accurately
 It is constantly changing
 It is differ from data by being organized in a way that corresponds to the
ways it will be used.

Dr.Harish Naik T,DCA,Presidency College. Page 28


Certain constraints that should be kept in mind while designing AI technique to solve
any AI problem.
 Knowledge captures generalizations. i.e. Situations that share
important properties are grouped together.
 Knowledge can be understood by people who must provide it
although in most of the cases data can be acquired automatically.
 It can be easily modified to correct errors and to reflect changes
in the world.
 It can be used in a great many situations even if it is not totally
accurate or complete.
 It can be used to help overcome its own sheer bulk by helping to
narrow the range of possibilities that must usually be considered.

***Types of knowledge:
1. Procedural Knowledge is a compiled knowledge related to the performance of
some task. E.g. the steps used to solve an algebraic equation are expressed as
procedural knowledge.
2. Declarative Knowledge is a passives about knowledge expressed as statements
of fact about the world. E.g. personnel data in explicit pieces of independent
knowledge.
3. Heuristic Knowledge A special type of knowledge used by humans to solve
complex problems. Heuristics are the strategies, tricks or rules of thumb used to
simplify the solution of problems. Heuristics are usually acquired with much
experience. E.g. A fault in a television set located by an experienced technician
without doing numerous voltage checks.
At this junction we must distinguish knowledge and other concepts such as belief and
hypotheses.
Belief: Belief can be defined as essentially any meaning and coherent expression that
can be represented. A belief may be true or false.
Hypothesis: Hypothesis can be defined as justified belief that is not known to be true.
Through hypothesis is a belief which is backed up with some supporting evidence, but
it may still be false. Finally, we define knowledge as true justified belief.
Two other terms which we shall occasionally use are epistemology and Meta
knowledge.
Epistemology is the study of the nature of knowledge.
Meta knowledge is knowledge about knowledge, i.e. knowledge about what we know.

Dr.Harish Naik T,DCA,Presidency College. Page 29


Importance of knowledge: AI has given new meaning for knowledge. It is possible to
“Package” specialized knowledge and sell it with a system that can use it to reason and
draw conclusions. Also the system, which is untiring and being a reliable advisor that
gives high level professional advice in specialized areas, such as
 Manufacturing techniques
 Sound financial strategies
 Ways to improve one’s health
 Marketing strategies
 Optimal farming plans, etc,

Knowledge-Based System: Knowledge- based systems depend on a rich base of


knowledge to perform difficult task where as general purpose problem solves use a
limited number of laws or axioms, which are too weak to be effective in solving
problems of any complexity.
Edward feigenbaum emphasized the fact that the real power of an expert system comes
from the knowledge it possesses. E.g. MYCIN, an expert system developed to
diagnose in factious blood diseases.

Components of knowledge – based system

I/O unit Inference-control Knowledge Base


unit
Knowledge – based systems get their power from the expert knowledge that has been
coded into facts, rules, heuristics and procedures. The knowledge is stored in a
knowledge base separate from the control and inferencing programs. This greatly
simplifies the construction and maintenance of knowledge-based systems.
***Representation of knowledge
Representation of knowledge has become one of top research priorities in AI.
Knowledge can be represented in different forms as mental images in one’s thoughts,
as spoken or written words in some language, as graphical or other pictures, as
character strings or collection of magnetic spots stored in a computer.
Different levels of knowledge Representation
Mental Images

Written Text

Character Strings
Dr.Harish Naik T,DCA,Presidency College. Page 30
Binary Numbers

Magnetic spots
Various Knowledge Representation Schemes
 PL (Propositional Logic)
 FOPL (First Order Predicate Logic)
 Frames
 Associative Networks
 Model Logics
 Object Oriented Methods etc.

Knowledge may vague, contradictory or incomplete. Yet we would still like to be able
to reason and make decisions. Humans do remarkably well with fuzzy, incomplete
knowledge. We would also like our AI programs to demonstrate this versatile.

Mapping between facts and representation


Facts Internal Representation Reasoning Program
(Understanding the English) (English Generation)
English Representation

Acquisition of Knowledge: One of the greatest bottlenecks in building knowledge-


rich system is the acquisition and validation of the knowledge. Knowledge may be
acquired from various sources such as experts, textbooks, reports, technical articles;
knowledge must be accurate, presented at the right level for encoding, complete, free
of inconsistencies and so on. The acquisition problem has also stimulated much
research in machine learning system that is to permit systems to learn new knowledge
automatically without the aid of humans and continually improve the quality of
knowledge they possess.

Knowledge Organization: The organization of knowledge in memory is key to


efficient processing. Knowledge-based systems may require tens of thousands of facts
and rules to perform their intended tasks. Quick access of knowledge is achievement
by having index and keywords to point knowledge groups.

Knowledge Manipulation: Decisions and actions in knowledge based systems come


from manipulation of the knowledge in specified ways. Typically, some form of input
(from a user keyboard or sensors) will initiate a search for a goal or decision. This

Dr.Harish Naik T,DCA,Presidency College. Page 31


requires that known facts in the knowledge-base be located, composed (matched) and
possibly altered in some way. The manipulations are the computational equivalent of
reasoning. This requires a form of inference or deduction using the knowledge and
inferring rules. All forms of reasoning requires a certain amount of computation time
in AI systems, so it is important to have techniques that limit the amount of search and
matching required to complete a task.

Formalized Symbolic Logics

Standard Logic Symbols

Symbol Meaning

→ Implication

↔ Double Implication/Equivalence/If and only if

̚ or ̴ Negation

˅ OR

˄ or & AND / Conjunction

 For all

ᴲ there exists

Logic is a formal method for reasoning, which has a sound theoretical foundation. This
is especially important in our attempts to mechanize or automate the reasoning process
in that inferences should be correct and logically sound. The Structure of PL or FOPL
must be flexible enough to permit the accurate representation of natural language
reasonably well. Many concepts which can be verbalized can be translated into
symbolic representation which closely approximates the meaning of these concepts.
These symbolic structures can then be manipulated in programs to deduce various facts
to carry out a form of automated reasoning.

In FOPL, statements from a natural language like English are translated into symbolic
structures comprises predicates functions, variables, constants, quantifies and logical
connections. The symbols from the basic building blocks for the knowledge and their
combination into valid structures are accomplished using the syntax (rules of
combination) for FOPL. Once structures have been created to represent basic facts or
procedure or other types of knowledge inference rules may then be applied to compare,

Dr.Harish Naik T,DCA,Presidency College. Page 32


combine and transform these “assumed” structured into new “deduced” structures. This
is how automated reasoning or inference is performed.

Some simple facts in propositional logic

It is raining

RAINING

It is sunny

SUNNY

It is windy

WINDY

If it is raining then it is not sunny.

RAINING → ̴ SUNNY

We can conclude from the fact that it is raining the fact that is not sunny.

But we want to represent the obvious fact stated by the classical sentence.

Gandhi is a Man.

We could write the above statement as GANDHIMAN --------- PL

If we also wanted to represent

Einstein is a Man.

We will have to write something such as EISTEINMAN -------- PL

Which would be a totally separate assertion and we would not be able to draw any
conclusion about similarities between Gandhi and Einstein.

It would be much better to represent these facts as

MAN (GANDHI) - FOPL

MAN (EINSTEIN) - FOPL

Dr.Harish Naik T,DCA,Presidency College. Page 33


****Predicate Logic:
Predicate – contents of statements.

The logic based upon the analysis of predicates in any statement is called predicate
logic.

FOPL – First order predicate logic

Introduction:

FOPL was developed to extend the expressiveness of PL. It is a generalization of the


PL that permits reasoning about world objects as relational entities as well as classes or
sub classes of objects. This generalization comes from the introduction of predicates in
place of propositions, the use of functions and the use of variables together with
quantifiers.

The syntax for FOPL like PL is determined by the allowable symbols and rules of
combinations. The semantics of FOPL are determined by interpretations assigned to
predicates rather than propositions. This means that an interpretation must also assign
values to other terms including constants, variables and functions.

Syntax of FOPL:

The symbols and rules of combination permitted in FOPL are defined as follows.

Connections: ̴ , ˄,˅,→,↔

Quantifiers: ᴲ (Existential Quantification)

˅ (For all) (Universal Quantification)

Constants: Fixed value terms that belong to a given domain, denoted by numbers,
words, E.g. Flight- 102, ak – 47, etc;

Variables: Terms that can assume different values over a given domain, denoted by
words.

Functions: Function symbols denotes relations defined on a domain D. They map n


elements (n >= 0) to a single element of the domain. Symbols f, g, h and words such as
father- of, age- of represent functions.

E.g. f (t1, t2,……..,tn ) where t, are terms (constants, variables or functions ) defined
over some domain, n>=0.
Dr.Harish Naik T,DCA,Presidency College. Page 34
A o-ary function is a constant.

Predicates: Predicate symbols denote relations or functions mapping from the


elements of a domain D to the values true or false. Capital letters and capitalized words
such as P, Q, R, EQUAL, GOOD, MARRIED are used to represent predicates. Like
functions predicates may have n(n>=0) terms as arguments, written as
P(t1,t2,t3,t4,………..,tn).

A O - ary predicate is a proposition that is a constant predicate.

In addition we use {,}, (,),[,].

Example:

Represent the following statements in symbolic form:

E1: All employees earning Rs. 10, 00,000 or more per year pay taxes.

E2: Some employees are sick today.

E3: No employee earns more than the president.

Abbreviations for above statements:

E(x) for x is an employee.

P(x) for x is a president.

i(x) for the income of x.

GE (uv) for u is greater than equals to V.

S(x) for x is sick today.

T(x) for x pays taxes.

Using above abbreviations we can represent E1,E2 and E3 as.

E1’ : ˅(for all)x((E(x) & GE (i(x), 10,00000))→T(x)

E2’ : ᴲY((E(y) →S(y))

E3’ : ˅(for all)xy((E(x) & P((y)) → ̴ GE(i(x), i(y))

E1’, E2’, E3’ are known as well-formed formulae or wffs (woofs).

Dr.Harish Naik T,DCA,Presidency College. Page 35


Valid Examples:

MECHANIC (Arun)

ᴲ xyz : ((Father (x,y)) ˄ (FATHER (y,z))→ GRANDFATHER (x,z)

Example for invalid statements:

 (for all) P P(x) → Q(x)

In FOPL, universal quantification is not permitted to the predicate P(x)

MAN( ̴ Kunal)

Constant should not have predicate arguments.

Predicate Logic:

Predicate - contents of statements

The Logic based upon the analysis of predicates in any statement is called predicate
logic.

Semantics of FOPL: When considering specific wffs, domain D must be remembered.


D is the set of all elements or objects from which fixed assignments are made to
constants and from which the domain and range of functions are defined. The
arguments of predicates must be terms (Constants, variables or functions). Therefore,
the domain of each n-place predicate is also defined over D.

E.g. Our Domain - All entities that make up the MCA Dept. in PESIT.

Constants - Lecturers (NGP, SKY, DU, CMH ……)

Staff (Sriram, Raghavendra, Shankar) books, labs, office…

Functions - lab – capacity(x)

Dept-grade average(y)

Advisor of (z)

Predicates - HOD(x)

-----------------

-------------------

Dr.Harish Naik T,DCA,Presidency College. Page 36


****Representation of simple facts as a set of wffs in predicate Logic:(10M)

1. Marcus was a man.


MAN (Marcus)
2. Marcus was a Pompeian
POMPEIAN (Marcus)
3. All Pompeians were Romans
 (for all) x: POMPEIAN (x) → ROMAN(x)
4. Caesar was a Ruler.
RULER (Caesar)
5. All Romans were either Loyal to Caesar or hated him.

 (for all)x: Roman (x, Caesar)˅ Hate(x, caesar)

Or
 (for all)x: Roman (x)→[(Loyalto(x, Caesar )˅ Hate(x, caesar)) ˄ ̴
(Loyal to (x, Caesar )˄Hate (x,Caesar)

6. Everyone is Loyal to someone

 (for all) x: ᴲy : Loyal to (x, y)


ᴲy: ˅ (for all) x: Loyal to (x,y)/there exists someone to whom
everyone is loyal to)

7. People only try to assassinate rules they are not loyal to

 (for all)x: ˅(for all)y: Person (x)˄Ruler(y)˄Try to Assassinate


(x,y) → ̴ Loyal to(x,y)

1. Marcus tried to assassinate Caesar.

tryassassinate (Marcus, Caesar)

Now suppose that we want to use these statements to answer the question

Was Marcus Loyal to Caesar?

It seems that using 7 and 8, we should be able to prove that Marcus was
not loyal to Caesar.

Now let’s try to produce a formal proof reasoning backward from the
desired goal:

Dr.Harish Naik T,DCA,Presidency College. Page 37


̴ loyalto(Marcus, Caesar)

Statements in Symbolic form:

Something is good ᴲ(x) (G(x))

Everything is good  (for all) (x) (G(x))


Nothing is good  (for all)(x) ( ̚ G(x))
Something is not good ᴲ(x) ( ̚ G(x))
Note: Symbols ̚ or ̴ can be used for negation

All true –none false

( (for all) x(P(x)) ↔ ̴ (ᴲ x) ( ̴ P(x))

All false – None true

( (for all) x ( ̴ P(x)) ↔ ̴ (ᴲ x) ( P(x))

Not all true – At least one false

̴ ( (for all) x (P(x)) ↔ (ᴲ x) ( ̴ P(x))

Not all false – At least one true

̴ ( (for all) x (̴ P(x)) ↔ (ᴲ x) ( P(x))

Rules for quantification:

To negate a statement covered by one quantifier, change the quantifier from universal
to existential or from existential to universal and negate the statement which it
quantifies.

Statement Its negation

( (for all) x) (P(x)) (ᴲx)( ̴ P(x))

(ᴲ(x)) (P(x)) (˅(for all) x) ( ̴ P(x))

Consider the following four statements:

1. All Monkeys has a tail.


2. No Monkeys have tails.

Dr.Harish Naik T,DCA,Presidency College. Page 38


3. Some Monkeys have tails.
4. Some Monkeys have no tails.

If the universe for the statement “All Monkeys have tails ” consists only of Monkeys
then this statement is merely.

( (for all) x)(P(x)), where P(x): x has a tail.

However, if the universe consists of objects some of which are not monkeys, a further
refinement is necessary.

Let M(x): x is a Monkey.

P(x) : x has a tail.

“All Monkeys have tails” makes no statement about objects in the universe, which are
Monkeys? If the object is a monkey then it has or does not have a tail, the statement is
false, otherwise it is true.

The statement (1) can be rephrased as follows:

“For all x, if x is a Monkey, then x has a tail” and it can be written as ((˅(for all) x)
[M(x) →P(x)].

The statement (2) means “for all x if x is a Monkey, then x has no tail” and it can be
written as ( (for all) x)[M(x) → ̴ P(x)]

The statement (3) means “there is an x such that x is a Monkey and x has a tail” and
can be written as (ᴲx) [M(x) ˄ P(x)]

The statement (4) means “there is an x such that x is a Monkey” and can be written as
(ᴲx) [M(x) ˄ ̴ P(x)]

Statement Symbolic Form

All Monkeys have tails ( (for all) x) [M(x) → P(x)]

No monkey has a tail ( (for all) x)[M(x)→ ̴ P(x)]

Some Monkeys has a tail (ᴲx)[M(x) ˄ P(x)]

Some Monkeys have no tails (ᴲx) [M(x) ˄ ̴ P(x)]

Dr.Harish Naik T,DCA,Presidency College. Page 39


****Write the following sentences in symbolic form:

a. Some people who trust others one rewarded.


b. If anyone is good then Jhon is good.
c. He is ambitious or no one is ambitious.
d. Someone is teasing.
e. It is not true that all roads lead to Rome.

Solution:

Let P(x) : x is a person.

T(x) : x trusts others.

R(x) : x is rewarded.

G(x) : x is good.

A(x) : x is ambitious.

Q(x) : x is teasing.

S(x) : x is a road.

L(x) : x leads to Rome.

Then

a. Some People who trust others one rewarded.


Can be rephrased as
“There is one x such that x is a person, x trusts others and x is rewarded.”
Symbolic form: (ᴲx) [P(x) ˄ T(x) ˄ R(x)]
b. If anyone is good, then Jhon is good, then Jhon is good can be worded as.
“If there is one x such that x is a person and x is good, then Jhon is good”.
Symbolic form: (ᴲx) [(P(x) ˄ G(x))] → G (John).
c. ‘He’ represents a particular person. Let that person be y. so, that statement is ‘Y’
is ambitious or for all x, if x is a person then x is not ambitious.

Symbolic Form: ( (for all) x): (ᴲy): A(y) ˅ [P(x) → ̴ A(x)]

d. ‘Some one is teasing’ can be written as “Then is one x such that x is a person
and x is teasing”.
Symbolic form: (ᴲx) [P(x) ˄ Q(x)]

Dr.Harish Naik T,DCA,Presidency College. Page 40


e. The statement can be written as ̴ ((for all) x)[s(x) → L(x)] or (ᴲ(x))[s(x) ˄ ̴
L(x)]

Associative Networks

In a huge Knowledge Base (KB), storage of voluminous and complicated


information, comprehension, use and maintenance of the knowledge can become
difficult. In such cases some form of knowledge structuring and organization
becomes a necessity. Real world problem domains typically involve a number and
variety of different objects interacting with each other in different ways. The object
themselves may require extensive characterizations, and their interaction
relationships with each other objects may be very complex.

Network representations provide a means of structuring the knowledge. In a


network, pieces of knowledge are clustered together into coherent semantic groups.
Network representations give a pictorial presentation of object, their attributes and
the relationships that exist between them and other entities.

Association networks are depicted as directed graphs with labeled nodes and arcs or
arrows. The language used in constructing a network is based on selected domain
primitives for objects and relations as well as some general primitives.

Here, a class of objects known as bird is depicted. The class has some properties
and a specific member of the class named tweety is shown. The color of tweety is
seemed to be yellow. Associative networks were introduced by quilt ion in the year
1968 to model the semantics of English sentences and words.

Quillian’s model of semantic networks has a certain intuitive appeal in that related
information is clustered and bound together through a relation links the knowledge
required for the performance of some task is typically contained within a narrow
domain or semantic vicinity of the task. This type of organization in some way
resembles the way knowledge is stored and retrieved in humans. The graphical
portrayal of knowledge can also be somewhat more expensive than other
representation schemes. They have been used in variety of systems such as natural
language understanding, information retrieval deductive databases, learning
systems, computer vision and speech generation systems.

Dr.Harish Naik T,DCA,Presidency College. Page 41


Syntax and Semantics of Associative Networks:

There is neither generally accepted syntax nor semantics for associative networks.
Most network systems are based on PL or FOPL with extensions. The syntax for
any system is determined by the object and relation primitives chosen and by any
special rules used to connect nodes. Basically the language of associative networks
is formed from letters of the alphabet both upper and lower case, relational symbols,
set membership and subset symbols, decimal digits, square and oval nodes, directed
arcs of arbitrary length. The word symbols used are those which represent object
constants and n-ary relation constants. Nodes are commonly used for objects or
nouns, arcs for relations. The direction of an arc is usually taken from the first to
subsequent arguments as they appear in a relational statement.

Thus, OWNS (anand, house) would be written as

A number of arc relations have become common among users. They include as ISA,
MEMBER-OF, SUBSET-OF, AKO (a-kind of), HAS-PARTS, INSTANCE-OF,
AGENT, and ATTRIBUTES, SHARED-LIKE and so forth. Less common arcs
have also been used to express modality relations (time, manner, and mood),
linguistics case relations (theme, source, and goal), logical connectives, quantifiers,
set relations, attributes and quantification (ordinal, count).

The ISA link is most often used to represent the fact that an object is of a certain
type. The ISA predictive has been used to exhibit the following types of structures:

****FUZZY LOGIC

In fuzzy logic, we consider what happens if we make fundamental changes to our idea
of set membership and corresponding changes to our definitions of logical operations.

While traditional set-theory defines set membership as a Boolean predicate.

fuzzy set theory allows us to represent set membership as a possibility distribution


such as tall-very for the set of tall people and the set of very tall people.

This contrasts with the standard Boolean definition for tall people where one is either
tall or not and there must be a specific height that defines the boundary. The same is
true for very tall. In fuzzy logic, one’s tallness increases with one’s height until the
value 1 is reached. So it is a distribution.

Once set membership has been redefined in this way, it is possible to define a
reasoning system based on techniques for combining distributions.
Dr.Harish Naik T,DCA,Presidency College. Page 42
The motivation for fuzzy sets is provided by the need to represent propositions such
as:

John is very tall

Mary is slightly tall

Most Frenchmen are not very tall

Sue and linda are close friends

***Structured Representation of Knowledge

Representing knowledge using logical formalism, like predicate logic, has several
advantages. They can be combined with powerful inference mechanisms like
resolution, which makes reasoning with facts easy.

But using logical formalism complex structures of the world, objects and their
relationships, events, sequences of events etc. can not be described easily.

A good system for the representation of structured knowledge in a particular domain


should posses the following four properties:

(i) Representational Adequacy:- The ability to represent all kinds of knowledge that
are needed in that domain.

(ii) Inferential Adequacy :- The ability to manipulate the represented structure and
infer new structures.

(iii) Inferential Efficiency:- The ability to incorporate additional information into the
knowledge structure that will aid the inference mechanisms.

(iv) Acquisitional Efficiency :- The ability to acquire new information easily, either
by direct insertion or by program control.

The techniques that have been developed in AI systems to accomplish these objectives
fall under two categories:

1. Declarative Methods:- In these knowledge is represented as static collection of


facts which are manipulated by general procedures. Here the facts need to be stored
only one and they can be used in any number of ways. Facts can be easily added to
declarative systems without changing the general procedures.

Dr.Harish Naik T,DCA,Presidency College. Page 43


2. Procedural Method:- In these knowledge is represented as procedures. Default
reasoning and probabilistic reasoning are examples of procedural methods. In these,
heuristic knowledge of “How to do things efficiently “can be easily represented.

In practice most of the knowledge representation employ a combination of both. Most


of the knowledge representation structures have been developed to handle programs
that handle natural language input. One of the reasons that knowledge structures are so
important is that they provide a way to represent information about commonly
occurring patterns of things.

such descriptions are some times called schema. One definition of schema is “Schema
refers to an active organization of the past reactions, or of past experience, which must
always be supposed to be operating in any well adapted organic response”.

By using schemas, people as well as programs can exploit the fact that the real world
is not random. There are several types of schemas that have proved useful in AI
programs. They include

(i) Frames:- Used to describe a collection of attributes that a given object possesses
(eg: description of a chair).

(ii) Scripts:- Used to describe common sequence of events


(eg:- a restaurant scene).

(iii) Stereotypes :- Used to described characteristics of people.

(iv) Rule models:- Used to describe common features shared among a


set of rules in a production system.

Frames and scripts are used very extensively in a variety of AI programs. Before
selecting any specific knowledge representation structure, the following issues have to
be considered.

(i) The basis properties of objects , if any, which are common to every problem
domain must be
identified and handled appropriately.

ii) The entire knowledge should be represented as a good set of primitives.

(iii) Mechanisms must be devised to access relevant parts in a large knowledge base.

Dr.Harish Naik T,DCA,Presidency College. Page 44


All these structures share a common notion that complex entities can be describe as a
collection of attributes and associated values (hence they are often called “slot-and –
filter structures”) ie., these structures have the form of ordered triples

OBJECTS x ATTRIBUTES x VALUE

Information can be retrieved from the knowledge base by an associative search with
VALUE.

****Conceptual Dependency (CD)

This representation is used in natural language processing in order to represent them


earning of the sentences in such a way that inference we can be made from the
sentences.

It is independent of the language in which the sentences were originally stated. CD


representations of a sentence is built out of primitives , which are not words belonging
to the language but are conceptual , these primitives are combined to form the meaning
s of the words. As an example consider the event represented by the sentence.

In the above

Representation the symbols have the following meaning:

Arrows indicate direction of dependency

Double arrow indicates two may link between actor and the action

P indicates past tense

ATRANS is one of the primitive acts used by the theory . it indicates transfer of
possession

0 indicates the object case relation

Dr.Harish Naik T,DCA,Presidency College. Page 45


R indicates the recipient case relation

Conceptual dependency provides a structure in which knowledge can be represented


and also a set of building blocks from which representations can be built. A typical set
of primitive actions are

ATRANS - Transfer of an abstract relationship(Eg: give)


PTRANS - Transfer of the physical location of an object(Eg: go)
PROPEL - Application of physical force to an object (Eg: push)
MOVE - Movement of a body part by its owner (eg : kick)
GRASP - Grasping of an object by an actor(Eg: throw)
INGEST - Ingesting of an object by an animal (Eg: eat)
EXPEL - Expulsion of something from the body of an animal (cry)
MTRANS - Transfer of mental information(Eg: tell)
MBUILD - Building new information out of old(Eg: decide)
SPEAK - Production of sounds(Eg: say)
ATTEND - Focusing of sense organ toward a stimulus (Eg: listen)

A second set of building block is the set of allowable dependencies among the
conceptualization describe in a sentence.

****Truth Maintenance System (TMS)

A truth maintenance system (TMS) is a means of providing the ability to do


dependency – directed back tracking and thus to support non-monotonic reasoning.

A TMS allows assertions to be connected via a network of dependencies.

Justification-based Truth Maintenance System (JTMS) we refer it as TMS only.

A truth maintenance system maintains consistency in knowledge representation of a


knowledge base.

The following of TMS are to

. Provide justification for conclusions

. Recognize inconsistencies

. Support default reasoning

Provide justification for conclusions

Dr.Harish Naik T,DCA,Presidency College. Page 46


When a problem solving system gives an answer to a user’s query, an explanation of
the answer is required

. Recognize inconsistencies

The inference engine(IE) may tell the TMS may some sentences are contradictory.

Then TMS may find that all those sentences are believed true, and reports to the IE
which can eliminate the inconsistencies by determining the assumptions used and
changing them appropriately.

Eg. A statement that either x,y or z is guilty together with other statements that x is not
guilty, y is not guilty, and z is not guilty form a contradiction.

Support default reasoning

In the absence of any firm knowledge, in many situations we want to reason form
default assumptions.

Unit III-PLANNING

What is planning?

The planning problem in Artificial Intelligence is about the decision making performed
by intelligent creatures like robots, humans, or computer programs when trying to
achieve some goal. It involves choosing a sequence of actions that will transform the
state of the world, step by step, so that it will satisfy the goal.

What does planning involve?

• Planning problems are hard problems:


• They are certainly non-trivial[important].
• Solutions involve many aspects that we have studied so far:
• Search and problem solving strategies.
• Knowledge Representation schemes.
• Problem decomposition -- breaking problem into smaller pieces and trying to
solve these first.

We have seen that it is possible to solve a problem by considering the appropriate form
of knowledge representation and using algorithms to solve parts of the problem and
also to use searching methods.

Dr.Harish Naik T,DCA,Presidency College. Page 47


***Blocks World Planning Examples

What is the Blocks World? -- The world consists of:

A flat surface such as a tabletop

An adequate set of identical blocks which are identified by letters.

The blocks can be stacked one on one to form towers of apparently unlimited height.

The stacking is achieved using a robot arm which has fundamental operations and
states which can be assessed using logic and combined using logical operations.

The robot can hold one block at a time and only one block can be moved at a time.

We shall use the four actions:

UNSTACK(A,B)
pick up clear block A from block B;
STACK(A,B)
place block A using the arm onto clear block B;
PICKUP(A)
lift clear block A with the empty arm;
PUTDOWN(A)
place the held block A onto a free space on the table.
and the five predicates:
ON(A,B)
block A is on block B.
ONTABLE(A)
block A is on the table.
CLEAR(A)
block A has nothing on it.
HOLDING(A)
the arm holds block A.
ARMEMPTY
the arm holds nothing
Using logic but not logical notation we can say that
 If the arm is holding a block it is not empty
 If block A is on the table it is not on any other block
 If block A is on block B,block B is not clear.

Dr.Harish Naik T,DCA,Presidency College. Page 48


Why Use the Blocks world as an example?
 The blocks world is chosen because:
 it is sufficiently simple and well behaved.
 easily understood
 yet still provides a good sample environment to study planning:
 problems can be broken into nearly distinct subproblems
 we can show how partial solutions need to be combined to form a realistic
complete solution.

Planning System Components


Simple problem solving tasks basically involve the following tasks:

 Choose the best rule based upon heuristics.


 Apply this rule to create a new state.
 Detect when a solution is found.
 Detect dead ends so that they can be avoided.
More complex problem solvers often add a fifth task:
 Detect when a nearly solved state occurs and use special methods to make it a
solved state.
 Now let us look at what AI techniques are generally used in each of the above
tasks. We will then look a t specific methods of implementation.

i. Choice of best rule

ii. Rule application

iii. Detecting Progress

i.Choice of best rule

Methods used involve

 finding the differences between the current states and the goal states and then
 choosing the rules that reduce these differences most effectively.
 Means end analysis good example of this.
If we wish to travel by car to visit a friend
 the first thing to do is to fill up the car with fuel.
 If we do not have a car then we need to acquire one.
 The largest difference must be tackled first.

Dr.Harish Naik T,DCA,Presidency College. Page 49


ii.Rule application
 Previously rules could be applied without any difficulty as complete systems
were specified and rules enabled the system to progress from one state to the
next.
 Now we must be able to handle rules which only cover parts of systems.
 A number of approaches to this task have been used.

STRIPS:

STRIPS proposed another approach:


Basically each operator has three lists of predicates associated with it:

 a list of things that become TRUE called ADD.


 a list of things that become FALSE called DELETE.
 a set of prerequisites that must be true before the operator can be applied.
 Anything not in these lists is assumed to be unaffected by the operation.
 This method initial implementation of STRIPS -- has been extended to include
other forms of reasoning/planning (e.g. Nonmonotonic methods, Goal Stack
Planning and even Nonlinear planning -- see later)

 Consider the following example in the Blocks World and the fundamental
operations:

STACK

 Requires the arm to be holding a block A and the other block B to be clear.
Afterwards the block A is on block B and the arm is empty and these are true
ADD; The arm is not holding a block and block B is not clear; predicates that
are false DELETE;

UNSTACK

 Requires that the block A is on block B; that the arm is empty and that block A
is clear. Afterwards block B is clear and the arm is holding block A ADD; The
arm is not empty and the block A is not on block B DELETE;

 We have now greatly reduced the information that needs to be held. If a new
attribute is introduced we do not need to add new axioms for existing operators.

Dr.Harish Naik T,DCA,Presidency College. Page 50


Unlike in Green's method we remove the state indicator and use a database of
predicates to indicate the current state

 Thus if the last state was:

 ONTABLE(B) ON(A,B) CLEAR(A)

 after the unstack operation the new state is

 ONTABLE(B) CLEAR(B) HOLDING(A) CLEAR(A)

Detecting Progress

The final solution can be detected if we can devise a predicate that is true when the
solution is found and is false otherwise. Requires a great deal of thought and requires a
proof.

Detecting false trails is also necessary:

E.g. A* search -- if insufficient progress is made then this trail is aborted in favour of a
more hopeful one.

 Sometimes it is clear that solving a problem one way has reduced the problem to
parts that are harder than the original state.

 By moving back from the goal state to the initial state it is possible to detect
conflicts and any trail or path that involves a conflict can be pruned out.

 Reducing the number of possible paths means that there are more resources
available for those left.

Supposing that the computer teacher is ill at a school there are two possible alternatives
transfer a teacher from mathematics who knows computing or bring another one in.

Possible Problems:

If the math’s teacher is the only teacher of maths the problem is not solved.

If there is no money left the second solution could be impossible.

*****Goal Stack Planning

Basic Idea to handle interactive compound goals uses goal stacks, Here the stack
contains :

Dr.Harish Naik T,DCA,Presidency College. Page 51


goals, operators ADD, DELETE and PREREQUISITE lists a database maintaining the
current situation for each operator used.

Consider the following where wish to proceed from the start to goal state.

We can describe the start state:

ON(B, A) ONTABLE(A) ONTABLE(C) ONTABLE(D) ARMEMPTY

and goal state:

ON(C, A) ON(B,D) ONTABLE(A) ONTABLE(D)

Initially the goal stack is the goal state.

We then split the problem into four subproblems

Two are solved as they already are true in the initial state -- ONTABLE(A),
ONTABLE(D).

With the other two -- there are two ways to proceed:

1. ON(C,A)
2.
3. ON(B,D)
4.
5. ON(C,A) ON(B,D)
6.
7. ONTABLE(A)
8. ONTABLE(D)
9.
10.
11. ON(B,D)
12.
Dr.Harish Naik T,DCA,Presidency College. Page 52
13. ON(C,A)
14.ON(C,A) ON(B,D)
15.
16. ONTABLE(A)
17.ONTABLE(D)
18.
19.
The method is to
Investigate the first node on the stack ie the top goal.

If a sequence of operators is found that satisfies this goal it is removed and the
next goal is attempted.

This continues until the goal state is empty.

The new goal stack becomes;

CLEAR(D)

HOLDING(B)

CLEAR(D) HOLDING(B)

STACK (B, D)

ONTABLE(C) CLEAR(C) ARMEMPTY

PICKUP(C)

At this point the top goal is true and the next and thus the combined goal leading
to the application of STACK (B,D), which means that the world model becomes

ONTABLE(A) ONTABLE(C) ONTABLE(D) ON(B,D) ARMEMPTY

This means that we can perform PICKUP(C) and then STACK (C,A)

Now coming to the goal ON(B,D) we realise that this has already been achieved
and checking the final goal we derive the following plan

1. UNSTACK(B,A)
2. STACK (B,D)
Dr.Harish Naik T,DCA,Presidency College. Page 53
3. PICKUP(C)
4. STACK (C,A)
This method produces a plan using good Artificial Intelligence techniques such as
heuristics to find matching goals and the A* algorithm to detect unpromising paths
which can be discarded.

*****Sussman Anomaly

Above method may fail to give a good solution. Consider:

The start state is given by:

ON(C, A) ONTABLE(A) ONTABLE(B) ARMEMPTY

The goal by:

ON(A,B) ON(B,C)

This immediately leads to two approaches as given below

1. ON(A,B)
2. ON(B,C)
3.
4. ON(A,B) ON(B,C)
5.
6.
7. ON(B,C)
8. ON(A,B)
9.
10. ON(A,B) ON(B,C)
11.
Dr.Harish Naik T,DCA,Presidency College. Page 54
12.
Choosing path 1 and trying to get block A on block B leads to the goal stack:
13. ON(C,A)
14. CLEAR(C)
15.
16. ARMEMPTY
17.
18. ON(C,A) CLEAR(C) ARMEMPTY
19.
20. UNSTACK(C,A)
21.
22. ARMEMPTY
23.

CLEAR(A) ARMEMPTY

PICKUP(A)

CLEAR(B) HOLDING(A)

STACK(A,B)

ON(B,C)

ON(A,B) ON(B,C)
This achieves block A on block B which was produced by putting block C on the table.
The sequence of operators is

1. UNSTACK(C,A)
2. PUTDOWN(C)
3. PICKUP(A)
4. STACK (A,B)
Working on the next goal of ON(B,C) requires block B to be cleared so that it can be
stacked on block C. Unfortunately we need to unstack block A which we just did. Thus
the list of operators becomes
1. UNSTACK(C,A)
2. PUTDOWN(C)
3. PICKUP(A)
Dr.Harish Naik T,DCA,Presidency College. Page 55
4. STACK (A,B)
5. UNSTACK(A,B)
6. PUTDOWN(A)
7. PICKUP(B)
8. STACK (B,C)

To get to the state that block A is not on block B two extra operations are needed:

1. PICKUP(A)

STACK(A,B)

Analyzing this sequence we observe that

Steps 4 and 5 are opposites and therefore cancel each other out,

Steps 3 and 6 are opposites and therefore cancel each other out as well.

So a more efficient scheme is:

1. UNSTACK(C,A)
2. PUTDOWN(C)
3. PICKUP(B)
4. STACK (B,C)
5. PICKUP(A)
6. STACK(A,B)

Nonlinear Planning Using Constraint Posting

Let us reconsider the SUSSMAN ANOMALY

Problems such as this one require subproblems to be worked on simultaneously.

Thus a nonlinear plan using heuristics such as:

 Try to achieve ON(A,B) clearing block A putting block C on the table.


 Achieve ON(B,C) by stacking block B on block C.
 Complete ON(A,B) by stacking block A on block B.
Constraint posting builds up a plan by:
 suggesting operators,

Dr.Harish Naik T,DCA,Presidency College. Page 56


 trying to order them, and
 produce bindings between variables in the operators and actual blocks.
The initial plan consists of no steps and by studying the goal state ideas for the possible
steps are generated.
There is no order or detail at this stage.
Gradually more detail is introduced and constraints about the order of subsets of the
steps are introduced until a completely ordered sequence is created.

In this problem means-end analysis suggests two steps with end conditions ON(A,B)
and ON(B,C) which indicates the operator STACK giving the layout shown below
where the operator is preceded by its preconditions and followed by its post conditions:
CLEAR(B)

CLEAR(C)

*HOLDING(A) *HOLDING(B)

__________________ __________________

STACK(A,B) STACK(B,C)

__________________ __________________

ARMEMPTY ARMEMPTY

ON(A,B) ON(B,C)

~CLEAR(B) ~ CLEAR(C)

~ HOLDING(A) ~HOLDING(B)
NOTE:
 There is no order at this stage.
 Unachieved preconditions are starred (*).
 Both of the HOLDING preconditions are unachieved since the arm holds
nothing in the initial state.
 Delete postconditions are marked by (~ ).
Many planning methods have introduced heuristics to achieve goals or preconditions.

Dr.Harish Naik T,DCA,Presidency College. Page 57


****Hierarchical Planning
In order to solve hard problem, a problem solver may have to generate long plans. In
order to do that efficiently, it is important to be able to eliminate some of the details of
the problem until a solution that addresses the main issue is found. Then an attempt can
be made to fill in the appropriate details.

Early attempt to do this involved the use of macro operators, in which large operators
were built from smaller ones. But in this approach no details were eliminated from the
actual descriptions of the operators.

A better approach was developed in the ABSTRIPS system, which actually planned in
a hierarchy of abstraction spaces, in each of which preconditions at a lower level of
abstraction were ignored.
As an example suppose you want to visit a friend in Europe but you have a limited
amount of cash to spend. It makes sense to check air fares first, since finding an
affordable flight will be the most efficient part of the task.
You should not worry about getting out of your driveway, planning a route to the
airport, or parking your car ..

The ABSTRIPS approach to problem solving is as follows: First solve the problem
completely, considering only preconditions whose criticality value is the highest
possible.

Because this process explores entire plans at one level of detail before it looks at the
lower level details of any one of them, it has been called length-first search.

***Perception
We perceive our environment through many channels: sight, sound, touch, smell, taste.
Many animals processes these same perceptual capabilities , and others also able to
monitor entirely different channels.

Robots, too, can process visual and auditory information, and they can also equipped
with more exotic sensors. Such as laser rangefinders, speedometers and radar.

Two extremely important sensory channels for human are vision and spoken
language. It is through these two faculties that we gather almost all of the knowledge
that drives our problem-solving behaviors.

Dr.Harish Naik T,DCA,Presidency College. Page 58


Vision: Accurate machine vision opens up a new realm of computer applications.
These applications include mobile robot navigation, complex manufacturing tasks
analysis of satellite images, and medical image processing.

The question is that how we can transform raw camera images into useful information
about the world.
A Video Camera provides a computer with an image represented as a two-
dimensional grid of intensity levels. Each grid element, or pixel, may store a single
bit of information (that is , black/white) or many bits(perhaps a real-valued intensity
measure and color information).
A visual image is composed of thousands of pixels. What kinds of things might we
want to do with such an image? Here are four operations, in order of increasing
complexity:
1. Signal Processing:- Enhancing the image, either for human consumption or as
input to another program.
2. Measurement Analysis:- For images containing a single object, determining the
two-dimensional extent of the object depicted.
3 Pattern Recognition:- For single – object images, classifying the object into a
category drawn from a finite set of possibilities.

4. image Understanding :- For images containing many objects, locating the object in
the image, classifying them, and building a three-dimensional model of the scene.
There are algorithms that perform the first two operations. The third operation, pattern
recognition varies in its difficulty. It is possible to classify two-dimensional (2-D)
objects, such as machine parts coming down a conveyor belt, but classifying 3-D
objects is harder because of the large number of possible orientations for each object.

Image understanding is the most difficult visual task, and it has been the subject of
the most study in AI. While some aspects of image understanding reduce to
measurement analysis and pattern recognition, the entire problem remains unsolved,
because of difficulties that include the following:

1. An image is two-dimensional; while the world is three-dimensional some


information is necessarily lost when an image is created.

Dr.Harish Naik T,DCA,Presidency College. Page 59


2. One image may contain several objects, and some objects may partially occlude
others.

3. The value of a single pixel is affected by many different phenomena, including the
color of the object, the source of the light , the angle and distance of the camera, the
pollution in the air, etc. it is hard to disentangle these effects.

As a result, 2-D images are highly ambiguous. Given a single image, we could
construct any number of 3-D worlds that would give rise to the image . it is impossible
to decide what 3-D solid it should portray. In order to determine the most likely
interpretation of a scene , we have to apply several types of knowledge.

****Robot Architecture
Now let us turn what happen when we put it all together-perception, cognition, and
action.
There are many decisions involved in designing an architecture that integrates all these
capabilities, among them:
 What range of tasks is supported by the architecture?
 What type of environment(e.g. indoor, outdoor, space) is supported?
 How are complex behaviors turned into sequence of low-level action?
 Is control centralized or distributed
 How are numeric and symbolic representations merged
 How does the architecture represent the state of the world?
 How quickly can the architecture react to changes in the environment?
 How does the architecture decide to plan and when to act?

Dr.Harish Naik T,DCA,Presidency College. Page 60


1.The input to an AI program is symbolic in form (example : a typed English
sentence), whereas the input to a robot is typically an analog signal ,such as a two
dimensional video image or a speech wave form.
2. Robots require special hardware for perceiving and affecting the world, while AI
programs require only general-purpose computers.

3. Robot sensors are inaccurate, and their effectors are limited in precision.

4. many robots must react in real time. A robot fighter plane, for example, cannot
afford to search optimally or o stop monitoring the world during a LISP garbage
collection.

5. the real world is unpredictable, dynamic, and uncertain. A root cannot hope to
maintain a correct and complete description of the world. This means that a robot must
consider the trade-off between devising and executing plans. This trade-off has several
aspects.
For one thing a robot may not possess enough information about the world for it to do
any useful planning. In that case, it must first engage in information gathering activity .
furthermore, once it begins executing a plan, the robot must continually the results of
its actions. If the results are unexpected, then re-planning may be necessary.
6. Because robots must operate in the real world, searching and back tracking can be
costly.
Recent years have seen efforts to integrate research in robotics and AI. The old idea of
simply sensors and effectors to existing AI programs has given way to a serious
rethinking of basic AI algorithms in light of the problems involved in dealing with the
physical world.

Research in robotics is likewise affected by AI techniques , since reasoning about goals


and plans is essential for mapping perceptions onto appropriate actions.

***Texture:
Texture is one of the most important attributes used in image analysis and pattern
recognition. It provides surface characteristics for the analysis of many types of images
including natural scenes, remotely sensed data, and biomedical modalities and plays an
important role in the human visual system for recognition and interpretation.
• Although there is no formal definition for texture, intuitively this descriptor
provides measures of properties such as smoothness, coarseness, and regularity.

Dr.Harish Naik T,DCA,Presidency College. Page 61


• Texture is one of the most important image characteristics to be found almost
anywhere in nature. It can be used to segment images into distinct object or
regions. The classification and recognition of different surfaces is often based on
texture properties.

Different types of textures.


1. Natural textures
2. Artificial textures
3. Regular textures
4. Computer vision is the science and technology of machines that see.
5. Concerned with the theory for building artificial systems that obtain information
from images.
6. The image data can take many forms, such as a video sequence, depth images,
views from multiple cameras, or multi-dimensional data from a medical scanner

***Computer Vision:
• Computer vision is the science and technology of machines that see.
• Concerned with the theory for building artificial systems that obtain information
from images.
• The image data can take many forms, such as a video sequence, depth images,
views from multiple cameras, or multi-dimensional data from a medical scanner.

Unit IV-LEARNING

Learning is the improvement of performance with experience over time.

Learning element is the portion of a learning AI system that decides how to modify the
performance element and implements those modifications.
We all learn new knowledge through different methods, depending on the type of
material to be learned, the amount of relevant knowledge we already possess, and the
environment in which the learning takes place.
***There are five methods of learning. They are,
1. Memorization (rote learning)
2. Direct instruction (by being told)
3. Analogy
4. Induction
5. Deduction

Dr.Harish Naik T,DCA,Presidency College. Page 62


Learning by memorizations is the simplest from of learning. It requires the least
amount of inference and is accomplished by simply copying the knowledge in the same
form that it will be used directly into the knowledge base.
Example:- Memorizing multiplication tables, formulate , etc.

Direct instruction is a complex form of learning. This type of learning requires more
inference than role learning since the knowledge must be transformed into an
operational form before

learning when a teacher presents a number of facts directly to us in a well organized


manner.
Analogical learning is the process of learning a new concept or solution through the
use of similar known concepts or solutions. We use this type of learning when solving
problems on an exam where previously learned examples serve as a guide or when
make frequent use of analogical learning.

This form of learning requires still more inferring than either of the previous forms.
Since difficult transformations must be made between the known and unknown
situations.

Learning by induction is also one that is used frequently by humans. it is a powerful


form of learning like analogical learning which also require s more inferring than the
first two methods. This learning requires the use of inductive inference, a form of
invalid but useful inference.

We use inductive learning of instances of examples of the concept. For example we


learn the
Concepts of color or sweet taste after experiencing the sensations associated with
several examples of colored objects or sweet foods.

Deductive learning is accomplished through a sequence of deductive inference steps


using known facts. From the known facts, new facts or relationships are logically
derived. Deductive learning usually requires more inference than the other methods.

Dr.Harish Naik T,DCA,Presidency College. Page 63


Matching and Learning.

MATCHING:So far, we have seen the process of using search to solve problems as
the application of appropriate rules to individual problem states to generate new states
to which the rules can then be applied, and so forth, until a solution is found.

Clever search involves choosing from among the rules that can be applied at a
particular point, the ones that are most likely to lead to a solution. We need to extract
from the entire collection of rules, those that can be applied at a given point. To do so
requires some kind of matching between the current state and the preconditions of the
rules.

How should this be done?


One way to select applicable rules is to do a simple search through all the rules
comparing one’s preconditions to the current state and extracting all the ones that
match. This requires indexing of all the rules. But there are two problems with these
simple solutions:

A. It requires the use of a large number of rules. Scanning through all of them would
be hopelessly inefficient.
B. It is not always immediately obvious whether a rule’s preconditions are satisfied by
a particular state.

Sometimes , instead of searching through the rules, we can use the current state as an
index into the rules and select the matching ones immediately. In spite of limitations,
indexing in some form is very important in the efficient operation of rules based
systems.

A more complex matching is required when the preconditions of rule specify required
properties that are not stated explicitly in the description of the current state. In this
case, a separate set of rules must be used to describe how some properties can be
inferred from others.

An even more complex matching process is required if rules should be applied and if
their pre condition approximately match the current situation. This is often the case in
situations involving physical descriptions of the world.

Dr.Harish Naik T,DCA,Presidency College. Page 64


***Inductive learning:

Inductive learning is an inherently conjectural process because any knowledge


created by generalization from specific facts cannot be proven true; it can only be
proven false. Hence, inductive inference is falsity preserving, not truth preserving.
• To generalize beyond the specific training examples, we need constraints or biases
on what f is best. That is, learning can be viewed as searching the Hypothesis Space
H of possible f functions.
• A bias allows us to choose one f over another one
• A completely unbiased inductive algorithm could only memorize the training
examples and could not say anything more about other unseen examples.
• Two types of biases are commonly used in machine learning:
Restricted Hypothesis Space Bias Allow only certain types of f functions, not
arbitrary ones
Preference Bias Define a metric for comparing fs so as to determine
whether one is better than another .
Inductive Learning Framework
• Raw input data from sensors are preprocessed to obtain a feature vector, x, that
adequately describes all of the relevant features for classifying examples.
• Each x is a list of (attribute, value) pairs. For example,
x = (Person = Sue, Eye-Color = Brown, Age = Young, Sex = Female)
The number of attributes (also called features) is fixed (positive, finite). Each attribute
has a fixed, finite number of possible values.
Each example can be interpreted as a point in an n-dimensional feature space, where n
is the number of attributes.

****Supervised Machine Learning


Supervised learning is the types of machine learning in which machines are trained using
well "labelled" training data, and on basis of that data, machines predict the output. The
labelled data means some input data is already tagged with the correct output.

In supervised learning, the training data provided to the machines work as the supervisor
that teaches the machines to predict the output correctly. It applies the same concept as a
student learns in the supervision of the teacher.

Dr.Harish Naik T,DCA,Presidency College. Page 65


Supervised learning is a process of providing input data as well as correct output data to the
machine learning model. The aim of a supervised learning algorithm is to find a mapping
function to map the input variable(x) with the output variable(y).

In the real-world, supervised learning can be used for Risk Assessment, Image
classification, Fraud Detection, spam filtering, etc.

How Supervised Learning Works?

In supervised learning, models are trained using labelled dataset, where the model learns
about each type of data. Once the training process is completed, the model is tested on the
basis of test data (a subset of the training set), and then it predicts the output.

The working of Supervised learning can be easily understood by the below example and
diagram:

Suppose we have a dataset of different types of shapes which includes square, rectangle,
triangle, and Polygon. Now the first step is that we need to train the model for each shape.

o If the given shape has four sides, and all the sides are equal, then it will be labelled as
a Square.
o If the given shape has three sides, then it will be labelled as a triangle.
o If the given shape has six equal sides then it will be labelled as hexagon.

Now, after training, we test our model using the test set, and the task of the model is to
identify the shape.

Dr.Harish Naik T,DCA,Presidency College. Page 66


The machine is already trained on all types of shapes, and when it finds a new shape, it
classifies the shape on the bases of a number of sides, and predicts the output.

Steps Involved in Supervised Learning:

o First Determine the type of training dataset


o Collect/Gather the labelled training data.
o Split the training dataset into training dataset, test dataset, and validation dataset.
o Determine the input features of the training dataset, which should have enough knowledge
so that the model can accurately predict the output.
o Determine the suitable algorithm for the model, such as support vector machine, decision
tree, etc.
o Execute the algorithm on the training dataset. Sometimes we need validation sets as the
control parameters, which are the subset of training datasets.
o Evaluate the accuracy of the model by providing the test set. If the model predicts the correct
output, which means our model is accurate.

Types of supervised Machine learning Algorithms:

Supervised learning can be further divided into two types of problems:

1. Regression

Regression algorithms are used if there is a relationship between the input variable and the
output variable. It is used for the prediction of continuous variables, such as Weather
forecasting, Market Trends, etc. Below are some popular Regression algorithms which come
under supervised learning:

Linear Regression

Dr.Harish Naik T,DCA,Presidency College. Page 67


o Regression Trees
o Non-Linear Regression
o Bayesian Linear Regression
o Polynomial Regression

2. Classification

Classification algorithms are used when the output variable is categorical, which means there
are two classes such as Yes-No, Male-Female, True-false, etc.

o Random Forest
o Decision Trees
o Logistic Regression
o Support vector Machines

Advantages of Supervised learning:

o With the help of supervised learning, the model can predict the output on the basis of prior
experiences.
o In supervised learning, we can have an exact idea about the classes of objects.
o Supervised learning model helps us to solve various real-world problems such as fraud
detection, spam filtering, etc.

Disadvantages of supervised learning:

o Supervised learning models are not suitable for handling the complex tasks.
o Supervised learning cannot predict the correct output if the test data is different from the
training dataset.
o Training required lots of computation times.
o In supervised learning, we need enough knowledge about the classes of object.

Unsupervised Machine Learning

In supervised machine learning in which models are trained using labeled data under the
supervision of training data. But there may be many cases in which we do not have labeled
data and need to find the hidden patterns from the given dataset. So, to solve such types of
cases in machine learning, we need unsupervised learning techniques.

Dr.Harish Naik T,DCA,Presidency College. Page 68


What is Unsupervised Learning?

As the name suggests, unsupervised learning is a machine learning technique in which


models are not supervised using training dataset. Instead, models itself find the hidden
patterns and insights from the given data. It can be compared to learning which takes place
in the human brain while learning new things. It can be defined as:

Unsupervised learning is a type of machine learning in which models are trained using unlabeled dataset
and are allowed to act on that data without any supervision.

Unsupervised learning cannot be directly applied to a regression or classification problem


because unlike supervised learning, we have the input data but no corresponding output
data. The goal of unsupervised learning is to find the underlying structure of dataset,
group that data according to similarities, and represent that dataset in a compressed
format.

Example: Suppose the unsupervised learning algorithm is given an input dataset containing
images of different types of cats and dogs. The algorithm is never trained upon the given
dataset, which means it does not have any idea about the features of the dataset. The task of
the unsupervised learning algorithm is to identify the image features on their own.
Unsupervised learning algorithm will perform this task by clustering the image dataset into
the groups according to similarities between images.

troduction to Machine Learning with Statistics | Machine Learning |TutorialsPoint

Why use Unsupervised Learning?

Below are some main reasons which describe the importance of Unsupervised Learning:

o Unsupervised learning is helpful for finding useful insights from the data.
o Unsupervised learning is much similar as a human learns to think by their own experiences,
which makes it closer to the real AI.

Dr.Harish Naik T,DCA,Presidency College. Page 69


o Unsupervised learning works on unlabeled and uncategorized data which make unsupervised
learning more important.
o In real-world, we do not always have input data with the corresponding output so to solve
such cases, we need unsupervised learning.

Working of Unsupervised Learning

Working of unsupervised learning can be understood by the below diagram:

Here, we have taken an unlabeled input data, which means it is not categorized and
corresponding outputs are also not given. Now, this unlabeled input data is fed to the
machine learning model in order to train it. Firstly, it will interpret the raw data to find the
hidden patterns from the data and then will apply suitable algorithms such as k-means
clustering, Decision tree, etc.

Once it applies the suitable algorithm, the algorithm divides the data objects into groups
according to the similarities and difference between the objects.

Types of Unsupervised Learning Algorithm:

The unsupervised learning algorithm can be further categorized into two types of problems:

Dr.Harish Naik T,DCA,Presidency College. Page 70


o Clustering: Clustering is a method of grouping the objects into clusters such that objects
with most similarities remains into a group and has less or no similarities with the objects of
another group. Cluster analysis finds the commonalities between the data objects and
categorizes them as per the presence and absence of those commonalities.
o Association: An association rule is an unsupervised learning method which is used for finding
the relationships between variables in the large database. It determines the set of items that
occurs together in the dataset. Association rule makes marketing strategy more effective.
Such as people who buy X item (suppose a bread) are also tend to purchase Y (Butter/Jam)
item. A typical example of Association rule is Market Basket Analysis.

Unsupervised Learning algorithms:

Below is the list of some popular unsupervised learning algorithms:

o K-means clustering
o KNN (k-nearest neighbors)
o Hierarchal clustering
o Anomaly detection
o Neural Networks
o Principle Component Analysis
o Independent Component Analysis
o Apriori algorithm
o Singular value decomposition

Advantages of Unsupervised Learning

o Unsupervised learning is used for more complex tasks as compared to supervised learning
because, in unsupervised learning, we don't have labeled input data.

Dr.Harish Naik T,DCA,Presidency College. Page 71


o Unsupervised learning is preferable as it is easy to get unlabeled data in comparison to
labeled data.

Disadvantages of Unsupervised Learning

o Unsupervised learning is intrinsically more difficult than supervised learning as it does not
have corresponding output.
o The result of the unsupervised learning algorithm might be less accurate as input data is not
labeled, and algorithms do not know the exact output in advance.

K-Nearest Neighbor(KNN) Algorithm for Machine Learning


o K-Nearest Neighbor is one of the simplest Machine Learning algorithms based on Supervised
Learning technique.
o K-NN algorithm assumes the similarity between the new case/data and available cases and
put the new case into the category that is most similar to the available categories.
o K-NN algorithm stores all the available data and classifies a new data point based on the
similarity. This means when new data appears then it can be easily classified into a well suite
category by using K- NN algorithm.
o K-NN algorithm can be used for Regression as well as for Classification but mostly it is used
for the Classification problems.
o K-NN is a non-parametric algorithm, which means it does not make any assumption on
underlying data.
o It is also called a lazy learner algorithm because it does not learn from the training set
immediately instead it stores the dataset and at the time of classification, it performs an
action on the dataset.
o KNN algorithm at the training phase just stores the dataset and when it gets new data, then it
classifies that data into a category that is much similar to the new data.
o Example: Suppose, we have an image of a creature that looks similar to cat and dog, but we
want to know either it is a cat or dog. So for this identification, we can use the KNN
algorithm, as it works on a similarity measure. Our KNN model will find the similar features of
the new data set to the cats and dogs images and based on the most similar features it will
put it in either cat or dog category.

Dr.Harish Naik T,DCA,Presidency College. Page 72


Why do we need a K-NN Algorithm?

Suppose there are two categories, i.e., Category A and Category B, and we have a new data
point x1, so this data point will lie in which of these categories. To solve this type of
problem, we need a K-NN algorithm. With the help of K-NN, we can easily identify the
category or class of a particular dataset. Consider the below diagram:

How does K-NN work?

The K-NN working can be explained on the basis of the below algorithm:

o Step-1: Select the number K of the neighbors


o Step-2: Calculate the Euclidean distance of K number of neighbors
o Step-3: Take the K nearest neighbors as per the calculated Euclidean distance.
o Step-4: Among these k neighbors, count the number of the data points in each category.

Dr.Harish Naik T,DCA,Presidency College. Page 73


o Step-5: Assign the new data points to that category for which the number of the neighbor is
maximum.
o Step-6: Our model is ready.

Suppose we have a new data point and we need to put it in the required category. Consider
the below image:

o Firstly, we will choose the number of neighbors, so we will choose the k=5.
o Next, we will calculate the Euclidean distance between the data points. The Euclidean
distance is the distance between two points, which we have already studied in geometry. It
can be calculated as:

Dr.Harish Naik T,DCA,Presidency College. Page 74


o By calculating the Euclidean distance we got the nearest neighbors, as three nearest
neighbors in category A and two nearest neighbors in category B. Consider the below image:

o As we can see the 3 nearest neighbors are from category A, hence this new data point must
belong to category A.

How to select the value of K in the K-NN Algorithm?

Below are some points to remember while selecting the value of K in the K-NN algorithm:
There is no particular way to determine the best value for "K", so we need to try some values
to find the best out of them. The most preferred value for K is 5.

o A very low value for K such as K=1 or K=2, can be noisy and lead to the effects of outliers in
the model.
o Large values for K are good, but it may find some difficulties.

Advantages of KNN Algorithm:

o It is simple to implement.
o It is robust to the noisy training data
o It can be more effective if the training data is large.

Disadvantages of KNN Algorithm:

o Always needs to determine the value of K which may be complex some time.

Dr.Harish Naik T,DCA,Presidency College. Page 75


o The computation cost is high because of calculating the distance between the data points for
all the training samples.

**Decision Tree Classification Algorithm


o Decision Tree is a Supervised learning technique that can be used for both classification
and Regression problems, but mostly it is preferred for solving Classification problems. It is a
tree-structured classifier, where internal nodes represent the features of a dataset,
branches represent the decision rules and each leaf node represents the outcome.
o In a Decision tree, there are two nodes, which are the Decision Node and Leaf
Node. Decision nodes are used to make any decision and have multiple branches, whereas
Leaf nodes are the output of those decisions and do not contain any further branches.
o The decisions or the test are performed on the basis of features of the given dataset.
o It is a graphical representation for getting all the possible solutions to a
problem/decision based on given conditions.
o It is called a decision tree because, similar to a tree, it starts with the root node, which
expands on further branches and constructs a tree-like structure.
o In order to build a tree, we use the CART algorithm, which stands for Classification and
Regression Tree algorithm.
o A decision tree simply asks a question, and based on the answer (Yes/No), it further split the
tree into subtrees.
o Below diagram explains the general structure of a decision tree:

Dr.Harish Naik T,DCA,Presidency College. Page 76


Why use Decision Trees?

There are various algorithms in Machine learning, so choosing the best algorithm for the
given dataset and problem is the main point to remember while creating a machine learning
model. Below are the two reasons for using the Decision tree:

o Decision Trees usually mimic human thinking ability while making a decision, so it is easy to
understand.
o The logic behind the decision tree can be easily understood because it shows a tree-like
structure.

Decision Tree Terminologies

 Root Node: Root node is from where the decision tree starts. It represents the entire dataset, which
further gets divided into two or more homogeneous sets.

 Leaf Node: Leaf nodes are the final output node, and the tree cannot be segregated further after getting
a leaf node.

 Splitting: Splitting is the process of dividing the decision node/root node into sub-nodes according to the
given conditions.

 Branch/Sub Tree: A tree formed by splitting the tree.

 Pruning: Pruning is the process of removing the unwanted branches from the tree.

 Parent/Child node: The root node of the tree is called the parent node, and other nodes are called the
child nodes.

How does the Decision Tree algorithm Work?

In a decision tree, for predicting the class of the given dataset, the algorithm starts from the
root node of the tree. This algorithm compares the values of root attribute with the record
(real dataset) attribute and, based on the comparison, follows the branch and jumps to the
next node.

For the next node, the algorithm again compares the attribute value with the other sub-
nodes and move further. It continues the process until it reaches the leaf node of the tree.
The complete process can be better understood using the below algorithm:

Step-1: Begin the tree with the root node, says S, which contains the complete dataset.

o Step-2: Find the best attribute in the dataset using Attribute Selection Measure (ASM).
o Step-3: Divide the S into subsets that contains possible values for the best attributes.

Dr.Harish Naik T,DCA,Presidency College. Page 77


o Step-4: Generate the decision tree node, which contains the best attribute.
o Step-5: Recursively make new decision trees using the subsets of the dataset created in step
-3. Continue this process until a stage is reached where you cannot further classify the nodes
and called the final node as a leaf node.

Example: Suppose there is a candidate who has a job offer and wants to decide whether he
should accept the offer or Not. So, to solve this problem, the decision tree starts with the
root node (Salary attribute by ASM). The root node splits further into the next decision node
(distance from the office) and one leaf node based on the corresponding labels. The next
decision node further gets split into one decision node (Cab facility) and one leaf node.
Finally, the decision node splits into two leaf nodes (Accepted offers and Declined offer).
Consider the below diagram:

Attribute Selection Measures


While implementing a Decision tree, the main issue arises that how to select the best
attribute for the root node and for sub-nodes. So, to solve such problems there is a
technique which is called as Attribute selection measure or ASM. By this measurement, we
can easily select the best attribute for the nodes of the tree. There are two popular
techniques for ASM, which are:

o Information Gain

Dr.Harish Naik T,DCA,Presidency College. Page 78


o Gini Index

1. Information Gain:
o Information gain is the measurement of changes in entropy after the segmentation of a
dataset based on an attribute.
o It calculates how much information a feature provides us about a class.
o According to the value of information gain, we split the node and build the decision tree.
o A decision tree algorithm always tries to maximize the value of information gain, and a
node/attribute having the highest information gain is split first. It can be calculated using the
below formula:

1. Information Gain= Entropy(S)- [(Weighted Avg) *Entropy(each feature)

Entropy: Entropy is a metric to measure the impurity in a given attribute. It specifies


randomness in data. Entropy can be calculated as:

Entropy(s)= -P(yes)log2 P(yes)- P(no) log2 P(no)

Where,

o S= Total number of samples


o P(yes)= probability of yes
o P(no)= probability of no

2. Gini Index:
o Gini index is a measure of impurity or purity used while creating a decision tree in the
CART(Classification and Regression Tree) algorithm.
o An attribute with the low Gini index should be preferred as compared to the high Gini index.
o It only creates binary splits, and the CART algorithm uses the Gini index to create binary splits.
o Gini index can be calculated using the below formula:

Gini Index= 1- ∑jPj2

Advantages of the Decision Tree

o It is simple to understand as it follows the same process which a human follow while making
any decision in real-life.
o It can be very useful for solving decision-related problems.

Dr.Harish Naik T,DCA,Presidency College. Page 79


o It helps to think about all the possible outcomes for a problem.
o There is less requirement of data cleaning compared to other algorithms.

Disadvantages of the Decision Tree

o The decision tree contains lots of layers, which makes it complex.


o It may have an overfitting issue, which can be resolved using the Random Forest algorithm.
o For more class labels, the computational complexity of the decision tree may increase.

****Artificial Neural Network

The term "Artificial neural network" refers to a biologically inspired sub-field of artificial
intelligence modeled after the brain. An Artificial neural network is usually a computational
network based on biological neural networks that construct the structure of the human
brain. Similar to a human brain has neurons interconnected to each other, artificial neural
networks also have neurons that are linked to each other in various layers of the networks.
These neurons are known as nodes.

Artificial neural network tutorial covers all the aspects related to the artificial neural network.
In this tutorial, we will discuss ANNs, Adaptive resonance theory, Kohonen self-organizing
map, Building blocks, unsupervised learning, Genetic algorithm, etc.

What is Artificial Neural Network?

The term "Artificial Neural Network" is derived from Biological neural networks that
develop the structure of a human brain. Similar to the human brain that has neurons
interconnected to one another, artificial neural networks also have neurons that are
interconnected to one another in various layers of the networks. These neurons are known
as nodes.

Dr.Harish Naik T,DCA,Presidency College. Page 80


The given figure illustrates the typical diagram of Biological Neural Network.

The typical Artificial Neural Network looks something like the given figure.

Dendrites from Biological Neural Network represent inputs in Artificial Neural Networks, cell
nucleus represents Nodes, synapse represents Weights, and Axon represents Output.

Dr.Harish Naik T,DCA,Presidency College. Page 81


Relationship between Biological neural network and artificial neural network:

Biological Neural Network Artificial Neural Network

Dendrites Inputs

Cell nucleus Nodes

Synapse Weights

Axon Output

An Artificial Neural Network in the field of Artificial intelligence where it attempts to


mimic the network of neurons makes up a human brain so that computers will have an
option to understand things and make decisions in a human-like manner. The artificial
neural network is designed by programming computers to behave simply like
interconnected brain cells.

There are around 1000 billion neurons in the human brain. Each neuron has an association
point somewhere in the range of 1,000 and 100,000. In the human brain, data is stored in
such a manner as to be distributed, and we can extract more than one piece of this data
when necessary from our memory parallelly. We can say that the human brain is made up of
incredibly amazing parallel processors.

We can understand the artificial neural network with an example, consider an example of a
digital logic gate that takes an input and gives an output. "OR" gate, which takes two inputs.
If one or both the inputs are "On," then we get "On" in output. If both the inputs are "Off,"
then we get "Off" in output. Here the output depends upon input. Our brain does not
perform the same task. The outputs to inputs relationship keep changing because of the
neurons in our brain, which are "learning."

The architecture of an artificial neural network:

To understand the concept of the architecture of an artificial neural network, we have to


understand what a neural network consists of. In order to define a neural network that
consists of a large number of artificial neurons, which are termed units arranged in a
sequence of layers. Lets us look at various types of layers available in an artificial neural
network.

Artificial Neural Network primarily consists of three layers:

Dr.Harish Naik T,DCA,Presidency College. Page 82


Input Layer:

As the name suggests, it accepts inputs in several different formats provided by the
programmer.

Hidden Layer:

The hidden layer presents in-between input and output layers. It performs all the
calculations to find hidden features and patterns.

Output Layer:

The input goes through a series of transformations using the hidden layer, which finally
results in output that is conveyed using this layer.

The artificial neural network takes input and computes the weighted sum of the inputs and
includes a bias. This computation is represented in the form of a transfer function.

It determines weighted total is passed as an input to an activation function to produce the


output. Activation functions choose whether a node should fire or not. Only those who are
fired make it to the output layer. There are distinctive activation functions available that can
be applied upon the sort of task we are performing.

Dr.Harish Naik T,DCA,Presidency College. Page 83


Advantages of Artificial Neural Network (ANN)

Parallel processing capability:

Artificial neural networks have a numerical value that can perform more than one task
simultaneously.

Storing data on the entire network:

Data that is used in traditional programming is stored on the whole network, not on a
database. The disappearance of a couple of pieces of data in one place doesn't prevent the
network from working.

Capability to work with incomplete knowledge:

After ANN training, the information may produce output even with inadequate data. The
loss of performance here relies upon the significance of missing data.

Having a memory distribution:

For ANN is to be able to adapt, it is important to determine the examples and to encourage
the network according to the desired output by demonstrating these examples to the
network. The succession of the network is directly proportional to the chosen instances, and
if the event can't appear to the network in all its aspects, it can produce false output.

Having fault tolerance:

Extortion of one or more cells of ANN does not prohibit it from generating output, and this
feature makes the network fault-tolerance.

Disadvantages of Artificial Neural Network:

Assurance of proper network structure:

There is no particular guideline for determining the structure of artificial neural networks.
The appropriate network structure is accomplished through experience, trial, and error.

Unrecognized behavior of the network:

It is the most significant issue of ANN. When ANN produces a testing solution, it does not
provide insight concerning why and how. It decreases trust in the network.

Hardware dependence:

Dr.Harish Naik T,DCA,Presidency College. Page 84


Artificial neural networks need processors with parallel processing power, as per their
structure. Therefore, the realization of the equipment is dependent.

Difficulty of showing the issue to the network:

ANNs can work with numerical data. Problems must be converted into numerical values
before being introduced to ANN. The presentation mechanism to be resolved here will
directly impact the performance of the network. It relies on the user's abilities.

The duration of the network is unknown:

The network is reduced to a specific value of the error, and this value does not give us
optimum results.

Science artificial neural networks that have steeped into the world in the mid-20th century are
exponentially developing. In the present time, we have investigated the pros of artificial neural networks
and the issues encountered in the course of their utilization. It should not be overlooked that the cons of
ANN networks, which are a flourishing science branch, are eliminated individually, and their pros are
increasing day by day. It means that artificial neural networks will turn into an irreplaceable part of our
lives progressively important.

How do artificial neural networks work?

Artificial Neural Network can be best represented as a weighted directed graph, where the
artificial neurons form the nodes. The association between the neurons outputs and neuron
inputs can be viewed as the directed edges with weights. The Artificial Neural Network
receives the input signal from the external source in the form of a pattern and image in the
form of a vector. These inputs are then mathematically assigned by the notations x(n) for
every n number of inputs.

Dr.Harish Naik T,DCA,Presidency College. Page 85


Afterward, each of the input is multiplied by its corresponding weights ( these weights are
the details utilized by the artificial neural networks to solve a specific problem ). In general
terms, these weights normally represent the strength of the interconnection between
neurons inside the artificial neural network. All the weighted inputs are summarized inside
the computing unit.

If the weighted sum is equal to zero, then bias is added to make the output non-zero or
something else to scale up to the system's response. Bias has the same input, and weight
equals to 1. Here the total of weighted inputs can be in the range of 0 to positive infinity.
Here, to keep the response in the limits of the desired value, a certain maximum value is
benchmarked, and the total of weighted inputs is passed through the activation function.

The activation function refers to the set of transfer functions used to achieve the desired
output. There is a different kind of the activation function, but primarily either linear or non-
linear sets of functions. Some of the commonly used sets of activation functions are the
Binary, linear, and Tan hyperbolic sigmoidal activation functions.

Types of Artificial Neural Network:

There are various types of Artificial Neural Networks (ANN) depending upon the human
brain neuron and network functions, an artificial neural network similarly performs tasks. The
majority of the artificial neural networks will have some similarities with a more complex
biological partner and are very effective at their expected tasks. For example, segmentation
or classification.

Feedback ANN:

In this type of ANN, the output returns into the network to accomplish the best-evolved
results internally. As per the University of Massachusetts, Lowell Centre for Atmospheric
Research. The feedback networks feed information back into itself and are well suited to
solve optimization issues. The Internal system error corrections utilize feedback ANNs.

Dr.Harish Naik T,DCA,Presidency College. Page 86


Feed-Forward ANN:
A feed-forward network is a basic neural network comprising of an input layer, an output layer, and
at least one layer of a neuron. Through assessment of its output by reviewing its input, the intensity
of the network can be noticed based on group behavior of the associated neurons, and the output is
decided. The primary advantage of this network is that it figures out how to evaluate and recognize
input patterns.

****Expert systems

There is a class of computer programs, known as expert systems, that aim to mimic
human reasoning. The methods and techniques used to build these programs are the
outcome of efforts in a field of computer science known as Artificial Intelligence (AI).

Expert systems have been built to diagnose disease (Pathfinder is an expert system that
assists surgical pathologists with the diagnosis of lymph-node diseases, aid in the
design chemical syntheses (Example), prospect for mineral deposits (PROSPECTOR),
translate natural languages, and solve complex mathematical problems(MACSYMA).

Features
An expert system is a computer program, with a set of rules encapsulating knowledge
about a particular problem domain (i.e., medicine, chemistry, finance, flight, et cetera).
These rules prescribe actions to take when certain conditions hold, and define the
effect of the action on deductions or data.

Dr.Harish Naik T,DCA,Presidency College. Page 87


The expert system, seemingly, uses reasoning capabilities to reach conclusions or to
perform analytical tasks. Expert systems that record the knowledge needed to solve a
problem as a collection of rules stored in a knowledge-base are called rule-based
systems.

Utility of Expert Systems(case study)

One of the early applications, MYCIN, was created to help physicians diagnose and
treat bacterial infections. Expert systems have been used to analyze geophysical data
in our search for petroleum and metal deposits (e.g., PROSPECTOR). They are used
by the investments, banking, and telecommunications industries.

They are essential in robotics, natural language processing, theorem proving, and the
intelligent retrieval of information from databases. They are used in many other human
accomplishments which might be considered more practical.

Rule-based systems have been used to monitor and control traffic, to aid in the
development of flight systems, and by the federal government to prepare budgets.

Advantages of Rule-Based Systems

A rule-based, expert system maintains a separation between its Knowledge-base and


that part of the system that executes rules, often referred to as the expert system shell.

The expert system shell can be applied to many different problem domains with little
or no change. It also means that adding or modifying rules to an expert system can
effect changes in program behavior without affecting the controlling component, the
system shell.

Changes to the Knowledge-base can be made easily by subject matter experts without
programmer intervention, thereby reducing the cost of software maintenance and
helping to ensure that changes are made in the way they were intended.

Rules are added to the knowledge-base by subject matter experts using text or
graphical editors that are integral to the system shell. The simple process by which
rules are added to the knowledge-base is depicted in Figure 1.

Dr.Harish Naik T,DCA,Presidency College. Page 88


Finally, the expert system never forgets, can store and retrieve more knowledge than
any single human being can remember, and makes no errors, provided the rules created
by the subject matter experts accurately model the problem at hand.

****Expert System Architecture

An expert system is, typically, composed of two major components, the Knowledge-
base and the Expert System Shell. The Knowledge-base is a collection of rules
encoded as metadata in a file system, or more often in a relational database.

The Expert System Shell is a problem-independent component housing facilities for


creating, editing, and executing rules. A software architecture for an expert system is
illustrated in Figure 2.

The shell portion


includes software modules whose purpose it is to,
Dr.Harish Naik T,DCA,Presidency College. Page 89
 Process requests for service from system users and application layer modules;
 Support the creation and modification of business rules by subject matter
experts;
 Translate business rules, created by a subject matter experts, into machine-
readable forms;
 Execute business rules; and
 Provide low-level support to expert system components (e.g., retrieve metadata
from and save metadata to knowledge base, build Abstract Syntax Trees during
rule translation of business rules, etc.).
Client Interface

 The Client Interface processes requests for service from system-users and from
application layer components. Client Interface logic routes these requests to an
appropriate shell program unit.

 For example, when a subject matter expert wishes to create or edit a rule, they
use the Client Interface to dispatch the Knowledge-base Editor. Other service
requests might schedule a rule, or a group of rules, for execution by the Rule
Engine.

Knowledge Base Editor

 The Knowledge-base Editor is a simple text editor, a graphical editor, or some


hybrid of these two types. It provides facilities that enable a subject matter
expert to compose and add rules to the Knowledge-base.

Rule Translator

 Rules, as they are composed by subject matter experts, are not directly
executable. They must first be converted from their human-readable form into a
form that can be interpreted by the Rule Engine. Converting rules from one
form to another is a function performed by the Rule Translator.

 Once created, AST (Abstract Syntax Tree, an Abstract data type)representations


are converted into rule metadata and stored in the Knowledge-base. Rule
metadata is simply a compact representation of ASTs. The role of the Rule
Translator in the rule editing process.

Dr.Harish Naik T,DCA,Presidency College. Page 90


Rule Engine

 The Rule Engine (often referred to as an inference engine in AI literature) is


responsible for executing Knowledge-base rules. It retrieves rules from the
Knowledge-base, converts them to ASTs, and then provides them to its rule
interpreter for execution.

 The Rule Engine interpreter traverses the AST, executing actions specified in the
rule along the way.

Rule Object Classes

The shell component, Rule Object Classes, is a container for object classes
supporting,

 Rule editing; AST construction;


 Conversion of ASTs to rule metadata;
 Conversion of rule metadata to ASTs; and
 Knowledge-base operations (query, update, insert, delete).

*****All the very best students*****

Dr.Harish Naik T,DCA,Presidency College. Page 91

You might also like