Artificial Intelligence
Artificial Intelligence
Definition of knowledge
Intelligence
Knowledge
Information
Data
Symbol
What is intelligence?
An exact definition of intelligence has proven to be extremely elusive.
Douglas Hofstadler suggests the following characteristics in a list of essential abilities for intelligence.
Turing Test: In 1950, Turing published an article in the Mind magazine, which triggered a controversial topic “Can a machine think”.
Turing proposed an ‘imitation game’ which was later modified to Turing test. In the imitation game the players are three humans- a
male, a female and an interrogator. The interrogator who is shielded from the other two, asks questions to both of them and based on
their typewritten answers determines who is female. The aim of the male is to imitate the female and deceive the interrogator and the
role of female is to provide replies that would inform the interrogator about her true sex.
Turing proposed that if the human interrogator in Room C is not able to identify who is in Room A or in Room B, then the machine
possesses intelligence. Turing considered this is a sufficient test for attributing thinking capacity to a machine.
As of today, Turing test is the ultimate test a machine must pass in order to be called as intelligent test.
It also helps in eliminating any bias in favour of living organism, because the interrogator focuses slowly on the content of the answers
to the questions.
There is no universal agreement among AI researchers about exactly what constitutes AI.
Various definitions of AI focus on different aspects of this branch of computer science including intelligent behavior, symbolic process,
heuristics and pattern matching.
1. Intelligence
2. Process
3. Understand
4. Make sense
5. Common sense.
1.Expert System:
An expert system is a computer program designed to act as an expert in a particular domain also known as knowledge based system .
An expert system is a set of programs that manipulate encoded knowledge to solve problems in a specialized domain that normally
requires human expertise. The system perform their inference through symbolic computations.
Expert systems are currently designed to assist experts not to replace them. They have proven to be useful in diverse are as such as
medical diagnosis, chemical analysis, and geological exploration and computer system configuration.
Since the expert system field promises a great deal of practical application and commercial potential in the near future .it has begun to
attract on enormous amount of attention.
The utility of computer is often limited by communication difficulties. The effective use of a computer traditionally has involved the use
of a programming language or set of commands that you must use to communicate with the computer. The goal of natural language
processing is to enable people and computers to communicate in a natural (human) language such as English rather in a computer
language.
The field of N.L.P is divided into 2 sub fields of : 1. Natural language understanding which investigates methods of allowing the
computer to comprehend instructions given in ordinary English so computers can understand people more easily.
2. Natural-language generations, which strives to have, computers produce ordinary English language so that people can understand
computers more easily.
3. Speech recognition:
The focus of N.L.P is to enable computers to communicate interactively with English words and sentences that are typed on paper or
displayed on a screen. The primary interactive method of communication used by human is not reading and writing; it is speech.
The goal of speech recognition research is to allow computers to understand human speech so that they can hear our voice and
recognize the words. We are speaking speech recognition research seeks to advance the goal of natural language processing by
simplifying the process of interactive communication between people and computers.
4.Computer vision:
It is a simple task to attach a camera to a computer so that the computer can receive visual images .it has proven to be a far more
difficult task. However to interpret those images so that the computers can understand exactly what it is seeing.
People generally use vision as their primary means of sensing their environmental .we generally see more than we hear, feel, smell of
taste .the goal of computer vision research is to give computers this same facility for understanding their surroundings.
5. Robotics:
A robot is an Electro mechanical device that can be programmed to perform manual tasks. The robotic industries association formally
defines a robot as “ a re programmable multi functional manipulator designed to move material, parts, roots or specialized devices
through variable programmed motions for the performance of a variety of tasks”.
Not at all robotics is considered to be the part of AI .a robot that performs only the actions it has been pre programmed to perform is
considered to be a ‘dumb’ robot processing no more intelligence.
An intelligent robot includes some kind of sensory apparatus such as a camera that allows it to respond to changes in its environment,
rather than just to allow instructions ‘mindlessly’.
Automatic programming:
In simple terms programming is the process of telling the computer exactly what you want it to do. Developing a computer program
frequently requires a great deal of time. A program must be designed, written, tested, debugged and evaluated all as part of the
program development process.
The goal of automatic programming is to create special programs that act as intelligent “tool” to assist programmers and expedite each
phase of the programming process. The ultimate aim of automatic programming is computer system that could develop programs by
itself, in response to and in accordance with the specifications of a program developer.
Applications of AI should be judged according to whether there is well-defined task, an implemented program and a set of identifiable
principles.
AI can help us to solve difficult real world problems, creating new opportunities in business, engineering and many other application
areas.
Characteristics of AI problems:
Before a solution can be found, the prime condition is that the problem must be very precisely defined.
To build a system to solve a particular problem, we need to do four things.
1. Define the problem precisely: this definition must include precise specifications of what the initial situations will be as well as what
final situations constitute acceptable solutions to the problem.
2. Analyze the problem.
3. Isolate and represent the last knowledge that is necessary to solve the problem.
4. Choose the last problem solving techniques and apply it to the particular problem.
A set of all possible states for a given problem is known as the state space of the problem. State space representations are highly
beneficial in AI because they provide all possible states, operations and goals. If the entire state space representation for a problem is
given, it is possible to trace the path from the initial state to the Goal State and identify the sequence of operators necessary for doing
it. The major deficiency of this method is that it is not possible to visualize all states for a given problem. To overcome the deficiencies
of this method, problem reduction technique comes handy.
Example1: Water
Boiled
Boiling
Water
Added coffee Milk powder
Decation Milk
Coffee
Added sugar
Palatable coffee
Example2:
A Water Jug Problem: You are given two Jugs, a 4-gallon one and a 3-gallon one. Neither have any measuring markers on it. There is
a pump that can be used to fill the jugs with water. How can you get exactly 2 gallons of water into the 4-gallon jug?
The state space for this problem can be described as the set of ordered pairs of integers (x, y), such that x=0, 1,2,3, or 4 and y = 0,1,2,
0r 3; x represents the number of gallons of water in the 4-gallon jug, and y represents the quality of water in the 3-gallon jug. The Start
State is (0,0). The goal state is (2,n) for any value of n (since the problem does not specify how many gallons need to be in the 3-gallon
jug).
9. (x, y) - (x+y,0) pour all the water from the 3-Gallon jug in to the
If x + y <3 and x > 0 4- gallon jug
10 (x, y) - (x+y,0) pour all the water from the 4-Gallon jug in to the
If x + y < 3 and x > 0 3- gallon jug
11 (0,2) (2,0) pour all 2 gallons from the 3-Gallon jug in to the
4-Gallon jug.
12. (2,y) (0,y) Empty the 2 gallons in the 4.gallons in the 4-gallon jug on the
Ground.
Gallons of water in the 4-gallon jug. Gallons of water in the 3-gallon jug Rule Applied
002
039
302
33
427
0 2 5 0r 12
2 0 9 or 11
002
401
138
106
0 1 10
411
238
206
Problem Reduction: In this method a complex problem is broken down or decomposed into a set of preemptive sub problems.
Solutions for these preemptive subprograms are easily obtained. The solutions for all the sub-problems collectively give the solution for
the complex problem.
Example:
We want to evaluate ?(x2+3x+sin2xcos2x)dx
? x2 dx ? 3x dx ? sin2x cos2x dx
x3 /3 3 x2/2 dx ? (1-cos2x)cos2x dx
3 x2 /2 ? (cos2x – cos4x) dx
The individual values can be combined (Integrated) to get the final result.
1. Knowledge representation
2. Heuristic search
3. AI programming languages and tools
4. AI hardware
A physical symbol system is a machine that produces through time an evolving collection of symbol structures.
Creation
Modification
Set of operators
Reproduction
Destruction
P.S.S. has the necessary and sufficient means to exhibit intelligence.
There appears to be no way to prove or disprove it on logical grounds. So it must be subjected to empirical validation. We may find that
it is false. We may find that the bulk of the evidence says that it is true. But the only way to determine its truth is by experimentation.
The importance of the physical symbol system hypothesis is twofold. It is a significant theory of the nature of human intelligence and so
is of great interest to psychologists. It also forms the basis of the belief that it is possible to build programs that can perform intelligent
tasks now performed by people.
Properties of AI:
What is an AI Techniques?
It is voluminous
It is hard to characterize accurately
It is constantly changing
It differs from data by being organized in a way that corresponds to the ways it will be used.
Search: Provides a way of solving problems for which no more direct approach is available as well as a frame work into which any
direct techniques that are available can be embedded.
Use of knowledge: Provides a way of solving complex problems by exploiting the structures of the objects that are involved.
Abstraction: Provides a way of separating important features and variations from unimportant ones that would otherwise overwhelm
any process.
Problem Characteristics:
Heuristic search is a very general method applicable to a large class of problems. In order to choose the most appropriate methods for
a particular problem it is necessary to analyze the problem along several key dimensions.
? x2 dx + ? 3x dx + ? sin2x cos2x dx
x3 /3 + ? 3x dx + ? (1-cos2x)cos2x dx
3 x2 /2 + ? (cos2x – cos4x) dx
The individual values can be combined (Integrated) to get the final result.
A non-decomposable problem: Blocks World problem
1. Clear (x) [ block x has nothing on it] ? On(x,table) [ Pick up x and put it on table]
2. Clear(x) and clear(y) ? On(x, y) [put x on y]
1.We now try to combine the two sub-solutions into one solution we fail regardless of which one do first, we will not be able to so the
second. I.e. 1 and 2 are independent.
1. Theorem Proving: Suppose we want to prove a mathematical theorem we proceed by first proving a lemma that we think will be
useful. Eventually, we realize that the lemma is not of help at all. Are we in trouble?
2. The 8 puzzle: The 8-puzzle is a square in which is placed eight square tiles. The remaining 9th square is uncovered. Each tile has a
number on it. A tile that is adjacent to the blank space can be slid into that space. A game consists of a starting position and specified
into the goal position by sliding the tiles around.
3. Chess: Suppose we made a wrong move and we realized it a couple of moves later.
Ignorable problems can be solved using a simple control structure that never back tracks.
Recoverable problems can be solved using a simple control structure that backtracks.
A great deal of effort is needed to solve irrecoverable problems.
So, to answer the question “Is Marcus alive” we can choose any one of the two solutions. Since each path will lead to answer. If we do
follow one path successfully to the answer there is no reason to go back and see if some other path might also lead to a solution.
Path 1 :
Boston---250---->New York---1450---->Miami---3050---->Dallas---4750---->S.F---7750---->Boston
Path 2 :
Boston---3000---->S.F.---4700---->Dallas---6200---->New York---7400---->Miami---8850---->Boston
We can’t say one path is the shortest one unless we try other paths also.
Several components in this sentence, each of which in solution, may have more than one interpretation. But, the whole sentence must
give only meaning.
Source of Ambiguity:
Bank financial institutions (or) side of rivers ? only one of these may have a president. Dish object of the verb ‘eat’, a dish was eaten?
The Pasta Salad in the dish was eaten.
So some search is required to find the interpretation of the sentence. But these will be anyone interpolation.
The solution is not just the state (2,0) but the path from (0,0) to (2,0).
Chess: Knowledge required is very little (a set of rules for legal moves, a control mechanism that implements an appropriate search
procedure, knowledge of good tactics by a perfect program.
Newspaper: Now consider the problem of scanning daily newspaper to decide which are supporting the democrats and which are
supporting the republicans in some upcoming election. Again assuming unlimited computing power, how much knowledge would be
required by a computer trying to solve this problem? This time the answer is a great deal.
These two problems chess and newspaper story understanding, illustrate the difference between the problems for which a lot of
knowledge is important only to constrain the search for solution and those for which a lot of knowledge is required even to be able to
recognize a solution.
Production system is a mechanism that describes and performs the search process. It consists of
1. A set of rules.
2. One or more knowledge or database.
3. A control strategy that specifies the order of the rules to be applied.
4. A rule applied.
We have argued that production systems are a good way to describe the operations that can be performed in a search for a solution to
a problem.
1. Can production systems, like problems, be described by a set of characteristics that shed some light on how they can easily be
implemented?
2. If so, what relationships are there between problem types and the types of production system best suited to solving the problems?
A monotonic production system: It is a production system in which the application of a rule never prevents the later application of
another rule that could also have been applied at the time first rule was selected.
A non-monotonic production system: A non-monotonic production system is one in which this is not true.
Partially commutative production system: A partially commutative production system is a production system with the property that if the
application of a particular sequence of rules transforms state x into state y then any permutation of those rules that is allowable (i.e.
each rules preconditions are satisfied when it is applied) also transforms state x into state y. Partially commutative, monotonic
production systems are useful for solving ignorable problems.
A commutative production system: A commutative production system is a production system that is both monotonic and partially
commutative.
The significance if these categories of production systems lie in the relationship between the categories and appropriate
implementation strategies.
Monotonic Non-monotonic
Partially commutative, monotonic production systems are important from an implementation standpoint because they can be
implemented with out the ability to backtrack to previous states when it is discovered that an incorrect path has been followed.
Although it is often useful to implement such systems with backtracking in order to guarantee a systematic search, the actual database
representing the problem state need not be restored.
Non-monotonic, partially commutative systems, on the other hand are useful for problems in which changes occur but can be reversed
and in which order of operations is not critical.
Commutative production systems are useful for many problems in which irreversible changes occur. These are likely to produce the
same node many times in the search process.
Searching Techniques
Every AI program has to do the process of searching for the solution steps are not explicit in nature. This searching is needed for
solution steps are not known before hand and have to be found out. Basically to do a search process the following steps are needed.
1. The initial state description of the problem.
2. A set of legal operators that changes the state.
3. The final or goal state.
The searching process in AI can be broadly classified into two major parts.
1. Brute force searching techniques (Or) Uninformed searching techniques.
2. Heuristic searching techniques (Or) Informed searching techniques.
In which, there is no preference is given to the order of successor node generation and selection. The path selected is blindly or
mechanically followed. No information is used to determine the preference of one child over another. These are commonly used search
procedures, which explore all the alternatives, during the searching process. They don’t have any domain specific knowledge all their
need are the initial state , final state and the set of legal operators. Very important brute force searching techniques are
1. Depth First Search
2. Breadth First Search
Depth first search: This is a very simple type of brute force searching techniques. The search begins by expanding the initial node i.e.
by using an operator generate all successors of the initial node and test them.
This procedure finds whether the goal can be reached or not but the path it has to follow has not been mentioned. Diving downward
into a tree as quickly as possible performs Dfs searches.
Root
AB
DEFIJ
GH
Goal State
Algorithm:
Step1: Put the initial node on a list START.
Step2: If START is empty or START = GOAL terminates search.
Step3: Remove the first node from START. Call this node a.
Step4: If (a= GOAL) terminates search with success.
Step5: Else if node a has successors, generate all of them and add them at the beginning
Of START.
Step6: Go to Step 2.
The major draw back of the DFS is the determination of the depth citric with the search has to proceed this depth is called cut of depth.
The value of cutoff depth is essential because the search will go on and on. If the cutoff depth is smaller solution may not be found.
And if cutoff depth is large time complexity will be more.
Advantages: DFS requires less memory since only the nodes on the current path are stored.
By chance DFS may find a solution with out examining much of the search space at all.
Breadth First Search (BFS): This is also a brute force search procedure like DFS. We are searching progresses level by level. Unlike
DFS which goes deep into the tree. An operator employed to generate all possible children of a node. BFS being a brute force search
generates all the nodes for identifying the goal. The amount of time taken for generating these nodes is prepositional to the depth “d”
and branching factor “b” is given by 0(b)
Root
AB
DEFIJ
GH
Goal State
ALGORITHM:
Step 1. Put the initial node on a list “START”
Step 2. If START is empty or goal terminate the search.
Step 3. Remove the first node from the Start and call this node “a”
Step 4. If a =”GOAL” terminate search with success
Step 5. Else if node “a” has successors generate all of them and add them at the tail of
“START”
Step 6. Go to step 2.
Advantages:
1. BFS will not get trapped exploring a blind alley.
2. If there is a solution then BFS is guaranteed to find it.
3. The amount of time needed to generate all the nodes is considerable because of the time complexity.
4. Memory constraint is also a major problem because of the space complexity.
5. The searching process remembers all unwanted nodes, which are not practical use for the search process.
Heuristic Search Techniques: In informed or directed search some information about the problem space is used to compute a
preference among the children for exploration and expansion.
The process of searching can be drastically reduced by the use of heuristics. Heuristic is a technique that improves the efficiency of
search process. Heuristic are approximations used to minimize the searching process. Generally two categories of problems are used
in heuristics.
1. Problems for which know exact algorithms are known & one needs to find an appropriate & satisfying the solution for example
computer vision. Speech recognition.
2. Problems for which exact solutions are known like rebuke cubes & chess.
The following algorithms make use of heuristic evolution
1. Generate & test
2. Hill climbing
3. Best first search
4. A* Algorithm
5. AO* Algorithm
6. Constraint satisfaction
7. Means- ends analysis.
1.Generate and test: The generate & test strategy is the simplest of all the approaches. The generate & test algorithm is a depth first
search procedure – since complete solutions must be generated before they can be tested. In its most systematic form, it is simply an
exhaustive search of the problem space, It is also known as the British museum algorithm. A reference to a method for finding an
object in British museums by wandering randomly.
Algorithm
Step 1: Generate possible solutions. For some problems this means generating a particular point in the problem space. For others it
means generating a path from a start state.
Step 2: Test to see if these actually a solution by comparing the chosen point or the end point of the chosen path of the set of
acceptable good states.
Step 3: If a solution has been found, quit, otherwise, return to step 1.
Hill Climbing: It is a variant of generate a test in which feed back from the test in which feed back from the test procedure is used to
help the generator decide which direction to move in the search space. In a pure generate & test procedure the test function response
with only a yes or no. But if the test function is augmented with a heuristic function. That provides a estimate of how close given state is
to a goal state. Hill climbing is often used when a good heuristic function is available for evaluating states. But when no other useful
knowledge is available. This algorithm is also discrete optimization algorithm uses a simple heuristic function. The amount of distance
the node is from the goal node in fact there is a practically no difference between hill climbing & DFS except that the children of the
node that has been expanded are shorted by the remaining distance nodes.
Root
AB
DEFIJ
GH
Goal State
Algorithm:
Local Maximum
Plateau: The flat area of the search space in which all neighbors have the same value.
Plateau
Ridge: Described as a long and narrow stretch of evaluated ground or a narrow elevation or raised part running along or across a
surface.
Ridge
In order to overcome these problems, adopt one of the following or a combination of the following methods.
1. Backtracking for local maximum. Backtracking helps in undoing what has been done so far and permits to try different path to attain
the global peak.
2. A big jump is the solution to escape from the plateau. A huge jump is recommended because in a plateau all neighboring points
have the same value.
3. Trying different paths at the same time is the solution for circumventing ridges.
Best first Search: Which is a way of combining the advantages of both depth-first-search and breadth-first-search in to a single method.
Dfs is good because if allows a solution to be found without all competing branches having to be expanded. Bfs is good because it
does not get trapped on dead end paths. One way of combining the two is to follow a single path at a time, but switch paths whenever
some competing path looks more promising than the current one does.
In this procedure, the heuristic function used here called an evaluation function is an indicator of how far the node is from the goal
node. Goal nodes have an evaluation function value of zero.
9D
3A8E
K
6 12 F 1
S
B 14 G 0
5 L Goal
5I
C2
M
7
6J
1. S A: 3, B: 6, C: 5 A: 3, B: 6, C: 5 A: 3
2. A D: 9, E: 8 B: 6, C: 5, D: 9, E: 8 C: 5
3. C H: 7 B: 6, D: 9,E: 8,H: 7 B: 6
4. B F: 12, G:14 D:9,E:8,H:7,F:12,G:14 H: 7
5. H I: 5, J: 6 D:9,E:8,F:12,G:14,I:5, I: 5
J:6
6. I K:1, L:0, M:2 D:9,E:8,F:12,G:14,J:6, search stop
K:1, L:0,M:2 goal is reached
There is an only minor variation between hill climbing and Best FS. In the former we sorted the children of the first node being
generated. Here we have to sort the entire list to identify the next node to be expanded.
The paths found by best first search are likely to give solutions faster because it expands a node that seems closer to the goal.
However there is no guarantee of this.
Algorithm:
A* algorithm: The best first search algorithm that was just presented is a simplification an algorithm called A* algorithm which was first
presented by HART.
A part from the evolution function values one can also bring in cost functions indicate how much resources like time, energy, money
etc. have been spent in reaching a particular node from the start. While evolution functions deal with the future, cost function deals with
the past. Since the cost function values are really expanded they are more concrete than evolution function values. If it is possible for
one to obtain the evolution function values then A* algorithm can be used. The basic principle is that the sum the cost and evolution
values for a state to get its goodness worth and this is a yard stick instead evolution function value in best first search. The sum of the
evolution function value and the cost along the path leading to that state is called fitness number. While best first search uses the
evolution function value for expanding the best node A* uses the fitness number for its computations.
9
2D
A
328
3E1
2B4F1K
S 12
63G20
6 14 I L Goal
52
C725M
5
7H66j
Problem Reduction: In this method, a complex problem is broken down or decomposed into a set of primitive sub problems. Solutions
for these primitive sub-problems are easily obtained. The solutions for all the sub-problems collectively given the solution for the
complex problem.
Between the complex problem and the sub-problem, there exist two kinds of relationships, i.e AND relation and OR relation ship.
In AND relation ship, the solution for the problem is obtained by solving all the sub-problems.(Remember AND gate truth table
condition).
In OR relationship, the solution for the problem is obtained by solving any of the sub-problems. (Remember AND gate truth table
condition).
This is why the structure is called an AND-OR graph.
The problem reduction is used on problems such as theorem proving, symbolic integration and analysis of industrial schedules.
To describe an algorithm for searching an AND-OR graph, need to exploit a value, call futility. If the estimated coast of a solution
becomes greater than the value of futility, then give up the search. Futility should be chosen to corresponds to a threshold such that
any solution with a cost above it is too expensive to be practical, even if it could every be found.
534
The problem reduction algorithm we just described is a simplification of an algorithm described in Martelli and Montanari, Martelli and
Montanari and Nilson. Nilsson calls it the AO* algorithm , the name we assume.
1. Place the start node s on open.
2. Using the search tree constructed thus far, compute the most promising solution tree T
3. Select a node n that is both on open and a part of T. Remove n from open and place it on closed.
4. If n is a terminal goal node, label n as solved. If the solution of n results in any of n’s ancestors being solved, label all the ancestors
as solved. If the start node s is solved, exit with success where T is the solution tree. Remove from open all nodes with a solved
ancestor.
5. If n is not a solvable node (operators cannot be applied), label n as unsolvable. If the start node is labeled as unsolvable, exit with
failure. If any of n’s ancestors become unsolvable because n is, label them unsolvable as well. Remove from open all nodes with
unsolvable ancestors.
6. Otherwise, expand node n generating all of its successors. For each such successor node that contains more than one sub problem,
generate their successors to give individual sub problems. Attach to each newly generated node a back pointer to its predecessor.
Compute the cost estimate h* for each newly generated node and place all such nodes that do not yet have descendents on open.
Next, recomputed the values of h* at n and each ancestor of n.
7. Return to step 2.
If can be shown that AO* will always find a minimum –cost solution tree if one exists, provided only that h*(n) <_ h(n), and all are costs
are positive. Like A*, the efficiency depends on how closely h* approximates h.
Constraint satisfaction:
Many problems in AI can be viewed as problems of constraint satisfaction in which the goal is to discover some problem state that
satisfies a given set of constraints. Constraint satisfaction is a search procedure that operates in a space of constraint sets. The initial
state contains the constraints that are originally given in the problem description. A goal state is any state that has been constrained
“enough”, where enough must be defined for each problem.
Constraint satisfaction is two-step process. First, constraints are discovered and propagated as far as possible through outs the
system.
Algorithm:
1. Propagate available constraints. To do this, first set OPEN to the set of all objects that must have values assigned to them in a
complete solution. Then do until an inconsistency is detected or until OPEN is empty :
(a) Select an object OB from OPEN. Strengthen as much as possible the set of constraints that apply to OB.
(b) If this set is different from the set that was assigned the last time OB was examined or if this is the first time OB has been
examined, then add to OPEN all objects that share any constraints with OB
(c) Remove OB from OPEN.
2. If the union of the constraints discovered above defines a solution, then quit and report the solution.
3. If the union of the constraints discovered above defines a contradiction, then return failure.
4. If neither of the above occurs, then it is necessary to make a guess at in order to proceed. To do this, loop until a solution is found or
all possible solutions have been eliminated
(a) Select an object whose value is not at determined and select a way of strengthening the constraints on that object.
(b) recursively invoke constrain satisfaction with the current set of constraints augmented by
the strengthening constraint just selected.
This algorithm apply it in particular problem domain requires the use of two kinds of rules. Rules that define the way constraints may
validly be propagated and rules that suggest guesses when guesses are necessary.
The solution process proceeds in cycles, at each cycle two significant things are done.
1. Constraints are propagated by using rules that correspond to the properties of arithmetic.
2. A value is guessed for some letter whose value is not yet determined.
Problem1: S E N D
MORE
=======
MONEY
==========
LET M=1
C3+S+1>9, C3 can be 0 or 1 => S=9 or 8
C3+S+M can be either 9,10 or 11. It is 9 then no carry, If sum is 11 then O=1.
But M is already assigned 1.
So O=0 and C3=0. S=9 or 8.
Let C3=0 & S=9
C2+E+O=N if C2=0, E=N It is wrong.
So c2 =1, 1+E =N.
Let E=2 then N=3
923D
10R2
======
10 3 2 Y R=9 & C1 = 0 wrong
R=8 & C1 = 1 correct
923D
1082
=====
1 0 3 2 Y to get carry D>8 => D= 8 or 9 clash, Similarly E= 3 & 4 clash
Now for E=5 then N=6
956D
10R5
=====
1 0 6 5 Y C2+6+R = 1 5
C2= 0 => R=9 wrong
C2= 1 => R =8
956D
1085
=====
1 0 6 5 Y Now D+5>9=>D>4
D=6 then Y=1 It is wrong
D= 7 then Y=2 It is correct
D= 8 or 9 wrong
Result: 9 5 6 7
1085
=====
10652
Values: S=9,E=5, N=6 , D= 7,
M=1,O=0,R=8,E=5
M=1,O=0,N=6,E= 5,Y=2.
Problem2:
DONALD
GERALD
=========
ROBERT
D+D = C1.T C1+ L+ L = C2.R
C2+A+A = C3.E C3+N+R= C4.B
C4+O+E= C5.O C5+D+G= R
Means-ends-analysis:
We have presented a collection of search strategies that can reason either forward of backward, but for a given problem, one direction
or the other must be chosen. Often, however, a mixture of the two directions is appropriate. Such a mixed strategy would make it
possible to solve the major parts of a problem first and then go back and solve the small problems that arise in “gluing” the big pieces
together. A technique known as means-ends analysis allows us to do that.
The means-ends analysis process centers on the detection of differences between the current state and the Goal State. Once such a
difference is isolated, an operator that can reduce the difference must be found. But perhaps that operator cannot be applied to the
current state. So we set up a sub problem of getting to a state in which it can be applied. The kind of backward chaining in which
operators are selected and then sub-goals are set up to establish the preconditions of the operators is called operator sub-goaling.
Just like the other problem solving techniques we have discussed, means-end- analysis relies on a set of rules that can transform one
problem state into another. These rules are usually not represented as a left side that describes the conditions that must be met for the
rule to be applicable (these conditions are called the rule’s preconditions) and a right side that describes those aspects of the problem
state that will be changed by the application of the rule.
Algorithm: 1. Compare CURRENT to GOAL. IF there are no differences between them then
return.
2.Otherwise, select the most important difference and reduce it by doing the
following until success or failure is signaled.
(a) Select an as yet untried operator 0 that is applicable to the current difference. If there are no such operators, them signal failure.
(b) Attempt to apply 0 to CURRENT. Generate descriptions of two states: 0-START, a state in which 0’s preconditions are satisfied and
0-RESULT, the state that would result if 0 was applied in 0-START.
(c) If (FIRST-PART ?MEA(CURRENT, 0-START))
And
(LAST-PART ? MEA (0-RESULT, GOAL)
are successful, then signal success and return the result of concatenating
FIRST PART, 0, and LAST-PART.
Ex: Initial state: ( (R & (~P?Q)&S)) Goal State:( ((Q V P) & R)&~S)
(R & (~P?Q)
(~P?Q) & R
(~~P V Q) & R
(P V Q) & R
(Q V P) & R
Knowledge Representation
Knowledge is an intellectual acquaintance with, or perception of, fact or truth. A representation is a way of describing certain fragments
or information so that any reasoning system can easily adopt it for interfacing purpose. Knowledge representation is a study of ways of
how knowledge is actually picturised and how effectively it resembles the representation of knowledge in human brain.
A knowledge representation system should provide ways of representing complex knowledge and should possess the following
characteristics.
1. The representation scheme should have a set of well-defined syntax and semantics. This help in representing various kinds of
knowledge.
2. The knowledge representation scheme should have a good expression capacity. A good expressive capability will catalyze the
inference mechanism in its reasoning process.
3. From the computer system point of view, the representation must be efficient. By this we mean that it should use only limited
resources with out compromising on the expressive power.
In order to solve the complex problems encountered in AI, one needs both a large amount of knowledge and some mechanisms for
manipulating that knowledge to create solutions to new problems. A variety ways of representing knowledge have been exploited in AI
programs.
Facts: truths in some relevant world. These are the things we want to represent.
Representations: Representations of facts in some choose formalism. These are the things we will actually be able to manipulate.
The knowledge level: The knowledge level at which facts are described.
The symbol level: The symbol level at which representations of objects at the knowledge level are defined in terms of symbols that can
be manipulated by programs.
English English
Understanding generation
We will call these links representation mappings. The forward representation mapping maps from facts to representations. The
backward representation mapping goes other way from representation to facts.
One representation of facts is so common that it deserves special mention. Natural language (particularly English) sentences.
Regardless of the representation for facts that we use in a program, we may also need to be concerned with an English representation
of those facts in order to facilitate getting information in to and out of the system. In this case we must also have mapping functions
from English sentences to the representation we are actually going to use and from is back to sentences,
Spot is a dog.
The fact represented by that English sentence can also be represented in logic as:
Dog (Spot)
Suppose that we also have a logical representation of the fact that all dogs have tails:
Then using the deductive mechanism of logic, we may generate the new representation object:
It is important to keep in mind that usually the available mapping functions are not one-to-one. In fact, they are often not even functions
but rather many-to-many relations. This particularly of the mapping involving English representations of facts. For example the two
sentences “All dogs have tails” and “Every dog has a tail” could both represent the same fact, namely that every dog has at least one
tail. On the other hand the former could represent either the fact that every dog has at least one tail or the fact that each dog has
several tails. The latter may represent whither the fact that every dog has at least one tail or the fact that there is a tail that every dog
has. As we will see shortly, when we try to convert English sentences in to some other representation, such as logical propositions, we
must first decide what facts the sentences represent and then convert those facts in to the new representation.
Approaches to knowledge in a particular domain should possess the following four properties.
1. Representational Adequacy: The ability to represent all the kinds of knowledge that are needed in that domain.(Ex: Axioms of water-
jug problem(12)).
2. Inferential Adequacy: The ability to manipulate the representational structures in such a way as to derive new structures
corresponding to new knowledge inferred from old. (Ex: Solutions of a water-jug problem).
3. Inferential Efficiency: The ability to incorporate in to the knowledge structure additional information mechanisms in the most
promising directions. (Ex: A doctor solves the patient’s decease).
4. Acquisitional Efficiency: The ability to acquire new information easily. The simplest case involves direct insertion by a person of new
knowledge into the database.(Ex: An expert).
? Are their any important relation ships that exist among attributes of objects?
? At what level should Knowledge represent? Is there a good set of primitives into which all knowledge can be broken down? Is it
helpful to use such primitives?
? Given a large amount of knowledge stored in a database, how can relevant parts be accessed when they are needed?
1 Important Attributes: There are two attributes that are of very general significance. That is Instance and is attributes are important
because they support property inheritance. They are called a variety of things in AI systems. They represent class member ship and
class inclusion and that class inclusion is transitive. In logic-based systems, these relation¬ ships may be represented this way or they
may be represented implicitly by a set of predicates describing particular class.
2.Relationships among Attributes: The attributes that we use to describe objects are themselves entities that we represent. There are
four properties. (1) Inverse 2) Existence in an is a hierarchy (3) Techniques reasoning about values (4) Single valued attributes.
1.Inverses: Entities in the world are related to each other in many different way us. But as soon as we decide to describe those relation
ships as attributes, we commit to a perspective in which we focus on one object and look for binary relation ships between it and
others. We used the attributes instance, is a, and team. Each of those was being described and terminating at the object representing
the value of the specified attribute. In many cases, it is important to represent this other view of relationships. There are two good ways
to do this. The first is to represent both relation ships in a single representation that ignores focus.
Ex: Team = (Sagar, cricket) The second approach is to use attributes that focus on a single entity but to use them in pairs, one the
inverse of the other,
• One associated with sagar: Team = Cricket
• One associated with cricket: Team member = Sagar.
2. Existence in an is_a hierarchy: Just as there are classes of objects and specialized subsets of those classes, there are attributes
and specialization of attributes. These are generalization - specialization relationships are important for attributes for the same reason
that they are important for other concepts - they support inheritance.
3. Techniques for reasoning about values: Some times values of attributes are specified explicitly when acknowledge base is created.
But often the reasoning system must reason about values it has not been given explicitly. Several kinds of information can play a role
in this reasoning.
4. Single valued attributes: A specific but very useful kind of attributes is one that is guaranteed to take a unique value. Knowledge -
representation systems have taken several different approaches to providing support for single - valued attributes.
? Introduce an explicit notation for temporal interval. If two different values are ever asserted for the same temporal interval, signal a
contradiction automatically.
? Assume that the only temporal interval that is of interest is now so if a new value is asserted. Replace the old value.
The major advantage of converting all statements into a representation in terms of a small set of primitives is that the written only in
terms of the primitives rather than in terms of the many ways in which the knowledge may originally have appeared. Several AI
programs including those described by schank and Abelsan and woks are based on knowledge bases described in terms of a small
number of low-level primitives. There are several arguments against the use of low-level primitives. One is the simple high level facts
may require a lot of storage when broken down into primitives. A second but related problem is that if knowledge is initially presented
to the system in a relatively high level form such as English, and then substantial work must be done to reduce the knowledge into
primitive form. A third problem with the use of low-level primitives is that in many domains; it is not at all clear what the premises should
be.
Ex: John spotted Sue.
Who spotted Sue?.
Here the direct answer to the question is ‘yes’.
Did John see Sue?.
The obvious answer that we may give is “yes”.
But for AI to reason it out we need to add a fact: Spotted(x,y)?sow(x,y).
Here we break the idea of spotting into more primitive concept of seeing.
5. Representing sets of objects: It is important to be able to represent sets of objects for several reasons. One is that there are some
properties that are true of sets that are not true of the individual members of a set. There are two ways to state a definition of a set and
its elements. The first is to list the members. Such a specification is called an extensional definition. The second is to provide a rule
that when a particular object is evaluated, returns true or false depending on where the object is in the set or not. Such a rule is called
an intensional definition. While it is trivial to determine whether two sets are identical if extensional descriptions are used, it may be
very difficult to do so using intension descriptions. Intensional representations have two important properties that extensional an lack.
The first is they can be used to describe infinite sets and sets not all of whole elements are explicitly known. Thus we can describe
intentionally such sets as prime numbers. The second thing we can do with intensional descriptions is to allow them to depend on
parameter that can change, such as time or spatial location.
The advantages that an intensional definition has over the extensional definition are:
1. intensional representations can be used to describe infinite sets and sets not all of whose elements are explicitly known.
Ex: sets of prime numbers or kings of England.
2. intentional definition allows us to depend on parameters that can change.
Ex: the president of the united states used to be a pemocrot.
5. Finding the right structures as needed: In fact, in order to have access to the right structure for describing a particular situation, it is
necessary to solve all of the following problems.
1. How to perform an initial selection of the most appropriate structure
2. How to fill in appropriate details from the current situation.
3. How to find a better structure if the one chosen initially terms cut not to be appropriate.
4. What to do if none of the available structures is appropriate.
5. When to create and
remember a new structure.
General-purpose method for solving all these problems. Some knowledge representation techniques solve it of them. In this section we
survey some solutions to two of these problems. Now to select an initial structure to consider and how to find a better structure if one
terms out not to be a good match.
1. Selecting an initial structure: Selecting candidate knowledge structures to match a particular problem-solving situation is a hard
problem. There are several ways in which it can be done. Their important approaches are the following.
1. Index the structures directly by the significant English words that can be used to describe them.
Disadvantages:
a. many words may have several different meanings.
Ex: I. john flew to newyork.
II. john flew the kite.
“flew” here had different meaning in two different contexts.
2. Consider each major concept as a pointer to all of the structures in which it might be involved.
Ex:
I. the concept steak might point to two scripts, one for restaurant and the other for supermarket.
II. The concept bill might point to as restaurant script or a shopping script. We take the intersection of these sets get the, structure that
involves all the content words.
Disadvantages:
I. if the problem contains extraneous concepts then the intersection will result as empty.
II.. It may require a great deal of computation to compute all the possible sets and then to interest them.
3. Locate one major clue in the problem description and use if to select an initial structure.
Disadvantages:
I. We can’t identify a major clue in some situation.
II. It is difficult to anticipate which clues are important and which are not.
2. Revising the choice when necessary: Depending on the representation we are using, the details of the matching process will vary. If
may require variables to be bound to objects. If may require attributes to have their values. Compared in any case, if values that satisfy
the required restrictions as imposed by the knowledge structure can be found, they are put into the appropriate places in the structures.
If no appropriate structure can be found them a new structure must be selected. The way in which the attempt to instantiated this first
structure failed may provide useful can as to which one to try next if on the other hand, appropriate values can be found, them the
current structure can be taken to be appropriate for describing the current situation.
FRAME PROBLEM
Frame problem is a problem of representing the facts that change as well as those that do not change.
For ex. Consider a table with plant on it under a window. Suppose we move it to the center of the room. Here we must infer that plant is
now in the center, but the window is not.
Frame axioms are used to describe all the things that do not change when an operator is applied in state to goto another state say
n+1.
Ex: colour(x,y,s1) ^move (x,s1,s2) ? colour(x,y,s2)
This axiom says that an object x has a colour y in state 1. moving x from state 1 to state 2 will not change the color of the object x.
once a change of state occurs how we undo the changes if we need to back track the two ways that are provided are
I. Do not modify the initial state description. At each node, simply store an indication of the specific change that should be made. In
order to refer to the current state, we start from the initial state and look back all the nodes on the path from start state to current state.
II. Make the changes to the initial state as they occur but every node where a change takes place, gives what to do to undo the move
or change if we need to back track.
One can represent information about an object or an event by means of a database manipulated about an object or an event by means
of a database management system even though holds information, do not hold the facility for representing and manipulating of facts
like
Hence cheetah has sharp teeth from the first two statements, the last one can be informed. In a DBMS until one specifies that cheetah
has a sharp tooth. It is not possible to get this information.
Declarative representation of knowledge: This controversy raged in 1970’s where in there was a heavy debate on which type of
representation should be used in AI programs.
A Declarative representation declares every piece of knowledge and permits the reasoning system to use the rules of inference like
modus ponens, modus tokens, etc., to come out with new piece of information.
Using these two representations, it is possible to deduce that “cheetah has sharp teeth”.
Procedural representation of knowledge: This represents knowledge as procedures and the inferencing mechanism manipulates these
procedures to arrive at the result.
Advantages: Procedural representations also have many advantages. First and foremost, heuristic knowledge can be easily
represented which is vital. Secondly, one has the control over search, which is not available in declarative knowledge representation.
A knowledge representation scheme should have both procedural and declarative schemes for effective organization of the knowledge
base.
Knowledge may be declarative or procedural. Procedural knowledge is compiled knowledge related to the performance of some task.
For example, the steps used to solve and algebraic equation are expressed as procedural knowledge. Declarative knowledge on the
other hand is passive knowledge expressed as statements of facts about the world. Personnel data in a database is typical of
declarative knowledge such data are explicit pieces of independent knowledge. We define knowledge as justified belief.
Two other knowledge terms, which we shall use occasionally, is epistemology and meta knowledge.
Epistemology is the study of the nature knowledge whereas meta knowledge is knowledge that is what we know.
Different kinds of widely known knowledge representation:
Semantic Networks: Network representations provide a means of structuring and exhibiting the structure in knowledge. In a network,
pieces of knowledge are clustered together into coherent semantic groups. Networks also provide a more natural way to map to and
from natural language than do other representation schemes. Network representation gives a pictorial presentation of objects, their
attributes and the relationships that exist between them and other entities. These are also known as associative networks. Associative
networks are directed graph with label nodes and arcs or arrows. A semantic network or semantic net is a structure for representing
knowledge as a pattern of interconnected nodes and arcs. It is also defined as a graphical representation of knowledge. The
knowledge used in constructing a network is based on selected domain primitives for objects and relations as well as some general
primitives.
Ex :
1.
fly
can
Bird tweety yellow
A knid of colour
Has parts
Wings
2.
is_a is_a
Scooter Two_wheeler Motor_bike
has has
Brakes Moving_vehicle Engine
has has
Electrical_system Fuel_System
Semantic nets were introduced by Quillian (1968) to model the semantics of English sentences and words. He called his structures
semantic networks to signify their intended use.
The main idea behind semantic nets is that the meaning of a concept comes from the ways in which it is connected to other concepts.
In a semantic net, information is represented as a set of nodes connected to each other by a set of labeled arcs, which represent
relation ships among the nodes.
Semantic nets are a natural way to represent relationships that would appear as ground instances of binary predicates in predicate
logic.
Height Height
Greater than
Height
Nodes: In the semantic net, nodes represent entities, attributes, states or events.
Arcs: In the semantic net, arcs give the relationship between the nodes.
Labels: In the semantic net, the labels specify what type relationship actually exists or describe the relation ship.
A generic node is a very general node. In figure for the semantic network of Bharathiar university computer center, the mini computer
system is a generic node because many minicomputer systems sexists and that node have to center to all of them.
Individual or instance nodes explicitly state that they are specific instances of a generic node. HCL Horizon-III is an individual node
because it is a very specific instance of the mini-computer system.
An is-a link is a special type of link because it provides facilities to link a generic node and a generic node and individual node and a
generic node.
Another major feature of the is-a link is that it generates hierarchical structure with the network.
This is a link has another major property which is called inheritance. The property of inheritance is that the properties, which a most a
generic node possesses, are transmitted to various specific instances of the generic node.
Reasoning using semantic networks: Reasoning using semantic networks is an easy task. All that has to be done is to specify the start
node. From the initial node, other nodes are pursued using the links until the final node is reached. To answer the question “What is
the speed of the line printer?” from the above figure.
The reasoning mechanism first finds the node of line printer. It identifies the arc that has the characteristics speed since it points to the
value 300, the answer is 30.
The ‘is a’ link structure can be easily represented using predicate logic. Road vehicle is a land vehicle.
1. Marcus is a man
Man (Marcus) (in predicate logic)
The nodes dog, bite and mail carrier represent the class of dog, biting and mail carriers respectively, while the nodes d, b and m
represent a particular biting and a particular mail carrier. This fact can be easily be represented by a single net with no partitioning.
But now suppose that we want to represent the fact
To represent this fact, it is necessary to encode the scope of the universally quantified x. The node g stands for the assertion given
above. Node g is an instance of the special class GS of general statement about the world. Every element of GS has as least two
attributes. A form, which states the relation that is being asserted, and one or more ? connections, one for each of the universally
quantified variables. There is only one such variable d., which and stand for any element of the class dogs. The other two variables in
the form, b and m are under stood to be existentially quantified. In other words, for every dog d, there exists a betting event b, and mail
Carrie n, such that d is the assailant of b and m is the victim.
In this net, the node c representing the victim lies out side the form of the general statement. Thus it is not viewed as an existentially
quantified variable whose value may depend on the value of d, instead it is interpreted as standing for a specific entity. (in this case, a
particular constant), just as do other nodes in a standard, non partitioned.
Example:
Every batter hit a ball. Forall x: Batter(x) ? there exist:Ball(Y)and hit(x,y)
Conceptual Graphs: A conceptual graph is a graphical portrayal of a mental perception, which consists of basic of primitive concepts
and relationships that exists between the concepts. A single conceptual graph is roughly equivalent to a graphical diagram of a natural
language sentence where the words are depicted as concepts and relationships. Conceptual graphs may be regarded as formal
building blocks for associative networks which when linked together in a coherent way, from a more complex knowledge structure. A
concept may be individual or generic.
Instrument
Spoon
Conceptual graphs offer the means to represent natural language statements accurately and to perform many forms of inference found
in common sense reasoning.
Frames : Frames were first introduced by Marvin Minsky (1975) and a data structure to represent a mental model of a stereotypical
situation such as driving a car, attending a meeting or eating in a restaurant.
Frames are general record like structures, which consist of a collection of slots and slot values. The slots may be of any size and any
type. Slots typically have names and any number of values.
A frame can be defined as a data structure that has slots for various objects and collection of frames consists of expectations for a
given situation.
A frame structure provides facilities for describing objects, facts about situations, procedures on what to when a situation is
encountered because of these facilities a frame provides, frames are used to represent the two types of knowledge. Declarative/factual
and procedural.
Ex :
Declarative and Procedural frames: A frame that merely contains description about objects is called a declarative
type/factual/situational frame.
A part from the declarative part in a frame, it is also possible to attach slots, which explain how to perform things. In other words it is
possible to have procedural knowledge represented in a frame. Such frames which have procedural knowledge embedded in it are
called action procedure frames. The action frame has the following slots.
1. Actor slot: which holds information about who is performing the activity.
2. Object slot : this frame information about the item to be operated on
3. Source slot: source slot holds information from where the action has to begin.
4. Destination slot: holds information about the place where action has to end.
5. Task slot: This generates the necessary sub-frames required to perform the operation.
Ex :
The generic frame merely describes that, the expert in order to clean the nozzle of the scooter has to merely perform, the following
operations:
Reasoning using frames: The task of action frames is to provide facility for procedural attachment and help transforming from initial to
goal state. It also helps in breaking the entire problem in to sub-tasks, which can be described as top-down methodology. It is possible
for one to represent any tasks using these action frames.
Reasoning using frames is done by instantiation. Instantiation process begins when the given situation is batches with frames that
already exist. The reasoning process tries to match the frame with the situation and latter fills up slots for which values must be
assigned. The values assigned to the slot depict a particular situation and but this reasoning process tries to move from one frame to
another to match the current situation. This process builds up a wide network of frames, there by facilitating one to build a knowledge
base for representing knowledge about common sense.
Implementation of frame structures: One way to implement frames is with property lists. An atom is used as the frame name and slots
are given as properties. Facets and values with in slots become lists of lists for the slot property.
Another way to implement frames is with an association list ( an-a-list), that is, a list of sub lists where each sub list contains a key and
one or more corresponding values. The same train frame would be represented using an a-list as
It is also possible to represent frame like structures using Object oriented programming extensions to LISP languages such as Flavors.
Scripts: Scripts are another structures representation scheme introduced by “Roger Schank” (1977). They are used to represent
sequences of commonly accruing events. They were originally developed to capture the meanings of stories or to understand natural
language test.
A script is a predefined frame-like structure, which contains expectations, inferences and other knowledge that is relevant to a
stereotypical situation.
Frames represented a general knowledge representation structure, which can accommodate all kinds of knowledge. Scripts on the
other hand help exclusive in representing stereotype events that takes place in day-to-day activity.
All the situations are stereotype in nature and specific properties of the restricted domain can be exploited with special purpose
structures.
A script is a knowledge representation structure that is extensively used for describing stereo typed sequences of action. It is a special
case of frame structure. These are interested for capturing situations in which behavior is very stylized. Scripts tell people what can
happen in a situation, what events follow and what role every actor plays. It is possible to visualize the same and scripts present a way
of representing them effectively what a reasoning mechanism exactly understand what happens at that situation.
Reasoning with Scripts: Reasoning in a script begins with the creation of a partially filled script named to meet the current situation.
Next a known script which matches the current situation is recalled from memory. The script name, preconditions or other key words
provide index values with which to search for the appropriate script. An inference is accomplished by filling in slots with inherited and
defaults values that satisfy certain conditions.
Advantages:
1. Permits one to identify what scenes must have been proceed when an event takes place.
2. It is possible using scripts to describe each and every event to the minutest detail so that enough light is thrown on implicitly
mentioned events.
3. Scripts provide a natural way of providing a single interpretation from a variety of observations.
4. Scripts are used in natural language understanding system and serve their purpose effectively in areas for which they are applied.
Disadvantages:
1. It is difficult to share knowledge across scripts what is happening in a script is true only for that script.
2. Scripts are designed to represent knowledge in stereo type situations only and hence cannot be generalized.
Important components:
1. Entry condition: Basic conditions that must be fulfilled. Here customer is hungry and has money to pay for the eatables.
2. Result: Presents the situations, which describe what, happens after the script has occurred. Here, the customer after satisfying his
hungry is no hungrier. The amount of money he has is reduced and the owner of the restaurant has now more money. Captional
results can also be stated here like the customer is pleased with the quality of food, quality of service etc., or can be displeased.
3. Properties: These indicate the objects that ate existing in the script. In a restaurant on has tables, chairs, menu, food money, etc..
4. Roles: What various characters play is brought under the slot of roles. These characters are implicitly involved but some of them
play an explicit role. For example waiter and cashier play an explicit role where the cook and owner are implicitly involved.
5. Track: Represents a specific instance of a generic pattern. Restaurant is a specific instance of a hotel.
This slot permits one to inherit the characteristics of the generic node.
6. Scenes: Sequences of activities are described in detail.
Conceptual Dependency (CD): Conceptual dependency is a theory of how to represent the kind of knowledge about events that is
usually contained in natural sentences. The goal is to represent the knowledge in a way that
The theory was first described in Schank 1973 and was further developed in Schank 1975. It has been implemented in a variety of
programs that read and understand natural language text. Unlike semantic nets provide only a structure in to which nodes representing
information at any level can be placed. Conceptual dependency provides both structure and a specific set of primitives, at a particular
level of granularity out of which representations of particular pieces of information can be constructed.
Conceptual dependency (CD) is a theory of natural language processing which mainly deals with representation of semantics of a
language. The main motivation for the development of CD as a knowledge representation techniques are given below.
Apart from the primitive CD actions one has to make use of the six following categories of objects
Major acts :
CD primitives Explanation
a. ATrans Transfer of abstract relationship (eg. Give)
b. PTrans Transfer of physical location of an object (eg. Go)
c. PROPEL Application of physical force of an object(eg. Throw)
d. Move Movement of a body part of an animal by the animal (eg. Kick)
e. GRASP Grasping of an object by an actor (eg. Hold)
f. INGEST Taking of an object by an animal to the inside of that animal (eg.
Drink,eat)
g. EXPEL Expulsion of an object from inside the body by an animal to the
words (eg. Spit)
h. MTrans Transfer of mental information between animals or within animal
(eg. Tell)
i. MBuild Construction of a new information from an old information (eg.
Decide)
j. Speak Actions of producing sound (eg. Say)
3. Locs : Locations
Every action takes place at some locations and serves as source and destination
4. Ts : Times
An action can take place at a particular location at a given specified time. The time can be represented on an absolute scale or relative
scale.
5. AAs : Action Aiders
These serve as modifiers of actions. An actor PROPEL has a speed of actor associated with it which is an action aider.
6. PAs : Picture Aiders
Serve as aides of pictures procedures. Every object that serve as a PP, needs certain characteristics by which they are defined. PAs
practically serve PPs by defining the characteristics.
The main goal of CD representation is to make explicit of what is implicit. That is way, every statement that is made has not only the
actors and objects but also time and locations, source and destinations.
CD brought forward the notation of language independence because all acts are language independent primitives.
CD is a special purpose of semantic networks in which specific primitives are used in rebuilding representations. It still remains today
as a fundamental knowledge representation structure in natural language processing system.
flower
Joe
Act Joe INGEST
do
soup
spoon
Joe ate some soup
CD is a theory of representing fairly simple actions.
ONE
One INGEST o smoke
I CIGARETTE
C TfP
I
INGEST o smoke
ALIVE
The vertical causality link indicates that smoking kills one. Since it is marked, however we know only that smoking can kill one, hot that
it necessarily does.
The horizontal causality link ordinates that it is first causality that made me stop smoking. The qualification p attached to the depending
between I and INGEST indicates that the smoking has stopped and that the stopping happened.
There are three important ways in which representing knowledge using the CD model facilities reasoning with the knowledge.
1. Fewer inference rules are needed would be required if knowledge were not broken down into primitives.
2. Many inferences are already contained in the representation it self.
3. The initial structure that is built to represent the information contained in one sentence will have that need to be billed. These holes
can serve as an attention focuser for the program that must understand using sentences.
LOGIC
Logic can be defined as a scientific study of the process of reasoning and the system of rules and procedures that help in the
reasoning process.
Basically, the logic process takes in some information (called premises) and produces some outputs (called Conclusions).
Propositional logic: This is the simplest form of logic. Here all statements made are called propositions. A proposition in propositional
logic takes only two values, i.e. either the proposition is True or it is False.
There are two kinds of propositions. They are atomic propositions (also called simple propositions) and molecule propositions (also
called compound propositions).
Combining one or more atomic propositions using a set of logical connective forms molecular propositions.
Molecular propositions are much more useful than atomic propositions because real world problems involve more of molecular
propositions.
1. ~A (Negation of A)
2. A & B (Conjunction of A with B)
3. A v B (Inclusive disjunction of A with B)
4. A? B (A implies B)
5. A B (Material bi-conditional of A with B)
6. A B (Exclusive disjunction of A with B)
7. A B (Joint denial of A with B)
8. A B (Disjoint denial of A with B)
Semantics of logical propositions: A clear meaning of the logical propositions can be arrived at by constructing appropriate truth tables
for the molecular propositions.
Truth table :
Logical equivalences :
1. A ~~A, A&A
2. A&B B&A
3. AvB BvA
4. (A&(B&C)) ((A&B)&C)
5. (Av(BvC)) ((AvB)vC)
6. (A&(BvC)) ((A&B)v(A&C))
7. (Av(B&C)) ((AvB)&(AvC))
8. ~(A&B) ~Av ~B
9. ~(AvB) ~A& ~B
10. A?B ~AvB, ~(A& ~B), (~B? ~A)
11. A?(B?C) ((A?B)?C)
12. A B (A&B)v(~A& ~B), (A?B)&(B?A)
Tautologies: Propositions that are true for all possible combinations of truth values of their atomic parts are called tautologies. This
implies that a tautology is always true.
Ex : ((A&(A?B))?B) is a tautology.
TTTTT
TFFFT
FTTFT
FFFFT
Contradiction: Whenever the truth value of a sentence is always false for all combinations of its constituents, then the sentence is
called contradiction. Ex : A & ~A.
Contingent: A statement is called a contingent if its truth table has both true and fales as its output. Ex : A? B.
Normal forms in prepositional logic: There are two major Normal forms of statements in prepositional logic. They are Conjunctive
Normal Form (CNF) and Disjunctive Normal Form (DNF).
A formula A is said to be in CNF if it has the form A=A1 & A2 & A3&…&An, n>=1 where each A1, A2, A3,…,An is a disjunction of an
atom or negation of an atom.
A formula A is said to be in DNF if it has the form A=A1 v A2 v A3v…v An, n>=1 where each A1, A2, A3,…, An is a conjunction of an
atom or negation of an atom.
Inference Rules : The inference rules of PL provide the means to perform logical proofs or deductions.
FOPL: It was developed by logicians as a means for formal reasoning primarily in the areas of mathematics.
In FOPL statements form a natural language like English are translated into symbolic structures comprised of predicates, functions,
variables, constants, quantifiers and logical connectives .the symbols form the basic building blocks for the knowledge and their
combination into valid structures is accomplished using the syntax for FOPL
Ex: "all employees of the AI software co. are programmers " might be written in FOPL as
For all x(AI ,software -co-employee cx)->programmer(x))
Ai software-co-employee (jm)
Programmer (jim) 1.all students in computer science must take pascal.
2.john is a computer science major.
Propositional logic works fine in situations where the result is either true or false but not both .however there are many real life
situations that cannot be treated this way. in order to over come this deficiency, first order logic or predicate logic uses three additional
notions .These are predicates ,terms and quantifiers.
Predicates: a predicate is defined as a relation that binds 2 atoms together.
Quantifiers: a quantifier is a symbol that permits one to declare or identify the range or scope of the variables in a logical expression.
There are 2 basic quantifiers used in logic. they are universal quantifiers (for all) and existential quantifier.
If a is variable then for all a is read as 1. for all a 2.for each a 3.for every a
Similarly if a is a variable then there exists a is read as 1.there exists a 2. for some 3. for every a
Variables:
Free and bound variables: a variable in a formula is free if and only if the occurrence is outside the scope of a quantifier having the
variable.
A variable is also free in a formula if at least one occurrence of it is free.
For all x there exists y (A(x, y,z)) and for all z (B(y, z))--------(1)
In this formula, the variable z is free in the first portion for all x there exists y (A(x, y, z))
A variable is a formula is bound if and only if its occurrence is with in the scope of the quantifier .a variable is also bound in situations
where at least one occurrence of it is bound.
Ex: for all x (A(x)->B(x))
In this formula, the quantifier for all applies over the entire formula.
(A(x)->B(x)) then the scope of the quantifier is A(x)->B(x)
any change in the quantifier has an effect on both A(x) and B(X).
hence the variable x is bound .
Normal forms in predicate logic: in predicate logic, one normal form is there .i.e called prefix normal form .
a formula A in predicate logic is said to be in prefix normal form if it has the form
(Q1, x1)(Q2,,x2)……….(Qn, xn)B
Where Qixi is either a for all or there exists and B is formula without any quantifier.
Convert the formula for all x(A(x)->there exists y(B(x, y)) in to prefix normal form
Sol: for all x(A(x)->there exists y B(x,y))
Connectives: there are 5 connective symbols .~, &, or, -> ,<->.
Constants: constants are fixed value term that belong to a given domain of discourse.
Variables: variables are terms that can assume different values over a given domain.
Predicates: predicate symbol denotes relations or functional mappings from the elements of a domain D to the values true or false.
Capital letters and capitalized words are used to represent the predicates.
Constants, variables and functions are referred to as terms and predicates are referred to as atomic formulas or atoms.
Symbols left and right parenthesis, square brackets braces and the periods are used for punctuation in symbolic expressions.
Ex:1. all employees earning $1400 or more per year pay taxes .
Clause form: A clause is defined as a WFF consisting of a disjunction of literals. The resolution process when it is applicable is applied
to pair of parents and producer a new clause.
Conversion to clasual form: one method we shall examine is called resolution of requires that all statements to be converted into a
normalized clasual form .we define a clause as the disjunction of a number of literal. A ground clause is one in which no variables
occur in the expression .A horn clause is a clause with at most one positive literal.
To transform a sentence into clasual form requires the following steps:
1. Eliminate all implication and equivalence symbols.
2. Move negation symbols into individual atoms.
3. Rename variables if necessary so that all remaining quantifiers have different variable assignments.
4. Replace existentially quantified variables with special functions and eliminates the corresponding quantifiers.
5. Drop all universal quantifiers and put the remaining expression info CNF(disjunctions are moved down to literals).
6. Drop all conjunction symbols writing each clause previously converted by the conjunction on a separate line.
We describe the process of eliminating the existential quantifiers thru a substitution process. This process requires that all such
variables are replaced by something called skolem functions, arbitrary functions which can always assume a correct value required of
an existentially quantified variable.
The skolem form is ? v :?x: p(f(q),v, x, g(v, x))->Q(a ,v, g(v, x))
Example in Predicate logic: Suppose we know that “all Romans who know marcus either hate Caesar or think that any one who hates
any one is crazy”.
3. Standardize variables so that each quantifier binds a unique variable. Since variables are just dummy names, this process cannot
affect the truth value of the wff.
4. Move all quantifiers to the left of the formula with out changing their relative order.
7. Convert the matrix into a conjunction of disjuncts. In our problem that are no (ands) disjuncts. So use associative property.
Unification: Unification is a procedure that compares two literals & discovers whether there exists a set of substitutions that makes
them identified.
Any substitution that makes 2 or more expressions equal is called a unifier for the expressions. applying a substitution to an expression
E produces an instance E' of E where E'=E. given 2 expressions are unifiable, such as expressions c1,c2 with a unifier B with
C1B=C2,we say that B is most general unifier (mgu) if any other unifier & is an existence of B. for ex: 2 unifiers for the literals p(u, b, v)
and p(a, x y) are & ={a/u, b/x, c/y} and B={a/u, b/x, c/v, c/y}.
Unification can sometimes be applied to literals with in the same single clause. When an mgu exists such that 2 or more literals with in
a clause are unified, the clause remaining after deletion of all but one of the unified literals is called a factor of the original clause.
The basic idea of unification is very simple. to attempt to unify 2 literals, we first check if their initial predicate symbols are the same, if
so we can proceed otherwise there is no way they can be unified, regardless of their arguments.
Unification has deep mathematical roots and is a useful in many AI programs. for ex: theorem proving and natural language parser.
Algorithm:
1.If l1, l2 are both variables or constants then
a) If l1, l2 are identical then return NIL.
b) Else if l1 is a variable the if l1 occurs in l2 then return {fail} , else return (l2/l1}
c) Else if l2 is a variable then if l2 occurs in l1 then return {fail} ,else return (l1/l2}.
d) Else return {fail}.
2. If the initial predicate symbols in l1, l2 are not identical then return {fail}.
3. If l1 , l2 have a different no. of arguments then return {fail}
4. Set subset to Nil (At the end of this procedure, subset will contain all the substitutions use to unify l1 , l2.
5. For i<-1 to no. of arguments in l1.
a) Call unify with the I th argument of l1 and the I th argument of l2 putting result in s.
b) If s contains fail then return {fail}.
c) If s is not equal to nil then
1.Apply s to the remainder of both l1, l2
2. Subset = append (s, subset).
3. Return subset.
Resolution: Robinson in 1965 introduced the resolution principle, which can be directly applied to any set of clauses.The principle is "
Given any two clauses A, B if there is a literal p1 in A which has a complementing literal p2 in B, delete p1 ,p2 from A,B and strut a
disjunction of the remaining clauses .the clause so constructed is called the resolvent of A,B .
EX: A: P V Q V R , B:~P V Q V R C: ~Q V R
PVQVR~PVQVR
Q V R ~Q V R
R
EX:2 A: P V Q V R B:~P V R ; C:~Q D: ~R
X = Q V R Resolvent of A &B .
Y = R Resolvent of X &C.
Z = Nil Resolvent of Y & D.
The resolution procedure is a simple iterative process. At each step two clauses, called the parent clause are compared (resolved)
yielding a new clause that has been inferred from them. The new clause represents ways that the 2 parent’s clauses interact with each
other.
Resolution work on the principle of identity complementary literals in 2 clauses and deleting them there by forming a new literal . the
process is simple and straight forward when one has identical literals . In other words for clauses containing no variables resolution is
easy. When there are variables the problem becomes complicated and the necessities one to make proper substitutions.
There are 3 major types of substitutions
1. Substitution of a variable by a constant.
2. Substitution of a variable by another variable.
3. Substitution of a variable by a function that does not contain the same variable.
Algorithm: 1. Convert all the statements of F to clause form.
2. Negate P and convert the result to clause form. Add it to the set of clauses obtained in 1.
3. Repeat until either a contradiction is found, no progress can be made, or a predetermined amount
of effort has been expended.
a) Select 2 clauses. call these the parent clauses.
b) Resolve them together. The resolving will be the disjunction of all the literals of both parent clauses with appropriate substitution
performed & with the following exception. If there is 1 pair of literals T1,T2 such that one of the parent clause contains T1 and the other
contains T2 and if T1,T2 are unifiable then neither T1,T2 should appear in the resolving. We call T1,T2 complementary literals.use the
substitution produced by the unification top create the resolving if there is more than 1 pair of complementary literals, only pair should
be omitted from the resolvent.
c) if the resolvent is the empty clause , then a contradiction has been found. if it is not, then add it to the set of clauses available to the
procedure .
This contradiction was due to the assumption that was made. i.e the negation of the result . Hence the negation of the result is false or
the result is true.
Resolving using refutation is much simpler that the method using the rules of inference.
Question-answering:
Chang (1973) divides the questions into four major classes.
Class 1 type where in the question that requires an “yes” or “no” answer.
Ex: “yes, flaght 312 has left coimbattor” or
“no, kerala express is running 45 minutes late.
Class 2 type: The kind of question that requires “where is” or “who is” or “under hat condition as an answer.
Ex: Ravi is vishaka patnam.
Kumar should not light a matchstick if there is an LPG leak.
Class 3 type: Here answer is in the form of sequence of action and the order is important
Ex: Add concentrated sulphuric acid slowly to water and then add the diluted acid to the salt.
Class 4 type: Questions whose answers require testing of some conditions.
Ex: If possible do colour Doppler evalution of if not do echocardiography.
Ex:1. marcus was a man man(marcus)
2. marcus was a Pompeian Pompeian (marcus)
3. marcus was born in 40 A.D born (marcus,40)
4.all men are mortal for all x : man(x)->mortal(x)
5.all Pompeian was died when the volcano erupted in 79 A.D.
6.erupted volcano 79 and for all x [Pompeian(x)->died (x,79)]
7.no mortal lives longer than 150 years .
for all x :for all t1: for all t2 : mortal(x) and born (x1t1) and gt(t2-t1,150) -> dead (x,t2)
8. it is now 1991 - now=1991
9.alive means not dead.
For all x:for all t:[alive (x,t)->->dead (x,t)]and [->dead (x,t)->alive(x,t)]
10.if someone dies ,then he is dead at all later times.
For all x:for all t1:for all t2 :died (x,t1)and gt(t2,t1)->dead(x,t)
Question : is marcus alive -> alive (marcus ,now )
^(9,substitution)
dead(marcus,now)
^ (10,substitution )
^ died(marcus,t1)and gt(now ,t1)
Pompeian (marcus)and gt(now ,79)
^ (2)
gt(now,79)
^ (8,substitute equals)
gt(1991,79)
^ compute gt
nil
marcus is died.
Ex2: 1. Steve only likes easy courses.
?x: easycourse(x)?likes(x,Steve)
2. Science courses are hard.
Hard(science courses)
3. All the courses in the basketweaving department are easy.
?x: basketweaving dept(x) ? easy(x).
4. BK301 is a basket weaving course.
Basketweaving course(Bk301).
From(1) ?x: ?easycourse(x) V likes(x,steve) ---------(5).
From (3) ?x: ?basketweaving dept(x) V easy(x). ---------(6).
Resolvent from (5) and (6) ?basketweaving dept(x) V likes(x,steve)-------(7).
Resolvent from (4) and (7) likes(Bk301,steve) x/Bk301.
?Steve would like Bk301.
Refutation Method: Assume steve does not Bk301
? likes(x,steve)
?easycourse(x)
?basketweaving dept(x)
Nil
From (1)……~food(x)Vlikes(x,john)…………..(5)
From (2)….~[ person(y)^ eats(y,x)^ ~killed(y)] V food(x)……..(6)
From(3) ….~eats(peanuts,bill)V person(x)
From (4) …..~eats(x,bill)v eat(x,sue)
~food(x)
~person(bill)
Nil
Note: Additional knowledge that Bill is a person. We can justify the knowledge because Bill eats peanuts &still alive.
So Bill is a person.
Several types of resolution are depending on the number and types of parents. Some of them.
Binary resolution: 2 clauses having complementary literals are combined as disjunct to produce a single a clause after deleting the
complementary literals.
~p(x,a) or q(x) and ~q(b) or r (x)
~p(b,a) or r (b)
Unit resulting resolution: a number of clauses are resolved simultaneously to produce a unit clause. All except one of the clauses are
unit clauses and that one clause has exactly one more literal than the number of unit clause.
Linear resolution : when each resolved clause ci is a parent to the clause ci+1 (i=1,2,…n-1) the process is called linear resolution.
Linear in resolution: if one of the parents in a linear resolution is always from the original set of clauses (the Bi) we have linear
resolution .
1.logic and theorem proving techniques are monotonic in nature. The derived axioms hold good under all circumstances. Real world is
never monotonic for information obtained is seldom complete.
2.logic does not provide facilities for handling uncertainty. Every information it deals has to either correct or incorrect but never partially.
3.codificaiton of the problem in logic is a tough task and requires considerable effort on the part of the user.
4.even though various techniques do exist for speeding resolution, it takes considerable amount of time to prove statements in logic.
5.one major constraint in logic is that unless you are sure that a solution exists, the search will not terminate .we will be going on
adding clause after but the solution will be still elusive.
The object of a search procedure is to discover a path through a problem space from an initial configuration to a goal state. There are
actually two directions in which such a search could proceed.
The production system model of the search process provides an easy way of viewing forward and backward reasoning as systematic
processes.
Forward reasoning from initial states:
1. Root: A tree of move sequences, that might be solutions, is built by starting with the initial configuration at the root of the tree.
2. Generate of the next level: The next level of the tree is generated by building a tree by finding all the rules whose left sides are
matched against the current state(root node) and the right sides are to generate new nodes by creating new configuration. Generate
next level by taking each node generated at the previous level and continue the above reasoning forward till a configuration matches
the goal state.
Back ward reasoning from the goal states:
1. Root: A tree of move sequences, that might be solutions, is built by starting with the goal configuration at the root of the tree.
2. Generate the next level: The next level of the tree is generated by finding all the rules
whose right sides match the root node. The left sides are used to use to generate the new nodes, representing new goal states to
achieve. Generate the next level of the tree by taking each node at the previous level and continuing the above reasoning until a node
that matches the initial state is generated. It is often called goal-directed reasoning.
It does not make much difference whether we reason forward or backward, about the same number of paths will be explored in either
case. But this is not always true. Depending on the topology of the problem space, if may be significantly more efficient to search in
one direction rather than the other.
Four factors influence the question of whether it is better to reason forward or backward.
Principle:
Forward rules, which encode knowledge about how to respond to certain input configurations.
Backward rules, which encode knowledge about how to achieve particular goals.
1. Instead of starting from the goal state, we start the search by incoming data.
2. In this, left sides of rules are matched against the state description.
3. If they match, we apply if and a new state generated will be represented at the right side.
This process repeats until the goal state is reached.
4. Matching is more complex than backward ones.
1. In this, we start the search from the goal state as an initial state and make a move to the state we want.
2. The number of state computed to generate are said to be chain of links.
3. If one state, in the path is removed the entire path will be collapsed.
We can also search both forward from the start state and backward from the goal simultaneously until two paths meet some where in
between. This strategy is called bi-directional search. It seems appealing if the number of nodes at each step grows exponentially with
the number of steps that have been taken. In fact, many successful AI applications have been written using a combination of forward
and backward reasoning and most AI programming environments provide explicit support for such hybride reasoning.
Monotonic NonMonotonic
1. It is complete with respect to It is incomplete.
the domain of interest.
3. New facts can be added only when New facts will contradict and invalidate
they are consistent with the facts the old knowledge
that have already been as sorted.
A monotonic reasoning system cannot work effectively in real life environments because
Information available is always incomplete.
As process goes by situation change and so are the solutions. Default assumptions are made in order to reduce the search time and
for quick arrival of solutions.
TMS: TMS also known as belief revision and revision maintenance are companion components to inference systems. The main job of
the TMS is to maintain consistency of the knowledge being used by the problem solves and not to perform any inference functions.
The TMS also gives the inference component the latitude to perform non-monotonic inferences. When new discovers are made, this
more recent information can displace previous conclusions that are no longer valid .in this way the set of beliefs available to the
problem solver will continue to be current and consistent.
Diagram:
Tell
Ask
The role played by the TMS a part of the problem solver. the inference engine solves domain problems based on its current belief set,
while the TMS maintains the currently active belief set .the updating process is incremented .after each inference ,information is
exchanged between the 2 components .the IE tells the TMS what the deductions has made. The TMS in turn, asks questions about
current beliefs and reasons for failures. It maintains a consistent set of beliefs for the I.e. to work with even if now knowledge is added
& removed.
The TMS maintains complete records of reasons of justifications for beliefs. Each proposition having at least 1 valid justification is
made a part of current belief set. Statements lacking acceptable justifications are excluded from this set. When a contradiction is
discovered the statements responsible for the contradiction are identified & an appropriate one is retracted. This in turn may result in
other retractions & additions. The procedure used tp perform this process is called dependency back tracking.
The TMS maintains records to reflect retractions and additions & that will always know its current belief set. The records are
maintained in the form of dependency network. The node in the network represents KB entries such as premises, conclusions,
inference rules and the like. Attached to the nodes are justifications, which represent steps from which the node was derived. Nodes in
the belief set must have valid justifications. a premise is a fundamental belief which is assumed to be always true. Premises need no
justifications. They form a base from which all other currently active nodes can be explained in terms of valid justifications.
There are 2 types of justifications records maintained for nodes. Support lists (SL) and conceptual dependencies. SL’s are the most
common type. They provide the supporting justifications for nodes. The data structure used for the SL contains 2 lists of other
dependent node names on in list and an out list.
(SL )
CP justifications are used less frequently than the SLS. They justify a node as a type of valid hypothetical argument.
(CP)
Example for truth maintenance system.
TMS maintains the consistency of the knowledge being used by the problem solves.
It maintains the currently active belief set.
It maintains the records of reasons of justification for beliefs.
Records are maintained in the form of dependency network.
Representation in TMS:
A TMS dependency network offer a purely syntactic,domain-independent way to represent belief and change it consistently.
Justification
Beneficiary Abbot Alibi Abbot
Justification:
1. The assertion ”Suspect Abbot” has an associated TMS justification. An arrow to the assertion it supports connects the justification.
2. Assertions in a TMS dependency network are believed when they have a valid justification.
3. Each justification has two parts:
a. An IN-list [connected to justification by ‘+’]
b. An OUT-list [connected to justification by ‘-‘]
4. If the assertion corresponding to the node should be believed, then in the TMS it is labeled IN.
5. If there is no reason to believe the assertion, then it is labeled OUT.
Premise Justification: Premise justifications are always considered to valid. Premises need no justifications.
Labeling task of a TMS: The labeling task of a TMS is to label each node so that three major criteria of dependency network are met.
1. Consistency
2. Well-founded-ness
3. Resolving contradictions.
The following two cases show how consistency is maintained while changing Abbot’s state.
Case(i). Abbot is beneficiary. We have no further justification for this fact. We simply accept it.
The following figure shows a consistent labeling for the network with premise justification.
Suspect Abbot[IN]
+_
Case (ii): If abbots alibi is obtained, then some more justifications are added. Then consistency is maintained in the following ways, by
making the following inclusions.
Suspect Abbot[OUT]
+_
Well-Founded ness criterion: (1) It is defined as the proper grounding of a chain of justifications on a set of nodes that do not
themselves depend on the nodes they support.
(2) For example : Cabot justification for his alibi that he was at a ski show is hardly valid.
The only support for the alibi of attending the ski show is that Cabot is telling the truth.---------?(1)
The only support for his telling the truth, would be if we knew he was at the ski show.----------?(2)
Above (1) and (2) statements show a chain of IN- List links to support the “Alibi Cabot “ Node.
So, In such cases the node should be labeled Out for well-founded ness.
+ TellsTruthCabot [OUT]
Contradiction [out]
(3) Initially there is no valid justification for other suspects so, contradiction is labeled OUT.
(4) Suppose Cabot was seen on T.V that he was at the ski slopes, then is causes “Alibi Cabot” node to be labeled IN. So, it makes ‘
Suspect Cabot’ node to be labeled OUT.
(5) The above point gives a valid justification for contradiction and hence is labeled IN.
Contradiction [IN]
(6) The job of TMS is to determine how the contradiction can be made OUT. I.e the justification should be made invalid.
(7) Non monotonic justifications can be invalidated, by asserting some fact whose absence is required by the justification
(8) That is we should install a justification that should be valid only as long as it needs to be.
(9) A TMS have algorithms to create such justifications, which is called Abductive justification.
Default reasoning:
Default reasoning is one type of non-monotonic reasoning, which treats conclusions as, believed until a better reason is found to
believe something else.
Two different approaches to deal with non-monotonic system are:
(1) Non- monotonic logic (2) Default logic
(1) Non-monotonic logic: Non-monotonic logic is one in which the language of FOPL is augmented with a modal operator m, which can
be read as “is consistent”.
Non-monotonic Logic defines the set of theorems that can be derived from a set of WFF’s A to be the intersection of the sets of
theorems that results from the various ways in which the WFF’s of A might be combined.
A ? MB ? B
¬A ? MB ? B
We conclude:
MB ? B.
(2) Default Logic:
Another form of uncertainty occurs as a result of in complete knowledge .one way human’s deal with this problem is by making
plausible default assumptions. That is we make assumptions, which typically hold but may have to be retracted if new information is
obtained to the contrary.
Default reasoning is another form of non-monotonic reasoning .it eliminates the need to explicitly store all the facts regarding a
situation. Reiter 1980 develops a theory of default reasoning within the context of traditional logics .a default is expressed as
C (x)
Where a (x) is a precondition wff for the conclusion wff c (x) M is a consistency operator and the bi (x) are conditions, each of which
must be separately consistent with the Kb for the conclusion c (x) to hold.
Default theories consist of a set of axioms and set of default inference rules with schemata. The theorems derivable from a default
system are those that follow from first order logic and the assumptions assumed from the default rules. Suppose a Kb contains only the
statements
Bird (x): Mfly (x)/fly (x)
A default proof of fly is possible.but if KB also contains the clause. Ostrich (tweety).
Ostrich (x)->~fly (x)
Fly (tweety) would be blocked since the default is now in consistent. Default rules are especially useful in hierarchal kB.because the
default rules are transitive property inheritance becomes possible.
Transitivity can also be a problem in Kb with many default rules. Rule interactions can make presentations very complex.
Two kinds of non-monotonic reasoning that can defined in these logics are:
1. Abduction 2. Inheritance
1. Abduction: Abductive reasoning is described “ Given two WFF’s (A ? B) & (B) for any expressions A & B , if it is consistent to
assume A then do so”.
Example: Suppose it is given that “people who have had too much to drink tend to stagger when they walk”. Then “we may conclude
that a person who is staggering is drunk” though this may be incorrect sometimes.
So we make a conclusion only if it is consistent enough to assume it.
2. Inheritance: Inheritance is another form of non-monotonic reasoning which inherit attribute values from a prototype description of a
class to the individuals entities that belong to a class.
Default rules are useful hierarchical knowledge bases. Because the default rules are transitive, property inheritance becomes possible.
Minimalist Reasoning:
Minimalist reasoning follows the idea that “there are many fewer true statements than false ones. If something is true and relevant it
makes sense to assume that is has been entered into our knowledgebase. Therefore, assume that the only true statements are those
that necessarily must be true in order to maintain the consistency of knowledge base”.
Two kinds of minimalist reasoning are:
(4) Closed world assumption (CWA).
(5) Circumscription.
Closed world assumption: Another form of assumption made with regard to incomplete knowledge, is more global in nature than single
defaults. This type of assumption is useful in application where most of the facts are and it is, therefore, reasonable to assume that if a
proposition cannot be proven, it is false. This is known as the closed world assumption with failure as negation. This means that in a kb
if the ground literal p(a) is not provable, then ~p(a) is assumed to hold true. CWA is another form of non-monotonic reasoning. CWA is
essentially the formalism under which prolog operates and prolog has been shown to be effective in numerous application.
2. ~Equal (a,joe)
Circumscription: Circumscription is another form of minimalist reasoning introduced by john Mc carthy (1980) . It is similar to predicate
completion in that all objects that can be shown to have same property p are in fact the only objects that satisfy p.
CWA does not capture all of the idea. If has two limitations:
3. CWA operates on individual predicates with out considering the interactions among predicates that are there in the knowledge base.
4. CWA assumes that all predicates have all of their instances listed. But this is not true in many knowledge-based systems.
Circumscription overcomes that above limitations.
To explain what is a circumscription considers a day-to-day activity. We have our house on a 2 wheeler go to the railway station, park
out vehicle, board the train to our destination, reach the destination and attend the college.office. Here the system that works for the
solution to a problem in this case is expected to recognize certain ground conditions such as :
1.There is no fuel in the 2 wheeler
2.Two-wheeler is in a good condition
3.There is no block able during your travel
4.train journey is safe etc.
In fact, it is not possible to consider all the qualifications that needs to be included to minimize time and solution space, we make
certain explicit assumptions that if there is something that is not mentioned, it need not be taken into consideration, so far the above
problem one need not bother about the factors. Practically if corresponds to minimization of objects under consideration. In short, in
any problem one considers only those existences is required for getting the clear picture of the solution. This principle of avoiding all
unnecessary details and taking into account only that is absolutely called as circumscription. This method is based on the use of
completion formula.
Modal logic: The standard approach is to use a modal logic such as defined in Hintikkal 1962.modal logic extend the expressions of
classical logic by permitting the notions of possibility, necessity obligation, belief and the like. a number of different modal logics have
been formalized and inference rules comparable to propositional and predicate logic are available to permit different forms of non
monotonic reasoning.
Modal logics are derived as extensions of PL and FOPL by adding modal operators and axioms to express the meanings and relations
for concepts such as consistency, possibility, necessity obligation, belief, known truths and temporal situations like past, present and
future .the operators take predicate as arguments and are denoted and so on.
Modal logic are classified by the types of modality they express .for ex: alethic logic are concerned with necessity and possibility
denotic logics with what is obligatory or permissible, epistemic logic’s with belief and knowledge and temporal logics with tense
modifiers like sometimes, always, what has been, what will be or what is.
Temporal logic: Temporal logic use modal operators in relation to concepts of time, such as past, present, future some times, always,
precedes, succeeds as so on .an ex: of 2 operators which correspond to necessity and possibility are always A and some times (S).
A(Q)->s Q (If always Q then sometimes Q)
AQ->Q (if always Q then Q)
Q->SQ(if Q then sometimes Q)
SQ->~A~Q (if sometimes Q THEN NOT ALWAYS NOT q)
EX: tweety flies (present tense)
P(tweety flies) past tense: tweety flew
F(tweety flies) future tense:tweety will fly
Fp(tweety flies) future perfect tense:tweety will have flown.
The semantics of a temporal logic is closely related to the frame problem. The frame problem is the problem of managing changes,
which occur from one situation to another cr from frame to frame as in a moving sequence.
Fuzzy logic: fuzzy logic was introduced to generalize and extent the expressiveness of traditional logics. Fuzzy logic is based on fuzzy
set theory, which permits partial set membership.
Lofti azadeh of university of california , Berkely first introduced fuzzy sets in 1965.his objectives were to generalise the notions of a set
and positions to accommodate the type of fuzziness or vagueness. Since its introduction in 1965, much research has been conducted
in fuzzy set theory and logic. As a possibility distributions fuzzy static’s, fuzzy random variables and fuzzy set functions.
Fuzzy predicates such small, large, young, safe, much large than seen. Fuzzy quantifier such as most, means, few, several often
usually. Fuzzy probabilities expressed as quite possible. Almost impossible.
Fuzzy truth values such as very, quite, extremely, somewhat, and slightly.
Fuzzy sets associated with the ability to use linguistic variables. Linguistic variables provide a link between a natural quantification
such as fuzzy propositions. A linguistics variable is a variable that assumes a value consisting of words or sentences rather than
numbers.
A formula, more elegant def. Of linguistic variable is one, which is based on language theory concepts for this; we define a linguistic
variable as the quintuple.
(x,T(x),U,G,M)
Where as x is the name of variable (AGE)
T(x) is the terminal set of x (very young, young, not young ,not old)
U is the universe of the discourse (years of life)
G is a set of syntactic rules, the grammar that generates the values of x and M is semantic rule, which associates each value of x with
its meaning M (x), a fuzzy subset of U.
FUZZY SET: You have asked by your friends to arrange for a small party what does small mean’s it possible for us exactly to identify
certain characteristics that tell that the party is small. If you are a fluent small has one meaning and if you belong to middle income flow
income group, the word small has a different meaning.
Hence, one can say that sets for which the boundary is ill defined are called fuzzy sets.
Operations on fuzzy sets are somewhat similar to the operations of standard set theory they are also intuitively acceptable.
~~
A= B if and only if u A(x)=u B(x) for all x belongs u equality
~ subset of ~
A B if and only if u A(x)<=u B(x) for all x belongs to u containment
DIAGRAMS :
Reasoning with fuzzy logic: The characteristic function for fuzzy sets provides a direct linkage to fuzzy logic. The degree of
membership of x in ~A corresponds to the truth-value of the statement x is a member of ~A where ~A defines some propositional or
predicate class.
Generalized modus ponens for fuzzy sets have been proposed by a number of researches. They differ from the standard modus
ponens in that statements, which are characterized by fuzzy sets, are permitted and the conclusion need not be identical to the
implicand in the implication.
Most information is not obtained personally but got from sources on which one has lot of belief. Uncertainty prevails when the source
does not send information in time or the information provided is not understood.
In laboratory experiments one takes it for granted that the equipment is properly calibrated and error free. If the equipment is faulty, the
picture one gets about the situation is not correct which loads to uncertainty.
In judicial courts we would have seen or hard delivering judgements acquitting for 1.lack of evidence 2.lack of certainty in evidence
they’re by giving the benefit of doubt to the accused.
We are interested in examining the geological evidence at a particular location to determine whether that would be a good place to dig
to find a desired mineral. If we know the prior probabilities of finding each of the various minerals and we know the probabilities that if a
mineral is present then certain physical characteristics will be observed.
To handle the uncertain data, probability is oldest technique available. Probabilistic reasoning is some times used when out comes
unpredictable.
Bayes theorem: Bayes theorem is used for the computation of probabilities, this theorem provides a method for reasoning about partial
beliefs. Here every event that is happening or likely to happen is quantified dictate how these numerical values are to be calculated.
This method introduced by Clergyman Thomas bates in the 18 century. This form of reasoning depends on the use of conditional
probabilities of specified events when it is known that other events have occurred. for 2 events H & E with the probability p(E)>0 the
condition probability of events H, given event E has occurred is defined as .
Read this expression as the probability of hypothesis H given that we have observed evidence.
The conditional probability of event E given that event H occurred can likewise be written as
P(E/H)=P(H&E)/P(H)----------(2)
This equation expresses the notion that probability of event H occurring when it is known that event E occurred is the same as the
probability that E occurs when it is known that H occurred, multiplied by the ratio of the probabilities of the 2 events H, E occurring.
Using this result 3 we can be written as P (H/E)=P (E/H) P (H) / P (E/H) P (H)+P (E/~H) P (~H)
The above equation can be generalized for an arbitrary no. of hypothesis Hi i=1,2……k thus suppose the Hi partition the universe, i.e.
the Hi are naturally exclusive .then for any evidence E1
We have P (E)=?k P (E & Hi)= ?k P (E/Hi) P (Hi)
I=1 I=1
And hence P(Hi/E)=P(E/Hi)P(Hi)/ ?k P(E/Hj)P(Hj)
J=1
Problem: consider an indescent bulb-manufacturing unit. here machines M1,M2,M3 make 30%,30%,40% of the total bulbs .of their o/p
lets assume that 2%,3%,4% are defective .a bulb is drawn at random and is found defective. What is the probability that the bulb is
made by machine M1, M2, and M3.
Solution: let E1, E2, E3 be the events that a bulb selected at random is made by M1, M2, M3 .let Q denote that it is defective
prob(E1)=0.3,prob(E2)=0.3 and prob(E3)=0.4 given data .
These represent the prior probabilities
Prob of drawing a defective bulb made by M1=prob(Q/E1)=0.02
Prob of drawing a defective bulb made by M2=prob(Q/E2)=0.03
Prob of drawing a defective bulb made by M3=prob(Q/E3)=0.04
These values are the posterior prob’s
Prob(E1/Q)=prob(E1)*prob(Q/E1)/summation from i=1 to 3 prob(Ei)*prob(Q/Ei)
=(0.3)*(0.02/(0.03*0.2)+(o.o3*0.3)+(0.04*0.4)
=0.1935
similarly prob (E2/Q)=0.3*0.03/(0.03*0.2)+(0.03*0.3)+(0.04*0.4)
=0.2903
prob E3/Q=(1-(prob(E1/Q)+prob(E2/Q))
=0.5162.
BAYESIAN NETWORKS: Network representations of knowledge have been used to graphically exhibit the interdependencies, which
exist between related pieces of knowledge. Much work has been done in this area to develop a formal syntax and semantics for such
representations. Network representations for uncertain dependencies are further motivated by observations made earlier. If we wish to
represent uncertain knowledge related to a set of propositional variables x1……xn by their joint distribution P(x1…….xn) It will require
some 2n entries to store the distribution explicitly .further more, a determination of any one of the marginal probabilities xi requires
summing p(x1……xn)over the remaining n-1 variables. Clearly the time and storage requirements for such computations quickly
become impractical inferring with such large no.s of prob’s does not appear to model the human process either. on the contrary,
humans tend to single out only a few propositions which are known to be casually linked when reasoning with certain beliefs. This
metaphor leads quite naturally be a form of network representations.
To represent casual relationships between the propositional variables x1…..x6 one can write the joint probability p(x1……x6) by
inspecting as a product of conditional prob’s.
P(x1,……x5)=p(x5/x2,x3)p(x4/x1,x2)p(x5/x1))p(x2/x1)p(x1)
Diagram:
X1
X2 X3
X4 X5
Ex2:
A Bayes network is directed a cyclic graph whose nodes are labeled by random variable. Bayes network are some times called causal
networks because the areas connecting the nodes can be though of as representing causal relationship.
To construct a Bayesian network for a given set of variable, we draw arcs from cause variable to immediate effects. We preserve the
formalism and rely on the modularity of the world. We are trying to model. Consider an example:
S: Sprinkler was on last night
W: Grass is wet
R: It rained last night
We can write MYCIN style rules that described predictive relationships among these three events.
IF: The sprinkler was on last night then there is evidence that the grass will be wet this morning.
Taken alone, this rule may accurately describe the world. But consider a second rule:
IF: the grass is wet this morning then there is evidence that it rained last night.
Taken alone , this rule makes sense when rain is the most common source of water on the grass.
There are two different ways that propositions can influence the likelihood of each other. The first is that causes influence the likelihood
of their symptoms; the second is that observing a symptom affects the likelihood of all of its possible causes. The basic idea behind the
Bayesian network structure is to make a clear distinction between these two kinds of influence.
DEMPSTER SHAFER THEORY : The Bayesian approach depends on the use of known prior and likely prob’s to compute conditional
prob’s .The dempster Shafer approach on the other hand is a generalization of classical prob theory which permits the assignment of
prob’masses (beliefs) to all subsets of the universe and not just to the basic element.
A generalization theory has been proposed by Arthur Dempster 1968 and extended by his student Glenn shafer (1976). It has come to
be known as the Dempster theory of evidence. The theory is based on the notion that separate prob masses may be assigned to all
subsets of a universe of discourse rather than just to indivisible single members are required in traditional prob theory.
In the Dempster, we assume a universe of discourse ? and a set correspond to n propositions exactly one of which is true . The
propositions are assumed to be exhaustive and mutually exclusive let 2 ? denote all subsets of U including the empty set and u itself
M:2 pow u->[0,1]
?m m(A)=1.
A??
The function m defines a prob distribution on 2 pow u it represents the measure of belief committed exactly to A .a belief function, Bel
corresponding to specific m for the set A is defined as the sum of beliefs committed to every subset of A by m. Bel(A) is a measure of
the total support or belief committed to the set A and sets a minimum value for its likelihood .
Bel(A)=summation over B subset of a m (B).
The Dempster ,a belief interval can also be defined for a subset A .It is represented as the sub internal [BEL(a),p(a)]OF [0,1].Bel(A) is
also called the support of A and p(A)=1-Bel(~A) the plausibility of (A).when evidence is available from two or more independent
knowledge sources Bel 1,BEL 2 one would like to pool the evidence to reduce the uncertainty .for this dempster has provided such has
combining function denoted as bel1 o bel 2 the total prob mass committed to C.
C= summation over Ai and Bj m1(Ai)m2(Bj)
The sum in above equation must be normalized to account for the fact that some intersections Ai and Bj =pie will have positive prob
which must be discarded .the final form of dempster of combination is then given by
m1om2=summation over Ai and Bj =C mi(Ai)m2(Bj)/summation over Ai and Bj not to 0m1(Ai)m2(Bj)
where the summations are taken overall I and j.
AD_HOC methods: The SO_called ad_hoc methods of dealing with uncertainty are methods, which have no formal theoretical basis.
Although they are usually patterned after probabilistic concepts. These methods typically have an intuitive, if not a theoretical, appeal.
They are chosen over formal methods as a pragmatic solution to a particular problem. When the formal methods impose difficult or
impossible conditions.
Different ad_hoc procedures have been employed successfully in a no. Of AI systems, particularly in expert system. ad_hoc methods
have been used in a large no. of knowledge base systems more than have the more formal methods. This is largely because of the
difficulties encountered in acquiring large no. of reliable probabilities related to the given domain and to the complexities of the
calculations.
Heuristic methods: Heuristic methods are based on the use of procedures, rules and the other forms of encoded knowledge to achieve
specified heuristics; one of several alternative conclusions may be chosen through the strength of positive versus negative evidence
presented in form of justifications and endorsements. The endorsement weights employed in such systems need not be numeric.
Some form of ordering a preference selection scheme must be used.
Reasoning using certainty factors: Probability based reasoning adopted bayes theorem for handling uncertainty, unfortunately, to apply
bayes theorem one, needs to estimate a priori and conditional probableties which are difficult to be calculated in many domains.
Hence, to circumvent this problem, the developers of MYCIN system adopted certainty factors.
A certainty factor (CF) is a numerical estimate of the belief or disbelief on a conclusion in the presence of a set of evidence. Various
methods of using CF’s have been adopted, Typical of them are as under.
1. Use a scale from 0 to 1 where 0 represents total disbelief and 1 stands for total belief. Other values
between 0 to 1 represent varying degrees of belief and disbelief.
2. MYCIN’s CF representation is one scale from –1 to +1. The value of 0 stands for unknown. In
Expert systems, every production rule has a certainty factor associated with it. The values of the CF
are determined by the domain expert who creates the knowledge base.
Matching
Matching is a basic function that is required in almost all A.I programs. It is an essential part of more complex operations such as
search and control. In many programs it is known that matching consumes a large fraction of the processing time.
Matching is the process of comparing two or more structures to discover their likeness or differences. The structures may represent a
wide range of objects including physical entities, words or phrases in some language, complete classes of things, general concepts,
relations between complex entities and the like . The representations will be given in one or more of the formalisms like FOPL,
networks or some other scheme and matching will involve comparing the component parts of such structures.
Matching is used in a variety of programs for different reasons. It may serve to control the sequence of operations, to identify or classify
objects, to determine the best of a number of different alternatives or to retrieve items from a database. It is an essential operation in
such diverse programs as speech recognition, natural language understanding, vision, learning, automated reasoning, planning,
automatic programming and expert system as well as many others.
Matching is just process of comparing two structures or patterns for equality. The match fails if the patterns differ in any aspect.
Different types of matching are:
(1). Indexing
(2). Matching with variables.
(3). Complex and Approximate Matching
(4). Conflict resolution.
(i) Indexing: Incase of indexing we use the current state as an index into the rules in order to select the matching ones. Consider
chess. Here we assign a number to each board position. Then we use a hosting function to treat the number as an index into the rules.
(ii) Matching with variables: In this case we try to match many rules against many elements in the state description simultaneously.
One efficient algorithm is RETE, which gains efficiency from 3 major sources these sources are:
(b) The temporal natural of the date. Any rule should into make changes radically but must add one or two elements, or delete one or
two.
(c) Structural similarity in rules. There may be a situation where some rules may or two conditions other conditions being the same.
For Example:
Mammal (x) -> jaugqr (x)
Feline (x)
Carnivours (x)
Has – stripes (x)
Now if we consider a tiger all the first 3 conditions are same but the 4th one changes, instead of has spots (x) we have
Here we need repeat the rules again. RETE stores these similar structures in memory so that they can be shared.
For Example:
Here the rules (ii) & (iii) can be matched but must satisfy the values of y.
(iii) Complex & Approximate Matching: An approximate matching is one, which is used when the preconditions approximately match
the current situation.
Consider an example of a dialogue between ELIZA & a user. Here ELIZA ill try to match the left side of the rule again the users last
sentence and use the correct right side to generate a response. Let us consider the following ELIZA rules:
(X me Y) -> (X you Y)
(I remember X) -> (who do remember X just now?)
(My {Family-member} is Y) -> (Who else in your family is Y?)
Suppose the use says, “ I remember Mary” How ELIZA will try to match the above response to the left hand side of the given rules. It
finds that it matches to the first rule & now it takes the right hand side and asks “ why do remember Mary just now?”
This is how the conversation proceeds taking into consideration the approximate matching.
(iv) Conflict Resolution: Conflict Resolution is a strategy in which we incorporate the decision making into the matching process. We
have three basic approaches.
(a) Preference Based on Rules: Here we consider the rules in the order they are given or we give some priority to special case rules.
There are two ways in which a matcher will try to decide how one rule is more general than the others. This allows us in decreasing the
search size by more general rules. The two ways are:
1. If one rule contains all the preconditions that another rule has and some additional then
Second rule is more general they the first.
2. If one rule contains preconditions with variables and the other contains the same
A precondition with constants then the first rule is more general.
(b) Preference based on Object: Here we use keywords to match into the rules consider the example of ELIZA. It takes specific
keywords form the user’s response and tries to match the keywords with the given rules. Like previously it uses the “remember”
keyword to match the L.H.S. rule.
(c) Preference Based on States: In this case we fire all the rules that are waiting they lead us to some states. Using a heuristic function
we can decide which state is the best.
Partial matching: For many AI applications complete matching between two or more structures is in appropriate. For example, input
representations of speech waveforms or visual scenes may have been corrupted by noise or other unwanted distortions. In such
cases, we do not want to reject the input out of hand. Our systems should be more tolerant of such commonly occurring problems.
Instead, we want our systems to be able to find an acceptable or best match between the input and some reference description.
A typical system will contain a Knowledge Base which contains structures representing the domain expert’s knowledge in the form of
rules or productions, a working memory which holds parameters for the current problem, and an inference engine with rule interpreter
which determines which rules are applicable for the current problem.
The basic inference cycle of a production system is match, select, and execute as indicated in figure. These operations are performed
as follows:
Match: During the match portion of the cycle, the conditions in the left hand side(LHS) of the rules in the knowledge base are matched
against the contents of working memory to determine which rules have their LHS conditions satisfied with consistent bindings to
working memory terms. Rules which are found to be applicable (that match) are put in a conflict set.
Select: From the conflict set, one of the rules is selected to execute. The selection strategy may depend on regency of usage,
specificity of the rule, or other criteria.
Execute: The rule selected from the conflict set is executed by carrying out the action or conclusion part of the rule, the right hand side
(RHS) of the rule. This may involve an I/O operation, adding, removing or changing clauses in working Memory or simply causing a
halt.
The above cycle is repeated until no rules are put in the conflict set or until a stopping condition is reached.
A typical knowledge base will contain hundreds or even thousands of rules and each rule will contain several (perhaps as many as ten
or more) conditions. Working memories typically contain hundreds of clauses as well. Consequently, exhaustive matching of all rules
and their LHS conditions against working-memory clauses may require tens of thousands of comparisons. This accounts for the claim
made in the introductory paragraph that as much as 90& of the computing time for such systems can be related to matching
operations.
To eliminate the need to perform thousands of matches per cycle, an efficient match algorithm called RETE has been developed
(Forgy, 1982). It was initially developed as part of the OPS family of programming languages (Brownston, et al., 1985). This algorithm
uses several novel features, including methods to avoid repetitive matching on successive cycles. The main timesaving features of
RETE are as follows.
Advantages:
a. In most expert systems, the contents of working memory change very little from cycle to cycle.
b. Many rules in a knowledge base will have the same conditions occurring in their LHS.
c. The temporal nature of data.
d. Structural similarity in rules.
e. Persistence of variable binding consistency.
PROLOG
The name PROLOG was taken from the phrase “PROgramming in LOGic” .The language was originally developed in 1972 by Alain
Colmerouer and P.Roussel at the university of Marseilles in France. Prolog is unique in its ability to infer facts and conclusions form
other facts.
Prolog has been successful as an AI programming language for the following reasons.
1.The syntax and semantics of prolog are very close to formal logic. by this time , It must be clear to you that most AI program reason
using logic.
2. Prolog language has in it a built in inference engine and automatic backtracking facility. This helps in efficient implementation of
various search strategies.
5. Because of the inherent AND parallelism, prolog language can be implemented with ease on parallel machines.
6. The clauses of prolog have a procedural and declarative meaning. Because of this understanding of the language is easier.
7.In prolog, each clause can be executed separately as though it is a separate program. Hence modular programming and testing is
possible.
The main part of a prolog program consists of a collection of knowledge about a specific subject. This collection is called a database
and this database is expressed in facts and rules.
Turbo prolog permits you to describe facts as symbolic relationships.
Ex: The right speaker is dead.
This same fact can be expressed in turbo prolog as
Is (right_speaker, dead)
This factual expression in prolog is called a clause. Notice the period at the end of the clause.
In this ex: right speakers and dead are objects.
The word “is” is the relation in this ex. relation is a name that defines the way in which a collection of objects.
The entire expression before the period in each case is called a predicate. A predicate is a function with a value of true or false
predicates express a property or relationship .the word before the parenthesis is name of relation. The elements with in the parenthesis
are the arguments of the predicate, which may be objects or variables.
Clauses
|
Facts rules
|
Predicates period
|
relations arguments
arguments
|
objects variables.
A rule is an expression that indicates the truth of a particular fact depends upon one or more other facts. Consider this ex: if there is a
body stiffness or pain in the joints.
AND there is sensitivity to infections.
Then there is probably a vitamin C deficiency.
This rule could be expressed as the following turbo prolog clause.
Hypothesis (vit_deficiency ) if
Symptom (arthrities ) and symptom (infection_senstivity)
Notice the general form of the prolog rule. The conclusion is stated first and is followed by the word if. Every rule has a conclusion is
stated and antecedent rules are important part of any programming short hand has been developed for expressing rules. in this
abbreviated form the previous rule becomes:
Hypothesis (vitc_deficiency ):-
Symptom (arthrities)
Symptom (infection_Senstivity)
The :-operator is called a break a comma expression and relationship and semicolon expresses an or relationship.
The process of using rules and facts to solve a problem is called formal reasoning.
A turbo prolog program consists of 2 or more sections. The main body of the program, the clause section contains the clause and
consists of facts and rules. The relationships used in the clauses of the clause section are defined in the predicate section. Each
relation in each clause must have a corresponding predicate definition in the predicates section predicate definition in the pre dicta
section does not end with a period. The domains section also is apart of most turbo prolog programs .If defines the type of each object.
The domain section turbo prolog controls the typing of object.
6 basic object types are available to the user.
Character.
Integer.
Real.
String.
Symbol
File.
Character –single character (enclosed between single quotation marks)
Integer-integer from -32768 to 32767
Real- floating point number (lE-307 to 1E-308)
String –character sequence (enclosed between double quotation marks)
Symbol- character sequences of letters numbers and under series with the first character a lower case letter
File-symbolic file name.
Write device: Any O/P to the display can be directed instead to a printer or a file .to redirect O/P uses the write device built-in predicate.
Write device (print)
Write (“this will on the printer”)
Write device (screen)
Writef predicate:
Some times you may wish to format o/p for ex: you may want the decimal points in a table to be aligned with turbo prolog, you can use
the writef built in predicate to force alignment of numbers a text.
Writef (format E1, E2…En)
The variable format is a control code of the type.
% - m.p
-Hyphen forcing the left justification, default is right just
m minimum field width .
p precision of a decimal floating point number.
Writef (“%-10#%5.0$3.2\n”, an, 3, 3.1)
Displays (fan#23$3.10)
Input operators:
Turbo prolog provides several built in I/P predicates for this ex:
Readln string or symbol
Read char character
Read int integer.
Read real real
Inkey,
Key pressed
The readln predicate: The readln predicate permits a user to read any string symbol into a variable.
Ex: readln(replay)
Replay (‘”yes”)
Read predicate: The read car predicate permits a user to read any character in to variable.
Ex: readchar(replay)
Replay =’y’
The read int. predicate can be used to read an integer value to a variable.
Ex: read int (Age)
Read real: The read real predicate can be used to read floating point numbers into a variable.
Ex: read real (price)
Price=12.50
Inkey: The inkey predicate reads a single character from the I/p the predicate form is inkey (char)
If there is no i/p the predicate fails. Otherwise the character is returned bound to the char.
Key pressed: key pressed predicate to determine whether has been pressed with out a character being returned.
Controlling execution predicates.
Fail predicate: In prolog, forcing a rule to fail under certain conditions is a type of control and is essential to goal programming. Failure
can be the forced In any rule by using the built in fail predicate. The fail forces back tracking in an attempt to unify with another clauses.
Whenever this predicate is invoked the goal being proved immediately fails, and back tracking is initiated .the predicate has no
arguments, so failing of the fail predicate is not dependent on variable binding .the predicate always fails.
Ex: 1 go: -
Test,
Write (“you will never get here”)
Test: -
Fail.
Here goal: go
False
Here the write predicate will never be executed.
2.go:
Write (“you will get here”)
Test
Test:-
Fail
Here goal: go
You will get here
False
The Cut predicate: The cut is one of the most important and one of the most complex features of prolog.
The primary purpose of the cut is to prevent or block backtracking based on a specified condition. The cut predicate is specified as an
exclamation point. If has no argument the cut predicate succeeds and once it succeeds, it acts a fence, effectively blocking any back
tracking beyond the cut. If any premise beyond the cut fails, prolog cans only back track as far as the cut to try another path.
The basic purpose of the cut is to eliminate certain search path is in the problem space, you can think of the paths through the
database as the limbs of a tree. If you get cut on one limb and discover failure, you normally can move back towards the trunk, find
another limb and start moving outward again the cut act like a fence where it is placed in the program you cannot back beyond the cut.
1. To terminate the search for any further solutions to a goal. Once a particular rule has been tested.
2. To terminate the search for any further solutions to a goal once a particular rule has been forced to fail with
the fail predicate.
3.To eliminate paths in the database to speed up execution .the cut is not always the best solution.
The red type of cut is used to omit explicit conditions. Cuts improve the clarity and efficiency of most programs of the 2 types of cut, the
green cut is the more acceptable type. You can often use the not predicate instead of the red cut.
Recursion: Recursion is one of the prologs most important Execution control technique. If often is used with the fail or cut predicate.
Recursion is a technique in which some thing is defined in terms of itself. In prolog recursion is the technique of using a clause to
invoke copy of it-self.
Ex: If you specify the goal as count (1) you should see the following output -> 1 2 3 4 5 6 7 8 true
Contains a copy itself.
Count (~)
(Statements)
Count (NN)
Program: Predicates
Count (integer)
Clauses
Count (9).
Count (N): -
Write (“ “, N).
NN=N+1.
Count (NN).
The repeat predicate: One predicate that uses recursion is the repeat predicate. Although repeat is not a built – in predicate it is so
important that you will probably want to add it to many programs its format is
Repeat
Repeat:-
Repeat
The repeat predicate is useful for forcing a program to generate alternate solutions through back tracking. When the repeat predicate is
used in a rule, the predicate always succeeds .if a later premise causes failure of the rule, or log will back tracking to the repeat
predicate.
Basic rules of recursion:
1.a program must include some method of terminating the recursion loop.
2.variable binding in a rule of fact apply only to the current layer.
3.in most applications the recursive procedure should do its work on the way down.
Recursion is not always the best programming choice. Recursion complex and the flow of execution is not always easy to follow.
Length (list, length): The length utility simply measures the list, reporting either how long it is ,which lists are of a specified length or
whether a specified list has the specified length.
Length([],0).
Length([|Tail],len):-
Length(Tail,len1).
Len = Len1+1.
Location (Position, list, object): This predicate can either find the location of the specified object, find the object at the location or
confirm the presence of the object at the position.
Insert, Extract, Replace: These three utilities share a common argument pattern.
(Object, location, list, new list)
Insert (object, location, list, new list): The insert utility differs from append in it allows you to select the location in the list for doing your
insertion.
Insert (Item, l, list, [item | list]):-!
Insert (item, loc [Head | rest], [head | Rest 1]:-
Loc1 = Loc-1, insert (item, loc1, rest, rest1).
Replace (object, location, list, new list): The replace utility goes to the specified location and substitutes the specified object for the
object at location.
Replace (item, l,[-|Rest],[Item | Rest]):-!
Extract (object, location, list, new list); The extract utility shortens the list by removing the item of the specified location.
APPEND: (List, list, list). Using append, you can add an element to a list. Since an empty list is still a list, this means you can use
append to build a list from scratch.
Append ([], L, L).
Append ([X|L1], L2, [X|L3]: -
Append (L1, L2, L3).
DELETE (Object, list, new list): Deleting an object from a list is also a recursive process.
Domains
Integer list= integers*
Item = integer
Predicates
Delete (Item, [Item | Tail], Tail).
1 solution.
Domains
objectlist = symbol *
item = symbol
Predicates
Clauses
L=[vijay,giri,vamsi,koti].
Domains
List = integer *
Sum = integer
Predicates
sum_of_list ( list, sum )
Clauses
sum_of_list ( [], 0 ).
sum_of_list ([head| tail],sum):-
sum_of_list (tail,addtail),sum=head+addtail.
Result: sum_of_list([ ],sum)
Sum=0;
Sum-of-list([1,2,3,4,5],sum)
Sum= 15.
/*CALCULATE FACTORIAL*/
Domains
X,factx = integer
Predicates
factorial(x,Factx)
Clauses
factorial (1,1).
factorial (x,Factx):-
Y=x-1, factorial (y,Facty),Factx=x*Facty.
Result:
Goal: factorial (5,x)
X= 120
/* APPENDING TWO LISTS*/
Domains
objects = integer
list=integer*
item=integer
Predicates
append(list,item,list)
Clauses
Append ([], item, list).
Append ([h|list1], item,[h|list3]):-
Append (list1,item, list3).
Result:
Goal: append ([Rajesh, Mahesh],[Divya, Deepthi],l).
L=[rajesh, Mahesh, divya, deepthi].
LISP
LISP (List Processing) is an AI programming language developed by John Mc Carthy in late 1950.Lisp is a symbolic processing
language that represents information in lists and manipulates these lists to derive information. Till today different dialects of lisp
language have been developed .the foundations of these dialects remain the same while the syntax and functionality show a marked
change.
LISP DIALECTS
S. No Dialects Developed by
1 Common Lisp Gold Hill computers
2 Franz Lisp Franz Inc
3 Inter Lisp Xerox
4 Mu lisp Microsoft Inc ,the software house
5 Portable standard Lisp University of Utah
6 Vax lisp Digital equipment corp.
7 X lisp Blue users group
8 Zeta lisp LMI, symbolic, Xerox
Features of LISP:
1. LISP is the equivalence of form between programs and data in language, which allows data structures to be executed as programs
and programs to be modified as data.
2. LISP is heavy reliance an recursion as a control structures, rather than iteration (looping) that is common in most programming
language.
3. LISP has an interpreter which implies that the debugging facilities would be unmatched by any compiled language.
4. LISP encourages the use of dynamic data structures which leads to design of a flexible system.
5. LISP macro facility allows new languages to be defined and embedded within the LISP system.
Preliminaries of Lisp: LISP language is exclusively used for manipulating lists. The major characteristic is that the basic elements are
treated as symbols irrespective of whether they are numeric or alphanumeric. The basic data element in lisp, An atom is indivisible.
Lisp has 2 basic types of atoms. No’s and symbols. Numbers represent numerical values. Any type of number such as positive or
negative integers, floating point or decimals are acceptable. Symbols represent alpha numeric character symbols can be a combination
of alphabets and numerical
(apple orange grapes mango)
(millimeter container decimeter meter)
(78 65 71 70 68)
(Bullock_ cart (Hercules atlas hero)(tvs 50 kelvinator mofa))
Such a combination of lists within lists are called “nested” or “multiple” or “complex” list. Only 1 thing to be mind is that the number of
parenthesis, opening and closing must be same,. If not, the system will flash an error message. Because if the parenthesis problem,
lisp is jovially referred as a language that has “Lots of Infuriating stupid parentheses”.
LISP FUNCTIONS: A lisp program is a collection of small routines, which define simple functions . In order that complex functions may
be written; the language comes with a set of basic functions called primitives. These primitives serve commonly required operations. a
part from these primitives ,lisp permits you to define your own functions. Because of this independence, lisp is highly flexible since
tailor made functions are defined and manipulated, modularity is high. Lisp uses prefix notations for its operations. The basic primitives
of lisp are classified as:
1. Arithmetic primitives
2. Boolean primitives
3. List manipulation primitives
1. Arithmetic primitives: + addition,- subtraction ,* multiplication ,/ division / |+adds | to its argument to the the power specified by its
second argument. Quotient and remainder together give the result of division between integers, recipe gives the reciprocal, max and
min return the maximum and minimum of their argument.
(+ 6 2) =>8 ; (* 6 2)=>12
(- 6 2) =>4; (/ 6 2)=>3
(Remainder 14 3)=>2 ;(Recip(5))=>.2 ;max( 8 3 9 )=>9 (min 8 3 9)=>3
Boolean primitives: these primitives provides a result which is Boolean in nature i.e. true or false. Some of them require only 1
argument while others more than 1.
Some such primitives are:
1. Atom : the purpose is to find out whether the element is an atom or not
Ex:(ATOM RAMAN)=>T
(ATOM 26)=>Nil
2.Number p: determines if the atom is a number or not.
Ex: (number p raman)=>nil ;(numberp 20)=>T
3.list p: determines if the input is a list or not.
Ex: (list p (25 35 46 75) =>T
(List p raman )=>NIL
4. Zero p:to find out whether the number is zero or not
Ex:(zero 26)=>nil ;(zerop 0)=>T
5.Odd p;to find out whether the i/p is odd.
Ex:(ODD p 65)=>T ;(ODD p 60)=>Nil
6.even p;to find out whether th i/p is even or not
Ex: (even p 78)=>T;(EVENP 89)=>nil
7.equaL P; To find out whether the given lists are equal
Ex: (equal ‘(janaki raman sarukesi )’(janiki raman sarukesi) =>T
(Equal ‘(75 67 94)’(4 3 65 987)
8.greater p: to find out the 1 is greater than the second
Ex: (greater p ’46 ’86)=>nil
Ex: (greater p ’86 ’46)=>T
9.lesser p: to find out whether 1 is lesser than the 2.
List manipulation of primitives: The purpose of list manipulation primitives are for 1.creating a new list 2 . Modifying an existing list with
addition, deletion or replacement of an atom 3. Extracting portions of LIST
For these lisp provides some primitives and well specified functions can be developed from these .in lisp values are assigned to
variables by set Q property .the primitive has 2 arguments, the 1 being the variable & the second, the value to be assigned to the
variable. The value could be an atom or list itself.
Ex: 1 (SET Q A 22) when evaluated assigns 22 to variable A.
2. (Set q TV ‘onida) would assign TV=ONIDA whenever.
3. (Set pressure-1 22) would assign pressure-1 =22
(Set q pressure 2 pressure1) would assign pressure 2 =22
Set q pressure ‘ pressure 1 would assign to pressure 2=pressure 1
List construction: lists are constructed using CONS primitive. Cons primitive adds new element to the front of the existing list. This
primitive needs 2 arguments and returns a single list.
Ex:1.((CONS ‘p ‘(Q R S )) =>(P Q R S )
3. (SET Q A ‘(B C D ))
(SET Q X ’(X Y Z))
(Cons A X)
=>(B C D X Y Z)
While the cons primitive adds a new list to the front the existing list, the primitive append adds it to tail of the existing list.
The CDR Primitive returns the list excluding the first element.
Ex: 1. (CDR ‘(RAM LAKSHMAN BHARATH SHATRUGHNAN))
(LAKSHMAN BHARATH SHATRUGHNAN)
2. (CDR ‘((32 36) (56 54) (67 31) (67 87 89))
((56 54) (67 31) 67 87 89)
Defining functions & conditions: The functions named define is used to define functions. it requires 3 arguments . 1 The new function
name 2. The parameter for the function 3. The functions body or lisp code, which perform the desired function on operations.
Syntax: (define name (parm1, parm2, , , ,) body)
Define does not evaluate its arguments .if simply builds a function which may be called like any other function.
Ex: we define a function named average to compute the average of 3 no’s
->(Define average ((n1, n2, n3)( / (+ n1 n2 n3)3)
Average
->
->(Average (10, 20, 30)
20
->
The condition (cond): Predicates are 2 way to make tests in program & take different actions based on the outcome of the test. We
need some construct to permit branching.
Syntax: (cond ()
(<TEST2)….. )
Each ()I=1…k is called a clause. Each clause consists of a test portion and an action result portion.
Ex: find the max of 2 numbers
->(defun Maximum2(a, b)
(cond((> a b) a)
( t b)))
Maximum2
->
Note the t in the second clause preceding b. This forces last clause to be evaluated when the first clause is (nil) (not).
->Maximum2 (234 320)
->320
Find maximum of 3 numbers
->(Defun Maximum3 (a b c )
(cond((>a b) (cond((>a c) a)( t c)))
((> b c) b ) (t c)))
Maximum 3
->
->Maximum3 (20 30 25)
30
->
Logical functions: Like predicate, logical functions may also be used for flow of control. The basic logical operations are AND, OR ,
NOT. Not is the simplest, it takes one argument & returns if the argument to nil.
Input, output and local variables: The operations we need for this are performed with the input-output functions. The most commonly
used i/o functions are read, print, print, princ, terpri and format.
prinl: It is the same as print except that the new line characters and space are not provided.
princ: We can avoid the double quotation makes in the output by using the printing function princ. it is same as prinl except it does not
print the unwanted quotation marks.
Terpri: Tt takes no arguments .it introduces a new line (carriage return and line feed) wherever it appears and then return nil.
Format: The format function permits us to create cleaner output then is possible with just the basic printing functions.
->(Format )
Destination specifies where the o/p is to be directed. String is the desired o/p string but intermixed with format directives, which specify
how watch argument is to be represented.
Directives appear in the same string in same order the argument is to be printed. Each directive is preceded with a tilde character (~)
to identify, it as a directive.
~F the argument must be a floating-point number is printed as a decimal floating point number.
2. The traveling salesman problem involves n cities with paths connecting the cities. The time taken for traversing through all the cities,
without knowing in advance the length of a minimum tour, is
(a) O(n)
(b) O(n2)
(c) O(n!)
(d) O(2n)
(e) On(log n).
18. Factors which affect the performance of learner system does not include,
(a) The representation scheme used
(b) The training scenario
(c) The type of feedback
(d) Good data structures
(e) Learning algorithm.
24. In Dempster-Shafer theory, the relation between belief and plausibility is given as,
(a) Pl(s) = 1 + Bel(Øs)
(b) Pl(s) = 1 – Bel(s)
(c) Pl(s) = 1 + Bel(s)
(d) Pl(s) = 1 – Bel(Øs)
(e) P1(s) = 1 * Bel(s).
27. Knowledge-based systems get their power from the expert knowledge that has been coded into
(a) Facts
(b) Rules
(c) Heuristics
(d) Procedures
(e) Relations.
28. A good system for the representation of knowledge in a particular domain should possess the properties
(a) Representational Adequacy
(b) Inferential Adequacy
(c) Inferential Efficiency
(d) Acquisitional Efficiency
(e) All of the above.
END OF SECTION A
b. What is a Learning Agent? Explain the role of the Learning Agent in Artificial Intelligence. ( 5 marks)
2. a. Briefly explain Uniform Cost Search. ( 3 marks)
A computer vision system is needed to be developed to identify abnormalities in ECG patterns. With regard to this vision system,
identify a suitable processing technique for each of the processing stages given in the following table and complete the table. ( 7
marks)
b. In many computer vision systems, image segments are used as useful features for providing interpretations about the contents of an
image. Give two techniques which can be used to extract segments of an image. ( 3 marks)
5. a. If the start state is 1 and the goal state is 7 in the diagram given below, describe the search path pursued by a breadth first search
algorithm, uniform cost search algorithm and a depth-first search algorithm. The value given on a link that connects two nodes
describes the cost to travel between the two nodes.
( 5 marks)
b. Compare the depth-first search and breadth-first search algorithms by writing out their advantages and disadvantages. ( 5 marks)
END OF SECTION B
7. Animals can be divided into various categories such as mammals and birds. Mammals have hair. They give milk too. Birds can fly
and lay eggs. Animals who eat meat, are known as carnivore. These carnivore animals have pointed teeth, claws and forward eyes.
Mammals who have hoofs or chew cud are known as ungulate. Cheetah is a mammal as well as a carnivore. It has tawny color and
dark spots. Giraffe is an ungulate. It has a long neck, long legs and dark spots. Zebra is also an ungulate with black stripes. Although a
Penguin is a bird, it cannot fly, but it can swim. It is colored black and white. Represent the above information using a semantic
network. ( 10 marks)
END OF SECTION C
Suggested Answers
Artificial Intelligence (MC321) : January 2008
Section A : Basic Concepts
Answer Reason
1. A Transition networks are based on the application of directed graphs and finite state automata. < TOP >
2. C The time taken for traversing through all the cities, without knowing in advance the length of a minimum tour, is O(n!). < TOP >
3. A The problem space of means-end analysis has an initial state and one or more goal states. < TOP >
4. B The engineering goal of artificial intelligence is to solve real-world problems. < TOP >
5. E Applications of artificial intelligence should be judged according to Whether there is a well-defined task, an implemented program
and a set of identifiable principles. < TOP >
6. B An algorithm A is admissible if, It is guaranteed to return an optimal solution when one exists. < TOP >
7. A An algorithm A is optimal over a class of algorithms if A dominates all members of the class. < TOP >
8. A In informed or directed search, some information about the problem space is used to compute a preference among the children for
exploration and expansion. < TOP >
9. C The action ‘ONTABLE(A)’ of a robot arm means Block A is on the table. < TOP >
10. A In the matching process, the conditions in the left hand side (LHS) of the rules in the knowledge base are matched against the
contents of working memory. < TOP >
11. A Logic programming is a programming language paradigm in which logical assertions are viewed as programs. < TOP >
12. A A PROLOG program is described as, a series of logical assertions, each of which is a horn clause. < TOP >
14. B In the rule selection process one of the rules is selected from the conflict set to execute. < TOP >
15. A Analogy is based on previously tested knowledge that bears a strong resemblance to the current situation. < TOP >
16. B In Case grammars case relates to the semantic role that a noun phrase plays with respect to verbs and adjectives. < TOP >
17. A Semantic grammars encode semantic information into a syntactic grammar. < TOP >
18. D Factors which affect the performance of learner system does not include good data structures. < TOP >
19. A A model of a set of wff’s is an interpretation that satisfies them. < TOP >
20. A Perception involves Sights, sounds, smell and touch. < TOP >
21. A Third component of a planning system is, to detect when a solution has been found. < TOP >
22. C Robustness is the measure of a learning system to function with unreliable feedback and with a variety of training examples. <
TOP >
23. B In Closed World Assumption ,If a proposition cannot be proven, it is false. < TOP >
24. D In Dempster-Shafer theory, the relation between belief and plausibility is given as,
Pl(s) = 1 – Bel(Øs). < TOP >
25. A Circumscription states that all objects that have some property P are the only objects that satisfy P. < TOP >
26. B Completion formulas are axioms which are added to a Knowledge Base to restrict the applicability of specific predicates. < TOP
>
27. E Knowledge-based systems get their power from the expert knowledge that has been coded in relations. < TOP >
28. E Representational Adequacy,Inferential Adequacy, Inferential Efficiency, Acquisitional efficiency. < TOP >
29. C A Horn clause is a clause that has at most one positive literal. < TOP >
30. B Neural networks store information in the strengths of the interconnections. < TOP >
Section B : Problems
1. a. A simple agent program can be defined mathematically as an agent function which maps every possible percepts sequence to a
possible action the agent can perform or to a coefficient, feedback element, function or constant that affects eventual actions:
f:P * - > A
The program agent, instead, maps every possible percept to an action.
It is possible to group agents into five classes based on their degree of perceived intelligence and capability:
simple reflex agents;
model-based reflex agents;
goal-based agents;
utility-based agents;
learning agents.
1. Simple reflex agents
Simple reflex agents acts only on the basis of the current percept. The agent function is based on the condition-action rule:
if condition then action rule
This agent function only succeeds when the environment is fully observable. Some reflex agents can also contain information on their
current state which allows them to disregard conditions whos actuators are already triggered.
2. Model-based reflex agents
Model-based agents can handle partially observable environments. Its current state is stored inside the agent maintaining some kind of
structure which describes the part of the world which cannot be seen. This behavior requires information on how the world behaves
and works. This additional information completes the “World View” model.
3. Goal-based agents
Goal-based agents are model-based agents which store information regarding situations that are desirable. This allows the agent a
way to choose among multiple possibilities, selecting the one which reaches a goal state.
4. Utility-based agents
Goal-based agents only distinguish between goal states and non-goal states. It is possible to define a measure of how desirable a
particular state is. This measure can be obtained through the use of a utility function which maps a state to a measure of the utility of
the state.
b. Learning agents
IAs are also referred to as autonomous intelligent agents, which means they act independently, and will learn and adapt to changing
circumstances. IA systems should exhibit the following characteristics:
• learn and improve through interaction with the environment .
• adapt online and in real time
• learn quickly from large amounts of data
• accommodate new problem solving rules incrementally
• have memory based exemplar storage and retrieval capacities
• have parameters to represent short and long term memory, age, forgetting, etc.
• be able to analyze itself in terms of behavior, error and success.
To actively perform their functions, Intelligent Agents today are normally gathered in a hierarchical structure containing many “sub-
agents”. Intelligent sub-agents process and perform lower level functions. Taken together, the intelligent agent and sub-agents create a
compete system that can accomplish difficult tasks or goals with behaviors and responses that display a form of intelligence.
Some of the sub-agents (not already mentioned in this treatment) that may be a part of an Intelligent Agent or a complete Intelligent
Agent in themselves are:
1. Temporal Agents (for time-based decisions);
2. Spatial Agents (that relate to the physical real-world);
3 Input Agents (that process and make sense of sensor inputs - example neural network based agents neural network);
4. Processing Agents (that solve a problem like speech recognition);
5. Decision Agents (that are geared to decision making);
6. Learning Agents (for building up the data structures and database of other Intelligent agents);
7. World Agents (that incorporate a combination of all the other classes of agents to allow autonomous behaviors). < TOP >
2. a. Breadth first search finds the shallowest goal state and that this will be the cheapest solution so long as the path cost is a function
of the depth of the solution. However, if this is not the case, then breadth first search is not guaranteed to find the best (i.e. cheapest)
solution. Uniform cost search remedies this. It works by always expanding the lowest cost node on the fringe, where the cost is the
path cost, g(n).
In fact, breadth first search is a uniform cost search with g(n) = DEPTH(n).
b. A* (pronounced “A star”) is a graph/tree search algorithm that finds a path from a given initial node to a given e(or one passing a
given goal test). It employs a “heuristic estimate” h(x) that ranks each node x by an estimate of the best route that goes through that
node. It visits the nodes in order of this heuristic estimate. The A* algorithm is therefore an example of best-first search.
Generally speaking, Depth-first search and breadth-first search are two special cases of A* algorithm. Dijkstra’s algorithm as another
example of a best-first search algorithm, is the special case of A* where h(x) = 0 for all x. For depth-first search, we may consider that
there is a global counter C initialized with a very big value. Every time, we process a node, we assign C to all of its newly discovered
neighbors. After each single assignment, we decrease the counter C by one. Thus the earlier a node is discovered, the higher its h(x)
value.
Consider the problem of route finding, for which A* is commonly used. A* incrementally builds all routes leading from the starting point
until it finds one that reaches the goal. But, like all informed search algorithms, it only builds routes that appear to lead towards the
goal.
To know which routes will likely lead to the goal, A* employs a heuristic estimate of the distance from any given point to the goal. In the
case of route finding, this may be the straight-line distance, which is usually an approximation of road distance.
What sets A* apart from greedy best-first search is that it also takes the distance already travelled into account. This makes A*
complete and optimal, i.e., A* will always find the shortest route if any exists and if h(x) was chosen correctly. While optimal in arbitrary
graphs, it is not guaranteed to perform better than simpler search algorithms that are more informed about the problem domain. In a
maze-like environment, the only way to reach the goal might be to first travel one way (away from the goal) and eventually turn around.
In this case trying nodes closer to your destination first may cost you time.
Properties
Like breadth-first search, A* is complete in the sense that it will always find a solution if there is one.
If the heuristic function h is admissible, meaning that it never overestimates the actual minimal cost of reaching the goal, then A* is
itself admissible (or optimal) if we do not use a closed set. If a closed set is used, then h must also be monotonic (or consistent) for A*
to be optimal. This means that it never overestimates the cost of getting from a node to its neighbor. Formally, for all paths x,y where y
is a successor of x:
A* is also optimally efficient for any heuristic h, meaning that no algorithm employing the same heuristic will expand fewer nodes than
A*, except when there are several partial solutions where h exactly predicts the cost of the optimal path.
A* is admissible and computationally optimal
A* is both admissible and considers fewer nodes than any other admissible search algorithm with the same heuristic, because A*
works from an “optimistic” estimate of the cost of a path through every node that it considers — optimistic in that the true cost of a path
through that node to the goal will be at least as great as the estimate. But, critically, as far as A* “knows”, that optimistic estimate might
be achievable.
When A* terminates its search, it has, by definition, found a path whose actual cost is lower than the estimated cost of any path
through any open node. But since those estimates are optimistic, A* can safely ignore those nodes. In other words, A* will never
overlook the possibility of a lower-cost path and so is admissible.Suppose now that some other search algorithm A terminates its
search with a path whose actual cost is not less than the estimated cost of a path through some open node. Algorithm A cannot rule
out the possibility, based on the heuristic information it has, that a path through that node might have a lower cost. So while A might
consider fewer nodes than A*, it cannot be admissible. Accordingly, A* considers the fewest nodes of any admissible search algorithm
that uses a no more accurate heuristic estimate.
Complexity
The time complexity of A* depends on the heuristic. In the worst case, the number of nodes expanded is exponential in the length of
the solution (the shortest path), but it is polynomial when the heuristic function h meets the following condition:
| h(x) - h * (x) | = O(logh * (x)) where h * is the optimal heuristic, i.e. the exact cost to get from x to the goal. In other words, the error of
h should not grow faster than the logarithm of the “perfect heuristic” h * that returns the true distance from x to the goal .More
problematic than its time complexity is A*’s memory usage. In the worst case, it must also remember an exponential number of nodes.
Several variants of A* have been developed to cope with this, including iterative deepening A* (IDA*), memory-bounded A* (MA*) and
simplified memory bounded A* (SMA*) and recursive best-first search (RBFS).
c. Uniform cost search minimises the cost g(n) of the path up to node n. Uniform cost search is both optimal and complete but can be
very inefficient. We combine this evaluation function, g(n), with a heuristic evaluation, h(n) to give the evaluation function for the A*
algorithm,
i.e. f(n) = g(n) + h(n)
As g(n) gives the path cost from the start node to node n and h(n) gives the estimated cost of the cheapest path from node n to the
goal, we have f(n) = estimated cost of the cheapest solution through node n. A* is both optimal and complete if the heuristic is
admissible (i.e. does not overestimate the cost to the goal). It also, in the worst case, has the same time and space complexity as
uniform cost search. < TOP >
3. a. 1. married(abhaya, dishna).
2. sibling(bimal, chintha).
3. mother(dishna, chintha).
4. son(bimal, abhaya).
b. 5. On father-son relationship : father(X,Y) :- son(Y, X).
6. On the sibling relationship : sibling(X, Y) :- sibling(Y, X).7. On the father-sibling relationship: father(X, Y) :- father(X, Z), sibling(Y, Z).
8 . On mother-sibling relationship: mother(X,Y) :- mother(X, Z), sibling(Y, Z).
c. ~father(abhaya, chintha) + father(X,Y) à
~(father(abhaya, Z), sibling(Z, chintha)) --- rule 7
father(abhaya, Z) + father(X, Z) à son(Z, abhaya) --- rule 5
son(Z, abhaya) + son(bimal, abhaya) à NULL --- fact 4
~sibling(bimal, chintha) + sibling(bimal, chintha) à NULL --- rule 6
Hence ~father(abhaya, chintha) is contradicted. i.e. abhaya IS the father of chintha.
~father(bimal, X) + father(bimal, X) à ~(son(X, bimal)) --- rule 5
~son(X, bimal) + son(abhaya, bimal) à NULL --- fact 4 < TOP >
4. a.
Processing stage Technique
Image capturing Flat-bed Scanner can be used to capture the image.
Preprocessing The image can be enhanced using sharpening, smoothing and histogram equalization techniques selecting the
appropriate techniques.
Feature extraction The ECG pattern should be extracted from the background. This could be done using threshold based segmentation
since the signal is darker than the background.
Feature representation
The extracted ECG signal pattern can be given as1s and background as 0s in a binary image.
Interpretation of the ECG pattern Neural network based techniques can be used to train the system for normal ECG patterns and
abnormal patterns can be identified by the system.
b. (i) Image segmentation by thresholding. The grey-level histogram can be used as an aid to identify the thresholds required to
segment the image into meaningful segments.
(ii) Pixel aggregation and region growing. < TOP >
5. a. Breadth-first Search: 1, 2, 3, 4, 5, 6, 7
Uniform Cost Search: 1, 4, 3, 2, 8, 9, 7, 6, 5
Depth first Search: 1, 2, 5, 6, 3, 7
b. Depth First Search
Advantages
1) Needs little memory (only nodes in current path need to be stored)
2) May arrive at solutions without examining much of search space.
Disadvantages
1) May explore a single unfruitful path for a long time (forever if loop exists)
2) May settle for a non-optimal solution
Breadth First Search
Advantages
1) Will not go down a blind alley for solution.
2) Optimal solutions are always found; multiple solutions found early.
Disadvantages
1) The entire tree generated must be stored in memory.
2) If solution path is long, the whole tree must be searched up to that depth. < TOP >