0% found this document useful (0 votes)
24 views159 pages

NCVRT Artificial Intelligence Notes

The document introduces artificial intelligence (AI), defining it as the study of creating systems that can perform tasks requiring human-like intelligence, categorized into systems that think and act like humans or rationally. It outlines key components of intelligence, including learning, reasoning, and problem-solving, and discusses historical milestones in AI development from its inception in the 1940s to the rise of machine learning in the 1980s. Additionally, it covers problem formulation in AI, emphasizing the importance of defining initial states, successor functions, goal tests, and path costs in problem-solving algorithms.

Uploaded by

charugeshm
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views159 pages

NCVRT Artificial Intelligence Notes

The document introduces artificial intelligence (AI), defining it as the study of creating systems that can perform tasks requiring human-like intelligence, categorized into systems that think and act like humans or rationally. It outlines key components of intelligence, including learning, reasoning, and problem-solving, and discusses historical milestones in AI development from its inception in the 1940s to the rise of machine learning in the 1980s. Additionally, it covers problem formulation in AI, emphasizing the importance of defining initial states, successor functions, goal tests, and path costs in problem-solving algorithms.

Uploaded by

charugeshm
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 159

UNIT I

INTRODUCTIONTOAlANDPRODUCTIONSYSTEMS
CHAPTER - 1
WhatisArtificial Intelligence?
1. INTELLIGENCE
 Thecapacityto learnandsolveproblems.
 Inparticular,
 theabilityto solvenovelproblems(i.esolvenewproblems)
 theabilityto act rationally(i.e act based on reason)
 theabilityto act like humans

Whatis involvedin intelligence?


• Abilitytointeractwiththerealworld
– toperceive,understand,andact
– e.g.,speech recognitionandunderstandingandsynthesis
– e.g.,image understanding
– e.g.,abilityto take actions, havean effect

• Reasoningand Planning
– modelingtheexternalworld,giveninput
– solvingnewproblems,planning,andmaking decisions
– abilityto deal with unexpected problems, uncertainties

• Learningand Adaptation
– wearecontinuouslylearningandadapting
– ourinternalmodelsare alwaysbeing―updatedǁ
• e.g., ababylearningtocategorizeandrecognize animals
2. ARTIFICIALINTELLIGENCE
Itisthestudyofhowto makecomputersdothingsat which,atthemoment,peoplearebetter. The term
AI is defined by each author in own ways which falls into 4 categories
1. Thesystemthatthinklike humans.
2. Systemthatactlikehumans.
3. Systemsthatthinkrationally.
4. Systemsthatactrationally.
 Buildingsystemsthatthinklike humans
―Theexcitingnewefforttomakecomputersthink…machineswithminds,inthe full
and literal senseǁ -- Haugeland, 1985
―The automation of activities that we associate with human thinking, … such as
decision-making, problem solving, learning, …ǁ -- Bellman, 1978

 Buildingsystemsthatactlike humans
―Theartofcreatingmachinesthatperformfunctionsthatrequireintelligence when
performed by peopleǁ -- Kurzweil, 1990
―Thestudyofhowto make computersdothings at which,at themoment,people are
betterǁ -- Rich and Knight, 1991

 Buildingsystemsthatthinkrationally
―Thestudyofmentalfacultiesthroughtheuseofcomputationalmodelsǁ-- Charniak
and McDermott, 1985
―Thestudyofthecomputationsthatmakeitpossibletoperceive,reason,andactǁ
-Winston, 1992

 Buildingsystemsthatactrationally
―Afieldofstudythatseekstoexplainandemulateintelligentbehaviorintermsof computational
processesǁ -- Schalkoff, 1990
―Thebranchofcomputersciencethatisconcernedwiththeautomationofintelligent behaviorǁ
-- Luger and Stubblefield, 1993
ActingHumanly:TheTuringTestApproach
 Test proposed byAlan Turingin 1950
 Thecomputer is asked questions byahumaninterrogator.
Thecomputerpassesthetestifahumaninterrogator,afterposingsomewrittenquestions,cannot tell
whether the written responses come from a person or not. Programming a computer to pass, the
computer need to possess the following capabilities:
 Naturallanguageprocessing toenableit tocommunicatesuccessfullyinEnglish.
 Knowledgerepresentationtostorewhatitknowsorhears
 Automatedreasoningtousethestoredinformationtoanswerquestionsandtodrawnew
conclusions.
 Machinelearningtoadapttonewcircumstancesandtodetectandextrapolatepatterns. To
pass the complete Turing Test, the computer will need
 Computervisiontoperceivetheobjects,and
 Roboticstomanipulateobjectsandmove about.
Thinkinghumanly:Thecognitivemodeling approach
Weneed to get insideactual workingofthe human mind:
(a) Throughintrospection–tryingtocaptureourownthoughts astheygoby;
(b) Throughpsychological experiments
Allen Newell and Herbert Simon, who developed GPS, the ―General Problem Solverǁ
tried to trace the reasoning steps to traces of human subjects solving the same problems.
The interdisciplinary field of cognitive science brings together computer models from AI
and experimental techniques from psychology to try to construct precise and testable
theories of the workings of the human mind
Thinkingrationally:The“lawsofthought approach”
The Greek philosopher Aristotle was one of the first to attempt to codify―right thinking
that is irrefutable (ie. Impossible to deny) reasoning processes. His syllogism provided
patterns for argument structures that always yielded correct conclusions when given
correct premises—for example, Socrates is a man; all men are mortal; therefore Socrates
is mortal.ǁ. These laws of thought were supposed to govern the operation of the mind;
their study initiated a field called logic.
Actingrationally :Therationalagent approach
An agent is something that acts. Computer agents are not mere programs, but they are
expected to have the following attributes also: (a) operating under autonomous control, (b)
perceiving their environment, (c) persisting over a prolonged time period, (e) adapting to
change. A rational agent is one that acts so as to achieve the best outcome.
3. HISTORYOFAI
• 1943:earlybeginnings
– McCulloch&Pitts:Booleancircuitmodelof brain
• 1950: Turing
– Turing's"ComputingMachineryand Intelligence―
• 1956: birth ofAI
– Dartmouthmeeting:"ArtificialIntelligence―nameadopted
• 1950s: initial promise
– EarlyAIprograms,including
– Samuel'scheckers program
– Newell&Simon'sLogicTheorist
• 1955-65:―greatenthusiasmǁ
– NewellandSimon:GPS,generalproblemsolver
– Gelertner:GeometryTheoremProver
– McCarthy:inventionofLISP
• 1966—73:Realitydawns
– Realizationthat manyAIproblems areintractable
– Limitationsofexistingneuralnetworkmethodsidentified
• Neuralnetworkresearchalmostdisappears
• 1969—85: Addingdomainknowledge
– Developmentofknowledge-basedsystems
– Successofrule-basedexpert systems,
• E.g.,DENDRAL,MYCIN
• Butwerebrittleanddidnotscalewellin practice
• 1986--Riseofmachinelearning
– Neuralnetworksreturn topopularity
– Majoradvancesin machinelearningalgorithmsand applications
• 1990--Role of uncertainty
– Bayesiannetworks asaknowledgerepresentationframework
• 1995--AIasScience
– Integrationoflearning,reasoning,knowledgerepresentation
– AImethodsused invision,language, datamining, etc
3.1AITechnique
AI technique is a method that exploits knowledge that should be represented in such a way
that:
• The knowledge captures generalizations. In other words, it is not necessary to represent
separately each individual situation. Instead, situations that share important properties are
grouped together. If knowledge does not have this property, inordinate amounts of memory and
updating will be required. So we usually call something without this property "data" rather than
knowledge.
• Itcan beunderstood by peoplewho mustprovideit. Although formanyprograms, thebulk
ofthedatacanbeacquiredautomatically(forexample,bytakingreadingsfromavarietyof
instruments),inmanyAIdomains,mostoftheknowledgeaprogramhasmustultimatelybe provided by
people in terms they understand.
• It can easily be modified to correct errors and to reflect changes in the world and in our
world view.
• Itcanbeusedin agreat manysituationsevenif itisnottotallyaccurateorcomplete.
• It can be used to help overcome its own sheer bulk by helping to narrow the range of
possibilities that must usually be considered.
Although AI techniques must be designed in keeping with these constraints imposed by AI
problems, there is some degree of independence between problems and problem-solving
techniques. It is possible to solve AI problems without using AI techniques (although, as we
suggested above, those solutions are not likely to be very good).
Tic-Tac-Toe
Solution 1
 Data Structure:

 Elementsof vector:
 0 : Empty
 1:X
 2: O
 Thevector isa ternarynumber
 Storeinsidetheprogramamove-table (lookup table):
 #Elementsin thetable: 19683 (=39)
 Element=Avector whichdescribesthe mostsuitable move fromthe current game-board
Algorithm
1. Viewthevector asa ternarynumber.Convert ittoadecimal number.
2. UsethecomputednumberasanindexintoMove-Table andaccessthe vectorstoredthere.
3. Setthenewboard tothat vector.
Comments
1. Alotof spaceto store the Move-Table.
2. Alotofworkto specifyall the entries intheMove-Table.
3. Difficulttoextend.
Solution 2
DataStructure
 Usevector, calledboard, asSolution 1
 However,elements ofthevector:
 : Empty
 :X
 :O
 Turnofmove: indexedbyinteger
 1,2,3,etc.
FunctionLibrary:
1. Make2:
 Returnalocationonagame-board. IF
(board[5] = 2)
RETURN5;//thecentercell.
ELSE
RETURNanycellthatisnotattheboard‘scorner;
//(cell:2,4,6,8)

 LetP represent forXorO


 can_win(P):
 Phasfilledalreadyatleasttwocellsonastraightline (horizontal,vertical,or
diagonal)
 cannot_win(P)= NOT(can_win(P))
2. Posswin(P):
IF(cannot_win(P))
RETURN 0;
ELSE
RETURNindextotheemptycellonthelineof can_win(P)
 Letodd numbersareturnsof X
 Letevennumbersareturnsof O
3. Go(n):make a move.
IF odd(turn) THEN //forX
Board[n] = 3
ELSE //forO
Board[n]=5
turn = turn + 1
Algorithm:
1. Turn=1: (Xmoves)
Go(1)//makeamoveat theleft-topcell
2. Turn=2: (Omoves)
IFboard[5]isemptyTHEN
Go(5)
ELSE
Go(1)
3. Turn=3: (Xmoves)
IFboard[9]isemptyTHEN
Go(9)
ELSE
Go(3).
4. Turn=4: (Omoves)
IFPosswin(X)<>0THEN
Go(Posswin(X))
//Preventtheopponenttowin
ELSE Go(Make2)
5. Turn=5: (Xmoves)
IFPosswin(X)<>0THEN
Go(Posswin(X))
//Win for X.
ELSEIFPosswin(O)<>THEN
Go(Posswin(O))
//Preventtheopponenttowin
ELSE IF board[7] is empty THEN
Go(7)ELSEGo(3).
Comments:
1. Notefficientintime,asithastocheckseveralconditionsbeforemakingeach move.
2. Easiertounderstandtheprogram‘s strategy.
3. Hardto generalize.
4. Checking forapossiblewin is quicker.
5. Humanfindstherow-scanapproacheasier,whilecomputerfindsthenumber-counting
approach more efficient.

Solution 3
DataStructure
1. Game-board:Usevectoras describedforthe aboveprogram
2. List:
 Containspossiblegame-boardsgeneratedfromthecurrent game-board
 Eachthegame-boardisaugmentedascoreindicatingthepossibilityofvictoryofthe current
turn

Algorithm:
1. Ifitisawin, giveitthehighest rating.
2. Otherwise,considerallthemovestheopponentcouldmakenext.Assumetheopponentwill make
the move that is worst for us. Assign the rating of that move to the current node.
3. Thebest nodeis then theonewith thehighest rating.
Comments:
1. Requiremuch moretime toconsider all possiblemoves.
2. Couldbe extended tohandlemorecomplicatedgames.

2. FORMULATINGPROBLEMS
Problemformulationistheprocessofdecidingwhatactionsandstatestoconsider,given
agoal FormulateGoal,Formulateproblem

Search

Execute
WELL-DEFINED PROBLEMSANDSOLUTIONS
A problem canbedefined formallybyfourcomponents:
1. Initialstate
2. Successorfunction
3. Goaltest
4. Pathcost
1. Initial State
Thestartingstatewhichagentknows itself.
1. Successor Function
 Adescription ofthepossibleactions availableto theagent.
 State x, successor – FN (x) returns a set of < action, successor> ordered pairs, where
each action is a legal action in a state x and each successor is a state that can be
reached from x by applying that action.
StateSpace
The set of all possible states reachable from the initial state by any sequence of
actions. The initial state and successor function defines the state space. The statespace
forms a graph in which nodes are state and axis between the nodes are action.
Path
Apath in the statespaceis a sequenceof stateconnected byasequenceofactions.
2. GoalTest
Test to determinewhetherthe givenstateis the goal state. Ifthereis an explicit set ofpossible
goal states, then we can whether any one of the goal state is reached or not.
Example:Inchess,thegoalistoreachastatecalled―checkmateǁwheretheopponent‘sking is under
attack and can‘t escape.
3. Path cost
A function that assigns a numeric cost to each path. The cost of a path can be described as the
sum of the costs of the individual actions along that path.
Stepcostoftakinganaction‗a‘togofromonestate‗x‘tostate‗y‘isdenotedbyC(x,a,y) C-Cost ,
x,y- states , Action , Step costs are non-negative
These 4 elements are gathered into a data structure that is given as input to problem solving
algorithm. A solution quality is measured by path cost function. An optimal solution has lowest
path cost among all solutions.
Totalcost=Pathcost+Searchcost
Example: Route finding problem

Karur
Kangayam61
Mettupalayam 74
32 61
Palladam
Trichy
37 38
Dindigul

37 Pollachi 60 Palani 57 320


97
Coimbatore Chennai(Goal)

100 66 200 140


Vellore
Erode Salem

Fig:1RouteFindingProblem Initial
State: In (Coimbatore)
SuccessorFunction:{<Go(Pollachi),In (Pollachi)>
<Go(Erode),In(Erode)>
<Go(Palladam),In (Palladam)>
<Go(Mettupalayam),In(Mettupalayam)>}
GoalTest:In (Chennai)
PathCost:{(In (Coimbatore),}
{Go(Erode),} =100[kilometers]
{In(Erode)}
Pathcost =100 +66 +200 +140 = 506

TOY PROBLEM
Example-1:Vacuum World
Problem Formulation
• States

– 2x22 =8 states
–Formulan2nstates
• InitialState
– Anyone of8 states
• Successor Function
– Legalstatesthatresultfrom threeactions(Left,Right,Absorb)
• GoalTest
– Allsquaresareclean
• PathCost
– Numberofsteps(eachstepcosts avalue of1)

Fig1.2 VacuumWorld
Fig:1.3StateSpacefortheVacuumWorld State
Space for the Vacuum World
LabelsonArcsdenoteL:Left,R:Right,S:Suck

Example2:Playingchess
InitialState: Describedasan8X8arraywhereeachpositionscontainsasymbolstandingforthe
appropriate piece in the official chess position.
Successorfunction: Thelegalstatesthatresultsfromsetof rules.
Theycanbe described easilybyasa setofrulesconsistingoftwo parts: aleftsidethatservesas a pattern
to be matched against the current board position and a right side that describes the changes to be
made to the board position to reflect the move. An example is shown in the following figure.

Fig1.4:Thelegal statesthat resultsfromsetofrules


However if we write rules like the one above, we have to write a very large number of themsince
there has to be a separate set of rule for each of them roughly 10120 possible boardpositions.
Practicaldifficultiestoimplementlargenumberof rules,
1. Itwilltaketoolongtoimplementlargenumberofrulesandcouldnotbedone without
mistakes.
2. Noprogramcouldeasilyhandleallthoserulesandstoringitpossessserious difficulties.
Inordertominimizesuchproblems,wehavetowriterulesdescribingthelegalmovesinasa general way
as possible. The following is the way to describe the chess moves.

CurrentPosition
Whilepawn atsquare(e, 2), ANDSquare(e,3)isempty,ANDSquare(e, 4) is empty.
Changing BoardPosition
Move pawn from Square(e, 2)to Square(e, 4 ) .
Some of the problems that fall within the scope of AI and the kinds of techniques will be useful
to solve these problems.
GoalTest
Anypositioninwhichtheopponentdoesnothavealegalmoveandhisorherkingisunder attack.

Example:3WaterJugProblem
A Water Jug Problem: You are given two jugs, a 4-gallon one and a 3-gallon one, a pump which
has unlimited water which you can use to fill the jug, and the ground on which water may be
poured. Neither jug has any measuring markings on it. How can you get exactly 2 gallons of
water in the 4-gallon jug?
State:(x,y) x=0,1, 2,3,or4y=0,1,2, 3
xrepresents quantityof waterin 4-gallonjugandyrepresents quantityof waterin 3-gallon jug.
•Startstate:(0,0).

•Goalstate:(2,n)foranyn.Attemptingtoendupinagoalstate.(sincetheproblemdoesn‘t specify the


quantity of water in 3-gallon jug)

1.(x,y) →(4,y) Fillthe4-gallonjug


Ifx <4
2.(x,y) →(x,3) Fillthe3-gallonjug
Ify<3
3.(x,y) →(x −d,y) Poursomewaterout of the
Ifx >0 4-gallonjug
4.(x,y) →(x,y−d) Poursomewaterout of the
Ify>0 3-gallonjug
5.(x,y) →(0,y) Emptythe4-gallon jugonthe
Ifx >0 ground
6.(x,y) →(x,0) Emptythe3-gallon jugonthe
Ify>0 ground
7.(x,y) →(4, y−(4−x)) Pourwaterfrom the3-gallon jug
Ifx+y≥4,y>0 intothe 4-gallonjuguntil the
4-gallonjugis full
8.(x,y) →(x−(3−y),3) Pourwaterfrom the4-gallon jug
If x+y≥3,x>0 intothe3-gallonjuguntilthe 3-
gallon jug is full
9.(x,y) →(x+y,0) Pourall the waterfrom the 3-gallon
Ifx+y≤4,y>0 jugintothe4-gallonjug
10.(x,y) →(0,x+y) Pourall the waterfrom the 4-gallon
If x+y≤3,x>0 jugintothe3-gallonjug
11.(0,2) →(2,0) Pourthe2 gallons from the3-gallon
Jugintothe4-gallon jug
12.(2,y) →(0,y) Emptythe 2gallonsin the4-gallon
Jugontheground

Production rulesforthewaterjug problem


TraceofstepsinvolvedinsolvingthewaterjugproblemFirstsolution
Numberof Steps Rulesapplied 4-gjug 3-gjug
1 Initialstate 0 0
2 R2{Fill3-gjug} 0 3
3 R7{Pour allwaterfrom 3to 4-gjug} 3 0
4 R2{Fill3-gjug} 3 3
5 R5 {Pour from 3 to 4-gjuguntil it is full} 4 2
6 R3 {Empty4-gallonjug} 0 2
7 R7{Pour allwaterfrom 3to 4-gjug} 2 0
Goal State
Second Solution
Numberof Steps Rulesapplied 4-gjug 3-gjug
1 Initial state 0 0
2 R1{Fill4-gallonjug} 4 0
3 R6{Pour from 4 to 3-gjuguntil it is full} 1 3
4 R4 {Empty3-gallonjug} 1 0
5 R8{Pourallwaterfrom 4to3-gallonjug} 0 1
6 R1{Fill4-gallonjug} 4 1
7 R6 {Pour from 4 to 3-gjuguntil it isfull} 2 3
8 R4 {Empty3-gallonjug} 2 0
Goal State

Example-58-puzzle Problem
The 8-puzzle problem consists of a 3 x 3 board with eight numbered tiles and a blank space. A
tile adjacent to the blank space can slide into the space. The object is to reach a specified goal
state.
States: A state description specifies the location of each of the eight tiles and the blank in one of
the nine squares.
Initialstate: Anystatecanbedesignatedas theinitialstate.
Successor function: This generates the legal states that result from trying the four actions (blank
moves Left, Right, Up, or Down).
Goal test: This checks whether the state matches the goal configuration (Other goal
configurations are possible.)
Pathcost: Eachstep costs 1,so thepathcost is thenumberof stepsin the path.

Fig1.5 8 Puzzle Problem


Exampe-68-queensproblem
Thegoalofthe8-queensproblemistoplaceeightqueensonachessboardsuchthatnoqueen attacks any
other. (A queen attacks any piece in the same row, column or diagonal.
States:Anyarrangementof0to8queensontheboardisastate. Initial
state: No queens on the board.
Successorfunction:Addaqueentoanyemptysquare. Goal
test: 8 queens are on the board, none attacked.
Pathcost:Zero (searchcostonlyexists)

Fig1.6Solutionto the8queens problem


PRODUCTIONSYSTEMS
Productionsystemisamechanismthatdescribesandperformsthesearchprocess.
Aproductionsystemconsists offourbasiccomponents:
1. Asetofrulesoftheform Ci→AiwhereCi istheconditionpartandAiistheactionpart. The
condition determines when a given rule is applied, and the action determines what
happens when it is applied.
(i.e A set of rules, each consist of left side (a pattern) that determines the applicability of
the rule and a right side that describes the operations to be performed if the rule isapplied)
2. One or more knowledge databases that contain whatever information is relevant for the
given problem. Some parts of the database may be permanent, while others may
temporary and only exist during the solution of the current problem. The information in
the databases may be structured in any appropriate manner.
3. Acontrolstrategythat determinesthe orderin whichtherules areappliedtothe database, and
provides a way of resolving any conflicts that can arise when several rules match at once.
4. A rule applier which is the computational system that implements the
controlstrategy and applies the rules.

Inorderto solvea problem


 We must first reduce it to one for which a precise statement can be given. This can be
done by defining the problem‘s state space (start and goal states) and a set of operatorsfor
moving that space.
 Theproblem can besolved bysearchingforapath through thespacefrom theinitial state to a
goal state.
 Theprocess of solvingtheproblem canbeusefullymodeledas a production system.
Control strategies
By considering control strategies we can decide which rule to apply next during the process of
searching for a solution to problem.
Thetwo requirements ofgoodcontrol strategyare that
 It should cause motion:consider water jugproblem, if we implement control strategy
of starting each time at the top of the list of rules, it will never leads to solution. So we
need to consider control strategy that leads to solution.
 It should be systematic: choose at random from among the applicable rules. This
strategy is better than the first. It causes the motion. It will lead to the solution
eventually. Doing like this is not a systematic approach and it leads to uselesssequence
of operators several times before finding final solution.
Systematiccontrolstrategyforthewaterjugproblem
BreadthFirstSearch(BlindSearch)
Letusdiscussthesestrategiesusingwaterjugproblem.Thesemaybeappliedtoany search
problem.
 Constructatreewiththe initialstateasits root.
 Generate all the offspring of the root by applying each of the applicable rules
tothe initial state.
(0, 0)

(4, 0) (0, 3)
 Now for each leaf node, generate all its successors by applying all the rules
thatare appropriate.
(0, 0)

(4, 0) (0, 3)

(4,3) (0,0) (1,3) (4,3) (0,0) (3,0)


 Continuethisprocessuntilsomeruleproducesagoal state.

Algorithm
1. Createavariable called NODE-LIST andsetittoinitialstate.
2. Unitagoal stateis found orNODE-LIST isemptydo
a. Remove the first element from NODE-LIST and call it E. if NODE-LIST is empty,
quit.
b. Foreach waythat each rulecan matchthe state describedin E do:
i. Applytheruleto generateanewstate.
ii. If the new state is a goal state, quit and return this state.
iii.Otherwise,addthenewstatetotheendofNODE-LIST.
DepthFirstSearch
Algorithm
1. If theinitialstateisthegoalstate,quitandreturnsuccess.

2. Otherwisedothe followinguntilsuccess orfailureis signaled:

a. Generateasuccessor, E, ofinitialstate. If therearenomoresuccessor,signal failure.


b. Calldepthfirst search,withE astheinitial state.
c. Ifthesuccessisreturned,signalsuccess.Otherwisecontinueinthisloop.
Backtracking
 Inthissearch,wepursueasinglebranchofthetreeuntilityieldsasolutionoruntila decision to
terminate the path is made.
 Itmakessensetoterminateapathifitreachesdead-end,producesapreviousstate.In such a state
backtracking occurs.
 ChronologicalBacktracking:orderinwhichstepsareundonedependsonlyonthetemporal
sequence in which steps were initially made.
 Specificallymostrecentstepisalwaysthefirsttobeundone.Thisisalsosimple backtracking.
(0, 0)
(4,0)

(4,3)
AdvantagesofDepthFirst search
 DFSrequires less memorysinceonlythenodes onthecurrent pathare stored.
 BychanceDFS mayfindasolution withoutexaminingmuch ofthe searchspaceatall.
AdvantagesofBreathFirst search
 BFScannot be trappedexploringablind alley.
 Ifthereisasolution,BFSisguaranteedtofindit.
 If there aremultiplesolutions,thenaminimalsolutionwillbefound.
TravelingSalesmanProblem(with5cities):
Asalesman issupposedtovisiteach of5 citiesshownbelow.Thereisaroadbetween eachpair of cities
and the distance is given next to the roads.Start city is A. The problem is to find the shortest
route so that the salesman visits each of the cities only once and returns to back to A.

FigTravellingSalesmanProblem
•Asimple,motioncausingandsystematiccontrolstructurecould,inprinciplesolvethis problem.
•Explorethesearchtree ofallpossiblepathsand returntheshortestpath.
•Thiswillrequire4!pathstobeexamined.
• Ifnumberofcitiesgrow,say25cities,thenthetimerequiredtowaitasalesmantogetthe information
about the shortest path is of 0(24!) which is not a practical situation.
•Thisphenomenoniscalledcombinatorial explosion.
•Wecan improvethe abovestrategyasfollows.
BranchandBound
 Begingeneratingcompletepaths,keepingtrack of theshortestpathfoundso far.
 Give up exploring any path as soon as its partial length become greater than the shortest
path found so far.

 Thisalgorithmisefficientthanthefirstone,stillrequiresexponentialtime∝some number raised


to N.
HeuristicSearch
•Heuristics are criteria for deciding which among several alternatives be the most effective
in order to achieve some goal.
•Heuristic is a technique that improves the efficiency of a search process possibly by
sacrificingclaims of systematic and completeness.It no longer guarantees to find the best
answer but almost always finds a very good answer.
•Using good heuristics, we can hope to get good solution to hard problems (such as
travelling salesman) in less than exponential time.
•There are general-purpose heuristics that are useful in a wide variety of problemdomains.
•Wecanalsoconstruct specialpurposeheuristics,whicharedomain specific.
GeneralPurpose Heuristics
• A general-purpose heuristics for combinatorial problem is nearest neighbor
algorithmswhich works by selecting the locally superior alternative.
• Forsuchalgorithms,itisoftenpossibletoproveanupperboundontheerrorwhich provide
reassurance that one is not paying too high a price in accuracy for speed.
• In many AI problems, it is often hard to measure precisely the goodness of a
particularsolution.
• For real world problems, it is often useful to introduce heuristics based on relatively
unstructured knowledge. It is impossible to define this knowledge in such a way that
mathematical analysis can be performed.
• In AI approaches, behavior of algorithms are analyzed by running them on computer as
contrast to analyzing algorithm mathematically.
•Thereareatleastmanyreasonsfortheadhoc approachesin AI.
 Itisa lotmore fun tosee aprogramdo somethingintelligentthan toproveit.
 AI problemdomainsareusually complex,sogenerally notpossibletoproduce
analytical proof that a procedure will work.
 Itisevennotpossibletodescribetherangeofproblemswellenoughtomake statistical
analysis of program behavior meaningful.
•Butstillitisimportanttokeepperformancequestioninmindwhiledesigning algorithm.
•Oneofthemostimportantanalysisofthesearchprocessisstraightforwardi.e.,―Number of
nodes in a complete search tree of depth D and branching factor F is F*Dǁ.
•Thissimpleanalysismotivatesto
 Lookforimprovementsontheexhaustivesearch.
 Find an upper bound on the search time which can be compared with exhaustive
search procedures.
PROBLEMCHARACTERISTICS
Heuristic search is a very general method applicable to a large class of problem.In order to
choose the most appropriate method (or combination of methods) for a particular problem it is
necessary to analyze the problem along several key dimensions:
Istheproblemdecomposableintoasetofindependentsmallersubproblems?
Example: Suppose we want to solve the problem of computing the integral of the following
expression ∫(x2+ 3x + sin2x * cos2x) dx
Fig1.7 Decompositionproblem
•We can solve this problem by breaking it down into three smaller sub problems, each of which
we can then be solved using a small collection of specific rules.
•Decomposableproblems can besolvedbythe divideand-conquertechnique.

•Useofdecomposingproblems:
-Eachsub-problemissimplertosolve
-Eachsub-problemcanbehandedovertoadifferentprocessor.Thuscanbesolvedin parallel
processing environment.
•Therearenon-decomposableproblems.Forexample,Blockworldproblemisnon decomposable.
Cansolutionstepsbeignoredoratleast undoneif theyprovetobeunwise?
• Inreallife,therearethreetypesofproblems:Ignorable,RecoverableandIrrecoverable.
• Letus explaineachofthesethroughexamples.

Example1:(Ignorable):Intheoremproving-(solutionstepscanbe ignored)
•Supposewehaveprovedsomelemmainordertoproveatheoremandeventually realized that lemma
is no help at all, then ignore it and prove another lemma.
•Can besolved byusingsimple controlstrategy.
Example2:(Recoverable):8puzzle-(solutionstepscanbe undone)
•8 puzzle: Objective is to rearrange a given initial configuration of eight numbered tiles on
3X 3 board (one place is empty) into a given final configuration (goal state).
•Rearrangementis donebyslidingoneof thetilesinto EmptySquare.
•Solvedbybacktrackingso control strategymust beimplemented using apush down stack.
Example3:(Irrecoverable):Chess(solutionstepscannotbeundone)
•Astupidmovecannotbeundone
•Can be solvedbyplanningprocess
IstheknowledgeBaseconsistent?
Example: Inconsistent knowledge:
Targetproblem: Aman is standing150 ftfrom atarget.Heplansto hit thetarget byshootinga gun
that fires bullets with velocity of 1500 ft/sec.How high above the target should he aim?
Solution:
•Velocityof bulletis1500 ft./seci.e.,bullettakes0.1 sectoreach the target.
•Assumebullettravelsinastraight line.
•Dueto gravity,the bulletfallsatadistance (1/2)gt2=(1/2)(32)(0.1)2= 0.16ft.
•Soifmanaimsup0.16feethighfrom thetarget,thenbulletwillhitthetarget.
•Now there is a contradiction to the fact that bullet travel in a straight line because the bullet in
actual will travel in an arc. Therefore there is inconsistency in the knowledge used.

Whatis theRoleofknowledge?
• In Chess game, knowledge is important to constrain the search for a solution otherwise just the
rule for determining legal moves and some simple control mechanism that implements an
appropriate search procedure is required.
•Newspapers scanning todecide somefacts,a lotof knowledge isrequiredeventobeable to
recognize a solution.
Is a goodsolutionAbsoluteor Relative?
• In water jug problem there are two ways to solve a problem.If we follow one path successfully
to the solution, there is no reason to go back and see if some other path might also lead to a
solution.Here a solution is absolute.
• In travelling salesman problem, our goal is to find the shortest route.Unless all routes are
known, the shortest is difficult to know.This is a best-path problem whereas water jug is any-
path problem.
•Anypathproblemcanoftenbesolvedinreasonableamountoftimeusingheuristicsthat suggest good
paths to explore.
•Bestpathproblems areingeneralcomputationallyharderthanany-path.
DoesthetaskRequireInteractionwitha Person?
• Solitaryproblem,inwhichthereisnointermediatecommunicationandnodemandforan
explanation of the reasoning process.
• Conversationalproblem,inwhichintermediatecommunicationistoprovideeitheradditional
assistance to the computer or additional information to the user.
Problemclassification
• There is avariety ofproblem-solving methods,but thereis no one singleway of solving all
problems.
• Not all new problems should be considered as totally new. Solutions of similar problems can
be exploited.
PRODUCTIONSYSTEM CHARACTERISTICS
Productionsystemsare importantinbuilding intelligentmatcheswhichcanprovideusa good set
of production rules, for solving problems.
Therearefourtypesofproductionsystemcharacteristics,namely
1. Monotonicproductionsystem
2. Non-monotonicproductionsystem
3. Commutativelawbasedproductionsystem,andlastly
4. Partiallycommutativelawbased production system
1. Monotonic Production System (MPS): The Monotonic production system (MPS) is a
systemin whichtheapplicationofaruleneverpreventslaterapplicationoftheanotherrule that
could also have been applied at the time that the first rule was selected
2. Non-monotonic Production (NMPS): The non-monotonic production system is a
system in which the application of a rule prevents the later application of the another rule
which may not have been applied at the time that the first rule was selected, i.e. it is a
system in which the above rule is not true, i.e. the monotonic production system rule not
true.
3. Commutative Production System (CPS): Commutative law based production systems
is a system in which it satisfies both monotonic & partially commutative.
4. Partially Commutative Production System (PCPS): The partially commutative
production system is a system with the property that if the application of those rules that is
allowable & also transforms from state x to state ‗y‘.
Monotonic (Characteristics) Non-monotonic
Partiallycommutative Theoremproving Robot navigation
Non-partialcommutative Chemicalsynthesis Bridgegame
Table1.1TheFourcategoriesofProductionSystem
Well the question may arise here such as:
- canthe productionsystems bedescribed byasetof characteristics?
- Also,canwedrawtherelationshipbetweenproblemtypes&thetypesofproduction systems,
suited to solve the problems, yes, we can by using above rules.
ROBLEMSOLVINGMETHODS,HEURISTICSEARCHTECHNIQUES
Search techniques are the general problem solving methods. When there is a formulated search
problem, a set of search states, a set of operators, an initial state and a goal criterion we can use
search techniques to solve a problem.
Matching:
Problem solvingcan be done through search. Search involves choosing amongthe rules that
can be applied at a particular point, the ones that are most likely to lead to a solution. This
can be done by extracting rules from large number of collections.
Howtoextractfromtheentire collectionof rulesthatcanbeappliedatagiven point?
 Thiscan bedonebyMatchingbetweencurrent stateand the preconditionofthe rules.
Onewaytoselectapplicablerulesistodosimplesearchthroughalltherulescomparingeach one‘s
preconditions to the current state and extracting the one that match.But there are two
problems with this simple solution.
1. In bigproblemslargenumberofrulesareused.Scanningthrough allofthisrulesat every step of
the search would be hopelessly inefficient.
2. Itisnotclearlyvisibletofindoutwhichconditionwillbesatisfied. Some

of the matching techniques are described below:


Indexing:Toovercomeaboveproblemsindexingisused.Inthisinsteadofsearchingall the rules
the current state is used as index into the rules, and select the matching rules immediately
e.g consider the chess game playing. Here the set of valid moves is very large. To reduce
the size of this set only useful moves are identified. At the time of playing the game, the
next move will verymuch depend upon the current move. As thegame is going on there
will be only ‗few‘ moves which are applicable in next move. Hence it will be wasteful
effort to check applicability of all moves. Rather, the important and valid legal moves are
directly stored as rules and through indexing the applicable rules are found. Here, the
indexing will store the current board position. The indexing make the matching process
easy, at the cost of lack of generality in the statement rules. Practically there is a tradeoff
between the ease of writing rules and simplicity of matching process. The indexing
technique is not very well suited for the rule base where rules are written in high level
predicates. InPROLOG and many theorem proving systems, rules are indexed by
predicates they contain. Hence all the applicable rules can be indexed quickly.
Example:

1.8OneLegal Chess Move


Matching with variable: In the rule base if the preconditions are not stated as exact
description of particular situation, the indexing technique does not work well. In certain
situations they describe properties that the situation must have. In situation, where single
conditions is matched against a single element in state description, the unification
procedure can be used. However in practical situation it is required to match complete set
of rules that match the current state. In forward and backward chaining system, the depth
first searching technique is used to select the individual rule. In the situation where
multiple rules are applicable, conflict resolution technique is used to choose appropriate
applicable rule. In the case of the situations requiring multiple match, the unification can
be applied recursively, but more efficient method is used to use RETE matching
algorithm.
Complex matching variable: A more complex matching process is required when
preconditions of a rule specify required properties that are not stated explicitly in the
description of current state. However the real world is full of uncertainties and sometimes
practically it is not possible to define the rule in exact fashion. The matching process become
more complicated in the situation where preconditions approximately match the current
situation e.g a speech understanding program must contain the rules that map from a
description of a physical wave form to phones. Because of the presence of noise the signal
becomes more variable that there will be only approximate match between the rules that
describe an ideal sound and the input that describes that unideal world. Approximate matching
is particularly difficult to deal with, because as we increase the tolerance allowed in the match
the new rules need to be written and it will increase number of rules. It will increase the size of
main search process. But approximate matching is nevertheless superior to exact matching in
situation such as speech understanding, where exact matching may result in no rule being
matched and the search process coming to a grinding halt.
HEURISTICSEARCH TECHNIQUES
Hill Climbing
 Hillclimbingistheoptimizationtechniquewhichbelongsto afamilyoflocalsearch. Itis
relativelysimpletoimplement,makingit apopularfirstchoice.Although moreadvanced
algorithms may give better results in some situations hill climbing works well.
 Hill climbing can be used to solve problems that have manysolutions, some of which are
better than others. It starts with a random (potentially poor) solution, and iteratively
makes small changes to the solution, each time improving it a little. When the algorithm
cannot see any improvement anymore, it terminates. Ideally, at that point the current
solution is close to optimal, but it is not guaranteed that hill climbing will ever comeclose
to the optimal solution.
 For example, hill climbing can be applied to the traveling salesman problem. It is easy to
find a solution that visits all the cities but is be very poor compared to the optimal
solution. The algorithm starts with such a solution and makes small improvements to it,
such as switching the order in which two cities are visited. Eventually, a much better
route is obtained.
 Hill climbing is used widely in artificial intelligence, for reaching a goal state from a
starting node. Choice of next node and starting node can be varied to give a list of related
algorithms.
 Hill climbing attempts to maximize (or minimize) a function f (x), where x are discrete
states. These states are typically represented by vertices in a graph, where edges in the
graph encode nearness or similarity of a graph. Hill climbing will follow the graph from
vertex to vertex, always locally increasing (or decreasing) the value of f, until a local
maximum (or local minimum) xm is reached. Hill climbing can also operate on a
continuous space: in that case, the algorithm is called gradient ascent (or gradient descent
if the function is minimized).*.
 Problems with hill climbing: local maxima (we've climbed to the top of the hill, and
missed the mountain), plateau (everything around is about as good as where we
are),ridges (we're on a ridge leading up, but we can't directly apply an operator toimprove
our situation, so we have to applymore than one operator to get there).Solutions include:
backtracking, making big jumps (to handle plateaus or poor local maxima), applying
multiple rules before testing (helps with ridges).Hill climbing is best suited to problems
where the heuristic gradually improves the closer it gets to the solution; itworks poorly
where there are sharp drop-offs. It assumes that local improvement willlead to global
improvement.
 Search methods based on hill climbing get their names from the way the nodes are
selected for expansion. At each point in the search path a successor node that appears to
lead most quickly to the top of the hill (goal) selected for exploration. This method
requiresthatsomeinformationbeavailablewithwhichtoevaluateandorderthemost
promising choices. Hill climbing is like depth first searching where the most promising
child is selected for expansion.
 Hill climbing is a variant of generate and test in which feedback from the test procedureis
used to help the generator decide which direction to move in the search space. Hill
climbing is often used when a good heuristic function is available for evaluating statesbut
when no other useful knowledge is available. For example, suppose you are in an
unfamiliarcitywithout amapand you want to get downtown.Yousimplyaim forthetall
buildings. The heuristic function is just distance between the current location and the
location of the tall buildings and the desirable states are those in which this distance is
minimized.
SimpleHillClimbing
The simplest wayto implement hill climbing is the simple hill climbing whose algorithm
is as given below:
Algorithm:SimpleHillClimbing:
Step1:Evaluatetheinitialstate.Ititisalsoagoalstate,thenreturnitandquit.
Otherwisecontinue withtheinitialstateasthecurrentstate.
Step2:Loopuntilasolutionisfoundoruntiltherearenonewoperatorslefttobe applied in the
current state:
(a) Select an operator that has not yet been applied to the current state and applyit
to produce a new state.
(b) Evaluatethenew state.

(i) Ifitisagoalstate,thenreturnitandquit.

(ii) Ifitisnot agoal state,butitisbetterthanthecurrent state,thenmakeitthe current


state.
(iii) Ifitisnotbetterthanthecurrentstate,thencontinueintheloop.

The key difference between this algorithm and the one we gave for generate andtest is the
use of an evaluation function as a way to inject task specific knowledge into the control
process. It is the use of such knowledge that makes this heuristic search method.It is the
same knowledge that gives these methods their power to solve some otherwise intractable
problems
To see how hill climbing works, let‘s take the puzzle of the four colored blocks. To solve
the problem we first need to define a heuristic function that describes how close a
particular configuration is to being a solution. One such function is simply the sum of the
number of different colors on each of the four sides. A solution to the puzzle will have a
value of 16. Next we need to define a set of rules that describe ways of transforming one
configuration to another. Actually one rule will suffice. It says simply pick a block and
rotateit90degreesinanydirection.Havingprovidedthesedefinitionsthenextstepisto
generate a starting configuration. This can either be done at random or with the aid of the
heuristic function. Now by using hill climbing, first we generate a new state by selectinga
block and rotating it. If the resulting state is better than we keep it. If not we return to the
previous state and try a different perturbation.
Problemsin HillClimbing
Steepest–AscentHillClimbing:
An useful variation on simple hill climbing considers all the moves form the current
state and selects the best one as the next state. This method is called steepest – ascent hill
climbing or gradient search. Steepest Ascent hill climbing contrasts with the basicmethod
in which the first state that is better than the current state is selected. The algorithm works
as follows.
Algorithm:Steepest–AscentHillClimbing
Step1:Evaluatetheinitialstate.Ifitisalsoagoalstate,thenreturnitandquit
.Otherwise,continue withtheinitialstateasthecurrentstate.
Step2:Loopuntilasolutionisfoundoruntilacompleteiterationproducesno change to
current state:
(a) Let SUCC be a state such that any possible successor of the current state will
be better than SUCC.
(b) Foreach operatorthatappliesto thecurrent statedo:

i. Applythe operator andgenerateanew state.


ii. Evaluatethe new state. Ifitis a goal state,then returnitand quit. If not,
compare it to SUCC. If it is better then set SUCC to this state. If it is not better, leave
SUCC alone.
(c) Ifthe SUCCisbetterthancurrentstate,thensetcurrentstate toSUCC.

To apply steepest- ascent hill climbing to the colored blocks problem, we must consider
all perturbations of the initial state and choose the best. For this problem this is difficult
since there are so many possible moves. There is a trade-off between the time required to
select amove and thenumberofmoves required to get asolution that must beconsidered
when deciding which method will work better for a particular problem. Usually the time
required to select a move is longer for steepest – ascent hill climbing and the number of
moves required to get to a solution is longer for basic hill climbing.
Both basic and steepest ascent hill climbing may fail to find a solution. Either algorithm
may terminate not by finding a goal state but by getting to a state from which no better
states can be generated. This will happen if the program has reached either a local
maximum, a plateau, or a ridge.
Hillclimbing Disadvantages
Local Maximum: A local maximum is a state that is better than all its neighbors but
isnot better than some other states farther away.
Plateau: A Plateau is a flat area of the search space in which a whole set of neighboring
states have the same value.
Ridge:Aridgeisaspecialkindoflocalmaximum. Itisanareaofthesearchspacethat is higher
than surrounding areas and that itself has a slope.
Ways out
 Backtracktosome earliernodeand trygoingina different direction.
 Makeabigjump to trytoget in anew section.
 Movingin several directions at once.
Hillclimbingisalocalmethod:Decideswhattodonextbylookingonlyatthe immediate
consequence of its choice.
Globalinformationmightbeencodedinheuristic functions.

Fig1.9ThreePossible Moves
Localheuristic:
+1 foreach blockthat is restingon the thingit is supposed to be restingon.
 -1foreach blockthat is restingon awrongthing.
Globalheuristic:
Foreachblockthathasthecorrectsupportstructure:+1toeveryblockinthesupportstructure. For each
block that has a wrong support structure: -1 to every block in the support structure.

Hillclimbing conclusion
 Canbeveryinefficientinalarge,roughproblem space.
 Globalheuristic mayhaveto payforcomputationalcomplexity.
 Oftenusefulwhencombinedwithothermethods,gettingitstartedrightintheright general
neighbourhood.
SimulatedAnnealing
The problem of local maxima has been overcome in simulated annealing search. In normal
hill climbing search, the movements towards downhill are never made. In such algorithms the
search maystuckup tolocalmaximum.Thusthissearch cannot guarantee completesolutions. In
contrast, a random search( or movement) towards successor chosen randomly from the set of
successor will be complete, but it will be extremely inefficient. The combination of hill climbing
and random search, which yields both efficiencyand completeness is called simulated annealing.
The simulated annealing method was originally developed for the physical process of annealing.
That is how the name simulated annealing was found and restored. In simulated annealing
searching algorithm, instead of picking the best move, a random move is picked. The standard
simulated annealing uses term objective function instead of heuristic function. If the move
improves the situation it is accepted otherwise the algorithm accepts the move with some
probability less than
Thisprobabilityis P=
e-E/kT

Where-Eispositivechargeinenergylevel,tistemperature andkisBoltzmanconstant. As
indicated by the equation the probability decreases with badness of the move (evaluation
gets worsened by amount -E). The rate at which -E is cooled is called annealing
schedule. The proper annealing schedule is maintained to monitor T.
Thisprocess hasfollowingdifferencesfromhill climbing search:
 Theannealingschedule ismaintained.
 Movesto worsestatesarealso accepted.
 Inadditiontocurrentstate,thebeststaterecordismaintained. The
algorithm of simulated annealing is presented as follows:
Algorithm:“simulatedannealingǁ
1. Evaluatethe initialstate. Markitas current state. Till thecurrentstate isnota goal state,
initialize best state to current state. If the initial state is the best state, return it and quit.
2. InitializeTaccordingto annealingschedule.
3. Repeatthefollowinguntil asolution is obtainedoroperatorsarenot left:
a. Applyyet unappliedoperator toproduceanew state
b. For new state compute -E= value of current state – value of new state. If the
newstateisthe goalstatethenstop,orifitisbetterthancurrentstate,makeitas current
state and record as best state.
c.If it is not better than the current state, then make it current state with probability
P.
d. ReviseT accordingto annealingschedule
4. Returnbeststateasanswer.
Best-FirstSearch:
It is a general heuristic based search technique. In best first search, in the graph of problem
representation,oneevaluation function(which correspondstoheuristicfunction)isattached with
every node. The value of evaluation function may depend upon cost or distance of current node
from goal node. The decision of which node to be expanded depends on the value of this
evaluation function. The best first can understood from following tree. In the tree, the attached
value with nodes indicates utility value. The expansion of nodes according to best first search is
illustrated in the following figure.
Step 1 Step2 Step 3

(4) (6) (2) A


A A

Step 4 Step 5 (5) (7)


B C D B C D
(7) (6) A (5) (7) A
(4) (6)
E F
(3) (2)
fig:1.10Treegettingexpansion accordingto bestfirst searchD
B C D B C

Here, at any step,


(6) the most promising node having least
(6) value of utility function is chosen for
expansion.
G H E F G H E F
Inthetreeshownabove, bestfirstsearchtechniqueisapplied,howeveritisbeneficialsometimes to
search a graph instead oftreeto avoid(7) (6) (7)
thesearchingofduplicatepaths. Intheprocess to do so,
searching is donein adirected graph in which each noderepresents
I J a point in theproblem space.
This graph is known as OR-graph. Each ofthebranches ofan OR graph represents an alternative
problem solving path.
Twolistsof nodesareused toimplementagraphsearchprocedurediscussedabove.Theseare
1. OPEN: these are the nodes that have been generated and have had the heuristic function
applied to them but not have been examined yet.
2. CLOSED: these are the nodes that have already been examined. These nodes are kept in
the memory if we want to search a graph rather than a tree because whenever a node
will be generated, we will have to check whether it has been generated earlier.
The best first search is a way of combiningthe advantage of both depth first and breath first
search. The depth first search is good because it allows a solution to be found without all
competing branches have to be expanded. Breadth first search is good because it does not get
trapped on dead ends of path. The way of combining this is to follow a single path at a time but
switches between paths whenever some competing paths looks more promising than current one
does. Hence at each step of best first search process, we select most promising node out of
successor nodes that have been generated so far.
Thefunctioningof bestfirstsearch issummarizedin thefollowingsteps:
1. Itmaintains alist opencontainingjustthe initial state.

2. Untilagoal is foundor thereareno nodesleft inopen list do:

a. Pickthebestnodefromopen,
b. Generateitssuccessor,andforeachsuccessor:
i. Check, and if it has not been generated before evaluate it and add it to
openandrecord its parent.
ii. Ifithasbeengeneratedbefore,andnewpathisbetterthanthepreviousparent then change
the parent.
Thealgorithmforbestfirstsearchis givenasfollows:
Algorithm:Bestfirst search
1. Puttheinitialnode on thelist say‗OPEN‘.

2. If (OPEN= emptyor OPEN=goal) terminatesearch, else

3. Remove thefirst node fromopen(saynodeisa)

4. If(a=goal)terminatesearchwithsuccesselse

5. Generate all the successor node of ‗a‘. Send node ‗a‘ to a list called ‗CLOSED‘. Find out
the value of heuristic function of all nodes. Sort all children generated so far on the basis
of their utility value. Select the node of minimum heuristic value for further expansion.
6. Gobacktostep 2.

The best first search can be implemented using priority queue. There are variations ofbest
first search. Example of these are greedybest first search, A* and recursive best first
search.
3.3.2TheA* Algorithm:
TheA* algorithm is a specialization ofbest first search. It most widelyknownform ofbest
first search. It provides genera guidelines about how to estimate goal distance for general search
graph. at eachnode alonga path to the goal node, the A* algorithm generate all successor nodes
andcomputesanestimateofdistance(cost)fromthestartnodetoagoalnodethrougheachof
the successors. if then chooses the successor with shortest estimated distance from expansion. It
calculatestheheuristic function based on distance of current node from the start node and distance
of current node to goal node.
TheformofheuristicestimationfunctionforA*isdefinedasfollows:
f(n)=g(n)+h(n)
wheref(n)=evaluationfunction
g(n)=cost(ordistance)ofcurrentnodefromstartnode.
h(n)= cost of current node from goal node.
In A* algorithm the most promising node is chosen from expansion. The promising node is
decided based on the value of heuristic function. Normally the node having lowest value of f (n)
is chosen for expansion. We must note that the goodness of a move depends upon the nature of
problem, in some problems the node having least value of heuristic function would be most
promising node, where in some situation, the node having maximum value of heuristic functionis
chosen for expansion. A* algorithm maintains two lists. One store the list of open nodes and
other maintain the list of already expanded nodes. A* algorithm is an example of optimal search
algorithm. A search algorithm is optimal if it has admissible heuristic. An algorithm has
admissible heuristic if its heuristic function h(n) never overestimates the cost to reach the goal.
Admissible heuristic are always optimistic because in them, the cost of solving the problem is
less than what actually is. The A* algorithm works as follows:
A* algorithm:
1. Placethestartingnode‗s‘on‗OPEN‘list.

2. IfOPENisempty,stopandreturnfailure.

3. Remove from OPEN the node ‗n‘ that has the smallest value of f*(n). if node ‗n is a goal
node, return success and stop otherwise.
4. Expand ‗n‘ generating all of its successors ‗n‘ and place ‗n‘ on CLOSED. For every
successor‗n‘if‗n‘isnotalreadyOPEN,attachabackpointerto‗n‘.computef*(n)and place it
on CLOSED.
5. Each ‗n‘ that is already on OPEN or CLOSED should be attached to back pointers which
reflectthelowestf*(n)path. If‗n‘wasonCLOSEDanditspointerwaschanged,remove it and
place it on OPEN.
6. Returntostep 2.
ConstraintSatisfaction
The general problem is to find a solution that satisfies a set of constraints. The heuristics which
are used to decide what node to expand next and not to estimate the distance to the goal.
Examples of this technique are design problem, labeling graphs robot path planning and crypt
arithmetic puzzles.
In constraint satisfaction problems a set of constraints are available. This is the search space.
Initial State is the set of constraints given originally in the problem description. A goal state is
any state that has been constrained enough. Constraint satisfaction is a two-step process.
1. Firstconstraintsarediscoveredandpropagated throughoutthesystem.

2. Thenifthereisnotasolutionsearchbegins,a guessismadeand addedtothisconstraint.


Propagation then occurs with this new constraint.
Algorithm
1. Propogateavailableconstraints:
 Openall objectsthat mustbeassignedvalues ina completesolution.
 Repeatuntil inconsistencyorallobjects areassigned valid values:
Selectanobjectandstrengthen as muchaspossible theset ofconstraintsthat applyto object.
If set of constraints different from previous set then open all objects that share any of these
constraints. Remove selected object.
2. Ifunionofconstraintsdiscoveredabovedefines asolutionreturnsolution.
3. Ifunionofconstraintsdiscovered abovedefinesacontradictionreturnfailure.
4. Makeaguessinordertoproceed.Repeatuntilasolutionisfoundorallpossiblesolutions exhausted:
 Selectan objectwith ano assignedvalue andtrytostrengthen its constraints.
 Recursivelyinvokeconstraintsatisfactionwiththecurrentsetofconstraintsplusthe selected
strengthening constraint.
Crypt arithmetic puzzles are examples of constraint satisfaction problems in which the goal to
discover some problem state that satisfies a given set of constraints. some problems of crypt
arithmetic are show below
Here each decimal digit is to be assigned to each of the letters in such a way that the answer to
the problem is correct. If the same letter occurs more than once it must be assigned the samedigit
each time. No two different letters may be assigned the same digit.
Thepuzzle SEND+MORE =MONEY,after solving, will appear likethis

Stateproductionandheuristics forcryptarithmetic problem.


Ans.
Theheuristicsandproductionrulesarespecificto thefollowingexample:

HeuristicsRules
1. If sum of two ‗n‘ digit operands yields ‗n+1‘ digit result then the ‗n+1‘th digit has to be
one.
2. Sumoftwo digits mayor maynotgenerate carry.
3. Whatevermight be theoperands the carrycan beeither0 or1.
4. Notwodistinctalphabets canhavesamenumericcode.
5. Whenever more than 1 solution appears to be existing, the choice is governed by the
factthat no two alphabets can have same number code.
Means–end Analysis
Means-ends analysis allows both backward and forward searching. This means we could
solve major parts of a problem first and then return to smaller problems when assembling the
final solution.
Themeans-endsanalysisalgorithmcanbesaid asfollows:
1. Untilthegoalis reachedorno moreproceduresareavailable:

 Describethecurrentstate thegoalstateandthedifferencesbetweenthetwo.
 Usethe differencethe describesa procedurethatwillhopefullyget nearerto goal.
 Usetheprocedureandupdatecurrentstate
If goalisreachedthensuccessotherwise fail.
Forusingmeans-ends analysis to solvea givenproblem, amixtureofthetwo directions, forward and
backward,is appropriate. Such amixed strategysolves the majorparts ofaproblem first and then
goes back and solves the small problems that arise by putting the big pieces together. The means-
end analysisprocessdetectsthedifferencesbetweenthe currentstateand goalstate. Once such
difference is isolated an operator that can reduce the difference has to be found. The operator
mayor maynot be applied to the current state. So a sub problem is set up of gettingto a state in
which this operator can be applied.
In operator sub goaling backward chaining and forward chaining is used in which first the
operators are selected and then sub goals are set up to establish the preconditions of theoperators.
If the operator does not produce the goal state we want, then we have second sub problem of
getting from the state it does produce to the goal. The two sub problems could be easier to solve
than the original problem, if the difference was chosen correctly and if the operator applied is
really effective at reducing the difference. The means-end analysis process can then be applied
recursively.
This method depends on a set of rues that can transform one problem state into another. These
rules are usually not represented with complete state descriptions on each side. Instead they are
represented as a left side that describes the conditions that must be met for the rules to be
applicable and a right side that describes those aspects of the problem state that will be changed
by the application of the rule. A separate data structure called a difference table which uses the
rules by the differences that they can be used to reduce.
ExampleSolve thefollowingproblem usingmeans-ends analysis
A farmer wants to cross the river along with fox, chicken and grain. He can take only one along
with him at a time. If fox and chicken are left alone the fox may eat the chicken. If chicken and
grain are left alone the chicken may eat the grain. Give the necessary solution.
UNIT-2
REPRESENTATIONOFKNOWLEDGE

Introduction

Game Playing is one of the oldest sub-fields in AI. Game playing involves abstract and pureform
of competition that seems to require intelligence. It is easy to represent the states and actions. To
implement the game playing very little world knowledge is required.
The most common used AI technique in game is search. Game playing research has contributed
ideas on how to make the best use of time to reach good decisions.
Gameplayingisasearchproblemdefinedby:
 Initialstateofthegame
 Operatorsdefininglegalmoves
 Successorfunction
 Terminaltestdefiningendofgame states
 Goaltest
 Pathcost/utility/payofffunction
More popular games are too complex to solve, requiring the program to take its best guess. “ for
examplein chess, thesearch treehas 1040 nodes (with branchingfactorof 35). It is theopponent
because of whom uncertainty arises.
Characteristicsofgame playing
1. Therearealwaysan “unpredictable” opponent:
 Theopponentintroducesuncertainty
 Theopponentalso wantsto win
The solution for this problem is a strategy, which specifies a move for every possible opponent
reply.
2. Time limits:
Game are often played under strict time constraints (eg:chess) and therefore must be very
effectively handled.
There are special games where two players have exactly opposite goals. There are also perfect
information games(sch as chess and go) where both the players have access to the same
information about the game in progress (e.g. tic-tac-toe). In imoerfect game information games
(such as bridge or certain card games and games where dice is used). Given sufficient time and
space, usually an optimum solution can be obtained for the former by exhaustive search, though
not for the latter.
Typesofgames
There arebasicallytwo types ofgames
 Deterministic games
 Chancegames
Game like chess and checker are perfect information deterministic games whereas games like
scrabble and bridge areimperfect information.Wewill consideronlytwo playerdiscrete,perfect
information games, such as tic-tac-toe, chess, checkers etc... . Two- player games are easier to
imagine and think and more common to play.
Minimizesearchprocedure
Typical characteristic of the games is to look ahead at future position in order to succeed. There
is a natural correspondence between such games and state space problems.
Inagameliketic-tac-toe
 States-legalboardpositions
 Operators-legalmoves
 Goal-winningposition
The game starts from a specified initial state and ends in position that can be declared win forone
player and loss for other or possibly a draw. Game tree is an explicit representation of all
possible plays of the game. We start with a 3 by 3 grid..
Thenthetwoplayerstakeitinturnstoplaceatheremarkerontheboard(oneplayerusesthe„X‟
marker,theotherusesthe„O‟marker). Thewinneristheplayerwho gets 3ofthesemarkersin a row, eg..
if X wins

Another possibilityis thatno1 winseg..

Orthe third possibilityisadrawcase

Searchtreefortic-tac-toe
The root node is an initial position of the game. Its successors are the positions that the first
player can reach in one move; their successors are the positions resulting from the secondplayer's
replies and so on. Terminal or leaf nodes are presented by WIN, LOSS or DRAW. Each path
from the root ro a terminal node represents a different complete play of the game. The moves
available to one player from a given position can be represented by OR links whereas the moves
available to his opponent are AND links.
Thetreesrepresentinggamescontaintwotypesof nodes:
 MAX-nodes(assumeatevenlevelfrom root)
 MIN -nodes[assumeatoddlevelfromroot)
Searchtreefortic-tac-toe
the leaves nodes are labeled WIN, LOSS or DRAW depending on whether they represent a win,
loss or draw position from Max‟s viewpoint. Once the leaf nodes are assigned their WIN-LOSS
orDRAW status, each nodes in thegametreecan belabeledWIN, LOSS orDRAW byabottom up
process.
Game playing is a special type of search, where the intention of all players must be taken into
account.
Minimaxprocedure
 Starting from the leaves of the tree (with final scores with respect to one player, MAX),
and go backwards towards the root.
 At each step, one player (MAX) takes the action that leads to the highest score, while the
other player(MIN) takes the action that leads to the lowest score.
 All the nodes in the tree will be scored and the path from root to the actual result is
theone on which all node have the same score.
The minimax procedure operates on a game tree and is recursive procedure where a player tries
to minimize its opponent‟s advantage while at the same time maximize its own. The player
hoping for positive number is called the maximizing player. His opponent is the minimizing
player. If the player to move is the maximizing player, he is looking for a path leading to a large
positivenumberandhisopponentwilltrytoforcetheplaytowardsituationwithstrongly
negative static evaluations. In game playing first construct the tree up till the depth-bound and
then compute the evaluation function for the leaves. The next step is to propagate the values upto
the starting.
Theprocedurebywhich thescoringinformationpassesupthe gametreeis calledtheMINIMAX
procedures since the score at each node is either minimum or maximum of the scores at thenodes
immediately below
One-plysearch
Inthis figsinceit isthemaximizingsearch ply8is transferredupwards to A

Two-ply search

Staticevaluationfunction
To play an entire game we need to combine search oriented and non-search oriented techniques.
The idea wayto use a search procedure to find a solution to the problem statement is to generate
movesthroughtheproblemspaceuntilagoalstateisreached.Unfortunatelyforgameslike
chess even with a good plausible move generator, it is not possible to search until goal state is
reached. In theamountoftimeavailableitispossibletogeneratethetreeatthemost10to20ply deep.
Then in order to choose the best move, the resulting board positions must be compared to
discoverwhichismost advantageous.This isdoneusingthestaticevaluationfunction.Thestatic
evaluation function evaluates individual board positions by estimating how much likely they are
eventually to lead to a win.
Theminimaxprocedureisa depth-first,depthlimitedsearch procedure.
 If the limit of search has reached, compute the static value of the current position relative
to the appropriate layer as given below (maximizing or minimizing player). Report the
result (value and path).
 Ifthelevelisminimizinglevel(minimizer‟sturn)
 Generatethesuccessorsofthecurrentposition.ApplyMINIMAXtoeachofthe successors.
Return theminimum of the result.
 Ifthelevelisamaximizinglevel.Generatethesuccessorsofcurrent position Apply
MINIMAX to each of these successors. Return the maximum of the result. The
maximum algorithm uses the following procedures
1. MOVEGEN(POS)
Itisplausiblemove generator. Itreturnsalistofsuccessorsof„Pos‟.

2. STSTIC(Pos, Depth)
The static evaluation function that returns a number representing the goodness of
„pos‟from the current point of view.
3. DEEP-ENOUGH
Itreturns trueif thesearchto be stopped at thecurrent level elseit returns false.
AMINIMAX example
Anotherexampleofminimax searchprocedure

In the above example, a Minimax search in a game tree is simulated. Every leaf has a
corresponding value, which is approximated from player A‟s view point. When a path is chosen,
the value of the child will be passed back to the parent. For example, the value for D is 6, which
is the maximum value of its children, while the value forC is 4 which is the minimum value of F
and G. In this example the best sequence of moves found by the maximizing/minimizing
procedure is the path through nodes A, B, D and H, which is called the principal continuation.
The nodes on the path are denoted as PC (principal continuation) nodes. For simplicity we can
modify the game tree values slightly and use only maximization operations. The trick is to
maximize the scores by negating the returned values from the children instead of searching for
minimum scores and estimate the values at leaves from the player‟s own viewpoint
At each node, the alpha and beta values may be updated as we iterate over the node‟s children.
AtnodeE,when alphais updatedto avalueof8,itendsupexceedingbeta. Thisisapointwhere alpha
beta pruning is required we know the minimizer would never let the game reach this node so we
don‟t have to look at its remaining children. In fact, pruning happens exactly when the alpha and
beta lines hit each other in the node value.

Secondarysearch
One proposed solution is to examine the search beyond the apparently best one to see if
something is looming just over the horizon. In that case we can revert to the second-best move.
Obviously then the second-best move has the same problem and there is not time to search
beyond all possible acceptable moves.

KnowledgeRepresentation
Representation and Mapping
 Problem solvingrequires large amount of knowledge and some mechanism for manipulating
thatknowledge.
 TheKnowledgeand theRepresentation are distinctentities,playacentral but
distinguishable roles in intelligent system.
 Knowledgeis adescription ofthe world;
itdetermines asystem's competencebywhat it knows.
 Representationis thewayknowledgeis encoded;
itdefinesthe system'sperformanceindoingsomething.
 FactsTruths aboutthe realworld andwhatwerepresent.This canberegardedasthe
knowledge level

 Insimplewords,we:
 needto know aboutthings wewant to represent,and
 needsomemeans bywhich things wecan manipulate.

Thus,knowledgerepresentationcanbeconsideredattwolevels :
 knowledgelevelatwhich factsaredescribed,and
 symbol levelat which the representations of the objects, defined in terms of symbols,can
be manipulated in the programs.
Note : A good representation enables fastand accurate access to knowledge and
understanding of the content.

MappingbetweenFactsandRepresentation
 Knowledgeisacollectionof“facts”fromsome domain.
 We need a representation of "facts"that can be manipulated by a program. Normal
English is insufficient, too hard currently for a computer program to draw inferences in
natural languages.
 Thussomesymbolicrepresentationisnecessary.
 Therefore,wemustbeabletomap"factstosymbols"and"symbolstofacts" using
forwardandbackwardrepresentationmapping.
Example:ConsideranEnglishsentence

Nowusingdeductivemechanism wecangenerate anewrepresentationofobject:


Hastail (Spot) Anewobjectrepresentation
Spothas atail [it is new knowledge] Usingbackwardmappingfunctiontogenerate
English sentence

 Goodrepresentationcanmakeareasoningprogram trivial
 The Mutilated Checkerboard Problem: “Consider a normal checker board from
which two squares, in opposite corners, have been removed. The task is to cover
all the remaining squares exactly with dominoes, each of which covers two
squares. No overlapping, either of dominoes on top of each other or of dominoes
over the boundary of the mutilated board are allowed. Can this task be done?”
 Forwardand BackwardRepresentation
Theforwardandbackwardrepresentationsareelaboratedbelow

 Thedotedlineontopindicatestheabstractreasoningprocessthataprogramis
intended to model.
 Thesolidlinesonbottomindicatetheconcretereasoningprocessthattheprogram
performs.
KRSystemRequirements
 Agoodknowledgerepresentationenablesfastandaccurateaccesstoknowledge and
understanding of the content.
 Aknowledgerepresentationsystem shouldhavefollowingproperties.
RepresentationalAdequacytheabilitytorepresentallkindsofknowledgethatare needed in
that domain.
Inferential Adequacy the ability to manipulate the representational structures to derive
new structure corresponding to new knowledge inferred from old.
Inferential Efficiency the ability to incorporate additional information into the
knowledge structure that can be used to focus attention of the inference mechanisms in
the most promising direction.
Acquisitional Efficiency the ability to acquire new knowledge using automatic methods
whenever possible rather than reliance on human intervention
Note:Todate nosingle system canoptimizesalloftheaboveproperties.
KnowledgeRepresentation Schemes
Therearefourtypes of Knowledgerepresentation :
Relational,Inheritable,Inferential,andDeclarative/Procedural.
 RelationalKnowledge:
 providesaframeworkto comparetwo objectsbased onequivalent attributes.

 anyinstanceinwhichtwodifferentobjectsarecomparedisarelationaltypeof knowledge.

 InheritableKnowledge
 isobtainedfromassociatedobjects.

 itprescribes astructureinwhichnewobjectsare createdwhich mayinheritallorasubset of


attributes from existing objects.
 Inferential Knowledge
 isinferredfromobjectsthroughrelationsamong objects.

 e.g., aword aloneis asimplesyntax,butwiththehelpofotherwordsinphrasethereader may


infer more from a word; this inference within linguistic is called semantics.
 DeclarativeKnowledge
 a statement in which knowledge is specified, but the use to which that knowledge is to be
put is not given.
 e.g. laws,people'sname;these are factswhich can stand alone, notdependenton other
knowledge;
 ProceduralKnowledge
 a representation in which the control information, to use the knowledge, is embedded in
the knowledge itself.
 e.g.computerprograms,directions,and recipes;theseindicatespecificuseor
implementation;

Relational Knowledge
Thisknowledgeassociates elementsof onedomainwith another domain.
 Relational knowledge is made up of objects consisting of attributes and their corresponding
associated values.
 Theresultsofthisknowledgetypeisamappingofelementsamongdifferentdomains. The
table below shows a simple way to store facts.
 Thefacts about aset ofobjects areput systematicallyin columns.
 Thisrepresentationprovideslittleopportunityfor inference.
Table-SimpleRelationalKnowledge
Player Height Weight Bats- Throws
Aaron 6-0 180 Right-Right
Mays 5-10 170 Right-Right
Ruth 6-2 215 Left-Left
Williams 6-3 205 Left-Right

 Giventhefactsitisnotpossibletoanswersimplequestionsuchas :
"Whoistheheaviestplayer? ".
butifaprocedureforfindingheaviestplayerisprovided,thenthesefacts willenable that
procedure to compute an answer.
 Wecanask thingslike who "bats–left"and"throws– right".

InheritableKnowledge
 Heretheknowledge elementsinheritattributesfromtheirparents.
 The knowledge is embodied in the design hierarchies found in the functional, physical
and process domains. Within the hierarchy, elements inherit attributes from their parents,
but in many cases not all attributes of the parent elements be prescribed to the child
elements.
 The inheritance is a powerful form of inference, but not adequate. The basic KR needs to
be augmented with inference mechanism.
 The KR in hierarchical structure, shown below, is called “semantic network” or a
collection of “frames” or “slot-and-filler structure". The structure shows property
inheritance and way for insertion of additional knowledge.
 Property inheritance: The objects or elements of specific classes inherit attributes and
values from more general classes.The classes are organized in a generalized hierarchy.

Fig.Inheritableknowledgerepresentation(KR)
 Thedirectedarrowsrepresentattributes(isa,instance,team)originatesatobject being
described and terminates at object or its value.
 Theboxnodes represents objectsandvalues oftheattributes.
 Viewinganode as a frame
Example: Baseball-player
isa : Adult-Male
Bates: EQUALhanded
Height: 6.1
Batting-average: 0.252
 Algorithm: Property Inheritance
Retrieve a value V for an attribute A ofaninstanceobjectO
Steps to follow:
1. Findobject O intheknowledgebase.

2. Ifthereisavaluefor theattribute Athenreportthat value.

3. Else,ifthereis avalue fortheattributeinstance;Ifnot,then fail.

4. Else,movetothenodecorrespondingtothatvalueandlookforavalueforthe attribute
A; If one is found, report it.
5. Else,do untilthereisno valueforthe “isa”attribute oruntil ananswer isfound :
(a) Getthevalue ofthe“isa”attributeandmove to that node.

(b) Seeifthereisavaluefor theattribute A;Ifyes,reportit.

 This algorithm is simple. It describes thebasicmechanismof inheritance. It does notsay


what to do if there is more than one value of the instance or “isa” attribute.
 This can beapplied to theexampleof knowledge baseillustrated,in theprevious slide, to
derive answers to the following queries :
 team(Pee-Wee-Reese)= Brooklyn–Dodger
 batting–average(Three-Finger-Brown)=0.106
 height(Pee-Wee-Reese)=6.1
 bats(ThreeFinger Brown)=right
Inferential Knowledge
 Thisknowledgegenerates newinformationfrom thegiven information.
 Thisnewinformationdoesnotrequirefurtherdatagatheringformsource,butdoes require
analysis of the given information to generate new knowledge.
Example:
 givenasetofrelationsandvalues,onemayinferothervaluesorrelations.
 apredicatelogic(amathematicaldeduction)isusedtoinferfromasetof attributes.
 inferencethroughpredicatelogicusesasetoflogicaloperationstorelate individual data.
 thesymbolsusedforthelogicoperationsare:

Fromthesethreestatements wecan inferthat :


"Wonder lives either on land oron water."
Note :If more information is made available about these objects and their relations, then
moreknowledge can be inferred.
Declarative/ProceduralKnowledge
DifferencesbetweenDeclarative/Procedural knowledgeis notveryclear.
Declarativeknowledge:
Here,theknowledgeisbasedondeclarativefactsaboutaxiomsanddomains.
 axiomsareassumed tobetrueunless a counter exampleisfound to invalidate them.
 domainsrepresentthephysicalworldandtheperceived functionality.
 axiomand domains thus simply existsand serve as declarative
statements that can stand alone.
Proceduralknowledge:
Here, the knowledge is a mapping process between domains that specify “what to do when” and
the representation is of “how to make it” rather than “what it is”. The procedural knowledge :
 may have inferential efficiency, but no inferential adequacy
and acquisitional efficiency.
 arerepresentedassmallprogramsthatknow how todospecificthings,howto proceed.
Example : A parser in a natural language has the knowledge that a noun phrase may contain
articles, adjectives and nouns. It thus accordinglycall routines that know how to process articles,
adjectives and nouns.
IssuesinKnowledge Representation
 ThefundamentalgoalofKnowledge Representationistofacilitateinference
(conclusions) from knowledge.
 TheissuesthatarisewhileusingKRtechniquesaremany.Someoftheseareexplained below.
 ImportantAttributes:
Anyattributeofobjects sobasicthattheyoccurinalmosteveryproblem domain ?
 Relationshipamongattributes:
Anyimportant relationshipthat existsamongobjectattributes ?

 ChoosingGranularity:
Atwhatlevelof detailshouldthe knowledgeberepresented ?
 Setofobjects:
Howsetsofobjectsbe represented ?
 FindingRightstructure:
Givenalargeamount of knowledgestored, how canrelevant partsbe accessed ?
 Important Attributes
 Thereareattributesthatareof generalsignificance.
 There are two attributes "instance" and"isa", thatare of generalimportance.
Theseattributesareimportantbecausetheysupport property inheritance.
 Relationshipamong Attributes
 Theattributes todescribeobjectsarethemselvesentities theyrepresent.
 Therelationshipbetweentheattributesofanobject,independentofspecific knowledge they
encode, may hold properties like:
 Inverses,existence in an isa hierarchy, techniques forreasoningabout valuesand single
valued attributes.
 Inverses:
This is about consistency check, while a value is added to one attribute. The entities
are related to each other in many different ways. The figure shows attributes (isa,
instance, and team), each with a directed arrow, originating at the object being
described and terminating either at the object or its value.
Therearetwoways ofrealizing this:
 first, represent two relationships in a single representation; e.g., a logical
representation,team(Pee-Wee-Reese,Brooklyn–Dodgers), that can be
interpreted as a statement about Pee-Wee-Reese or Brooklyn–Dodger.
 second, use attributes that focus on a single entity but use them in pairs, one
the inverse of the other; for e.g., one, team = Brooklyn– Dodgers , and the
other,team = Pee-Wee-Reese, . . . .
This second approach is followed in semantic net and frame-based systems,
accompanied by a knowledge acquisition tool that guarantees the consistency of
inverse slot by checking, each time a value is added to one attribute then the
corresponding value is added to the inverse.
 Existenceinan"isa" hierarchy
This is about generalization-specialization, like, classes of objects and
specialized subsets of those classes. There are attributes and specialization of
attributes.
Example: the attribute "height" is a specialization of general attribute
"physical-size" which is, in turn, a specialization of "physical-attribute".These
generalization-specialization relationships for attributes are important because
they support inheritance.
 Techniquesforreasoningaboutvalues
This isaboutreasoning values of attributesnot given explicitly. Severalkinds of
information are used in reasoning, like,
height: mustbein aunit of length,
age :ofpersoncannot be greaterthantheageofperson's parents.
Thevalues areoften specifiedwhenaknowledge baseis created.
 Singlevaluedattributes
Thisisaboutaspecificattributethat isguaranteedtotakeaunique value.
Example : A baseball player can at time have only a single height and be a
member of only one team. KR systems take different approaches to provide
support for single valued attributes.
 ChoosingGranularity
Whatlevelshould theknowledgeberepresentedand whataretheprimitives?
 Shouldtherebeasmallnumberorshouldtherebealargenumberoflow-level primitives or
High-level facts.
 High-levelfactsmaynotbeadequateforinferencewhileLow-level primitives may require
a lot of storage.
Exampleof Granularity:
 Supposeweareinterestedinfollowingfacts
John spotted Sue.
 This could be represented as
Spotted(agent(John),object(Sue))
 Sucharepresentationwouldmakeiteasytoanswerquestionssuchare Who
spotted Sue ?
 Supposewewanttoknow
Did John see Sue ?
 Given onlyonefact, we cannot discover that answer.
 Wecanaddotherfacts,suchas
Spotted (x , y) -> saw (x , y)
 Wecan now infer theanswer to thequestion.
 Setof Objects
 Certainpropertiesofobjectsthataretrueasmemberofasetbutnotas individual;
Example :Consider the assertion madein the sentences "there are more sheep than
people in Australia", and "English speakers can be found all over the world."
 Todescribethesefacts,theonlywayistoattachassertiontothesetsrepresenting people,
sheep, and English.
 Thereason torepresentsetsofobjectsis :
If a property is true for all or most elements of a set, then it is more efficient to
associate it once with the set rather than to associate it explicitly with every elements
of the set.
 Thisisdoneindifferentways:
 inlogicalrepresentationthroughtheuseofuniversalquantifier, and
 inhierarchicalstructurewherenoderepresentsets,theinheritance propagateset level
assertion down to individual.
Example:assertlarge(elephant); Remember tomakeclear distinction between,
 whetherweareassertingsomeproperty ofthesetitself,means,thesetofelephantsis large,or
 asserting some property that holds for individual elements of the set , means,any thing
that is an elephant is large.
Therearethreeways in whichsets mayberepresented :
 Name, as in the example. Inheritable KR, the node - Baseball- Player and the predicates
as Ball and Batter in logical representation.
 Extensionaldefinition isto listthe numbers,and
 In tensional definition isto provide a rule, that returns true or false dependingon whether
the object is in the set or not.
 FindingRight Structure
 Accesstorightstructure fordescribingaparticular situation.
 It requires, selecting an initial structure and then revising the choice. While doing so,
it is necessary to solve following problems :
 howtoperform aninitialselectionofthemostappropriatestructure.
 howtofillinappropriate detailsfromthecurrentsituations.
 howtofindabetterstructureiftheonechoseninitiallyturns outnottobe
appropriate.
 whattodoif noneofthe availablestructuresis appropriate.
 whentocreateandrememberanewstructure.
 Thereisnogood,generalpurposemethodforsolvingalltheseproblems.Some knowledge
representation techniques solve some of them.

KnowledgeRepresentationusingpredicatelogic
RepresentingSimpleFactsinLogic
AIsystem might need to represent knowledge. Propositional logicis oneofthefairlygood forms of
representing the same because it is simple to deal with and a decision procedure for it exists.
Real-world facts are represented as logical propositions and are written as well-formed formulas
(wff's) in propositional logic, as shown in Figure below. Usingthese propositions, wemayeasily
concludeitisnotsunnyfromthefactthatitsraining.Butcontrarytotheeaseofusingt
propositional logic there are its limitations. This is well demonstrated using a few simple
sentence like:

SomesimplefactsinPropositionallogic

Socratesisaman.
Wecould write:
SOCRATESMAN
Butifwealsowantedtorepresent Plato
is a man.
wewould havetowrite somethingsuch as:
PLATOMAN
which would be a totally separate assertion, and we would not be able to draw any conclusions
aboutsimilaritiesbetweenSocratesandPlato. Itwouldbemuchbettertorepresentthese factsas:
MAN(SOCRATES)
MAN(PLATO)
since now the structure of the representation reflects the structure of the knowledge itself. But to
do that, we need to be able to use predicates applied to arguments. We are in even moredifficulty
if we try to represent the equally classic sentence
Allmenaremortal.
Wecouldrepresentthis as:
MORTALMAN
But that fails to capture the relationship between any individual being a man and that
individual being a mortal. To do that, we really need variables and quantification unless we are
willing to write separate statements about the mortality of every known man.
Let's now explore the use of predicate logic as a way of representing knowledge by looking at a
specific example. Consider the following set of sentences:
1.Marcus wasa man.
2. MarcuswasaPompeian.
3. AllPompeianswere Romans.
4. Caesarwasaruler.
5. AllRomanswereeitherloyal toCaesar orhatedhim.
6. Everyoneisloyaltosomeone.
7. People onlytryto assassinaterulers theyarenotloyalto.
8. MarcustriedtoassassinateCaesar.
The facts described by these sentences can be represented as a set of wff's in predicate logic as
follows:
1. Marcuswasaman.
man(Marcus)
Although this representation fails to represent the notion of past tense (which is clear in the
English sentence), it captures the critical fact of Marcus being a man. Whether this omission is
acceptable or not depends on the use to which we intend to put the knowledge.
2. MarcuswasaPompeian.
Pompeian(Marcus)
3. AllPompeianswere Romans.
x :Pompeian(x)Roman(x)
4. Caesarwasaruler.
ruler(Caesar)
Since many people share the same name, the fact that proper names are often not references to
unique individuals, overlooked here. Occasionally deciding which of several people of the same
name is being referred to in a particular statement may require a somewhat more amount of
knowledge and logic.
5. AllRomanswereeitherloyal toCaesarorhatedhim.
x:Roman(x)loyalto(x,Caesar)Vhate(Caesar)
Here we have used the inclusive-or interpretation of the two types of Or supported by
English language. Some people will argue, however, that this English sentence is really statingan
exclusive-or. To express that. we would have to write:
x:Roman(x)[(loyalto(x,Caesar)Vhate(x,Caesar)) Not
(loyalto(x, Caesar) hate(x, Caesar))]
6. Everyoneisloyaltosomeone.
x:y:loyalto(x,y)
The scope of quantifiers is a major problem that arises when trying to convert English sentences
into logical statements. Does this sentence say, as we have assumed in writing the logicalformula
above, that for each person there exists someone to whom he or she is loyal, possibly a different
someonefor everyone? Ordoes it saythat thereis someoneto whom everyoneis loyal?

AnAttempttoProvenotloyal to(Marcus,Caesar)
7. People onlytryto assassinaterulers theyarenotloyalto.
x:y: person(x)ruler(y)tryassasinate(x,y)loyalto(x,y)
8.Like the previous one this sentence too is ambiguous which may lead to more than one
conclusion. The usage of “try to assassinate” as a single predicate gives us a fairly simple
representation with which we can reason about trying to assassinate. But there might be
connections as try to assassinate and not actually assassinate could not be made easily.
9. MarcustriedtoassassinateCaesar.
tryassasinate (Marcus,Caesar)
now,saysuppose wewish to answer the following question:
WasMarcusloyaltoCaesar?
What we do is start reasoning backward from the desired goal which is represented in
predicate logic as:
loyalto(Marcus,Caesar)
Figure 4.2 shows an attempt to produce a proof of the goal by reducing the set of
necessary but as yet unattained goals to the empty sets. The attempts fail as we do not have any
statement to prove person(Marcus). But the problem is solved just by adding an additional
statement i.e.

10. Allmenare people.


x: man(x) person(x)
Nowwecansatisfythelast goalandproduceaproof that Marcuswas notloyal to Caesar.
RepresentingInstanceandisarelationships
 Knowledgecanberepresentedas classes,objects,attributesandSuperclassandsubclass
relationships.
 Knowledgecanbeinferenceusingpropertyinheritance.Inthiselementsofspecific classes
inherit the attributes and values.
 Attribute instance is used to represent the relationship “Class membership ” (element of
the class)
 Attribute isa is used to represent the relationship “Class inclusion” (super class, sub class
relationship)
Threewaysofrepresentingclassmembership
These examples illustrate two points. The first is fairly specific. It is that, although class and
superclass memberships are important facts that need to be represented those memberships need
not be represented with predicates labelled instance and isa. In fact, in a logical framework it is
usually unwieldy to do that, and instead unary predicates corresponding to the classes are often
used. The second point is more general. There are usually several different ways of representinga
given fact within a particular representational framework, be it logic or anything else. The choice
depends partly on which deductions need to be supported most efficiently and partly on taste.
The only important thing is that within a particular knowledge base consistency of
representationis critical. Since any particular inference rule is designed to work on one particular
form of representation, it is necessary that all the knowledge to which that rule is intended to
apply be in the form that the rule demands. Many errors in the reasoning performed by
knowledge-based programs are the result of inconsistent representation decisions. The moral is
simply to be careful.
Computablefunctionsandpredicates
 SomeofthecomputationalpredicateslikeLessthan,Greaterthanusedinknowledge
representation.
 Itgenerallyreturntrueorfalsefortheinputs.
Examples:Computable predicates
gt(1,0)or lt(0,1)
gt(5,4)orgt(4,5)
Computablefunctions:gt(2+4,5)
Considerthefollowingsetoffacts, againinvolvingMarcus:
1. marcuswasaman
man(Marcus)
2. Marcuswasapompeian
Pompeian(Marcus)
3. Marcuswasbornin40A.D
born(marcus, 40)
4. Allmenare mortal

∀x:men(x)→mortal(x)
5. AllPompeiansdiedwhenthevolcanoeruptedin79A.D
erupted(volcano,79) & x :pompeian(x)→died(x, 79)
6. Nomortallives longer than150years

∀x:∀t1: ∀t2: mortal(x)&born(x,t1) & gt(t2-t1,150)→ dead(x,t1)


7. ItisNow1991
Now=1991
8. Alivemeansnot dead

∀x:∀t:[ alive(x,t)→~dead(x,t)]& [~dead(x,t)→alive(x,t)]


9. Ifsomeonedies thenheisdeadat alllater times

∀x:∀t1: ∀t2: died(x,t1)& gt(t2,t1)→ dead(x1,t2)


Thisrepresentationsaysthatoneisdeadinall yearsaftertheoneinwhichonedied.Itignores the question
of whether one is dead in the year in which one died.
1. man(Marcus)
2. Pompeian(Marcus)
3. born(marcus, 40)

4. ∀x:men(x)→mortal(x)
5. ∀:pompeian(x)→died(x,79)
6. erupted(volcano,79)

7. ∀ x:∀t1:∀t2:mortal(x)&born(x,t1)&gt(t2-t1,150)→
dead(x,t1)
8. Now=1991

9. ∀x:∀t:[alive(x,t)→~dead(x,t)]&[~dead(x,t)→alive(x,t)]

10. ∀x:∀t1: ∀t2:died(x,t1)&gt(t2,t1)→dead(x1,t2)


Twothings shouldbeclear fromthe proofs wehavejust shown:
 Even verysimple conclusions can requiremanysteps to prove.
 A variety of processes, such as matching, substitution, and application of modus ponens are
involved in the production of a proof, This is true even for the simple statements we are
using, It would be worse if we had implications with more than a single term on the right or
with complicated expressions involving ands and ors on the left.
Disadvantage:
 Manystepsrequired to provesimpleconclusions
 Varietyof processes suchas matchingand substitution used to provesimple conclusions
Resolution
 Resolutionis aproofprocedurebyrefutation.
 Toproveastatementusingresolutionitattempttoshowthatthenegationofthat statement.
ResolutioninPropositional Logic
In propositionallogic, the procedure for producing a proof by resolutionof proposition
Pwithrespect to a set of axioms F is the following.
ALGORITHM:PROPOSITIONALRESOLUTION
1. Convertallthepropositionsof Ftoclause form
2. NegatePandconverttheresulttoclauseform.Add itto thesetof clausesobtained instep 1.
3. Repeatuntileither acontradiction isfoundornoprogress canbemade:
a) Selecttwoclauses.Callthesetheparentclauses.
b) Resolve them together. The resulting clause, called the resolvent, will be the disjunction of all
of the literals of both of the parent clauses with the following exception: If there are any pairs of
literals L and ¬L such that one of the parent clauses contains L and the other contains ¬L, then
select one such pair and eliminate both L and ¬L from the resolvent.
c) If the resolvent is the emptyclause, then a contradiction has been found. If it is not then add it
to the set of clauses available to the procedure.

Suppose we are given the axioms shown in the first column of Table 1 and we want to prove
R.First weconvert the axioms to clausewhich is alreadyin clause form. Then webegin selecting
pairs of clauses to resolve together. Although any pair of clauses can be resolved, only thosepairs
that contain complementaryliterals will producea resolvent that is likelyto lead to the goal of
sequence of resolvents shown in figure 1. We begin byresolvingwith the clause ⱷR since that is
one of the clauses that must be involved in the contradiction we are trying to find.
One way of viewing the resolution process is that it takes a set of clauses that are all assumed to
be true and based on information provided by the others, it generates new clauses that represent
restrictions on the way each of those original clauses can be made true. A contradiction occurs
when a clause becomes so restricted that there is no way it can be true. This is indicated by the
generation of the empty clause.

Table1SomeFactsinPropositionalLogic
Figure:Resolution InPropositionalLogic

UNIFICATIONALGORITHM
In propsoitional logic it is easy to determine that two literals can not both be true at the same
time. Simply look for L and ~L . In predicate logic, this matching process is more complicated,
since bindings of variables must be considered.
For example man (john) and man(john) is a contradiction while man (john) and man(Himalayas)
is not. Thus in order to determine contradictions we need a matching procedure that compares
two literals and discovers whether there exist a set of substitutions that makes them identical .
There is a recursive procedure that does this matching . It is called Unification algorithm.
In Unification algorithm each literal is represented as a list, where first element is the name of a
predicate and the remaining elements are arguments. The argument may be a single element
(atom) or may be another list. For example we can have literals as
(tryassassinateMarcusCaesar)
(tryassassinateMarcus(rulerof Rome))
To unify two literals , first check if their first elements re same. If so proceed. Otherwise theycan
not be unified. For example the literals
(tryassassinateMarcusCaesar)
(hateMarcusCaesar)
CannotbeUnfied.Theunificationalgorithmrecursivelymatchespairsof elements,onepair ata time.
The matching rules are :
i) Differentconstants,functionsorpredicatescan notmatch,whereasidenticalonescan.
ii) A variable can match another variable , any constant or a function or predicate expression,
subject to the condition that the function or [predicate expression must not contain any instance
of the variable being matched (otherwise it will lead to infinite recursion).
iii) The substitution must be consistent. Substituting y for x now and then z for x later is
inconsistent. (a substitution y for x written as y/x)
The Unification algorithm is listed below as a procedure UNIFY (L1, L2). It returns a list
representing the composition of the substitutions that were performed during the match. An
empty list NIL indicates that a match was found without any substitutions. If the list contains a
single value F, it indicates that the unification procedure failed.
The empty list, NIL, indicates that a match was found without any substitutions. The list
consisting of the single value FAIL indicates that the unification procedure failed.
Algorithm:Unify(L1,L2)
I.IfL1orL2 arebothvariablesorconstants,then:
(a) IfL1and L2 areidentical,thenreturnNIL.
(b) Elseif L1isavariable,thenif L1occursinL2then return{FAIL},elsereturn (L2/L1).
(c) Elseif L2isavariable,thenif L2occursinL1thenreturn{FAIL}, elsereturn (L1/L2).
(d) Elsereturn {FAIL}.
2. Iftheinitialpredicatesymbolsin L1 and L2arenotidentical,thenreturn{FAIL}.
3. IfLIandL2have adifferent numberofarguments,thenreturn {FAIL}.
4. Set SUBST to NIL. (At the end of this procedure, SUBST will contain all the substitutions
used to unify L1 and L2.)
5. Fori←1tonumberofargumentsinL1:
(a) CallUnifywith theith argumentof L1 andthe ithargumentofL2,puttingresultin S.
(b) IfScontains FAILthen return{FAIL}.
(c) IfSisnot equaltoNILthen:
(i) ApplySto theremainderof bothL1 andL2.
(ii) SUBST:=APPEND(S,SUBST).
6. Return SUBST.
ResolutioninPredicate Logic
ALGORITHM:RESOLUTIONINPREDICATELOGIC
1. Convertall thestatementsof Ftoclauseform
2. NegatePand convert theresulttoclauseform.Additto thesetofclauses obtainedinstep 1.
3. Repeat until either a contradiction is found or no progress can be made or a predetermined
amount of effort has been expended:
a) Selecttwoclauses.Callthesetheparentclauses.
b) Resolve them together. The resulting clause, called the resolvent, will be the disjunction of all
of the literals of both the parent clauses with appropriate substitutions performed and with the
following exception: If there is one pair of literals T1 and T2 such that one of the parent clauses
contains T1 and the other contains T2 and if T1 and T2 are unifiable, then neither T1 nor T2
should appear in the resolvent. We call T1 and T2 complementary literals. Use the substitution
produced bythe unification to create the resolvent. If there is one pair of complementaryliterals,
only one such pair should be omitted from the resolvent.
c) If the resolvent is the emptyclause, then a contradiction has been found. If it is not then add it
to the set of clauses available to the procedure.

If the choice of clauses to resolve together at each step is made in certain systematic ways, then
the resolution procedure will find a contradiction if one exists. However, it may take a very long
time. There exist strategies for making the choice that can speed up the process considerably as
given below.
Axiomsin clauseform
AnUnsuccessfulAttemptat Resolution
9. persecute(x,y)hate(y,x)
10. hate(x,y)  persecute(y, x)
Convertingtoclauseform,weget
9. persecute(x5,y2)hate(y2,x5)
10. hate(x6,y3)persecute(y3,x6)
Proceduralv/sDeclarative Knowledge
 A Declarative representation is one in which knowledge is specified but the use to which
that knowledge is to be put in, is not given.
 A Procedural representation is one in which the control information thatis necessary touse
the knowledge is considered to be embedded in the knowledge itself.
 Touseaproceduralrepresentation,weneedtoaugmentitwithaninterpreterthatfollows the
instructions given in the knowledge.
 The difference between the declarative and the procedural views of knowledge lies in
where control information resides.
man(Marcus)
man(Caesar)
person(Cleopatra)

∀x:man(x)→person(x)

person(x)?
Nowwewantto extract from thisknowledgebase theansto thequestion:
Ǝy:person(y)
Marcus,Ceaserand Cleopatracanbetheanswers
 As there is more than one value that satisfies the predicate, but only one value is needed,
the answer depends on the order in which the assertions are examined during the searchof
a response.
 If we view the assertions as declarative, then we cannot depict how they will be
examined. If we view them as procedural, then they do.
 Letusviewtheseassertionsasanondeterministicprogramwhoseoutputissimplynot defined,
now this means that there is no difference between Procedural & Declarative
Statements.Butmost of themachines don‟tdo so, theyholdon towhatevermethodthey have,
either sequential or in parallel.
 Thefocusisonworkingonthecontrolmodel.
man(Marcus)
man(Ceaser)
Vx : man(x)
person(x)

Person(Cleopatra)
 If we view this as declarative then there is no difference with the previous statement. But
viewed procedurally, and using the control model, we used to got Cleopatra as the
answer, now the answer is marcus.
 Theanswer canvarybychangingthe waytheinterpreterworks.
 The distinction between the two forms is often very fuzzy. Rather then trying to prove
which technique is better, what we should do is to figure out what the ways in which rule
formalisms and interpreters can be combined to solve problems.
Logic Programming
 Logic programming is a programming language paradigm in whichlogical assertions are
viewed as programs, e.g : PROLOG
 APROLOGprogramis describedasaseriesoflogicalassertions,eachof whichisaHorn Clause.
 AHorn Clauseis a clausethat has at most onepositiveliteral.
 Egp,¬p V qetcarealso Horn Clauses.
 Thefact that PROLOGprograms arecomposedonlyofHorn Clauses and not ofarbitrary
logical expressions has two important consequences.
 Becauseofuniformrepresentation asimple&effective interpretercanbewritten.
 ThelogicofHornClausesystemsis decidable.
 EvenPROLOGworksonbackwardreasoning.
 The program is read top to bottom, left to right and search is performed depth-first with
backtracking.
 There are some syntactic difference between the logic and the PROLOG representations
as mentioned
 The key difference between the logic & PROLOG representation is that PROLOG
interpreter has a fixed control strategy, so assertions in the PROLOG program define a
particular search path to answer any question. Whereas Logical assertions define set of
answers that they justify, there can be more than one answers, it can be forward or
backward tracking.
 ControlStrategyforPROLOGstatesthatwebeginwithaproblem statement,whichis
viewed as a goal to be proved.

 Lookfor theassertions thatcan provethegoal.


 Todecidewhetherafactorarulecanbeappliedtothecurrentproblem,invokea standard
unification procedure.
 Reasonbackwardfromthatgoaluntilapathisfoundthatterminateswithassertionsin the
program.
 Considerpaths usingadepth-first search strategyand usebacktracking.
 Propagateto the answer bysatisfyingthe conditions.

Forwardv/sBackwardReasoning
 The objective of any search is tofind a path througha problem spacefrom the initial tothe
final one.
 Thereare2directionstogoandfindthe answer
 Forward
 Backward
 8-squareproblem
 Reasonforwardfromthe initialstates:Beginbuildingatreeofmove
sequencesthatmightbesolutionbystartingwiththeinitialconfiguration(s) at theroot of the
tree. Generate the next level of tree by finding all the rules whose left sides match the
root node and use the right sides to create the new configurations. Generate each node by
taking each node generated at the previous level and applying to it all of the rules whose
left sides match it. Continue.
 Reason backward from the goal states: Begin building a tree of move sequences that
might be solution by starting with the goal configuration(s) at the root of the tree.
Generate the next level of tree by finding all the rules whose right sides match the root
nodeandusetheleftsidestocreatethenewconfigurations. Generateeach nodebytaking each
node generated at the previous level and applying to it all of the rules whose right sides
match it. Continue. This is also called Goal-Directed Reasoning.
 To summarize, to reason forward, the left sides (pre-conditions) are matched against the
current state and the right sides (the results) areused to generate new nodes until the goal
is reached.
 To reason backwards, the right sides are matched against the current node and the left
sides are used to generate new nodes.
 Factorsthatinfluence whethertochooseforwardorbackwardreasoning:
 Are there more possible start states or goal states? We would like to go from smaller
set of states to larger set of states.
 In which direction is the branching factor (the average number of nodes that can be
reached directly from a single node) greater? We would like to proceed in the
direction with the lower branching factor.
 Will the program be asked to justify its reasoning process to the user? It so, it is
important to proceed in the direction that corresponds more closelywith the way user
will think.
 Whatkindofeventis goingto triggeraproblem-solvingepisode? Ifitisthearrivalof a new
fact , forward reasoning should be used. If it a query to which response is desired, use
backward reasoning.
 Hometo unknownplaceexample.
 MYCIN
 BidirectionalSearch(Thetwosearchesmustpasseachother)
 Forward Rules: which encode knowledge about how to respond to certain input
configurations.
 BackwardRules:which encodeknowledgeabouthowtoachieve particulargoals.
 Backward-ChainingRuleSystems
 PROLOGisanexampleofthis.
 Thesearegoodforgoal-directedproblemsolving.
 HenceProlog&MYCINareexamplesofthesame.
 Forward-ChainingRuleSystems
 Wework on the incomingdatahere.
 Theleftsidesofrulesarematchedwithagainstthe statedescription.
 Therulesthat matchthe statedumptheir rightsideassertions intothe state.
 Matchingismore complex forforwardchainingsystems.
 OPS5,Brownstonetc. aretheexamples ofthesame.
 CombiningForwardv/s BackwardReasoning
 Patientsexampleof diagnosis.
 Insomesystems, thisis onlypossiblein reversiblerules.

Matching
 Till now we have used search to solve the problems as the application ofappropriate rules.
 We applied them to individual problem states to generate new states to which the rules
can then be applied, until a solution is found.
 We suggest that a clever search involves choosing from among the rules that can be
applied at a particular point, but we do not talk about how to extract from the entire
collection of rules those that can be applied at a given point.
 Todo thisweneed matching.
Indexing
 Do a simple search through all the rules, comparing each one‟s precondition to the
current state and extracting all the ones that match.
 Butthis hastwo problems
 In order to solve very interesting problems, it will be necessary to use a large number of
rules, scanning through all of them at every step of the search would be hopelessly
inefficient.
 It is not always immediately obvious whether a rule‟s preconditions are satisfied by a
particular state.
 To solve the first problem, use simple indexing. E.g. in Chess, combine all moves at a
particular board state together.
Matchingwith Variables
 The problem of selecting applicable rules is made more difficult when preconditions are
not stated as exact descriptions of particular situations but rather describe properties that
the situations must have.
 Thenweneed tomatchaparticularsituationand thepreconditionsof agiven situation.
 In many rules based systems, we need to compute the whole set of rules that match the
current state description. Backward Chaining Systems usually use depth-first
backtracking to select individual rules, but forward chaining systems use Conflict
Resolution Strategies.
 Oneefficient manyto manymatchalgorithm isRETE

Complex&Approximate Matching
 Amorecomplexmatchingprocessisrequiredwhenthepreconditions ofarulespecifyrequired
propertiesthatarenotstatedexplicitlyinthedescriptionofthe current state. In this case, a
separate set of rules must be used to describe how some properties can be inferred from
others.
 An even more complex matching process is required if rules should be applied if their
preconditions approximately match the current situation. Example of listening to a
recording of a telephonic conversation.
 Forsomeproblems,almostalltheactionisinthematchingofthe rulestotheproblemstate.
Oncethatisdone,sofewrulesapplythattheremainingsearch is trivial. Example ELIZA
Conflict Resolution
 The result of the matching process is a list of rules whose antecedents have matched the
current state description along with whatever variable binding were generated by the
matching process.
 Itisthejobofthesearch methodtodecideonthe orderinwhichthe rules willbeapplied. But
sometimes it is useful to incorporate some of the decision making into the matching
process. This phase is called conflict resolution.
 There are three basic approaches to the problem of conflict resolution in the production
system
 Assignapreferencebased ontherulethat matched.
 Assignapreferencebased onthe objectsthat matched.
 Assignapreferencebased onthe action thatthematchedrule wouldperform.

STRUCTUREDREPRESNTATIONOFKNOWLEDGE
 Representing knowledge using logical formalism, like predicate logic, has several
advantages. They can be combined with powerful inference mechanisms like resolution,
which makes reasoning with facts easy. But using logical formalism complex structuresof
the world, objects and their relationships, events, sequences of events etc. cannot be
described easily.
 A good system for the representation of structured knowledge in a particular domain
should posses the following four properties:
(i) Representational Adequacy:- The ability to represent all kinds of knowledge that are
needed in that domain.
(ii) Inferential Adequacy :- The ability to manipulate the represented structure and infer
new structures.
(iii) InferentialEfficiency:-Theabilitytoincorporateadditionalinformationintothe
knowledge structure that will aid the inference mechanisms.
(iv) Acquisitional Efficiency :- The ability to acquire new information easily, either
bydirect insertion or by program control.
 The techniques that have been developed in AI systems to accomplish these objectives
fall under two categories:
1. Declarative Methods:- In these knowledge is represented as static collection of facts
which are manipulated by general procedures. Here the facts need to be stored only one
and they can be used in any number of ways. Facts can be easily added to declarative
systems without changing the general procedures.
2. Procedural Method:- In these knowledge is represented as procedures. Default
reasoning and probabilistic reasoning are examples of procedural methods. In these,
heuristic knowledge of “How to do things efficiently “can be easily represented.
 In practice most of the knowledge representation employ a combination of both. Most of
the knowledge representation structures have been developed to handle programs that
handle natural language input. One of the reasons that knowledge structures are so
important is that they provide a way to represent information about commonly occurring
patterns of things . such descriptions are some times called schema. One definition of
schema is
 “Schema refers to an active organization of the past reactions, or of past experience,
which must always be supposed to be operating in any well adapted organic response”.
 By using schemas, people as well as programs can exploit the fact that the real world is
notrandom.ThereareseveraltypesofschemasthathaveprovedusefulinAIprograms.
Theyinclude
(i) Frames:- Used to describe a collection of attributes that a given object possesses (eg:
description of a chair).
(ii) Scripts:-Used todescribecommon sequenceof event (eg:-arestaurant scene).
(iii) Stereotypes:-Usedtodescribedcharacteristicsofpeople.
(iv) Rule models:- Used to describe common features shared among a set of rules in a
production system.
 Frames and scripts are used very extensively in a variety of AI programs. Beforeselecting
anyspecific knowledge representation structure, the followingissues have to be
considered.
(i) The basis properties of objects , if any, which are common to every problem domain
must be identified and handled appropriately.
(ii) Theentireknowledgeshouldberepresentedasagood setof primitives.
(iii) Mechanismsmustbedevisedtoaccessrelevantpartsinalargeknowledgebase.
UNIT-3

KNOWLEDGEINFERENCE

The object of a knowledge representation is to express knowledge in a computer tractable form,


so that it can be used to enable our AI agents to perform well.
Aknowledgerepresentation language is definedbytwo aspects:
1. Syntax The syntax of a language defines which configurations of the components of
the language constitute valid sentences.
2. Semantics The semantics defines which facts in the world the sentences refer to, and
hence the statement about the world that each sentence makes.
Thisisa verygeneralidea, andnotrestricted tonatural language.
Supposethelanguageisarithmetic,then„x‟,„³‟and„y‟arecomponents(orsymbolsorwords)of
thelanguagethesyntaxsaysthat„x³y‟isavalidsentenceinthelanguage,but„³³xy‟isnotthe semantics
say that „x ³ y‟ is false if y is bigger than x, and true otherwise A good knowledge representation
system for any particular domain should possess the following properties:
1. Representational Adequacy – the ability to represent all the different kinds of
knowledge that might be needed in that domain.
2. Inferential Adequacy – the ability to manipulate the representational structures to
derive new structures (corresponding to new knowledge) from existing structures.
3. Inferential Efficiency – the ability to incorporate additional information into the
knowledge structure which can be used to focus the attention of the inferencemechanisms
in the most promising directions.
4. Acquisitional Efficiency – the ability to acquire new information easily. Ideally the
agent should be able to control its own knowledge acquisition, but direct insertion of
information by a „knowledge engineer‟ would be acceptable.
In practice, the theoretical requirements for good knowledge representations can usually be
achieved by dealing appropriately with a number of practical requirements:
1. The representations need to be complete – so that everything that could possibly need
to be represented, can easily be represented.
2. Theymust becomputable– implementablewithstandard computingprocedures.
3. They should make the important objects and relations explicit and accessible – so that
it is easy to see what is going on, and how the various components interact.
4. Theyshouldsuppressirrelevantdetail–sothatrarelyuseddetailsdon‟tintroduce necessary
complications, but are still available when needed.
5. Theyshouldexposeanynaturalconstraints–sothatitiseasytoexpresshowone object or
relation influences another.
6. Theyshould betransparent –soyoucan easilyunderstand whatis beingsaid.
7. The implementation needs to be concise and fast – so that information can be
stored,retrieved and manipulated rapidly.
A Knowledge representation formalism consists of collections of condition-action rules
(Production Rules or Operators), a database which is modified in accordance with the rules,
andaProductionSystemInterpreterwhichcontrolstheoperationoftherulesi.eThe 'controlmechanism'
of a Production System, determining the order in which Production
Rulesarefired.Asystemthatusesthisform ofknowledge representationiscalledaproduction system.

ProductionBased System

Aproductionsystem consistsoffourbasic components:


1. AsetofrulesoftheformCi®AiwhereCiistheconditionpartandAi istheaction part. The
condition determines when a given rule is applied, and the action
determineswhathappens whenitisapplied.
2. Oneormoreknowledgedatabases thatcontain whateverinformationisrelevant forthe
given problem. Some parts of the database may be permanent, while others may
temporary and only exist during the solution of the current problem. The information in
the databases may be structured in any appropriate manner.
3. A control strategy that determines the order in which the rules are applied to the
database, and provides a way of resolving any conflicts that can arise when several rules
match at once.
4. A rule applier which is the computational system that implements the control strategy
and applies the rules.

Fourclassesofproductionsystems:-

1. Amonotonicproduction system

2. Anon monotonicproduction system

3. Apartiallycommutativeproduction system

4. Acommutativeproductionsystem.

Advantagesofproductionsystems:-

1. Productionsystemsprovidean excellenttoolforstructuringAIprograms.

2. Production Systems are highly modular because the individual rules can be added,
removed or modified independently.

3. Theproductionrulesareexpressedinanaturalform,sothestatementscontainedinthe
knowledge base should the a recording of an expert thinking out loud.

DisadvantagesofProduction Systems:-

One important disadvantage is the fact that it may be very difficult to analyse the flow of control
within a production system because the individual rules don‟t call each other.

Productionsystemsdescribetheoperationsthatcanbeperformedinasearchforasolutiontothe problem.
They can be classified as follows.
Monotonicproductionsystem:-
A system in which the application of a rule never prevents the later application ofanother
rule, that could have also been applied at the time the first rule was selected.
Partiallycommutative production system:-
A production system inwhich the application of a particular sequence of rules transforms
state X into state Y, then any permutation of those rules that is allowable also transforms state x
into state Y.
Theorem proving falls under monotonic partially communicative system. Blocks world and 8
puzzle problems like chemical analysis and synthesis come under monotonic, not partially
commutative systems. Playing the game of bridge comes under non monotonic , not partially
commutative system.
For any problem, several production systems exist. Some will be efficient than others. Though it
may seem that there is no relationship between kinds of problems and kinds of production
systems, in practice there is a definite relationship.
Partiallycommutative,monotonicproductionsystemsareusefulforsolvingignorableproblems. These
systems are important for man implementation standpoint because they can be implemented
without the ability to backtrack to previous states, when it is discovered that an incorrect path
was followed. Such systems increase the efficiency since it is not necessary tokeep track of the
changes made in the search process.
Monotonic partially commutative systems are useful for problems in which changes occur but
can be reversed and in which the order of operation is not critical (ex: 8 puzzle problem).
Production systems that are not partially commutative are useful for many problems in which
irreversiblechangesoccur,suchaschemicalanalysis.Whendealingwithsuchsystems,theorder
inwhichoperationsareperformedisveryimportantandhencecorrectdecisionshavetobemade at the
first time itself.

FrameBasedSystem

A frame is a data structure with typical knowledge about a particular object or concept.Frames,
first proposed by Marvin Minsky in the 1970s.

Example: Boardingpassframes
QANTASBOARDINGPASS AIRNEWZEALANDBOARDINGPASS
CCaS6rr6i5er9:-ArtificiQ
alAIN
ntTeA
lliSgeAnIcReWAYS Carrier: AIRNEWZEALAND
Name: MRNBLACK Name: MRSJ WHITE Page4
Flight: QF612 Flight: NZ0198
Date: 29DEC Date: 23NOV
Seat: 23A Seat: 27K
From: HOBART From: MELBOURNE
Each framehas its ownnameand aset of attributes associatedwith it.Name, weight, height and age
are slots in the frame Person.Model, processor, memory and price are slots in the frame
Computer.Each attribute or slot has a value attached to it.

Framesprovideanaturalwayforthestructuredandconcise representationof knowledge.

A frame provides a means of organising knowledge in slots to describe various attributes and
characteristics of the object.

Framesareanapplicationofobject-orientedprogrammingforexpertsystems.

Object-oriented programming is a programming method that uses objects as a basis for analysis,
design and implementation.

In object-oriented programming, an object is defined as a concept, abstraction or thingwith crisp


boundaries and meaning for the problem at hand.All objects have identity and are clearly
distinguishable.Michael Black, Audi 5000 Turbo, IBM Aptiva S35 are examples of objects.

An object combines both data structure and its behaviour in a single entity.This is in sharp
contrast to conventional programming, in which data structure and the program behaviour have
concealed or vague connections.

When an object is created in an object-orientedprogramminglanguage, we first assign anameto


the object, then determine a set of attributes to describe the object‟s characteristics, and at last
write procedures to specify the object‟s behaviour.

Aknowledgeengineer refers toan object asaframe(theterm, whichhasbecometheAIjargon).


Framesasaknowledgerepresentationtechnique

Theconceptofaframeisdefinedbyacollectionofslots. Eachslotdescribesaparticular
attribute or operation of the frame.

Slots are used to store values. A slot may contain a default value or a pointer to another frame, a
set of rules or procedure by which the slot value is obtained.

Typicalinformationincludedinaslot Frame

name.

Relationshipoftheframe totheotherframes.The frame IBMAptivaS35 mightbeamemberof the class


Computer, which in turn might belong to the class Hardware.

Slot value.A slot value can be symbolic, numeric or Boolean.For example, the slot Name has
symbolicvalues,andtheslot Agenumericvalues.Slotvaluescanbeassignedwhentheframeis created
or during a session with the expert system.

Default slot value.The default value is taken to be true when no evidence to the contrary has
been found.For example, a car frame might have four wheels and a chair frame four legs as
default values in the corresponding slots.

Range of the slot value.The range of the slot value determines whether a particular object
complies with the stereotype requirements defined by the frame.For example, the cost of a
computer might be specified between $750 and $1500.

Procedural information.A slot can have a procedure attached to it, which is executed if the slot
value is changed or needed.

Mostframebasedexpertsystemsusetwotypesofmethods:

WHENCHANGED andWHEN NEEDED

AWHENCHANGEDmethodisexecutedimmediatelywhenthevalueofitsattributechanges. A

WHEN NEEDED method is used to obtain the attribute value only when it is needed.

A WHEN NEEDED method is executed when information associated with a particular attribute
is needed for solving the problem, but the attribute value is undetermined.

Most frame based expert systems allow us to use a set of rules to evaluate information contained
in frames.
Howdoes aninferenceengineworkin aframebased system?

In a rule based system, the inference engine links the rules contained in the knowledge base with
data given in the database.

When thegoal isset up,the inference engine searches the knowledge base to find a rule thathas th
goal in its consequent.

If such a rule is found and its IF part matches data in the database, the rule is fired and the
specified object, the goal, obtains its value. If no rules are found that can derive a value for the
goal, the system queries the user to supply that value.

Inaframebasedsystem,theinference enginealso searchesforthe goal.But

Inaframebasedsystem,rulesplay anauxiliaryrole.Framesrepresenthereamajorsourceof knowledge


and both methods and demons are used to add actions to the frames.

Thusthegoalinaframebasedsystemcanbeestablishedeitherinamethodorinademon. Difference

between methods and demons:

A demon has an IF-THEN structure. It is executed whenever an attribute in the demon‟s IF


statement changes its value. In this sense, demons and methods are very similar and the two
terms are often used as synonyms.

However, methods aremore appropriate if we need to write complex procedures. Demons on the
other hand, are usually limited to IF-THEN statements.

Inference

Twocontrolstrategies:forward chainingandbackwardchaining

Forwardchaining:

Working from the facts to a conclusion. Sometimes called the datadriven approach.To chain
forward, match data in working memory against 'conditions' of rules in the rule-base. When one
of them fires, this is liable to produce more data. So the cycle continues

Backwardchaining:

Workingfromthe conclusionto thefacts. Sometimescalled the goal-driven approach.

To chain backward, match a goal in working memory against 'conclusions' of rules in the rule-
base.
Whenoneof themfires,this is liabletoproducemoregoals. Sothecyclecontinues.

The choice of strategy depends on the nature of the problem.Assume the problem is to get from
facts to a goal (e.g. symptoms to a diagnosis).

Backward chainingisthe bestchoiceif:

The goal is given in the problem statement, or can sensibly be guessed at the beginning of the
consultation; or:

The system has been built so that it sometimes asks for pieces of data (e.g. "please now do the
gram test on the patient's blood, and tell me the result"), rather than expecting all the facts to be
presented to it.

This is because (especially in the medical domain) the test may be expensive, or unpleasant, or
dangerous for the human participant so one would want to avoid doing such a test unless there
was a good reason for it.

Forwardchainingisthebestchoiceif:

Allthefacts areprovided with theproblem statement; or:

Therearemanypossible goals, and asmaller number ofpatterns ofdata; or:

Thereisn't anysensiblewayto guess whatthegoal is at thebeginningof theconsultation.

Note also that a backwards-chaining system tends to produce a sequence of questions which
seems focussed and logical to the user, a forward-chaining system tends to produce a sequence
which seems random & unconnected.

If it is important that the system should seem to behave like a human expert, backward chaining
is probably the best choice.

ForwardChainingAlgorithm

Forward chaining is a techniques for drawing inferences from Rule base. Forward-chaining
inference is often called data driven.

‡ The algorithm proceeds from a given situation to a desired goal,adding new assertions (facts)
found.

‡ A forward-chaining, system compares data in the working memory against the conditions inthe
IF parts of the rules and determines which rule to fire.
‡DataDriven

Example:ForwardChanning

■ Given:ARulebasecontainsfollowingRuleset

Rule 1: If A and C Then F

Rule2:IfAandEThenG Rule

3: If B Then E

Rule4: If GThenD

■ Problem:Prove

IfAandBtrueThenDistrue

Solution :

(i) StartwithinputgivenA,Bistrueandthen

startatRule1and goforward/downtillarule

“fires'' is found.

Firstiteration:

(ii) Rule3fires:conclusionEistrue

new knowledge found


(iii) Nootherrulefires;

end of first iteration.

(iv) Goalnotfound;

newknowledgefoundat(ii); go

for second iteration Second

iteration :

(v) Rule2fires:conclusionGistrue

new knowledge found

(vi) Rule4fires:conclusionDistrue

Goal found;

Proved

BackwardChaining Algorithm

Backward chaining is a techniques for drawing inferences from Rule base. Backward-
chaininginference is often called goal driven.

‡Thealgorithmproceedsfromdesiredgoal,addingnewassertions found.

‡Abackward-chaining, systemlooksfortheactionintheTHENclauseoftherulesthatmatches the


specified goal.

Goal Driven

Example:BackwardChanning
■ Given:RulebasecontainsfollowingRuleset

Rule 1: If A and C Then F

Rule2:IfAandEThenG Rule

3: If B Then E

Rule4: If GThenD

■ Problem:Prove

IfAandBtrueThenDistrue Solution :

(i) Startwithgoalie Distrue

gobackward/uptillarule"fires''isfound. First

iteration :

(ii) Rule4fires :

newsubgoaltoproveGistrue go

backward

(iii) Rule2"fires'';conclusion:Aistrue

new sub goal to prove E is true

gobackward;

(iv) nootherrulefires;endoffirstiteration.

new sub goal found at

(iii)goforseconditeration

Second iteration :

(v) Rule3fires :
conclusionBistrue(2ndinputfound)

both inputs A and B ascertained

Proved

FuzzyLogic

FuzzyLogic (FL) is a method of reasoning that resembles human reasoning. The approach of FL
imitates the way of decision making in humans that involves all intermediate possibilities
between digital values YES and NO.

The conventional logic block that a computer can understand takes precise input and produces a
definite output as TRUE or FALSE, which is equivalent to human‟s YES or NO.

Theinventoroffuzzy logic,LotfiZadeh,observedthatunlikecomputers,thehumandecision making


includes a range of possibilities between YES and NO, such as −

CERTAINLYYES
POSSIBLY YES
CANNOTSAY
POSSIBLY NO
CERTAINLYNO

Thefuzzylogicworks onthelevels of possibilities ofinputto achievethe definite output.

Implementation

 It can be implemented in systems with various sizes and capabilities ranging from small
micro-controllers to large, networked, workstation-based control systems.
 Itcanbeimplementedin hardware,software,oracombinationofboth.

WhyFuzzy Logic?

Fuzzylogicis usefulfor commercial andpracticalpurposes.

 Itcancontrolmachinesandconsumerproducts.
 Itmaynot give accurate reasoning,butacceptable reasoning.
 Fuzzylogic helps to dealwith theuncertaintyin engineering.
FuzzyLogicSystems Architecture

Ithas fourmainpartsasshown−

 FuzzificationModule−Ittransformsthesysteminputs,whicharecrispnumbers,into fuzzy sets.


It splits the input signal into five steps such as −

LP x isLarge Positive

MP xis Medium Positive

S xis Small

MN xis Medium Negative

LN x isLarge Negative

 KnowledgeBase−ItstoresIF-THENrulesprovidedbyexperts.
 Inference Engine − It simulates the human reasoning process by making fuzzy inference
on the inputs and IF-THEN rules.
 DefuzzificationModule−Ittransformsthefuzzysetobtainedbytheinferenceengine into a
crisp value.
Themembershipfunctionsworkonfuzzysetsofvariables.

Membership Function

Membershipfunctionsallowyoutoquantifylinguistictermandrepresentafuzzyset
graphically. A membership function for a fuzzy set A on the universe of discourse X is definedas
µA:X → [0,1].

Here, each element of X is mapped to a value between 0 and 1. It is called membership value or
degree of membership. It quantifies the degree of membership of the element in X to the fuzzy
set A.

 x axisrepresentstheuniverseofdiscourse.
 yaxis represents thedegrees of membership inthe[0, 1]interval.

There can be multiple membership functions applicable to fuzzify a numerical value. Simple
membership functions are used as use of complex functions does not add more precision in the
output.

AllmembershipfunctionsforLP,MP,S,MN,andLN areshownasbelow −

The triangular membership function shapes are most common among various other membership
function shapes such as trapezoidal, singleton, and Gaussian.
Here, the input to 5-level fuzzifier varies from -10 volts to +10 volts. Hence the corresponding
output also changes.

ExampleofaFuzzyLogicSystem

Let us consider an air conditioning system with 5-lvel fuzzy logic system. This system adjuststhe
temperature of air conditioner bycomparingthe room temperature and the target temperature
value.

Algorithm
 Definelinguisticvariablesand terms.
 Constructmembershipfunctionsfor them.
 Constructknowledgebaseof rules.
 Convertcrisp data into fuzzydata sets usingmembership functions. (fuzzification)
 Evaluaterulesintherulebase.(interfaceengine)
 Combineresultsfromeachrule.(interface engine)
 Convert output datainto non-fuzzyvalues.(defuzzification)

Logic Development

Step1:Definelinguisticvariablesandterms

Linguistic variables are input and output variables in the form of simple words or sentences. For
room temperature, cold, warm, hot, etc., are linguistic terms.

Temperature(t) ={very-cold,cold,warm,very-warm,hot}

Everymemberofthissetisalinguistictermandit can coversomeportion ofoveralltemperature values.

Step2:Constructmembershipfunctionsforthem

Themembershipfunctionsoftemperaturevariableareasshown−

Step3:Constructknowledgebaserules
Create a matrix of room temperature values versus target temperature values that an air
conditioning system is expected to provide.

RoomTemp.
Very_Cold Cold Warm Hot Very_Hot
/Target
Very_Cold No_Change Heat Heat Heat Heat
Cold Cool No_Change Heat Heat Heat
Warm Cool Cool No_Change Heat Heat
Hot Cool Cool Cool No_Change Heat
Very_Hot Cool Cool Cool Cool No_Change

Buildasetofrulesintothe knowledgebaseintheformof IF-THEN-ELSEstructures.

Sr. No. Condition Action


1 IFtemperature=(ColdORVery_Cold)ANDtarget=WarmTHEN Heat
2 IFtemperature=(HotOR Very_Hot)ANDtarget=WarmTHEN Cool
3 IF(temperature=Warm)AND(target=Warm)THEN No_Change

Step4:Obtain fuzzyvalue

Fuzzyset operations perform evaluation of rules. The operations used for OR and AND are Max
and Min respectively. Combine all results of evaluation to form a final result. This result is a
fuzzy value.

Step5:Performdefuzzification

Defuzzificationisthenperformed accordingtomembershipfunctionforoutputvariable.
ApplicationAreasofFuzzy Logic

Thekeyapplicationareasoffuzzylogicareasgiven− Automotive

Systems

 Automatic Gearboxes
 Four-WheelSteering
 Vehicleenvironmentcontrol

Consumer Electronic Goods

 Hi-FiSystems
 Photocopiers
 StillandVideo Cameras
 Television

Domestic Goods

 MicrowaveOvens
 Refrigerators
 Toasters
 VacuumCleaners
 Washing Machines
EnvironmentControl

 Air Conditioners/Dryers/Heaters
 Humidifiers

AdvantagesofFLSs

 Mathematicalconcepts within fuzzyreasoningare verysimple.


 Youcan modifyaFLS byjust addingor deletingrulesdueto flexibilityoffuzzylogic.
 Fuzzylogic Systems cantakeimprecise, distorted,noisyinput information.
 FLSsareeasytoconstructand understand.
 Fuzzylogic is a solution to complex problems in all fields of life, including medicine, as it
resembles human reasoning and decision making.

DisadvantagesofFLSs

 Thereis nosystematicapproachtofuzzysystem designing.


 Theyareunderstandableonlywhen simple.
 Theyaresuitablefortheproblemswhich donot needhigh accuracy.

CertaintyFactor

A certainty factor (CF) is a numerical value that expresses a degree of subjective belief that a
particular item is true. The item may be a fact or a rule. When probabilities are used attention
must be paid to the underlying assumptions and probability distributions in order to show
validity. Bayes‟ rule can be used to combine probability measures.

Suppose that a certainty is defined to be a real number between -1.0 and +1.0, where 1.0
represents complete certainty that an item is true and -1.0 represents complete certainty that an
item is false. Here a CF of 0.0 indicates that no information is available about either the truth or
thefalsityofanitem.Hencepositivevaluesindicateadegreeofbelieforevidencethat anitemis true, and
negative values indicate the opposite belief. Moreover it is common to select a positive number
that represents a minimum threshold of belief in the truth of an item. For example, 0.2 is a
commonly chosen threshold value.

Formof certaintyfactorsin ES
IF<evidence>
THEN<hypothesis>{cf }

cfrepresentsbeliefinhypothesisHgiventhatevidenceEhasoccurred It is

based on 2 functions
i) MeasureofbeliefMB(H, E)
ii) Measureof disbelief MD(H,E)

Indicatethedegreetowhichbelief/disbeliefofhypothesisHisincreasedifevidenceEwere observed

Totalstrengthofbeliefanddisbeliefina hypothesis:

Bayesian networks

 Representdependencies amongrandomvariables
 Giveashort specification of conditional probabilitydistribution
 Manyrandom variables areconditionallyindependent
 Simplifies computations
 Graphicalrepresentation
 DAG– causalrelationshipsamongrandom variables
 Allowsinferencesbased onthenetworkstructure

Definitionof Bayesian networks

ABNisaDAGinwhicheachnodeisannotatedwithquantitativeprobabilityinformation, namely:

 Nodesrepresentrandom variables(discreteorcontinuous)

 Directedlinks XY:Xhasa directinfluenceonY, Xissaid tobeaparent of Y

 eachnodeXhasanassociatedconditionalprobabilitytable,P(Xi|Parents(Xi))that quantify the


effects of the parents on the node

Example:Weather,Cavity,Toothache, Catch
 Weather,CavityToothache,CavityCatch

Example

Bayesiannetworksemantics

A) Represent a probabilitydistribution

B) Specifyconditional independence– build the network

A) eachvalue of theprobabilitydistribution can becomputed as:

P(X1=x1… Xn=xn) =P(x1,…,xn) =i=1,nP(xi | Parents(xi))

whereParents(xi)representthespecificvaluesofParents(Xi)

Buildingthenetwork

P(X1=x1… Xn=xn) = P(x1,…, xn) =

P(xn|xn-1,…,x1)*P(xn-1,…,x1)=…=

P(xn | xn-1,…, x1)* P(xn-1|xn-2,…,x1)* …P(x2|x1) *P(x1) =i=1,nP(xi|xi-1,…, x1)


 We can see thatP(Xi| Xi-1,…, X1) = P(xi| Parents(Xi)) if Parents(Xi) { Xi-1,…,X1}

 Theconditionmaybesatisfiedbylabelingthenodesinanorderconsistentwitha DAG

 Intuitively,theparentsofanodeXimustbeallthenodesXi-1,…,X1whichhavea direct
influence on Xi.

 Pickaset ofrandom variables thatdescribethe problem

 Pickanorderingofthosevariables

 whiletherearestillvariablesrepeat

(a) chooseavariable Xiand add anode associatedto Xi

(b) assignParents(Xi)aminimalsetofnodesthatalreadyexistsinthenetwork such that


the conditional independence property is satisfied

(c) definethe conditionalprobabilitytableforXi

 Because each nodeislinked onlytoprevious nodesDAG

 P(MaryCalls|JohnCals,Alarm,Burglary,Earthquake)=P(MaryCalls|Alarm)

Compactnessof node ordering

 Farmorecompactthanaprobabilitydistribution

 Exampleoflocallystructuredsystem(orsparse):eachcomponentinteractsdirectly only with


a limited number of other components

 Associatedusuallywithalineargrowthincomplexityratherthanwithanexponential one

 Theorder of addingthenodes is important

 Thecorrectorderinwhichtoaddnodesistoaddthe“rootcauses”first,thenthevariables they
influence, and so on, until we reach the leaves

ProbabilisticInterfaces
P(JM ABE)=

P(J|A)*P(M|A)*P(A|BE )*P(B) P(E)=0.9*0.7*0.001*0.999*0.998= 0.00062

P(A|B)=P(A|B,E)*P(E|B)+P(A|B,E)*P(E|B)=P(A|B,E)*P(E) +P(A|B,E)*P(E)

=0.95 * 0.002 + 0.94 * 0.998 = 0.94002

Differenttypesof inferences
Diagnosisinferences(effectcause) P(Burglary

| JohnCalls)

Causalinferences(causeeffect)

P(JohnCalls |Burglary),

P(MaryCalls | Burgalry)

Intercausalinferences(betweencauseandcommon effects)

P(Burglary|Alarm Earthquake)

Mixedinferences

P(Alarm | JohnCalls Earthquake)  diag + causal

P(Burglary|JohnCallsEarthquake)diag+intercausal

Dempster-ShaferTheory

 Dempster-Shafertheoryisanapproachtocombining evidence

 Dempster(1967)developedmeansforcombiningdegreesofbeliefderivedfrom independent
items of evidence.

 His student, Glenn Shafer (1976), developed method for obtaining degrees of belief
forone question from subjective probabilities for a related question
 People working in Expert Systems in the 1980s saw their approach as ideallysuitable for
such systems.

 Eachfacthasadegreeof support,between0and1:

 0No support forthe fact

 1full support forthe fact

 DiffersfromBayesianapproahinthat:

 Beliefin afact anditsnegationneed notsumto 1.

 Bothvalues canbe0(meaningnoevidencefororagainstthefact)

Setofpossibleconclusions:Θ Θ

= { θ1 , θ2 , …, θn }

Where:

 Θisthesetofpossible conclusionsto bedrawn

 Eachθi is mutuallyexclusive: at most onehas to be true.

 ΘisExhaustive: Atleastoneθihasto betrue.

Frameof discernment

Θ={ θ1 , θ2 , …, θn }

 Bayes was concerned with evidence that supported single conclusions (e.g., evidence
for each outcome θi in Θ):

 p(θi|E)

 D-STheoryisconcerned withevidenceswhichsupport

 subsetsofoutcomesinΘ,e.g., θ1 vθ2vθ3or{θ1,θ2, θ3}

 The “frame of discernment” (or “Power set”) of Θ is the set of all possible subsets of
Θ:

E.g.,if Θ ={θ1,θ2, θ3}


 Thenthe frameofdiscernmentofΘis:

(Ø, θ1, θ2, θ3, {θ1, θ2}, {θ1, θ3}, {θ2, θ3}, { θ1, θ2, θ3} )

 Ø,the emptyset,has aprobabilityof0,since oneof the outcomeshas tobetrue.

 Eachof theotherelements in the powerset has aprobabilitybetween 0 and 1.

 Theprobabilityof {θ1, θ2, θ3} is1.0 sinceonehasto betrue.

Massfunction m(A):

 (where A is a member of the power set) = proportion of all evidence that supports this
element of the power set.

 “The mass m(A) of a given member of the power set, A, expresses the proportion of all
relevant and available evidence that supports the claim that the actual state belongs to A
but to no particular subset of A.”

 “The value of m(A) pertains only to the set A and makes no additional claims about any
subsets of A, each of which has, by definition, its own mass.

 Eachm(A)isbetween0and1.

 Allm(A) sumto 1.

 m(Ø)is0-atleastonemustbetrue.

Interpretation of m({AvB})=0.3

 Means there is evidence for {AvB} that cannot be divided among more specific beliefsfor
A or B.

Example

 4people (B,J, Sand K)arelockedin aroom whenthelightsgo out.

 Whenthe lightscome on,K isdead, stabbedwithaknife.

 Notsuicide (stabbedin theback)

 No-oneenteredthe room.

 Assume onlyone killer.


 Θ={ B, J, S}

 P(Θ)=(Ø,{B},{J},{S},{B,J},{B,S},{J,S},{B,J,S})

 Detectives, after reviewing the crime-scene, assign mass probabilities to various


elements of the power set:

Beliefin A:

ThebeliefinanelementAofthePowersetisthesumofthemassesofelementswhichare subsets of A
(including A itself).

E.g.,givenA={q1,q2,q3}

Bel(A)=m(q1)+m(q2)+m(q3)+m({q1, q2})+m({q2, q3})+m({q1, q3})+m({q1, q2, q3})

Example

 Giventhe mass assignments as assigned bythedetectives:

 bel({B})=m({B})=0.1

 bel({B,J})=m({B})+m({J})+m({B,J}) =0.1+0.2+0.1=0.4

 Result:
PlausibilityofA: pl(A)

Theplausabilityofan elementA,pl(A),isthesumofallthemassesofthesetsthatintersect with the set A:

E.g.pl({B,J})=m(B)+m(J)+m(B,J)+m(B,S)+m(J,S)+m(B,J,S)=0.9

All Results:

Disbelief (orDoubt)inA: dis(A)

Thedisbelief in Ais simplybel(¬A).

ItiscalculatedbysummingallmassesofelementswhichdonotintersectwithA. The

plausibility of A is thus 1-dis(A):

pl(A)=1-dis(A)

BeliefIntervalof A:

ThecertaintyassociatedwithagivensubsetAisdefinedbythebeliefinterval: [

bel(A) pl(A) ]

E.g.thebeliefinterval of {B,S}is:[0.1 0.8]


BeliefIntervals&Probability

Theprobabilityin Afallssomewherebetweenbel(A)andpl(A).

 bel(A) represents the evidence we have for A directly So prob(A) cannot be less than this
value.

 pl(A) represents the maximum share of the evidence we could possibly have, if, for all
sets that intersect with A, the part that intersects is actually valid. So pl(A) is the
maximum possible value of prob(A).

BeliefIntervals:

Beliefintervals allowDemspter-Shafertheoryto reason about thedegreeofcertaintyorcertainty of


our beliefs.

 A small difference between belief and plausibility shows that we are certain about our
belief.

 Alargedifferenceshows that weareuncertainabout our belief.

However,evenwitha0interval,thisdoesnotmeanweknowwhichconclusionisright.Just how probable


it is!
UNIT4

PLANNINGANDMACHINELEARNING

PlanningWithStateSpace Search

Theagentfirstgeneratesagoaltoachieveandthen constructsaplantoachieveitfrom the


Current state.

ProblemSolvingToPlanning

RepresentationUsingProblemSolvingApproach

 Forwardsearch

 Backwardsearch

 Heuristicsearch

RepresentationUsingPlanning Approach

 STRIPS-standardresearchinstituteproblem solver.

 Representationforstates andgoals

 Representationforplans

 Situationspaceandplan space

 Solutions

Why Planning?

Intelligent agents must operate in the world. They are not simply passive reasons (Knowledge
Representation, reasoning under uncertainty) or problem solvers (Search), they must also act on
the world.

We want intelligent agents to act in “intelligent ways”. Taking purposeful actions, predicting the
expected effect of such actions, composing actions together to achieve complex goals. E.g. if we
have a robot we want robot to decide what to do; how to act to achieve our goals.

PlanningProblem

Howto changetheworldtosuit our needs

Critical issue: we need to reason about what the world will be like after doing a few actions, not
just what it is like now
GOAL:Craighascoffee

CURRENTLY:robotinmailroom,hasnocoffee,coffeenotmade,Craiginofficeetc. TO DO:

goto lounge, make coffee

PartialOrderPlan

 Apartiallyordered collection ofsteps

o Startstephastheinitial statedescriptionandits effect

o Finishstephasthegoaldescriptionasits precondition

o Causallinksfrom outcome ofonestep to precondition ofanotherstep

o Temporalorderingbetweenpairsofsteps

 Anopencondition isa preconditionofastepnotyetcausallylinked

 A plan is completeifeveryprecondition is achieved

 A precondition is achievedif it is the effect of an earlier step and no possiblyintervening


step undoes it
Start

RightSock

RightShoe

LeftSock

LeftShoe

Finish
PartialOrderPlanAlgorithm
StanfordResearchInstituteProblemSolver(STRIPS)

STRIPS is a classical planning language, representing plan components as states, goals, and
actions, allowing algorithms to parse the logical structure of the planning problem to provide a
solution.

In STRIPS, state is represented as a conjunction of positive literals.Positive literals may be a


propositional literal (e.g., Big ^ Tall) or a first-order literal (e.g., At(Billy, Desk)).The positive
literals must be grounded – may not contain a variable (e.g., At(x, Desk)) – and must befunction-
free – may not invoke a function to calculate a value (e.g., At(Father(Billy), Desk)).Any state
conditions that are not mentioned are assumed false.

The goal is also represented as a conjunction of positive, ground literals.A state satisfies a goal if
the state contains all of the conjuncted literals in the goal; e.g., Stacked ^ Ordered ^ Purchased
satisfies Ordered ^ Stacked.

Actions(oroperators) aredefinedbyactionschemas, eachconsistingofthreeparts:

 Theaction nameand anyparameters.


 Preconditions which must hold before the action can be executed.Preconditions are
represented as a conjunction of function-free, positive literals.Any variables in a
precondition must appear in the action‟s parameter list.
 Effects which describe how the state of the environment changes when the action is
executed.Effectsarerepresentedasaconjunctionoffunction-freeliterals.Any
variables ina preconditionmustappear inthe action‟s parameterlist.Any worldstate not
explicitly impacted by the action schema‟s effect is assumed to remain unchanged.

Thefollowing,simpleactionschemadescribestheactionofmovingaboxfromlocationxto location y:

Action:MoveBox(x,y)
Precond: BoxAt(x)
Effect:BoxAt(y),¬BoxAt(x)

If an action is applied, but the current state of the system does not meet the necessary
preconditions, then the action has no effect.But if an action is successfully applied, then any
positive literals, in the effect, are added to the current state of the world; correspondingly, any
negativeliterals,intheeffect,resultintheremovalofthecorrespondingpositiveliteralsfromthe state of
the world.

For example, in the action schema above, the effect would result in the proposition BoxAt(y)
being added to the known state of the world, while BoxAt(x) would be removed from the known
state of the world.(Recall that state onlyincludes positive literals, so a negation effect results in
the removal of positive literals.)Note also that positive effects can not get duplicated in state;
likewise,anegativeofa proposition that is not currentlyin stateis simplyignored.For example, if
Open(x) was not previously part of the state, ¬ Open(x) would have no effect.

A STRIPS problem includes the complete (but relevant) initial state of the world, the goal
state(s), and action schemas.A STRIPS algorithm should then be able toaccept such a problem,
returning a solution.The solution is simply an action sequence that, when applied to the initial
state, results in a state which satisfies the goal.

STRIPSPlanning Algorithm

As previously referenced, STRIPS began asanautomated planning algorithm and hasdouble-


meaningtodescribethelanguage(describedabove)usedtoprovideinputtothatalgorithm. While the
algorithm does not scale well to real-world problems, it, like the language, serves as a
foundational starting point to developing and understanding more powerful automated planning
algorithms.The STRIPS algorithm [3] is found below, followed by a brief commentary:

STRIPS(A,s,g)
p=emptyplan loop…
ifssatisfiesgthenreturnp
a=[anapplicableactioninA,relevantforg] if a =
null, then return failure
p’ = STRIPS(A, s,
precond(a))ifp’=failure,thenretu
rnfailure s = apply p’ to s
s=applyatos p
= p + p’ + a

IntheaboveSTRIPSalgorithm,Arepresentsallofthepossible,groundedactions(i.e.,action schemas
with variables replaced with values), while s is the current state, and g is the goal state.

Critically important to note is that this algorithm is a backward-search algorithm.In otherwords,


the goal state of the planning problem is fed into the algorithm as s, while the initial state of the
planning problem is provided as g.The returned solution, if any, simply needs to be reversed to
provide the solution to the planning problem.Simply put, it works backwards from the goal to
figure out a logical path to get there.

Explanation

Mostexpertsystemshaveexplanationfacilitiesthatallowtheusertoaskquestions-whyand how it
reached some conclusion

The questions are answered by referring to the system goals, the rules being used, and existing
problemsolving.Therulestypicallyreflectempiricalor“compiled”knowledge.Theyarecodes of an
expert‟s rule of thumb, not the expert‟s deeper understanding.

Example:

Dialogwithanexpertsystemdesignedtogiveadviceoncarproblems.

System Does the car start?

User No.

System Doestheengineturnover?

User yes

System Doyousmell gas?

User yes

System Recommendation-wait5minutesandtryagain.

User Why?

System Iused therule:

Ifnotstart,andengine_turn_over,andsmell_gas

Thenrecommendationis„wait5minutesandtryagain‟
User

Note: The rule gives the correct advice for a flooded car. And knows the questions to be ask to
determineifthecarisflooded,butitdoesnotcontaintheknowledgeofwhatafloodedcarisand why
waiting will help.

TypesofExplanation

Thereare fourtypes ofexplanations commonlyused in expert systems.

 Ruletracereportson the progressof aconsultation;

 Explanationofhow thesystem reachedto the giveconclusion;

 Explanation ofwhythe system did not giveanyconclusion.

 Explanationofwhythe system is askingaquestion;

Learning

MachineLearning

 Likehumanlearningfrompast experiences,a computerdoesnothave“experiences”.

 A computer system learns from data, which represent some “past experiences” of an
application domain.

 Objective of machine learning : learn a target function that can be used to predict the
values of a discrete class attribute, e.g., approve or not-approved, and high-risk or low
risk.

 Thetaskiscommonlycalled:Supervisedlearning,classification,orinductivelearning

Supervised Learning

Supervised learning is a machine learning technique for learning a function from training data.
The training data consist of pairs of input objects (typically vectors), and desired outputs. The
output of the function can be a continuous value (called regression), or can predict a class labelof
the input object (called classification). The task of the supervised learner is to predict thevalue of
the function for any valid input object after having seen a number of training examples (i.e. pairs
of input and target output). To achieve this, the learner has to generalize from the presented data
to unseen situations in a "reasonable" way.
Another term for supervised learning is classification. Classifier performance depend greatly on
the characteristics of the data to be classified. There is no single classifier that works best on all
givenproblems.Determiningasuitableclassifier fora givenproblemishoweverstillmoreanart than a
science. The most widely used classifiers are the Neural Network (Multi-layerPerceptron),
Support Vector Machines, k-Nearest Neighbors, Gaussian Mixture Model,Gaussian, Naive
Bayes, Decision Tree and RBF classifiers.

Supervisedlearningprocess:twosteps

 Learning(training): Learn amodelusingthetrainingdata


 Testing:Test themodel usingunseen test datatoassessthe model accuracy

Accuracy Numberofcorrectclassifications,
Totalnumberoftestcases

Supervisedvs.unsupervisedLearning

 Supervised learning:

classificationisseenassupervisedlearningfrom examples.

 Supervision:Thedata(observations,measurements,etc.)arelabeledwithpre- defined
classes. It is like that a “teacher” gives the classes (supervision).

 Testdataareclassifiedintotheseclasses too.

 Unsupervisedlearning (clustering)

 Classlabelsofthe dataare unknown

 Given a set of data, the task is to establish the existence of classes or clusters inthe
data

DecisionTree
 A decisiontree takesasinputanobjector situationdescribedby a setofattributesand returns a
“decision” – the predicted output value for the input.

 Adecisiontreereachesitsdecisionbyperformingasequenceoftests.

Example :“HOW TO” manuals (for car repair)

A decision tree reaches its decision by performing a sequence of tests. Each internal node in the
tree corresponds to a test of the value of one of the properties, and the branches from the
nodearelabeledwiththepossiblevaluesofthetest.Eachleafnodeinthetree specifiesthevaluetobe
returned if that leaf is reached. The decision tree representation seems to be very natural for
humans; indeed, many "How To" manuals (e.g., for car repair) are written entirely as a single
decision tree stretching over hundreds of pages.

A somewhat simpler example is provided by the problem of whether to wait for a table at a
restaurant. Theaimhereistolearnadefinitionforthegoal predicateWillWait. Insettingthisup as a
learning problem, we first have to state what attributes are available to describe examples in the
domain. we will see how to automate this task; for now, let's suppose we decide on the following
list of attributes:

1. Alternate:whetherthereisasuitablealternativerestaurantnearby.

2. Bar:whetherthe restaurant hasacomfortablebar areatowait in.

3. Fri/Sat:trueonFridaysandSaturdays.

4. Hungry: whetherwearehungry.

5. Patrons:howmanypeopleareintherestaurant(values areNone, Some, and Full).

6. Price:therestaurant's pricerange($,$$,$$$).

7. Raining:whetheritisrainingoutside.

8. Reservation:whether wemadeareservation.

9. Type:thekindofrestaurant(French,Italian,Thai,orburger).

10. WaitEstimate: thewait estimatedbythe host(0-10minutes, 10-30, 30-60, >60).


Decisiontreeinductionfromexamples

An example for a Boolean decision tree consists of a vector of' input attributes, X, and a single
Boolean output value y. A set of examples (X1,Y1) . . . , (X2, y2) is shown in Figure. The
positive examples are the ones in which the goal Will Wait is true (XI, X3, . . .); the negative
examples are the ones in which it is false (X2, X5, . . .). The complete set of examplesis called
the training set.

DecisionTreeAlgorithm

The basic idea behind the Decision-Tree-Learning-Algorithm is to test the most important
attribute first. By "most important," we mean the one that makes the most difference to the
classification of an example. That way, we hope to get to the correct classification with a small
number of tests, meaning that all paths in the tree will be short and the tree as a whole will be
small.
ReinforcementLearning

 Learningwhat to doto maximizereward

 Learnerisnotgiventraining

 Onlyfeedback is in terms ofreward

 Trythings out and seewhat therewardis

 DifferentfromSupervisedLearning

 Teachergivestraining examples

Examples

 Robotics:QuadrupedGaitControl,BallAcquisition(Robocup)

 Control: Helicopters

 OperationsResearch:Pricing,Routing, Scheduling

 GamePlaying: Backgammon,Solitaire,Chess,Checkers

 HumanComputerInteraction:SpokenDialogue Systems

 Economics/Finance:Trading

MarkovdecisionprocessVSReinforcementLearning

 Markovdecision process
 Setof stateS, set ofactions A

 TransitionprobabilitiestonextstatesT(s,a,a‟)

 RewardfunctionsR(s)

 RLis based on MDPs,but

 Transitionmodelisnot known

 Rewardmodelisnotknown

 MDPcomputesanoptimal policy

 RLlearns an optimal policy

TypesofReinforcement Learning

 PassiveVs Active

 Passive:Agent executesafixedpolicyandevaluatesit

 Active:Agentsupdatespolicyasit learns

 ModelbasedVsModelfree

 Model-based: Learntransition andrewardmodel,useittoget optimalpolicy

 Model free: Deriveoptimal policywithout learningthemodel

PassiveLearning

 Evaluatehowgood apolicyπis

 LearntheutilityUπ(s)ofeachstate

 Sameas policyevaluation forknown transition& reward models


Agentexecutes asequenceof trials:

(1,1)→ (1, 2)→(1, 3) →(1, 2) →(1, 3)→(2,3)→(3, 3) →(4, 3)+1


(1,1)→ (1, 2) → (1, 3) → (2, 3) →(3, 3) →(3,2)→ (3, 3) →(4, 3)+1
(1,1)→ (2, 1)→(3, 1) →(3, 2) →(4, 2)−1

Goal is to learn theexpected utilityUπ(s)

DirectUtility Estimation

 Reductiontoinductivelearning

 Computetheempiricalvalueofeachstate

 Eachtrialgivesasample value

 Estimatethe utilitybasedon the sample values

 Example:Firsttrialgives

 State(1,1): Asample ofreward 0.72

 State(1,2):Twosamples ofreward0.76and 0.84

 State(1,3):Twosamples ofreward0.80and 0.88

 Estimatecan bearunningaverageof samplevalues

 Example:U(1, 1) =0.72,U(1, 2)=0.80,U(1, 3)=0.84, . . .

 Ignoresaveryimportantsourceof information
 Theutilityof states satisfytheBellman equations

 SearchisinahypothesisspaceforUmuchlargerthan needed

 Convergenceis very slow

 Makeuseof Bellman equationstogetUπ(s)

 NeedtoestimateT(s,π(s),s‟)andR(s)fromtrials

 Plug-inlearnttransitionandrewardintheBellmanequations

 SolvingforUπ: Systemof n linearequations

 Estimatesof TandR keep changing

 Makeuse of modified policyiteration idea

 Runfewroundsofvalueiteration

 Initializevalueiterationfrompreviousutilities

 ConvergesfastsinceTandRchangesare small

 ADPisastandardbaselinetotest„smarter‟ideas

 ADPisinefficientifstate spaceislarge

 Hasto solvealinearsystem in the sizeof thestatespace

 Backgammon:1050 linear equationsin1050unknowns

TemporalDifferenceLearning

 Bestofbothworlds

 Onlyupdate statesthat are directlyaffected

 ApproximatelysatisfytheBellmanequations

 Example:

(1,1)→ (1,2) →(1, 3)→ (1, 2) →(1, 3)→(2,3)→ (3,3) →(4, 3)+1

(1,1)→ (1, 2)→(1, 3) →(2, 3) →(3, 3)→(3,2)→(3, 3) →(4, 3)+1


(1,1)→ (2, 1)→(3, 1) →(3, 2) →(4, 2)−1

 Afterthe first trial,U(1,3)=0.84,U(2, 3) =0.92

 Considerthetransition (1, 3)→ (2, 3) in thesecondtrial

 Ifdeterministic,thenU(1,3)=−0.04+U(2,3)

 Howtoaccountforprobabilistictransitions(withouta model)

 TDchoosesamiddleground

 Temporaldifference(TD)equation,αisthelearningrate

 TheTD equation

 TDappliesacorrectiontoapproachtheBellmanequations

 Theupdatefors‟willoccurT(s,π(s),s‟)fractionofthetime

 Thecorrectionhappens proportionaltothe probabilities

 Overtrials,thecorrection issameasthe expectation

 Learningrate αdetermines convergenceto true utility

 Decrease αsproportional tothenumberofstatevisits

 Convergenceisguaranteedif

 Decayαs(m)=1/msatisfiesthe condition

 TDis model free

TDVs ADP

 TDis mode freeas opposed to ADPwhich is model based

 TDupdatesobservedsuccessor ratherthan all successors


 Thedifferencedisappearswith largenumberof trials

 TDisslowerinconvergence,butmuchsimplercomputationper observation

ActiveLearning

 Agent updates policyas itlearns

 Goalis tolearntheoptimal policy

 Learningusingthe passiveADP agent

 EstimatethemodelR(s),T(s,a,s‟)fromobservations

 Theoptimal utilityandactionsatisfies

 Solveusingvalue iteration orpolicyiteration

 Agenthas“optimal” action

 Simplyexecute the “optimal” action

Exploitation vsExploration

 Thepassiveapproachgivesa greedyagent

 Exactlyexecutes therecipeforsolvingMDPs

 Rarelyconverges to theoptimal utilityand policy

 Thelearned modelis different fromthetrue environment

 Trade-off

 Exploitation:Maximize rewardsusingcurrentestimates

 Agentstopslearningand startsexecutingpolicy

 Exploration:Maximizelongterm rewards

 Agentkeeps learningbytryingout newthings

 Pure Exploitation
 Mostlygets stuck in badpolicies

 Pure Exploration

 Gets better models bylearning

 Smallrewardsduetoexploration

 Themulti-armedbanditsetting

 Aslot machinehasonelever,aone-armed bandit

 n-armedbandithasnlevers

 Whicharm to pull?

 Exploit:Theonewith thebest pay-offsofar

 Explore:Theonethathas not been tried

Exploration

 Greedyin thelimit of infinite exploration (GLIE)

 Reasonableschemesfortradeoff

 Revisitingthe greedyADP approach

 Agent must tryeachaction infinitelyoften

 Rulesoutchanceofmissinga goodaction

 Eventuallymustbecomegreedytoget rewards

 SimpleGLIE

 Chooserandomaction1/t fractionofthe time

 Usegreedypolicyotherwise

 Convergestotheoptimal policy

 Convergenceis very slow


ExplorationFunction

 AsmarterGLIE

 Givehigher weights toactions not triedveryoften

 Givelower weights to low utilityactions

 AlterBellmanequations usingoptimisticutilitiesU+(s)

 Theexplorationfunctionf(u,n)

 Shouldincreasewithexpected utilityu

 Shoulddecreasewith numberoftries n

 Asimpleexploration function

 Actionstowardsunexploredregionsareencouraged

 Fastconvergenceto almost optimal policyin practice

Q-Learning

 ExplorationfunctiongivesaactiveADPagent

 AcorrespondingTDagentcanbe constructed

 Surprisingly,theTDupdatecanremainthesame

 Convergestotheoptimal policyasactiveADP

 SlowerthanADPinpractice

 Q-learninglearnsan action-valuefunctionQ(a;s)

 UtilityvaluesU(s) = maxaQ(a; s)

 Amodel-freeTD method

 Nomodel forlearningoraction selection


 ConstraintequationsforQ-valuesatequilibrium

 Canbeupdated usinga model forT(s; a; s‟)

 TheTDQ-learningdoesnotrequireamodel

 Calculatedwhenever ainsleadsto s‟

 Thenextactionanext=argmaxa‟f(Q(a‟;s‟);N(s‟;a‟))

 Q-learningisslowerthanADP

 Trade-o:Model-freevsknowledge-basedmethods
UNIT5

EXPERTSYSTEMS

An expert system is a computer program that represents and reasons with knowledge of some
specialist subject with a view to solving problems or giving advice.

To solve expert-level problems, expert systems will need efficient access to a substantial domain
knowledge base, and a reasoning mechanism to apply the knowledge to the problems they are
given. Usually they will also need to be able to explain, to the users who rely on them, how they
have reached their decisions.

They will generally build upon the ideas of knowledge representation, production rules, search,
and so on, that we have already covered.

Oftenweuse anexpertsystemshell whichisan existingknowledge independentframeworkinto


which domain knowledge can be inserted to produce a working expert system. We can thusavoid
having to program each new system from scratch.

TypicalTasksforExpertSystems

There are no fundamental limits on what problem domains an expert system can be built to deal
with. Some typical existing expert system tasks include:

1. Theinterpretationofdata

Suchassonardataor geophysicalmeasurements

2. Diagnosisof malfunctions

Suchasequipmentfaultsorhumandiseases

3. Structural analysis or configuration of complex objects

Suchaschemicalcompoundsorcomputersystems

4. Planningsequencesofactions

Suchasmightbeperformedbyrobots

5. Predictingthe future
Suchasweather,shareprices,exchange rates

However,thesedays,“conventional”computersystemscanalsodosomeofthesethings

CharacteristicsofExpertSystems

Expertsystemscanbedistinguishedfromconventionalcomputersystemsinthat:

1. They simulate human reasoning about the problem domain, rather than simulating the domain
itself.

2. They perform reasoning over representations of human knowledge, in addition to doing


numerical calculations or data retrieval. They have corresponding distinct modules referred to as
the inference engine and the knowledge base.

3. Problems tend to be solved using heuristics (rules of thumb) or approximate methods or


probabilistic methods which, unlike algorithmic solutions, are not guaranteed to result in acorrect
or optimal solution.

4. They usually have to provide explanations and justifications of their solutions or


recommendations in order to convince the user that their reasoning is correct.

Notethat the term Intelligent KnowledgeBased System (IKBS)is sometimes usedas a synonym
for Expert System.

TheArchitectureofExpertSystems

The process of building expert systems is often called knowledge engineering. The
knowledgeengineer is involved with all components of an expert system:
Building expert systems is generally an iterative process. The components and their interaction
will berefined overthecourseofnumerous meetings oftheknowledge engineerwith theexperts and
users. We shall look in turn at the various components.

KnowledgeAcquisition

The knowledge acquisition component allows the expert to enter their knowledge or expertise
into the expert system, and to refine it later as and when required.

Historically, the knowledge engineer played a major role in this process, but automated systems
that allow the expert to interact directly with the system are becoming increasingly common.

Theknowledgeacquisition process isusuallycomprised ofthreeprincipal stages:

1. Knowledge elicitation is the interaction between the expert and the knowledge
engineer/program to elicit the expert knowledge in some systematic way.

2. The knowledge thus obtained is usually stored in some form of human friendly
intermediaterepresentation.

3. The intermediate representationof theknowledge isthencompiledintoanexecutable form (e.g.


production rules) that the inference engine can process.

Inpractice,muchiteration throughthesethreestagesisusuallyrequired!

KnowledgeElicitation

Theknowledge elicitationprocessitselfusuallyconsistsofseveralstages:

1. Findasmuchaspossibleabouttheproblemanddomainfrombooks,manuals,etc.Inparticular,
become familiar with any specialist terminology and jargon.

2. Try to characterize the types of reasoning andproblem solving tasks that thesystem will
berequired to perform.

3. Findanexpert(orsetofexperts)thatiswillingtocollaborateontheproject.Sometimes experts are


frightened of being replaced by a computer system!
4. Interview the expert (usually many times during the course of building the system). Find out
how they solve the problems your system will be expected to solve. Have them check and refine
your intermediate knowledge representation.

This is a time intensive process, and automated knowledge elicitation and machine learning
techniques are increasingly common modern alternatives.

StagesofKnowledge Acquisition

The iterative nature of the knowledge acquisition process can be represented in the following
diagram.

Levelsof Knowledge Analysis

Knowledgeidentification:Useindepthinterviewsinwhichtheknowledgeengineerencourages the
expert to talk about how they do what they do. The knowledge engineer should understand the
domain well enough to know which objects and facts need talking about.

Knowledge conceptualization: Find the primitive concepts and conceptual relations of the
problem domain.
Epistemological analysis: Uncover the structural properties of the conceptual knowledge, such
as taxonomic relations (classifications).

Logicalanalysis:Decidehowtoperformreasoningintheproblemdomain.Thiskindofknowledge can
be particularly hard to acquire.

Implementationanalysis:Workoutsystematicproceduresforimplementingandtestingthe system.

CapturingTacit/Implicit Knowledge

Oneproblemthatknowledgeengineersoftenencounteristhatthehumanexpertsuse tacit/implicit
knowledge (e.g. procedural knowledge) that is difficult to capture.

Thereareseveral useful techniques for acquiringthis knowledge:

1. Protocol analysis: Tape-record the expert thinkingaloud while performingtheir role and later
analyze this. Break down their protocol/account into the smallest atomic units of thought, and let
these become operators.

2. Participant observation: The knowledge engineer acquires tacit knowledge through practical
domain experience with the expert.

3. Machine induction: This is useful when the experts areable to supplyexamples of the results
of their decision making, even if they are unable to articulate the underlying knowledge or
reasoning process.

Whichis/arebestto usewill generallydependontheproblem domainandthe expert.

RepresentingtheKnowledge

We have alreadylooked at various types of knowledge representation. In general, the knowledge


acquired from our expert will be formulated in two ways:

1. Intermediate representation – a structured knowledge representation that the knowledge


engineer and expert can both work with efficiently.

2. Production system – a formulation that the expert system’s inference engine can process
efficiently.
Itisimportanttodistinguishbetween:

1. Domain knowledge – the expert’s knowledge which might be expressed in the form of rules,
general/default values, and so on.

2. Caseknowledge–specificfacts/knowledgeaboutparticularcases,includinganyderived
knowledge about the particular cases.

Thesystemwillhavethedomainknowledgebuiltin,andwillhavetointegratethiswiththe different case


knowledge that will become available each time the system is used.

MetaKnowledge

Knowledgeabout knowledge

 Metaknowledge can besimplydefinedas knowledge about knowledge.

 Metaknowledgeisknowledgeabouttheuseandcontrolofdomainknowledgeinan expert
system.

Rolesin ExpertSystemDevelopment

Threefundamental roles in buildingexpert systems are:

1. Expert - Successful ES systems depend on the experience and application of knowledge that
the people can bring to it during its development. Large systems generally require multiple
experts.

2. Knowledge engineer - The knowledge engineer has a dual task. This person should be able to
elicit knowledge from the expert, gradually gaining an understanding of an area of expertise.
Intelligence, tact, empathy, and proficiency in specific techniques of knowledge acquisition are
all required of a knowledge engineer. Knowledge-acquisition techniques include conducting
interviews with varying degrees of structure, protocol analysis, observation of experts at work,
and analysis of cases.

On the other hand, the knowledge engineer must also select a tool appropriate for the project and
use it to represent the knowledge with the application of the knowledge acquisition facility.
3. User - A system developed by an end user with a simple shell, is built rather quickly an
inexpensively. Largersystemsarebuiltinanorganizeddevelopmenteffort.Aprototype-oriented
iterative development strategy is commonly used. ESs lends themselves particularly well to
prototyping.

TypicalExpertSystem

1. A problem-domain-specific knowledge base that stores the encoded knowledge to support one
problem domain such as diagnosing why a car won't start. In a rule-based expert system, the
knowledge base includes the if-then rules and additional specifications that control the course of
the interview.

2. An inference engine a set of rules for making deductions from the data and that implements
the reasoning mechanism and controls the interview process. The inference engine might be
generalized so that the same software is able to process many different knowledge bases.

3. The user interface requests information from the user and outputs intermediate and final
results. In some expert systems, input is acquired from additional sources such as data bases and
sensors.

An expert system shell consists of a generalized inference engine and user interface designed to
workwithaknowledgebaseprovidedinaspecifiedformat.Ashelloftenincludestoolsthathelp with the
design, development and testing of the knowledge base. With the shell approach, expert systems
representing many different problem domains may be developed and delivered with the same
software environment. .

Therearespecialhighlevellanguagesusedtoprogramexpertsystemsegg PROLOG

The user interacts with the system through a user interface which may use menus, natural
language or any other style of interaction). Then an inference engine is used to reason with both
the expert knowledge (extracted from our friendly expert) and data specific to the particular
problem being solved. The expert knowledge will typically be in the form of a set of IF-THEN
rules.Thecasespecificdataincludesbothdataprovidedbytheuserandpartialconclusions
(alongwithcertaintymeasures)basedonthisdata.Inasimpleforwardchainingrule-based system the
case specific data will be the elements in working memory.

HowanexpertsystemworksCarengine diagnosis

1. IFengine_getting_petrol

ANDengine_turns_over

THENproblem_with_spark_plugs

2. IF NOT engine_turns_over

AND NOT lights_come_on

THENproblem_with_battery

3. IF NOT engine_turns_over

AND lights_come_on

THENproblem_with_starter

4. IF petrol_in_fuel_tank

THENengine_getting_petrol

Therearethreepossible problems withthe car:

 problem_with_spark_plugs,

 problem_with_battery,

 problem_with_starter.

The system will ask the user:

 Isittruethatthere's petrolinthefuel tank?

Let's say that the answer is yes. This answer would be recorded, so that the user doesn't get
askedthesamequestionagain.Anyway,the systemnowhasprovedthattheengineisgetting petrol,
so now wants to find out if the engine turns over. As the system doesn't yet know whether
this is the case, and as there are no rules which conclude this, the user will be asked:
 Isittruethat the engineturns over?

Letssaythistimetheanswerisno.Therearenootherruleswhichcanbeusedtoprove
``problem_with_spark_plugs'' so the system will conclude that this is not the solution to the
problem, and will consider the next hypothesis: problem_with_battery. It is true that the
engine does not turn over (the user has just said that), so all it has to prove is that the lights
don't come one. It will ask the user

 Isittruethatthe lightscome on?

Suppose the answer is no. It has now proved that the problem is with the battery. Some
systems might stop there, but usuallythere might be more than one solution, (e.g., more than
one fault with the car), or it will be uncertain which of various solutions is the right one. So
usually all hypotheses are considered. It will try to prove ``problem_with_starter'', but given
the existing data (the lights come on) the proof will fail, so the system will conclude that the
problem is with the battery. A complete interaction with our very simple system might be:

 System: Isittruethatthere'spetrolinthefueltank?

User: Yes.

System:Isittruethattheengineturnsover?

User: No.

SystemIsittruethatthelightscomeon? User:

No.

System:Iconcludethat thereis aproblem with battery.

Note that in general, solving problems using backward chaining involves searching through
allthe possible ways of proving the hypothesis, systematically checking each of them.

Questions

1. ``Brieflydescribethebasicarchitectureofatypicalexpertsystem,mentioningthe function of
each of the main components.''

2. ``Atravelagent asks youtodesignan expertsystemto helppeoplechoosewhereto go on


holiday. Design a set of decisions to help you give advice on which holiday to take.
ExpertSystemUse

Expertsystemsareusedinavarietyofareas,andarestillthemostpopulardevelopmental approach in the


artificial intelligence world.

Thetablebelow depicts thepercentageof expert systems beingdeveloped in particularareas:

• Medicalscreeningfor cancer andbraintumours

• Matchingpeople to jobs

• Trainingonoil rigs

• Diagnosingfaultsincar engines

• Legaladvisorysystems

• Mineral prospecting

5.6.1 MYCIN

TasksandDomain

► DiseaseDIAGNOSISandTherapySELECTION

► Advicefornon-expertphysicianswithtimeconsiderationsandincomplete evidenceon:

 Bacterialinfectionsofthe blood

 Expandedtomeningitisandotherailments

System Goals

► Utility
 Beuseful,to attract assistanceof experts

 Demonstratecompetence

 Fulfilldomainneed(i.e.penicillin)

► Flexibility

 Domainis complex,varietyof knowledgetypes

 Medicalknowledgerapidlyevolves, mustbeeasyto maintain K.B.

► InteractiveDialogue

 Providecoherentexplanations(symbolicreasoningparadigm)

 Allowforreal-time K.B.updates byexperts

► FastandEasy

 Meettimeconstraintsofthemedicalfield

Architecture
Consultation System

► PerformsDiagnosisand TherapySelection

► ControlStructurereadsStaticDB(rules)andread/writestoDynamicDB(patient,
context)

► LinkedtoExplanations

► Terminalinterfaceto Physician

► User-FriendlyFeatures:
 Userscanrequest rephrasingof questions

 Synonymdictionaryallows latitudeofuser responses

 User typos areautomaticallyfixed

► Questionsareaskedwhen moredatais needed

 Ifdatacannotbeprovided,systemignores relevantrules

► Goal-directedBackward-chainingDepth-firstTreeSearch

► High-levelAlgorithm:

 DetermineifPatienthassignificantinfection

 Determinelikelyidentityof significant organisms

 Decidewhich drugs arepotentiallyuseful

 Selectbestdrugor coverageof drugs

Static Database

► Rules

► Meta-Rules

► Templates

► Rule Properties

► ContextProperties

► FedfromKnowledge AcquisitionSystem
ProductionRules

► RepresentDomain-specific Knowledge

► Over450rulesin MYCIN

► Premise-Action(If-Then) Form:

<predicatefunction><object><attrib><value>

► Eachruleiscompletelymodular,allrelevantcontextiscontainedintherulewith
explicitly stated premises

MYCINP.R.Assumptions

► Noteverydomaincan berepresented,requires formalization (EMYCIN)

► Onlysmall numberof simultaneousfactors (more than 6was thought to beunwieldy)

► IF-THENformalismissuitableforExpertKnowledgeAcquisitionandExplanationsub-
systems

JudgmentalKnowledge

► InexactReasoningwithCertaintyFactors(CF)

► CFarenot Probability!

► Truthof aHypothesis ismeasuredbyasum of theCFs


PremisesandRulesaddedtogether

 Positivesumisconfirmingevidence

 Negativesumisdisconfirmingevidence

Sub-goals

► Atanygiven time MYCIN is establishingthe valueof someparameter bysub-goaling

► UnityPaths:amethodtobypasssub-goalsbyfollowingapathwhosecertaintyisknown
(CF==1) to make a definite conclusion

► Won’tsearchasub-goalif itcan beobtainedfromauserfirst (i.e.lab data)

Preview Mechanism

► Interpreterreadsrulesbeforeinvokingthem

► Avoidsunnecessarydeductivework if thesub-goal has alreadybeentested/determined

► Ensuresself-referencingsub-goalsdonotenterrecursiveinfiniteloops

Meta-Rules

► Alternativetoexhaustiveinvocationofallrules

► Strategyrulestosuggestan approachfora given sub-goal

 Orderingrules to tryfirst,effectivelypruningthesearch tree

► Createsasearch-spacewithembeddedinformation onwhichbranch isbestto take

► High-orderMeta-Rules(i.e.Meta-RulesforMeta-Rules)

 Powerful,but usedlimitedlyin practice

► ImpacttotheExplanationSystem:

 (+) EncodeKnowledgeformerlyin theControl Structure

 (-)Sometimescreate“murky” explanations

Templates
► TheProductionRulesareallbasedonTemplatestructures

► ThisaidsKnowledge-baseexpansion,becausethesystemcan“understand”itsown
representations

► Templatesareupdatedbythesystem whenanew rule is entered

Dynamic Database

► Patient Data

► LaboratoryData

► ContextTree

► Built byConsultation System

► Used byExplanationSystem

ContextTree
Therapy Selection

► Plan-Generate-and-TestProcess

► TherapyListCreation

 Setofspecificrulesrecommendtreatmentsbasedontheprobability(notCF)of
organism sensitivity

 Probabilitiesbasedonlaboratorydata

 Onetherapyrulefor every organism

► AssigningItem Numbers

 Onlyhypothesiswithorganismsdeemed“significantlylikely”(CF)are
considered

 Thenthemostlikely(CF)identityoftheorganismsthemselvesaredetermined and
assigned an Item Number

 Eachitemisassignedaprobabilityoflikelihoodandprobabilityofsensitivityto drug

► FinalSelectionbasedon:
 Sensitivity

 ContraindicationScreening

 Usingtheminimalnumberofdrugsandmaximizingthe coverageof organisms

► Expertscanaskforalternatetreatments

 Therapyselectionisrepeatedwithpreviouslyrecommendeddrugsremovedfrom the
list

Explanation System

► Providesreasoningwhyaconclusion has been made, or whyaquestion is being asked

► Q-A Module

► ReasoningStatusChecker

► Usesa traceoftheProduction Rulesforabasis,and the ContextTree, to providecontext

 IgnoresDefinitionalRules(CF==1)
► TwoModules

 Q-A Module

 ReasoningStatusChecker

Module

► SymbolicProductionRulesarereadable

► Each <predicate function>has an associated translation pattern:

GRID(THE(2)ASSOCIATEDWITH(1)ISKNOWN)

VAL (((2 1)))

PORTAL (THE PORTAL OF ENTRY OF *)

PATH-FLORA(LISTOFLIKELYPATHOGENS)

(GRID(VALCNTXTPORTAL)PATH-FLORA)becomes:

“Thelistoflikelypathogensassociatedwiththeportalofentryoftheorganismis known.”

ReasoningStatusChecker

► Explanationisatreetraversalofthetraced rules:

 WHY– moves up thetree

 HOW – movesdown (possiblyto untriedareas)

► Questionisrephrased,andtherule beingapplied isexplainedwith thetranslation patterns

KnowledgeAcquisitionSystem

► ExtendsStatic DBviaDialoguewith Experts

► DialogueDriven bySystem

► RequiresminimaltrainingforExperts
► AllowsforIncremental Competence,NOT anAll-or-Nothingmodel

► IF-THENSymboliclogicwasfoundtobeeasyforexpertstolearn,andrequiredlittle training
by the MYCIN team

► Whenfacedwitharule,theexpertmusteitherexceptitorbeforcedtoupdateitusingthe education
process

EducationProcess

1. Bugisuncovered, usuallybyExplanationprocess

2. Add/Modifyrules usingsubset ofEnglishbyexperts

3. IntegratingnewknowledgeintoKB

 Foundtobedifficultinpractice,requiresdetectionofcontradictions,andcomplex
concepts become difficult to express

Results

► Neverimplementedforroutineclinicaluse

► Showntobecompetentbypanelsofexperts,evenincaseswhereexpertsthemselves
disagreed on conclusions

► Key Contributions:
 ReuseofProductionRules(explanation,knowledge acquisitionmodels)

 Meta-LevelKnowledgeUse

5.6.2DART

The Dynamic Analysis and Replanning Tool, commonly abbreviated to DART, is an artificial
intelligence program used by the U.S. military to optimize and schedule the transportation of
supplies or personnel and solve other logistical problems.

DART uses intelligent agents to aid decision support systems located at the U.S. Transportation
and European Commands. It integrates a set of intelligent data processing agents and database
management systems to give planners the ability to rapidly evaluate plans for logistical
feasibility. By automating evaluation of these processes DART decreases the cost and time
required to implement decisions.

DART achieved logistical solutions that surprised many military planners. Introduced in 1991,
DART had by 1995 offset the monetary equivalent of all funds DARPA had channeled into AI
research for the previous 30 years combined

Developmentand introduction

DARPA funded the MITRE Corporation and Carnegie Mellon University to analyze the
feasibility of several intelligent planning systems. In November 1989, a demonstration named
The Proud Eagle Exercise indicated many inadequacies and bottlenecks within military support
systems. In July, DART was previewed to the military by BBN Systems and Technologies and
the ISX Corporation (now part of Lockheed Martin Advanced Technology Laboratories) in
conjunction with the United States Air Force Rome Laboratory. It was proposed in November
1990, with the military immediately demanding that a prototype be developed for testing. Eight
weeks later, a hastybut working prototype was introduced in 1991 to the USTRANSCOM at the
beginning of Operation Desert Storm during the Gulf War.

Impact

Directly following its launch, DART solved several logistical nightmares, saving the military
millions of dollars. Military planners were aware of the tremendous obstacles facing moving
militaryassets from bases in Europe to prepared bases in Saudi Arabia, in preparation for Desert
Storm. DART quickly proved its value by improving upon existing plans of the U.S. military.
What surprised many observers were DART's ability to adapt plans rapidly in a crisis
environment.
DART'ssuccess led to thedevelopment of other militaryplanningagentssuchas:

 RDA-ResourceDescriptionandAccess system

 DRPI-Knowledge-Based Planningand Scheduling Initiative, asuccessorof DART

5.7ExpertSystems shells

Initially each expert system is build from scratch (LISP).Systems are constructed as a set of
declarative representations (mostly rules) combined with an interpreter for those representations.
It helps to separate the interpreter from domain-specific knowledge and to create a system that
could be used construct new expert system by adding new knowledge corresponding to the new
problem domain. The resulting interpreters are called shells. Example of shells is EMYCIN (for
Empty MYCIN derived from MYCIN).

Shells − A shell is nothing but an expert system without knowledge base. A shell provides the
developers withknowledgeacquisition,inferenceengine,userinterface, andexplanationfacility. For
example, few shells are given below −

 Java Expert System Shell (JESS) that provides fully developed Java API for creating an
expert system.

 Vidwan, a shell developed at the National Centre for Software Technology, Mumbai
in1993. It enables knowledge encoding in the form of IF-THEN rules.

Shells provide greater flexibility in representing knowledge and in reasoning than


MYCIN. They support rules, frames, truth maintenance systems and a variety of other reasoning
mechanisms.

Early expert system shells provide mechanisms for knowledge representation, reasoning
and explanation. Later these tools provide knowledge acquisition. Still expert system shells need
to integrate with other programs easily. Expert systems cannot operate in a vacuum. The shells
must provide an easy-to-use interface between an expert system written with the shell and
programming environment.

You might also like