Ai and Ml-Unit 1234 (Notes)
Ai and Ml-Unit 1234 (Notes)
Unit 01:
What is Artificial Intelligence?
Definition:
Artificial Intelligence is the branch of computer science concerned with the
study of how to make computer do things which, at the moment people do
better.
Artificial Intelligence is concerned with the design of intelligence in an artificial
device.
The term was coined by McCarthy in 1956.
There are two ideas in the definition.
1. Intelligence
2. Artificial device
What is intelligence?
Accordingly there are two possibilities:
– A system with intelligence is expected to behave as intelligently as a human
– A system with intelligence is expected to behave in the best possible manner.
Intelligent behavior
This discussion brings us back to the question of what constitutes intelligent
behavior. Some of these tasks and applications are:
i. Perception involving image recognition and computer vision
ii. Reasoning
iii. Learning
iv. Understanding language involving natural language processing, speech
processing
v. Solving problems
vi. Robotics
Today’s AI systems have been able to achieve limited success in some of
these tasks.
• In Computer vision, the systems are capable of face recognition
• In Robotics, we have been able to make vehicles that are mostly autonomous.
• In Natural language processing, we have systems that are capable of simple
machine translation.
An agent is something that perceives and acts. An agent acts in an environment .An
agent perceives its environment through sensors.
The complete set of inputs at a given time is called a percept. The current percept, or a
sequence of percepts can influence the actions of an agent. The agent can change the
environment through actuators or effectors. An operation involving an effector is
called an action. Actions can be grouped into action sequences. The agent can have
goals which it tries to achieve.
Thus, an agent can be looked upon as a system that implements a mapping from
percept sequences to actions.
Examples of Agents :
An agent is something that acts in an environment - it does something. Agents include
thermostats, airplanes, robots, humans, companies, and countries. We are interested in
what an agent does; that is, how it acts. We judge an agent by its actions.
1.Humans can be looked upon as agents. They have eyes, ears, skin, taste buds, etc. for
sensors; and hands, fingers, legs, mouth for effectors.
2.Robots are agents. Robots may have camera, sonar, infrared, bumper, etc. for
sensors. They can have grippers, wheels, lights, speakers, etc. for actuators.
PROBLEM SOLVING
The steps that are required to build a system to solve a particular problem are:
Consider the problem of “Playing Chess” . to build a program that could play chess, we
have to specify the starting position of the chess board, the rules that define legal
moves. And the board position that represent a win. The goal of the winning the game,
if possible, must be made explicit.
The legal moves can be described as a set of rules consisting of two parts: A left side
that gives the current position and the right side that describes the change to be made to
the board position. An example is shown in the following figure.
The current position of a coin on the board is its STATE and the set of all possible
STATES is STATE SPACE. One or more states where the problem terminates is
FINAL STATE or GOAL STATE. The state space representation forms the basis of
most of the AI methods.
It allows for a formal definition of the problem as the need to convert some given
situation into some desired situation using a set of permissible operations. It permits
the problem to be solved with the help of known techniques and control strategies to
move through the problem space until goal state is found.
Let us begin by introducing certain terms. An initial state is the description of the
starting configuration of the agent .An action or an operator takes the agent from one
state to another state which is called a successor state. A state can have a number of
successor states.
A plan is a sequence of actions. The cost of a plan is referred to as the path cost. The
path cost is a positive number, and a common path cost may be the sum of the costs of
the steps in the path.
Statement: - We are given 2 jugs, a 4 liter one and a 3- liter one. Neither have any
measuring markers on it. There is a pump that can be used to fill the jugs with water.
How can we get exactly 2 liters of water in to the 4-liter jugs?
Solution:-
‘i’ represents the number of liters of water in the 4-liter jug and ‘j’ represents the
number of liters of water in the 3-liter jug. The initial state is ( 0,0) that is no water on
each jug. The goal state is to get ( 2,n) for any value of ‘n’.
{ ( i ,j ) i = 0,1,2,3,4 j = 0,1,2,3}
To solve this we have to make some assumptions not mentioned in the problem. They
are
The various operators (Production Rules) that are available to solve this problem may
be stated as given in the following figure.
Search in AI:
Another crucial general technique required when writing AI programs is search. Often
there is no direct way to find a solution to some problem. However, you do know how
to generate possibilities. For example, in solving a puzzle you might know all the
Dr.Harish Naik T,DCA,Presidency College. Page 11
possible moves, but not the sequence that would lead to a solution. When working out
how to get somewhere you might know all the roads/buses/trains, just not the best
route to get you to your destination quickly. Developing good ways to search through
these possibilities for a good solution is therefore vital. Brute force techniques, where
you generate and try out every possible solution may work, but are often very
inefficient, as there are just too many possibilities to try. Heuristic techniques are often
better, where you only try the options, which you think (based on your current best
guess) are most likely to lead to a good solution.
Following are the four essential properties of search algorithms to compare the efficiency of
these algorithms:
Space Complexity: It is the maximum storage space required at any point during the
search, as the complexity of the problem.
1.Blind search:we move through the space without worrying about what is coming
next, but recognising the answer if we see it
2.Informed search:we guess what is ahead, and use that information to decide where to
look next.We may want to search for the first answer that satisfies our goal, or we may
want to keep searching until we find the best answer.
Breadth first search is also like depth first search. Here searching progresses level by
level. Unlike depth first search, which goes deep into the tree. An operator employed to
generate all possible children of a node. Breadth first search being the brute force
search generates all the nodes for identifying the goal.
Loop
Node remove-first(fringe)
If node is a goal
else
End Loop
1.Shortest Path and Minimum Spanning Tree for unweighted graph In unweighted
graph, the shortest path is the path with least number of edges. With Breadth First.
2.Peer to Peer Networks. In Peer to Peer Networks like BitTorrent, Breadth First
Search is used to find all neighbor nodes.
4.Social Networking Websites: In social networks, we can find people within a given
distance ‘k’ from a person using Breadth First Search till ‘k’ levels.
5.GPS Navigation systems: Breadth First Search is used to find all neighboring
locations.
In Depth-First Search, search begins by expanding the initial node, i.e., by using an
operator, generate all successors of the initial node and test them.
Loop
Node remove-first(fringe)
If node is a goal
else
End Loop
One way to combine the two is to follow a single path at a time, but switch path
whenever some competing path looks more promising than the current one.
At each step, we select the most promising of the nodes we have generated so far. This
is done by applying an appropriate heuristic function to each of them. We then expand
the chosen node by using the rules to generate its successors.
If one them is a solution, we can quit if not, all those new nodes are added to the set of
nodes generated so far.
Again the most promising node is selected and the process continues.
A is an initial node, which is expand to B,C and D. A heuristic function, say cost of
reaching the goal , is applied to each of these nodes, since D is most promising, it is
expanded next, producing two successor nodes E and F. Heuristic function is applied to
them.
Now out of the four remaining ( B,C and F) B looks more promising and hence it is
expand generating nodes G and H . Again when evaluated E appears to be the next stop
J has to be expanded giving rise to nodes I and J. In the next step J has to be expanded,
since it is more promising this process continues until a solution is found.
• Applying to TSP:
– the weights given to individual aspects are chosen in such a way that
• The value of the heuristic function at a given node in the search process gives as
good an estimate as possible of whether that node is on the desired path to a
solution.
Well designed heuristic functions can play an important part in efficiently guiding a
search process toward a solution.
In heuristic search or informed search, heuristics are used to identify the most
promising search path.
****Explain A* ALGORITHM
A Star algorithm is a best first graph search algorithm that finds a least cost path from
a given initial node to one goal node.
Evaluation Function f(n):At any node n,it estimates the sum of the cost of the minimal
cost path from start node s to node n plus the cost of a minimal cost path from node n
to a goal node.
f(n)=g(n)+h(n)
Function f*(n):At any node n,it is the actual cost of an optimal path m node s to node n
plus the cost of an optimal path from node n to a goal node.
f*(n)=g*(n)+h*(n)
h*(n)=cost of the optimal path in the search tree from n to a goal node;
h*(n):It is the cost of the minimal cost path from n to a goal node and any path from
node n to a goal node that acheives h*(n)is an optimal path from n to a goal. h is an
estimate of h*.
Algorithm:
Step1: Create a single node comprising at root node Step2: If the first member is goal
node then go to step5
Step3: If it is not then remove it from the queue load it to the list of visited
nodes,consider its child nodes if any and evaluate them with the evaluation function.
When a problem can be divided into a set of sub problems, where each sub problem
can be solved separately and a combination of these will be a solution, AND-OR
graphs or AND - OR trees are used for representing the solution.
The decomposition of the problem or problem reduction generates AND arcs. One
AND arc may point to any number of successor nodes. All these must be solved so that
the arc will rise to many arcs, indicating several possible solutions. Hence the graph is
known as AND - OR instead of AND. Figure shows an AND - OR graph.
With the available information till now , it appears that C is the most promising node to
expand since its f ' = 3 , the lowest but going through B would be better since to use C
we must also use D' and the cost would be 9(3+4+1+1). Through B it would be 6(5+1).
Thus the choice of the next node to expand depends not only n a value but also on
whether that node is part of the current best path form the initial mode. Figure (b)
makes this clearer.
In figure the node G appears to be the most promising node, with the least f ' value. But
G is not on the current best path, since to use G we must use GH with a cost of 9 and
again this demands that arcs be used (with a cost of 27). The path from A through B,
E-F is better with a total cost of (17+1=18). Thus we can see that to search an AND-
OR graph, the following three things must be done.
1. traverse the graph starting at the initial node and following the current best path, and
accumulate the set of nodes that are on the path and have not yet been expanded.
2. Pick one of these unexpanded nodes and expand it. Add its successors to the graph
and computer f ' (cost of the remaining distance) for each of them.
3. Change the f ' estimate of the newly expanded node to reflect the new information
produced by its successors. Propagate this change backward through the graph. Decide
which of the current best path.
Dr.Harish Naik T,DCA,Presidency College. Page 22
The propagation of revised cost estimation backward is in the tree is not necessary in
A* algorithm. This is because in AO* algorithm expanded nodes are re-examined so
that the current best path can be selected. The working of AO* algorithm is illustrated
in figure as follows:
Referring the figure. The initial node is expanded and D is Marked initially as
promising node. D is expanded producing an AND arc E-F. f ' value of D is updated to
10. Going backwards we can see that the AND arc B-C is better . it is now marked as
current best path. B and C have to be expanded next.
This process continues until a solution is found or all paths have led to dead ends,
indicating that there is no solution. An A* algorithm the path from one node to the
other is always that of the lowest cost and it is independent of the paths through other
nodes.
The algorithm for performing a heuristic search of an AND - OR graph is given below.
Unlike A* algorithm which used two lists OPEN and CLOSED, the AO* algorithm
uses a single structure G. G represents the part of the search graph generated so far.
Each node in G points down to its immediate successors and up to its immediate
predecessors, and also has with it the value of h' cost of a path from itself to a set of
solution nodes. The cost of getting from the start nodes to the current node "g" is not
stored as in the A* algorithm.
This is because it is not possible to compute a single such value since there may be
many paths to the same state. In AO* algorithm serves as the estimate of goodness of a
node. Also a there should value called FUTILITY is used.
Most of the search strategies either reason forward or backward however, often a
mixture of the two directions is appropriate. Such mixed strategy would make it
possible to solve the major parts of problem first and solve the smaller problems that
Dr.Harish Naik T,DCA,Presidency College. Page 24
arise when combining them together. Such a technique is called "Means - Ends
Analysis".
The means -ends analysis process centers around finding the difference between
current state and goal state. The problem space of means - ends analysis has an initial
state and one or more goal state, a set of operate with a set of preconditions their
application and difference functions that computes the difference between two state
a(i) and s(j). A problem is solved using means - ends analysis by
1. Computing the current state s1 to a goal state s2 and computing their difference D12.
3. The operator OP is applied if possible. If not the current state is solved a goal is
created and means- ends analysis is applied recursively to reduce the sub goal.
4. If the sub goal is solved state is restored and work resumed on the original problem.
( the first AI program to use means - ends analysis was the GPS General problem
solver)
means- ends analysis is useful for many human planning activities. Consider the
example of planning for an office worker. Suppose we have a different table of three
rules:
1. If in our current state we are hungry, and in our goal state we are not hungry , then
either the "visit hotel" or "visit Canteen " operator is recommended.
2. If our current state we do not have money, and if in your goal state we have money,
then the "Visit our bank" operator or the "Visit secretary" operator is recommended.
3. If our current state we do not know where something is , need in our goal state we
do know, then either the "visit office enquiry" , "visit secretary" or "visit co worker "
operator is recommended.
Assume the robot in this domain was given the problem of moving a desk with two
things on it from one room to another.
The difference between the start state and the goal state would be the location of the
desk.
Move * *
object
Move *
robot
Clear *
object
Get object *
on object
Get arm * *
empty
Be holding *
object
arm empty
CARRY (obj, loc) at(robot, obj) &Small (obj) at(obj, loc) &at(robot, loc)
Mini-Max Algorithm
Definition: Knowledge can be defined as the body of facts and principles accumulated
by human kind or the act, fact or state of knowing. Also knowledge is having a
familiarity with language, concepts procedures, rules, ideas, abstractions, places,
customs, beliefs, facts and associations coupled with an ability to use these notions
effectively in modeling different aspects of the world.
***Properties of knowledge:
Knowledge requires data
Knowledge is voluminous
It is hard to characterize accurately
It is constantly changing
It is differ from data by being organized in a way that corresponds to the
ways it will be used.
***Types of knowledge:
1. Procedural Knowledge is a compiled knowledge related to the performance of
some task. E.g. the steps used to solve an algebraic equation are expressed as
procedural knowledge.
2. Declarative Knowledge is a passives about knowledge expressed as statements
of fact about the world. E.g. personnel data in explicit pieces of independent
knowledge.
3. Heuristic Knowledge A special type of knowledge used by humans to solve
complex problems. Heuristics are the strategies, tricks or rules of thumb used to
simplify the solution of problems. Heuristics are usually acquired with much
experience. E.g. A fault in a television set located by an experienced technician
without doing numerous voltage checks.
At this junction we must distinguish knowledge and other concepts such as belief and
hypotheses.
Belief: Belief can be defined as essentially any meaning and coherent expression that
can be represented. A belief may be true or false.
Hypothesis: Hypothesis can be defined as justified belief that is not known to be true.
Through hypothesis is a belief which is backed up with some supporting evidence, but
it may still be false. Finally, we define knowledge as true justified belief.
Two other terms which we shall occasionally use are epistemology and Meta
knowledge.
Epistemology is the study of the nature of knowledge.
Meta knowledge is knowledge about knowledge, i.e. knowledge about what we know.
Written Text
Character Strings
Dr.Harish Naik T,DCA,Presidency College. Page 30
Binary Numbers
Magnetic spots
Various Knowledge Representation Schemes
PL (Propositional Logic)
FOPL (First Order Predicate Logic)
Frames
Associative Networks
Model Logics
Object Oriented Methods etc.
Knowledge may vague, contradictory or incomplete. Yet we would still like to be able
to reason and make decisions. Humans do remarkably well with fuzzy, incomplete
knowledge. We would also like our AI programs to demonstrate this versatile.
Symbol Meaning
→ Implication
̚ or ̴ Negation
˅ OR
For all
ᴲ there exists
Logic is a formal method for reasoning, which has a sound theoretical foundation. This
is especially important in our attempts to mechanize or automate the reasoning process
in that inferences should be correct and logically sound. The Structure of PL or FOPL
must be flexible enough to permit the accurate representation of natural language
reasonably well. Many concepts which can be verbalized can be translated into
symbolic representation which closely approximates the meaning of these concepts.
These symbolic structures can then be manipulated in programs to deduce various facts
to carry out a form of automated reasoning.
In FOPL, statements from a natural language like English are translated into symbolic
structures comprises predicates functions, variables, constants, quantifies and logical
connections. The symbols from the basic building blocks for the knowledge and their
combination into valid structures are accomplished using the syntax (rules of
combination) for FOPL. Once structures have been created to represent basic facts or
procedure or other types of knowledge inference rules may then be applied to compare,
It is raining
RAINING
It is sunny
SUNNY
It is windy
WINDY
RAINING → ̴ SUNNY
We can conclude from the fact that it is raining the fact that is not sunny.
But we want to represent the obvious fact stated by the classical sentence.
Gandhi is a Man.
Einstein is a Man.
Which would be a totally separate assertion and we would not be able to draw any
conclusion about similarities between Gandhi and Einstein.
The logic based upon the analysis of predicates in any statement is called predicate
logic.
Introduction:
The syntax for FOPL like PL is determined by the allowable symbols and rules of
combinations. The semantics of FOPL are determined by interpretations assigned to
predicates rather than propositions. This means that an interpretation must also assign
values to other terms including constants, variables and functions.
Syntax of FOPL:
The symbols and rules of combination permitted in FOPL are defined as follows.
Connections: ̴ , ˄,˅,→,↔
Constants: Fixed value terms that belong to a given domain, denoted by numbers,
words, E.g. Flight- 102, ak – 47, etc;
Variables: Terms that can assume different values over a given domain, denoted by
words.
E.g. f (t1, t2,……..,tn ) where t, are terms (constants, variables or functions ) defined
over some domain, n>=0.
Dr.Harish Naik T,DCA,Presidency College. Page 34
A o-ary function is a constant.
Example:
E1: All employees earning Rs. 10, 00,000 or more per year pay taxes.
MECHANIC (Arun)
MAN( ̴ Kunal)
Predicate Logic:
The Logic based upon the analysis of predicates in any statement is called predicate
logic.
E.g. Our Domain - All entities that make up the MCA Dept. in PESIT.
Dept-grade average(y)
Advisor of (z)
Predicates - HOD(x)
-----------------
-------------------
Or
(for all)x: Roman (x)→[(Loyalto(x, Caesar )˅ Hate(x, caesar)) ˄ ̴
(Loyal to (x, Caesar )˄Hate (x,Caesar)
Now suppose that we want to use these statements to answer the question
It seems that using 7 and 8, we should be able to prove that Marcus was
not loyal to Caesar.
Now let’s try to produce a formal proof reasoning backward from the
desired goal:
To negate a statement covered by one quantifier, change the quantifier from universal
to existential or from existential to universal and negate the statement which it
quantifies.
If the universe for the statement “All Monkeys have tails ” consists only of Monkeys
then this statement is merely.
However, if the universe consists of objects some of which are not monkeys, a further
refinement is necessary.
“All Monkeys have tails” makes no statement about objects in the universe, which are
Monkeys? If the object is a monkey then it has or does not have a tail, the statement is
false, otherwise it is true.
“For all x, if x is a Monkey, then x has a tail” and it can be written as ((˅(for all) x)
[M(x) →P(x)].
The statement (2) means “for all x if x is a Monkey, then x has no tail” and it can be
written as ( (for all) x)[M(x) → ̴ P(x)]
The statement (3) means “there is an x such that x is a Monkey and x has a tail” and
can be written as (ᴲx) [M(x) ˄ P(x)]
The statement (4) means “there is an x such that x is a Monkey” and can be written as
(ᴲx) [M(x) ˄ ̴ P(x)]
Solution:
R(x) : x is rewarded.
G(x) : x is good.
A(x) : x is ambitious.
Q(x) : x is teasing.
S(x) : x is a road.
Then
d. ‘Some one is teasing’ can be written as “Then is one x such that x is a person
and x is teasing”.
Symbolic form: (ᴲx) [P(x) ˄ Q(x)]
Associative Networks
Association networks are depicted as directed graphs with labeled nodes and arcs or
arrows. The language used in constructing a network is based on selected domain
primitives for objects and relations as well as some general primitives.
Here, a class of objects known as bird is depicted. The class has some properties
and a specific member of the class named tweety is shown. The color of tweety is
seemed to be yellow. Associative networks were introduced by quilt ion in the year
1968 to model the semantics of English sentences and words.
Quillian’s model of semantic networks has a certain intuitive appeal in that related
information is clustered and bound together through a relation links the knowledge
required for the performance of some task is typically contained within a narrow
domain or semantic vicinity of the task. This type of organization in some way
resembles the way knowledge is stored and retrieved in humans. The graphical
portrayal of knowledge can also be somewhat more expensive than other
representation schemes. They have been used in variety of systems such as natural
language understanding, information retrieval deductive databases, learning
systems, computer vision and speech generation systems.
There is neither generally accepted syntax nor semantics for associative networks.
Most network systems are based on PL or FOPL with extensions. The syntax for
any system is determined by the object and relation primitives chosen and by any
special rules used to connect nodes. Basically the language of associative networks
is formed from letters of the alphabet both upper and lower case, relational symbols,
set membership and subset symbols, decimal digits, square and oval nodes, directed
arcs of arbitrary length. The word symbols used are those which represent object
constants and n-ary relation constants. Nodes are commonly used for objects or
nouns, arcs for relations. The direction of an arc is usually taken from the first to
subsequent arguments as they appear in a relational statement.
A number of arc relations have become common among users. They include as ISA,
MEMBER-OF, SUBSET-OF, AKO (a-kind of), HAS-PARTS, INSTANCE-OF,
AGENT, and ATTRIBUTES, SHARED-LIKE and so forth. Less common arcs
have also been used to express modality relations (time, manner, and mood),
linguistics case relations (theme, source, and goal), logical connectives, quantifiers,
set relations, attributes and quantification (ordinal, count).
The ISA link is most often used to represent the fact that an object is of a certain
type. The ISA predictive has been used to exhibit the following types of structures:
****FUZZY LOGIC
In fuzzy logic, we consider what happens if we make fundamental changes to our idea
of set membership and corresponding changes to our definitions of logical operations.
This contrasts with the standard Boolean definition for tall people where one is either
tall or not and there must be a specific height that defines the boundary. The same is
true for very tall. In fuzzy logic, one’s tallness increases with one’s height until the
value 1 is reached. So it is a distribution.
Once set membership has been redefined in this way, it is possible to define a
reasoning system based on techniques for combining distributions.
Dr.Harish Naik T,DCA,Presidency College. Page 42
The motivation for fuzzy sets is provided by the need to represent propositions such
as:
Representing knowledge using logical formalism, like predicate logic, has several
advantages. They can be combined with powerful inference mechanisms like
resolution, which makes reasoning with facts easy.
But using logical formalism complex structures of the world, objects and their
relationships, events, sequences of events etc. can not be described easily.
(i) Representational Adequacy:- The ability to represent all kinds of knowledge that
are needed in that domain.
(ii) Inferential Adequacy :- The ability to manipulate the represented structure and
infer new structures.
(iii) Inferential Efficiency:- The ability to incorporate additional information into the
knowledge structure that will aid the inference mechanisms.
(iv) Acquisitional Efficiency :- The ability to acquire new information easily, either
by direct insertion or by program control.
The techniques that have been developed in AI systems to accomplish these objectives
fall under two categories:
such descriptions are some times called schema. One definition of schema is “Schema
refers to an active organization of the past reactions, or of past experience, which must
always be supposed to be operating in any well adapted organic response”.
By using schemas, people as well as programs can exploit the fact that the real world
is not random. There are several types of schemas that have proved useful in AI
programs. They include
(i) Frames:- Used to describe a collection of attributes that a given object possesses
(eg: description of a chair).
Frames and scripts are used very extensively in a variety of AI programs. Before
selecting any specific knowledge representation structure, the following issues have to
be considered.
(i) The basis properties of objects , if any, which are common to every problem
domain must be
identified and handled appropriately.
(iii) Mechanisms must be devised to access relevant parts in a large knowledge base.
Information can be retrieved from the knowledge base by an associative search with
VALUE.
In the above
Double arrow indicates two may link between actor and the action
ATRANS is one of the primitive acts used by the theory . it indicates transfer of
possession
A second set of building block is the set of allowable dependencies among the
conceptualization describe in a sentence.
. Recognize inconsistencies
. Recognize inconsistencies
The inference engine(IE) may tell the TMS may some sentences are contradictory.
Then TMS may find that all those sentences are believed true, and reports to the IE
which can eliminate the inconsistencies by determining the assumptions used and
changing them appropriately.
Eg. A statement that either x,y or z is guilty together with other statements that x is not
guilty, y is not guilty, and z is not guilty form a contradiction.
In the absence of any firm knowledge, in many situations we want to reason form
default assumptions.
Unit III-PLANNING
What is planning?
The planning problem in Artificial Intelligence is about the decision making performed
by intelligent creatures like robots, humans, or computer programs when trying to
achieve some goal. It involves choosing a sequence of actions that will transform the
state of the world, step by step, so that it will satisfy the goal.
We have seen that it is possible to solve a problem by considering the appropriate form
of knowledge representation and using algorithms to solve parts of the problem and
also to use searching methods.
The blocks can be stacked one on one to form towers of apparently unlimited height.
The stacking is achieved using a robot arm which has fundamental operations and
states which can be assessed using logic and combined using logical operations.
The robot can hold one block at a time and only one block can be moved at a time.
UNSTACK(A,B)
pick up clear block A from block B;
STACK(A,B)
place block A using the arm onto clear block B;
PICKUP(A)
lift clear block A with the empty arm;
PUTDOWN(A)
place the held block A onto a free space on the table.
and the five predicates:
ON(A,B)
block A is on block B.
ONTABLE(A)
block A is on the table.
CLEAR(A)
block A has nothing on it.
HOLDING(A)
the arm holds block A.
ARMEMPTY
the arm holds nothing
Using logic but not logical notation we can say that
If the arm is holding a block it is not empty
If block A is on the table it is not on any other block
If block A is on block B,block B is not clear.
finding the differences between the current states and the goal states and then
choosing the rules that reduce these differences most effectively.
Means end analysis good example of this.
If we wish to travel by car to visit a friend
the first thing to do is to fill up the car with fuel.
If we do not have a car then we need to acquire one.
The largest difference must be tackled first.
STRIPS:
Consider the following example in the Blocks World and the fundamental
operations:
STACK
Requires the arm to be holding a block A and the other block B to be clear.
Afterwards the block A is on block B and the arm is empty and these are true
ADD; The arm is not holding a block and block B is not clear; predicates that
are false DELETE;
UNSTACK
Requires that the block A is on block B; that the arm is empty and that block A
is clear. Afterwards block B is clear and the arm is holding block A ADD; The
arm is not empty and the block A is not on block B DELETE;
We have now greatly reduced the information that needs to be held. If a new
attribute is introduced we do not need to add new axioms for existing operators.
Detecting Progress
The final solution can be detected if we can devise a predicate that is true when the
solution is found and is false otherwise. Requires a great deal of thought and requires a
proof.
E.g. A* search -- if insufficient progress is made then this trail is aborted in favour of a
more hopeful one.
Sometimes it is clear that solving a problem one way has reduced the problem to
parts that are harder than the original state.
By moving back from the goal state to the initial state it is possible to detect
conflicts and any trail or path that involves a conflict can be pruned out.
Reducing the number of possible paths means that there are more resources
available for those left.
Supposing that the computer teacher is ill at a school there are two possible alternatives
transfer a teacher from mathematics who knows computing or bring another one in.
Possible Problems:
If the math’s teacher is the only teacher of maths the problem is not solved.
Basic Idea to handle interactive compound goals uses goal stacks, Here the stack
contains :
Consider the following where wish to proceed from the start to goal state.
Two are solved as they already are true in the initial state -- ONTABLE(A),
ONTABLE(D).
1. ON(C,A)
2.
3. ON(B,D)
4.
5. ON(C,A) ON(B,D)
6.
7. ONTABLE(A)
8. ONTABLE(D)
9.
10.
11. ON(B,D)
12.
Dr.Harish Naik T,DCA,Presidency College. Page 52
13. ON(C,A)
14.ON(C,A) ON(B,D)
15.
16. ONTABLE(A)
17.ONTABLE(D)
18.
19.
The method is to
Investigate the first node on the stack ie the top goal.
If a sequence of operators is found that satisfies this goal it is removed and the
next goal is attempted.
CLEAR(D)
HOLDING(B)
CLEAR(D) HOLDING(B)
STACK (B, D)
PICKUP(C)
At this point the top goal is true and the next and thus the combined goal leading
to the application of STACK (B,D), which means that the world model becomes
This means that we can perform PICKUP(C) and then STACK (C,A)
Now coming to the goal ON(B,D) we realise that this has already been achieved
and checking the final goal we derive the following plan
1. UNSTACK(B,A)
2. STACK (B,D)
Dr.Harish Naik T,DCA,Presidency College. Page 53
3. PICKUP(C)
4. STACK (C,A)
This method produces a plan using good Artificial Intelligence techniques such as
heuristics to find matching goals and the A* algorithm to detect unpromising paths
which can be discarded.
*****Sussman Anomaly
ON(A,B) ON(B,C)
1. ON(A,B)
2. ON(B,C)
3.
4. ON(A,B) ON(B,C)
5.
6.
7. ON(B,C)
8. ON(A,B)
9.
10. ON(A,B) ON(B,C)
11.
Dr.Harish Naik T,DCA,Presidency College. Page 54
12.
Choosing path 1 and trying to get block A on block B leads to the goal stack:
13. ON(C,A)
14. CLEAR(C)
15.
16. ARMEMPTY
17.
18. ON(C,A) CLEAR(C) ARMEMPTY
19.
20. UNSTACK(C,A)
21.
22. ARMEMPTY
23.
CLEAR(A) ARMEMPTY
PICKUP(A)
CLEAR(B) HOLDING(A)
STACK(A,B)
ON(B,C)
ON(A,B) ON(B,C)
This achieves block A on block B which was produced by putting block C on the table.
The sequence of operators is
1. UNSTACK(C,A)
2. PUTDOWN(C)
3. PICKUP(A)
4. STACK (A,B)
Working on the next goal of ON(B,C) requires block B to be cleared so that it can be
stacked on block C. Unfortunately we need to unstack block A which we just did. Thus
the list of operators becomes
1. UNSTACK(C,A)
2. PUTDOWN(C)
3. PICKUP(A)
Dr.Harish Naik T,DCA,Presidency College. Page 55
4. STACK (A,B)
5. UNSTACK(A,B)
6. PUTDOWN(A)
7. PICKUP(B)
8. STACK (B,C)
To get to the state that block A is not on block B two extra operations are needed:
1. PICKUP(A)
STACK(A,B)
Steps 4 and 5 are opposites and therefore cancel each other out,
Steps 3 and 6 are opposites and therefore cancel each other out as well.
1. UNSTACK(C,A)
2. PUTDOWN(C)
3. PICKUP(B)
4. STACK (B,C)
5. PICKUP(A)
6. STACK(A,B)
In this problem means-end analysis suggests two steps with end conditions ON(A,B)
and ON(B,C) which indicates the operator STACK giving the layout shown below
where the operator is preceded by its preconditions and followed by its post conditions:
CLEAR(B)
CLEAR(C)
*HOLDING(A) *HOLDING(B)
__________________ __________________
STACK(A,B) STACK(B,C)
__________________ __________________
ARMEMPTY ARMEMPTY
ON(A,B) ON(B,C)
~CLEAR(B) ~ CLEAR(C)
~ HOLDING(A) ~HOLDING(B)
NOTE:
There is no order at this stage.
Unachieved preconditions are starred (*).
Both of the HOLDING preconditions are unachieved since the arm holds
nothing in the initial state.
Delete postconditions are marked by (~ ).
Many planning methods have introduced heuristics to achieve goals or preconditions.
Early attempt to do this involved the use of macro operators, in which large operators
were built from smaller ones. But in this approach no details were eliminated from the
actual descriptions of the operators.
A better approach was developed in the ABSTRIPS system, which actually planned in
a hierarchy of abstraction spaces, in each of which preconditions at a lower level of
abstraction were ignored.
As an example suppose you want to visit a friend in Europe but you have a limited
amount of cash to spend. It makes sense to check air fares first, since finding an
affordable flight will be the most efficient part of the task.
You should not worry about getting out of your driveway, planning a route to the
airport, or parking your car ..
The ABSTRIPS approach to problem solving is as follows: First solve the problem
completely, considering only preconditions whose criticality value is the highest
possible.
Because this process explores entire plans at one level of detail before it looks at the
lower level details of any one of them, it has been called length-first search.
***Perception
We perceive our environment through many channels: sight, sound, touch, smell, taste.
Many animals processes these same perceptual capabilities , and others also able to
monitor entirely different channels.
Robots, too, can process visual and auditory information, and they can also equipped
with more exotic sensors. Such as laser rangefinders, speedometers and radar.
Two extremely important sensory channels for human are vision and spoken
language. It is through these two faculties that we gather almost all of the knowledge
that drives our problem-solving behaviors.
The question is that how we can transform raw camera images into useful information
about the world.
A Video Camera provides a computer with an image represented as a two-
dimensional grid of intensity levels. Each grid element, or pixel, may store a single
bit of information (that is , black/white) or many bits(perhaps a real-valued intensity
measure and color information).
A visual image is composed of thousands of pixels. What kinds of things might we
want to do with such an image? Here are four operations, in order of increasing
complexity:
1. Signal Processing:- Enhancing the image, either for human consumption or as
input to another program.
2. Measurement Analysis:- For images containing a single object, determining the
two-dimensional extent of the object depicted.
3 Pattern Recognition:- For single – object images, classifying the object into a
category drawn from a finite set of possibilities.
4. image Understanding :- For images containing many objects, locating the object in
the image, classifying them, and building a three-dimensional model of the scene.
There are algorithms that perform the first two operations. The third operation, pattern
recognition varies in its difficulty. It is possible to classify two-dimensional (2-D)
objects, such as machine parts coming down a conveyor belt, but classifying 3-D
objects is harder because of the large number of possible orientations for each object.
Image understanding is the most difficult visual task, and it has been the subject of
the most study in AI. While some aspects of image understanding reduce to
measurement analysis and pattern recognition, the entire problem remains unsolved,
because of difficulties that include the following:
3. The value of a single pixel is affected by many different phenomena, including the
color of the object, the source of the light , the angle and distance of the camera, the
pollution in the air, etc. it is hard to disentangle these effects.
As a result, 2-D images are highly ambiguous. Given a single image, we could
construct any number of 3-D worlds that would give rise to the image . it is impossible
to decide what 3-D solid it should portray. In order to determine the most likely
interpretation of a scene , we have to apply several types of knowledge.
****Robot Architecture
Now let us turn what happen when we put it all together-perception, cognition, and
action.
There are many decisions involved in designing an architecture that integrates all these
capabilities, among them:
What range of tasks is supported by the architecture?
What type of environment(e.g. indoor, outdoor, space) is supported?
How are complex behaviors turned into sequence of low-level action?
Is control centralized or distributed
How are numeric and symbolic representations merged
How does the architecture represent the state of the world?
How quickly can the architecture react to changes in the environment?
How does the architecture decide to plan and when to act?
3. Robot sensors are inaccurate, and their effectors are limited in precision.
4. many robots must react in real time. A robot fighter plane, for example, cannot
afford to search optimally or o stop monitoring the world during a LISP garbage
collection.
5. the real world is unpredictable, dynamic, and uncertain. A root cannot hope to
maintain a correct and complete description of the world. This means that a robot must
consider the trade-off between devising and executing plans. This trade-off has several
aspects.
For one thing a robot may not possess enough information about the world for it to do
any useful planning. In that case, it must first engage in information gathering activity .
furthermore, once it begins executing a plan, the robot must continually the results of
its actions. If the results are unexpected, then re-planning may be necessary.
6. Because robots must operate in the real world, searching and back tracking can be
costly.
Recent years have seen efforts to integrate research in robotics and AI. The old idea of
simply sensors and effectors to existing AI programs has given way to a serious
rethinking of basic AI algorithms in light of the problems involved in dealing with the
physical world.
***Texture:
Texture is one of the most important attributes used in image analysis and pattern
recognition. It provides surface characteristics for the analysis of many types of images
including natural scenes, remotely sensed data, and biomedical modalities and plays an
important role in the human visual system for recognition and interpretation.
• Although there is no formal definition for texture, intuitively this descriptor
provides measures of properties such as smoothness, coarseness, and regularity.
***Computer Vision:
• Computer vision is the science and technology of machines that see.
• Concerned with the theory for building artificial systems that obtain information
from images.
• The image data can take many forms, such as a video sequence, depth images,
views from multiple cameras, or multi-dimensional data from a medical scanner.
Unit IV-LEARNING
Learning element is the portion of a learning AI system that decides how to modify the
performance element and implements those modifications.
We all learn new knowledge through different methods, depending on the type of
material to be learned, the amount of relevant knowledge we already possess, and the
environment in which the learning takes place.
***There are five methods of learning. They are,
1. Memorization (rote learning)
2. Direct instruction (by being told)
3. Analogy
4. Induction
5. Deduction
Direct instruction is a complex form of learning. This type of learning requires more
inference than role learning since the knowledge must be transformed into an
operational form before
This form of learning requires still more inferring than either of the previous forms.
Since difficult transformations must be made between the known and unknown
situations.
MATCHING:So far, we have seen the process of using search to solve problems as
the application of appropriate rules to individual problem states to generate new states
to which the rules can then be applied, and so forth, until a solution is found.
Clever search involves choosing from among the rules that can be applied at a
particular point, the ones that are most likely to lead to a solution. We need to extract
from the entire collection of rules, those that can be applied at a given point. To do so
requires some kind of matching between the current state and the preconditions of the
rules.
A. It requires the use of a large number of rules. Scanning through all of them would
be hopelessly inefficient.
B. It is not always immediately obvious whether a rule’s preconditions are satisfied by
a particular state.
Sometimes , instead of searching through the rules, we can use the current state as an
index into the rules and select the matching ones immediately. In spite of limitations,
indexing in some form is very important in the efficient operation of rules based
systems.
A more complex matching is required when the preconditions of rule specify required
properties that are not stated explicitly in the description of the current state. In this
case, a separate set of rules must be used to describe how some properties can be
inferred from others.
An even more complex matching process is required if rules should be applied and if
their pre condition approximately match the current situation. This is often the case in
situations involving physical descriptions of the world.
In supervised learning, the training data provided to the machines work as the supervisor
that teaches the machines to predict the output correctly. It applies the same concept as a
student learns in the supervision of the teacher.
In the real-world, supervised learning can be used for Risk Assessment, Image
classification, Fraud Detection, spam filtering, etc.
In supervised learning, models are trained using labelled dataset, where the model learns
about each type of data. Once the training process is completed, the model is tested on the
basis of test data (a subset of the training set), and then it predicts the output.
The working of Supervised learning can be easily understood by the below example and
diagram:
Suppose we have a dataset of different types of shapes which includes square, rectangle,
triangle, and Polygon. Now the first step is that we need to train the model for each shape.
o If the given shape has four sides, and all the sides are equal, then it will be labelled as
a Square.
o If the given shape has three sides, then it will be labelled as a triangle.
o If the given shape has six equal sides then it will be labelled as hexagon.
Now, after training, we test our model using the test set, and the task of the model is to
identify the shape.
1. Regression
Regression algorithms are used if there is a relationship between the input variable and the
output variable. It is used for the prediction of continuous variables, such as Weather
forecasting, Market Trends, etc. Below are some popular Regression algorithms which come
under supervised learning:
Linear Regression
2. Classification
Classification algorithms are used when the output variable is categorical, which means there
are two classes such as Yes-No, Male-Female, True-false, etc.
o Random Forest
o Decision Trees
o Logistic Regression
o Support vector Machines
o With the help of supervised learning, the model can predict the output on the basis of prior
experiences.
o In supervised learning, we can have an exact idea about the classes of objects.
o Supervised learning model helps us to solve various real-world problems such as fraud
detection, spam filtering, etc.
o Supervised learning models are not suitable for handling the complex tasks.
o Supervised learning cannot predict the correct output if the test data is different from the
training dataset.
o Training required lots of computation times.
o In supervised learning, we need enough knowledge about the classes of object.
In supervised machine learning in which models are trained using labeled data under the
supervision of training data. But there may be many cases in which we do not have labeled
data and need to find the hidden patterns from the given dataset. So, to solve such types of
cases in machine learning, we need unsupervised learning techniques.
Unsupervised learning is a type of machine learning in which models are trained using unlabeled dataset
and are allowed to act on that data without any supervision.
Example: Suppose the unsupervised learning algorithm is given an input dataset containing
images of different types of cats and dogs. The algorithm is never trained upon the given
dataset, which means it does not have any idea about the features of the dataset. The task of
the unsupervised learning algorithm is to identify the image features on their own.
Unsupervised learning algorithm will perform this task by clustering the image dataset into
the groups according to similarities between images.
Below are some main reasons which describe the importance of Unsupervised Learning:
o Unsupervised learning is helpful for finding useful insights from the data.
o Unsupervised learning is much similar as a human learns to think by their own experiences,
which makes it closer to the real AI.
Here, we have taken an unlabeled input data, which means it is not categorized and
corresponding outputs are also not given. Now, this unlabeled input data is fed to the
machine learning model in order to train it. Firstly, it will interpret the raw data to find the
hidden patterns from the data and then will apply suitable algorithms such as k-means
clustering, Decision tree, etc.
Once it applies the suitable algorithm, the algorithm divides the data objects into groups
according to the similarities and difference between the objects.
The unsupervised learning algorithm can be further categorized into two types of problems:
o K-means clustering
o KNN (k-nearest neighbors)
o Hierarchal clustering
o Anomaly detection
o Neural Networks
o Principle Component Analysis
o Independent Component Analysis
o Apriori algorithm
o Singular value decomposition
o Unsupervised learning is used for more complex tasks as compared to supervised learning
because, in unsupervised learning, we don't have labeled input data.
o Unsupervised learning is intrinsically more difficult than supervised learning as it does not
have corresponding output.
o The result of the unsupervised learning algorithm might be less accurate as input data is not
labeled, and algorithms do not know the exact output in advance.
Suppose there are two categories, i.e., Category A and Category B, and we have a new data
point x1, so this data point will lie in which of these categories. To solve this type of
problem, we need a K-NN algorithm. With the help of K-NN, we can easily identify the
category or class of a particular dataset. Consider the below diagram:
The K-NN working can be explained on the basis of the below algorithm:
Suppose we have a new data point and we need to put it in the required category. Consider
the below image:
o Firstly, we will choose the number of neighbors, so we will choose the k=5.
o Next, we will calculate the Euclidean distance between the data points. The Euclidean
distance is the distance between two points, which we have already studied in geometry. It
can be calculated as:
o As we can see the 3 nearest neighbors are from category A, hence this new data point must
belong to category A.
Below are some points to remember while selecting the value of K in the K-NN algorithm:
There is no particular way to determine the best value for "K", so we need to try some values
to find the best out of them. The most preferred value for K is 5.
o A very low value for K such as K=1 or K=2, can be noisy and lead to the effects of outliers in
the model.
o Large values for K are good, but it may find some difficulties.
o It is simple to implement.
o It is robust to the noisy training data
o It can be more effective if the training data is large.
o Always needs to determine the value of K which may be complex some time.
There are various algorithms in Machine learning, so choosing the best algorithm for the
given dataset and problem is the main point to remember while creating a machine learning
model. Below are the two reasons for using the Decision tree:
o Decision Trees usually mimic human thinking ability while making a decision, so it is easy to
understand.
o The logic behind the decision tree can be easily understood because it shows a tree-like
structure.
Root Node: Root node is from where the decision tree starts. It represents the entire dataset, which
further gets divided into two or more homogeneous sets.
Leaf Node: Leaf nodes are the final output node, and the tree cannot be segregated further after getting
a leaf node.
Splitting: Splitting is the process of dividing the decision node/root node into sub-nodes according to the
given conditions.
Pruning: Pruning is the process of removing the unwanted branches from the tree.
Parent/Child node: The root node of the tree is called the parent node, and other nodes are called the
child nodes.
In a decision tree, for predicting the class of the given dataset, the algorithm starts from the
root node of the tree. This algorithm compares the values of root attribute with the record
(real dataset) attribute and, based on the comparison, follows the branch and jumps to the
next node.
For the next node, the algorithm again compares the attribute value with the other sub-
nodes and move further. It continues the process until it reaches the leaf node of the tree.
The complete process can be better understood using the below algorithm:
Step-1: Begin the tree with the root node, says S, which contains the complete dataset.
o Step-2: Find the best attribute in the dataset using Attribute Selection Measure (ASM).
o Step-3: Divide the S into subsets that contains possible values for the best attributes.
Example: Suppose there is a candidate who has a job offer and wants to decide whether he
should accept the offer or Not. So, to solve this problem, the decision tree starts with the
root node (Salary attribute by ASM). The root node splits further into the next decision node
(distance from the office) and one leaf node based on the corresponding labels. The next
decision node further gets split into one decision node (Cab facility) and one leaf node.
Finally, the decision node splits into two leaf nodes (Accepted offers and Declined offer).
Consider the below diagram:
o Information Gain
1. Information Gain:
o Information gain is the measurement of changes in entropy after the segmentation of a
dataset based on an attribute.
o It calculates how much information a feature provides us about a class.
o According to the value of information gain, we split the node and build the decision tree.
o A decision tree algorithm always tries to maximize the value of information gain, and a
node/attribute having the highest information gain is split first. It can be calculated using the
below formula:
Where,
2. Gini Index:
o Gini index is a measure of impurity or purity used while creating a decision tree in the
CART(Classification and Regression Tree) algorithm.
o An attribute with the low Gini index should be preferred as compared to the high Gini index.
o It only creates binary splits, and the CART algorithm uses the Gini index to create binary splits.
o Gini index can be calculated using the below formula:
o It is simple to understand as it follows the same process which a human follow while making
any decision in real-life.
o It can be very useful for solving decision-related problems.
The term "Artificial neural network" refers to a biologically inspired sub-field of artificial
intelligence modeled after the brain. An Artificial neural network is usually a computational
network based on biological neural networks that construct the structure of the human
brain. Similar to a human brain has neurons interconnected to each other, artificial neural
networks also have neurons that are linked to each other in various layers of the networks.
These neurons are known as nodes.
Artificial neural network tutorial covers all the aspects related to the artificial neural network.
In this tutorial, we will discuss ANNs, Adaptive resonance theory, Kohonen self-organizing
map, Building blocks, unsupervised learning, Genetic algorithm, etc.
The term "Artificial Neural Network" is derived from Biological neural networks that
develop the structure of a human brain. Similar to the human brain that has neurons
interconnected to one another, artificial neural networks also have neurons that are
interconnected to one another in various layers of the networks. These neurons are known
as nodes.
The typical Artificial Neural Network looks something like the given figure.
Dendrites from Biological Neural Network represent inputs in Artificial Neural Networks, cell
nucleus represents Nodes, synapse represents Weights, and Axon represents Output.
Dendrites Inputs
Synapse Weights
Axon Output
There are around 1000 billion neurons in the human brain. Each neuron has an association
point somewhere in the range of 1,000 and 100,000. In the human brain, data is stored in
such a manner as to be distributed, and we can extract more than one piece of this data
when necessary from our memory parallelly. We can say that the human brain is made up of
incredibly amazing parallel processors.
We can understand the artificial neural network with an example, consider an example of a
digital logic gate that takes an input and gives an output. "OR" gate, which takes two inputs.
If one or both the inputs are "On," then we get "On" in output. If both the inputs are "Off,"
then we get "Off" in output. Here the output depends upon input. Our brain does not
perform the same task. The outputs to inputs relationship keep changing because of the
neurons in our brain, which are "learning."
As the name suggests, it accepts inputs in several different formats provided by the
programmer.
Hidden Layer:
The hidden layer presents in-between input and output layers. It performs all the
calculations to find hidden features and patterns.
Output Layer:
The input goes through a series of transformations using the hidden layer, which finally
results in output that is conveyed using this layer.
The artificial neural network takes input and computes the weighted sum of the inputs and
includes a bias. This computation is represented in the form of a transfer function.
Artificial neural networks have a numerical value that can perform more than one task
simultaneously.
Data that is used in traditional programming is stored on the whole network, not on a
database. The disappearance of a couple of pieces of data in one place doesn't prevent the
network from working.
After ANN training, the information may produce output even with inadequate data. The
loss of performance here relies upon the significance of missing data.
For ANN is to be able to adapt, it is important to determine the examples and to encourage
the network according to the desired output by demonstrating these examples to the
network. The succession of the network is directly proportional to the chosen instances, and
if the event can't appear to the network in all its aspects, it can produce false output.
Extortion of one or more cells of ANN does not prohibit it from generating output, and this
feature makes the network fault-tolerance.
There is no particular guideline for determining the structure of artificial neural networks.
The appropriate network structure is accomplished through experience, trial, and error.
It is the most significant issue of ANN. When ANN produces a testing solution, it does not
provide insight concerning why and how. It decreases trust in the network.
Hardware dependence:
ANNs can work with numerical data. Problems must be converted into numerical values
before being introduced to ANN. The presentation mechanism to be resolved here will
directly impact the performance of the network. It relies on the user's abilities.
The network is reduced to a specific value of the error, and this value does not give us
optimum results.
Science artificial neural networks that have steeped into the world in the mid-20th century are
exponentially developing. In the present time, we have investigated the pros of artificial neural networks
and the issues encountered in the course of their utilization. It should not be overlooked that the cons of
ANN networks, which are a flourishing science branch, are eliminated individually, and their pros are
increasing day by day. It means that artificial neural networks will turn into an irreplaceable part of our
lives progressively important.
Artificial Neural Network can be best represented as a weighted directed graph, where the
artificial neurons form the nodes. The association between the neurons outputs and neuron
inputs can be viewed as the directed edges with weights. The Artificial Neural Network
receives the input signal from the external source in the form of a pattern and image in the
form of a vector. These inputs are then mathematically assigned by the notations x(n) for
every n number of inputs.
If the weighted sum is equal to zero, then bias is added to make the output non-zero or
something else to scale up to the system's response. Bias has the same input, and weight
equals to 1. Here the total of weighted inputs can be in the range of 0 to positive infinity.
Here, to keep the response in the limits of the desired value, a certain maximum value is
benchmarked, and the total of weighted inputs is passed through the activation function.
The activation function refers to the set of transfer functions used to achieve the desired
output. There is a different kind of the activation function, but primarily either linear or non-
linear sets of functions. Some of the commonly used sets of activation functions are the
Binary, linear, and Tan hyperbolic sigmoidal activation functions.
There are various types of Artificial Neural Networks (ANN) depending upon the human
brain neuron and network functions, an artificial neural network similarly performs tasks. The
majority of the artificial neural networks will have some similarities with a more complex
biological partner and are very effective at their expected tasks. For example, segmentation
or classification.
Feedback ANN:
In this type of ANN, the output returns into the network to accomplish the best-evolved
results internally. As per the University of Massachusetts, Lowell Centre for Atmospheric
Research. The feedback networks feed information back into itself and are well suited to
solve optimization issues. The Internal system error corrections utilize feedback ANNs.
****Expert systems
There is a class of computer programs, known as expert systems, that aim to mimic
human reasoning. The methods and techniques used to build these programs are the
outcome of efforts in a field of computer science known as Artificial Intelligence (AI).
Expert systems have been built to diagnose disease (Pathfinder is an expert system that
assists surgical pathologists with the diagnosis of lymph-node diseases, aid in the
design chemical syntheses (Example), prospect for mineral deposits (PROSPECTOR),
translate natural languages, and solve complex mathematical problems(MACSYMA).
Features
An expert system is a computer program, with a set of rules encapsulating knowledge
about a particular problem domain (i.e., medicine, chemistry, finance, flight, et cetera).
These rules prescribe actions to take when certain conditions hold, and define the
effect of the action on deductions or data.
One of the early applications, MYCIN, was created to help physicians diagnose and
treat bacterial infections. Expert systems have been used to analyze geophysical data
in our search for petroleum and metal deposits (e.g., PROSPECTOR). They are used
by the investments, banking, and telecommunications industries.
They are essential in robotics, natural language processing, theorem proving, and the
intelligent retrieval of information from databases. They are used in many other human
accomplishments which might be considered more practical.
Rule-based systems have been used to monitor and control traffic, to aid in the
development of flight systems, and by the federal government to prepare budgets.
The expert system shell can be applied to many different problem domains with little
or no change. It also means that adding or modifying rules to an expert system can
effect changes in program behavior without affecting the controlling component, the
system shell.
Changes to the Knowledge-base can be made easily by subject matter experts without
programmer intervention, thereby reducing the cost of software maintenance and
helping to ensure that changes are made in the way they were intended.
Rules are added to the knowledge-base by subject matter experts using text or
graphical editors that are integral to the system shell. The simple process by which
rules are added to the knowledge-base is depicted in Figure 1.
An expert system is, typically, composed of two major components, the Knowledge-
base and the Expert System Shell. The Knowledge-base is a collection of rules
encoded as metadata in a file system, or more often in a relational database.
The Client Interface processes requests for service from system-users and from
application layer components. Client Interface logic routes these requests to an
appropriate shell program unit.
For example, when a subject matter expert wishes to create or edit a rule, they
use the Client Interface to dispatch the Knowledge-base Editor. Other service
requests might schedule a rule, or a group of rules, for execution by the Rule
Engine.
Rule Translator
Rules, as they are composed by subject matter experts, are not directly
executable. They must first be converted from their human-readable form into a
form that can be interpreted by the Rule Engine. Converting rules from one
form to another is a function performed by the Rule Translator.
The Rule Engine interpreter traverses the AST, executing actions specified in the
rule along the way.
The shell component, Rule Object Classes, is a container for object classes
supporting,