Module -04 Machine Learning(BCS602) Search Creators
Module -04 Machine Learning(BCS602) Search Creators
Module-4
Definition:
Purpose:
It helps in predicting outcomes and learning from large datasets by using Bayes' rule to
infer unknown quantities.
Bayesian Learning:
Bayesian Algorithms:
Bayes' Rule: Forms the foundation of probabilistic learning and Bayesian learning
algorithms for inferring useful information.
Naive Bayes Model relies on Bayes theorem that works on the principle of three kinds of
probabil- ities called prior probability, likelihood probability, and posterior probability.
Prior Probability
It is the initial probability that is believed before any new information is collected.
Likelihood Probability
Likelihood probability is the relative probability of the observation occurring for each
class or the sampling density for the evidence given the hypothesis.
Posterior Probability
It is the updated or revised probability of an event taking into account the observations
from the training data.
P (Hypothesis | Evidence) is the posterior distribution representing the belief about the
hypothesis, given the evidence from the training data.
Naive Bayes Classification is based on Bayes’ theorem, which calculates the posterior
probability using prior probabilities.
Bayes’ theorem determines the probability of a hypothesis (h) given evidence (E):
Bayes’ theorem helps calculate posterior probabilities for multiple hypotheses to select
the one with the highest probability, known as Maximum A Posteriori (MAP) Hypothesis.
The hypothesis with the highest posterior probability among all candidates.
When all hypotheses are equally probable, only the likelihood (P(E | h)) is
considered.
The hypothesis that maximizes the likelihood is selected.
Approaches:
Calculate the mean (μ) and variance (σ2) of each continuous feature for every class
using the training data.
For a given test instance, compute the likelihood P(x∣h) for each feature using the
Gaussian formula.
Multiply the likelihoods of all features with the prior probabilities of the class to
calculate the posterior probability.
Select the class with the highest posterior probability as the predicted class.
Advantages:
Limitations:
Chapter – 02
Introduction
The human nervous system has billions of neurons that act as processing units,
enabling perception, hearing, vision, smell, and overall cognition.
It helps humans understand themselves, their actions, their location, and their
surroundings, allowing them to remember, recognize, and correlate information.
Central Nervous System (CNS): Includes the brain and spinal cord.
Peripheral Nervous System (PNS): Comprises neurons located inside and outside
the CNS.
Types of Neurons:
Sensory Neurons:
o Gather information from different parts of the body and transmit it to the CNS.
Motor Neurons:
o Receive information from other neurons and send commands to the body parts.
Interneurons:
o Found in the CNS, they connect one neuron to another, transmitting information
between them.
Functionality of a Neuron:
Biological Neurons
A typical biological neuron has four parts called dendrites, soma, axon and synapse.
The body of the neuron is called as soma.
Dendrites accept the input information and process it in the cell body called soma. A
single neuron is connected by axons to around 10,000 neurons and through these
axons the processed information is passed from one neuron to another neuron.
A neuron gets fired if the input information crosses a threshold value and transmits
signals to another neuron through a synapse.
A synapse gets fired with an electrical impulse called spikes which are transmitted
to another neuron.
A single neuron can receive synaptic inputs from one neuron or multiple neurons.
These neurons form a network structure which processes input information and
gives out a response.
The simple structure of a biological neuron is shown in Figure 10.1.
Artificial Neurons
Artificial neurons are like biological neurons which are called as nodes.
A node or a neuron can receive one or more input information and process it.
Artificial neurons or nodes are connected by connection links to one another.
Each connection link is associated with a synaptic weight.
The structure of a single neuron is shown in Figure 10.2.
McCulloch & Pitts Neuron model can represent only a few Boolean functions. A Boolean
function has binary inputs and provides a binary output.
For example, an AND Boolean function neuron would fire when all the inputs are 1,
whereas an OR Boolean function neuron would even when one input is 1.
Moreover, the weight and threshold values are fixed in this mathematical model.
Structure of ANN:
o The ANN is a network represented as a directed graph with neurons (nodes) and
connection weights (edges).
o The neurons are arranged in layers:
1. Input Layer: Receives the input data.
2. Hidden Layer: Processes the information from the input layer.
3. Output Layer: Provides the final output.
o Each neuron in the hidden layer performs computations based on weighted inputs
from the previous layer and fires if the weighted sum exceeds the threshold.
o Neurons use an activation function to map the weighted sum to a non-linear output.
Activation Functions
Role:
o Activation functions determine whether a neuron should fire or not based on the
input signals. They map the weighted input sum to an output value, typically
normalizing the value between 0 and 1 or -1 and +1.
Linear activation functions are used when outputs can be classified into two groups,
suitable for binary classification.
Non-linear activation functions (such as sigmoid, tanh, etc.) are used for more complex
data, like audio, video, or images, and allow the network to learn non-linear
relationships.
ReLU (Rectified Linear Unit): Outputs the input if positive, otherwise outputs 0.
Leaky ReLU: Similar to ReLU, but allows a small negative slope for negative inputs.
The perceptron model works for linearly separable Boolean functions but fails to solve
the XOR problem. The XOR function produces:
Since XOR is not linearly separable, a single-layer perceptron cannot solve it. This
limitation led to the development of the Multi-Layer Perceptron (MLP), which uses
multiple layers to handle non-linear separable problems.
The Delta Rule, also known as the Widrow-Hoff rule or Adaline rule, is used to update the
weights. It is based on minimizing the error (difference between desired and actual
output).
Fully connected neural networks are the ones in which all the neurons in a layer are
connected to all other neurons in the next layer.
Feedback neural networks have feedback connections between neurons that allow
information flow in both directions in the network.
The output signals can be sent back to the neurons in the same layer or to the neurons
in the preceding layers.
Hence, this network is more dynamic during training. The model of a feedback neural
network is shown in Figure 10.10.
Structure
Training Process
Overfitting Solution
Advantages of ANN
Limitations of ANN
Challenges of ANN
Generalization issues – Models trained on simulated data may not work well in real
applications.