Unit 3 Machine Learning
Unit 3 Machine Learning
Supervised
Machine Learning
Learning
Machine learning is an application of AI that enables systems to learn and improve
from experience without being explicitly programmed. Machine learning focuses on
developing computer programs that can access data and use it to learn for themselves.
The machine learning process begins with observations or data, such as examples,
direct experience or instruction. It looks for patterns in data so it can later make inferences
based on the examples provided. The primary aim of ML is to allow computers to learn
autonomously without human intervention or assistance and adjust actions
accordingly. Learning system of a machine learning algorithms can be broken down
into three main parts.
Decision Process: In general, machine learning algorithms are used to
make a prediction or classification. Based on some input data, which can be
labelled or unlabeled, ML algorithm will produce an estimate about a pattern in
the data.
Error Function: An error function serves to evaluate the prediction of the
model. If there are known examples, an error function can make comparison to
assess the accuracy of the model.
Model Optimization Process: If the model can’t fit better to the data points
in the training set, then parameters are adjusted to reduce the discrepancy between
the known example and the model estimate. The algorithm will repeat this evaluate
and optimize process, until a threshold of accuracy has been met.
Unsupervised Learning
In unsupervised learning ML algorithm is provided with dataset without desired
output. The ML algorithm then attempts to find patterns in the data by extracting
useful features and analyzing its structure. Unsupervised learning algorithms are
widely used for tasks like: clustering, dimensionality reduction, association mining etc. K-
Means algorithm, K-Medoid algorithm, Agglomerative algorithm etc. are examples of
clustering algorithms.
Reinforcement Learning
In reinforcement learning, we do not provide the machine with examples of correct
input-output pairs, but we do provide a method for the machine to quantify its
performance in the form of a reward signal. Reinforcement learning methods resemble
how humans and animals learn: the machine tries a bunch of different things and is
For example, we can build a classification model to categorize bank loan applications as
either safe or risky. We can also construct a classification model to identify digits. On the
other hand, we can build a regression model to predict the expenditures of a potential
customers on computer equipment given their income and occupation. We can also build
a prediction model to predict stock price given historical trading data.
Working of Classification Algorithms
The Classification process works in following two steps: Learning Step and Testing Step.
Learning Step: This step is also called training step. In this step the learning
algorithms build a model on the basis of relationship between input and output in
the training dataset. This dataset contains input attributes along with class label
for every input tuple. Because the class label of each training tuple is provided, this
step is also known as supervised learning.
Testing Step: In this step, the model is used for prediction. Here the test dataset is
used to estimate the accuracy of the model. This dataset contains values of input
attributes along with class label of the output attribute. However, the model
only takes values of input attributes and predicts class label of each input tuple.
Then, accuracy of the model is computed by looking at predicted class labels and
actual
Prepared By: Arjun Singh Saud
class labels of test dataset. The model can be applied to the new data tuples if the
accuracy is considered acceptable.
Linear Regression
Regression analysis is the process of curve fitting in which the relationship between the
independent variables and dependent variables are modeled in the mth degree
polynomial. Polynomial Regression models are usually fit with the method of least mean
square (LMS). If we assume that the relationship is a linear one and only one variable,
then we can use linear equation given as below.
𝑦 = 𝑓(𝑥) = 𝑤0 + 𝑤1𝑥
In the above equation, y is dependent variable and x is independent variable, w0 and w1
are coefficient that needs to be determined through training of the model. If we have
two independent variable, linear regression equation can be written as below.
Logistic Regression
Logistic regression is one of the most popular machine learning algorithms for binary
negative class, while 1 is called positive class. Such task is known as binary classification.
The heart of the logistic regression technique is logistic function and is defined as given in
equation
(1). Logistic function transforms the input into the range [0, 1]. Smallest negative numbers
results in values close to zero and the larger positive numbers results in values close to
one.
1
f (x)
1 ex
If there are two input variable, logistic regression has two coefficients just like linear
regression.
y w 0 w 1 x1 w 2 x2
Unlike linear regression, the output is transformed into a probability using the logistic
function.
1
yˆ (
y) 1 ey
If the probability is > 0.5 we can take the output as a prediction for the class 1, otherwise
the prediction is for the class 0. The job of the learning algorithm will be to discover the
best values for the coefficients (w0, w1, and w2) based on the training data.
P( X | H )P(H )
P(H | X ) P( X )
P(Ci | X ) P( X | Ci )P(Ci )
P( X )
Given a tuple, X, the classifier will predict that X belongs to the class having the highest
posterior probability, conditioned on X. Thus we need to maximize P(Ci|X). As P(X) is
constant for all classes, only P(X|Ci)P(Ci) need to be maximized. Let X is the set of
attributes {x1, x2, x3…….xn} where attributes are independent of one another. Now the
probability P(X|Ci) is given by the equation given below.
n
Example given below creates Naïve Bayes classifier model using the above training
data and then predicts class level of the tuple: X = (age = youth, income = medium, student =
yes, credit_rating = fair) using the model.
Example
from sklearn.naive_bayes import GaussianNB
from sklearn import preprocessing
import pandas as pd
Age=['Youth','Youth','Middle_Aged','Senior','Senior','Senior','Middle_Aged
','Youth','Youth','Senior','Youth','Middle_Aged','Middle_Aged','Senior','Y
outh']
Income=['High','High','High','Medium','Low','Low','Low','Medium','Low','Me
dium','Medium','Medium','High','Medium','Medium']
Student=['No','No','No','No','Yes','Yes','Yes','No','Yes','Yes','Yes','No'
,'Yes','No','Yes']
Credit_Rating=['Fair','Excellent','Fair','Fair','Fair','Excellent','Excell
ent','Fair','Fair','Fair','Excellent','Excellent','Fair','Excellent','Fair
']
Buys=['No','No','Yes','Yes','Yes','No','Yes','No','Yes','Yes','Yes','Yes',
'Yes','No',"?"]
model = GaussianNB()
model.fit(trainx,trainy)
predicted= model.predict(testx)
if(predicted==1):
pred='No'
else:
pred='Yes'
print("Predicted Value:", pred)
5. Else
• Got to step 2
There are many variations of decision-tree algorithms. Some of them are: ID3 (Iterative
Dichotomiser 3), C4.5 (successor of ID3), CART (Classification and Regression Tree) etc. There
are different attribute selection measures used by decision tree classifiers. Some of them
are: Information Gain, Gain Ratio, Gini Index etc. ID3 stands for Iterative Dichotomiser 3. It
uses top-down greedy approach to build decision tree model. This algorithm computes
information gain for each attribute and then selects the attribute with the highest
pi Ci, D (2)
D
Where Ci,D is the number of tuples in D belonging to class Ci and is the number
of tuples in D.
j1 D
outlook=['Sunny','Sunny','Overcast','Rainy','Rainy','Rainy','Overcast','Su
nny','Sunny', 'Rainy','Sunny','Overcast','Overcast','Rainy','Sunny']
temp=['Hot','Hot','Hot','Mild','Cool','Cool','Cool','Mild','Cool','Mild','
Mild','Mild','Hot','Mild','Hot']
humidity=['High','High','High','High','Normal','Normal','Normal','High','N
ormal','Normal','Normal','High','Normal','High','Normal']
wind=['Weak','Strong','Weak','Weak','Weak','Strong','Strong','Weak','Weak'
,'Weak','Strong','Strong','Weak','Strong','Strong']
play=['No','No','Yes','Yes','Yes','No','Yes','No','Yes','Yes','Yes','Yes',
'Yes','No','?']
d={'Outlook':outlook,'Temperature':temp,'Humidity':humidity, 'Windy':wind,
'Play_Tennis':play}
df=pd.DataFrame(d)
Le = LabelEncoder()
df['Outlook'] = Le.fit_transform(df['Outlook'])
df['Temperature'] = Le.fit_transform(df['Temperature'])
df['Humidity'] = Le.fit_transform(df['Humidity'])
df['Windy'] = Le.fit_transform(df['Windy'])
dt = DecisionTreeClassifier(criterion = 'entropy')
dt.fit(trainx,trainy)
p= dt.predict(testx)
p=Le.inverse_transform(p)
print("Predicted Label:",p)