0% found this document useful (0 votes)
22 views

ML_Lab_01999676272

Uploaded by

c201012
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views

ML_Lab_01999676272

Uploaded by

c201012
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

3.

Naive Bayes Algorithm for Machine Learning

# Importing the libraries


import numpy as nm
import matplotlib.pyplot as mtp
import pandas as pd

# Importing the dataset


dataset = pd.read_csv('user_data.csv')
x = dataset.iloc[:, [2, 3]].values
y = dataset.iloc[:, 4].values

# Splitting the dataset into the Training set and Test set
from sklearn.model_selection import train_test_split
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size = 0.25,
random_state = 0)

# Feature Scaling
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
x_train = sc.fit_transform(x_train)
x_test = sc.transform(x_test)

# Fitting Naive Bayes to the Training set


from sklearn.naive_bayes import GaussianNB
classifier = GaussianNB()
classifier.fit(x_train, y_train)

# Predicting the Test set results


y_pred = classifier.predict(x_test)

# Making the Confusion Matrix


from sklearn.metrics import confusion_matrix
cm = confusion_matrix(y_test, y_pred)

# Visualising the Training set results


from matplotlib.colors import ListedColormap
x_set, y_set = x_train, y_train
X1, X2 = nm.meshgrid(nm.arange(start = x_set[:, 0].min() - 1, stop = x_set[:,
0].max() + 1, step = 0.01), nm.arange(start = x_set[:, 1].min() - 1, stop =
x_set[:, 1].max() + 1, step = 0.01))
mtp.contourf(X1, X2, classifier.predict(nm.array([X1.ravel(),
X2.ravel()]).T).reshape(X1.shape), alpha = 0.75, cmap =
ListedColormap(('purple', 'green')))
mtp.xlim(X1.min(), X1.max())
mtp.ylim(X2.min(), X2.max())
for i, j in enumerate(nm.unique(y_set)):
mtp.scatter(x_set[y_set == j, 0], x_set[y_set == j, 1], c =
ListedColormap(('purple', 'green'))(i), label = j)
mtp.title('Naive Bayes (Training set)')
mtp.xlabel('Age')
mtp.ylabel('Estimated Salary')
mtp.legend()
mtp.show()

# Visualising the Test set results


from matplotlib.colors import ListedColormap
x_set, y_set = x_test, y_test
X1, X2 = nm.meshgrid(nm.arange(start = x_set[:, 0].min() - 1, stop = x_set[:,
0].max() + 1, step = 0.01), nm.arange(start = x_set[:, 1].min() - 1, stop =
x_set[:, 1].max() + 1, step = 0.01))
mtp.contourf(X1, X2, classifier.predict(nm.array([X1.ravel(),
X2.ravel()]).T).reshape(X1.shape), alpha = 0.75, cmap =
ListedColormap(('purple', 'green')))
mtp.xlim(X1.min(), X1.max())
mtp.ylim(X2.min(), X2.max())
for i, j in enumerate(nm.unique(y_set)):
mtp.scatter(x_set[y_set == j, 0], x_set[y_set == j, 1], c =
ListedColormap(('purple', 'green'))(i), label = j)
mtp.title('Naive Bayes (test set)')
mtp.xlabel('Age')
mtp.ylabel('Estimated Salary')
mtp.legend()
mtp.show()

Explanation:
1. Importing Libraries:
(a) The code begins by importing necessary libraries:
- `import numpy as nm` for numerical operations.
- `import matplotlib.pyplot as mtp` for creating plots.
- `import pandas as pd` for data manipulation.

2. Importing the Dataset:


(a) The dataset is read from a CSV file and loaded into a pandas DataFrame named
`dataset`.
(b) Extracting independent and dependent variables:
- `x` contains the values of the 3rd and 4th columns (indices 2 and 3) from the
dataset.
- `y` contains the values of the 5th column (index 4) from the dataset.
3. Splitting the Dataset:
(a) The dataset is split into training and test sets:
- 25% of the data is used for testing (`test_size = 0.25`), and 75% for training.
- `random_state = 0` ensures reproducibility.

4. Feature Scaling:
(a) Feature scaling is applied to the dataset:
- `StandardScaler` standardizes features by removing the mean and scaling to unit
variance.
- The scaler `sc` is fitted to the training data and transforms both the training and
test data.

5. Fitting Naive Bayes:


(a) The Naive Bayes classifier is fitted to the training set:
- `GaussianNB` (Gaussian Naive Bayes) is instantiated as `classifier`.
- The classifier is fitted to the training data (`x_train` and `y_train`).

6. Predicting the Test Set Results:


(a) Predictions are made on the test set:
- The trained classifier is used to predict the labels for the test data (`x_test`).

7. Making the Confusion Matrix:


(a) The confusion matrix is created to evaluate the classifier:
- `confusion_matrix` compares the true labels (`y_test`) with the predicted labels
(`y_pred`).

8. Visualizing the Training Set Results:


(a) The training set results are visualized:
- The decision boundary and the training points are plotted.
- `contourf` creates a filled contour plot for the decision regions.
- `scatter` plots the data points with different colors for each class.

9. Visualizing the Test Set Results:


(a) The test set results are visualized:
- The decision boundary and the test points are plotted to see how well the classifier
generalizes to new data.
4. Logistic Regression For Machine Learning

# importing libraries
import numpy as nm
import matplotlib.pyplot as mtp
import pandas as pd

#importing datasets
data_set= pd.read_csv('user_data.csv')

#Extracting Independent and dependent Variable


x= data_set.iloc[:, [2,3]].values
y= data_set.iloc[:, 4].values

# Splitting the dataset into training and test set.


from sklearn.model_selection import train_test_split
x_train, x_test, y_train, y_test= train_test_split(x, y, test_size=
0.25, random_state=0)

#feature Scaling
from sklearn.preprocessing import StandardScaler
st_x= StandardScaler()
x_train= st_x.fit_transform(x_train)
x_test= st_x.transform(x_test)

#Fitting Logistic Regression to the training set


from sklearn.linear_model import LogisticRegression
classifier= LogisticRegression(random_state=0)
classifier.fit(x_train, y_train)

#Predicting the test set result


y_pred= classifier.predict(x_test)

#Creating the Confusion matrix


from sklearn.metrics import confusion_matrix
cm= confusion_matrix()

#Visualizing the training set result


from matplotlib.colors import ListedColormap
x_set, y_set = x_train, y_train
x1, x2 = nm.meshgrid(nm.arange(start = x_set[:, 0].min() - 1, stop =
x_set[:, 0].max() + 1, step =0.01),
nm.arange(start = x_set[:, 1].min() - 1, stop = x_set[:, 1].max() + 1,
step = 0.01))
mtp.contourf(x1, x2, classifier.predict(nm.array([x1.ravel(),
x2.ravel()]).T).reshape(x1.shape),
alpha = 0.75, cmap = ListedColormap(('purple','green' )))
mtp.xlim(x1.min(), x1.max())
mtp.ylim(x2.min(), x2.max())
for i, j in enumerate(nm.unique(y_set)):
mtp.scatter(x_set[y_set == j, 0], x_set[y_set == j, 1], c =
ListedColormap(('purple', 'green'))(i), label = j)
mtp.title('Logistic Regression (Training set)')
mtp.xlabel('Age')
mtp.ylabel('Estimated Salary')
mtp.legend()
mtp.show()
#Visulaizing the test set result
from matplotlib.colors import ListedColormap
x_set, y_set = x_test, y_test
x1, x2 = nm.meshgrid(nm.arange(start = x_set[:, 0].min() - 1, stop =
x_set[:, 0].max() + 1, step =0.01),
nm.arange(start = x_set[:, 1].min() - 1, stop = x_set[:, 1].max() + 1,
step = 0.01))
mtp.contourf(x1, x2, classifier.predict(nm.array([x1.ravel(),
x2.ravel()]).T).reshape(x1.shape),
alpha = 0.75, cmap = ListedColormap(('purple','green' )))
mtp.xlim(x1.min(), x1.max())
mtp.ylim(x2.min(), x2.max())
for i, j in enumerate(nm.unique(y_set)):
mtp.scatter(x_set[y_set == j, 0], x_set[y_set == j, 1],c =
ListedColormap(('purple', 'green'))(i), label = j)
mtp.title('Logistic Regression (Test set)')
mtp.xlabel('Age')
mtp.ylabel('Estimated Salary')
mtp.legend()
mtp.show()

Explanation:
1. Importing Libraries:
(a) The code begins by importing necessary libraries:
- `import numpy as nm` for numerical operations.
- `import matplotlib.pyplot as mtp` for creating plots.
- `import pandas as pd` for data manipulation.
2. Importing the Dataset:
(a) The dataset is read from a CSV file and loaded into a pandas DataFrame named
`data_set`.
(b) Extracting independent and dependent variables:
- `x` contains the values of the 3rd and 4th columns (indices 2 and 3) from the
dataset.
- `y` contains the values of the 5th column (index 4) from the dataset.

3. Splitting the Dataset:


(a) The dataset is split into training and test sets:
- 25% of the data is used for testing (`test_size = 0.25`), and 75% for training.
- `random_state = 0` ensures reproducibility.

4. Feature Scaling:
(a) Feature scaling is applied to the dataset:
- `StandardScaler` standardizes features by removing the mean and scaling to unit
variance.
- The scaler `st_x` is fitted to the training data and transforms both the training and
test data.

5. Fitting Logistic Regression:


(a) The Logistic Regression classifier is fitted to the training set:
- `LogisticRegression` is instantiated as `classifier`.
- The classifier is fitted to the training data (`x_train` and `y_train`).

6. Predicting the Test Set Results:


(a) Predictions are made on the test set:
- The trained classifier is used to predict the labels for the test data (`x_test`).

7. Creating the Confusion Matrix:


(a) The confusion matrix is created to evaluate the classifier:
- `confusion_matrix` compares the true labels (`y_test`) with the predicted labels
(`y_pred`).

8. Visualizing the Training Set Results:


(a) The training set results are visualized:
- The decision boundary and the training points are plotted.
- `contourf` creates a filled contour plot for the decision regions.
- `scatter` plots the data points with different colors for each class.

9. Visualizing the Test Set Results:


(a) The test set results are visualized:
- The decision boundary and the test points are plotted to see how well the classifier
generalizes to new data.
5. Support Vector Machine Algorithm For Machine Learning

# Importing libraries
import numpy as nm
import matplotlib.pyplot as mtp
import pandas as pd

# Importing datasets
data_set= pd.read_csv('user_data.csv')

# Extracting Independent and dependent Variable


x= data_set.iloc[:, [2,3]].values
y= data_set.iloc[:, 4].values

# Splitting the dataset into training and test set.


from sklearn.model_selection import train_test_split
x_train, x_test, y_train, y_test= train_test_split(x, y, test_size= 0.25,
random_state=0)
# Feature Scaling
from sklearn.preprocessing import StandardScaler
st_x= StandardScaler()
x_train= st_x.fit_transform(x_train)
x_test= st_x.transform(x_test)

# "Support vector classifier"


from sklearn.svm import SVC
classifier = SVC(kernel='linear', random_state=0)
classifier.fit(x_train, y_train)

# Predicting the test set result


y_pred= classifier.predict(x_test)

# Creating the Confusion matrix


from sklearn.metrics import confusion_matrix
cm= confusion_matrix(y_test, y_pred)

# Visualizing the training set result:


from matplotlib.colors import ListedColormap
x_set, y_set = x_train, y_train
x1, x2 = nm.meshgrid(nm.arange(start = x_set[:, 0].min() - 1, stop = x_set[:,
0].max() + 1, step =0.01),
nm.arange(start = x_set[:, 1].min() - 1, stop = x_set[:, 1].max() + 1, step =
0.01))
mtp.contourf(x1, x2, classifier.predict(nm.array([x1.ravel(),
x2.ravel()]).T).reshape(x1.shape),
alpha = 0.75, cmap = ListedColormap(('red', 'green')))
mtp.xlim(x1.min(), x1.max())
mtp.ylim(x2.min(), x2.max())
for i, j in enumerate(nm.unique(y_set)):
mtp.scatter(x_set[y_set == j, 0], x_set[y_set == j, 1], c =
ListedColormap(('red', 'green'))(i), label = j)
mtp.title('SVM classifier (Training set)')
mtp.xlabel('Age')
mtp.ylabel('Estimated Salary')
mtp.legend()
mtp.show()

# Visualizing the test set result:


from matplotlib.colors import ListedColormap
x_set, y_set = x_test, y_test
x1, x2 = nm.meshgrid(nm.arange(start = x_set[:, 0].min() - 1, stop = x_set[:,
0].max() + 1, step =0.01),
nm.arange(start = x_set[:, 1].min() - 1, stop = x_set[:, 1].max() + 1, step =
0.01))
mtp.contourf(x1, x2, classifier.predict(nm.array([x1.ravel(),
x2.ravel()]).T).reshape(x1.shape),
alpha = 0.75, cmap = ListedColormap(('red','green' )))
mtp.xlim(x1.min(), x1.max())
mtp.ylim(x2.min(), x2.max())
for i, j in enumerate(nm.unique(y_set)):
mtp.scatter(x_set[y_set == j, 0], x_set[y_set == j, 1], c =
ListedColormap(('red', 'green'))(i), label = j)
mtp.title('SVM classifier (Test set)')
mtp.xlabel('Age')
mtp.ylabel('Estimated Salary')
mtp.legend()
mtp.show()

1. Importing Libraries:
(a) The code begins by importing necessary libraries:
- `import numpy as nm` for numerical operations.
- `import matplotlib.pyplot as mtp` for creating plots.
- `import pandas as pd` for data manipulation.

2. Importing the Dataset:


(a) The dataset is read from a CSV file and loaded into a pandas DataFrame named
`data_set`.

3. Extracting Independent and Dependent Variables:


(a) `x` contains the values of the 3rd and 4th columns (indices 2 and 3) from the
dataset.
(b) `y` contains the values of the 5th column (index 4) from the dataset.

4. Splitting the Dataset:


(a) The dataset is split into training and test sets:
- 25% of the data is used for testing (`test_size = 0.25`), and 75% for training.
- `random_state = 0` ensures reproducibility.

5. Feature Scaling:
(a) Feature scaling is applied to the dataset:
- `StandardScaler` standardizes features by removing the mean and scaling to unit
variance.
- The scaler `st_x` is fitted to the training data and transforms both the training and
test data.

6. Fitting Support Vector Classifier:


(a) The Support Vector Classifier (SVC) is fitted to the training set:
- `SVC` is instantiated with a linear kernel and `random_state=0`.
- The classifier is fitted to the training data (`x_train` and `y_train`).

7. Predicting the Test Set Results:


(a) Predictions are made on the test set:
- The trained classifier is used to predict the labels for the test data (`x_test`).

8. Creating the Confusion Matrix:


(a) The confusion matrix is created to evaluate the classifier:
- `confusion_matrix` compares the true labels (`y_test`) with the predicted labels
(`y_pred`).

9. Visualizing the Training Set Results:


(a) The training set results are visualized:
- The decision boundary and the training points are plotted.
- `contourf` creates a filled contour plot for the decision regions.
- `scatter` plots the data points with different colors for each class.
- The plot is labeled and displayed with appropriate titles and legends.

10. Visualizing the Test Set Results:


(a) The test set results are visualized:
- The decision boundary and the test points are plotted to see how well the classifier
generalizes to new data.
- Similar to the training set, `contourf` creates a filled contour plot, and `scatter`
plots the data points with different colors for each class.
- The plot is labeled and displayed with appropriate titles and legends.
5. Decision Tree Machine Algorithm For Machine Learning

# Importing libraries
import numpy as nm
import matplotlib.pyplot as mtp
import pandas as pd

# Importing datasets
data_set= pd.read_csv('user_data.csv')

# Extracting Independent and dependent Variable


x= data_set.iloc[:, [2,3]].values
y= data_set.iloc[:, 4].values

# Splitting the dataset into training and test set.


from sklearn.model_selection import train_test_split
x_train, x_test, y_train, y_test= train_test_split(x, y, test_size= 0.25,
random_state=0)

# Feature Scaling
from sklearn.preprocessing import StandardScaler
st_x= StandardScaler()
x_train= st_x.fit_transform(x_train)
x_test= st_x.transform(x_test)

# Fitting Decision Tree classifier to the training set


From sklearn.tree import DecisionTreeClassifier
classifier= DecisionTreeClassifier(criterion='entropy', random_state=0)
classifier.fit(x_train, y_train)

# Predicting the test set result


y_pred= classifier.predict(x_test)

# Creating the Confusion matrix


from sklearn.metrics import confusion_matrix
cm= confusion_matrix(y_test, y_pred)

# Visulaizing the trianing set result


from matplotlib.colors import ListedColormap
x_set, y_set = x_train, y_train
x1, x2 = nm.meshgrid(nm.arange(start = x_set[:, 0].min() - 1, stop = x_set[:,
0].max() + 1, step =0.01),
nm.arange(start = x_set[:, 1].min() - 1, stop = x_set[:, 1].max() + 1, step =
0.01))
mtp.contourf(x1, x2, classifier.predict(nm.array([x1.ravel(),
x2.ravel()]).T).reshape(x1.shape),
alpha = 0.75, cmap = ListedColormap(('purple','green' )))
mtp.xlim(x1.min(), x1.max())
mtp.ylim(x2.min(), x2.max())
for i, j in enumerate(nm.unique(y_set)):
mtp.scatter(x_set[y_set == j, 0], x_set[y_set == j, 1], c =
ListedColormap(('purple', 'green'))(i), label = j)
mtp.title('Decision Tree Algorithm (Training set)')
mtp.xlabel('Age')
mtp.ylabel('Estimated Salary')
mtp.legend()
mtp.show()

#Visulaizing the test set result


from matplotlib.colors import ListedColormap
x_set, y_set = x_test, y_test
x1, x2 = nm.meshgrid(nm.arange(start = x_set[:, 0].min() - 1, stop = x_set[:,
0].max() + 1, step =0.01),
nm.arange(start = x_set[:, 1].min() - 1, stop = x_set[:, 1].max() + 1, step =
0.01))
mtp.contourf(x1, x2, classifier.predict(nm.array([x1.ravel(),
x2.ravel()]).T).reshape(x1.shape),
alpha = 0.75, cmap = ListedColormap(('purple','green' )))
mtp.xlim(x1.min(), x1.max())
mtp.ylim(x2.min(), x2.max())
for i, j in enumerate(nm.unique(y_set)):
mtp.scatter(x_set[y_set == j, 0], x_set[y_set == j, 1], c =
ListedColormap(('purple', 'green'))(i), label = j)
mtp.title('Decision Tree Algorithm(Test set)')
mtp.xlabel('Age')
mtp.ylabel('Estimated Salary')
mtp.legend()
mtp.show()

1. Importing Libraries:
(a) The code begins by importing necessary libraries:
- `import numpy as nm` for numerical operations.
- `import matplotlib.pyplot as mtp` for creating plots.
- `import pandas as pd` for data manipulation.

2. Importing the Dataset:


(a) The dataset is read from a CSV file and loaded into a pandas DataFrame named
`data_set`.
3. Extracting Independent and Dependent Variables:
(a) `x` contains the values of the 3rd and 4th columns (indices 2 and 3) from the dataset.
(b) `y` contains the values of the 5th column (index 4) from the dataset.

4. Splitting the Dataset:


(a) The dataset is split into training and test sets:
- 25% of the data is used for testing (`test_size = 0.25`), and 75% for training.
- `random_state = 0` ensures reproducibility.

5. Feature Scaling:
(a) Feature scaling is applied to the dataset:
- `StandardScaler` standardizes features by removing the mean and scaling to unit
variance.
- The scaler `st_x` is fitted to the training data and transforms both the training and
test data.

6. Fitting Decision Tree Classifier:


(a) The Decision Tree classifier is fitted to the training set:
- `DecisionTreeClassifier` is instantiated with the criterion set to 'entropy' and
`random_state=0`.
- The classifier is fitted to the training data (`x_train` and `y_train`).

7. Predicting the Test Set Results:


(a) Predictions are made on the test set:
- The trained classifier is used to predict the labels for the test data (`x_test`).

8. Creating the Confusion Matrix:


(a) The confusion matrix is created to evaluate the classifier:
- `confusion_matrix` compares the true labels (`y_test`) with the predicted labels
(`y_pred`).

9. Visualizing the Training Set Results:


(a) The training set results are visualized:
- The decision boundary and the training points are plotted.
- `contourf` creates a filled contour plot for the decision regions.
- `scatter` plots the data points with different colors for each class.
- The plot is labeled and displayed with appropriate titles and legends.

10. Visualizing the Test Set Results:


(a) The test set results are visualized:
- The decision boundary and the test points are plotted to see how well the classifier
generalizes to new data.
- Similar to the training set, `contourf` creates a filled contour plot, and `scatter` plots
the data points with different colors for each class.
- The plot is labeled and displayed with appropriate titles and legends.

You might also like