0% found this document useful (0 votes)
2 views

TP02

The document provides an introduction to the Scikit-Learn library for machine learning, detailing its functionalities such as classification, regression, clustering, and model evaluation. It outlines a step-by-step guide for building a machine learning model using Scikit-Learn, including data collection, preprocessing, model selection, training, prediction, and evaluation. The example focuses on using a Decision Tree Classifier with a salary dataset.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

TP02

The document provides an introduction to the Scikit-Learn library for machine learning, detailing its functionalities such as classification, regression, clustering, and model evaluation. It outlines a step-by-step guide for building a machine learning model using Scikit-Learn, including data collection, preprocessing, model selection, training, prediction, and evaluation. The example focuses on using a Decision Tree Classifier with a salary dataset.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

REPUBLIQUE ALGERIENNE DEMOCRATIQUE ET POPULAIRE

MINISTERE DE L’ENSEIGNEMENT SUPERIEUR ET DE LA


RECHERCHE SCIENTIFIQUE
ECOLE NATIONALE POLYTECHNIQUE
Module TP artificial intelligent
TP02: Mastering Machine Learning with Scikit-Learn: A Comprehensive
Introduction and Practical Guide
What is Scikit-learn library in Machine Learning?
Scikit-Learn, often referred to as sklearn, is a Python library for machine learning. It
provides a user-friendly interface and a rich set of tools for various machine learning tasks,
including classification, regression, and clustering. Scikit-Learn is widely used for its
versatility and ease of
Here are some common objectives you can pursue with scikit-learn:
 Classification: If your goal is to classify data into different categories or classes,
you can use scikit-learn to build and evaluate classification models. The objective
here might be to achieve high accuracy, precision, recall, or F1-score, depending
on your specific use case.
 Regression: When you want to predict a continuous numeric value, regression is
the objective. You can use scikit-learn to build regression models, and the goal is
typically to minimize the mean squared error (MSE) or other regression
evaluation metrics.
 Clustering: If you want to group data into clusters based on similarity, you can
use clustering algorithms in scikit-learn. The objective is to find natural groupings
in your data.
 Dimensionality Reduction: For tasks like feature selection or reducing the
dimensionality of your data, scikit-learn provides tools such as PCA (Principal
Component Analysis) or t-SNE. The objective here is often to preserve the most
important information while reducing the number of features or dimensions.
 Anomaly Detection: In cases where you want to identify unusual or anomalous
data points, scikit-learn can help you build models for anomaly detection. The
objective is to detect outliers effectively.
 Feature Selection: When you want to select the most informative features from a
large set of features, scikit-learn provides methods for feature selection. The
objective is to improve model performance and reduce overfitting.
 Model Evaluation: Regardless of the specific task, a common objective is to
evaluate your machine learning models properly. Scikit-learn provides tools for
cross-validation, hyperparameter tuning, and various evaluation metrics.
 Ensemble Learning: If you aim to combine multiple models for better predictive
performance, ensemble learning is a common objective. Scikit-learn provides
tools for building ensembles like Random Forests and Gradient Boosting.
To import the scikit-learn library in Python, you should use the following format:
from sklearn.model_selection import train_test_split
Tack 01 :
Building a machine learning model typically involves several steps. Here's a simplified
guide with example code in Python using scikit-learn, a popular machine learning library.
This example focuses on a basic supervised learning problem.
I. Data Collection:
Obtain and prepare your dataset. For this example, we'll use the salary dataset from
scikit-learn.
Import pandas as pd
Import numpy as np
Import matplotlib.pyplot as plt
Data = pd.read_csv (‘salary.csv)
Data.head().
II. Data Preprocessing:
 Clean, preprocess, and normalize your data.
 Split the dataset into training and testing sets.

from sklearn.model_selection import train_test_split


X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
III. Select a Model:
Choose a machine learning algorithm. We'll use a simple Decision Tree Classifier for this
example.
from sklearn.tree import DecisionTreeClassifier
model = DecisionTreeClassifier()
IV. Training the Model:
Fit your model on the training data.
model.fit(X_train, y_train)
V. Model prediction :
y_pred = model.predict(X_test)
VI. Model Evaluation:
Evaluate your model's performance on the test data.
from sklearn.metrics import accuracy_score
accuracy = accuracy_score(y_test, y_pred)
print(f"Accuracy: {accuracy}")

You might also like