0% found this document useful (0 votes)
12 views

PYTHON

This document provides a comprehensive guide on setting up a Python environment for machine learning, highlighting the advantages of using Python and its extensive libraries like NumPy, Pandas, Scikit-learn, and TensorFlow. It covers installation methods, configuring environments with Anaconda, and utilizing Google Colaboratory for cloud-based machine learning tasks. Additionally, it discusses key libraries for data manipulation, visualization, and model building, emphasizing the collaborative and flexible nature of Python in machine learning projects.

Uploaded by

riyasshelke824
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views

PYTHON

This document provides a comprehensive guide on setting up a Python environment for machine learning, highlighting the advantages of using Python and its extensive libraries like NumPy, Pandas, Scikit-learn, and TensorFlow. It covers installation methods, configuring environments with Anaconda, and utilizing Google Colaboratory for cloud-based machine learning tasks. Additionally, it discusses key libraries for data manipulation, visualization, and model building, emphasizing the collaborative and flexible nature of Python in machine learning projects.

Uploaded by

riyasshelke824
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 43

Setting Up the Python

Environment for
Machine Learning
Welcome to the exciting world of
machine learning. In this
presentation, we'll dive into
setting up the Python
environment, the language of
choice for building and deploying
machine learning models. So,
let's get started and prepare
ourselves for a journey into the
world of AI and machine learning.
Short Term Program on: Insights into Machine Learning and Deep Learning:Theory to Practice
Why Python for Machine Learning?
Extensive Libraries Easy to Learn Strong Community
Python boasts a wealth Python's syntax is Python has a vast and
of powerful libraries clear, concise, and active community of
specifically designed beginner-friendly. It's developers and
for machine learning, widely used for general researchers, offering a
such as NumPy, programming, making it wealth of resources,
Pandas, Scikit-learn, an ideal language to documentation, and
and TensorFlow. These learn for those new to support for machine
libraries provide pre- coding. learning. This
built functions and collaborative
algorithms for data environment is
manipulation, analysis, incredibly valuable for
and model building. learning and problem-
solving.

Short Term Program on: Insights into Machine Learning and Deep Learning:Theory to Practice
Setting up the Python Environment
1.Install Python/ 2. Install Libraries
use cloud-based platform
Use pip,
Ensure you
Python's
have Python
package
installed on
installer, to
your system.
install the
You can
necessary
download the
libraries like
latest version
NumPy,
from the
Pandas,
official Python
Matplotlib, and
website.
Scikit-learn.

3. Set up a Project Folder


Create a folder to store your Python code, data files, and any
other relevant files for the project.
Short Term Program on: Insights into Machine Learning and Deep Learning:Theory to Practice
Different Ways to Install Python

1.From the Official Website 5. Using Pre-Installed Python


Visit Python.org and go to the Downloads section. Some OS have Python pre-installed.
Download the installer suitable for your OS. MacOS: Often includes Python 2.x (deprecated).
Run the installer and ensure 'Add Python to PATH' is checked. Linux: Most distributions include Python 3.x.

2. Using Package Managers 6. Compiling from Source


Windows: Use winget or Chocolatey (choco). Download source code from Python.org.
MacOS: Use Homebrew (brew). Extract and navigate to the source directory.
Linux: Use apt, dnf, or pacman based on the distribution. Run ./configure, make, and sudo make install.

3. Using Python Environment Management Tools 7. Portable Python


Pyenv: Manage multiple Python versions easily. Use portable Python distributions like WinPython or Portable
Anaconda/Miniconda: Ideal for data science workflows. Python.
Both tools help isolate and manage Python environments. No installation required; runs directly from a folder.

4. Using Docker 8. Installing via IDEs


Pull Python images from Docker Hub. IDEs like PyCharm automate Python installation.
Run Python in a containerized environment. Useful for beginners who need an integrated setup.
Example: docker pull python:3.x

Short Term Program on: Insights into Machine Learning and Deep Learning:Theory to Practice 4
Installing Anaconda
1 What is Anaconda? 2 Download Anaconda
Anaconda is a free and open-source Download the appropriate
distribution of Python and R Anaconda installer for your
programming languages for data operating system (Windows,
science, machine learning, and deep macOS, or Linux) from the official
learning. It provides a comprehensive Anaconda website. You'll find both
environment with all the necessary graphical installer and command-
packages and tools pre-installed, line options.
making it easy to get started with
machine learning.

3 Run the Installer


Run the downloaded installer and follow the on-screen instructions. Choose
the appropriate installation directory and ensure you select "add Anaconda to
your PATH" for easier access to commands from the command line.

Short Term Program on: Insights into Machine Learning and Deep Learning:Theory to Practice
Configuring the Anaconda Environment
Create a Virtual Activate the Environment
Environment
Activate the newly created
Virtual environments help virtual environment to start
isolate project dependencies, working within it. Use the
preventing conflicts and command: "conda activate "
ensuring that each project (replace with the name of your
uses the specific packages environment). You'll see the
required. In the Anaconda environment name in
Prompt, use the command: parentheses before the
"conda create -n python=3.9" prompt.
(replace with a descriptive
name).

Install Packages
Within the activated environment, you can install packages using the
"conda install" command. For example, to install NumPy, use: "conda
install numpy."
Short Term Program on: Insights into Machine Learning and Deep Learning:Theory to Practice
Importing Relevant Libraries
NumPy Pandas
Import the NumPy library for Import the Pandas library
numerical computation. for data manipulation and
analysis.

Matplotlib Scikit-learn
Import the Matplotlib library Import the Scikit-learn
for data visualization and library for machine learning
plotting. algorithms, including linear
and polynomial regression.

TensorFlow
A powerful library for deep learning, offering a flexible
framework for building and training neural networks.

Short Term Program on: Insights into Machine Learning and Deep Learning:Theory to Practice
Verifying the Environment
Open Python Interpreter
1 Launch a Python interpreter within your activated virtual
environment. This can be done by typing "python" in the
Anaconda Prompt or by starting a Python script in your code
editor.

Import Libraries
2 Attempt to import the necessary libraries: "import numpy as
np" or "import pandas as pd" or "from
sklearn.model_selection import train_test_split". If
successful, you should not see any errors.

Run Sample Code


3 Try a basic script that uses the installed libraries, such as
creating a NumPy array or reading a CSV file with Pandas. If
the code runs without errors, your environment is set up
correctly.

Short Term Program on: Insights into Machine Learning and Deep Learning:Theory to Practice
Troubleshooting Common Issues
Package Conflicts Environment Activation Internet Connectivity
If you encounter errors Double-check that Some installation
related to missing you've activated the processes require an
packages or conflicting correct virtual internet connection to
versions, use "conda environment. You download packages.
list" to list all packages should see the Ensure that your
in your environment. environment name in internet connection is
You can then use parentheses before the stable and that any
"conda update " to prompt. If you're not in firewall settings are not
update a specific the correct blocking access.
package or "conda environment, activate it
install " to install a new using "conda activate ".
one.

Short Term Program on: Insights into Machine Learning and Deep Learning:Theory to Practice
Harnessing Google Colab for ML

Google Colaboratory is a free and


accessible cloud-based platform for
machine learning, data science,
and Python development.
We'll delve into the advantages of
Colab, demonstrate how to get
started, and guide you through
essential tasks, from managing
packages to training models.

Short Term Program on: Insights into Machine Learning and Deep Learning:Theory to Practice
Introduction to Google Colaboratory
Cloud-Based Jupyter Notebook
Google Colaboratory, often shortened to Colab, is a web-based environment for coding,
running, and sharing machine learning models. It's powered by Jupyter Notebooks, a
popular format for interactive coding and documentation, providing a seamless
experience for data science tasks.

Free and Accessible


Colab is entirely free to use, requiring only a Google account. Its accessibility eliminates
the need for complex software installations and provides a streamlined setup, allowing
you to focus on your projects rather than technical hurdles.

Short Term Program on: Insights into Machine Learning and Deep Learning:Theory to Practice
Why Choose Google Collaboratory
1 GPU and TPU Access 2 Pre-Installed Libraries
Colab offers free access to powerful GPUs and TPUs Colab comes pre-installed with essential Python
(Tensor Processing Units), enabling you to accelerate libraries for machine learning, such as TensorFlow,
complex machine learning tasks and train deep PyTorch, scikit-learn, and more. This pre-configured
learning models more efficiently. environment saves you time and eliminates
dependencies management headaches.

3 Collaboration and Sharing 4 Seamless Integration


Colab encourages collaboration by allowing you to Colab seamlessly integrates with other Google
share your notebooks with others, making it easy to services like Drive and Gmail. This integration
share code, results, and insights. This feature fosters a streamlines data storage, project management, and
collaborative environment for learning and research. communication, making it an excellent choice for
workflow efficiency.

Short Term Program on: Insights into Machine Learning and Deep Learning:Theory to Practice
Getting Started with Google Colaboratory
Navigate to Colab
Begin by opening your browser and visiting the Google Colaboratory
website. You'll be greeted with a welcoming interface, allowing you to
access existing notebooks or start a new one.

Create a New Notebook


To begin a new project, simply click the "New Notebook" button. This
will create a blank Jupyter Notebook, your canvas for writing and
executing Python code. You can rename the notebook to reflect your
project's theme.

Start Coding
Now you're ready to write your code! Colab presents a familiar interface
with code cells and markdown cells for documentation. Add code cells
to experiment with Python commands and machine learning libraries.

Short Term Program on: Insights into Machine Learning and Deep Learning:Theory to Practice
Short Term Program on: Insights into Machine Learning and Deep Learning:Theory to Practice
Managing Packages in Google Colaboratory
Installation with 'pip' Importing Libraries
Colab utilizes the 'pip' package manager to install Python Once installed, you can import libraries using the 'import'
libraries. To install a new library, use the 'pip install' statement. For instance, 'import numpy as np' imports
command within a code cell. For example, 'pip install NumPy and assigns it the alias 'np' for convenience. This
numpy' will install the NumPy library for numerical allows you to utilize the library's functions in your code.
computations.

Managing Dependencies Environment Management


Colab allows you to create a 'requirements.txt' file to Colab offers an environment management system,
specify all the packages your project requires. This allowing you to create virtual environments within a
ensures that collaborators or those working on your notebook to isolate project dependencies and prevent
project have the necessary libraries installed. conflicts. This helps ensure a smooth development
process.

Short Term Program on: Insights into Machine Learning and Deep Learning:Theory to Practice
Python Machine Learning Libraries

Python has become the language of


choice for machine learning, thanks to
its extensive libraries and active
community.

Short Term Program on: Insights into Machine Learning and Deep Learning:Theory to Practice
NumPy: The Foundation of
Scientific Computing
Versatile Arrays
NumPy's core data structure is the multidimensional array,
which allows efficient storage and manipulation of numerical
data. Its arrays are optimized for mathematical operations,
making NumPy a fundamental library for linear algebra,
Fourier transforms, and random number generation.

Broadcasting
NumPy introduces broadcasting, a mechanism that enables
operations between arrays of different shapes, simplifying
mathematical operations. It provides a powerful way to
perform vectorized calculations, significantly enhancing
performance compared to element-wise iterations.

Short Term Program on: Insights into Machine Learning and Deep Learning:Theory to Practice
Pandas: Data Manipulation and Analysis
Data Structures
Pandas introduces powerful data structures like Series (1D labeled array) and DataFrame (2D labeled
data structure), allowing efficient data manipulation and analysis. Its intuitive API makes it easy to
access, filter, and modify data.

Data Cleaning
Pandas offers extensive tools for data cleaning, including handling missing values, removing duplicates, and
transforming data types. It empowers data scientists to prepare data for analysis and modeling, ensuring
data quality and consistency.

Data Visualization
Pandas integrates well with Matplotlib, allowing users to easily create charts and graphs directly from
DataFrames. This seamless integration simplifies data visualization and facilitates insights from data
exploration.

Short Term Program on: Insights into Machine Learning and Deep Learning:Theory to Practice
Scikit-learn: Machine Learning Algorithms and
Models

Algorithms
Scikit-learn provides a wide range of machine learning algorithms, including classification, regression,
clustering, and dimensionality reduction techniques. Its comprehensive library empowers users to
select the most suitable algorithm for their specific task.

Models
Scikit-learn offers a user-friendly API for building and training machine learning models. Its
streamlined process includes data preparation, model selection, parameter tuning, and model
evaluation, facilitating efficient model development.

Pipelines
Scikit-learn's pipelines streamline the machine learning process by combining multiple steps into a
single object. This simplifies data preprocessing, model training, and model evaluation, enabling
reproducible workflows.

Short Term Program on: Insights into Machine Learning and Deep Learning:Theory to Practice
TensorFlow: Open-Source Machine
Learning and Deep Learning
1
Scalable
TensorFlow supports distributed training across multiple GPUs
and TPUs, enabling the training of massive models on large
datasets. Its scalability makes it suitable for handling real-
world machine learning challenges.
2
Flexible
TensorFlow allows for both eager execution and graph
execution, providing flexibility for different development
workflows. Its flexible API enables users to customize models
and experiment with new architectures.
3
Productive
TensorFlow's extensive ecosystem includes tools for model
deployment, serving, and monitoring, making it a productive
platform for deploying machine learning models into
production environments.

Short Term Program on: Insights into Machine Learning and Deep Learning:Theory to Practice
Keras: High-Level Neural Networks API
User-Friendly
1 Keras provides a high-level API for building and training deep learning models. Its simple
and intuitive syntax makes it easy to define and experiment with neural network
architectures, even for beginners.

Modular
Keras' modular design allows users to easily assemble and customize neural network
2 layers, activation functions, optimizers, and loss functions, promoting flexibility and
customization.

Multi-Backend
Keras runs on top of TensorFlow, Theano, and CNTK, providing flexibility to choose
3
the underlying deep learning backend based on project requirements and available
resources.

Short Term Program on: Insights into Machine Learning and Deep Learning:Theory to Practice
Matplotlib: Plotting and Data Visualization

Static Plots
Matplotlib provides a wide range of static plotting functions, including line
1 plots, scatter plots, bar charts, histograms, and more. It allows users to
create visually appealing and informative static visualizations.

Interactive Plots
2 Matplotlib also supports interactive plots, allowing users to zoom,
pan, and hover over data points, providing a more engaging and
interactive data exploration experience.

Customization
Matplotlib offers extensive customization options for
3 controlling the appearance of plots, including color
schemes, line styles, labels, and annotations. It enables
users to tailor visualizations to their specific needs.

Short Term Program on: Insights into Machine Learning and Deep Learning:Theory to Practice
Leveraging Python's
Rich Machine
Learning Ecosystem
Python's rich machine learning
ecosystem empowers data scientists
and developers to build and deploy
machine learning models effectively.

These libraries, working together,


enable a complete machine learning
workflow, from data preparation to
model deployment.

Short Term Program on: Insights into Machine Learning and Deep Learning:Theory to Practice
Working with Datasets

Importing from Google Drive


Colab integrates seamlessly with Google Drive, allowing you to upload or access datasets
stored in your Drive. You can use the 'from google.colab import drive' command and
'drive.mount' to mount your Drive to the Colab environment.

Downloading from the Web


Colab allows you to download datasets directly from the web using libraries like 'urllib' or
'requests.' For example, you can use 'urllib.request.urlretrieve' to download a file from a
specific URL.

Using Public Datasets


Colab offers access to public datasets through repositories like Kaggle. You can directly
import datasets from Kaggle using the 'kaggle' library, simplifying the process of obtaining
and working with public data.

Short Term Program on: Insights into Machine Learning and Deep Learning:Theory to Practice
Load the Dataset
Import Libraries
Begin by importing necessary libraries like Pandas, NumPy, and Scikit-
learn, which provide tools for data manipulation, numerical computation,
and machine learning tasks.

Read the Data


Use the appropriate function to read the data from its source. For example,
Pandas' pd.read_csv() function can read comma-separated value files.

Short Term Program on: Insights into Machine Learning and Deep Learning:Theory to Practice
Hands-on: Libraries
1) Install the necessary Python libraries required for data handling, visualization, and machine
learning.

2) Write a Python script to import essential libraries like NumPy, pandas, matplotlib, seaborn, and
scikit-learn to set up the environment for a machine learning project.

3) Program to use dataframe

4) Using a Local CSV File:


Load a dataset from a local CSV file into a pandas DataFrame and prepare it for analysis.
Using scikit-learn’s Built-in Datasets:
Use a built-in dataset from the scikit-learn library (e.g., Iris dataset) and convert it into a pandas
DataFrame for easier manipulation and exploration.

5) Write a program to display the first few rows, basic statistics, and metadata (like data types
and missing values of a dataset to understand its structure and quality.

6) Save a processed dataset to a CSV file and load it back into a pandas DataFrame to
demonstrate how to persist and retrieve datasets.

Short Term Program on: Insights into Machine Learning and Deep Learning:Theory to Practice
Install the necessary Python libraries required for data
handling, visualization, and machine learning.

!pip install numpy


!pip install pandas scikit-learn matplotlib seaborn

Write a Python script to import essential libraries like


NumPy, pandas, matplotlib, seaborn, and scikit-learn to
set up the environment for a machine learning project.
import numpy
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.model_selection import train_test_split
Short Term Program on: Insights into Machine Learning and Deep Learning:Theory to Practice
Using dataframes:
1.Import the pandas library:
•Use import pandas as pd to load the pandas library with the alias pd.
2.Create a dictionary to represent data:
•The dictionary has three keys: 'Name', 'Age', and 'City'.
•Each key maps to a list of corresponding values:
•'Name': ['Alice', 'Bob', 'Charlie']
•'Age': [25, 30, 35]
•'City': ['New York', 'Los Angeles', 'Chicago']
3.Convert the dictionary to a DataFrame:
•Use pd.DataFrame(data) to convert the dictionary data into a pandas DataFrame.
4.Display the original DataFrame:
•Print the DataFrame to show the original data.
5.Add a new column ('Salary') to the DataFrame:
•Add the 'Salary' column with values [70000, 80000, 90000] to the DataFrame.
6.Filter rows where Age is greater than 28:
•Create a filtered DataFrame that includes only rows where the Age column value is greater
than 28.
7.Display the updated DataFrame:
•Print the DataFrame after adding the 'Salary' column.
8.Display the filtered DataFrame:
•Print the filtered DataFrame showing only the rows where Age is greater than 28.

Short Term Program on: Insights into Machine Learning and Deep Learning:Theory to Practice
1.Import the pandas library:
•Use import pandas as pd to load the pandas library with the alias pd.
import pandas as pd
2. Create a dictionary

data = {
'Name': ['Alice', 'Bob', 'Charlie'],
'Age': [25, 30, 35],
'City': ['New York', 'Los Angeles', 'Chicago']
}

3.Convert the dictionary to a DataFrame:


•Use pd.DataFrame(data) to convert the dictionary data into a pandas
DataFrame.
df = pd.DataFrame(data)

4.Display the original DataFrame:


•Print the DataFrame to show the original data.
print("Original DataFrame:")
print(df)

Short Term Program on: Insights into Machine Learning and Deep Learning:Theory to Practice
5.Add a new column ('Salary') to the DataFrame:
•Add the 'Salary' column with values [70000, 80000, 90000] to the
DataFrame.
df['Salary'] = [70000, 80000, 90000]

6.Filter rows where Age is greater than 28:


•Create a filtered DataFrame that includes only rows where the Age
column value is greater than 28.
filtered_df = df[df['Age'] > 28]

7.Display the updated DataFrame:


•Print the DataFrame after adding the 'Salary' column.
df = pd.DataFrame(data)

8.Display the filtered DataFrame:


•Print the filtered DataFrame showing only the rows where Age is greater
than 28.
filtered_df = df[df['Age'] > 28]

Short Term Program on: Insights into Machine Learning and Deep Learning:Theory to Practice
Exercise:
1. Select specific columns from the DataFrame.
2. Select rows based on a condition.
3. Sort the DataFrame by a specific column.
4. Add a new row to the DataFrame.
5. Update a column's values.
6. Drop a column from the DataFrame.
7. Count the unique values in a column.
8. Check for missing values in the DataFrame.
9. Rename a column.
10.Group data by a specific column and calculate a
summary statistic.
11.Reset the index of the DataFrame.
12.Filter data by multiple conditions.
13.Get descriptive statistics of the numeric columns.
14.Apply a function to a specific column.
Short Term Program on: Insights into Machine Learning and Deep Learning:Theory to Practice
Exercise:
1. Select specific columns from the DataFrame.
# Select only the 'Name' and 'City' columns
selected_columns = df[['Name', 'City’]]
print(selected_columns)

2. Select rows based on a condition.

# Select rows where 'Age' is less than 30


under_30 = df[df['Age'] < 30]
print(under_30)

Short Term Program on: Insights into Machine Learning and Deep Learning:Theory to Practice
Exercise:
3. Sort the DataFrame by a specific column.
# Sort the DataFrame by the 'Age' column in ascending order
sorted_df = df.sort_values(by='Age’)
print(sorted_df)

4. Add a new row to the DataFrame.


new_row = pd.DataFrame({'Name': ['David'], 'Age': [28],
'City': ['Boston'], 'Salary': [95000]})

df = pd.concat([df, new_row], ignore_index=True)


print(df)

Short Term Program on: Insights into Machine Learning and Deep Learning:Theory to Practice
Exercise:
1.Select Rows by Index
2.Select Rows Using loc[] (Label-based
Selection)
3.Select Rows Using iloc[] with Ranges
4.Select Rows by Multiple Conditions
5.Select Specific Rows Using head() or
tail()

Short Term Program on: Insights into Machine Learning and Deep Learning:Theory to Practice
Hands-on: Datasets

1)Load a dataset from a local CSV file into a pandas


DataFrame in Python

2)Display the first few rows, basic statistics, and metadata


(such as column types and missing values) of a dataset in
Python.

3)use a built-in dataset from the scikit-learn library (such as


the Iris dataset) and convert it into a pandas DataFrame

Short Term Program on: Insights into Machine Learning and Deep Learning:Theory to Practice
Datasets: Examples

1) Observe the student dataset

2) The Iris dataset is one of the most famous and widely used
datasets in machine learning and statistics.

Details of the Iris Dataset:


•Number of Instances (Rows): 150
•Number of Features (Columns): 4
•Classes (Categories): 3
•Class Names:
• Setosa
• Versicolor
• Virginica

Short Term Program on: Insights into Machine Learning and Deep Learning:Theory to Practice
Datasets: Examples

• The Iris dataset is a tabular dataset that contains


numerical data representing physical
measurements of iris flowers.

• Specifically, it consists of 150 rows and 5 columns


(4 features and 1 target label), with the features
being measurements like sepal length, sepal width,
petal length, and petal width (all in centimeters).

• Each row represents a different flower, and the


target label indicates the species of the flower
(Setosa, Versicolor, or Virginica).

Short Term Program on: Insights into Machine Learning and Deep Learning:Theory to Practice
1)Load a dataset from a local CSV file into a
pandas DataFrame in Python

# Replace YOUR actual file path


dataset =
pd.read_csv('/content/drive/MyDrive/1/

student-dataset.csv')

Short Term Program on: Insights into Machine Learning and Deep Learning:Theory to Practice
2) Display the first few rows, basic statistics, and
metadata (such as column types and missing
values) of a dataset in Python

# Display first few rows


print(dataset.head())

# Check the dimensions of the dataset


print(f"Dataset shape: {dataset.shape}")
# Display basic statistics
print(dataset.describe())
# Show dataset info (types, non-null counts)
print(dataset.info())

Short Term Program on: Insights into Machine Learning and Deep Learning:Theory to Practice
3) Use a built-in dataset from the scikit-
learn library (such as the Iris dataset)
and convert it into a pandas DataFrame

from sklearn.datasets import load_iris

•This line imports the load_iris function from the datasets module
of the scikit-learn library.
•load_iris is a function that loads the Iris dataset, a classic
dataset for machine learning tasks.

# Load Iris dataset


iris = load_iris()
This line calls the load_iris() function to load the Iris dataset into
the variable iris.

Short Term Program on: Insights into Machine Learning and Deep Learning:Theory to Practice
3) Use a built-in dataset from the scikit-learn
library (such as the Iris dataset) and convert it
into a pandas DataFrame
# Convert to DataFrame
df = pd.DataFrame(data=iris.data, columns=iris.feature_names)

•This line converts the iris.data (which is a 2D NumPy array) into a pandas
DataFrame.
•iris.data holds the feature values of the dataset.
•iris.feature_names holds the names of the features, such as "sepal length", "sepal
width", etc. These names are passed as column labels for the DataFrame.
•The result is that df is now a DataFrame where each row represents a flower and the
columns represent the different features (sepal length, width, petal length, and
width).
df['target'] = iris.target

•This line adds a new column to the df DataFrame, labeled 'target'.


•iris.target contains the target values (species of the iris flowers), which are integers
(0, 1, 2 corresponding to Setosa, Versicolor, and Virginica).
•The result is that the DataFrame df now includes the target labels (species)
Short Term Program on: Insights into Machine Learning and Deep Learning:Theory to Practice
3) Use a built-in dataset from the scikit-learn library
(such as the Iris dataset) and convert it into a pandas
DataFrame

The dataset is returned as a Bunch object (similar


to a dictionary), where:

•iris.data contains the feature values (150 rows and


4 columns).

•iris.target contains the target labels (species


types).

•iris.feature_names contains the names of the


features.

Short Term Program on: Insights into Machine Learning and Deep Learning:Theory to Practice
Exercise on iris dataset:
1. Display the First 5 Rows of the Dataset
2. Display the Last 5 Rows of the Dataset
3. Display the first N rows (e.g., first 10 rows) of
the dataset.
4. Display the Last N Rows
5. Display a specific range of rows (e.g., middle
rows) of the dataset.
6. Display Random Rows from the Dataset
7. Display Specific Columns
8. Display the Data Types of the Columns
9. Display Basic Statistical Summary
10.Display Unique Values in the 'target' Column
Short Term Program on: Insights into Machine Learning and Deep Learning:Theory to Practice

You might also like