PYTHON
PYTHON
Environment for
Machine Learning
Welcome to the exciting world of
machine learning. In this
presentation, we'll dive into
setting up the Python
environment, the language of
choice for building and deploying
machine learning models. So,
let's get started and prepare
ourselves for a journey into the
world of AI and machine learning.
Short Term Program on: Insights into Machine Learning and Deep Learning:Theory to Practice
Why Python for Machine Learning?
Extensive Libraries Easy to Learn Strong Community
Python boasts a wealth Python's syntax is Python has a vast and
of powerful libraries clear, concise, and active community of
specifically designed beginner-friendly. It's developers and
for machine learning, widely used for general researchers, offering a
such as NumPy, programming, making it wealth of resources,
Pandas, Scikit-learn, an ideal language to documentation, and
and TensorFlow. These learn for those new to support for machine
libraries provide pre- coding. learning. This
built functions and collaborative
algorithms for data environment is
manipulation, analysis, incredibly valuable for
and model building. learning and problem-
solving.
Short Term Program on: Insights into Machine Learning and Deep Learning:Theory to Practice
Setting up the Python Environment
1.Install Python/ 2. Install Libraries
use cloud-based platform
Use pip,
Ensure you
Python's
have Python
package
installed on
installer, to
your system.
install the
You can
necessary
download the
libraries like
latest version
NumPy,
from the
Pandas,
official Python
Matplotlib, and
website.
Scikit-learn.
Short Term Program on: Insights into Machine Learning and Deep Learning:Theory to Practice 4
Installing Anaconda
1 What is Anaconda? 2 Download Anaconda
Anaconda is a free and open-source Download the appropriate
distribution of Python and R Anaconda installer for your
programming languages for data operating system (Windows,
science, machine learning, and deep macOS, or Linux) from the official
learning. It provides a comprehensive Anaconda website. You'll find both
environment with all the necessary graphical installer and command-
packages and tools pre-installed, line options.
making it easy to get started with
machine learning.
Short Term Program on: Insights into Machine Learning and Deep Learning:Theory to Practice
Configuring the Anaconda Environment
Create a Virtual Activate the Environment
Environment
Activate the newly created
Virtual environments help virtual environment to start
isolate project dependencies, working within it. Use the
preventing conflicts and command: "conda activate "
ensuring that each project (replace with the name of your
uses the specific packages environment). You'll see the
required. In the Anaconda environment name in
Prompt, use the command: parentheses before the
"conda create -n python=3.9" prompt.
(replace with a descriptive
name).
Install Packages
Within the activated environment, you can install packages using the
"conda install" command. For example, to install NumPy, use: "conda
install numpy."
Short Term Program on: Insights into Machine Learning and Deep Learning:Theory to Practice
Importing Relevant Libraries
NumPy Pandas
Import the NumPy library for Import the Pandas library
numerical computation. for data manipulation and
analysis.
Matplotlib Scikit-learn
Import the Matplotlib library Import the Scikit-learn
for data visualization and library for machine learning
plotting. algorithms, including linear
and polynomial regression.
TensorFlow
A powerful library for deep learning, offering a flexible
framework for building and training neural networks.
Short Term Program on: Insights into Machine Learning and Deep Learning:Theory to Practice
Verifying the Environment
Open Python Interpreter
1 Launch a Python interpreter within your activated virtual
environment. This can be done by typing "python" in the
Anaconda Prompt or by starting a Python script in your code
editor.
Import Libraries
2 Attempt to import the necessary libraries: "import numpy as
np" or "import pandas as pd" or "from
sklearn.model_selection import train_test_split". If
successful, you should not see any errors.
Short Term Program on: Insights into Machine Learning and Deep Learning:Theory to Practice
Troubleshooting Common Issues
Package Conflicts Environment Activation Internet Connectivity
If you encounter errors Double-check that Some installation
related to missing you've activated the processes require an
packages or conflicting correct virtual internet connection to
versions, use "conda environment. You download packages.
list" to list all packages should see the Ensure that your
in your environment. environment name in internet connection is
You can then use parentheses before the stable and that any
"conda update " to prompt. If you're not in firewall settings are not
update a specific the correct blocking access.
package or "conda environment, activate it
install " to install a new using "conda activate ".
one.
Short Term Program on: Insights into Machine Learning and Deep Learning:Theory to Practice
Harnessing Google Colab for ML
Short Term Program on: Insights into Machine Learning and Deep Learning:Theory to Practice
Introduction to Google Colaboratory
Cloud-Based Jupyter Notebook
Google Colaboratory, often shortened to Colab, is a web-based environment for coding,
running, and sharing machine learning models. It's powered by Jupyter Notebooks, a
popular format for interactive coding and documentation, providing a seamless
experience for data science tasks.
Short Term Program on: Insights into Machine Learning and Deep Learning:Theory to Practice
Why Choose Google Collaboratory
1 GPU and TPU Access 2 Pre-Installed Libraries
Colab offers free access to powerful GPUs and TPUs Colab comes pre-installed with essential Python
(Tensor Processing Units), enabling you to accelerate libraries for machine learning, such as TensorFlow,
complex machine learning tasks and train deep PyTorch, scikit-learn, and more. This pre-configured
learning models more efficiently. environment saves you time and eliminates
dependencies management headaches.
Short Term Program on: Insights into Machine Learning and Deep Learning:Theory to Practice
Getting Started with Google Colaboratory
Navigate to Colab
Begin by opening your browser and visiting the Google Colaboratory
website. You'll be greeted with a welcoming interface, allowing you to
access existing notebooks or start a new one.
Start Coding
Now you're ready to write your code! Colab presents a familiar interface
with code cells and markdown cells for documentation. Add code cells
to experiment with Python commands and machine learning libraries.
Short Term Program on: Insights into Machine Learning and Deep Learning:Theory to Practice
Short Term Program on: Insights into Machine Learning and Deep Learning:Theory to Practice
Managing Packages in Google Colaboratory
Installation with 'pip' Importing Libraries
Colab utilizes the 'pip' package manager to install Python Once installed, you can import libraries using the 'import'
libraries. To install a new library, use the 'pip install' statement. For instance, 'import numpy as np' imports
command within a code cell. For example, 'pip install NumPy and assigns it the alias 'np' for convenience. This
numpy' will install the NumPy library for numerical allows you to utilize the library's functions in your code.
computations.
Short Term Program on: Insights into Machine Learning and Deep Learning:Theory to Practice
Python Machine Learning Libraries
Short Term Program on: Insights into Machine Learning and Deep Learning:Theory to Practice
NumPy: The Foundation of
Scientific Computing
Versatile Arrays
NumPy's core data structure is the multidimensional array,
which allows efficient storage and manipulation of numerical
data. Its arrays are optimized for mathematical operations,
making NumPy a fundamental library for linear algebra,
Fourier transforms, and random number generation.
Broadcasting
NumPy introduces broadcasting, a mechanism that enables
operations between arrays of different shapes, simplifying
mathematical operations. It provides a powerful way to
perform vectorized calculations, significantly enhancing
performance compared to element-wise iterations.
Short Term Program on: Insights into Machine Learning and Deep Learning:Theory to Practice
Pandas: Data Manipulation and Analysis
Data Structures
Pandas introduces powerful data structures like Series (1D labeled array) and DataFrame (2D labeled
data structure), allowing efficient data manipulation and analysis. Its intuitive API makes it easy to
access, filter, and modify data.
Data Cleaning
Pandas offers extensive tools for data cleaning, including handling missing values, removing duplicates, and
transforming data types. It empowers data scientists to prepare data for analysis and modeling, ensuring
data quality and consistency.
Data Visualization
Pandas integrates well with Matplotlib, allowing users to easily create charts and graphs directly from
DataFrames. This seamless integration simplifies data visualization and facilitates insights from data
exploration.
Short Term Program on: Insights into Machine Learning and Deep Learning:Theory to Practice
Scikit-learn: Machine Learning Algorithms and
Models
Algorithms
Scikit-learn provides a wide range of machine learning algorithms, including classification, regression,
clustering, and dimensionality reduction techniques. Its comprehensive library empowers users to
select the most suitable algorithm for their specific task.
Models
Scikit-learn offers a user-friendly API for building and training machine learning models. Its
streamlined process includes data preparation, model selection, parameter tuning, and model
evaluation, facilitating efficient model development.
Pipelines
Scikit-learn's pipelines streamline the machine learning process by combining multiple steps into a
single object. This simplifies data preprocessing, model training, and model evaluation, enabling
reproducible workflows.
Short Term Program on: Insights into Machine Learning and Deep Learning:Theory to Practice
TensorFlow: Open-Source Machine
Learning and Deep Learning
1
Scalable
TensorFlow supports distributed training across multiple GPUs
and TPUs, enabling the training of massive models on large
datasets. Its scalability makes it suitable for handling real-
world machine learning challenges.
2
Flexible
TensorFlow allows for both eager execution and graph
execution, providing flexibility for different development
workflows. Its flexible API enables users to customize models
and experiment with new architectures.
3
Productive
TensorFlow's extensive ecosystem includes tools for model
deployment, serving, and monitoring, making it a productive
platform for deploying machine learning models into
production environments.
Short Term Program on: Insights into Machine Learning and Deep Learning:Theory to Practice
Keras: High-Level Neural Networks API
User-Friendly
1 Keras provides a high-level API for building and training deep learning models. Its simple
and intuitive syntax makes it easy to define and experiment with neural network
architectures, even for beginners.
Modular
Keras' modular design allows users to easily assemble and customize neural network
2 layers, activation functions, optimizers, and loss functions, promoting flexibility and
customization.
Multi-Backend
Keras runs on top of TensorFlow, Theano, and CNTK, providing flexibility to choose
3
the underlying deep learning backend based on project requirements and available
resources.
Short Term Program on: Insights into Machine Learning and Deep Learning:Theory to Practice
Matplotlib: Plotting and Data Visualization
Static Plots
Matplotlib provides a wide range of static plotting functions, including line
1 plots, scatter plots, bar charts, histograms, and more. It allows users to
create visually appealing and informative static visualizations.
Interactive Plots
2 Matplotlib also supports interactive plots, allowing users to zoom,
pan, and hover over data points, providing a more engaging and
interactive data exploration experience.
Customization
Matplotlib offers extensive customization options for
3 controlling the appearance of plots, including color
schemes, line styles, labels, and annotations. It enables
users to tailor visualizations to their specific needs.
Short Term Program on: Insights into Machine Learning and Deep Learning:Theory to Practice
Leveraging Python's
Rich Machine
Learning Ecosystem
Python's rich machine learning
ecosystem empowers data scientists
and developers to build and deploy
machine learning models effectively.
Short Term Program on: Insights into Machine Learning and Deep Learning:Theory to Practice
Working with Datasets
Short Term Program on: Insights into Machine Learning and Deep Learning:Theory to Practice
Load the Dataset
Import Libraries
Begin by importing necessary libraries like Pandas, NumPy, and Scikit-
learn, which provide tools for data manipulation, numerical computation,
and machine learning tasks.
Short Term Program on: Insights into Machine Learning and Deep Learning:Theory to Practice
Hands-on: Libraries
1) Install the necessary Python libraries required for data handling, visualization, and machine
learning.
2) Write a Python script to import essential libraries like NumPy, pandas, matplotlib, seaborn, and
scikit-learn to set up the environment for a machine learning project.
5) Write a program to display the first few rows, basic statistics, and metadata (like data types
and missing values of a dataset to understand its structure and quality.
6) Save a processed dataset to a CSV file and load it back into a pandas DataFrame to
demonstrate how to persist and retrieve datasets.
Short Term Program on: Insights into Machine Learning and Deep Learning:Theory to Practice
Install the necessary Python libraries required for data
handling, visualization, and machine learning.
Short Term Program on: Insights into Machine Learning and Deep Learning:Theory to Practice
1.Import the pandas library:
•Use import pandas as pd to load the pandas library with the alias pd.
import pandas as pd
2. Create a dictionary
data = {
'Name': ['Alice', 'Bob', 'Charlie'],
'Age': [25, 30, 35],
'City': ['New York', 'Los Angeles', 'Chicago']
}
Short Term Program on: Insights into Machine Learning and Deep Learning:Theory to Practice
5.Add a new column ('Salary') to the DataFrame:
•Add the 'Salary' column with values [70000, 80000, 90000] to the
DataFrame.
df['Salary'] = [70000, 80000, 90000]
Short Term Program on: Insights into Machine Learning and Deep Learning:Theory to Practice
Exercise:
1. Select specific columns from the DataFrame.
2. Select rows based on a condition.
3. Sort the DataFrame by a specific column.
4. Add a new row to the DataFrame.
5. Update a column's values.
6. Drop a column from the DataFrame.
7. Count the unique values in a column.
8. Check for missing values in the DataFrame.
9. Rename a column.
10.Group data by a specific column and calculate a
summary statistic.
11.Reset the index of the DataFrame.
12.Filter data by multiple conditions.
13.Get descriptive statistics of the numeric columns.
14.Apply a function to a specific column.
Short Term Program on: Insights into Machine Learning and Deep Learning:Theory to Practice
Exercise:
1. Select specific columns from the DataFrame.
# Select only the 'Name' and 'City' columns
selected_columns = df[['Name', 'City’]]
print(selected_columns)
Short Term Program on: Insights into Machine Learning and Deep Learning:Theory to Practice
Exercise:
3. Sort the DataFrame by a specific column.
# Sort the DataFrame by the 'Age' column in ascending order
sorted_df = df.sort_values(by='Age’)
print(sorted_df)
Short Term Program on: Insights into Machine Learning and Deep Learning:Theory to Practice
Exercise:
1.Select Rows by Index
2.Select Rows Using loc[] (Label-based
Selection)
3.Select Rows Using iloc[] with Ranges
4.Select Rows by Multiple Conditions
5.Select Specific Rows Using head() or
tail()
Short Term Program on: Insights into Machine Learning and Deep Learning:Theory to Practice
Hands-on: Datasets
Short Term Program on: Insights into Machine Learning and Deep Learning:Theory to Practice
Datasets: Examples
2) The Iris dataset is one of the most famous and widely used
datasets in machine learning and statistics.
Short Term Program on: Insights into Machine Learning and Deep Learning:Theory to Practice
Datasets: Examples
Short Term Program on: Insights into Machine Learning and Deep Learning:Theory to Practice
1)Load a dataset from a local CSV file into a
pandas DataFrame in Python
student-dataset.csv')
Short Term Program on: Insights into Machine Learning and Deep Learning:Theory to Practice
2) Display the first few rows, basic statistics, and
metadata (such as column types and missing
values) of a dataset in Python
Short Term Program on: Insights into Machine Learning and Deep Learning:Theory to Practice
3) Use a built-in dataset from the scikit-
learn library (such as the Iris dataset)
and convert it into a pandas DataFrame
•This line imports the load_iris function from the datasets module
of the scikit-learn library.
•load_iris is a function that loads the Iris dataset, a classic
dataset for machine learning tasks.
Short Term Program on: Insights into Machine Learning and Deep Learning:Theory to Practice
3) Use a built-in dataset from the scikit-learn
library (such as the Iris dataset) and convert it
into a pandas DataFrame
# Convert to DataFrame
df = pd.DataFrame(data=iris.data, columns=iris.feature_names)
•This line converts the iris.data (which is a 2D NumPy array) into a pandas
DataFrame.
•iris.data holds the feature values of the dataset.
•iris.feature_names holds the names of the features, such as "sepal length", "sepal
width", etc. These names are passed as column labels for the DataFrame.
•The result is that df is now a DataFrame where each row represents a flower and the
columns represent the different features (sepal length, width, petal length, and
width).
df['target'] = iris.target
Short Term Program on: Insights into Machine Learning and Deep Learning:Theory to Practice
Exercise on iris dataset:
1. Display the First 5 Rows of the Dataset
2. Display the Last 5 Rows of the Dataset
3. Display the first N rows (e.g., first 10 rows) of
the dataset.
4. Display the Last N Rows
5. Display a specific range of rows (e.g., middle
rows) of the dataset.
6. Display Random Rows from the Dataset
7. Display Specific Columns
8. Display the Data Types of the Columns
9. Display Basic Statistical Summary
10.Display Unique Values in the 'target' Column
Short Term Program on: Insights into Machine Learning and Deep Learning:Theory to Practice