Kashif Murtaza
AI Sciences Instructor
A Practical Approach to Timeseries
Forecasting using Python
• Basic Data Manipulation in Time Series
Shahzaib Hamid
• How to Install Packages?
• Basic Plotting and Data Visualization in Python
• Dataset Manipulation and Slicing
• Overview of Time Series Parameters
Section Overview?
▪ Pandas, Numpy, MatplotLib and scikit-learn are basic important libraries
of python.
▪ Pandas is used to develop and modify data frames.
▪ Numpy is one of the most commonly used packages for scientific
computing in Python.
▪ Matplotlib is a cross-platform, data visualization and graphical plotting
library for Python.
▪ Scikit-learn (Sklearn) is the most useful and robust library for machine
learning in Python.
Packages Installation?
Recommended Tool:
▪ Anaconda Distribution of Python
Steps:
▪ Visit Anaconda.com/downloads
▪ Select Windows
▪ Download the .exe installer
▪ Open and run the .exe. Installer
▪ Open the Anaconda Prompt and you are good to go
▪ After Installation, create new environment on Anaconda
▪ pip install Jupyter notebook
▪ Install dependencies
Basic Plotting and Data Visualization in
Python
There are mainly three core frameworks for data visualization in Python.
1. Basic matplotlib
2. Area Plots, Histograms, and Bar Plots
3. Pie Charts, Box Plots, Scatter Plots, and Bubble Plots
Overview of Data
Visualization
Basic matplotlib
Area Plots,
Histograms, and
Bar Plots
Pie Charts, Box
Plots
Matplotlib
Install
Dependencies
Load Dataset
Data slicing
Data
visualization
using
matplotlib
Area Plots,
Histograms, and
Bar Plots
Load Dataset
Data Sorting
Data
visualization of
Area Plot
Histogram and
Pie Charts
Histogram
Pie Charts
Overview of Time Series Parameters
Correlation MAE
RMSE MAPE
Correlation
• It can be useful in data analysis and modeling to better
understand the relationships between variables.
• The statistical relationship between two variables is referred
to as their correlation.
• numpy, matplotlib
Mean Absolute Error
• Mean Absolute Error calculates the average difference
between the calculated values and actual values.
• It is also known as scale-dependent accuracy as it calculates
error in observations taken on the same scale.
• It is used as evaluation metrics for regression models in
machine learning.
• scikit-learn
Root Mean Square Error
• Root Mean Square Error, which is the square root of value
obtained from Mean Square Error function.
• Using RMSE, we can easily plot a difference between the
estimated and actual values of a parameter of the model.
• scikit-learn
Mean Absolute Percentage Error
• It is a statistical measure to define the accuracy of a machine
learning algorithm on a particular dataset.
• Can be considered as a loss function to define the error
termed by the model evaluation.
• It helps estimate the accuracy in terms of the differences in
the actual v/s estimated values.
• scikit-learn

Module 3 - Basics of Data Manipulation in Time Series

  • 1.
    Kashif Murtaza AI SciencesInstructor A Practical Approach to Timeseries Forecasting using Python • Basic Data Manipulation in Time Series Shahzaib Hamid • How to Install Packages? • Basic Plotting and Data Visualization in Python • Dataset Manipulation and Slicing • Overview of Time Series Parameters
  • 2.
    Section Overview? ▪ Pandas,Numpy, MatplotLib and scikit-learn are basic important libraries of python. ▪ Pandas is used to develop and modify data frames. ▪ Numpy is one of the most commonly used packages for scientific computing in Python. ▪ Matplotlib is a cross-platform, data visualization and graphical plotting library for Python. ▪ Scikit-learn (Sklearn) is the most useful and robust library for machine learning in Python.
  • 3.
    Packages Installation? Recommended Tool: ▪Anaconda Distribution of Python Steps: ▪ Visit Anaconda.com/downloads ▪ Select Windows ▪ Download the .exe installer ▪ Open and run the .exe. Installer ▪ Open the Anaconda Prompt and you are good to go ▪ After Installation, create new environment on Anaconda ▪ pip install Jupyter notebook ▪ Install dependencies
  • 4.
    Basic Plotting andData Visualization in Python There are mainly three core frameworks for data visualization in Python. 1. Basic matplotlib 2. Area Plots, Histograms, and Bar Plots 3. Pie Charts, Box Plots, Scatter Plots, and Bubble Plots
  • 5.
    Overview of Data Visualization Basicmatplotlib Area Plots, Histograms, and Bar Plots Pie Charts, Box Plots
  • 6.
  • 7.
    Area Plots, Histograms, and BarPlots Load Dataset Data Sorting Data visualization of Area Plot
  • 8.
  • 9.
    Overview of TimeSeries Parameters Correlation MAE RMSE MAPE
  • 10.
    Correlation • It canbe useful in data analysis and modeling to better understand the relationships between variables. • The statistical relationship between two variables is referred to as their correlation. • numpy, matplotlib
  • 11.
    Mean Absolute Error •Mean Absolute Error calculates the average difference between the calculated values and actual values. • It is also known as scale-dependent accuracy as it calculates error in observations taken on the same scale. • It is used as evaluation metrics for regression models in machine learning. • scikit-learn
  • 12.
    Root Mean SquareError • Root Mean Square Error, which is the square root of value obtained from Mean Square Error function. • Using RMSE, we can easily plot a difference between the estimated and actual values of a parameter of the model. • scikit-learn
  • 13.
    Mean Absolute PercentageError • It is a statistical measure to define the accuracy of a machine learning algorithm on a particular dataset. • Can be considered as a loss function to define the error termed by the model evaluation. • It helps estimate the accuracy in terms of the differences in the actual v/s estimated values. • scikit-learn