0% found this document useful (0 votes)
10 views39 pages

L1-L3 - Tutorial 1

The document outlines the syllabus for a Data Analytics course in the III Semester of B.Tech, covering modules such as Introduction to Analytics, Data Exploration, and Recommender Systems. It includes references, course outcomes, types of data, business applications, and a structured process for data analysis, along with case studies illustrating practical applications. Additionally, it discusses the components of business analytics, including descriptive, predictive, and prescriptive analytics, along with their techniques and real-world use cases.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views39 pages

L1-L3 - Tutorial 1

The document outlines the syllabus for a Data Analytics course in the III Semester of B.Tech, covering modules such as Introduction to Analytics, Data Exploration, and Recommender Systems. It includes references, course outcomes, types of data, business applications, and a structured process for data analysis, along with case studies illustrating practical applications. Additionally, it discusses the components of business analytics, including descriptive, predictive, and prescriptive analytics, along with their techniques and real-world use cases.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 39

Data Analytics

III Semester B.Tech


Syllabus

• Modules
• Introduction to Analytics
• Data exploration
• Data Preparation
• Recommender Systems
• Time Series
References
1. Glenn J. Myatt, Wayne P. Johnson, Making Sense of Data: A
Practical Guide to Exploratory Data Analysis and Data Mining,
John Wiley Publication, Second Edition, 2014.
2. Jiawei Han, Micheline Kamber, Jian Pei, Data Mining Concepts and
Techniques Morgan Kaufmann Publishers, Third Edition, 2012.
3. U. Dinesh Kumar, Business Analytics: The Science of Data-Driven
Decision Making, Second Edition, Wiley Publications, 2021
4. Joel Grus, "Data Science from Scratch: First Principles with
Python", O'Reilly Media, 2019.
5. Anil Maheshwari, Data analytics: A comprehensive guide to data
analysis and decision-making, Wiley Publications, 2021.
6. https://archive.nptel.ac.in/courses/110/106/110106072/
Introduction to Data Analytics.
7. https://onlinecourses.nptel.ac.in/noc21_cs45/preview Data
Analytics with Python.
Course Outcomes
1. Interpret diverse datasets using data exploratory
techniques.
2. Analyse different data preparation techniques to
improve data quality and model performance.
3. Implement multivariate data analysis methods for
efficient decision making.
4. Evaluate association rule mining techniques to
develop recommender systems.
5. Apply time series analysis techniques for forecasting
and trend analysis in data-driven applications.
Data

• Data is a collection of qualitative or quantitative values


related to variables or events that can be measured,
recorded, and analyzed.
• Data analytics point of view:
• Raw facts, figures, and observations that are collected
from various sources and used to generate insights, support
decision-making, and solve problems.
Types of Data

• Structured Data:
• Organized, tabular data (e.g., spreadsheets, databases).
• Example: Sales data, student marks, bank transactions
• Unstructured Data:
• No predefined format (e.g., text, images, videos, social media posts)
• Example: Customer reviews, audio recordings, emails
• Semi-Structured:
• Has some organizational structure but not fully tabular
• Example: JSON, XML files combining both.
Introduction
• Data Analytics
• The science of extracting insights from raw data.
• Importance:
• Helps in decision-making, trend detection, performance
monitoring.
• Objectives:
• Summarize data, make predictions, recommend actions
Business Applications
• Operations:
• Inventory and process optimization.
• Marketing:
• Customer segmentation, targeting, churn analysis.
• Finance:
• Fraud detection, credit scoring.
• HR:
• Employee performance and attrition analysis
Data Sources
• Internal Data Sources
• Company sales records.
• Financial transactions.
• HR data.
• Customer feedback and CRM systems.
• External Data Source
• Market research reports.
• Social media and web analytics.
• Government and public datasets.
• Third-party data vendors.
Process for Making Sense of Data
• Problem definition and planning
• Data preparation
• Analysis
• Deployment
Problem Definition and Planning
• Some of the issues to consider when defining and
planning a data analysis project
• Identify the problem need to be addressed
• List the project deliverables
• General success factors
• Understand each resource and other limitations
• Put together an appropriate team
• Create a plan
• Perform costs/benefits analysis
Data Preparation
• Access and combine data tables
• Summarize data
• Look for errors
• Transform Data
• Segment Data
Analysis
• Summarizing data
• Explore relationships between attributes
• Grouping the data
• Identify non-trivial facts, patterns and trends
• Building regression models
• Building classification models
Deployment
• Generate report
• Deploy standalone or integrated decision support tool
• Measure business impact
Case Study- 1
• Improving Student Performance in a University
• Problem Definition and Planning
• Objective: Identify factors affecting student performance and improve academic outcomes.
• Stakeholders: Academic council, faculty members, data analysts.
• Planning: Select data from the last 2 academic years, including grades, attendance, socio-economic
background, and faculty feedback.
• Data Preparation Data Collection
• Collection: Academic records, attendance logs, faculty evaluations, library usage, and feedback forms.
• Cleaning: Removed duplicate entries, handled missing data, and normalized scoring formats.
• Integration: Merged datasets from student database, LMS (Learning Management System), and survey
responses.
• Analysis
• Descriptive Analysis: Found that students with >75% attendance had 20% higher average grades.
• Predictive Analysis: Logistic regression model used to predict likelihood of a student failing.
• Clustering: K-means used to segment students into performance-based groups (High, Medium, Low
Risk).
• Insights: Key predictors of poor performance: low attendance, fewer library visits, and low internal
assessment scores.
• Deployment
• Intervention Plans: Personalized study plans and mentorship for "high-risk" students.
• Academic Dashboard: Developed to monitor performance in real-time.
• Policy Changes: Introduced minimum attendance requirement and early warning system.
Case Study- 2
• Problem Statement: Smart City Traffic Optimization
Smart City Traffic Optimization

• Problem Definition & Planning: A city wants to


reduce congestion and travel time.
• Data Preparation: GPS data from buses, traffic
sensors, and signal timings are integrated.
• Analysis: Traffic flow modeling and simulations are run
to test new traffic light patterns.
• Deployment: Traffic signal algorithms are updated
dynamically based on real-time traffic, improving flow.
Introduction - Business Analytics
• 20th century, many companies were taking business
decisions based on ‘opinions’ rather than decisions
based on proper data analysis
• Opinion-based decision making can be very risky and
often leads to incorrect decisions
• Objective- Business Analytics
• Improve the quality of decision making using data analysis

Business analytics is a set of statistical and operations


research techniques, artificial intelligence, information
technology and management strategies used for framing a
business problem, collecting data, and analysing the data to
create value to organizations.
Components of Business Analytics
Analytics
• Analytics is a body of knowledge consisting of statistical,
mathematical, and operations research techniques;
• Artificial intelligence techniques
• Machine learning and deep learning algorithms;
• Data Collection and storage
• Data management processes
• Data extraction
• Transformation
• Loading
• Computing and big data technologies such as Hadoop, Spark, and
Hive t
Business Data Analytics

Figure : Data Driven Decision Making Flow Diagram


Pyramid of Analytics
Business Analytics
• Categories
• Descriptive analytics
• Predictive analytics
• Prescriptive analytics
Descriptive analytics
• Descriptive analytics is the simplest form of analytics that mainly uses simple
descriptive statistics, data visualization techniques, and business related
queries to understand past data.
• It is an innovative ways of data summarization
• Used for understanding the trends in past data which can be useful for
generating insights
• Example
Predictive Analysis
• Aims to predict the probability of occurrence of a future
event such as forecasting demand for products/services,
customer churn, employee attrition, loan defaults,
fraudulent transactions, insurance claim, and stock
market fluctuations.
• Used for predicting what is likely to happen in the future
Real-World Use Cases of Predictive Analytics
• Entertainment & Media
• Polyphonic HMI: Predicts hit songs using machine learning.
• Netflix: Recommends movies based on viewing behavior (75% of views).
• E-Commerce & Marketing
• Amazon: Recommender system influences 35% of sales.
• OkCupid: Forecasts likelihood of response in dating messages.
• Workforce Management
• HP: Predicts employee attrition with flight risk scores.
• Customer Insights & Profitability
• Capital One Bank: Identifies the most profitable customers.
• Travel & Transportation
• FlightCaster: Predicts flight delays 6 hours in advance.
• Farecast: Forecasts airfare trends and pricing changes.
• Healthcare & Public Health
• Google: Tracked spread of H1N1 flu using search queries.
• Academic Research
• University of Maryland: Explored dreams as predictors of infidelity.
Predictive Analytics Techniques
Technique Key Application Areas
Forecasting continuous outcomes (e.g.,
Regression
sales, demand, risk).
Logistic & Multinomial Predicting probabilities (e.g., customer
Regression churn, fraud detection).
Classification tasks and easy interpretation
Decision Trees
of decisions.
Modeling systems that transition from one
Markov Chains state to another (e.g., web behavior, supply
chains).
Improves decision trees using ensemble
Random Forest
learning for better accuracy.
Optimization problems like resource
Linear Programming
allocation, scheduling.
Similar to linear programming, but
Integer Programming variables are integers (used in production
planning, capital budgeting).
Prescriptive Analytics
• Used for choosing optimal actions
• Solved as a separate optimization problem
• Assists users in finding the optimal solution to a
problem or in making the right choice/decision among
several alternatives
• Ensures optimal actions
• Techniques used
• Operations Research
• Machine learning algorithms
• Metaheuristics
• Advanced statistical models
Prescriptive Analytics

Figure : Link between different analytics


capabilities.
Prescriptive Analytics

Figure : Analytics capability versus


Prescriptive Analytics Techniques
Technique Key Application Areas

Choosing optimal decisions with multiple


Multi-Criteria Decision Making (MCDM) conflicting objectives (e.g., product design,
vendor selection).

Optimizing decisions by satisfying several


Goal Programming
goals simultaneously.

Solving complex real-world optimization


Nonlinear Programming
problems with nonlinear relationships.

Improving process quality using DMAIC


Six Sigma framework (Define, Measure, Analyze,
Improve, Control).
Understanding trends, customer sentiment,
Social Media Analytics and campaign performance using
unstructured data.
Case Study
• Healthcare – Reducing Hospital Readmission Rates
Healthcare – Reducing Hospital
Readmission Rates
• Descriptive Analytics
• Goal: Understand patterns in patient readmissions.
• Data Used: Electronic health records (EHRs), discharge summaries,
patient history.
• Insights: Found that 20% of patients with heart failure are readmitted
within 30 days
• Predictive Analytics
• Goal: Predict which patients are at risk of readmission.
• Model: Logistic regression using age, past hospitalizations, medication
adherence.
• Result: 85% accuracy in identifying high-risk patients.
• . Prescriptive Analytics
• Goal: Recommend personalized post-discharge plans.
• Action: Assign high-risk patients to follow-up care and remote monitoring.
• Outcome: Readmissions reduced by 18% over 6 months
Tutorial
• Example 1:
• You are part of an academic analytics team at a university
aiming to improve overall student performance and reduce
failure rates. Describe how you would apply descriptive,
predictive, and prescriptive analytics to:
• Understand existing patterns in student performance
• Identify students at risk of underperforming or dropping out
• Recommend data-driven interventions to support those students
• In your response, clearly explain:
• What types of data you would collect
• How you would analyze the data at each stage (descriptive,
predictive, prescriptive)
• Expected outcomes or actions based on your analysis
Tutorial
• Example 2
• You are a data analyst for a ride-sharing company
looking to reduce passenger wait times and improve
route efficiency.
• Explain how you would apply descriptive, predictive,
and prescriptive analytics to:
• Analyze existing ride and traffic patterns
• Predict future demand in specific geographic areas
• Optimize driver allocation and routing for better service
Tutorial

• Describe each step in the process of making sense of


data with a suitable case study.
• How does data collected from IoT devices differ from
data obtained through surveys?
• What are the advantages and disadvantages of using
publicly available datasets
Tutorial
• What types of questions can descriptive analytics
answer? Give at least two examples.
• Given a sales dataset, what descriptive statistics would
you calculate and why?
Tutorial
• Explain how historical data is used in predictive
analytics.
• Discuss a real-life application where predictive analytics
has had a significant impact such as healthcare and
banking.
Tutorial
• What makes prescriptive analytics different from
descriptive and predictive analytics?
• Give an example where prescriptive analytics is used for
decision optimization.
• What are some tools and techniques used in
prescriptive analytics?

You might also like