Generative AI for Data Scientists
Generative AI for Data Scientists
Data Science
Now integrated with
Generative AI
Table of
Contents
The Era Of Generative AI
About upGrad
Why upGrad?
Program Highlights
Industry Projects
Learning Path
Master’s Curriculum
Career Support
Why
upGrad?
433%
Highest Hike
300+
Hiring Partners
50%
Avg Salary Hike
700+ 10 Million+
Industry Experts Learners
Program
Highlights
Dual Accreditation and Alumni Status
Get certified by IIITB and LJMU, UK and
gain dual alumni status on successful
completion of the program along
with access to LJMU’s digital library.
5 Specialisations
Choose from 5 specialisation-
son the basis of your back-
ground and career aspirations
and get the learning you want.
Dr. Debabrata Das is Director of IIITB. Prof. Chandrashekar has a PhD from A gold medallist from IIM Bangalore,
He has received his PhD from IIT-KGP. Mississippi State University and an alumnus of IIT Madras and London
His main areas of research are IoT and experience of over 10 years in several Business School, Anand is among the
Wireless Access Network. multinational organisations. top 10 data scientists in India with 20
years of experience.
Prof Tricha has a Ph.D from Georgia An M. Tech graduate and PhD from He has a PhD (Dual) from Penn State
Tech as well as an integrated M.Tech. Jersey Institute of Technology, Behzad University as well as a BTech Degree
from IIT Bombay. Her research interests possesses tremendous years of expe- from IIT Bombay.
include computer networks. rience in Data Science and ML.
Prof. G. Srinivasaraghavan Mirza Rahim Baig Sajan Kedia
Professor, IIITB Ex- Lead Analyst, Flipkart Ex- Data Science Lead, Myntra
Prof. Srinivasaraghavan has a PhD in Mirza is a veteran professional with Sajan graduated from IIT, BHU and has
Computer Science from IIT-K and 18 10+ years of experience in applications tons of experience in Data Science, Big
years of experience with Infosys and of data science, machine learning in Data, Spark, Machine Learning and
several other MNCs. e-commerce and healthcare. Natural Language Processing.
Rajesh has 10+ years of experience A Senior Member of the IEEE and a Bijoy comes with a deep understanding
leading Data Science teams in various Chartered IT Professional. He is a fellow of the private and cloud architectures
domains solving complex problems of the UK Higher Education Academy. and has helped numerous companies
using Deep Learning & ML technique. make the transition.
A Senior Faculty of Engineering and Studied Mathematical Physics at LU A Senior Lecturer in Statistics and Data
Technology at LJMU who has multiple and was the chairman of Industrial Science at the Department of Applied
publications in the healthcare domain. Mathematics at LJMU in 1996 and Head Mathematics at LJMU. Her research
of Graduate School in 2002. focus is Advanced Statistics for
Decision Support.
Expert Feedback
• Personalised expert feedback on
assignments and projects
• Regular live sessions by experts to
clarify concept-related doubts
Q&A Forum
• Timely doubt resolution by industry
experts and peers
• 100% expert-verified responses to
ensure quality learning
New
Additions
IMDb Movie Analysis Uber Supply-Demand Gap Lead Scoring Fraud Detection
SHOP
Telecom Churn Interactive Market Retail Giant Sales And many more!
Campaign Analysis Forecasting
Learning
Path Preparatory Course
0 week
Data Toolkit
12 weeks
Machine Learning
10 weeks
MSc - LJMU MSc - LJMU MSc - LJMU MSc - LJMU MSc - LJMU
(Natural Language (Deep Learning) (Business Analytics) (Business Intelligence/ (Data Engineering)
Processing) Data Analytics)
Executive PG Programme
in Data Science
COMMON CURRIC ULUM
PRE-PROGRAM PREPARATORY CONTENT
1. DATA ANALYSIS IN EXCEL
2. CRISP-DM FRAMEWORK
- DATA PREPARATION,
MODELLING, EVALUATION
AND DEPLOYMENT
2. BASICS OF PYTHON
3. DATA STRUCTURES IN
PYTHON
5. OOP IN PYTHON
*The Curriculum is subject to change as per the inputs from university or industry experts
2. PROGRAMMING IN PYTHON
1. LOGIC AND SYNTAX Learn how to approach and solve logical 1 WEEK
BUILDING problems using programming.
3. TIME COMPLEXITY
5. TWO POINTERS
6. RECURSION
3. INTRODUCTION TO PANDAS
*The Curriculum is subject to change as per the inputs from university or industry experts
5. EXPLORATORY DATA ANALYSIS
3. FINAL SUBMISSION
4. SOLUTION
7. INFERENTIAL STATISTICS
3. CONTINUOUS PROBABILITY
DISTRIBUTIONS
8. HYPOTHESIS TESTING
2. CONCEPTS OF HYPOTHESIS
TESTING - II: P-VALUE METHOD
AND TYPES OF ERRORS
3. INDUSTRY DEMONSTRATION
OF HYPOTHESIS TESTING:
TWO-SAMPLE MEAN AND
PROPORTION TEST, A/B
TESTING
*The Curriculum is subject to change as per the inputs from university or industry experts
9. DATA ANALYSIS USING SQL
4. PROBLEM-SOLVING USING
SQL
3. FINAL SUBMISSION
4. SOLUTION
1. SIMPLE LINEAR REGRESSION Venture into the machine learning community 2 WEEKS
by learning how one variable can be predict-
2. SIMPLE LINEAR REGRESSION
ed using several other variables through a
IN PYTHON
housing dataset where you will predict the
3. MULTIPLE LINEAR prices of houses based on various factors.
REGRESSION
4. MULTIPLE LINEAR
REGRESSION IN PYTHON
5. INDUSTRY RELEVANCE OF
LINEAR REGRESSION
*The Curriculum is subject to change as per the inputs from university or industry experts
2. LINEAR REGRESSION ASSIGNMENT
3. LOGISTIC REGRESSION
3. LOGISTIC REGRESSION:
INDUSTRY APPLICATIONS
3. HYPERPARAMETER TUNING
IN DECISION TREES
*The Curriculum is subject to change as per the inputs from university or industry experts
6. BASICS OF NLP AND TEXT MINING
4. SOLUTION
*The Curriculum is subject to change as per the inputs from university or industry experts
SP E CI AL I SATI ON: DEEP LEARNING
COURSE 3 - MACHINE LEARNING II
1. BAGGING & RANDOM FOREST
3. FEATURE IMPORTANCE IN
RANDOM FORESTS
4. RANDOM FORESTS IN
PYTHON
2. BOOSTING
1. PRINCIPLES OF MODEL Learn the pros and cons of simple and 1 WEEK
SELECTION complex models and the different methods
for quantifying model complexity, along with
2. MODEL EVALUATION
general machine learning techniques like
3. MODEL SELECTION: BEST feature engineering, model evaluation, and
PRACTICES many more.
*The Curriculum is subject to change as per the inputs from university or industry experts
5. ADVANCED REGRESSION
3. END-TO-END ANALYSIS OF
TIME SERIES
4. SOLUTION
3. BACKPROPAGATION IN
NEURAL NETWORKS
4. MODIFICATIONS TO NEURAL
NETWORKS
5. HYPERPARAMETER TUNING
IN NEURAL NETWORKS
*The Curriculum is subject to change as per the inputs from university or industry experts
2. CONVOLUTIONAL NEURAL NETWORKS
2. INDUSTRY DEMONSTRATION:
USING CNNS WITH X-RAY
IMAGES
3. ONE-SHOT DETECTORS
5. SEMANTIC SEGMENTATION
*The Curriculum is subject to change as per the inputs from university or industry experts
5. RECURRENT NEURAL NETWORKS (OPTIONAL)
1. WHAT MAKES A NEURAL Ever wondered what goes behind machine 1 WEEK
NETWORK RECURRENT translation, sentiment analysis, and speech
recognition? Learn how RNN helps in areas
2. VARIANTS OF RNNS:
having sequential data like text, speech,
BIDIRECTIONAL RNNS AND
videos, and a lot more.
LSTMS
6. GESTURE RECOGNITION
3. STARTER CODE
WALKTHROUGH
*The Curriculum
*The is subject
Curriculum to to
is subject change asas
change perper
thethe
inputs from
inputs university
from or or
university industry experts
industry experts
COURSE 5 - GENERATIVE AI
*The Curriculum is subject to change as per the inputs from university or industry experts
COURSE 6 - CAPSTONE PROJECT
CAPSTONE PROJECT
5. FINAL SUBMISSION
6. SOLUTION
3. FEATURE IMPORTANCE IN
RANDOM FORESTS
4. RANDOM FORESTS IN
PYTHON
2. BOOSTING
*The Curriculum is subject to change as per the inputs from university or industry experts
3. MODEL SELECTION & GENERAL ML TECHNIQUES
1. PRINCIPLES OF MODEL Learn the pros and cons of simple and 1 WEEK
SELECTION complex models and the different methods
for quantifying model complexity, along with
2. MODEL EVALUATION
general machine learning techniques like
3. MODEL SELECTION: BEST feature engineering, model evaluation, and
PRACTICES many more.
5. ADVANCED REGRESSION
3. END-TO-END ANALYSIS OF
TIME SERIES
*The Curriculum is subject to change as per the inputs from university or industry experts
7. ADVANCED ML CASE STUDY
4. SOLUTION
3. UNDERSTANDING
TENSORFLOW
2. SYNTACTIC PROCESSING
3. INFORMATION EXTRACTION
4. CONDITIONAL RANDOM
FIELDS
*The Curriculum is subject to change as per the inputs from university or industry experts
3. SYNCTACTIC PROCESSING
4. SOLUTION
4. SEMANTIC PROCESSING
4. TOPIC MODELLING
5. APPLIED DL IN NLP
3. FINAL SUBMISSION
4. SOLUTION
*The Curriculum is subject to change as per the inputs from university or industry experts
COURSE 5 - GENERATIVE AI
*The Curriculum is subject to change as per the inputs from university or industry experts
COURSE 6 - CAPSTONE PROJECT
1. CAPSTONE PROJECT
3. FEATURE IMPORTANCE IN
RANDOM FORESTS
4. RANDOM FORESTS IN
PYTHON
1. PRINCIPLES OF MODEL Learn the pros and cons of simple and 2 WEEKS
SELECTION complex models and the different methods
for quantifying model complexity, along with
2. MODEL BUILDING AND
general machine learning techniques like
EVALUATION
feature engineering, model evaluation, and
3. FEATURE ENGINEERING many more.
4. CLASS IMBALANCE
*The Curriculum is subject to change as per the inputs from university or industry experts
3. TIME SERIES FORECASTING
2. SMOOTHING TECHNIQUES
3. INTRODUCTION TO AR
MODELS
4. BUILDING AR MODELS
*The Curriculum is subject to change as per the inputs from university or industry experts
2. ADVANCED EXCEL
3. DATA TRANSFORMATIONS
USING POWERBI
2. INTERVIEWING AND
FRAMEWORKS - I: 5W AND
5WHYS
3. INTERVIEWING AND
FRAMEWORKS - II: SPIN
4. INDUSTRY DEMONSTRATIONS
ON FRAMEWORKS
5. UNDERSTANDING BUSINESS
MODEL CANVAS AND ISSUE
TREE FRAMEWORK
6. INDUSTRY DEMONSTRATIONS
ON ISSUE TREE FRAMEWORK
7. SPECIALISED FRAMEWORKS
FOR BUSINESS PROBLEMS:
7PS, 5CS, ETC.
*The Curriculum is subject to change as per the inputs from university or industry experts
5. DATA STORYTELLING
*The Curriculum is subject to change as per the inputs from university or industry experts
COURSE 5 - GENERATIVE AI
*The Curriculum is subject to change as per the inputs from university or industry experts
COURSE 6 - CAPSTONE PROJECT
1. CAPSTONE PROJECT
3. PROBLEM STATEMENT
4. EVALUATION RUBRIC
5. MID SUBMISSION
6. FINAL SUBMISSION
7. SOLUTION
1. DATABASE DESIGN RECAP In this module, you will learn and use data 1 WEEK
modelling on a dataset to solve a business
2. BUILDING BLOCKS OF DATA
problem.
MODELLING
*The Curriculum is subject to change as per the inputs from university or industry experts
3. INTRODUCTION TO BIG DATA AND CLOUD
1. BIG DATA AND CLOUD Understand the basics of big data and cloud 1 WEEK
COMPUTING and learn to work with an EMR cluster on a
cloud-based service.
2. AMAZON WEB SERVICES
4. SOLUTION
*The Curriculum is subject to change as per the inputs from university or industry experts
2. ADVANCED EXCEL
3. DATA TRANSFORMATIONS
USING POWERBI
2. INTERVIEWING AND
FRAMEWORKS - I: 5W AND
5WHYS
3. INTERVIEWING AND
FRAMEWORKS - II: SPIN
4. INDUSTRY DEMONSTRATIONS
ON FRAMEWORKS
5. UNDERSTANDING BUSINESS
MODEL CANVAS AND ISSUE
TREE FRAMEWORK
6. INDUSTRY DEMONSTRATIONS
ON ISSUE TREE FRAMEWORK
7. SPECIALIZED FRAMEWORKS
FOR BUSINESS PROBLEMS:
7PS, 5CS, ETC.
*The Curriculum is subject to change as per the inputs from university or industry experts
5. DATA STORYTELLING
4. TREES
*The Curriculum is subject to change as per the inputs from university or industry experts
2. SEARCHING AND SORTING
3. TWO POINTERS
3. RECURSION
*The Curriculum is subject to change as per the inputs from university or industry experts
COURSE 6 - CAPSTONE PROJECT
1. CAPSTONE PROJECT
2. PROBLEM STATEMENT
3. EVALUATION RUBRIC
4. MID SUBMISSION
5. FINAL SUBMISSION
6. SOLUTION
1. 4VS OF BIG DATA This module you will learn what big data 0 WEEK
is, its various characteristics, and its
2. BIG DATA: INDUSTRY CASE
determining factors. You will also get an
STUDIES
idea of the various sources of big data and
the wide range of big data applications in
different industries such as retail, healthcare,
and finance.
*The Curriculum is subject to change as per the inputs from university or industry experts
3. INTRODUCTION TO CLOUD AND AWS SETUP
3. MAPREDUCE PROGRAMMING
IN PYTHON
5. ASSIGNMENT (OPTIONAL)
1. INTRODUCTION, PROBLEM Solve an assignment to brush up on the skills 0 WEEK
STATEMENT AND GRADING learnt so far.
RUBRICS
2. INTRODUCTION TO APACHE
HBASE
4. COMPARISON OF NOSQL
DATABASES
*The Curriculum is subject to change as per the inputs from university or industry experts
7. DATA WAREHOUSING (OPTIONAL)
2. DESIGNING DATA
WAREHOUSING FOR AN ETL
DATA PIPELINE
3. UNSTRUCTURED DATA
INGESTION WITH FLUME
2. SOLUTION
3. PARTITIONING AND
BUCKETING WITH HIVE
*The Curriculum is subject to change as per the inputs from university or industry experts
2. ASSIGNMENT (OPTIONAL)
3. AMAZON REDSHIFT
1. DATA WAREHOUSING WITH Learn to deploy a Redshift cluster and use it 1 WEEK
REDSHIFT for querying data.
1. THE AWS CLOUD PLATFORM Do a deep dive into AWS Cloud. 0 WEEK
2. BUILDING AND DEPLOYING
VIRTUAL MACHINES
4. APPLICATION DEPLOYMENT
5. CLOUD ADMINISTRATION
AND SECURITY
7. CLOUD AUTOMATION
*The Curriculum is subject to change as per the inputs from university or industry experts
COURSE 5 - DATA ENGINEERING - III
2. APACHE FLINK(OPTIONAL)
4. SQL API
2. FUNDAMENTALS OF APACHE
KAFKA
3. SETTING UP KAFKA
PRODUCER AND CONSUMER
*The Curriculum is subject to change as per the inputs from university or industry experts
4. REAL-TIME DATA PROCESSING USING SPARK STREAMING
4. COMPARISION BETWEEN
SPARK STREAMING AND
FLINK
5. ASSIGNMENT (OPTIONAL)
1. INTRODUCTION, PROBLEM Solve an assignment to brush up on the skills 0 WEEK
STATEMENT AND GRADING learnt so far.
RUBRICS
3. AUTOMATING AN ENTIRE
DATA PIPELINE WITH
AIRFLOW
*The Curriculum is subject to change as per the inputs from university or industry experts
COURSE 6 - CAPSTONE PROJECT
CAPSTONE PROJECT
1. AN OVERVIEW OF THE The capstone project will stitch all the 4 WEEKS
DOMAIN AND ASSOCIATED components of data engineering together.
CONCEPTS
2. PROBLEM STATEMENT
3. EVALUATION RUBRIC
4. MID SUBMISSION
5. FINAL SUBMISSION
6. SOLUTION
*The Curriculum is subject to change as per the inputs from university or industry experts
RESEARCH DESIGN • Types of research methods and pyramid of evidence
1. Study of existing researches and links between them
DEVELOP AN 2. Applied and incremental
UNDERSTANDING OF 3. Discover
VARIOUS RESEARCH • Applied vs Fundamental, Quantitative vs Qualitative,
DESIGNS Bayesian vs Frequentis, Hypothesis driven research vs
Exploratory resarch
• Sample Size and Power, Precision vs accuracy trade-off,
p-value vs confidence intervals using a case study
LITERATURE REVIEWING • Intro to lit review process, what is a lit review, benefits of lit
review, literature reivew process (read, analyse and cite)
LEAN HOW TO READ AND • How to read and critique a paper
CRITIQUE A PAPER, AND • Types of sources that could be cited during research, the
HOW TO CITE A PAPER importance of citations and how to cite
• What makes a good reference, How to use reference
management software, Related scientific ethics
*The Curriculum is subject to change as per the inputs from university or industry experts
REPORT WRITING AND PRESENTATION SKILLS
Disclaimer: Program curriculum is subject to change basis inputs from the institute and experts. Please refer to the website for update
*Thedetails,
Curriculum is subject
or speak to ourtoAdmission
change asCounsellors.
per the inputs from university or industry experts
Research of our learners:
A Glimpse
1 Thesis Topic
Build a prediction model to accurately detect
and classify peripheral neuropathy.
Abstract
Background:
Damage to peripheral nerves causes Peripheral neuropathy (PN). Patients complain of pain, numbness and loss
of balance. If not identified early and treated adequately, PN could progress rapidly and lead to fatal complications.
A neurologist needs to determine the type of PN to provide differential treatment to the patient. However,
defining factors to classify PN accurately has remained challenging. This research proposes a model to detect
and classify PN into axonal, demyelinating, mixed and normal types from clinical and nerve conduction study
(NCS) data using the Random Forest algorithm.
Results:
Random Forest model was able to predict and classify PN with an accuracy of 96%. In axonal cases, sensory
and motor nerves showed a drop in amplitudes of greater than 40% compared to normal patients. Reduced
amplitude (>40%) in motor nerves of lower limbs and missing values (>90%) in sensory nerves of lower limbs
identified axonal PN. Delayed onset latency (>40%) in motor nerves of upper limbs, decreased conduction
velocity (>60%) in sensory nerves of upper limbs and increased onset latency (>40%) in F-waves of upper limbs
delineated the demyelinating type. Median ages of patients were mixed (65), demyelinating (51) and axonal
(61). Axonal (18.75% was significant in diabetic patients and demyelinating (14.8%) in non-diabetic patients. Both
axonal and mixed (16.78%) types were greater in hypertensive patients, and demyelinating (17.11%) type was
higher in patients without hypertension. Reflex was depressed more in mixed (17.49%) than axonal (15.51%) and
demyelinating (11.89%). Mixed (37.06%) type showed more in-sensitivity to pin-prick than axonal (29.37%) and
demyelinating (24.48%) types. Mixed (45%) patients tested positivefor Romberg’s test more than axonal (31%)
and demyelinating (21%). Mixed (34.65%) patients complained of numbness more than axonal (23.62%) and
demyelinating (26.77%) types.
Conclusion:
Random forest algorithm identified and classified PN well using clinical and NCS features. Clinical features (age,
diabetes, hypertension, reflex, Romberg’s test, numbness and perception to pin-prick) were useful in detecting
PN. Nerve conduction study features (amplitude, onset latency, conduction velocity, F-wave response and
missing sensory values) were instrumental in classifying PN. Reduced amplitudes of sensory and motor nerves
identified the axonal condition. Delayed onset latency and low conduction velocities along with missing and
delayed F-wave responses identified the demyelinating type.
2 Thesis Topic
Automatic network coding of traffic junctions using
machine learning.
Abstract
Before any traffic simulation can be performed, the network of roads and junctions is modeled. Assigning
attributes to the roadway network, such as the road length and width, the junction type, number of arms, and
lanes, is a crucial task while building the network. This research is an attempt to develop an efficient traffic junction
classifier using machine learning and deep learning algorithms on satellite images. Three junction categories,
Priority, Roundabout, and Signal, are considered for analysis. As this is a novel research idea, the required
image dataset of junctions is created using the Google Maps API. By using robotic process automation, the
downloading of the images is automated. Two approaches are taken to build the classifiers: a machine-learning
approach and a deep-learning approach. The machine learning approach is split into two phases: the feature
extraction phase and the classification phase. In the feature extraction phase, a Histogram of Oriented Gradients
(HOG) descriptors is used to extract features from the images. Furthermore, in the classification phase, several
classification algorithms are applied to the HOG features to build classifiers. In the deep-learning approach,
taking advantage of powerful pre-trained models and transfer learning, a Convolutional Neural Network (CNN)
is developed for classifying the junctions. The models are evaluated, and in the end, a comparison between the
various classification models is performed. The results showed that the CNN classifier modeled had the best
accuracy and AUC compared to the other models with scores of 0.81 and 0.94 respectively. Among the machine
learning models that were trained on the HOG features, the Extreme Gradient Boosting model has the best
accuracy of 0.62. The ultimate aim of this work is to use this junction-classifier model on real projects to aid the
process of finding the type of junctions and reduce the effort and time required to model the roadway networks.
Meet the
Class
INDUSTRIES OUR STUDENTS COME FROM
5% Healthcare
5% E-Commerce
1% Telecom
57% IT
1% Finance
15% Other
1% Consulting
1% Education
3% Retail
1% Manufacturing
10% BFSI
upGrad Elevate
• Recruitment Drive to connect you with the
best talent admirers in the industry Just-In-Time Interview Prep (JIT)
• Get access to a wide range of opportunities For upcoming job interviews JITs are conducted
and find the perfect job within 48 hours for eligible programs.
• Apply your learnings to real industry • Tailored to the job role and target domain
problems • Real-time feedback and tips for improvement
Disclaimer: Career services are subject to change. Please refer to the website or speak to our Admission Counsellor for updated details.
Experience upGrad
Offline
UPGRAD BASECAMPS
Held across all major cities in India, upGrad basecamps
bring together learners, faculty and industry experts
for a power-packed day of activities, career-building
sessions and live group projects. Get to know your
peers and faculty and hone your networking skills
in an exciting environment.
CAREER FAIRS
Attend regular hiring drives in major cities across
India, giving you the opportunity to interview with
upGrad’s 300+ hiring partners ensuring you get every
opportunity you deserve.
HACKATHONS
Team up and put your learning to use with our offline
Hackathons: designed to help you apply concepts
and meet, network, and grow!
Hear from
Our Learners
Sachin Aggarwal, Experience: 18+ Years
“Learning with IIITB and upGrad has been an experience like no other. Being enrolled
on an online program, you have your worries about how the program and teach-
ing methods will be. My favourite part about the learning experience has been the
well-designed and thoughtful content shared by IIITB professors and industry experts
on upGrad platforms. Kudos to upGrad!”
SELECTION PROCESS
STEP 1: Selection Test STEP 2: Review and Shortlisting of STEP 3: Enrollment for Access
Fill out an application and take a Suitable Candidates to Prep Content
short 17-minute online test with Our faculty will review all applications, Make a quick block payment
11 questions. considering the educational and with assistance from our loan
professional background of an partners where required,
applicant and review the test scores receive immediate access to
where applicable. Following this, the prepped content and begin
Offer Letters will be rolled out so you are your upGrad journey.
assured of a great peer group to learn
and network with.
COMPANY INFORMATION
COMPANY INFORMATION
upGrad
upGrad Education
Education Private
Private Limited
Limited,
Nishuvi,
Nishuvi, 75, Annie
75, Dr. Annie Besant
Besant Road,
Road,
Worli,
Worli, Mumbai
Mumbai - 400018.
– 400018.