0% found this document useful (0 votes)
96 views55 pages

Generative AI for Data Scientists

Uploaded by

venkystaring
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
96 views55 pages

Generative AI for Data Scientists

Uploaded by

venkystaring
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 55

Master of Science in

Data Science
Now integrated with
Generative AI
Table of
Contents
The Era Of Generative AI

About upGrad

Why upGrad?

Program Highlights

Faculty and Industry Experts

upGrad Learning Experience

Industry Projects

Learning Path

Master’s Curriculum

Meet the Class

Career Support

Experience upGrad Offline

Hear from Our Learners

Program Details and Admission Process


The Era Of
Generative AI

Usually, this first page is reserved for


“About upGrad”. But the world is at the
cusp of Generative AI rapidly changing
“ IIIT Bangalore prides itself in constantly
updating cutting-edge topics to its
curriculum. Our faculty has shaped this
the world as we know it. At upGrad, we’ve exciting Generative AI elective along
always believed in imparting learners with upGrad’s industry experts, thus
the skills necessary to thrive in the fast- ensuring both academic rigour as well as
evolving world of technology. We are incorporating the latest advancements in
hence quite thrilled to pioneer Generative tech.”
AI as an elective in the Master of Science Dr. V. Sridhar,
in Data Science. Head-Faculty, IIITB

With this key inclusion of Generative


AI, learners will delve deeper into
the fascinating realm of using Data
“ A s a n o r g a n i s a t i o n t h a t a s ks
professionals to stay updated with the
latest skills, we had to be one of the first
Science to build practical applications to teach Generative AI. With this move,
like conversational AI chat bots, image we are excited to witness the impact that
creators, and content recommenders Generative AI will have on the future, as
amongst others, to solve real-world well as the value our learners will bring to
challenges. So dive into this brave the field with this essential skill.”
new world of Generative AI and Large Mayank Kumar, Co-founder & MD
Language Models with us, and watch upGrad
yourself transform into a 10x Data
Scientist.
About
upGrad
upGrad has delivered over 20 million hours of
learning, delivering programs by collaborating with
universities across the world including Liverpool
John Moores University, IIIIT Bangalore and Deakin
Business School among others.

Online education is a fundamental The faculty includes an average of 15+


disruption that will have a far-reaching years of experience. The faculty covers
impact. upGrad was founded taking this the conceptual depths of topics such
into consideration. upGrad is an online as Data Science, Machine Learning and
education platform to help individuals AI, and Big Data Analytics. These will be
develop their professional potential in the complemented by industry-relevant case
most engaging learning environment. studies from major industry verticals by
industry leaders with 8+ years of experience
Since its inception, upGrad has delivered from upGrad’s industry network.
over 20 million hours of learning, delivering
programs by collaborating with universities Furthermore, our strong placement network,
across the world, including LJMU, industry mentorship and the credibility of a
IIIT Bangalore and Deakin Business Master’s Degree will provide you with just
School among others. And it doesn’t end the right push to accelerate your career
there. in Data Science!
INR 1.23 CR
Highest Salary

Why
upGrad?

433%
Highest Hike

300+
Hiring Partners

50%
Avg Salary Hike

700+ 10 Million+
Industry Experts Learners
Program
Highlights
Dual Accreditation and Alumni Status
Get certified by IIITB and LJMU, UK and
gain dual alumni status on successful
completion of the program along
with access to LJMU’s digital library.

Programming Language & Tools


Learn 5+ Programming Languages and
Tools like Python, Tableau, MySQL and
more. Optional modules for further
upskilling.building, career fairs, industry
mentors and much more.
For the Industry, by the Industry
Learn from 60+ case studies and
industry experts who mentor
you throughout the program.

5 Specialisations
Choose from 5 specialisation-
son the basis of your back-
ground and career aspirations
and get the learning you want.

Live Classroom Session


Live Classroom hour with Dr Manoj
Jayabalan, Post-Doctoral Fellow at
LJMU, to solve queries related to
dissertation.

Global Access to Jobs


With 360-degree career support
and dual alumni status, gain global
access to jobs.
Faculty and Industry
Experts

Dr. Debabrata Das Chandrashekar Ramanathan S. Anand


Director, IIITB Dean Academics, IIITB CEO, Gramener

Dr. Debabrata Das is Director of IIITB. Prof. Chandrashekar has a PhD from A gold medallist from IIM Bangalore,
He has received his PhD from IIT-KGP. Mississippi State University and an alumnus of IIT Madras and London
His main areas of research are IoT and experience of over 10 years in several Business School, Anand is among the
Wireless Access Network. multinational organisations. top 10 data scientists in India with 20
years of experience.

Tricha Anjali Behzad Ahmadi Anshuman Gupta


Ex-Associate Dean, IIIT-B Data Scientist Walmart Labs Director - Data Science, Pitney Bowes

Prof Tricha has a Ph.D from Georgia An M. Tech graduate and PhD from He has a PhD (Dual) from Penn State
Tech as well as an integrated M.Tech. Jersey Institute of Technology, Behzad University as well as a BTech Degree
from IIT Bombay. Her research interests possesses tremendous years of expe- from IIT Bombay.
include computer networks. rience in Data Science and ML.
Prof. G. Srinivasaraghavan Mirza Rahim Baig Sajan Kedia
Professor, IIITB Ex- Lead Analyst, Flipkart Ex- Data Science Lead, Myntra

Prof. Srinivasaraghavan has a PhD in Mirza is a veteran professional with Sajan graduated from IIT, BHU and has
Computer Science from IIT-K and 18 10+ years of experience in applications tons of experience in Data Science, Big
years of experience with Infosys and of data science, machine learning in Data, Spark, Machine Learning and
several other MNCs. e-commerce and healthcare. Natural Language Processing.

Rajesh Sabapathy Prof. Dhiya Al-Jumeily Bijoy Kumar Khandelwal


Sr Director, Data Science, UHG Group The Head and Professor - AI, LJMU COO, Actify Data Labs

Rajesh has 10+ years of experience A Senior Member of the IEEE and a Bijoy comes with a deep understanding
leading Data Science teams in various Chartered IT Professional. He is a fellow of the private and cloud architectures
domains solving complex problems of the UK Higher Education Academy. and has helped numerous companies
using Deep Learning & ML technique. make the transition.

Ujjyaini Mitra Ankit Jain


Head of Analytics, Zee5 ML Engineering Manager, Meta

An alumnus of McKinsey and Co, An alumnus of IIT Bombay, UCB, and


Flipkart and Bharati Airtel with over HBS with over 9 years of experience.
11 years of experience. Ankit has been recognised as 40
Under40 Data Scientist for 2022.
Dr. Atif Waraich Prof. Paulo Lisboa Dr Gabriela Czanner
Faculty - Computer Science, Head of Dept - Applied Mathematics, Faculty - Engineering and Technology,
LJMU LJMU - Retired LJMU

A Senior Faculty of Engineering and Studied Mathematical Physics at LU A Senior Lecturer in Statistics and Data
Technology at LJMU who has multiple and was the chairman of Industrial Science at the Department of Applied
publications in the healthcare domain. Mathematics at LJMU in 1996 and Head Mathematics at LJMU. Her research
of Graduate School in 2002. focus is Advanced Statistics for
Decision Support.

Dr. Manoj jayabalan Dr. Ahmed Kaky


Faculty of Engineering and Technology, Faculty of Engineering and Technology,
Liverpool John Moores University Liverpool John Moores University
upGrad Learning
Experience

Student Support Team


• We have a dedicated/ Student Support Team
for handling your queries via email or call- Industry Networking
back requests • Live sessions by experts on various
• This support team is available 7 days a week, industry topics
24 hours a day • One-on-one discussion and feedback
sessions with industry mentors
Industry Mentors
• Receive unparalleled guidance from industry upGrad BaseCamp
mentors, teaching assistants and graders • Fun-packed, informative and career
• Receive one-on-one feedback on sub- building workshop sessions by indus-
missions and personalised feedbacks on try professionals and professors
improvement • Group activities with your peers and
alumni

Expert Feedback
• Personalised expert feedback on
assignments and projects
• Regular live sessions by experts to
clarify concept-related doubts
Q&A Forum
• Timely doubt resolution by industry
experts and peers
• 100% expert-verified responses to
ensure quality learning
New
Additions

Career Essential Soft-skills Program


• Excel your personal & professional life with
upGrad’s Soft Skills Program

• Study Three fundamental Skills - Interview


& Job Search, Corporate & Business Com-
munication and Problem Solving

• Get access to 40+ learner hours of soft


skills content delivered by the best faculty
& Industry experts

30-Hour Programming Bootcamp for Non-tech


Learners
• Non-tech background? No need to fear
Programming anymore

• A 30-hour Python Programming bootcamp,


focusing on developing Basic + Intermediate
Python Programming Concepts to assist non-
tech learners

• A blended learning experience delivered via


Interactive live sessions and assessments
Industry
Projects

IMDb Movie Analysis Uber Supply-Demand Gap Lead Scoring Fraud Detection

Creditworthiness of Speech Recognition Image Captioning Social Media Listening


Customers

SHOP

Telecom Churn Interactive Market Retail Giant Sales And many more!
Campaign Analysis Forecasting
Learning
Path Preparatory Course
0 week
Data Toolkit
12 weeks
Machine Learning
10 weeks

Choose any of the 5 Specialisations


22 weeks (with 4 weeks of Capstone)

Natural Language Deep Learning Business Business Intel- Data Engineer-


Processing Tools: ChatGPT, Analytics ligence/ Data ing
Tools: ChatGPT, OpenAI API, Dall-E, Tools: ChatGPT, Analyics Tools: Hadoop,
OpenAI API, Dall-E, Midjourney, Copilot, OpenAI API, Dall-E, Tools: Python, Pow- HBase, Sqoop,
Midjourney, Copilot, Flask Midjourney, Copilot, er BI, Excel, mySQL, Hive, Flume,
Flask Flask MongoDB, Shiny, PySpark, Spark,
Tableau Airflow

Executive PG Executive PG Executive PG Executive PG Executive PG


Programme in Programme in Programme in Programme in Data Programme in
Data Science Data Science Data Science Science Data Science
(Natural Language (Deep Learning) (Business (Business (Data Engineering)
Processing) Analytics) Intelligence/ Data
Analytics)

Research Methodology Dissertation

MSc - LJMU MSc - LJMU MSc - LJMU MSc - LJMU MSc - LJMU
(Natural Language (Deep Learning) (Business Analytics) (Business Intelligence/ (Data Engineering)
Processing) Data Analytics)
Executive PG Programme
in Data Science
COMMON CURRIC ULUM
PRE-PROGRAM PREPARATORY CONTENT
1. DATA ANALYSIS IN EXCEL

1. INTRODUCTION TO EXCEL Taught by one of the most renowned data


scientists in the country (S.Anand, CEO,
2. DATA ANALYSIS IN EXCEL - I:
Gramener), this module takes you from
FUNCTIONS, FORMULAE, AND
a beginner-level Excel user to an almost
CHARTS
professional user.
3. DATA ANALYSIS IN EXCEL - II:
PIVOTS AND LOOKUPS

2. ANALYTICS PROBLEM SOLVING

1. THE CRISP-DM FRAMEWORK This module covers concepts of the CRISP-


- BUSINESS AND DATA DM framework for business problem-solving.
UNDERSTANDING

2. CRISP-DM FRAMEWORK
- DATA PREPARATION,
MODELLING, EVALUATION
AND DEPLOYMENT

COURSE 1: DATA TOOLKIT


1. INTRODUCTION TO PYTHON

1. UNDERSTANDING THE Build a foundation for the most in-demand 2 WEEKS


UPGRAD CODING CONSOLE programming language of the 21st century.

2. BASICS OF PYTHON

3. DATA STRUCTURES IN
PYTHON

4. CONTROL STRUCTURE AND


FUNCTIONS IN PYTHON

5. OOP IN PYTHON

*The Curriculum is subject to change as per the inputs from university or industry experts
2. PROGRAMMING IN PYTHON

1. LOGIC AND SYNTAX Learn how to approach and solve logical 1 WEEK
BUILDING problems using programming.

2. DATA STRUCTURES: LISTS,


STRINGS, DICTIONARIES, AND
STACKS

3. TIME COMPLEXITY

4. SEARCHING AND SORTING

5. TWO POINTERS

6. RECURSION

3. PYTHON FOR DATA SCIENCE

1. INTRODUCTION TO NUMPY Learn how to manipulate datasets in Python 1 WEEK


using Pandas which is the most powerful
2. INTRODUCTION TO
library for data preparation and analysis.
MATPLOTLIB

3. INTRODUCTION TO PANDAS

4. GETTING AND CLEANING


DATA

4. DATA VISUALISATION IN PYTHON

1. INTRODUCTION TO DATA Humans are visual learners, and hence no 1 WEEK


VISUALISATION task related to data is complete without
visualisation. Learn to plot and interpret
2. DATA VISUALISATION USING
various graphs in Python and observe how
SEABORN
they make data analysis and drawing insights
easier.

*The Curriculum is subject to change as per the inputs from university or industry experts
5. EXPLORATORY DATA ANALYSIS

1. DATA SOURCING Learn how to find and analyse the 1 WEEK


patterns in the data to draw actionable
2. DATA CLEANING
insights.
3. UNIVARIATE ANALYSIS

4. BIVARIATE ANALYSIS AND


MULTIVARIATE ANALYSIS

6. CREDIT EDA CASE STUDY

1. PROBLEM STATEMENT Solve a real industry problem through the 1 WEEK


concepts learnt in exploratory data analysis.
2. EVALUATION RUBRIC

3. FINAL SUBMISSION

4. SOLUTION

7. INFERENTIAL STATISTICS

1. BASICS OF PROBABILITY Build a strong statistical foundation and learn 1 WEEK


how to ‘infer’ insights from a huge population
2. DISCRETE PROBABILITY
using a small sample.
DISTRIBUTIONS

3. CONTINUOUS PROBABILITY
DISTRIBUTIONS

4. CENTRAL LIMIT THEOREM

8. HYPOTHESIS TESTING

1. CONCEPTS OF HYPOTHESIS Understand how to formulate and validate 1 WEEK


TESTING - I: NULL AND hypotheses for a population to solve real-life
ALTERNATE HYPOTHESIS, business problems.
MAKING A DECISION, AND
CRITICAL VALUE METHOD

2. CONCEPTS OF HYPOTHESIS
TESTING - II: P-VALUE METHOD
AND TYPES OF ERRORS

3. INDUSTRY DEMONSTRATION
OF HYPOTHESIS TESTING:
TWO-SAMPLE MEAN AND
PROPORTION TEST, A/B
TESTING

*The Curriculum is subject to change as per the inputs from university or industry experts
9. DATA ANALYSIS USING SQL

1. DATABASE DESIGN Data in companies is definitely not stored 1 WEEK


in excel sheets! Learn the fundamentals
2. DATABASE CREATION IN
of databases and extract information
MYSQL WORKBENCH
from RDBMS using the structured query
3. QUERYING IN MYSQL language.
4. JOINS AND SET OPERATIONS

10. ADVANCED SQL & BEST PRACTICES

1. WINDOW FUNCTIONS Apply advanced SQL concepts like window- 1 WEEK


ing and procedures to derive insights from
2. CASE STATEMENTS, STORED
data and answer pertinent business
ROUTINES AND CURSORS
questions.
3. QUERY OPTIMISATION AND
BEST PRACTICES

4. PROBLEM-SOLVING USING
SQL

11. SQL ASSIGNMENT: RSVP MOVIES

1. PROBLEM STATEMENT In this assignment, you will work on a movies 1 WEEK


dataset using SQL to extract exciting insights.
2. EVALUATION RUBRIC

3. FINAL SUBMISSION

4. SOLUTION

COURSE 2 - MACHINE LEARNING I


1. LINEAR REGRESSION

1. SIMPLE LINEAR REGRESSION Venture into the machine learning community 2 WEEKS
by learning how one variable can be predict-
2. SIMPLE LINEAR REGRESSION
ed using several other variables through a
IN PYTHON
housing dataset where you will predict the
3. MULTIPLE LINEAR prices of houses based on various factors.
REGRESSION

4. MULTIPLE LINEAR
REGRESSION IN PYTHON

5. INDUSTRY RELEVANCE OF
LINEAR REGRESSION

*The Curriculum is subject to change as per the inputs from university or industry experts
2. LINEAR REGRESSION ASSIGNMENT

1. PROBLEM STATEMENT Build a model to understand the factors 1 WEEK


on which the demand for bike-sharing
2. EVALUATION RUBRIC
systems vary on and help a company
3. FINAL SUBMISSION optimise its revenue.
4. SOLUTION

3. LOGISTIC REGRESSION

1. UNIVARIATE LOGISTIC Learn your first binary classification tech- 2 WEEKS


REGRESSION nique by determining which telecom oper-
ator customers are likely to churn versus
2. MULTIVARIATE LOGISTIC
those who are not to help the business
REGRESSION: MODEL
retain customers.
BUILDING AND EVALUATION

3. LOGISTIC REGRESSION:
INDUSTRY APPLICATIONS

4. CLASSIFICATION USING DECISION TREES

1. INTRODUCTION TO DECISION Learn how the human decision-making pro- 1 WEEK


TREES cess can be replicated using a decision tree
and tune it to suit your needs.
2. ALGORITHMS FOR DECISION
TREES CONSTRUCTION

3. HYPERPARAMETER TUNING
IN DECISION TREES

5. UNSUPERVISED LEARNING: CLUSTERING

1. INTRODUCTION TO Learn how to group elements into different 1 WEEK


CLUSTERING clusters when you don’t have any pre-de-
fined labels to segregate them through
2. K-MEANS CLUSTERING
K-means clustering, hierarchical clustering,
3. HIERARCHICAL CLUSTERING and more.
4. OTHER FORMS OF
CLUSTERING: K-MODE,
K-PROTOTYPE, DB SCAN

*The Curriculum is subject to change as per the inputs from university or industry experts
6. BASICS OF NLP AND TEXT MINING

1. REGEX AND INTRODUCTION Do you get annoyed by the constant


TO NLP spam in your mailbox? Wouldn’t it be
nice if we had a program to check your
2. BASIC LEXICAL PROCESSING
spelling? In this module learn how to build
3. ADVANCED LEXICAL a spell checker & spam detector using
PROCESSING techniques like phonetic hashing, bag-of-
words, TF-IDF, etc.

5. BUSINESS PROBLEM SOLVING

1. INTRODUCTION TO BUSINESS Learn how to approach open-ended real-


PROBLEM SOLVING world problems using data as a lever to
draw actionable insights.
2. BUSINESS PROBLEM
SOLVING: CASE STUDY
DEMONSTRATIONS

7. CASE STUDY: LEAD SCORING

1. PROBLEM STATEMENT Help the Sales team of your company iden-


tify which leads are worth pursuing through
2. EVALUATION RUBRIC
this classification case study.
3. FINAL SUBMISSION

4. SOLUTION

*The Curriculum is subject to change as per the inputs from university or industry experts
SP E CI AL I SATI ON: DEEP LEARNING
COURSE 3 - MACHINE LEARNING II
1. BAGGING & RANDOM FOREST

1. POPULAR ENSEMBLES Learn how powerful ensemble algorithms 1 WEEK


can improve your classification models by
2. INTRODUCTION TO RANDOM
building random forests from decision trees.
FORESTS

3. FEATURE IMPORTANCE IN
RANDOM FORESTS

4. RANDOM FORESTS IN
PYTHON

2. BOOSTING

1. INTRODUCTION TO Learn about ensemble modelling through 1 WEEK


BOOSTING AND ADABOOST bagging and boosting and, understand how
weak algorithms can be transformed into
2. GRADIENT BOOSTING
stronger ones.

3. MODEL SELECTION & GENERAL ML TECHNIQUES

1. PRINCIPLES OF MODEL Learn the pros and cons of simple and 1 WEEK
SELECTION complex models and the different methods
for quantifying model complexity, along with
2. MODEL EVALUATION
general machine learning techniques like
3. MODEL SELECTION: BEST feature engineering, model evaluation, and
PRACTICES many more.

4. PRINCIPAL COMPONENT ANALYSIS

1. PRINCIPAL COMPONENT Understand important concepts related to 1 WEEK


ANALYSIS AND SINGULAR dimensionality reduction, the basic idea and
VALUE DECOMPOSITION the learning algorithm of PCA, and its practi-
cal applications on supervised and unsuper-
2. PRINCIPAL COMPONENT
vised problems.
ANALYSIS IN PYTHON

*The Curriculum is subject to change as per the inputs from university or industry experts
5. ADVANCED REGRESSION

1. GENERALISED LINEAR In this module, take a more advanced look 1 WEEK


REGRESSION at regression models and learn the concepts
related to regularisation.
2. REGULARISED REGRESSION

6. TIME SERIES FORECASTING (OPTIONAL)

1. INTRODUCTION TO In this module, you will learn how to analyse 0 WEEK


TIME SERIES AND ITS and forecast a series that varies with time.
COMPONENTS

2. WORKING WITH STATIONARY


TIME SERIES

3. END-TO-END ANALYSIS OF
TIME SERIES

7. ADVANCED ML CASE STUDY 1 WEEK

1. PROBLEM STATEMENT Build a regularized regression model to


understand the most important variables to
2. EVALUATION RUBRIC
predict house prices in Australia.
3. FINAL SUBMISSION

4. SOLUTION

COURSE 4 - ADVANCED MACHINE LEARNING AND


DEEP LEARNING

1. INTRODUCTION TO NEURAL NETWORKS AND ANN

1. STRUCTURE OF NEURAL Learn the most sophisticated and cutting- 2 WEEKS


NETWORKS edge technique in machine learning -
Artificial Neural Networks or ANNs.
2. FEED FORWARD IN NEURAL
NETWORKS

3. BACKPROPAGATION IN
NEURAL NETWORKS

4. MODIFICATIONS TO NEURAL
NETWORKS

5. HYPERPARAMETER TUNING
IN NEURAL NETWORKS

*The Curriculum is subject to change as per the inputs from university or industry experts
2. CONVOLUTIONAL NEURAL NETWORKS

1. INTRODUCTION TO Learn the basics of CNN and OpenCV and 1 WEEK


CONVOLUTIONAL NEURAL how to classify image data using various
NETWORKS architectures which you will then implement
using Python and Keras.
2. BUILDING CNNS WITH
PYTHON AND KERAS

3. CNN ARCHITECTURES AND


TRANSFER LEARNING

4. STYLE TRANSFER AND


OBJECT DETECTION

3. CONVOLUTIONAL NEURAL NETWORKS -INDUSTRY


APPLICATIONS
1. INDUSTRY DEMONSTRATION: Apply CNNs to Computer Vision tasks like 1 WEEK
USING CNNS WITH FLOWERS detecting anomalies in chest X-Ray scans.
IMAGES

2. INDUSTRY DEMONSTRATION:
USING CNNS WITH X-RAY
IMAGES

4. OBJECT DETECTION & IMAGE SEGMENTATION

1. FUNDAMENTALS OF OBJECT Learn the applications of DL in computer 1 WEEK


DETECTION vision through industry-relevant detection
algorithms such as RCNNs, YOLO and SSD.
2. REGION-BASED DETECTORS

3. ONE-SHOT DETECTORS

4. CUSTOM OBJECT DETECTION

5. SEMANTIC SEGMENTATION

*The Curriculum is subject to change as per the inputs from university or industry experts
5. RECURRENT NEURAL NETWORKS (OPTIONAL)

1. WHAT MAKES A NEURAL Ever wondered what goes behind machine 1 WEEK
NETWORK RECURRENT translation, sentiment analysis, and speech
recognition? Learn how RNN helps in areas
2. VARIANTS OF RNNS:
having sequential data like text, speech,
BIDIRECTIONAL RNNS AND
videos, and a lot more.
LSTMS

3. BUILDING RNNS IN PYTHON

6. GESTURE RECOGNITION

1. TWO ARCHITECTURES: 3D Make a Smart TV system which can control 1 WEEK


CONVS AND CNN-RNN STACK the TV with the user’s hand gestures as the
remote control
2. UNDERSTANDING
GENERATORS

3. STARTER CODE
WALKTHROUGH

4. PROBLEM STATEMENT AND


FINAL SUBMISSION

*The Curriculum
*The is subject
Curriculum to to
is subject change asas
change perper
thethe
inputs from
inputs university
from or or
university industry experts
industry experts
COURSE 5 - GENERATIVE AI

1. FUNDAMENTALS OF TRANSFORMERS 1 WEEK


ARCHITECTURE, GENERATIVE AI, CHATGPT
& PROMPT ENGINEERING USING NON
REASONING, CHAIN OF THOUGHT &
ADVANCED TECHNIQUES

2. PRODUCT DEVELOPMENT USING OPENAI 1 WEEK


APIS, FINE TUNING USING STAR TECHNIQUE IN
PYTHON

3. INTEGRATING SPEECH USING WHISPER API 1 WEEK


AND APPLICATION DEPLOYMENT USING FLASK

4. FUNDAMENTALS OF DESIGN, PHOTOGRAPHY, 1 WEEK


PRODUCT DEVELOPMENT USING STABLE
DIFFUSION IN PYTHON
& CREATE PIXXELCRAFT AI TO ENABLE
FAST-TRACK DIGITISATION FOR OFFLINE
E-COMMERCE BUSINESSES BY GENERATING
HIGH-QUALITY IMAGES AI FOR A LARGE
PRODUCT PORTFOLIO

5. APPLICATIONS OF LLMS IN DATA 1 WEEK


SCIENCE PROJECTS & AUTOMATING NEWS
RECOMMENDATION USING GPT3 AND COPILOT
POWERED MACHINE LEARNING APPLICATIONS
OF LLMS

6. INTERVIEW GYNIE AI: CHATBOT DEVELOPMENT 1 WEEK


PROJECT

*The Curriculum is subject to change as per the inputs from university or industry experts
COURSE 6 - CAPSTONE PROJECT
CAPSTONE PROJECT

1. AN OVERVIEW OF THE Choose from a range of real-world industry- 4 WEEKS


DOMAIN AND ASSOCIATED woven projects on advanced topics like
CONCEPTS Recommendation Systems, Fraud Detection,
Emotion Detection from faces, Social Media
2. PROBLEM STATEMENT
Listening, and Speech Recognition among
3. EVALUATION RUBRIC many others.
4. MID SUBMISSION

5. FINAL SUBMISSION

6. SOLUTION

S P E CI AL I SATI ON: NAT URAL


L A N G UAGE PRO C ESSING
COURSE 3 - MACHINE LEARNING II
1. BAGGING & RANDOM FOREST

1. POPULAR ENSEMBLES Learn how powerful ensemble algorithms 1 WEEK


can improve your classification models by
2. INTRODUCTION TO RANDOM
building random forests from decision trees.
FORESTS

3. FEATURE IMPORTANCE IN
RANDOM FORESTS

4. RANDOM FORESTS IN
PYTHON

2. BOOSTING

1. INTRODUCTION TO Learn about ensemble modelling through 1 WEEK


BOOSTING AND ADABOOST bagging and boosting, and understand how
weak algorithms can be transformed into
2. GRADIENT BOOSTING
stronger ones.

*The Curriculum is subject to change as per the inputs from university or industry experts
3. MODEL SELECTION & GENERAL ML TECHNIQUES

1. PRINCIPLES OF MODEL Learn the pros and cons of simple and 1 WEEK
SELECTION complex models and the different methods
for quantifying model complexity, along with
2. MODEL EVALUATION
general machine learning techniques like
3. MODEL SELECTION: BEST feature engineering, model evaluation, and
PRACTICES many more.

4. PRINCIPAL COMPONENT ANALYSIS

1. PRINCIPAL COMPONENT Understand important concepts related to 1 WEEK


ANALYSIS AND SINGULAR dimensionality reduction, the basic idea
VALUE DECOMPOSITION and the learning algorithm of PCA, and its
practical applications on supervised and
2. PRINCIPAL COMPONENT
unsupervised problems.
ANALYSIS IN PYTHON

5. ADVANCED REGRESSION

1. GENERALISED LINEAR In this module, take a more advanced look 1 WEEK


REGRESSION at regression models and learn the concepts
related to regularisation.
2. REGULARISED REGRESSION

6. TIME SERIES ANALYSIS (OPTIONAL)

1. INTRODUCTION TO In this module, you will learn how to analyse 2 WEEKS


TIME SERIES AND ITS and forecast a series that varies with time.
COMPONENTS

2. WORKING WITH STATIONARY


TIME SERIES

3. END-TO-END ANALYSIS OF
TIME SERIES

*The Curriculum is subject to change as per the inputs from university or industry experts
7. ADVANCED ML CASE STUDY

1. PROBLEM STATEMENT Build a regularised regression model to 1 WEEK


understand the most important variables to
2. EVALUATION RUBRIC
predict house prices in Australia.
3. FINAL SUBMISSION

4. SOLUTION

COURSE 4 - ADVANCED MACHINE LEARNING AND


NATURAL LANGUAGE PROCESSING

1. NEURAL NETS FOR NLP

1. UNDERSTANDING NEURAL Learn the most sophisticated and cutting- 1 WEEK


NETWORKS edge technique in machine learning -
Artificial Neural Networks or ANNs.
2. LOSS FUNCTIONS AND BACK
PROPAGATION

3. UNDERSTANDING
TENSORFLOW

4. CASE STUDY: IMDB MOVIE


REVIEW CLASSIFICATION

2. SYNTACTIC PROCESSING

1. INTRODUCTION TO Learn how to analyse the syntax or the 1 WEEK


SYNTACTIC PROCESSING grammatical structure of sentences using
POS tagging and Dependency parsing.
2. PARSING

3. INFORMATION EXTRACTION

4. CONDITIONAL RANDOM
FIELDS

*The Curriculum is subject to change as per the inputs from university or industry experts
3. SYNCTACTIC PROCESSING

1. PROBLEM STATEMENT Use the techniques such as POS tagging 1 WEEK


and Dependency parsing to extract
2. EVALUATION RUBRIC
information from unstructured text data.
3. FINAL SUBMISSION

4. SOLUTION

4. SEMANTIC PROCESSING

1. INTRODUCTION TO Learn the most interesting area in the field 2 WEEKS


SEMANTIC PROCESSING of NLP and understand different techniques
like word-embeddings and topic modelling
2. DISTRIBUTIONAL SEMANTICS
to build an application that extracts opinions
3. INDUSTRY APPLICATIONS OF about socially relevant issues.
DISTRBUTIONAL SEMANTICS

4. TOPIC MODELLING

5. APPLIED DL IN NLP

1. INTRODUCTION TO MACHINE Apply the concepts of DL in natural language 1 WEEK


TRANSLATION processing problems through encoder-
decoder architecture and NMTs, and
2. ATTENTION-BASED NMT
implement them in TensorFlow.
MODEL

3. CUSTOM MODEL BUILDING IN


TENSORFLOW

6. CASE STUDY: AUTOMATIC TICKET CLASSIFICATION

1. PROBLEM STATEMENT Categorise support tickets with the help of 1 WEEK


Unsupervised learning and Topic modelling.
2. EVALUATION RUBRIC

3. FINAL SUBMISSION

4. SOLUTION

*The Curriculum is subject to change as per the inputs from university or industry experts
COURSE 5 - GENERATIVE AI

1. FUNDAMENTALS OF TRANSFORMERS 1 WEEK


ARCHITECTURE, GENERATIVE AI, CHATGPT
& PROMPT ENGINEERING USING NON
REASONING, CHAIN OF THOUGHT & ADVANCED
TECHNIQUES

2. PRODUCT DEVELOPMENT USING OPENAI 1 WEEK


APIS, FINE TUNING USING STAR TECHNIQUE IN
PYTHON

3. INTEGRATING SPEECH USING WHISPER API 1 WEEK


AND APPLICATION DEPLOYMENT USING FLASK

4. FUNDAMENTALS OF DESIGN, PHOTOGRAPHY, 1 WEEK


PRODUCT DEVELOPMENT USING STABLE
DIFFUSION IN PYTHON
& CREATE PIXXELCRAFT AI TO ENABLE
FAST-TRACK DIGITISATION FOR OFFLINE
E-COMMERCE BUSINESSES BY GENERATING
HIGH-QUALITY IMAGES AI FOR A LARGE
PRODUCT PORTFOLIO

5. APPLICATIONS OF LLMS IN DATA 1 WEEK


SCIENCE PROJECTS & AUTOMATING NEWS
RECOMMENDATION USING GPT3 AND COPILOT
POWERED MACHINE LEARNING APPLICATIONS
OF LLMS

6. INTERVIEW GYNIE AI: CHATBOT DEVELOPMENT 1 WEEK


PROJECT

*The Curriculum is subject to change as per the inputs from university or industry experts
COURSE 6 - CAPSTONE PROJECT
1. CAPSTONE PROJECT

1. AN OVERVIEW OF THE Choose from a range of real-world industry- 4 WEEKS


DOMAIN AND ASSOCIATED woven projects on advanced topics like
CONCEPTS Recommendation Systems, Fraud Detection,
Emotion Detection from faces, Social Media
2. PROBLEM STATEMENT
Listening, and Speech Recognition among
3. EVALUATION RUBRIC many others.
4. MID SUBMISSION

S P E CIA LI SATI ON: BUSINESS ANALYT ICS


COURSE 3 - ADVANCED MACHINE LEARNING
1. BAGGING & RANDOM FOREST

1. POPULAR ENSEMBLES Learn how powerful ensemble algorithms 1 WEEK


can improve your classification models by
2. INTRODUCTION TO RANDOM
building random forests from decision trees.
FORESTS

3. FEATURE IMPORTANCE IN
RANDOM FORESTS

4. RANDOM FORESTS IN
PYTHON

2. MODEL SELECTION & GENERAL ML TECHNIQUES

1. PRINCIPLES OF MODEL Learn the pros and cons of simple and 2 WEEKS
SELECTION complex models and the different methods
for quantifying model complexity, along with
2. MODEL BUILDING AND
general machine learning techniques like
EVALUATION
feature engineering, model evaluation, and
3. FEATURE ENGINEERING many more.
4. CLASS IMBALANCE

*The Curriculum is subject to change as per the inputs from university or industry experts
3. TIME SERIES FORECASTING

1. INTRODUCTION TO In this module, you will learn how to analyse 2 WEEKS


TIME SERIES AND ITS and forecast a series that varies with time.
COMPONENTS

2. SMOOTHING TECHNIQUES

3. INTRODUCTION TO AR
MODELS

4. BUILDING AR MODELS

4. MODEL SELECTION CASE STUDY

1. PROBLEM STATEMENT Apply your business acumen to the newly 1 WEEK


learnt machine learning techniques, and
2. EVALUATION RUBRIC
select the right model most appropriate for a
3. FINAL SUBMISSION provided business scenario.
4. SOLUTION

COURSE 4 - DATA VISUALISATION AND


STORYTELLING
1. VISUALISATION USING TABLEAU

1. DATA EXPLORATION IN Learn basic visualisation techniques using 1 WEEK


TABLEAU the most in-demand visualisation tool in the
industry.
2. VISUALISING AND ANALYSING
DATA IN TABLEAU WITH
BASIC PLOTS

*The Curriculum is subject to change as per the inputs from university or industry experts
2. ADVANCED EXCEL

1. EXCEL FUNCTIONS Learn the advanced concepts in Excel and 1 WEEK


start to perform data analysis like a pro!
2. DATA ANALYSIS IN EXCEL

3. ADVANCED TOOLS AND


VISUALISATIONS

3. VISUALISATION USING POWERBI

1. POWERBI: INTRODUCTION Take your visualisation game a step forward 1 WEEK


AND SETUP by understanding how to operate PowerBI.

2. VISUALISING AND ANALYSING


DATA IN POWERBI

3. DATA TRANSFORMATIONS
USING POWERBI

4. STRUCTURED PROBLEM SOLVING USING FRAMEWORKS

1. INTRODUCTION TO Learn how to attack a business problem 1 WEEK


STRUCTURED PROBLEM using various structured frameworks like 5W,
SOLVING 5WHYs, and SPIN.

2. INTERVIEWING AND
FRAMEWORKS - I: 5W AND
5WHYS

3. INTERVIEWING AND
FRAMEWORKS - II: SPIN

4. INDUSTRY DEMONSTRATIONS
ON FRAMEWORKS

5. UNDERSTANDING BUSINESS
MODEL CANVAS AND ISSUE
TREE FRAMEWORK

6. INDUSTRY DEMONSTRATIONS
ON ISSUE TREE FRAMEWORK

7. SPECIALISED FRAMEWORKS
FOR BUSINESS PROBLEMS:
7PS, 5CS, ETC.

*The Curriculum is subject to change as per the inputs from university or industry experts
5. DATA STORYTELLING

1. INTRODUCTION TO DATA Learn how to effectively strategise, 1 WEEK


STORYTELLING communicate, and fine-grain your data
analysis projects and understand how to
2. COMPONENTS OF A
optimally present your findings to technical
GOOD STORY WITH
and non-technical stakeholders and upgrade
DATA - UNDERSTANDING
your storytelling skills.
YOUR STAKEHOLDER AND
STAKEHOLDER EMPATHY,
LEVELS OF DETAILS FOR
DIFFERENT STAKEHOLDERS
- CXO/LEADERSHIP VS TEAM
PRESENTATIONS, VISUALS,
ETC.

3. GOLDEN RULES FOR DATA


STORYTELLING

6. AIRBNB CASE STUDY

1. PROBLEM STATEMENT Use your newly learnt UI tools skills to 1 WEEK


analyse an AirBnB dataset to make important
2. EVALUATION RUBRIC
business decisions. But the analysis is
3. FINAL SUBMISSION just a small part; can you also effectively
4. SOLUTION present it using Data Storytelling to the right
stakeholders?

*The Curriculum is subject to change as per the inputs from university or industry experts
COURSE 5 - GENERATIVE AI

1. FUNDAMENTALS OF TRANSFORMERS 1 WEEK


ARCHITECTURE, GENERATIVE AI, CHATGPT
& PROMPT ENGINEERING USING NON
REASONING, CHAIN OF THOUGHT &
ADVANCED TECHNIQUES

2. PRODUCT DEVELOPMENT USING OPENAI 1 WEEK


APIS, FINE TUNING USING STAR TECHNIQUE IN
PYTHON

3. INTEGRATING SPEECH USING WHISPER API 1 WEEK


AND APPLICATION DEPLOYMENT USING FLASK

4. FUNDAMENTALS OF DESIGN, PHOTOGRAPHY, 1 WEEK


PRODUCT DEVELOPMENT USING STABLE
DIFFUSION IN PYTHON
& CREATE PIXXELCRAFT AI TO ENABLE
FAST-TRACK DIGITISATION FOR OFFLINE
E-COMMERCE BUSINESSES BY GENERATING
HIGH-QUALITY IMAGES AI FOR A LARGE
PRODUCT PORTFOLIO

5. APPLICATIONS OF LLMS IN DATA 1 WEEK


SCIENCE PROJECTS & AUTOMATING NEWS
RECOMMENDATION USING GPT3 AND COPILOT
POWERED MACHINE LEARNING APPLICATIONS
OF LLMS

6. INTERVIEW GYNIE AI: CHATBOT DEVELOPMENT 1 WEEK


PROJECT

*The Curriculum is subject to change as per the inputs from university or industry experts
COURSE 6 - CAPSTONE PROJECT
1. CAPSTONE PROJECT

1. POWER BI - OPTIONAL Solve an end-to-end real-life industry


4 WEEKS
problem from a wide variety of domains.
2. AN OVERVIEW OF THE
DOMAIN AND ASSOCIATED
CONCEPTS

3. PROBLEM STATEMENT

4. EVALUATION RUBRIC

5. MID SUBMISSION

6. FINAL SUBMISSION

7. SOLUTION

SP E CI AL I SATI ON: BUSINESS


IN TE L LI GE NCE / DATA ANALYT IC S
COURSE 3: ADVANCED DBS AND BIG DATA ANALYTICS
1. DATA MODELLING

1. DATABASE DESIGN RECAP In this module, you will learn and use data 1 WEEK
modelling on a dataset to solve a business
2. BUILDING BLOCKS OF DATA
problem.
MODELLING

3. PROBLEM SOLVING USING


DATA MODELLING

4. DATA MODELLING: OPTIONAL


ASSIGNMENT

2. ADVANCED SQL AND BEST PRACTICES

1. WINDOW FUNCTIONS Apply advanced SQL concepts like 1 WEEK


windowing and procedures to derive insights
2. CASE STATEMENTS, STORED
from data and answer pertinent business
ROUTINES, AND CURSORS
questions.
3. QUERY OPTIMISATION AND
BEST PRACTICES

4. PROBLEM SOLVING USING


SQL

*The Curriculum is subject to change as per the inputs from university or industry experts
3. INTRODUCTION TO BIG DATA AND CLOUD

1. BIG DATA AND CLOUD Understand the basics of big data and cloud 1 WEEK
COMPUTING and learn to work with an EMR cluster on a
cloud-based service.
2. AMAZON WEB SERVICES

3. BIG DATA STORAGE AND


PROCESSING - HADOOP

4. EMR CLUSTER IN AWS

4. ANALYTICS USING SPARK

1. EXPLORATORY DATA Use PySpark to do EDA and Predictive 2 WEEKS


ANALYSIS WITH PYSPARK Analysis using Spark’s ML library.

2. PREDICTIVE ANALYSIS WITH


SPARK MLLIB

5. BIG DATA CASE STUDY

1. PROBLEM STATEMENT Use your analytics skills to work on a large 1 WEEK


dataset in the cloud to solve an industry
2. EVALUATION RUBRIC
problem.
3. FINAL SUBMISSION

4. SOLUTION

COURSE 4 - DATA VISUALISATION AND


STORYTELLING
1. VISUALISATION USING TABLEAU

1. DATA EXPLORATION IN Learn basic visualisation techniques using 1 WEEK


TABLEAU the most in-demand visualisation tool in the
industry.
2. VISUALISING AND ANALYSING
DATA IN TABLEAU WITH
BASIC PLOTS

*The Curriculum is subject to change as per the inputs from university or industry experts
2. ADVANCED EXCEL

1. EXCEL FUNCTIONS Learn the advanced concepts in Excel and 1 WEEK


start to perform data analysis like a pro!
2. DATA ANALYSIS IN EXCEL

3. ADVANCED TOOLS AND


VISUALISATIONS

3. VISUALISATION USING POWERBI

1. POWERBI: INTRODUCTION Take your visualisation game a step forward 1 WEEK


AND SETUP by understanding how to operate PowerBI.

2. VISUALISING AND ANALYSING


DATA IN POWERBI

3. DATA TRANSFORMATIONS
USING POWERBI

4. STRUCTURED PROBLEM SOLVING USING FRAMEWORKS

1. INTRODUCTION TO Learn how to attack a business problem 1 WEEK


STRUCTURED PROBLEM using various structured frameworks like 5W,
SOLVING 5WHYs, and SPIN.

2. INTERVIEWING AND
FRAMEWORKS - I: 5W AND
5WHYS

3. INTERVIEWING AND
FRAMEWORKS - II: SPIN

4. INDUSTRY DEMONSTRATIONS
ON FRAMEWORKS

5. UNDERSTANDING BUSINESS
MODEL CANVAS AND ISSUE
TREE FRAMEWORK

6. INDUSTRY DEMONSTRATIONS
ON ISSUE TREE FRAMEWORK

7. SPECIALIZED FRAMEWORKS
FOR BUSINESS PROBLEMS:
7PS, 5CS, ETC.

*The Curriculum is subject to change as per the inputs from university or industry experts
5. DATA STORYTELLING

1. INTRODUCTION TO DATA Learn how to effectively strategise, 1 WEEK


STORYTELLING communicate, and fine-grain your data
analysis projects and understand how to
2. COMPONENTS OF A
optimally present your findings to technical
GOOD STORY WITH
and non-technical stakeholders and upgrade
DATA - UNDERSTANDING
your storytelling skills.
YOUR STAKEHOLDER AND
STAKEHOLDER EMPATHY,
LEVELS OF DETAILS FOR
DIFFERENT STAKEHOLDERS
- CXO/LEADERSHIP VS TEAM
PRESENTATIONS, VISUALS,
ETC.

3. GOLDEN RULES FOR DATA


STORYTELLING

6. AIRBNB CASE STUDY

1. PROBLEM STATEMENT Use your newly learnt UI tools skills to 1 WEEK


analyse an AirBnB dataset to make important
2. EVALUATION RUBRIC
business decisions. But the analysis is
3. FINAL SUBMISSION just a small part; can you also effectively
4. SOLUTION present it using Data Storytelling to the right
stakeholders?

COURSE 5: ADVANCED PROBLEM SOLVING AND


PROGRAMMING
1. DATA STRUCTURES - SETS, DICTIONARIES, STACKS, QUEUES

1. IN-BUILT DATA STRUCTURES Learn user-defined data structures -Stack, 1 WEEK


Queue, and Trees in Python that help in
2. STACK
advanced data manipulation.
3. QUEUE

4. TREES

*The Curriculum is subject to change as per the inputs from university or industry experts
2. SEARCHING AND SORTING

1. SEARCHING Learn most fundamental searching and


sorting algorithms and design techniques
2. SORTING

3. TWO POINTERS

3. ALGORITHM ANALYSIS + RECURSION

1. ALGORITHM ANALYSIS Learn how to assess the efficiency of your


code using algorithm analysis techniques
2. TIME AND SPACE
and learn to write recursive algorithms
COMPLEXITY

3. RECURSION

4. ADVANCED DATABASE PROGRAMMING USING PANDAS

1. ADVANCED DATA WRANGLING Learn and implement advanced wrangling


WITH PANDAS - I functions and techniques in Pandas related
to date-time, multi-columns aggregation,
2. ADVANCED DATA WRANGLING
hierarchical indexing, and more.
WITH PANDAS - II

5. PYTHON & SQL LAB

1. SQL: TIMED TEST + In this competitive assignment, you will solve


ASSIGNMENT a variety of programming questions in both
SQL and Python in a timed environment. You
2. PYTHON: TIMED TESTS I & II
will also demonstrate one of the questions
3. VIDEO SUBMISSION through a video submission to help improve
your interviewing skills.

*The Curriculum is subject to change as per the inputs from university or industry experts
COURSE 6 - CAPSTONE PROJECT
1. CAPSTONE PROJECT

1. AN OVERVIEW OF THE Solve an end-to-end real-life industry 4 WEEKS


DOMAIN AND ASSOCIATED problem from a wide variety of domains.
CONCEPTS

2. PROBLEM STATEMENT

3. EVALUATION RUBRIC

4. MID SUBMISSION

5. FINAL SUBMISSION

6. SOLUTION

S P E CIAL I SATI ON: DATA ENG INEERING


COURSE 3: DATA ENGINEERING - I
1. DATA MANAGEMENT AND RELATIONAL DATABASE MODELLING

1. ENTERPRISE DATA Understand the concepts of Data 1 WEEK


MANAGEMENT Management and learn to model data
from a Relational Database.
2. RELATIONAL DATABASE
MODELLING

3. NORMAL FORMS AND ER


DIAGRAMS

2. INTRODUCTION TO BIG DATA(OPTIONAL)

1. 4VS OF BIG DATA This module you will learn what big data 0 WEEK
is, its various characteristics, and its
2. BIG DATA: INDUSTRY CASE
determining factors. You will also get an
STUDIES
idea of the various sources of big data and
the wide range of big data applications in
different industries such as retail, healthcare,
and finance.

*The Curriculum is subject to change as per the inputs from university or industry experts
3. INTRODUCTION TO CLOUD AND AWS SETUP

1. INTRODUCTION TO CLOUD Understand what is cloud and setup your 1 WEEK


AWS account which will be required during
2. AWS SETUP
the program.

4. INTRODUCTION TO HADOOP AND MAPREDUCE PROGRAMMING

1. CONCEPTS RETAILED TO Understand the world of distributed data 1 WEEK


DISTRIBUTED COMPUTING processing and storage with Hadoop. Learn
to write MapReduce jobs in Python.
2. HADOOP DISTRIBUTED FILE
SYSTEM

3. MAPREDUCE PROGRAMMING
IN PYTHON

5. ASSIGNMENT (OPTIONAL)
1. INTRODUCTION, PROBLEM Solve an assignment to brush up on the skills 0 WEEK
STATEMENT AND GRADING learnt so far.
RUBRICS

6. NOSQL DATABASES AND APACHE HBASE NOSQL DATABASES


AND MONGODB (OPTIONAL)

1. CONCEPTS OF NOSQL Learn the concepts of NoSQL databases. 1 WEEK


DATABASES Understand the working of Apache HBase.

2. INTRODUCTION TO APACHE
HBASE

3. HBASE PYTHON API

4. COMPARISON OF NOSQL
DATABASES

*The Curriculum is subject to change as per the inputs from university or industry experts
7. DATA WAREHOUSING (OPTIONAL)

1. INTRODUCTION TO DATA Understand the intricacies behind designing 0 WEEK


WAREHOUSE AND DATA a data warehouse and a data lake for use
LAKES case(s).

2. DESIGNING DATA
WAREHOUSING FOR AN ETL
DATA PIPELINE

3. DESIGNING DATA LAKE FOR


AN ETL DATA PIPELINE

8. DATA INGESTION WITH APACHE SQOOP AND APACHE FLUME

1. INTRODUCTION TO DATA Get familiar with the challenges involved 1 WEEK


INGESTION in data ingestion. Use Sqoop and Flume to
ingest structured and unstructured data into
2. STRUCTURED DATA
Hadoop.
INGESTION WITH SQOOP

3. UNSTRUCTURED DATA
INGESTION WITH FLUME

9. MAPREDUCE PROGRAMMING ASSIGNMENT

1. PROBLEM STATEMENT AND Practise MapReduce Programming on a Big 1 WEEK


SAMPLE DATASET Dataset.

2. SOLUTION

COURSE 4 - DATA ENGINEERING - II

1. HIVE & QUERYING

1. FUNDAMENTALS OF APACHE Manage and query a data warehouse with 2 WEEKS


HIVE Apache Hive. Learn to write optimised HQL
for large-scale data analysis.
2. WRITING HQL FOR DATA
ANALYSIS

3. PARTITIONING AND
BUCKETING WITH HIVE

*The Curriculum is subject to change as per the inputs from university or industry experts
2. ASSIGNMENT (OPTIONAL)

1. INTRODUCTION, PROBLEM Solve an assignment to brush up the skills 0 WEEK


STATEMENT AND GRADING learnt so far.
RUBRICS

3. AMAZON REDSHIFT

1. DATA WAREHOUSING WITH Learn to deploy a Redshift cluster and use it 1 WEEK
REDSHIFT for querying data.

2. ANALYSE DATA WITH


REDSHIFT

4. INTRODUCTION TO APACHE SPARK

1. SPARK ARCHITECTURE Get introduced to Apache Spark, a lighting 1 WEEK


fast big data processing engine.
2. RDD, DATAFRAME API, SPARK
SQL

5. PROJECT: ETL DATA PIPELINE

1. INTRODUCTION AND Make use of Sqoop, Redshift & Spark to 2 WEEKS


PROBLEM STATEMENT design an ETL data pipeline.

2. GRADING RUBRICS AND


SUBMISSION

6. AWS CLOUD INFRASTRUCTURE (OPTIONAL)

1. THE AWS CLOUD PLATFORM Do a deep dive into AWS Cloud. 0 WEEK
2. BUILDING AND DEPLOYING
VIRTUAL MACHINES

3. AWS CLOUD STORAGE


SOLUTIONS

4. APPLICATION DEPLOYMENT

5. CLOUD ADMINISTRATION
AND SECURITY

6. LOAD BALANCING AND


BACKUP STRATEGIES

7. CLOUD AUTOMATION

*The Curriculum is subject to change as per the inputs from university or industry experts
COURSE 5 - DATA ENGINEERING - III

1. OPTIMISING SPARK FOR LARGE-SCALE DATA PROCESSING

1. RUNNING SPARK ON Use PySpark to create large-scale data 1 WEEK


MULTINODE CLUSTER processing applications.

2. SPARK MEMORY & DISK


OPTIMISATION

3. OPTIMISING SPARK CLUSTER


ENVIRONMENT

2. APACHE FLINK(OPTIONAL)

1. INTRODUCTION TO APACHE Get Introduced to Apache Flink and learn 0 WEEK


FLINK query batch data.

2. BATCH DATA PROCESSING


WITH FLINK
Use DataStream API to create a stream
3. STREAM PROCESSING WITH processing application.
APACHE FLINK

4. SQL API

3. REAL-TIME DATA STREAMING WITH APACHE KAFKA

1. INTRO TO REAL-TIME Understand the producer-consumer 1 WEEK


DATA PROCESSING architecture of Apache Kafka. Learn to set up
ARCHITECTURES a Kafka cluster for managing real-time data.

2. FUNDAMENTALS OF APACHE
KAFKA

3. SETTING UP KAFKA
PRODUCER AND CONSUMER

4. KAFKA CONNECT API &


KAFKA STREAMS

*The Curriculum is subject to change as per the inputs from university or industry experts
4. REAL-TIME DATA PROCESSING USING SPARK STREAMING

1. SPARK STREAMING Learn about the real-time data processing 1 WEEK


ARCHITECTURE architecture of Apache Spark. Build Spark
Streaming applications to process data in
2. SPARK STREAMING APIS
real-time.
3. BUILDING STREAM
PROCESSING APPLICATION
WITH SPARK

4. COMPARISION BETWEEN
SPARK STREAMING AND
FLINK

5. ASSIGNMENT (OPTIONAL)
1. INTRODUCTION, PROBLEM Solve an assignment to brush up on the skills 0 WEEK
STATEMENT AND GRADING learnt so far.
RUBRICS

6. BUILDING AUTOMATED DATA PIPELINES WITH AIRFLOW

1. FUNDAMENTS OF AIRFLOW Automate Data Pipelines with Airflow. 1 WEEK


2. WORKFLOW MANAGEMENT
WITH AIRFLOW

3. AUTOMATING AN ENTIRE
DATA PIPELINE WITH
AIRFLOW

7. ANALYTICS USING PYSPARK

1. EXPLORATORY DATA Use PySpark to do EDA and Predictive 1 WEEK


ANALYSIS WITH PYSPARK Analysis using Spark’s ML library.

2. PREDICTIVE ANALYSIS WITH


SPARK MLLIB

8. PROJECT: REAL-TIME DATA PROCESSING

1. INTRODUCTION AND Build an end-to-end real-time data 1 WEEK


PROBLEM STATEMENT processing application using Spark
Streaming and Kafka.
2. GRADING RUBRICS AND
SUBMISSION

*The Curriculum is subject to change as per the inputs from university or industry experts
COURSE 6 - CAPSTONE PROJECT
CAPSTONE PROJECT

1. AN OVERVIEW OF THE The capstone project will stitch all the 4 WEEKS
DOMAIN AND ASSOCIATED components of data engineering together.
CONCEPTS

2. PROBLEM STATEMENT

3. EVALUATION RUBRIC

4. MID SUBMISSION

5. FINAL SUBMISSION

6. SOLUTION

CO U R SE - RE S E A RC H MET HO DOLO G IES


( 11 W E E K S )
INTRODUCTION TO RESEARCH AND RESEARCH PROCESS

FAMILIARISE WITH • What is research, importance of reseach, what is data, what


DIFFERENT ASPECTS is information, what is knowledge?
OF RESEARCH AND • Importance of research, types of originality, characteristics
FORMULATE A RESEARCH of research, research process
QUESTION • Criticism in research and its importance, Peer reviews in
research and its importance
• Types of research: Scientific vs Rest, Objectives of research
• Structure of a research proposal: Components of the
Research proposal covered over the course
• Identify a research problem, formulate a research question,
characteristics of a good research question

*The Curriculum is subject to change as per the inputs from university or industry experts
RESEARCH DESIGN • Types of research methods and pyramid of evidence
1. Study of existing researches and links between them
DEVELOP AN 2. Applied and incremental
UNDERSTANDING OF 3. Discover
VARIOUS RESEARCH • Applied vs Fundamental, Quantitative vs Qualitative,
DESIGNS Bayesian vs Frequentis, Hypothesis driven research vs
Exploratory resarch
• Sample Size and Power, Precision vs accuracy trade-off,
p-value vs confidence intervals using a case study

LITERATURE REVIEWING • Intro to lit review process, what is a lit review, benefits of lit
review, literature reivew process (read, analyse and cite)
LEAN HOW TO READ AND • How to read and critique a paper
CRITIQUE A PAPER, AND • Types of sources that could be cited during research, the
HOW TO CITE A PAPER importance of citations and how to cite
• What makes a good reference, How to use reference
management software, Related scientific ethics

RESEARCH PROJECT MANAGEMENT • Project management in reseach: research question, planning of


the project, initiation, monitoring, closure.
LEARN HOW TO PLAN • Project requirements on data: data collection, data access,
THE PROJECT AND HOW data sources, availability, credibility and usability of data from
TO ARRANGE FOR DATA different sources.
• Project requirements on analysis software: Analytical methods
in Data Science, software requirement (R, Minitab, Matlab...),
and data cleaning skills.
• Project requirements on time: planning, breaking the work
down to tasks, Gantt Charts, Milestones identification and
Deliverables, Re-planning.

*The Curriculum is subject to change as per the inputs from university or industry experts
REPORT WRITING AND PRESENTATION SKILLS

MASTER GOOD • Art of writing a paper


SCIENTIFIC WRITING AND • Parts of a paper
PROPER PRESENTATION • Tools to write papers
SKILLS • Publishing papers: Journals + Seminars
• Citation Methods and Rules
• Defending your thesis

SCIENTIFIC ETHICS • Honor Code, Definition of Plagiarism, Type of Plagiarism, Code


of good practice
DEVELOP AN • Research Claims, Professional Standard, IP, Conflict of Interest
UNDERSTANDING OF THE • Legal aspects of data: Ethical Approvals for studies involving
ETHICAL DIMENSION IN humans such as questionaire based research, Storing Primary
RESEARCH Data,

COURSE - SAMPLE THESIS TOPICS (15 WEEKS)


SUBMITTING THE IN- Sample Thesis topics to select from:
DEPTH RESEARCH • Investigate dietary patterns and metabolite fingerprints of
WORK IN A FINAL takeaway (fast) food consumers using PCA and clustering
THESIS REPORT AND methods
PRESENTING IT. • Investigate a diagnosis of eye diseases using imaging
ophthalmic data
• Structure medical images with information geometry
• Using Social media feed to place tweets regarding natural
disasters on a map
• Preventing credit card fraud through pattern recognition
• Developing a recommender system for a Media giant
• Risk modelling for Financial activities and Investment Banking

Disclaimer: Program curriculum is subject to change basis inputs from the institute and experts. Please refer to the website for update
*Thedetails,
Curriculum is subject
or speak to ourtoAdmission
change asCounsellors.
per the inputs from university or industry experts
Research of our learners:
A Glimpse
1 Thesis Topic
Build a prediction model to accurately detect
and classify peripheral neuropathy.

Abstract
Background:
Damage to peripheral nerves causes Peripheral neuropathy (PN). Patients complain of pain, numbness and loss
of balance. If not identified early and treated adequately, PN could progress rapidly and lead to fatal complications.
A neurologist needs to determine the type of PN to provide differential treatment to the patient. However,
defining factors to classify PN accurately has remained challenging. This research proposes a model to detect
and classify PN into axonal, demyelinating, mixed and normal types from clinical and nerve conduction study
(NCS) data using the Random Forest algorithm.

Data and methods:


Clinical and NCS data of 304 Indian patients, 229 affected by PN and 75 normal was collected with ethical
approval from Kauvery hospital, Chennai. Exploratory data analysis and the Random Forest Algorithm was used
to build a model.

Results:
Random Forest model was able to predict and classify PN with an accuracy of 96%. In axonal cases, sensory
and motor nerves showed a drop in amplitudes of greater than 40% compared to normal patients. Reduced
amplitude (>40%) in motor nerves of lower limbs and missing values (>90%) in sensory nerves of lower limbs
identified axonal PN. Delayed onset latency (>40%) in motor nerves of upper limbs, decreased conduction
velocity (>60%) in sensory nerves of upper limbs and increased onset latency (>40%) in F-waves of upper limbs
delineated the demyelinating type. Median ages of patients were mixed (65), demyelinating (51) and axonal
(61). Axonal (18.75% was significant in diabetic patients and demyelinating (14.8%) in non-diabetic patients. Both
axonal and mixed (16.78%) types were greater in hypertensive patients, and demyelinating (17.11%) type was
higher in patients without hypertension. Reflex was depressed more in mixed (17.49%) than axonal (15.51%) and
demyelinating (11.89%). Mixed (37.06%) type showed more in-sensitivity to pin-prick than axonal (29.37%) and
demyelinating (24.48%) types. Mixed (45%) patients tested positivefor Romberg’s test more than axonal (31%)
and demyelinating (21%). Mixed (34.65%) patients complained of numbness more than axonal (23.62%) and
demyelinating (26.77%) types.

Conclusion:
Random forest algorithm identified and classified PN well using clinical and NCS features. Clinical features (age,
diabetes, hypertension, reflex, Romberg’s test, numbness and perception to pin-prick) were useful in detecting
PN. Nerve conduction study features (amplitude, onset latency, conduction velocity, F-wave response and
missing sensory values) were instrumental in classifying PN. Reduced amplitudes of sensory and motor nerves
identified the axonal condition. Delayed onset latency and low conduction velocities along with missing and
delayed F-wave responses identified the demyelinating type.
2 Thesis Topic
Automatic network coding of traffic junctions using
machine learning.

Abstract
Before any traffic simulation can be performed, the network of roads and junctions is modeled. Assigning
attributes to the roadway network, such as the road length and width, the junction type, number of arms, and
lanes, is a crucial task while building the network. This research is an attempt to develop an efficient traffic junction
classifier using machine learning and deep learning algorithms on satellite images. Three junction categories,
Priority, Roundabout, and Signal, are considered for analysis. As this is a novel research idea, the required
image dataset of junctions is created using the Google Maps API. By using robotic process automation, the
downloading of the images is automated. Two approaches are taken to build the classifiers: a machine-learning
approach and a deep-learning approach. The machine learning approach is split into two phases: the feature
extraction phase and the classification phase. In the feature extraction phase, a Histogram of Oriented Gradients
(HOG) descriptors is used to extract features from the images. Furthermore, in the classification phase, several
classification algorithms are applied to the HOG features to build classifiers. In the deep-learning approach,
taking advantage of powerful pre-trained models and transfer learning, a Convolutional Neural Network (CNN)
is developed for classifying the junctions. The models are evaluated, and in the end, a comparison between the
various classification models is performed. The results showed that the CNN classifier modeled had the best
accuracy and AUC compared to the other models with scores of 0.81 and 0.94 respectively. Among the machine
learning models that were trained on the HOG features, the Extreme Gradient Boosting model has the best
accuracy of 0.62. The ultimate aim of this work is to use this junction-classifier model on real projects to aid the
process of finding the type of junctions and reduce the effort and time required to model the roadway networks.
Meet the
Class
INDUSTRIES OUR STUDENTS COME FROM
5% Healthcare
5% E-Commerce

1% Telecom
57% IT
1% Finance

15% Other

1% Consulting

1% Education
3% Retail

1% Manufacturing

10% BFSI

WORK EXPERIENCE 15% 6.1-9 years


21% 3.1-6 years

33% 0-3 years 11% 9.1-12 years

20% +12.1 years


Elements of
Career Services
Jobs on Career Centre Profile Builder (AI-Powered)
Career Centre offers upGrad jobs across expe- An easy-to-use Resume, LinkedIn and Cover
rience levels and CTC ranges. letter preparation tool.
• Easy apply feature for upGrad hiring partner • Resume Score: AI-Driven Resume Score
vacancies • Real-time recommendations to improve.
• Create a resume at profile builder with one • Match your resume to the JD and check
click to apply for various jobs. fitment.
• LinkedIn Profile Review.
• Cover Letter creation.

upGrad Elevate
• Recruitment Drive to connect you with the
best talent admirers in the industry Just-In-Time Interview Prep (JIT)
• Get access to a wide range of opportunities For upcoming job interviews JITs are conducted
and find the perfect job within 48 hours for eligible programs.
• Apply your learnings to real industry • Tailored to the job role and target domain
problems • Real-time feedback and tips for improvement

Interview Preparation Personalised Industry Session


Pre-recorded content on topics such as: 90-minute sessions over the weekend by leading
• Profile building, communications, etc. industry experts.
• Problem-solving approach • Session categories: Career, Technical
• Approaching guesstimates and Communications
• Domain-specific interview question bank • Doubt resolution
and much more • Develop proof of concepts and apply
theoretical concepts in the real world
• Assess skill levels
• Peer Networking
• Classroom element
• Business communication sessions and much more

Disclaimer: Career services are subject to change. Please refer to the website or speak to our Admission Counsellor for updated details.
Experience upGrad
Offline
UPGRAD BASECAMPS
Held across all major cities in India, upGrad basecamps
bring together learners, faculty and industry experts
for a power-packed day of activities, career-building
sessions and live group projects. Get to know your
peers and faculty and hone your networking skills
in an exciting environment.

CAREER FAIRS
Attend regular hiring drives in major cities across
India, giving you the opportunity to interview with
upGrad’s 300+ hiring partners ensuring you get every
opportunity you deserve.

HACKATHONS
Team up and put your learning to use with our offline
Hackathons: designed to help you apply concepts
and meet, network, and grow!
Hear from
Our Learners
Sachin Aggarwal, Experience: 18+ Years
“Learning with IIITB and upGrad has been an experience like no other. Being enrolled
on an online program, you have your worries about how the program and teach-
ing methods will be. My favourite part about the learning experience has been the
well-designed and thoughtful content shared by IIITB professors and industry experts
on upGrad platforms. Kudos to upGrad!”

Shravani Shahapure, Experience 16 Years


“For someone who really wants to pursue a career in the field of Data Science, it
is worth opting for the complete course by IIITB and upGrad. IIITB and upGrad’s
online course on Data Science gives many opportunities and develops students
for their future as they provide the best professors, thought-provoking assignments
and case studies.”

Savita Upadhyay, Experience: 4 Years


“It has been an amazing journey with upGrad till now. Starting with their course ma-
terial to live sessions to mentor support, each helps you to always be on track and
progress efficiently with the Data Science course. My sincere thanks to the entire
team of upGrad and Professors of IIITB for showing me the path and direction for
my dream to become a Data Analyst.”

Tuhin Pal, Experience: 5 Years


“I appreciate the platform upGrad has provided and the way they have arranged
modules and assignments. Modules are locked until you complete the previous one,
so it feels like clearing a semester and going to the next one.”
Program Details and
Admission Process
PROGRAM DURATION AND FORMAT PROGRAM FEE
19 Months | Online Please refer to the website for more details

PROGRAM START DATES ELIGIBILITY


Please refer to the website for program start dates. Bachelor’s Degree with minimum 50% or equivalent
upgrad.com/data-science-masters-degree-iiitb/ passing marks, and successful completion of the
Executive PG Program in Data Science from IIITB
with a 2.4 GPA. No coding experience required.

WEEKLY COMMITMENT (15 hours/week)

6-7 HOURS 6-7 HOURS 1 LIVE SESSION


Asynchronous learning time. Assignments and projects. Every two weeks.

SELECTION PROCESS

STEP 1: Selection Test STEP 2: Review and Shortlisting of STEP 3: Enrollment for Access
Fill out an application and take a Suitable Candidates to Prep Content
short 17-minute online test with Our faculty will review all applications, Make a quick block payment
11 questions. considering the educational and with assistance from our loan
professional background of an partners where required,
applicant and review the test scores receive immediate access to
where applicable. Following this, the prepped content and begin
Offer Letters will be rolled out so you are your upGrad journey.
assured of a great peer group to learn
and network with.

FOR FURTHER [email protected] Disclaimer: Program fee and


payment options are subject to
INFORMATION, 1800 210 2020
change. Please refer to the website
CONTACT We are available 24*7 for updated details or speak to our
admission counsellor.

COMPANY INFORMATION
COMPANY INFORMATION
upGrad
upGrad Education
Education Private
Private Limited
Limited,
Nishuvi,
Nishuvi, 75, Annie
75, Dr. Annie Besant
Besant Road,
Road,
Worli,
Worli, Mumbai
Mumbai - 400018.
– 400018.

You might also like