0% found this document useful (0 votes)
80 views38 pages

Edureka Training - AI and Machine Learning Masters Course

The document provides information about Edureka's AI and Machine Learning Masters program. It details the program curriculum which aims to provide learners with hands-on experience in designing and implementing AI and machine learning solutions through extensive coursework and projects. The program teaches skills needed to develop cutting-edge AI solutions.

Uploaded by

bhavneet_bodh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
80 views38 pages

Edureka Training - AI and Machine Learning Masters Course

The document provides information about Edureka's AI and Machine Learning Masters program. It details the program curriculum which aims to provide learners with hands-on experience in designing and implementing AI and machine learning solutions through extensive coursework and projects. The program teaches skills needed to develop cutting-edge AI solutions.

Uploaded by

bhavneet_bodh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 38

edureka!

edureka!
Discover Learning

AI and Machine Learning Masters


Program

About Edureka
Edureka is one of the world’s largest and most effective online education platform for
technology professionals. In a span of 10 years, 100,000+ students from over 176 countries
have upskilled themselves with the help of our online courses. Since our inception, we have
been dedicated to helping technology professionals from all corners of the world learn
Programming, Data Science, Big Data, Cloud Computing, DevOps, Business Analytic, Java &
Mobile Technologies, Software Testing, Web Development, System Engineering, Project
Management, Digital Marketing, Business Intelligence, Cybersecurity, RPA and more.
We have an easy and affordable learning solution that is accessible to millions of learners. With
our learners spread across countries like the US, India, UK, Canada, Singapore, Australia, Middle
East, Brazil, and many others, we have built a community of over 1 million learners across the
globe.

About the Program


Edureka’s AI and Machine Learning Masters Course is curated by industry experts to provide
learners with a deep understanding of the principles and practices of artificial intelligence and
machine learning through its extensive course work and hands-on projects. Learners will gain
hands-on experience in designing and implementing model building, creating AI and machine
learning solutions, performing feature engineering, working with big data, and making data-
driven decisions. With our comprehensive curriculum, learners will gain the skills necessary to
develop cutting-edge AI and machine learning solutions to meet the demands of any
organization. Join us today and become globally recognized AI and machine learning
professional!

www.edureka.co © Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.


edureka!

Index
1 Python Statistics for Data Science Course
2 Python Certification Training Course
3 Python Machine Learning Certification Training
4 Advanced Artificial Intelligence Course
5 ChatGPT Complete Course: Beginners to Advanced
6 PySpark Certification Training Course

*Depending on industry requirements, Edureka may make changes to the course curriculum

www.edureka.co © Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.


edureka!

edureka!
Discover Learning

Python
Index Statistics for Data Science

Course (Self-paced)
Course Curriculum

Course Outline

Module 1: Understanding the Data

Topics:

• Introduction to Data Types


• Numerical parameters to represent data.
a. Mean
b. Mode
c. Median
d. Sensitivity
e. Information Gain
f. Entropy
• Statistical parameters to represent data.

Module 2: Probability and its uses

Topics:

• Uses of probability
• Need of probability
• Bayesian Inference

www.edureka.co © Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.


edureka!

• Density Concepts
• Normal Distribution Curve

Module 3: Statistical Inference

Topics:

• Point Estimation
• Confidence Margin
• Hypothesis Testing
• Levels of Hypothesis Testing

Module 4: Data Clustering

Topics:

• Association and Dependence


• Causation and Correlation
• Covariance
• Simpson’s Paradox
• Clustering Techniques

Module 5: Testing the Data

Topics:

• Parametric Test
• Parametric Test Types
• Non- Parametric Test
• Experimental Designing
• A/B testing

www.edureka.co © Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.


edureka!

Module 6: Regression Modelling

Topics:

• Logistic and Regression Techniques


• Problem of Collinearity
• WOE and IV
• Residual Analysis
• Heteroscedasticity
• Homoscedasticity

www.edureka.co © Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.


edureka!

edureka!
Discover Learning

Python Certification Training Course


*Depending on industry requirements, Edureka may make changes to the course curriculum

Course Curriculum

Course Outline

Module 1: Introduction to Python

Topics:

• Need for programming


• Advantages of programming
• Overview of python
• Organizations using python
• Python Applications in various domains
• Variables
• Operands and expressions
• Conditional statements
• Loops
• Structural pattern matching

Module 2: Sequences and File Operations

Topics:

• Accepting user input and eval function


• Files input/output functions
• Lists

www.edureka.co © Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.


edureka!

• Tuples
• Strings manipulation
• Sets and set operations
• Python dictionary

Module 3: Functions and Object-oriented Programming

Topics:

• User-defined functions
• Function parameters
• Different types of arguments
• Global variables
• Global keyword
• Lambda functions
• Built-in functions
• Object-oriented concepts
• Public, protected and private attributes
• Class variable and instance variable
• Constructor and destructor
• Inheritance and its types
• Method resolution order
• Overloading and overriding
• Getter and setter methods

Module 4: Working with Modules and Handling Exceptions

Topics:

• Standard libraries
• Packages and import statements

www.edureka.co © Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.


edureka!

• Reload function
• Creating a module
• Important modules in python
• Sys module
• OS module
• Math module
• Date-time module
• Random module
• JSON module
• Regular expression
• Exception handling

Module 5: Array Manipulation using NumPy

Topics:

• Basics of data analysis


• NumPy - Arrays
• Array operations
• Indexing, slicing, and Iterating
• NumPy array attributes
• Matrix product
• NumPy functions
• Array manipulation
• File handling using NumPy

www.edureka.co © Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.


edureka!

Module 6: Data Manipulation using Pandas

Topics:

• Basics of data analysis


• NumPy - Arrays
• Array operations
• Indexing, slicing, and Iterating
• NumPy array attributes
• Matrix product
• NumPy functions
• Array manipulation
• File handling using NumPy

Module 7: Data Visualization using Matplotlib and Seaborn

Topics:

• Why data visualization?


• Matplotlib library
• Seaborn
• Line plots
• Multiline plots
• Bar plot
• Histogram
• Pie chart
• Scatter plot
• Boxplot
• Saving charts
• Customizing visualizations
• Saving plots

www.edureka.co © Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.


edureka!

• Grids
• Subplots
• Heatmaps

Module 8: GUI Programming

Topics:

• Ipywidgets package
• Numeric widgets
• Boolean widgets
• Selection widgets
• String widgets
• Date picker
• Color picker
• Container widgets
• Creating a GUI application

Module 9: Developing Web Maps and Representing Information using Plots (Self-paced)

Topics:

• Use of Folium library


• Use of Pandas library
• Flow Chart of web map application
• Developing web map using Folium and Pandas
• Reading Information from titanic dataset and represent It using plots

www.edureka.co © Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.


edureka!

Module 10: Web Scraping and Computer Vision using OpenCV (Self-paced)

Topics:

• Beautiful Soup library


• Scrapy
• Requests library
• Scrap All hyperlinks from a webpage using Beautiful Soup and Requests
• Plotting charts using Bokeh
• Plotting scatterplots using Bokeh
• Image editing using OpenCV
• Face detection using OpenCV
• Motion detection and capturing video

Module 11: Database Integration with Python (Self-paced)

Topics:

• Basics of database management


• Python MySql
• Create database
• Create a table
• Insert into table
• Select query
• Where clause
• OrderBy clause
• Delete query
• Drop table
• Update query
• Limit clause
• Join and Self-Join

www.edureka.co © Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.


edureka!

• MongoDB (Unstructured)
• Insert_one query
• Insert_many query
• Update_one query
• Update_many query
• Create_index query
• Drop_index query
• Delete and drop collections
• Limit query

www.edureka.co © Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.


edureka!

edureka!
Discover Learning
*Depending on industry requirements, Edureka may make changes to the course curriculum

Python Machine Learning Certification


Training
Course Curriculum

Course Outline

Module 1: Introduction to Data Science

Topics:

• What is Data Science?


• What does Data Science involve?
• Era of Data Science
• Business Intelligence vs Data Science
• Life cycle of Data Science
• Tools of Data Science
• Introduction to Python

Module 2: Data Extraction, Wrangling, & Visualization

Topics:

• Data Analysis Pipeline


• What is Data Extraction?
• Types of Data
• Raw and Processed Data
• Data Wrangling
• Exploratory Data Analysis

www.edureka.co © Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.


edureka!

• Visualization of Data

Module 3: Introduction to Machine Learning with Python

Topics:

• Python Revision (numpy, Pandas, scikit learn, matplotlib)


• What is Machine Learning?
• Machine Learning Use-Cases
• Machine Learning Process Flow
• Machine Learning Categories
• Linear regression
• Gradient descent

Module 4: Supervised Learning - I

Topics:

• What is Classification and its use cases?


• What is Decision Tree?
• Algorithm for Decision Tree Induction
• Creating a Perfect Decision Tree
• Confusion Matrix
• What is Random Forest?

Module 5: Dimensionality Reduction

Topics:

• Introduction to Dimensionality
• Why Dimensionality Reduction?
• PCA

www.edureka.co © Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.


edureka!

• Factor Analysis
• Scaling dimensional model
• LDA

Module 6: Supervised Learning - II

Topics:

• What is Naïve Bayes?


• How Naïve Bayes works?
• Implementing Naïve Bayes Classifier
• What is Support Vector Machine?
• Illustrate how Support Vector Machine works?
• Hyperparameter optimization
• Grid Search vs Random Search
• Implementation of Support Vector Machine for Classification

Module 7: Unsupervised Learning

Topics:

• What is Clustering & its Use Cases?


• What is K-means Clustering?
• How K-means algorithm works?
• How to do optimal clustering?
• What is C-means Clustering?
• What is Hierarchical Clustering?
• How Hierarchical Clustering works?

www.edureka.co © Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.


edureka!

Module 8: Association Rules Mining and Recommendation Systems

Topics:

• What are Association Rules?


• Association Rule Parameters
• Calculating Association Rule Parameters
• Recommendation Engines
• How Recommendation Engines work?
• Collaborative Filtering
• Content Based Filtering

Module 9: Reinforcement Learning

Topics:

• What is Reinforcement Learning?


• Why Reinforcement Learning?
• Elements of Reinforcement Learning
• Exploration vs Exploitation dilemma
• Epsilon Greedy Algorithm
• Markov Decision Process (MDP)
• Q values and V values
• Q values and V values
• α values

Module 10: Time Series Analysis

Topics:

• What is Time Series Analysis?


• Importance of TSA

www.edureka.co © Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.


edureka!

• Components of TSA
• White Noise
• AR model
• MA model
• ARMA model
• ARIMA model
• Stationarity
• ACF & PACF

Module 11: Model Selection and Boosting

Topics:

• What is Model Selection?


• Need of Model Selection
• Cross – Validation
• What is Boosting?
• How Boosting Algorithms work?
• Types of Boosting Algorithms
• Adaptive Boosting

Module 12: In-Class Project

Topics:

• Predict the species of Plant.

www.edureka.co © Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.


edureka!

edureka!
Discover Learning

Advanced Artificial
• Vectors, Intelligence
and how to build Course
them using Arrays and Linked Lists with Pointers

Course Curriculum

Course Outline

Module 1: Introduction to Text Mining and NLP

Topics:

• Overview of Text Mining


• Need of Text Mining
• Natural Language Processing (NLP) in Text Mining
• Applications of Text Mining
• OS Module
• Reading, Writing to text and word files
• Setting the NLTK Environment
• Accessing the NLTK Corpora

Module 2: Extracting, Cleaning and Preprocessing Text

Topics:

• Tokenization
• Frequency Distribution
• Different Types of Tokenizers

www.edureka.co © Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.


edureka!

• Bigrams, Trigrams & Ngrams


• Stemming
• Lemmatization
• Stopwords
• POS Tagging
• Named Entity Recognition

Module 3: Analyzing Sentence Structure


Topics:
• Syntax Trees
• Chunking
• Chinking
• Context Free Grammars (CFG)
• Automating Text Paraphrasing

Module 4: Text Classification-I


Topics:

• Machine Learning: Brush Up

• Bag of Words

• Count Vectorizer

• Term Frequency (TF)

• Inverse Document Frequency (IDF)

Module 5: Introduction to Deep Learning

Topics:

• What is Deep Learning?


• Curse of Dimensionality

www.edureka.co © Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.


edureka!

• Machine Learning vs. Deep Learning


• Use cases of Deep Learning
• Human Brain vs. Neural Network
• What is Perceptron?
• Learning Rate
• Epoch
• Batch Size
• Activation Function
• Single Layer Perceptron

Module 6: Getting Started with TensorFlow 2.0

Topics:

• Introduction to TensorFlow 2.x


• Installing TensorFlow 2.x
• Defining Sequence model layers
• Activation Function
• Layer Types
• Model Compilation
• Model Optimizer
• Model Loss Function
• Model Training
• Digit Classification using Simple Neural Network in TensorFlow 2.x
• Improving the model
• Adding Hidden Layer
• Adding Dropout
• Using Adam Optimizer

www.edureka.co © Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.


edureka!

Module 7: Convolution Neural Network


Topics:

• Image Classification Example


• What is Convolution
• Convolutional Layer Network
• Convolutional Layer
• Filtering
• ReLU Layer
• Pooling
• Data Flattening
• Fully Connected Layer
• Predicting a cat or a dog
• Saving and Loading a Model
• Face Detection using OpenCV

Module 8: Regional CNN

Topics:

• Regional-CNN
• Selective Search Algorithm
• Bounding Box Regression
• SVM in RCNN
• Pre-trained Model
• Model Accuracy
• Model Inference Time
• Model Size Comparison
• Transfer Learning
• Object Detection – Evaluation

www.edureka.co © Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.


edureka!

• mAP
• IoU
• RCNN – Speed Bottleneck
• Fast R-CNN
• RoI Pooling
• Fast R-CNN – Speed Bottleneck
• Faster R-CNN
• Feature Pyramid Network (FPN)
• Regional Proposal Network (RPN)
• Mask R-CNN

Module 9: Boltzmann Machine & Autoencoder

Topics:

• What is Boltzmann Machine (BM)?


• Identify the issues with BM
• Why did RBM come into the picture?
• Step-by-step implementation of RBM
• Distribution of Boltzmann Machine
• Understanding Autoencoders
• Architecture of Autoencoders
• Brief on types of Autoencoders
• Applications of Autoencoders

Module 10: Generative Adversarial Network (GAN)

Topics:

• Which Face is Fake?


• Understanding GAN

www.edureka.co © Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.


edureka!

• What is Generative Adversarial Network?


• How does GAN work?
• Step by step Generative Adversarial Network implementation
• Types of GAN
• Recent Advances: GAN

Module 11: Emotion and Gender Detection (Self-paced)

Topics:

• Which Face is Fake?


• Understanding GAN
• What is Generative Adversarial Network?
• How does GAN work?
• Step by step Generative Adversarial Network implementation
• Types of GAN
• Recent Advances: GAN

Module 12: Introduction to RNN and GRU (Self-paced)

Topics:

• Issues with Feed Forward Network


• Recurrent Neural Network (RNN)
• Architecture of RNN
• Calculation in RNN
• Backpropagation and Loss calculation
• Applications of RNN
• Vanishing Gradient
• Exploding Gradient
• What is GRU?

www.edureka.co © Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.


edureka!

• Components of GRU
• Update gate
• Reset gate
• Current memory content
• Final memory at current time step

Module 13: LSTM (Self-paced)

Topics:

• What is LSTM?
• Structure of LSTM
• Forget Gate
• Input Gate
• Output Gate
• LSTM architecture
• Types of Sequence-Based Model
• Sequence Prediction
• Sequence Classification
• Sequence Generation
• Types of LSTM
• Vanilla LSTM
• Stacked LSTM
• CNN LSTM
• Bidirectional LSTM
• How to increase the efficiency of the model?
• Backpropagation through time
• Workflow of BPTT

www.edureka.co © Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.


edureka!

Module 14: Auto Image Captioning Using CNN LSTM (Self-paced)

Topics:

• Auto Image Captioning


• COCO dataset
• Pre-trained model
• Inception V3 model
• The architecture of Inception V3
• Modify the last layer of a pre-trained model
• Freeze model
• CNN for image processing
• LSTM or text processing

Module 15: Developing a Criminal Identification and Detection Application Using OpenCV
(Self-paced)

Topics:

• Why is OpenCV used?


• What is OpenCV
• Applications
• Demo: Build a Criminal Identification and Detection App

Module 16: TensorFlow for Deployment (Self-paced)

Topics:

• Use Case: Amazon’s Virtual Try-Out Room.


• Why Deploy models?
• Model Deployment: Intuit AI models
• Model Deployment: Instagram’s Image Classification Models
• What is Model Deployment

www.edureka.co © Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.


edureka!

• Types of Model Deployment Techniques


• TensorFlow Serving
• Browser-based Models
• What is TensorFlow Serving?
• What are Servables?
• Demo: Deploy the Model in Practice using TensorFlow Serving
• Introduction to Browser based Models
• Demo: Deploy a Deep Learning Model in your Browser.

Module 17: Text Classification-II (Self-paced)

Topics:

• Converting text to features and labels


• Multinomial Naive Bayes Classifier
• Leveraging Confusion Matrix

Module 18: In Class Project (Self-paced)

Topics:

• Sentiment Classification on Movie Rating Dataset

www.edureka.co © Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.


edureka!

edureka !
Discover Learning

ChatGPT Complete Course: Beginners to


Advanced
Course Curriculum

Course Outline

Module 1: Introduction to OpenAI and ChatGPT

Topics:

• Emergence of ChatGPT
• What is ChatGPT?
• How does ChatGPT work?
• Applications of ChatGPT
• Introduction to OpenAI and its role in NLP and AI
• Overview of OpenAI's GPT models (e.g., GPT-2 and GPT-3)
• Environment setup
• Sign up for an OpenAI API account

Module 2: Business Use Cases of ChatGPT

Topics:

• Using ChatGPT for live coding


• Build, optimize, and scale business using ChatGPT
• Advanced SEO for digital marketers
• Creating social media posts with ChatGPT
• Using ChatGPT for language translation
• Using ChatGPT for YouTube scripts

www.edureka.co © Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.


edureka!

• Code generation and code debugging with ChatGPT


• Content creation with ChatGPT
• Question answering
• Sentiment analysis

Module 3: Developing Web Application using ChatGPT

Topics:

• Building web development architecture


• Building backend server
• Setting up the database
• Setting up a React-based client-side application
• Writing user API requests to MongoDB with Express and React
• Fetching and updating the database with MongoDB API and routing with Express
• Routing to React-based client-side application
• Debugging and client-side coding

Module 4: Deploying and Integrating ChatGPT in Business Applications

Topics:

• Create serverless ChatGPT


• Integrate ChatGPT with Power Automate
• Integrate ChatGPT with Power Apps
• Integrate ChatGPT with Outlook
• Integrate ChatGPT with Bubble
• Integrate ChatGPT with Airtable
• Deployment on cloud platforms

www.edureka.co © Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.


edureka!

Module 5: GPT Models, Pre-processing and Fine-tuning ChatGPT (Self-paced)

Topics:

• Overview of language models


• Understanding the architecture of the GPT model
• GPT models: advantages and disadvantages
• Overview of the pre-trained GPT models available for fine-tuning
• Training of ChatGPT
• Data preparation
• Model architecture
• Hyperparameter tuning
• Training process

Module 6: Working with GPT-3 and OpenAI API (Self-paced)

Topics:

• Introduction to GPT-3 and its capabilities


• Democratizing NLP
• Understanding prompts, completions, and tokens
• Understanding GPT-3 risks
• Understanding general GPT-3 use cases
• Content filtering
• Sentiment analysis using GPT-3
• Text summarization using GPT-3
• Question answering and information retrieval
• Introducing the Playground
• Handling text generation and classification tasks
• Understanding semantic search
• Understanding APIs

www.edureka.co © Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.


edureka!

• Getting familiar with HTTP


• Reviewing the OpenAI API endpoints
• Introducing CURL and Postman
• Understanding API authentication
• Making an authenticated request to the OpenAI API
• Introducing JSON
• Using the Completions endpoint
• Using the Semantic Search endpoint

Module 7: Building and Deploying GPT-3 Powered Application (Self-paced)

Topics:

• Setting up the GPT-3 API and integrating it into projects


• Building conversational AI for finance and e-commerce domain
• Strengths and limitations of GPT-3 in building conversational AI
• Scaling and deploying GPT-3 models to production

Module 8: Building Real-world Applications with OpenAI API and ChatGPT (Self-paced)

Topics:

• Build and deploy ChatGPT AI app


• Build diet planning application
• Build a website and create landing page content using ChatGPT

www.edureka.co © Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.


edureka!

Module 9: ChatGPT: Best Practices, Limitations, and Avenues for Future Development (Self-
paced)

Topics:

• Ethical considerations
• Limitations of ChatGPT
• Best practices for using ChatGPT
• Future developments in the field of ChatGPT
• Opportunities for further learning and research

www.edureka.co © Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.


edureka!

edureka!
Discover Learning

PySpark Certification Training Course


Course Curriculum

Course Outline

Module 1: Introduction to Big Data Hadoop and Spark

Topics:

• What is Big Data?


• Big Data Customer Scenarios
• Limitations and Solutions of Existing Data Analytics Architecture with Uber Use Case
• How Hadoop Solves the Big Data Problem?
• What is Hadoop?
• Hadoop’s Key Characteristics
• Hadoop Ecosystem and HDFS
• Hadoop Core Components
• Rack Awareness and Block Replication
• YARN and its Advantage
• Hadoop Cluster and its Architecture
• Hadoop: Different Cluster Modes
• Big Data Analytics with Batch & Real-Time Processing
• Why is Spark Needed?
• What is Spark?
• How Spark Differs from its Competitors?

www.edureka.co © Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.


edureka!

• Spark at eBay
• Spark’s Place in Hadoop Ecosystem

Module 2: Introduction to Python for Apache Spark

Topics:

• Overview of Python
• Different Applications where Python is Used
• Values, Types, Variables
• Operands and Expressions
• Conditional Statements
• Loops
• Command Line Arguments
• Writing to the Screen
• Python files I/O Functions
• Numbers
• Strings and related operations
• Tuples and related operations
• Lists and related operations
• Dictionaries and related operations
• Sets and related operations

www.edureka.co © Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.


edureka!

Module 3: Functions, OOPS, and Modules in Python

Topics:

• Spark Components & its Architecture

• Spark Deployment Modes

• Introduction to PySpark Shell

• Submitting PySpark Job

• Spark Web UI

• Writing your first PySpark Job Using Jupyter Notebook

• Data Ingestion using Sqoop

Module 4: Playing with Spark RDDs

Topics:

• Challenges in Existing Computing Methods

• Probable Solution & How RDD Solves the Problem

• What is RDD, It’s Operations, Transformations & Actions

• Data Loading and Saving Through RDDs

• Key-Value Pair RDDs

• Other Pair RDDs, Two Pair RDDs

• RDD Lineage

• RDD Persistence

• WordCount Program Using RDD Concepts

• RDD Partitioning & How it Helps Achieve Parallelization

• Passing Functions to Spark

www.edureka.co © Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.


edureka!

Module 5: DataFrames and Spark SQL

Topics:

• Need for Spark SQL

• What is Spark SQL

• Spark SQL Architecture

• SQL Context in Spark SQL

• Schema RDDs

• User Defined Functions

• Data Frames & Datasets

• Interoperating with RDDs

• JSON and Parquet File Formats

• Loading Data through Different Sources

• Spark-Hive Integration

Module 6: Machine Learning using Spark MLlib

Topics:

• Why Machine Learning

• What is Machine Learning

• Where Machine Learning is used

• Face Detection: USE CASE

• Different Types of Machine Learning Techniques

• Introduction to MLlib

• Features of MLlib and MLlib Tools

• Various ML algorithms supported by MLlib

www.edureka.co © Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.


edureka!

Module 7: Deep Dive into Spark MLlib

Topics:
• Supervised Learning: Linear Regression, Logistic Regression, Decision Tree, Random
Forest
• Unsupervised Learning: K-Means Clustering & How It Works with MLlib
• Analysis of US Election Data using MLlib (K-Means)

Module 8: Understanding Apache Kafka and Apache Flume

Topics:
• Need for Kafka
• What is Kafka
• Core Concepts of Kafka
• Kafka Architecture
• Where is Kafka Used
• Understanding the Components of Kafka Cluster
• Configuring Kafka Cluster
• Kafka Producer and Consumer Java API
• Need of Apache Flume
• What is Apache Flume
• Basic Flume Architecture
• Flume Sources
• Flume Sinks
• Flume Channels
• Flume Configuration
• Integrating Apache Flume and Apache Kafka

www.edureka.co © Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.


edureka!

Module 9: Apache Spark Streaming - Processing Multiple Batches

Topics:
• Drawbacks in Existing Computing Methods
• Why Streaming is Necessary
• What is Spark Streaming
• Spark Streaming Features
• Spark Streaming Workflow
• How Uber Uses Streaming Data
• Streaming Context & DStreams
• Transformations on DStreams
• Describe Windowed Operators and Why it is Useful
• Important Windowed Operators
• Slice, Window and ReduceByWindow Operators
• Stateful Operators

Module 10: Apache Spark Streaming - Data Sources

Topics:
• Apache Spark Streaming: Data Sources
• Streaming Data Source Overview
• Apache Flume and Apache Kafka Data Sources
• Example: Using a Kafka Direct Data Source

Module 11: Spark GraphX (Self-Paced)

Topics:
• Introduction to Spark GraphX
• Information about a Graph
• GraphX Basic APIs and Operations

www.edureka.co © Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.


edureka!

• Spark GraphX Algorithm - PageRank, Personalized PageRank, Triangle Count, Shortest


Paths, Connected Components, Strongly Connected Components, Label Propagation

www.edureka.co © Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.

You might also like