0% found this document useful (0 votes)

19 views20 pages

Introduction to Deep Learning Concepts

Uploaded by

kamlesh.thakre2578

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

19 views20 pages

Introduction to Deep Learning Concepts

Uploaded by

kamlesh.thakre2578

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Unit II

Introduction to Deep Learning

Deep Learning (DL) is a subset of Machine Learning (ML) that uses
artificial neural networks (ANNs) to model and solve complex
problems. It is particularly useful when dealing with large datasets
and high-dimensional data like images, audio, and text.
Key Points:
• Inspired by the human brain: DL models are made of layers of
neurons that mimic the brain’s neural network.
• Automatic feature extraction: Unlike traditional ML, DL can
learn features from raw data without manual feature
engineering.
• Applications:
o Image recognition (e.g., detecting cats/dogs in photos)
o Speech recognition (e.g., Siri, Alexa)
o Natural language processing (e.g., ChatGPT, translation)
o Self-driving cars (detecting lanes, obstacles)
Flow of Deep Learning:
Input Data → Neural Network (Layers) → Features automatically
learned → Prediction/Output

Why Deep Learning?

• Can handle complex and unstructured data (images, video,
audio).
• Scales well with large amounts of data.
• Higher accuracy compared to traditional ML for complex tasks.
• Learns hierarchical representations of data (low-level to high-
level features).

Deep Learning Architectures

Deep Learning architectures are different types of neural networks
designed for specific tasks.
A. Feedforward Neural Network (FNN) / Multi-Layer Perceptron
(MLP)
• Structure:
o Input Layer → Hidden Layers → Output Layer
o Neurons in each layer are fully connected to the next
layer.
• How it works: Data flows forward from input to output.
• Use cases: Tabular data, basic classification, regression.
• Limitations: Cannot capture spatial or sequential patterns well
(like images or time series).

B. Convolutional Neural Network (CNN)

• Specialized for images and spatial data.
• Key layers:
o Convolutional Layer: Extracts features like edges,
textures.
o Pooling Layer: Reduces dimensionality (max pooling,
average pooling).
o Fully Connected Layer: For classification or final output.
• Use cases: Image classification, object detection, face
recognition.
Example Flow:
Image → Conv Layer → ReLU → Pooling → Flatten → Fully Connected
→ Output

C. Recurrent Neural Network (RNN)

• Specialized for sequential data (time series, text).
• Characteristic: Has memory — it uses previous information to
influence current output.
• Problems: Vanishing gradient for long sequences.
• Variants:
o LSTM (Long Short-Term Memory) → Solves long-term
dependency issues.
o GRU (Gated Recurrent Unit) → Simpler and faster than
LSTM.
• Use cases: Language modeling, speech recognition, stock
prediction.

D. Autoencoder
• Purpose: Unsupervised learning for dimensionality reduction
or data compression.
• Structure: Input → Encoder → Bottleneck → Decoder →
Reconstructed Output
• Use cases: Image denoising, anomaly detection, feature
learning.
Machine Learning vs Deep Learning :
Feature Machine Learning (ML) Deep Learning (DL)

Neural networks with

Algorithms learn from
Definition many layers learn
data to make predictions
features automatically

Data Needed Small to medium Large datasets

Feature Manual (you decide Automatic (network finds

Engineering features) features)

Images, videos, audio,

Best For Structured/tabular data
text

Speed Faster Slower, needs GPU

Very complex (deep

Complexity Simple models
neural networks)

Accuracy Good for simple tasks High for complex tasks

Interpretation Easy to understand Hard (“black box”)

Representation Learning (RL) :

Definition:
Representation Learning is a technique in Machine Learning / Deep
Learning where the model automatically discovers useful features
or representations from raw data instead of relying on manually
crafted features.
What it Means
Representation Learning means —
the computer learns to understand data by itself.
It finds what features are important instead of you telling it.
Example:
If you give a lot of cat and dog photos —
• You don’t tell the computer “look at ears or tails.”
• It learns by itself what makes a cat or dog different.
That’s representation learning — learning how to represent data
automatically.

Why It’s Needed

In old Machine Learning:
You had to do feature engineering — manually find features like
“color”, “shape”, etc.
In Deep Learning:
The model itself learns features — from simple to complex, layer by
layer.

How It Works (Step by Step)

Imagine an image going through a Deep Neural Network (like CNN):
First layer: learns edges and corners
Next layer: learns shapes like eyes or wheels
Final layers: learn full objects like faces or cars
So, each layer builds better and smarter representations of data.

Types of Representation Learning

Type Description Example

Learns from labeled

Supervised CNN classifying cats/dogs
data

Learns from unlabeled Autoencoder compressing

Unsupervised
data images

Self- Learns using its own BERT predicting missing

Supervised signals words

Advantages
No need to manually design features
Works on images, text, speech
Learns useful patterns automatically
Improves accuracy and generalization
Helps in Transfer Learning (use knowledge from one task to
another)

Width vs Depth of Neural Networks (Simple Version)

Width
• Means how many neurons are in a single layer
• Wide network = more neurons in a layer
• Learns more features at once
• Too wide → might overfit
Example:
• 1 hidden layer with 100 neurons → wide
• 1 hidden layer with 10 neurons → narrow

Depth
• Means how many hidden layers the network has
• Deep network = more layers
• Learns complex patterns step by step
• Too deep → hard to train (vanishing gradient)
Example:
• 3 hidden layers → deep
• 1 hidden layer → shallow

Quick Comparison Table

Feature Width Depth

What it is Neurons per layer Number of layers

Learns many features at

Strength Learns complex patterns
once

Weakness Can overfit Harder to train

Complex data (images,

Best For Simple data
text)

Memory Tip:
• Width = fat layer (more neurons)
• Depth = tall network (more layers)
Activation Functions in Neural Networks :
Definition:
An activation function decides whether a neuron should be
activated or not, introducing non-linearity into the network. Without
it, neural networks would just be linear models, no matter how many
layers they have.

ReLU (Rectified Linear Unit)

Formula:
𝑓(𝑥) = max⁡(0, 𝑥)

Meaning:
• If input > 0 → output = input
• If input ≤ 0 → output = 0
Graph:
• Straight line for x > 0
• Flat at 0 for x ≤ 0
Pros:
• Simple and fast to compute
• Helps avoid vanishing gradient problem in deep networks
• Works well in practice for CNNs and many deep networks
Cons:
• “Dying ReLU” problem: Neurons can get stuck at 0 and stop
learning if inputs are always negative
Leaky ReLU (LReLU)
Formula:
𝑥 if 𝑥 > 0
𝑓(𝑥) = {
𝛼𝑥 if 𝑥 ≤ 0

• Typically, α = 0.01
Meaning:
• Positive inputs → output = input
• Negative inputs → output = small negative value (not zero)
Graph:
• Slight slope for negative inputs (instead of flat)
Pros:
• Solves dying ReLU problem
• Allows gradient to flow even for negative inputs
Cons:
• Slightly more complex than ReLU
• α is hyperparameter that needs tuning

ELU (Exponential Linear Unit)

Formula:
𝑥 if 𝑥 > 0
𝑓(𝑥) = {
𝛼(𝑒 𝑥 − 1) if 𝑥 ≤ 0

• α is usually 1.0
Meaning:
• Positive inputs → output = input
• Negative inputs → output = smooth exponential curve
approaching -α
Graph:
• Smooth, continuous curve for negative values
• Linear for positive values
Pros:
• Helps vanishing gradient problem
• Smooth output for negative inputs → better learning
• Can converge faster than ReLU
Cons:
• More computationally expensive than ReLU/LReLU
• Slightly more complex to implement

Unsupervised Training of Neural Networks

What is Unsupervised Training?

Definition:
Unsupervised training means training a neural network without
labeled data.
• The network tries to find patterns, structures, or
representations in the input data by itself.
• There is no “correct output” provided.
Key Idea:
The network learns relationships, clusters, or features directly from
the data.

Why Use Unsupervised Training?

• Labeled data is expensive or hard to get
• Helps the network discover hidden structures
• Useful for feature learning, clustering, dimensionality
reduction
Applications:
• Autoencoders: Compress and reconstruct data → feature
extraction
• Clustering Networks: Organize similar data points together
• Generative Models (GANs): Learn to generate new data similar
to training data

Step-by-Step Working
Let’s break down how it actually happens
Step 1: Input Data
You give the network raw data (like images, sounds, or text) — but
no labels.
Example:
• Images of cats and dogs are given
• The network does not know which is which

Step 2: Feature Extraction / Encoding

The neural network tries to capture patterns in the data — for
example:
• Which pixels are similar
• What shapes or textures repeat
• What parts of data are common
This is usually done by an Encoder network (in Autoencoders) or
hidden layers that learn compressed information.

Step 3: Representation Learning

The network converts input into a latent representation — a
compact form that captures important features.
Think of it as a “summary” of the input.
Example:
Instead of remembering every pixel of a face image, it learns:
• Shape of face
• Eyes position
• Mouth curve

Step 4: Reconstruction or Similarity Task

The network then tries to recreate the input or find patterns from
the learned representation.
There are 3 main methods:
1. Autoencoders:
o Network encodes → decodes → compares output to input
o Learns by reducing reconstruction error
Loss =∣∣ 𝐼𝑛𝑝𝑢𝑡 − 𝑂𝑢𝑡𝑝𝑢𝑡 ∣∣2

2. Clustering Networks (like SOMs):

o Neurons organize themselves into groups based on similar
inputs
3. Generative Models (like GANs):
o Network learns to generate new data similar to input data

Step 5: Weight Update (Learning Process)

The network still uses backpropagation, but instead of a “label-based
loss,” it uses:
• Reconstruction loss (for autoencoders)
• Distribution loss (for GANs)
• Similarity measure (for clustering)
Weights are updated to minimize these losses → so the network gets
better at representing the input structure.

Example: Autoencoder Working

Here’s how it works practically
1. Input: Image (say, a handwritten digit)
2. Encoder: Compresses image → smaller feature vector
3. Latent space: Stores key patterns of the digit
4. Decoder: Rebuilds the original image from the compressed
version
5. Loss: Difference between input and output (MSE)
6. Backpropagation: Updates weights to minimize this difference
Real-Life Applications
• Image compression and reconstruction
• Feature extraction for other models
• Anomaly detection
• Clustering and pattern discovery
• Pretraining models before supervised learning

Restricted Boltzmann Machines (RBMs)

What is an RBM? :
Definition:
A Restricted Boltzmann Machine (RBM) is a type of unsupervised
neural network that learns patterns in data and can represent
complex probability distributions.
• It’s called “restricted” because connections only exist between
layers, not within a layer.
• It’s a stochastic neural network (neurons have probabilities,
not fixed outputs).
Use: Mainly for feature learning, dimensionality reduction, and
pretraining deep networks.

Structure of RBM
RBM has two layers:
1. Visible Layer (v):
o Represents the input data
o Example: Pixels of an image
2. Hidden Layer (h):
o Learns features or patterns from visible layer
Key Point:
• No connections between neurons within the same layer (this
is why it’s “restricted”)
• All visible neurons connect to all hidden neurons
Diagram (simple view):
Visible Layer: v1 v2 v3 ... vn
⬇ ⬇ ⬇
Hidden Layer: h1 h2 h3 ... hm

Step-by-Step Working
Step 1: Input Data → Visible Layer
• Feed the raw input data into the visible layer
• Example: An image of a handwritten digit

Step 2: Activate Hidden Layer Probabilistically

• Each hidden neuron computes weighted sum of inputs + bias

𝑝(ℎ𝑗 = 1 ∣ 𝑣) = 𝜎(∑ 𝑤𝑖𝑗 𝑣𝑖 + 𝑏𝑗 )

𝑖

• σ = sigmoid function → outputs probability of neuron being ON

• Hidden neurons turn ON or OFF stochastically based on this
probability
Step 3: Reconstruct the Input
• Using the activated hidden neurons, the network reconstructs
the visible layer

𝑝(𝑣𝑖 = 1 ∣ ℎ) = 𝜎(∑ 𝑤𝑖𝑗 ℎ𝑗 + 𝑎𝑖 )

𝑗

• This gives a reconstructed input 𝑣 ′

• Idea: Network tries to reproduce the input from the hidden
representation

Step 4: Compute Reconstruction Error

• Compare original input (v) with reconstructed input (v')
Error = 𝑣 − 𝑣 ′

• This tells the network how well it has captured the features

Step 5: Update Weights (Learning)

• Weights and biases are updated to minimize reconstruction
error
• Common algorithm: Contrastive Divergence (CD)
o Approximate method → fast and works well
o Updates weights iteratively:
Δ𝑤𝑖𝑗 = 𝜂(⟨𝑣𝑖 ℎ𝑗 ⟩𝑑𝑎𝑡𝑎 − ⟨𝑣𝑖 ℎ𝑗 ⟩𝑟𝑒𝑐𝑜𝑛𝑠𝑡𝑟𝑢𝑐𝑡𝑖𝑜𝑛 )
• Repeat steps 1–5 for many epochs until the network learns the
patterns.
Mathematical Idea
• Energy Function: Measures how good a configuration of
neurons is

𝐸(𝑣, ℎ) = − ∑ 𝑎𝑖 𝑣𝑖 − ∑ 𝑏𝑗 ℎ𝑗 − ∑ ∑ 𝑣𝑖 𝑤𝑖𝑗 ℎ𝑗
𝑖 𝑗 𝑖 𝑗

Where:
• 𝑣𝑖 = visible neuron
• ℎ𝑗 = hidden neuron
• 𝑤𝑖𝑗 = weight between visible and hidden neuron
• 𝑎𝑖 , 𝑏𝑗 = biases
• The network learns weights (w) that minimize energy → better
feature representation

Autoencoders (AEs)

What is an Autoencoder?
Definition:
An Autoencoder is a type of unsupervised neural network that
learns to compress data and then reconstruct it back as accurately as
possible.
• Input → Encoder → Latent space → Decoder → Output
• Goal: Output ≈ Input
Key Idea:
Autoencoders learn a compact representation (features) of the input
data automatically.

Structure of an Autoencoder
1. Input Layer: Raw data
2. Encoder: Compresses input into a smaller latent
representation
3. Latent Space / Bottleneck: Stores the compressed features
4. Decoder: Reconstructs the input from the latent representation
5. Output Layer: Reconstructed data
Diagram (simplified):
Input ---> [Encoder] ---> Latent Space ---> [Decoder] ---> Output

How Autoencoders Work (Step-by-Step)

Step 1: Feed Input
• Raw data (image, text, audio) is fed into the input layer
Step 2: Encode
• Encoder compresses input into latent features
• Reduces dimensionality while preserving important info
Step 3: Decode
• Decoder reconstructs the original input from latent features
Step 4: Calculate Loss
• Compare reconstructed output with original input
• Loss function: Mean Squared Error (MSE) or Binary Cross-
Entropy
𝐿𝑜𝑠𝑠 =∣∣ 𝐼𝑛𝑝𝑢𝑡 − 𝑂𝑢𝑡𝑝𝑢𝑡 ∣∣2

Step 5: Update Weights

• Backpropagation adjusts weights in encoder + decoder to
minimize reconstruction error
Step 6: Repeat
• Repeat for many epochs → network learns best representation
of input data

Types of Autoencoders
1. Undercomplete AE
2. Sparse AE
3. Denoising AE
4. Variational AE (VAE)

Deep Learning Fundamentals and Challenges
No ratings yet
Deep Learning Fundamentals and Challenges
200 pages
Deep Learning Fundamentals and Applications
No ratings yet
Deep Learning Fundamentals and Applications
42 pages
Deep Learning Fundamentals and Models
No ratings yet
Deep Learning Fundamentals and Models
27 pages
Understanding Deep Learning Basics
No ratings yet
Understanding Deep Learning Basics
20 pages
Introduction to Deep Learning Concepts
No ratings yet
Introduction to Deep Learning Concepts
43 pages
Deep Neural Networks Explained
No ratings yet
Deep Neural Networks Explained
12 pages
Deep Learning Course Overview and Concepts
No ratings yet
Deep Learning Course Overview and Concepts
199 pages
Unit
No ratings yet
Unit
40 pages
Feedforward Neural Networks Overview
No ratings yet
Feedforward Neural Networks Overview
26 pages
Deep Learning for Natural Language Processing
No ratings yet
Deep Learning for Natural Language Processing
78 pages
Neural Networks Overview and Applications
No ratings yet
Neural Networks Overview and Applications
17 pages
Deep Learning Report For Students
No ratings yet
Deep Learning Report For Students
32 pages
Understanding Neural Networks and Deep Learning
No ratings yet
Understanding Neural Networks and Deep Learning
10 pages
Deep Learning Foundations and Architectures
No ratings yet
Deep Learning Foundations and Architectures
11 pages
Overview of the Deep Learning Revolution
No ratings yet
Overview of the Deep Learning Revolution
35 pages
Intro To Deep Learning
No ratings yet
Intro To Deep Learning
30 pages
Deep Learning Architectures Overview
No ratings yet
Deep Learning Architectures Overview
19 pages
Challenges in Training Deep Neural Networks
No ratings yet
Challenges in Training Deep Neural Networks
37 pages
Deep Learning Fundamentals Explained
No ratings yet
Deep Learning Fundamentals Explained
16 pages
Understanding Artificial Neural Networks
No ratings yet
Understanding Artificial Neural Networks
37 pages
Deep Learning Notes Overview
No ratings yet
Deep Learning Notes Overview
4 pages
Introduction to Deep Learning Concepts
No ratings yet
Introduction to Deep Learning Concepts
62 pages
Deep Learning: Concepts and Applications
No ratings yet
Deep Learning: Concepts and Applications
16 pages
Introduction to Deep Learning Concepts
No ratings yet
Introduction to Deep Learning Concepts
19 pages
Understanding Neural Networks and Deep Learning
No ratings yet
Understanding Neural Networks and Deep Learning
11 pages
Deep Learning: Concepts and Applications
No ratings yet
Deep Learning: Concepts and Applications
7 pages
Introduction to Neural Networks Basics
No ratings yet
Introduction to Neural Networks Basics
9 pages
Deep Learning
No ratings yet
Deep Learning
36 pages
Understanding Neural Networks Basics
No ratings yet
Understanding Neural Networks Basics
22 pages
Learning Problems in Machine Learning
No ratings yet
Learning Problems in Machine Learning
15 pages
Deep Learning Neural Networks Overview
No ratings yet
Deep Learning Neural Networks Overview
27 pages
Neural Networks and Deep Learning Overview
No ratings yet
Neural Networks and Deep Learning Overview
20 pages
Neural Networks in Machine Learning Guide
No ratings yet
Neural Networks in Machine Learning Guide
18 pages
Deep Learning Techniques Overview
No ratings yet
Deep Learning Techniques Overview
9 pages
Notes 2022 2023
No ratings yet
Notes 2022 2023
144 pages
Deep Learning Computational Units Overview
No ratings yet
Deep Learning Computational Units Overview
10 pages
Deep Learning Overview and Applications
No ratings yet
Deep Learning Overview and Applications
9 pages
Understanding Deep Learning Concepts
No ratings yet
Understanding Deep Learning Concepts
10 pages
Understanding Deep Learning Basics
No ratings yet
Understanding Deep Learning Basics
40 pages
Overview of Activation Functions in ML
No ratings yet
Overview of Activation Functions in ML
19 pages
Deep Learning Overview and Techniques
No ratings yet
Deep Learning Overview and Techniques
95 pages
Deep Learning in Healthcare Systems
100% (1)
Deep Learning in Healthcare Systems
57 pages
Deep Learning Basics and Techniques
No ratings yet
Deep Learning Basics and Techniques
62 pages
Understanding Deep Learning Basics
No ratings yet
Understanding Deep Learning Basics
27 pages
Understanding AI, ML, and Deep Learning
No ratings yet
Understanding AI, ML, and Deep Learning
5 pages
Deep Learning Concepts and Techniques
No ratings yet
Deep Learning Concepts and Techniques
54 pages
Deep Learning
No ratings yet
Deep Learning
40 pages
Deep Learning Fundamentals Explained
No ratings yet
Deep Learning Fundamentals Explained
4 pages
Deep Learning Overview and Network Types
No ratings yet
Deep Learning Overview and Network Types
15 pages
Deep Learning Fundamentals and Applications
100% (1)
Deep Learning Fundamentals and Applications
182 pages
Visual Introduction to Deep Learning
No ratings yet
Visual Introduction to Deep Learning
53 pages
Deep Learning Basics for Engineers
No ratings yet
Deep Learning Basics for Engineers
38 pages
Deep Learning Overview and Techniques
No ratings yet
Deep Learning Overview and Techniques
34 pages
Lecture2 - Refresher
No ratings yet
Lecture2 - Refresher
39 pages
Understanding Convolutional Neural Networks
No ratings yet
Understanding Convolutional Neural Networks
44 pages
Triveni Back Pressure Turbine Offer
No ratings yet
Triveni Back Pressure Turbine Offer
33 pages
SWOT Analysis of Microcredit NGOs in Bangladesh
No ratings yet
SWOT Analysis of Microcredit NGOs in Bangladesh
18 pages
AX-36-160-0415 Inline Fan Cut Sheet
No ratings yet
AX-36-160-0415 Inline Fan Cut Sheet
2 pages
GEA Niro Pharma SD
No ratings yet
GEA Niro Pharma SD
16 pages
States Versus Corporations: Rethinking The Power of Business in International Politics
No ratings yet
States Versus Corporations: Rethinking The Power of Business in International Politics
25 pages
TR0172 Technical Reference For Altium's Desktop Stereo Speaker Assembly NB2DSK-SPK01
No ratings yet
TR0172 Technical Reference For Altium's Desktop Stereo Speaker Assembly NB2DSK-SPK01
6 pages
hp42s Om en
No ratings yet
hp42s Om en
362 pages
Scope of Political Science Explained
No ratings yet
Scope of Political Science Explained
2 pages
Class 11 Admission Notice 2024-25
No ratings yet
Class 11 Admission Notice 2024-25
2 pages
IC Assembly Fundamentals and Technologies
No ratings yet
IC Assembly Fundamentals and Technologies
38 pages
The First and The Last Adolf Galland Ebook Testbank Solutions Newly Updated Content
100% (5)
The First and The Last Adolf Galland Ebook Testbank Solutions Newly Updated Content
64 pages
Enhancing Teaching via Constructive Alignment
No ratings yet
Enhancing Teaching via Constructive Alignment
19 pages
Linear Algebra Applications in Graph Theory
No ratings yet
Linear Algebra Applications in Graph Theory
13 pages
Stresses in Pressure Tunnels with Liners
No ratings yet
Stresses in Pressure Tunnels with Liners
29 pages
EnggTree.com Download Links
No ratings yet
EnggTree.com Download Links
25 pages
LED Recessing Lights Specifications
No ratings yet
LED Recessing Lights Specifications
24 pages
Mahanagar: Gender Roles in Change
100% (1)
Mahanagar: Gender Roles in Change
4 pages
Extraction and Characterization of Coconut Husk Lignin
No ratings yet
Extraction and Characterization of Coconut Husk Lignin
12 pages
Google Meet for Student Video Conferencing
No ratings yet
Google Meet for Student Video Conferencing
6 pages
AP Calculus BC Unit 5 Progress Check
No ratings yet
AP Calculus BC Unit 5 Progress Check
2 pages
Sac400-Lesson 1
No ratings yet
Sac400-Lesson 1
20 pages
West Bengal University Faculty Recruitment Rules
No ratings yet
West Bengal University Faculty Recruitment Rules
8 pages
Escape the Standard at Somewhere Elementary
No ratings yet
Escape the Standard at Somewhere Elementary
3 pages
ISMIS Admission Application Guide
No ratings yet
ISMIS Admission Application Guide
2 pages
Motivating Students in Lesson Plans
50% (2)
Motivating Students in Lesson Plans
11 pages
Delhi Govt's Homoeopathy Day Event 2021
No ratings yet
Delhi Govt's Homoeopathy Day Event 2021
1 page
Pourquoi les enfants apprennent mieux les langues
No ratings yet
Pourquoi les enfants apprennent mieux les langues
2 pages
GIS Software Comparison on GeM
No ratings yet
GIS Software Comparison on GeM
23 pages
Fungal Decay Effects on Wood-Plastic Composites
No ratings yet
Fungal Decay Effects on Wood-Plastic Composites
11 pages

Introduction to Deep Learning Concepts

Uploaded by

Introduction to Deep Learning Concepts

Uploaded by

Unit II

Introduction to Deep Learning

Why Deep Learning?

Deep Learning Architectures

B. Convolutional Neural Network (CNN)

C. Recurrent Neural Network (RNN)

Neural networks with

Data Needed Small to medium Large datasets

Feature Manual (you decide Automatic (network finds

Images, videos, audio,

Speed Faster Slower, needs GPU

Very complex (deep

Accuracy Good for simple tasks High for complex tasks

Interpretation Easy to understand Hard (“black box”)

Representation Learning (RL) :

Why It’s Needed

How It Works (Step by Step)

Types of Representation Learning

Learns from labeled

Learns from unlabeled Autoencoder compressing

Self- Learns using its own BERT predicting missing

Width vs Depth of Neural Networks (Simple Version)

Quick Comparison Table

Feature Width Depth

What it is Neurons per layer Number of layers

Learns many features at

Weakness Can overfit Harder to train

Complex data (images,

ReLU (Rectified Linear Unit)

ELU (Exponential Linear Unit)

Unsupervised Training of Neural Networks

What is Unsupervised Training?

Why Use Unsupervised Training?

Step 2: Feature Extraction / Encoding

Step 3: Representation Learning

Step 4: Reconstruction or Similarity Task

2. Clustering Networks (like SOMs):

Step 5: Weight Update (Learning Process)

Example: Autoencoder Working

Restricted Boltzmann Machines (RBMs)

Step 2: Activate Hidden Layer Probabilistically

𝑝(ℎ𝑗 = 1 ∣ 𝑣) = 𝜎(∑ 𝑤𝑖𝑗 𝑣𝑖 + 𝑏𝑗 )

• σ = sigmoid function → outputs probability of neuron being ON

𝑝(𝑣𝑖 = 1 ∣ ℎ) = 𝜎(∑ 𝑤𝑖𝑗 ℎ𝑗 + 𝑎𝑖 )

• This gives a reconstructed input 𝑣 ′

Step 4: Compute Reconstruction Error

Step 5: Update Weights (Learning)

How Autoencoders Work (Step-by-Step)

Step 5: Update Weights

You might also like