0% found this document useful (0 votes)

11 views25 pages

Cours 2

The document provides an overview of shallow neural networks, detailing their structure and mathematical representation. It explains the process of forward propagation through layers, including the use of weights, biases, and activation functions. Additionally, it discusses vectorization techniques for processing multiple examples efficiently in neural network training.

Uploaded by

lahlou khalid

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views25 pages

Cours 2

Uploaded by

lahlou khalid

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 25

Deep learning

Dr. Aissa Boulmerka

[email protected]

2023-2024

1
CHAPTER 2
SHALLOW NEURAL NETWORKS

2
What is a Neural Network?

𝑥1
𝑥2 𝑦=a
𝑥3

𝑥
𝑤 𝑧 = 𝑤𝑇𝑥 + 𝑏 𝑎=𝜎 𝑧 ℒ 𝑎, 𝑦

3
What is a Neural Network?

𝑥1

𝑥2 𝑦=a

𝑥3

𝑊 [1] 𝑧 [1] = 𝑤 [1] 𝑥 + 𝑏 [1] 𝑎[1] = 𝜎 𝑧 [1] 𝑧 [2] = 𝑤 [2] 𝑎[1] + 𝑏 [2] 𝑎[2] = 𝜎 𝑧 [2] ℒ 𝑎[2] , 𝑦

𝑏 [1] 𝑊 [2]

𝑏 [2]

4
Neural Network Representation

2 layers neural network

𝐰 [𝟏] , 𝐛 [𝟏]
(4,3) (4,1)

𝐚[𝟎] = 𝐗 𝐚[𝟏]
[𝟏]
𝒂𝟏
𝐰 [𝟐] , 𝐛 [𝟐]
𝑥1 (1,4) (1,1)
[𝟏]
𝒂𝟐 𝐚[𝟐]
𝑥2 𝑦 = 𝐚[𝟐]
[𝟏]
𝒂𝟑

𝑥3 𝒂𝟏 [𝟏]
[𝟏]
𝒂𝟒
𝒂𝟐 [𝟏]
𝐚[𝟏] =
𝒂𝟑 [𝟏]
Input Hidden Output 𝒂𝟒 [𝟏]
layer layer layer

5
Neural Network Representation

𝑥1

𝑥1 𝑥2 𝑦

𝑥2 𝑤 𝑇 𝑥 + 𝑏 𝜎(𝑧) 𝑎=𝑦 𝑥3

𝑥3
𝑧 = 𝑤𝑇𝑥 + 𝑏
𝑎 = 𝜎(𝑧)

6
Neural Network Representation
[1] 1𝑇 [1]
𝑧1 = 𝑤1 𝑥 + 𝑏1
[1] [1]
𝑎1 = 𝜎(𝑧1 )
𝑥1

𝑥1 𝑥2 𝑦

𝑥2 𝑤 𝑇 𝑥 + 𝑏 𝜎(𝑧) 𝑎=𝑦 𝑥3

𝑥3
[1] 1𝑇 [1]
𝑧= 𝑤𝑇𝑥 +𝑏 𝑧2 = 𝑤2 𝑥 + 𝑏2
[1] [1]
𝑎 = 𝜎(𝑧) 𝑥1 𝑎2 = 𝜎(𝑧2 )

𝑥2 𝑦

𝑥3
7
Neural Network Representation
[𝟏]
𝒂𝟏
[1] 1𝑇 [1] [1] [1]
𝑥1 𝑧1 = 𝑤1 𝑥 + 𝑏1 , 𝑎1 = 𝜎 𝑧1
[1] 1𝑇 [1] [1] [1]
[𝟏]
𝒂𝟐 𝑧2 = 𝑤2 𝑥 + 𝑏2 , 𝑎2 = 𝜎 𝑧2
𝑥2 𝑦 [1]
𝑧3 = 𝑤3
1𝑇 [1] [1]
𝑥 + 𝑏3 , 𝑎3 = 𝜎 𝑧3
[1]
[𝟏]
𝒂𝟑 [1] 1𝑇 [1] [1] [1]
𝑧4 = 𝑤4 𝑥 + 𝑏4 , 𝑎4 = 𝜎 𝑧4
𝑥3
𝑊 [1] 𝑏 [1]
[𝟏]
𝒂𝟒

1𝑇 [1] 1𝑇 [1] [1]

____ 𝑤1 ___ 𝑏1 𝑤1 𝑥 + 𝑏1 𝑧1
1𝑇 𝑥1 [1] 1𝑇 [1] [1]
____𝑤2 ____ 𝑏2 𝑤2 𝑥 + 𝑏2 𝑧2
𝑧 [1] = 1𝑇
𝑥2 +
[1]
= 1𝑇 [1]
= [1]
(4,1) ____𝑤2 ____ 𝑥3 𝑏3 𝑤3 𝑥 + 𝑏3 𝑧3
1𝑇 [1] 1𝑇 [1] [1]
____𝑤2 ____ (3,1) 𝑏4 𝑤4 𝑥 + 𝑏4 𝑧4
(4,3) (4,1)
[1]
𝑎1
[1]
𝑎2
𝑎[1] = [1]
= 𝜎 𝑍 [1]
𝑎3
[1]
𝑎4 8
Neural Network Representation Learning
[𝟏]
𝒂𝟏
𝑥1
[𝟏]
𝒂𝟐

𝑥2 𝑦 Given input x:
[𝟏]
𝒂𝟑

𝑥3 𝑧 [1] = 𝑊 1 𝑥 + 𝑏 [1]
[𝟏] 𝑎[2] (𝟒, 𝟏) 𝟒, 𝟑 𝟑, 𝟏 (𝟒, 𝟏)
𝒂𝟒

𝑥 = 𝑎[0] 𝑎[1] = 𝜎 𝑧 [1]

𝑎[1] (𝟒, 𝟏) (𝟒, 𝟏)

𝑧 [2] = 𝑊 2
𝑎[1] + 𝑏 [2]
(𝟏, 𝟏) 𝟏, 𝟒 𝟒, 𝟏 (𝟏, 𝟏)

𝑎[2] = 𝜎 𝑧 [2]
(𝟏, 𝟏) (𝟏, 𝟏)

9
For loop across multiple examples

𝑥1 𝑧 [1] = 𝑊 1 𝑥 + 𝑏 [1]
𝑥2 𝑎[1] = 𝜎 𝑧 [1]
𝑦
𝑧 [2] = 𝑊 2 𝑎[1] + 𝑏 [2]
𝑥3 𝑎[2] = 𝜎 𝑧 [2]

𝑥 -------------> 𝑎[2] = 𝑦 for i=1 to m:

𝑥 (1) -------------> 𝑎[2](1) = 𝑦 (1) 𝑧 [1](𝑖) = 𝑊 1 𝑥 (𝑖) + 𝑏 [1]
𝑥 (2) -------------> 𝑎[2](2) = 𝑦 (2) 𝑎[1](𝑖) = 𝜎 𝑧 [1](𝑖)
⋯ ⋯ ⋯ 𝑧 [2](𝑖) = 𝑊 2 𝑎[1](𝑖) + 𝑏 [2]
𝑥 (𝑚) -------------> 𝑎[2](𝑚) = 𝑦 (𝑚) 𝑎[2](𝑖) = 𝜎 𝑧 [2](𝑖)

𝑎[2](𝑖)
Layer 2 Example (i)
10
Vectorizing across multiple examples
for i=1 to m:
𝑍 [1] = 𝑊 1 𝑋 + 𝑏 [1]
𝑧 [1](𝑖) = 𝑤 1 𝑥 (𝑖) + 𝑏 [1]
𝐴[1] = 𝜎 𝑍 [1]
𝑎[1](𝑖) = 𝜎 𝑧 [1](𝑖)
𝑍 [2] = 𝑊 2 𝐴[1] + 𝑏 [2]
𝑧 [2](𝑖) = 𝑤 2 𝑎[1](𝑖) + 𝑏 [2]
𝐴[2] = 𝜎 𝑍 [2]
𝑎[2](𝑖) = 𝜎 𝑧 [2](𝑖)

| | ⋯ | | ⋯ | |
𝑋 = 𝑥 (1) 𝑥 (2) ⋯𝑥 (𝑚) 𝑍 [1] = 𝑍 [1](1) 𝑍 [1](2) ⋯𝑍 [1](𝑚)
| | ⋯ | | | ⋯ |
(𝒏𝒙 , 𝒎)
# training examples

∙ ∙ ∙
∙ ∙ ∙
𝑊 1 𝑥 (1) = 𝑊 1 𝑥 (2) = 𝑊 1 𝑥 (3) =
∙ ∙ ∙
∙ ∙ ∙

𝑍 [1] = 𝑊 1
𝑋 + 𝑏 [1]

12
Recap of vectorizing across multiple examples

𝑥1 for i=1 to m:
𝑧 [1](𝑖) = 𝑊 1 𝑥 (𝑖) + 𝑏 [1]
𝑥2 𝑦 𝑎[1](𝑖) = 𝜎 𝑧 [1](𝑖)
𝑥3 𝑧 [2](𝑖) = 𝑊 2 𝑎[1](𝑖) + 𝑏 [2]
𝑎[2](𝑖) = 𝜎 𝑧 [2](𝑖)

| | ⋯ | 𝐴[0]
𝑋 = 𝑥 (1) 𝑥 (2) ⋯𝑥 (𝑚)
| | ⋯ | 𝑍 [1] = 𝑊 1 𝑋 + 𝑏 [1]
𝐴[1] = 𝜎 𝑍 [1]
𝑍 [2] = 𝑊 2 𝐴[1] + 𝑏 [2]
| | ⋯ | 𝐴[2] = 𝜎 𝑍 [2]
𝐴[1] = 𝑎[1](1) 𝑎[1](2) ⋯𝑎[1](𝑚)
| | ⋯ |
13
Activation functions
Sigmoid:
𝑔[1] 𝑍 [1] = 𝑡𝑎𝑛ℎ 𝑍 [1]
1
𝑎=
𝑥1 𝒕𝒂𝒏𝒉 1 + 𝑒 −𝑧
Sigmoid:
𝑔[2] 𝑍 [2] = 𝜎 𝑍 [2]

𝑥2 𝒕𝒂𝒏𝒉 𝝈 𝑦 tanh:
𝑒 𝑧 − 𝑒 −𝑧
𝑎= 𝑧
𝑥3 𝒕𝒂𝒏𝒉 𝑒 + 𝑒 −𝑧

Given 𝑿: ReLU:
𝑎 = max 0, 𝑧
𝑧 [1] = 𝑊 1 𝑥 + 𝑏 [1]
𝑎[1] = 𝑔[1] 𝑍 [1]
𝑧 [2] = 𝑊 2 𝑎[1] + 𝑏 [2]
Leaky ReLU:
𝑎[2] = 𝑔[2] 𝑧 [2] 𝑎 = max 0.01𝑧, 𝑧

14
Pros and cons of activation functions

Sigmoid activation function: tanh activation function:

1 𝑒 𝑧 − 𝑒 −𝑧
𝑎= 𝑎= 𝑧
1 + 𝑒 −𝑧 𝑒 + 𝑒 −𝑧
 Never use this, except for the output layer.
 if you are doing binary classification, or  The tanh is much strictly superior.
maybe almost never use this.

ReLU activation function: Leaky ReLU activation function:

𝑎 = max 0, 𝑧 𝑎 = max 0.01𝑧, 𝑧
 The default and the most commonly used  You can also try the leaky ReLU function.
activation function is the ReLU.
 So if you're not sure what else to use, use
the ReLU function.

15
Why do you need non linear activation functions?

𝒍𝒊𝒏
𝑥1
𝒍𝒊𝒏 𝑎[1] = 𝑧 [1] = 𝑊 1
𝑥 + 𝑏 [1]
𝑥2 𝒍𝒊𝒏 𝑦∈ℝ 𝑎[2] = 𝑧 [2] = 𝑊 2 [1]
𝑎 + 𝑏 [2]
𝒍𝒊𝒏
𝑎[2] = 𝑊 2 𝑊 1𝑥 + 𝑏 [1] + 𝑏 [2]
𝑥3 𝑎[2] = 𝑊 2 𝑊 1 𝑥 + 𝑊 2 𝑏 [1] + 𝑏 [2]
𝒍𝒊𝒏

𝑎[2] = 𝑊′𝑥 + 𝑏′
Given 𝑿:

𝑧 [1] = 𝑊 1 𝑥 + 𝑏 [1]
𝑎[1] = 𝑔 𝑧 [1] = 𝑧 [1] Linear activation
function:
𝑧 [2] = 𝑊 2 𝑎[1] + 𝑏 [2] 𝒈 𝒛 =𝒛
𝑎[2] = 𝑔 𝑧 [2] = 𝑧 [2]
16
Derivatives of activation functions

Sigmoid activation function:

1
𝑔 𝑧 =
1 + 𝑒 −𝑧

 𝑧 = 10 ⟹ 𝑔 𝑧 ≈ 1
𝑑
𝑔′ 𝑧 = 𝑔 𝑧 = slope of 𝑔(𝑧) at 𝑧 𝑑
𝑑𝑧 ⟹ 𝑔 𝑧 ≈1 1−1 ≈0
𝑑𝑧
1 1  𝑧 = −10 ⟹ 𝑔 𝑧 ≈ 0
= 1−
1 + 𝑒 −𝑧 1 + 𝑒 −𝑧 𝑑
⟹ 𝑔 𝑧 ≈0 1−0 ≈0
𝑑𝑧
=𝑔 𝑧 1−𝑔 𝑧
 𝑧 = 0 ⟹ 𝑔 𝑧 = 1/2
𝑎=𝑔 𝑧 , 𝑔′ 𝑧 =𝑎 1−𝑎 𝑑 1 1
⟹ 𝑑𝑧 𝑔 𝑧 = 2 1 − 2 = 1/4
17
Derivatives of activation functions

Tanh activation function:

𝑒 𝑧 − 𝑒 −𝑧
𝑔 𝑧 = tanh 𝑧 = 𝑧
𝑒 + 𝑒 −𝑧

 𝑧 = 10 ⟹ 𝑡𝑎𝑛ℎ 𝑧 ≈ 1
𝑑 ⟹ 𝑔′ 𝑧 ≈ 0
𝑔′ 𝑧 = 𝑔 𝑧 = slope of 𝑔(𝑧) at 𝑧
𝑑𝑧

= 1 − tanh 𝑧 2  𝑧 = −10 ⟹ 𝑔 𝑧 ≈ −1
⟹ 𝑔′ 𝑧 ≈ 0
𝑎=𝑔 𝑧 , 𝑔′ 𝑧 = 1 − 𝑎 2
 𝑧=0⟹𝑔 𝑧 =0
⟹ 𝑔′ 𝑧 = 1

18
Derivatives of activation functions

ReLU and Leaky ReLU :

𝑔 𝑧 = 𝑚𝑎𝑥 0, 𝑧 𝑎 = 𝑚𝑎𝑥 0.01𝑧, 𝑧

1 𝑠𝑖 𝑧 ≥ 0 1 𝑠𝑖 𝑧 ≥ 0
𝑔′ 𝑧 = 𝑔′ 𝑧 =
0 𝑠𝑖 𝑧 < 0 0.01 𝑠𝑖 𝑧 < 0

19
Gradient descent for neural networks
Parameters :
𝑊 [1] 𝑏 [1] 𝑊 [2] 𝑏 [2]
𝑛[1] , 𝑛[0] 𝑛[1] , 1 𝑛[2] , 𝑛[1] 𝑛[2] , 1

with 𝑛𝑥 = 𝑛[0] , 𝑛 1 , 𝑛[2] = 1

Cost function: 𝑚
1
J 𝑊 [1] , 𝑏 [1] , 𝑊 [2] , 𝑏[2] = ℒ 𝑦, 𝑦
𝑚
𝑖=1
Gradient descent:
Repeat {
Compute predictions 𝑦 (𝑖) , 𝑖 = 1, … , 𝑚
𝜕𝐽 𝜕𝐽
𝑑𝑊 [1] = 𝜕𝑊 [1] , 𝑑𝑏 [1] = 𝜕𝑏[1]
𝑊 [1] = 𝑊 [1] − 𝛼𝑑𝑊 [1]
𝑏 [1] = 𝑏 [1] − 𝛼𝑑𝑏 [1]
𝜕𝐽 𝜕𝐽
𝑑𝑊 [2] = 𝜕𝑊 [2] , 𝑑𝑏 [2] = 𝜕𝑏[2]
𝑊 [2] = 𝑊 [2] − 𝛼𝑑𝑊 [2]
𝑏 [2] = 𝑏 [2] − 𝛼𝑑𝑏 [2]
} 20
Formulas for computing derivatives

Forward propagation: Back propagation:

𝑍 [1] = 𝑊 1
𝑋 + 𝑏 [1] 𝑑𝑍 [2] = 𝐴[2] − 𝑌
1
𝐴[1] = 𝑔[1] 𝑍 [1] 𝑑𝑊 [2] = 𝑑𝑍 [2] 𝐴 1 𝑇
𝑚

1
𝑍 [2] = 𝑊 2
𝐴[1] + 𝑏 [2] 𝑑𝑏 [2] = 𝑚 𝑛𝑝. 𝑠𝑢𝑚 𝑑𝑍 [2] , 𝑎𝑥𝑖𝑠 = 1, 𝑘𝑒𝑒𝑝𝑑𝑖𝑚𝑠 = 𝑇𝑟𝑢𝑒
(𝑛 2 , 1) (𝑛 2 , )
𝐴[2] = 𝑔[2] 𝑍 [2] = 𝜎 𝑍 [2]
𝑑𝑍 [1] = 𝑊 2 𝑇 𝑑𝑍 [2] ∗ 𝑔[1]′ 𝑍 [1]
(𝑛 1 , 𝑚) (𝑛 1 , 𝑚) (𝑛 1 , 𝑚)
Element wise product
1
𝑑𝑊 [1] = 𝑑𝑍 [1] 𝑋 𝑇
𝑚

1
𝑑𝑏 [1] = 𝑚 𝑛𝑝. 𝑠𝑢𝑚 𝑑𝑍 [1] , 𝑎𝑥𝑖𝑠 = 1, 𝑘𝑒𝑒𝑝𝑑𝑖𝑚𝑠 = 𝑇𝑟𝑢𝑒
(𝑛 1 , 1) (𝑛 1 , ) reshape

21
What happens if you initialize weights to zero?

𝑥1 [𝟏]
𝒂𝟏
[𝟐]
𝒂𝟏 𝑦
𝑥2 [𝟏]
𝒂𝟐

𝑛[0] = 2 𝑛1 =2

0 0 0
𝑊 [1] = 𝑏 [1] =
0 0 0
[𝟏] [𝟏] [𝟏] [𝟏]
𝒂𝟏 = 𝒂𝟐 (symmetric) 𝒅𝒛𝟏 = 𝒅𝒛𝟐
𝑢 𝑣
𝑑𝑤 = 𝑊 [1] = 𝑊 [1] − 𝛼𝑑𝑤
𝑢 𝑣

• The bias terms 𝑏 can be initialized by 0, but initializing 𝑊 to all 0s is a

problem:
[𝟏] [𝟏]
• The two activations 𝒂𝟏 and 𝒂𝟐 will be the same, because both of
these hidden units are computing exactly the same function.
• After every single iteration of training the two hidden units are still
computing exactly the same function.
22
Random initialization

𝑥1 [𝟏]
𝒂𝟏

𝒂𝟏
[𝟐]
𝑦
𝑥2 [𝟏]
𝒂𝟐

𝑊 [1] = 𝑛𝑝. 𝑟𝑎𝑛𝑑𝑜𝑚. 𝑟𝑎𝑛𝑑𝑛 2,2 ∗ 0.01

𝑏 [1] = 𝑛𝑝. 𝑧𝑒𝑟𝑜𝑠 2,1

𝑊 [2] = 𝑛𝑝. 𝑟𝑎𝑛𝑑𝑜𝑚. 𝑟𝑎𝑛𝑑𝑛 1,1 ∗ 0.01

𝑏 [1] = 0

23
Vectorization demo

24
References
 Andrew Ng. Deep learning. Coursera.
 Geoffrey Hinton. Neural Networks for Machine Learning.
 Kevin P. Murphy. Probabilistic Machine Learning An Introduction. MIT
Press, 2022.

Lecture 09 Slides - After
No ratings yet
Lecture 09 Slides - After
57 pages
Sparseautoencoder 2011new
No ratings yet
Sparseautoencoder 2011new
19 pages
Neural Network Training
No ratings yet
Neural Network Training
73 pages
Autoencoders in Deep Learning
No ratings yet
Autoencoders in Deep Learning
73 pages
Gradient Descent & Backpropagation Practice Problems
No ratings yet
Gradient Descent & Backpropagation Practice Problems
7 pages
1) Deep - Learning
No ratings yet
1) Deep - Learning
60 pages
Slides 11
No ratings yet
Slides 11
48 pages
AI Course Overview for Students
No ratings yet
AI Course Overview for Students
86 pages
CS460 - Deep Learning - W02 & W03
No ratings yet
CS460 - Deep Learning - W02 & W03
44 pages
Tutorial On Neural Networks - 18MAR2024
No ratings yet
Tutorial On Neural Networks - 18MAR2024
33 pages
Activation Functions in Neural Networks
No ratings yet
Activation Functions in Neural Networks
7 pages
Lecture 9. Neural Networks
No ratings yet
Lecture 9. Neural Networks
106 pages
Deep Neural Networks
No ratings yet
Deep Neural Networks
25 pages
Sparse Autoencoder Overview
No ratings yet
Sparse Autoencoder Overview
15 pages
L4 Training Neural Networks en
No ratings yet
L4 Training Neural Networks en
48 pages
ANN-Unit 6 - Deep Neural Networks
No ratings yet
ANN-Unit 6 - Deep Neural Networks
29 pages
Cours 3
No ratings yet
Cours 3
18 pages
AI2025 Lecture08 Recording Slide
No ratings yet
AI2025 Lecture08 Recording Slide
38 pages
Day1 06 Simple NN Python
No ratings yet
Day1 06 Simple NN Python
18 pages
Part 2 Module 2 DL BP
No ratings yet
Part 2 Module 2 DL BP
66 pages
DL03 Classroom SNN
No ratings yet
DL03 Classroom SNN
41 pages
Neural Networks Optional
No ratings yet
Neural Networks Optional
96 pages
Aml Pa
No ratings yet
Aml Pa
17 pages
Ann
No ratings yet
Ann
30 pages
Ad3451 ML Unit 4 Notes Eduengg
No ratings yet
Ad3451 ML Unit 4 Notes Eduengg
36 pages
UDL Answer Booklet Students
No ratings yet
UDL Answer Booklet Students
79 pages
Deep Learning Assignment 2 Solutions
No ratings yet
Deep Learning Assignment 2 Solutions
8 pages
UDL Answer Booklet Students
No ratings yet
UDL Answer Booklet Students
79 pages
hw07 Neural Soln PDF
No ratings yet
hw07 Neural Soln PDF
6 pages
UDL Answer Booklet Students
No ratings yet
UDL Answer Booklet Students
79 pages
IBest DeepLearning
No ratings yet
IBest DeepLearning
123 pages
0905 Cs 161183 Vishal
No ratings yet
0905 Cs 161183 Vishal
38 pages
Lecture NN Part1
No ratings yet
Lecture NN Part1
62 pages
cst414 - Deep Learning
No ratings yet
cst414 - Deep Learning
34 pages
Kagan Lecture2
No ratings yet
Kagan Lecture2
118 pages
AI Techniques for Engineering Students
No ratings yet
AI Techniques for Engineering Students
12 pages
Neural Network Backpropagation Guide
No ratings yet
Neural Network Backpropagation Guide
9 pages
L25 NeuralNetwork
No ratings yet
L25 NeuralNetwork
29 pages
CS2011 5
No ratings yet
CS2011 5
43 pages
Dense Neural Nets
No ratings yet
Dense Neural Nets
68 pages
Neural Network Notes
No ratings yet
Neural Network Notes
8 pages
08 Neural Networks
No ratings yet
08 Neural Networks
47 pages
Physics Informed Neural Networks For Numerical Analysis
No ratings yet
Physics Informed Neural Networks For Numerical Analysis
16 pages
Activation Function in NN
No ratings yet
Activation Function in NN
29 pages
Annette Paper
No ratings yet
Annette Paper
7 pages
Vectorized Neural Network Gradients
No ratings yet
Vectorized Neural Network Gradients
67 pages
Vectorized Neural Network Gradients
No ratings yet
Vectorized Neural Network Gradients
7 pages
TUM I2DL Matrix Derivatives
No ratings yet
TUM I2DL Matrix Derivatives
8 pages
Unit Ii DNN
No ratings yet
Unit Ii DNN
24 pages
(Fall 2024) Deep Learning 1
No ratings yet
(Fall 2024) Deep Learning 1
55 pages
Module 1 DL
No ratings yet
Module 1 DL
84 pages
Seminar Artificial Neural Network 24 9
No ratings yet
Seminar Artificial Neural Network 24 9
39 pages
Artificial Neural Networks
No ratings yet
Artificial Neural Networks
100 pages
Vectorized Linear Regression in Neural Nets
No ratings yet
Vectorized Linear Regression in Neural Nets
5 pages
Neural Networks Skimmed - Ipynb - Colab
No ratings yet
Neural Networks Skimmed - Ipynb - Colab
8 pages
Neural Networks
No ratings yet
Neural Networks
29 pages
UnderstandingDeepLearning 03-26-25 C 39 54
No ratings yet
UnderstandingDeepLearning 03-26-25 C 39 54
16 pages
Ad3451 ML Unit 4 Notes
No ratings yet
Ad3451 ML Unit 4 Notes
34 pages
Mila University Centre M1: I2A Uncertain Decision: Work 2 (Reinforcement Learning) 1. Definition
No ratings yet
Mila University Centre M1: I2A Uncertain Decision: Work 2 (Reinforcement Learning) 1. Definition
2 pages
Mila University Centre M1: IA Uncertain Decision: Work 3 (Bayesian Network) Exercise 1
No ratings yet
Mila University Centre M1: IA Uncertain Decision: Work 3 (Bayesian Network) Exercise 1
1 page
MDPs Solving
No ratings yet
MDPs Solving
19 pages
MDPs
No ratings yet
MDPs
19 pages
Cours 7 A
No ratings yet
Cours 7 A
25 pages
Cours 8 A
No ratings yet
Cours 8 A
34 pages
Cours 7 B
No ratings yet
Cours 7 B
31 pages
Cours9c-Attention Mechanism
No ratings yet
Cours9c-Attention Mechanism
36 pages
Cours9a RNN
No ratings yet
Cours9a RNN
29 pages
Cours9b NLP
No ratings yet
Cours9b NLP
35 pages
Projects 2023-2024
No ratings yet
Projects 2023-2024
7 pages
Assignment of Projects 2023-2024
No ratings yet
Assignment of Projects 2023-2024
1 page
18 LSTM
No ratings yet
18 LSTM
16 pages
Deep Learning Exam Questions
No ratings yet
Deep Learning Exam Questions
2 pages
MLP & Backpropagation Explained
No ratings yet
MLP & Backpropagation Explained
30 pages
UNIT 6.machine Learning
No ratings yet
UNIT 6.machine Learning
34 pages
Deep Learning For Image Recognition
No ratings yet
Deep Learning For Image Recognition
13 pages
A New Backpropagation Algorithm Without Gradient Descent
No ratings yet
A New Backpropagation Algorithm Without Gradient Descent
15 pages
Lab 4 - Part 2 - Hyperparam - Playground
No ratings yet
Lab 4 - Part 2 - Hyperparam - Playground
3 pages
Convolutional Neural Network CNN For Ima
No ratings yet
Convolutional Neural Network CNN For Ima
5 pages
Performance Analysis of Various Activation Functions Using LSTM Neural Network For Movie Recommendation Systems
No ratings yet
Performance Analysis of Various Activation Functions Using LSTM Neural Network For Movie Recommendation Systems
32 pages
3-Neural Network
No ratings yet
3-Neural Network
26 pages
Lab Manual Soft Computing
No ratings yet
Lab Manual Soft Computing
44 pages
Neural Networks Quiz Review
No ratings yet
Neural Networks Quiz Review
4 pages
A Survey of Convolutional Neural Networks Analysis
No ratings yet
A Survey of Convolutional Neural Networks Analysis
22 pages
Ann Unit 1
No ratings yet
Ann Unit 1
54 pages
Pretest Praktikum
75% (4)
Pretest Praktikum
3 pages
Chapter 3-1 - Introduction To ANN
No ratings yet
Chapter 3-1 - Introduction To ANN
61 pages
Deep Learning (R20a06610)
No ratings yet
Deep Learning (R20a06610)
170 pages
Unit II - Perceptron
No ratings yet
Unit II - Perceptron
20 pages
Report On Neural Networks
No ratings yet
Report On Neural Networks
15 pages
Concepts in Deep Learning
No ratings yet
Concepts in Deep Learning
14 pages
ANN 3 - Perceptron
100% (1)
ANN 3 - Perceptron
56 pages
Assignment 1 (
No ratings yet
Assignment 1 (
2 pages
Day 9 - Transformer Revision
No ratings yet
Day 9 - Transformer Revision
13 pages
2017 - Online - Introduction To ANN
No ratings yet
2017 - Online - Introduction To ANN
9 pages
Unit 3
100% (1)
Unit 3
11 pages
Deep Learning Notes
No ratings yet
Deep Learning Notes
54 pages
Lecc
No ratings yet
Lecc
40 pages
Ann Rec054
No ratings yet
Ann Rec054
1 page
Case Studies
No ratings yet
Case Studies
17 pages
Code Question1-Adaline
No ratings yet
Code Question1-Adaline
29 pages

Cours 2

Uploaded by

Cours 2

Uploaded by

Deep learning

Dr. Aissa Boulmerka

2 layers neural network

1𝑇 [1] 1𝑇 [1] [1]

𝑥 = 𝑎[0] 𝑎[1] = 𝜎 𝑧 [1]

𝑥 -------------> 𝑎[2] = 𝑦 for i=1 to m:

Sigmoid activation function: tanh activation function:

ReLU activation function: Leaky ReLU activation function:

Sigmoid activation function:

Tanh activation function:

ReLU and Leaky ReLU :

𝑔 𝑧 = 𝑚𝑎𝑥 0, 𝑧 𝑎 = 𝑚𝑎𝑥 0.01𝑧, 𝑧

with 𝑛𝑥 = 𝑛[0] , 𝑛 1 , 𝑛[2] = 1

Forward propagation: Back propagation:

• The bias terms 𝑏 can be initialized by 0, but initializing 𝑊 to all 0s is a

𝑊 [1] = 𝑛𝑝. 𝑟𝑎𝑛𝑑𝑜𝑚. 𝑟𝑎𝑛𝑑𝑛 2,2 ∗ 0.01

𝑏 [1] = 𝑛𝑝. 𝑧𝑒𝑟𝑜𝑠 2,1

𝑊 [2] = 𝑛𝑝. 𝑟𝑎𝑛𝑑𝑜𝑚. 𝑟𝑎𝑛𝑑𝑛 1,1 ∗ 0.01

You might also like