0% found this document useful (0 votes)

13 views6 pages

11-Nonlinear Models (Neural Networks)

The document discusses nonlinear models, particularly focusing on neural networks and their ability to solve complex classification problems like XOR, which cannot be addressed by linear models. It explains the structure of artificial neural networks, including perceptrons and multi-layer perceptrons, as well as the processes of forward propagation and backpropagation for computing gradients. The document emphasizes the importance of learnable non-linear mappings and the challenges of programming these systems efficiently.

Uploaded by

soham

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

13 views6 pages

11-Nonlinear Models (Neural Networks)

Uploaded by

soham

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

11-Nonlinear Models (Neural Networks)

Linear models
In previous topics, we mainly dealt with linear models
• Regression ℎ(𝒙) = 𝒘! 𝒙 + 𝑏

• Classification:
ℎ(𝒙) = argmax 𝒘!
" 𝒙 + 𝑏"

Category 𝑖 is referred than Category 𝑗

𝒘! !
" 𝒙 + 𝑏" ≥ 𝒘# 𝒙 + 𝑏# i.e., (𝒘" − 𝒘# )! 𝒙 + (𝑏" − 𝑏# ) ≥ 0
Binary classification with logistic regression is a special case of the above.

Nonlinear models
• Non-linear features 𝝓(𝒙)
○ E.g., Gaussian discriminant analysis with the different covariance
matrices, in which case we have quadratic features of 𝒙.
• Non-linear kernel 𝑘6𝒙" , 𝒙# 8
○ A kernel is an inner-product of two data samples that are
transformed in a certain vector space. The vector space could be very
high-dimensional (e.g., with infinite dimensions). A linear
classification in such a high-dimensional space could be non-linear in
the original low dimensional space.
• Learnable non-linear mapping
○ We can probably stack a few layers of learnable non-linear functions
(e.g., logistic functions) to learn the non-linear feature 𝝓(𝒙) or a non-
linear kernel that is appropriate to the task at hand.

Motivation: XOR, a nonlinear classification problem

○ We can probably stack a few layers of learnable non-linear functions
(e.g., logistic functions) to learn the non-linear feature 𝝓(𝒙) or a non-
linear kernel that is appropriate to the task at hand.

Motivation: XOR, a nonlinear classification problem

• The problem cannot be solved by logistic regression

• What if we stack multiple logistic regression classifiers?

• The XOR problem is solvable by three linear classifiers

○ One built upon the other two
○ But this is programming [HW]
§ Some machinery that allows you to specify certain things
§ Programming means you specify these things (usually
heuristically by human intelligence) that are can be input to the
machinery
§ The machinery accomplishes a certain task according to your
input (program).
○ Programming is very tedious and only feasible simple tasks
○ We want to learn the weights.

• Can we learn the weights?

○ Yes, still by gradient descent.

• Can we compute the gradient?

○ Yes, it's still a differentiable function

○ Again, brute-force computation of gradient is very tedious.

○ We need a systematic way of
○ Again, brute-force computation of gradient is very tedious.
○ We need a systematic way of
§ Defining a deep architecture, and
§ Computing its gradient.

Artificial Neural Network

• A perceptron [Rosenblatt, 1958]

𝑧 = 𝒘! 𝒙 + 𝑏
𝑦 = 𝑓(𝑧)

where 𝑓 is a binary thresholding function

• A perceptron-like neuron, unit, or node

𝑧 = 𝒘! 𝒙 + 𝑏
𝑦 = 𝑓 (𝑧 )

𝑓 is an activation function, e.g., sigmoid, tanh, ReLU

○ Usually we use nonlinear activation

○ Linear activation may be used for regression

• A multi-layer neural network, or a multi-layer perceptron

A common structure is layer-wise fully connected

For each node 𝑗 at layer

To simply notations, we omit the layer L, but call the output of the current
layer as 𝑦 and the input of the current layer 𝑥, which is the output of the
lower layer. In the simplified notation,

Since we have multiple layers, we need a recursive algorithm that

computes the activation of all nodes automatically.

Forward propagation (FP)

○ Initialization

○ Recursion

○ Termination

Gradient of multi-layer neural networks

Main idea: if we can compute the gradient for one layer, we may use
chain rule to compute the gradient for all layers.

Recursion on what?

We consider a local layer

Backpropagation (BP)
○ Initialization

○ Recursion
Backpropagation (BP)
○ Initialization

○ Recursion

○ Termination

• A few more thoughs

○ Non-layerwise connection: Topological sort
○ Multiple losses: BP is a linear system
○ Tied weights: Total derivative is the summation

Auto-differentiation in general

Numerical gradient checking

Week 03-04 - Deep Feedforward Networks - Intro
No ratings yet
Week 03-04 - Deep Feedforward Networks - Intro
141 pages
XOR Problem & Two-Layer Perceptron
No ratings yet
XOR Problem & Two-Layer Perceptron
74 pages
Deep Neural Networks - 2
No ratings yet
Deep Neural Networks - 2
55 pages
Module 2
No ratings yet
Module 2
44 pages
10 Multilayer Perceptrons
No ratings yet
10 Multilayer Perceptrons
54 pages
NN PDF
No ratings yet
NN PDF
23 pages
Neural Networks
No ratings yet
Neural Networks
14 pages
Supervised Learning Neural Networks
No ratings yet
Supervised Learning Neural Networks
34 pages
Unit 3 .
No ratings yet
Unit 3 .
48 pages
Machine Learning: The Hundred-Page Book
No ratings yet
Machine Learning: The Hundred-Page Book
17 pages
02A DL2023 NN Basics
No ratings yet
02A DL2023 NN Basics
52 pages
Kagan Lecture2
No ratings yet
Kagan Lecture2
118 pages
Ece18898g Neural Networks
No ratings yet
Ece18898g Neural Networks
47 pages
AN2DL 02 2324 Perceptron 2 FeedForward
No ratings yet
AN2DL 02 2324 Perceptron 2 FeedForward
55 pages
Neural Networks
No ratings yet
Neural Networks
108 pages
NN Theory
No ratings yet
NN Theory
138 pages
cs188 sp24 Note22
No ratings yet
cs188 sp24 Note22
8 pages
Neural Networks & Gradient Descent
No ratings yet
Neural Networks & Gradient Descent
77 pages
Unit 2.1
No ratings yet
Unit 2.1
37 pages
Week3 Perceptron Mlprwerwerwer
No ratings yet
Week3 Perceptron Mlprwerwerwer
8 pages
Unit V
No ratings yet
Unit V
25 pages
DL Unit 1
No ratings yet
DL Unit 1
10 pages
Neural Network
No ratings yet
Neural Network
82 pages
DL Mod 1 Final
No ratings yet
DL Mod 1 Final
4 pages
Unit - II ML
No ratings yet
Unit - II ML
9 pages
Artificial Neural Networks: HCMC University of Technology Sep. 2008
No ratings yet
Artificial Neural Networks: HCMC University of Technology Sep. 2008
71 pages
UNIT1 Perceptron MLP
No ratings yet
UNIT1 Perceptron MLP
26 pages
3 Non Linear Classifiers
No ratings yet
3 Non Linear Classifiers
74 pages
Lec 03 Deep Networks 1
No ratings yet
Lec 03 Deep Networks 1
53 pages
Nns Are A Study of Parallel and Distributed Processing Systems (PDPS)
No ratings yet
Nns Are A Study of Parallel and Distributed Processing Systems (PDPS)
46 pages
2023 Lecture11 NeuralNetworks
No ratings yet
2023 Lecture11 NeuralNetworks
48 pages
What Is Computer Vision?
No ratings yet
What Is Computer Vision?
120 pages
16 DL 1
No ratings yet
16 DL 1
9 pages
NN Unit 2
No ratings yet
NN Unit 2
20 pages
Basics of Deep Learning
No ratings yet
Basics of Deep Learning
20 pages
Neural Network and Fuzzy Logic
50% (2)
Neural Network and Fuzzy Logic
54 pages
Machine Learning Unit 5 Notes
No ratings yet
Machine Learning Unit 5 Notes
19 pages
Neural Networks for CS Students
100% (1)
Neural Networks for CS Students
22 pages
The Little Book of Deep Learning
No ratings yet
The Little Book of Deep Learning
168 pages
UNIT 3-Multilayer-Perceptrons
No ratings yet
UNIT 3-Multilayer-Perceptrons
23 pages
Artificial Neural Networks
No ratings yet
Artificial Neural Networks
100 pages
Lec 05
No ratings yet
Lec 05
46 pages
GeoStat DeepLearn NDesassis 15 06 22
No ratings yet
GeoStat DeepLearn NDesassis 15 06 22
134 pages
1) Deep - Learning
No ratings yet
1) Deep - Learning
60 pages
Module 5
No ratings yet
Module 5
21 pages
DL Lecture 04 05 Neural Network
No ratings yet
DL Lecture 04 05 Neural Network
51 pages
ch11 NeuralNetworks
No ratings yet
ch11 NeuralNetworks
51 pages
CS 188 Introduction To Artificial Intelligence Fall 2017 Note 10 Neural Networks: Motivation
No ratings yet
CS 188 Introduction To Artificial Intelligence Fall 2017 Note 10 Neural Networks: Motivation
9 pages
Neural Networks Unit-3
No ratings yet
Neural Networks Unit-3
14 pages
CH 12 - Artificial Neural Networks
No ratings yet
CH 12 - Artificial Neural Networks
39 pages
Ai Unit 4 Part 2
No ratings yet
Ai Unit 4 Part 2
45 pages
NN-Ch2 New V1
No ratings yet
NN-Ch2 New V1
99 pages
Jntuk R20 ML Unit-V
No ratings yet
Jntuk R20 ML Unit-V
19 pages
Deep Neural Networks
No ratings yet
Deep Neural Networks
53 pages
Unit V
No ratings yet
Unit V
26 pages
Genetic Algorithms Versus Traditional Methods
No ratings yet
Genetic Algorithms Versus Traditional Methods
7 pages
Topic 4 1 Kinematics of Simple Harmonic Motion
No ratings yet
Topic 4 1 Kinematics of Simple Harmonic Motion
31 pages
Breakeven Analysis for Students
No ratings yet
Breakeven Analysis for Students
16 pages
Knowledgeable Communicators Thinkers Balanced
No ratings yet
Knowledgeable Communicators Thinkers Balanced
1 page
Mathematics Paper 1 TZ2 HL
No ratings yet
Mathematics Paper 1 TZ2 HL
19 pages
Computer Science IA Criteria A Soham PDF
No ratings yet
Computer Science IA Criteria A Soham PDF
3 pages
Specimen Papers 2021 - English PDF
No ratings yet
Specimen Papers 2021 - English PDF
122 pages
Lab Report Helicopter
50% (2)
Lab Report Helicopter
2 pages
Divya Bhaskar 250 Words Essays On Teachers
No ratings yet
Divya Bhaskar 250 Words Essays On Teachers
1 page
MAA HL Sequences & Binomial Test
No ratings yet
MAA HL Sequences & Binomial Test
16 pages
PS216 Manual - Lab11
No ratings yet
PS216 Manual - Lab11
4 pages
Test 1. Sequences
100% (1)
Test 1. Sequences
3 pages
7.1 Define and Use Sequences Series
No ratings yet
7.1 Define and Use Sequences Series
34 pages
Chapter 4. Differential Equations
No ratings yet
Chapter 4. Differential Equations
44 pages
Trajectory Tracking of A 2-DOF Helicopter System Using Fuzzy Controller Approach
No ratings yet
Trajectory Tracking of A 2-DOF Helicopter System Using Fuzzy Controller Approach
6 pages
Charpit's Method
No ratings yet
Charpit's Method
26 pages
The SIMPLE Algorithm For Pressure-Velocity Coupling
No ratings yet
The SIMPLE Algorithm For Pressure-Velocity Coupling
21 pages
MT 222
No ratings yet
MT 222
2 pages
Control Lec 3 Solved
No ratings yet
Control Lec 3 Solved
12 pages
Nonlinear Finite Elements For Continua and Structures: Ted Belytschko
No ratings yet
Nonlinear Finite Elements For Continua and Structures: Ted Belytschko
4 pages
Mathematical Programming For Agricultural, Environmental, and Resource Economics
67% (3)
Mathematical Programming For Agricultural, Environmental, and Resource Economics
514 pages
Neural Algorithms For Solving Differential Equations
No ratings yet
Neural Algorithms For Solving Differential Equations
22 pages
Mathematical Models in Philosophy of Science
No ratings yet
Mathematical Models in Philosophy of Science
9 pages
Advanced Finite Element Methods: Bauhaus - Universität Weimar Institut Für Strukturmechanik
No ratings yet
Advanced Finite Element Methods: Bauhaus - Universität Weimar Institut Für Strukturmechanik
63 pages
Eccm2012 Abqnurbs PDF
No ratings yet
Eccm2012 Abqnurbs PDF
27 pages
Models, Simulations, and Representations (PDFDrive)
No ratings yet
Models, Simulations, and Representations (PDFDrive)
307 pages
MIP Control for Engineers
No ratings yet
MIP Control for Engineers
63 pages
Nonlinear and Adaptive Control Systems PDF
No ratings yet
Nonlinear and Adaptive Control Systems PDF
290 pages
Nonlinear Control Systems 1. - Introduction To Nonlinear Systems
No ratings yet
Nonlinear Control Systems 1. - Introduction To Nonlinear Systems
54 pages
Simbolička Matematika: M. Essert: Matlab Inženjerski
No ratings yet
Simbolička Matematika: M. Essert: Matlab Inženjerski
19 pages
SAT Math - Nonlinear For 1 and System For 2 - Easy - Questions
No ratings yet
SAT Math - Nonlinear For 1 and System For 2 - Easy - Questions
30 pages
Atangana
No ratings yet
Atangana
16 pages
10 33187-jmsm 1475211-3892180
No ratings yet
10 33187-jmsm 1475211-3892180
11 pages
Mathematics for Economics, fourth edition Michael Hoy full
No ratings yet
Mathematics for Economics, fourth edition Michael Hoy full
155 pages
Adaptive Dynamic Programming With Applications in Optimal Control 1st Edition Derong Liu - Quickly Download The Ebook To Read Anytime, Anywhere
100% (2)
Adaptive Dynamic Programming With Applications in Optimal Control 1st Edition Derong Liu - Quickly Download The Ebook To Read Anytime, Anywhere
59 pages
Mechanical Calculation of Power Lines
No ratings yet
Mechanical Calculation of Power Lines
4 pages
Water Resources Systems - S Vedula and P P Mujumdar
No ratings yet
Water Resources Systems - S Vedula and P P Mujumdar
280 pages
The Structure of Lie Algebras and The Classification Problem For Partial Differential Equations
100% (1)
The Structure of Lie Algebras and The Classification Problem For Partial Differential Equations
52 pages
EE263 Midterm Solution
100% (1)
EE263 Midterm Solution
23 pages
Havelock Dawson Method For Free Surface
No ratings yet
Havelock Dawson Method For Free Surface
4 pages
Tutorial 9 - Non-Linear Time History Analysis
No ratings yet
Tutorial 9 - Non-Linear Time History Analysis
15 pages
On The Systems Approach in Hydrology
No ratings yet
On The Systems Approach in Hydrology
23 pages

11-Nonlinear Models (Neural Networks)

Uploaded by

11-Nonlinear Models (Neural Networks)

Uploaded by

11-Nonlinear Models (Neural Networks)

Category 𝑖 is referred than Category 𝑗

Motivation: XOR, a nonlinear classification problem

Motivation: XOR, a nonlinear classification problem

• The problem cannot be solved by logistic regression

• The XOR problem is solvable by three linear classifiers

• Can we learn the weights?

• Can we compute the gradient?

○ Again, brute-force computation of gradient is very tedious.

Artificial Neural Network

where 𝑓 is a binary thresholding function

• A perceptron-like neuron, unit, or node

𝑓 is an activation function, e.g., sigmoid, tanh, ReLU

○ Usually we use nonlinear activation

• A multi-layer neural network, or a multi-layer perceptron

A common structure is layer-wise fully connected

For each node 𝑗 at layer

Since we have multiple layers, we need a recursive algorithm that

Forward propagation (FP)

Gradient of multi-layer neural networks

We consider a local layer

• A few more thoughs

Numerical gradient checking

You might also like