Deep Learning Neural Network - Lecture1 PDF
Deep Learning Neural Network - Lecture1 PDF
Neural Network
2019/2/25 1
Course Details
• Contents:
• Introduction
• Programing frameworks
• Applications, data collection, data preprocessing, features selection
• Neural Network and Deep Learning Architecture
• Convolutional Neural Network
• Sequence Model
• Introduction to Reinforcement Learning
2019/2/25 2
Todays’ discussion focus on:
• Recap on last week discussion and project ideas for final project
2019/2/25 3
Final project ideas
2019/2/25 4
Project timing
• Proposal
• Progress and plan for next step
• Poster presentation (short)
• Final report
2019/2/25 5
Summery of last week
• Mitchell’s definition
• “A computer program is said to learn from experience E with respect to some class of tasks T and
performance measure P, if its performance at tasks in T, as measured by P, improves with
experience E”.
• Machine learning performance P
TN TP
• Accuracy Accuracy
TN TP FN FP
• Confusion matrix (Precision, Recall, F1 measures)
TP
Pr ecision
TP FP
P*R
TP F1 2( )
Re call
TP FN
PR
2019/2/25 6
• A=99.9%
Wiki
2019/2/25 7
• Common machine learning tasks T
• Classification
• Regression
• Machine translation
• Anomaly detection
• Machine learning experience E,
• Supervised
• Unsupervised
• Some machine learning algorithms interact with the environment
(feedback in the loop) – reinforcement learning
2019/2/25 8
Underfitting and overfitting
2019/2/25 10
Neural Network and Deep Learning Architecture
• Introduction
• Basic of Neural network Architecture
• One Layer Neural Network
• Deep Neural Network
2019/2/25 11
What is Neural Network
size price
price
neuron
size of house
2019/2/25 12
Sensor representation in brain
• [BrainPort; Welsh & Blasch, 1997; Nagel et al., 2005; Constantine-Paton & Law, 2009]
2019/2/25 13
Housing Price Prediction
size 𝑥1
#bedrooms 𝑥2
y
zip code 𝑥3
wealth 𝑥4
x, y
Supervised Learning
Standard NN
Recurrent NN
Convolutional NN
2019/2/25 17
What drive deep learning
Small training
set
• Trend
Gartner hyper cycle graph to analyzing the history of artificial neural network technology
2019/2/25 19
Break
2019/2/25 20
Binary Classification
x y
255
231
Blue
Green 42
Red
22
nx 12288
X
255
134
255
134
Notation
x, y x R nx , y 0,1
X R n x m
Logistic Regression
• Given x, y P( y 1 | x),
X R nx
where 0 y 1 ( z)
1
1 ez
• Parameters:
w R nx bR
• Output:
y ( wT x b)
2019/2/25 23
Logistic Regression cost function
𝑦ො = 𝜎 𝑤 𝑇 𝑥 + 𝑏 , where 𝜎 𝑧 =
1 z (i ) wT x (i ) b
1+𝑒 −𝑧
1
Loss (error) fun: L( y, y ) ( y y ) 2
2
L( y, y ) ( y log y (1 y ) log(1 y )) y 1 y0
if y 1 : L( y, y ) log y
if y 0 : L( y, y ) log(1 y )
h (x) 1 h (x) 1
1 m
1 m (i ) (i )
Cost function: J (W , b) L( y , y ) ( y log y (1 y ) log(1 y (i ) ))
(i ) (i ) (i )
m i 1 m i 1
Gradient Descent
1
𝑦ො = 𝜎 𝑤𝑇𝑥 +𝑏 , 𝜎 𝑧 =
1+𝑒 −𝑧
𝑚 𝑚
1 1
𝐽 𝑤, 𝑏 = ℒ(𝑦ො 𝑖 , 𝑦 (𝑖) ) = − 𝑦
(𝑖)
log 𝑦ො 𝑖 + (1 − 𝑦 (𝑖) ) log(1 − 𝑦ො 𝑖 )
𝑚 𝑚
𝑖=1 𝑖=1
𝐽 𝑤, 𝑏
𝑏
𝑤
Gradient descent
a (z )
• z=xy y ( wx b)
Remember chain rule of differentiation
If
z u b
z xy Then chain rule:
u wx
x y w x b
2019/2/25 27
Optimization algorithms
2019/2/25 29
Logistic regression
+1
b
x1 w1
𝑧= 𝑤𝑇𝑥 +𝑏 y
x2 w2 z wT x b
𝑦ො = 𝑎 = 𝜎(𝑧) (z )
w3
x3 w4
𝐿 𝑎, 𝑦 = −(𝑦 log(𝑎) + (1 − 𝑦) log(1 − 𝑎))
x4
da= dL y 1 y
da a 1 a
w1 : w1 dw1
dz= dL dL . da a y
dz da dz w2 : w2 dw2
b : b db
dL dL da dz
dw1= . . x1dz
dw1 da dz dw1
dL dz dL
dw2= dz. x2 dz db = dz
dw2 dw dw2
Logistic regression
1 m dw1(i ) , dw2(i ) , db (i )
J ( w, b) L(a ( i ) , y ( i ) )
w1 m i 1 w1
1 m
dw1(i )
m i 1
2019/2/25 31
Logistic regression
dz = a-y
dw1 += x1*dz w1 : w1 dw1
dw2 += x2 * dz
db += dz w2 : w2 dw2
J= J/m
dw1= dw1/m; dw2= dw2/m; db = db/m
b : b db
Vector Valued functions
2019/2/25 34