0% found this document useful (0 votes)
67 views31 pages

Deep Learning for EECS Students

deep learning is a subdomain of machine learning. With accelerated computational power and large data sets,

Uploaded by

indrahermawan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
67 views31 pages

Deep Learning for EECS Students

deep learning is a subdomain of machine learning. With accelerated computational power and large data sets,

Uploaded by

indrahermawan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 31

CS294-129: Designing, Visualizing and

Understanding Deep Neural Networks

John Canny
Fall 2016
CN: 34652
With considerable help from Andrej Karpathy,
Fei-Fei Li, Justin Johnson, Ian Goodfellow and
several others
Deep Learning: Hype or Hope?
Deep Learning: Hype or Hope?

Hype: (n) “extravagant or intensive publicity or promotion”

Hope: (n) “expectation of fulfillment or success”


Milestones: Digit Recognition
LeNet 1989: recognize zip codes, Yann Lecun, Bernhard
Boser and others, ran live in US postal service
Milestones: Image Classification
Convolutional NNs: AlexNet (2012): trained on 200 GB of
ImageNet Data

Human performance
5.1% error
Milestones: Speech Recognition
Recurrent Nets: LSTMs (1997):
Milestones: Language Translation
Sequence-to-sequence models with LSTMs and attention:

Source Luong, Cho, Manning ACL Tutorial 2016.


8

Milestones: Deep Reinforcement Learning


In 2013, Deep Mind’s arcade player bests human expert
on six Atari Games. Acquired by Google in 2014,.

In 2016, Deep Mind’s


alphaGo defeats former
world champion Lee Sedol
Deep Learning: Is it Hype or Hope?
Deep Learning: Is it Hype or Hope?

Yes !
11

Other Deep Learning courses at Berkeley:

Fall 2016:

CS294-43: Special Topics in Deep Learning: Trevor Darrell,


Alexei Efros, Marcus Rohrbach, Mondays 10-12 in Newton
room (250 SDH). Emphasis on computer vision research.
Readings, presentations.

CS294-YY: Special Topics in Deep Learning: Trevor


Darrell, Sergey Levine, and Dawn Song, Weds 10-12,
https://people.eecs.berkeley.edu/~dawnsong/cs294-dl.html
Emphasis on security and other emerging applications.
Readings, presentations, project.
12

Other Deep Learning courses at Berkeley:

Spring XX:

CS294-YY: Deep Reinforcement Learning, John Shulman


and Sergey Levine, http://rll.berkeley.edu/deeprlcourse/.
Should happen in Spring 2017. Broad/introductory
coverage, assignments, project.

Stat 212B: Topics in Deep Learning: Joan Bruna


(Statistics), last offered Spring 2016. Next ??
Paper reviews and course project.
13

Learning about Deep Neural Networks

Yann Lecun quote: DNNs require: “an interplay between


intuitive insights, theoretical modeling, practical
implementations, empirical studies, and scientific analyses”

i.e. there isn’t a framework or core set of principles to


explain everything (c.f. graphical models for machine
learning).

We try to cover the ground in Lecun’s quote.


14

This Course (please interrupt with questions)

Goals:
• Introduce deep learning to a broad audience.
• Review principles and techniques (incl. visualization) for
understanding deep networks.
• Develop skill at designing networks for applications.

Materials:
• Book(s)
• Notes
• Lectures
• Visualizations
15

The role of Animation

From A. Karpathy’s cs231n notes.


16

This Course

Work:
• Class Participation: 10%
• Midterm: 20%
• Final Project (in groups): 40%
• Assignments : 30%

Audience: EECS grads and undergrads.


In the long run will probably be “normalized” to a 100/200-
course numbered “mezzanine” offering.
17

Prerequisites

• Knowledge of calculus and linear algebra, Math 53/54 or


equivalent.

• Probability and Statistics, CS70 or Stat 134. CS70 is bare


minimum preparation, a stat course is better.

• Machine Learning, CS189.

• Programming, CS61B or equivalent. Assignments will


mostly use Python.
18

Logistics

• Course Number: CS 294-129 Fall 2016, UC Berkeley


• On bCourses, publicly readable.
• CCN: 34652
• Instructor: John Canny [email protected]
• Time: MW 1pm - 2:30pm
• Location: 306 Soda Hall
• Discussion: Join Piazza for announcements and to ask
questions about the course
• Office hours:
• John Canny - M 2:30-3:30, in 637 Soda
Course Project

• More info later

• Can be combined with other course projects

• Encourage “open-source” projects that can be archived


somewhere.
Outline – Basics/Computer Vision
Outline – Applications/Advanced Topics
Some History

• Reading: the Deep Learning Book, Introduction


Phases of Neural Network Research

• 1940s-1960s: Cybernetics: Brain like electronic systems,


morphed into modern control theory and signal processing.
• 1960s-1980s: Digital computers, automata theory,
computational complexity theory: simple shallow circuits are
very limited…
• 1980s-1990s: Connectionism: complex, non-linear networks,
back-propagation.
• 1990s-2010s: Computational learning theory, graphical
models: Learning is computationally hard, simple shallow
circuits are very limited…
• 2006: Deep learning: End-to-end training, large datasets,
explosion in applications.
Citations of the “LeNet” paper
• Recall the LeNet was a modern visual classification network
that recognized digits for zip codes. Its citations look like this:

Second phase Deep Learning “Winter” Third phase

• The 2000s were a golden age for machine learning, and


marked the ascent of graphical models. But not so for neural
networks.
Why the success of DNNs is surprising
• From both complexity and learning theory perspectives,
simple networks are very limited.
• Can’t compute parity with a small network.
• NP-Hard to learn “simple” functions like 3SAT formulae,
and i.e. training a DNN is NP-hard.
Why the success of DNNs is surprising
• The most successful DNN training algorithm is a version of
gradient descent which will only find local optima. In other
words, it’s a greedy algorithm. Backprop:
loss = f(g(h(y)))
d loss/dy = f’(g) x g’(h) x h’(y)

• Greedy algorithms are even more limited in what they can


represent and how well they learn.

• If a problem has a greedy solution, its regarded as an “easy”


problem.
Why the success of DNNs is surprising
• In graphical models, values in a network represent random
variables, and have a clear meaning. The network structure
encodes dependency information, i.e. you can represent rich
models.

• In a DNN, node activations encode nothing in particular, and


the network structure only encodes (trivially) how they derive
from each other.
Why the success of DNNs is surprising obvious
• Hierarchical representations are ubiquitous in AI. Computer
vision:
Why the success of DNNs is surprising obvious
• Natural language:
Why the success of DNNs is surprising obvious
• Human Learning: is deeply layered.

Deep expertise
Why the success of DNNs is surprising obvious
• What about greedy optimization?
• Less obvious, but it looks like many learning problems (e.g.
image classification) are actually “easy” i.e. have reliable
steepest descent paths to a good model.

Ian Goodfellow – ICLR 2015 Tutorial

You might also like