0% found this document useful (0 votes)
61 views34 pages

Deep Learning Basics for Beginners

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
61 views34 pages

Deep Learning Basics for Beginners

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 34

Introduction to deep

learning

By: DIVAKAR KESHRI


PhD NIT TRICHY
About this course
• Introduction to deep learning
• basics of ML assumed
• mostly high-school math
• much of theory, many details skipped
• 1st day: lectures + small-scale exercises using
notebooks.csc.fi
• 2nd day: experiments using GPUs at Puhti-AI
• Slides at: https://tinyurl.com/yyej6rxl
• Other materials at GitHub:
https://github.com/csc-training/intro-to-dl
• Gitter chat at:
https://gitter.im/csc_training/intro-to-dl
• Focus on text and image classification, no fancy
stuff
Further resources
• This course is largely “inspired by”: “Deep
Learning with Python” by François Chollet
• Recommended textbook: “Deep learning”
by Goodfellow, Bengio, Courville
• Lots of further material available online, e.g.:
http://cs231n.stanford.edu/ http://course.fast.ai/
https://developers.google.com/machine-learning/crash-course/

www.nvidia.com/dlilabs http://introtodeeplearning.com/
https://github.com/oxford-cs-deepnlp-2017/lectures,
https://jalammar.github.io/
• Academic courses
What is artificial
intelligence?

Artificial intelligence is the ability of a computer to


perform tasks commonly associated with intelligent
beings.
What is machine
learning?

Machine learning is the study of algorithms that


learn from examples and experience instead of
relying on hard-coded rules and make predictions
on new data.
What is deep learning?

Deep learning is a subfield of machine learning


focusing on learning data representations as
successive layers of increasingly meaningful
representations.
Image from https://blogs.nvidia.com/blog/2016/07/29/whats-difference-artificial-intelligence-machine-learning-deep-learning-ai/
“Traditional” machine learning:

handcrafted learned
cat
features classifier

Deep, “end-to-end” learning:

learned learned learned


learned
low-level mid-level high-level cat
classifier
features features features
From: Wang & Raj: On the Origin of Deep Learning (2017)
Main types of machine
learning
Main types of machine learning

• Supervised learning
cat
• Unsupervised learning
• Self-supervised dog
learning
• Reinforcement
learning
Main types of machine learning

• Supervised learning

• Unsupervised
learning
• Self-supervised
learning
• Reinforcement
learning
Main types of machine learning

• Supervised learning

• Unsupervised learning
• Self-supervised
learning
• Reinforcement
learning

Image from https://arxiv.org/abs/1710.10196


Main types of machine learning

• Supervised learning

• Unsupervised learning
• Self-supervised
learning
• Reinforcement
learning

Animation from https://yanpanlau.github.io/2016/07/10/FlappyBird-Keras.html


Fundamentals of machine
learning
Data
• Humans learn by observation
and unsupervised learning
• model of the world /
common sense reasoning
• Machine learning needs lots
of (labeled) data to
compensate
Data

• Tensors: generalization of matrices


to n dimensions (or rank, order, degree)
• 1D tensor: vector
• 2D tensor: matrix
• 3D, 4D, 5D tensors
• numpy.ndarray(shape, dtype)
• Training – validation – test split (+
adversarial test)
• Minibatches
• small sets of input data used at a time
• usually processed independently Image from:
https://arxiv.org/abs/1707.08945
Model – learning/training – inference

http://playground.tensorflow.org/

• parameters 𝜃 and hyperparameters


Optimization
• Mathematical optimization:
“the selection of a best element
(with
regard to some criterion) from some
set of available alternatives”
(Wikipedia)
• Main types:
• Learning asiterative,
finite-step, an optimization
heuristic
By Rebecca Wilson (originally posted to Flickr as Vicariously) [CC BY 2.0], via Wikimedia Commons

problem
loss regularization
• cost function:
Optimization

Image from: Li et al. “Visualizing the Loss Landscape of Neural Nets”, arXiv:1712.09913
Gradient descent

• Derivative and minima/maxima of


functions
• Gradient: the derivative of a multivariable
function
• Gradient descent:

• (Mini-batch) stochastic gradient


descent (and its variants)
Image from: https://towardsdatascience.com/gradient-descent-algorithm-and-its-variants-10f652806a3
Over- and underfitting, generalization,
regularization
• Models with lots of parameters
can easily overfit to training
data
• Generalization: the quality of
ML model is measured on new,
unseen samples
• Regularization: any method*
to prevent overfitting
• simplicity, sparsity, dropout, early
stopping
• *) other than adding more data By Chabacano [GFDL or CC BY-SA 4.0], from Wikimedia Commons
Deep learning
Anatomy of a deep neural network

• Layers
• Input data and targets
• Loss function
• Optimizer
Layers
• Data processing modules
• Many different kinds exist
• densely connected
• convolutional
• recurrent
• pooling, flattening, merging,
normalization, etc.
• Input: one or more tensors
output: one or more tensors
• Usually have a state, encoded as
weights
• learned, initially random
• When combined, form a network or
a model
Input data and targets

• The network maps the input


data X to predictions Y′
• During training, the
predictions Y′ are compared
to true targets Y using the
loss function

cat
dog
Loss function
• The quantity to be minimized (optimized) during
training
• the only thing the network cares about
• there might also be other metrics you care
about
• Common tasks have “standard” loss functions:
• mean squared error for regression
• binary cross-entropy for two-class
classification
• categorical cross-entropy for multi-class
classification
• etc.
Optimizer
• How to update the
weights based on the
loss function
• Learning rate
(+scheduling)
• Stochastic gradient
descent, momentum,
and their variants
• RMSProp is usually a
good first choice
• more info:
http://ruder.io/optimizing-gradient-d
escent/ Animation from: https://imgur.com/s25RsOr
Anatomy of a deep neural network
Deep learning frameworks
Deep learning frameworks
+

• Actually tools for defining static or


dynamic general-purpose +
computational graphs
• Automatic differentiation ✕ ✕
• Seamless CPU / GPU usage
• multi-GPU, distributed
x y 5
• Python/numpy or R interfaces
• instead of C, C++, CUDA or HIP
• Open source
Deep learning Lasagne Keras TF Estimator torch.nn Gluon

frameworks
Theano TensorFlow CNTK PyTorch MXNet Caffe

CUDA, cuDNN
MKL, MKL-DNN
• Keras is a high-level HIP, MIOpen

neural networks API


• we will use TensorFlow GPUs CPUs
as the compute backend
• included in TensorFlow 2 as tf.keras
• https://keras.io/ ,
https://www.tensorflow.org/guide/keras
• PyTorch is:
• a GPU-based tensor library
• an efficient library for dynamic neural networks

You might also like