MODULE 2 Deep Learning
MODULE 2 Deep Learning
Computational Graphs
Layers and Blocks, shallow neural network, deep
neural network, Optimization for training Deep
Models, self-organizing maps, Case study
Computational graphs
• Computational graphs are a type of graph that can be used to
represent mathematical expressions. This is similar to descriptive
language in the case of deep learning models, providing a functional
description of the required computation.
EXAMPLE
Y = (a+b) * (b-c)
we have three operations, addition,
subtraction, and multiplication. To create a
computational graph, we create nodes, each
of them has different operations along with
input variables. The direction of the array
shows the direction of input being applied to
other nodes.
d = a+b
e = b-c
Y = d*e
Types of computational graphs:
• Type 1: Static Computational Graphs
• Involves two phases:-
• Phase 1:- Make a plan for your architecture.
• Phase 2:- To train the model and generate predictions, feed it a lot of
data.
• The benefit of utilizing this graph is that it enables powerful
offline graph optimization and scheduling. As a result, they
should be faster than dynamic graphs in general.
• The drawback is that dealing with structured and even
variable-sized data is unsightly.
Type 2: Dynamic Computational Graphs
• As the forward computation is performed, the graph is implicitly
defined.
• This graph has the advantage of being more adaptable. The library
is less intrusive and enables interleaved graph generation and
evaluation. The forward computation is implemented in your
preferred programming language, complete with all of its features
and algorithms. Debugging dynamic graphs is simple. Because it
permits line-by-line execution of the code and access to all
variables, finding bugs in your code is considerably easier. If you
want to employ Deep Learning for any genuine purpose in the
industry, this is a must-have feature.
• The disadvantage of employing this graph is that there is limited
time for graph optimization, and the effort may be wasted if the
graph does not change.
Layers and Blocks
•Layers are blocks.
•Many layers can comprise a block.
•Many blocks can comprise a block.
•A block can contain code.
•Blocks take care of lots of housekeeping, including
parameter initialization and backpropagation.
•Sequential concatenations of layers and blocks are
handled by the Sequential block.
Multiple layers are combined into blocks,
forming repeating patterns of larger models.
• A block could describe a single layer, a component consisting
of multiple layers, or the entire model itself! One benefit of
working with the block abstraction is that they can be combined
into larger artifacts, often recursively
Shallow Neural Network
• Shallow Neural Network: A neural network with only one
hidden layer, often used for simpler tasks or as a building block
for larger networks.
Optimization for training Deep Models
• optimization provides a way to minimize the loss function for deep
learning
• Minimizing the training error does not guarantee that we find the
best set of parameters to minimize the generalization error.
• The optimization problems may have many local minima.
• The problem may have even more saddle points, as generally the
problems are not convex.
• Vanishing gradients can cause optimization to stall. Often a
reparameterization of the problem helps. Good initialization of the
parameters can be beneficial, too
Goal of Optimization
optimization provides a way to minimize
the loss function for deep learning, in
essence, the goals of optimization and
deep learning are fundamentally different
There are many challenges in deep learning
optimization.
• Local Minima
• Saddle Points
• Vanishing Gradients
Local optima
• A local optimum is an extrema (maximum or minimum) point of the
objective function for a certain region of the input space. More
formally, for the minimization case x_{local} is a local minimum of the
objective function f if: