Sequence Models-II

Uploaded by

Gurvinder Singh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views10 pages

Sequence Models-II

Uploaded by

Gurvinder Singh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

Sequence Models-II

Many-to-Many-RNNs for Unequal Length

D R. JA SM EE T SIN G H,
A SSI STANT P RO F ES SOR,
C SE D, T IE T
Many-to-Many RNNs for Unequal Input-
Output Length
 For applications, like machine translation, question-answering systems,
document summarization, etc. the lengths of input and output pairs are of unequal
length.
 For machine translation systems, the input is a source language text and output
is a target language text of unequal lengths.
 Similarly, for document summarization systems, the input is a document whose
summary is desired but the output is the summary of the given input document.
The output length is quite less than the input document length.
 For these kind of applications, traditional RNNs are modified to Sequence-to-
Sequence Encoder-Decoder architectures.
Seq2Seq Encoder-Decoder Architecture
 A sequence-to-sequence (Seq2Seq) model with an encoder-decoder architecture is
commonly used for applications where inputs and outputs lengths are equal.
1. Encoder (Processes Input Sequence)
•Takes an input sequence x1,x2,...,xTx.
•Processes each token sequentially using recurrent units.
•The final hidden state summarizes the input and is passed to the decoder.
2. Decoder (Generates Output Sequence)
•Takes the final hidden state of the encoder as its initial hidden state.
•Generates the output sequence y1,y2,...,yTy.
•Each output yt is dependent on the previous output yt−1
Seq2Seq Encoder-Decoder Architecture
(Contd….)
Seq2Seq Encoder-Decoder Forward
Propagation
Encoder Forward Pass:
1. Initialize hidden state (may be zero matrix or any random values).
2. For each time step in 𝑥:

• Compute the hidden state:

• If last input token xTx, pass to the decoder.

where 𝑊 is of shape: nXn, ba is of shape nX1, Wax is of shape nX|V1|

n: number of neurons in encoder/decoder hidden layer, |V1|: length of vocabulary of source
language.
Seq2Seq Encoder-Decoder Forward
Propagation (Contd….)
Decoder Forward Pass:

1. Initialize 𝑎 =𝑎

2. For each time step t in Ty:

 Compute the hidden state: 𝑎 = tanh 𝑊 𝑎 +𝑊 𝑦 +𝑏

 Compute the output: 𝑦 = 𝑠𝑜𝑓𝑡𝑚𝑎𝑥(𝑊 𝑎 +𝑏 )

Where 𝑊 is of shape |V2| X n and by is of shape |V2| X 1; |V2| being length of vocabulary of
target language.
Seq2Seq Encoder-Decoder Loss Function
 Compute Loss:

Use Cross-Entropy Loss (since machine translation is a classification task at each step). For one
examples, the loss is computed as:

𝐿= 𝐿 (𝑦 , 𝑦 )

Where 𝐿 𝑦 , 𝑦 = −𝑦 ∗ log(𝑦 )
Seq2Seq Encoder-Decoder Back-
Propagation
 During the back-propagation phase, gradient of loss function w.r.t Wya (dWya), Way (dWay), Waa (dWaa), Wax
(dWax), ba (dba), and by (dby) is computed.

𝜕𝐿 𝜕𝐿 𝜕𝐿 𝜕𝐿
= + + ⋯………+
𝜕𝑊 𝜕𝑊 𝜕𝑊 𝜕𝑊

Now, = . = 𝑦 −𝑦 ∗𝑎

𝜕𝐿 𝜕𝐿 𝜕𝐿 𝜕𝐿
𝑆𝑖𝑚𝑖𝑙𝑎𝑟𝑙𝑦, = + + ⋯………+
𝜕𝑏 𝜕𝑏 𝜕𝑏 𝜕𝑏

= . = 𝑦 −𝑦 ∗1= 𝑦 −𝑦
Seq2Seq Encoder-Decoder Back-
Propagation (Contd…..)

. .

𝝏𝑳𝒕 𝒕 𝝏𝑳𝒕 𝝏𝒚𝒕 𝝏𝒂𝒊𝒅𝒆𝒄

𝝏𝑾𝒂𝒚 𝒊 𝟏 𝝏𝒚𝒕 𝝏𝒂𝒊 𝝏𝑾𝒂𝒚
𝒅𝒆𝒄

𝝏𝑳𝒕 𝑻𝒙 𝝏𝑳𝒕 𝝏𝒚𝒕 𝝏𝒂𝒕𝒅𝒆𝒄 𝝏𝒂𝒕𝒅𝒆𝒄𝟏 𝝏𝒂𝟏𝒅𝒆𝒄 𝝏𝒂𝒊𝒆𝒏𝒄

𝝏𝑾𝒂𝒙 𝒊 𝟏 𝝏𝒚𝒕 . 𝝏𝒂𝒕 𝒕 𝟏 𝒕 𝟐 𝝏𝒂𝒊𝒆𝒏𝒄 𝝏𝑾𝒂𝒙
𝒅𝒆𝒄 𝝏𝒂𝒅𝒆𝒄 𝝏𝒂𝒅𝒆𝒄
Seq2Seq Encoder-Decoder Back-
Propagation (Contd…..)
= + + ⋯………+

𝒕 𝝏𝑳𝒕 𝝏𝒚𝒕 𝝏𝒂𝒊𝒅𝒆𝒄 𝑻𝒙 𝝏𝑳𝒕 𝝏𝒚𝒕 𝝏𝒂𝒕𝒅𝒆𝒄 𝝏𝒂𝒕𝒅𝒆𝒄𝟏 𝝏𝒂𝟏𝒅𝒆𝒄 𝝏𝒂𝒊𝒆𝒏𝒄

𝑁𝑜𝑤, = ∑𝒊 𝟏 𝒕 . 𝒊 .
𝝏𝒚 𝝏𝒂𝒅𝒆𝒄 𝝏𝑾𝒂𝒂
+ ∑𝒊 𝟏 𝝏𝒚𝒕 . 𝝏𝒂𝒕 . 𝒕 𝟏 . 𝒕 𝟐 … … … … … … … . 𝝏𝒂𝒊 . 𝝏𝑾
𝒅𝒆𝒄 𝝏𝒂𝒅𝒆𝒄 𝝏𝒂𝒅𝒆𝒄 𝒆𝒏𝒄 𝒂𝒂

Ump Academic Calendar 2017 2018
100% (1)
Ump Academic Calendar 2017 2018
2 pages
RNN-StannfordBased
No ratings yet
RNN-StannfordBased
102 pages
Is Homework Necessary For Learning
100% (1)
Is Homework Necessary For Learning
6 pages
Learning Plan ENG9
100% (2)
Learning Plan ENG9
41 pages
Neural Machine Translation: Shusen Wang
No ratings yet
Neural Machine Translation: Shusen Wang
57 pages
dlunit4
No ratings yet
dlunit4
122 pages
RNN Text Generation
No ratings yet
RNN Text Generation
3 pages
PCS224
No ratings yet
PCS224
1 page
Tensor flow chat bot
No ratings yet
Tensor flow chat bot
44 pages
11-rnn
No ratings yet
11-rnn
32 pages
Jews Money Management
No ratings yet
Jews Money Management
9 pages
Learners’ Perspectives: Relevance of Teaching Values Education in the 21st Century
No ratings yet
Learners’ Perspectives: Relevance of Teaching Values Education in the 21st Century
7 pages
Dl-7
No ratings yet
Dl-7
6 pages
sequence to sequence
No ratings yet
sequence to sequence
4 pages
NLP Lab2
No ratings yet
NLP Lab2
7 pages
Product and Brand Management
No ratings yet
Product and Brand Management
2 pages
DL_MOD4 (3)
No ratings yet
DL_MOD4 (3)
105 pages
Opportunity discovery or Creation 4.pptx
No ratings yet
Opportunity discovery or Creation 4.pptx
7 pages
EncoderDecoderSeq2Seq DeepLSTM
No ratings yet
EncoderDecoderSeq2Seq DeepLSTM
7 pages
DL 8
No ratings yet
DL 8
7 pages
NLP_Answers
No ratings yet
NLP_Answers
13 pages
Marisa Rebelo: Knowledge in The Following Areas
No ratings yet
Marisa Rebelo: Knowledge in The Following Areas
1 page
Constructivism and Learning
No ratings yet
Constructivism and Learning
5 pages
Neural Machine Translation, Seq2seq, and Attention
No ratings yet
Neural Machine Translation, Seq2seq, and Attention
17 pages
Meghann's Resume
No ratings yet
Meghann's Resume
2 pages
Sequence Models231205
No ratings yet
Sequence Models231205
72 pages
Seq 2 Seq
No ratings yet
Seq 2 Seq
59 pages
cl8_encdec
No ratings yet
cl8_encdec
51 pages
05 Attention Slides
No ratings yet
05 Attention Slides
69 pages
Cs 224N: Assignment #4: 1. Neural Machine Translation With Rnns (45 Points)
No ratings yet
Cs 224N: Assignment #4: 1. Neural Machine Translation With Rnns (45 Points)
10 pages
Encoder_Decoder_Transformers_Notes
No ratings yet
Encoder_Decoder_Transformers_Notes
6 pages
Student Contact Information: Orange Tulip Scholarship 2017/2018 Amsterdam Business School Master of International Finance
No ratings yet
Student Contact Information: Orange Tulip Scholarship 2017/2018 Amsterdam Business School Master of International Finance
2 pages
Unit IV DL
No ratings yet
Unit IV DL
122 pages
Unit IV DL
No ratings yet
Unit IV DL
122 pages
Module 2 Ict Policies and Safety Issues in Teaching and Learning
No ratings yet
Module 2 Ict Policies and Safety Issues in Teaching and Learning
6 pages
Lecture5
No ratings yet
Lecture5
102 pages
CM Slides On Attention
No ratings yet
CM Slides On Attention
162 pages
Module 3 Part 2 Encoder
No ratings yet
Module 3 Part 2 Encoder
14 pages
Urdaneta City University: Obe Course Syllabus Course Title: Course Credit: Course Code: Prerequisite
100% (1)
Urdaneta City University: Obe Course Syllabus Course Title: Course Credit: Course Code: Prerequisite
8 pages
Polynomial Expansion Paper
No ratings yet
Polynomial Expansion Paper
4 pages
Exploring Sequence-to-Sequence Models _ Understanding the power of Encoder and Decoder Architecture _ by Sachinsoni _ Medium
No ratings yet
Exploring Sequence-to-Sequence Models _ Understanding the power of Encoder and Decoder Architecture _ by Sachinsoni _ Medium
18 pages
Presentation 4
No ratings yet
Presentation 4
41 pages
Exp 8 Machine Translation
No ratings yet
Exp 8 Machine Translation
11 pages
Opportunities Created or Discovered 2 pptx.pptx
No ratings yet
Opportunities Created or Discovered 2 pptx.pptx
8 pages
RNN LSTM GRU Transformers
0% (1)
RNN LSTM GRU Transformers
123 pages
2014 10 Cho EMNLP
No ratings yet
2014 10 Cho EMNLP
11 pages
unit5 3
No ratings yet
unit5 3
48 pages
M5 Topic 1 - Encoder Decoder
No ratings yet
M5 Topic 1 - Encoder Decoder
21 pages
[Slides] Module 44
No ratings yet
[Slides] Module 44
119 pages
The Interpreter and Translator Trainer: Click For Updates
No ratings yet
The Interpreter and Translator Trainer: Click For Updates
21 pages
Unit-3_Part-02
No ratings yet
Unit-3_Part-02
40 pages
Unit 3
No ratings yet
Unit 3
27 pages
Unit_IV_Natural Language Processing (1)
No ratings yet
Unit_IV_Natural Language Processing (1)
9 pages
Deep Recurrent Neural Networks (1)
No ratings yet
Deep Recurrent Neural Networks (1)
24 pages
2. Encoder-Decoder Sequence to Sequence Architechure
No ratings yet
2. Encoder-Decoder Sequence to Sequence Architechure
16 pages
Visualizing A Neural Machine Translation Model
No ratings yet
Visualizing A Neural Machine Translation Model
38 pages
06-DL-Deep Learning For Text Data (LSTM Seq2Seq Models)
No ratings yet
06-DL-Deep Learning For Text Data (LSTM Seq2Seq Models)
44 pages
Sequence-To-Sequence, Attention, Transformer - Machine Learning Lecture
No ratings yet
Sequence-To-Sequence, Attention, Transformer - Machine Learning Lecture
20 pages
Sequence Learning Problem
No ratings yet
Sequence Learning Problem
42 pages
UNIT-3 Sequence Modeling
No ratings yet
UNIT-3 Sequence Modeling
20 pages
Unit III- Recurrent Neural Networks
No ratings yet
Unit III- Recurrent Neural Networks
44 pages
Module 4-1
No ratings yet
Module 4-1
44 pages
Vinija's Notes - Natural Language Processing - Attention
No ratings yet
Vinija's Notes - Natural Language Processing - Attention
27 pages
RNN
No ratings yet
RNN
53 pages
AN2DL_05_2324_Seq2SeqAndWordEmbedding
No ratings yet
AN2DL_05_2324_Seq2SeqAndWordEmbedding
42 pages
AW Chapter 2 Methods of Enquiry in Psychology
No ratings yet
AW Chapter 2 Methods of Enquiry in Psychology
5 pages
UCS503_EST_23
No ratings yet
UCS503_EST_23
4 pages
Sequence Models - Merged
No ratings yet
Sequence Models - Merged
67 pages
Week9 Seq2seq
No ratings yet
Week9 Seq2seq
32 pages
Sequence Models
No ratings yet
Sequence Models
85 pages
Raynell Brooks - All
No ratings yet
Raynell Brooks - All
3 pages
Pervasive Attention 2D Convolutional Neural Networks for Sequence-to-Sequence Prediction
No ratings yet
Pervasive Attention 2D Convolutional Neural Networks for Sequence-to-Sequence Prediction
11 pages
Mr.khaled Sherif-senior teacherسجل المعلم الأول2022المرحلةالإبتدائية
No ratings yet
Mr.khaled Sherif-senior teacherسجل المعلم الأول2022المرحلةالإبتدائية
55 pages
Education Psychology
100% (7)
Education Psychology
164 pages
Post Observation Conference
100% (1)
Post Observation Conference
2 pages
lec14-RNN3-8-Feb-18
No ratings yet
lec14-RNN3-8-Feb-18
16 pages
Problem 1 Proposal
No ratings yet
Problem 1 Proposal
24 pages
CNN RNN LSTM Attention
No ratings yet
CNN RNN LSTM Attention
86 pages
Cs224n 2020 Lecture08 NMT
No ratings yet
Cs224n 2020 Lecture08 NMT
77 pages
Essays On Teaching Excellence: Role-Play: An Often Misused Active Learning Strategy
No ratings yet
Essays On Teaching Excellence: Role-Play: An Often Misused Active Learning Strategy
6 pages
Tutorial Letter 501/0/2020: Visual and Performing Arts
No ratings yet
Tutorial Letter 501/0/2020: Visual and Performing Arts
76 pages
Anao High School (Senior High)
No ratings yet
Anao High School (Senior High)
2 pages
Daily Lesson Plan 1B
No ratings yet
Daily Lesson Plan 1B
3 pages
Diploma in Business and Marketing Strategy (SCQF Level 11/ RQF Level 7) (Standard Mode)
No ratings yet
Diploma in Business and Marketing Strategy (SCQF Level 11/ RQF Level 7) (Standard Mode)
4 pages
MFA Actor Performer Training Merged Documents
No ratings yet
MFA Actor Performer Training Merged Documents
27 pages
Chapter 4 Group 1
No ratings yet
Chapter 4 Group 1
16 pages
Lido Learning
No ratings yet
Lido Learning
9 pages
The Learner The Learner : Curriculum Map in Arts 9
No ratings yet
The Learner The Learner : Curriculum Map in Arts 9
8 pages
Grade 10 Q2 PE LAS
No ratings yet
Grade 10 Q2 PE LAS
12 pages
Learners' Activity Sheets: Homeroom Guidance 7 Quarter 3 - Week 1 My Duties For Myself and For Others
No ratings yet
Learners' Activity Sheets: Homeroom Guidance 7 Quarter 3 - Week 1 My Duties For Myself and For Others
9 pages
Line Drawing Algorithm: Mastering Techniques for Precision Image Rendering
From Everand
Line Drawing Algorithm: Mastering Techniques for Precision Image Rendering
Fouad Sabry
No ratings yet
Generalized Fermat Equation
From Everand
Generalized Fermat Equation
Ran Van Vo
No ratings yet
Speed Mathamatics
From Everand
Speed Mathamatics
Naila Hina
1/5 (1)
Computer Engineering Laboratory Solution Primer
From Everand
Computer Engineering Laboratory Solution Primer
Karan Bhandari
No ratings yet