AI Assignment 11: RL Solutions

The document provides solutions to 10 questions related to reinforcement learning. The solutions discuss concepts like temporal difference learning, state-action pairs, Q-learning updates, and epsilon-greedy policies.

Uploaded by

shanthidl

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

68 views3 pages

AI Assignment 11: RL Solutions

Uploaded by

shanthidl

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

NPTEL: AI Assignment-11 Solutions

Reinforcement Learning

Solution Q1) B, Follows from slides

Solution Q2) BC, Follows from slides

Solution Q3) AD

Solution Q4) A, Follows from the equation of feature-based Q-learning

Solution Q5) BC

a. Incorrect. Temporal difference (TD) learning is a model-free reinforcement learning

technique. It doesn't require knowledge of the underlying model of the environment,
unlike model-based approaches
b. Correct. Follows from slides
c. Correct. In temporal difference learning, the value of a state is updated incrementally
based on the TD error. The value of the state is adjusted towards a target value,
which is a combination of the current estimate and the observed reward and the next
state values.
d. Incorrect. The TD error is defined as the difference between the estimated value of a
state (based on the current estimate) and the target value (usually based on the
observed reward and next state values). It's not the difference between old and new
values; rather, it represents the discrepancy between what was expected and what
was actually observed.

Solution Q6) ACD

- Factual, discussed in videos

Solution Q7) 3
The state action pair (B2, R) is seen 3 times, all 3 times we end up in state B3, hence
x = T(B2, R, B3) = 3/3 = 1
The state action pair (B3, U) is seen 3 times, only 1 time we end up in state C3,
hence y = T(B3, U, C3) = 1/3

Solution Q8) 46
- We visit A1 twice, the first time in the first simulation from where the reward collected
before reaching a terminal state is -9 + 100 = 91. The second time we visit A1 is in
the second simulation from where the reward collected before reaching a terminal
state is -5-100 = -105. Hence w = (91-105)/2 = -7
- Similarly we visit B1 twice, the first time in the first simulation from where the reward
collected before reaching a terminal state is -8 + 100 = 92. The second time we visit
B1 is in the second simulation from where the reward collected before reaching a
terminal state is -4-100 = -104. Hence x = (92-104)/2 = -6.
- We visit B2 thrice, 2 times in the first simulation and 1 time in the second simulation.
The rewards collected before reaching a terminal state are -7+100 = 93, -3 + 100 =
97, -3-100 = -103. Hence y = (93+97-103)/3 = 29.
- We visit B3 thrice, 2 times in the first simulation and 1 time in the second simulation.
The rewards collected before reaching a terminal state are -6+100, -2+100, -2-100.
Hence z = (94 + 98 - 102)/3 = 30
- w+x+y+z = 46

Solution Q9) -0.16

The state action pair (c, RIGHT) is experienced twice and hence Q(c, RIGHT) will be
updated twice.

At first update, since a collision happens a reward of -1 will be received.

Q(c, RIGHT) = (1-𝝰).Q(c, RIGHT) + 𝝰(R+Q(c, RIGHT))
= 0.2x0 + 0.8(-1)
= -0.8

At second update
Q(c, RIGHT) = (1-𝝰).Q(c, RIGHT) + 𝝰(R+Q(d, UP))
= 0.2x(-0.8) + 0.8(0)
= -0.16

Solution Q10) 0.025

Q(c,RIGHT)= -0.16
Q(c,UP)=Q(c,DOWN)=Q(c,LEFT)=0
Epsilon greedy policy will pick any one of UP, DOWN, LEFT as the action.
Q(c,RIGHT) action will only be taken with epsilon/4 probability.

Understanding Reinforcement Learning Concepts
No ratings yet
Understanding Reinforcement Learning Concepts
36 pages
Active vs Passive Reinforcement Learning
100% (1)
Active vs Passive Reinforcement Learning
31 pages
Deep Reinforcement Learning Exam Guide
No ratings yet
Deep Reinforcement Learning Exam Guide
12 pages
Reinforcement Learning Assignment 6
No ratings yet
Reinforcement Learning Assignment 6
24 pages
Understanding Temporal-Difference Learning
No ratings yet
Understanding Temporal-Difference Learning
20 pages
MDP Formulation and Reward Analysis
No ratings yet
MDP Formulation and Reward Analysis
25 pages
Understanding Reinforcement Learning Basics
No ratings yet
Understanding Reinforcement Learning Basics
32 pages
Machine Learning Assignment Solutions
No ratings yet
Machine Learning Assignment Solutions
5 pages
COMP 3211 Fall 2014 Midterm Exam Guide
No ratings yet
COMP 3211 Fall 2014 Midterm Exam Guide
5 pages
Deep RL for Super Mario Bros Agent
No ratings yet
Deep RL for Super Mario Bros Agent
59 pages
Reinforcement Learning Assignment 2
No ratings yet
Reinforcement Learning Assignment 2
4 pages
Passive vs Active Reinforcement Learning
No ratings yet
Passive vs Active Reinforcement Learning
30 pages
Mid-Semester DRL Exam Guidelines
No ratings yet
Mid-Semester DRL Exam Guidelines
2 pages
CS188 Final Exam Guidelines
No ratings yet
CS188 Final Exam Guidelines
26 pages
DRL Mid-Semester Test Guidelines
No ratings yet
DRL Mid-Semester Test Guidelines
3 pages
Q-Learning in Reinforcement Learning
No ratings yet
Q-Learning in Reinforcement Learning
27 pages
Reinforcement Learning Assignment Solutions
No ratings yet
Reinforcement Learning Assignment Solutions
6 pages
Chapter 10
No ratings yet
Chapter 10
4 pages
Reinforcement Learning Insights from CS188
No ratings yet
Reinforcement Learning Insights from CS188
46 pages
CS188 Fall 2014 Homework 8 Instructions
No ratings yet
CS188 Fall 2014 Homework 8 Instructions
7 pages
TD Learning and Q-Learning Overview
No ratings yet
TD Learning and Q-Learning Overview
15 pages
Markov Decision Processes and Q-Learning Analysis
No ratings yet
Markov Decision Processes and Q-Learning Analysis
14 pages
Neural Networks and Q-Learning Tutorial
No ratings yet
Neural Networks and Q-Learning Tutorial
4 pages
Reinforcement Learning Assignment Solutions
No ratings yet
Reinforcement Learning Assignment Solutions
4 pages
Reinforcement Learning Techniques Overview
No ratings yet
Reinforcement Learning Techniques Overview
13 pages
Reinforcement Learning in Gaming Strategies
No ratings yet
Reinforcement Learning in Gaming Strategies
22 pages
Mini Grid World Reinforcement Learning
No ratings yet
Mini Grid World Reinforcement Learning
6 pages
Deep Reinforcement Learning Exam Guide
No ratings yet
Deep Reinforcement Learning Exam Guide
12 pages
Understanding Reinforcement Learning Basics
No ratings yet
Understanding Reinforcement Learning Basics
17 pages
Cairo University AI424 Reinforcement Learning Assignment
No ratings yet
Cairo University AI424 Reinforcement Learning Assignment
3 pages
Markov Decision Processes and Q-Learning
No ratings yet
Markov Decision Processes and Q-Learning
9 pages
Deep Reinforcement Learning Overview
No ratings yet
Deep Reinforcement Learning Overview
64 pages
Cyber-Physical Systems Week 10 Assignment
No ratings yet
Cyber-Physical Systems Week 10 Assignment
6 pages
Respuestas sobre Aprendizaje por Refuerzo
No ratings yet
Respuestas sobre Aprendizaje por Refuerzo
9 pages
CS6700 Reinforcement Learning Assignment 1
No ratings yet
CS6700 Reinforcement Learning Assignment 1
7 pages
Reinforcement Learning Exam - Stockholm University
No ratings yet
Reinforcement Learning Exam - Stockholm University
2 pages
MDP and Reinforcement Learning Tasks
No ratings yet
MDP and Reinforcement Learning Tasks
6 pages
Understanding Reinforcement Learning
No ratings yet
Understanding Reinforcement Learning
34 pages
Reinforcement Learning Assignment Solutions
No ratings yet
Reinforcement Learning Assignment Solutions
4 pages
Reinforcement Learning Algorithms Overview
No ratings yet
Reinforcement Learning Algorithms Overview
42 pages
Deep Reinforcement Learning Overview
No ratings yet
Deep Reinforcement Learning Overview
52 pages
Reinforcement Learning Overview
No ratings yet
Reinforcement Learning Overview
136 pages
Reinforcement Learning Homework Solutions
No ratings yet
Reinforcement Learning Homework Solutions
8 pages
Q-Learning Basics and Enhancements
No ratings yet
Q-Learning Basics and Enhancements
22 pages
DRL Problem Statement and Custom Env
No ratings yet
DRL Problem Statement and Custom Env
5 pages
TD Learning Dynamics and Policy Analysis
No ratings yet
TD Learning Dynamics and Policy Analysis
5 pages
Q-Learning with Linear Function Approximation
No ratings yet
Q-Learning with Linear Function Approximation
10 pages
Understanding Reinforcement Learning
No ratings yet
Understanding Reinforcement Learning
32 pages
Exercises 3
No ratings yet
Exercises 3
12 pages
DQN Insights for Atari Games
No ratings yet
DQN Insights for Atari Games
13 pages
Reinforcement Learning Overview Guide
No ratings yet
Reinforcement Learning Overview Guide
12 pages
Reinforcement Learning Exam Questions
No ratings yet
Reinforcement Learning Exam Questions
6 pages
Multi-Agent Reinforcement Learning Notes
No ratings yet
Multi-Agent Reinforcement Learning Notes
36 pages
Discrete-Time System Response Solutions
85% (13)
Discrete-Time System Response Solutions
474 pages
Deep Reinforcement Learning in Classification
No ratings yet
Deep Reinforcement Learning in Classification
80 pages
Faculty Contributions in Engineering Accreditation
No ratings yet
Faculty Contributions in Engineering Accreditation
6 pages
Teaching-Learning in Engineering Accreditation
No ratings yet
Teaching-Learning in Engineering Accreditation
5 pages
LLM A5
No ratings yet
LLM A5
3 pages
Course Outcomes Mapping in Engineering
No ratings yet
Course Outcomes Mapping in Engineering
6 pages
Vision, Mission, and PEOs in Accreditation
No ratings yet
Vision, Mission, and PEOs in Accreditation
5 pages
LLM A7
No ratings yet
LLM A7
3 pages
LLM W6
No ratings yet
LLM W6
3 pages
Paytm Soundbox Setup and Usage Guide
No ratings yet
Paytm Soundbox Setup and Usage Guide
9 pages
Mini Qualitative Research Assignment Guide
No ratings yet
Mini Qualitative Research Assignment Guide
5 pages
UFED Physical Analyzer v7.2 Manual Eng March 2018
100% (1)
UFED Physical Analyzer v7.2 Manual Eng March 2018
318 pages
FMEA Fundamentals and Applications
No ratings yet
FMEA Fundamentals and Applications
40 pages
Ilidža-Otes Bus Schedule
No ratings yet
Ilidža-Otes Bus Schedule
1 page
Vibe Coding: AI-Driven Programming Insights
No ratings yet
Vibe Coding: AI-Driven Programming Insights
31 pages
Equilibrium Index Problem Solutions
No ratings yet
Equilibrium Index Problem Solutions
3 pages
Python Script Error Tracebacks
No ratings yet
Python Script Error Tracebacks
11 pages
Icon Design Study in Computer Interface
No ratings yet
Icon Design Study in Computer Interface
5 pages
5th Grade Math Strategies Overview
No ratings yet
5th Grade Math Strategies Overview
2 pages
Sharp MX-M266N Technical Guide
No ratings yet
Sharp MX-M266N Technical Guide
47 pages
HCI in E-Learning Platform Design
No ratings yet
HCI in E-Learning Platform Design
6 pages
CryptoTab Hack and Support Guide
33% (3)
CryptoTab Hack and Support Guide
1 page
Business Computing Applications Overview
No ratings yet
Business Computing Applications Overview
18 pages
English Grammar Exercises for 1º ESO
No ratings yet
English Grammar Exercises for 1º ESO
4 pages
People Soft Bundle Release Note 9 Bundle10
No ratings yet
People Soft Bundle Release Note 9 Bundle10
43 pages
Type-Sensitive Embeddings in QA Systems
No ratings yet
Type-Sensitive Embeddings in QA Systems
16 pages
APLMF 2017 Working Group Report
No ratings yet
APLMF 2017 Working Group Report
3 pages
NLP Concepts and Applications Quiz
No ratings yet
NLP Concepts and Applications Quiz
4 pages
iPF671 PDF
No ratings yet
iPF671 PDF
2 pages
Kuramoto Sivashinsky Equation Overview
No ratings yet
Kuramoto Sivashinsky Equation Overview
8 pages
Website UserID Creation Manual
No ratings yet
Website UserID Creation Manual
7 pages
SEO Best Practices for SharePoint Sites
No ratings yet
SEO Best Practices for SharePoint Sites
12 pages
CIS for C. Hafner GmbH & CEO Gallego
No ratings yet
CIS for C. Hafner GmbH & CEO Gallego
5 pages
Full Stack Developer Profile: Vinod Lokhande
No ratings yet
Full Stack Developer Profile: Vinod Lokhande
2 pages
Creating Docker Containers Guide
100% (1)
Creating Docker Containers Guide
7 pages
Easy Boot Setup for DiskGO! USB
100% (1)
Easy Boot Setup for DiskGO! USB
2 pages
CSS Basics and External Stylesheets
100% (1)
CSS Basics and External Stylesheets
119 pages
Blockchain - ICBC 2021 4th - (Z-Library)
No ratings yet
Blockchain - ICBC 2021 4th - (Z-Library)
150 pages
CCNP ENCOR 350-401 Course Overview
100% (2)
CCNP ENCOR 350-401 Course Overview
130 pages

AI Assignment 11: RL Solutions

Uploaded by

AI Assignment 11: RL Solutions

Uploaded by

NPTEL: AI Assignment-11 Solutions

Solution Q1) B, Follows from slides

Solution Q2) BC, Follows from slides

Solution Q4) A, Follows from the equation of feature-based Q-learning

a. Incorrect. Temporal difference (TD) learning is a model-free reinforcement learning

Solution Q6) ACD

Solution Q9) -0.16

At first update, since a collision happens a reward of -1 will be received.

Solution Q10) 0.025

You might also like