0% found this document useful (0 votes)
2 views

Reinforcement Learning Question Bank

The document is a reinforcement learning question bank covering various topics such as definitions, elements, and algorithms related to reinforcement learning. It includes practical questions for implementing RL environments and algorithms like Monte Carlo Prediction, Value Iteration, and Policy Iteration. Additionally, it contrasts different types of policies, tasks, and environments within the context of reinforcement learning.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Reinforcement Learning Question Bank

The document is a reinforcement learning question bank covering various topics such as definitions, elements, and algorithms related to reinforcement learning. It includes practical questions for implementing RL environments and algorithms like Monte Carlo Prediction, Value Iteration, and Policy Iteration. Additionally, it contrasts different types of policies, tasks, and environments within the context of reinforcement learning.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

REINFORCEMENT LEARNING QUESTION BANK

(MODULE 1&MODULE 2)
1. Why RL is known as Feedback based machine learning technique explain in your own
words.

2.Define the following terms

a) Goal b)Action space c) State space d) Reward

e) Trajectory f) Optimal policy g) Policy h) Discount factor i)Random policy .

3.Explain the elements of RL

4..Contrast the deterministic policy and the stochastic policy Define Value function

5.Define an episode in reinforcement learning, and how is it generated, implement 10


episodes for a RL environment ?

6.Differentiate between Continuous task and Episodic Task

7.Differentiate between MC Prediction and MC Control

8.Distinguish between finite Horizon and infinite horizon

9.Interpret a Python code to create and display the RL environment

10.Differentiate between the Markov reward process (MRP) and the Markov decision
process (MDP)?

11.Interpret a simple Reinforcement Learning environment using Gym and simulate an


agent interacting with it for a few episodes. Submit the code and explain the observations.

12.Write MC Prediction Algorithm to compute Value Function

13.Write MC Prediction Algorithm to compute Q function

14.Infer a Python code to create and display the RL environment

15.Calculate the return for the following Episode


16.Infer an episode in reinforcement learning, and illustrate generation of 10 episodes for
any RL environment .

17.Interpret a Python code to create and display the RL environment

18.Write the equation to find V(S) in Monte Carlo Method.

19.If X is a random variable with the outcomes of throwing a dice.Find the expectation
E[f(x)],where f(x)= X3

20.Define the return of a trajectory for a continuous task

21.Differentiate between MC Control On Policy and MC Control Off Policy

22.For the following grid world Environment Construct the Value function which follows a
deterministic policy.

23.Using the Model dynamics table of State A find the optimal policy using Policy Iteration

24.The model dynamics of an RL environment is given below. Identify V(A) after the first
iteration using value iteration algorithm. Assume discount factor as 1.
25.Without the dynamics of the environment, the RL agent has to evaluate a policy of a
state using Monte-Carlo methods. Which type of RL task the agent has to use for this?
Explain the corresponding algorithm in detail

26.Explain Markov Decision Process in Detail

27.Using the value iteration algorithm and the model dynamics of state A given in the table
below, identify the optimal value of state A, after the first iteration.

28.Explain Deterministic environment and Stochastic environment in detail with an


example each

29.Write MC Control Algorithm to compute Value Function

30.Write MC Control Algorithm to compute Q Function

31.Construct final value of states for the given policy.


32.Derive the optimal Bellman equation for V(s) and Q (s, a). Using Bellman’s definition,
construct the value of all states in the environment given below. Assume discount factor as
1

33.Explain stochastic environment and deterministic environment in RL with an example.

34.Explain Exploration Exploitation Dilemma

PRACTICAL QUESTIONS

1.Implement the reinforcement learning environment namely, Frozen Lake Environment


using a random policy and show the output of the following:

a. Create and render the environment b. Action Space c. State Space


d. Generate 10 Episodes and print Return of each episode e. Print the action for every
episode

2.Implement the reinforcement learning environment namely, Frozen Lake Environment


using a deterministic policy and show the output of the following:

a. Create and render the environment b. Action Space c. State Space


d . State transition probabilities
e. Print the action for every episode

3.Implement the reinforcement learning environment namely, Cart pole Environment using
a random policy and show the output of the following:

a. Create and render the environment


b. State Space
c. Action Space
d. State transition probabilities
e. Print the action for every episode

4.Implement Value Iteration for a simple RL environment and display the optimal policy.
Submit the code and explain the results.

5.Implement Policy Iteration for a simple RL environment and display the optimal policy.
Submit the code and explain the results.

6.Implement the reinforcement learning environment namely, Cartpole Environment using


a deterministic policy and show the output of the following:
a. Create and render the environment b. Action Space c. State Space
d.Generate 30 Episodes and print Return of each episode e. State transition probabilities

You might also like