Reinforcement Learning

Here are the key points about reinforcement learning: 1. The key components are the agent, environment, actions, states, and rewards. The agent takes actions in the environment and transitions between states. It receives rewards depending on the actions and states. The goal is for the agent to learn which actions yield the most reward through trial and error interactions with the environment. 2. Model-based methods (like dynamic programming) learn an internal model of the environment and use it to plan optimal behavior. Model-free methods (like Q-learning) directly estimate value functions without learning a model. Q-learning estimates state-action values directly from experience. Value iteration uses a learned model to calculate state values. 3. In an

Uploaded by

21dce106

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

38 views15 pages

Reinforcement Learning

Uploaded by

21dce106

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 15

Reinforcement Learning

What is Reinforcement Learning ?

Reinforcement learning (RL) is a subfield of machine learning that involves teaching
computers how to learn from experience by assigning them rewards and punishments
in response to their actions.
What is Reinforcement Learning ?
Scenario: Imagine a robot named RoboStudent in a classroom setting. RoboStudent's
objective is to learn how to behave in class to maximize its "student performance
score." It interacts with the classroom environment and learns over time.
Key Components:

Agent (RoboStudent): RoboStudent is the learner or decision-maker in this scenario. It

takes actions in the classroom based on the current state to maximize its cumulative
performance score.

Environment (Classroom): The classroom is where RoboStudent operates. It can be in

various states, such as quiet, noisy, teacher speaking, or students discussing.

Actions: RoboStudent can take actions like raising its hand, asking questions, paying
attention, or chatting with classmates.
What is Reinforcement Learning ?
States: The classroom can be in different states, representing different situations. For
instance, it might be in a state where the teacher is teaching, students are discussing,
or it's a quiet study session.

Rewards: RoboStudent receives rewards or penalties based on its actions. For

instance, it might receive a positive reward when it raises its hand to answer a
question correctly or a negative reward when it disrupts the class.
What is Q- Reinforcement Learning ?
Q-Learning is a Reinforcement learning policy that will find the next best action, given
a current state. It chooses this action at random and aims to maximize the reward.
Q- RL- Example
Q- RL- Example

We’ll call each room, including outside, a “state”, and the agent’s movement
from one room to another will be an “action”. In our diagram, a “state” is
depicted as a node, while “action” is represented by the arrows.
Q- RL- Example

matrix R

The transition rule of Q learning is a very simple formula:

Q(state, action) = R(state, action) + Gamma * Max[Q(next state, all actions)]

Q- RL- Example
The Q-Learning algorithm goes as follows:

1. Set the gamma parameter and environment rewards in matrix R.

2. Initialize matrix Q to zero.
3. For each episode:
Select a random initial state.
Do While the goal state hasn’t been reached.
• Select one among all possible actions for the current state.
• Using this possible action, consider going to the next state.
• Get maximum Q value for this next state based on all possible actions.
• Compute: Q(state, action) = R(state, action) + Gamma * Max[Q(next state, all actions)]
• Set the next state as the current state.
End Do
End For
Q- RL- Example

learning parameter Gamma = 0.8, and the initial state as Room 1.

1. Look at the second row (state 1) of matrix R. There are two possible actions for the
current state 1: go to state 3, or go to state 5. By random selection, we select to go to 5
as our action.
Q- RL- Example
learning parameter Gamma = 0.8, and the initial state as Room 1.

2. Now let’s imagine what would happen if our agent were in state 5. Look at the sixth
row of the reward matrix R (i.e. state 5). It has 3 possible actions: go to states 1, 4, or 5.

3. Q(state, action) = R(state, action) + Gamma * Max[Q(next state, all actions)]

Q(1, 5) = R(1, 5) + 0.8 * Max[Q(5, 1), Q(5, 4), Q(5, 5)] = 100 + 0.8 * 0 = 100
Q- RL- Example
learning parameter Gamma = 0.8, and the initial state as Room 1.

100

The next state, 5, now becomes the current state. Because 5 is the goal state, we’ve
finished one episode.
Q- RL- Example
learning parameter Gamma = 0.8, and the initial state as Room 1.

100

• For the next episode, we start with a randomly chosen initial state. This time, we have
state 3 as our initial state.
• Look at the fourth row of matrix R; it has 3 possible actions: go to states 1, 2, or 4. By
random selection, we select to go to state 1 as our action.
• Now we imagine that we are in state 1. Look at the second row of reward matrix R (i.e.
state 1). It has 2 possible actions: go to state 3 or state 5. Then, we compute the Q value:
Q- RL- Example
learning parameter Gamma = 0.8, and the initial state as Room 1.

100

• Q(state, action) = R(state, action) + Gamma * Max[Q(next state, all actions)]

Q(3, 1) = R(3,1) + 0.8 * Max[Q(1, 3), Q(1, 5)] = 0 + 0.8 * Max(0, 100) = 80
Question
1. What are the key components of a reinforcement learning problem, and how do they interact with each other?
2. Distinguish between model-based and model-free reinforcement learning approaches. Illustrate the dissimilarity
with the help of specific algorithms for each category.
3. How does an agent interact with an environment in the context of reinforcement learning? Provide an example.

Unit-5 ML Notes
No ratings yet
Unit-5 ML Notes
31 pages
DRL Final Notes
No ratings yet
DRL Final Notes
281 pages
CMPE257 - W10C13 - Reinforcement Learning
No ratings yet
CMPE257 - W10C13 - Reinforcement Learning
161 pages
Unit-5 Mlt
No ratings yet
Unit-5 Mlt
13 pages
UNIT - 5 RL
No ratings yet
UNIT - 5 RL
38 pages
Reinforedu
No ratings yet
Reinforedu
46 pages
Lecture 9 - Reinforced Learning
No ratings yet
Lecture 9 - Reinforced Learning
18 pages
Reinforcement Learning
No ratings yet
Reinforcement Learning
30 pages
Q Learning Ejemplo
100% (1)
Q Learning Ejemplo
11 pages
112 Q Learning N
100% (1)
112 Q Learning N
15 pages
ML UNIT-V
No ratings yet
ML UNIT-V
20 pages
Reinforcement Learning
No ratings yet
Reinforcement Learning
38 pages
Reinforcement Learning
No ratings yet
Reinforcement Learning
34 pages
Reinforcement Learning
No ratings yet
Reinforcement Learning
14 pages
Intro To Reinforcement Learning
No ratings yet
Intro To Reinforcement Learning
56 pages
Unit 4
No ratings yet
Unit 4
12 pages
Sections
No ratings yet
Sections
76 pages
Sara Reinforcement Learning
No ratings yet
Sara Reinforcement Learning
69 pages
Deep Learning Binoy-19-3-RL Q Learning
No ratings yet
Deep Learning Binoy-19-3-RL Q Learning
26 pages
ML UNIT 5
No ratings yet
ML UNIT 5
13 pages
21 - Reinforcement Learning
No ratings yet
21 - Reinforcement Learning
25 pages
AI unit -3.docx
No ratings yet
AI unit -3.docx
102 pages
Adobe Scan Nov 18, 2024
No ratings yet
Adobe Scan Nov 18, 2024
13 pages
4.3 Reinforcement Learning
No ratings yet
4.3 Reinforcement Learning
27 pages
Unit 5 ML 3year
No ratings yet
Unit 5 ML 3year
17 pages
Unit 5
No ratings yet
Unit 5
45 pages
Reinforcement Learning
No ratings yet
Reinforcement Learning
9 pages
ML - Unit 3 - Part II
No ratings yet
ML - Unit 3 - Part II
51 pages
Unit-5 Mla
No ratings yet
Unit-5 Mla
22 pages
Reinforcement Learning
100% (1)
Reinforcement Learning
25 pages
ML unit 4
No ratings yet
ML unit 4
17 pages
Reinforced Learning
No ratings yet
Reinforced Learning
25 pages
RL & DL Notes
No ratings yet
RL & DL Notes
73 pages
Module 1
No ratings yet
Module 1
72 pages
UNIT-4
No ratings yet
UNIT-4
56 pages
Reinforcement Learning
No ratings yet
Reinforcement Learning
38 pages
Module_1 - Reinforcement Learning and Markov Decision Process
No ratings yet
Module_1 - Reinforcement Learning and Markov Decision Process
19 pages
Houston ISD Special Education Report
No ratings yet
Houston ISD Special Education Report
89 pages
Reinforcement Learning
No ratings yet
Reinforcement Learning
4 pages
Review of Related Literature The Effect of Technology To Education
50% (4)
Review of Related Literature The Effect of Technology To Education
15 pages
Unit 5 - Reinforcement Learning
No ratings yet
Unit 5 - Reinforcement Learning
15 pages
AI (IT) UNIT-5
No ratings yet
AI (IT) UNIT-5
43 pages
RL & DL Notes
No ratings yet
RL & DL Notes
43 pages
Reinforcement Learning
No ratings yet
Reinforcement Learning
29 pages
lecture 9 Reiforcement learning (1)
No ratings yet
lecture 9 Reiforcement learning (1)
29 pages
ReinforcementLearning
No ratings yet
ReinforcementLearning
17 pages
Reinforcement Learning
No ratings yet
Reinforcement Learning
11 pages
7- Reinforcement Learning
No ratings yet
7- Reinforcement Learning
23 pages
Unit 5
No ratings yet
Unit 5
10 pages
Reinforcement Learning
No ratings yet
Reinforcement Learning
5 pages
Unit-5 (AI)
No ratings yet
Unit-5 (AI)
21 pages
Reinforcement Learning
No ratings yet
Reinforcement Learning
32 pages
unit 5 ml
No ratings yet
unit 5 ml
15 pages
Unit 1
No ratings yet
Unit 1
18 pages
Reinforcement LN-6
No ratings yet
Reinforcement LN-6
13 pages
Reinforcement learning
No ratings yet
Reinforcement learning
10 pages
L11 Reinforcement Learning 1
No ratings yet
L11 Reinforcement Learning 1
18 pages
Fundamentals of Reinforcement Learning
No ratings yet
Fundamentals of Reinforcement Learning
33 pages
7.reinforcement Learning-Introduction-The Learning Task Q-Learning
No ratings yet
7.reinforcement Learning-Introduction-The Learning Task Q-Learning
34 pages
ML Assignment 2
No ratings yet
ML Assignment 2
6 pages
Dynamic Programming
No ratings yet
Dynamic Programming
74 pages
Divide and Conquer
No ratings yet
Divide and Conquer
50 pages
Linkages and Networking With Organizations
100% (1)
Linkages and Networking With Organizations
26 pages
DLL Cookery Y10
No ratings yet
DLL Cookery Y10
4 pages
Assessing The Impact of E-Learning Systems in Colleges of Education in Different Divisions of Karnataka
No ratings yet
Assessing The Impact of E-Learning Systems in Colleges of Education in Different Divisions of Karnataka
4 pages
Case Study
No ratings yet
Case Study
11 pages
Elec 301, Unit 2 - Lesson 4
No ratings yet
Elec 301, Unit 2 - Lesson 4
11 pages
2.2 Learner Exceptionalities 2.1.6 Compare Theories and Philosophies of Education and Training Impacting Learners With Exceptionalities
No ratings yet
2.2 Learner Exceptionalities 2.1.6 Compare Theories and Philosophies of Education and Training Impacting Learners With Exceptionalities
3 pages
Psychology of Success Overcoming Barriers To Pursuing Further Education
No ratings yet
Psychology of Success Overcoming Barriers To Pursuing Further Education
13 pages
Reinforcement Learning
No ratings yet
Reinforcement Learning
32 pages
STUDENTS' ACHIEVEMENT (PS1) (1.1 and 1.2 Attainment & Progress)
No ratings yet
STUDENTS' ACHIEVEMENT (PS1) (1.1 and 1.2 Attainment & Progress)
11 pages
Thesis Summary
No ratings yet
Thesis Summary
41 pages
K-Means With Elbow Method
No ratings yet
K-Means With Elbow Method
24 pages
Ce 315 Daa 03 2018
No ratings yet
Ce 315 Daa 03 2018
3 pages
CH 5
No ratings yet
CH 5
16 pages
English Exercise - Layaalin Mutmainah - 5301421020
No ratings yet
English Exercise - Layaalin Mutmainah - 5301421020
3 pages
A Closer Look On The Education System of Selected Countries
No ratings yet
A Closer Look On The Education System of Selected Countries
55 pages
DVD Observation 1 - Worksheet
No ratings yet
DVD Observation 1 - Worksheet
3 pages
Science 5 DLP Feb 23, 2023 Final
100% (1)
Science 5 DLP Feb 23, 2023 Final
4 pages
MT1OL-Iei-5.1: Name/Triad: Schedule
No ratings yet
MT1OL-Iei-5.1: Name/Triad: Schedule
2 pages
Project Icare Evaluation Form
No ratings yet
Project Icare Evaluation Form
7 pages
Description of Class Teaching Unit 28. Type of Lesson
No ratings yet
Description of Class Teaching Unit 28. Type of Lesson
4 pages
General Dentistry Regs June 2017
No ratings yet
General Dentistry Regs June 2017
2 pages
RA 10533 Enhanced Basic Education Act of 2013
100% (2)
RA 10533 Enhanced Basic Education Act of 2013
7 pages
Portic Is Swot Analysis Mapeh
No ratings yet
Portic Is Swot Analysis Mapeh
1 page
Assessment Brief - Planning Your Career in Business (V-3)
No ratings yet
Assessment Brief - Planning Your Career in Business (V-3)
3 pages
Social Psychology Article Critique
No ratings yet
Social Psychology Article Critique
4 pages
Situational PowerPoint
No ratings yet
Situational PowerPoint
20 pages
Learners Individual Record Lavender
No ratings yet
Learners Individual Record Lavender
3 pages
MAPEH 9 Arts Q3
100% (1)
MAPEH 9 Arts Q3
11 pages
INTERVIEW QUESTION NOVOTEL HOTEL Answers
No ratings yet
INTERVIEW QUESTION NOVOTEL HOTEL Answers
2 pages
PROSPECTUS - Bachelor of Culture - Arts Education
No ratings yet
PROSPECTUS - Bachelor of Culture - Arts Education
1 page
Guidelines On The Content Evaluation of Deped-Developed Adm Modules
No ratings yet
Guidelines On The Content Evaluation of Deped-Developed Adm Modules
8 pages
Prepared By: Analiza V. Responde, SST-I (L-BNHS)
No ratings yet
Prepared By: Analiza V. Responde, SST-I (L-BNHS)
3 pages
Analytical Methods of Optimization
From Everand
Analytical Methods of Optimization
D. F. Lawden
No ratings yet