0% found this document useful (0 votes)
12 views2 pages

Mila University Centre M1: I2A Uncertain Decision: Work 2 (Reinforcement Learning) 1. Definition

The document provides an overview of OpenAI's Gym, an open-source project for reinforcement learning experiments, specifically focusing on the Frozen Lake environment. It details the installation process, the state and action spaces, and how to interact with the environment using Python code. Additionally, it poses questions regarding the simulation of an episode and the implementation of an optimal policy algorithm.

Uploaded by

lahlou khalid
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as ODT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views2 pages

Mila University Centre M1: I2A Uncertain Decision: Work 2 (Reinforcement Learning) 1. Definition

The document provides an overview of OpenAI's Gym, an open-source project for reinforcement learning experiments, specifically focusing on the Frozen Lake environment. It details the installation process, the state and action spaces, and how to interact with the environment using Python code. Additionally, it poses questions regarding the simulation of an episode and the implementation of an optimal policy algorithm.

Uploaded by

lahlou khalid
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as ODT, PDF, TXT or read online on Scribd
You are on page 1/ 2

Mila University Centre

M1: I2A
Uncertain decision
Work 2 (reinforcement learning)
1. Definition
Gym is an open source project created by OpenAI used for reinforcement learning experiments.

2. Install OpenAI Gym


- pip install gym
- pip install gym [toy-text]

3. The Frozen Lake Environment

Frozen lake involves crossing a frozen lake from Start(S) to Goal(G) without falling into any Holes(H)
by walking over the Frozen(F) lake (see Figure ).

0 1 2 3
S F F F
4 5 6 7
F H F H
8 9 10 11
F F F H
12 13 14 15
H F F G
Frozen Lake environment
import gym
env = gym.make("FrozenLake-v1",render_mode="human") ## to create the Frozen Lake
environment
env.reset() ## to put the environment on its initial state.
env.render() # to print the environment into the console.
3.1 State space
This environment consists of 16 fields (4 by 4 grid). The states are denoted from 0 to 15 (See figure
above) . There are four types of fields: start field (S), frozen fields (F), holes (H), and the goal field
(G).That is, the game is completed if we step on a hole field or if we reach the goal field.
env.observation_space
Mila University Centre
M1: I2A
Uncertain decision

3.2 Action space


env.action_space.
we have 4 possible actions: : left(0), down (1), right (2), up(3)
To take a random action, we use :
random_action = env.action_space.sample()
env.step(random_action)
This function has the following parameter:
(1, 0.0, False, False, {'prob': 0.3333333333333333})
1: The current state, 2: reward, 3: Boolean parameter taking true when the agent achieves the goal or
falls into a hole.
The last parameter concerns the probability that the agent move in the intended direction. In fact, The
agent may not always move in the intended direction, due to the slippery nature of the frozen lake.

3.3 Probability transition


Env.P[s][a]
The different Probability of reaching the adjacent state of s, including s, using action a.
[(0.3333333333333333, 0, 0.0, False), (0.3333333333333333, 4, 0.0, False), (0.3333333333333333, 1,
0.0, False)]
3.4 leave the environment
env.close:

4. Questions
1. Simulate an episode
2. Implement an Algorithm that allows determining the optimal policy to achieve the Goal.

You might also like