0% found this document useful (0 votes)

16 views6 pages

Workshop No 1 Systems Sciences

The document outlines the development of an autonomous agent for managing traffic lights using reinforcement learning to optimize traffic flow at two consecutive intersections. It details system requirements, including functional specifications, use cases, and a preliminary implementation outline with frameworks like Gymnasium and Stable-Baselines3. A timeline for transitioning from basic Q-learning to advanced DQN approaches is also provided, along with references for further reading.

Uploaded by

Juan Ramos Ome

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

16 views6 pages

Workshop No 1 Systems Sciences

Uploaded by

Juan Ramos Ome

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

Workshop No.

1 Systems Sciences Foundations

Juan Santiago Ramos Ome, Arlo Nicolas Ocampo
April 8, 2025

1 Introduction
Develop an autonomous agent capable of learning and adapting to a simulated
environment using reinforcement learning. The officer will monitor traffic lights
at a single intersection with multiple traffic participants (cars, motorcycles, and
pedestrians) or at two consecutive traffic lights on the same main street.

2 System Requirements Document

2.1 Functional Specifications
An autonomous agent will be designed to control and manage two consecutive
traffic lights located on the same street, with the goal of optimizing traffic flow
with reinforcement learning in the simulation, the agent learns to respond to the
traffic lights by observing the traffic conditions, in orden to minimize waiting
time and avoid traffic into the two points.
The integration of sensors, actuators, and reward functions which are essen-
tial for the interaction between the agent and the environment.

Table 1: Sensors

sensors What does the sensor Data provided by

do? the sensor
Camera in the two counts stopped and mov- number of vehicles be-
traffic lights ing vehicles fore, between and after
traffic lights
status timer time spent in a state (red, time in seconds
yellow, green)
passage camera counts the vehicles that average number of ve-
pass at change the state of hicles per unit of time
the traffic light
people detector detects if there are people yes or no
waiting

1
Table 2: Actuators

Actuators what the actuator does

first traffic light controller (A) changes the state of the traffic light ac-
cording to the agent (red, yellow, green)
second traffic light controller (B) changes the state of the traffic light
according to the agent (red, yellow,
green))
timer adjusts the time based on the defined
parameters

Table 3: Reward functions

Situation Type of score re- why the reward?

ward
correct flow between +1 point promotes the correct synchro-
traffic lights nization of traffic lights
very long waiting times -1 or -2 punishes the long wait
for vehicles (piling up
at one or both traffic
lights)
accident or vehicle -2 or -3 punishment for block the
stopped between two road
traffic lights
crossing time decreases + 0.5 or +1 reward for speeding the cross-
ing of people
Unnecessary or very -1 point Very fast or unnecessary light
rapid action that af- change in vehicles or people
fects the crossing of ve-
hicles

2.2 Use Cases

Use case 1

Title: Optimizing vehicle flow between the two traffic lights

Priority: High
Estimate: 5 Days

User story: As an intelligent traffic control agent, I want to learn to coor-

dinate two consecutive traffic lights, so that vehicle flow on the main road is
optimized and waiting time is minimized.

Acceptance Criteria:
• Given the agent is in a simulated environment during training

2
• When it makes decisions to switch traffic lights
• Then it learns to reduce traffic congestion and improve average travel time
Use case 2

Title: Reducing the average wait time

Priority: Medium
Estimate: 3-4 Days

User story: As an autonomous control system, I want to adjust the traffic

light in real time based on traffic variation, so that the system remains efficient
during both peak and low traffic hours.

Acceptance Criteria:
• Given traffic conditions vary over time

• When the agent receives updated sensor observations

• Then it modifies to maintain optimal the traffic flow without manual in-
tervention

3 High-Level Architecture
The component diagram and the feedback loops are uploaded in the GitHub
repository.

4 Preliminary Implementation Outline

4.1 Potential frameworks
1. Gymnasium:
Gymnasium is a standard framework for defining reinforcement learning envi-
ronments. The principal advantages of using it in our agent are:
• It will allow to build the traffic light environment with a Python class that
defines states, actions, and rewards.

• It allows to observe the environment to track the agent behavior in real

time. This will be useful during the agent testing period.
• It is compatible with RL libraries, including Stable-Baselines3.
• It is one of the best for working on experimental projects.

3
How to apply it to the agent? It will be used to simulate the traffic system with
vehicles moving between two control points (traffic light A and traffic light B)
with defined transition rules, virtual sensors, and rewards.

2. Stable-Baselines3:
Stable-Baselines3 is an implementation of RL algorithms and provides algo-
rithms that are ready to use like DQN.
• It allows the use of advanced algorithms like DQN, which is ideal for
environments with discrete spaces, in this case traffic lights.
• It is an interface that is not complex to use.

• It will allow metrics to be recorded, in this case average waiting time and
time spent standing still.
• Rewards can be rewritten without rewriting all the system.
How to apply it to the agent? It will allow to start a DQN agent that learns
to change traffic lights based on observations like the number of vehicles and
waiting time, and it will allow you to evolve to more complex models if desired.

4.2 Timeline for the transition from basic Q learning to

more advanced DQN approaches
Week 1
Problem analysis and system definition

• Scenario analysis: the street with two traffic lights

• Identify the actors: sensors, environment and vehicles
• Review the functional specifications, primarily for the sensors

• Literature review of frameworks and concepts

Objetive: Document the system requirements and describe the agent and the
technologies to be used

Week 2
Gymnasium design and implementation
• Define the environment states (number of vehicles on the street, people,
and time), and also define the actions (changing the state of the traffic
lights)
• Logical structure of the environment, rewards and sensors

• Implement basic methods for basic traffic simulation

4
Objetive: Design a basic functional Gym environment (Python, Gymnasium,
NumPy)

Week 3
Traffic simulation and implement sensor logic

• Vehicle Simulation Programming

• Traffic Light Logic (timers)
• Sensor modeling (counting in zones)
Objetive: Perform the traffic simulation and gather observations (Python,
Gymnasium, logical structures)

Week 4
Training with basic Q learning
• Implementation of RL agent with Q table

• Training in multiple scenarios

Objetive:Initial training with Q table and analysis of initial performance

Week 5
Evaluation of Q learning and parameter tuning
• Analysis of the reward function and observe if it is being met
• View data on wait times and decisions
Objetive: Observe and correct the Q agent with clear data

Week 6
Transition to DQN
• Implementation of Stable-Baselines3
• Implementation of DQN agents

• Training and observe if there is room for improvement

Objetive: Implementation of the first DQN model (Gymnasium, State-Baselines3)

Week 7
Evaluation and optimization of the agent DQN
• Review the reward function to ensure it delivers the expected results Ob-
jective

5
Objetive: Have the agent optimized and the reward function stable

Week 8
Compare Q learning and DQN
• Adjust average time, congestion, zone, and shifts

• View and interpret the results

Objetive: Compare performance

Week 9
Documentation and delivery
• Final report
• System architecture diagrams
Objetive: Final delivery with graphics and code

5 References
• Gymnasium documentation. (s. f.). https://gymnasium.farama.org/index.html

• Stable-Baselines3 Docs - Reliable Reinforcement Learning Implementa-

tions — Stable Baselines3 2.6.1a0 documentation. (s. f.). https://stable-
baselines3.readthedocs.io/en/master/
• Lin, R. (2022, 3 march). Create your own environment using the OpenAI
Gym library — GridWorld. Medium. https://reneelin2019.medium.com/create-
your-own-environment-using-the-openai-gym-library-gridworld-8ca9f18e00a4
• Vitality Learning. (2024, 15 november). Solving the Taxi Problem Using
OpenAI Gym and Reinforcement Learning. Medium. https://vitalitylearning.medium.com/solving-
the-taxi-problem-using-openai-gym-and-reinforcement-learning-0317e089b48f

• Tutorial: An Introduction to Reinforcement Learning Using Open AI Gym

(s. f.). https://www.gocoder.one/blog/rl-tutorial-with-openai-gym/

Vidali Tesi
No ratings yet
Vidali Tesi
73 pages
Application For Reinforcement Learning
No ratings yet
Application For Reinforcement Learning
4 pages
Ieee 2023 DQL
No ratings yet
Ieee 2023 DQL
6 pages
Xedig, 2 - 116-Louw - Kopie
No ratings yet
Xedig, 2 - 116-Louw - Kopie
29 pages
RL Report
No ratings yet
RL Report
7 pages
IEEE
No ratings yet
IEEE
3 pages
Improving Traffic Light Systems Using Deep Q-Networks
No ratings yet
Improving Traffic Light Systems Using Deep Q-Networks
2 pages
A Survey of Reinforcement and Deep Reinforcement Learning For Coordination in Intelligent Traffic Light Control
No ratings yet
A Survey of Reinforcement and Deep Reinforcement Learning For Coordination in Intelligent Traffic Light Control
15 pages
Traffic Light Control With Reinforcement Learning
No ratings yet
Traffic Light Control With Reinforcement Learning
18 pages
Self-Driving Car Racing: Application of Deep Reinforcement Learning
No ratings yet
Self-Driving Car Racing: Application of Deep Reinforcement Learning
12 pages
Electronics 10 02363 v2
No ratings yet
Electronics 10 02363 v2
32 pages
Comparative Study of Reinforcement Learning Algorithms On Traffic Light Control System
No ratings yet
Comparative Study of Reinforcement Learning Algorithms On Traffic Light Control System
13 pages
Enhancing Traffic Flow Through Multi-Agent Reinforcement Learning For Adaptive Traffic Light Duration Control
No ratings yet
Enhancing Traffic Flow Through Multi-Agent Reinforcement Learning For Adaptive Traffic Light Duration Control
16 pages
Aa Aa
No ratings yet
Aa Aa
16 pages
Machine Learning Technque Research Paper
No ratings yet
Machine Learning Technque Research Paper
11 pages
Sensors 24 03987 v2
No ratings yet
Sensors 24 03987 v2
19 pages
Using A Deep Reinforcement Learning Agent For Traffic Signal Control
No ratings yet
Using A Deep Reinforcement Learning Agent For Traffic Signal Control
9 pages
Coordination Between Traffic Signals Based On Cooperative: S.S. Shamshirband, H. Shirgahi, M.Gholami and B. Kia
No ratings yet
Coordination Between Traffic Signals Based On Cooperative: S.S. Shamshirband, H. Shirgahi, M.Gholami and B. Kia
6 pages
Deep Reinforcement Learning For Traffic Signal Control With Consistent State and Reward Design Approach
No ratings yet
Deep Reinforcement Learning For Traffic Signal Control With Consistent State and Reward Design Approach
11 pages
Deep Reinforcement Learning Algorithm With Experience Replay and Target Network
No ratings yet
Deep Reinforcement Learning Algorithm With Experience Replay and Target Network
10 pages
Ieee 2022 DQL
No ratings yet
Ieee 2022 DQL
6 pages
Traffic Light Control with Reinforcement Learning
No ratings yet
Traffic Light Control with Reinforcement Learning
8 pages
Traffic Prediction
No ratings yet
Traffic Prediction
33 pages
2018 TS Optimization 34
No ratings yet
2018 TS Optimization 34
17 pages
ITEC327 W11 Asynchronous
No ratings yet
ITEC327 W11 Asynchronous
64 pages
Deep RL for Autonomous Vehicles at Intersections
No ratings yet
Deep RL for Autonomous Vehicles at Intersections
19 pages
Survey:: Control Using Deep Learning
No ratings yet
Survey:: Control Using Deep Learning
5 pages
Improving Traffic Light Systems Using Deep Q-Networks
No ratings yet
Improving Traffic Light Systems Using Deep Q-Networks
13 pages
Ieee 2022 I DQL
No ratings yet
Ieee 2022 I DQL
6 pages
Ieee 2023 DQL Canada
No ratings yet
Ieee 2023 DQL Canada
6 pages
PGP Report Sachin t22060
No ratings yet
PGP Report Sachin t22060
20 pages
Overview of Traffic Light Control
No ratings yet
Overview of Traffic Light Control
6 pages
Traffic rl-1
No ratings yet
Traffic rl-1
16 pages
Electronics: Decision-Making System For Lane Change Using Deep Reinforcement Learning in Connected and Automated Driving
No ratings yet
Electronics: Decision-Making System For Lane Change Using Deep Reinforcement Learning in Connected and Automated Driving
13 pages
Adaptive Traffic Light Control Using Deep Reinforcement Learning Technique
No ratings yet
Adaptive Traffic Light Control Using Deep Reinforcement Learning Technique
22 pages
Sensors 22 08732 v3
No ratings yet
Sensors 22 08732 v3
21 pages
Orouji 20230108
No ratings yet
Orouji 20230108
45 pages
Literature Review
No ratings yet
Literature Review
3 pages
IEEE 2022 Coordination Minimizing Pressure Difference
No ratings yet
IEEE 2022 Coordination Minimizing Pressure Difference
6 pages
Real-World RL Challenges Survey
No ratings yet
Real-World RL Challenges Survey
48 pages
2024 Special Lectures in Information Science 4 Wo Ans
No ratings yet
2024 Special Lectures in Information Science 4 Wo Ans
47 pages
Chapter 2 3 Problem and Methodology RL Report Kiran
No ratings yet
Chapter 2 3 Problem and Methodology RL Report Kiran
3 pages
Trafiq Smart Traffic Signal Optimizatiion
No ratings yet
Trafiq Smart Traffic Signal Optimizatiion
38 pages
Intelligent Control Systems
No ratings yet
Intelligent Control Systems
72 pages
Traffic Signal Control System Using Deep Reinforcement Learning With Emphasis On Reinforcing Successful Experiences
No ratings yet
Traffic Signal Control System Using Deep Reinforcement Learning With Emphasis On Reinforcing Successful Experiences
8 pages
IEEE 2022 DQL Intersection
No ratings yet
IEEE 2022 DQL Intersection
5 pages
Deep RL for Autonomous Car Racing
No ratings yet
Deep RL for Autonomous Car Racing
6 pages
Adaptive Traffi-WPS Office
No ratings yet
Adaptive Traffi-WPS Office
7 pages
If There Is More Than One Set Action: Assignments Are Made in Sequence
No ratings yet
If There Is More Than One Set Action: Assignments Are Made in Sequence
13 pages
AI-Enhanced Traffic Light Systems
No ratings yet
AI-Enhanced Traffic Light Systems
7 pages
Traffic Light Algorithms
No ratings yet
Traffic Light Algorithms
3 pages
Vlmlight: Traffic Signal Control Via Vision-Language Meta-Control and Dual-Branch Reasoning
No ratings yet
Vlmlight: Traffic Signal Control Via Vision-Language Meta-Control and Dual-Branch Reasoning
25 pages
RMIT Group Project - ANY
No ratings yet
RMIT Group Project - ANY
23 pages
Researchbased Synopsis 2
No ratings yet
Researchbased Synopsis 2
3 pages
Rewrite of Naret Layne - Draft Senior Thesis
No ratings yet
Rewrite of Naret Layne - Draft Senior Thesis
14 pages
RL Report TEAM - 6
No ratings yet
RL Report TEAM - 6
13 pages
MAE 598 Intro To Autonomous Project Dhiram Omkar Harshal
No ratings yet
MAE 598 Intro To Autonomous Project Dhiram Omkar Harshal
14 pages
Multi-Agent Deep Learning for Traffic Signals
No ratings yet
Multi-Agent Deep Learning for Traffic Signals
16 pages
Local Policies for Non-Admins in Windows 2000
No ratings yet
Local Policies for Non-Admins in Windows 2000
3 pages
Um Pemtk Pemohon
No ratings yet
Um Pemtk Pemohon
16 pages
Price List Laptop Sony Vaio
No ratings yet
Price List Laptop Sony Vaio
2 pages
B Cisco NX-OS FCoE Configuration Guide
No ratings yet
B Cisco NX-OS FCoE Configuration Guide
62 pages
Telecom Site Status Report
No ratings yet
Telecom Site Status Report
22 pages
PowerPoint Presentation Guide for Kids
No ratings yet
PowerPoint Presentation Guide for Kids
2 pages
Baldur's Gate I - Forgotten Realms (Manual & Volo's Game Quide)
100% (3)
Baldur's Gate I - Forgotten Realms (Manual & Volo's Game Quide)
162 pages
Electronics & Cell Phone Repair Training
No ratings yet
Electronics & Cell Phone Repair Training
3 pages
Prashant CV
No ratings yet
Prashant CV
1 page
D5E Project Guide
No ratings yet
D5E Project Guide
61 pages
AIinvestment Final
No ratings yet
AIinvestment Final
15 pages
Ms Data Science S, 24 (WEEK# 2)
No ratings yet
Ms Data Science S, 24 (WEEK# 2)
19 pages
CSV File
No ratings yet
CSV File
5 pages
P0405-96 P0406-96 Taken From 2KD Manual - To Be Checked
No ratings yet
P0405-96 P0406-96 Taken From 2KD Manual - To Be Checked
4 pages
MCU Flashloader Reference Manual
No ratings yet
MCU Flashloader Reference Manual
87 pages
தனிவட்டி மற்றும் கூட்டு வட்டி 1st chapter
100% (3)
தனிவட்டி மற்றும் கூட்டு வட்டி 1st chapter
26 pages
Math - 2024 Oct. - QP P2
No ratings yet
Math - 2024 Oct. - QP P2
32 pages
TOGAF Training Overview & Case Study
No ratings yet
TOGAF Training Overview & Case Study
4 pages
Nokia RAN Parameters and Recommendations
No ratings yet
Nokia RAN Parameters and Recommendations
1,987 pages
Computer Aided Design (CAD) : 7CCSMCAD
No ratings yet
Computer Aided Design (CAD) : 7CCSMCAD
2 pages
A Comprehensive Study On Security Attacks On SSLTLS Protocol
No ratings yet
A Comprehensive Study On Security Attacks On SSLTLS Protocol
10 pages
XXX Alyx Star Chipa Asian Mal - Uk 514430
No ratings yet
XXX Alyx Star Chipa Asian Mal - Uk 514430
5 pages
Thermal Weapon Sight Training
100% (1)
Thermal Weapon Sight Training
47 pages
WEF WR129 Harnessing 4IR Water Online
100% (1)
WEF WR129 Harnessing 4IR Water Online
26 pages
Fives Booklet
100% (2)
Fives Booklet
21 pages
AI Assignment: Predicate Logic & Algorithms
No ratings yet
AI Assignment: Predicate Logic & Algorithms
3 pages
Google Analytics User Insights Report
No ratings yet
Google Analytics User Insights Report
20 pages
Fixed End Moments PDF
No ratings yet
Fixed End Moments PDF
1 page
Smartplant Instrumentation 2007: Using Rule Manager
No ratings yet
Smartplant Instrumentation 2007: Using Rule Manager
33 pages
Essentials of Sociology 2nd Edition Ritzer Fast Access
No ratings yet
Essentials of Sociology 2nd Edition Ritzer Fast Access
310 pages

Workshop No 1 Systems Sciences

Uploaded by

Workshop No 1 Systems Sciences

Uploaded by

Workshop No.

1 Systems Sciences Foundations

2 System Requirements Document

sensors What does the sensor Data provided by

Actuators what the actuator does

Table 3: Reward functions

Situation Type of score re- why the reward?

2.2 Use Cases

Title: Optimizing vehicle flow between the two traffic lights

User story: As an intelligent traffic control agent, I want to learn to coor-

Title: Reducing the average wait time

User story: As an autonomous control system, I want to adjust the traffic

• When the agent receives updated sensor observations

4 Preliminary Implementation Outline

• It allows to observe the environment to track the agent behavior in real

4.2 Timeline for the transition from basic Q learning to

• Scenario analysis: the street with two traffic lights

• Literature review of frameworks and concepts

• Implement basic methods for basic traffic simulation

• Vehicle Simulation Programming

• Training in multiple scenarios

• Training and observe if there is room for improvement

• View and interpret the results

• Stable-Baselines3 Docs - Reliable Reinforcement Learning Implementa-

• Tutorial: An Introduction to Reinforcement Learning Using Open AI Gym

You might also like