0% found this document useful (0 votes)

37 views11 pages

Deep Reinforcement Learning For Traffic Signal Control With Consistent State and Reward Design Approach

hufia

Uploaded by

Md. Mujahid Hasan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

37 views11 pages

Deep Reinforcement Learning For Traffic Signal Control With Consistent State and Reward Design Approach

hufia

Uploaded by

Md. Mujahid Hasan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 11

Deep Reinforcement Learning for Traﬃc Signal Control with Consistent

State and Reward Design Approach

1. How they have managed the traffic system
They proposed a Deep Reinforcement Learning (DRL) framework using a Double
Deep Q-Network (DDQN) with Priori�zed Experience Replay (PER) to manage
traffic signals. The core idea is to design consistent and simple defini�ons for
both state and reward, making the agent learn op�mal policies quickly and
effec�vely.

2. Gap analysis
• Exis�ng Issues: Prior works o�en used hand-cra�ed or inconsistent state
and reward designs, which harmed convergence and real-world
applicability.
• Gap Filled: This paper proposes three consistent state-reward pairs,
designed to directly reﬂect and op�mize traﬃc metrics like vehicle count,
queue length, and wai�ng �me.

3. Methodology used
• Reinforcement Learning (RL) using DDQN with PER.
• Three state-reward design approaches:
1. Number of vehicles (State) ↔ Vehicle count (Reward)
2. Queue length (State) ↔ Queue length (Reward)
3. Wai�ng �me (State) ↔ Wai�ng �me (Reward)
• Penalty mechanism for subop�mal ac�ons to improve learning.
• Abla�on studies to evaluate the role of components like PER and DDQN.
4. How they collected the dataset
• Synthe�c traffic flow data was generated using SUMO (Simula�on of
Urban Mobility).
• Flows were based on Weibull and Normal distribu�ons to simulate high
and low traffic densi�es.
• Each simula�on had origin-des�na�on rou�ng with realis�c parameters
(e.g., speed, turn ra�os).

5. Research area
• Urban traﬃc signal control under Intelligent Transporta�on Systems (ITS).
• Focused on dynamic and adap�ve op�miza�on using AI/ML techniques.

6. Did they use simula�on to generate data?

Yes, the SUMO simulator was used to:
• Create a four-way intersec�on environment.
• Generate and manage vehicular ﬂows.
• Evaluate agent performance in various scenarios.

7. Results (Final outputs from research)

• The proposed CSRD agents (QSR, NSR, WSR) outperformed:
o Tradi�onal Fixed-Time controls
o Benchmarks like DTSE, PressLight, and LIT
• NSR (Number of vehicles) achieved the best performance across all
metrics.
• Metrics improved:
o Average Travel Time (ATT)
o Queue Length (QL)
o Wai�ng Time (WT)
8. Limita�ons (from conclusion)
• Results are based on synthe�c simula�ons—not tested on real-world traﬃc
systems.
• Only considered single intersec�ons, not mul�-intersec�on networks.
• State and reward design assumes accurate sensor data, which may not
always be available.

9. Future implementa�on plan (from conclusion)

• Extend the framework to mul�-intersec�on traﬃc networks.
• Integrate real-world traﬃc data for training and valida�on.
• Explore hardware deployment using edge compu�ng and real-�me sensor
integra�on.

Deep Reinforcement Q-Learning for Intelligent Traffic Signal Control with Par�al
Detec�on
1. How they managed the traffic system
• They developed a Deep Q-Learning (DQN) agent for traffic signal control at
a single intersec�on.
• The agent makes decisions based on par�ally observed data from
connected vehicles (CVs) using image-like state representa�ons (par�al
DTSE).
• The goal is to minimize total squared vehicle delay by selec�ng traffic signal
phases that adapt to real-�me traffic condi�ons.

2. Gap Analysis
• Exis�ng works assume full vehicle detec�on, which is unrealis�c.
• Few models are designed for low CV penetra�on rates.
• No standard exists for state representa�ons or reward func�ons in DQN for
TSC.
• Lack of real-world viability due to expensive infrastructure and
reproducibility issues in RL.

3. Which methodology they used

• Dueling Double Deep Q-Network (3DQN).
• The model was tested using SUMO (Simula�on of Urban Mobility) on three
scenarios with increasing complexity.
• The agent selects signal phases based on microscopic data from CVs only.

4. How they collected the dataset

• Data was synthe�cally generated using SUMO:
o Random traffic flows (Poisson distribu�on).
o Varied CV penetra�on rates (0–100%).
o Simulated across 3600-second episodes with different intersec�on
designs.
5. Research Area
• Falls under Intelligent Transporta�on Systems (ITS).
• Subfield: Adap�ve Traffic Signal Control using Deep Reinforcement
Learning (DRL).
• Emphasis on par�ally observable environments and low-cost
implementa�on with CVs.

6. Did they use simula�on to generate data?

• Yes. All data was generated in SUMO, a widely used microscopic traﬃc
simula�on tool.

7. Results / Final Outputs

• Outperformed tradi�onal algorithms (Max Pressure, SOTL) in scenarios
with 4-phase programs.
• Showed robustness to diverse traﬃc condi�ons.
• Eﬀec�ve at 20% CV penetra�on rate (acceptable), with op�mal
performance at ≥40%.
• Introduced a fairness-aware reward to avoid favoring heavily used lanes
only.

8. Limita�ons (From Conclusion)

• Assumes perfect data from CVs, which isn't realis�c.
• Trained separately for each scenario — lacks generalizability.
• Not op�mized for low CV rates (<20%) — model performance drops
signiﬁcantly.
• Only tested on synthe�c data — no real-world valida�on yet.

9. Future Implementa�on Plan

• Improve robustness under imperfect CV data using probabilis�c methods.
• Generalize model across various intersec�on types using techniques like
zero-padding.
• Redesign reward to enable real-world learning a�er deployment using CV-
only data.
• Explore mul�-agent coordina�on for city-wide traﬃc networks.
• Validate using realis�c datasets, e.g., Luxembourg SUMO Traﬃc (LuST).

10. Comparison Between Methodologies (Current vs. Previous)

Previous Methods (Max This Research (3DQN with Par�al

Aspect
Pressure, SOTL) Detec�on)

Detec�on Full detec�on needed Par�al detec�on from CVs (low-

Requirement (expensive) cost, prac�cal)

Learns from experience (adap�ve

Adap�vity Rule-based (sta�c logic)
& op�mal)

Superior in complex, mul�-phase

Performance Good in simple scenarios
intersec�ons

Fairness Designed reward promotes

None
Considera�on fairness among all direc�ons

Deployment Promising, but requires real-world

Already in use
Readiness tes�ng and tuning
11. Main Objec�ve: New Idea Genera�on
You can explore the following new ideas based on this work:
• Hybrid reward design: Combine CV-only data reward with es�ma�ons of
non-CV behavior.
• Federated learning for intersec�ons: Enable distributed learning without
centralized data sharing.
• Transfer learning: Train in simula�on, adapt model to real-world
intersec�ons with minimal tuning.
• Incorpora�ng weather/pedestrian data: Add more environmental factors
for improved realism.
• Integra�on with vehicle rou�ng apps: Use real-�me rou�ng info to predict
near-future inﬂows.

A Reinforcement Learning Approach for Reducing Traﬃc Conges�on Using Deep

Q Learning
1. How they managed the traffic system
They used a Deep Q-Learning (DQL) model to control traffic signals at
intersec�ons dynamically. Their system:
• Focused on op�mizing the queue length and rewards.
• Trained an RL agent in a simulated environment to select the best traffic
signal ac�on based on real-�me traffic data.
• Integrated state, ac�on, and reward logic to adjust signals adap�vely
depending on vehicle flow and conges�on.

� 2. Gap analysis
• Tradi�onal systems (fixed-�me or sta�c signal controls) fail to adapt in real-
�me.
• Earlier reinforcement learning approaches suffered from:
o Large state-space issues
o Limited adaptability to dynamic environments
o Inadequate handling of vehicle behaviors
• This paper addresses these by combining Deep Q Networks (DQN) with
hyperparameter tuning, improving efficiency and learning from sparse,
dynamic inputs.

� 3. Methodology used

• Deep Reinforcement Learning (DRL) approach using Deep Q-Network

(DQN)
• Components:
o State: Vehicle posi�ons, veloci�es, distances.
o Ac�ons: Signal control (e.g., North-South, East-West green lights).
o Rewards: Feedback based on queue length reduc�on and traﬃc ﬂow
improvement.
• Simula�on over 30 episodes, 240 steps per episode, 1000 vehicles, and
training over 800 epochs.

� 4. How they collected the dataset

They used two XML datasets:

• Dataset 1: Environmental data (Vehicle ID, route, speed, etc.)
• Dataset 2: Route data (edge ID, lane ID, shape, etc.) These datasets were
merged to simulate traﬃc at a junc�on and train the RL agent.

� 5. Research area

Focused on intersec�on-based urban traﬃc management within the domain of:

• Smart ci�es
• Urban sustainability
• Intelligent Traﬃc Systems (ITS)
• Adap�ve Traﬃc Signal Control (ATSC)

� 6. Did they use simula�on to generate data?

Yes.
• They simulated the environment using SUMO-like structured intersec�ons
with vehicle flow and signal dynamics.
• Tes�ng involved ar�ficially generated vehicle traffic (1000 cars) and
simulated ac�ons over 240 �me steps.

� 7. Final results (outputs of the research)

• Queue length reduc�on: 49%

• Rewards (incen�ves) increased: 9%
• Training queue length: 852 → Tes�ng queue length: 418
• Training reward: −944992 → Tes�ng reward: −8520
This shows signiﬁcant performance improvement during tes�ng a�er
training.
� 8. Limita�ons (from conclusion/discussion)

• High computa�onal cost due to large state-ac�on space in DQL.

• Limited to simulated environments, not yet applied in real-world systems.
• S�ll relies on sta�c vehicle ﬂow assump�ons—real traﬃc is more chao�c.

� 9. Future implementa�on plans

• Integrate real-�me traﬃc data via internet connec�vity and sensors.

• Implement dimensionality reduc�on to decrease computa�onal
complexity.
• Adopt distributed compu�ng and parallel processing to scale the model for
large ci�es.
• Use experience replay and eﬃcient architectures to stabilize learning in
complex environments.

� 10. Comparison between current and previous methods

Criteria Previous Methods Proposed DQL Method

Fixed/heuris�c or
Signal Control Adap�ve and real-�me via RL
tradi�onal

Queue Reduc�on Limited (14–30%) Signiﬁcant (49%)

Reward
Rarely focused Explicitly op�mized
Op�miza�on

Challenging for large Managed using hyperparameter

Scalability
states tuning
Criteria Previous Methods Proposed DQL Method

Fully dynamic with con�nuous

Learning Capability Sta�c or semi-dynamic
updates

Techniques Used Gene�c, Fuzzy, MARL Deep Q-Learning with tuning

� Main Objec�ve: New Idea Genera�on

This research lays the groundwork for further innova�ons like:

• Real-�me adap�ve signal control using cloud-connected sensors.
• Mul�-agent systems for city-wide traﬃc op�miza�on.
• Hybrid approaches combining DQL with graph neural networks or
computer vision (camera feeds).
• Personalized traﬃc management for emergency vehicles or public transit

Ieee 2023 DQL
No ratings yet
Ieee 2023 DQL
6 pages
Deep Reinforcement Q-Learning For Intelligent Tra C Signal Control With Partial Detection
No ratings yet
Deep Reinforcement Q-Learning For Intelligent Tra C Signal Control With Partial Detection
15 pages
Ieee 2022 I DQL
No ratings yet
Ieee 2022 I DQL
6 pages
Traffic Light Control With Reinforcement Learning
No ratings yet
Traffic Light Control With Reinforcement Learning
18 pages
Ieee 2022 DQL
No ratings yet
Ieee 2022 DQL
6 pages
Improving Traffic Light Systems Using Deep Q-Networks
No ratings yet
Improving Traffic Light Systems Using Deep Q-Networks
2 pages
RL Report
No ratings yet
RL Report
7 pages
IEEE
No ratings yet
IEEE
3 pages
Aa Aa
No ratings yet
Aa Aa
16 pages
SIH2024
No ratings yet
SIH2024
6 pages
A Survey of Reinforcement and Deep Reinforcement Learning For Coordination in Intelligent Traffic Light Control
No ratings yet
A Survey of Reinforcement and Deep Reinforcement Learning For Coordination in Intelligent Traffic Light Control
15 pages
Using A Deep Reinforcement Learning Agent For Traffic Signal Control
No ratings yet
Using A Deep Reinforcement Learning Agent For Traffic Signal Control
9 pages
Xedig, 2 - 116-Louw - Kopie
No ratings yet
Xedig, 2 - 116-Louw - Kopie
29 pages
Improving Traffic Light Systems Using Deep Q-Networks
No ratings yet
Improving Traffic Light Systems Using Deep Q-Networks
13 pages
Machine Learning Technque Research Paper
No ratings yet
Machine Learning Technque Research Paper
11 pages
3 Discharge Control Policy Based On Density and Speed For Deep Q-Learning Adaptive Traffic Signal
No ratings yet
3 Discharge Control Policy Based On Density and Speed For Deep Q-Learning Adaptive Traffic Signal
21 pages
Ieee 2023 DQL Canada
No ratings yet
Ieee 2023 DQL Canada
6 pages
Wjaets 2024 0437
No ratings yet
Wjaets 2024 0437
6 pages
Synopsis
No ratings yet
Synopsis
4 pages
Intelligent Transportation Systems Using Deep Q Learning
No ratings yet
Intelligent Transportation Systems Using Deep Q Learning
10 pages
Electronics 10 02363 v2
No ratings yet
Electronics 10 02363 v2
32 pages
Intelligent Emergency Traffic Signal Control System With Pedestrian Access
No ratings yet
Intelligent Emergency Traffic Signal Control System With Pedestrian Access
1 page
Sensors 24 03987 v2
No ratings yet
Sensors 24 03987 v2
19 pages
Distributed Traffic Light Control at Uncoupled Intersections With Real-World Topology by Deep Reinforcement Learning
No ratings yet
Distributed Traffic Light Control at Uncoupled Intersections With Real-World Topology by Deep Reinforcement Learning
9 pages
Behavioral Adaptation in Driving Using Deep Q Learning For Adaptive Cruise Control and Lane Control
No ratings yet
Behavioral Adaptation in Driving Using Deep Q Learning For Adaptive Cruise Control and Lane Control
9 pages
Vidali Tesi
No ratings yet
Vidali Tesi
73 pages
Application For Reinforcement Learning
No ratings yet
Application For Reinforcement Learning
4 pages
Adaptive Traffic Light Control Using Deep Reinforcement Learning Technique
No ratings yet
Adaptive Traffic Light Control Using Deep Reinforcement Learning Technique
22 pages
Traffic Light Control with Reinforcement Learning
No ratings yet
Traffic Light Control with Reinforcement Learning
8 pages
IEEE DQL Regional Network
No ratings yet
IEEE DQL Regional Network
5 pages
Researchbased Synopsis 2
No ratings yet
Researchbased Synopsis 2
3 pages
Deep Reinforcement Learning For Traffic Signal Control A Review - 2020
No ratings yet
Deep Reinforcement Learning For Traffic Signal Control A Review - 2020
29 pages
Sensors 22 08732 v3
No ratings yet
Sensors 22 08732 v3
21 pages
AV Control with DQN for Safety
No ratings yet
AV Control with DQN for Safety
6 pages
IEEE 2022 Coordination Minimizing Pressure Difference
No ratings yet
IEEE 2022 Coordination Minimizing Pressure Difference
6 pages
Traffic Signal Control System Using Deep Reinforcement Learning With Emphasis On Reinforcing Successful Experiences
No ratings yet
Traffic Signal Control System Using Deep Reinforcement Learning With Emphasis On Reinforcing Successful Experiences
8 pages
Comparative Study of Reinforcement Learning Algorithms On Traffic Light Control System
No ratings yet
Comparative Study of Reinforcement Learning Algorithms On Traffic Light Control System
13 pages
Traffic Signal Control A Double Q-Learning Approac
No ratings yet
Traffic Signal Control A Double Q-Learning Approac
5 pages
Deep RL for Autonomous Vehicles at Intersections
No ratings yet
Deep RL for Autonomous Vehicles at Intersections
19 pages
Survey:: Control Using Deep Learning
No ratings yet
Survey:: Control Using Deep Learning
5 pages
IEEE 2022 DQL Intersection
No ratings yet
IEEE 2022 DQL Intersection
5 pages
1 s2.0 S1877050923005719 Main
No ratings yet
1 s2.0 S1877050923005719 Main
8 pages
Deep Reinforcement Learning Based Approach For Tra C Signal Control Deep Reinforcement Learning Based Approach For Tra C Signal Control
No ratings yet
Deep Reinforcement Learning Based Approach For Tra C Signal Control Deep Reinforcement Learning Based Approach For Tra C Signal Control
8 pages
IEEE 2023 Cooperative Control
No ratings yet
IEEE 2023 Cooperative Control
5 pages
2018 TS Optimization 34
No ratings yet
2018 TS Optimization 34
17 pages
RL Traffic Signal Control with Railway Data
No ratings yet
RL Traffic Signal Control with Railway Data
12 pages
Workshop No 1 Systems Sciences
No ratings yet
Workshop No 1 Systems Sciences
6 pages
Implementation
No ratings yet
Implementation
11 pages
Enhanced DDPG Car-Following Model
No ratings yet
Enhanced DDPG Car-Following Model
34 pages
Comsats Ms Format Thesis
No ratings yet
Comsats Ms Format Thesis
71 pages
Infrastructures 10 00114
No ratings yet
Infrastructures 10 00114
41 pages
Final Presentation
No ratings yet
Final Presentation
59 pages
Intelligent Traffic Signal Automation Based On Computer Vision Techniques Using
No ratings yet
Intelligent Traffic Signal Automation Based On Computer Vision Techniques Using
6 pages
Deep Reinforcement Learning Algorithm With Experience Replay and Target Network
No ratings yet
Deep Reinforcement Learning Algorithm With Experience Replay and Target Network
10 pages
Taffic Control System
No ratings yet
Taffic Control System
19 pages
Safe Navigation Based On Deep Q-Network Algorithm Using An Improved Control Architecture
No ratings yet
Safe Navigation Based On Deep Q-Network Algorithm Using An Improved Control Architecture
6 pages
Ads Aat
No ratings yet
Ads Aat
10 pages
3D Transformations
No ratings yet
3D Transformations
14 pages
Lab Final A1
No ratings yet
Lab Final A1
3 pages
Lab Task-03
No ratings yet
Lab Task-03
5 pages
Chapter-04 Syntax Analysis
No ratings yet
Chapter-04 Syntax Analysis
1 page
Time Value Money
No ratings yet
Time Value Money
79 pages
CNN & RNN & LSTM
0% (1)
CNN & RNN & LSTM
33 pages
Thevenin's Theorem Lab Guide
No ratings yet
Thevenin's Theorem Lab Guide
7 pages
Lesson 2 332ac - 2
No ratings yet
Lesson 2 332ac - 2
9 pages
Initial Mental Status Examination Report
No ratings yet
Initial Mental Status Examination Report
20 pages
Year 3 Science Activity Overview
No ratings yet
Year 3 Science Activity Overview
3 pages
Podcast 2
No ratings yet
Podcast 2
1 page
BFA Course Guide: Calcutta University
No ratings yet
BFA Course Guide: Calcutta University
8 pages
Preventive Orthodontic
No ratings yet
Preventive Orthodontic
19 pages
2021 Vedanta MA Syllabus
No ratings yet
2021 Vedanta MA Syllabus
8 pages
Financial Applications 1 Excel Solver Project Questions
No ratings yet
Financial Applications 1 Excel Solver Project Questions
4 pages
Assignment 2 Ualg 2003 Language Teaching Methodology
No ratings yet
Assignment 2 Ualg 2003 Language Teaching Methodology
3 pages
CCGL9015 Syllabus
No ratings yet
CCGL9015 Syllabus
5 pages
Divided Syllabus of NTT
No ratings yet
Divided Syllabus of NTT
3 pages
Life Long
No ratings yet
Life Long
14 pages
Jnana Yoga - The Path of Knowl...
No ratings yet
Jnana Yoga - The Path of Knowl...
11 pages
First Evaluation Exam ANSWER KEY
No ratings yet
First Evaluation Exam ANSWER KEY
3 pages
Classification of Tasks
No ratings yet
Classification of Tasks
1 page
TBP 2021-2022 Education Program Guide
No ratings yet
TBP 2021-2022 Education Program Guide
54 pages
Casa de Amparo 2017-2018 Annual Report
No ratings yet
Casa de Amparo 2017-2018 Annual Report
18 pages
Types and Roles of AI Agents
No ratings yet
Types and Roles of AI Agents
13 pages
Foundation Course-I Project (SEM-1) 2017-18: and Remedial Measures
75% (4)
Foundation Course-I Project (SEM-1) 2017-18: and Remedial Measures
21 pages
NEET Score Card Ditto Format
No ratings yet
NEET Score Card Ditto Format
2 pages
Lesson 28
No ratings yet
Lesson 28
3 pages
21st Century Teaching Philosophy
No ratings yet
21st Century Teaching Philosophy
1 page
Gujarat Executive Engineer Recruitment Rules
No ratings yet
Gujarat Executive Engineer Recruitment Rules
9 pages
Synchronous Learning Plan Other Type of Assessment Feb 22 2021
No ratings yet
Synchronous Learning Plan Other Type of Assessment Feb 22 2021
4 pages
2011 CTU Course Catalog
No ratings yet
2011 CTU Course Catalog
493 pages
Knowledge Representation in AI
No ratings yet
Knowledge Representation in AI
3 pages
Virtual Reality
No ratings yet
Virtual Reality
14 pages
Indian Institute of Technology, Delhi Department of Physics:: Electromagnetics & Quantum Mechanics
No ratings yet
Indian Institute of Technology, Delhi Department of Physics:: Electromagnetics & Quantum Mechanics
2 pages
Quiz 1 Humanities Quizizz
No ratings yet
Quiz 1 Humanities Quizizz
5 pages
B.Tech ME Attendance List
No ratings yet
B.Tech ME Attendance List
14 pages
Art Tapp Notes
No ratings yet
Art Tapp Notes
19 pages

Deep Reinforcement Learning For Traffic Signal Control With Consistent State and Reward Design Approach

Uploaded by

Deep Reinforcement Learning For Traffic Signal Control With Consistent State and Reward Design Approach

Uploaded by

Deep Reinforcement Learning for Traﬃc Signal Control with Consistent

State and Reward Design Approach

6. Did they use simula�on to generate data?

7. Results (Final outputs from research)

9. Future implementa�on plan (from conclusion)

3. Which methodology they used

4. How they collected the dataset

6. Did they use simula�on to generate data?

7. Results / Final Outputs

8. Limita�ons (From Conclusion)

9. Future Implementa�on Plan

10. Comparison Between Methodologies (Current vs. Previous)

Previous Methods (Max This Research (3DQN with Par�al

Detec�on Full detec�on needed Par�al detec�on from CVs (low-

Learns from experience (adap�ve

Superior in complex, mul�-phase

Fairness Designed reward promotes

Deployment Promising, but requires real-world

A Reinforcement Learning Approach for Reducing Traﬃc Conges�on Using Deep

• Deep Reinforcement Learning (DRL) approach using Deep Q-Network

� 4. How they collected the dataset

They used two XML datasets:

Focused on intersec�on-based urban traﬃc management within the domain of:

� 6. Did they use simula�on to generate data?

� 7. Final results (outputs of the research)

• Queue length reduc�on: 49%

• High computa�onal cost due to large state-ac�on space in DQL.

� 9. Future implementa�on plans

• Integrate real-�me traﬃc data via internet connec�vity and sensors.

� 10. Comparison between current and previous methods

Criteria Previous Methods Proposed DQL Method

Queue Reduc�on Limited (14–30%) Signiﬁcant (49%)

Challenging for large Managed using hyperparameter

Fully dynamic with con�nuous

Techniques Used Gene�c, Fuzzy, MARL Deep Q-Learning with tuning

� Main Objec�ve: New Idea Genera�on

This research lays the groundwork for further innova�ons like:

You might also like