0% found this document useful (0 votes)

21 views14 pages

Spatiotemporal Optimized Dispatch of Electric Vehicles Under Electricity-Carbon Joint Market

This article, accepted for publication in IEEE Transactions on Transportation Electrification, presents a real-time dispatch method for electric vehicles (EVs) in an electricity-carbon joint market using multi-agent deep reinforcement learning (MADRL). The proposed model optimizes EV dispatch by considering various factors such as charge/discharge benefits, carbon trading, and battery degradation costs, validated through case studies in San Diego. Results indicate that the method can achieve significant daily revenue through spatiotemporal flexibility in EV operations.

Uploaded by

santhosha bm

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

21 views14 pages

Spatiotemporal Optimized Dispatch of Electric Vehicles Under Electricity-Carbon Joint Market

Uploaded by

santhosha bm

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 14

This article has been accepted for publication in IEEE Transactions on Transportation Electrification.

This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TTE.2025.3549747

Spatiotemporal Optimized Dispatch of Electric

Vehicles under Electricity-carbon Joint Market
Dong Han, Huarui Zhang, Jiawen Peng, Zhuoxin Lu, Xijun Ren

Abstract—The electrification of urban transportation systems DOD Depth of discharge

is a critical step toward achieving low-carbon transportation and CS Charing station
meeting climate commitments. With the development of Vehicle- Indices and sets:
to-Grid technology, electric vehicles (EVs) have become a vital tT Index and set of time steps
component of the power-transportation network. In order to iI Index and set of EVs
perform the optimal control of EVs with low-carbon and
n,mN Index and set of charging stations
spatiotemporal characteristics, this paper proposes a real-time
dispatch method under the electricity-carbon joint market based dD Index and set of battery degradation
on the two-layer multi-agent deep reinforcement learning cycle
(MADRL). Firstly, the EVs are grouped into the fleets, whose RO Set of roads
dispatch model is then constructed, considering the W Set of weights for roads
charge/discharge arbitrage benefits, carbon trading benefits, jJ Index and set of decision steps
spatial transfer and capacity degradation costs of EVs. Secondly, L arr arr
I,j , C I,j Set of LMP and LME at all charging
the dispatch problem is described as a Markov game, and the stations upon the ith EV’s arrival
model is solved through a two-layer MADRL framework to obtain Sch I, Achi Set of states and actions of
the discrete spatial transfer decisions and continuous charging/discharging decision layer
charging/discharging decisions synergistically. Finally, extensive for the ith agent
case studies are developed with the real-world locational marginal Smi , Ami Set of states and actions of spatial
price data and the location information of 30 charging stations of transfer decision layer for the ith
San Diego in US California to verify the validity of the proposed agent
scheme. Simulation results show that the proposed method Parameters:
facilitates the arbitrage strategies to obtain average daily revenue
t Interval of a time step
of $1,968.7 in the manner of its spatiotemporal flexibility.
cbattery Cost coefficient of capacity
Index Terms—electric vehicle, electricity-carbon joint market,
degradation
multi-agent deep reinforcement learning, spatiotemporal dispatch. ctra Cost coefficient of spatial transfer
Pmax Maximum power of the batteries
NOMENCLATURE Eini Initial capacity of batteries
Acronyms:  sei,  sei Coefficients of solid electrolyte
EV Electric vehicles interface film
PDN Power distribution network SOCmax Upper limit of the SOC
TN Transportation network SOCmin Lower limit of the SOC
V2G Vehicle-to-Grid  tra,  tra Retardation coefficients
PTN Power-transportation network ct Carbon quota price
LMP Locational marginal price Lev Maximum mileage of the EV per unit
DRL Deep reinforcement learning electric quantity
MADRL Multi-agent deep reinforcement Er Maximum carbon emissions of fuel
learning vehicles driving 1 km
MAPPO Multi-agent proximal policy cSOC Penalty coefficient of SOC constraint
optimization violation
LME Locational marginal emission Variables:
MG Markov game Ra Arbitrage benefit from charging and
MATD3 Multi-agent twin delayed deep discharging
deterministic policy gradient RE Carbon trading benefit
SOC State of charge Cbattery Battery capacity degradation cost

Dong Han, Huarui Zhang, Jiawen Peng and Zhuoxin Lu are with the Corresponding author: Huarui Zhang. (Address: No.516 Jungong Road,
Department of Electrical Engineering, University of Shanghai for Science and University of Shanghai for Science and Technology, Yangpu District,
Technology, Yangpu District, Shanghai 200093, China. (email: Shanghai, China).
[email protected]; [email protected]; [email protected]; The paper has not been presented at a conference or submitted elsewhere
[email protected]). previously.
Xijun Ren is with Institute of Economy and Technology of State Grid Anhui
Electric Power Co., Ltd., Hefei 230022, Anhui Province, China. (email:
[email protected]).

horized licensed use limited to: Ballari Institute of Technology & Management (formerly Bellary Eng College). Downloaded on March 17,2025 at 09:04:18 UTC from IEEE Xplore. Restrictions app
© 2025 IEEE. All rights reserved, including rights for text and data mining and training of artificial intelligence and similar technologies. Personal use is permitted,
but republication/redistribution requires IEEE permission. See https://www.ieee.org/publications/rights/index.html for more information.
This article has been accepted for publication in IEEE Transactions on Transportation Electrification. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TTE.2025.3549747

Ctra Spatial transfer cost battery

i Cost coefficient related to charging
n,t,i Binary variable represents whether and discharging power of ith EV
the ith EV stays at charging station n Estart
d , Eend
d Remaining capacity before and after
at time step t the dth cycle
ptn,t LMP of charging station n at time
step t
Pn,t,i Charging/discharging power of the 𝑖th I. INTRODUCTION

R
EV at charging station n at time step t egarding the continuing increase of carbon emissions
Eloss
i Degradation capacity of the 𝑖th EV and the growth of energy consumption, low-carbon
ttra
i Transfer time of the ith EV economy has become the focus of governments and
ch, dis Charging and discharging efficiency research institutions worldwide. In this regard, the
Pch, Pdis Charging and discharging power decarbonization of transportation sector is particularly
fd Comprehensive stress factor of cycle essential, and it is significant to achieve the low-carbon
d behaviors of vehicles in the area of transportation. Electric
Eloss', Eloss Capacity loss of the battery before vehicle (EV), as a low-pollution and spatially flexible
and after the current period transportation technology, has gradually received much
SOCt,i State of charge of the 𝑖th EV at time attention [1]. In 2023, global EV sales reached 14 million, and
step t it is projected to reach 17 million in 2024 [4].
rnm Road from station n to station m
Free-flow travel time of road rnm A. Motivation
tr0nm
The widespread adoption of EVs has intensified the
C a p rnm , xrnm ,t Capacity and the real-time traffic interdependence between the power distribution network
volume of road rnm at time step t (PDN) and the transportation network (TN) [5]. At the same
nm,t,i Binary variable whether the ith EV time, the rapid proliferation of EVs leads to the exposure of the
travel from charging station n to weaknesses in the structure and operation of PDN, which is
charging station m at time step t unable to adequately meet the sharply increasing charging
M ev1
i, Carbon emission quota of the ith EV demands of users [6]. On the one hand, EVs can conduct spatial
at time step t transformation with their characteristics of transportation. On
M ev2
i,t Carbon emission generated by the ith the other hand, the development of Vehicle-to-Grid (V2G)
EV at time step t technology has enabled EVs to not only consume electricity but
en,t LME of charging station n at time also act as storages and providers of electricity through
step t bidirectional charging and discharging. Therefore, leveraging
CSi,j Charging station of the ith EV at the coordinated scheduling in the power-transportation network
decision step j (PTN), the aggregators manage EVs in the way of fleets, which
tj Current time step at decision step j integrates and optimizes their charging/discharging and spatial
SOCi,j Current SOC of the ith EV at decision transfer behaviors across various scenarios. This approach
step j effectively meets the operational needs of the PTN, achieving
si,j ,e si,j LMP and LME of charging station its low-carbon and efficient operation [7].
which the 𝑖th EV is located at
decision step j B. Literature review
Pi,j Charging/discharging power of ith Current research has already explored various applications
EV at decision step j of EVs across multiple scenarios. Due to their spatiotemporal
chi,j , dis
i,j Charging and discharging efficiency flexibility and the development of V2G technology, EVs are
of ith EV at decision step j primarily applied in fields such as renewable energy integration
rmi,j Reward for the spatial transfer [10], enhancing resilience of PDN [12], and providing auxiliary
decision layer of the ith EV at services to the grid, including peak shaving [14], voltage
decision step j regulation [15], and local congestion relief [16]. [17] propose a
rchi,j Reward for the discharging/charging heuristic algorithm-based discrete charging and discharging
decision layer of the ith EV at dispatch method for EVs, which effectively filled the valley
decision step j load of the grid. In [18], the authors introduce a mutually
rtra
i,j Spatial transfer cost of the ith EV at beneficial operational framework for virtual power plants and
decision step j EV CSs, coordinating multiple stakeholders to reduce EV
roci,j, roc’i,j Punishment of constraint violation of charging costs. The above studies have delved into the flexible
the ith EV at decision step j charging and discharging strategies of EVs in PDN, but EVs
rai,j Arbitrage benefit of the ith EV at also need to operate in complex TN generally. Therefore, in
decision step j order to fully explore the spatial flexibility of EVs, achieve
rEi,j Carbon trading benefit of the ith EV more efficient energy utilization and more stable grid operation,
at decision step j it is necessary to conduct research on the collaborative control
rbattery
i,j Capacity degradation cost of the ith strategy of EVs in PTN. The authors in [19] design a fast-
EV at decision step j charging navigation strategy for EVs based on weighted pricing

in coupled networks, achieving coordinated economic TABLE I

operation of EVs in the PTN. [20] develop a method based on COMPARATIVE FEATURE OF RECENT SIMILAR RESEARCHES
genetic algorithm for the planning of charging infrastructure,
vehicle dispatch, and charging management of battery electric P P Electricity Electricity- Real-
buses, which improved the economic benefits of the operation D T market carbon time
of battery electric buses. Great achievements have been made N N joint dispatch
in the above works. Nonetheless, there still exist some market
limitations. In practical scenarios, variations in EV charging [17,18,22, √ √
demand, the electricity prices, battery status, and traffic 23]
conditions are much more complex. Requiring the development [19,20] √ √
of real-time scheduling strategies that respond to dynamic [21,29] √ √ √
charging demands and time-varying electricity prices. A real-
[24] √ √ √
time dispatch strategy for EVs in PTN that accurately perceive
environmental changes should be considered. [25] √ √ √ √
The flexible charging services for EVs have brought [28] √ √ √
considerable benefits. [21] develop a hierarchical trading [30] √ √
energy framework to induce and coordinate EVs charging This paper √ √ √ √ √
demand and distributed energy resources (DER) generation in characterization of uncertainties such as traffic conditions, load
local distribution networks, which benefit both EV owners and distribution, renewable energy distribution, and electricity
DER investors through secure local energy trading. In [22] prices. The model-driven solution method uses a scenario-
authors propose a dynamic control strategy for charging EVs in based stochastic programming method to deal with uncertainty.
response to regulation signals, and the results show that this This method only captures a small number of representative
strategy can significantly improve the profitability of EV scenarios, resulting in insufficient representative results. Due to
charging control. Another work refers to [23] in which a the existence of a large number of integer variables, the solution
coordinated operation strategy between EV CSs and complexity of the model-based algorithm will increase
distribution system operators is designed, and integrates a peer- exponentially with the increase in the size of the problem,
to-peer (P2P) trading model based on Nash bargaining game resulting in low efficiency. Given the bottleneck of model-
theory to maximize the profits of participating entities. The driven solving methods, previous studies have shown that deep
above study has conducted in-depth research on the economic reinforcement learning (DRL), as a data-driven method,
benefits of EVs in the electricity market. However, the impact presents an excellent performance in dealing with uncertainty
of carbon quotas on the spatiotemporal dispatch strategy of EVs and complex modeling problems [27]. For instance, one study
has not been considered. With the continuous improvement of propose an EV fleet charging strategy based on DRL to prevent
carbon trading mechanisms, research should be conducted on grid overload caused by disorderly EV fleet charging [28].
the economic viability of EVs in the electricity-carbon joint Another work, [29], design a joint charging and order dispatch
market. Authors in [24] develop a probabilistic carbon footprint scheme for large-scale shared EV fleets based on DRL, aiming
management strategy. The direct and indirect carbon emission to maximizing the benefits of fleet operators. In actual research,
is restricted by a chance-constrained carbon footprint it is necessary to carefully characterize multiple EV
management model from the supply and demand sides. In [25], spatiotemporal dispatch strategies. Therefore, multi-agent deep
a multistage low-carbon EV charging facilities planning model reinforcement learning (MADRL), as an extended algorithm of
is adopted for the PTN, including carbon emission amount from DRL, is widely used for routing problems in PTN. The authors
the perspective of consumption side is calculated by carbon in [30] introduce a hierarchical MARL to handle the problem
emission flow. The works above set up carbon caps to restrict of effectively coordinating the dispatch of RCs towards system
carbon emissions of individual power lines, which may resilience. Currently, most DRL algorithms for EVs only
overlook uncertainty and system dynamics. Therefore, we consider a single action space. However, multiple decision-
adopted a carbon emission estimation method based on LMP, making actions such as discrete spatial transfer actions and
which more directly characterizes the spatiotemporal continuous charging/discharging actions are required to make
differences in carbon emissions. precise spatiotemporal decisions for EVs in PTNs. Therefore, a
In terms of model solving, the charging/discharging multi action space MADRL algorithm is needed to achieve
behaviors as well as spatial transfer behaviors of EVs exhibit comprehensive optimization of EV spatiotemporal dispatch.
randomness and unpredictability [26]. The spatiotemporal
dispatch problem of EVs in PTN requires accurate C. Contributions
To fulfill the mentioned research gaps, this paper aims to
solve the problem of real-time spatiotemporal dispatch of EVs
with V2G in the electricity-carbon joint market, so as to
maximize the benefits of their aggregator. The differences
between this paper with recent research works can be
summarized in Table I. The major contributions of this paper
are threefold:
1) Traffic information network, energy information
network, and carbon emission information network are

constructed in this paper to meet the information needs of real- in Section V.

time low-carbon spatiotemporal dispatch of EV fleets in the
PTN. We develop the optimization model considering the II. Mathematical models
electricity and carbon joint trading, routing, and battery
characteristics of EVs with the objective of maximizing the net A. Problem setting
benefits of EV aggregators. In the electricity-carbon joint market, the mechanism for the
2) A collaborative solution framework for discrete action low-carbon spatiotemporal optimization dispatch of EVs is
space and continuous action space based on two-layer MADRL shown in Fig 1. Within the PTN, EVs can obtain spatial transfer
is designed. A sequential training strategy is proposed to ensure time information from traffic information network, LMP data
training sufficiency and model convergence. from energy information network and locational marginal
3) Through a simulation in real-world scenario, the emission (LME) data from carbon emission information
stability of the proposed method in making real-time decisions network, as well as their own energy storage battery
in response to electricity price changes, traffic congestion, etc. information. All EVs transmit global observations to the
is evaluated, and the scalability of the method over a long-term aggregator, which utilizes a two-layer MADRL algorithm to
scale is verified. generate spatial transfer decision and charging/discharging
decision, thereby guiding the behavior for EVs in both the TN
D. Organization of the paper and PDN. The spatial transfer decision determines charging
The rest of this paper is organized as follows. Section II station (CS) where the EV will be located at the next time step,
presents the spatiotemporal dispatch mechanism of EV fleets in while the charging/discharging decision determines the EV's
the electricity-carbon joint market. It explains the charging/discharging amounts at the current time step. The
spatiotemporal dispatch model of EVs that takes into account aggregator can not only direct EVs to discharge with high LMP
carbon trading costs, spatial transfer costs, capacity decay costs, and charge with low LMP at CS, thus participating in the
and arbitrage benefits. Section III describes the two-layer electricity market to gain arbitrage benefit, but also participate
MADRL solution framework and sequential training process. in the carbon trading market according to the allocated carbon
Section IV provides the case study, and conclusions are drawn emission quotas of EVs and real-time carbon emissions.
Spatial transfer time

Road Traffic information network

Energy flow
Carbon emission flow
Spatial transfer
Global observation EVs decision

Electricity trading Energy information network

Carbon trading
DG
Two-layer Charging/discharging
decision LMP
MADRL DG

Electricity-
...

carbon joint Benefits/costs

market Aggregator
Carbon emission information network
Policy
Agent DG
DG
Charging/discharging
decision

LME

Fig. 1. Spatiotemporal dispatch mechanism of EVs.

The basic settings of the spatiotemporal dispatch of EVs are from the power flow. Therefore, the LME can be estimated by
supposed as follows: LMP as the preconditions for calculation.
1) Several EVs form an EV fleet, EVs assigned to same fleet
B. Model formulation
execute the same policy. EV fleets can share observation
information and participate in the electricity-carbon joint 1) Objective function
market through centralized dispatching of aggregator. The goal Considering the arbitrage benefits from EVs charging and
of aggregator is to maximize their total benefits. 2) EVs support discharging, carbon market trading benefits, battery capacity
V2G technology. When stationed at CS, EVs can perform degradation costs, and spatial transfer costs, the objective
charging, discharging, or idle operations. While moving within function is established with the objective of maximizing the net
the TN, EVs will experience energy loss. 3) All the roads benefits of EVs in the electricity-carbon joint market.
connecting CSs in the TN are bidirectional, and EVs are not max f  R a  R E  C battery  C tra (1)
allowed to change their destination while moving. 4) The CSs Eq. (1) represents the net benefits of EVs for a single
are connected to the PDN, and the charging and discharging dispatch period. 𝑅 represents the arbitrage benefits of EVs
prices are measured by LMP. Carbon emissions are only from the charging and discharging, which can be calculated
generated during the charging of EVs and measured by LME, with Eq. (2). 𝑅 represents the carbon trading benefits of EVs,
which is the incremental carbon emissions of the system which can be calculated with Eq. (3). 𝐶 represents
brought by increasing unit load. 5) Both LMP and LME come battery capacity degradation costs of EVs, which can be

calculated with Eq (4). 𝐶 represents spatial transfer costs of step t is:

EVs, which can be calculated with Eq (5). SOCt ,i  SOCmin (8)
T
R    n ,i ,t  P
a pt
t (2)
SOCt ,i  SOCmax
t  0 iI nN
n ,t n ,i ,t (9)
T where 𝑆𝑂𝐶 , represents the state of charge of the 𝑖th EV during
R E    RiE,t (3) time step 𝑡. Constraint (8) limits the lower limit of the SOC for
t  0 iI
the 𝑖 th EV at time step 𝑡 , reflecting the range anxiety of EV
C battery
 cbattery  Eiloss (4) owners. Constraint (9) limits the upper limit of the SOC for the
iI 𝑖 th EV at time step 𝑡 , preventing safety hazards due to
Ctra  ctra  titra (5) overcharging.
iI To ensure the normal operation of EVs in the next dispatch
where  , , {0,1} decides whether the 𝑖th EV stays at CS𝑛 at period, the SOC of the EVs at the end of the cycle must be the
time step 𝑡.  , represents the LMP at CS𝑛 at time step 𝑡. 𝑃 , , same as the initial SOC:
represents the charging/discharging power of the 𝑖th EV at time SOCtmax 1,i  SOCinit (10)
period 𝑡 . 𝑃 , , 0 represents discharging. 𝑃 , , 0 where 𝑆𝑂𝐶 represents the SOC of the 𝑖th EV at initial time
represents charging. 𝑅 , represents carbon market trading step.
benefit of the 𝑖 th EV at time step 𝑡 . 𝐸 represents the b) Traffic information network
degradation capacity of the 𝑖th EV. 𝑐 represents the cost Traffic information network provides spatial transfer time of
coefficient of capacity degradation. 𝑐 represents unit time all roads in TN for EVs. The traffic constraints are as follows:
spatial transfer cost of the ith EV. 𝑡 represents the transfer The network topology characteristic of TN can be
time of the ith EV. effectively modeled by graph theory. Let the connected directed
2) Constraints graph 𝐺 𝑁, 𝑅𝑂, 𝑊𝑇 represent the traffic information
In order to meet the information requirements of EVs real- network, where 𝑁 and 𝑅𝑂 represent the node set and arc set of
time spatiotemporal dispatch, this paper constructs energy the graph, respectively. The arcs connecting nodes 𝑛
information network, traffic information network and carbon 1, 2, … , 𝑁 and nodes 𝑚 1, 2, … , 𝑁 are denoted as 𝑟 . Let
emission information network. The above three networks and 𝑊 𝑇 , 𝑟 𝑅𝑂 be the set of weights of the roads,
constraints are composed as follows: reflecting transfer time from CS 𝑛 to CS 𝑚 at time step t. 𝑇 ,
a) Energy information network denotes the time required for EVs to travel on road 𝑟 at time
Energy information network provides LMP of all CSs in step 𝑡.
PDN for EVs. The energy constraints are as follows: To accurately calculate the spatial transfer cost, Bureau of
The constraint on the charging and discharging power of the Public Roads (BPR) function is used to compute the spatial
𝑖th EV at time step t is below. transfer time, achieving a thorough consideration of real-time
n,i,t Pmax  Pn,i,t  n,i,t Pmax (6) traffic congestion in the network. The travel time on road r at
time step t is expressed as:
where 𝑃 represents the maximum power of EVs.
The capacity degradation of the battery depends on xrnm ,t tra
Trnm ,t  tr0nm [1   tra ( ) ] (11)
environmental temperature, depth of discharge (DOD), SOC, Caprnm
and battery operating time. Inspired by the reference [31], a
semi-empirical battery capacity degradation model is used to where 𝑡 is the free-flow travel time of road 𝑟 , which
calculate the capacity loss of energy storage batteries within one depends on the length of road. 𝐶𝑎𝑝 , 𝑥 , correspond to
cycle. capacity and the real-time traffic volume of road 𝑟 at time
 ini step 𝑡 , respectively; 𝛼 , 𝛽 correspond to the retardation
 E [1   exp(    f d )
sei sei
coefficients. The spatial transfer time of the ith EV for a
 d
dispatch period is:

E loss    (1   sei ) exp(   f d )], E loss  0 (7) T
 d t itra    nm ,i ,t Tr ,t (12)
 E ini  ( E ini  E loss ' ) exp( 
nm

  f d ), E loss '  0 t 0
d where  , ,  0,1 represents whether the 𝑖th EV travels from
where 𝐸 and 𝐸 represent the capacity loss of the energy CS 𝑛 to CS 𝑚 at time step 𝑡.
storage battery before and after the current period, respectively. The spatiotemporal dispatch rules for EVs refer to the
𝛼 and 𝛽 are the coefficients of the solid electrolyte spatial transfer constraints of energy storage in reference [32].
interface film when the battery is formed. 𝑓 is the Additionally, to ensure the proper dispatch of EVs in the next
comprehensive stress factor of cycle 𝑑 related to temperature, dispatch period, EVs must return to their initial CS at the end of
DOD, SOC and battery operating time, which is obtained by the dispatch period. The ith EV must satisfy constraint (13):
method proposed in reference [32]. CS start ,1,i  CS start ,T ,i  1 (13)
To ensure the proper operation of EVs, the SOC of any EV i i

at any time step must not exceed the upper and lower limits of where 𝐶𝑆 represents the initial CS where 𝑖 th EV was
the SOC of batteries. The SOC constraint of the 𝑖th EV at time located.
c)Carbon emission information network

Carbon emission information network provides LME of all The state space of the ith EV in the charging/discharging
CSs in PDN for EVs. The carbon emission constraints are as decision layer at decision step j is expressed as:
follows: sich, j  {CSi, j , t j , SOCi, j , is, j , eis, j } Sich (18)
EVs aggregator own the total carbon quotas of EVs and
participate in carbon trading market. The carbon trading where  , and 𝑒 , represent the LMP and LME of the CS where
benefits of the 𝑖th EV at time step 𝑡 is: the 𝑖th EV is located at decision step 𝑗, respectively.
T 2) Action space
RiE,t    ct ( M iev,t 1  M iev,t 2 ) (14) a) Spatial transfer decision layer
iI t  0
The action of the 𝑖th EV in the spatial transfer decision layer
where  represents the carbon quota price. 𝑀 , represents at decision step 𝑗 is to select the CS at the next decision step,
the carbon emission quota of the 𝑖th EV at time step 𝑡, which which is given as follows:
can be calculated with Eq. (15). 𝑀 , represents the carbon aim, j  {CSinext
, j } Ai
m
(19)
emission generated by the 𝑖th EV charging behavior at time step
𝑡, which can be calculated with Eq. (16). b) Charging/discharging decision layer
The action of the 𝑖th EV in the charging/discharging
M ev1
i ,t  Pn,i,t L E
ev r
(15) decision layer at decision step j is charging/discharging power
at the current CS, which is given as follows:
en,t Pn,i ,t t , Pn,i ,t  0
M tev 2   (16) aich, j  {Pi, j } Aich (20)
0, else
3) Transition function
The 𝑖th EV performs actions according to the state 𝑠 , at
where 𝐿 represents the maximum mileage of an electric
decision step j and interacts with the environment to transition
vehicle per unit of electricity. 𝐸 represents the maximum
to the next state 𝑠 , . The traffic information, LMP, and LME
carbon emissions of an ordinary fuel vehicle per kilometer. 𝑒 ,
are updated based on the initial input dataset, while the update
represents the LME of CS 𝑛 at time step 𝑡.
methods for CSs where the agent is located, current time step,
and SOC are as follows:
III. Two-layer Multi-agent Deep Reinforcement Learning
Model of EVs CSi , j 1  aim, j (21)
In order to solve the spatiotemporal dispatch problem of t j  1, CSi , j  aim, j
EVs considering carbon trading with MADRL algorithm, it is t j 1   (22)
t j  Ti, j (ai, j ), else
tra m
necessary to express the problem in Section 2 as a Markov game
(MG) tuple: 𝐼, 𝑆 , 𝐴 , 𝑷, 𝑹,  , where 𝐼 represents the
number of agents; 𝑆 represents the joint states of the 𝑖 th
where 𝑇 , 𝑎 , represents the time required for the 𝑖th EV to
agent; 𝐴 represents the joint actions of the 𝑖 th agent; 𝑷
represents the state transition probability of the agents; 𝑹 move from current station 𝐶𝑆 , to 𝑎 , .
represents the global cumulative reward function; represents If the agent chooses to make spatial transfer decision, the
the reward discount factor [33]. SOC will be updated based on the energy consumed by the EV
during the movement, which is given as follows：
A. Algorithm Model
LSOC m
i , j (ai , j )
In the spatiotemporal dispatch problem of EVs, each EV SOCi, j 1  SOCi, j  (23)
performs two types of decisions, which are the discrete spatial Eini
transfer decisions and continuous charging/discharging where 𝐿 , 𝑎 , represents the energy lost for the 𝑖 th EV to
decisions. To coordinate the discrete and continuous actions of move from current station 𝐶𝑆 , to 𝑎 , .
EVs, a two-layer decision-making framework based on the
If the agent chooses to make charging/discharging decision,
MADRL algorithm is established in this paper. The MG tuple
the SOC will be updated based on the amount of energy charged
for the EV agents is defined as follows:
or discharged by the EV, which is given as follows:
1) State space
a) Spatial transfer decision layer  aich, j  t ch
SOCi , j  i , j  , ai , j  0
ch
The state space of the 𝑖th EV in the spatial transfer decision E ini

layer at decision step 𝑗 is expressed as: SOCi , j 1   (24)
SOC  1  ai , j  t ,ach  0
ch
sim, j  {CSi , j , t j , SOCi, j , Ti,r j , Larr
i , j , Ci , j }  Si
arr m
(17)
 i, j
idis E ini
i, j
where 𝐶𝑆 , represents the current CS where the 𝑖 th EV is  ,j

located at decision step 𝑗 . 𝑡 represents the current time step, 4) Reward function
and 𝑆𝑂𝐶 , represents the current SOC of the 𝑖th EV at decision a) Spatial transfer decision layer
step 𝑗. The vector 𝑻 , represents the required time for the 𝑖th The reward function for the spatial transfer decision layer is
the sum reward values for the spatial transfer decisions of all
EV to move to the remaining CSs at decision step 𝑗. The vectors
EVs. The reward of the 𝑖th EV in the spatial transfer decision
𝑳 , and 𝑪 , represent the LMP and LME at all CSs upon the
layer at decision step 𝑗 is:
ith EV’s arrival, respectively.
b) Charging/discharging decision layer

rim, j  rich convergence speed and relatively straightforward parameter

, j  ri, j  ri, j
tra oc
(25)
tuning.
, j  c Ti , j (CSi , j )
ritra tra tra next
(26) To improve sample efficiency, the MAPPO algorithm
employs importance sampling techniques. For each agent, it
c Ti , j (CSi ,0 ), CSi ,T  CSi ,0
tra tra

,j  
rioc (27) utilizes two Actor networks: Actor(new) as the network
0, else being optimized and Actor(old) as fixed network to collect
where, 𝑟 , represents the reward of the ith EV in the data and estimate the new policy. After a certain batch of
discharging/charging decision layer at decision step 𝑗 , which updates, Actor(new) is synchronized with Actor(old) to
can be obtained by Eq. (28); 𝑟 , represents spatial transfer cost reuse the same batch of training data. The optimization
of the 𝑖th EV at decision step 𝑗; 𝑟 , represents the cost for the objective is:
𝑖th EV to return to the initial CS at the end of dispatch period  
J old ( )  ( s , a )    Aold ( s, a)  (34)
under the constraint (13). old
  old 
b) Charging/discharging decision layer
The reward function for the charging/discharging decision To ensure training stability, the network parameters  and
layer is the total reward value for the charging/discharging old input the same state and produce similar action
decisions of all EVs. The reward of the 𝑖 th EV in the probability distributions. The MAPPO algorithm uses a
charging/discharging decision layer at decision step 𝑗 is: clipping function to constrain the update speed of the policy
network, ensuring that the new policy remains close to the
, j  ri , j  ri , j  ri , j  ri , j
rich a E oc ' battery
(28)
old policy, which is given as follows:
ria, j  is, j Pi , j t (29)  
 old
J PPO ( )   min[  A old , clip (  ,1   ,1   ) A old ] (35)
 
 ctj Pi, j (Lev Er  eis, j t ), Pi, j  0 ( s ,a )  old  old

ri,Ej   ct (30) where 𝑐𝑙𝑖𝑝 represents clipping function used to constrain

 j Pi, j L E , Pi, j  0
ev r
𝜋 /𝜋 between 1 𝜀 and 1 𝜀 when  and  differ
 c SOC SOCi , j 1  SOCi ,0 , j  J significantly. 𝜀 represents parameter related to the clipping
,j  
rioc '
(31) amplitude. The network structure of the MAPPO algorithm
 0, else is shown in Fig. 2.
where 𝑟 , and 𝑟 , represent the arbitrage and the carbon trading Actor Network Critic Network
Value a1
benefits from charging/discharging for the ith EV at decision Network Value Network

step j, respectively. 𝑟 , is the penalty term arising from  old Trajectory Q w

S1,τ  old buffer
constraint (10), which represents the cost of 𝑖th EV to adjust the Update
Advantage
Target
SOC to the initial SOC at the end of dispatch period. 𝑟 , a1,τ Network Clear
Clip Critic_loss
function
Update
represents the battery capacity degradation costs of the 𝑖th EV  new Update
Environment

at decision step 𝑗, which can be calculated with Eq (32). 𝑐 1 2 1 2 1

{[ s , s ],[a , a ], rt ,[ s , s ]}
t t t t t 1
2
t 1
Agent 1

represents the cost coefficient for SOC variation. 𝐽 represents Sample update
Replay buffer Update parameters
Agent 2
the max decision step. Value a2
To calculate the capacity degradation costs at each decision Network Value Network

step, the degradation costs is calculated based on the S2,τ  old

Trajectory Q w
Update  old buffer
charging/discharging power at each decision step inspired by a2,τ Advantage
recent work in [32], which is given as follows: Target
Network Clear
Clip Critic_loss
function
Update
ribattery
,j   ibattery Pi , j (32)  new Update
Actor Network Critic Network
Edstart  Edend Fig. 2. Structure of MAPPO-Network.
 ibattery  cbattery (33)
 j 1 Pi, j 2) Charging/discharging decision layer
Td

The charging/discharging decision layer adopts the Multi-

where 𝛼 represents the cost coefficient related to Agent Twin Delayed Deep Deterministic Policy Gradient
charging/discharging power; 𝐸 and 𝐸 are the (MATD3) algorithm to decide on the agents'
remaining capacity before and after the 𝑑th cycle, respectively. charging/discharging actions. MATD3 is also based on the
𝑇 represents 𝛼 being updated every 𝑇 steps. Actor-Critic framework, where two Critic networks can be used
to estimate the value of actions. It improves algorithm stability
B. Algorithm implementation by delaying updates to the Actor network during the learning
1) Spatial transfer decision layer process.
The spatial transfer decision layer adopts the MAPPO In the MATD3 algorithm, each agent uses two Critic
algorithm to govern the agents’ spatial transfer behavior. networks to estimate the value of actions and selects the
MAPPO is an extension of the PPO algorithm designed to minimum value between them. The error computation
address policy optimization in multi-agent systems. It is formula is as follows:
based on an Actor-Critic architecture, known for its faster

1 N make spatial transfer decision, and the MAPPO algorithm's

 (min j 1,2 Q ( si , ai  j )  y )2
N i 1
x loss  (36) network parameters are updated to maximize the advantage
function to achieve agent training. Finally, aggregator can use
where 𝑦 represents the estimation of the target value. To the trained charging/discharging decision model and spatial
prevent the instability in learning and local optimal policies transfer decision model to make decisions for the agents, which
caused by the dependency between Critic networks and can effectively reduce communication and computational
target values, the MATD3 algorithm calculates target values burdens during execution.
using target networks periodically updated by copying the
Actor and Critic networks. The formula for calculating the IV. Case studies
target value is:
y  r   min j 1,2 (Q j ( si 1 ,  ( si 1 |   ) |  q   )) (37) A. Experimental setting
j
During the training phase, LMP data, LME data, and traffic
where 𝑄 and 𝑄 represent the target Critic networks, 𝜇 information from 30 CSs in the San Diego area of California
represents the target Actor network. 𝜃 , 𝜃 , 𝜃 represent from 2020 to 2022 are taken as an example to be simulated in
target network parameters. 𝜎 represents the noise estimate this paper. The LME data is estimated from the LMP at the
for smoothing. The network structure of the MATD3 corresponding times and the geographical locations of the CSs
algorithm is shown in Fig. 3. are shown in Fig. 5. LMP data can be downloaded from the
Actor Network Update Critic Network California Independent System Operator (CAISO) website [34],
V
Value
Optimizer
Q Value
Network1
Value
Network2
and free-flow travel time information can be obtained by the
Network
  ,1  q1 ,1  q2 ,1 Google Maps Developer Platform API [35]. The parameter of
S1,τ Update
a1 Soft Update Update Soft estimation method for LME are provided in Appendix A.
update LOSS update
Target Target Target During the execution phase this paper selects LMP data, LME
y
a1,τ Network a1 Network1
 q ,1
Network2
 q ,1
data, and traffic information from typical days at the 30 CSs to
  ,1 1 2
verify the reliability of the trained model. This paper assumes
Environment

{[ s1t , st2 ],[at1 , at2 ], rt ,[ s1t 1 , st21 ]}

Agent 1 that the aggregator has 3 EV fleets, each consisting of 10 EVs.
Sample update
Replay buffer Update parameters According to the description in section 2.1, EVs in same fleet
Agent 2
Update V
execute the same policy. Therefore, there are only 3 agents in
Value Q Value Value
Network Optimizer Network1 Network2 the spatiotemporal dispatch problem of EVs. The basic
S2,τ   ,2  q1 ,2  q2 ,2 parameter settings of this paper are provided in Appendix B.
Update
a2 Soft Update Update Soft The MADRL algorithm is implemented in Python 3.7. The
update LOSS update
a2,τ
Target Target y Target MATD3 network framework is built by TensorFlow2.6.0, and
Network a2 Network1
 q ,2
Network2
 q ,2 the MAPPO network framework is built by
  ,2 1 2

Actor Network Critic Network

PyTorch1.13.0+CUDA 11.7.0. The computational platform's
hardware consists of an Intel Core [email protected] and 16
Fig. 3. Structure of MATD3-Network. GB RAM.
C. Algorithm implementation B. Analysis of algorithm performance
The spatial transfer decision layer adopts the MAPPO 1) Comparisons with DRL algorithm
algorithm to govern the agents’ spatial transfer actions, while Based on the process illustrated in Fig. 4, this paper first
the charging/discharging decision layer adopts the Multi-Agent employed MATD3 to train the charging and discharging
Twin Delayed Deep Deterministic Policy Gradient (MATD3) decision agents, followed by the use of MAPPO to train the
algorithm to decide on the agents' charging/discharging actions. spatial transfer agents. The training results of the charge-
To improve the training efficiency of the proposed MATD3- discharge decision layer and the space transfer decision layer
MAPPO framework, a sequential training method is proposed were compared with the typical continuous action space
to train two groups of agents. The training and execution phase decision algorithm and the typical discrete action space
of the MATD3-MAPPO framework is shown in Fig. 4. Firstly, decision algorithm respectively. The resulting reward curves
the MATD3 model is trained. The charging/discharging are shown in Fig. 6 and Fig. 7, where the light-colored lines are
decision agents will interact with the environment to update the the true reward after smoothing, and the dark-colored lines are
network parameters, eventually obtaining and saving the the average reward. The reward curve goes through three stages.
trained charging/discharging decision model. Secondly, the The first stage is before 1000 episodes, which is a random
MAPPO model is trained. During training, each agent randomly decision-making stage. In this stage, the replay buffer does not
selects an initial CS and interacts with the designed meet the minimum sampling requirements, and the agents
environment. The agent first determines whether to stay at the randomly generates decisions, and the reward curve fluctuates
current CS. If stay, the agents will use trained greatly. The second stage is the learning stage, when the replay
charging/discharging decision model to make the buffer is full, agents learn and update their strategies, and the
charging/discharging decisions. Otherwise, the agents will

Training phase

Initialize environment Select initial stations Obtain charge/ discharge decisions

Obtain global rewards and transition to the
and network parameters for each agent based on each agent's state
next state.
No Yes No
Check if the maximum Check if the maximum time
iterations has been reached period has been reached
Yes

s ch
j , a ch ch ch
j , r j , s j 1 
Store state, action, reward, and next state into the replay buffer until full
Save the charge and discharge decision layer model
Sample from the replay buffer and update network parameters

If the agent is stationary, make charge and

Initialize environment Obtain initial state of the Obtain Spatial transfer decisions
discharge decisions to obtain global rewards and the
and network parameters agent based on each agent's state
next state; otherwise, obtain them directly based on
No No spatial transfer decisions.
Check if the Check if the
maximum maximum time
iterations has period has been
Yes been reached Yes reached

s m
j , a mj , rjm , s mj 1 
Save the spatial transfer layer model Store state, action, reward, and next state into the replay buffer until full
Sample from the replay buffer and update network parameters

Execution phase
Stand
Complete
Obtain Charging/discharging decisions one step
scheduling
No Stand
Obtain initial state of the Obtain Spatial transfer decisions
agent

Fig. 4. Training and execution flowchart.

stage., agents gradually find the optimal strategy in the process
30
of updating the network parameters, and the reward reaches a
stable state and finally stabilizes at a relatively fixed level.
29
In the comparison of the continuous action space algorithm
training process, the MATD3 algorithm used in this paper
27 26 28
partially overlaps with the MADDPG algorithm in the learning
23 24 25
stage. In the end, the reward of the MATD3 algorithm
22 21
16 17
19
20
converges to a higher level. This is because the MATD3
14
15 18 algorithm introduces an additional Critic network on the basis
12
of the MADDPG algorithm and delays the update of the Actor
13
10 9 11 network, thereby ensuring the stability of the training process
8 7 and enabling the agents to learn better strategies in the
6 5 interaction with the environment. The reward of the MASAC
4 2 algorithm is improved compared to the MADDPG algorithm,
1
3 but the convergence speed is slower than that of the MATD3
algorithm and the MADDPG algorithm. This is because the
MASAC algorithm is more sensitive to the dynamics of the
Fig. 5. Locations of 30 CSs. environment during training. When the environment changes or
reward gradually improves. The third stage is the convergence is highly unstable, the algorithm requires a longer training time

to ensure performance. which also proves the validity of the algorithm. In terms of
In the comparison of the training process of discrete action solving speed, the model-based algorithm takes 23 523.23 s,
space algorithms, the reward curve of the MAPPO algorithm while the algorithm proposed in this paper only takes 0.1 s. The
used in this paper converged to a higher level and showed a comparison results show that, although the model-based
faster convergence speed. In the convergence stage, the reward algorithm yields slightly better results than the DRL-based
curve fluctuations of the MADQN algorithm and the MA- algorithm, it faces two bottlenecks in practical applications.
rainbow algorithm were significantly greater than those of the Firstly, the model-based algorithm requires a long time to
MAPPO algorithm. In particular, the MADQN algorithm obtain dispatch results, while the DRL-based algorithm can
showed large fluctuations around 2700 and 3200 episodes. This obtain them in just 0.1 s. This is because the DRL-based
is because the state dimension of the studied environment is algorithm completes a lengthy training phase in advance,
high, and the Q-Learning-based method needs to store and allowing it to directly output results during the execution phase.
update a large number of Q values, which increases the However, the model-based algorithm needs to solve the
computational pressure of the MADQN and MA-rainbow scheduling strategy for an entire day from scratch. Second,
algorithms, resulting in unstable training process. In contrast, model-based optimization algorithms often require information
the gradient calculation of the loss function of the MAPPO for the entire day, whereas in practical scenarios, data such as
algorithm used in this paper is relatively simple and efficient, electricity prices and traffic information are often difficult to
which is conducive to the training and convergence of the obtain. In contrast, DRL-based optimization algorithms only
algorithm, and obtains better training results. require current information and forecasts for the upcoming
Therefore, the MATD3-MAPPO framework with higher periods. Therefore, in this case, the DRL-based algorithm is
reward and faster convergence speed is a reasonable choice to more suitable.
study this problem.
3000 C. Analysis of dispatch results
2500
In order to cope with the complex changes in the real
environment, three EV fleets are numbered EVF1, EVF2 and
2000 EVF3 respectively, and different starting positions are selected for
1500 them in the execution phase. The starting position of EVF1 is
CS16, which is convenient for all-round routing coverage; The
Reward

1000
starting position of EVF2 is CS30, which is far away from other
500 CSs and is suitable for processing edge tasks and supporting
routing dispatch needs in surrounding areas; The starting position
0
of EVF3 is CS1 which is remote but has slightly more nearby CSs
MADDPG(Average) MADDPG(True with smooth)
-500 MASAC(Average) MASAC(True with smooth) than CS30, so it is suitable for handling mission requirements in
MATD3(Average) MATD3(True with smooth)
-1000
the southern region.
0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000 1) Dispatch results analysis for one day (Only consider the
Episode
electricity market)
Fig. 6. Training process of continuous action algorithms. To facilitate comparative analysis, this section conducted the
3000
dispatch of EVs within a typical day considering only the
2000 electricity market scenario. The charging/discharging conditions
of EVs and the LMP of corresponding CSs are shown in Fig. 8. In
1000 this scenario, EVs' spatiotemporal decisions will comprehensively
consider battery degradation costs and spatial transfer costs. Upon
Reward

0
observing profitable opportunities at other CSs, they will move to
-1000
discharge at CSs with higher LMP or charge at CSs with lower
LMP to maximize arbitrage benefit, and return to the starting CS
-2000 by the end of the last time step. EVs' charging decisions are mainly
MADQN(Average)
MA-Rainbow(Average)
MADQN(True with smooth)
MA-Rainbow(True with smooth)
concentrated in the time steps 25-55 and 75-85. Specifically, EVs
-3000 MAPPO(Average) MAPPO(True with smooth) always choose to charge at CS14 during the 75th to 85th time steps.
0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000 The reason is that the LMP at CS14 is relatively low, allowing the
Episode arbitrage benefit from moving to CS14 to cover the spatial transfer
Fig. 7. Training process of discrete action algorithms. costs. EVs' discharging decisions are mainly concentrated at the
2) Comparisons with model-based algorithm time steps 5-15 and 60-70. Referring to the corresponding LMP
To verify the validity of the proposed method, a comparison curve, it can be seen that the LMP is relatively high during these
was made between the algorithm proposed in this paper and periods. After weighing the spatial transfer costs and arbitrage
model-based algorithms. For the model-based method, the benefit, EVs will choose to make discharging decisions. In
optimized dispatch of electric vehicles is cast as MILP problems. addition, to ensure the normal operation of the next dispatch
In the electricity-carbon joint market, the method proposed period, EVs may make charging and discharging decisions at the
in this paper has a one-day benefits of $2 048.5, while the end of the period to ensure that the SOC returns to the initial level
model-based optimization algorithm has a one-day benefits of by the end of the dispatch period. In some periods, EVs neither
$2 125.5, which is 3.7% higher than the method proposed in make spatial transfer decisions nor charging/discharging
this paper. The difference between the two is not significant, decisions. The reason is that the designed model comprehensively

considers the battery degradation costs and spatial transfer costs and make more frequent charging and discharging decisions to
of EVs, leading to the neglect of arbitrage benefit when LMP increase carbon trading revenue. On the other hand, EVs can
changes are minor, thus ensuring the health of the battery. also accurately sense LME changes and make prudent decisions
2) Dispatch results analysis for one day (Electricity-carbon to avoid charging during high LME periods, thereby reducing
joint market) carbon emissions generated during operation and ensuring
To analyze the impact of the carbon market on the economic benefits.
spatiotemporal allocation of EVs, this section conducted the 3) Dispatch results analysis for one day under fluctuating
dispatch of EVs within a typical day considering the electricity- price (Electricity-carbon joint market)
carbon joint market scenario. The charging/discharging of EVs To verify the stability of the proposed MATD3-MAPPO
and the LMP of corresponding CSs are shown in Fig. 9, and the method when electricity price fluctuate, this section selected a
LME of corresponding CSs are shown in Fig.10. In this typical day with large LMP fluctuations to analyze the dispatch
scenario, the carbon trading benefits of EVs are jointly affected results under the electricity price fluctuation scenario. The
by the charging and discharging decisions and the LME of charging/discharging of EVs and the LMP of corresponding
corresponding CSs. EVs are more inclined to make charging CSs are shown in Fig. 11. When electricity prices fluctuate
and discharging decisions to obtain carbon emission quotas, greatly, EVs locations change frequently. With sufficient
thereby obtaining more carbon trading benefits. The EVs arbitrage benefit under the high LMP difference to cover the
charged and discharged a total of 13 260kWh, an increase of spatial transfer costs of EVs, EVs tend to move to CSs with
24.7% compared to the scenario that only the electricity market higher LMP to discharge or CSs with lower LMP to charge in
is considered. In particular, the charging behavior of EVs is order to obtain higher benefits. At the same time, EVs tend to
mostly concentrated at the time steps 45-55. Combined with the make charging and discharging decisions more frequently.
LME curve in Fig. 10, the LME level of the relevant CSs during EVF1, EVF2, and EVF3 making charging and discharging
these time steps is low, and the carbon emissions generated by decisions 39, 36, and 39 times, respectively, with a total
EVs charging behavior are less, so higher carbon trading profits charging and discharging capacity of 16 560kWh, an increase
can be obtained. The results in the electricity-carbon joint of 55.7% compared to scenario b). Therefore, the results prove
market show that the proposed method is effective for the that the designed method can still accurately perceive the
spatiotemporal dispatch of EVs. On the one hand, EVs can environment when electricity prices fluctuate, guiding EVs to
accurately measure the relationship between benefits and costs make reasonable decisions.
LMP @ CS1 LMP @ CS4 LMP @ CS15 LMP @ CS16 LMP @ CS30

Power @ CS1 Power @ CS4 Power @ CS15 Power @ CS16 Power @ CS30
600 150

100
400

Price($/MWh)
50
200
Power(kW)

0
0
-50
-200
-100
-400
-150

-600 -200
0 10 20 30 40 50 60 70 80 90 0 10 20 30 40 50 60 70 80 90 0 10 20 30 40 50 60 70 80 90
Time(15min) Time(15min) Time(15min)

a)EVF1 b)EVF2 c)EVF3

Fig. 8. Dispatch results of EVs in the electricity market.
LMP @ CS1 LMP @ CS14 LMP @ CS15 LMP @ CS16 LMP @ CS23 LMP @ CS29 LMP @ CS30

Power @ CS1 Power @ CS14 Power @ CS15 Power @ CS16 Power @ CS23 Power @ CS29 Power @ CS30
600 150

400 100
Price($/MWh)

50
200
Power(kW)

0
0
-50
-200
-100
-400
-150

-600 -200
0 10 20 30 40 50 60 70 80 90 0 10 20 30 40 50 60 70 80 90 0 10 20 30 40 50 60 70 80 90
Time(15min) Time(15min) Time(15min)
a)EVF1 b)EVF2 c)EVF3
Fig. 9. Dispatch results of EVs in the electricity-carbon joint market.

1 results show that the designed method can still guide EVs to
0.9 make reasonable decisions when the traffic network model
changes.
0.8
D. Analysis of dispatch results in one year
0.7
To analyze the scalability and practicability of the proposed
CO2(kg/kWh)

0.6 method for V2G in the electricity-carbon joint market, the

0.5 dispatch period is set to one year in this section. The EVs
CS 1 charged a total of 2,401,355.2kWh and discharged 2,067,360.9
0.4
CS 14 kWh. The charge and discharge energy of EVs at each CS in
0.3 CS 15 one year are shown in Fig. 13 and Fig. 14, respectively. The
CS 16
0.2 CS 23 results show that EVs tend to charge at CS 14 because the LMP
CS 29 of CS 14 is always low, allowing EVs to charge at a very low
0.1 CS 30 price, thereby reducing the cost of charging behavior. EVs tend
0 to discharge at CSs 15 and 30 because their LMP prices are
10 20 30 40 50 60 70 80 90 higher, which enables EVs to obtain higher arbitrage benefits.
Time(15min) In addition, due to the different initial CSs selected, EVF1,
Fig. 10. LME of corresponding CSs. EVF2, and EVF3 have more power interactions at their
4) Dispatch results analysis for one day with jam respective initial CSs.
(Electricity-carbon joint market) In the electricity-carbon joint market, aggregator of EVs has
To verify the stability of the MATD3-MAPPO method in earned considerable benefit after one year of operation.
guiding spatial transfer decisions, this section analyzed the EVs Specifically, the benefit from the carbon market is $607,115.2,
dispatch results under traffic congestion scenarios. In this the benefit from the electricity market is $305,153.1, and the
scenario, the spatial transfer cost increases to three times than final total benefit of EVs is $718,592.1, with an average daily
that of normal. Fig. 12 shows the charging and discharging revenue of $1,968.7. Refer to the average price ($55,353) of
power of EVs and the CSs where they are located. The spatial electric vehicles in the United States published by Kelley Blue
transfer of EVF1 is CS16→CS30→CS14→CS16, the spatial Book, under the dispatch method proposed in this paper, the
transfer of EVF2 is CS30 → CS14 → CS30, and the spatial cost can be recovered in about 2.31 years. If policy subsidies
transfer of EVF3 is CS1→CS15→CS14→CS1. Combined with for the purchase of EVs are taken into account, the payback
the geographic location information in Fig. 5, compared with period may be even shorter. In addition, the price of EVs may
the scenario without jam, it can be seen that in the scenario with further decline in the future with technological advancement
the spatial transfer cost increasing, EVs must more carefully and expansion of production scale, which will further shorten
weigh the relationship between spatial transfer cost and the cost recovery period. Therefore, the method proposed in this
arbitrage benefit. In this scenario, EVs tend to choose shorter paper can effectively offer accurate guidance to the EVs to run
routes and charge and discharge at CSs that are closer to each in an economic operation way in the electricity-carbon joint
other to minimize the impact of transfer costs. Therefore, the market and has promising application prospects in the future.
LMP @ CS1 LMP @ CS16 LMP @ CS18 LMP @ CS19 LMP @ CS28 LMP @ CS30

Power @ CS1 Power @ CS16 Power @ CS18 Power @ CS19 Power @ CS28 Power @ CS30
600 1500

400
1000
200 Price($/MWh)
Power(kW)

0 500

-200
0
-400

-600 -500
0 10 20 30 40 50 60 70 80 90 0 10 20 30 40 50 60 70 80 90 0 10 20 30 40 50 60 70 80 90
Time(15min) Time(15min) Time(15min)
a)EVF1 b)EVF2 c)EVF3
Fig. 11. Dispatch results of EVs under fluctuating price.
volumes and daily revenues under these three scenarios are
E. Sensitivity analysis
shown in Table II. As observed from Table II, EVs'
To study the impact of carbon trading benefits on the charging/discharging power and daily benefits are positively
spatiotemporal dispatch of EVs, this section conducts a correlated with  . This is because as  decreases, the
sensitivity analysis based on different carbon quota price levels, benefits from EVs participating in carbon market trading also
considering the changes in EV charging/discharging behaviors decreases, leading to a reduced willingness to charge and
as well as carbon emissions. discharge. Although less charging and discharging behavior
Referring to the carbon allowance prices from CEA, the reduces the cost of battery capacity degradation, both the
annual carbon allowance price ranges from 0.01 to 0.014 $/kg, arbitrage revenue from charging/discharging and the carbon
with an average value of 0.012 $/kg. Therefore, 0.01, 0.012, and trading revenue decrease, resulting in a decline in daily revenue.
0.014 are selected for analysis. The EVs charging/discharging

Therefore, it can be seen that the modeling of carbon market reasonable decisions, and the total amount of charge and
trading in this paper is relatively reasonable. discharge increases by 20.2% in electricity-carbon joint market.
TABLE II When traffic congestion occurs, EVs will choose the stations
SENSITIVITY ANALYSIS OF CARBON ALLOWANCE PRICES adjacent to the initial CS to save the spatial transfer cost.
Charging/discharging Benefits / ($) 3) The long-term dispatch results show that the dispatch
 /($/kgCO2) method of EVs under the electricity-carbon joint market
power /(kWh)
proposed in this paper can achieve both scalability and
0.014 13 260 2 048.5 feasibility. The dispatch method proposed in this paper can
0.012 13 010 1 705.3 guide EVs to obtain an average daily income of approximately
0.010 12 440 1 294.6 $1,968.7, which enables EVs dispatch to recover costs in 2.31
years or even shorter.
In the future work, we plan to adopt more precise modeling
Discharing power Charing power Location of EVF(With jam) Location of EVF(Without jam) of carbon emissions from EVs. In this paper, we use LMP to
30
estimate LME to quantify carbon emissions during EV charging
20
behavior, which is just a preliminary idea. Additionally, to

CS
10 enhance the economic benefits of EVs, we will analyze a larger
EVF3
0 number of EVs to explore large-scale EV dispatch strategies.

Acknowledgment
This work is supported by National Natural Science
EVF2 Foundation of China (12171145).
600

REFERENCES
Power(kW)

400

[1] S. Englberger, K. A. Gamra, B. Tepe, M. Schreiber, A. Jossen and Holger

200
Hesse, "Electric vehicle multi-use: Optimizing multiple value streams
EVF1
0 using mobile storage systems in a vehicle-to-grid context," Applied Energy,
0 10 20 30 40 50 60 70 80 90
vol. 304, pp. 117862, 2021, doi: 10.1016/j.apenergy.2021.117862.
Fig. 12. Dispatch results of EVs with jam. [2] R. Lauvergne, Y. Perez, M. Françon and A. T. D. L. Cruz, " Integration of
105

electric vehicles into transmission grids: A case study on generation

Energy(kWh)

EVF1 3

2
adequacy in Europe in 2040, " Applied Energy, vol. 326, pp. 120030, 2022,
EVF2

1
doi: 10.1016/j.apenergy.2022.120030.
EVF3
[3] H. Wang, T. Zheng, W. Sun and M. Q. Khan, " Research on the pricing
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Station
16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
strategy of park electric vehicle agent considering carbon trading," Applied
Energy, vol. 340, pp. 121017, 2023, doi: 10.1016/j.apenergy.2023.121017.
Fig. 13. Charge energy of EVs at each CS in one year. [4] Global EV Outlook 2024. International Energy Agency.
105

EVF1 3 https://www.iea.org/reports/global-ev-outlook-2024, 2024(accessed 15

Energy(kWh)

EVF2
2 August 2024).
1 [5] K. M. Tan, V. K. Ramachandaramurthy, J. Y. Yong, " Integration of
EVF3
electric vehicles in smart grid: A review on vehicle to grid technologies and
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Station
16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
optimization techniques," Renewable and Sustainable Energy Reviews, vol.
Fig. 14. Discharge energy of EVs at each CS in one year. 53, pp. 720-732, 2016, doi: 10.1016/j.rser.2015.09.012.
[6] K. Guo, "Research on location selection model of distribution network with
constrained line constraints based on genetic algorithm," Neural
V. CONCLUSION Computing and Applications, vol. 32, pp. 1679–1689, 2020, doi:
10.1007/s00521-019-04257-y.
To realize the low-carbon-economic scheduling of EVs in [7] X. Li, Z. Wang, L. Zhang, F. Sun, D. Cui, C. Hecht, J. Figgener and D. U.
the electricity-carbon joint market, a spatiotemporal dispatch Sauer, "Electric vehicle behavior modeling and applications in vehicle-grid
method for EV fleets in the PTN based on MATD3-MAPPO is integration: An overview," Energy, vol. 268, pp. 126647, 2023, doi:
proposed in this paper, which provides practical strategy for 10.1016/j.energy.2023.126647.
[8] K. Li and L. Wang, "Optimal electric vehicle subsidy and pricing decisions
EVs in the PTN. The main conclusions are as follows: with consideration of EV anxiety and EV preference in green and non-
1) Algorithm performance results show that the MATD3 green consumers," Transportation Research Part E: Logistics and
and MAPPO algorithms used in this paper exhibit faster Transportation Review, vol. 170, pp. 103010, 2023, doi:
convergence speed and higher rewards. The high adaptability to 10.1016/j.tre.2022.103010.
[9] S. Hemavathi and A. Shinisha, "A study on trends and developments in
the modeling environment provides an effective solution to the electric vehicle charging technologies," Journal of Energy Storage, vol. 52,
spatiotemporal dispatch problem of the EVs under electricity- pp. 105013, 2022, doi: 10.1016/j.est.2022.105013.
carbon joint market. [10]R. Fachrizal, K. Qian, O. Lindberg, M. Shepero, R. Adam, J. Widén and J.
2) The dispatch results within a typical day show that EVs Munkhammar, "Urban-scale energy matching optimization with smart EV
charging and V2G in a net-zero energy city powered by wind and solar
accurately measure the benefit and cost, and make reasonable energy," eTransportation, vol. 20, pp. 100314, 2024, doi:
decisions under the scenarios of market structure changes, 10.1016/j.etran.2024.100314.
electricity price fluctuations and traffic congestion. Compared [11]Julia K. Szinai, Colin J.R. Sheppard, Nikit Abhyankar, Anand R. Gopal,
with the scenario of only considering the electricity market, the "Reduced grid operating costs and renewable energy curtailment with
electric vehicle charge management," Energy Policy, vol. 146, pp. 111051,
total amount of charge and discharge of EVs increases by 2020, doi: 10.1016/j.enpol.2019.111051.
24.7%. As the electricity price fluctuates greatly, EVs [12]W. Gan, J. Wen, M. Yan, Y. Zhou and W. Yao, "Enhancing Resilience
accurately perceive the electricity price information and make With Electric Vehicles Charging Redispatching and Vehicle-to-Grid in

Traffic-Electric Networks," IEEE Transactions on Industry Applications, Reinforcement Learning," in IEEE Transactions on Smart Grid, vol. 12, no.
vol. 60, no. 1, pp. 953-965, 2024, doi: 10.1109/TIA.2023.3272870. 2, pp. 1380-1393, 2021, doi: 10.1109/TSG.2020.3025082.
[13]A. Ravi, L. Bai and H. Wang, "Optimal Siting of EV Fleet Charging Station [30]D. Qiu, Y. Wang, T. Zhang, M. Sun and G. Strbac, " Hierarchical multi-
Considering EV Mobility and Microgrid Formation for Enhanced Grid agent reinforcement learning for repair crews dispatch control towards
Resilience," Applied Sciences, vol. 13, no. 22, pp. 12181, 2023, doi: multi-energy microgrid resilience," Applied Energy, vol. 336, pp. 120826,
10.3390/app132212181. 2023, doi: 10.1016/j.apenergy.2023.120826.
[14]A.G. Olabi, T. Wilberforce, E. T. Sayed, A. G. Abo-Khalil, H. M. [31]B. Xu, A. Oudalov, A. Ulbig, G. Andersson and D. S. Kirschen, "Modeling
Maghrabie, K. Elsaid and M. A. Abdelkareem, " Battery energy storage of Lithium-Ion Battery Degradation for Cell Life Assessment," in IEEE
systems and SWOT (strengths, weakness, opportunities, and threats) Transactions on Smart Grid, vol. 9, no. 2, pp. 1131-1140, 2018, doi:
analysis of batteries in power transmission," Applied Sciences, vol. 254, pp. 10.1109/TSG.2016.2578950.
123987, 2022, doi: 10.1016/j.energy.2022.123987. [32]T. Chen, X. Xu, H. Wang and Z. Yan, "Routing and Scheduling of Mobile
[15]J. Y. Yong, V. K. Ramachandaramurthy, K. M. Tan and J. Selvaraj, Energy Storage System for Electricity Arbitrage Based on Two-Layer Deep
"Experimental Validation of a Three-Phase Off-Board Electric Vehicle Reinforcement Learning," in IEEE Transactions on Transportation
Charger With New Power Grid Voltage Control," IEEE Transactions on Electrification, vol. 9, no. 1, pp. 1087-1102, 2023, doi:
Smart Grid, vol. 9, no. 4, pp. 2703-2713, 2018, doi: 10.1109/TTE.2022.3201164.
10.1109/TSG.2016.2617400. [33]S. Gronauer and K. Diepold, " Multi-agent deep reinforcement learning: a
[16]Y. Tao, J. Qiu, S. Lai, X. Sun, H. Liu and J. Zhao, "Distributed Electric survey," Artificial Intelligence Review, vol. 55, pp. 895–943, 2022, doi:
Vehicle Assignment and Charging Navigation in Cyber-Physical Systems," 10.1007/s10462-021-09996-w.
IEEE Transactions on Smart Grid, vol. 15, no. 2, pp. 1861-1875, 2024, doi: [34] LMP data: http://oasis.caiso.com/mrioasis/logon.do.
10.1109/TSG.2023.3293251. [35] Traffic information data: http://developers.google.com/maps.
[17]B. Sun, Z. Huang, X. Tan and D. H. K. Tsang, "Optimal Scheduling for
Electric Vehicle Charging With Discrete Charging Levels in Distribution
Grid," in IEEE Transactions on Smart Grid, vol. 9, no. 2, pp. 624-634, 2018,
Appendix
doi: 10.1109/TSG.2016.2558585. A. Parameter of estimation method for LME based on LMP
[18]H. Wang, Y. Jia, M. Shi, C. S. Lai and K. Li, "A Mutually Beneficial
Operation Framework for Virtual Power Plants and Electric Vehicle Table A1 Parameter of estimation method for LME
Charging Stations," in IEEE Transactions on Smart Grid, vol. 14, no. 6, pp.
4634-4648, 2023, doi: 10.1109/TSG.2023.3273856. Parameters Coal Natural gas Oil
[19]L. Ran, J. Qin, Y. Wan, W. Fu, W. Yu and F. Xiao, "Fast Charging 𝜇 21.86 70.69 112.39
Navigation Strategy of EVs in Power-Transportation Networks: A Coupled
Network Weighted Pricing Perspective," in IEEE Transactions on Smart 𝜎 6.56 21.21 33.72
Grid, vol. 15, no. 4, pp. 3864-3875, 2024, doi: 10.1109/TSG.2024.3354300. Carbon emission
[20]Y. He, Z. Liu, Z. Song, " Joint optimization of electric bus charging 0.517 0.361 0.867
infrastructure, vehicle scheduling, and charging management," factor/(kgCO2/kWh)
Transportation Research Part D: Transport and Environment, vol. 117, pp.
103653, 2023, doi: 10.1016/j.trd.2023.103653. B. Basic Parameter Setting
[21]J. Yang, T. Wiedmann, F. Luo, G. Yan, F. Wen and G. H. Broadbent, "A
Fully Decentralized Hierarchical Transactive Energy Framework for Table B1 Basic Parameter Setting
Charging EVs With Local DERs in Power Distribution Systems," in IEEE Parameters Value Parameters Value
Transactions on Transportation Electrification, vol. 8, no. 3, pp. 3041-3055,
2022, doi: 10.1109/TTE.2022.3168979. 𝑃 /kW 60 𝐸 /kWh 150
[22]S. Gao, R. Dai, W. Cao and Y. Che, "Combined Provision of Economic
𝛼 0.575 𝛽 121
Dispatch and Frequency Regulation by Aggregated EVs Considering
Electricity Market Interaction," in IEEE Transactions on Transportation 𝑆𝑂𝐶 0.9 𝑆𝑂𝐶 0.1
Electrification, vol. 9, no. 1, pp. 1723-1735, 2023, doi:
10.1109/TTE.2022.3195567.
𝛼 0.15 𝛽 4
[23]J. Zhang, L. Che, X. Wan and M. Shahidehpour, "Distributed Hierarchical  /($/kgCO2) 0.014 𝐿 /(km/kWh) 6.3
Coordination of Networked Charging Stations Based on Peer-to-Peer
Trading and EV Charging Flexibility Quantification," in IEEE
𝐸 /(kgCO2/km) 1.6 𝑆𝑂𝐶 0.8
Transactions on Power Systems, vol. 37, no. 4, pp. 2961-2975, 2022, doi:
10.1109/TPWRS.2021.3123351.
[24]G. Liu, Y. Tao, Z. Ge, J. Qiu, F. Wen and S. Lai, "Data-Driven Carbon
Footprint Management of Electric Vehicles and Emission Abatement in
Electricity Networks," IEEE Transactions on Sustainable Energy, vol. 15,
no. 1, pp. 95-108, 2024, doi: 10.1109/TSTE.2023.3274813.
[25]T. Wu, Z. Li, G. Wang, X. Zhang and J. Qiu, "Low-Carbon Charging
Facilities Planning for Electric Vehicles Based on a Novel Travel Route
Choice Model," IEEE Transactions on Intelligent Transportation Systems,
vol. 24, no. 6, pp. 5908-5922, 2023, doi: 10.1109/TITS.2023.3248087.
[26]H. Jahangir, S. S. Gougheri, B. Vatandoust, M. A. Golkar, A. Ahmadian
and A. Hajizadeh, "Plug-in Electric Vehicle Behavior Modeling in Energy
Market: A Novel Deep Learning-Based Approach With Clustering
Technique," in IEEE Transactions on Smart Grid, vol. 11, no. 6, pp. 4738-
4748, 2020, doi: 10.1109/TSG.2020.2998072.
[27]M. Zhang, H. Yang, Y. Xu and H. Sun, "Learning-Based Real-Time
Aggregate Flexibility Provision and Scheduling of Electric Vehicles,"
IEEE Transactions on Smart Grid, vol. 15, no. 6, pp. 5840-5852, 2024, doi:
10.1109/TSG.2024.3400968.
[28]F. Tuchnitz, N. Ebell, J. Schlund and M. Pruckner, " Development and
Evaluation of a Smart Charging Strategy for an Electric Vehicle Fleet
Based on Reinforcement Learning," Applied Energy, vol. 285, pp. 116382,
2023, doi: 10.1016/j.apenergy.2020.116382.
[29]Y. Liang, Z. Ding, T. Ding and W. -J. Lee, "Mobility-Aware Charging
Scheduling for Shared On-Demand Electric Vehicle Fleet Using Deep

A Deep Reinforcement Learning Based Charging and Discharging Scheduling Strategy For Electric Vehicles
No ratings yet
A Deep Reinforcement Learning Based Charging and Discharging Scheduling Strategy For Electric Vehicles
10 pages
Optimal Scheduling of Smart Microgrids Considering Electric Vehicle Battery Swapping Stations
No ratings yet
Optimal Scheduling of Smart Microgrids Considering Electric Vehicle Battery Swapping Stations
15 pages
Decentralized Based Advance Optimized Scheduling Scheme To Charge and Discharge The Electric Vehicles
No ratings yet
Decentralized Based Advance Optimized Scheduling Scheme To Charge and Discharge The Electric Vehicles
7 pages
Context Aware DRL
No ratings yet
Context Aware DRL
19 pages
Optimization of Pricing Policy of Electric Vehicle Charging Station Based On Big Data
No ratings yet
Optimization of Pricing Policy of Electric Vehicle Charging Station Based On Big Data
11 pages
Qian 2020
No ratings yet
Qian 2020
11 pages
Constrained EV Charging Scheduling Based On Safe Deep Reinforcement Learning
No ratings yet
Constrained EV Charging Scheduling Based On Safe Deep Reinforcement Learning
3 pages
Learning To Operate An Electric Vehicle Charging Station Considering Vehicle-Grid Integration
No ratings yet
Learning To Operate An Electric Vehicle Charging Station Considering Vehicle-Grid Integration
11 pages
Battery Health-Informed and Policy-Aware Deep Reinforcement Learning For EV-Facilitated Distribution Grid Optimal Policy
No ratings yet
Battery Health-Informed and Policy-Aware Deep Reinforcement Learning For EV-Facilitated Distribution Grid Optimal Policy
14 pages
Scheduling and Routing of Mobile Charging Stations With Stochastic Travel Times To Service Heterogeneous Spatiotemporal Electric Vehicle Charging Requests With Time Windows
No ratings yet
Scheduling and Routing of Mobile Charging Stations With Stochastic Travel Times To Service Heterogeneous Spatiotemporal Electric Vehicle Charging Requests With Time Windows
11 pages
Deep Reinforcement Learning Based Charging Scheduling For Household Electric Vehicles in Active Distribution Network
No ratings yet
Deep Reinforcement Learning Based Charging Scheduling For Household Electric Vehicles in Active Distribution Network
12 pages
Optimal Scheduling of Electric Vehicle Ordered Charging and Discharging
No ratings yet
Optimal Scheduling of Electric Vehicle Ordered Charging and Discharging
18 pages
Mabrouk 2019
No ratings yet
Mabrouk 2019
8 pages
Deep Learning for EV Charging Control
No ratings yet
Deep Learning for EV Charging Control
13 pages
Bms Battery Charging
No ratings yet
Bms Battery Charging
12 pages
Electric Taxi Battery Swap Pricing in China
No ratings yet
Electric Taxi Battery Swap Pricing in China
37 pages
Fuzzy Based Ev Charging With Reduced Power Fluctuation Under Renewable Power Consumption Constraint
No ratings yet
Fuzzy Based Ev Charging With Reduced Power Fluctuation Under Renewable Power Consumption Constraint
16 pages
Applied Sciences
No ratings yet
Applied Sciences
21 pages
A Data Driven Approach For Optimizing Early Stage Electric Vehicle
No ratings yet
A Data Driven Approach For Optimizing Early Stage Electric Vehicle
11 pages
Smart Online Charging Algorithm For Electric Vehicles Via Customized Actor-Critic Learning
No ratings yet
Smart Online Charging Algorithm For Electric Vehicles Via Customized Actor-Critic Learning
11 pages
Liu 2019
No ratings yet
Liu 2019
12 pages
Distributed Electric Vehicle Charging Mechanism: A Game-Theoretical Approach
No ratings yet
Distributed Electric Vehicle Charging Mechanism: A Game-Theoretical Approach
9 pages
Electric Vehicle Charging Planning: A Complex Systems Perspective
No ratings yet
Electric Vehicle Charging Planning: A Complex Systems Perspective
19 pages
Large Scale Scenarios of EV Charging W A Data Driven Model of Control - Powell
No ratings yet
Large Scale Scenarios of EV Charging W A Data Driven Model of Control - Powell
43 pages
A Multiagent Federated Reinforcement Learning Approach For Plug-In Electric Vehicle Fleet Charging Coordination in A Residential Community
No ratings yet
A Multiagent Federated Reinforcement Learning Approach For Plug-In Electric Vehicle Fleet Charging Coordination in A Residential Community
14 pages
Optimizing Electric Vehicles Charging Through Smart Energy Allocation and Cost-Saving
No ratings yet
Optimizing Electric Vehicles Charging Through Smart Energy Allocation and Cost-Saving
6 pages
EV Charging Stations With A Provision of V2G and Voltage Support in A Distribution Network
No ratings yet
EV Charging Stations With A Provision of V2G and Voltage Support in A Distribution Network
10 pages
A Cognitive Stochastic Approximation Approach To Optimal Charging Schedule in Electric Vehicle Stations
No ratings yet
A Cognitive Stochastic Approximation Approach To Optimal Charging Schedule in Electric Vehicle Stations
6 pages
A Coordinated Charging Model For Electric Vehicles in A Smart Grid Using Whale Optimization Algorithm
No ratings yet
A Coordinated Charging Model For Electric Vehicles in A Smart Grid Using Whale Optimization Algorithm
8 pages
EV Charging Stations With A Provision of V2G and Voltage Support in A Distribution Network
No ratings yet
EV Charging Stations With A Provision of V2G and Voltage Support in A Distribution Network
10 pages
Electric Vehicle Charging Load Prediction Based On Weight Fusion Spatial-Temporal Graph Convolutional Network
No ratings yet
Electric Vehicle Charging Load Prediction Based On Weight Fusion Spatial-Temporal Graph Convolutional Network
17 pages
Learning Assisted Demand Charge Mitigation For Workplace Electric Vehicle Charging
No ratings yet
Learning Assisted Demand Charge Mitigation For Workplace Electric Vehicle Charging
9 pages
Electric Vehicle Charging Scheduling Problem Heuristicsand Metaheuristic Approaches
No ratings yet
Electric Vehicle Charging Scheduling Problem Heuristicsand Metaheuristic Approaches
21 pages
EV Demand KSR Phase 2
No ratings yet
EV Demand KSR Phase 2
68 pages
1 s2.0 S0957417422019595 Main
No ratings yet
1 s2.0 S0957417422019595 Main
18 pages
Simulation of Electric Vehicle With DC Fast Charging For V2G and G2V Operation
No ratings yet
Simulation of Electric Vehicle With DC Fast Charging For V2G and G2V Operation
8 pages
EV Charging Infrastructure Insights
No ratings yet
EV Charging Infrastructure Insights
9 pages
A Structural Property of Charging Scheduling Policy For Shared Electric Vehicles With Wind Power Generation
No ratings yet
A Structural Property of Charging Scheduling Policy For Shared Electric Vehicles With Wind Power Generation
13 pages
Content 1 5
No ratings yet
Content 1 5
5 pages
Energies: Coordinated Charging Strategy For Electric Taxis in Temporal and Spatial Scale
No ratings yet
Energies: Coordinated Charging Strategy For Electric Taxis in Temporal and Spatial Scale
17 pages
Optimal Planning of Charging For Plug-In Electric
No ratings yet
Optimal Planning of Charging For Plug-In Electric
15 pages
Multimicrogrid Energy Scheduling with EVs
No ratings yet
Multimicrogrid Energy Scheduling with EVs
13 pages
Article 2 VE
No ratings yet
Article 2 VE
13 pages
1 s2.0 S0142061519335616 Main
No ratings yet
1 s2.0 S0142061519335616 Main
11 pages
EV Charging Scheme: Battery Cost Reduction
No ratings yet
EV Charging Scheme: Battery Cost Reduction
13 pages
A Heuristic Operation Strategy For Commercial Building Microgrids Containing Evs and PV System
No ratings yet
A Heuristic Operation Strategy For Commercial Building Microgrids Containing Evs and PV System
11 pages
EV Charging Dispatch via DRL
No ratings yet
EV Charging Dispatch via DRL
10 pages
EV Energy Management in Smart Grids
No ratings yet
EV Energy Management in Smart Grids
197 pages
Cost Benefit Analysis For V2G Implementation of Electric Vehicles in Distribution System
No ratings yet
Cost Benefit Analysis For V2G Implementation of Electric Vehicles in Distribution System
11 pages
Optimal Electric Vehicle Charging and Discharging Scheduling Using
No ratings yet
Optimal Electric Vehicle Charging and Discharging Scheduling Using
10 pages
6415 6683 1 PB
No ratings yet
6415 6683 1 PB
7 pages
1 s2.0 S2590116825000013 Main
No ratings yet
1 s2.0 S2590116825000013 Main
14 pages
2021 Routing Optimization of Electric Vehicles For Charging With Event-Driven Pricing Strategy
No ratings yet
2021 Routing Optimization of Electric Vehicles For Charging With Event-Driven Pricing Strategy
14 pages
Eb MDSVPTW
No ratings yet
Eb MDSVPTW
18 pages
1 s2.0 S1877050914006723 Main
No ratings yet
1 s2.0 S1877050914006723 Main
8 pages
Prediction of EV Charging Behavior Using Machine L
No ratings yet
Prediction of EV Charging Behavior Using Machine L
12 pages
Prediction of EV Charging Behavior Using Machine L
No ratings yet
Prediction of EV Charging Behavior Using Machine L
12 pages
Varsha 1
No ratings yet
Varsha 1
18 pages
Pat130a1 - Gtre Gtx-35 Archived 02 2011
No ratings yet
Pat130a1 - Gtre Gtx-35 Archived 02 2011
5 pages
CF34-10 Technical Manual Index March 1, 2022: All Component Maintenance Manuals Are Revised As Required
No ratings yet
CF34-10 Technical Manual Index March 1, 2022: All Component Maintenance Manuals Are Revised As Required
21 pages
PSA1 Module 5
No ratings yet
PSA1 Module 5
21 pages
X7R Dielectric: General Specifications
No ratings yet
X7R Dielectric: General Specifications
5 pages
EEC Model Answers
No ratings yet
EEC Model Answers
55 pages
Bio-Gas Digester
No ratings yet
Bio-Gas Digester
17 pages
Part B, Fire Safety 2000 Edition
No ratings yet
Part B, Fire Safety 2000 Edition
154 pages
(NADA) Introduction NADA+v-net7000 R1 2023 en
No ratings yet
(NADA) Introduction NADA+v-net7000 R1 2023 en
25 pages
NISSAN Pathfinder R51 EM (ENGINE MECHANICAL)
100% (2)
NISSAN Pathfinder R51 EM (ENGINE MECHANICAL)
304 pages
Reles 21
No ratings yet
Reles 21
32 pages
Sabre5000 Tri Rev A Low Res
No ratings yet
Sabre5000 Tri Rev A Low Res
144 pages
Gauntlet 4 Automatic Screen Printing Press 9r5nVBB90r20144yri
No ratings yet
Gauntlet 4 Automatic Screen Printing Press 9r5nVBB90r20144yri
2 pages
Hev M4
No ratings yet
Hev M4
13 pages
IS 3043:2018 Earthing Code Analysis
No ratings yet
IS 3043:2018 Earthing Code Analysis
5 pages
ATV212HD15N4: Product Data Sheet
No ratings yet
ATV212HD15N4: Product Data Sheet
11 pages
Petronas Etro 4: Product Data Sheet
No ratings yet
Petronas Etro 4: Product Data Sheet
1 page
Non-Metals: Teacher's Guide
No ratings yet
Non-Metals: Teacher's Guide
49 pages
Ceiling Duct Unit Specs & Details
No ratings yet
Ceiling Duct Unit Specs & Details
36 pages
QG 4935 - Discipline Inspectors E&I Inspection
No ratings yet
QG 4935 - Discipline Inspectors E&I Inspection
4 pages
Dark Blue Orange Modern Real Estate Marketing Brochure
No ratings yet
Dark Blue Orange Modern Real Estate Marketing Brochure
2 pages
5923G-AJ-1807 Rev0
No ratings yet
5923G-AJ-1807 Rev0
1 page
Physics XII CH 6 CASE STUDY Electromagnetic Induction
No ratings yet
Physics XII CH 6 CASE STUDY Electromagnetic Induction
35 pages
WPS & PQR VVSPL
No ratings yet
WPS & PQR VVSPL
3 pages
Specimen Paper 1 (P1)
No ratings yet
Specimen Paper 1 (P1)
20 pages
Https::assets Publishing Service Gov Uk:media:5a7cb223ed915d6822361f92:geho1210btgq-E-E
No ratings yet
Https::assets Publishing Service Gov Uk:media:5a7cb223ed915d6822361f92:geho1210btgq-E-E
136 pages
Manual 5etanorm SYT
No ratings yet
Manual 5etanorm SYT
70 pages
RT24 Agenda - Academic Day
No ratings yet
RT24 Agenda - Academic Day
4 pages
7th BASIC SCEINCE First Term Model Question Paper - English Medium by Educationobserver - Com 2
No ratings yet
7th BASIC SCEINCE First Term Model Question Paper - English Medium by Educationobserver - Com 2
4 pages
CCL 8
No ratings yet
CCL 8
11 pages

Spatiotemporal Optimized Dispatch of Electric Vehicles Under Electricity-Carbon Joint Market

Uploaded by

Spatiotemporal Optimized Dispatch of Electric Vehicles Under Electricity-Carbon Joint Market

Uploaded by

This article has been accepted for publication in IEEE Transactions on Transportation Electrification.

Spatiotemporal Optimized Dispatch of Electric

Abstract—The electrification of urban transportation systems DOD Depth of discharge

Ctra Spatial transfer cost battery

in coupled networks, achieving coordinated economic TABLE I

constructed in this paper to meet the information needs of real- in Section V.

Road Traffic information network

Electricity trading Energy information network

carbon joint Benefits/costs

Fig. 1. Spatiotemporal dispatch mechanism of EVs.

calculated with Eq (4). 𝐶 represents spatial transfer costs of step t is:

rim, j  rich convergence speed and relatively straightforward parameter

ri,Ej   ct (30) where 𝑐𝑙𝑖𝑝 represents clipping function used to constrain

step j, respectively. 𝑟 , is the penalty term arising from  old Trajectory Q w

at decision step 𝑗, which can be calculated with Eq (32). 𝑐 1 2 1 2 1

step, the degradation costs is calculated based on the S2,τ  old

The charging/discharging decision layer adopts the Multi-

1 N make spatial transfer decision, and the MAPPO algorithm's

{[ s1t , st2 ],[at1 , at2 ], rt ,[ s1t 1 , st21 ]}

Actor Network Critic Network

Initialize environment Select initial stations Obtain charge/ discharge decisions

If the agent is stationary, make charge and

Fig. 4. Training and execution flowchart.

a)EVF1 b)EVF2 c)EVF3

0.6 method for V2G in the electricity-carbon joint market, the

[1] S. Englberger, K. A. Gamra, B. Tepe, M. Schreiber, A. Jossen and Holger

electric vehicles into transmission grids: A case study on generation

EVF1 3 https://www.iea.org/reports/global-ev-outlook-2024, 2024(accessed 15

You might also like