Robust off-policy Reinforcement Learning via Soft Constrained Adversary

Nakanishi, Kosuke; Kubo, Akihiro; Yasui, Yuji; Ishii, Shin

Computer Science > Machine Learning

arXiv:2409.00418 (cs)

[Submitted on 31 Aug 2024]

Title:Robust off-policy Reinforcement Learning via Soft Constrained Adversary

Authors:Kosuke Nakanishi, Akihiro Kubo, Yuji Yasui, Shin Ishii

View PDF HTML (experimental)

Abstract:Recently, robust reinforcement learning (RL) methods against input observation have garnered significant attention and undergone rapid evolution due to RL's potential vulnerability. Although these advanced methods have achieved reasonable success, there have been two limitations when considering adversary in terms of long-term horizons. First, the mutual dependency between the policy and its corresponding optimal adversary limits the development of off-policy RL algorithms; although obtaining optimal adversary should depend on the current policy, this has restricted applications to off-policy RL. Second, these methods generally assume perturbations based only on the $L_p$-norm, even when prior knowledge of the perturbation distribution in the environment is available. We here introduce another perspective on adversarial RL: an f-divergence constrained problem with the prior knowledge distribution. From this, we derive two typical attacks and their corresponding robust learning frameworks. The evaluation of robustness is conducted and the results demonstrate that our proposed methods achieve excellent performance in sample-efficient off-policy RL.

Comments:	33 pages, 12 figures, 2 tables
Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2409.00418 [cs.LG]
	(or arXiv:2409.00418v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2409.00418

Submission history

From: Kosuke Nakanishi [view email]
[v1] Sat, 31 Aug 2024 11:13:33 UTC (513 KB)

Computer Science > Machine Learning

Title:Robust off-policy Reinforcement Learning via Soft Constrained Adversary

Submission history

Access Paper:

Current browse context:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Robust off-policy Reinforcement Learning via Soft Constrained Adversary

Submission history

Access Paper:

Current browse context:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators