SPMA

This repository contains the code for Softmax Policy Mirror Ascent (SPMA), a policy optimization algorithm based on mirror ascent in the space of logits, using the log-sum-exp mirror map. The repository includes scripts that can be integrated into stable-baselines3 to reproduce the experiments from the AISTATS 2025 paper Fast Convergence of Softmax Policy Mirror Ascent.

The installation is identical to stable-baselines3 (Pytorch version), so no additional steps are required.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
common		common
spma		spma
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SPMA

About

Uh oh!

Releases

Packages

Languages

License

reza-asad/SPMA

Folders and files

Latest commit

History

Repository files navigation

SPMA

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages