Skip to main content

Advertisement

Springer Nature Link
Account
Menu
Find a journal Publish with us Track your research
Search
Saved research
Cart
  1. Home
  2. Tools and Algorithms for the Construction and Analysis of Systems
  3. Conference paper

Good-for-MDPs Automata for Probabilistic Analysis and Reinforcement Learning

  • Conference paper
  • Open Access
  • First Online: 17 April 2020
  • pp 306–323
  • Cite this conference paper

You have full access to this open access conference paper

Download book PDF
Tools and Algorithms for the Construction and Analysis of Systems (TACAS 2020)
Good-for-MDPs Automata for Probabilistic Analysis and Reinforcement Learning
Download book PDF
  • Ernst Moritz Hahn  ORCID: orcid.org/0000-0002-9348-768410,11,
  • Mateo Perez  ORCID: orcid.org/0000-0003-4220-321212,
  • Sven Schewe  ORCID: orcid.org/0000-0002-9093-951813,
  • Fabio Somenzi  ORCID: orcid.org/0000-0002-2085-200312,
  • Ashutosh Trivedi  ORCID: orcid.org/0000-0001-9346-012612 &
  • …
  • Dominik Wojtczak  ORCID: orcid.org/0000-0001-5560-054613 

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 12078))

Included in the following conference series:

  • International Conference on Tools and Algorithms for the Construction and Analysis of Systems
  • 8282 Accesses

  • 22 Citations

Abstract

We characterize the class of nondeterministic \(\omega \)-automata that can be used for the analysis of finite Markov decision processes (MDPs). We call these automata ‘good-for-MDPs’ (GFM). We show that GFM automata are closed under classic simulation as well as under more powerful simulation relations that leverage properties of optimal control strategies for MDPs. This closure enables us to exploit state-space reduction techniques, such as those based on direct and delayed simulation, that guarantee simulation equivalence. We demonstrate the promise of GFM automata by defining a new class of automata with favorable properties—they are Büchi automata with low branching degree obtained through a simple construction—and show that going beyond limit-deterministic automata may significantly benefit reinforcement learning.

This work has been supported by the National Natural Science Foundation of China (Grant Nr. 61532019), EPSRC grants EP/M027287/1 and EP/P020909/1, and a CU Boulder RIO grant.

Download to read the full chapter text

Chapter PDF

Similar content being viewed by others

Alternating Good-for-MDPs Automata

Chapter © 2022

Omega-Regular Objectives in Model-Free Reinforcement Learning

Chapter © 2019

An Impossibility Result in Automata-Theoretic Reinforcement Learning

Chapter © 2022

Explore related subjects

Discover the latest articles, books and news in related subjects, suggested using machine learning.
  • Data Structures and Information Theory
  • Formal Languages and Automata Theory
  • Markov Process
  • Stochastic Systems and Control
  • Stochastic Calculus
  • Probabilistic Methods, Simulation and Stochastic Differential Equations

References

  1. T. Babiak, M. Křetínský, V. Rehák, and J. Strejcek. LTL to Büchi automata translation: Fast and more deterministic. In Tools and Algorithms for the Construction and Analysis of Systems, pages 95–109, 2012.

    Google Scholar 

  2. Ch. Baier and J.-P. Katoen. Principles of Model Checking. MIT Press, 2008.

    Google Scholar 

  3. C. Courcoubetis and M. Yannakakis. Verifying temporal properties of finite-state probabilistic programs. In Foundations of Computer Science, pages 338–345. IEEE, 1988.

    Google Scholar 

  4. C. Courcoubetis and M. Yannakakis. The complexity of probabilistic verification. J. ACM, 42(4):857–907, July 1995.

    Google Scholar 

  5. L. de Alfaro. Formal Verification of Probabilistic Systems. PhD thesis, Stanford University, 1998.

    Google Scholar 

  6. P. Dhariwal, Ch. Hesse, O. Klimov, A. Nichol, M. Plappert, A. Radford, J. Schulman, S. Sidor, Y. Wu, and P. Zhokhov. Openai baselines. https://github.com/openai/baselines, 2017.

  7. D. L. Dill, A. J. Hu, and H. Wong-Toi. Checking for language inclusion using simulation relations. In Computer Aided Verification, pages 255–265, July 1991. LNCS 575.

    Google Scholar 

  8. A. Duret-Lutz, A. Lewkowicz, A. Fauchille, T. Michaud, E. Renault, and L. Xu. Spot 2.0 - A framework for LTL and \(\omega \)-automata manipulation. In Automated Technology for Verification and Analysis, pages 122–129, 2016.

    Google Scholar 

  9. K. Etessami, T. Wilke, and R. A. Schuller. Fair simulation relations, parity games, and state space reduction for Büchi automata. SIAM J. Comput., 34(5):1159–1175, 2005.

    Google Scholar 

  10. S. Gurumurthy, R. Bloem, and F. Somenzi. Fair simulation minimization. In Computer Aided Verification (CAV’02), pages 610–623, July 2002. LNCS 2404.

    Google Scholar 

  11. E. M. Hahn, G. Li, S. Schewe, A. Turrini, and L. Zhang. Lazy probabilistic model checking without determinisation. In Concurrency Theory, pages 354–367, 2015.

    Google Scholar 

  12. E. M. Hahn, M. Perez, S. Schewe, F. Somenzi, A. Trivedi, and D. Wojtczak. Omega-regular objectives in model-free reinforcement learning. In Tools and Algorithms for the Construction and Analysis of Systems, pages 395–412, 2019. LNCS 11427.

    Google Scholar 

  13. E. M. Hahn, M. Perez, F. Somenzi, A. Trivedi, S. Schewe, and D. Wojtczak. Good-for-MDPs automata. arXiv e-prints, abs/1909.05081, September 2019.

    Google Scholar 

  14. T. Henzinger, O. Kupferman, and S. Rajamani. Fair simulation. In Concurrency Theory, pages 273–287, 1997. LNCS 1243.

    Google Scholar 

  15. T. A. Henzinger and N. Piterman. Solving games without determinization. In Computer Science Logic, pages 394–409, September 2006. LNCS 4207.

    Google Scholar 

  16. D. Kini and M. Viswanathan. Optimal translation of LTL to limit deterministic automata. In Tools and Algorithms for the Construction and Analysis of Systems, pages 113–129, 2017.

    Google Scholar 

  17. J. Klein, D. Müller, Ch. Baier, and S. Klüppelholz. Are good-for-games automata good for probabilistic model checking? In Language and Automata Theory and Applications, pages 453–465. Springer, 2014.

    Google Scholar 

  18. J. Klein, D. Müller, Ch. Baier, and S. Klüppelholz. Are good-for-games automata good for probabilistic model checking? In Language and Automata Theory and Applications, pages 453–465, 2014.

    Google Scholar 

  19. J. Křetínský, T. Meggendorfer, S. Sickert, and Ch. Ziegler. Rabinizer 4: from LTL to your favourite deterministic automaton. In Computer Aided Verification, pages 567–577. Springer, 2018.

    Google Scholar 

  20. J. Křetínský, T. Meggendorfer, and S. Sickert. Owl: A library for \(\omega \)-words, automata, and LTL. In Automated Technology for Verification and Analysis, pages 543–550, 2018.

    Google Scholar 

  21. R. Milner. An algebraic definition of simulation between programs. Int. Joint Conf. on Artificial Intelligence, pages 481–489, 1971.

    Google Scholar 

  22. N. Piterman. From deterministic Büchi and Streett automata to deterministic parity automata. Logical Methods in Computer Science, 3(3):1–21, 2007.

    Google Scholar 

  23. M. L. Puterman. Markov Decision Processes: Discrete Stochastic Dynamic Programming. John Wiley & Sons, New York, NY, USA, 1994.

    Google Scholar 

  24. S. Safra. Complexity of Automata on Infinite Objects. PhD thesis, The Weizmann Institute of Science, March 1989.

    Google Scholar 

  25. S. Schewe. Beyond hyper-minimisation—minimising DBAs and DPAs isNP-complete. In Foundations of Software Technology and Theoretical Computer Science, FSTTCS, pages 400–411, 2010.

    Google Scholar 

  26. S. Schewe and T. Varghese. Tight bounds for the determinisation and complementation of generalised Büchi automata. In Automated Technology for Verification and Analysis, pages 42–56, 2012.

    Google Scholar 

  27. S. Schewe and T. Varghese. Determinising parity automata. In Mathematical Foundations of Computer Science, pages 486–498, 2014.

    Google Scholar 

  28. J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov. Proximal policy optimization algorithms. CoRR, abs/1707.06347, 2017.

    Google Scholar 

  29. S. Sickert, J. Esparza, S. Jaax, and J. Křetínský. Limit-deterministic Büchi automata for linear temporal logic. In Computer Aided Verification, pages 312–332, 2016. LNCS 9780.

    Google Scholar 

  30. S. Sickert and J. Křetínský. MoChiBA: Probabilistic LTL model checking using limit-deterministic Büchi automata. In Automated Technology for Verification and Analysis, pages 130–137, 2016.

    Google Scholar 

  31. F. Somenzi and R. Bloem. Efficient Büchi automata from LTL formulae. In Computer Aided Verification, pages 248–263, July 2000. LNCS 1855.

    Google Scholar 

  32. M.-H. Tsai, S. Fogarty, M. Y. Vardi, and Y.-K. Tsay. State of Büchi complementation. Logical Mehods in Computer Science, 10(4), 2014.

    Google Scholar 

  33. M.-H. Tsai, Y.-K. Tsay, and Y.-S. Hwang. GOAL for games, omega-automata, and logics. In Computer Aided Verification, pages 883–889, 2013.

    Google Scholar 

  34. M. Y. Vardi. Automatic verification of probabilistic concurrent finite state programs. In Foundations of Computer Science, pages 327–338, 1985.

    Google Scholar 

  35. E. M. Hahn, M. Perez, S. Schewe, F. Somenzi, A. Trivedi, and D. Wojtczak. Good-for-MDPs Automata for Probabilistic Analysis and Reinforcement Learning Figshare (2020), https://doi.org/10.6084/m9.figshare.11882739.

  36. A. Hartmanns and M. Seidl. tacas20ae.ova. Figshare (2019) https://doi.org/10.6084/m9.figshare.9699839.v2

Download references

Author information

Authors and Affiliations

  1. School of EEECS, Queen’s University Belfast, Belfast, UK

    Ernst Moritz Hahn

  2. State Key Laboratory of Computer Science, Institute of Software, CAS, Beijing, People’s Republic of China

    Ernst Moritz Hahn

  3. University of Colorado Boulder, Boulder, USA

    Mateo Perez, Fabio Somenzi & Ashutosh Trivedi

  4. University of Liverpool, Liverpool, UK

    Sven Schewe & Dominik Wojtczak

Authors
  1. Ernst Moritz Hahn
    View author publications

    Search author on:PubMed Google Scholar

  2. Mateo Perez
    View author publications

    Search author on:PubMed Google Scholar

  3. Sven Schewe
    View author publications

    Search author on:PubMed Google Scholar

  4. Fabio Somenzi
    View author publications

    Search author on:PubMed Google Scholar

  5. Ashutosh Trivedi
    View author publications

    Search author on:PubMed Google Scholar

  6. Dominik Wojtczak
    View author publications

    Search author on:PubMed Google Scholar

Corresponding author

Correspondence to Ernst Moritz Hahn .

Editor information

Editors and Affiliations

  1. Johannes Kepler University, Linz, Austria

    Armin Biere

  2. University of Birmingham, Birmingham, UK

    David Parker

Rights and permissions

Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

Reprints and permissions

Copyright information

© 2020 The Author(s)

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Hahn, E.M., Perez, M., Schewe, S., Somenzi, F., Trivedi, A., Wojtczak, D. (2020). Good-for-MDPs Automata for Probabilistic Analysis and Reinforcement Learning. In: Biere, A., Parker, D. (eds) Tools and Algorithms for the Construction and Analysis of Systems. TACAS 2020. Lecture Notes in Computer Science(), vol 12078. Springer, Cham. https://doi.org/10.1007/978-3-030-45190-5_17

Download citation

  • .RIS
  • .ENW
  • .BIB
  • DOI: https://doi.org/10.1007/978-3-030-45190-5_17

  • Published: 17 April 2020

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-45189-9

  • Online ISBN: 978-3-030-45190-5

  • eBook Packages: Computer ScienceComputer Science (R0)Springer Nature Proceedings Computer Science

Share this paper

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Publish with us

Policies and ethics

Profiles

  1. Ashutosh Trivedi View author profile

Search

Navigation

  • Find a journal
  • Publish with us
  • Track your research

Discover content

  • Journals A-Z
  • Books A-Z

Publish with us

  • Journal finder
  • Publish your research
  • Language editing
  • Open access publishing

Products and services

  • Our products
  • Librarians
  • Societies
  • Partners and advertisers

Our brands

  • Springer
  • Nature Portfolio
  • BMC
  • Palgrave Macmillan
  • Apress
  • Discover
  • Your US state privacy rights
  • Accessibility statement
  • Terms and conditions
  • Privacy policy
  • Help and support
  • Legal notice
  • Cancel contracts here

Not affiliated

Springer Nature

© 2026 Springer Nature