10.1016/j.jcss.2016.09.009
Brázdil, Tomáš
Tomáš
Brázdil
Chatterjee, Krishnendu
Krishnendu
Chatterjee0000-0002-4561-241X
Forejt, Vojtěch
Vojtěch
Forejt
Kučera, Antonín
Antonín
Kučera
Trading performance for stability in Markov decision processes
Elsevier
2017
2018-12-11T11:51:12Z
2020-01-21T13:19:33Z
journal_article
https://research-explorer.app.ist.ac.at/record/1294
https://research-explorer.app.ist.ac.at/record/1294.json
708657 bytes
application/pdf
We study controller synthesis problems for finite-state Markov decision processes, where the objective is to optimize the expected mean-payoff performance and stability (also known as variability in the literature). We argue that the basic notion of expressing the stability using the statistical variance of the mean payoff is sometimes insufficient, and propose an alternative definition. We show that a strategy ensuring both the expected mean payoff and the variance below given bounds requires randomization and memory, under both the above definitions. We then show that the problem of finding such a strategy can be expressed as a set of constraints.