Unifying two views on multiple mean-payoff objectives in Markov decision processes
IST Austria Technical Report
Chatterjee, Krishnendu
Komarkova, Zuzana
Kretinsky, Jan
ddc:004
We consider Markov decision processes (MDPs) with multiple limit-average (or mean-payoff) objectives.
There have been two different views: (i) the expectation semantics, where the goal is to optimize the expected mean-payoff objective, and (ii) the satisfaction semantics, where the goal is to maximize the probability of runs such that the mean-payoff value stays above a given vector.
We consider the problem where the goal is to optimize the expectation under the constraint that the satisfaction semantics is ensured, and thus consider a generalization that unifies the existing semantics. Our problem captures the notion of optimization with respect to strategies that are risk-averse (i.e., ensures certain probabilistic guarantee).
Our main results are algorithms for the decision problem which are always polynomial in the size of the MDP.
We also show that an approximation of the Pareto-curve can be computed in time polynomial in the size of the MDP, and the approximation factor, but exponential in the number of dimensions. Finally, we present a complete characterization of the strategy complexity (in terms of memory bounds and randomization) required to solve our problem.
IST Austria
2015
info:eu-repo/semantics/technical_report
doc-type:other
text
https://research-explorer.app.ist.ac.at/record/5435
https://research-explorer.app.ist.ac.at/download/5435/5525
Chatterjee K, Komarkova Z, Kretinsky J. <i>Unifying Two Views on Multiple Mean-Payoff Objectives in Markov Decision Processes</i>. IST Austria; 2015. doi:<a href="https://doi.org/10.15479/AT:IST-2015-318-v2-1">10.15479/AT:IST-2015-318-v2-1</a>
eng
info:eu-repo/semantics/altIdentifier/doi/10.15479/AT:IST-2015-318-v2-1
info:eu-repo/semantics/altIdentifier/issn/2664-1690
info:eu-repo/semantics/closedAccess