{"doi":"10.1109/LICS.2011.10","project":[{"call_identifier":"FWF","name":"Modern Graph Algorithmic Techniques in Formal Verification","_id":"2584A770-B435-11E9-9278-68D0E5697425","grant_number":"P 23499-N23"},{"_id":"25832EC2-B435-11E9-9278-68D0E5697425","grant_number":"S 11407_N23","name":"Rigorous Systems Engineering","call_identifier":"FWF"},{"_id":"2581B60A-B435-11E9-9278-68D0E5697425","grant_number":"279307","name":"Quantitative Graph Games: Theory and Applications","call_identifier":"FP7"},{"_id":"2587B514-B435-11E9-9278-68D0E5697425","name":"Microsoft Research Faculty Fellowship"}],"quality_controlled":"1","_id":"3346","year":"2011","day":"21","title":"Two views on multiple mean payoff objectives in Markov Decision Processes","user_id":"4435EBFC-F248-11E8-B48F-1D18A9856A87","date_updated":"2021-01-12T07:42:49Z","language":[{"iso":"eng"}],"date_created":"2018-12-11T12:02:48Z","author":[{"last_name":"Brázdil","first_name":"Tomáš","full_name":"Brázdil, Tomáš"},{"full_name":"Brožek, Václav","first_name":"Václav","last_name":"Brožek"},{"full_name":"Chatterjee, Krishnendu","first_name":"Krishnendu","last_name":"Chatterjee","id":"2E5DCA20-F248-11E8-B48F-1D18A9856A87","orcid":"0000-0002-4561-241X"},{"full_name":"Forejt, Vojtěch","first_name":"Vojtěch","last_name":"Forejt"},{"first_name":"Antonín","last_name":"Kučera","full_name":"Kučera, Antonín"}],"date_published":"2011-06-21T00:00:00Z","oa_version":"Submitted Version","conference":{"name":"LICS: Logic in Computer Science","location":"Toronto, Canada","start_date":"2011-06-21","end_date":"2011-06-24"},"ec_funded":1,"abstract":[{"text":"We study Markov decision processes (MDPs) with multiple limit-average (or mean-payoff) functions. We consider two different objectives, namely, expectation and satisfaction objectives. Given an MDP with k reward functions, in the expectation objective the goal is to maximize the expected limit-average value, and in the satisfaction objective the goal is to maximize the probability of runs such that the limit-average value stays above a given vector. We show that under the expectation objective, in contrast to the single-objective case, both randomization and memory are necessary for strategies, and that finite-memory randomized strategies are sufficient. Under the satisfaction objective, in contrast to the single-objective case, infinite memory is necessary for strategies, and that randomized memoryless strategies are sufficient for epsilon-approximation, for all epsilon>;0. We further prove that the decision problems for both expectation and satisfaction objectives can be solved in polynomial time and the trade-off curve (Pareto curve) can be epsilon-approximated in time polynomial in the size of the MDP and 1/epsilon, and exponential in the number of reward functions, for all epsilon>;0. Our results also reveal flaws in previous work for MDPs with multiple mean-payoff functions under the expectation objective, correct the flaws and obtain improved results.","lang":"eng"}],"status":"public","oa":1,"department":[{"_id":"KrCh"}],"citation":{"mla":"Brázdil, Tomáš, et al. Two Views on Multiple Mean Payoff Objectives in Markov Decision Processes. 5970225, IEEE, 2011, doi:10.1109/LICS.2011.10.","ista":"Brázdil T, Brožek V, Chatterjee K, Forejt V, Kučera A. 2011. Two views on multiple mean payoff objectives in Markov Decision Processes. LICS: Logic in Computer Science, 5970225.","short":"T. Brázdil, V. Brožek, K. Chatterjee, V. Forejt, A. Kučera, in:, IEEE, 2011.","ieee":"T. Brázdil, V. Brožek, K. Chatterjee, V. Forejt, and A. Kučera, “Two views on multiple mean payoff objectives in Markov Decision Processes,” presented at the LICS: Logic in Computer Science, Toronto, Canada, 2011.","ama":"Brázdil T, Brožek V, Chatterjee K, Forejt V, Kučera A. Two views on multiple mean payoff objectives in Markov Decision Processes. In: IEEE; 2011. doi:10.1109/LICS.2011.10","chicago":"Brázdil, Tomáš, Václav Brožek, Krishnendu Chatterjee, Vojtěch Forejt, and Antonín Kučera. “Two Views on Multiple Mean Payoff Objectives in Markov Decision Processes.” IEEE, 2011. https://doi.org/10.1109/LICS.2011.10.","apa":"Brázdil, T., Brožek, V., Chatterjee, K., Forejt, V., & Kučera, A. (2011). Two views on multiple mean payoff objectives in Markov Decision Processes. Presented at the LICS: Logic in Computer Science, Toronto, Canada: IEEE. https://doi.org/10.1109/LICS.2011.10"},"publisher":"IEEE","month":"06","article_number":"5970225","scopus_import":1,"main_file_link":[{"url":"http://arxiv.org/abs/1104.3489","open_access":"1"}],"publication_status":"published","publist_id":"3275","type":"conference"}