{"ec_funded":1,"status":"public","title":"Optimizing expectation with guarantees in POMDPs","type":"conference","month":"01","date_published":"2017-01-01T00:00:00Z","date_created":"2018-12-11T11:49:40Z","oa_version":"Submitted Version","article_processing_charge":"No","user_id":"c635000d-4b10-11ee-a964-aac5a93f6ac1","scopus_import":"1","quality_controlled":"1","project":[{"_id":"25863FF4-B435-11E9-9278-68D0E5697425","grant_number":"S11407","call_identifier":"FWF","name":"Game Theory"},{"grant_number":"279307","_id":"2581B60A-B435-11E9-9278-68D0E5697425","name":"Quantitative Graph Games: Theory and Applications","call_identifier":"FP7"},{"call_identifier":"FP7","name":"International IST Postdoc Fellowship Programme","_id":"25681D80-B435-11E9-9278-68D0E5697425","grant_number":"291734"},{"name":"Efficient Algorithms for Computer Aided Verification","_id":"25892FC0-B435-11E9-9278-68D0E5697425","grant_number":"ICT15-003"}],"abstract":[{"text":"A standard objective in partially-observable Markov decision processes (POMDPs) is to find a policy that maximizes the expected discounted-sum payoff. However, such policies may still permit unlikely but highly undesirable outcomes, which is problematic especially in safety-critical applications. Recently, there has been a surge of interest in POMDPs where the goal is to maximize the probability to ensure that the payoff is at least a given threshold, but these approaches do not consider any optimization beyond satisfying this threshold constraint. In this work we go beyond both the “expectation” and “threshold” approaches and consider a “guaranteed payoff optimization (GPO)” problem for POMDPs, where we are given a threshold t and the objective is to find a policy σ such that a) each possible outcome of σ yields a discounted-sum payoff of at least t, and b) the expected discounted-sum payoff of σ is optimal (or near-optimal) among all policies satisfying a). We present a practical approach to tackle the GPO problem and evaluate it on standard POMDP benchmarks.","lang":"eng"}],"isi":1,"citation":{"chicago":"Chatterjee, Krishnendu, Petr Novotný, Guillermo Pérez, Jean Raskin, and Djordje Zikelic. “Optimizing Expectation with Guarantees in POMDPs.” In <i>Proceedings of the 31st AAAI Conference on Artificial Intelligence</i>, 5:3725–32. AAAI Press, 2017.","ieee":"K. Chatterjee, P. Novotný, G. Pérez, J. Raskin, and D. Zikelic, “Optimizing expectation with guarantees in POMDPs,” in <i>Proceedings of the 31st AAAI Conference on Artificial Intelligence</i>, San Francisco, CA, United States, 2017, vol. 5, pp. 3725–3732.","mla":"Chatterjee, Krishnendu, et al. “Optimizing Expectation with Guarantees in POMDPs.” <i>Proceedings of the 31st AAAI Conference on Artificial Intelligence</i>, vol. 5, AAAI Press, 2017, pp. 3725–32.","apa":"Chatterjee, K., Novotný, P., Pérez, G., Raskin, J., &#38; Zikelic, D. (2017). Optimizing expectation with guarantees in POMDPs. In <i>Proceedings of the 31st AAAI Conference on Artificial Intelligence</i> (Vol. 5, pp. 3725–3732). San Francisco, CA, United States: AAAI Press.","ista":"Chatterjee K, Novotný P, Pérez G, Raskin J, Zikelic D. 2017. Optimizing expectation with guarantees in POMDPs. Proceedings of the 31st AAAI Conference on Artificial Intelligence. AAAI: Conference on Artificial Intelligence vol. 5, 3725–3732.","short":"K. Chatterjee, P. Novotný, G. Pérez, J. Raskin, D. Zikelic, in:, Proceedings of the 31st AAAI Conference on Artificial Intelligence, AAAI Press, 2017, pp. 3725–3732.","ama":"Chatterjee K, Novotný P, Pérez G, Raskin J, Zikelic D. Optimizing expectation with guarantees in POMDPs. In: <i>Proceedings of the 31st AAAI Conference on Artificial Intelligence</i>. Vol 5. AAAI Press; 2017:3725-3732."},"intvolume":"         5","author":[{"id":"2E5DCA20-F248-11E8-B48F-1D18A9856A87","orcid":"0000-0002-4561-241X","first_name":"Krishnendu","last_name":"Chatterjee","full_name":"Chatterjee, Krishnendu"},{"id":"3CC3B868-F248-11E8-B48F-1D18A9856A87","first_name":"Petr","last_name":"Novotny","full_name":"Novotny, Petr"},{"first_name":"Guillermo","last_name":"Pérez","full_name":"Pérez, Guillermo"},{"full_name":"Raskin, Jean","last_name":"Raskin","first_name":"Jean"},{"full_name":"Zikelic, Djordje","last_name":"Zikelic","first_name":"Djordje"}],"department":[{"_id":"KrCh"}],"publication_status":"published","external_id":{"isi":["000485630703107"]},"day":"01","volume":5,"year":"2017","publisher":"AAAI Press","language":[{"iso":"eng"}],"oa":1,"date_updated":"2023-09-22T09:46:41Z","conference":{"location":"San Francisco, CA, United States","start_date":"2017-02-04","end_date":"2017-02-10","name":"AAAI: Conference on Artificial Intelligence"},"main_file_link":[{"open_access":"1","url":"http://www.aaai.org/ocs/index.php/AAAI/AAAI17/paper/download/14354/14092"}],"publication":"Proceedings of the 31st AAAI Conference on Artificial Intelligence","_id":"1009","publist_id":"6387","page":"3725 - 3732","acknowledgement":"he research leading to these results was supported by the Austrian Science Fund (FWF) NFN Grant no. S11407-N23 (RiSE/SHiNE); two ERC Starting grants (279307: Graph Games, 279499: inVEST); the Vienna Science and Tech- nology Fund (WWTF) through project ICT15-003; and the People Programme (Marie Curie Actions) of the European Union’s Seventh Framework Programme (FP7/2007-2013) under REA grant agreement no. [291734]."}