Finite-memory strategies in POMDPs with long-run average objectives

Chatterjee K, Saona Urmeneta RJ, Ziliotto B. 2022. Finite-memory strategies in POMDPs with long-run average objectives. Mathematics of Operations Research. 47(1), 100–119.


Journal Article | Published | English

Scopus indexed
Abstract
Partially observable Markov decision processes (POMDPs) are standard models for dynamic systems with probabilistic and nondeterministic behaviour in uncertain environments. We prove that in POMDPs with long-run average objective, the decision maker has approximately optimal strategies with finite memory. This implies notably that approximating the long-run value is recursively enumerable, as well as a weak continuity property of the value with respect to the transition function.
Publishing Year
Date Published
2022-02-01
Journal Title
Mathematics of Operations Research
Acknowledgement
Partially supported by Austrian Science Fund (FWF) NFN Grant No RiSE/SHiNE S11407, by CONICYT Chile through grant PII 20150140, and by ECOS-CONICYT through grant C15E03.
Volume
47
Issue
1
Page
100-119
ISSN
eISSN
IST-REx-ID

Cite this

Chatterjee K, Saona Urmeneta RJ, Ziliotto B. Finite-memory strategies in POMDPs with long-run average objectives. Mathematics of Operations Research. 2022;47(1):100-119. doi:10.1287/moor.2020.1116
Chatterjee, K., Saona Urmeneta, R. J., & Ziliotto, B. (2022). Finite-memory strategies in POMDPs with long-run average objectives. Mathematics of Operations Research. Institute for Operations Research and the Management Sciences. https://doi.org/10.1287/moor.2020.1116
Chatterjee, Krishnendu, Raimundo J Saona Urmeneta, and Bruno Ziliotto. “Finite-Memory Strategies in POMDPs with Long-Run Average Objectives.” Mathematics of Operations Research. Institute for Operations Research and the Management Sciences, 2022. https://doi.org/10.1287/moor.2020.1116.
K. Chatterjee, R. J. Saona Urmeneta, and B. Ziliotto, “Finite-memory strategies in POMDPs with long-run average objectives,” Mathematics of Operations Research, vol. 47, no. 1. Institute for Operations Research and the Management Sciences, pp. 100–119, 2022.
Chatterjee K, Saona Urmeneta RJ, Ziliotto B. 2022. Finite-memory strategies in POMDPs with long-run average objectives. Mathematics of Operations Research. 47(1), 100–119.
Chatterjee, Krishnendu, et al. “Finite-Memory Strategies in POMDPs with Long-Run Average Objectives.” Mathematics of Operations Research, vol. 47, no. 1, Institute for Operations Research and the Management Sciences, 2022, pp. 100–19, doi:10.1287/moor.2020.1116.
All files available under the following license(s):
Copyright Statement:
This Item is protected by copyright and/or related rights. [...]

Link(s) to Main File(s)
Access Level
OA Open Access

Export

Marked Publications

Open Data ISTA Research Explorer

Web of Science

View record in Web of Science®

Sources

arXiv 1904.13360

Search this title in

Google Scholar