TY - JOUR
AB - We consider two-player games played on weighted directed graphs with mean-payoff and total-payoff objectives, two classical quantitative objectives. While for single-dimensional games the complexity and memory bounds for both objectives coincide, we show that in contrast to multi-dimensional mean-payoff games that are known to be coNP-complete, multi-dimensional total-payoff games are undecidable. We introduce conservative approximations of these objectives, where the payoff is considered over a local finite window sliding along a play, instead of the whole play. For single dimension, we show that (i) if the window size is polynomial, deciding the winner takes polynomial time, and (ii) the existence of a bounded window can be decided in NP ∩ coNP, and is at least as hard as solving mean-payoff games. For multiple dimensions, we show that (i) the problem with fixed window size is EXPTIME-complete, and (ii) there is no primitive-recursive algorithm to decide the existence of a bounded window.
AU - Chatterjee, Krishnendu
AU - Doyen, Laurent
AU - Randour, Mickael
AU - Raskin, Jean
ID - 523
IS - 6
JF - Information and Computation
TI - Looking at mean-payoff and total-payoff through windows
VL - 242
ER -
TY - JOUR
AB - We consider concurrent games played by two players on a finite-state graph, where in every round the players simultaneously choose a move, and the current state along with the joint moves determine the successor state. We study the most fundamental objective for concurrent games, namely, mean-payoff or limit-average objective, where a reward is associated to each transition, and the goal of player 1 is to maximize the long-run average of the rewards, and the objective of player 2 is strictly the opposite (i.e., the games are zero-sum). The path constraint for player 1 could be qualitative, i.e., the mean-payoff is the maximal reward, or arbitrarily close to it; or quantitative, i.e., a given threshold between the minimal and maximal reward. We consider the computation of the almost-sure (resp. positive) winning sets, where player 1 can ensure that the path constraint is satisfied with probability 1 (resp. positive probability). Almost-sure winning with qualitative constraint exactly corresponds to the question of whether there exists a strategy to ensure that the payoff is the maximal reward of the game. Our main results for qualitative path constraints are as follows: (1) we establish qualitative determinacy results that show that for every state either player 1 has a strategy to ensure almost-sure (resp. positive) winning against all player-2 strategies, or player 2 has a spoiling strategy to falsify almost-sure (resp. positive) winning against all player-1 strategies; (2) we present optimal strategy complexity results that precisely characterize the classes of strategies required for almost-sure and positive winning for both players; and (3) we present quadratic time algorithms to compute the almost-sure and the positive winning sets, matching the best known bound of the algorithms for much simpler problems (such as reachability objectives). For quantitative constraints we show that a polynomial time solution for the almost-sure or the positive winning set would imply a solution to a long-standing open problem (of solving the value problem of turn-based deterministic mean-payoff games) that is not known to be solvable in polynomial time.
AU - Chatterjee, Krishnendu
AU - Ibsen-Jensen, Rasmus
ID - 524
IS - 6
JF - Information and Computation
TI - Qualitative analysis of concurrent mean payoff games
VL - 242
ER -
TY - GEN
AB - We consider Markov decision processes (MDPs) with multiple limit-average (or mean-payoff) objectives.
There have been two different views: (i) the expectation semantics, where the goal is to optimize the expected mean-payoff objective, and (ii) the satisfaction semantics, where the goal is to maximize the probability of runs such that the mean-payoff value stays above a given vector.
We consider the problem where the goal is to optimize the expectation under the constraint that the satisfaction semantics is ensured, and thus consider a generalization that unifies the existing semantics.
Our problem captures the notion of optimization with respect to strategies that are risk-averse (i.e., ensures certain probabilistic guarantee).
Our main results are algorithms for the decision problem which are always polynomial in the size of the MDP. We also show that an approximation of the Pareto-curve can be computed in time polynomial in the size of the MDP, and the approximation factor, but exponential in the number of dimensions.
Finally, we present a complete characterization of the strategy complexity (in terms of memory bounds and randomization) required to solve our problem.
AU - Chatterjee, Krishnendu
AU - Komarkova, Zuzana
AU - Kretinsky, Jan
ID - 5429
SN - 2664-1690
TI - Unifying two views on multiple mean-payoff objectives in Markov decision processes
ER -
TY - GEN
AB - We consider the core algorithmic problems related to verification of systems with respect to three classical quantitative properties, namely, the mean- payoff property, the ratio property, and the minimum initial credit for energy property. The algorithmic problem given a graph and a quantitative property asks to compute the optimal value (the infimum value over all traces) from every node of the graph. We consider graphs with constant treewidth, and it is well-known that the control-flow graphs of most programs have constant treewidth. Let n denote the number of nodes of a graph, m the number of edges (for constant treewidth graphs m = O ( n ) ) and W the largest absolute value of the weights. Our main theoretical results are as follows. First, for constant treewidth graphs we present an algorithm that approximates the mean-payoff value within a mul- tiplicative factor of ∊ in time O ( n · log( n/∊ )) and linear space, as compared to the classical algorithms that require quadratic time. Second, for the ratio property we present an algorithm that for constant treewidth graphs works in time O ( n · log( | a · b · n | )) = O ( n · log( n · W )) , when the output is a b , as compared to the previously best known algorithm with running time O ( n 2 · log( n · W )) . Third, for the minimum initial credit problem we show that (i) for general graphs the problem can be solved in O ( n 2 · m ) time and the associated decision problem can be solved in O ( n · m ) time, improving the previous known O ( n 3 · m · log( n · W )) and O ( n 2 · m ) bounds, respectively; and (ii) for constant treewidth graphs we present an algorithm that requires O ( n · log n ) time, improving the previous known O ( n 4 · log( n · W )) bound. We have implemented some of our algorithms and show that they present a significant speedup on standard benchmarks.
AU - Chatterjee, Krishnendu
AU - Ibsen-Jensen, Rasmus
AU - Pavlogiannis, Andreas
ID - 5430
SN - 2664-1690
TI - Faster algorithms for quantitative verification in constant treewidth graphs
ER -
TY - GEN
AB - We consider finite-state concurrent stochastic games, played by k>=2 players for an infinite number of rounds, where in every round, each player simultaneously and independently of the other players chooses an action, whereafter the successor state is determined by a probability distribution given by the current state and the chosen actions. We consider reachability objectives that given a target set of states require that some state in the target set is visited, and the dual safety objectives that given a target set require that only states in the target set are visited. We are interested in the complexity of stationary strategies measured by their patience, which is defined as the inverse of the smallest non-zero probability employed.
Our main results are as follows: We show that in two-player zero-sum concurrent stochastic games (with reachability objective for one player and the complementary safety objective for the other player): (i) the optimal bound on the patience of optimal and epsilon-optimal strategies, for both players is doubly exponential; and (ii) even in games with a single non-absorbing state exponential (in the number of actions) patience is necessary. In general we study the class of non-zero-sum games admitting epsilon-Nash equilibria. We show that if there is at least one player with reachability objective, then doubly-exponential patience is needed in general for epsilon-Nash equilibrium strategies, whereas in contrast if all players have safety objectives, then the optimal bound on patience for epsilon-Nash equilibrium strategies is only exponential.
AU - Chatterjee, Krishnendu
AU - Ibsen-Jensen, Rasmus
AU - Hansen, Kristoffer
ID - 5431
SN - 2664-1690
TI - The patience of concurrent stochastic games with safety and reachability objectives
ER -