TY  - JOUR
AB  - Mathematical models often aim to describe a complicated mechanism in a cohesive and simple manner. However, reaching perfect balance between being simple enough or overly simplistic is a challenging task. Frequently, game-theoretic models have an underlying assumption that players, whenever they choose to execute a specific action, do so perfectly. In fact, it is rare that action execution perfectly coincides with intentions of individuals, giving rise to behavioural mistakes. The concept of incompetence of players was suggested to address this issue in game-theoretic settings. Under the assumption of incompetence, players have non-zero probabilities of executing a different strategy from the one they chose, leading to stochastic outcomes of the interactions. In this article, we survey results related to the concept of incompetence in classic as well as evolutionary game theory and provide several new results. We also suggest future extensions of the model and argue why it is important to take into account behavioural mistakes when analysing interactions among players in both economic and biological settings.
AU  - Graham, Thomas
AU  - Kleshnina, Maria
AU  - Filar, Jerzy A.
ID  - 10770
JF  - Dynamic Games and Applications
SN  - 2153-0785
TI  - Where do mistakes lead? A survey of games with incompetent players
VL  - 13
ER  - 
TY  - CONF
AB  - Entropic risk (ERisk) is an established risk measure in finance, quantifying risk by an exponential re-weighting of rewards. We study ERisk for the first time in the context of turn-based stochastic games with the total reward objective. This gives rise to an objective function that demands the control of systems in a risk-averse manner. We show that the resulting games are determined and, in particular, admit optimal memoryless deterministic strategies. This contrasts risk measures that previously have been considered in the special case of Markov decision processes and that require randomization and/or memory. We provide several results on the decidability and the computational complexity of the threshold problem, i.e. whether the optimal value of ERisk exceeds a given threshold. In the most general case, the problem is decidable subject to Shanuel’s conjecture. If all inputs are rational, the resulting threshold problem can be solved using algebraic numbers, leading to decidability via a polynomial-time reduction to the existential theory of the reals. Further restrictions on the encoding of the input allow the solution of the threshold problem in NP∩coNP. Finally, an approximation algorithm for the optimal value of ERisk is provided.
AU  - Baier, Christel
AU  - Chatterjee, Krishnendu
AU  - Meggendorfer, Tobias
AU  - Piribauer, Jakob
ID  - 14417
SN  - 9783959772921
T2  - 48th International Symposium on Mathematical Foundations of Computer Science
TI  - Entropic risk for turn-based stochastic games
VL  - 272
ER  - 
TY  - JOUR
AB  - Allometric settings of population dynamics models are appealing due to their parsimonious nature and broad utility when studying system level effects. Here, we parameterise the size-scaled Rosenzweig-MacArthur differential equations to eliminate prey-mass dependency, facilitating an in depth analytic study of the equations which incorporates scaling parameters’ contributions to coexistence. We define the functional response term to match empirical findings, and examine situations where metabolic theory derivations and observation diverge. The dynamical properties of the Rosenzweig-MacArthur system, encompassing the distribution of size-abundance equilibria, the scaling of period and amplitude of population cycling, and relationships between predator and prey abundances, are consistent with empirical observation. Our parameterisation is an accurate minimal model across 15+ orders of mass magnitude.
AU  - Mckerral, Jody C.
AU  - Kleshnina, Maria
AU  - Ejov, Vladimir
AU  - Bartle, Louise
AU  - Mitchell, James G.
AU  - Filar, Jerzy A.
ID  - 12706
IS  - 2
JF  - PLoS One
TI  - Empirical parameterisation and dynamical analysis of the allometric Rosenzweig-MacArthur equations
VL  - 18
ER  - 
TY  - CONF
AB  - We consider bidding games, a class of two-player zero-sum graph games. The game proceeds as follows. Both players have bounded budgets. A token is placed on a vertex of a graph, in each turn the players simultaneously submit bids, and the higher bidder moves the token, where we break bidding ties in favor of Player 1. Player 1 wins the game iff the token visits a designated target vertex. We consider, for the first time, poorman discrete-bidding in which the granularity of the bids is restricted and the higher bid is paid to the bank. Previous work either did not impose granularity restrictions or considered Richman bidding (bids are paid to the opponent). While the latter mechanisms are technically more accessible, the former is more appealing from a practical standpoint. Our study focuses on threshold budgets, which is the necessary and sufficient initial budget required for Player 1 to ensure winning against a given Player 2 budget. We first show existence of thresholds. In DAGs, we show that threshold budgets can be approximated with error bounds by thresholds under continuous-bidding and that they exhibit a periodic behavior. We identify closed-form solutions in special cases. We implement and experiment with an algorithm to find threshold budgets.
AU  - Avni, Guy
AU  - Meggendorfer, Tobias
AU  - Sadhukhan, Suman
AU  - Tkadlec, Josef
AU  - Zikelic, Dorde
ID  - 14518
SN  - 0922-6389
T2  - Frontiers in Artificial Intelligence and Applications
TI  - Reachability poorman discrete-bidding games
VL  - 372
ER  - 
TY  - CONF
AB  - We consider the problem of learning control policies in discrete-time stochastic systems which guarantee that the system stabilizes within some specified stabilization region with probability 1. Our approach is based on the novel notion of stabilizing ranking supermartingales (sRSMs) that we introduce in this work. Our sRSMs overcome the limitation of methods proposed in previous works whose applicability is restricted to systems in which the stabilizing region cannot be left once entered under any control policy. We present a learning procedure that learns a control policy together with an sRSM that formally certifies probability 1 stability, both learned as neural networks. We show that this procedure can also be adapted to formally verifying that, under a given Lipschitz continuous control policy, the stochastic system stabilizes within some stabilizing region with probability 1. Our experimental evaluation shows that our learning procedure can successfully learn provably stabilizing policies in practice.
AU  - Ansaripour, Matin
AU  - Chatterjee, Krishnendu
AU  - Henzinger, Thomas A
AU  - Lechner, Mathias
AU  - Zikelic, Dorde
ID  - 14559
SN  - 0302-9743
T2  - 21st International Symposium on Automated Technology for Verification and Analysis
TI  - Learning provably stabilizing neural controllers for discrete-time stochastic systems
VL  - 14215
ER  - 
TY  - CONF
AB  - We consider a natural problem dealing with weighted packet selection across a rechargeable link, which e.g., finds applications in cryptocurrency networks. The capacity of a link (u, v) is determined by how much nodes u and v allocate for this link. Specifically, the input is a finite ordered sequence of packets that arrive in both directions along a link. Given (u, v) and a packet of weight x going from u to v, node u can either accept or reject the packet. If u accepts the packet, the capacity on link (u, v) decreases by x. Correspondingly, v’s capacity on (u, v) increases by x. If a node rejects the packet, this will entail a cost affinely linear in the weight of the packet. A link is “rechargeable” in the sense that the total capacity of the link has to remain constant, but the allocation of capacity at the ends of the link can depend arbitrarily on the nodes’ decisions. The goal is to minimise the sum of the capacity injected into the link and the cost of rejecting packets. We show that the problem is NP-hard, but can be approximated efficiently with a ratio of (1+ε)⋅(1+3–√) for some arbitrary ε>0.
.
AU  - Schmid, Stefan
AU  - Svoboda, Jakub
AU  - Yeo, Michelle X
ID  - 13238
SN  - 0302-9743
T2  - SIROCCO 2023: Structural Information and Communication Complexity 
TI  - Weighted packet selection for rechargeable links in cryptocurrency networks: Complexity and approximation
VL  - 13892
ER  - 
TY  - JOUR
AB  - Natural selection is usually studied between mutants that differ in reproductive rate, but are subject to the same population structure. Here we explore how natural selection acts on mutants that have the same reproductive rate, but different population structures. In our framework, population structure is given by a graph that specifies where offspring can disperse. The invading mutant disperses offspring on a different graph than the resident wild-type. We find that more densely connected dispersal graphs tend to increase the invader’s fixation probability, but the exact relationship between structure and fixation probability is subtle. We present three main results. First, we prove that if both invader and resident are on complete dispersal graphs, then removing a single edge in the invader’s dispersal graph reduces its fixation probability. Second, we show that for certain island models higher invader’s connectivity increases its fixation probability, but the magnitude of the effect depends on the exact layout of the connections. Third, we show that for lattices the effect of different connectivity is comparable to that of different fitness: for large population size, the invader’s fixation probability is either constant or exponentially small, depending on whether it is more or less connected than the resident.
AU  - Tkadlec, Josef
AU  - Kaveh, Kamran
AU  - Chatterjee, Krishnendu
AU  - Nowak, Martin A.
ID  - 14657
IS  - 208
JF  - Journal of the Royal Society, Interface
TI  - Evolutionary dynamics of mutants that modify population structure
VL  - 20
ER  - 
TY  - JOUR
AB  - Many human interactions feature the characteristics of social dilemmas where individual actions have consequences for the group and the environment. The feedback between behavior and environment can be studied with the framework of stochastic games. In stochastic games, the state of the environment can change, depending on the choices made by group members. Past work suggests that such feedback can reinforce cooperative behaviors. In particular, cooperation can evolve in stochastic games even if it is infeasible in each separate repeated game. In stochastic games, participants have an interest in conditioning their strategies on the state of the environment. Yet in many applications, precise information about the state could be scarce. Here, we study how the availability of information (or lack thereof) shapes evolution of cooperation. Already for simple examples of two state games we find surprising effects. In some cases, cooperation is only possible if there is precise information about the state of the environment. In other cases, cooperation is most abundant when there is no information about the state of the environment. We systematically analyze all stochastic games of a given complexity class, to determine when receiving information about the environment is better, neutral, or worse for evolution of cooperation.
AU  - Kleshnina, Maria
AU  - Hilbe, Christian
AU  - Simsa, Stepan
AU  - Chatterjee, Krishnendu
AU  - Nowak, Martin A.
ID  - 13258
JF  - Nature Communications
TI  - The effect of environmental information on evolution of cooperation in stochastic games
VL  - 14
ER  - 
TY  - GEN
AU  - Kleshnina, Maria
ID  - 13336
TI  - kleshnina/stochgames_info: The effect of environmental information on evolution of cooperation in stochastic games
ER  - 
TY  - CONF
AB  - A classic solution technique for Markov decision processes (MDP) and stochastic games (SG) is value iteration (VI). Due to its good practical performance, this approximative approach is typically preferred over exact techniques, even though no practical bounds on the imprecision of the result could be given until recently. As a consequence, even the most used model checkers could return arbitrarily wrong results. Over the past decade, different works derived stopping criteria, indicating when the precision reaches the desired level, for various settings, in particular MDP with reachability, total reward, and mean payoff, and SG with reachability.In this paper, we provide the first stopping criteria for VI on SG with total reward and mean payoff, yielding the first anytime algorithms in these settings. To this end, we provide the solution in two flavours: First through a reduction to the MDP case and second directly on SG. The former is simpler and automatically utilizes any advances on MDP. The latter allows for more local computations, heading towards better practical efficiency.Our solution unifies the previously mentioned approaches for MDP and SG and their underlying ideas. To achieve this, we isolate objective-specific subroutines as well as identify objective-independent concepts. These structural concepts, while surprisingly simple, form the very essence of the unified solution.
AU  - Kretinsky, Jan
AU  - Meggendorfer, Tobias
AU  - Weininger, Maximilian
ID  - 13967
SN  - 1043-6871
T2  - 38th Annual ACM/IEEE Symposium on Logic in Computer Science
TI  - Stopping criteria for value iteration on stochastic games with quantitative objectives
VL  - 2023
ER  - 
TY  - JOUR
AB  - The input to the token swapping problem is a graph with vertices v1, v2, . . . , vn, and n tokens with labels 1,2, . . . , n, one on each vertex. The goal is to get token i to vertex vi for all i= 1, . . . , n using a minimum number of swaps, where a swap exchanges the tokens on the endpoints of an edge.Token swapping on a tree, also known as “sorting with a transposition tree,” is not known to be in P nor NP-complete. We present some partial results: 1. An optimum swap sequence may need to perform a swap on a leaf vertex that has the correct token (a “happy leaf”), disproving a conjecture of Vaughan. 2. Any algorithm that fixes happy leaves—as all known approximation algorithms for the problem do—has approximation factor at least 4/3. Furthermore, the two best-known 2-approximation algorithms have approximation factor exactly 2. 3. A generalized problem—weighted coloured token swapping—is NP-complete on trees, but solvable in polynomial time on paths and stars. In this version, tokens and vertices have colours, and colours have weights. The goal is to get every token to a vertex of the same colour, and the cost of a swap is the sum of the weights of the two tokens involved.
AU  - Biniaz, Ahmad
AU  - Jain, Kshitij
AU  - Lubiw, Anna
AU  - Masárová, Zuzana
AU  - Miltzow, Tillmann
AU  - Mondal, Debajyoti
AU  - Naredla, Anurag Murty
AU  - Tkadlec, Josef
AU  - Turcotte, Alexi
ID  - 12833
IS  - 2
JF  - Discrete Mathematics and Theoretical Computer Science
SN  - 1462-7264
TI  - Token swapping on trees
VL  - 24
ER  - 
TY  - CONF
AB  - Payment channel networks (PCNs) are a promising technology to improve the scalability of cryptocurrencies. PCNs, however, face the challenge that the frequent usage of certain routes may deplete channels in one direction, and hence prevent further transactions. In order to reap the full potential of PCNs, recharging and rebalancing mechanisms are required to provision channels, as well as an admission control logic to decide which transactions to reject in case capacity is insufficient. This paper presents a formal model of this optimisation problem. In particular, we consider an online algorithms perspective, where transactions arrive over time in an unpredictable manner. Our main contributions are competitive online algorithms which come with provable guarantees over time. We empirically evaluate our algorithms on randomly generated transactions to compare the average performance of our algorithms to our theoretical bounds. We also show how this model and approach differs from related problems in classic communication networks.
AU  - Bastankhah, Mahsa
AU  - Chatterjee, Krishnendu
AU  - Maddah-Ali, Mohammad Ali
AU  - Schmid, Stefan
AU  - Svoboda, Jakub
AU  - Yeo, Michelle X
ID  - 14736
SN  - 0302-9743
T2  - 27th International Conference on Financial Cryptography and Data Security
TI  - R2: Boosting liquidity in payment channel networks with online admission control
VL  - 13950
ER  - 
TY  - THES
AB  - Stochastic systems provide a formal framework for modelling and quantifying uncertainty in systems and have been widely adopted in many application domains. Formal
verification and control of finite state stochastic systems, a subfield of formal methods
also known as probabilistic model checking, is well studied. In contrast, formal verification and control of infinite state stochastic systems have received comparatively
less attention. However, infinite state stochastic systems commonly arise in practice.
For instance, probabilistic models that contain continuous probability distributions such
as normal or uniform, or stochastic dynamical systems which are a classical model for
control under uncertainty, both give rise to infinite state systems.
The goal of this thesis is to contribute to laying theoretical and algorithmic foundations
of fully automated formal verification and control of infinite state stochastic systems,
with a particular focus on systems that may be executed over a long or infinite time.
We consider formal verification of infinite state stochastic systems in the setting of
static analysis of probabilistic programs and formal control in the setting of controller
synthesis in stochastic dynamical systems. For both problems, we present some of the
first fully automated methods for probabilistic (a.k.a. quantitative) reachability and
safety analysis applicable to infinite time horizon systems. We also advance the state
of the art of probability 1 (a.k.a. qualitative) reachability analysis for both problems.
Finally, for formal controller synthesis in stochastic dynamical systems, we present a
novel framework for learning neural network control policies in stochastic dynamical
systems with formal guarantees on correctness with respect to quantitative reachability,
safety or reach-avoid specifications.

AU  - Zikelic, Dorde
ID  - 14539
SN  - 2663 - 337X
TI  - Automated verification and control of infinite state stochastic systems
ER  - 
TY  - JOUR
AB  - We consider the almost-sure (a.s.) termination problem for probabilistic programs, which are a stochastic extension of classical imperative programs. Lexicographic ranking functions provide a sound and practical approach for termination of non-probabilistic programs, and their extension to probabilistic programs is achieved via lexicographic ranking supermartingales (LexRSMs). However, LexRSMs introduced in the previous work have a limitation that impedes their automation: all of their components have to be non-negative in all reachable states. This might result in a LexRSM not existing even for simple terminating programs. Our contributions are twofold. First, we introduce a generalization of LexRSMs that allows for some components to be negative. This standard feature of non-probabilistic termination proofs was hitherto not known to be sound in the probabilistic setting, as the soundness proof requires a careful analysis of the underlying stochastic process. Second, we present polynomial-time algorithms using our generalized LexRSMs for proving a.s. termination in broad classes of linear-arithmetic programs.
AU  - Chatterjee, Krishnendu
AU  - Kafshdar Goharshady, Ehsan
AU  - Novotný, Petr
AU  - Zárevúcky, Jiří
AU  - Zikelic, Dorde
ID  - 14778
IS  - 2
JF  - Formal Aspects of Computing
KW  - Theoretical Computer Science
KW  - Software
SN  - 0934-5043
TI  - On lexicographic proof rules for probabilistic termination
VL  - 35
ER  - 
TY  - CONF
AB  - In this paper, we present novel algorithms that efficiently compute a shortest reconfiguration sequence between two given dominating sets in trees and interval graphs under the TOKEN SLIDING model. In this problem, a graph is provided along with its two dominating sets, which can be imagined as tokens placed on vertices. The objective is to find a shortest sequence of dominating sets that transforms one set into the other, with each set in the sequence resulting from sliding a single token in the previous set. While identifying any sequence has been well studied, our work presents the first polynomial algorithms for this optimization variant in the context of dominating sets.
AU  - Křišťan, Jan Matyáš
AU  - Svoboda, Jakub
ID  - 14456
SN  - 0302-9743
T2  - 24th International Symposium on Fundamentals of Computation Theory
TI  - Shortest dominating set reconfiguration under token sliding
VL  - 14292
ER  - 
TY  - CONF
AB  - We study the problem of learning controllers for discrete-time non-linear stochastic dynamical systems with formal reach-avoid guarantees. This work presents the first method for providing formal reach-avoid guarantees, which combine and generalize stability and safety guarantees, with a tolerable probability threshold p in [0,1] over the infinite time horizon. Our method leverages advances in machine learning literature and it represents formal certificates as neural networks. In particular, we learn a certificate in the form of a reach-avoid supermartingale (RASM), a novel notion that we introduce in this work. Our RASMs provide reachability and avoidance guarantees by imposing constraints on what can be viewed as a stochastic extension of level sets of Lyapunov functions for deterministic systems. Our approach solves several important problems -- it can be used to learn a control policy from scratch, to verify a reach-avoid specification for a fixed control policy, or to fine-tune a pre-trained policy if it does not satisfy the reach-avoid specification. We validate our approach on 3 stochastic non-linear reinforcement learning tasks.
AU  - Zikelic, Dorde
AU  - Lechner, Mathias
AU  - Henzinger, Thomas A
AU  - Chatterjee, Krishnendu
ID  - 14830
IS  - 10
KW  - General Medicine
SN  - 2159-5399
T2  - Proceedings of the 37th AAAI Conference on Artificial Intelligence
TI  - Learning control policies for stochastic systems with reach-avoid guarantees
VL  - 37
ER  - 
TY  - CONF
AB  - A classical problem for Markov chains is determining their stationary (or steady-state) distribution. This problem has an equally classical solution based on eigenvectors and linear equation systems. However, this approach does not scale to large instances, and iterative solutions are desirable. It turns out that a naive approach, as used by current model checkers, may yield completely wrong results. We present a new approach, which utilizes recent advances in partial exploration and mean payoff computation to obtain a correct, converging approximation.
AU  - Meggendorfer, Tobias
ID  - 13139
SN  - 0302-9743
T2  - TACAS 2023: Tools and Algorithms for the Construction and Analysis of Systems
TI  - Correct approximation of stationary distributions
VL  - 13993
ER  - 
TY  - GEN
AB  - The software artefact to evaluate the approximation of stationary distributions implementation.
AU  - Meggendorfer, Tobias
ID  - 14990
TI  - Artefact for: Correct Approximation of Stationary Distributions
ER  - 
TY  - CONF
AB  - Reinforcement learning has shown promising results in learning neural network policies for complicated control tasks. However, the lack of formal guarantees about the behavior of such policies remains an impediment to their deployment. We propose a novel method for learning a composition of neural network policies in stochastic environments, along with a formal certificate which guarantees that a specification over the policy's behavior is satisfied with the desired probability. Unlike prior work on verifiable RL, our approach leverages the compositional nature of logical specifications provided in SpectRL, to learn over graphs of probabilistic reach-avoid specifications. The formal guarantees are provided by learning neural network policies together with reach-avoid supermartingales (RASM) for the graph’s sub-tasks and then composing them into a global policy. We also derive a tighter lower bound compared to previous work on the probability of reach-avoidance implied by a RASM, which is required to find a compositional policy with an acceptable probabilistic threshold for complex tasks with multiple edge policies. We implement a prototype of our approach and evaluate it on a Stochastic Nine Rooms environment.
AU  - Zikelic, Dorde
AU  - Lechner, Mathias
AU  - Verma, Abhinav
AU  - Chatterjee, Krishnendu
AU  - Henzinger, Thomas A
ID  - 15023
T2  - 37th Conference on Neural Information Processing Systems
TI  - Compositional policy learning in stochastic control systems with formal guarantees
ER  - 
TY  - CONF
AB  - Given a Markov chain M = (V, v_0, δ), with state space V and a starting state v_0, and a probability threshold ε, an ε-core is a subset C of states that is left with probability at most ε. More formally, C ⊆ V is an ε-core, iff ℙ[reach (V\C)] ≤ ε. Cores have been applied in a wide variety of verification problems over Markov chains, Markov decision processes, and probabilistic programs, as a means of discarding uninteresting and low-probability parts of a probabilistic system and instead being able to focus on the states that are likely to be encountered in a real-world run. In this work, we focus on the problem of computing a minimal ε-core in a Markov chain. Our contributions include both negative and positive results: (i) We show that the decision problem on the existence of an ε-core of a given size is NP-complete. This solves an open problem posed in [Jan Kretínský and Tobias Meggendorfer, 2020]. We additionally show that the problem remains NP-complete even when limited to acyclic Markov chains with bounded maximal vertex degree; (ii) We provide a polynomial time algorithm for computing a minimal ε-core on Markov chains over control-flow graphs of structured programs. A straightforward combination of our algorithm with standard branch prediction techniques allows one to apply the idea of cores to find a subset of program lines that are left with low probability and then focus any desired static analysis on this core subset.
AU  - Ahmadi, Ali
AU  - Chatterjee, Krishnendu
AU  - Goharshady, Amir Kafshdar
AU  - Meggendorfer, Tobias
AU  - Safavi Hemami, Roodabeh
AU  - Zikelic, Dorde
ID  - 12102
SN  - 1868-8969
T2  - 42nd IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science
TI  - Algorithms and hardness results for computing cores of Markov chains
VL  - 250
ER  -