TY - CONF AB - Stochastic Gradient Descent (SGD) is a fundamental algorithm in machine learning, representing the optimization backbone for training several classic models, from regression to neural networks. Given the recent practical focus on distributed machine learning, significant work has been dedicated to the convergence properties of this algorithm under the inconsistent and noisy updates arising from execution in a distributed environment. However, surprisingly, the convergence properties of this classic algorithm in the standard shared-memory model are still not well-understood. In this work, we address this gap, and provide new convergence bounds for lock-free concurrent stochastic gradient descent, executing in the classic asynchronous shared memory model, against a strong adaptive adversary. Our results give improved upper and lower bounds on the "price of asynchrony'' when executing the fundamental SGD algorithm in a concurrent setting. They show that this classic optimization tool can converge faster and with a wider range of parameters than previously known under asynchronous iterations. At the same time, we exhibit a fundamental trade-off between the maximum delay in the system and the rate at which SGD can converge, which governs the set of parameters under which this algorithm can still work efficiently. AU - Alistarh, Dan-Adrian AU - De Sa, Christopher AU - Konstantinov, Nikola H ID - 5962 SN - 9781450357951 T2 - Proceedings of the 2018 ACM Symposium on Principles of Distributed Computing - PODC '18 TI - The convergence of stochastic gradient descent in asynchronous shared memory ER - TY - JOUR AB - A major problem for evolutionary theory is understanding the so-called open-ended nature of evolutionary change, from its definition to its origins. Open-ended evolution (OEE) refers to the unbounded increase in complexity that seems to characterize evolution on multiple scales. This property seems to be a characteristic feature of biological and technological evolution and is strongly tied to the generative potential associated with combinatorics, which allows the system to grow and expand their available state spaces. Interestingly, many complex systems presumably displaying OEE, from language to proteins, share a common statistical property: the presence of Zipf's Law. Given an inventory of basic items (such as words or protein domains) required to build more complex structures (sentences or proteins) Zipf's Law tells us that most of these elements are rare whereas a few of them are extremely common. Using algorithmic information theory, in this paper we provide a fundamental definition for open-endedness, which can be understood as postulates. Its statistical counterpart, based on standard Shannon information theory, has the structure of a variational problem which is shown to lead to Zipf's Law as the expected consequence of an evolutionary process displaying OEE. We further explore the problem of information conservation through an OEE process and we conclude that statistical information (standard Shannon information) is not conserved, resulting in the paradoxical situation in which the increase of information content has the effect of erasing itself. We prove that this paradox is solved if we consider non-statistical forms of information. This last result implies that standard information theory may not be a suitable theoretical framework to explore the persistence and increase of the information content in OEE systems. AU - Corominas-Murtra, Bernat AU - Seoane, Luís F. AU - Solé, Ricard ID - 5860 IS - 149 JF - Journal of the Royal Society Interface SN - 17425689 TI - Zipf's Law, unbounded complexity and open-ended evolution VL - 15 ER - TY - JOUR AB - In this paper we present a reliable method to verify the existence of loops along the uncertain trajectory of a robot, based on proprioceptive measurements only, within a bounded-error context. The loop closure detection is one of the key points in simultaneous localization and mapping (SLAM) methods, especially in homogeneous environments with difficult scenes recognitions. The proposed approach is generic and could be coupled with conventional SLAM algorithms to reliably reduce their computing burden, thus improving the localization and mapping processes in the most challenging environments such as unexplored underwater extents. To prove that a robot performed a loop whatever the uncertainties in its evolution, we employ the notion of topological degree that originates in the field of differential topology. We show that a verification tool based on the topological degree is an optimal method for proving robot loops. This is demonstrated both on datasets from real missions involving autonomous underwater vehicles and by a mathematical discussion. AU - Rohou, Simon AU - Franek, Peter AU - Aubry, Clément AU - Jaulin, Luc ID - 5960 IS - 12 JF - The International Journal of Robotics Research SN - 0278-3649 TI - Proving the existence of loops in robot trajectories VL - 37 ER - TY - CONF AB - There has been significant progress in understanding the parallelism inherent to iterative sequential algorithms: for many classic algorithms, the depth of the dependence structure is now well understood, and scheduling techniques have been developed to exploit this shallow dependence structure for efficient parallel implementations. A related, applied research strand has studied methods by which certain iterative task-based algorithms can be efficiently parallelized via relaxed concurrent priority schedulers. These allow for high concurrency when inserting and removing tasks, at the cost of executing superfluous work due to the relaxed semantics of the scheduler. In this work, we take a step towards unifying these two research directions, by showing that there exists a family of relaxed priority schedulers that can efficiently and deterministically execute classic iterative algorithms such as greedy maximal independent set (MIS) and matching. Our primary result shows that, given a randomized scheduler with an expected relaxation factor of k in terms of the maximum allowed priority inversions on a task, and any graph on n vertices, the scheduler is able to execute greedy MIS with only an additive factor of \poly(k) expected additional iterations compared to an exact (but not scalable) scheduler. This counter-intuitive result demonstrates that the overhead of relaxation when computing MIS is not dependent on the input size or structure of the input graph. Experimental results show that this overhead can be clearly offset by the gain in performance due to the highly scalable scheduler. In sum, we present an efficient method to deterministically parallelize iterative sequential algorithms, with provable runtime guarantees in terms of the number of executed tasks to completion. AU - Alistarh, Dan-Adrian AU - Brown, Trevor A AU - Kopinsky, Justin AU - Nadiradze, Giorgi ID - 5963 SN - 9781450357951 T2 - Proceedings of the 2018 ACM Symposium on Principles of Distributed Computing - PODC '18 TI - Relaxed schedulers can efficiently parallelize iterative algorithms ER - TY - CONF AB - Relaxed concurrent data structures have become increasingly popular, due to their scalability in graph processing and machine learning applications (\citeNguyen13, gonzalez2012powergraph ). Despite considerable interest, there exist families of natural, high performing randomized relaxed concurrent data structures, such as the popular MultiQueue~\citeMQ pattern for implementing relaxed priority queue data structures, for which no guarantees are known in the concurrent setting~\citeAKLN17. Our main contribution is in showing for the first time that, under a set of analytic assumptions, a family of relaxed concurrent data structures, including variants of MultiQueues, but also a new approximate counting algorithm we call the MultiCounter, provides strong probabilistic guarantees on the degree of relaxation with respect to the sequential specification, in arbitrary concurrent executions. We formalize these guarantees via a new correctness condition called distributional linearizability, tailored to concurrent implementations with randomized relaxations. Our result is based on a new analysis of an asynchronous variant of the classic power-of-two-choices load balancing algorithm, in which placement choices can be based on inconsistent, outdated information (this result may be of independent interest). We validate our results empirically, showing that the MultiCounter algorithm can implement scalable relaxed timestamps. AU - Alistarh, Dan-Adrian AU - Brown, Trevor A AU - Kopinsky, Justin AU - Li, Jerry Z. AU - Nadiradze, Giorgi ID - 5965 SN - 9781450357999 T2 - Proceedings of the 30th on Symposium on Parallelism in Algorithms and Architectures - SPAA '18 TI - Distributionally linearizable data structures ER - TY - CONF AB - The Big Match is a multi-stage two-player game. In each stage Player 1 hides one or two pebbles in his hand, and his opponent has to guess that number; Player 1 loses a point if Player 2 is correct, and otherwise he wins a point. As soon as Player 1 hides one pebble, the players cannot change their choices in any future stage. Blackwell and Ferguson (1968) give an ε-optimal strategy for Player 1 that hides, in each stage, one pebble with a probability that depends on the entire past history. Any strategy that depends just on the clock or on a finite memory is worthless. The long-standing natural open problem has been whether every strategy that depends just on the clock and a finite memory is worthless. We prove that there is such a strategy that is ε-optimal. In fact, we show that just two states of memory are sufficient. AU - Hansen, Kristoffer Arnsfelt AU - Ibsen-Jensen, Rasmus AU - Neyman, Abraham ID - 5967 SN - 9781450358293 T2 - Proceedings of the 2018 ACM Conference on Economics and Computation - EC '18 TI - The Big Match with a clock and a bit of memory ER - TY - CONF AB - The transactional conflict problem arises in transactional systems whenever two or more concurrent transactions clash on a data item. While the standard solution to such conflicts is to immediately abort one of the transactions, some practical systems consider the alternative of delaying conflict resolution for a short interval, which may allow one of the transactions to commit. The challenge in the transactional conflict problem is to choose the optimal length of this delay interval so as to minimize the overall running time penalty for the conflicting transactions. In this paper, we propose a family of optimal online algorithms for the transactional conflict problem. Specifically, we consider variants of this problem which arise in different implementations of transactional systems, namely "requestor wins'' and "requestor aborts'' implementations: in the former, the recipient of a coherence request is aborted, whereas in the latter, it is the requestor which has to abort. Both strategies are implemented by real systems. We show that the requestor aborts case can be reduced to a classic instance of the ski rental problem, while the requestor wins case leads to a new version of this classical problem, for which we derive optimal deterministic and randomized algorithms. Moreover, we prove that, under a simplified adversarial model, our algorithms are constant-competitive with the offline optimum in terms of throughput. We validate our algorithmic results empirically through a hardware simulation of hardware transactional memory (HTM), showing that our algorithms can lead to non-trivial performance improvements for classic concurrent data structures. AU - Alistarh, Dan-Adrian AU - Haider, Syed Kamran AU - Kübler, Raphael AU - Nadiradze, Giorgi ID - 5966 SN - 9781450357999 T2 - Proceedings of the 30th on Symposium on Parallelism in Algorithms and Architectures - SPAA '18 TI - The transactional conflict problem ER - TY - JOUR AB - We consider the recent formulation of the algorithmic Lov ́asz Local Lemma [N. Har-vey and J. Vondr ́ak, inProceedings of FOCS, 2015, pp. 1327–1345; D. Achlioptas and F. Iliopoulos,inProceedings of SODA, 2016, pp. 2024–2038; D. Achlioptas, F. Iliopoulos, and V. Kolmogorov,ALocal Lemma for Focused Stochastic Algorithms, arXiv preprint, 2018] for finding objects that avoid“bad features,” or “flaws.” It extends the Moser–Tardos resampling algorithm [R. A. Moser andG. Tardos,J. ACM, 57 (2010), 11] to more general discrete spaces. At each step the method picks aflaw present in the current state and goes to a new state according to some prespecified probabilitydistribution (which depends on the current state and the selected flaw). However, the recent formu-lation is less flexible than the Moser–Tardos method since it requires a specific flaw selection rule,whereas the algorithm of Moser and Tardos allows an arbitrary rule (and thus can potentially beimplemented more efficiently). We formulate a new “commutativity” condition and prove that it issufficient for an arbitrary rule to work. It also enables an efficient parallelization under an additionalassumption. We then show that existing resampling oracles for perfect matchings and permutationsdo satisfy this condition. AU - Kolmogorov, Vladimir ID - 5975 IS - 6 JF - SIAM Journal on Computing SN - 0097-5397 TI - Commutativity in the algorithmic Lovász local lemma VL - 47 ER - TY - CONF AB - A standard design pattern found in many concurrent data structures, such as hash tables or ordered containers, is an alternation of parallelizable sections that incur no data conflicts and critical sections that must run sequentially and are protected with locks. A lock can be viewed as a queue that arbitrates the order in which the critical sections are executed, and a natural question is whether we can use stochastic analysis to predict the resulting throughput. As a preliminary evidence to the affirmative, we describe a simple model that can be used to predict the throughput of coarse-grained lock-based algorithms. We show that our model works well for CLH lock, and we expect it to work for other popular lock designs such as TTAS, MCS, etc. AU - Aksenov, Vitaly AU - Alistarh, Dan-Adrian AU - Kuznetsov, Petr ID - 5964 SN - 9781450357951 T2 - Proceedings of the 2018 ACM Symposium on Principles of Distributed Computing - PODC '18 TI - Brief Announcement: Performance prediction for coarse-grained locking ER - TY - JOUR AB - We consider a Wigner-type ensemble, i.e. large hermitian N×N random matrices H=H∗ with centered independent entries and with a general matrix of variances Sxy=𝔼∣∣Hxy∣∣2. The norm of H is asymptotically given by the maximum of the support of the self-consistent density of states. We establish a bound on this maximum in terms of norms of powers of S that substantially improves the earlier bound 2∥S∥1/2∞ given in [O. Ajanki, L. Erdős and T. Krüger, Universality for general Wigner-type matrices, Prob. Theor. Rel. Fields169 (2017) 667–727]. The key element of the proof is an effective Markov chain approximation for the contributions of the weighted Dyck paths appearing in the iterative solution of the corresponding Dyson equation. AU - Erdös, László AU - Mühlbacher, Peter ID - 5971 JF - Random matrices: Theory and applications SN - 2010-3263 TI - Bounds on the norm of Wigner-type random matrices ER -