TY - CONF
AB - Several Hybrid Transactional Memory (HyTM) schemes have recently been proposed to complement the fast, but best-effort nature of Hardware Transactional Memory (HTM) with a slow, reliable software backup. However, the costs of providing concurrency between hardware and software transactions in HyTM are still not well understood. In this paper, we propose a general model for HyTM implementations, which captures the ability of hardware transactions to buffer memory accesses. The model allows us to formally quantify and analyze the amount of overhead (instrumentation) caused by the potential presence of software transactions.We prove that (1) it is impossible to build a strictly serializable HyTM implementation that has both uninstrumented reads and writes, even for very weak progress guarantees, and (2) the instrumentation cost incurred by a hardware transaction in any progressive opaque HyTM is linear in the size of the transaction’s data set.We further describe two implementations which exhibit optimal instrumentation costs for two different progress conditions. In sum, this paper proposes the first formal HyTM model and captures for the first time the trade-off between the degree of hardware-software TM concurrency and the amount of instrumentation overhead.
AU - Alistarh, Dan-Adrian
AU - Kopinsky, Justin
AU - Kuznetsov, Petr
AU - Ravi, Srivatsan
AU - Shavit, Nir
ID - 778
TI - Inherent limitations of hybrid transactional memory
VL - 9363
ER -
TY - CONF
AB - The concurrent memory reclamation problem is that of devising a way for a deallocating thread to verify that no other concurrent threads hold references to a memory block being deallocated. To date, in the absence of automatic garbage collection, there is no satisfactory solution to this problem; existing tracking methods like hazard pointers, reference counters, or epoch-based techniques like RCU, are either prohibitively expensive or require significant programming expertise, to the extent that implementing them efficiently can be worthy of a publication. None of the existing techniques are automatic or even semi-automated. In this paper, we take a new approach to concurrent memory reclamation: instead of manually tracking access to memory locations as done in techniques like hazard pointers, or restricting shared accesses to specific epoch boundaries as in RCU, our algorithm, called ThreadScan, leverages operating system signaling to automatically detect which memory locations are being accessed by concurrent threads. Initial empirical evidence shows that ThreadScan scales surprisingly well and requires negligible programming effort beyond the standard use of Malloc and Free.
AU - Alistarh, Dan-Adrian
AU - Matveev, Alexander
AU - Leiserson, William
AU - Shavit, Nir
ID - 779
TI - ThreadScan: Automatic and scalable memory reclamation
VL - 2015-June
ER -
TY - CONF
AB - Population protocols are networks of finite-state agents, interacting randomly, and updating their states using simple rules. Despite their extreme simplicity, these systems have been shown to cooperatively perform complex computational tasks, such as simulating register machines to compute standard arithmetic functions. The election of a unique leader agent is a key requirement in such computational constructions. Yet, the fastest currently known population protocol for electing a leader only has linear convergence time, and it has recently been shown that no population protocol using a constant number of states per node may overcome this linear bound. In this paper, we give the first population protocol for leader election with polylogarithmic convergence time, using polylogarithmic memory states per node. The protocol structure is quite simple: each node has an associated value, and is either a leader (still in contention) or a minion (following some leader). A leader keeps incrementing its value and “defeats” other leaders in one-to-one interactions, and will drop from contention and become a minion if it meets a leader with higher value. Importantly, a leader also drops out if it meets a minion with higher absolute value. While these rules are quite simple, the proof that this algorithm achieves polylogarithmic convergence time is non-trivial. In particular, the argument combines careful use of concentration inequalities with anti-concentration bounds, showing that the leaders’ values become spread apart as the execution progresses, which in turn implies that straggling leaders get quickly eliminated. We complement our analysis with empirical results, showing that our protocol converges extremely fast, even for large network sizes.
AU - Alistarh, Dan-Adrian
AU - Gelashvili, Rati
ID - 780
TI - Polylogarithmic-time leader election in population protocols
VL - 9135
ER -
TY - CONF
AB - Population protocols, roughly defined as systems consisting of large numbers of simple identical agents, interacting at random and updating their state following simple rules, are an important research topic at the intersection of distributed computing and biology. One of the fundamental tasks that a population protocol may solve is majority: each node starts in one of two states; the goal is for all nodes to reach a correct consensus on which of the two states was initially the majority. Despite considerable research effort, known protocols for this problem are either exact but slow (taking linear parallel time to converge), or fast but approximate (with non-zero probability of error). In this paper, we show that this trade-off between preciasion and speed is not inherent. We present a new protocol called Average and Conquer (AVC) that solves majority ex-actly in expected parallel convergence time O(log n/(sε) + log n log s), where n is the number of nodes, εn is the initial node advantage of the majority state, and s = Ω(log n log log n) is the number of states the protocol employs. This shows that the majority problem can be solved exactly in time poly-logarithmic in n, provided that the memory per node is s = Ω(1/ε + lognlog1/ε). On the negative side, we establish a lower bound of Ω(1/ε) on the expected paraallel convergence time for the case of four memory states per node, and a lower bound of Ω(logn) parallel time for protocols using any number of memory states per node.per node, and a lower bound of (log n) parallel time for protocols using any number of memory states per node.
AU - Alistarh, Dan-Adrian
AU - Gelashvili, Rati
AU - Vojnović, Milan
ID - 781
TI - Fast and exact majority in population protocols
VL - 2015-July
ER -
TY - CONF
AB - In this work, we consider the following random process, mo- Tivated by the analysis of lock-free concurrent algorithms under high memory contention. In each round, a new scheduling step is allocated to one of n threads, according to a distribution p = (p1; p2; : : : ; pn), where thread i is scheduled with probability pi. When some thread first reaches a set threshold of executed steps, it registers a win, completing its current operation, and resets its step count to 1. At the same time, threads whose step count was close to the threshold also get reset because of the win, but to 0 steps, being penalized for almost winning. We are interested in two questions: how often does some thread complete an operation (system latency), and how often does a specific thread complete an operation (individual latency)? We provide asymptotically tight bounds for the system and individual latency of this general concurrency pattern, for arbitrary scheduling distributions p. Surprisingly, a sim- ple characterization exists: in expectation, the system will complete a new operation every Θ(1/p 2) steps, while thread i will complete a new operation every Θ(1/2=p i ) steps. The proof is interesting in its own right, as it requires a careful analysis of how the higher norms of the vector p inuence the thread step counts and latencies in this random process. Our result offers a simple connection between the scheduling distribution and the average performance of concurrent algorithms, which has several applications.
AU - Alistarh, Dan-Adrian
AU - Sauerwald, Thomas
AU - Vojnović, Milan
ID - 782
TI - Lock-Free algorithms under stochastic schedulers
VL - 2015-July
ER -
TY - CONF
AB - The problem of electing a leader from among n contenders is one of the fundamental questions in distributed computing. In its simplest formulation, the task is as follows: given n processors, all participants must eventually return a win or lose indication, such that a single contender may win. Despite a considerable amount of work on leader election, the following question is still open: can we elect a leader in an asynchronous fault-prone system faster than just running a Θ(log n)-time tournament, against a strong adaptive adversary? In this paper, we answer this question in the affirmative, improving on a decades-old upper bound. We introduce two new algorithmic ideas to reduce the time complexity of electing a leader to O(log∗ n), using O(n2) point-to-point messages. A non-trivial application of our algorithm is a new upper bound for the tight renaming problem, assigning n items to the n participants in expected O(log2 n) time and O(n2) messages. We complement our results with lower bound of Ω(n2) messages for solving these two problems, closing the question of their message complexity.
AU - Alistarh, Dan-Adrian
AU - Gelashvili, Rati
AU - Vladu, Adrian
ID - 783
TI - How to elect a leader faster than a tournament
VL - 2015-July
ER -
TY - CONF
AB - We demonstrate an optical switch design that can scale up to a thousand ports with high per-port bandwidth (25 Gbps+) and low switching latency (40 ns). Our design uses a broadcast and select architecture, based on a passive star coupler and fast tunable transceivers. In addition we employ time division multiplexing to achieve very low switching latency. Our demo shows the feasibility of the switch data plane using a small testbed, comprising two transmitters and a receiver, connected through a star coupler.
AU - Alistarh, Dan-Adrian
AU - Ballani, Hitesh
AU - Costa, Paolo
AU - Funnell, Adam
AU - Benjamin, Joshua
AU - Watts, Philip
AU - Thomsen, Benn
ID - 784
SN - 978-1-4503-3542-3
TI - A high-radix, low-latency optical switch for data centers
ER -
TY - JOUR
AB - We prove that nonlinear Gibbs measures can be obtained from the corresponding many-body, grand-canonical, quantum Gibbs states, in a mean-field limit where the temperature T diverges and the interaction strength behaves as 1/T. We proceed by characterizing the interacting Gibbs state as minimizing a functional counting the free-energy relatively to the non-interacting case. We then perform an infinite-dimensional analogue of phase-space semiclassical analysis, using fine properties of the quantum relative entropy, the link between quantum de Finetti measures and upper/lower symbols in a coherent state basis, as well as Berezin-Lieb type inequalities. Our results cover the measure built on the defocusing nonlinear Schrödinger functional on a finite interval, as well as smoother interactions in dimensions d 2.
AU - Lewin, Mathieu
AU - Phan Thanh, Nam
AU - Rougerie, Nicolas
ID - 473
JF - Journal de l'Ecole Polytechnique - Mathematiques
TI - Derivation of nonlinear gibbs measures from many-body quantum mechanics
VL - 2
ER -
TY - JOUR
AB - Dendritic cells are potent antigen-presenting cells endowed with the unique ability to initiate adaptive immune responses upon inflammation. Inflammatory processes are often associated with an increased production of serotonin, which operates by activating specific receptors. However, the functional role of serotonin receptors in regulation of dendritic cell functions is poorly understood. Here, we demonstrate that expression of serotonin receptor 5-HT7 (5-HT7TR) as well as its downstream effector Cdc42 is upregulated in dendritic cells upon maturation. Although dendritic cell maturation was independent of 5-HT7TR, receptor stimulation affected dendritic cell morphology through Cdc42-mediated signaling. In addition, basal activity of 5-HT7TR was required for the proper expression of the chemokine receptor CCR7, which is a key factor that controls dendritic cell migration. Consistent with this, we observed that 5-HT7TR enhances chemotactic motility of dendritic cells in vitro by modulating their directionality and migration velocity. Accordingly, migration of dendritic cells in murine colon explants was abolished after pharmacological receptor inhibition. Our results indicate that there is a crucial role for 5-HT7TR-Cdc42-mediated signaling in the regulation of dendritic cell morphology and motility, suggesting that 5-HT7TR could be a new target for treatment of a variety of inflammatory and immune disorders.
AU - Holst, Katrin
AU - Guseva, Daria
AU - Schindler, Susann
AU - Sixt, Michael K
AU - Braun, Armin
AU - Chopra, Himpriya
AU - Pabst, Oliver
AU - Ponimaskin, Evgeni
ID - 477
IS - 15
JF - Journal of Cell Science
TI - The serotonin receptor 5-HT7R regulates the morphology and migratory properties of dendritic cells
VL - 128
ER -
TY - JOUR
AB - We consider two-player games played on weighted directed graphs with mean-payoff and total-payoff objectives, two classical quantitative objectives. While for single-dimensional games the complexity and memory bounds for both objectives coincide, we show that in contrast to multi-dimensional mean-payoff games that are known to be coNP-complete, multi-dimensional total-payoff games are undecidable. We introduce conservative approximations of these objectives, where the payoff is considered over a local finite window sliding along a play, instead of the whole play. For single dimension, we show that (i) if the window size is polynomial, deciding the winner takes polynomial time, and (ii) the existence of a bounded window can be decided in NP ∩ coNP, and is at least as hard as solving mean-payoff games. For multiple dimensions, we show that (i) the problem with fixed window size is EXPTIME-complete, and (ii) there is no primitive-recursive algorithm to decide the existence of a bounded window.
AU - Chatterjee, Krishnendu
AU - Doyen, Laurent
AU - Randour, Mickael
AU - Raskin, Jean
ID - 523
IS - 6
JF - Information and Computation
TI - Looking at mean-payoff and total-payoff through windows
VL - 242
ER -