TY - JOUR
AB - Traditional statistical methods for confidentiality protection of statistical databases do not scale well to deal with GWAS databases especially in terms of guarantees regarding protection from linkage to external information. The more recent concept of differential privacy, introduced by the cryptographic community, is an approach which provides a rigorous definition of privacy with meaningful privacy guarantees in the presence of arbitrary external information, although the guarantees may come at a serious price in terms of data utility. Building on such notions, we propose new methods to release aggregate GWAS data without compromising an individual’s privacy. We present methods for releasing differentially private minor allele frequencies, chi-square statistics and p-values. We compare these approaches on simulated data and on a GWAS study of canine hair length involving 685 dogs. We also propose a privacy-preserving method for finding genome-wide associations based on a differentially-private approach to penalized logistic regression.
AU - Uhler, Caroline
AU - Slavkovic, Aleksandra
AU - Fienberg, Stephen
ID - 2009
IS - 1
JF - Journal of Privacy and Confidentiality
TI - Privacy-preserving data sharing for genome-wide association studies
VL - 5
ER -
TY - JOUR
AB - Many algorithms for inferring causality rely heavily on the faithfulness assumption. The main justification for imposing this assumption is that the set of unfaithful distributions has Lebesgue measure zero, since it can be seen as a collection of hypersurfaces in a hypercube. However, due to sampling error the faithfulness condition alone is not sufficient for statistical estimation, and strong-faithfulness has been proposed and assumed to achieve uniform or high-dimensional consistency. In contrast to the plain faithfulness assumption, the set of distributions that is not strong-faithful has nonzero Lebesgue measure and in fact, can be surprisingly large as we show in this paper. We study the strong-faithfulness condition from a geometric and combinatorial point of view and give upper and lower bounds on the Lebesgue measure of strong-faithful distributions for various classes of directed acyclic graphs. Our results imply fundamental limitations for the PC-algorithm and potentially also for other algorithms based on partial correlation testing in the Gaussian case.
AU - Uhler, Caroline
AU - Raskutti, Garvesh
AU - Bühlmann, Peter
AU - Yu, Bin
ID - 2010
IS - 2
JF - The Annals of Statistics
TI - Geometry of the faithfulness assumption in causal inference
VL - 41
ER -
TY - CONF
AB - There is a trade-off between performance and correctness in implementing concurrent data structures. Better performance may be achieved at the expense of relaxing correctness, by redefining the semantics of data structures. We address such a redefinition of data structure semantics and present a systematic and formal framework for obtaining new data structures by quantitatively relaxing existing ones. We view a data structure as a sequential specification S containing all "legal" sequences over an alphabet of method calls. Relaxing the data structure corresponds to defining a distance from any sequence over the alphabet to the sequential specification: the k-relaxed sequential specification contains all sequences over the alphabet within distance k from the original specification. In contrast to other existing work, our relaxations are semantic (distance in terms of data structure states). As an instantiation of our framework, we present two simple yet generic relaxation schemes, called out-of-order and stuttering relaxation, along with several ways of computing distances. We show that the out-of-order relaxation, when further instantiated to stacks, queues, and priority queues, amounts to tolerating bounded out-of-order behavior, which cannot be captured by a purely syntactic relaxation (distance in terms of sequence manipulation, e.g. edit distance). We give concurrent implementations of relaxed data structures and demonstrate that bounded relaxations provide the means for trading correctness for performance in a controlled way. The relaxations are monotonic which further highlights the trade-off: increasing k increases the number of permitted sequences, which as we demonstrate can lead to better performance. Finally, since a relaxed stack or queue also implements a pool, we actually have new concurrent pool implementations that outperform the state-of-the-art ones.
AU - Henzinger, Thomas A
AU - Kirsch, Christoph
AU - Payer, Hannes
AU - Sezgin, Ali
AU - Sokolova, Ana
ID - 2181
SN - 978-1-4503-1832-7
T2 - Proceedings of the 40th annual ACM SIGPLAN-SIGACT symposium on Principles of programming language
TI - Quantitative relaxation of concurrent data structures
ER -
TY - CONF
AB - We propose a general framework for abstraction with respect to quantitative properties, such as worst-case execution time, or power consumption. Our framework provides a systematic way for counter-example guided abstraction refinement for quantitative properties. The salient aspect of the framework is that it allows anytime verification, that is, verification algorithms that can be stopped at any time (for example, due to exhaustion of memory), and report approximations that improve monotonically when the algorithms are given more time. We instantiate the framework with a number of quantitative abstractions and refinement schemes, which differ in terms of how much quantitative information they keep from the original system. We introduce both state-based and trace-based quantitative abstractions, and we describe conditions that define classes of quantitative properties for which the abstractions provide over-approximations. We give algorithms for evaluating the quantitative properties on the abstract systems. We present algorithms for counter-example based refinements for quantitative properties for both state-based and segment-based abstractions. We perform a case study on worst-case execution time of executables to evaluate the anytime verification aspect and the quantitative abstractions we proposed.
AU - Cerny, Pavol
AU - Henzinger, Thomas A
AU - Radhakrishna, Arjun
ID - 2182
T2 - Proceedings of the 40th annual ACM SIGPLAN-SIGACT symposium on Principles of programming language
TI - Quantitative abstraction refinement
ER -
TY - CONF
AB - A straight skeleton is a well-known geometric structure, and several algorithms exist to construct the straight skeleton for a given polygon or planar straight-line graph. In this paper, we ask the reverse question: Given the straight skeleton (in form of a planar straight-line graph, with some rays to infinity), can we reconstruct a planar straight-line graph for which this was the straight skeleton? We show how to reduce this problem to the problem of finding a line that intersects a set of convex polygons. We can find these convex polygons and all such lines in $O(nlog n)$ time in the Real RAM computer model, where $n$ denotes the number of edges of the input graph. We also explain how our approach can be used for recognizing Voronoi diagrams of points, thereby completing a partial solution provided by Ash and Bolker in 1985.
AU - Biedl, Therese
AU - Held, Martin
AU - Huber, Stefan
ID - 2209
TI - Recognizing straight skeletons and Voronoi diagrams and reconstructing their input
ER -