TY - JOUR AB - We consider the quantum mechanical many-body problem of a single impurity particle immersed in a weakly interacting Bose gas. The impurity interacts with the bosons via a two-body potential. We study the Hamiltonian of this system in the mean-field limit and rigorously show that, at low energies, the problem is well described by the Fröhlich polaron model. AU - Mysliwy, Krzysztof AU - Seiringer, Robert ID - 8705 IS - 12 JF - Annales Henri Poincare SN - 1424-0637 TI - Microscopic derivation of the Fröhlich Hamiltonian for the Bose polaron in the mean-field limit VL - 21 ER - TY - JOUR AB - Motivation: Recent technological advances have led to an increase in the production and availability of single-cell data. The ability to integrate a set of multi-technology measurements would allow the identification of biologically or clinically meaningful observations through the unification of the perspectives afforded by each technology. In most cases, however, profiling technologies consume the used cells and thus pairwise correspondences between datasets are lost. Due to the sheer size single-cell datasets can acquire, scalable algorithms that are able to universally match single-cell measurements carried out in one cell to its corresponding sibling in another technology are needed. Results: We propose Single-Cell data Integration via Matching (SCIM), a scalable approach to recover such correspondences in two or more technologies. SCIM assumes that cells share a common (low-dimensional) underlying structure and that the underlying cell distribution is approximately constant across technologies. It constructs a technology-invariant latent space using an autoencoder framework with an adversarial objective. Multi-modal datasets are integrated by pairing cells across technologies using a bipartite matching scheme that operates on the low-dimensional latent representations. We evaluate SCIM on a simulated cellular branching process and show that the cell-to-cell matches derived by SCIM reflect the same pseudotime on the simulated dataset. Moreover, we apply our method to two real-world scenarios, a melanoma tumor sample and a human bone marrow sample, where we pair cells from a scRNA dataset to their sibling cells in a CyTOF dataset achieving 90% and 78% cell-matching accuracy for each one of the samples, respectively. AU - Stark, Stefan G AU - Ficek, Joanna AU - Locatello, Francesco AU - Bonilla, Ximena AU - Chevrier, Stéphane AU - Singer, Franziska AU - Aebersold, Rudolf AU - Al-Quaddoomi, Faisal S AU - Albinus, Jonas AU - Alborelli, Ilaria AU - Andani, Sonali AU - Attinger, Per-Olof AU - Bacac, Marina AU - Baumhoer, Daniel AU - Beck-Schimmer, Beatrice AU - Beerenwinkel, Niko AU - Beisel, Christian AU - Bernasconi, Lara AU - Bertolini, Anne AU - Bodenmiller, Bernd AU - Bonilla, Ximena AU - Casanova, Ruben AU - Chevrier, Stéphane AU - Chicherova, Natalia AU - D'Costa, Maya AU - Danenberg, Esther AU - Davidson, Natalie AU - gan, Monica-Andreea Dră AU - Dummer, Reinhard AU - Engler, Stefanie AU - Erkens, Martin AU - Eschbach, Katja AU - Esposito, Cinzia AU - Fedier, André AU - Ferreira, Pedro AU - Ficek, Joanna AU - Frei, Anja L AU - Frey, Bruno AU - Goetze, Sandra AU - Grob, Linda AU - Gut, Gabriele AU - Günther, Detlef AU - Haberecker, Martina AU - Haeuptle, Pirmin AU - Heinzelmann-Schwarz, Viola AU - Herter, Sylvia AU - Holtackers, Rene AU - Huesser, Tamara AU - Irmisch, Anja AU - Jacob, Francis AU - Jacobs, Andrea AU - Jaeger, Tim M AU - Jahn, Katharina AU - James, Alva R AU - Jermann, Philip M AU - Kahles, André AU - Kahraman, Abdullah AU - Koelzer, Viktor H AU - Kuebler, Werner AU - Kuipers, Jack AU - Kunze, Christian P AU - Kurzeder, Christian AU - Lehmann, Kjong-Van AU - Levesque, Mitchell AU - Lugert, Sebastian AU - Maass, Gerd AU - Manz, Markus AU - Markolin, Philipp AU - Mena, Julien AU - Menzel, Ulrike AU - Metzler, Julian M AU - Miglino, Nicola AU - Milani, Emanuela S AU - Moch, Holger AU - Muenst, Simone AU - Murri, Riccardo AU - Ng, Charlotte KY AU - Nicolet, Stefan AU - Nowak, Marta AU - Pedrioli, Patrick GA AU - Pelkmans, Lucas AU - Piscuoglio, Salvatore AU - Prummer, Michael AU - Ritter, Mathilde AU - Rommel, Christian AU - Rosano-González, María L AU - Rätsch, Gunnar AU - Santacroce, Natascha AU - Castillo, Jacobo Sarabia del AU - Schlenker, Ramona AU - Schwalie, Petra C AU - Schwan, Severin AU - Schär, Tobias AU - Senti, Gabriela AU - Singer, Franziska AU - Sivapatham, Sujana AU - Snijder, Berend AU - Sobottka, Bettina AU - Sreedharan, Vipin T AU - Stark, Stefan AU - Stekhoven, Daniel J AU - Theocharides, Alexandre PA AU - Thomas, Tinu M AU - Tolnay, Markus AU - Tosevski, Vinko AU - Toussaint, Nora C AU - Tuncel, Mustafa A AU - Tusup, Marina AU - Drogen, Audrey Van AU - Vetter, Marcus AU - Vlajnic, Tatjana AU - Weber, Sandra AU - Weber, Walter P AU - Wegmann, Rebekka AU - Weller, Michael AU - Wendt, Fabian AU - Wey, Norbert AU - Wicki, Andreas AU - Wollscheid, Bernd AU - Yu, Shuqing AU - Ziegler, Johanna AU - Zimmermann, Marc AU - Zoche, Martin AU - Zuend, Gregor AU - Rätsch, Gunnar AU - Lehmann, Kjong-Van ID - 14125 IS - Supplement_2 JF - Bioinformatics KW - Computational Mathematics KW - Computational Theory and Mathematics KW - Computer Science Applications KW - Molecular Biology KW - Biochemistry KW - Statistics and Probability TI - SCIM: Universal single-cell matching with unpaired feature sets VL - 36 ER - TY - CONF AB - The goal of the unsupervised learning of disentangled representations is to separate the independent explanatory factors of variation in the data without access to supervision. In this paper, we summarize the results of Locatello et al., 2019, and focus on their implications for practitioners. We discuss the theoretical result showing that the unsupervised learning of disentangled representations is fundamentally impossible without inductive biases and the practical challenges it entails. Finally, we comment on our experimental findings, highlighting the limitations of state-of-the-art approaches and directions for future research. AU - Locatello, Francesco AU - Bauer, Stefan AU - Lucic, Mario AU - Rätsch, Gunnar AU - Gelly, Sylvain AU - Schölkopf, Bernhard AU - Bachem, Olivier ID - 14186 IS - 9 SN - 9781577358350 T2 - The 34th AAAI Conference on Artificial Intelligence TI - A commentary on the unsupervised learning of disentangled representations VL - 34 ER - TY - CONF AB - Intelligent agents should be able to learn useful representations by observing changes in their environment. We model such observations as pairs of non-i.i.d. images sharing at least one of the underlying factors of variation. First, we theoretically show that only knowing how many factors have changed, but not which ones, is sufficient to learn disentangled representations. Second, we provide practical algorithms that learn disentangled representations from pairs of images without requiring annotation of groups, individual factors, or the number of factors that have changed. Third, we perform a large-scale empirical study and show that such pairs of observations are sufficient to reliably learn disentangled representations on several benchmark data sets. Finally, we evaluate our learned representations and find that they are simultaneously useful on a diverse suite of tasks, including generalization under covariate shifts, fairness, and abstract reasoning. Overall, our results demonstrate that weak supervision enables learning of useful disentangled representations in realistic scenarios. AU - Locatello, Francesco AU - Poole, Ben AU - Rätsch, Gunnar AU - Schölkopf, Bernhard AU - Bachem, Olivier AU - Tschannen, Michael ID - 14188 T2 - Proceedings of the 37th International Conference on Machine Learning TI - Weakly-supervised disentanglement without compromises VL - 119 ER - TY - CONF AB - We propose a novel Stochastic Frank-Wolfe (a.k.a. conditional gradient) algorithm for constrained smooth finite-sum minimization with a generalized linear prediction/structure. This class of problems includes empirical risk minimization with sparse, low-rank, or other structured constraints. The proposed method is simple to implement, does not require step-size tuning, and has a constant per-iteration cost that is independent of the dataset size. Furthermore, as a byproduct of the method we obtain a stochastic estimator of the Frank-Wolfe gap that can be used as a stopping criterion. Depending on the setting, the proposed method matches or improves on the best computational guarantees for Stochastic Frank-Wolfe algorithms. Benchmarks on several datasets highlight different regimes in which the proposed method exhibits a faster empirical convergence than related methods. Finally, we provide an implementation of all considered methods in an open-source package. AU - Négiar, Geoffrey AU - Dresdner, Gideon AU - Tsai, Alicia AU - Ghaoui, Laurent El AU - Locatello, Francesco AU - Freund, Robert M. AU - Pedregosa, Fabian ID - 14187 T2 - Proceedings of the 37th International Conference on Machine Learning TI - Stochastic Frank-Wolfe for constrained finite-sum minimization VL - 119 ER - TY - JOUR AB - The idea behind the unsupervised learning of disentangled representations is that real-world data is generated by a few explanatory factors of variation which can be recovered by unsupervised learning algorithms. In this paper, we provide a sober look at recent progress in the field and challenge some common assumptions. We first theoretically show that the unsupervised learning of disentangled representations is fundamentally impossible without inductive biases on both the models and the data. Then, we train over 14000 models covering most prominent methods and evaluation metrics in a reproducible large-scale experimental study on eight data sets. We observe that while the different methods successfully enforce properties “encouraged” by the corresponding losses, well-disentangled models seemingly cannot be identified without supervision. Furthermore, different evaluation metrics do not always agree on what should be considered “disentangled” and exhibit systematic differences in the estimation. Finally, increased disentanglement does not seem to necessarily lead to a decreased sample complexity of learning for downstream tasks. Our results suggest that future work on disentanglement learning should be explicit about the role of inductive biases and (implicit) supervision, investigate concrete benefits of enforcing disentanglement of the learned representations, and consider a reproducible experimental setup covering several data sets. AU - Locatello, Francesco AU - Bauer, Stefan AU - Lucic, Mario AU - Rätsch, Gunnar AU - Gelly, Sylvain AU - Schölkopf, Bernhard AU - Bachem, Olivier ID - 14195 JF - Journal of Machine Learning Research TI - A sober look at the unsupervised learning of disentangled representations and their evaluation VL - 21 ER - TY - JOUR AB - Genes differ in the frequency at which they are expressed and in the form of regulation used to control their activity. In particular, positive or negative regulation can lead to activation of a gene in response to an external signal. Previous works proposed that the form of regulation of a gene correlates with its frequency of usage: positive regulation when the gene is frequently expressed and negative regulation when infrequently expressed. Such network design means that, in the absence of their regulators, the genes are found in their least required activity state, hence regulatory intervention is often necessary. Due to the multitude of genes and regulators, spurious binding and unbinding events, called “crosstalk”, could occur. To determine how the form of regulation affects the global crosstalk in the network, we used a mathematical model that includes multiple regulators and multiple target genes. We found that crosstalk depends non-monotonically on the availability of regulators. Our analysis showed that excess use of regulation entailed by the formerly suggested network design caused high crosstalk levels in a large part of the parameter space. We therefore considered the opposite ‘idle’ design, where the default unregulated state of genes is their frequently required activity state. We found, that ‘idle’ design minimized the use of regulation and thus minimized crosstalk. In addition, we estimated global crosstalk of S. cerevisiae using transcription factors binding data. We demonstrated that even partial network data could suffice to estimate its global crosstalk, suggesting its applicability to additional organisms. We found that S. cerevisiae estimated crosstalk is lower than that of a random network, suggesting that natural selection reduces crosstalk. In summary, our study highlights a new type of protein production cost which is typically overlooked: that of regulatory interference caused by the presence of excess regulators in the cell. It demonstrates the importance of whole-network descriptions, which could show effects missed by single-gene models. AU - Grah, Rok AU - Friedlander, Tamar ID - 7569 IS - 2 JF - PLOS Computational Biology SN - 1553-7358 TI - The relation between crosstalk and gene regulation form revisited VL - 16 ER - TY - GEN AB - In mammals, chromatin marks at imprinted genes are asymmetrically inherited to control parentally-biased gene expression. This control is thought predominantly to involve parent-specific differentially methylated regions (DMR) in genomic DNA. However, neither parent-of-origin-specific transcription nor DMRs have been comprehensively mapped. We here address this by integrating transcriptomic and epigenomic approaches in mouse preimplantation embryos (blastocysts). Transcriptome-analysis identified 71 genes expressed with previously unknown parent-of-origin-specific expression in blastocysts (nBiX: novel blastocyst-imprinted expression). Uniparental expression of nBiX genes disappeared soon after implantation. Micro-whole-genome bisulfite sequencing (μWGBS) of individual uniparental blastocysts detected 859 DMRs. Only 18% of nBiXs were associated with a DMR, whereas 60% were associated with parentally-biased H3K27me3. This suggests a major role for Polycomb-mediated imprinting in blastocysts. Five nBiX-clusters contained at least one known imprinted gene, and five novel clusters contained exclusively nBiX-genes. These data suggest a complex program of stage-specific imprinting involving different tiers of regulation. AU - Santini, Laura AU - Halbritter, Florian AU - Titz-Teixeira, Fabian AU - Suzuki, Toru AU - Asami, Maki AU - Ramesmayer, Julia AU - Ma, Xiaoyan AU - Lackner, Andreas AU - Warr, Nick AU - Pauler, Florian AU - Hippenmeyer, Simon AU - Laue, Ernest AU - Farlik, Matthias AU - Bock, Christoph AU - Beyer, Andreas AU - Perry, Anthony C. F. AU - Leeb, Martin ID - 8813 T2 - bioRxiv TI - Novel imprints in mouse blastocysts are predominantly DNA methylation independent ER - TY - GEN AU - Grah, Rok AU - Friedlander, Tamar ID - 9777 TI - Maximizing crosstalk ER - TY - THES AB - Designing and verifying concurrent programs is a notoriously challenging, time consuming, and error prone task, even for experts. This is due to the sheer number of possible interleavings of a concurrent program, all of which have to be tracked and accounted for in a formal proof. Inventing an inductive invariant that captures all interleavings of a low-level implementation is theoretically possible, but practically intractable. We develop a refinement-based verification framework that provides mechanisms to simplify proof construction by decomposing the verification task into smaller subtasks. In a first line of work, we present a foundation for refinement reasoning over structured concurrent programs. We introduce layered concurrent programs as a compact notation to represent multi-layer refinement proofs. A layered concurrent program specifies a sequence of connected concurrent programs, from most concrete to most abstract, such that common parts of different programs are written exactly once. Each program in this sequence is expressed as structured concurrent program, i.e., a program over (potentially recursive) procedures, imperative control flow, gated atomic actions, structured parallelism, and asynchronous concurrency. This is in contrast to existing refinement-based verifiers, which represent concurrent systems as flat transition relations. We present a powerful refinement proof rule that decomposes refinement checking over structured programs into modular verification conditions. Refinement checking is supported by a new form of modular, parameterized invariants, called yield invariants, and a linear permission system to enhance local reasoning. In a second line of work, we present two new reduction-based program transformations that target asynchronous programs. These transformations reduce the number of interleavings that need to be considered, thus reducing the complexity of invariants. Synchronization simplifies the verification of asynchronous programs by introducing the fiction, for proof purposes, that asynchronous operations complete synchronously. Synchronization summarizes an asynchronous computation as immediate atomic effect. Inductive sequentialization establishes sequential reductions that captures every behavior of the original program up to reordering of coarse-grained commutative actions. A sequential reduction of a concurrent program is easy to reason about since it corresponds to a simple execution of the program in an idealized synchronous environment, where processes act in a fixed order and at the same speed. Our approach is implemented the CIVL verifier, which has been successfully used for the verification of several complex concurrent programs. In our methodology, the overall correctness of a program is established piecemeal by focusing on the invariant required for each refinement step separately. While the programmer does the creative work of specifying the chain of programs and the inductive invariant justifying each link in the chain, the tool automatically constructs the verification conditions underlying each refinement step. AU - Kragl, Bernhard ID - 8332 SN - 2663-337X TI - Verifying concurrent programs: Refinement, synchronization, sequentialization ER -