TY - JOUR AB - Pedigree and sibship reconstruction are important methods in quantifying relationships and fitness of individuals in natural populations. Current methods employ a Markov chain-based algorithm to explore plausible possible pedigrees iteratively. This provides accurate results, but is time-consuming. Here, we develop a method to infer sibship and paternity relationships from half-sibling arrays of known maternity using hierarchical clustering. Given 50 or more unlinked SNP markers and empirically derived error rates, the method performs as well as the widely used package Colony, but is faster by two orders of magnitude. Using simulations, we show that the method performs well across contrasting mating scenarios, even when samples are large. We then apply the method to open-pollinated arrays of the snapdragon Antirrhinum majus and find evidence for a high degree of multiple mating. Although we focus on diploid SNP data, the method does not depend on marker type and as such has broad applications in nonmodel systems. AU - Ellis, Thomas AU - Field, David AU - Barton, Nicholas H ID - 286 IS - 5 JF - Molecular Ecology Resources TI - Efficient inference of paternity and sibship inference given known maternity via hierarchical clustering VL - 18 ER - TY - DATA AB - Data and scripts are provided in support of the manuscript "Efficient inference of paternity and sibship inference given known maternity via hierarchical clustering", and the associated Python package FAPS, available from www.github.com/ellisztamas/faps. Simulation scripts cover: 1. Performance under different mating scenarios. 2. Comparison with Colony2. 3. Effect of changing the number of Monte Carlo draws The final script covers the analysis of half-sib arrays from wild-pollinated seed in an Antirrhinum majus hybrid zone. AU - Ellis, Thomas ID - 5583 TI - Data and Python scripts supporting Python package FAPS ER - TY - DATA AB - File S1. Variant Calling Format file of the ingroup: 197 haploid sequences of D. melanogaster from Zambia (Africa) aligned to the D. melanogaster 5.57 reference genome. File S2. Variant Calling Format file of the outgroup: 1 haploid sequence of D. simulans aligned to the D. melanogaster 5.57 reference genome. File S3. Annotations of each transcript in coding regions with SNPeff: Ps (# of synonymous polymorphic sites); Pn (# of non-synonymous polymorphic sites); Ds (# of synonymous divergent sites); Dn (# of non-synonymous divergent sites); DoS; ⍺ MK . All variants were included. File S4. Annotations of each transcript in non-coding regions with SNPeff: Ps (# of synonymous polymorphic sites); Pu (# of UTR polymorphic sites); Ds (# of synonymous divergent sites); Du (# of UTR divergent sites); DoS; ⍺ MK . All variants were included. File S5. Annotations of each transcript in coding regions with SNPGenie: Ps (# of synonymous polymorphic sites); πs (synonymous diversity); Ss_p (total # of synonymous sites in the polymorphism data); Pn (# of non-synonymous polymorphic sites); πn (non-synonymous diversity); Sn_p (total # of non-synonymous sites in the polymorphism data); Ds (# of synonymous divergent sites); ks (synonymous evolutionary rate); Ss_d (total # of synonymous sites in the divergence data); Dn (# of non-synonymous divergent sites); kn (non-synonymous evolutionary rate); Sn_d (total # of non- synonymous sites in the divergence data); DoS; ⍺ MK . All variants were included. File S6. Gene expression values (RPKM summed over all transcripts) for each sample. Values were quantile-normalized across all samples. File S7. Final dataset with all covariates, ⍺ MK , ωA MK and DoS for coding sites, excluding variants below 5% frequency. File S8. Final dataset with all covariates, ⍺ MK , ωA MK and DoS for non-coding sites, excluding variants below 5% frequency. File S9. Final dataset with all covariates, ⍺ EWK , ωA EWK and deleterious SFS for coding sites obtained with the Eyre-Walker and Keightley method on binned data and using all variants. AU - Fraisse, Christelle ID - 5757 KW - (mal)adaptation KW - pleiotropy KW - selective constraint KW - evo-devo KW - gene expression KW - Drosophila melanogaster TI - Supplementary Files for "Pleiotropy modulates the efficacy of selection in Drosophila melanogaster" ER - TY - CONF AB - There has been renewed interest in modelling the behaviour of evolutionary algorithms by more traditional mathematical objects, such as ordinary differential equations or Markov chains. The advantage is that the analysis becomes greatly facilitated due to the existence of well established methods. However, this typically comes at the cost of disregarding information about the process. Here, we introduce the use of stochastic differential equations (SDEs) for the study of EAs. SDEs can produce simple analytical results for the dynamics of stochastic processes, unlike Markov chains which can produce rigorous but unwieldy expressions about the dynamics. On the other hand, unlike ordinary differential equations (ODEs), they do not discard information about the stochasticity of the process. We show that these are especially suitable for the analysis of fixed budget scenarios and present analogs of the additive and multiplicative drift theorems for SDEs. We exemplify the use of these methods for two model algorithms ((1+1) EA and RLS) on two canonical problems(OneMax and LeadingOnes). AU - Paixao, Tiago AU - Pérez Heredia, Jorge ID - 1112 SN - 978-145034651-1 T2 - Proceedings of the 14th ACM/SIGEVO Conference on Foundations of Genetic Algorithms TI - An application of stochastic differential equations to evolutionary algorithms ER - TY - JOUR AB - Variation in genotypes may be responsible for differences in dispersal rates, directional biases, and growth rates of individuals. These traits may favor certain genotypes and enhance their spatiotemporal spreading into areas occupied by the less advantageous genotypes. We study how these factors influence the speed of spreading in the case of two competing genotypes under the assumption that spatial variation of the total population is small compared to the spatial variation of the frequencies of the genotypes in the population. In that case, the dynamics of the frequency of one of the genotypes is approximately described by a generalized Fisher–Kolmogorov–Petrovskii–Piskunov (F–KPP) equation. This generalized F–KPP equation with (nonlinear) frequency-dependent diffusion and advection terms admits traveling wave solutions that characterize the invasion of the dominant genotype. Our existence results generalize the classical theory for traveling waves for the F–KPP with constant coefficients. Moreover, in the particular case of the quadratic (monostable) nonlinear growth–decay rate in the generalized F–KPP we study in detail the influence of the variance in diffusion and mean displacement rates of the two genotypes on the minimal wave propagation speed. AU - Kollár, Richard AU - Novak, Sebastian ID - 1191 IS - 3 JF - Bulletin of Mathematical Biology TI - Existence of traveling waves for the generalized F–KPP equation VL - 79 ER - TY - JOUR AB - Most phenotypes are determined by molecular systems composed of specifically interacting molecules. However, unlike for individual components, little is known about the distributions of mutational effects of molecular systems as a whole. We ask how the distribution of mutational effects of a transcriptional regulatory system differs from the distributions of its components, by first independently, and then simultaneously, mutating a transcription factor and the associated promoter it represses. We find that the system distribution exhibits increased phenotypic variation compared to individual component distributions - an effect arising from intermolecular epistasis between the transcription factor and its DNA-binding site. In large part, this epistasis can be qualitatively attributed to the structure of the transcriptional regulatory system and could therefore be a common feature in prokaryotes. Counter-intuitively, intermolecular epistasis can alleviate the constraints of individual components, thereby increasing phenotypic variation that selection could act on and facilitating adaptive evolution. AU - Lagator, Mato AU - Sarikas, Srdjan AU - Acar, Hande AU - Bollback, Jonathan P AU - Guet, Calin C ID - 570 JF - eLife SN - 2050084X TI - Regulatory network structure determines patterns of intermolecular epistasis VL - 6 ER - TY - JOUR AB - Small RNAs (sRNAs) regulate genes in plants and animals. Here, we show that population-wide differences in color patterns in snapdragon flowers are caused by an inverted duplication that generates sRNAs. The complexity and size of the transcripts indicate that the duplication represents an intermediate on the pathway to microRNA evolution. The sRNAs repress a pigment biosynthesis gene, creating a yellow highlight at the site of pollinator entry. The inverted duplication exhibits steep clines in allele frequency in a natural hybrid zone, showing that the allele is under selection. Thus, regulatory interactions of evolutionarily recent sRNAs can be acted upon by selection and contribute to the evolution of phenotypic diversity. AU - Bradley, Desmond AU - Xu, Ping AU - Mohorianu, Irina AU - Whibley, Annabel AU - Field, David AU - Tavares, Hugo AU - Couchman, Matthew AU - Copsey, Lucy AU - Carpenter, Rosemary AU - Li, Miaomiao AU - Li, Qun AU - Xue, Yongbiao AU - Dalmay, Tamas AU - Coen, Enrico ID - 611 IS - 6365 JF - Science SN - 00368075 TI - Evolution of flower color pattern through selection on regulatory small RNAs VL - 358 ER - TY - JOUR AB - Our focus here is on the infinitesimal model. In this model, one or several quantitative traits are described as the sum of a genetic and a non-genetic component, the first being distributed within families as a normal random variable centred at the average of the parental genetic components, and with a variance independent of the parental traits. Thus, the variance that segregates within families is not perturbed by selection, and can be predicted from the variance components. This does not necessarily imply that the trait distribution across the whole population should be Gaussian, and indeed selection or population structure may have a substantial effect on the overall trait distribution. One of our main aims is to identify some general conditions on the allelic effects for the infinitesimal model to be accurate. We first review the long history of the infinitesimal model in quantitative genetics. Then we formulate the model at the phenotypic level in terms of individual trait values and relationships between individuals, but including different evolutionary processes: genetic drift, recombination, selection, mutation, population structure, …. We give a range of examples of its application to evolutionary questions related to stabilising selection, assortative mating, effective population size and response to selection, habitat preference and speciation. We provide a mathematical justification of the model as the limit as the number M of underlying loci tends to infinity of a model with Mendelian inheritance, mutation and environmental noise, when the genetic component of the trait is purely additive. We also show how the model generalises to include epistatic effects. We prove in particular that, within each family, the genetic components of the individual trait values in the current generation are indeed normally distributed with a variance independent of ancestral traits, up to an error of order 1∕M. Simulations suggest that in some cases the convergence may be as fast as 1∕M. AU - Barton, Nicholas H AU - Etheridge, Alison AU - Véber, Amandine ID - 626 JF - Theoretical Population Biology SN - 00405809 TI - The infinitesimal model: Definition derivation and implications VL - 118 ER - TY - GEN AB - This text provides additional information about the model, a derivation of the analytic results in Eq (4), and details about simulations of an additional parameter set. AU - Lukacisinova, Marta AU - Novak, Sebastian AU - Paixao, Tiago ID - 9849 TI - Modelling and simulation details ER - TY - GEN AB - In this text, we discuss how a cost of resistance and the possibility of lethal mutations impact our model. AU - Lukacisinova, Marta AU - Novak, Sebastian AU - Paixao, Tiago ID - 9850 TI - Extensions of the model ER - TY - GEN AB - Based on the intuitive derivation of the dynamics of SIM allele frequency pM in the main text, we present a heuristic prediction for the long-term SIM allele frequencies with χ > 1 stresses and compare it to numerical simulations. AU - Lukacisinova, Marta AU - Novak, Sebastian AU - Paixao, Tiago ID - 9851 TI - Heuristic prediction for multiple stresses ER - TY - GEN AB - We show how different combination strategies affect the fraction of individuals that are multi-resistant. AU - Lukacisinova, Marta AU - Novak, Sebastian AU - Paixao, Tiago ID - 9852 TI - Resistance frequencies for different combination strategies ER - TY - THES AB - Bacteria and their pathogens – phages – are the most abundant living entities on Earth. Throughout their coevolution, bacteria have evolved multiple immune systems to overcome the ubiquitous threat from the phages. Although the molecu- lar details of these immune systems’ functions are relatively well understood, their epidemiological consequences for the phage-bacterial communities have been largely neglected. In this thesis we employed both experimental and theoretical methods to explore whether herd and social immunity may arise in bacterial popu- lations. Using our experimental system consisting of Escherichia coli strains with a CRISPR based immunity to the T7 phage we show that herd immunity arises in phage-bacterial communities and that it is accentuated when the populations are spatially structured. By fitting a mathematical model, we inferred expressions for the herd immunity threshold and the velocity of spread of a phage epidemic in partially resistant bacterial populations, which both depend on the bacterial growth rate, phage burst size and phage latent period. We also investigated the poten- tial for social immunity in Streptococcus thermophilus and its phage 2972 using a bioinformatic analysis of potentially coding short open reading frames with a signalling signature, encoded within the CRISPR associated genes. Subsequently, we tested one identified potentially signalling peptide and found that its addition to a phage-challenged culture increases probability of survival of bacteria two fold, although the results were only marginally significant. Together, these results demonstrate that the ubiquitous arms races between bacteria and phages have further consequences at the level of the population. AU - Payne, Pavel ID - 6291 SN - 2663-337X TI - Bacterial herd and social immunity to phages ER - TY - GEN AB - Mathematica notebooks used to generate figures. AU - Etheridge, Alison AU - Barton, Nicholas H ID - 9842 TI - Data for: Establishment in a new habitat by polygenic adaptation ER - TY - JOUR AB - The behaviour of gene regulatory networks (GRNs) is typically analysed using simulation-based statistical testing-like methods. In this paper, we demonstrate that we can replace this approach by a formal verification-like method that gives higher assurance and scalability. We focus on Wagner’s weighted GRN model with varying weights, which is used in evolutionary biology. In the model, weight parameters represent the gene interaction strength that may change due to genetic mutations. For a property of interest, we synthesise the constraints over the parameter space that represent the set of GRNs satisfying the property. We experimentally show that our parameter synthesis procedure computes the mutational robustness of GRNs—an important problem of interest in evolutionary biology—more efficiently than the classical simulation method. We specify the property in linear temporal logic. We employ symbolic bounded model checking and SMT solving to compute the space of GRNs that satisfy the property, which amounts to synthesizing a set of linear constraints on the weights. AU - Giacobbe, Mirco AU - Guet, Calin C AU - Gupta, Ashutosh AU - Henzinger, Thomas A AU - Paixao, Tiago AU - Petrov, Tatjana ID - 1351 IS - 8 JF - Acta Informatica SN - 00015903 TI - Model checking the evolution of gene regulatory networks VL - 54 ER - TY - JOUR AB - Evolutionary algorithms (EAs) form a popular optimisation paradigm inspired by natural evolution. In recent years the field of evolutionary computation has developed a rigorous analytical theory to analyse the runtimes of EAs on many illustrative problems. Here we apply this theory to a simple model of natural evolution. In the Strong Selection Weak Mutation (SSWM) evolutionary regime the time between occurrences of new mutations is much longer than the time it takes for a mutated genotype to take over the population. In this situation, the population only contains copies of one genotype and evolution can be modelled as a stochastic process evolving one genotype by means of mutation and selection between the resident and the mutated genotype. The probability of accepting the mutated genotype then depends on the change in fitness. We study this process, SSWM, from an algorithmic perspective, quantifying its expected optimisation time for various parameters and investigating differences to a similar evolutionary algorithm, the well-known (1+1) EA. We show that SSWM can have a moderate advantage over the (1+1) EA at crossing fitness valleys and study an example where SSWM outperforms the (1+1) EA by taking advantage of information on the fitness gradient. AU - Paixao, Tiago AU - Pérez Heredia, Jorge AU - Sudholt, Dirk AU - Trubenova, Barbora ID - 1336 IS - 2 JF - Algorithmica SN - 01784617 TI - Towards a runtime comparison of natural and artificial evolution VL - 78 ER - TY - JOUR AB - Much of quantitative genetics is based on the ‘infinitesimal model’, under which selection has a negligible effect on the genetic variance. This is typically justified by assuming a very large number of loci with additive effects. However, it applies even when genes interact, provided that the number of loci is large enough that selection on each of them is weak relative to random drift. In the long term, directional selection will change allele frequencies, but even then, the effects of epistasis on the ultimate change in trait mean due to selection may be modest. Stabilising selection can maintain many traits close to their optima, even when the underlying alleles are weakly selected. However, the number of traits that can be optimised is apparently limited to ~4Ne by the ‘drift load’, and this is hard to reconcile with the apparent complexity of many organisms. Just as for the mutation load, this limit can be evaded by a particular form of negative epistasis. A more robust limit is set by the variance in reproductive success. This suggests that selection accumulates information most efficiently in the infinitesimal regime, when selection on individual alleles is weak, and comparable with random drift. A review of evidence on selection strength suggests that although most variance in fitness may be because of alleles with large Nes, substantial amounts of adaptation may be because of alleles in the infinitesimal regime, in which epistasis has modest effects. AU - Barton, Nicholas H ID - 1199 JF - Heredity TI - How does epistasis influence the response to selection? VL - 118 ER - TY - JOUR AB - Dispersal is a crucial factor in natural evolution, since it determines the habitat experienced by any population and defines the spatial scale of interactions between individuals. There is compelling evidence for systematic differences in dispersal characteristics within the same population, i.e., genotype-dependent dispersal. The consequences of genotype-dependent dispersal on other evolutionary phenomena, however, are poorly understood. In this article we investigate the effect of genotype-dependent dispersal on spatial gene frequency patterns, using a generalization of the classical diffusion model of selection and dispersal. Dispersal is characterized by the variance of dispersal (diffusion coefficient) and the mean displacement (directional advection term). We demonstrate that genotype-dependent dispersal may change the qualitative behavior of Fisher waves, which change from being “pulled” to being “pushed” wave fronts as the discrepancy in dispersal between genotypes increases. The speed of any wave is partitioned into components due to selection, genotype-dependent variance of dispersal, and genotype-dependent mean displacement. We apply our findings to wave fronts maintained by selection against heterozygotes. Furthermore, we identify a benefit of increased variance of dispersal, quantify its effect on the speed of the wave, and discuss the implications for the evolution of dispersal strategies. AU - Novak, Sebastian AU - Kollár, Richard ID - 1169 IS - 1 JF - Genetics SN - 00166731 TI - Spatial gene frequency waves under genotype dependent dispersal VL - 205 ER - TY - JOUR AB - Adaptation depends critically on the effects of new mutations and their dependency on the genetic background in which they occur. These two factors can be summarized by the fitness landscape. However, it would require testing all mutations in all backgrounds, making the definition and analysis of fitness landscapes mostly inaccessible. Instead of postulating a particular fitness landscape, we address this problem by considering general classes of landscapes and calculating an upper limit for the time it takes for a population to reach a fitness peak, circumventing the need to have full knowledge about the fitness landscape. We analyze populations in the weak-mutation regime and characterize the conditions that enable them to quickly reach the fitness peak as a function of the number of sites under selection. We show that for additive landscapes there is a critical selection strength enabling populations to reach high-fitness genotypes, regardless of the distribution of effects. This threshold scales with the number of sites under selection, effectively setting a limit to adaptation, and results from the inevitable increase in deleterious mutational pressure as the population adapts in a space of discrete genotypes. Furthermore, we show that for the class of all unimodal landscapes this condition is sufficient but not necessary for rapid adaptation, as in some highly epistatic landscapes the critical strength does not depend on the number of sites under selection; effectively removing this barrier to adaptation. AU - Heredia, Jorge AU - Trubenova, Barbora AU - Sudholt, Dirk AU - Paixao, Tiago ID - 1111 IS - 2 JF - Genetics SN - 00166731 TI - Selection limits to adaptive walks on correlated landscapes VL - 205 ER - TY - JOUR AB - Viral capsids are structurally constrained by interactions among the amino acids (AAs) of their constituent proteins. Therefore, epistasis is expected to evolve among physically interacting sites and to influence the rates of substitution. To study the evolution of epistasis, we focused on the major structural protein of the fX174 phage family by first reconstructing the ancestral protein sequences of 18 species using a Bayesian statistical framework. The inferred ancestral reconstruction differed at eight AAs, for a total of 256 possible ancestral haplotypes. For each ancestral haplotype and the extant species, we estimated, in silico, the distribution of free energies and epistasis of the capsid structure. We found that free energy has not significantly increased but epistasis has. We decomposed epistasis up to fifth order and found that higher-order epistasis sometimes compensates pairwise interactions making the free energy seem additive. The dN/dS ratio is low, suggesting strong purifying selection, and that structure is under stabilizing selection. We synthesized phages carrying ancestral haplotypes of the coat protein gene and measured their fitness experimentally. Our findings indicate that stabilizing mutations can have higher fitness, and that fitness optima do not necessarily coincide with energy minima. AU - Fernandes Redondo, Rodrigo A AU - Vladar, Harold AU - Włodarski, Tomasz AU - Bollback, Jonathan P ID - 1077 IS - 126 JF - Journal of the Royal Society Interface SN - 17425689 TI - Evolutionary interplay between structure, energy and epistasis in the coat protein of the ϕX174 phage family VL - 14 ER -