TY - JOUR AB - Predicting function from sequence is a central problem of biology. Currently, this is possible only locally in a narrow mutational neighborhood around a wildtype sequence rather than globally from any sequence. Using random mutant libraries, we developed a biophysical model that accounts for multiple features of σ70 binding bacterial promoters to predict constitutive gene expression levels from any sequence. We experimentally and theoretically estimated that 10–20% of random sequences lead to expression and ~80% of non-expressing sequences are one mutation away from a functional promoter. The potential for generating expression from random sequences is so pervasive that selection acts against σ70-RNA polymerase binding sites even within inter-genic, promoter-containing regions. This pervasiveness of σ70-binding sites implies that emergence of promoters is not the limiting step in gene regulatory evolution. Ultimately, the inclusion of novel features of promoter function into a mechanistic model enabled not only more accurate predictions of gene expression levels, but also identified that promoters evolve more rapidly than previously thought. AU - Lagator, Mato AU - Sarikas, Srdjan AU - Steinrueck, Magdalena AU - Toledo-Aparicio, David AU - Bollback, Jonathan P AU - Guet, Calin C AU - Tkačik, Gašper ID - 10736 JF - eLife TI - Predicting bacterial promoter function and evolution from random sequences VL - 11 ER - TY - JOUR AB - Organisms cope with change by taking advantage of transcriptional regulators. However, when faced with rare environments, the evolution of transcriptional regulators and their promoters may be too slow. Here, we investigate whether the intrinsic instability of gene duplication and amplification provides a generic alternative to canonical gene regulation. Using real-time monitoring of gene-copy-number mutations in Escherichia coli, we show that gene duplications and amplifications enable adaptation to fluctuating environments by rapidly generating copy-number and, therefore, expression-level polymorphisms. This amplification-mediated gene expression tuning (AMGET) occurs on timescales that are similar to canonical gene regulation and can respond to rapid environmental changes. Mathematical modelling shows that amplifications also tune gene expression in stochastic environments in which transcription-factor-based schemes are hard to evolve or maintain. The fleeting nature of gene amplifications gives rise to a generic population-level mechanism that relies on genetic heterogeneity to rapidly tune the expression of any gene, without leaving any genomic signature. AU - Tomanek, Isabella AU - Grah, Rok AU - Lagator, M. AU - Andersson, A. M. C. AU - Bollback, Jonathan P AU - Tkačik, Gašper AU - Guet, Calin C ID - 7652 IS - 4 JF - Nature Ecology & Evolution SN - 2397-334X TI - Gene amplification as a form of population-level gene expression regulation VL - 4 ER - TY - JOUR AB - Herd immunity, a process in which resistant individuals limit the spread of a pathogen among susceptible hosts has been extensively studied in eukaryotes. Even though bacteria have evolved multiple immune systems against their phage pathogens, herd immunity in bacteria remains unexplored. Here we experimentally demonstrate that herd immunity arises during phage epidemics in structured and unstructured Escherichia coli populations consisting of differing frequencies of susceptible and resistant cells harboring CRISPR immunity. In addition, we develop a mathematical model that quantifies how herd immunity is affected by spatial population structure, bacterial growth rate, and phage replication rate. Using our model we infer a general epidemiological rule describing the relative speed of an epidemic in partially resistant spatially structured populations. Our experimental and theoretical findings indicate that herd immunity may be important in bacterial communities, allowing for stable coexistence of bacteria and their phages and the maintenance of polymorphism in bacterial immunity. AU - Payne, Pavel AU - Geyrhofer, Lukas AU - Barton, Nicholas H AU - Bollback, Jonathan P ID - 423 JF - eLife TI - CRISPR-based herd immunity can limit phage epidemics in bacterial populations VL - 7 ER - TY - GEN AB - Herd immunity, a process in which resistant individuals limit the spread of a pathogen among susceptible hosts has been extensively studied in eukaryotes. Even though bacteria have evolved multiple immune systems against their phage pathogens, herd immunity in bacteria remains unexplored. Here we experimentally demonstrate that herd immunity arises during phage epidemics in structured and unstructured Escherichia coli populations consisting of differing frequencies of susceptible and resistant cells harboring CRISPR immunity. In addition, we develop a mathematical model that quantifies how herd immunity is affected by spatial population structure, bacterial growth rate, and phage replication rate. Using our model we infer a general epidemiological rule describing the relative speed of an epidemic in partially resistant spatially structured populations. Our experimental and theoretical findings indicate that herd immunity may be important in bacterial communities, allowing for stable coexistence of bacteria and their phages and the maintenance of polymorphism in bacterial immunity. AU - Payne, Pavel AU - Geyrhofer, Lukas AU - Barton, Nicholas H AU - Bollback, Jonathan P ID - 9840 TI - Data from: CRISPR-based herd immunity limits phage epidemics in bacterial populations ER - TY - JOUR AB - Gene regulatory networks evolve through rewiring of individual components—that is, through changes in regulatory connections. However, the mechanistic basis of regulatory rewiring is poorly understood. Using a canonical gene regulatory system, we quantify the properties of transcription factors that determine the evolutionary potential for rewiring of regulatory connections: robustness, tunability and evolvability. In vivo repression measurements of two repressors at mutated operator sites reveal their contrasting evolutionary potential: while robustness and evolvability were positively correlated, both were in trade-off with tunability. Epistatic interactions between adjacent operators alleviated this trade-off. A thermodynamic model explains how the differences in robustness, tunability and evolvability arise from biophysical characteristics of repressor–DNA binding. The model also uncovers that the energy matrix, which describes how mutations affect repressor–DNA binding, encodes crucial information about the evolutionary potential of a repressor. The biophysical determinants of evolutionary potential for regulatory rewiring constitute a mechanistic framework for understanding network evolution. AU - Igler, Claudia AU - Lagator, Mato AU - Tkacik, Gasper AU - Bollback, Jonathan P AU - Guet, Calin C ID - 67 IS - 10 JF - Nature Ecology and Evolution TI - Evolutionary potential of transcription factors for gene regulatory rewiring VL - 2 ER - TY - DATA AB - Mean repression values and standard error of the mean are given for all operator mutant libraries. AU - Igler, Claudia AU - Lagator, Mato AU - Tkacik, Gasper AU - Bollback, Jonathan P AU - Guet, Calin C ID - 5585 TI - Data for the paper Evolutionary potential of transcription factors for gene regulatory rewiring ER - TY - JOUR AB - Most phenotypes are determined by molecular systems composed of specifically interacting molecules. However, unlike for individual components, little is known about the distributions of mutational effects of molecular systems as a whole. We ask how the distribution of mutational effects of a transcriptional regulatory system differs from the distributions of its components, by first independently, and then simultaneously, mutating a transcription factor and the associated promoter it represses. We find that the system distribution exhibits increased phenotypic variation compared to individual component distributions - an effect arising from intermolecular epistasis between the transcription factor and its DNA-binding site. In large part, this epistasis can be qualitatively attributed to the structure of the transcriptional regulatory system and could therefore be a common feature in prokaryotes. Counter-intuitively, intermolecular epistasis can alleviate the constraints of individual components, thereby increasing phenotypic variation that selection could act on and facilitating adaptive evolution. AU - Lagator, Mato AU - Sarikas, Srdjan AU - Acar, Hande AU - Bollback, Jonathan P AU - Guet, Calin C ID - 570 JF - eLife SN - 2050084X TI - Regulatory network structure determines patterns of intermolecular epistasis VL - 6 ER - TY - JOUR AB - Viral capsids are structurally constrained by interactions among the amino acids (AAs) of their constituent proteins. Therefore, epistasis is expected to evolve among physically interacting sites and to influence the rates of substitution. To study the evolution of epistasis, we focused on the major structural protein of the fX174 phage family by first reconstructing the ancestral protein sequences of 18 species using a Bayesian statistical framework. The inferred ancestral reconstruction differed at eight AAs, for a total of 256 possible ancestral haplotypes. For each ancestral haplotype and the extant species, we estimated, in silico, the distribution of free energies and epistasis of the capsid structure. We found that free energy has not significantly increased but epistasis has. We decomposed epistasis up to fifth order and found that higher-order epistasis sometimes compensates pairwise interactions making the free energy seem additive. The dN/dS ratio is low, suggesting strong purifying selection, and that structure is under stabilizing selection. We synthesized phages carrying ancestral haplotypes of the coat protein gene and measured their fitness experimentally. Our findings indicate that stabilizing mutations can have higher fitness, and that fitness optima do not necessarily coincide with energy minima. AU - Fernandes Redondo, Rodrigo A AU - Vladar, Harold AU - Włodarski, Tomasz AU - Bollback, Jonathan P ID - 1077 IS - 126 JF - Journal of the Royal Society Interface SN - 17425689 TI - Evolutionary interplay between structure, energy and epistasis in the coat protein of the ϕX174 phage family VL - 14 ER - TY - JOUR AB - Understanding the relation between genotype and phenotype remains a major challenge. The difficulty of predicting individual mutation effects, and particularly the interactions between them, has prevented the development of a comprehensive theory that links genotypic changes to their phenotypic effects. We show that a general thermodynamic framework for gene regulation, based on a biophysical understanding of protein-DNA binding, accurately predicts the sign of epistasis in a canonical cis-regulatory element consisting of overlapping RNA polymerase and repressor binding sites. Sign and magnitude of individual mutation effects are sufficient to predict the sign of epistasis and its environmental dependence. Thus, the thermodynamic model offers the correct null prediction for epistasis between mutations across DNA-binding sites. Our results indicate that a predictive theory for the effects of cis-regulatory mutations is possible from first principles, as long as the essential molecular mechanisms and the constraints these impose on a biological system are accounted for. AU - Lagator, Mato AU - Paixao, Tiago AU - Barton, Nicholas H AU - Bollback, Jonathan P AU - Guet, Calin C ID - 954 JF - eLife SN - 2050084X TI - On the mechanistic nature of epistasis in a canonical cis-regulatory element VL - 6 ER - TY - JOUR AB - Changes in gene expression are an important mode of evolution; however, the proximate mechanism of these changes is poorly understood. In particular, little is known about the effects of mutations within cis binding sites for transcription factors, or the nature of epistatic interactions between these mutations. Here, we tested the effects of single and double mutants in two cis binding sites involved in the transcriptional regulation of the Escherichia coli araBAD operon, a component of arabinose metabolism, using a synthetic system. This system decouples transcriptional control from any posttranslational effects on fitness, allowing a precise estimate of the effect of single and double mutations, and hence epistasis, on gene expression. We found that epistatic interactions between mutations in the araBAD cis-regulatory element are common, and that the predominant form of epistasis is negative. The magnitude of the interactions depended on whether the mutations are located in the same or in different operator sites. Importantly, these epistatic interactions were dependent on the presence of arabinose, a native inducer of the araBAD operon in vivo, with some interactions changing in sign (e.g., from negative to positive) in its presence. This study thus reveals that mutations in even relatively simple cis-regulatory elements interact in complex ways such that selection on the level of gene expression in one environment might perturb regulation in the other environment in an unpredictable and uncorrelated manner. AU - Lagator, Mato AU - Igler, Claudia AU - Moreno, Anaisa AU - Guet, Calin C AU - Bollback, Jonathan P ID - 1427 IS - 3 JF - Molecular Biology and Evolution TI - Epistatic interactions in the arabinose cis-regulatory element VL - 33 ER - TY - GEN AB - Viral capsids are structurally constrained by interactions among the amino acids (AAs) of their constituent proteins. Therefore, epistasis is expected to evolve among physically interacting sites and to influence the rates of substitution. To study the evolution of epistasis, we focused on the major structural protein of the ϕX174 phage family by, first, reconstructing the ancestral protein sequences of 18 species using a Bayesian statistical framework. The inferred ancestral reconstruction differed at eight AAs, for a total of 256 possible ancestral haplotypes. For each ancestral haplotype and the extant species, we estimated, in silico, the distribution of free energies and epistasis of the capsid structure. We found that free energy has not significantly increased but epistasis has. We decomposed epistasis up to fifth order and found that higher-order epistasis sometimes compensates pairwise interactions making the free energy seem additive. The dN/dS ratio is low, suggesting strong purifying selection, and that structure is under stabilizing selection. We synthesized phages carrying ancestral haplotypes of the coat protein gene and measured their fitness experimentally. Our findings indicate that stabilizing mutations can have higher fitness, and that fitness optima do not necessarily coincide with energy minima. AU - Fernandes Redondo, Rodrigo A AU - de Vladar, Harold AU - Włodarski, Tomasz AU - Bollback, Jonathan P ID - 9864 TI - Data from evolutionary interplay between structure, energy and epistasis in the coat protein of the ϕX174 phage family ER - TY - JOUR AB - Background: CRISPR is a microbial immune system likely to be involved in host-parasite coevolution. It functions using target sequences encoded by the bacterial genome, which interfere with invading nucleic acids using a homology-dependent system. The system also requires protospacer associated motifs (PAMs), short motifs close to the target sequence that are required for interference in CRISPR types I and II. Here, we investigate whether PAMs are depleted in phage genomes due to selection pressure to escape recognition.Results: To this end, we analyzed two data sets. Phages infecting all bacterial hosts were analyzed first, followed by a detailed analysis of phages infecting the genus Streptococcus, where PAMs are best understood. We use two different measures of motif underrepresentation that control for codon bias and the frequency of submotifs. We compare phages infecting species with a particular CRISPR type to those infecting species without that type. Since only known PAMs were investigated, the analysis is restricted to CRISPR types I-C and I-E and in Streptococcus to types I-C and II. We found evidence for PAM depletion in Streptococcus phages infecting hosts with CRISPR type I-C, in Vibrio phages infecting hosts with CRISPR type I-E and in Streptococcus thermopilus phages infecting hosts with type II-A, known as CRISPR3.Conclusions: The observed motif depletion in phages with hosts having CRISPR can be attributed to selection rather than to mutational bias, as mutational bias should affect the phages of all hosts. This observation implies that the CRISPR system has been efficient in the groups discussed here. AU - Kupczok, Anne AU - Bollback, Jonathan P ID - 2042 IS - 1 JF - BMC Genomics TI - Motif depletion in bacteriophages infecting hosts with CRISPR systems VL - 15 ER - TY - JOUR AB - Background: The CRISPR/Cas system is known to act as an adaptive and heritable immune system in Eubacteria and Archaea. Immunity is encoded in an array of spacer sequences. Each spacer can provide specific immunity to invasive elements that carry the same or a similar sequence. Even in closely related strains, spacer content is very dynamic and evolves quickly. Standard models of nucleotide evolutioncannot be applied to quantify its rate of change since processes other than single nucleotide changes determine its evolution.Methods We present probabilistic models that are specific for spacer content evolution. They account for the different processes of insertion and deletion. Insertions can be constrained to occur on one end only or are allowed to occur throughout the array. One deletion event can affect one spacer or a whole fragment of adjacent spacers. Parameters of the underlying models are estimated for a pair of arrays by maximum likelihood using explicit ancestor enumeration.Results Simulations show that parameters are well estimated on average under the models presented here. There is a bias in the rate estimation when including fragment deletions. The models also estimate times between pairs of strains. But with increasing time, spacer overlap goes to zero, and thus there is an upper bound on the distance that can be estimated. Spacer content similarities are displayed in a distance based phylogeny using the estimated times.We use the presented models to analyze different Yersinia pestis data sets and find that the results among them are largely congruent. The models also capture the variation in diversity of spacers among the data sets. A comparison of spacer-based phylogenies and Cas gene phylogenies shows that they resolve very different time scales for this data set.Conclusions The simulations and data analyses show that the presented models are useful for quantifying spacer content evolution and for displaying spacer content similarities of closely related strains in a phylogeny. This allows for comparisons of different CRISPR arrays or for comparisons between CRISPR arrays and nucleotide substitution rates. AU - Kupczok, Anne AU - Bollback, Jonathan P ID - 2412 IS - 1 JF - BMC Evolutionary Biology TI - Probabilistic models for CRISPR spacer content evolution VL - 13 ER - TY - JOUR AB - Here, we describe a novel virulent bacteriophage that infects Bacillus weihenstephanensis, isolated from soil in Austria. It is the first phage to be discovered that infects this species. Here, we present the complete genome sequence of this podovirus. AU - Fernandes Redondo, Rodrigo A AU - Kupczok, Anne AU - Stift, Gertraud AU - Bollback, Jonathan P ID - 2410 IS - 3 JF - Genome Announcements TI - Complete genome sequence of the novel phage MG-B1 infecting bacillus weihenstephanensis VL - 1 ER - TY - JOUR AB - Background: Reassortment between the RNA segments encoding haemagglutinin (HA) and neuraminidase (NA), the major antigenic influenza proteins, produces viruses with novel HA and NA subtype combinations and has preceded the emergence of pandemic strains. It has been suggested that productive viral infection requires a balance in the level of functional activity of HA and NA, arising from their closely interacting roles in the viral life cycle, and that this functional balance could be mediated by genetic changes in the HA and NA. Here, we investigate how the selective pressure varies for H7 avian influenza HA on different NA subtype backgrounds. Results: By extending Bayesian stochastic mutational mapping methods to calculate the ratio of the rate of non-synonymous change to the rate of synonymous change (d N/d S), we found the average d N/d S across the avian influenza H7 HA1 region to be significantly greater on an N2 NA subtype background than on an N1, N3 or N7 background. Observed differences in evolutionary rates of H7 HA on different NA subtype backgrounds could not be attributed to underlying differences between avian host species or virus pathogenicity. Examination of d N/d S values for each subtype on a site-by-site basis indicated that the elevated d N/d S on the N2 NA background was a result of increased selection, rather than a relaxation of selective constraint. Conclusions: Our results are consistent with the hypothesis that reassortment exposes influenza HA to significant changes in selective pressure through genetic interactions with NA. Such epistatic effects might be explicitly accounted for in future models of influenza evolution. AU - Ward, Melissa AU - Lycett, Samantha AU - Avila, Dorita AU - Bollback, Jonathan P AU - Leigh Brown, Andrew ID - 500 IS - 1 JF - BMC Evolutionary Biology TI - Evolutionary interactions between haemagglutinin and neuraminidase in avian influenza VL - 13 ER - TY - JOUR AB - Phenotypic biotyping has traditionally been used to differentiate bacteria occupying distinct ecological niches such as host species. For example, the capacity of Staphylococcus aureus from sheep to coagulate ruminant plasma, reported over 60 years ago, led to the description of small ruminant and bovine S. aureus ecovars. The great majority of small ruminant isolates are represented by a single, widespread clonal complex (CC133) of S. aureus, but its evolutionary origin and the molecular basis for its host tropism remain unknown. Here, we provide evidence that the CC133 clone evolved as the result of a human to ruminant host jump followed by adaptive genome diversification. Comparative whole-genome sequencing revealed molecular evidence for host adaptation including gene decay and diversification of proteins involved in host-pathogen interactions. Importantly, several novel mobile genetic elements encoding virulence proteins with attenuated or enhanced activity in ruminants were widely distributed in CC133 isolates, suggesting a key role in its host-specific interactions. To investigate this further, we examined the activity of a novel staphylococcal pathogenicity island (SaPIov2) found in the great majority of CC133 isolates which encodes a variant of the chromosomally encoded von Willebrand-binding protein (vWbp(Sov2)), previously demonstrated to have coagulase activity for human plasma. Remarkably, we discovered that SaPIov2 confers the ability to coagulate ruminant plasma suggesting an important role in ruminant disease pathogenesis and revealing the origin of a defining phenotype of the classical S. aureus biotyping scheme. Taken together, these data provide broad new insights into the origin and molecular basis of S. aureus ruminant host specificity. AU - Guinane, Caitriona M AU - Ben Zakour, Nouri L AU - Tormo-Mas, Maria A AU - Weinert, Lucy A AU - Lowder, Bethan V AU - Cartwright, Robyn A AU - Smyth, Davida S AU - Smyth, Cyril J AU - Lindsay, Jodi A AU - Gould, Katherine A AU - Witney, Adam AU - Hinds, Jason AU - Jonathan Bollback AU - Rambaut, Andrew AU - Penades, Jose R AU - Fitzgerald, J Ross ID - 4358 JF - Genome Biology and Evolution TI - Evolutionary genomics of Staphylococcus aureus reveals insights into the origin and molecular basis of ruminant host adaptation VL - 2 ER - TY - JOUR AB - Parallel evolution is the acquisition of identical adaptive traits in independently evolving populations. Understanding whether the genetic changes underlying adaptation to a common selective environment are parallel within and between species is interesting because it sheds light on the degree of evolutionary constraints. If parallel evolution is perfect, then the implication is that forces such as functional constraints, epistasis, and pleiotropy play an important role in shaping the outcomes of adaptive evolution. In addition, population genetic theory predicts that the probability of parallel evolution will decline with an increase in the number of adaptive solutions-if a single adaptive solution exists, then parallel evolution will be observed among highly divergent species. For this reason, it is predicted that close relatives-which likely overlap more in the details of their adaptive solutions-will show more parallel evolution. By adapting three related bacteriophage species to a novel environment we find (1) a high rate of parallel genetic evolution at orthologous nucleotide and amino acid residues within species, (2) parallel beneficial mutations do not occur in a common order in which they fix or appear in an evolving population, (3) low rates of parallel evolution and convergent evolution between species, and (4) the probability of parallel and convergent evolution between species is strongly effected by divergence. AU - Jonathan Bollback AU - Huelsenbeck, John P ID - 4357 IS - 1 JF - Genetics TI - Parallel genetic evolution within and between bacteriophage species of varying degrees of divergence VL - 181 ER - TY - JOUR AB - We develop a new method for estimating effective population sizes, Ne, and selection coefficients, s, from time-series data of allele frequencies sampled from a single diallelic locus. The method is based on calculating transition probabilities, using a numerical solution of the diffusion process, and assuming independent binomial sampling from this diffusion process at each time point. We apply the method in two example applications. First, we estimate selection coefficients acting on the CCR5-Δ32 mutation on the basis of published samples of contemporary and ancient human DNA. We show that the data are compatible with the assumption of s = 0, although moderate amounts of selection acting on this mutation cannot be excluded. In our second example, we estimate the selection coefficient acting on a mutation segregating in an experimental phage population. We show that the selection coefficient acting on this mutation is ~0.43. AU - Jonathan Bollback AU - York, Thomas L AU - Nielsen, Rasmus ID - 3435 IS - 1 JF - Genetics TI - Estimation of 2Nes From Temporal Allele Frequency Data VL - 179 ER - TY - JOUR AB - BACKGROUND: The invention of the Genome Sequence 20 DNA Sequencing System (454 parallel sequencing platform) has enabled the rapid and high-volume production of sequence data. Until now, however, individual emulsion PCR (emPCR) reactions and subsequent sequencing runs have been unable to combine template DNA from multiple individuals, as homologous sequences cannot be subsequently assigned to their original sources. METHODOLOGY: We use conventional PCR with 5'-nucleotide tagged primers to generate homologous DNA amplification products from multiple specimens, followed by sequencing through the high-throughput Genome Sequence 20 DNA Sequencing System (GS20, Roche/454 Life Sciences). Each DNA sequence is subsequently traced back to its individual source through 5'tag-analysis. CONCLUSIONS: We demonstrate that this new approach enables the assignment of virtually all the generated DNA sequences to the correct source once sequencing anomalies are accounted for (miss-assignment rate<0.4%). Therefore, the method enables accurate sequencing and assignment of homologous DNA sequences from multiple sources in single high-throughput GS20 run. We observe a bias in the distribution of the differently tagged primers that is dependent on the 5' nucleotide of the tag. In particular, primers 5' labelled with a cytosine are heavily overrepresented among the final sequences, while those 5' labelled with a thymine are strongly underrepresented. A weaker bias also exists with regards to the distribution of the sequences as sorted by the second nucleotide of the dinucleotide tags. As the results are based on a single GS20 run, the general applicability of the approach requires confirmation. However, our experiments demonstrate that 5'primer tagging is a useful method in which the sequencing power of the GS20 can be applied to PCR-based assays of multiple homologous PCR products. The new approach will be of value to a broad range of research areas, such as those of comparative genomics, complete mitochondrial analyses, population genetics, and phylogenetics. AU - Binladen, Jonas AU - Gilbert, M Thomas AU - Jonathan Bollback AU - Panitz, Frank AU - Bendixen, Christian AU - Nielsen, Rasmus AU - Willerslev, Eske ID - 4353 IS - 2 JF - PLoS One TI - The use of coded PCR primers enables high-throughput sequencing of multiple homolog amplification products by 454 parallel sequencing VL - 2 ER - TY - JOUR AB - We used a comparative genomics approach to identify genes that are under positive selection in six strains of Escherichia coli and Shigella flexneri, including five strains that are human pathogens. We find that positive selection targets a wide range of different functions in the E. coli genome, including cell surface proteins such as beta barrel porins, presumably because of the involvement of these genes in evolutionary arms races with other bacteria, phages, and/or the host immune system. Structural mapping of positively selected sites on trans-membrane beta barrel porins reveals that the residues under positive selection occur almost exclusively in the extracellular region of the proteins that are enriched with sites known to be targets of phages, colicins, or the host immune system. More surprisingly, we also find a number of other categories of genes that show very strong evidence for positive selection, such as the enigmatic rhs elements and transposases. Based on structural evidence, we hypothesize that the selection acting on transposases is related to the genomic conflict between transposable elements and the host genome. AU - Petersen, Lise AU - Jonathan Bollback AU - Dimmic, Matt AU - Hubisz, Melissa AU - Nielsen, Rasmus ID - 4356 IS - 9 JF - Genome Research TI - Genes under positive selection in Escherichia coli VL - 17 ER -