Background: Long non-coding RNAs (lncRNAs) are increasingly implicated as gene regulators and may ultimately be more numerous than protein-coding genes in the human genome. Despite large numbers of reported lncRNAs, reference annotations are likely incomplete due to their lower and tighter tissue-specific expression compared to mRNAs. An unexplored factor potentially confounding lncRNA identification is inter-individual expression variability. Here, we characterize lncRNA natural expression variability in human primary granulocytes. Results: We annotate granulocyte lncRNAs and mRNAs in RNA-seq data from 10 healthy individuals, identifying multiple lncRNAs absent from reference annotations, and use this to investigate three known features (higher tissue-specificity, lower expression, and reduced splicing efficiency) of lncRNAs relative to mRNAs. Expression variability was examined in seven individuals sampled three times at 1- or more than 1-month intervals. We show that lncRNAs display significantly more inter-individual expression variability compared to mRNAs. We confirm this finding in two independent human datasets by analyzing multiple tissues from the GTEx project and lymphoblastoid cell lines from the GEUVADIS project. Using the latter dataset we also show that including more human donors into the transcriptome annotation pipeline allows identification of an increasing number of lncRNAs, but minimally affects mRNA gene number. Conclusions: A comprehensive annotation of lncRNAs is known to require an approach that is sensitive to low and tight tissue-specific expression. Here we show that increased inter-individual expression variability is an additional general lncRNA feature to consider when creating a comprehensive annotation of human lncRNAs or proposing their use as prognostic or disease markers.
This study was partly funded by the Austrian Science Fund (FWF F43-B09, FWF W1207-B09). PMG is a recipient of a DOC Fellowship of the Austrian Academy of Sciences. We thank Ruth Klement, Tomasz Kulinski, Elisangela Valente, Elisabeth Salzer, and Roland Jäger for technical/bioinformatic assistance and advice, the CeMM IT department and José Manuel Molero for help and advice on software usage, the Biomedical Sequencing Facility (http://biomedical-sequencing.at/) for sequencing and advice, Jacques Colinge, Daniel Andergassen, and Tomasz Kulinski for discussions, Quanah Hudson and Jörg Menche for reading and commenting on the manuscript.
Kornienko A, Dotter C, Guenzl P, et al. Long non-coding RNAs display higher natural expression variation than protein-coding genes in healthy humans. Genome Biology. 2016;17(1). doi:10.1186/s13059-016-0873-8
Kornienko, A., Dotter, C., Guenzl, P., Gisslinger, H., Gisslinger, B., Cleary, C., … Barlow, D. (2016). Long non-coding RNAs display higher natural expression variation than protein-coding genes in healthy humans. Genome Biology, 17(1). https://doi.org/10.1186/s13059-016-0873-8
Kornienko, Aleksandra, Christoph Dotter, Philipp Guenzl, Heinz Gisslinger, Bettina Gisslinger, Ciara Cleary, Robert Kralovics, Florian Pauler, and Denise Barlow. “Long Non-Coding RNAs Display Higher Natural Expression Variation than Protein-Coding Genes in Healthy Humans.” Genome Biology 17, no. 1 (2016). https://doi.org/10.1186/s13059-016-0873-8.
A. Kornienko et al., “Long non-coding RNAs display higher natural expression variation than protein-coding genes in healthy humans,” Genome Biology, vol. 17, no. 1, 2016.
Kornienko A, Dotter C, Guenzl P, Gisslinger H, Gisslinger B, Cleary C, Kralovics R, Pauler F, Barlow D. 2016. Long non-coding RNAs display higher natural expression variation than protein-coding genes in healthy humans. Genome Biology. 17(1), 14.
Kornienko, Aleksandra, et al. “Long Non-Coding RNAs Display Higher Natural Expression Variation than Protein-Coding Genes in Healthy Humans.” Genome Biology, vol. 17, no. 1, 14, BioMed Central, 2016, doi:10.1186/s13059-016-0873-8.