03681nas a2200469 4500000000100000000000100001008004100002260001600043653001200059653002600071653003000097653003100127653001100158653003400169653003000203653002200233653002400255653001300279653001800292100001800310700002000328700002100348700001700369700002100386700001700407700002400424700001900448700001800467700002300485700001800508700002000526700001800546700001700564700001900581700001600600700001900616245010300635300000800738490000700746520244400753022001403197 2020 d c2020 Feb 1210aAnimals10aComputational Biology10aGene Expression Profiling10aGene Expression Regulation10aGenome10aMolecular Sequence Annotation10aNucleic Acid Conformation10aOrgan Specificity10aRNA, Long Noncoding10aRodentia10aTranscriptome1 aAmita Kashyap1 aAdelaide Rhodes1 aBrent Kronmiller1 aJosie Berger1 aAshley Champagne1 aEdward Davis1 aMitchell Finnegan V1 aMatthew Geniza1 aDavid Hendrix1 aChristiane Löhr V1 aVanessa Petro1 aThomas Sharpton1 aJackson Wells1 aClinton Epps1 aPankaj Jaiswal1 aBrett Tyler1 aStephen Ramsey00aPan-tissue transcriptome analysis of long noncoding RNAs in the American beaver Castor canadensis. a1530 v213 a

BACKGROUND: Long noncoding RNAs (lncRNAs) have roles in gene regulation, epigenetics, and molecular scaffolding and it is hypothesized that they underlie some mammalian evolutionary adaptations. However, for many mammalian species, the absence of a genome assembly precludes the comprehensive identification of lncRNAs. The genome of the American beaver (Castor canadensis) has recently been sequenced, setting the stage for the systematic identification of beaver lncRNAs and the characterization of their expression in various tissues. The objective of this study was to discover and profile polyadenylated lncRNAs in the beaver using high-throughput short-read sequencing of RNA from sixteen beaver tissues and to annotate the resulting lncRNAs based on their potential for orthology with known lncRNAs in other species.

RESULTS: Using de novo transcriptome assembly, we found 9528 potential lncRNA contigs and 187 high-confidence lncRNA contigs. Of the high-confidence lncRNA contigs, 147 have no known orthologs (and thus are putative novel lncRNAs) and 40 have mammalian orthologs. The novel lncRNAs mapped to the Oregon State University (OSU) reference beaver genome with greater than 90% sequence identity. While the novel lncRNAs were on average shorter than their annotated counterparts, they were similar to the annotated lncRNAs in terms of the relationships between contig length and minimum free energy (MFE) and between coverage and contig length. We identified beaver orthologs of known lncRNAs such as XIST, MEG3, TINCR, and NIPBL-DT. We profiled the expression of the 187 high-confidence lncRNAs across 16 beaver tissues (whole blood, brain, lung, liver, heart, stomach, intestine, skeletal muscle, kidney, spleen, ovary, placenta, castor gland, tail, toe-webbing, and tongue) and identified both tissue-specific and ubiquitous lncRNAs.

CONCLUSIONS: To our knowledge this is the first report of systematic identification of lncRNAs and their expression atlas in beaver. LncRNAs-both novel and those with known orthologs-are expressed in each of the beaver tissues that we analyzed. For some beaver lncRNAs with known orthologs, the tissue-specific expression patterns were phylogenetically conserved. The lncRNA sequence data files and raw sequence files are available via the web supplement and the NCBI Sequence Read Archive, respectively.

 a1471-2164