riboCIRC: a comprehensive database of translatable circRNAs
Genome Biology volume 22, Article number: 79 (2021)
riboCIRC is a translatome data-oriented circRNA database specifically designed for hosting, exploring, analyzing, and visualizing translatable circRNAs from multi-species. The database provides a comprehensive repository of computationally predicted ribosome-associated circRNAs; a manually curated collection of experimentally verified translated circRNAs; an evaluation of cross-species conservation of translatable circRNAs; a systematic de novo annotation of putative circRNA-encoded peptides, including sequence, structure, and function; and a genome browser to visualize the context-specific occupant footprints of circRNAs. It represents a valuable resource for the circRNA research community and is publicly available at http://www.ribocirc.com.
Circular RNAs (circRNAs) are an abundant class of covalently closed endogenous RNA molecules generated by back-splicing of pre-mRNAs. Recent advances in computational analysis and high-throughput RNA sequencing (RNA-seq) have unveiled a detailed view of circRNA biogenesis, regulatory mechanisms, and cellular functions . With the development of various computational and experimental approaches to effective identification of circRNAs, many dedicated databases for circRNAs were constructed, such as circBase and circAtlas for vertebrate circRNAs [2, 3], CSCD and TSCD for disease/tissue-specific circRNAs [4, 5], and Circ2Disease and Circ2Traits for circRNA-disease associations [6, 7]. These transcriptome data-oriented databases provide essential information about circRNAs, facilitating the current understanding of circRNAs related to their biological importance and clinical relevance. It becomes increasingly clear that circRNAs can regulate multiple biological processes via a variety of mechanisms. For instance, circRNAs can act as “sponges” or “decoys” for microRNAs or RNA-binding proteins to modulate gene expression or mRNA translation [8,9,10].
circRNAs are generally considered as “non-coding” elements; however, circRNAs can in fact serve as templates for protein translation. Using ribosome profiling (Ribo-seq) that enables genome-wide investigation of in vivo translation at a subcodon resolution [11, 12], a subset of circRNAs have recently been identified to be associated with translating ribosomes [13, 14]. Furthermore, by performing in vivo and in vitro translation assays, circRNAs have been shown to enable cap-independent translation and generate functional proteins. Of these proteins, some have been demonstrated to play vital roles under a number of pathophysiological conditions, such as muscle-enriched circRNA circ-ZNF609  and brain-ubiquitously expressed circRNA circAβ-a . In addition, several mechanisms for circRNA translation have been proposed. For instance, internal ribosome entry site (IRES)- and N6-methyladenosines (m6A)-mediated cap-independent translation initiation are potential mechanisms for circRNA translation [16, 17].
Although translation of circRNAs has attracted considerable attention and a large number of the Ribo-seq datasets have been generated in the past several years , there is no translatome data-oriented database that aims to provide direct in vivo translation evidence for multi-species circRNAs to date. To fill the gap, we analyzed the 3168 publicly available Ribo-seq and 1970 matched RNA-seq datasets from 314 studies covering 21 various species to determine the prevalence of circRNA translation. We further provided a dedicated multi-species translatable circRNA database, riboCIRC, towards a comprehensive repository of computationally predicted and experimentally verified translatable circRNAs. Overall, the riboCIRC database provides an important resource for the circRNA research community and can serve as a useful starting point for further investigation of the details of circRNA function and their involvement in cellular processes and diseases.
Construction and content
Data collection and preprocessing
We collected 3168 publicly available Ribo-seq datasets and 1970 matched RNA-seq datasets of the same samples from 314 studies covering 21 species, including Arabidopsis, Caenorhabditis elegans, Caulobacter crescentus, Cryptococcus neoformans, Chinese hamster, Drosophila, Escherichia coli, Halobacterium salinarum, human, mouse, Plasmodium falciparum, Pseudomonas aeruginosa, rat, Saccharomyces cerevisiae, Salmonella enterica, Schizosaccharomyces pombe, Streptomyces coelicolor, Staphylococcus aureus, Trypanosoma brucei, Vibrio vulnificus, and zebrafish (see Additional file 1: Table S1). After downloading the raw data files from the NCBI SRA database , we applied a unified pipeline to perform preprocessing of the Ribo-seq and RNA-seq data. Briefly, the 3′-end adapters were clipped using Cutadapt (version 1.8.1) ; low-quality bases were trimmed using Sickle (version 1.33) ; and the retained reads that mapped to rRNAs or tRNAs were removed.
Detection of transcribed and ribosome-associated circRNAs
We combined three different detection tools to identify transcribed circRNAs in each RNA-seq dataset, namely, CIRCexplorer2, CIRI2, and DCC [22,23,24]. The full-length sequence of each identified circRNA was assembled by the CIRI-full pipeline  or extracted from the circAtlas database [2, 3] when RNA-seq data were unavailable. Taking advantage of these full sequences, we generated a pseudo circRNA reference for each species by initial extraction of the 23-base pair (bp) sequences on either side of the backsplice junction (BSJ) site of each transcribed circRNA with subsequent concatenation of the two-sided sequences. To identify ribosome-associated circRNAs (ribo-circRNAs), we first eliminated sequence reads corresponding to nonribosomal RNA-protein complexes in each Ribo-seq dataset using Rfoot (version 1.0) , considering that ribosomes are not specifically selected during the biochemical isolation procedure of ribosome profiling experiment. After removal of footprints from nonribosomal complexes, all the ribosome-protected footprints were then mapped with Tophat2 (version 2.1.1)  to the corresponding linear reference genome, and further, the resulting unmapped.bam files were remapped to the pseudo circRNA reference using Tophat2 (version 2.1.1)  with default parameters except N, which was set to 0 (the default is 2). Finally, a circRNA was defined to be associated with translating ribosomes only when it met all of the following three criteria simultaneously: (1) at least two unique backsplice junction-spanning Ribo-seq reads, (2) a minimum read-junction overlap of three nucleotides (nt) on either side of the backsplice junction site, and (3) a typical range of read lengths of 25–35 nt (see Additional file 2: Fig. S1).
Two different strategies were here used to characterize ribo-circRNAs: (1) condition-dependent detection for Ribo-seq and perfectly matched RNA-seq datasets and (2) condition-independent detection for previously reported circRNAs and Ribo-seq datasets. The former strategy was applied to the initial genome-wide characterization of transcribed circRNAs using 1922 RNA-seq datasets with subsequent examination of ribosome associations of these circRNAs using 1970 Ribo-seq datasets from the same samples. In total, 278 out of the 91,143 transcribed circRNAs were identified as ribo-circRNAs, involving four different species (Drosophila, human, mouse, and rat). The latter strategy was applied to the systemic examination of ribosome associations of the circRNAs reported in the public databases using 3168 Ribo-seq datasets. To accomplish this task, we selected nine out of the 18 examined public circRNA databases, including circAtlas, circBank, circBase, CIRCpedia, circRNADb, CSCD, exoRBase, TSCD, and Circ2Disease [2,3,4,5,6, 28,29,30,31], and obtained 1,411,865 unique circRNAs after conversion of their coordinates using LiftOver . Notably, the other public circRNA databases were excluded from this analysis due to lack of a batch download link, incomplete annotation, or inaccessible webpage (see Additional file 3: Table S2). Among these well-documented circRNAs, a total of 1969 circRNAs were finally identified as ribo-circRNAs, involving six different species (C. elegans, Drosophila, human, mouse, rat, and zebrafish).
Cross-species conservation analysis of translatable circRNAs
To evaluate translatable circRNA conservation among different species, we first annotated the parental genes of ribo-circRNAs using the GTF files and then identified orthologous gene pairs expressing these circRNAs using a pairwise orthologous gene list downloaded from the OMA orthology database (http://omabrowser.org) . After that, we extracted 50-bp fragments on either side of the ribo-circRNA BSJ site from the reference genome and further concatenated both fragments to represent the ribo-circRNA BSJ sequence. Next, all ribo-circRNA BSJ sequences in one species were aligned to those of the other species using BLAT with default parameters , followed by a reciprocal best hit strategy to find the orthologous ribo-circRNAs. Finally, a pair of conserved ribo-circRNAs were defined based on their sequence alignment length ≥ 80 and alignment bit-score ≥ 150.
Prediction of circRNA-derived ORFs
We predicted putative circRNA-derived ORFs (cORFs) for each ribo-circRNA using the cORF_prediction_pipeline with some modifications . Briefly, the full-length sequence of each ribo-circRNA was retrieved and multiplied four times to allow for rolling circle translation. All cORFs beginning with an AUG initiation codon were identified separately for each circRNA and further filtered based on the requirements of a minimum length of 20 amino acids (aa) and of spanning the backsplice junction site. Notably, those cORFs terminating without an in-frame stop codon were defined as INF (infinite)-cORFs, representing that the corresponding circRNAs could be translated via a rolling circle amplification mechanism. Finally, only the longest cORF was kept for each one of the three reading frames, considering that circRNA with a long ORF would have a better chance of undergoing translation.
Annotation of IRES elements and m6A sites in circRNAs
Given that previous studies have shown the ability of IRES elements and m6A modification to drive circRNA translation [13, 16], we predicted potential IRES elements and m6A sites in circRNAs by using publicly available IRES sequences and m6A modification data. To identify IRES elements in circRNAs, we extracted experimentally validated IRES sequences from the IRESbase database  and then aligned them to circRNA sequences using BLASTN (version 2.7.1+)  with at least 80% sequence identity and a cutoff 30 nucleotides alignment length. To identify potential m6A sites in circRNAs, we extracted m6A modification peaks detected by three different peak calling tools (exomePeak, MeTPeak, and MACS2) from the REPIC database , followed by aligning them to circRNA sequences and the presence of m6A consensus motif “RAC” (where R is any purine) in the aligned positions.
Annotation of cORF-encoded peptides
We constructed a semi-automated bioinformatic workflow system to perform de novo annotation of all putative cORF-encoded peptides, including sequence conservation, transmembrane topology, signal cleavage site, subcellular localization, folding structure, potential function, etc. Specifically, sequence conservation of each putative cORF-encoded peptides was computed by an in-house Python script based on the phastCons score files at the University of California Santa Cruz (UCSC) . The presence or absence of the signal peptide cleavage sites was predicted by SignalP (version 5.0b) with default parameters . Transmembrane helical topology was predicted by TMHMM (version 1.1)  and Phobius (version 1.01)  with default parameters. The N-terminal presequences, such as signal peptide (SP), mitochondrial transit peptide (mTP), chloroplast transit peptide (cTP), or thylakoid luminal transit peptide (luTP), were predicted by TargetP (version 2.0)  with default parameters. Subcellular localization was predicted by DeepLoc (version 1.0)  with default parameters, which can differentiate between 10 different localizations, including nucleus, cytoplasm, extracellular, mitochondrial, cell membrane, endoplasmic reticulum, chloroplast, Golgi apparatus, lysosome/vacuole, and peroxisome. The 3D structure was predicted by I-TASSER (version 5.1)  that also provided other information, such as secondary structure, solvent accessibility, normalized B-factor, and top 10 threading templates.
Detection of cORF-encoded peptides by mass spectrometry
We used public proteomics data to find protein evidence of putative cORF-encoded peptides. Briefly, the raw files were of 26 datasets downloaded from the PRIDE database  (see Additional file 4: Table S3) and analyzed using MaxQuant software (version 184.108.40.206)  against a custom-tailored database separately for each species (the respective size for human: n = 22,113; mouse: n = 18,308; rat: n = 8159; Drosophila: n = 3629; C. elegans: n = 4156; and zebrafish: n = 3165), which combined all documented sequences from UniProt/Swiss-Prot with additional sequences derived from circRNA translation, based on the target-decoy strategy (Reverse) with the standard search parameters with the following exceptions: (1) the peptide-level FDR was set to 5%, and the protein-level FDR was excluded; (2) the minimal peptide length was set to seven amino acids; and (3) a maximum of two missed cleavages were allowed. For each search, fixed modifications and variable modifications were customized according to different proteomics data. In total, 719 cORF-encoded proteins from 669 circRNAs were evidenced by at least one unique junction-spanning peptide.
Primer design and structure representation of circRNAs
Based on the sequence of each circRNA, we performed circRNA-specific primer design. Divergent primer sets spanning the backsplice junction sequence were generated using circtools . The graphical representations of circRNAs and their linear host transcripts were constructed using circView .
Collection of experimentally verified translated circRNAs
Experimentally verified translated circRNAs were manually curated from the literature. To accomplish this task, we searched the PubMed literature database using the keyword ‘(circRNA [MeSH terms]) AND (translation [MeSH terms])’ and the bioRxiv preprint server using the keyword ‘(“circRNA”+“translation”)’ and found a total of 65 relevant published or preprint references. After retrieving the full text of these references, we reviewed the studies to manually collect the circRNA entries, which generated the peptides and were validated by various experiments. Strict screening identified 216 translated circRNAs with mass spectrometry-derived detection of the corresponding peptides, tag-peptide fusion system detection, or/and antibody (against circRNA-peptide) detection evidence and incorporated all information into the riboCIRC database. Additional basic information on these experimentally verified circRNAs was also collected, including the circRNA name, circBase id, genomic coordinates, strand, host gene, transcript, species/condition, circRNA-encoded peptide sequence, peptide length, experimental method, and reference information.
Utility and discussion
Exploration of translatable circRNAs
The ribo-circRNA page provides a comprehensive repository of translatable circRNAs, including computationally predicted ribosome-associated circRNAs and experimentally verified translated circRNAs. Users can click the “Ribo-circRNA” button on the navigation bar and then select one of the dropdown menu options (including “Computationally predicted ribo-circRNAs,” “Experimentally verified translated circRNAs,” and “Cross-species conserved ribo-circRNAs”) for a quick query.
Selection of “Computationally predicted ribo-circRNAs” returns the result page containing all predicted ribo-circRNAs identified using Ribo-seq data, including 1969 condition-independent and 278 condition-dependent ribo-circRNAs. Brief descriptions of these circRNAs are shown in this result page, including riboCIRC id, chromosome position, best transcript, host gene symbol, and circRNA length. A built-in search box can narrow the results down to a particular subject by entering additional search terms. Furthermore, clicking the riboCIRC id in the second column opens a separate page for every circRNA that displays detailed information on the matching circRNA, including cORF annotation such as the location of the junction-spanning cORF in the genome, total number of junction-spanning footprints, unique number of junction-spanning footprints, translation conditions, involved dataset, cORF sequence, cORF-encoded peptide, and length of cORF-encoded peptide, evidence for translation of circRNAs such as IRES element, m6A site, and mass spectrometric proof, graphical representation of the linear and circular RNA structure, and designed circRNA primer sets. Clicking the chromosome position in the third column opens a separate page for visualization of the host gene track, genomic features, and aligned junction-spanning ribosome footprints of the circRNA in the JBrowse . Selection of “Experimentally verified translated circRNAs” returns the result page with all 216 experimentally verified translated circRNAs that have been validated by various experiments to generate peptides. All information collected on these circRNAs is shown with some additional relevant information on the validated circRNAs accessible via the hyperlinks provided on the result page. In addition, selection of “Cross-species conserved ribo-circRNAs” returns the result page with all cross-species inference of conserved translatable circRNA pairs that involve a total of 90 evolutionarily conserved ribo-circRNAs.
Comprehensive analysis of circRNA-encoded peptides
The circ-peptide page provides a systematic annotation of putative circRNA-encoded peptides, including their sequence, structure, and function. Users can click the “Circ-peptide” button on the navigation bar to quickly browse the putative circRNA-encoded peptides. A dropdown menu shows a list of the available circRNA-encoded peptide options, and users can select one of the options to retrieve additional information, including basic information on the given peptide (sequence and conservation), summary of peptide characteristics (signal cleavage site, transmembrane domain, and N-terminal presequence), and location and topology of the peptide (subcellular localization, secondary structure, and structural conformation).
Intuitive visualization of ribosome-associated circRNAs
The visualization page provides an intuitive view of ribo-circRNAs, including visualization of the host gene track, genomic features, and aligned junction-spanning ribosome footprints of the circRNAs. Users can click the “Visualization” button on the navigation bar to visualize the data on the features of ribosome-associated circRNAs in the JBrowse browser  embedded in the result page. A cascading dropdown menu consists of three independent selection dropdown buttons for quick navigation to a circRNA of interest, interactive exploration of the data, and intuitive comparison of the data originating from various datasets.
Data download, statistics, user guide, and feedback
The download page provides access to a convenient tabular data format. Users can click the “Download” button on the navigation bar to easily access the data. Tabular list of metadata for all computationally predicted ribo-circRNAs, circRNA sequences, nucleotide sequences of all cORFs, and their corresponding protein sequences, as well as customized protein sequence databases for proteomics search, can be freely downloaded for nonprofit and academic purposes. In addition, the Ribo-circRNA page also provides the download buttons for download of computationally predicted ribosome-associated or experimentally verified translated circRNAs in various formats, including JSON, XML, CSV, TXT, SQL, and MS-Excel. The statistics page provides a summary that summarizes the data in all accessible records of the database. The tutorial page provides step-by-step instructions for users to familiarize themselves with the database. The feedback page provides a feedback form for translatable circRNAs and Ribo-seq datasets, making it easy for users to provide feedback.
It should be noted here that traditional approaches based on the properties of active translation such as three-nucleotide periodic subcodon pattern are not feasible for identifying active translated circRNAs due to the difficulty in distinguishing circular and linear Ribo-seq reads. In this database, we also adopt a similar strategy of translatable circRNA detection as previously described [13, 14], where ribosome-associated circRNAs were identified only by Ribo-seq reads spanning a head-to-tail splice junction. However, computational prediction through this strategy does not necessarily mean that the circRNA is being actively translated into a detectable micropeptide, even though it is associated with translating ribosomes. This database is just a starting point for bench scientists and computational biologists to pursue translatable circRNAs. The translated details of individual circRNAs still have to be further experimentally validated.
In the future, riboCIRC will be periodically updated. Increasing availability of new high-throughput Ribo-seq data will be used to characterize the putative translation of circRNAs and expand the size of computationally predicted ribo-circRNAs. We will continue to fill the database with newly reported experimentally verified translated circRNAs. In addition, we will continue to extend our collection of public proteomics data and to further enhance the identification rate of cORF-encoded peptides. These additions are anticipated to enhance the efficiency of the applications of riboCIRC in the circRNA research community.
To the best of our knowledge, riboCIRC is the first database for hosting, exploring, analyzing, and visualizing translatable circRNAs for multi-species. The database provides a comprehensive repository of computationally predicted ribo-circRNAs, together with multiple lines of evidence supporting their translation, and experimentally verified translated circRNAs. It also provides an evaluation of cross-species conversed translatable circRNAs, a systematic functional annotation of the putative circRNA-encoded peptides, a flexible visualization framework for ribosome-associated circRNAs, and a user-friendly web interface for easy data access and exploration. Thus, riboCIRC will serve as a valuable resource for bench scientists and computational biologists to explore translatable circRNAs and to drive functional investigation of the circRNA translation.
Availability of data and materials
riboCIRC is available at http://www.ribocirc.com to all users without any login or registration restrictions. All public Ribo-seq, RNA-seq, and mass spectrometry datasets used during the current study are available in Additional file 1: Table S1 and Additional file 4: Table S3. All translatable circRNAs can be downloaded from the riboCIRC data download page.
Kristensen LS, Andersen MS, Stagsted LVW, Ebbesen KK, Hansen TB, Kjems J. The biogenesis, biology and characterization of circular RNAs. Nat Rev Genet. 2019;20:675–91.
Glažar P, Papavasileiou P, Rajewsky N. circBase: a database for circular RNAs. RNA. 2014;20:1666–70.
Wu W, Ji P, Zhao F. CircAtlas: an integrated resource of one million highly accurate circular RNAs from 1070 vertebrate transcriptomes. Genome Biol. 2020;21:101.
Xia S, Feng J, Chen K, Ma Y, Gong J, Cai F, et al. CSCD: a database for cancer-specific circular RNAs. Nucleic Acids Res. 2018;46:D925–9.
Xia S, Feng J, Lei L, Hu J, Xia L, Wang J, et al. Comprehensive characterization of tissue-specific circular RNAs in the human and mouse genomes. Brief Bioinform. 2017;18:984–92.
Yao D, Zhang L, Zheng M, Sun X, Lu Y, Liu P. Circ2Disease: a manually curated database of experimentally validated circRNAs in human disease. Sci Rep. 2018;8:11018.
Ghosal S, Das S, Sen R, Basak P, Chakrabarti J. Circ2Traits: a comprehensive database for circular RNA potentially associated with disease and traits. Front Genet. 2013;4:283.
Piwecka M, Glažar P, Hernandez-Miranda LR, Memczak S, Wolf SA, Rybak-Wolf A, et al. Loss of a mammalian circular RNA locus causes miRNA deregulation and affects brain function. Science. 2017;357:eaam8526.
Abdelmohsen K, Panda AC, Munk R, Grammatikakis I, Dudekula DB, De S, et al. Identification of HuR target circular RNAs uncovers suppression of PABPN1 translation by CircPABPN1. RNA Biol. 2017;14:361–9.
Legnini I, Di Timoteo G, Rossi F, Morlando M, Briganti F, Sthandier O, et al. Circ-ZNF609 is a circular RNA that can be translated and functions in myogenesis. Mol Cell. 2017;66:22–37.
Brar GA, Weissman JS. Ribosome profiling reveals the what, when, where and how of protein synthesis. Nat Rev Mol Cell Biol. 2015;16:651–64.
Ingolia NT. Ribosome footprint profiling of translation throughout the genome. Cell. 2016;165:22–33.
Pamudurti NR, Bartok O, Jens M, Ashwal-Fluss R, Stottmeister C, Ruhe L, et al. Translation of circRNAs. Mol Cell. 2017;66:9–21.
van Heesch S, Witte F, Schneider-Lunitz V, Schulz JF, Adami E, Faber AB, et al. The translational landscape of the human heart. Cell. 2019;178:242–60.
Mo D, Li X, Raabe CA, Rozhdestvensky TS, Skryabin BV, Brosius J. Circular RNA encoded amyloid beta peptides-a novel putative player in Alzheimer’s disease. Cells. 2020;9:2196.
Yang Y, Fan X, Mao M, Song X, Wu P, Zhang Y, et al. Extensive translation of circular RNAs driven by N6-methyladenosine. Cell Res. 2017;27:626–41.
Fan X, Yang Y, Wang Z. Pervasive translation of circular RNAs driven by short IRES-like elements bioRxiv 2019; doi:https://doi.org/10.1101/473207.
Wang H, Yang L, Wang Y, Chen L, Li H, Xie Z. RPFdb v2.0: an updated database for genome-wide information of translated mRNA generated from ribosome profiling. Nucleic Acids Res. 2019;47:D230–4.
Kodama Y, Shumway M, Leinonen R. International nucleotide sequence database collaboration. The Sequence Read Archive: explosive growth of sequencing data. Nucleic Acids Res. 2012;40:D54–6.
Martin M. Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet.journal. 2011;17:10–2.
Joshi NA, Fass JN. Sickle: a sliding-window, adaptive, quality-based trimming tool for FastQ files. 2011. https://github.com/najoshi/sickle. Accessed 14 Feb 2020.
Zhang X-O, Dong R, Zhang Y, Zhang J-L, Luo Z, Zhang J, et al. Diverse alternative back-splicing and alternative splicing landscape of circular RNAs. Genome Res. 2016;26:1277–87.
Gao Y, Zhang J, Zhao F. Circular RNA identification based on multiple seed matching. Brief Bioinform. 2018;19:803–10.
Cheng J, Metge F, Dieterich C. Specific identification and quantification of circular RNAs from sequencing data. Bioinformatics. 2016;32:1094–6.
Zheng Y, Ji P, Chen S, Hou L, Zhao F. Reconstruction of full-length circular RNAs enables isoform-level quantification. Genome Med. 2019;11:2.
Ji Z, Song R, Huang H, Regev A, Struhl K. Transcriptome-scale RNase-footprinting of RNA-protein complexes. Nat Biotechnol. 2016;34:410–3.
Kim D, Pertea G, Trapnell C, Pimentel H, Kelley R, Salzberg SL. TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions. Genome Biol. 2013;14:R36.
Liu M, Wang Q, Shen J, Yang BB, Ding X. Circbank: a comprehensive database for circRNA with standard nomenclature. RNA Biol. 2019;16:899–905.
Dong R, Ma X-K, Li G-W, Yang L. CIRCpedia v2: an updated database for comprehensive circular RNA annotation and expression comparison. Genomics Proteomics Bioinformatics. 2018;16:226–33.
Chen X, Han P, Zhou T, Guo X, Song X, Li Y. circRNADb: a comprehensive database for human circular RNAs with protein-coding annotations. Sci Rep. 2016;6:34985.
Li S, Li Y, Chen B, Zhao J, Yu S, Tang Y, et al. exoRBase: a database of circRNA, lncRNA and mRNA in human blood exosomes. Nucleic Acids Res. 2018;46:D106–12.
Kuhn RM, Haussler D, Kent WJ. The UCSC genome browser and associated tools. Brief Bioinform. 2013;14:144–61.
Altenhoff AM, Glover NM, Train C-M, Kaleb K, Warwick Vesztrocy A, Dylus D, et al. The OMA orthology database in 2018: retrieving evolutionary relationships among all domains of life through richer web and programmatic interfaces. Nucleic Acids Res. 2018;46:D477–85.
Kent WJ. BLAT-the BLAST-like alignment tool. Genome Res. 2002;12:656–64.
Zhao J, Li Y, Wang C, Zhang H, Zhang H, Jiang B, et al. IRESbase: a comprehensive database of experimentally validated internal ribosome entry sites. Genomics Proteomics Bioinformatics. 2020;18:129–39.
Johnson M, Zaretskaya I, Raytselis Y, Merezhuk Y, McGinnis S, Madden TL. NCBI BLAST: a better web interface. Nucleic Acids Res. 2008;36:W5–9.
Liu S, Zhu A, He C, Chen M. REPIC: a database for exploring the N6-methyladenosine methylome. Genome Biol. 2020;21:100.
Pollard KS, Hubisz MJ, Rosenbloom KR, Siepel A. Detection of nonneutral substitution rates on mammalian phylogenies. Genome Res. 2010;20:110–21.
Almagro Armenteros JJ, Tsirigos KD, Sønderby CK, Petersen TN, Winther O, Brunak S, et al. SignalP 5.0 improves signal peptide predictions using deep neural networks. Nat Biotechnol. 2019;37:420–3.
Krogh A, Larsson B, Von Heijne G, Sonnhammer EL. Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes. J Mol Biol. 2001;305:567–80.
Käll L, Krogh A, Sonnhammer ELL. Advantages of combined transmembrane topology and signal peptide prediction-the Phobius web server. Nucleic Acids Res. 2007;35:W429–32.
Almagro Armenteros JJ, Salvatore M, Emanuelsson O, Winther O, von Heijne G, Elofsson A, et al. Detecting sequence signals in targeting peptides using deep learning. Life Sci Alliance. 2019;2:e201900429.
Almagro Armenteros JJ, Sønderby CK, Sønderby SK, Nielsen H, Winther O. DeepLoc: prediction of protein subcellular localization using deep learning. Bioinformatics. 2017;33:3387–95.
Yang J, Zhang Y. I-TASSER server: new development for protein structure and function predictions. Nucleic Acids Res. 2015;43:W174–81.
Perez-Riverol Y, Csordas A, Bai J, Bernal-Llinares M, Hewapathirana S, Kundu DJ, et al. The PRIDE database and related tools and resources in 2019: improving support for quantification data. Nucleic Acids Res. 2019;47:D442–50.
Cox J, Mann M. MaxQuant enables high peptide identification rates, individualized p.p.b.-range mass accuracies and proteome-wide protein quantification. Nat Biotechnol. 2008;26:1367–72.
Jakobi T, Uvarovskii A, Dieterich C. circtools-a one-stop software solution for circular RNA research. Bioinformatics. 2019;35:2326–8.
Feng J, Xiang Y, Xia S, Liu H, Wang J, Ozguc FM, et al. CircView: a visualization and exploration tool for circular RNAs. Brief Bioinform. 2019;20:745–51.
Buels R, Yao E, Diesh CM, Hayes RD, Munoz-Torres M, Helt G, et al. JBrowse: a dynamic web platform for genome visualization and analysis. Genome Biol. 2016;17:66.
We would like to thank our team members Congying Chen and Liang Yi for their assistance in collecting translated circRNAs. We also thank the support from the Center for Precision Medicine at Sun Yat-sen University.
The review history is available as Additional file 5.
Peer review information
Anahita Bishop was the primary editor of this article and managed its editorial process and peer review in collaboration with the rest of the editorial team.
This work was supported, in part, by the National Natural Science Foundation of China under Award Number 31871302 to Z.X. and the Joint Research Fund for Overseas Natural Science of China under Award Number 31829002 to Z.X.
Ethics approval and consent to participate
Consent for publication
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Summary of Ribo-seq and matched RNA-Seq datasets used in this study.
Summary of circRNAs reported in the public databases.
Summary of public proteomics datasets used in this study.
About this article
Cite this article
Li, H., Xie, M., Wang, Y. et al. riboCIRC: a comprehensive database of translatable circRNAs. Genome Biol 22, 79 (2021). https://doi.org/10.1186/s13059-021-02300-7