Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2003 Jun;13(6B):1301-6.
doi: 10.1101/gr.1011603.

Identification of putative noncoding RNAs among the RIKEN mouse full-length cDNA collection

Affiliations

Identification of putative noncoding RNAs among the RIKEN mouse full-length cDNA collection

Koji Numata et al. Genome Res. 2003 Jun.

Abstract

With the sequencing and annotation of genomes and transcriptomes of several eukaryotes, the importance of noncoding RNA (ncRNA)-RNA molecules that are not translated to protein products-has become more evident. A subclass of ncRNA transcripts are encoded by highly regulated, multi-exon, transcriptional units, are processed like typical protein-coding mRNAs and are increasingly implicated in regulation of many cellular functions in eukaryotes. This study describes the identification of candidate functional ncRNAs from among the RIKEN mouse full-length cDNA collection, which contains 60,770 sequences, by using a systematic computational filtering approach. We initially searched for previously reported ncRNAs and found nine murine ncRNAs and homologs of several previously described nonmouse ncRNAs. Through our computational approach to filter artifact-free clones that lack protein coding potential, we extracted 4280 transcripts as the largest-candidate set. Many clones in the set had EST hits, potential CpG islands surrounding the transcription start sites, and homologies with the human genome. This implies that many candidates are indeed transcribed in a regulated manner. Our results demonstrate that ncRNAs are a major functional subclass of processed transcripts in mammals.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Average of CpG observed/expected (O/E) value around putative transcription start site. The average of CpG O/E ratio for each transcription start site (from 3 kb upstream to 0.5 kb downstream) of the 4280 largest-candidate set was plotted. The set contains 919 sequences, which have potential CpG islands surrounding transcription start site, as referred in Table 3. Putative transcription start sites (TSSs) were defined by 5′ boundaries of mapped genomic regions as indicated by an arrow. CpG O/E ratio was calculated every 200-bp window with sliding 20 bp. The formula for producing CpG O/E ratio is described in Methods.
Figure 2
Figure 2
Frequencies of polyadenylation signal-like sequences upstream of 3′ end. Frequencies of polyadenylation signal-like sequences located upstream of 3′ end sites were plotted. The sequence pattern AATAAA/ATTAAA was searched for every position from the 3′ end of the 4280 largest-candidate set. The set contains 1395 sequences, which contain polyA-signal like sequence in the 3′ end, as mentioned in Table 3.

Similar articles

Cited by

References

    1. Argaman, L., Hershberg, R., Vogel, J., Bejerano, G., Wagner, E.G., Margalit, H., and Altuvia, S. 2001. Novel small RNA-encoding genes in the intergenic regions of Escherichia coli. Curr. Biol. 11: 941-950. - PubMed
    1. Bono, H., Yagi, K., Kasukawa, T., Nikaido, I., Tominaga, N., Miki, R., Mizuno, Y., Tomaru, Y., Goto, H., Nitanda, H., et al. 2003. Systematic expression profiling of the mouse transcriptome using RIKEN cDNA microarrays. Genome Res. (this issue). - PMC - PubMed
    1. Bortolin, M.L. and Kiss. T. 1998. Human U19 intron-encoded snoRNA is processed from a long primary transcript that possesses little potential for protein coding. RNA 4: 445-454. - PMC - PubMed
    1. Burge, C. and Karlin, S. 1997. Prediction of complete gene structures in human genomic DNA. J. Mol. Biol. 268: 78-94. - PubMed
    1. Chan, A.S., Thorner, P.S., Squire, J.A., and Zielenska, M. 2002. Identification of a novel gene NCRMS on chromosome 12q21 with differential expression between rhabdomyosarcoma subtypes. Oncogene 21: 3029-3037. - PubMed

WEB SITE REFERENCES

    1. http://biobases.ibch.poznan.pl/ncRNA/; Noncoding RNAs database.
    1. ftp://us.expasy.org/databases/sp_tr_nrdb/; data set for known protein sequences.
    1. ftp://ftp.ncbi.nih.gov/blast/db/; database of mouse EST sequences and human EST sequences.
    1. ftp://ftp.ncbi.nih.gov/genomes/R_norvegicus/; database of rat EST sequences.
    1. http://www.ncbi.nlm.nih.gov/blast; executable files of BLASTN and BLASTX.

LinkOut - more resources