Abstract
Comparisons of full-length cDNAs and genomic DNAs available for Arabidopsis thaliana described here indicate that some adjacent loci are transcribed into extremely long RNAs spanning two annotated genes. Once expressed, some of these transcripts are post-transcriptionally spliced within their coding and intergenic sequences to generate bicistronic transcripts containing two complete open reading frames. Others are spliced to generate monocistronic transcripts coding for fusion proteins with sequences derived from both loci. RT-PCR of several P450 transcripts in this collection indicates that these extended transcripts exist side by side with shorter monocistronic transcripts derived from the individual loci in each pair. The existence of these unusual transcripts highlights variations in the processes of transcription and splicing that could not possibly have been predicted in the algorithms used for genome annotation and splice site predictions.
Keywords
bicistronic transcription units, Arabidopsis thaliana, pre-mRNA splicing, genome annotation
Date of this Version
11-18-2004
DOI
doi:10.1261/rna.7114505
Included in
Engineering Commons, Life Sciences Commons, Medicine and Health Sciences Commons, Physical Sciences and Mathematics Commons