Introns have typically been discovered in an ad
hoc fashion: introns are found as a gene is characterized
for other reasons. As complete eukaryotic genome sequences
become available, better methods for predicting RNA processing
signals in raw sequence will be necessary in order to discover
genes and predict their expression. Here we present a catalog
of 228 yeast introns, arrived at through a combination
of bioinformatic and molecular analysis. Introns annotated
in the Saccharomyces Genome Database (SGD) were
evaluated, questionable introns were removed after failing
a test for splicing in vivo, and known introns absent from
the SGD annotation were added. A novel branchpoint sequence,
AAUUAAC, was identified within an annotated intron that
lacks a six-of-seven match to the highly conserved branchpoint
consensus UACUAAC. Analysis of the database corroborates
many conclusions about pre-mRNA substrate requirements
for splicing derived from experimental studies, but indicates
that splicing in yeast may not be as rigidly determined
by splice-site conservation as had previously been thought.
Using this database and a molecular technique that directly
displays the lariat intron products of spliced transcripts
(intron display), we suggest that the current set of 228
introns is still not complete, and that additional intron-containing
genes remain to be discovered in yeast. The database can be accessed at
http://www.cse.ucsc.edu/research/compbio/yeast_introns.html.