Computational analysis of human intron PY tracts. (a) Distribution of intronic motifs (branchpoint (BPS), G-triples (GGG) and U2AF65 binding sites (U2AF65)) adjacent to the 3' end of human introns. The BPS curve is a composite of the distribution of all pentamers containing YTRAC (Y = T or C, R = A or G). The G-triple curve is the composite for all pentamers containing GGG. The U2AF65 curve is a composite of the occurrence of the ten most abundant pentamers found in the U2AF65 SELEX sequences [27, 39] (Additional data file 1). The distributions were determined over all human introns, and for each curve the total area under the curve was normalized to unity. The two regions used in this study are depicted below the curves. The PY tract region consisted of the region from -30 to -3, and the upstream PY (UPY) tract region was defined to be from -80 to -30 (relative to the acceptor splice-junction (SJ)). (b) Distribution of U2AF65 binding site scores (S65 scores) for all human introns (filled blue) and for the U2AF65 SELEX sequences used as the training set for the binding site score (vertical solid black lines). The distributions were generated using a bin size of 0.02, and the total area under the curves was normalized to unity. The median (used as the cutoff for 'weak' and 'strong' binding sites) is depicted as a vertical dashed line.