From: Functional constraint and small insertions and deletions in the ENCODE regions of the human genome
Feature | Term | Definition |
---|---|---|
RNA transcription (coding and noncoding) | CDS | Coding sequence: well characterized transcribed regions with an annotated protein-coding open reading frame (ORF) |
 | RACEfrags | 5' and 3' rapid Amplification of cDNA ends (RACE), using polyA or total RNA to construct full-length cDNA. This technique has revealed previously unrecognized UTRs |
 | TARs/transfrags | Transcriptionally active regions/transcribed fragments as determined by analyses of cellular RNA (polyA or total) hybridizations to multiple microarray platforms. For the analyses reported here, portions of TARs/transfrags overlapping any CDS, 5' or 3' UTR annotations were removed from the dataset |
 | Pseudo-exons | A pre-mRNA sequence that resembles an exon but is not recognized as such by the splicing machinery |
 | TSS | Transcription start site |
 | 5' UTR | Untranslated region: portions of CDS-containing transcripts before the start codon. For the analyses reported here, 5' UTRs overlapping alternatively transcribed CDS annotations were removed from the dataset |
 | TUF | Transcripts of unknown function for noncoding transcripts |
 | 3' UTR | Untranslated region: portions of CDS-containing transcripts after the stop codon |
Transcript regulation: open chromatin/DNA-protein interaction | DHS | DNAse I hypersensitive sites are short regions of DNA that are relatively easily cleaved by deoxyribonuclease. Regions of open chromatin detected by quantitative chromatin profiling and novel microarray-based methods. For the analyses reported here, regions that overlap repetitive sequence were removed. Measures of DHS are reported using two sources: the ENCODE Regulome group and the NHGRI |
 | FAIRE-sites | Formaldehyde assisted isolation of regulatory elements: a procedure used to isolate chromatin that is resistant to the formation of protein-DNA crosslinks. Data suggest that depletion of nucleosomes (the most basic organizational unit of chromatin) at active regulatory regions, such as promotors, is the primary underlying basis for FAIRE [38] |
 | HisPolTAF | Histone modifications, RNA polymerase II (PolII), and transcription regulator TAF250 |
 | Sequence specific factors | Regions of DNA determined to be bound by sequence-specific transcription factors through chromatin immunoprecipitation followed by microarray chip hybridization (so-called 'ChIP-Chip') analyses |
 | Sequence specific (all motifs) | Computationally identified short sequence motifs found to be over-represented in the sequence specific factors dataset |
Ancestral repeats | Â | Mobile elements with well defined consensus sequences that inserted into the ancestral genome prior to mammalian radiation. These sequences are considered to be predominantly non-functional and are often used as models of neutrally evolving DNA |
Cell cycle | EarlyRepSeg | Early replicating segments |
 | MidRepSeg | Mid replicating segments |
 | LateRepSeg | Late replicating segments |
Evolutionary constraint | MCS strict | Multi-species conserved sequences: strict criteria |
 | MCS moderate | Multi-species conserved sequences: modest criteria |
 | MCS loose | Multi-species conserved sequences: loose criteria |