From: Boosting with stumps for predicting transcription start sites
Classifier type | Features | |
---|---|---|
CpG | P versus U | Log-likelihood ratios from third order Markov chain, log-likelihood ratios from TSS weight matrix |
 |  | GC-box score, weighted score of transcription factor NFY, weighted energy score at position +1 |
 |  | Weighted score of transcription factor YY1, TATA score, weighted score of transcription factor ELK1 |
 |  | MTE score, weighted score of transcription factor CREB |
 | P versus D | Log-likelihood ratios from third order Markov chain, GC-box score |
 |  | Weighted score of transcription factor NFY |
 |  | Log-likelihood ratios from TSS weight matrix |
 |  | Difference between the energy score around positions -25 and +1 and the average from surroundings |
 |  | Log-likelihood ratios from transcription factor ELK1, frequency of G+C |
 |  | Log-likelihood ratios from transcription factor YY1, TATA score, frequency of G |
Non-CpG | P versus U | Correlation between vector of energy scores and empirical average energy profile |
 |  | Log-likelihood ratios from third order Markov chain, TATA score |
 |  | Difference between the energy score around positions -25 and +1 and the average from surroundings |
 |  | Weighted energy at position +1 |
 |  | Proportion of Inr and GC-box pair within 10 bp of observed distance, Inr score. |
 | P versus D | Correlation between vector of energy scores and empirical average energy profile, TATA score |
 |  | Log-likelihood ratios from third order Markov chain |
 |  | Weighted energy at position +1 |
 |  | Correlation between vector of flexibility scores and empirical average flexibility profile, Inr score |
 |  | Difference between the flexibility score around position +1 and the average from surroundings, GC-box score |