From: Boosting with stumps for predicting transcription start sites
Classifier type | Features | |
---|---|---|
CpG | P versus U | Log-likelihood ratios from third order Markov chain, log-likelihood ratios from TSS weight matrix |
GC-box score, weighted score of transcription factor NFY, weighted energy score at position +1 | ||
Weighted score of transcription factor YY1, TATA score, weighted score of transcription factor ELK1 | ||
MTE score, weighted score of transcription factor CREB | ||
P versus D | Log-likelihood ratios from third order Markov chain, GC-box score | |
Weighted score of transcription factor NFY | ||
Log-likelihood ratios from TSS weight matrix | ||
Difference between the energy score around positions -25 and +1 and the average from surroundings | ||
Log-likelihood ratios from transcription factor ELK1, frequency of G+C | ||
Log-likelihood ratios from transcription factor YY1, TATA score, frequency of G | ||
Non-CpG | P versus U | Correlation between vector of energy scores and empirical average energy profile |
Log-likelihood ratios from third order Markov chain, TATA score | ||
Difference between the energy score around positions -25 and +1 and the average from surroundings | ||
Weighted energy at position +1 | ||
Proportion of Inr and GC-box pair within 10 bp of observed distance, Inr score. | ||
P versus D | Correlation between vector of energy scores and empirical average energy profile, TATA score | |
Log-likelihood ratios from third order Markov chain | ||
Weighted energy at position +1 | ||
Correlation between vector of flexibility scores and empirical average flexibility profile, Inr score | ||
Difference between the flexibility score around position +1 and the average from surroundings, GC-box score |