Clustering proteins by motifs they contain
|
MotifCluster
|
Takes aligned or unaligned protein and nucleotide sequences and a MEME file showing motifs; allows clustering of the sequences according to the motifs they contain, and visualization of the motifs on the aligned and unaligned sequences and three-dimensional structures
|
This article
|
Clustering of transcription factor binding sites (in DNA)
|
MCAST
|
Takes list of transcription factor binding sites as input: uses hidden Markov models to find cis-regulatory modules in DNA
|
[21]
|
|
Cluster-Buster
|
Takes list of transcription factor binding sites as input: uses Forward algorithm and expected uniform distribution to find motif co-occurrence in DNA
|
[22]
|
|
ClusterDraw
|
Takes list of transcription factor binding sites as input: uses r-scan algorithm and sweep over parameter values to visualize significant clusters as peaks on the DNA sequence
|
[23]
|
|
COMET
|
Calculates significance of collection of position-specific score matrices that appear in order: can apply to DNA or protein, in principle
|
[24]
|
|
PEAKS
|
Calculates significance of collection of transcription factor binding sites that appear at specified distance from transcription start site or other feature in the DNA
|
[25]
|
|
CompMoby
|
Aligns all pairs of motifs that appear significant in different promoters, then groups these into clusters using the CAST algorithm. DNA-specific
|
[26]
|
|
CREME
|
Identifies groups of DNA motifs that co-occur significantly within a defined distance using both order-dependent and order-independent models
|
[27]
|
|
PHYLOCLUS
|
Uses Bayesian method to find clusters of evolutionarily conserved DNA motifs that appear in different promoters.
|
[28]
|
|
INCLUSive
|
Clusters genes based on microarray analysis: feeds promoters to Gibbs sampler to find DNA motifs overrepresented in each cluster
|
[29]
|
Identifying kernels for SVMs*
|
SVM kernels
|
Introduces kernels based on k-word occurrences and best BLAST hit for SVM clustering: does not focus on conserved motifs
|
[30]
|
|
WCM (word correlation matrices)
|
Introduces k-word kernel for SVM clustering based on correlations in appearance of pairs of k-words: does not focus on conserved motifs.
|
[31]
|
|
ODH (oligomer distance histograms)
|
Introduces new kernel for SVM clustering based on histograms of distances between all words in protein: does not focus on conserved motifs
|
[32]
|
Iterative BLAST
|
Shotgun
|
BLAST-based approach for identifying remote homologs by iterative searches: not motif-based
|
[3]
|
|
DivergentSet
|
Among other features, can perform BLAST and PSI-BLAST versions of Shotgun and choose representative sequences of each group: not motif-based
|
[20]
|
|
Cascade PSI-BLAST
|
Performs iterative steps of PSI-BLAST, otherwise like Shotgun: not motif-based.
|
[33]
|
|
ProClust
|
Performs graph-based connection of proteins based on pairwise sequence similarity: not motif based
|
[34]
|
k-word clustering
|
CD-Hit
|
Clusters proteins based on shared segments of overall sequence, not by motifs already known to be significant
|
[35]
|
Profile-profile alignment
|
COMPASS
|
Performs profile-profile alignments for remote homology detection: assesses statistical significance matches in the profiles overall, rather than specifically using shared motifs
|
[1]
|
Clustering of motifs
|
STAMP
|
Aligns motifs with one another so that relationships among motifs can be detected; performs many other tasks for promoter characterization, but specific to promoters
|
[36]
|
|
TAMO
|
Performs many functions for cis-regulatory analysis: is able to cluster DNA motifs with one another
|
[37]
|
|
SOMBRERO
|
Aligns and clusters DNA motifs with one another to improve transcription factor binding site searches
|
[38]
|
Identification of functions in labeled structures
|
FunClust
|
Takes set of three-dimensional structures with annotated functions; identifies three-dimensional motif fragments that are common to the structures with each function.
|
[39]
|