ArchAlign: coordinate-free chromatin alignment reveals novel architectures
© Lai and Buck; licensee BioMed Central Ltd. 2010
Received: 1 November 2010
Accepted: 23 December 2010
Published: 23 December 2010
To facilitate identification and characterization of genomic functional elements, we have developed a chromatin architecture alignment algorithm (ArchAlign). ArchAlign identifies shared chromatin structural patterns from high-resolution chromatin structural datasets derived from next-generation sequencing or tiled microarray approaches for user defined regions of interest. We validated ArchAlign using well characterized functional elements, and used it to explore the chromatin structural architecture at CTCF binding sites in the human genome. ArchAlign is freely available at http://www.acsu.buffalo.edu/~mjbuck/ArchAlign.html.
The development of protein and DNA sequence alignment algorithms in the 1970s and 1980s revolutionized the functional characterization of unknown proteins and genes [1, 2]. Since then sequence-based alignments have become so accepted that when a pairwise percentage identity is high enough, a gene or protein is now assigned a function without biochemical confirmation . Similar to the explosion of sequence data in the 1980s, today there is an exponential growth in chromatin structural data. The majority of chromatin data are being generated by next-generation DNA sequencing combined with chromatin immunoprecipitation (ChIP), FAIRE (formaldehyde-assisted isolation of regulatory elements), DNAse I hypersensitivity, or micrococcal nuclease (MNase) digestion assays . Analysis of these high resolution datasets has discovered shared chromatin architectures at previously defined functional elements in the genome; however, identification of new functional elements and their chromatin signatures remains limited.
Currently, the only way to characterize chromatin architecture is to have an accurately mapped functional element in the genome. Functional elements include genes for protein and non-coding RNAs, and regulatory sequences that direct essential functions such as gene expression, DNA replication, and chromosome inheritance. With an accurately mapped functional element, chromatin structural data are aligned by the genomic coordinates and an average profile is created. For example, transcription start sites (TSSs) in Saccharomyces cerevisiae have a well documented nucleosome-depleted region approximately 50 to 100 bp upstream of the TSS, flanked by a non-canonical acetylated nucleosome containing the histone variant H2A.Z . Chromatin architecture at these regions was identified because TSSs had been accurately determined through other molecular methods. In addition to TSSs, researchers have used genomic datasets to identify shared chromatin architectures at origins of replication , intron-exon junctures [7–11], and enhancers . All successful analyses have started with an accurately mapped functional element, which was used to align all regions containing that functional element. The chromatin architecture was then determined by averaging the chromatin data for aligned regions. For poorly mapped functional elements or elements having an unknown directionality, the chromatin structural profile loses definition and directionality is obscured.
Insulator elements are an example of a genomic element that has not been accurately mapped and has not been extensively characterized. Insulators function to restrict transcriptional enhancers from activating unintended promoters, by acting as a barrier between chromatin contexts [13–15] or by mediating intra- and interchromosomal contacts . While insulators are critical for gene regulation, only a few have been identified [15, 17]. A key component of insulators in vertebrates is the ubiquitously expressed CCCTC binding factor (CTCF). The genome-wide binding locations for CTCF have been determined in multiple cell lines by both ChIP-chip and ChIP-seq [18, 19] and these locations have been proposed to be insulator sites. Due to limitations in the resolution for all ChIP experiments, the exact site of CTCF binding cannot be determined. In addition, CTCF is part of a multimeric complex that in total defines the location and directionality of insulator elements. Therefore, CTCF binding can only identify insulators within 100 to 200 bp and any directionality within insulators is unknown.
Identification of shared chromatin architecture at functional sites has recently become an active area of research [20–25], but most studies focus on well-defined transcriptional promoters. While these approaches have provided extensive insight into the chromatin architecture at well-defined genomic features, there has been very limited work to identify shared chromatin architectures for unmapped, poorly mapped, or unknown genomic features. Two groups have developed unsupervised approaches to identify overrepresented chromatin states in a genome [24, 25]. Hon et al. used a variant of a standard motif finding approach with a probabilistic method and were able to uncover 16 distinct signatures and the known patterns at TSSs and enhancers. Ernst and Kellis  used a multivariate hidden Markov model to identify how often different chromatin mark combinations are found with one another and used this to identify chromatin states. These two approaches are limited in that while they can identify overrepresented chromatin signatures, they cannot identify less abundant signatures or be used to identify the shared architecture at user-defined regions of interest. To address this limitation, we developed ArchAlign, an algorithm that identifies shared chromatin structural patterns for user-specified regions of interest, from high-resolution chromatin structural datasets derived from next-generation sequencing or tiled microarray approaches. ArchAlign was designed and validated with data from mononucleosomes isolated by MNase digestion , and can be used with any dataset that can be converted into high-resolution log ratios. We used ArchAlign to align the nucleosome positions at CTCF binding sites, and uncovered a novel directional chromatin architecture containing positioned H2A.Z nucleosomes with the histone tail modifications H3K4me3, H3K4me2, H3K4me1, H3K9me1, and H3K20me1. These results define a shared structure at many CTCF sites and provide a framework for further exploration of the chromatin structure at insulator elements.
ArchAlign design and implementation
The second approach, known as seed sampling, is a more comprehensive search of the possible alignment space. Every region in the alignment is used as one-half of the optimal seed pattern for an independent alignment. Therefore, for a dataset with n regions, n independent alignments are generated as described for the single-best-pair approach (Figure 1b). To determine which of the n alignments is the best alignment for the dataset, a post-alignment quality assessment is performed by calculating the average correlation or distance of each aligned region to every other aligned region (see Materials and methods). The alignment that maximizes the similarity across all regions is then selected as the optimal alignment.
To ensure that ArchAlign can accurately align chromatin signatures located at various genomic features, we further validated ArchAlign with chromatin data for origins of replications from S. cerevisiae. Origins in S. cerevisiae have a well-characterized nucleosome-depleted region surrounded on both sides by an array of nucleosomes . We used all origins, 156 of 222, that contained a complete nucleosome occupancy profile identified in the recent study of Berbenetz et al.. The origins were then randomized in an identical method to the TSSs and aligned using the same parameters as previously stated. As shown previously with TSSs, ArchAlign using the seed sampling approach was able to produce high quality alignments with low variability regardless of the level of randomization (Figure 2d,e).
Alignment of nucleosome occupancy at CTCF binding in CD4+ cells
Identification of epigenetic architecture using the ArchAlign coordinates
ArchAlign was able to uncover the unique chromatin architecture located at CTCF sites using only nucleosome occupancy data. The identified architecture contains a nucleosome-depleted region located near CTCF binding sites with adjacent positioned nucleosomes. Overlaying the histone modification and variant data using the aligned coordinates on top of the nucleosome occupancy showed a strong preference for the presence of the H2A.Z nucleosome variant on the strongly positioned nucleosomes as well as the concurrent presence of the histone tail modifications H3K4me3, H3K4me2, H3K4me1, H3K9me1, and H4K20me1. These results suggest that CTCF is a component of direction-dependent chromatin architecture at the majority of its binding sites, which may be functionally important for its role as an insulator.
Insulator elements have been characterized by their ability to act as a barrier between differing chromatin contexts [13, 16]. We found a directional chromatin signature at CTCF sites that appears similar to a barrier between chromatin contexts. On one side there are H2A.Z nucleosomes with H3K4, H3K9, and H3K20 methylation and on the other side there is a reduced nucleosome occupancy. The discovered asymmetric architecture at CTCF appears similar to the recent association of CTCF at borders between repressive chromatin marks [13, 30], or heterochromatic lamina associated domains . The functional importance of the chromatin architecture and polarity for insulators has yet to be determined, but our alignment will act as a guide for future functional dissections.
CTCF binding location alone is not capable of providing an accurate alignment because the chromatin architecture at CTCF insulators is likely caused by other associated proteins, the underlying DNA sequence at the region, or a combination of both. At the well-studied H19 imprinting control region nucleosome positioning has been shown to be regulated by the underlying DNA sequence, not CTCF binding . In addition, CTCF is known to interact with multiple DNA binding transcription factors, chromatin modifying proteins, and nuclear architectural proteins . Therefore, it is likely that CTCF is only a single component of a multimeric complex located at insulators and that this insulator complex with the underlying DNA sequence defines the chromatin architecture at insulators. Identifying the shared chromatin architecture at insulators using CTCF binding location is analogous to identifying the shared architecture at TSSs using only the binding location of a transcription factor. To illustrate this concept, we examined the shared chromatin architecture around the binding sites for three abundant yeast transcription factors (Figure S6 in Additional file 1). The average nucleosome occupancy profile of the transcription factor binding site was compared before and after alignment with ArchAlign to the average nucleosome profile of the TSSs adjacent to the binding site. Since most transcription factors can bind in either orientation in relationship to TSSs and at various distances, the profile derived by only the binding site appears symmetrical and not well resolved. After alignment with ArchAlign the aligned profile has a similarity to the TSSs' derived profile and the true asymmetric nature is uncovered.
Similar to DNA and protein sequence alignment algorithms, ArchAlign does not identify the base pair location for a feature of interest. To identify the location of a genomic feature with ArchAlign, example regions containing an experimentally mapped feature need to be included in the alignment. The location of the unknown features can then be inferred from the alignment. The accuracy of the alignment is dependent on both the accuracy of the example regions and the extent to which the chromatin is organized around that feature. As evident from TSSs and CTCF binding sites, histone variants and histone tail modifications help define distinct chromatin architecture for certain genomic features. Future versions of ArchAlign will incorporate these datasets in order to produce an even more biologically relevant alignment.
ArchAlign is the first tool developed to align chromatin structural data and will prove highly valuable for analyzing chromatin datasets from genomes lacking substantial genomic feature annotation. Currently, there are many genomic features that cannot be accurately mapped by available techniques. For example, TSSs in Caenorhabditis elegans are difficult to map due to trans-splicing of the majority of mRNAs, which causes the 5' ends of different messages to have the same leader sequence , and origins of replication in Schizosaccharomyces pombe are difficult to map accurately, because S. pombe's origin recognition complex does not bind to specific DNA sequences but to AT-rich regions . ArchAlign requires only the general coordinates of a feature in order to determine the likely structural pattern present around it. In addition, as demonstrated for CTCF sites, even accurately mapped features may have a previously unrecognized directionality obscuring results that could be revealed by alignment with ArchAlign.
Materials and methods
S. cerevisiaenucleosome occupancy
S. cerevisiae genome-wide nucleosome occupancy maps were downloaded from the Segal Lab and transformed into a log2 ratio for each base pair . Mapped sequence tags from a MNase digestion were extended to the average sequence length for that experiment (150 to 200 bp) and normalized nucleosome occupancy at every base pair was determined as the log-ratio between the number of reads that cover that base pair and the average number of reads per base pair across the genome . The 200 TSSs used in validation were randomly selected from the previously defined cluster 3 of similar TSSs exhibiting high expression levels and a similar nucleosome occupancy pattern . The 156 origins of replication used for validation were selected from the original list of 222 characterized origins because they contained no gaps in the nucleosome occupancy dataset . The 70 MBP1, 76 GCN4, and 63 SWI4 binding sites identified by ChIP-chip experiments were selected from the original list of 127 MBP1, 107 GCN4, and 145 SWI4 sites because they contained no gaps in nucleosome occupancy .
Human CD4+ resting cells
Genome-wide nucleosome occupancy maps for CD4+ cells were downloaded from NCBI  and transformed into a log2 ratio, as described above, using a tag extension of 120 bp.
CTCF binding sites
CTCF binding data were downloaded from NCBI . The top 1,000 CTCF sites were selected by running a MACS analysis  and identifying the highest peaks determined by fold enrichment of CTCF binding in the genome. The coordinates for the top 10,000 peaks were then used to generate the nucleosome occupancy profiles in the log2 occupancy dataset previously generated. All CTCF sites for which complete data were not available or were within 2 kb of TSSs of a known gene were removed from the original set of sites. The remaining top 1,000 sites by fold enrichment were then selected as CTCF binding sites.
CD4+ histone modification maps
CD4+ histone modification data were downloaded from NCBI [12, 30]. Genome-wide maps of sequence tag count were then generated for all datasets assuming an extension length of 120 bp as previously described.
A random number generator was used to generate the genomic coordinates for 10,000 random non-overlapping regions. The data were then extracted as previously described. The first 1,000 remaining regions after filtering were then selected as the random regions.
Scheme 1: overview of chosen seed alignment
Usage: Chosen Seed Alignment (X, Y), where X = Seed Region 1, Y = Seed Region 2, Z = List of Regions, n = Number of Regions, w = Optimal Window of Region, and P = Average Profile.
Repeat for i = 3 to n
Repeat for j = 0 to Length of Remaining Regions
Identify window of Zj that maximizes similarity to P
Remove Zj from Z
Output Optimal Windows of All Regions
Scheme 2: Overview of single best-pair alignment
Usage: Single Best-Pair Alignment(Z)
Repeat for i = 0 to n
Repeat for j = 0 to n
Identify windows of Zi and Zj that maximize similarity out of all possible regions given i ≠ j
Identify which two Regions contained the windows that produced the highest similarity to each other for Zi and Zj
Chosen Seed Alignment(Zi, Zj)
Scheme 3: Overview of seed selection alignment
Usage: Seed Selection Alignment(Z), where Ai = Alignment with a forced seed from Region i.
Repeat for i = 0 to n
Repeat for j = 0 to n
Identify windows of Zi and Zj that maximize similarity out of all possible regions given i ≠ j
Identify which two Regions contained the windows that produced the highest similarity to each other for Zi and Zj
Ai = Chosen Seed Alignment(Zi, Zj)
Post-Alignment Quality Assessment(Ai)
Identify and output alignment that produced the highest Post-Alignment Quality Assessment
Equation for post-alignment quality assessment
ArchAlign was designed and written in C++ then compiled and run on a 64-bit Linux machine with 8 × 2.76 GHz Xeon X5550 cores, 48 GB RAM, and 80 TB attached disk storage array. The current version of ArchAlign is designed to use only a single CPU core per run.
The original (±750 bp) nucleosome profiles of the 200 TSSs were extracted at a resolution of 10 bp. ArchAlign was performed with a sliding window of 1 kb with region reversal disabled. The original (±1 kb) nucleosome of 156 origins of replication were extracted at a resolution of 10 bp. ArchAlign was then performed with a sliding window of 1.5 kb with region reversal disabled.
The (±1 kb) nucleosome profiles of the top 1,000 were extracted at a resolution of 10 bp and ArchAlign was run with a sliding window of 1.5 kb with region reversal enabled.
The (±1 kb) nucleosome profiles of 1,000 random regions were extracted at a resolution of 10 bp and ArchAlign was run with a sliding window of 1.5 kb with region reversal enabled.
Alignment using seed sampling without reversals of 200 2-kb regions at 10-bp resolution with a sliding window of 1.5 kb requires less than 5 minutes of CPU time. Alignment using seed sampling with reversals of 1,000 2-kb regions at 10-bp resolution with a sliding window of 1.5 kb requires approximately 20 hours of CPU time. Increases in number of regions, region size, data resolution, and decreases in window size will result in increases of CPU run time. Alignment using the single-best-pair approach is significantly faster; for a 1,000 2-kb region with reversals at 10-bp resolution with a 1.5-kb window requires less than 10 minutes CPU time.
ArchAlign is available at .
CCCTC binding factor
transcription start site.
This work was supported by an NSF grant to MJB (IIS1016929).
- Smith TF, Waterman MS: Identification of common molecular subsequences. J Mol Biol. 1981, 147: 195-197. 10.1016/0022-2836(81)90087-5.PubMedView ArticleGoogle Scholar
- Needleman SB, Wunsch CD: A general method applicable to the search for similarities in the amino acid sequence of two proteins. J Mol Biol. 1970, 48: 443-453. 10.1016/0022-2836(70)90057-4.PubMedView ArticleGoogle Scholar
- Hodgman TC: A historical perspective on gene/protein functional assignment. Bioinformatics. 2000, 16: 10-15. 10.1093/bioinformatics/16.1.10.PubMedView ArticleGoogle Scholar
- Celniker SE, Dillon LA, Gerstein MB, Gunsalus KC, Henikoff S, Karpen GH, Kellis M, Lai EC, Lieb JD, MacAlpine DM, Micklem G, Piano F, Snyder M, Stein L, White KP, Waterston RH: Unlocking the secrets of the genome. Nature. 2009, 459: 927-930. 10.1038/459927a.PubMedPubMed CentralView ArticleGoogle Scholar
- Venters BJ, Pugh BF: How eukaryotic genes are transcribed. Crit Rev Biochem Mol Biol. 2009, 44: 117-141.PubMedPubMed CentralGoogle Scholar
- Berbenetz NM, Nislow C, Brown GW: Diversity of eukaryotic DNA replication origins revealed by genome-wide analysis of chromatin structure. PLoS Genet. 2010, 6: e1001092-10.1371/journal.pgen.1001092.PubMedPubMed CentralView ArticleGoogle Scholar
- Kolasinska-Zwierz P, Down T, Latorre I, Liu T, Liu XS, Ahringer J: Differential chromatin marking of introns and expressed exons by H3K36me3. Nat Genet. 2009, 41: 376-381. 10.1038/ng.322.PubMedPubMed CentralView ArticleGoogle Scholar
- Spies N, Nielsen CB, Padgett RA, Burge CB: Biased chromatin signatures around polyadenylation sites and exons. Mol Cell. 2009, 36: 245-254. 10.1016/j.molcel.2009.10.008.PubMedPubMed CentralView ArticleGoogle Scholar
- Andersson R, Enroth S, Rada-Iglesias A, Wadelius C, Komorowski J: Nucleosomes are well positioned in exons and carry characteristic histone modifications. Genome Res. 2009, 19: 1732-1741. 10.1101/gr.092353.109.PubMedPubMed CentralView ArticleGoogle Scholar
- Tilgner H, Nikolaou C, Althammer S, Sammeth M, Beato M, Valcarcel J, Guigo R: Nucleosome positioning as a determinant of exon recognition. Nat Struct Mol Biol. 2009, 16: 996-1001. 10.1038/nsmb.1658.PubMedView ArticleGoogle Scholar
- Schwartz S, Meshorer E, Ast G: Chromatin organization marks exon-intron structure. Nat Struct Mol Biol. 2009, 16: 990-995. 10.1038/nsmb.1659.PubMedView ArticleGoogle Scholar
- Wang Z, Zang C, Rosenfeld JA, Schones DE, Barski A, Cuddapah S, Cui K, Roh TY, Peng W, Zhang MQ, Zhao K: Combinatorial patterns of histone acetylations and methylations in the human genome. Nat Genet. 2008, 40: 897-903. 10.1038/ng.154.PubMedPubMed CentralView ArticleGoogle Scholar
- Cuddapah S, Jothi R, Schones DE, Roh TY, Cui K, Zhao K: Global analysis of the insulator binding protein CTCF in chromatin barrier regions reveals demarcation of active and repressive domains. Genome Res. 2009, 19: 24-32. 10.1101/gr.082800.108.PubMedPubMed CentralView ArticleGoogle Scholar
- Fu Y, Sinha M, Peterson CL, Weng Z: The insulator binding protein CTCF positions 20 nucleosomes around its binding sites across the human genome. PLoS Genet. 2008, 4: e1000138-10.1371/journal.pgen.1000138.PubMedPubMed CentralView ArticleGoogle Scholar
- Bell AC, West AG, Felsenfeld G: Insulators and boundaries: versatile regulatory elements in the eukaryotic genome. Science. 2001, 291: 447-450. 10.1126/science.291.5503.447.PubMedView ArticleGoogle Scholar
- Phillips JE, Corces VG: CTCF: master weaver of the genome. Cell. 2009, 137: 1194-1211. 10.1016/j.cell.2009.06.001.PubMedPubMed CentralView ArticleGoogle Scholar
- Bell AC, West AG, Felsenfeld G: The protein CTCF is required for the enhancer blocking activity of vertebrate insulators. Cell. 1999, 98: 387-396. 10.1016/S0092-8674(00)81967-4.PubMedView ArticleGoogle Scholar
- Kim TH, Abdullaev ZK, Smith AD, Ching KA, Loukinov DI, Green RD, Zhang MQ, Lobanenkov VV, Ren B: Analysis of the vertebrate insulator protein CTCF-binding sites in the human genome. Cell. 2007, 128: 1231-1245. 10.1016/j.cell.2006.12.048.PubMedPubMed CentralView ArticleGoogle Scholar
- Jothi R, Cuddapah S, Barski A, Cui K, Zhao K: Genome-wide identification of in vivo protein-DNA binding sites from ChIP-Seq data. Nucleic Acids Res. 2008, 36: 5221-5231. 10.1093/nar/gkn488.PubMedPubMed CentralView ArticleGoogle Scholar
- Liu CL, Kaplan T, Kim M, Buratowski S, Schreiber SL, Friedman N, Rando OJ: Single-nucleosome mapping of histone modifications in S. cerevisiae. PLoS Biol. 2005, 3: e328-10.1371/journal.pbio.0030328.PubMedPubMed CentralView ArticleGoogle Scholar
- Heintzman ND, Stuart RK, Hon G, Fu Y, Ching CW, Hawkins RD, Barrera LO, Van Calcar S, Qu C, Ching KA, Wang W, Weng Z, Green RD, Crawford GE, Ren B: Distinct and predictive chromatin signatures of transcriptional promoters and enhancers in the human genome. Nat Genet. 2007, 39: 311-318. 10.1038/ng1966.PubMedView ArticleGoogle Scholar
- Hon G, Wang W, Ren B: Discovery and annotation of functional chromatin signatures in the human genome. PLoS Comput Biol. 2009, 5: e1000566-10.1371/journal.pcbi.1000566.PubMedPubMed CentralView ArticleGoogle Scholar
- Won KJ, Chepelev I, Ren B, Wang W: Prediction of regulatory elements in mammalian genomes using chromatin signatures. BMC Bioinformatics. 2008, 9: 547-10.1186/1471-2105-9-547.PubMedPubMed CentralView ArticleGoogle Scholar
- Ernst J, Kellis M: Discovery and characterization of chromatin states for systematic annotation of the human genome. Nat Biotechnol. 2010, 28: 817-825. 10.1038/nbt.1662.PubMedPubMed CentralView ArticleGoogle Scholar
- Hon G, Ren B, Wang W: ChromaSig: a probabilistic approach to finding common chromatin signatures in the human genome. PLoS Comput Biol. 2008, 4: e1000201-10.1371/journal.pcbi.1000201.PubMedPubMed CentralView ArticleGoogle Scholar
- Kaplan N, Moore IK, Fondufe-Mittendorf Y, Gossett AJ, Tillo D, Field Y, LeProust EM, Hughes TR, Lieb JD, Widom J, Segal E: The DNA-encoded nucleosome organization of a eukaryotic genome. Nature. 2009, 458: 362-366. 10.1038/nature07667.PubMedPubMed CentralView ArticleGoogle Scholar
- Lee W, Tillo D, Bray N, Morse RH, Davis RW, Hughes TR, Nislow C: A high-resolution atlas of nucleosome occupancy in yeast. Nat Genet. 2007, 39: 1235-1244. 10.1038/ng2117.PubMedView ArticleGoogle Scholar
- Berbenetz N, Nislow C, Brown GW: Diversity of eukaryotic DNA replication origins revealed by genome-wide analysis of chromatin structure. PLoS Genet. 2010, 6: e1001092-10.1371/journal.pgen.1001092.PubMedPubMed CentralView ArticleGoogle Scholar
- Zhang Y, Liu T, Meyer CA, Eeckhoute J, Johnson DS, Bernstein BE, Nussbaum C, Myers RM, Brown M, Li W, Liu XS: Model-based analysis of ChIP-Seq (MACS). Genome Biol. 2008, 9: R137-10.1186/gb-2008-9-9-r137.PubMedPubMed CentralView ArticleGoogle Scholar
- Barski A, Cuddapah S, Cui K, Roh TY, Schones DE, Wang Z, Wei G, Chepelev I, Zhao K: High-resolution profiling of histone methylations in the human genome. Cell. 2007, 129: 823-837. 10.1016/j.cell.2007.05.009.PubMedView ArticleGoogle Scholar
- Guelen L, Pagie L, Brasset E, Meuleman W, Faza MB, Talhout W, Eussen BH, de Klein A, Wessels L, de Laat W, van Steensel B: Domain organization of human chromosomes revealed by mapping of nuclear lamina interactions. Nature. 2008, 453: 948-951. 10.1038/nature06947.PubMedView ArticleGoogle Scholar
- Davey C, Fraser R, Smolle M, Simmen MW, Allan J: Nucleosome positioning signals in the DNA sequence of the human and mouse H19 imprinting control regions. J Mol Biol. 2003, 325: 873-887. 10.1016/S0022-2836(02)01340-2.PubMedView ArticleGoogle Scholar
- Ohlsson R, Lobanenkov V, Klenova E: Does CTCF mediate between nuclear organization and gene expression?. Bioessays. 2010, 32: 37-50. 10.1002/bies.200900118.PubMedView ArticleGoogle Scholar
- Blumenthal T: Trans-splicing and operons. WormBook. 2005, 1-9.Google Scholar
- Lee JK, Moon KY, Jiang Y, Hurwitz J: The Schizosaccharomyces pombe origin recognition complex interacts with multiple AT-rich regions of the replication origin DNA by means of the AT-hook domains of the spOrc4 protein. Proc Natl Acad Sci USA. 2001, 98: 13589-13594. 10.1073/pnas.251530398.PubMedPubMed CentralView ArticleGoogle Scholar
- MacIsaac KD, Wang T, Gordon DB, Gifford DK, Stormo GD, Fraenkel E: An improved map of conserved regulatory sites for Saccharomyces cerevisiae. BMC Bioinformatics. 2006, 7: 113-10.1186/1471-2105-7-113.PubMedPubMed CentralView ArticleGoogle Scholar
- Schones DE, Cui K, Cuddapah S, Roh TY, Barski A, Wang Z, Wei G, Zhao K: Dynamic regulation of nucleosome positioning in the human genome. Cell. 2008, 132: 887-898. 10.1016/j.cell.2008.02.022.PubMedView ArticleGoogle Scholar
- Buck Lab - ArchAlign. [http://www.acsu.buffalo.edu/~mjbuck/ArchAlign.html]
- Eisen MB, Spellman PT, Brown PO, Botstein D: Cluster analysis and display of genome-wide expression patterns. Proc Natl Acad Sci USA. 1998, 95: 14863-14868. 10.1073/pnas.95.25.14863.PubMedPubMed CentralView ArticleGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.