SoxNeuro orchestrates central nervous system specification and differentiation in Drosophila and is only partially redundant with Dichaete

Background Sox proteins encompass an evolutionarily conserved family of transcription factors with critical roles in animal development and stem cell biology. In common with vertebrates, the Drosophila group B proteins SoxNeuro and Dichaete are involved in central nervous system development, where they play both similar and unique roles in gene regulation. Sox genes show extensive functional redundancy across metazoans, but the molecular basis underpinning functional compensation mechanisms at the genomic level are currently unknown. Results Using a combination of genome-wide binding analysis and gene expression profiling, we show that SoxNeuro directs embryonic neural development from the early specification of neuroblasts through to the terminal differentiation of neurons and glia. To address the issue of functional redundancy and compensation at a genomic level, we compare SoxNeuro and Dichaete binding, identifying common and independent binding events in wild-type conditions, as well as instances of compensation and loss of binding in mutant backgrounds. Conclusions We find that early aspects of group B Sox functions in the central nervous system, such as stem cell maintenance and dorsoventral patterning, are highly conserved. However, in contrast to vertebrates, we find that Drosophila group B1 proteins also play prominent roles during later aspects of neural morphogenesis. Our analysis of the functional relationship between SoxNeuro and Dichaete uncovers evidence for redundant and independent functions for each protein, along with unexpected examples of compensation and interdependency, thus providing new insights into the general issue of transcription factor functional redundancy.

(GEO platform GPL14121). Slides were hybridized for 16 hours at 51°C, washed and scanned using a GenePix 4000B scanner. Each experiment was performed in quadruplicate.

DamID
Embryos from Dam, SoxNDam, DDam, SoxN U6-35 [3]. Genomic DNA was extracted using the Qiagen DNeasy blood and tissue kit and precipitated with ethanol. DNA was digested overnight at 37°C with DpnI to cut methylated GATC sequences and the fragments were ligated to a double-stranded adapter oligonucleotide. Ligation products were subjected to DpnII digestion to remove unmethylated GATC sequences and amplified by PCR. Klenow labelling by random priming was used to incorporate Cy3 and Cy5 into the DNA. After combining samples with Dam-only controls, DNA was loaded onto NimbleGen Drosophila melanogaster Whole Genome 2.1M tiling arrays (GEO platform 15641). For each experiment, three arrays, corresponding to three biological replicates, were hybridized overnight at 42°C, then washed and scanned the following day.

ChIP-on-chip
ChIP followed by microarray hybridization was performed as described by Sandmann and colleagues [4]. SoxND1 and SoxND2 are polyclonal antisera raised in rabbit immunized with protein fragments corresponding to amino acids 2-92 and 317-417 of SoxN, respectively, and were produced by the modENCODE consortium (a gift of N Negre and K. White). SoxNPA179 is an affinity-purified rabbit polyclonal antibody designed against a 506-LHYQTDSPDLQQQHQS-521 peptide at the C-terminal of SoxN, commissioned from Eurogentec. A mouse monoclonal antibody against βGal (40-1a, Developmental Studies Hybridoma Bank, DSHB) was used for control immunoprecipitations. Following dechorionation, approximately 2.5 mg wet weight of embryos per replicate were crosslinked with formaldehyde and lysed to extract protein-DNA complexes. After sonication, chromatin average size (~500 -1000 bp) was checked by electrophoresis.
Chromatin was then incubated overnight with protein A agarose beads, salmon sperm DNA and anti-βGal (control) or anti-SoxN antibodies. After extensive washes, crosslinking was reversed by incubation at 65°C for 6 hours and DNA isolated through phenolchloroform extraction. DNA was amplified with two rounds of ligation-mediated PCR, then labelled by Klenow amplification and random priming, and hybridized onto NimbleGen Drosophila melanogaster Whole Genome 2.1M tiling arrays (GEO platform 15641) overnight at 42°C. Each experiment was performed and hybridized in triplicate.

Data analysis
Gene expression arrays were processed according to established FlyChip pipelines [http://www.flychip.org.uk]. Scanned images were imported into Dapple [5] for spot finding and quantification, raw data was normalised with the variance stabilization method (VSN) [6]. Experiments performed at different stages of embryonic development were analysed together with the limma Bioconductor package [7] to retrieve probes differentially expressed (p ≤ 0.05) over the timecourse. Nimblescan was used to quantify features on the scanned images of DamID and ChIP microarrays. Quantile normalisation was applied to the raw data before using the Ringo Bioconductor package [8] for peak calling at different FDRs. Window scores (SGR) and binding intervals (BED) files were visualised with the Integrated Genome Browser, [9]. The SoxNDam, DDam, SoxN-DDam and D-SoxNDam DamID experiments were quantile normalised together and the resulting ratios were used to perform pairwise and three-way comparisons between the datasets with SimBindProfiles [10]. This tool does not rely on peak calling, but directly compares binding profiles, allowing the retrieval of commonly and differentially bound regions between datasets, as well as trans-and over-compensation events. The BEDTools suite [11] was used for operations with BED files. Assignment of intervals to genes was performed using a custom script identifying the closest TSS in a 10 kb window. If no TSSs were found, the interval was assigned to the closest gene boundary in the same 10 kb window or left otherwise unassigned. GO:BP terms enrichment analyses were performed using BiNGO [12], a Cytoscape plugin. Terms were considered significant if their p-value, corrected for multiple hypothesis testing with the Benjamini-Hochberg method, was below a 0.05 threshold. The HOMER software suite [13] was utilised for both de novo motif discovery and to find enrichment of previously known motifs. Mapping de novo motif matches to the Drosophila genome was done using FIMO at a p-value cut off of 1E-4 [14]. To assess the similarity of binding datasets, an algorithm performing pairwise comparisons of BED files and relying on a subsampling-based approach was employed [15,16]: Embryonic binding datasets from the BDTNP (Berkeley Drosophila Transcription Network Project) [17] and modENCODE (Model Organism Encyclopedia of DNA Elements) [16,18] projects were used. FlyExpress [19] was used for the production of genome-wide expression maps. For network analysis, the whole DroID database [20], with the exception of TF-gene, microRNA-gene and predicted protein-protein interactions was used. The resulted network was imported into Cytoscape [21] and used for further analysis.

SimBindProfiles
The SimBindProfiles Bioconductor package [10] identifies common and unique binding regions in genome tiling array data. This package does not rely on peak calling, but directly compares binding profiles processed on the same array platform. It implements a simple threshold approach, thus allowing retrieval of commonly and differentially bound regions between datasets as well as events of trans-and over-compensation. In order to identify probes or regions that are similarly or differentially bound between the data sets, we implemented the twoGaussiansNull method established in the Ringo package to set a bound cut-off, probes above this threshold are considered "bound". When comparing two datasets, a probe is considered uniquely bound in one data set if it is bound above a diff.cutoff threshold (in our case we used a diff.cutoff threshold of 75% of the bound.cutoff of the other data set). The R package is available from BioConductor [http://www.bioconductor.org/packages/2.14/bioc/html/SimBindProfiles.html].

Immunohistochemistry
After collection and dechorionation, antibody staining of embryos from SoxN U6-35 /CyO, twi- crosses was carried out essentially as described by Patel [22]. anti-Spdo (1:1000, [44]) and anti-Wor (1:1000, [32]). Following extensive washes, biotinylated secondary antibodies (Vector Labs) were added at a 1:200 concentration and left for 2 hours rolling at room temperature. After washes, detection was performed by incubating embryos for 1 hour with the Vectastain ABC system. Colour development was achieved by addition of DAB and H2O2 to embryos in watchglasses. After washing, embryos were left to sink in 50% and then 70% glycerol, mounted on slides and observed with a Zeiss Axioplan microscope.