Ciliogenic RFX regulatory networks are conserved between C. elegans and D. melanogaster. Based on these first observations, the genomic screens we conducted combined with functional and in vivo gene analyses led to the identification of at least 11 novel genes that had never been described as RFX targets in any biological model. In addition, our screen allowed us to identify at least two novel genes specifically expressed in ciliated sensory neurons in Drosophila that are potentially involved in sensory ciliogenesis. These results validate the accuracy of our screens. Our work thus provides a new set of candidate genes for further functional studies in ciliogenesis.
Molecular nature of RFX target gene products
Our Drosophila genome wide X-box screen led to the identification of 83 X-box genes among which we report 11 novel RFX targets. Combined with the genes identified by comparisons to C. elegans or to other genomic studies in Drosophila (Table 1) , we report 35 genes regulated by dRFX in Drosophila. Most of these genes can be classified based on their described function. Many of the RFX target genes are involved in IFT, which is necessary for cilium assembly and function . Remarkably, a second class of genes regulated by dRFX includes all the Drosophila homologs of BBS genes. Similarly, most C. elegans BBS genes are regulated by DAF-19 [14, 36, 37]. This strong dependence of BBS genes on RFX control may thus be conserved in mammals. Hence, RFX proteins may be involved in BBS in humans. Interestingly, two of the three Drosophila genes coding for proteins with B9 domains are also controlled by dRFX (tectonic, CG14870). One human B9 domain protein, MKS1, is known to be involved in the human Meckel-Gruber syndrome . The molecular function of this domain is unknown and work in Drosophila suggested that these two B9 domain containing proteins are likely involved in ciliogenesis . Several of the novel dRFX target genes that we identified in this study encode known components of the ciliary axoneme and associated structures, such as axonemal dyneins or rootletin. Other genes encode different types of proteins likely involved in sensory transduction (CG4536/osm-9/TRPV4 or MIP-T3). A last class includes genes for which the function is either not described or poorly understood, such as CG31036 and CG13125. However, our functional studies strongly suggest that they are also probably involved in sensory ciliogenesis in Drosophila as well. Thus, RFX target genes play various roles in ciliary structure and function and our X-box search strategy has proven to be useful to identify novel ciliogenic genes.
Database mining using the X-box promoter motif
This full set of dRFX target genes in Drosophila is of crucial importance, as we can now more precisely define X-box sequences and the promoter context required for dRFX control. This will be particularly useful for further database mining of dRFX target genes in Drosophila. In fact, several genes that are under dRFX control (Table 1, for example CG4525, CG17599) for which an X-box can be identified did not come out in the whole genome X-box screen. Several reasons can explain this result. First, homologs were not all annotated in CDS listings that were available at the time of the search (for example, CG18631, CG9595, nompB in D. pseudoobsura). Second, annotation of both Drosophila databases is incomplete, as sometimes the start codon is not properly defined for all genes. Our X-box search algorithm keeps only genes for which the X-box match is upstream of the ATG. For example, for CG15666/GA13881, we clearly predict that the correct ATG should be considered 75 bp downstream of the currently defined ATG, based on evolutionarily conserved sequences. This definition clearly excludes the homologous genes CG15666 and GA13881 from the dataset. However, as illustrated in Table 2, in a few cases, our X-box consensus cannot define a clearly conserved X-box match in the two Drosophila species for genes that appear to be down-regulated in a dRfx mutant, while several individual X-boxes are found separately in each organism. This could either reflect that these genes are not direct dRFX targets but are shut down by a feedback control loop that is not dependent on a X-box motif, or that the X-box is only loosely conserved in some promoter contexts. Notably, homologs of these genes in C. elegans are under RFX (DAF-19) control and have a well defined X-box (for example, CG9333/che-2, CG13691/bbs-8), which argues in favor of the second possibility. Interestingly, we also quantified the expression levels in control and dRfx deficient Drosophila of several genes of the DCBB dataset that did not come out of the X-box genome-wide motif search. It allowed us to identify several novel genes that are indeed down-regulated in dRfx mutants, but for which no conserved X-box can be recognized based on our initial consensus motif (AL, unpublished). Altogether, our observations clearly highlight the difficulties encountered in motif definition in promoters. Similar conclusions were deduced from a parallel approach performed in C. elegans, which has led to the identification of several novel DAF-19 target genes . Interestingly, in that study the in silico search was associated with microarray analysis of transcripts in wild-type and daf-19 mutant worms. The in silico search allowed the identification of 93 X-box genes. Yet, among the 466 genes that were shown to be down-regulated at least two-fold in microarray hybridization experiments, only 25 were also represented in the 93 in silico X-box gene list. Thus, in silico searches on isolated motifs are likely hampered by a high level of false negatives. In order to improve the screening efficiency, the use of combinatorial motif searches would probably greatly enhance the accuracy of the screen as proposed by other studies [71, 72]. Even though, since conserved X-boxes that we identified are rarely associated with highly conserved surrounding sequences (Table 2), it is reasonable to assume that other conserved nearby motifs, still to be identified, could help to discriminate between false positives and false negatives.
Regulatory network of ciliary genes
We have identified 35 genes that are transcriptionally down-regulated in dRfx mutants. We show that RFX regulatory networks are conserved between C. elegans and Drosophila as most of the genes controlled by DAF-19 in C. elegans are also under dRFX control in D. melanogaster. Interestingly, our results show that only certain subsets of ciliogenic genes are regulated by RFX proteins. For example, in our assay conditions all the genes known to be involved in IFT-A complexes are not regulated by dRFX, whereas all IFT-B homologous proteins are regulated by dRFX. In addition, retrograde motors are also regulated by dRFX (CG15148/btv and CG3769), whereas anterograde motors seem not to be. Indeed, in addition to CG10642/KIF3A, the main described anterograde motor in several organisms, we have shown that two other kinesin subunits, CG17461/Kif3C/osm-3 and CG7293/Klp68D, are invariantly expressed in wild-type and dRfx-deficient Drosophila (AL, data not shown). It is also interesting to note that all the BBS gene homologs in D. melanogaster are under dRFX control (Table 1).
The biological significance of these observations is unclear. It could reflect the fact that IFT-B proteins, BBS proteins and the dyneins involved in IFT are dedicated to ciliogenesis and, therefore, need to be turned on concomitantly only when the cilium is formed, whereas IFT-A complexes or anterograde transport kinesin II share more complex regulatory controls as they might be necessary also for other cellular functions. This is the case for kinesin II motors , but does not seem to be true for IFT-A complexes as these proteins are proposed to be specific for ciliated organisms . In C. elegans, the ciliary IFT machinery works in modular fashion , and it is tempting to speculate that RFX-dependent proteins could be involved in specialized ciliogenic transport modules.
Genes necessary for centriole biogenesis or replication, such as the recently described DSas-6, DSas-4 or sak genes [75–78] are not present in our screen and no conserved X-box can be found upstream of these genes. Thus, dRFX does not seem to regulate centriole biogenesis and appears to be restricted to cilia assembly only.
To find which transcription factors are responsible for governing other sets of ciliary proteins will certainly be one track to follow. Based on our data, it would be of particular interest to compare promoter sequences of genes, either regulated by dRFX, or not. It may allow us to discover novel regulatory motifs and protein modules that are necessary to coordinate ciliogenesis control. So far, only a few transcription factors have been shown to be involved in the control of ciliogenesis: the RFX proteins [21, 23, 24], Foxj1 , and HNF1-beta . However, the last two have no obvious homologs in Drosophila. Thus, our work strongly suggests that novel transcription factors necessary for ciliogenesis still need to be discovered.
Novel RFX target genes
Some of the novel RFX target genes found in Drosophila were unexpected. For example, we identified several proteins that are proposed to be involved in flagella or cilia motility, such as dynein heavy chains (CG17150/Dhc93AB). Recently, a CG13125 homolog has also been shown to function as a motility factor in T. brucei (TbCMF46) . Sensory cilia are thought not to be motile in general. However, it has been shown that Drosophila chordotonal neurons of the antenna generate motion that depends on the integrity of proteins encoded by genes such as CG15148/btv (cytoplasmic dynein heavy chain) or CG14620/tilB (LRRC6 homolog), described to affect the axonemal structure [52, 79] (D Eberl, personal communication). In addition, cilia of the chordotonal neurons of the grasshopper bend upon vibration stimulation . Thus, proteins involved in axonemal motility might be important for motion generation of the cilium in response to mechanical stimulation. It will be of high interest to determine whether flies defective in these 'motility' genes are affected in hearing and, more specifically, in the motility of the mechanosensory cilium that amplifies hearing vibrations. Interestingly, CG13125/TbCMF46 does not seem to be expressed in fly testis (AL, unpublished), where the spermatozoa are the only cell type with a motile flagellum in flies. This suggests that like CG15148/btv, CG13125/TbCMF46 function could be restricted to the sensory cilium and, more specifically, in allowing these cilia to mechanically respond to auditory vibrations . Thus, our data suggest that in the fly, possible axonemal motility could be regulated by different subsets of proteins in sperm flagella and in mechanosensory cilia. This is of particular interest with regard to hearing in mammals, which is dependent on hair cell motility. It will be very interesting to determine whether the CG13125/TbCMF46 homolog in mammals does have a specific function in those cell types.
We also identified in our screen three genes (CG6054/Su(fu), CG13415/Cby, CG33038/Ext(2)) known to be involved in the hedgehog or wingless signaling pathways in Drosophila. Su(fu) and Ext(2) are involved in the Hedgehog pathway and Su(fu) is localized to cilia in mammalian cells . However, Su(fu) and Ext(2) do not appear to be under dRfx control according to real-time PCR quantification results (Table 3) and may be false positives in our screen. This result argues in favor of the generally accepted observation that the Hedgehog signaling pathway does not seem to depend on ciliogenic proteins in Drosophila . Only Chibby (Cby) is statistically down-regulated two-fold in a dRfx deficient background. Cby was isolated in a two-hybrid screen for armadillo/beta-catenin interactors. RNAi knock-down of Cby in Drosophila embryos leads to ectopic activation of the wingless pathway . Cby is also described to antagonize the Wnt/beta-catenin pathway in mammalian cells [64, 65]. However, the expression pattern of Cby in Drosophila is not documented, so we do not know if the variations of expression observed in the dRfx deficient background are connected to dRfx expression and, thus, if it is biologically significant.
Among the 83 genes with conserved X-boxes between D. melanogaster and D. pseudoobscura (Table 3), several genes were hardly detectable by quantitative RT-PCR. Hence we were unable to determine by this approach if they are under dRFX control. This could reflect that these genes are expressed only in a subset of sensory neurons and, thus, difficult to detect by quantitative RT-PCR. Nevertheless, several genes are interesting as potential ciliogenic or RFX target genes. For example, CG14079 is homologous to a mouse protein that appears to be specific to testis. CG11356 is homologous to mammalian arl13, which has just been isolated in an ethyl-nitroso-urea screen for neural tube defects in mouse. Indeed, mutation of arl13 affects ciliary architecture and Sonic-Hedgehog signaling in mouse . This gene, CG11356, was not found in any previous ciliogenesis study, again illustrating the accuracy of our screen. Functional studies in Drosophila will be of particular importance to demonstrate the role of this gene in sensory ciliogenesis.