Zebrafish lines and care
Zebrafish (Danio rerio) WIK strain [41] was kindly provided by Dr. Bovolenta (CBM-UAM) and inbred in our laboratory for at least three generations before the experiments. Afterwards, a WIK F1 population was generated from a single batch of embryos from a single couple of adult fish. Two additional cycles of inbreeding were carried out, crossing a couple of siblings from the former generation. The CG2 clone population [17], generated by double gymnogenetic heat-shock and characterized by being pure isogenic zebrafish, was kindly provided by Dr. Revskoy (Univ Northwestern) as a control for reduced genetic differences between siblings. The outbred LPS (Local Pet Store) strain was recently described [42] and was used as a model of genetic heterogeneity. Heterozygotic hdac1 (hi1618) and yy1a (sa7439) mutant strains with wild-type (AB strain) counterparts were obtained from ZIRC, while heterozygotic hdac1 (sa436) [21] with wild-type counterparts were kindly provided by Dr. Ober (University of Copenhagen).
Care and breeding of the zebrafish strains were as described [42], with specific additional details. Eggs were isolated 24 h post-fertilization and maintained in custom multiwell plates until 10 days post-fertilization (dpf). They were fed (JBL NovoBaby) from 6 dpf and water was changed daily if not indicated specifically in the experiment.
Free-swimming setup and recording
The setup consists of a monochrome camera located over the wells at a distance of 70 cm and pointing downwards. The camera used was a 1.2 MPixel camera (Basler A622f, with a Pentax objective of focal length 16 mm). The wells are circular, carved on transparent PMMA (24 wells per plate, and typically two plates are recorded simultaneously), and have their walls tilted so that even in the most lateral wells the wall never hides the larva from the camera. Each well is 15 mm deep, and has a diameter of 1.8 mm at the bottom and a diameter of 30 mm at the top (Additional file 1: Figure S1A). For the experiments, each well is filled with a volume of 3 ml. The dishes are supported by a white PMMA surface that is only partially opaque. Behind this white surface we place two infrared LED arrays (830 nm, TSHG8400 Vishay Semiconductors) pointing outwards (Additional file 1: Figure S1A). Two paper sheets stand between the lights and the central space that lies directly under the wells. With this disposition we ensure that only diffuse indirect light reaches the wells, so that the illumination is roughly uniform (most of the light comes from below the wells through the white surface). White curtains entirely surround the set-up. The video camera recorded at 25 fps (Additional file 1: Figure S1B, C for examples of a single frame and final trajectories).
A larval population (5–8 dpf) consisted of at least 24 fish siblings from the same batch of embryos. After 5 min of acclimation to the new environment, the larvae were recorded for 20 min. Water temperature was maintained in a strict range (27–28 °C) during each experiment.
Custom-built software tracking larvae
We developed multiwellTracker, a software to automatically track zebrafish larva in wells. The software is available at http://www.multiwelltracker.es.
Detection of wells
The program is prepared to auto-detect circular wells, regardless of their spatial arrangement. To detect the wells we use the circular Hough transform (we have modified the code of Tao Peng distributed by Matlab Central under BSD license). In order to estimate the diameter of the wells, it computes the image’s Hough transform for 100 radii different in 5 pixels and a rough estimate of the largest possible radius (length of the longer side of the image divided by the square root of the number of wells) (Additional file 1: Figure S1D). The system selects the highest point of this measure as an estimate of the radius of the wells (rest). It is possible to skip this first step and instead manually specify a value for rest. This may be advisable when many videos are recorded with the same set-up and the same wells.
In the second step the system locates the centers of the wells. To do this it performs a Hough transform of the original image, this time with radii only in the range between 0.8rest and 1.2rest. The transformed image usually has clear peaks in the centers of the wells. Then it filters the transformed image with a Gaussian filter to increase its smoothness (the resulting transformed image is shown in Additional file 1: Figure S1E). Then, it selects the maximum of the transformed image as the center of the first well. To prevent selecting the same well twice, the system discards all the pixels of the transformed image that are within radius rest of the selected center (Additional file 1: Figure S1F). It selects the new maximum as the center of the second well, and repeats the procedure until all wells have been found (Additional file 1: Figure S1G). The experimenter can correct the result by manually clicking on the center of the wells that have not been correctly located (< 1% of cases).
Pre-processing of images
In order to control for fluctuations in illumination, each frame is normalized by dividing the intensity of each pixel by the average intensity across all pixels of the frame. After normalizing the frame, a 2D Gaussian filter is used to smooth the image (Additional file 1: Figure S1H shows the image before and Additional file 1: Figure S1I after filtering).
Background subtraction and detection of the larva
In order to extract the image of the larva from the background, the system finds the average of 1000 frames equi-spaced along the whole video. This average image is what we will call “static background”. By substracting the static background from each frame, we obtain an image in which the larvae correspond to dark regions (Additional file 1: Figure S1J). However, because of relatively slow changes in the set-up over time, the system uses the static background in combination with a dynamic background, which is computed as the average of the previous five frames. The difference between the current frame and the dynamic background will only show larvae that are moving in that precise moment (Additional file 1: Figure S1K).
The specific algorithm to detect the larva is as follows. First, the difference between the current frame and the static background is thresholded, keeping only pixels for which the difference is below − 0.5. We then find connected components (“blobs”) in this thresholded image, keeping those that are larger than one pixel. Because these blobs come from the difference with the static background, both static and moving larvae will be detected. But at this stage some blobs come simply from noise. In order to filter out noisy blobs, the system accepts a blob if it fulfills at least one of these two conditions: (a) it contains at least one pixel that was identified as part of the larva in the previous frame or (b) it contains at least one pixel for which the difference between the current frame minus the dynamic background is below the same threshold as before (− 0.5).
Removal of reflections
In most cases only one blob is obtained after the process described in the previous sections. But when the larva is close to the wall of the well, its reflection on the wall may also be selected. The system considers that a blob A is a reflection of blob B when all of the following conditions are met: (a) blob B is bigger than blob A, (b) blob B is closer to the center of the well than blob A, and (c) the lines between the center of the well and the two blobs form an angle < 10°. When these three conditions are met, the system removes blob A.
Acquisition of the position of the larvae
If more than one blob remains in the same well after the previous steps, the system selects the one with highest contrast. For the selected blob, the system takes the position of its most contrasted pixel, and adds this position to the trajectory of the larva. If in a well no blob remains after the previous steps, the trajectory is left with a gap. When this happens, the program will not re-track the larva until it moves.
Behavioral parameters
Different parameters reflecting the behavior of individual larvae were measured, and two were used throughout the paper: (i) activity (percentage of time in movement) and (ii) radial index (average position from the border towards the center of the well). We also studied six additional parameters: (iii) tortuosity in the trajectory was calculated as the scalar product of the velocity vectors between two consecutive frames and the value in Additional file 1: Figure S2C was obtained by averaging this parameter through the whole video, excluding the frames where the animal was immobile. (iv) Speed was calculated as the average distance (in pixels) travelled per frame, in those frames where the fish was active. (v) Bursting was obtained as the total number of frames where fish changed from immobility to motion. (vi) Circularity was calculated as the distance travelled in the border/center axis divided by the total distance travelled by the individual. (vii) Instant acceleration was obtained as the average distance travelled by the individual in the frames where the fish changed from immobility to motion. (viii) Instant tortuosity was the average tortuosity in the frames where the fish changed from immobility to motion.
The average of each individual parameter was tested from 5 to 8 dpf to assess if individual behavior was significantly stable along the days using Pearson coefficient of correlation.
Inter-individual vs intra-individual differences
The eight behavioral parameters were also obtained from consecutive fragments of 30 s for each 20-min experiment for each larva. This was fitted to a two-dimensional Gaussian, but for clarity when representing many animals an isocontour of the Gaussian for each animal was used. An isocontour is an ellipse with principal axes given by the eigenvectors of the covariance matrix. We chose the isocontour with length of each semiaxis given by the square root of the eigenvalue of the covariance matrix, as this reduces to the standard deviation in each direction for cases with no correlation between the two variables. Intra-individual variation distribution was obtained using the coefficients of variation (CVs) and the standard deviations (only in the case of tortuosity and instant tortuosity, due to the presence of positive and negative values) separately. Inter-individual variation was calculated the same way but using fragments from different fish.
Additional validation of the experimental setup
Several controls were performed for possible experimental artifacts affecting wells differently. Behavioral parameters (activity and radial index) were robust to 90 degrees counterclockwise rotations of the multi-well plate (Additional file 1: Figure S3A, left; R = 0.73, P < 0.001, and R = 0.68, P < 0.001; two-sided t-tests for R = 0) or to interchanging the larvae between outer and inner wells (Additional file 1: Figure S3A, right; R = 0.65, P < 0.001, and R = 0.61, P < 0.001; using the same test as before). Also, we found no correlation between the small differences in illumination across wells and behavior (Additional file 1: Figure S3B). We further corroborated using a significance test that the differences in behavior did not have an origin in systematic differences across wells. For this, we found that the average behavioral parameters obtained in 15 individual experiments were not different between wells (Additional file 1: Figure S3C, typically P = 0.4 and always P > 0.19 for both parameters, permutation tests).
Stimulus response tests
We studied the influence that our free-swimming behavioral parameters could have on the performance of the individuals when they respond to three different stimuli.
Response to mechanical disturbance
We applied mechanical perturbations to each larva by pipetting the water content of the well up and down four times. Perturbations were applied at 6 dpf to previously recorded animals, and the 20-min recording was done at 7 dpf. The recording was performed in the usual setup.
Response to strong light pulse
In complete darkness, we applied three different light flashes to the larvae and studied their behavior in the 90 subsequent seconds. The flashes and the recording were performed in the usual setup. Pre-recording behavioral parameters were obtained the day before.
Novel tank with light/dark preference
In order to study the effect that a novel setup could have on the behavior of larvae we built a rectangular setup, which changed the geometry of the previous circular wells. The setup dimensions were 84 mm × 21 mm and it was built in transparent acrylic. To try to see if our parameters had any effect on the light–dark preference, half of the floor of the setup (42 mm × 21 mm) was white while the other half was black. The height of the setup was 5 mm. Larvae were placed in the center of the white part and recorded for 10 min. Activity was calculated as previously described and distance to the wall was represented by the average distance to the longest walls, normalized to 1 in the middle point of both walls and to 0 at the exact position of the walls.
The effect of our behavioral tests resulted in a decrease (increase) in mean activity (radial index), but maintaining the same individuality of the pre-recorded free-swimming experiments (Additional file 1: Figure S3D; P < 0.04 for changes in mean activity and radial index compared to control larvae of the same age, two-sided t-tests; P < 0.02 in two-sided t-tests for R = 0 between activity and radial index). In the case of the novel tank, radial index cannot be applied because the wells are elongated so was replaced by the minimum distance to the longer walls. We note, however, that this parameter showed no correlation with the radial index of pre-recorded experiments in the same animals.
Comparing the behavioral variability between two animal groups
A simple visual method to characterize the variability in a population is to plot the bi-dimensional distribution of the activity and radial index of individuals (like in Fig. 1a’”). To do so, we used Gaussian kernel smoothing that consists in adding up Gaussians centered at the data points as:
$$ P\left(x,y\right)=\frac{1}{N}\frac{1}{2{\pi \sigma}_x{\sigma}_y}\sum \limits_{i=1}^N\exp \left(-\frac{1}{2}\left[\frac{{\left(x-{x}_i\right)}^2}{\sigma_x^2}+\frac{{\left(y-{y}_i\right)}^2}{\sigma_y^2}\right]\right) $$
with xi and yi the mean activity and radial index values of individual i of a total of N members of the population. An optimal smoothing uses standard deviations of each Gaussian given by σ
x
= N−1/6α
x
with α
x
the standard deviation in the xi data values, and similarly for σ
y
using the yi values (see B.E. Hansen, unpublished manuscript, http://www.ssc.wisc.edu/~bhansen/718/NonParametrics1.pdf). The volume under the probability surface has a value of 1, even when the values of the probability density are already up to 90. The probability surface sits on an area on the x–y plane of approximately 0.4 × 0.4, making the total volume under the surface 1.
While this distribution gives a visual and intuitive characterization of behavioral variability in two dimensions, for statistical tests we found it advantageous to reduce it to a single parameter measuring dispersion on the plane. The variance \( {\sigma}_x^2\left({\sigma}_y^2\right) \) measures dispersion in x(y). In more dimensions, a measure of dispersion that takes into account the covariance structure is the hypervolume that the distribution occupies in space. This is measured by the generalized variance, which can be computed as the determinant of the covariance matrix, |Σ| [16]. To gain intuition, note that generalized variance reduces in two dimensions to \( {\sigma}_x^2{\sigma}_y^2\left(1-{\rho}^2\right) \), with ρ the Pearson correlation. When the two variables have no correlation ρ = 0 and generalized variance is maximal. Correlation makes ρ closer to 1 and the two variables closer to a line, making dispersion on the plane and generalized variance smaller. For statistical tests, we use the generalized variance of the behavior (activity/radial index) of populations composed by at least 24 individuals (Additional file 2: Table S1), and additionally the standard deviation for each parameter (Additional file 2: Table S2).
ChIP-seq, conventional ChIP, and reChIP analyses
Chromatin immunoprecipitation was obtained from pooled samples of at least four zebrafish larvae. Briefly, the samples were crosslinked with 1.8% formaldehyde for 30′ and then quenched with 1% glycine for 5′. Extracts were lysed using a SDS lysis buffer (50 mM Tris-HCl pH 8.1, 1% SDS, 10 mM EDTA) for 30′ at 4 °C, and then diluted with a dilution buffer (6.7 mM Tris-HCl pH 8.1, 0.01% SDS, 1.2 mM EDTA, 1.1% Triton X-100, 167 mM NaCl). Sodium butyrate (2 mM) was added to avoid histone deacetylation activity during the preparation. Then, the fish were sonicated with two pulses (30′′ ON/30′′ OFF) of 15′ each with the Diagenode Bioruptor. Before pre-clearing the samples with protein A/G beads, an input sample was obtained. Then, the extracts were immunoprecipitated overnight using 1 μg of anti-acetyl-Histone 4, anti-HDAC1, or anti-YY1 antibodies. Bound DNA was recovered with protein A/G beads, then washed with low-salt (120 mM Tris-HCl pH 8, 0.1% SDS, 2 mM EDTA, 1% Triton X-100, 150 mM NaCl), high-salt (120 mM Tris-HCl pH 8, 0.1% SDS, 2 mM EDTA, 1% Triton X-100, 500 mM NaCl), LiCl (10 mM Tris-HCl pH 10, 1 mM EDTA, 0.25 M LiCl, 1% NP40, 1% sodium deoxycholate) and two times with 1× TE (10 mM Tris-HCl pH 8, 1 mM EDTA) buffers, and recovered with elution buffer (1% SDS, 0.1 M NaHCO3). DNA purified samples were de-crosslinked using sodium chloride and cleared with Qiagen spin columns, and for reChIP, the samples were re-incubated with a second antibody after different elution [43], and another round of washes and elution was performed.
In the case of conventional ChIP, qPCR was used to detect differences between the samples in the target regions, using the unbound fraction (ChIP from the same sample but without antibody) as a control to normalize results.
In the case of the acH4 ChIP-seq, we prepared 12 samples (four control, four NaBu, and four random) composed of five larvae each, chosen by the following algorithm. For the control samples, we needed to obtain clusters of larvae with very similar behavior, so we performed hierarchical clustering (using Euclidean distance as the metric and average linkage criteria) of the behavior (activity and radial index) of a population of 72 larvae. The selection in the NaBu samples used the same algorithm, but using 48 larvae. Previous experiments indicated that the behavioral variability between the NaBu samples would be reduced compared to control samples (Fig. 3b). In the random experiment, fish were randomly selected from the population, obtaining samples composed of five random larvae without any behavioral consideration. We postulated that the variability across these random samples was not associated with behavior. In summary, we obtained four samples with high behavioral variability (control samples), four samples with low behavioral variability (NaBu samples), and four samples with low variability that is not associated with behavioral differences (random samples). Even when the use of pooled fish might introduce variation into the results, our previous results showing that acH4 levels are very similar between larvae with the same behavior (Additional file 1: Figure S5C) suggested that this effect would be minor.
The final samples were processed at the Genomics Unit at the Scientific Park of Madrid. Libraries were built and the samples were sequenced using an Illumina GAIIX. Reads were aligned to Danio rerio genome sequence (Zv9) with BWA, and the results were subjected to the MACS peak detection algorithm [44]. Afterwards, peaks from the different samples were merged and quantified separately as fragments per kilobase per million reads (FPKM) using the DiffBind package [45], obtaining 27,310 peaks in control samples and 33,649 peaks in NaBu samples, with 12,419 peaks shared by both conditions. Finally, EdgeR [46] was applied to detect differential binding between control and NaBu samples. The candidate peaks with higher histone H4 acetylation in NaBu compared to control (P < 0.001) were further filtered using the random samples. Specifically, we removed the regions with higher variability (measured with the coefficient of variation) in random samples, as this variability is not associated with behavior.
Gene Ontology and transcription factor binding site analysis
Position of the candidate peaks obtained in the ChIP-seq was retrieved in order to study their position relative to near genes. Nearest genes were retrievedand analyzed for Gene Ontology using DAVID [47] as in the case of RNA-seq targets. In addition, the candidate regions located near the TSS of a gene were used to predict enriched DNA motifs and their potential biological activity with the MEME suite [29].
Reagents and antibodies
Sodium butyrate (NaBu), trichostatin A (TSA), cambinol (Cmb), and 5-aza-2′-deoxycytidine (AZA) were purchased from Sigma-Aldrich (#303410, #T8552, #C0494, and #A3656) and dissolved in phosphate-buffered saline (PBS) to final concentrations of 2 mM, 0.1 μM, 0.2 μM, and 15 mM in the fishes’ water, respectively. PBS alone was used as vehicle control. The pharmacological treatment lasted for 24 h from 7 to 8dpf. Acetyl-histone 4 and acetyl-lysine antibodies were obtained from Millipore (#06-866 and #05-515), anti-HDAC1 and anti-YY1 from ActiveMotif (#39531 and #61780), anti-Sp1 and anti-H4 from Abcam (ab59257 and ab16483), and McrBC enzyme from New England Biolabs (M0272).
Immunoprecipitation and western immunoblotting
Groups of 20 fish (control, NaBu-treated, hdac1 +/−, and yy1 +/−) were frozen at 8 dpf. They were lysed in a solution containing 100 mM Tris-HCl pH 7.5, 20 mM NaF, 2 mM DTT, 2 mM EGTA, 2 mM EDTA, 1 mM sodium orthovanadate, 0.54 M sucrose, 0.2 mM phenyl-methyl sulfonyl fluoride, 2% X-100 Triton, ß-mercaptoethanol, and 4 μg/ml complete protease inhibitor cocktail (Roche, #11836153001). For each condition/treatment an aliquot of 1 mg protein was incubated with 2 μg of anti-YY1 antibody and 30 μl of protein-A/G plus Sepharose beads overnight at 4 °C. Beads were then washed five times with washing buffer (20 mM Tris-HCl pH 7.4, 50 mM NaCl, and 4 μg/ml complete protease inhibitor cocktail). Immunoprecipitated proteins were analyzed by western blotting using anti-acetylated lysine antibody and YY1 antibody.
RNA isolation, qPCR quantification, and RNA-seq
Total RNA was isolated using homogenized extracts from at least five fish per sample by RNeasy Mini purification (Qiagen, #74104). Retrotranscription was done with iScript (Bio-Rad, #1708891) following the manufacturer’s recommendations. Finally, quantification of the target genes was measured using qPCR with specific primers. Values were normalized using the GAPDH results obtained from the same sample, and P values obtained by using Student’s t-test. In the case of RNA-seq, three control and three NaBu samples were obtained using the same algorithm described for the ChIP-seq experiments. RNA samples were processed at the Genomics Unit at the Scientific Park of Madrid to generate libraries, and raw reads were obtained using Illumina GAII. Afterwards, the reads were aligned to the Danio rerio genome (Zv9) using BWA, and transcript counts were normalized to TMM (trimmed mean of M-values) scores in order to be able to compare the gene expression across samples. Then, the NoiseqBIO algorithm from Noiseq [32] was used to detect significantly (adjusted P value < 0.1) differentially expressed genes in the two groups with three biological replicates each.
Quantification of histone acetylation levels
Eighteen clusters of five fish from a total population of 90 were obtained from the behavioral space (activity/radial index) using an ad hoc algorithm. First, 18 centroids were randomly chosen, and five individuals were assigned to the nearest (not occupied) centroid. Then, centroids were redefined using the average values of the new clusters, and a new round of assignment of the fish to the centroids was done. This iteration was repeated until the centroids were stable. Then, total histones were extracted using an Epigentek kit (OP-0006) and quantified by Pierce Coomasie Plus reagent (Thermofisher, #27236) in order to use the same amount of total histones in each sample. Acetyl-H4, acetyl-H3 and specific acH4 marks (H4K5, H4K8, H4K12, and H4K16) were quantified by ELISA following the manufacturer’s recommendations using the Epigentek kits P-4009, P-4008, and P-3102, respectively.
Quantification of methylated DNA
DNA methylation was quantified using larval DNA digested by MCrBC enzyme as previously done [48] following the kit instructions.
Statistical analysis
In the case of linear correlation between behavioral parameters or across days, the statistical tests assessed the null hypothesis that the correlation coefficient is equal to 0, using two-sided t-tests. In the case of correlation between bulk histone acetylation and behavioral parameters (distance to the average or angle to the average), a similar approach was performed.
When the differences between the inter-individual variability and the intra-individual variability of a behavioral parameter were assessed, the set of 30-s fragments of the trajectories of all the individuals were shuffled. Then, this random inter-individual and intra-individual variability was calculated (using the CV) and we compared these permutated values to the observed in the real dataset. Finally, by doing this process 1000 times, we were able to obtain a P value (the proportion of times in which the final values were higher in the case of permutation datasets) for testing the hypothesis that the inter-individual variability is equal to the intra-individual variability.
In the case of the analysis of changes in behavioral inter-individual variability between two groups, we compared the difference between the generalized variance of these groups and the difference of two groups obtained after permuting the datasets. Specifically, we shuffled the individuals between the two groups, generating random groups of individuals. Then, we obtained the difference between the generalized variance in these two random groups and compared it to the real value. By doing this 1000 times, we obtained a P value for testing the hypothesis that the generalized variance of the two groups is the same.
To test acetylated histone or methylated DNA differences between two groups, the levels of these groups were compared using two-sided t-tests. In the case of ChIP, reChIP, and gene expression analyses, the statistical tests to compare the difference between two conditions were conducted by calculating the representative parameter (fold change compared to a control, gene expression ratio between two groups), and P values were obtained using two-sided t-tests.
For motif finding, the MEME algorithm [29] was used to estimate an E-value of each motif, a parameter that quantifies the log-likelihood ratio of the motif depending on its size and the letter composition of the background.
In the case of ChIP-seq analysis, a P value was extracted using the EdgeR algorithm, which calculates an exact test using the quantile-adjusted conditional maximum likelihood following a negative binomial model on the normalized counts of the samples.
All the experiments (except ChIP-seq and RNA-seq) were done at least three times with different biological datasets, and P values were calculated using the three replicates. Figures related to behavior show a representative experiment of the triplicate, while molecular analyses show the average result with error bars representing the standard deviation. MATLAB and R were used for all the computations and the statistical analysis.