Experimental and simulated Hi-C datasets for performance evaluation
We performed two replicate Hi-C experiments on cells from 13 immortalized human cancer cell lines from a variety of tissues and lineages using HindIII and DpnII restriction enzyme digestion (Additional file 1: Table S1). After aligning and filtering of paired end sequencing reads, we obtain 10 to 61 million paired reads per experiment for 11 cell types (generated using HinDIII) and more than 400 million paired reads for the remaining two deeply sequenced cell types (generated using DpnII). These Hi-C interactions serve as a readout of three-dimensional proximity of the corresponding genomic loci. The interactions are binned into fixed-sized bins, and a count of the number of Hi-C interactions that connect each pair of bins is stored in a Hi-C contact matrix. Unless otherwise noted, we used 40-kb bins because this value achieves reasonable sparsity of the Hi-C contact matrices, based on the depth of sequencing of the datasets used in our study. Also, this resolution has been adopted in multiple previous studies [7, 8]. We use the resulting Hi-C matrices as input to every reproducibility and quality control analysis in this study, except where indicated.
For use in assessing reproducibility and quality measures for Hi-C data, we designed a model for simulating noisy Hi-C experiments (Fig. 1a). Our noise model aims to simulate a contact matrix from a Hi-C experiment performed on chromatin that lacks any high-order structure, such as loops and topologically associating domains. For this purpose, our simulation models two main phenomena: the “genomic distance effect,” i.e., the higher prevalence of crosslinks between genomic loci that are close together along the genome [1], and random ligations generated by the Hi-C protocol [24]. For the first phenomenon, we use real Hi-C data, and we sample from the empirical marginal distribution of counts as a function of genomic distance. The second phenomenon, random ligation noise, is modeled by generating Hi-C interactions between random bin pairs (see the “Methods” section for details). Counts generated by these two “noise” components of the model can be mixed with different proportions to produce simulated “pure noise” Hi-C matrices. We then mix the simulated contacts with experimental contact matrices in varying proportions to obtain noise-injected matrices.
In addition to noise, we tested the effects of sparsity and the resolution of Hi-C matrices on the performance of each method. We profiled the effects of sparsity explicitly by downsampling real Hi-C matrices to contain a set of fixed total number of intrachromosomal Hi-C interactions. Binning resolution further controls the sparsity of a Hi-C matrix, at the same time dictating the scale of chromatin organization that can be observed in a Hi-C matrix. By binning deeply sequenced Hi-C datasets containing at least 400 million intrachromosomal Hi-C interactions from two cell types, we generated Hi-C matrices binned at high, mid, and low resolutions (10 kb, 40 kb, 500 kb) and used these to investigate the effect of resolution on each method as well (Additional file 1: Table S1). A schematic of the full range of datasets used in this study to validate each method is shown in Fig. 1b.
Measures for quality and reproducibility of Hi-C data
Four recently developed methods for measuring the quality of and reproducibility of Hi-C experiments were assessed in this study (Fig. 1c). HiCRep [34], GenomeDISCO [35], HiC-Spector [33], and QuASAR-Rep [36] measure reproducibility, and QuASAR-QC measures quality of Hi-C data. The four reproducibility methods we evaluate employ a variety of transformations of the Hi-C contact matrix. HiCRep stratifies a smoothed Hi-C contact matrix according to genomic distance and then measures the weighted similarity of two Hi-C contact matrices at each stratum. In this way, HiCRep explicitly corrects for the genomic distance effect and addresses the sparsity of contact matrices through stratification and smoothing, respectively. GenomeDISCO uses random walks on the network defined by the Hi-C contact map to perform data smoothing before computing similarity. The resulting score is sensitive to both differences in 3D DNA structure and differences in the genomic distance effect [35] and makes it thus more challenging for two contact maps to be reproducible, as they have to satisfy both criteria to be deemed similar. HiC-Spector transforms the Hi-C contact map to a Laplacian matrix and then summarizes the Laplacian by matrix decomposition. QuASAR calculates the interaction correlation matrix, weighted by interaction enrichment. The two variants of QuASAR, QuASAR-QC and QuASAR-Rep, both assume that spatially close regions of the genome will establish similar contacts across the genome, and they measure quality and reproducibility, respectively, by testing the validity of this assumption for a single and pair of replicates.
Reproducibility measures correctly rank noise-injected datasets
To assess the performance of the reproducibility measures, we simulated pairs of Hi-C matrices with varying noise levels. Intuitively, a good reproducibility measure should declare the least noisy replicate pair as most reproducible and the noisiest replicate pair as least reproducible. We paired a real Hi-C contact matrix with a noisier version of the same matrix using a wide range of simulated noise levels (5%, 10%, 15%, 20%, 30%, 40%, and 50%). This procedure yielded seven pairs of replicates for each of 11 different cell types. We performed this approach using two different sets of randomly generated noise matrices, using one-third genomic distance noise and two-thirds random ligation noise or vice versa. Each replicate pair was assigned a reproducibility measure by HiCRep, GenomeDISCO, HiC-Spector, QuASAR-Rep, and Pearson correlation.
Our analysis showed that all reproducibility measures were able to correctly rank the simulated datasets. Averaged over 11 different cell types, we observed a monotonic trend for all of these measures (Fig. 2a). Indeed, for every cell type and every measure, increasing the noise level always led to a decrease in estimated reproducibility (Additional file 1: Figure S1). Qualitatively, the trends in Fig. 2a suggest that QuASAR and HiCRep may be more robust to noise than the other reproducibility measures.
Comparing the two noise models, we saw less consistent trends. HiC-Spector assigned higher reproducibility scores to matrices with 66% genomic distance noise and 33% random ligation noise. GenomeDISCO showed the opposite behavior whereas QuASAR-Rep, HiCRep, and Pearson correlation gave similar scores regardless of the underlying noise proportions. This variability suggests that the various reproducibility measures exhibit different sensitivities to different sources of noise, thus potentially yielding complementary assessments of reproducibility.
Assessment using real datasets reveals differences among reproducibility measures
Inevitably, any simulation approach is only as good as its underlying assumptions; thus, we also analyzed the performance of the four reproducibility measures using real data. Specifically, we asked whether the reproducibility measures can discriminate between pairs of independent Hi-C experiments repeated on the same cell type versus pairs of experiments from different cell types. In this setup, we used three types of replicate pairs: a single pair of matrices from the same cell type (which we call “biological replicates,” although each pair represents the same cells being prepped twice, rather than two different sets of cells), pairs of matrices from different cell types (non-replicates), and pairs of matrices sampled from combined biological replicates (pseudo-replicates, see the “Methods” section for details about the generation of pseudo-replicates) [34]. We assigned a reproducibility score to every matrix pair for each measure and asked if reproducibility scores differ among replicate pair types.
Because pseudo-replicates are generated from pooled biological replicates, their variation solely stems from statistical sampling, with no biological (including distance effect) or technical variance. Therefore, we expect pseudo-replicates to exhibit the highest reproducibility. Conversely, non-replicate pairs are expected to have the lowest degree of reproducibility, because they contain all the experimental variation observed in biological replicates, as well as cell type-specific differences in 3D chromatin organization.
In contrast to the simulation analysis, the analysis using real datasets showed distinct differences among the five methods. For each of the 11 cell types and each reproducibility measure, we assigned reproducibility scores to a single biological replicate pair, 20 non-replicate pairs, and 3 pseudo-replicate pairs (Fig. 2b). The reproducibility score of a replicate pair is the score obtained by averaging reproducibility scores assigned to each chromosome. All four reproducibility measures and the Pearson correlation can separate replicate pair types from each other (Additional file 1: Figure S2); however, the reproducibility measures generally achieved clearer separation between different replicate pair types. These differences are statistically significant according to a one-sided Kolmogorov-Smirnov test (P < 0.01). In addition to the Pearson correlation, we considered the rank-based Spearman correlation as a potential method for assessing reproducibility. We also considered using either type of correlation in conjunction with ICE normalization. The results (Additional file 1: Figure S3) show that none of these four methods successfully separates biological replicate from non-replicate pairs. Intuitively, we prefer a measure that separates non-replicates from biological replicates with a clear margin. By this measure, the HiC-Spector measure yields the largest separation, followed by HiCRep, QuASAR-Rep, and GenomeDISCO (Fig. 2b). Among them, HiC-Spector and HiCRep correctly rank all replicate types for all 11 comparisons, with a clear separation between biological replicates and non-replicates. GenomeDISCO ranks a biological replicate lower than a non-replicate for a single case out of 11. The pair of biological replicates that GenomeDISCO ranks lower than non-replicates shows a marked difference in genomic distance effect (Additional file 1: Figure S4), to which this method is sensitive [35]. QuASAR-Rep is able to correctly rank biological replicates above non-replicates in 7 out of 11 cases. The cell types in which it fails have only 12 to 28 million interactions, suggesting that QuASAR-Rep does not perform well when coverage is low and the resolution is set to 40 kb. However, re-analysis of the same data suggests that switching to a larger resolution (120 kb) improves QuASAR-Rep’s performance, leading to separation between replicates and non-replicates for all cell lines but two (data not shown). As expected, the Pearson correlation performs worse than the Hi-C-specific measures, ranking non-replicates higher than biological replicates in 7 cases.
Pseudo-replicate reproducibility scores provide an upper bound for each reproducibility measure. In general, these scores show similar trends to those described above. For example, the Pearson correlation scores assigned to pseudo-replicates show a relatively wide separation from the rest of the scores, even though non-replicates and biological replicates are intermingled. On the other hand, GenomeDISCO, HiC-Spector, HiCRep, and QuASAR-Rep show the desired behavior: a high degree of separation between non-replicates and biological replicates, and a relatively small separation between biological replicates and pseudo-replicates.
Reproducibility can be determined over a range of experimental coverage
To directly investigate the effects of the coverage of a Hi-C experiment on the reproducibility measures, we downsampled real Hi-C matrices to contain fewer interactions and examined the effects on the resulting reproducibility scores. We limited this analysis to real data from six cell types with higher coverage, and we subsampled each replicate multiple times to contain 1 to 30 million total Hi-C interactions (see the “Methods” section for details). These datasets were used for testing the ability of each method to distinguish among different replicate types at lower coverage levels and for explicitly profiling the dependence of reproducibility scores on coverage levels.
Hi-C reproducibility measures retained their ability to distinguish between replicate types, even at extremely low coverage levels. Visualization of the reproducibility scores revealed that the HiCRep, HiC-Spector, and GenomeDISCO measures successfully separate non-replicates from biological replicates even with only 5 million Hi-C interactions, a feat that Pearson correlation cannot achieve at even the highest coverage level (Fig. 2c). QuASAR-Rep can successfully separate biological replicates from non-replicates at 25 and 30 million interactions but fails to distinguish them when coverage is lower than 20 million interactions, consistent with the results from Fig. 2b. As before, pseudo-replicate pairs continue to serve as an upper bound for reproducibility measures. However, the separation between pseudo-replicates and biological replicates is reduced at lower coverage levels, and so is the separation between biological replicates and non-replicates. Furthermore, this analysis suggests we can infer empirical thresholds for these reproducibility measures that can effectively separate all biological replicates from non-replicates at a given coverage level, as explained in the “Methods” section. These empirical thresholds, selected as the midpoint between the most reproducible non-replicate pair and the least reproducible replicate pair, are shown as dashed lines in Fig. 2c and can be found in Additional file 1: Table S2.
Consistent with the trends observed in the analysis of real datasets, the reproducibility of downsampled replicate pairs exhibits a dependence on sequencing depth. We observe that reproducibility scores associated with biological replicates become significantly smaller as coverage decreases, according to a one-sided Wilcoxon signed rank test (P < 0.05, Additional file 1: Figure S5). The HiCRep, GenomeDISCO, QuASAR-Rep, and Pearson correlation scores exhibit a statistically significant drop for every level of coverage. In contrast, reproducibility scores from HiC-Spector only start to significantly and consistently decay below 20 million interactions, exhibiting a lesser degree of dependence on the coverage level. This may be because the leading eigenvectors used by HiC-Spector tend to capture local or mesoscopic structures, which are less likely to be affected by coverage. Despite varying levels of dependence on coverage, downsampling analysis convincingly shows that all measures exhibit a dependence on coverage. Thus, coverage of different replicate pairs must be factored into reproducibility analyses, especially for comparative purposes.
Reproducibility measures are robust to changes in resolution
The resolution of a Hi-C matrix effectively dictates the scale of 3D organization observable from the data: a low-resolution matrix can only reveal compartments and TADs [1, 8], whereas high-resolution matrices reveal additional finer scale structures like chromatin loops [11]. To investigate the effect of resolution on reproducibility, we used deeply sequenced Hi-C replicates with at least 400 million intrachromosomal interactions generated from the HepG2 and HeLa cell lines. From these data, we generated real and simulated replicate pairs at 10-kb, 40-kb, and 500-kb resolution, and we measured the reproducibility of each replicate pair.
HiCRep, GenomeDISCO, HiC-Spector, QuASAR-Rep, and Pearson correlation accurately measure reproducibility at both high and low resolutions. The four Hi-C-specific methods can correctly rank pseudo, biological, and non-replicate pairs at 10-kb, 40-kb and 500-kb resolutions (Fig. 3a) with a clear margin between biological replicate and non-replicate pairs. Surprisingly, we found that the Pearson correlation can correctly rank replicate types for these deeply sequenced datasets. Notably, the reproducibility scores from the four methods are largely independent of resolution. While GenomeDISCO and especially QuASAR-Rep exhibit some dependence of resolution, assigning lower reproducibility scores to replicates with lower coverage, they maintain a clear boundary with large margins between biological and non-replicates at all resolutions. However, the Pearson correlation exhibits a larger degree of dependence on resolution for all replicate pair types and maintains relatively smaller margins between non-replicate and biological replicate pairs. Simulated datasets further validate that reproducibility scores from each method decrease with increasing levels of noise at 10-kb, 40-kb and 500-kb resolution (Fig. 3b).
Next, we used deeply sequenced datasets to further investigate the effect of coverage on reproducibility scores of biological replicates at three resolution levels using a wider range of coverage values (30, 60, 120, 240, and 400 million intrachromosomal interactions). For HiCRep, QuASAR-Rep, and GenomeDISCO, we observed that reproducibility scores tend to plateau at 240 million interactions at 10-kb and 40-kb resolutions, whereas reproducibility scores of 500-kb resolution matrices benefit little from higher coverage (Fig. 3c). Consistent with our previous observations, HiC-Spector exhibits a lower degree of dependence on coverage, with scores reaching maxima at 120 kb. Overall, the four Hi-C reproducibility measures exhibit robustness to coverage and resolution differences, as measured by their ability to distinguish between replicate and non-replicate pairs.
Next, we tested whether the reproducibility measures can be used to select empirically the optimal resolution for a Hi-C dataset. Although resolution strongly influences almost every downstream analysis of Hi-C data, this parameter is generally set in an ad hoc fashion. To explore the performance of the measures as a function of the resolution parameter, we binned four pairs of biological replicates at an increasingly high resolution ranging from 40 kb, 20 kb, 10 kb and 5 kb and asked if the reproducibility scores of biological replicates decay significantly at higher resolutions. We chose six samples performed using HindIII with coverage values ranging from 15 million to 60 million interactions and two samples generated using DpnII and coverage of ~ 400 million interactions.
We observed that the four reproducibility measures show variable trends in how reproducibility scores assigned to biological replicates decay with respect to increasing resolution (Additional file 1: Figure S6). For HiCRep, GenomeDISCO, and QuASAR-Rep, the HindIII replicates (A549, G410, and LNCaP) exhibit a decay in reproducibility scores, whereas the scores assigned to replicate pairs generated by DpnII (HepG2) are more robust to changes in resolution. Notably, for these three reproducibility measures, the degree of decay also correlates with the sequence coverage of the data. For HiC-Spector, we do not observe consistent trends. These observations generally support the idea that deeply sequenced replicates generated by a 4-cutter such as DpnII can support resolutions higher than 40 kb, whereas relatively shallow replicates (< 100 million read pairs) generated using a 6-cutter are not suitable for binning resolutions higher than 40 kb. However, given the lack of a clear elbow or maximum in Additional file 1: Figure S6, we do not recommend using reproducibility scores to attempt to select an appropriate resolution.
Finally, we compared the run times of each reproducibility measure, using a large number of pairs of chromosome 21 contact matrices binned at 40-kb resolution. As seen in Additional file 1: Figure S7, QuASAR-Rep achieves the fastest median running time (0.82 s), followed by HiC-Spector (2.76 s), GenomeDISCO (5.77 s), and HiCRep (9.00 s).
Reproducibility measures accurately quantify reproducibility of Hi-C data from non-human genomes
We investigated whether the four Hi-C reproducibility measures can be applied to data derived from a non-human genome. We wanted to investigate a genome that is markedly different from human, but replicate Hi-C experiments in organisms other than human and mice are rare. We used Hi-C data from Ramirez et al., which has two biological replicates from two cell types (clone-8 and S2) from the fruitfly Drosophila melanogaster [37]. The fruitfly genome is approximately 18 times smaller than the human genome. For this analysis, we binned the Hi-C matrices at 10 kb and compared the reproducibility of the four large, non-heterochromatic chromosomes in Drosophila (chromosomes 2, 3, 4, and X). As before, we assigned reproducibility scores to each replicate pair and each non-replicate pair. The results show that biological replicate pairs are clearly separated from non-replicates for each measure in both cell types (Fig. 2d). Furthermore, for three out of the four reproducibility measure, the empirical thresholds that we inferred from the human Hi-C data (shown as dashed lines in Fig. 2d) generalize to the fruitfly genome.
Noise reduces the consistency and the prevalence of higher order structures in Hi-C matrices
Having investigated four different methods for evaluating the reproducibility of a given pair of Hi-C matrices, we now focus on methods for evaluating the quality of a single Hi-C matrix. As before, we perform this evaluation by injecting noise into real Hi-C data, producing a collection of 88 matrices corresponding to 11 cell types and 8 different noise profiles (see the “Methods” section). Among our four Hi-C reproducibility measures, only one (QuASAR-QC) provides a variant to assess the quality of a single matrix. The procedure yields a single, bounded summary statistic indicative of homogeneity of the underlying sample population and the signal-to-noise ratio of the interaction map. In addition to QuASAR-QC analysis, we profiled two well-known features of 3D organization: statistically significant long-range contacts [38, 39], which include DNA loops, and topologically associating domains (TADs). Intuitively, we expect that significant contacts and TADs should be harder to detect in noisy matrices and that such matrices should have a lower degree of consistency.
Our analysis suggests that QuASAR-QC is indeed sensitive to the noise and the coverage of a Hi-C matrix. For each simulated Hi-C matrix from 11 cell types, QuASAR-QC detects a perfectly monotonic relationship between the noise level and the consistency of the matrix (Fig. 4a). The same trend is observed in deeply sequenced HepG2 and HeLa cell types at 10-kb, 40-kb, and 500 kb resolutions (Additional file 1: Figure S8). Although the majority of noise-free combined replicates are assigned a QuASAR-QC score ranging from 0.05 to 0.07, three cell types have strikingly lower QuASAR-QC scores ranging from 0.03 and 0.02. The Hi-C matrices from these three cell types (LNCaP, SKNDZ, SKNMC) contain fewer Hi-C interactions. Thus, the lower consistency scores are likely partially due to the sparsity that results from low experimental coverage (Additional file 1: Table S1). Furthermore, investigation of contact probabilities at given genomic distances for each cell type revealed that the three cell types with lower QuASAR-QC scores have significantly higher contact probabilities at genomic distances larger than 50 Mb (Additional file 1: Figure S9). Because such long-range contacts are unlikely to occur due to the organization of chromatin, it is likely that such long-range contacts represent random ligation of uncrosslinked DNA fragments, which is a known source of noise in a Hi-C experiment [24]. Thus, the QuASAR-QC measure is potentially sensitive to both the level of simulated noise and the differences in the level of inherent noise that each combined replicate contains.
Statistically significant mid-range (50 kb–10 Mb) interactions are depleted in noisy Hi-C matrices. We identified statistically significant Hi-C contacts using Fit-Hi-C [38] for each of the Hi-C matrices that make up our simulated dataset. Because robust identification of such contacts requires deeply sequenced datasets that contain large numbers of Hi-C interactions, we chose to use a somewhat liberal false discovery rate threshold of 0.05 to facilitate discovery of statistically significant contacts. For 11 cell types, we observed that 8 out of 11 cell types exhibit a perfect or near perfect anti-correlation between the injected noise percentage and the total number of significant interactions (Fig. 4b). For the other three cell lines (LNCaP, SKNDZ, SKNMC), Fit-Hi-C identifies almost no significant contacts with or without any noise injection, further supporting the conclusion that these Hi-C datasets have low quality. These three cell lines are also the cell lines that have the lowest QuASAR-QC scores, corroborating the results between these two independent analyses. For the deeply sequenced two datasets (HepG2 and HeLa), we observed a similar trend at both 10-kb and 40-kb resolutions, with a higher number of significant mid-range contacts due to the higher coverage, as expected (Additional file 1: Figure S10).
Surprisingly, we found that topologically associating domain detection is highly robust to noise. We identified TADs using the insulation score [5, 40] method for the 88 simulated matrices, and we characterized the changes in the total number of TADs and TAD size distribution and the changes to TAD boundaries with respect to the noise level. The total number of identified TADs and their size distribution are only altered at the highest level of noise injection (Additional file 1: Figures S11 and S12). In addition, TAD boundaries between the original replicate and noise-injected levels exhibit the same degree of variation between two biological replicates, further supporting the idea that TAD boundaries identified with the insulation score approach are highly robust to noise (Fig. 4c, Additional file 1: Figure S13).
Quality control measures require different levels of experimental coverage
Continuing our assessment of Hi-C quality measures, we used downsampled Hi-C matrices to investigate the relationship between experimental coverage and each QC measure using a similar setup as before (see the “Methods” section).
Quality control metrics exhibit a predictable dependence on the coverage of Hi-C matrices. For each of the six cell types we downsampled, we observed that QuASAR-QC scores are lower for Hi-C matrices with fewer interactions (Fig. 4d). We observe the same trend for deeply sequenced matrices at 10-kb and 40-kb resolutions; however, QuASAR-QC scores at 500 kb tend to benefit less from deeper coverage, likely because coarse resolutions do not require large numbers of Hi-C interactions (Additional file 1: Figure S14). Similarly, the number of statistically significant long-range interactions also decreases as we reduce the number of total Hi-C interactions. However, the number of significant interactions decreases at a much higher rate: even at 15 million interactions, most cell lines lose the majority of significant interactions (Fig. 4e). Larger numbers of significant interactions are detected in deeply sequenced datasets, due to added statistical power, but a similar relationship between coverage and number of significant contacts is observed at both 10-kb and 40-kb resolutions (Additional file 1: Figure S15). Conversely, we found that TADs detected by insulation score are robust to low coverage levels. Using the same approach for noise-injected datasets, we found that the total number of TADs and their size distribution are not altered by lower coverage (Additional file 1: Figures S16 and S17). Indeed, the distances between TAD boundaries identified at lower coverage and original replicates only differ from the baseline distribution at 10 million or fewer interactions (Fig. 4f, Additional file 1: Figure S18).
Quality control measures are consistent with mapping statistics
To further validate the performance of the quality control measures at our disposal, we investigated the relationship between the QuASAR-QC scores assigned to real Hi-C matrices and various read-mapping statistics that have been used previously to evaluate Hi-C data quality [24]. The four statistics we compared against are the percentages of fragment pairs that can be mapped uniquely to the genome (aligned pairs), fragment pairs from the same restriction fragments (invalid pairs), intrachromosomal interactions (intrachromosomal percentage), and fragment pairs that are repeated in the dataset (PCR duplicate rate).
Overall, we observe varying degrees of correlation between the quality control measures and the mapping statistics for biological replicates. The percentage of aligned pairs is correlated with higher quality experiments, consistent with what one would intuitively expect from high-quality sequencing libraries (Fig. 5a). The percentage of invalid pairs is also weakly anti-correlated with QuASAR-QC scores, consistent with the fact that invalid pairs represent uninformative Hi-C interactions (Fig. 5b). However, we observed the highest degree of correlation between QuASAR-QC scores and intrachromosomal percentage (Fig. 5c). In a typical Hi-C experiment, a portion of interchromosomal interactions result from random ligation of non-crosslinked fragments; thus, a significant enrichment of interchromosomal interactions, which results in a depletion of intrachromosomal interactions, indicates a low-quality Hi-C experiment. In particular, six biological replicates with lower than 30% intrachromosomal interactions have the lowest QuASAR-QC scores; these replicates are from the LNCaP, SKNDZ, and SKNMC cell types. Analysis of downsampled data shows that this effect is not simply due to the overall lower sequencing depth of these three replicates (Additional file 1: Figure S19). These replicates were also identified to have lower quality in our simulation studies (Fig. 4a) and are depleted for significant mid-range interactions, establishing the consistency of quality control measures overall. We note that this finding is consistent with the previously suggested range of 40–60% intrachromosomal interactions for high-quality experiments [24]. The PCR duplicate rate is uncorrelated with QuASAR-QC. Note that the PCR duplicate rate may be influenced by overall coverage, which we have not controlled for in this experiment. Nonetheless, even for sets of experiments with very similar coverage (red dots in Fig. 5d), we observe very little correlation.