Skip to main content
Fig. 1 | Genome Biology

Fig. 1

From: Genetic–epigenetic interactions in cis: a major focus in the post-GWAS era

Fig. 1

Approaches for mapping mQTLs and hap-ASM DMRs. Haplotype-dependent allelic methylation asymmetry (hap-ASM) can be assessed using two different approaches, methylation quantitative trait locus (mQTL) and hap-ASM analysis. The mQTL approach is based on correlations of (biallelic) net methylation to genotypes across individuals, whereas sequencing-based approaches are based on direct comparisons between alleles in single (heterozygous) individuals. a To identify mQTLs, correlations between single nucleotide polymorphism (SNP) genotypes and net methylation at nearby CpGs are measured in groups of samples. Methylation and genotyping data are generated in separate assays, which are usually array-based, and correlations are computed using linear regression or Spearman’s rank correlation. The mQTLs are defined using q value (false discovery rate [FDR]-corrected p value), effect size (β value), and goodness of fit of the linear model (R square). An example of a mQTL in the S100A gene cluster [49] is shown. The genotype of the index SNP, rs9330298, correlates with the methylation at cg08477332 by stringent criteria (β > 0.1, R2 > 0.5, q value <0.05). Lack of correlations between the index SNP and more distant CpGs corresponds to a discrete hap-ASM region spanning approximately 1 kb. b Hap-ASM is analyzed directly, using targeted bis-seq or whole genome bisulfite sequencing (WGBS) in single individuals. Deep long-read sequencing is desirable to generate reads mapping both CpG sites and common SNPs because the statistical power depends on the number of reads per allele. Alignment is performed against bisulfite-converted reference genomes, which can be done, for example, using Bismark [169], BSMAP [170], or Bison [171]. Alignment against personalized diploid genomes (constructed using additional genotyping data) or SNP-masked reference genomes, can decrease alignment bias toward the reference allele. Quality control (QC) filtering is based on Phred score, read length, duplicates, number of mismatches, ambiguous mapping, and number of reads per allele. CpG SNPs can be tagged or filtered out by intersecting CpG and common SNP coordinates. After alignment and quality control of the bis-seq data, SNP calling is performed, for example, using BisSNP [172]. For C/T and G/A SNPs, the distinction between the alternative allele and bisulfite conversion is possible only on one of the DNA strands (the G/A strand). Methylation levels are determined separately for the two alleles, both for individual CpGs and for groups of CpGs in genomic windows, and compared using, for example, Fisher’s exact test or Wilcoxon test, respectively. Both p value (and corrected p value) and effect size metrics (number of significant CpGs in the DMR and methylation difference across all covered CpGs) are used to define hap-ASM regions. c Example of a hap-ASM DMR, located downstream of the KBTBD11 gene [49]. The hap-ASM region in T cells overlaps a CTCF ChIP-Seq peak. The index SNP (rs117902864) disrupts a canonical CTCF motif as reflected by a lower position weight matrix (PWM) score associated with allele B. This result implicates CTCF allele-specific binding as a mechanism for hap-ASM at this locus. Consistent with this hypothesis, the NHP (Rhesus macaque) sequence differs from the human reference allele (allele A) by one nucleotide (bold and underlined) which does not affect the binding affinity, and the observed methylation levels are very low in the macaque blood samples, similar to allele A in the human T cells. PWM position weight matrix

Back to article page