From: Computational methods for chromosome-scale haplotype reconstruction
Approach | Tools | Data | Advantages | Disadvantages |
---|---|---|---|---|
Reference-based phasing | ||||
 Molecular haplotyping | Long reads such as PacBio, Hi-C of individual | Can phase de novo and rare variants | Limitations in complex regions such as centromeres, HLA, etc. | |
 Single-cell phasing | Single-cell short-read | High precision at single-cell, detection of rare alleles | Engineering tricks required to scale to > million cells | |
 Polyploid phasing | HapTree [50], Hap10 [51], WhatsHap-polyphase [52], H-PoP [53] | Local phasing | Can phase de novo and rare variants | Limitations in repetitive regions and not optimized for ploidy > 5 |
De novo assembly | ||||
 Diploid assembly | Long reads and Hi-C of individual | Local phased contigs | No chromosome-scale assembly and computationally expensive | |
Long reads and Hi-C of individual | Chromosome-scale diploid assembly | Collapsed assembly not suitable for repetitive regions | ||
HiFi reads of individual | High consensus accuracy and continuity | No chromosome-scale assembly | ||
pstools | Hifi and Hi-C reads | High-quality chromosome-scale haplotype assembly | Only designed for haplotyping diploids | |
Long reads of trios | Local phased contigs | Require family information | ||
 Polyploid assembly | Long reads of individual | Local phased contigs | Need to be optimized for whole genomes | |
POLYTE [62] | Illumina short reads | Local phased contigs | Does not scale well to whole genomes | |
Strain-resolved metagenome assembly | ||||
 De novo (re-) assembly | Metagenome short reads | No prior knowledge required | Low sensitivity: rare haplotypes can remain undetected | |
OPERA-MS [65] | Metagenome using short and long reads | High continuity | Computationally expensive | |
 SNV-based assembly | Metagenome short reads | Computational efficiency | Assembly accuracy depends on variant calling | |
 Read binning | MetaMaps [69] | Metagenome long reads | Computational efficiency | Accuracy depends on database |
 Contig binning | Metagenome short reads and Hi-C | Reference-free, ability to link plasmids to host chromosome | Multiple technologies necessary (Hi-C + shotgun sequencing) |