Skip to main content

Table 1 Current DNA methylation calling tools for Nanopore sequencing

From: DNA methylation-calling tools for Oxford Nanopore sequencing: a survey and human epigenome-wide evaluation

Tools DNA modification Input required Support multi-read FAST5 file? Flow cells compatibility Model trained on Algorithm Reported performance
4mC 5mC 5hmC 6 mA
Nanopolish [9]      Basecalled FAST5 R7.3, R9, R9.4, R9.4.1, R9.5, R10 E. coli Hidden Markov model (HMM) Accuracy = 0.94 (5mC, Homo sapiens)
Tombo/ Nanoraw [20]   Raw FAST5   R9.4, R9.4.1, R9.5 no model Mann-Whitney and Fisher’s exact test Accuracy = 0.839, AUC = 0.78
SignalAlign [39]   Basecalled FAST5   R7.3, R9.4a Synthetic nucleotides Hidden Markov model with a hierarchical Dirichlet process (HMM-HDP) Accuracy = 0.76 (for 5hmC, 5mC)
E. coli Accuracy = 0.96 (for 5mC), Precision = 0.92
Guppy [32]     Raw FAST5 R7.3, R9, R9.4, R9.4.1, R9.5, R10, R10.3 Homo sapiens and E. coli Recurrent neural network N/A
NanoMod [31]      Basecalled FAST5, requires control sequence   R7.3, R9 no model Kolmogorov-Smirnov test Precision = 0.9
mCaller [33]     Basecalled FAST5   R9, R9.4, R9.5 E. coli Neural network Accuracy = 0.954, AUC = 0.99
DeepSignal [35]     Basecalled FAST5 processed by Tombo re-squiggle module   R9, R9.4, R9.4.1 E. coli Bidirectional RNN with LSTM+Inception structure Accuracy = 0.92 (5mC, Homo sapiens), 0.90(m6A), Precision = 0.97
DeepMod [34]     FAST5 with raw signals and base calls   R9, R9.4, R9.4.1 E. coli Bidirectional recurrent neural network (RNN) with long short-term memory (LSTM) Precision = 0.99, AUC >  0.97
Megalodon [36]     Raw FAST5b R9.4.1, R10.3 Homo sapiens and E. colic Recurrent neural networkd N/Ae
methBERT [23]     Raw FAST5   R9 Homo sapiens and E. coli Bidirectional encoder representations from transformers (BERT) Precision = 0.9147 (5mC, Homo sapiens)f
METEORE [38]     Methylation calling per-read resultsg   R9.4.1 E. coli Random forest (RF), multiple linear regression (REG) Root mean square error (RMSE) = 0.0687 (5mC, E.coli)h
DeepMP [37]     Basecalled FAST5 processed by Tombo re-squiggle module   R9, R9.4 Homo sapiens, E. coli, pUC19 Convolutional neural network (RNN) F-score = 0.9324 (5mC, Homo sapiens)i
  1. aSignalAlign’s compability on R9.4 is only validated on 5mC and 6 mA, not 5hmC
  2. bMegalodon must obtain the intermediate output from the basecall neural network, and Guppy is the recommended backend to obtain this output from FAST5
  3. cThe model is trained in biological contexts only on Homo sapiens and E. coli. Users have to specify the model from the modified base models included in basecaller Guppy or research models in ONT Rerio repository
  4. dMegalodon’s functionality centers on the anchoring of the high-information neural network basecalling output to a reference sequence
  5. eThe performance for Megalodon is not available since it is still under active development, no available published paper yet
  6. fOnly 5mC precision on Homo sapiens at per-site level is listed here, more performance parameter (AUC, Recall) of 5mC at per-site level and per-read level, and 5mC/6 mA performance on E.coli are available in the original paper
  7. gMETEORE combine the outputs from two or more methylation calling tools
  8. hRMSE is for METEORE RF model combining Megalodon and DeepSignal at per-site level on selected sites of E.coli
  9. iOnly 5mC overall accuracy on Homo sapiens at per-read level is listed here. More performance parameters on Homo sapiens, and 5mC performance on E.coli, 5 mA performance on pUC19 plasmid are available in the original paper