Skip to main content
Fig. 2 | Genome Biology

Fig. 2

From: AnnoPRO: a strategy for protein function annotation based on multi-scale protein representation and a hybrid deep learning of dual-path encoding

Fig. 2

The hybrid deep learning framework of three consecutive modules (M1 to M3) adopted in this study. (M1) the sequence-based multi-scale protein representation realizing conversion of all protein sequences to feature similarity-based images (ProMAP) and protein similarity-based vectors (ProSIM). (M2) the dual-path protein encoding based on pre-training. Using the ProMAP and ProSIM generated for all the sequences, a dual-path encoding strategy was constructed based on a seven-channel Convolutional Neural Network (7C-CNN) and Deep Neural Network of five fully-connected layers (5FC-DNN) to pre-train the features of all CAFA4 proteins by integrating their annotation data of GO families. (M3) the functional annotation by a LSTM-based decoding. The protein features pre-trained using the dual-path encoding layer in M2 were concatenated and then fed into a long short-term memory recurrent neural network (LSTM) to enable a multi-label annotation of proteins to 6,109 functional GO families using the hybrid deep learning

Back to article page