ATGPred-FL: Sequence-Based Prediction of Autophagy Protein with Feature Representation Learning |Home| Sever| TrainData| TestData||
Introduction

Autophagy is a conservative ‘self-eating’ process during biological evolution. In the process of autophagy, some cytoplasmic and intracellular organelles are isolated in characteristic double membrane or multi-membrane autophagic vacuoles (called autophagosomes) and finally transported to lysosomes for overall degradation. This process is regulated by many autophagy proteins (ATGs). Accurate identification of autophagy proteins is crucially important to reveal their biological functions. ATGPred-FL, a primary sequence-based predictor was proposed for ATGs identification. ATGPred-FL works according to the following two steps: (i) Once the query fasta format protein sequences are input, the feature representation scheme is employed to generate 14 probabilistic features derived from 14 prediction models that utilized nine different types of peptide features, i.e., dipeptide composition (DPC), amphiphilic pseudo-amino acid composition (APAAC), pseudo-amino acid composition (PAAC), dipeptide deviation from expected mean (DDE) and adaptive skip dinucleotide composition (ASDC), the composition (C)-transition (T)-distribution (D) model: composition (CTDC), transition (CTDT), Distribution (CTDD) and(QSOrder), and four machine learning classifiers, i.e. Logistic Regression (LR), AdaBoost (AB), support vector machine (SVM) and Extremely Randomized Trees (ERT); (ii) those 14 probability features were fused and inputted to the trained SVM_model to make the final prediction. The workflow is shown in the figure below. Our method shows superior results in independent set tests and practical scenarios.The involved datasets and Python scripts can also be freely downloaded at https://github.com/jiaoshihu/ATGPred. The proposed model may serve as an efficient tool to assist researchers with their experimental research.

Flowchart