Identification of Pharmacophoric Fragments of DYRK1A Inhibitors Using Machine Learning Classification Models |
| |
Authors: | Mengzhou Bi Zhen Guan Tengjiao Fan Na Zhang Jianhua Wang Guohui Sun Lijiao Zhao Rugang Zhong |
| |
Affiliation: | 1.Key Laboratory of Environmental and Viral Oncology, College of Life Science and Chemistry, Faculty of Environment and Life, Beijing University of Technology, Beijing 100124, China; (M.B.); (T.F.); (G.S.); (L.Z.); (R.Z.);2.Beijing Municipal Key Laboratory of Child Development and Nutriomics, Translational Medicine Laboratory, Capital Institute of Pediatrics, Beijing 100020, China;3.Department of Medical Technology, Beijing Pharmaceutical University of Staff and Workers, Beijing 100079, China |
| |
Abstract: | Dual-specific tyrosine phosphorylation regulated kinase 1 (DYRK1A) has been regarded as a potential therapeutic target of neurodegenerative diseases, and considerable progress has been made in the discovery of DYRK1A inhibitors. Identification of pharmacophoric fragments provides valuable information for structure- and fragment-based design of potent and selective DYRK1A inhibitors. In this study, seven machine learning methods along with five molecular fingerprints were employed to develop qualitative classification models of DYRK1A inhibitors, which were evaluated by cross-validation, test set, and external validation set with four performance indicators of predictive classification accuracy (CA), the area under receiver operating characteristic (AUC), Matthews correlation coefficient (MCC), and balanced accuracy (BA). The PubChem fingerprint-support vector machine model (CA = 0.909, AUC = 0.933, MCC = 0.717, BA = 0.855) and PubChem fingerprint along with the artificial neural model (CA = 0.862, AUC = 0.911, MCC = 0.705, BA = 0.870) were considered as the optimal modes for training set and test set, respectively. A hybrid data balancing method SMOTETL, a combination of synthetic minority over-sampling technique (SMOTE) and Tomek link (TL) algorithms, was applied to explore the impact of balanced learning on the performance of models. Based on the frequency analysis and information gain, pharmacophoric fragments related to DYRK1A inhibition were also identified. All the results will provide theoretical supports and clues for the screening and design of novel DYRK1A inhibitors. |
| |
Keywords: | DYRK1A heterocyclic inhibitors classification models pharmacophoric fragments |
|
|