Journal of Guangxi Normal University(Natural Science Edition) ›› 2022, Vol. 40 ›› Issue (3): 161-171.doi: 10.16088/j.issn.1001-6600.2021071301

Previous Articles     Next Articles

Phosphorylation Site Prediction Model Based on Multi-head Attention Mechanism

WU Jun*, OUYANG Aijia, ZHANG Lin   

  1. School of Information Engineering, Zunyi Normal University, Zunyi Guizhou 563006, China
  • Received:2021-07-13 Revised:2021-09-09 Online:2022-05-25 Published:2022-05-27

Abstract: The computational methods for predicting protein phosphorylation sites are usually used in the preliminary screening stage of the site identification. To further improve the prediction accuracy, a deep learning model called MAPhos is proposed. First, each residue is represented by the summation of the amino acid vector and the position vector. Second, a bidirectional GRU network is utilized to generate the hidden states of residues. Third, the multi-head attention mechanism is used for generating the context vector. Finally, the context vector and the sequence vectors are concatenated, and the concatenation vector is fed into a fully connected neural network for predicting the site. Experimental results on real-world datasets demonstrate that the MAPhos model can outperform the models based on feature extraction and the models based on convolutional neural network over several measures, and the new model has better interpretability than the convolutional neural network models.

Key words: deep learning, bioinformatics, phosphorylation sites identification, multi-head attention mechanism, residues representation

CLC Number: 

  • TP183
[1]邓新宇, 姜颖, 贺福初. 磷酸化蛋白质及多肽相关研究的技术进展[J]. 遗传, 2007,22(10): 1163-1166. DOI: 10.16288/j.yczz.2007.10.001.
[2]李玲, 徐小洁, 叶棋浓. 蛋白质修饰与肿瘤糖代谢[J]. 中国科学:生命科学, 2015, 45(11): 1101-1109. DOI: 10.1360/N052015-00067.
[3]季美超, 付斌, 张养军. 基于质谱的蛋白质组学方法新进展[J]. 质谱学报, 2021, 42(5): 862-877. DOI: 10.7538/zpxb.2021.0091.
[4]GAO J J, THELEN J J, DUNKER A K, et al. Musite:a tool for global prediction of general and kinase-specific phosphorylation sites[J]. Molecular and Cellular Proteomics, 2010, 9(12): 2586-2600. DOI: 10.1074/mcp.M110.001388.
[5]PEJAVER V, HSU W L, XIN F, et al. The structural and functional signatures of proteins that undergo multiple events of post translational modification[J]. Protein Science, 2014, 23(8): 1077-1093. DOI: 10.1002/pro.2494.
[6]XUE Y, LI A, WANG L R, et al. PPSP: prediction of pk-specific phosphorylation site with bayesian decision theory[J]. BMC Bioinformatics, 2006, 7(1): 163-170. DOI: 10.1186/1471-2105-7-163.
[7]FAN W W, XU X Y, SHEN Y, et al. Prediction of protein kinase-specific phosphorylation sites in hierarchical structure using functional information and random forest[J]. Amino Acids, 2014, 46(4): 1069-1078. DOI: 10.1007/s00726-014- 1669-3.
[8]赵凌志, 刘颖, 覃征. Weighted SVM在蛋白质磷酸化位点预测中的应用[J]. 计算机工程与应用, 2006, 3(1): 155-157. DOI: 10.3778/j.issn.1002-8331.2006.03.155.
[9]邓文轩, 杨航, 靳婷. 基于注意力机制的图像分类降维方法[J]. 广西师范大学学报(自然科学版), 2021, 39(2): 32-40. DOI: 10.16088/j.issn.1001-6600.2020090704.
[10]WANG D L, ZENG S, XU C H, et al. MusiteDeep: a deep-learning framework for general and kinase-specific phosphorylation site prediction[J]. Bioinformatics, 2017, 33(24): 3909-3916. DOI: 10.1093/bioinformatics/btx496.
[11]LUO F L, WANG M H, LIU Y, et al. DeepPhos: prediction of protein phosphorylation sites with deep learning[J]. Bioinformatics, 2019, 35(16): 2766-2773. DOI: 10.1093/bioinformatics/bty1051.
[12]HU D C. An introductory survey on attention mechanisms in NLP problems[C]// Intelligent Systems and Applications: Proceedings of the 2019 Intelligent Systems Conference (IntelliSys) Volume 2. Cham: Nature Switzerland AG, 2020: 432-448. DOI: 10.1007/978-3-030-29513-43_1.
[13]VENKATESH G, GROVER A, SRINIVA G, et al. MHCAttnNet: predicting MHC-peptide bindings for MHC alleles classes I and II using an attention-based deep neural model[J]. Bioinformatics, 2020, 36(1): 399-406. DOI: 10.1093/bioinformatics/btaa479.
[14]UDDIN M, MAHBUB S, RAHMAN M, et al. SAINT: Self-attention augmented inception-inside-inception network improves protein secondary structure prediction[J]. Bioinformatics, 2020, 40(1): 1-10. DOI: 10.1093/bioinformatics/btaa531.
[15]HU H L, XIAO A, ZHANG S, et al. DeepHINT: under-standing HIV-1 integration via deep learning with attention[J]. Bioinformatics, 2020, 35(10): 1660-1667. DOI: 10.1093/bioinformatics/bty842.
[16]笱程成, 秦宇君, 田甜, 等. 一种基于RNN的社交消息爆发预测模型[J]. 软件学报, 2017, 28(11): 3030-3042. DOI: 10.13328/j.cnki.jos.005333.
[17]HOCHREITER S, SCHMIDHUBER J. Long short-term memory[J]. Neural Computation, 1997, 9(8): 1735-1780. DOI: 10.1162/neco.1997.9.8.1735.
[18]李丽双, 周安桥, 刘阳, 等. 基于动态注意力GRU的特定目标情感分类[J]. 中国科学: 信息科学, 2019, 49(8): 1019-1030.DOI: 10.1360/N112018-00280.
[19]YU X, ZHANG Y Q, GONG M G, et al. MGAT: multi-view graph attention networks[J]. Neural Networks, 2020, 132(2):180-189. DOI: 10.1016/j.neunet.2020.08.021.
[20]于海, 赵玉丽, 崔坤, 等. 一种基于交叉熵的社区发现算法[J]. 计算机学报, 2015, 38(8): 1574-1581. DOI: 10. 11897/SP.J.1016.2015.01574.
[21]CONSORTIUM T U, BOUGUELERET L. The universal protein resource (UniProt)[J]. Nucleic Acids Research, 2007, 35(1): D154-D159. DOI: 10.1093/nar/gki070.
[22]HORNBECK P V, KORNHAUSER J M, SASHA T, et al. PhosphoSitePlus: a comprehensive resource for investigating the structure and function of experimentally determined post-translational modifications in man and mouse[J]. Nucleic Acids Research, 2012, 42(2): D261-D270. DOI: 10.1093/nar/gkr1122.
[23]LU C , HUANG K Y, SU M G, et al. dbPTM 3.0: an informative resource for investigating substrate site specificity and functional association of protein post-translational modifications[J]. Nucleic Acids Research, 2013, 41(1): D209-D305. DOI: 10.1093/nar/gks1229.
[24]向陶然, 叶笑春, 李文明, 等. 基于细粒度数据流架构的稀疏神经网络全连接层加速[J]. 计算机研究与发展, 2019, 56(6): 1192-1204. DOI: 10.7544/issn1000-1239.2019.20190117.
[25]周飞燕, 金林鹏, 董军. 卷积神经网络研究综述[J]. 计算机学报, 2017, 40(6): 1229-1251. DOI: 10.11897/SP.J.1016.2017.01229.
[26]吴军, 段琼, 张琳, 等. 磷酸化基序精确置换检验p-value的计算方法[J]. 中国科学: 信息科学, 2017, 47(10): 1334-1348. DOI:10.1360/N112017-00012.
[1] ZHANG Ping, XU Qiaozhi. Segmentation of Lung Nodules Based on Multi-receptive Field and Grouping Attention Mechanism [J]. Journal of Guangxi Normal University(Natural Science Edition), 2022, 40(3): 76-87.
[2] LI Yongjie, ZHOU Guihong, LIU Bo. Fusion Algorithm of Face Detection and Head Pose Estimation Based on YOLOv3 Model [J]. Journal of Guangxi Normal University(Natural Science Edition), 2022, 40(3): 95-103.
[3] YAN Longchuan, LI Yan, SONG Hu, ZOU Haodong, WANG Lijun. Web Traffic Prediction Based on Prophet-DeepAR [J]. Journal of Guangxi Normal University(Natural Science Edition), 2022, 40(3): 172-184.
[4] LU Kaifeng, YANG Yilong, LI Zhi. A Web Service Classification Method Using BERT and DPCNN [J]. Journal of Guangxi Normal University(Natural Science Edition), 2021, 39(6): 87-98.
[5] WU Lingyu, LAN Yang, XIA Haiying. Retinal Image Registration Using Convolutional Neural Network [J]. Journal of Guangxi Normal University(Natural Science Edition), 2021, 39(5): 122-133.
[6] CHEN Wenkang, LU Shenglian, LIU Binghao, LI Guo, LIU Xiaoyu, CHEN Ming. Real-time Citrus Recognition under Orchard Environment by Improved YOLOv4 [J]. Journal of Guangxi Normal University(Natural Science Edition), 2021, 39(5): 134-146.
[7] YANG Zhou, FAN Yixing, ZHU Xiaofei, GUO Jiafeng, WANG Yue. Survey on Modeling Factors of Neural Information Retrieval Model [J]. Journal of Guangxi Normal University(Natural Science Edition), 2021, 39(2): 1-12.
[8] DENG Wenxuan, YANG Hang, JIN Ting. A Dimensionality-reduction Method Based on Attention Mechanismon Image Classification [J]. Journal of Guangxi Normal University(Natural Science Edition), 2021, 39(2): 32-40.
[9] XUE Tao, QIU Senhui, LU Hao, QIN Xingsheng. Exchange Rate Prediction Based on Empirical Mode Decomposition and Multi-branch LSTM Network [J]. Journal of Guangxi Normal University(Natural Science Edition), 2021, 39(2): 41-50.
[10] GUO Chen, ZHOU Fei , HAN Biao, PAN Cui, WU Jiemin, YANG Ting, SHANG Changhua. Cloning and Bioinformatics Analysis of LAP Gene from Pseudomonas [J]. Journal of Guangxi Normal University(Natural Science Edition), 2021, 39(1): 156-164.
[11] TANG Rongchai, WU Xiru. Real-time Detection of Passion Fruit Based on Improved YOLO-V3 Network [J]. Journal of Guangxi Normal University(Natural Science Edition), 2020, 38(6): 32-39.
[12] ZHANG Mingyu,ZHAO Meng,CAI Fuhong,LIANG Yu,WANG Xinhong. Wave Power Prediction Based on Deep Learning [J]. Journal of Guangxi Normal University(Natural Science Edition), 2020, 38(3): 25-32.
[13] LI Weiyong, LIU Bin, ZHANG Wei, CHEN Yunfang. An Automatic Summarization Model Based on Deep Learning for Chinese [J]. Journal of Guangxi Normal University(Natural Science Edition), 2020, 38(2): 51-63.
[14] LIU Yingxuan, WU Xiru, XUE Ganggang. Multi-target Real-time Detection for Road Traffic SignsBased on Deep Learning [J]. Journal of Guangxi Normal University(Natural Science Edition), 2020, 38(2): 96-106.
[15] ZHANG Jinlei, LUO Yuling, FU Qiang. Predicting Financial Time Series Based on Gated Recurrent Unit Neural Network [J]. Journal of Guangxi Normal University(Natural Science Edition), 2019, 37(2): 82-89.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] AI Yan, JIA Nan, WANG Yuan, GUO Jing, PAN Dongdong. Review of Statistical Methods and Applications of Genetic Association Analysis for Multiple Traits and Multiple Locus[J]. Journal of Guangxi Normal University(Natural Science Edition), 2022, 40(1): 1 -14 .
[2] BAI Defa, XU Xin, WANG Guochang. Review of Generalized Linear Models and Classification for Functional Data[J]. Journal of Guangxi Normal University(Natural Science Edition), 2022, 40(1): 15 -29 .
[3] ZENG Qingfan, QIN Yongsong, LI Yufang. Empirical Likelihood Inference for a Class of Spatial Panel Data Models[J]. Journal of Guangxi Normal University(Natural Science Edition), 2022, 40(1): 30 -42 .
[4] ZHANG Zhifei, DUAN Qian, LIU Naijia, HUANG Lei. High-dimensional Nonlinear Regression Model Based on JMI[J]. Journal of Guangxi Normal University(Natural Science Edition), 2022, 40(1): 43 -56 .
[5] YANG Di, FANG Yangxin, ZHOU Yan. New Category Classification Research Based on MEB and SVM Methods[J]. Journal of Guangxi Normal University(Natural Science Edition), 2022, 40(1): 57 -67 .
[6] CHEN Zhongxiu, ZHANG Xingfa, XIONG Qiang, SONG Zefang. Estimation and Test for Asymmetric DAR Model[J]. Journal of Guangxi Normal University(Natural Science Edition), 2022, 40(1): 68 -81 .
[7] DU Jinfeng, WANG Hairong, LIANG Huan, WANG Dong. Progress of Cross-modal Retrieval Methods Based on Representation Learning[J]. Journal of Guangxi Normal University(Natural Science Edition), 2022, 40(3): 1 -12 .
[8] LI Muhang, HAN Meng, CHEN Zhiqiang, WU Hongxin, ZHANG Xilong. Survey of Algorithms Oriented to Complex High Utility Pattern Mining[J]. Journal of Guangxi Normal University(Natural Science Edition), 2022, 40(3): 13 -30 .
[9] CHAO Rui, ZHANG Kunli, WANG Jiajia, HU Bin, ZHANG Weicong, HAN Yingjie, ZAN Hongying. Construction of Chinese Multimodal Knowledge Base[J]. Journal of Guangxi Normal University(Natural Science Edition), 2022, 40(3): 31 -39 .
[10] LI Zhengguang, CHEN Heng, LIN Hongfei. Identification of Adverse Drug Reaction on Social Media Using Bi-directional Language Model[J]. Journal of Guangxi Normal University(Natural Science Edition), 2022, 40(3): 40 -48 .