广西师范大学学报(自然科学版) ›› 2022, Vol. 40 ›› Issue (2): 91-102.doi: 10.16088/j.issn.1001-6600.2021072301

• • 上一篇    下一篇

基于多信息集成的药物靶标预测方法研究

谭凯1, 李永杰1, 潘海明1, 黄可馨2, 邱杰3, 陈庆锋1*   

  1. 1.广西大学 计算机与电子信息学院, 广西 南宁 530004;
    2.广西医科大学, 广西 南宁 530021;
    3.玉林师范学院 计算机科学与工程学院, 广西 玉林 537000
  • 收稿日期:2021-07-23 修回日期:2021-10-09 发布日期:2022-05-31
  • 通讯作者: 陈庆锋(1972—), 男, 广西鹿寨人, 广西大学教授, 博士。E-mail: qingfeng@gxu.edu.cn
  • 基金资助:
    国家自然科学基金(61963004); 广西自然科学基金重点项目(2017GXNSFDA198033)

Study on Multi-information Integration for Drug Target Prediction

TAN Kai1, LI Yongjie1, PAN Haiming1, HUANG Kexin2, QIU Jie2, CHEN Qingfeng1*   

  1. 1. School of Computer, Electronics and Information, Guangxi University, Nanning Guangxi 530004, China;
    2. Guangxi Medical University, Nanning Guangxi 530021, China;
    3. School of Computer Science and Engineering, Yulin Normal University, Yulin Guangxi 537000, China
  • Received:2021-07-23 Revised:2021-10-09 Published:2022-05-31

摘要: 准确的药物-靶标相互作用预测在药物发现和重新定位中有重要作用。传统的方法要么费时(基于模拟的方法),要么严重依赖领域专业知识(基于相似性和基于特征的方法),而且现有的使用单一数据信息或稀疏数据的计算方法普遍准确性不高。尽管多个异构网络整合已被广泛用于预测药物靶标,但如何尽可能多的保留网络结构信息仍然是一个巨大的挑战。本文提出一种新颖的框架NGDTI,不仅从网络中提取相关的生物学特性和关联信息,而且保留重要的网络拓扑信息。其利用图神经网络更新提取的特征信息,所发现的药物和靶标的拓扑特征使药物-靶标相互作用预测更加准确。与最新的基准方法相比,本文模型的AUPR值提高了0.01。实验结果表明,NGDTI在药物开发和重新定位方面有良好的应用前景。

关键词: 药物-靶标预测, 网络嵌入, 网络集成, 矩阵分解, 图神经网络

Abstract: Accurate determination of drug-target interactions is crucial in drug discovery process and repositioning. Traditional methods for DTI prediction are either time-consuming (simulation-based methods) or heavily dependent on domain expertise (similarity-based and feature-based methods). Existing computation-based methods using single data information or sparse data, always suffer from high false positive rates. Although integrating multiple heterogeneous networks has been prevalent for drug target prediction, how to retain as much structural information as possible is still a big challenge. This paper proposes a novel framework NGDTI, which extracts relevant biological properties and association information from the network while maintaining the topology information. Further, the graph neural network is applied to update the extracted feature information. The learned topology-preserving representations of drugs and targets promote DTI prediction. Compared with the state-of-the-art methods, NGDTI increases the AUPR value by nearly 0.01. The results demonstrate that NGDTI is promising for drug development and repositioning.

Key words: drug target association prediction, network embedding, network integration, matrix decomposition, graph neural network

中图分类号: 

  • TP183
[1] 徐国保, 陈媛晓, 王骥. 基于图卷积网络的药物靶标关联预测算法[J]. 计算机应用, 2021, 41(5): 1522-1526. DOI: 10.11772/j.issn.1001-9081.2020081186.
[2] 余冬华, 郭茂祖, 刘晓燕, 等. 药物靶标作用关系预测结果评价及查询验证[J]. 计算机研究与发展, 2019, 56(9): 1881-1888. DOI: 10.7544/issn1000-1239.2019.20180830.
[3] 宫莉, 康经武. 基于化学蛋白质组学的药物靶标鉴定[J]. 色谱, 2020, 38(8): 877-879.
[4] 李梢. 网络靶标: 中药方剂网络药理学研究的一个切入点[J]. 中国中药杂志, 2011, 36(15): 2017-2020.
[5] PAUL S M, MYTELKA D S, DUNWIDDIE C T, et al. How to improve R&D productivity: the pharmaceutical industry's grand challenge[J]. Nature Reviews Drug Discovery, 2010, 9(3): 203-214. DOI: 10.1038/nrd3078.
[6] 彭利红, 刘海燕, 任日丽, 等. 基于多标记学习预测药物-靶标相互作用[J]. 计算机工程与应用, 2017, 53(15): 260-265.
[7] MORRIS G M, HUEY R, LINDSTROM W, et al. AutoDock4 and AutoDockTools4: automated docking with selective receptor flexibility[J]. Journal of Computational Chemistry, 2009, 30(16): 2785-2791. DOI: 10.1002/jcc.21256.
[8] KEISER M J, ROTH B L, ARMBRUSTER B N, et al. Relating protein pharmacology by ligand chemistry[J]. Nature Biotechnology, 2007, 25(2): 197-206. DOI: 10.1038/nbt1284.
[9] KLIPP E, WADE R C, KUMMER U. Biochemical network-based drug-target prediction[J]. Current opinion in biotechnology, 2010, 21(4): 511-516. DOI: 10.1016/j.copbio.2010.05.004.
[10] YAMANISHI Y, ARAKI M, GUTTERIDGE A, et al. Prediction of drug-target interaction networks from the integration of chemical and genomic spaces[J]. Bioinformatics, 2008, 24(13): i232-i240. DOI: 10.1093/bioinformatics/btn162.
[11] BLEAKLEY K, YAMANISHI Y. Supervised prediction of drug-target interactions using bipartite local models[J]. Bioinformatics, 2009, 25(18): 2397-2403. DOI: 10.1093/bioinformatics/btp433.
[12] MEI J P, KWOH C K, YANG P, et al. Drug-target interaction prediction by learning from local information and neighbors[J]. Bioinformatics, 2013, 29(2): 238-245. DOI: 10.1093/bioinformatics/bts670.
[13] VAN LAARHOVEN T, NABUURS S B, MARCHIORI E. Gaussian interaction profile kernels for predicting drug-target interaction[J]. Bioinformatics, 2011, 27(21): 3036-3043. DOI: 10.1093/bioinformatics/btr500.
[14] PAHIKKALA T, AIROLA A, PIETILÄ S, et al. Toward more realistic drug-target interaction predictions[J]. Briefings in Bioinformatics, 2015, 16(2): 325-337. DOI: 10.1093/bib/bbu010.
[15] HAO M, WANG Y L, BRYANT S H. Improved prediction of drug-target interactions using regularized least squares integrating with kernel fusion technique[J]. Analytica Chimica Acta, 2016, 909: 41-50. DOI: 10.1016/j.aca.2016.01.014.
[16] NASCIMENTO A C A, PRUDÊNCIO R B C, COSTA I G. A multiple kernel learning algorithm for drug-target interaction prediction[J]. BMC Bioinformatics, 2016, 17: 46. DOI: 10.1186/s12859-016-0890-3.
[17] LIU Y, WU M, MIAO C Y, et al. Neighborhood regularized logistic matrix factorization for drug-target interaction prediction[J]. PLOS Computational Biology, 2016, 12(2): e1004760. DOI: 10.1371/journal.pcbi.1004760.
[18] GÖNEN M. Predicting drug-target interactions from chemical and genomic kernels using Bayesian matrix factorization[J]. Bioinformatics, 2012, 28(18): 2304-2310. DOI: 10.1093/bioinformatics/bts360.
[19] ZHENG X D, DING H, MAMITSUKA H, et al. Collaborative matrix factorization with multiple similarities for predicting drug-target interactions[C]// Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York, NY: ACM Press, 2013: 1025-1033. DOI: 10.1145/2487575.2487670.
[20] HAO M, BRYANT S H, WANG Y L. Predicting drug-target interactions by dual-network integrated logistic matrix factorization[J]. Scientific Reports, 2017, 7: 40376. DOI: 10.1038/srep40376.
[21] MIZUTANI S, PAUWELS E, STOVEN V, et al. Relating drug-protein interaction network with drug side effects[J]. Bioinformatics, 2012, 28(18): i522-i528. DOI: 10.1093/bioinformatics/bts383.
[22] LUO Y N, ZHAO X B, ZHOU J T, et al. A network integration approach for drug-target interaction prediction and computational drug repositioning from heterogeneous information[J]. Nature Communications, 2017, 8: 537. DOI: 10.1038/s41467-017-00680-8.
[23] NATARAJAN N, DHILLON I S. Inductive matrix completion for predicting gene-disease associations[J]. Bioinformatics, 2014, 30(12): i60-i68. DOI: 10.1093/bioinformatics/btu269.
[24] CHEN Q F, LAI D H, LAN W, et al. ILDMSF: inferring associations between long non-coding RNA and disease based on multi-similarity fusion[J]. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 2021, 18(3): 1106-1112. DOI: 10.1109/TCBB.2019.2936476.
[25] LAN W, LAI D H, CHEN Q F, et al. LDICDL: LncRNA-disease association identification based on collaborative deep learning[J/OL]. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 2020[2021-07-23]. https://ieeexplore.ieee.org/document/9246263. DOI: 10.1109/TCBB.2020.3034910.
[26] 李琳, 梁永全, 刘广明. 基于重启随机游走的图自编码器[J]. 计算机应用研究, 2021, 38(10): 3009-3013. DOI: 10.19734/j.issn.1001-3695.2021.03.0083.
[27] 翟正利, 梁振明, 周炜, 等. 变分自编码器模型综述[J]. 计算机工程与应用, 2019, 55(3): 1-9. DOI: 10.3778/j.issn.1002-8331.1810-0284.
[28] 郭景峰, 董慧, 张庭玮, 等. 主题关注网络的表示学习[J]. 计算机应用, 2020, 40(2): 441-447. DOI: 10.11772/j.issn.1001-9081.2019081529.
[29] 王佩琪, 高原, 刘振宇, 等. 深度卷积神经网络的数据表示方法分析与实践[J]. 计算机研究与发展, 2017, 54(6): 1348-1356. DOI: 10.7544/issn1000-1239.2017.20170098.
[30] 刘思, 刘海, 陈启买, 等. 基于网络表示学习与随机游走的链路预测算法[J]. 计算机应用, 2017, 37(8): 2234-2239. DOI: 10.11772/j.issn.1001-9081.2017.08.2234.
[31] WAN F P, HONG L X, XIAO A, et al. NeoDTI: neural integration of neighbor information from a heterogeneous network for discovering new drug-target interactions[J]. Bioinformatics, 2019, 35(1): 104-111. DOI: 10.1093/bioinformatics/bty543.
[32] GILMER J, SCHOENHOLZ S S, RILEY P F, et al. Neural message passing for quantum chemistry[C]// Proceedings of the 34th International Conference on Machine Learning: Volume 70. Sydney: PMLR, 2017: 1263-1272.
[33] ROSSI R A, ZHOU R, AHMED N K. Deep inductive graph representation learning[J]. IEEE Transactions on Knowledge and Data Engineering, 2020, 32(3): 438-452. DOI: 10.1109/TKDE.2018.2878247.
[34] WU Z H, PAN S R, CHEN F W, et al. A comprehensive survey on graph neural networks[J]. IEEE Transactions on Neural Networks and Learning Systems, 2021, 32(1): 4-24. DOI: 10.1109/TNNLS.2020.2978386.
[35] 焦李成, 杨淑媛, 刘芳, 等. 神经网络七十年: 回顾与展望[J]. 计算机学报, 2016, 39(8): 1697-1716.
[36] SUN C, CAO Y K, WEI J M, et al. Autoencoder-based drug-target interaction prediction by preserving the consistency of chemical properties and functions of drugs[J]. Bioinformatics, 2021, 37(20):3618-3625. DOI: 10.1093/bioinformatics/btab384.
[37] XUAN P, ZHANG Y, CUI H, et al. Integrating multi-scale neighbouring topologies and cross-modal similarities for drug-protein interaction prediction[J]. Briefings in Bioinformatics, 2021, 22(5): bbab119. DOI: 10.1093/bib/bbab119.
[38] GAO K Y, FOKOUE A, LUO H, et al. Interpretable drug target prediction using deep neural representation[C]// Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence. Marina del Rey, CA: IJCAI, 2018: 3371-3377. DOI: 10.24963/ijcai.2018/468.
[39] WANG S, CHO H H, ZHAI C X, et al. Exploiting ontology graph for predicting sparsely annotated gene function[J]. Bioinformatics, 2015, 31(12): i357-i364. DOI: 10.1093/bioinformatics/btv260.
[40] KIPF T N, WELLING M. Variational graph auto-encoders[EB/OL]. (2016-11-21)[2021-07-23]. https://arxiv.org/abs/1611.07308.
[41] KNOX C, LAW V, JEWISON T, et al. DrugBank 3.0: a comprehensive resource for ‘Omics' research on drugs[J]. Nucleic Acids Research, 2011, 39(suppl_1): D1035-D1041. DOI: 10.1093/nar/gkq1126.
[42] PRASAD T S K, GOEL R, KANDASAMY K, et al. Human protein reference database: 2009 update[J]. Nucleic Acids Research, 2009, 37(suppl_1): D767-D772. DOI: 10.1093/nar/gkn892.
[43] DAVISA P, MURPHY C G, JOHNSON R, et al. The comparative toxicogenomics database: update 2013[J]. Nucleic Acids Research, 2013, 41(D1): D1104-D1114. DOI: 10.1093/nar/gks994.
[44] KUHN M, CAMPILLOS M, LETUNIC I, et al. A side effect resource to capture phenotypic effects of drugs[J]. Molecular Systems Biology, 2010, 6(1): 343. DOI: 10.1038/msb.2009.98.
[45] ROGERS D, HAHN M. Extended-connectivity fingerprints[J]. Journal ofChemical Information and Modeling, 2010, 50(5): 742-754. DOI: 10.1021/ci100050t.
[46] SMITH T F, WATERMAN M S. Identification of common molecular subsequences[J]. Journal of Molecular Biology, 1981, 147(1): 195-197. DOI: 10.1016/0022-2836(81)90087-5.
[47] XIA Z, WU L Y, ZHOU X B, et al. Semi-supervised drug-protein interaction prediction from heterogeneous biological spaces[J]. BMC Systems Biology, 2010, 4(Suppl 2): S6. DOI: 10.1186/1752-0509-4-S2-S6.
[48] WANG W H, YANG S, ZHANG X, et al. Drug repositioning by integrating target information through a heterogeneous network model[J]. Bioinformatics, 2014, 30(20): 2923-2930. DOI: 10.1093/bioinformatics/btu403.
[49] DUVENAUD D, MACLAURIN D, AGUILERA-IPARRAGUIRRE J, et al. Convolutional networks on graphs for learning molecular fingerprints[EB/OL]. (2015-11-03)[2021-07-23]. https://arxiv.org/abs/1509.09292v2.
[50] YANG Y Z, DU H, LI Y H, et al. NR3C1 gene polymorphisms are associated with high-altitude pulmonary edema in Han Chinese[J]. Journal of Physiological Anthropology, 2019, 38: 4. DOI: 10.1186/s40101-019-0194-1.
[51] LIU G X, REMME C A, BOUKENS B J, et al. Overexpression of SCN5A in mouse heart mimics human syndrome of enhanced atrioventricular nodal conduction[J]. Heart Rhythm, 2015, 12(5): 1036-1045. DOI: 10.1016/j.hrthm.2015.01.029.
[52] THOMAS G, GURUNG I S, KILLEEN M J, et al. Effects of L-type Ca2+ channel antagonism on ventricular arrhythmogenesis in murine hearts containing a modification in the Scn5a gene modelling human long QT syndrome 3[J]. The Journal of Physiology, 2007, 578(1): 85-97. DOI: 10.1113/jphysiol.2006.121921.
[53] ULLRICH K, WURSTER K D, LAMPRECHT B, et al. BAY 43-9006/Sorafenib blocks CSF1R activity and induces apoptosis in various classical Hodgkin lymphoma cell lines[J]. British Journal of Haematology, 2011, 155(3): 398-402. DOI: 10.1111/j.1365-2141.2011.08685.x.
[54] TSURKAN L G, HATFIELD M J, EDWARDS C C, et al. Inhibition of human carboxylesterases hCE1 and hiCE by cholinesterase inhibitors[J]. Chemico-biological interactions, 2013, 203(1): 226-230. DOI: 10.1016/j.cbi.2012.10.018.
[1] 孔亚钰, 卢玉洁, 孙中天, 肖敬先, 侯昊辰, 陈廷伟. 面向强化当前兴趣的图神经网络推荐算法研究[J]. 广西师范大学学报(自然科学版), 2022, 40(3): 151-160.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] 胡锦铭, 韦笃取. 不同阶次分数阶永磁同步电机的混合投影同步[J]. 广西师范大学学报(自然科学版), 2021, 39(4): 1 -8 .
[2] 武康康, 周鹏, 陆叶, 蒋丹, 闫江鸿, 钱正成, 龚闯. 基于小批量梯度下降法的FIR滤波器[J]. 广西师范大学学报(自然科学版), 2021, 39(4): 9 -20 .
[3] 刘东, 周莉, 郑晓亮. 基于SA-DBN的超短期电力负荷预测[J]. 广西师范大学学报(自然科学版), 2021, 39(4): 21 -33 .
[4] 张伟彬, 吴军, 易见兵. 基于RFB网络的特征融合管制物品检测算法研究[J]. 广西师范大学学报(自然科学版), 2021, 39(4): 34 -46 .
[5] 王金艳, 胡春, 高健. 一种面向知识编译的OBDD构造方法[J]. 广西师范大学学报(自然科学版), 2021, 39(4): 47 -54 .
[6] 逯苗, 何登旭, 曲良东. 非线性参数的精英学习灰狼优化算法[J]. 广西师范大学学报(自然科学版), 2021, 39(4): 55 -67 .
[7] 李莉丽, 张兴发, 李元, 邓春亮. 基于高频数据的日频GARCH模型估计[J]. 广西师范大学学报(自然科学版), 2021, 39(4): 68 -78 .
[8] 李松涛, 李群宏, 张文. 三自由度碰撞振动系统的余维二擦边分岔与混沌控制[J]. 广西师范大学学报(自然科学版), 2021, 39(4): 79 -92 .
[9] 赵红涛, 刘志伟. λ重完全二部3-一致超图λK(3)n,n分解为超图双三角锥[J]. 广西师范大学学报(自然科学版), 2021, 39(4): 93 -98 .
[10] 李梦, 曹庆先 , 胡宝清. 1960—2018年广西大陆海岸线时空变迁分析[J]. 广西师范大学学报(自然科学版), 2021, 39(4): 99 -108 .
版权所有 © 广西师范大学学报(自然科学版)编辑部
地址:广西桂林市三里店育才路15号 邮编:541004
电话:0773-5857325 E-mail: gxsdzkb@mailbox.gxnu.edu.cn
本系统由北京玛格泰克科技发展有限公司设计开发