广西师范大学学报(自然科学版) ›› 2023, Vol. 41 ›› Issue (1): 102-112.doi: 10.16088/j.issn.1001-6600.2022031703

• 研究论文 • 上一篇    下一篇

基于卷积推理的多跳知识图谱问答算法

潘海明1, 陈庆锋1*, 邱杰2, 何乃旭1, 刘春雨1, 杜晓敬1   

  1. 1.广西大学计算机与电子信息学院, 广西 南宁 530004;
    2.玉林师范学院计算机科学与工程学院, 广西 玉林 537000
  • 收稿日期:2022-03-17 修回日期:2022-04-19 出版日期:2023-01-25 发布日期:2023-03-07
  • 通讯作者: 陈庆锋(1972—),男,广西柳州人,广西大学教授,博士。E-mail:qingfeng@gxu.edu.cn
  • 基金资助:
    国家自然科学基金(61963004, 61862006); 广西自然科学基金重点项目(2017GXNSFDA198033)

Multi-hop Knowledge Graph Question Answering Based on Convolution Reasoning

PAN Haiming1, CHEN Qingfeng1*, QIU Jie2, HE Naixu1, LIU Chunyu1, DU Xiaojing1   

  1. 1. School of Computer Electronics and Information, Guangxi University, Nanning Guangxi 530004, China;
    2. School of Computer Science and Engineering,Yulin Normal University, Yulin Guangxi 537000, China
  • Received:2022-03-17 Revised:2022-04-19 Online:2023-01-25 Published:2023-03-07

摘要: 多跳问题相比于简单问题更符合人们日常的提问方式,同时,研究多跳知识图谱问答(KGQA)算法有助于智能问答系统的推广。然而,现有的多跳KGQA方法在2~3跳问题和不完整知识图谱上的答案推理能力较弱。针对这一问题,本文提出基于卷积推理的多跳KGQA算法。首先,为了获取更具表示能力的问题嵌入向量,本文根据问题与关系的语义相似性提出结合字符特征和语义特征的问题嵌入模型;而后,为了增强算法的长链接推理能力,提出基于卷积神经网络(CNN)的答案推理模型来抽取嵌入向量的高阶信息。实验结果显示,相比于已有的5种算法,本文算法在MetaQA数据集的2跳和3跳问题答案预测准确率分别提高了1.7和1.3个百分点,在不完整知识图谱的2跳和3跳问题上分别提高了9.4和9.3个百分点。

关键词: 知识图谱问答, 知识图谱嵌入, 语言模型, 卷积神经网络

Abstract: Compared with simple questions, multi-hop questions are more in line with people's daily questioning methods. At the same time, the research on the multi-hop knowledge graph question-answering (KGQA) algorithm is useful to enhance the intelligent question answering system. However, the existing multi-hop KGQA methods show weak answer reasoning ability in 2 and 3-hop questions and incomplete knowledge graph. To solve this problem, a multi-hop KGQA based on convolution reasoning is proposed in this paper. A question embedding model combining character features and semantic features is developed according to the semantic similarity between questions and relationships to obtain more expressive question embedding. Furthermore, to enhance the long link reasoning ability of the algorithm, an answer reasoning model based on convolutional neural network (CNN) is proposed to extract the high-order information of the embedding. The experimental results on MetaQA dataset demonstrate that compared with the five existing methods, the new algorithm improves the prediction accuracy of the 2-hop and 3-hop questions in the complete knowledge graph and incomplete knowledge graph by 1.7%, 1.3%, 9.4%, and 9.3%, respectively.

Key words: knowledge graph question-answering, knowledge graph embedding, language model, convolutional neural network

中图分类号: 

  • TP391.1
[1] JI S X,PAN S R,CAMBRIA E,et al. A survey on knowledge graphs:representation,acquisition,and applications[J]. IEEE Transactions on Neural Networks and Learning Systems,2022,33(2):494-514. DOI:10.1109/TNNLS.2021.3070843.
[2]曹明宇,李青青,杨志豪,等. 基于知识图谱的原发性肝癌知识问答系统[J]. 中文信息学报,2019,33(6):88-93. DOI:10.3969/j.issn.1003-0077.2019.06.013.
[3]陈程,翟洁,秦锦玉,等. 基于中医药知识图谱的智能问答技术研究[J]. 中国新通信,2018,20(2):204-207. DOI:10.3969/j.issn.1673-4866.2018.02.174.
[4]王继伟,梁怀众,樊伟,等. 基于中文医疗知识图谱的智能问答系统设计与实现方法[J]. 中国数字医学,2021,16(2):54-58. DOI:10.3969/j.issn.1673-7571.2021.02.012.
[5]杜泽宇,杨燕,贺樑. 基于中文知识图谱的电商领域问答系统[J]. 计算机应用与软件,2017,34(5):153-159. DOI:10.3969/j.issn.1000-386x.2017.05.027.
[6]谭刚,陈聿,彭云竹. 融合领域特征知识图谱的电网客服问答系统[J]. 计算机工程与应用,2020,56(3):232-239. DOI:10.3778/j.issn.1002-8331.1907-0385.
[7]李轩. 基于知识图谱的教育领域知识问答系统的研究与应用[D]. 吉林:吉林大学,2019.
[8]WANG Q,MAO Z D ,WANG B,et al. Knowledge graph embedding:a survey of approaches and applications[J]. IEEE Transactions on Knowledge and Data Engineering,2017,29(12):2724-2743. DOI:10.1109/TKDE.2017.2754499.
[9]QIU X P,SUN T X,XU Y G,et al. Pre-trained models for natural language processing:a survey[J]. Science China Technological Sciences,2020,63(10):1872-1897. DOI:10.1007/s11431-020-1647-3.
[10]LAN W,WU X M,CHEN Q F,et al. GANLDA:graph attention network for lncRNA-disease associations prediction[J]. Neurocomputing,2022,469:384-393. DOI:10.1016/j.neucom.2020.09.094.
[11]LAN W,DONG Y,CHEN Q F,et al. IGNSCDA:predicting CircRNA-disease associations based on improved graph convolutional network and negative sampling[J]. IEEE/ACM Transactions on Computational Biology and Bioinformatics,2022,19(6):3530-3538. DOI:10.1109/TCBB.2021.3111607.
[12]SUN H T,DHINGRA B,ZAHEER M,et al. Open domain question answering using early fusion of knowledge bases and text[C]// Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Stroudsburg,PA:Association for Computational Linguistics,2018:4231-4242. DOI:10.18653/v1/D18-1455.
[13]SUN H T,BEDRAX-WEISS T,COHEN W W. PullNet:open domain question answering with iterative retrieval on knowledge bases and text[C]// Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing(EMNLP-IJCNLP). Stroudsburg,PA:Association for Computational Linguistics,2019:2380-2390. DOI:10.18653/v1/D19-1242.
[14]SCHLICHTKRULL M,KIPF T N,BLOEM P,et al. Modeling relational data with graph convolutional networks[C]// The Semantic Web:LNCS Volume 10843. Cham:Springer International Publishing AG,2018:593-607. DOI:10.1007/978-3-319-93417-4_38.
[15]李肯立,李旻佳,刘楚波,等. 一种基于图神经网络嵌入匹配的知识图谱问答方法和系统:CN202011624049.6[P]. 2021-05-07.
[16]HAN J L,CHENG B,WANG X. Hypergraph convolutional network for multi-hop knowledge base question answering(student abstract)[J]. Proceedings of the AAAI Conference on Artificial Intelligence,2020,34(10):13801-13802. DOI:10.1609/aaai.v34i10.7172.
[17]HAN J L,CHENG B,WANG X. Two-phase hypergraph based reasoning with dynamic relations for multi-hop KBQA[C]// Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence. Yokohama:IJCAI,2020:3615-3621. DOI:10.24963/ijcai.2020/500.
[18]YIH W T,CHANG M W,HE X D,et al. Semantic parsing via staged query graph generation:question answering with knowledge base[C]// Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing(Volume 1: Long Papers). Stroudsburg,PA:Association for Computational Linguistics,2015:1321-1331. DOI:10.3115/v1/P15-1128.
[19]LUO K Q,LIN F L,LUO X S,et al. Knowledge base question answering via encoding of complex query graphs[C]// Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Stroudsburg,PA:Association for Computational Linguistics,2018:2185-2194. DOI:10.18653/v1/D18-1242.
[20]LAN Y S,JIANG J. Query graph generation for answering multi-hop complex questions from knowledge bases[C]// Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg,PA:Association for Computational Linguistics,2020:969-974. DOI:10.18653/v1/2020.acl-main.91.
[21]SAXENA A,TRIPATHI A,TALUKDAR P. Improving multi-hop question answering over knowledge graphs using knowledge base embeddings[C]// Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg,PA:Association for Computational Linguistics,2020:4498-4507. DOI:10.18653/v1/2020.acl-main.412.
[22]TROUILLON T,WELBL J,RIEDEL S,et al. Complex embeddings for simple link prediction[C]// Proceedings of the 33nd International Conference on Machine Learning:PMLR Volume 48. New York,NY:PMLR,2016:2071-2080.
[23]LIU Y H,OTT M,GOYAL N,et al. RoBERTa:a robustly optimized BERT pretraining approach[EB/OL].(2019-07-26)[2022-03-17]. https://arxiv.org/abs/1907.11692. DOI:10.48550/arXiv.1907.11692.
[24]SUN H T,ARNOLD A O,BEDRAX-WEISS T,et al. Faithful embeddings for knowledge base queries[EB/OL].(2021-01-29)[2022-03-17]. https://arxiv.org/abs/2004.03658. DOI:10.48550/arXiv.2004.03658.
[25]张天杭,李婷婷,张永刚. 基于知识图谱嵌入的多跳中文知识问答方法[J]. 吉林大学学报(理学版),2022,60(1):119-126. DOI:10.13413/j.cnki.jdxblxb.2020417.
[26]ZHOU P,SHI W,TIAN J,et al. Attention-based bidirectional long short-term memory networks for relation classification[C]// Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. Stroudsburg,PA:Association for Computational Linguistics,2016:207-212. DOI:10.18653/v1/P16-2034.
[27]MIKOLOV T,CHEN K,CORRADO G,et al. Efficient estimation of word representations in vector space[EB/OL].(2013-09-07)[2022-03-17]. https://arxiv.org/abs/1301.3781. DOI:10.48550/arXiv.1301.3781.
[28]PENNINGTON J,SOCHER R,MANNING C D. GloVe:global vectors for word representation[C]// Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing(EMNLP). Stroudsburg, PA:Association for Computational Linguistics,2014:1532-1543. DOI:10.3115/v1/d14-1162.
[29]PETERS M E,NEUMANN M,IYYER M,et al. Deep contextualized word representations[C]// Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies. Stroudsburg,PA:Association for Computational Linguistics Press,2018:2227-2237. DOI:10.18653/v1/n18-1202.
[30]DEVLIN J,CHANG M W,LEE K,et al. BERT:pre-training of deep bidirectional transformers for language understanding[C]// Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologie, Volume 1(Long and Short Papers). Stroudsburg,PA:Association for Computational Linguistics,2019:4171-4186. DOI:10.18653/v1/n19-1423.
[31]刘知远,孙茂松,林衍凯,等. 知识表示学习研究进展[J]. 计算机研究与发展,2016,53(2):247-261. DOI:10.7544/issn1000-1239.2016.20160020.
[32]ZHANG Z,ZHUANG F Z,ZHU H S,et al. Relational graph neural network with hierarchical attention for knowledge graph completion[J]. Proceedings of the AAAI Conference on Artificial Intelligence,2020,34(5):9612-9619. DOI:10.1609/aaai.v34i05.6508.
[33]WANG H W,ZHANG F Z,ZHANG M D,et al. Knowledge-aware graph neural networks with label smoothness regularization for recommender systems[C]// Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. New York,NY:Association for Computing Machinery,2019:968-977. DOI:10.1145/3292500. 3330836.
[34]李慧慧,张洁,夏军生,等. 一种基于知识图谱嵌入的用户实体群组推荐方法:CN202110024581.2[P]. 2021-04-30.
[35]李林峰. 面向临床决策支持的人工智能关键技术研究[D]. 北京:北京交通大学,2020.DOI:10.26944/d.cnki.gbfju.2020.000101.
[36]LAN W,DONG Y,CHEN Q F,et al. KGANCDA:predicting circRNA-disease associations based on knowledge graph attention network[J]. Briefings in Bioinformatics,2022,23(1):bbab494. DOI:10.1093/bib/bbab494.
[37]ZHANG Y Y,DAI H J,KOZAREVA Z,et al. Variational reasoning for question answering with knowledge graph[J]. Proceedings of the AAAI Conference on Artificial Intelligence,2018,32(1):6069-6076. DOI:10.1609/aaai.v32i1.12057.
[38]MILLER A,FISCH A,DODGE J,et al. Key-value memory networks for directly reading documents[C]// Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. Stroudsburg,PA: Association for Computational Linguistics,2016:1400-1409. DOI:10.18653/v1/D16-1147.
[39]XIONG W H,YU M,CHANG S Y,et al. Improving question answering over incomplete KBs with knowledge-aware reader[C]// Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Stroudsburg,PA:Association for Computational Linguistics,2019:4258-4264. DOI:10.18653/v1/p19-1417.
[1] 张涛, 杜建民. 基于无人机遥感的荒漠草原微斑块识别研究[J]. 广西师范大学学报(自然科学版), 2022, 40(6): 50-58.
[2] 郝雅茹, 董力, 许可, 李先贤. 预训练语言模型的可解释性研究进展[J]. 广西师范大学学报(自然科学版), 2022, 40(5): 59-71.
[3] 田晟, 宋霖. 基于CNN和Bagging集成的交通标志识别[J]. 广西师范大学学报(自然科学版), 2022, 40(4): 35-46.
[4] 李正光, 陈恒, 林鸿飞. 基于双向语言模型的社交媒体药物不良反应识别[J]. 广西师范大学学报(自然科学版), 2022, 40(3): 40-48.
[5] 周圣凯, 富丽贞, 宋文爱. 基于深度学习的短文本语义相似度计算模型[J]. 广西师范大学学报(自然科学版), 2022, 40(3): 49-56.
[6] 彭涛, 唐经, 何凯, 胡新荣, 刘军平, 何儒汉. 基于多步态特征融合的情感识别[J]. 广西师范大学学报(自然科学版), 2022, 40(3): 104-111.
[7] 马铖旭, 曾上游, 赵俊博, 陈红阳. 基于卷积神经网络的逆光图像增强研究[J]. 广西师范大学学报(自然科学版), 2022, 40(2): 81-90.
[8] 陈文康, 陆声链, 刘冰浩, 李帼, 刘晓宇, 陈明. 基于改进YOLOv4的果园柑橘检测方法研究[J]. 广西师范大学学报(自然科学版), 2021, 39(5): 134-146.
[9] 杨州, 范意兴, 朱小飞, 郭嘉丰, 王越. 神经信息检索模型建模因素综述[J]. 广西师范大学学报(自然科学版), 2021, 39(2): 1-12.
[10] 邓文轩, 杨航, 靳婷. 基于注意力机制的图像分类降维方法[J]. 广西师范大学学报(自然科学版), 2021, 39(2): 32-40.
[11] 严浩, 许洪波, 沈英汉, 程学旗. 开放式中文事件检测研究[J]. 广西师范大学学报(自然科学版), 2020, 38(2): 64-71.
[12] 范瑞,蒋品群,曾上游,夏海英,廖志贤,李鹏. 多尺度并行融合的轻量级卷积神经网络设计[J]. 广西师范大学学报(自然科学版), 2019, 37(3): 50-59.
[13] 武文雅, 陈钰枫, 徐金安, 张玉洁. 基于高层语义注意力机制的中文实体关系抽取[J]. 广西师范大学学报(自然科学版), 2019, 37(1): 32-41.
[14] 薛洋,曾庆科,夏海英,王文涛. 基于卷积神经网络超分辨率重建的遥感图像融合[J]. 广西师范大学学报(自然科学版), 2018, 36(2): 33-41.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] 杨婵, 万雅琼, 黄小富, 袁旭东, 周洪艳, 方浩存, 黎大勇, 李佳琦. 基于红外相机技术的小麂(Muntiacus reevesi)活动节律[J]. 广西师范大学学报(自然科学版), 2021, 39(1): 65 -70 .
[2] 蒋向辉, 谭荣, 杨永平, 肖清淙. 十大功劳甘草汤治疗肝炎的网络药理学研究[J]. 广西师范大学学报(自然科学版), 2021, 39(5): 198 -209 .
[3] 张顺生, 罗玉玲, 丘森辉. 面向AES密码硬件系统的马氏距离随机旁路攻击方法[J]. 广西师范大学学报(自然科学版), 2021, 39(6): 33 -43 .
[4] 马铖旭, 曾上游, 赵俊博, 陈红阳. 基于卷积神经网络的逆光图像增强研究[J]. 广西师范大学学报(自然科学版), 2022, 40(2): 81 -90 .
[5] 陈超, 徐正会, 张新民, 郭宁妍, 刘霞, 钱怡顺, 祁彪. 四川大凉山中部蚂蚁物种多样性研究[J]. 广西师范大学学报(自然科学版), 2022, 40(2): 218 -230 .
[6] 李俊, 梁晓琴, 常燕玲, 黄艳, 潘立卫. 青钱柳的化学成分及药理活性研究进展[J]. 广西师范大学学报(自然科学版), 2022, 40(5): 227 -252 .
[7] 吴焕政, 吴渝. BBS网络舆情定量分析研究[J]. 广西师范大学学报(自然科学版), 2010, 28(3): 155 -159 .
[8] 陈家瑞, 曹建华, 李涛, 黄艳梅, 庞庭才, 何媛媛. 西南典型岩溶区土壤微生物数量研究[J]. 广西师范大学学报(自然科学版), 2010, 28(4): 96 -100 .
[9] 张梦芸, 葛静, 林支桂. 三类区域上的Logistic扩散问题及其分析[J]. 广西师范大学学报(自然科学版), 2023, 41(1): 17 -23 .
[10] 陈瑶, 李梅珊, 覃锋, 王恒山. 近5年两面针的化学成分及药理活性研究进展[J]. 广西师范大学学报(自然科学版), 2023, 41(1): 24 -37 .
版权所有 © 广西师范大学学报(自然科学版)编辑部
地址:广西桂林市三里店育才路15号 邮编:541004
电话:0773-5857325 E-mail: gxsdzkb@mailbox.gxnu.edu.cn
本系统由北京玛格泰克科技发展有限公司设计开发