广西师范大学学报(自然科学版) ›› 2019, Vol. 37 ›› Issue (4): 53-60.doi: 10.16088/j.issn.1001-6600.2019.04.006

• • 上一篇    下一篇

基于ENCODER_ATT机制的远程监督关系抽取

王健*, 郑七凡, 李超, 石晶   

  1. 东北林业大学信息与计算机工程学院,黑龙江哈尔滨150040
  • 收稿日期:2019-02-28 出版日期:2019-10-25 发布日期:2019-11-28
  • 通讯作者: 王健(1976—),男,黑龙江齐齐哈尔人,东北林业大学副教授,博士。E-mail:wang1342@foxmail.com
  • 基金资助:
    国家自然科学基金(31700643);中央高校基本科研业务费专项资金(DL11AB01)

Remote Supervision Relationship Extraction Based on Encoder and Attention Mechanism

WANG Jian*, ZHENG Qifan, LI Chao, SHI Jing   

  1. College of Information and Computer Engineering, Northeast Forestry University, Harbin Heilongjiang 150040,China
  • Received:2019-02-28 Online:2019-10-25 Published:2019-11-28

摘要: 在信息抽取中,关系抽取是一项准确识别自然语言中实体间关系的关键技术。针对关系抽取模型中容易丢失关键语义特征问题及远程监督的基本假设容易引入噪声数据的问题,本文提出一种基于远程监督的ENCODER_ATT关系抽取模型。基于循环神经网络构造的ENCODER模型在以词级别进行特征记忆提取,并在句子层面进行语义特征信息整合,保证不遗失关键语义特征的同时去除冗余特征。然后在句子层面引入了注意力机制来降低噪声数据对实验结果的影响。在真实的数据集上进行实验,并绘制准确率-召回率曲线,实验结果表明ENCODER_ATT模型对比同类型的关系抽取方法有明显的提升。

关键词: 关系抽取, 远程监督, ENCODER, 注意力机制

Abstract: In information extraction, relation extraction is a key technology to accurately identify the relationships between entities in natural language. Aiming at the problem that the key semantics in the relation extraction model are easy to lose and the basic assumptions of remote supervision are easy to introduce noise data, an ENCODER_ATT relationship extraction model based on remote supervision is proposed. Firstly, the ENCODER model based on the construction of the cyclic neural network extracts the feature memory at the word level and integrates the semantic feature information at the sentence level to ensure that the key features are removed without removing the redundant features. Secondly, attention mechanism is introduced at the sentence level to reduce the influence of noise data on the test results. Based on the actual experimental data, the experiment was carried out and the accuracy-recall rate curve was drawn to prove that the ENCODER_ATT model has a better improvement over the relationship extraction method of the same type.

Key words: relationship extraction, remote supervision, ENCODER, attention mechanism

中图分类号: 

  • TP391.1
[1] CHINCHOR N,MARSH E.MUC-7 information extraction task definition[C]//Proceedings of the 7th Message Understanding Conference.Stroudsburg,PA:Association for Computational Linguistics,1998:359-367.
[2] 韩红旗,徐硕,桂婕,等.基于词形规则模板的术语层次关系抽取方法[J].情报学报,2013,32(7):708-715.DOI: 10.3772/j.issn.1000-0135.2013.07.004.
[3] RINK B,HARABAGIU A.UTD:classifying semantic relations by combining lexical and semantic resources [C]//Proceedings of the 5th International Workshop on Semantic Evaluation.Stroudsburg,PA:Association for Computational Linguistics,2010:256-259.
[4] 陈金栋,肖仰华.一种基于语义的上下位关系抽取方法[J].计算机应用与软件,2019,36(2):216-221.DOI: 10.3969/j.issn.1000-386x.2019.02.039.
[5] KAMBHATLA N.Combining lexical,syntactic,and semantic features with maximum entropy models for extracting relations[C]//Proceedings of the ACL 2004 on Interactive Poster and Demonstration Sessions. Stroudsburg,PA:Association for Computational Linguistics,2004:22.DOI:10.3115/1219044.1219066.
[6] ZELENKO D,AONE C,RICHARDELLA A.Kernel methods for relation extraction[J].Journal of Machine Learning Research,2003,3:1083-1106.
[7] 王路路,艾山·吾买尔,买合木提·买买提,等.基于CRF和半监督学习的维吾尔文命名实体识别[J].中文信息学报,2018,32(11):16-26,33.
[8] 屠恩美,杨杰.半监督学习理论及其研究进展概述[J].上海交通大学学报,2018,52(10):1280-1291.DOI: 10.16183/j.cnki.jsjtu.2018.10.017.
[9] ETZIONI O,CAFARELLA M,DOWNEY D,et al.Unsupervised named-entity extraction from the Web:An experimental study[J].Artificial Intelligence,2005,165(1):91-134.DOI:10.1016/j.artint.2005.03. 001.
[10]BOLLEGALA D T,MATSUO Y,ISHIZUKA M.Relational duality: unsupervised extraction of semantic relations between entities on the Web[C]//Proceedings of the 19th International Conference on World Wide Web.New York,NY:ACM Press,2010:151-160.DOI:10.1145/1772690.1772707.
[11]刘荣,郝晓燕,李颖.基于语义模式的半监督中文观点句识别研究[J].南京大学学报(自然科学),2018,54(5): 967-973.DOI:10.13232/j.cnki.jnju.2018.05.012.
[12]刘锦文,许静,张利萍,等.基于标签传播和主动学习的人物社会关系抽取[J].计算机工程,2017,43(2):234-240. DOI:10.3969/j.issn.1000-3428.2017.02.039.
[13]MINTZ M,BILLS S,SNOW R,et al.Distant supervision for relation extraction without labeled data[C]// Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP:Volume 2.Stroudsburg,PA:Association for Computational Linguistics,2009:1003-1011.
[14]SUCHANEK F M,KASNECI G,WEIKUM G.YAGO:a core of semantic knowledge[C]//Proceedings of the 16th International Conference on World Wide Web.New York,NY:ACM Press,2007:697-706.DOI:10.1145/1242572. 1242667.
[15]BOLLACKER K,EVANS C,PARITOSH P,et al.Freebase:a collaboratively created graph database for structuring human knowledge[C]//Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data.New York,NY:ACM Press,2008:1247-1250.DOI:10.1145/1376616.1376746.
[16]AUER S,BIZER C,KOBILAROV G,et al.DBpedia:a nucleus for a Web of open data[C]//The Semantic Web: Proceedings of 6th International Semantic Web Conference and 2nd Asian Semantic Web Conference:LNCS 4825.Berlin:Springer-Verlag,2007:722-735.DOI:10.1007/978-3-540-76298-0_52.
[17]HOFFMANN R,ZHANG Congle,LING Xiao,et al.Knowledge-based weak supervision for information extraction of overlapping relations[C]//Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics:Human Language Technologies:Volume 1.Stroudsburg,PA:Association for Computational Linguistics,2011:541-550.
[18]SURDEANUM,TIBSHIRANI J,NALLAPATI R,et al.Multi-instance multi-label learning for relation extraction[C]//Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning.Stroudsburg,PA:Association for Computational Linguistics,2012:455-465.
[19]LIN Yankai,SHEN Shiqi,LIU Zhiyuan,et al.Neural relation extraction with selective attention over instances[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics: Volume 1.Stroudsburg,PA:Association for Computational Linguistics,2016:2124-2133.DOI:10.18653/v1/P16-1200.
[20]黄兆玮,常亮,宾辰忠,等.基于GRU和注意力机制的远程监督关系抽取[J/OL].计算机应用研究,2019,36(10) [2019-02-28].http://www.arocmag.com/article/02-2019-10-006.html.
[21]ZENG Daojian,LIU Kang,LAI Siwei,et al.Relation classification via convolutional deep neural network [C]//Proceedings of the 25th International Conference on Computational Linguistics:Technical Papers. Stroudsburg,PA:Association for Computational Linguistics,2014:2335-2344.
[22]GRAVES A.Long short-term memory[M]//GRAVES A.Supervised Sequence Labelling with Recurrent Neural Networks.Berlin:Springer-Verlag,2012:37-45.DOI:10.1007/978-3-642-24797-2_4.
[23]RIEDEL S,YAO Limin,McCALLUM A.Modeling relations and their mentions without labeled text[C]//Machine Learning and Knowledge Discovery in Databases:Proceedings of ECML PKDD 2010 Part III:LNAI 6323. Berlin:Springer-Verlag,2010:148-163.
[1] 李维勇, 柳斌, 张伟, 陈云芳. 一种基于深度学习的中文生成式自动摘要方法[J]. 广西师范大学学报(自然科学版), 2020, 38(2): 51-63.
[2] 武文雅, 陈钰枫, 徐金安, 张玉洁. 基于高层语义注意力机制的中文实体关系抽取[J]. 广西师范大学学报(自然科学版), 2019, 37(1): 32-41.
[3] 岳天驰, 张绍武, 杨亮, 林鸿飞, 于凯. 基于两阶段注意力机制的立场检测方法[J]. 广西师范大学学报(自然科学版), 2019, 37(1): 42-49.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] 李钰慧, 陈泽柠, 黄中豪, 周岐海. 广西弄岗熊猴的雨季活动时间分配[J]. 广西师范大学学报(自然科学版), 2018, 36(3): 80 -86 .
[2] 覃盈盈, 漆光超, 梁士楚. 凤眼莲组织浸提液对靖西海菜花种子萌发的影响[J]. 广西师范大学学报(自然科学版), 2018, 36(3): 87 -92 .
[3] 庄枫红, 马姜明, 张雅君, 苏静, 于方明. 中华水韭对不同光照条件的生理生态响应[J]. 广西师范大学学报(自然科学版), 2018, 36(3): 93 -100 .
[4] 韦宏金, 周喜乐, 金冬梅, 严岳鸿. 湖南蕨类植物增补[J]. 广西师范大学学报(自然科学版), 2018, 36(3): 101 -106 .
[5] 包金萍, 郑连斌, 宇克莉, 宋雪, 田金源, 董文静. 大凉山彝族成人皮褶厚度特征[J]. 广西师范大学学报(自然科学版), 2018, 36(3): 107 -112 .
[6] 林永生, 裴建国, 邹胜章, 杜毓超, 卢丽. 清江下游红层岩溶及其水化学特征[J]. 广西师范大学学报(自然科学版), 2018, 36(3): 113 -120 .
[7] 张茹, 张蓓, 任鸿瑞. 山西轩岗矿区耕地流失时空特征及其影响因子研究[J]. 广西师范大学学报(自然科学版), 2018, 36(3): 121 -132 .
[8] 李贤江, 石淑芹, 蔡为民, 曹玉青. 基于CA-Markov模型的天津滨海新区土地利用变化模拟[J]. 广西师范大学学报(自然科学版), 2018, 36(3): 133 -143 .
[9] 王梦飞, 黄松. 广西西江经济带的城市旅游经济空间关联研究[J]. 广西师范大学学报(自然科学版), 2018, 36(3): 144 -150 .
[10] 刘国伦, 宋树祥, 岑明灿, 李桂琴, 谢丽娜. 带宽可调带阻滤波器的设计[J]. 广西师范大学学报(自然科学版), 2018, 36(3): 1 -8 .
版权所有 © 广西师范大学学报(自然科学版)编辑部
地址:广西桂林市三里店育才路15号 邮编:541004
电话:0773-5857325 E-mail: gxsdzkb@mailbox.gxnu.edu.cn
本系统由北京玛格泰克科技发展有限公司设计开发