|
广西师范大学学报(自然科学版) ›› 2019, Vol. 37 ›› Issue (4): 53-60.doi: 10.16088/j.issn.1001-6600.2019.04.006
王健*, 郑七凡, 李超, 石晶
WANG Jian*, ZHENG Qifan, LI Chao, SHI Jing
摘要: 在信息抽取中,关系抽取是一项准确识别自然语言中实体间关系的关键技术。针对关系抽取模型中容易丢失关键语义特征问题及远程监督的基本假设容易引入噪声数据的问题,本文提出一种基于远程监督的ENCODER_ATT关系抽取模型。基于循环神经网络构造的ENCODER模型在以词级别进行特征记忆提取,并在句子层面进行语义特征信息整合,保证不遗失关键语义特征的同时去除冗余特征。然后在句子层面引入了注意力机制来降低噪声数据对实验结果的影响。在真实的数据集上进行实验,并绘制准确率-召回率曲线,实验结果表明ENCODER_ATT模型对比同类型的关系抽取方法有明显的提升。
中图分类号:
[1] CHINCHOR N,MARSH E.MUC-7 information extraction task definition[C]//Proceedings of the 7th Message Understanding Conference.Stroudsburg,PA:Association for Computational Linguistics,1998:359-367. [2] 韩红旗,徐硕,桂婕,等.基于词形规则模板的术语层次关系抽取方法[J].情报学报,2013,32(7):708-715.DOI: 10.3772/j.issn.1000-0135.2013.07.004. [3] RINK B,HARABAGIU A.UTD:classifying semantic relations by combining lexical and semantic resources [C]//Proceedings of the 5th International Workshop on Semantic Evaluation.Stroudsburg,PA:Association for Computational Linguistics,2010:256-259. [4] 陈金栋,肖仰华.一种基于语义的上下位关系抽取方法[J].计算机应用与软件,2019,36(2):216-221.DOI: 10.3969/j.issn.1000-386x.2019.02.039. [5] KAMBHATLA N.Combining lexical,syntactic,and semantic features with maximum entropy models for extracting relations[C]//Proceedings of the ACL 2004 on Interactive Poster and Demonstration Sessions. Stroudsburg,PA:Association for Computational Linguistics,2004:22.DOI:10.3115/1219044.1219066. [6] ZELENKO D,AONE C,RICHARDELLA A.Kernel methods for relation extraction[J].Journal of Machine Learning Research,2003,3:1083-1106. [7] 王路路,艾山·吾买尔,买合木提·买买提,等.基于CRF和半监督学习的维吾尔文命名实体识别[J].中文信息学报,2018,32(11):16-26,33. [8] 屠恩美,杨杰.半监督学习理论及其研究进展概述[J].上海交通大学学报,2018,52(10):1280-1291.DOI: 10.16183/j.cnki.jsjtu.2018.10.017. [9] ETZIONI O,CAFARELLA M,DOWNEY D,et al.Unsupervised named-entity extraction from the Web:An experimental study[J].Artificial Intelligence,2005,165(1):91-134.DOI:10.1016/j.artint.2005.03. 001. [10]BOLLEGALA D T,MATSUO Y,ISHIZUKA M.Relational duality: unsupervised extraction of semantic relations between entities on the Web[C]//Proceedings of the 19th International Conference on World Wide Web.New York,NY:ACM Press,2010:151-160.DOI:10.1145/1772690.1772707. [11]刘荣,郝晓燕,李颖.基于语义模式的半监督中文观点句识别研究[J].南京大学学报(自然科学),2018,54(5): 967-973.DOI:10.13232/j.cnki.jnju.2018.05.012. [12]刘锦文,许静,张利萍,等.基于标签传播和主动学习的人物社会关系抽取[J].计算机工程,2017,43(2):234-240. DOI:10.3969/j.issn.1000-3428.2017.02.039. [13]MINTZ M,BILLS S,SNOW R,et al.Distant supervision for relation extraction without labeled data[C]// Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP:Volume 2.Stroudsburg,PA:Association for Computational Linguistics,2009:1003-1011. [14]SUCHANEK F M,KASNECI G,WEIKUM G.YAGO:a core of semantic knowledge[C]//Proceedings of the 16th International Conference on World Wide Web.New York,NY:ACM Press,2007:697-706.DOI:10.1145/1242572. 1242667. [15]BOLLACKER K,EVANS C,PARITOSH P,et al.Freebase:a collaboratively created graph database for structuring human knowledge[C]//Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data.New York,NY:ACM Press,2008:1247-1250.DOI:10.1145/1376616.1376746. [16]AUER S,BIZER C,KOBILAROV G,et al.DBpedia:a nucleus for a Web of open data[C]//The Semantic Web: Proceedings of 6th International Semantic Web Conference and 2nd Asian Semantic Web Conference:LNCS 4825.Berlin:Springer-Verlag,2007:722-735.DOI:10.1007/978-3-540-76298-0_52. [17]HOFFMANN R,ZHANG Congle,LING Xiao,et al.Knowledge-based weak supervision for information extraction of overlapping relations[C]//Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics:Human Language Technologies:Volume 1.Stroudsburg,PA:Association for Computational Linguistics,2011:541-550. [18]SURDEANUM,TIBSHIRANI J,NALLAPATI R,et al.Multi-instance multi-label learning for relation extraction[C]//Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning.Stroudsburg,PA:Association for Computational Linguistics,2012:455-465. [19]LIN Yankai,SHEN Shiqi,LIU Zhiyuan,et al.Neural relation extraction with selective attention over instances[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics: Volume 1.Stroudsburg,PA:Association for Computational Linguistics,2016:2124-2133.DOI:10.18653/v1/P16-1200. [20]黄兆玮,常亮,宾辰忠,等.基于GRU和注意力机制的远程监督关系抽取[J/OL].计算机应用研究,2019,36(10) [2019-02-28].http://www.arocmag.com/article/02-2019-10-006.html. [21]ZENG Daojian,LIU Kang,LAI Siwei,et al.Relation classification via convolutional deep neural network [C]//Proceedings of the 25th International Conference on Computational Linguistics:Technical Papers. Stroudsburg,PA:Association for Computational Linguistics,2014:2335-2344. [22]GRAVES A.Long short-term memory[M]//GRAVES A.Supervised Sequence Labelling with Recurrent Neural Networks.Berlin:Springer-Verlag,2012:37-45.DOI:10.1007/978-3-642-24797-2_4. [23]RIEDEL S,YAO Limin,McCALLUM A.Modeling relations and their mentions without labeled text[C]//Machine Learning and Knowledge Discovery in Databases:Proceedings of ECML PKDD 2010 Part III:LNAI 6323. Berlin:Springer-Verlag,2010:148-163. |
[1] | 李维勇, 柳斌, 张伟, 陈云芳. 一种基于深度学习的中文生成式自动摘要方法[J]. 广西师范大学学报(自然科学版), 2020, 38(2): 51-63. |
[2] | 武文雅, 陈钰枫, 徐金安, 张玉洁. 基于高层语义注意力机制的中文实体关系抽取[J]. 广西师范大学学报(自然科学版), 2019, 37(1): 32-41. |
[3] | 岳天驰, 张绍武, 杨亮, 林鸿飞, 于凯. 基于两阶段注意力机制的立场检测方法[J]. 广西师范大学学报(自然科学版), 2019, 37(1): 42-49. |
|
版权所有 © 广西师范大学学报(自然科学版)编辑部 地址:广西桂林市三里店育才路15号 邮编:541004 电话:0773-5857325 E-mail: gxsdzkb@mailbox.gxnu.edu.cn 本系统由北京玛格泰克科技发展有限公司设计开发 |