|
广西师范大学学报(自然科学版) ›› 2022, Vol. 40 ›› Issue (3): 49-56.doi: 10.16088/j.issn.1001-6600.2021071001
周圣凯1,2, 富丽贞1,2*, 宋文爱1,2
ZHOU Shengkai1,2, FU Lizhen1,2*, SONG Wen’ai1,2
摘要: 基于深度学习的短文本语义相似度度量方法是现代自然语言处理任务的基石,其重要性不言而喻。本文提出一种基于卷积神经网络和双向门控循环单元的文本编码模型,通过卷积层提取重要语义并且通过双向门控循环单元保证语义顺序,采用孪生神经网络结构保证文本编码的一致性。选取传统的卷积神经网络和长短期记忆网络以及BERT模型进行对比验证,在Quora、Sick和MSRP数据集上的验证结果表明,本文模型的精确率和召回率表现优异,且F1值也优于传统模型。
中图分类号:
[1]殷美桂. 网络舆情系统的设计与实现[J]. 现代计算机, 2020(21): 104-108. DOI: 10.3969/j.issn.1007-1423.2020.21.022. [2]张睿. 基于用户评价的强化学习推荐算法研究[D]. 西安: 西安电子科技大学, 2019. [3]杨晨. 基于神经网络的短文本语义相似度计算方法研究[D]. 成都: 电子科技大学, 2020. [4]DERIU J, MLYNCHYK K, SCHLÄPFER P, et al. A methodology for creating question answering corpora using inverse data annotation[C]// Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA: Association for Computational Linguistics, 2020: 897-911. DOI: 10.18653/v1/2020.acl-main.84. [5]WANG K, SHEN W Z, YANG Y Y, et al. Relational graph attention network for aspect-based sentiment analysis[C]// Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA: Association for Computational Linguistics, 2020: 3229-3238. DOI: 10.18653/v1/2020.acl-main.295. [6]周永称. 基于混合深度学习模型的临床医学文本分类研究[D]. 北京: 北京协和医学院, 2020. [7]MIKOLOV T, SUTSKEVER I, CHEN K, et al. Distributed representations of words and phrases and their compositionality[C]// Proceedings of the 26th International Conference on Neural Information Processing Systems: Volume 2. Red Hook, NY: Curran Associates Inc., 2013: 3111-3119. [8]LUHN H P. The automatic creation of literature abstracts[J]. IBM Journal of Research and Development, 1958, 2(2): 159-165. [9]ZHOU Y, LIU C, PAN Y. Modelling sentence pairs with tree-structured attentive encoder[C]// Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers. Osaka, Japan: The COLING 2016 Organizing Committee, 2016: 2912-2922. [10]石凤贵. 基于自然语言处理的Word2Vec词向量应用[J]. 黑河学院学报, 2020, 11(7): 173-177. [11]MUELLER J, THYAGARAJAN A. Siamese recurrent architectures for learning sentence similarity[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2016, 30(1): 2786-2792. [12]SEVERYN A, MOSCHITTI A. Learning to rank short text pairs with convolutional deep neural networks[C]// Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval. New York, NY: Association for Computing Machinery, 2015: 373-382. DOI: 10.1145/2766462.2767738. [13]FUKUSHIMA K. Neocognitron: a self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position[J]. Biological Cybernetics, 1980, 36(4): 193-202. [14]CHOPRA S, HADSELL R, LECUN Y. Learning a similarity metric discriminatively, with application to face verification[C]// 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition(CVPR’05). Piscataway, NJ: IEEE Press, 2005: 539-546. DOI: 10.1109/CVPR.2005.202. [15]YIN W P, SCHÜTZE H. Learning word meta-embeddings[C]// Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Stroudsburg, PA: Association for Computational Linguistics, 2016: 1351-1360. DOI: 10.18653/v1/P16-1128. [16]韩越, 艾山·吾买尔. 基于元嵌入的跨语言词嵌入方法研究[J]. 现代计算机, 2021(20): 20-25, 32. [17]POERNER N, WALTINGER U, SCHÜTZE H. Sentence meta-embeddings for unsupervised semantic textual similarity[C]// Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA: Association for Computational Linguistics, 2020: 7027-7034. DOI: 10.18653/v1/2020.acl-main.628. [18]SANTUS E, WANG H M, CHERSONI E, et al. A rank-based similarity metric for word embeddings[C]// Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). Stroudsburg, PA: Association for Computational Linguistics, 2018: 552-557. DOI: 10.18653/v1/P18-2088. [19]VOR DER BRÜCK T, POULY M. Text similarity estimation based on word embeddings and matrix norms for targeted marketing[C]// Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Stroudsburg, PA: Association for Computational Linguistics, 2019: 1827-1836. DOI: 10.18653/v1/N19-1181. [20]李海林, 邹金串. 基于分类词典的文本相似性度量方法[J]. 智能系统学报, 2017, 12(4): 556-562. [21]朱辉. 融合主题模型的文本语义表示方法研究[D]. 烟台: 山东工商学院, 2021. [22]KUSNER M J, SUN Y, KOLKIN N I, et al. From word embeddings to document distances[C]// Proceedings of the 32nd International Conference on Machine Learning. Lille: PMLR, 2015: 957-966. [23]GONG H Y, SAKAKINI T, BHAT S, et al. Document similarity for texts of varying lengths via hidden topics[C]// Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Stroudsburg, PA: Association for Computational Linguistics, 2018: 2341-2351. DOI: 10.18653/v1/P18-1218. [24]MRABET Y, KILICOGLU H, DEMNER-FUSHMAN D. TextFlow: a text similarity measure based on continuous sequences[C]// Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Stroudsburg, PA: Association for Computational Linguistics, 2017: 763-772. DOI: 10.18653/v1/P17-1071. [25]胡春光, 高燕, 李颖. 一种扩展滑动窗口算法[J]. 微电子学与计算机, 2007(8): 106-109, 112. [26]VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[EB/OL].(2017-12-06)[2021-07-10]. https://arxiv.org/pdf/1706.03762. DOI: 10.48550/arXiv.1706.03762. [27]DEVLIN J, CHANG M W, LEE K, et al. BERT: pre-training of deep bidirectional transformers for language understanding[C]// Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Stroudsburg, PA: Association for Computational Linguistics, 2019: 4171-4186. DOI: 10.18653/v1/N19-1423. [28]PEINELT N, NGUYEN D, LIAKATA M. tBERT: topic models and BERT joining forces for semantic similarity detection[C]// Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA: Association for Computational Linguistics, 2020: 7047-7055. DOI: 10.18653/v1/2020.acl-main.630. [29]李雨楠. 基于智能问答系统的短文本语义相似度匹配[D]. 西安: 西安建筑科技大学, 2020. [30]王培, 王亚文, 卢苗苗. 基于BERT模型的中医文本分类研究[J]. 电脑知识与技术, 2021, 17(27): 13-14, 20. [31]CHUNG J Y, GULCEHRE C, CHO K H, et al. Empirical evaluation of gated recurrent neural networks on sequence modeling[EB/OL].(2014-12-11)[2021-07-10]. https://arxiv.org/pdf/1412.3555. DOI: 10.48550/arXiv.1412.3555. |
[1] | 彭涛, 唐经, 何凯, 胡新荣, 刘军平, 何儒汉. 基于多步态特征融合的情感识别[J]. 广西师范大学学报(自然科学版), 2022, 40(3): 104-111. |
[2] | 蒋瑞, 徐娟, 李强. 基于跨域均值逼近的轴承剩余使用寿命预测[J]. 广西师范大学学报(自然科学版), 2022, 40(3): 121-131. |
[3] | 李冰, 李智, 杨溢龙. 基于词嵌入和长短期记忆网络的非功能软件需求分类[J]. 广西师范大学学报(自然科学版), 2021, 39(5): 110-121. |
[4] | 陈文康, 陆声链, 刘冰浩, 李帼, 刘晓宇, 陈明. 基于改进YOLOv4的果园柑橘检测方法研究[J]. 广西师范大学学报(自然科学版), 2021, 39(5): 134-146. |
[5] | 杨州, 范意兴, 朱小飞, 郭嘉丰, 王越. 神经信息检索模型建模因素综述[J]. 广西师范大学学报(自然科学版), 2021, 39(2): 1-12. |
[6] | 邓文轩, 杨航, 靳婷. 基于注意力机制的图像分类降维方法[J]. 广西师范大学学报(自然科学版), 2021, 39(2): 32-40. |
[7] | 严浩, 许洪波, 沈英汉, 程学旗. 开放式中文事件检测研究[J]. 广西师范大学学报(自然科学版), 2020, 38(2): 64-71. |
[8] | 范瑞,蒋品群,曾上游,夏海英,廖志贤,李鹏. 多尺度并行融合的轻量级卷积神经网络设计[J]. 广西师范大学学报(自然科学版), 2019, 37(3): 50-59. |
[9] | 张金磊, 罗玉玲, 付强. 基于门控循环单元神经网络的金融时间序列预测[J]. 广西师范大学学报(自然科学版), 2019, 37(2): 82-89. |
[10] | 武文雅, 陈钰枫, 徐金安, 张玉洁. 基于高层语义注意力机制的中文实体关系抽取[J]. 广西师范大学学报(自然科学版), 2019, 37(1): 32-41. |
[11] | 薛洋,曾庆科,夏海英,王文涛. 基于卷积神经网络超分辨率重建的遥感图像融合[J]. 广西师范大学学报(自然科学版), 2018, 36(2): 33-41. |
[12] | 宋俊, 韩啸宇, 黄宇, 黄廷磊, 付琨. 一种面向实体的演化式多文档摘要生成方法[J]. 广西师范大学学报(自然科学版), 2015, 33(2): 36-41. |
[13] | 沙贝贝, 谢丽聪. 一种基于频繁项集的搜索引擎聚类浏览算法[J]. 广西师范大学学报(自然科学版), 2011, 29(2): 151-155. |
[14] | 程显毅, 潘燕, 朱倩, 孙萍. 面向事件的多文档文摘生成算法的研究[J]. 广西师范大学学报(自然科学版), 2011, 29(1): 147-150. |
|
版权所有 © 广西师范大学学报(自然科学版)编辑部 地址:广西桂林市三里店育才路15号 邮编:541004 电话:0773-5857325 E-mail: gxsdzkb@mailbox.gxnu.edu.cn 本系统由北京玛格泰克科技发展有限公司设计开发 |