基于深度学习的短文本语义相似度计算模型

doi:10.16088/j.issn.1001-6600.2021071001

摘要/Abstract

摘要： 基于深度学习的短文本语义相似度度量方法是现代自然语言处理任务的基石,其重要性不言而喻。本文提出一种基于卷积神经网络和双向门控循环单元的文本编码模型,通过卷积层提取重要语义并且通过双向门控循环单元保证语义顺序,采用孪生神经网络结构保证文本编码的一致性。选取传统的卷积神经网络和长短期记忆网络以及BERT模型进行对比验证,在Quora、Sick和MSRP数据集上的验证结果表明,本文模型的精确率和召回率表现优异,且F₁值也优于传统模型。

关键词: 自然语言处理, 语义相似度, 卷积神经网络, 长短期记忆网络, 门控循环单元

Abstract: Short text semantic similarity measurement based on deep learning is the cornerstone of modern natural language processing, and its importance is self-evident. Text encoding model is proposed in this paper based on convolutional neural network and bidirectional gated circulation unit, by convolution important semantic extraction and through bidirectional gated circulation unit to ensure semantic sequence cycles. And the consistency of text encoding is ensured by Siamese neural network structure. In this paper, traditional convolution neural networl is compared with both short-term and long-term memory network and BERT model. Experimental results are done on Quora data set, Sick data set and MSRP data set. The verification results show that the accuracy and recall rate of the proposed model are excellent, and the comprehensive performance index F₁ value is the best compared with the traditional model.

Key words: natural language processing, semantic similarity, convolutional neural network, long short-term memory, gated recurrent unit

中图分类号:

TP391.1

周圣凯, 富丽贞, 宋文爱. 基于深度学习的短文本语义相似度计算模型[J]. 广西师范大学学报（自然科学版）, 2022, 40(3): 49-56.

ZHOU Shengkai, FU Lizhen, SONG Wen’ai. Semantic Similarity Computing Model for Short Text Based on Deep Learning[J]. Journal of Guangxi Normal University(Natural Science Edition), 2022, 40(3): 49-56.

参考文献

[1]殷美桂. 网络舆情系统的设计与实现[J]. 现代计算机, 2020(21): 104-108. DOI: 10.3969/j.issn.1007-1423.2020.21.022.
[2]张睿. 基于用户评价的强化学习推荐算法研究[D]. 西安: 西安电子科技大学, 2019.
[3]杨晨. 基于神经网络的短文本语义相似度计算方法研究[D]. 成都: 电子科技大学, 2020.
[4]DERIU J, MLYNCHYK K, SCHLÄPFER P, et al. A methodology for creating question answering corpora using inverse data annotation[C]// Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA: Association for Computational Linguistics, 2020: 897-911. DOI: 10.18653/v1/2020.acl-main.84.
[5]WANG K, SHEN W Z, YANG Y Y, et al. Relational graph attention network for aspect-based sentiment analysis[C]// Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA: Association for Computational Linguistics, 2020: 3229-3238. DOI: 10.18653/v1/2020.acl-main.295.
[6]周永称. 基于混合深度学习模型的临床医学文本分类研究[D]. 北京: 北京协和医学院, 2020.
[7]MIKOLOV T, SUTSKEVER I, CHEN K, et al. Distributed representations of words and phrases and their compositionality[C]// Proceedings of the 26th International Conference on Neural Information Processing Systems: Volume 2. Red Hook, NY: Curran Associates Inc., 2013: 3111-3119.
[8]LUHN H P. The automatic creation of literature abstracts[J]. IBM Journal of Research and Development, 1958, 2(2): 159-165.
[9]ZHOU Y, LIU C, PAN Y. Modelling sentence pairs with tree-structured attentive encoder[C]// Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers. Osaka, Japan: The COLING 2016 Organizing Committee, 2016: 2912-2922.
[10]石凤贵. 基于自然语言处理的Word2Vec词向量应用[J]. 黑河学院学报, 2020, 11(7): 173-177.
[11]MUELLER J, THYAGARAJAN A. Siamese recurrent architectures for learning sentence similarity[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2016, 30(1): 2786-2792.
[12]SEVERYN A, MOSCHITTI A. Learning to rank short text pairs with convolutional deep neural networks[C]// Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval. New York, NY: Association for Computing Machinery, 2015: 373-382. DOI: 10.1145/2766462.2767738.
[13]FUKUSHIMA K. Neocognitron: a self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position[J]. Biological Cybernetics, 1980, 36(4): 193-202.
[14]CHOPRA S, HADSELL R, LECUN Y. Learning a similarity metric discriminatively, with application to face verification[C]// 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition(CVPR’05). Piscataway, NJ: IEEE Press, 2005: 539-546. DOI: 10.1109/CVPR.2005.202.
[15]YIN W P, SCHÜTZE H. Learning word meta-embeddings[C]// Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Stroudsburg, PA: Association for Computational Linguistics, 2016: 1351-1360. DOI: 10.18653/v1/P16-1128.
[16]韩越, 艾山·吾买尔. 基于元嵌入的跨语言词嵌入方法研究[J]. 现代计算机, 2021(20): 20-25, 32.
[17]POERNER N, WALTINGER U, SCHÜTZE H. Sentence meta-embeddings for unsupervised semantic textual similarity[C]// Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA: Association for Computational Linguistics, 2020: 7027-7034. DOI: 10.18653/v1/2020.acl-main.628.
[18]SANTUS E, WANG H M, CHERSONI E, et al. A rank-based similarity metric for word embeddings[C]// Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). Stroudsburg, PA: Association for Computational Linguistics, 2018: 552-557. DOI: 10.18653/v1/P18-2088.
[19]VOR DER BRÜCK T, POULY M. Text similarity estimation based on word embeddings and matrix norms for targeted marketing[C]// Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Stroudsburg, PA: Association for Computational Linguistics, 2019: 1827-1836. DOI: 10.18653/v1/N19-1181.
[20]李海林, 邹金串. 基于分类词典的文本相似性度量方法[J]. 智能系统学报, 2017, 12(4): 556-562.
[21]朱辉. 融合主题模型的文本语义表示方法研究[D]. 烟台: 山东工商学院, 2021.
[22]KUSNER M J, SUN Y, KOLKIN N I, et al. From word embeddings to document distances[C]// Proceedings of the 32nd International Conference on Machine Learning. Lille: PMLR, 2015: 957-966.
[23]GONG H Y, SAKAKINI T, BHAT S, et al. Document similarity for texts of varying lengths via hidden topics[C]// Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Stroudsburg, PA: Association for Computational Linguistics, 2018: 2341-2351. DOI: 10.18653/v1/P18-1218.
[24]MRABET Y, KILICOGLU H, DEMNER-FUSHMAN D. TextFlow: a text similarity measure based on continuous sequences[C]// Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Stroudsburg, PA: Association for Computational Linguistics, 2017: 763-772. DOI: 10.18653/v1/P17-1071.
[25]胡春光, 高燕, 李颖. 一种扩展滑动窗口算法[J]. 微电子学与计算机, 2007(8): 106-109, 112.
[26]VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[EB/OL].(2017-12-06)[2021-07-10]. https://arxiv.org/pdf/1706.03762. DOI: 10.48550/arXiv.1706.03762.
[27]DEVLIN J, CHANG M W, LEE K, et al. BERT: pre-training of deep bidirectional transformers for language understanding[C]// Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Stroudsburg, PA: Association for Computational Linguistics, 2019: 4171-4186. DOI: 10.18653/v1/N19-1423.
[28]PEINELT N, NGUYEN D, LIAKATA M. tBERT: topic models and BERT joining forces for semantic similarity detection[C]// Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA: Association for Computational Linguistics, 2020: 7047-7055. DOI: 10.18653/v1/2020.acl-main.630.
[29]李雨楠. 基于智能问答系统的短文本语义相似度匹配[D]. 西安: 西安建筑科技大学, 2020.
[30]王培, 王亚文, 卢苗苗. 基于BERT模型的中医文本分类研究[J]. 电脑知识与技术, 2021, 17(27): 13-14, 20.
[31]CHUNG J Y, GULCEHRE C, CHO K H, et al. Empirical evaluation of gated recurrent neural networks on sequence modeling[EB/OL].(2014-12-11)[2021-07-10]. https://arxiv.org/pdf/1412.3555. DOI: 10.48550/arXiv.1412.3555.

Metrics

Viewed

Full text

936

HTML			PDF

Just accepted	Online first	Issue	Just accepted	Online first	Issue
0	0	0	0	0	936

From	Others	local

Times	183	753
Rate	20%	80%

Abstract

435

Just accepted	Online first	Issue

0	0	435

From	Others	local

Times	407	28
Rate	94%	6%

Cited

Web of Science	Crossref	ScienceDirect	Search for Citations in Google Scholar >>


This page requires you have already subscribed to WoS.

Shared

Discussed