一种基于深度学习的中文生成式自动摘要方法

doi:10.16088/j.issn.1001-6600.2020.02.006

摘要/Abstract

摘要： 针对中文的象形性和结构性特点,本文提出了一种新的生成式自动摘要解决方案,包括基于笔画的文本向量生成技术和一个生成式自动摘要模型。基于笔画的文本向量方法针对组成汉字的最小粒度笔画进行编码,增强了通过Skip-Gram模型得到对应的中文词向量语义信息;然后通过对Seq2Seq模型进行优化,使用Bi-LSTM解决长序列文本信息丢失以及逆向信息的补充问题;并在编码端加入Attention机制以计算不同输入词对解码端的影响权重,在解码端加入Beam Search算法优化生成序列的流畅度。基于LCSTS数据集实验表明,本文提出的模型在中文文本摘要生成质量和可读性上有所提升。

关键词: 深度学习, 生成式自动摘要, 笔画向量, Seq2Seq, 注意力机制

Abstract: Based on the unique pictograph and the structure of Chinese character, a new way to form automatic summarization is proposed in the paper, which includes text vector technique directing at Chinese stroke and an automatic summarizing model. Stroke-based text vector codes the basic element of Chinese character and it highlights the specific characteristics of the word, which makes the relationship between words tightened. The corresponding text vector of Chinese word is gained by Skip-Gram model and optimized through Seq2Seq model. It solves the problem of long-sequence text information loss and the supplement of reversing information by using Bi-LSTM. Attention mechanism is used in encoder to weigh different effects of the input statement on decoder and meanwhile the use of Beam Search in the decoder optimizes the sequence of the results. The experiments based on LCSTS data set training model show the automatic summarization model can improve the quality and the readability of Chinese text summary.

Key words: deep learning, generation summarization, stroke_embedding, Seq2Seq, attention mechanism

中图分类号:

TP391

李维勇, 柳斌, 张伟, 陈云芳. 一种基于深度学习的中文生成式自动摘要方法[J]. 广西师范大学学报（自然科学版）, 2020, 38(2): 51-63.

LI Weiyong, LIU Bin, ZHANG Wei, CHEN Yunfang. An Automatic Summarization Model Based on Deep Learning for Chinese[J]. Journal of Guangxi Normal University(Natural Science Edition), 2020, 38(2): 51-63.

参考文献

[1] LUHN H P.The automatic creation of literature abstracts[J].IBM Journal of Research and Development,1958, 2(2):159-165.DOI: 10.1147/rd.22.0159.
[2] 张随远,薛源海,俞晓明,等.多文档短摘要生成技术研究[J].广西师范大学学报(自然科学版),2019,37(2):60-74.DOI: 10.16088/j.issn.1001-6600.2019.02.008.
[3] LOPYREVK. Generating news headlines with recurrent neural networks[EB/OL].(2015-12-05)[2019-10-08]. https://arxiv.org/abs/1512.01712.
[4] 宋俊,韩啸宇,黄宇,等.一种面向实体的演化式多文档摘要生成方法[J].广西师范大学学报(自然科学版),2015,33(2):36-41.DOI: 10.16088/j.issn.1001-6600.2015.02.006.
[5] CHO K, van MERRIËNBOER B,GULCEHRE C,et al.Learning phrase representations using RNN encoder-decoder for statistical machine translation[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing.Stroudsburg,PA:Association for Computational Linguistics,2014:1724-1734.DOI:10.3115/v1/D14-1179.
[6] BAHDANAU D,CHO K,BENGIO Y.Neural machine translation by jointly learning to align and translate[EB/OL]. (2016-05-19)[2019-10-08].https://arxiv.org/abs/1409.0473v7.
[7] 张仰森,曹元大,俞士汶.基于规则与统计相结合的中文文本自动查错模型与算法[J].中文信息学报,2006, 20(4):1-7,55.DOI: 10.3969/j.issn.1003-0077.2006.04.001.
[8] HU Baotian,CHEN Qingcai,ZHU Fangze.LCSTS:a large scale chinese short text summarization dataset[EB/OL]. (2015-06-19)[2019-10-08].https://arxiv.org/abs/1506.05865.
[9] RUSHA M,CHOPRA S,WESTON J.A neural attention model for abstractive sentence summarization[C]// Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. Stroudsburg,PA: Association for Computational Linguistics,2015:379-389.DOI: 10.18653/v1/D15-1044.
[10]BENGIO Y,DUCHARME R,VINCENT P, et al.A neural probabilistic language model[J].Journal of Machine Learning Research,2003,3: 1137-1155.
[11]GOODFELLOWI J,POUGET-ABADIE J,MIRZA M,et al.Generative adversarial nets[C]//Proceedings of the 27th International Conference on Neural Information Processing Systems:Volume 2.Cambridge,MA:MIT Press,2014: 2672-2680.
[12]BOJANOWSKI P,GRAVE E,JOULIN A,et al.Enriching word vectors with subword information[J].Transactions of the Association for Computational Linguistics,2017,5:135-146.DOI 10.1162/tacl_a_00051.
[13]SUTSKEVER I,VINYALS O,LE Q V.Sequence to sequence learning with neural networks[C]//Proceedings of the 27th International Conference on Neural Information Processing Systems:Volume 2.Cambridge,MA:MIT Press,2014: 3104-3112.
[14]BAHDANAU D,CHO K,BENGIO Y.Neural machine translation by jointly learning to align and translate[EB/OL]. (2014-09-01)[2019-10-08].https://arxiv.org/abs/1409.0473v7.
[15]LIN C Y,HOVY E.Automatic evaluation of summaries using N-gram co-occurrence statistics[C]//Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology:Volume 1.Stroudsburg,PA:Association for Computational Linguistics,2003:71-78.DOI:10.3115/ 1073445.1073465.
[16]YU Jinxing,JIAN Xun,XIN Hao,et al.Joint embeddings of Chinese words, characters, and fine-grained subcharacter components[C]//Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing Stroudsburg,PA:Association for Computational Linguistics,2017:286-291.DOI:10.18653/v1/D17-1027.
[17]LUONG T,PHAM H,MANNING C D.Effective approaches to attention-based neural machine translation[C]// Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing.Stroudsburg,PA: Association for Computational Linguistics,2015:1412-1421.DOI:10.18653/v1/D15-1166.
[18]Term Frequency by Inverse Document Frequency[M]//LIU Ling,ÖZSU M.Encyclopedia of Database Systems.Boston, MA:Springer,2009. DOI: 10.1007/978-0-387-39940-9_3784.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed