|
广西师范大学学报(自然科学版) ›› 2023, Vol. 41 ›› Issue (5): 26-36.doi: 10.16088/j.issn.1001-6600.2023020502
吴正清, 曹晖*, 刘宝锴
WU Zhengqing, CAO Hui*, LIU Baokai
摘要: 针对现有的虚假评论检测方法未充分利用虚假评论文本特征这一问题,本文提出一种基于多层注意力机制的卷积神经网络模型。首先,使用多种预训练词向量初始化词嵌入层,并进行复值位置编码;然后,将经过多种卷积核卷积得到的多种特征映射依次通过嵌入用户特征的通道级和卷积核级的注意力层,根据特征重要程度分配不同权重;最后,将拟合的评论文本特征表示进行Softmax分类。实验结果表明,与诸多主流优秀神经网络模型相比,本文模型准确率和F1值分别提高4.74和3.86个百分点。
中图分类号: TP391.1
[1] JINDAL N, LIU B. Opinion spam and analysis[C]// WSDM’08: Proceedings of the 2008 International Conference on Web Search and Data Mining. New York, NY: Association for Computing Machinery, 2008: 219-230. DOI: 10.1145/1341531.1341560. [2] OTT M, CHOI Y J, CARDIE C, et al. Finding deceptive opinion spam by any stretch of the imagination[C]// Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies. Stroudsburg, PA: Association for Computational Linguistics, 2011: 309-319. [3] MUKHERJEE A, VENKATARAMAN V, LIU B, et al. Fake review detection: classification and analysis of real and pseudo reviews: UIC-CS-2013-03[R]. Chicago: Department of Computer Science of University of Illinois at Chicago, 2013. [4] LI H Y, CHEN Z Y, LIU B, et al. Spotting fake reviews via collective Positive-Unlabeled learning[C]// 2014 IEEE International Conference on Data Mining. Los Alamitos, CA: IEEE Computer Society, 2014: 899-904. DOI: 10.1109/ICDM.2014.47. [5] 任亚峰, 姬东鸿, 张红斌, 等. 基于PU学习算法的虚假评论识别研究[J]. 计算机研究与发展, 2015, 52(3): 639-648. DOI: 10.7544/issn1000-1239.2015.20131473. [6] ABRI F, GUTIERREZ L F, NAMIN A S, et al.Fake reviews detection through analysis of linguistic features[EB/OL]. (2020-10-08)[2023-02-05]. https://arxiv.org/abs/2010.04260. DOI: 10.48550/arXiv.2010.04260. [7] 景亚鹏. 基于深度学习的欺骗性垃圾信息识别研究[D]. 上海: 华东师范大学, 2014. [8] ZHANG W, DU Y H, YOSHIDA T, et al. DRI-RCNN: an approach to deceptive review identification using recurrent convolutional neural network[J]. Information Processing and Management, 2018, 54(4): 576-592. DOI: 10.1016/j.ipm.2018.03.007. [9] LI A, QIN Z, LIU R S, et al.Spam review detection with graph convolutional networks[C]// CIKM’19: Proceedings of the 28th ACM International Conference on Information and Knowledge Management. New York, NY: Association for Computing Machinery, 2019: 2703-2711. DOI: 10.1145/3357384.3357820. [10] STANTON G, IRISSAPPANE A A. GANs for semi-supervised opinion spam detection[C]// Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence (IJCAI-19). Macao: International Joint Conferences on Artificial Intelligence Organization, 2019: 5204-5210. DOI: 10.24963/ijcai.2019/723. [11] 李璐旸. 基于表示学习的虚假信息检测研究[D]. 哈尔滨: 哈尔滨工业大学, 2017. DOI: 10.7666/d.D01332130. [12] LI L Y, QIN B, REN W J, et al. Document representation and feature combination for deceptive spam review detection[J]. Neurocomputing, 2017, 254: 33-41. DOI: 10.1016/j.neucom.2016.10.080. [13] 刘雨心, 王莉, 张昊. 基于分层注意力机制的神经网络垃圾评论检测模型[J]. 计算机应用, 2018, 38(11): 3063-3068, 3074. DOI: 10.11772/j.issn.1001-9081.2018041356. [14] 颜梦香, 姬东鸿, 任亚峰. 基于层次注意力机制神经网络模型的虚假评论识别[J]. 计算机应用, 2019, 39(7): 1925-1930. DOI: 10.11772/j.issn.1001-9081.2018112340. [15] 曾致远, 卢晓勇, 徐盛剑, 等. 基于多层注意力机制深度学习模型的虚假评论检测[J]. 计算机应用与软件, 2020, 37(5): 177-182. DOI: 10.3969/j.issn.1000-386x.2020.05.031. [16] 张蓉, 张献国. 基于层次异构图注意力网络的虚假评论检测[J]. 计算机应用, 2021, 41(5): 1275-1281. DOI: 10.11772/j.issn.1001-9081.2020081190. [17] KIM Y. Convolutional neural networks for sentence classification[C]// Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). Stroudsburg, PA: Association for Computational Linguistics, 2014: 1746-1751. DOI: 10.3115/v1/D14-1181. [18] 汤皓星. 商品虚假评论检测技术研究及软件实现[D]. 兰州: 西北民族大学, 2021. DOI: 10.27408/d.cnki.gxmzc.2021.000036. [19] WANG B Y, ZHAO D H, LIOMA C, et al. Encoding word order in complex embeddings[C]// International Conference on Learning Representations 2020. Virtual: ICLR, 2020: 1-15. [20] LI S, ZHAO Z, HU R F, et al. Analogical reasoning on Chinese morphological and semantic relations[C]// Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). Stroudsburg, PA: Association for Computational Linguistics, 2018: 138-143. DOI: 10.18653/v1/P18-2023. [21] SONG Y, SHI S M, LI J, et al. Directional skip-gram: explicitly distinguishing left and right context for word embeddings[C]// Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers). Stroudsburg, PA: Association for Computational Linguistics, 2018: 175-180. DOI: 10.18653/v1/N18-2028. [22] ZHOU P, QI Z Y, ZHENG S C, et al. Text classification improved by integrating bidirectional LSTM with two-dimensional maxpooling[EB/OL]. (2016-11-21)[2023-02-05]. https://arxiv.org/abs/1611.06639. DOI: 10.48550/arXiv.1611.06639. [23] ZHANG R, LEE H, RADEV D R. Dependency sensitive convolutional neural networks for modeling sentences and documents[C]// Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Stroudsburg, PA: Association for Computational Linguistics, 2016: 1512-1521. DOI: 10.18653/v1/N16-1177. [24] JOHNSON R, ZHANG T. Deep pyramid convolutional neural networks for text categorization[C]// Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Stroudsburg, PA: Association for Computational Linguistics, 2017: 562-570. DOI: 10.18653/v1/P17-1052. [25] LAI A W, XU L H, LIU K, et al. Recurrent convolutional neural networks for text classification[C]// AAAI’15: Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence. Palo Alto, CA: AAAI Press, 2015: 2267-2273. DOI: 10.1609/aaai.v29i1.9513. [26] ZHOU P, SHI W, TIAN J, et al. Attention-based bidirectional long short-term memory networks for relation classification[C]// Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). Stroudsburg, PA: Association for Computational Linguistics, 2016: 207-212. DOI: 10.18653/v1/P16-2034. [27] VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]// Advances in Neural Information Processing Systems 30 (NIPS 2017). Red Hook, NY: Curran Associates Inc., 2017: 6000-6010. [28] DEVLIN J, CHANG M W, LEE K, et al. BERT: pre-training of deep bidirectional transformers for language understanding [C]// Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Stroudsburg, PA: Association for Computational Linguistics, 2019: 4171-4186. DOI: 10.18653/v1/N19-1423. [29] LIU P F, QIU X P, HUANG X J. Recurrent neural network for text classification with multi-task learning[C] // Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence (IJCAI-16). Palo Alto, CA: AAAI Press, 2016: 2873-2879. [30] JOULIN A, GRAVE E, BOJANOWSKI E, et al. Bag of tricks for efficient text classification[C]// Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers. Stroudsburg, PA: Association for Computational Linguistics, 2017: 427-431. DOI: 10.18653/v1/E17-2068. |
[1] | 郭嘉梁, 靳婷. 基于语义增强的多模态情感分析[J]. 广西师范大学学报(自然科学版), 2023, 41(5): 14-25. |
[2] | 唐侯清, 辛斌斌, 朱虹谕, 乙加伟, 张冬冬, 武新章, 双丰. 基于多尺度注意力倒残差网络的轴承故障诊断[J]. 广西师范大学学报(自然科学版), 2023, 41(4): 109-122. |
[3] | 黄叶祺, 王明伟, 闫瑞, 雷涛. 基于改进的YOLOv5金刚石线表面质量检测[J]. 广西师范大学学报(自然科学版), 2023, 41(4): 123-134. |
[4] | 邓希桢, 蒋明, 岑明灿, 罗玉玲. 基于熵图像静态分析技术的勒索软件分类研究[J]. 广西师范大学学报(自然科学版), 2023, 41(3): 91-104. |
[5] | 王利娥, 王艺汇, 李先贤. POI推荐中的多源数据融合和隐私保护方法[J]. 广西师范大学学报(自然科学版), 2023, 41(1): 87-101. |
[6] | 潘海明, 陈庆锋, 邱杰, 何乃旭, 刘春雨, 杜晓敬. 基于卷积推理的多跳知识图谱问答算法[J]. 广西师范大学学报(自然科学版), 2023, 41(1): 102-112. |
[7] | 张涛, 杜建民. 基于无人机遥感的荒漠草原微斑块识别研究[J]. 广西师范大学学报(自然科学版), 2022, 40(6): 50-58. |
[8] | 田晟, 宋霖. 基于CNN和Bagging集成的交通标志识别[J]. 广西师范大学学报(自然科学版), 2022, 40(4): 35-46. |
[9] | 王宇航, 张灿龙, 李志欣, 王智文. 体现用户意图和风格的图像描述生成[J]. 广西师范大学学报(自然科学版), 2022, 40(4): 91-103. |
[10] | 李正光, 陈恒, 林鸿飞. 基于双向语言模型的社交媒体药物不良反应识别[J]. 广西师范大学学报(自然科学版), 2022, 40(3): 40-48. |
[11] | 周圣凯, 富丽贞, 宋文爱. 基于深度学习的短文本语义相似度计算模型[J]. 广西师范大学学报(自然科学版), 2022, 40(3): 49-56. |
[12] | 万黎明, 张小乾, 刘知贵, 宋林, 周莹, 李理. 基于高效通道注意力的UNet肺结节CT图像分割[J]. 广西师范大学学报(自然科学版), 2022, 40(3): 66-75. |
[13] | 张萍, 徐巧枝. 基于多感受野与分组混合注意力机制的肺结节分割研究[J]. 广西师范大学学报(自然科学版), 2022, 40(3): 76-87. |
[14] | 彭涛, 唐经, 何凯, 胡新荣, 刘军平, 何儒汉. 基于多步态特征融合的情感识别[J]. 广西师范大学学报(自然科学版), 2022, 40(3): 104-111. |
[15] | 孔亚钰, 卢玉洁, 孙中天, 肖敬先, 侯昊辰, 陈廷伟. 面向强化当前兴趣的图神经网络推荐算法研究[J]. 广西师范大学学报(自然科学版), 2022, 40(3): 151-160. |
|
版权所有 © 广西师范大学学报(自然科学版)编辑部 地址:广西桂林市三里店育才路15号 邮编:541004 电话:0773-5857325 E-mail: gxsdzkb@mailbox.gxnu.edu.cn 本系统由北京玛格泰克科技发展有限公司设计开发 |