Journal of Guangxi Normal University(Natural Science Edition) ›› 2023, Vol. 41 ›› Issue (6): 62-69.doi: 10.16088/j.issn.1001-6600.2023052001

Previous Articles     Next Articles

Multi-level Argument Position Classification Method via Data Augmentation

LIN Wancong, HAN Mingjie, JIN Ting*   

  1. School of Computer Science and Technology, Hainan University, Haikou Hainan 570228, China
  • Received:2023-05-20 Revised:2023-06-21 Published:2023-12-04

Abstract: The purpose of this paper is to investigate argument extraction techniques, in order to identify, extract, and analyze argumentative components and structures in textual information. The intelligent analysis of debate fact text is accomplished by extracting arguments related to the topic of debate from multiple sentences and determining whether the position of the argument is supportive or oppositional. Previous research has mainly relied on deep learning models such as convolutional neural networks and recurrent neural networks, which have simple network structures and cannot learn deeper features from arguments. In order to learn richer semantic information from argumentative text for position classification better, this paper proposes an enhanced RoBERTa model (EnhRoBERTa) based on the pre-training language model RoBERTa, which fully utilizes the multi-level multi-head attention mechanism and extracts shallow and deep semantic representations for fusion, enabling a comprehensive understanding of the relationship between arguments and debate topics from multiple feature dimensions, thereby facilitating argument position classification. However, considering the problem of imbalanced distribution of position in argumentative points, this paper adopts data augmentation techniques to enhance the learning ability of scarce samples. The experimental results on the CCAC2022 match data set show that the proposed model can extract more text features than other baseline models, achieving an F1-score of 61.4%, which is approximately 19% higher than that of the baseline models TextCNN and BiLSTM, and 3.8% higher than that of the RoBERTa.

Key words: position classification, data augmentation, pre-training language model, multiple attention, multi-layer feature extraction

CLC Number:  TP391.1
[1] THOMAS M, PANG B, LEE L. Get out the vote:determining support or opposition from Congressional floor-debate transcripts[C]// Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, PA: Association for Computational Linguistics, 2006: 327-335.
[2] BURFOOT C, BIRD S, BALDWIN T. Collective classification of congressional floor-debate transcripts[C]// Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies. Stroudsburg, PA: Association for Computational Linguistics, 2011: 1506-1515.
[3] ANAND P, WALKER M, ABBOTT R, et al. Cats rule and dogsdrool!: classifying stance in online debate[C]// Proceedings of the 2nd Workshop on Computational Approaches to Subjectivity and Sentiment Analysis (WASSA 2.011). Stroudsburg, PA: Association for Computational Linguistics, 2011: 1-9.
[4] 王儒, 王嘉梅, 王伟全, 等. 深度学习框架下微博文本情感细粒度研究[J]. 计算机系统应用, 2020, 29(5): 19-28. DOI: 10.15888/j.cnki.csa.007371.
[5] 王安君, 黄凯凯, 陆黎明. 基于Bert-Condition-CNN的中文微博立场检测[J]. 计算机系统应用, 2019, 28(11): 45-53. DOI: 10.15888/j.cnki.csa.007152.
[6] MOENS M F, BOIY E, PALAU R M, et al. Automatic detection of arguments in legal texts[C]// Proceedings of the 11th International Conference on Artificial Intelligence and Law. New York, NY: Association for Computing Machinery, 2007: 225-230. DOI: 10.1145/1276318.1276362.
[7] FLOROU E, KONSTANTOPOULOS S, KOUKOURIKOS A, et al. Argument extraction for supporting public policy formulation[C]// Proceedings of the 7th Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities. Stroudsburg, PA: Association for Computational Linguistics, 2013: 49-54.
[8] STAB C, GUREVYCH I. Parsing argumentation structures in persuasive essays[J]. Computational Linguistics, 2017, 43(3): 619-659. DOI: 10.1162/COLI_a_00295.
[9] 杨进才, 汪燕燕, 曹元, 等. 关系词非充盈态复句的特征融合CNN关系识别方法[J]. 计算机系统应用, 2020, 29(6): 224-229. DOI: 10.15888/j.cnki.csa.007369.
[10] 孙凯丽, 邓沌华, 李源, 等. 基于句内注意力机制多路CNN的汉语复句关系识别方法[J]. 中文信息学报, 2020, 34(6): 9-17, 26. DOI: 10.3969/j.issn.1003-0077.2020.06.003.
[11] 黄丽明, 陈维政, 闫宏飞, 等. 基于循环神经网络和深度学习的股票预测方法[J]. 广西师范大学学报(自然科学版), 2019, 37(1): 13-22. DOI: 10.16088/j.issn.1001-6600.2019.01.002.
[12] 邵良杉, 周玉. 基于语义规则与RNN模型的在线评论情感分类研究[J]. 中文信息学报, 2019, 33(6): 124-131. DOI: 10.3969/j.issn.1003-0077.2019.06.018.
[13] 周圣凯, 富丽贞, 宋文爱. 基于深度学习的短文本语义相似度计算模型[J]. 广西师范大学学报(自然科学版), 2022, 40(3): 49-56. DOI: 10.16088/j.issn.1001-6600.2021071001.
[14] ZARRELLA G, MARSH A.MITRE at SemEval-2016 task 6: transfer learning for stance detection[C]// Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016). Stroudsburg, PA: Association for Computational Linguistics, 2016: 458-463. DOI: 10.18653/v1/S16-1074.
[15] MOHTARAMI M, BALY R, GLASS J, et al. Automatic stance detection using end-to-end memory networks[C]// Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). Stroudsburg, PA: Association for Computational Linguistics, 2018: 767-776. DOI: 10.18653/v1/N18-1070.
[16] LI M L, GAO Y, WEN H, et al. Joint RNN model for argument component boundary detection[C]// 2017 IEEE International Conference on Systems, Man, and Cybernetics (SMC). Piscataway, NJ: IEEE, 2017: 57-62. DOI: 10.1109/SMC.2017.8122578.
[17] LAHA A, RAYKAR V. An empirical evaluation of various deep learning architectures for bi-sequence classification tasks[C]// Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers. Osaka: The COLING 2016 Organizing Committee, 2016: 2762-2773.
[18] 郝雅茹, 董力, 许可, 等. 预训练语言模型的可解释性研究进展[J]. 广西师范大学学报(自然科学版), 2022, 40(5): 59-71. DOI: 10.16088/j.issn.1001-6600.2022030802.
[19] LEWIS M, LIU Y H, GOYAL N, et al. BART: denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension[C]// Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA: Association for Computational Linguistics, 2020: 7871-7880. DOI: 10.18653/v1/2020.acl-main.703.
[20] SUN Y, WANG S H, LI Y K, et al. ERNIE: enhanced representation through knowledge integration[EB/OL].(2019-04-19)[2023-05-20]. http://arxiv.org/abs/1904.09223. DOI: 10.48550/arXiv.1904.09223.
[21] DEVLIN J, CHANG M W, LEE K, et al.BERT: pre-training of deep bidirectional transformers for language understanding[C]// Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Stroudsburg, PA: Association for Computational Linguistics, 2019: 4171-4186. DOI: 10.18653/v1/N19-1423.
[22] 胡婕, 何巍, 曾张帆. 基于RoBERTa的全局图神经网络文档级中文金融事件抽取[J]. 中文信息学报, 2023, 37(2): 107-118. DOI: 10.3969/j.issn.1003-0077.2023.02.011.
[23] 马天宇, 覃俊, 刘晶, 等. 基于 BERT 的意图分类与槽填充联合方法[J]. 中文信息学报, 2022, 36(8): 127-134. DOI: 10.3969/j.issn.1003-0077.2022.08.016.
[24] JAWAHAR G, SAGOT B, SEDDAH D. What does BERT learn about the structure of language?[C]// Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA: Association for Computational Linguistics, 2019: 3651-3657. DOI: 10.18653/v1/P19-1356.
[25] LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 42(2): 318-327. DOI: 10.1109/TPAMI.2018.2858826.
[26] 孙毅, 裘杭萍, 郑雨, 等. 自然语言预训练模型知识增强方法综述[J]. 中文信息学报, 2021, 35(7): 10-29. DOI: 10.3969/j.issn.1003-0077.2021.07.002.
[27] LIU Y H, OTT M, GOYAL N, et al. RoBERTa: a robustly optimized BERT pretraining approach[EB/OL].(2019-07-26)[2023-05-20]. http://arxiv.org/abs/1907.11692. DOI: 10.48550/arXiv.1907.11692.
[28] KIM Y.Convolutional neural networks for sentence classification[C]// Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). Stroudsburg, PA: Association for Computational Linguistics, 2014: 1746-1751. DOI: 10.3115/v1/D14-1181.
[29] ZHANG S, ZHENG D Q, HU X C, et al. Bidirectional long short-term memory networks for relation classification[C]// Proceedings of the 29th Pacific Asia Conference on Language, Information and Computation. Stroudsburg, PA: Association for Computational Linguistics, 2015: 73-78.
[1] WU Wenya,CHEN Yufeng,XU Jin’an,ZHANG Yujie. High-level Semantic Attention-based Convolutional Neural Networks for Chinese Relation Extraction [J]. Journal of Guangxi Normal University(Natural Science Edition), 2019, 37(1): 32-41.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] DONG Shulong, MA Jiangming, XIN Wenjie. Research Progress and Trend of Landscape Visual Evaluation —Knowledge Atlas Analysis Based on CiteSpace[J]. Journal of Guangxi Normal University(Natural Science Edition), 2023, 41(5): 1 -13 .
[2] MA Qianran, WEI Duqu. Chaos Prediction of a Motor System with Two Linearly Coupled Reservoir Computers[J]. Journal of Guangxi Normal University(Natural Science Edition), 2023, 41(6): 1 -7 .
[3] YAN Minxiu, JIN Qisen. Construction of Multi-dimensional Chaotic Systems and Its Multi-channel Adaptive Control[J]. Journal of Guangxi Normal University(Natural Science Edition), 2023, 41(6): 8 -21 .
[4] ZHAO Wei, TIAN Shuai, ZHANG Qiang, WANG Yaoshen, WANG Sibo, SONG Jiang. Fritillaria ussuriensis Maxim Detection Model Based on Improved YOLOv5[J]. Journal of Guangxi Normal University(Natural Science Edition), 2023, 41(6): 22 -32 .
[5] GAO Fei, GUO Xiaobin, YUAN Dongfang, CAO Fujun. Improved PINNs Method for Solving the Convective Dominant Diffusion Equation with Boundary Layer[J]. Journal of Guangxi Normal University(Natural Science Edition), 2023, 41(6): 33 -50 .
[6] ZHOU Qiao, ZHAI Jiangtao, JIA Dongsheng, SUN Haoxiang. A Web Attack Detection Method Based on Convolutional Gated Recurrent Neural Network[J]. Journal of Guangxi Normal University(Natural Science Edition), 2023, 41(6): 51 -61 .
[7] WEN Xueyan, GU Xunkai, LI Zhen, HUANG Yinglai, HUANG Helin. Study of Idiom Reading Comprehension Methods Integrating Interpretation and Bidirectional Interaction[J]. Journal of Guangxi Normal University(Natural Science Edition), 2023, 41(6): 70 -79 .
[8] SONG Guanwu, CHEN Zhiming, LI Jianjun. Remote Sensing Image Classification with Cascade Attention Based on ResNet-50[J]. Journal of Guangxi Normal University(Natural Science Edition), 2023, 41(6): 80 -91 .
[9] XU Ziyu, WU Keqing. Uniqueness of Positive Solutions for Caputo Fractional Differential Systems[J]. Journal of Guangxi Normal University(Natural Science Edition), 2023, 41(6): 92 -104 .
[10] GUO Jie, SUO Hongmin, ZHU Yiying, GUO Jiachao. Existence of Solutions for a Class of Kirchhoff Type Problems with Critical Exponent and Indefinite Potential[J]. Journal of Guangxi Normal University(Natural Science Edition), 2023, 41(6): 105 -112 .