Microblog Opinion Summarization Method Based on Transformer and TextRank

SUN Xu1, SHEN Bin1, YAN Xin1,2*, ZHANG Jinpeng3,4, XU Guangyi5   

  1. 1. School of Information Engineering and Automation, Kunming University of Science and Technology, Kunming Yunnan 650500, China;
    2. Key Laboratory of Artificial Intelligence in Yunnan Province (Kunming University of Science and Technology), Kunming Yunnan 650500, China;
    3. School of Computer Science and Engineering, Yunnan University, Kunming Yunnan 650091, China;
    4. School of Information, Yunnan University of Finance and Economics, Kunming Yunnan 650221, China;
    5. Yunnan Nantian Electronic Information Industry Co., Ltd., Kunming Yunnan 650040, China
  • Received:2022-10-31 Revised:2023-03-16 Online:2023-07-25 Published:2023-09-06

Abstract: The association of sentiment among microblog texts has not been considered by previous research. A microblog opinion summarization method based on Transformer and TextRank is proposed in this paper. Firstly, the word vectors of the texts are encoded and quantified by encoder and quantization space of Transformer. Then according to the quantization results, the opinion categories of microblog textset are divided by semanteme clustering, and the important categories are selected for summary extraction. Then the sentiment feature vector and the microblog text feature vector are concatenated. Then TextRank algorithm with sentiment features is used in every category, and the microblog text with the highest weight is extracted as the summary text. Finally, the most representative summary texts in all categories are combined to obtain the final microblog opinion summarizations. The experimental results show that, after adding the sentiment polarity influence factor, the ROUGE values of the proposed method has significantly improved compared with the baseline method. The maximum F-measure values of Rouge-1, Rouge-2 and Rouge-SU4 can top out at 0.493 7, 0.255 5, 0.270 6 respectively. It proves that the proposed method is effective for the task of extracting microblog opinion summarizations.

Key words: sentiment feature, opinion summarization, semanteme clustering, summary extraction, Transformer, TextRank

CLC Number:  TP391.1
