Journal of Guangxi Normal University(Natural Science Edition) ›› 2015, Vol. 33 ›› Issue (2): 36-41.doi: 10.16088/j.issn.1001-6600.2015.02.006

Previous Articles     Next Articles

A Method for Entity-Oriented Timeline Summarization

SONG Jun1,2,3, HAN Xiao-yu1,2,3, HUANG Yu1,2, HUANG Ting-lei1,2, FU Kun1,2   

  1. 1. CAS Key Laboratory of Spatial Information Processing and Applied System Technology, Beijing 100190, China;
    2. Institute of Electronics, Chinese Academy of Sciences, Beijing 100190, China;
    3. University of Chinese Academy of Sciences, Beijing 100190, China
  • Received:2015-03-19 Online:2015-02-10 Published:2018-09-20

Abstract: The objective of this paper is to propose a novel entity-oriented timeline summarization from multiple documents. To achieve this, this paper firstly proposes a topic model to simultaneously model the dynamic topics and the entity’s participation. An efficient Gibbs sampler is also developed for this model. Then each sentence is allocated a score based on the discovered topics and the sentences with high score are selected as summaries. Experimental results on real-world datasets verify that the proposed model can not only generate summaries for entities, but also outperform the baseline model on Rouge evaluation.

Key words: multiple document summarization, topic model, natural language process

CLC Number: 

  • TP391.1
[1] 秦兵, 刘挺, 李生. 多文档自动文摘综述[J]. 中文信息学报, 2005, 19(6):13-20.
[2] YAN R, KONG L, HUANG C, et al. Timeline generation through evolutionary trans-temporal summarization[C] //Proceedings of the Conference on Empirical Methods in Natural Language Processing. Edinburgh, United Kingdom:Association for Computational Linguistics,2011:433-443.
[3] 严睿. 演进式动态新闻文档摘要生成方法研究[D]. 北京:北京大学, 2013.
[4] RADEV D R, JING H, STYS' M, et al. Centroid-based summarization of multiple documents[J]. Information Processing and Management, 2004, 40(6):919-938.
[5] 程显毅, 潘燕, 朱倩,等. 面向事件的多文档文摘生成算法的研究[J]. 广西师范大学学报:自然科学版, 2011, 29(1):147-150.
[6] 刘晓燕, 黄宇, 尤红建. 基于仿射传播算法的多文档摘要方法[J]. 国外电子测量技术, 2014,33(8):29-33.
[7] 林立, 胡侠, 朱俊彦. 基于谱聚类的多文档摘要新方法[J]. Computer Engineering, 2010, 36(22):64-65.
[8] BLEI D M, NG A Y, JORDAN M I. Latent dirichlet allocation[J]. Journal of Machine Learning Research, 2003, 3(Jan):993-1022.
[9] LI J, LI S. Evolutionary hierarchical dirichlet process for timeline summarization[C] //Association for computational Linguistics (2). Sofia, Bulgaria:Association for Computational Linguistics,2013,556-560.
[10] 刘美玲, 郑德权, 赵铁军,等. 动态多文档文摘模型[J]. Journal of Software, 2012, 23(2):289-298
[11] 付玲, 张晖. 结合 LDA 和谱聚类的多文档摘要[J]. Computer Engineering and Applications, 2013, 49(16):142-145
[12] TEH Y W, JORDAN M I, BEAL M J, et al. Hierarchical dirichlet processes[J]. Journal of the American Statistical Association, 2006, 101(476):1566-1581
[13] LIN C Y. Rouge:A package for automatic evaluation of summaries[C]//Text Summarization Branches Out:Proceedings of the ACL-04 Workshop. Barcelona, Spain:Association for Computational Linguistics,2004:74-81.
[1] CHEN Feng,MENG Zuqiang. Topic Discovery in Microblog Based on BTM and Weighting K-Means [J]. Journal of Guangxi Normal University(Natural Science Edition), 2019, 37(3): 71-78.
[2] LI Zhi-xin, CHEN Hong-chao, WU Jing-li, ZHOU Sheng-ming. Semantic Learning and Retrieval of Images Based on Probabilistic Topic Modeling [J]. Journal of Guangxi Normal University(Natural Science Edition), 2012, 30(3): 125-134.
[3] CHENG Xian-yi, PAN Yan, ZHU Qian, SUN Ping. Automatic Generating Algorithm of Event-oriented Multi-documentSummarization [J]. Journal of Guangxi Normal University(Natural Science Edition), 2011, 29(1): 147-150.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!