Journal of Guangxi Normal University(Natural Science Edition) ›› 2010, Vol. 28 ›› Issue (1): 153-156.

Previous Articles     Next Articles

A Term Extraction Approach Based on Modified Log-likelihood Ratio

LIN Lei, SUN Cheng-jie, ZHANG Er-yan, LIU Bing-quan   

  1. School of Computer Science and Technology,Harbin Institute ofTechnology,Harbin Heilongjiang 150001,China
  • Received:2009-11-20 Online:2010-03-20 Published:2023-02-07

Abstract: Term extraction is a basic subject in information processing and is attracting more and more attention nowadays.In order to extractlow frequency words effectively,Log-likelihood ratio method is used but with alow precision rate.To solve this problem,C-value method is used to deal withthe results of Log-likelihood ratio.Experiment results show that by combining the two methods,the precision is improved in the premise of ensuing high recall rate of Log-likelihood ratio method.The proposed method can improve the precision by about8% compared with the Log-likelihood ratio method.

Key words: low-frequency word, Log-likelihood ratio, C-value, term extraction

CLC Number: 

  • TP391.1
[1] 张勇,何婷婷.中文术语自动抽取相关方法研究[D].武汉:华中师范大学计算机系,2006.
[2] 谌贻荣,俞士汶,穗志方.中文术语自动提取技术研究[D].北京:北京大学计算机系,2005.
[3] 索红光,杨涛.基于互信息的Web文档聚类方法[J].广西师范大学学报:自然科学版,2007,25(2):131-134.
[4] DIAS G,GUILLOR S,BASSANO J C,et al.Combining linguistics with statistics for multiword term extraction:a fruitful association[C]//Proceedings ofRecherched Informations Assiste par Ordinateur.Paris:College de France,2000:157-173.
[5] DUNNING T.Accurate methods for the statistics of surprise and coincidence[J].Association for Computational Linguistics,1993,19(1):61-76.
[6] SILVA J,LOPES G.A local maxima method and a fair dispersion normalization for extracting multiword units[C]//Proceedings of the 6th Meeting on the Mathematics of Language.Florida:University of Central Florida,1999:369-381.
[7] FRANTZI K T,ANANIADOU S.The C-value/NC-value domain independent method for multi-word term extraction[J].Journal of Natural Language Processing,1999,6(3):145-179.
[1] HAO Yaru, DONG Li, XU Ke, LI Xianxian. Interpretability of Pre-trained Language Models: A Survey [J]. Journal of Guangxi Normal University(Natural Science Edition), 2022, 40(5): 59-71.
[2] CHAO Rui, ZHANG Kunli, WANG Jiajia, HU Bin, ZHANG Weicong, HAN Yingjie, ZAN Hongying. Construction of Chinese Multimodal Knowledge Base [J]. Journal of Guangxi Normal University(Natural Science Edition), 2022, 40(3): 31-39.
[3] LI Zhengguang, CHEN Heng, LIN Hongfei. Identification of Adverse Drug Reaction on Social Media Using Bi-directional Language Model [J]. Journal of Guangxi Normal University(Natural Science Edition), 2022, 40(3): 40-48.
[4] ZHOU Shengkai, FU Lizhen, SONG Wen’ai. Semantic Similarity Computing Model for Short Text Based on Deep Learning [J]. Journal of Guangxi Normal University(Natural Science Edition), 2022, 40(3): 49-56.
[5] SUN Yansong, YANG Liang, LIN Hongfei. Humor Recognition of Sitcom Based on Multi-granularity of Segmentation Enhancement and Semantic Enhancement [J]. Journal of Guangxi Normal University(Natural Science Edition), 2022, 40(3): 57-65.
[6] WANG Jian, ZHENG Qifan, LI Chao, SHI Jing. Remote Supervision Relationship Extraction Based on Encoder and Attention Mechanism [J]. Journal of Guangxi Normal University(Natural Science Edition), 2019, 37(4): 53-60.
[7] SONG Jun, HAN Xiao-yu, HUANG Yu, HUANG Ting-lei, FU Kun. A Method for Entity-Oriented Timeline Summarization [J]. Journal of Guangxi Normal University(Natural Science Edition), 2015, 33(2): 36-41.
[8] ZHANG Fen, QU Wei-guang, ZHAO Hong-yan, ZHOU Jun-sheng. Shallow Parsing Based on CRF and Transformation-basedError-driven Learning [J]. Journal of Guangxi Normal University(Natural Science Edition), 2011, 29(3): 147-150.
[9] ZHUO Guang-ping, SUN Jing-yu, LI Xian-hua, YU Xue-li. Personalized Recommendation Algorithm Based on CBR [J]. Journal of Guangxi Normal University(Natural Science Edition), 2011, 29(3): 151-156.
[10] CHENG Xian-yi, PAN Yan, ZHU Qian, SUN Ping. Automatic Generating Algorithm of Event-oriented Multi-documentSummarization [J]. Journal of Guangxi Normal University(Natural Science Edition), 2011, 29(1): 147-150.
[11] YANG Liang, PAN Feng-ming, LIN Hong-fei. Chunk-based Opinion Object Extraction and Application in OpinionAnalysis [J]. Journal of Guangxi Normal University(Natural Science Edition), 2011, 29(1): 151-156.
[12] CHENG Xian-yi, ZHU Qian, HAN Fei. Semantic Chunk of Question Sentence Analysis Based on HNC and Description Logics [J]. Journal of Guangxi Normal University(Natural Science Edition), 2010, 28(3): 131-134.
[13] XIA Ning, LIN Hong-fei, YANG Zhi-hao, LI Yan-peng. Gene Mention Normalization Based on Semantic Featured Machine Learning Disambiguation [J]. Journal of Guangxi Normal University(Natural Science Edition), 2010, 28(3): 144-147.
[14] CHEN Chong, LI Feng, MAO Xian-ling, HE Jing, YAN Hong-fei. Literature Retrieval System Implementation and Impact-based Summarization [J]. Journal of Guangxi Normal University(Natural Science Edition), 2010, 28(1): 135-138.
[15] WEI Li, TAN Hong-ye, ZHENG Jia-heng, SUN Jian. Study of Keeping Consistency of Chinese Corpus of Complete Parsing [J]. Journal of Guangxi Normal University(Natural Science Edition), 2010, 28(1): 139-142.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] CHEN Yong-qi, BAI Ke-zhao, KUANG hua, KONG Ling-jiang, LIU Mu-ren. Effect of Internal Layout on the Pedestrian Evacuation in the Classroom[J]. Journal of Guangxi Normal University(Natural Science Edition), 2011, 29(1): 1 -4 .
[2] XU Lun-hui, YE Fan. Acceleration Noise Model Based on Horizontal,Vertical and LateralAcceleration[J]. Journal of Guangxi Normal University(Natural Science Edition), 2011, 29(1): 5 -9 .
[3] YANG Li, KONG Ling-jiang. Capillary Force between Microparticles[J]. Journal of Guangxi Normal University(Natural Science Edition), 2012, 30(1): 1 -4 .
[4] HE Qing, LIU Jian, WEI Lianfu. Single-Photon Detectors as the Physical Limit Detections of Weak Electromagnetic Signals[J]. Journal of Guangxi Normal University(Natural Science Edition), 2022, 40(5): 1 -23 .
[5] BAI Ke-zhao, LUO Xu-dong, KONG Ling-jiang, LIU Mu-ren. Cellular Automaton Model of Date Transmission with Open Boundary Condition[J]. Journal of Guangxi Normal University(Natural Science Edition), 2010, 28(3): 1 -4 .
[6] XU Lun-hui, LIAO Ran-kun. Signal Phasing-Sequence Optimization of Intersection Based on Traffic Track[J]. Journal of Guangxi Normal University(Natural Science Edition), 2010, 28(3): 5 -9 .
[7] WANG Xiu-xin, QIN Li-mei, NONG Jing-hui, LIANG Zong-jin, ZHU Qi-jiang. Land Surface Temperature Retrieval with Mono-window Algorithm in Karst City[J]. Journal of Guangxi Normal University(Natural Science Edition), 2010, 28(3): 10 -14 .
[8] LI Yu-fang, ZHANG Jun-jian. Strong Consistency of the Regression Weighted Function Estimator for Negatively Associated Samples[J]. Journal of Guangxi Normal University(Natural Science Edition), 2010, 28(3): 15 -19 .
[9] JIA Bao-hua. A Strictly Stationary Associated Random Sequence Which Unsatisfythe Central Limit Theorem[J]. Journal of Guangxi Normal University(Natural Science Edition), 2010, 28(3): 20 -23 .
[10] CHEN Cui-ling, LI Ming, LIANG Jia-mei, LI Lüe. A Class of New Conjugate Gradient Method and Its Convergence Property Under the Wolfe Line Search[J]. Journal of Guangxi Normal University(Natural Science Edition), 2010, 28(3): 24 -28 .