Journal of Guangxi Normal University(Natural Science Edition) ›› 2016, Vol. 34 ›› Issue (1): 59-65.doi: 10.16088/j.issn.1001-6600.2016.01.009

Previous Articles     Next Articles

A General Method of Chinese Word Segmentation Based onthe Resolution of Word Frequency Ambiguity

PENG Qi1, ZHU Xinhua2, CHEN Yishan3   

  1. 1.Network Center,Guangxi Normal University, Guilin Guangxi 541004, China;
    2.College of Computer Science and Information Technology, Guangxi Normal University,Guilin Guangxi 541004,China;
    3. College of Lijiang, Guangxi Normal University, Guilin Guangxi 541006,China
  • Received:2015-08-10 Published:2018-09-14

Abstract: Ambiguity is a common problem in dictionary based word segmentation methods. In the past, the word segmentation method based on dictionary often uses the bidirectional maximum matching method to get the result of word segmentation, and then carries out ambiguity resolution by using the context imformation, which cannot be used in the environment without context information. A general disambiguation method based on word frequency is presented in this paper, which is context-free and expands the application range of ambiguity resolution. Experimental results show that compared with the traditional methods of dictionary-based Chinese word segmentation, this method has a stronger applicability and higher availability.

Key words: Chinese word segmentation, word frequency, ambiguity resolution

CLC Number: 

  • TP391
[1] QIU Xipeng, HUANG Chaochao, HUANG Xuanjing. Automatic corpus expansion for Chinese word segmentation by exploiting the redundancy of web information[C]//Proceedings of the 25th International Conference on Computational Linguistics: Technical Papers. Dublin: ACL, 2014:1154-1164.
[2] 杨尔弘,方莹,刘冬明,等. 汉语自动分词和词性标注评测[J]. 中文信息学报,2006, 20(1):44-49,97.
[3] 翟凤文,赫枫龄,左万利. 字典与统计相结合的中文分词方法[J]. 小型微型计算机统,2006,27(9):1766-1771.
[4] 费洪晓,康松林,朱小娟,等.基于词频统计的中文分词的研究[J]. 计算机工程与应用,2005,41(7):67-68,100.
[5] ZENG Xiaodong, WONG D F, CHAO L S, et al. Graph-based semi-supervised model for joint Chinese word segmentation and part-of-speech tagging[C]//Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics. Sofia: ACL, 2013:770-779.
[6] 刘开瑛. 中文文本自动分词和标注[M]. 北京:商务印书馆,2000:66.
[7] 郑家恒,张剑锋,谭红叶. 中文分词中歧义切分处理策略[J]. 山西大学学报(自然科学版),2007,30(2):163-167. DOI:10.13451/j.cnki.shanxi.univ(nat.sci.). 2007.02.009.
[8] 王晓龙,关毅,计算机自然语言处理[M]. 北京:清华大学出版社,2005:49.
[9] 赵珀璋,徐力.计算机中文信息处理:下[M]. 北京:宇航出版社. 1989:386.
[10] 黄昌宁,赵海. 中文分词十年回顾[J]. 中文信息学报, 2007,21(3):8-19.
[11] ZHANG Longkai, LI Li, HE Zhengyan, et al. Improving Chinese word segmentation on micro-blog using rich punctuations[C]//Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics: Volume 2: Short Papers. Sofia: ACL, 2013:117-182.
[12] 莫建文,郑阳,首照宇,等. 改进的基于词典的中文分词方法[J]. 计算机工程与设计, 2013,34(5): 1802-1807.
[1] ZHANG Canlong, LI Yanru, LI Zhixin, WANG Zhiwen. Block Target Tracking Based on Kernel Correlation Filter and Feature Fusion [J]. Journal of Guangxi Normal University(Natural Science Edition), 2020, 38(5): 12-23.
[2] WANG Jian, ZHENG Qifan, LI Chao, SHI Jing. Remote Supervision Relationship Extraction Based on Encoder and Attention Mechanism [J]. Journal of Guangxi Normal University(Natural Science Edition), 2019, 37(4): 53-60.
[3] XIAO Yiqun, SONG Shuxiang, XIA Haiying. Fast Pedestrian Detection Method Based on Multi-Features    and Implementation [J]. Journal of Guangxi Normal University(Natural Science Edition), 2019, 37(4): 61-67.
[4] WANG Xun, LI Tinghui, PAN Xiao, TIAN Yu. Image Segmentation Method Based on Improved Fuzzy C-means Clustering and Otsu Maximum Variance [J]. Journal of Guangxi Normal University(Natural Science Edition), 2019, 37(4): 68-73.
[5] CHEN Feng,MENG Zuqiang. Topic Discovery in Microblog Based on BTM and Weighting K-Means [J]. Journal of Guangxi Normal University(Natural Science Edition), 2019, 37(3): 71-78.
[6] ZHANG Suiyuan, XUE Yuanhai, YU Xiaoming, LIU Yue, CHENG Xueqi. Research on Short Summary Generation of Multi-Document [J]. Journal of Guangxi Normal University(Natural Science Edition), 2019, 37(2): 60-74.
[7] SUN Ronghai, SHI Linfu, HUANG Liyan, TANG Zhenjun, YU Chunqiang. Reversible Data Hiding Based on Image Interpolation and Reference Matrix [J]. Journal of Guangxi Normal University(Natural Science Edition), 2019, 37(2): 90-104.
[8] ZHU Yongjian, PENG Ke, QI Guangwen, XIA Haiying, SONG Shuxiang. Defect Detection of Solar Panel Based on Machine Vision [J]. Journal of Guangxi Normal University(Natural Science Edition), 2019, 37(2): 105-112.
[9] WANG Qi,QIU Jiahui,RUAN Tong,GAO Daqi,GAO Ju. Recurrent Capsule Network for Clinical Relation Extraction [J]. Journal of Guangxi Normal University(Natural Science Edition), 2019, 37(1): 80-88.
[10] WU Wenya,CHEN Yufeng,XU Jin’an,ZHANG Yujie. High-level Semantic Attention-based Convolutional Neural Networks for Chinese Relation Extraction [J]. Journal of Guangxi Normal University(Natural Science Edition), 2019, 37(1): 32-41.
[11] YUE Tianchi, ZHANG Shaowu, YANG Liang, LIN Hongfei, YU Kai. Stance Detection Method Based on Two-Stage Attention Mechanism [J]. Journal of Guangxi Normal University(Natural Science Edition), 2019, 37(1): 42-49.
[12] YU Chuanming,LI Haonan,AN Lu. Analysis of Text Emotion Cause Based on Multi-task Deep Learning [J]. Journal of Guangxi Normal University(Natural Science Edition), 2019, 37(1): 50-61.
[13] LIN Yuan, LIU Haifeng, LIN Hongfei, XU Kan. Group Ranking Methods with Loss Function Incorporation [J]. Journal of Guangxi Normal University(Natural Science Edition), 2019, 37(1): 62-70.
[14] WAN Fucheng,MA Ning,HE Xiangzhen. Tibetan Information Extraction Technology Integrated with Event Feature and Semantic Role Labeling [J]. Journal of Guangxi Normal University(Natural Science Edition), 2018, 36(2): 18-23.
[15] XIA Haiying,LIU Weitao,ZHU Yongjian. An Improved Fast SUSAN Chessboard Corner Detection Algorithm [J]. Journal of Guangxi Normal University(Natural Science Edition), 2018, 36(1): 44-52.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] MENG Chunmei, LU Shiyin, LIANG Yonghong, MO Xiaomin, LI Weidong, HUANG Yuanjie, CHENG Xiaojing, SU Zhiheng, ZHENG Hua. Electron Microscopy Study on the Apoptosis and Autophagy of the Hepatic Stellate Cells Induced by Total Alkaloids[J]. Journal of Guangxi Normal University(Natural Science Edition), 2018, 36(3): 76 -79 .
[2] LI Yuhui, CHEN Zening, HUANG Zhonghao, ZHOU Qihai. Activity Time Budget of Assamese macaque (Macaca assamensis) during Rainy Season in Nonggang Nature Reserve, Guangxi, China[J]. Journal of Guangxi Normal University(Natural Science Edition), 2018, 36(3): 80 -86 .
[3] QIN Yingying, QI Guangchao, LIANG Shichu. Effects of Eichhornia crassipes Aqueous Extracts on Seed Germination of Ottelia acuminata var. jingxiensis[J]. Journal of Guangxi Normal University(Natural Science Edition), 2018, 36(3): 87 -92 .
[4] ZHUANG Fenghong, MA Jiangming, ZHANG Yajun, SU Jing, YU Fangming. Eco-Physiological Responses of Leaves of Isoetes sinensis to Light Intensity[J]. Journal of Guangxi Normal University(Natural Science Edition), 2018, 36(3): 93 -100 .
[5] WEI Hongjin, ZHOU Xile, JIN Dongmei, YAN Yuehong. Additions to the Pteridophyte Flora of Hunan, China[J]. Journal of Guangxi Normal University(Natural Science Edition), 2018, 36(3): 101 -106 .
[6] BAO Jinping, ZHENG Lianbin, YU Keli, SONG Xue, TIAN Jinyuan, DONG Wenjing. Skinfold Thickness Characteristics of Yi Adults in Daliangshan,China[J]. Journal of Guangxi Normal University(Natural Science Edition), 2018, 36(3): 107 -112 .
[7] LIN Yongsheng, PEI Jianguo, ZOU Shengzhang, DU Yuchao, LU Li. Red Bed Karst and Its Hydrochemical Characteristics of Groundwater in the Downstream of Qingjiang River, China[J]. Journal of Guangxi Normal University(Natural Science Edition), 2018, 36(3): 113 -120 .
[8] ZHANG Ru, ZHANG Bei, REN Hongrui. Spatio-temporal Dynamics Analysis and Its Affecting Factors of Cropland Loss in Xuangang Mining Area, Shanxi, China[J]. Journal of Guangxi Normal University(Natural Science Edition), 2018, 36(3): 121 -132 .
[9] LI Xianjiang, SHI Shuqin, CAI Weimin, CAO Yuqing. Simulation of Land Use Change in Tianjin Binhai New Area Based on CA-Markov Model[J]. Journal of Guangxi Normal University(Natural Science Edition), 2018, 36(3): 133 -143 .
[10] WANG Mengfei, HUANG Song. Spatial Linkage of Tourism Economy of Cities in West River Economic Belt in Guangxi, China[J]. Journal of Guangxi Normal University(Natural Science Edition), 2018, 36(3): 144 -150 .