Journal of Guangxi Normal University(Natural Science Edition) ›› 2011, Vol. 29 ›› Issue (4): 35-38.

Previous Articles     Next Articles

Text ClassificationBased on Experimental Study of Two-step Strategy

HE Quan-hao, FAN Xing-hua, ZHOU Peng   

  1. Institute of Computer Science and Technology,Chongqing University of Posts and Telecommunications,Chongqing 400065,China
  • Received:2011-09-25 Published:2018-11-16

Abstract: Naive Bayesian classifier is known to use two-step classification strategy to improve the efficiency of two types of Chinese text categorization.This paper tries to solve the following three questions:(1) the condition of a classifier to be fulfilled by using two-step strategy text classification,(2) the theoretical analysis of the three classifiers which can be used for two-step strategy text classification,(3) experimental results comparison ofRocchio,Naive Bayes,KNN combination used in many types of English text classification.Experimental results show that the Rocchio,NB and KNN satisfy the conditions of two-step strategy.Best performance is achieved by using KNN as the firststep classifier and NB as the second.

Key words: text categorization, two-step strategy, Rocchio, naiveBayes, KNN

CLC Number: 

  • TP18
[1] SEBASTIANI F.Machine Learning in automated text categorization[J].ACM Computing Surveys,2002,34(1):1-47.
[2] 张玉芳,杨柯,熊忠阳.基于关联规则的中文文本分类算法的改进[J].郑州大学学报:理学版,2007,39(2):114-117.
[3] 樊兴华,孙茂松.一种高性能的两类中文文本分类方法[J].计算机学报,2006,29(1):124-131.
[4] 樊兴华.因果推理和文本分类[R].北京:清华大学计算机科学与技术系,2004.
[5] RCCHIO J J.Relevance feedback in information retrieval[C]//The SMART Retrieval System Experiments in Automatic Document Processing.Englewood Cliffs,NJ:Prentice Hall,1971:31323.
[6] LEWIS D D.Naive bayes at forty:the independence as-sumption in information retrieval[C]//Proceedings of the 10th European Conference on Maching Learning:LNCS vol 1398.Berlin:Springer,1998:4-15.
[7] 孙丽华,张积东,李静梅.一种改进的KNN方法及其在文本分类中的应用[J].应用科技,2002,29(2):25-27.
[8] 陈建林,樊兴华,王国胤.基于两步策略的英文文本分类[J].广西师范大学学报:自然科学版,2007,25(4):200-203.
[1] WU Hao, QIN Lichun, LUO Liurong. Improving Classification Rule with Lift Measure for KNN Classifier [J]. Journal of Guangxi Normal University(Natural Science Edition), 2019, 37(2): 75-81.
[2] SU Yi-juan, SUN Ke, DENG Zhen-yun, YIN Ke-jun. KNN Imputation Algorithm Based on LPP and l2,1 [J]. Journal of Guangxi Normal University(Natural Science Edition), 2015, 33(4): 55-62.
[3] WANG Feng, JIN Xiao-bo, YU Jun-wei, WANG Gui-cai. V-optimal Histogram and Its Application in License Plate Classification [J]. Journal of Guangxi Normal University(Natural Science Edition), 2013, 31(3): 138-143.
[4] LU Guang-quan, XIE Yang-cai, LIU Xing, ZHANG Shi-chao. An Improvement Semi-supervised Learning Based on KNN Classification [J]. Journal of Guangxi Normal University(Natural Science Edition), 2012, 30(1): 45-49.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!