Journal of Guangxi Normal University(Natural Science Edition) ›› 2010, Vol. 28 ›› Issue (1): 157-160.

Previous Articles     Next Articles

Cross-language Text Classification Based on Latent Semantic DualSpace

XIONG Chao1, WANG Ming-wen1, WU Fu-ying2, WU Shi-yong1, SHEN Yang2   

  1. 1. School of Computer Information Engineering,Jiangxi Normal University,Nanchang Jiangxi 330022,China;
    2. School of Software,Jiangxi Normal University,Nanchang Jiangxi 330022,China;
    3. Jiangxi Microsoft Technology Center,Nanchang Jiangxi 330096,China
  • Received:2009-12-20 Online:2010-03-20 Published:2023-02-07

Abstract: Nowadays,with the trend of language diversity inthe internet,how to organize multi-language resources becomes a hotspot.This paper focuses on cross language text categorization (CLTC) which can organize heterogeneous document collections.Using the semantics pairs extracted from parallel.The latent semantic dual space can be built by using the semantics pairs extracted from parallel corpus.In the experiment,through changing the trainingsize and language composition the performance of CLTC can be verified.Results show that cross-language text classification based on latent semantic dual space makes a good performance in stability and accuracy.

Key words: CLTC, latent semantic dual space, semantic pairs, parallel corpus

CLC Number: 

  • TP391.1
[1] 王灏,黄厚宽,田盛丰.文本分类实现技术[J].广西师范大学学报:自然科学版,2003,21(1):173-179.
[2] 王昊鹏,王卫东,李森.基于元数据的科技论文分类方法[J].山东师范大学学报:自然科学版,2008,23(3):41-43.
[3] 倪茂树,时达明,林鸿飞.基于粗糙集属性约简的文本分类[J].郑州大学学报:理学版,2007,39(2):100-103.
[4] 张启蕊,董守斌,张凌.文本分类的性能评估指标[J].广西师范大学学报:自然科学版,2007,25(2):119-122.
[5] KAZUAKI K.Technical issues of cross-language information retrieval:a review[J].Information Processing and Management,2005,41:433-455.
[6] LI Kar-wing.A Corpus-based approach for cross-lingual information retrieval[D].Hong Kong:Department of Systems Engineering and Engineering Mangement,The Chinese University of Hong Kong,2004.
[7] 杨丽.国外跨语言信息检索的技术研究综述[J].情报杂志,2008,27(7):37-40.
[8] 金千里,赵军,徐波.弱指导的统计隐含语义分析及其在跨语言信息检索中的应用[C]//语言计算与基于内容的文本处理——全国第七届计算语言学联合学术会议论文集.哈尔滨:中国中文信息学会,2003:527-533.
[9] BI Wen-xia,WANG Ming-wen,LUO Yuan-sheng,el at.A new cross language text categorization based on interlingua semantic[J].Journal of Computational Information Systems,2008,4(1):105-110.
[10] WANG Ming-wen,YE Hao,HUANG Guo-bin,et al.A cross language retrieval model based on interlingua semantics[J].Journal of Computational Information Systems,2007,3(4):1555-1560.
[1] ZHENG Kengtao, LIN Nankai, FU Yingwen, WANG Lianxi, JIANG Shengyi. Study on the Automatic Alignment of Mandarin-Indonesian Bilingual Texts [J]. Journal of Guangxi Normal University(Natural Science Edition), 2019, 37(1): 89-97.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] CHEN Yong-qi, BAI Ke-zhao, KUANG hua, KONG Ling-jiang, LIU Mu-ren. Effect of Internal Layout on the Pedestrian Evacuation in the Classroom[J]. Journal of Guangxi Normal University(Natural Science Edition), 2011, 29(1): 1 -4 .
[2] XU Lun-hui, YE Fan. Acceleration Noise Model Based on Horizontal,Vertical and LateralAcceleration[J]. Journal of Guangxi Normal University(Natural Science Edition), 2011, 29(1): 5 -9 .
[3] YANG Li, KONG Ling-jiang. Capillary Force between Microparticles[J]. Journal of Guangxi Normal University(Natural Science Edition), 2012, 30(1): 1 -4 .
[4] HE Qing, LIU Jian, WEI Lianfu. Single-Photon Detectors as the Physical Limit Detections of Weak Electromagnetic Signals[J]. Journal of Guangxi Normal University(Natural Science Edition), 2022, 40(5): 1 -23 .
[5] BAI Ke-zhao, LUO Xu-dong, KONG Ling-jiang, LIU Mu-ren. Cellular Automaton Model of Date Transmission with Open Boundary Condition[J]. Journal of Guangxi Normal University(Natural Science Edition), 2010, 28(3): 1 -4 .
[6] XU Lun-hui, LIAO Ran-kun. Signal Phasing-Sequence Optimization of Intersection Based on Traffic Track[J]. Journal of Guangxi Normal University(Natural Science Edition), 2010, 28(3): 5 -9 .
[7] WANG Xiu-xin, QIN Li-mei, NONG Jing-hui, LIANG Zong-jin, ZHU Qi-jiang. Land Surface Temperature Retrieval with Mono-window Algorithm in Karst City[J]. Journal of Guangxi Normal University(Natural Science Edition), 2010, 28(3): 10 -14 .
[8] LI Yu-fang, ZHANG Jun-jian. Strong Consistency of the Regression Weighted Function Estimator for Negatively Associated Samples[J]. Journal of Guangxi Normal University(Natural Science Edition), 2010, 28(3): 15 -19 .
[9] JIA Bao-hua. A Strictly Stationary Associated Random Sequence Which Unsatisfythe Central Limit Theorem[J]. Journal of Guangxi Normal University(Natural Science Edition), 2010, 28(3): 20 -23 .
[10] CHEN Cui-ling, LI Ming, LIANG Jia-mei, LI Lüe. A Class of New Conjugate Gradient Method and Its Convergence Property Under the Wolfe Line Search[J]. Journal of Guangxi Normal University(Natural Science Edition), 2010, 28(3): 24 -28 .