广西师范大学学报(自然科学版) ›› 2016, Vol. 34 ›› Issue (3): 39-45.doi: 10.16088/j.issn.1001-6600.2016.03.006

• • 上一篇    下一篇

基于稀疏学习的kNN分类

宗鸣1,2, 龚永红3, 文国秋1, 程德波1,2, 朱永华4   

  1. 1.广西师范大学计算机科学与信息工程学院,广西桂林541004;
    2.广西区域多源信息集成与智能处理协同创新中心,广西贵港537000;
    3.桂林航天工业学院信息工程系,广西桂林541004;
    4.广西大学计算机与电子信息学院,广西南宁530004
  • 收稿日期:2015-09-09 出版日期:2016-09-30 发布日期:2018-09-17
  • 通讯作者: 文国秋(1987—),女,广西桂林人,广西师范大学讲师。E-mail:wenguoqiu2008@163.com;龚永红(1970—),女,广西永福人,桂林航天工业学院副教授。E-mail:zysjd2015@163.com
  • 基金资助:
    国家自然科学基金资助项目(61450001,61263035,61573270);国家973计划项目(2013CB329404);中国博士后科学基金资助项目(2015M570837);广西自然科学基金资助项目(2012GXNSFGA060004,2015GXNSFCB139011,2015GXNSFAA139306)

kNN Classification Based on Sparse Learning

ZONG Ming1,2, GONG Yonghong3, WEN Guoqiu1, CHENG Debo1,2, ZHU Yonghua4   

  1. 1.College of Computer Science and Information Technology,Guangxi Normal University,Guilin Guangxi 541004,China;
    2.Guangxi Collaborative Innovation Center of Multi-source Information Integration and Intelligent Processing, Guigang Guangxi 537000,China;
    3.Department of Information Engineering,Guilin University of Aerospace Technology, GuilinGuangxi 541004,China;
    4.School of Computer,Electronics and Information, Guangxi University,Nanning Guangxi 530004,China
  • Received:2015-09-09 Online:2016-09-30 Published:2018-09-17

摘要: 在kNN算法分类问题中,k的取值一般是固定的,另外,训练样本中可能存在的噪声能影响分类结果。针对以上存在的两个问题,本文提出一种新的基于稀疏学习的kNN分类方法。本文用训练样本重构测试样本,其中,l1-范数导致的稀疏性用来对每个测试样本用不同数目的训练样本进行分类,这解决了kNN算法固定k值问题;l21-范数产生的整行稀疏用来去除噪声样本。在UCI数据集上进行实验,本文使用的新算法比原来的kNN分类算法能取得更好的分类效果。

关键词: 稀疏学习, 重构, l1-范数, l21-范数, 噪声样本

Abstract: The value of k is usually fixed in the issue of k Nearest Neighbors (kNN) classification. In addition, there may be noise in train samples which affect the results of classification. To solve these two problems, a sparse-based k Nearest Neighbors (kNN) classification method is proposed in this paper. Specifically, the proposed method reconstructs each test sample by the training data. During the reconstruction process,l1-norm is used to generate the sparsity and different k values are used for different test samples, which solves the issue of fixed value of k. And l21-norm is used to generate row sparsity which can remove noisy training samples. The experimental results on UCI datasets show that the proposed method outperforms the existing kNN classification method in terms of classification performance.

Key words: sparse learning, reconstruction, l1-norm, l21-norm, noise sample

中图分类号: 

  • TP181
[1] QIN Yongsong, ZHANG Shichao, ZHU Xiaofeng, et al. Semi-parametric optimization for missing data imputation[J]. Applied Intelligence, 2007,27(1):79-88. DOI: 10.1007/s10489-006-0032-0.
[2] ZHU Xiaofeng,HUANG Zi,CHENG Hong,et al.Sparse hashing for fast multimedia search[J].ACM Transactions on Information Systems,2013,31(2):9. DOI: 10.1145/2457465.2457469.
[3] ZHU Xiaofeng, LI Xuelong, ZHANG Shichao. Block-row sparse multiview multilabel learning for image classification [J].IEEE Transactions on Cybernetics, 2016, 46(2):450-461. DOI:10.1109/TCYB.2015.2403356.
[4] ZHU Xiaofeng, ZHANG Lei, HUANG Zi. A sparse embedding and least variance encoding approach to hashing [J].IEEE Transactions on Image Processing, 2014, 23(9):3737-3750. DOI:10.1109/TIP.2014.2332764.
[5] LIU Huawen,ZHANG Shichao.Noisy data elimination using mutual k-nearest neighbor for classification mining[J].Journal of Systems and Software,2012,85(5):1067-1074. DOI:10.1016/j.jss.2011.12.019.
[6] ZHU Xiaofeng,HUANG Zi,SHEN Hengtao,et al.Linear cross-modal hashing for efficient multimedia search[C]//Proceedings of the 21st ACM International Conference on Multimedia. New York: ACM Press, 2013: 143-152. DOI:10.1145/2502081.2502107.
[7] ZHU Xiaofeng,HUANG Zi,SHEN Hengtao,et al.Dimensionality reduction by mixed kernel canonical correlation analysis [J]. Pattern Recognition,2012, 45(8):3003-3016. DOI:10.1016/j.patcog.2012.02.007.
[8] HASTIE T,TIBSHIRANI R,FRIEDMAN J.统计学习基础:数据挖掘、推理与预测[M].范明,柴玉梅,咎红英,译.北京:电子工业出版社,2003:7-8.
[9] ZHU Xiaofeng,SUK H,SHEN D G.Matrix-similarity based loss function and feature selection for Alzheimer’s disease diagnosis[C]//Proceeding of IEEE Conference on Computer Vision and Pattern Recognition. Los Alamitos, CA:IEEE Computer Society,2014:3089-3096. DOI:10.1109/CVPR.2014.395.
[10] ZHU Xiaofeng,HUANG Zi,YANG Yang,et al.Self-taught dimensionality reduction on the high-dimensional small-sized data[J]. Pattern Recognition,2013,46(1):215-229. DOI:10.1016/j.patcog.2012.07.018.
[11] ZHU Xiaofeng, ZHANG Shichao, JIN Zhi,et al. Missing value estimation for mixed-attribute data sets[J]. IEEE Transactions on Knowledge and Data Engineering,2011, 23(1):110-121. DOI:10.1109/TKDE.2010.99.
[12] ZHU Xiaofeng,HUANG Zi,CUI Jiangtao,et al. Video-to-shot tag propagation by graph sparse group lasso[J]. IEEE Transactions on Multimedia,2013,15(3):633-646. DOI:10.1109/TMM.2012.2233723.
[13] NESTEROV Y.Introductory lectures on convex optimization: A basic course[M]. Berlin: Springer, 2004.
[14] ZHU Xiaofeng, LI Xuelong, ZHANG Shichao,et al.Robust joint graph sparse coding for unsupervised spectral feature selection [J].IEEE Transactions on neural networks and learning systems, 2016, PP(99):1-13. DOI:10.1109/TNNLS. 2016.2521602.
[1] 邓振云, 龚永红, 孙可, 张继连. 基于局部相关性的kNN分类算法[J]. 广西师范大学学报(自然科学版), 2016, 34(1): 52-58.
[2] 苏毅娟, 孙可, 邓振云, 尹科军. 基于LPP和l2,1的KNN填充算法[J]. 广西师范大学学报(自然科学版), 2015, 33(4): 55-62.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!
版权所有 © 广西师范大学学报(自然科学版)编辑部
地址:广西桂林市三里店育才路15号 邮编:541004
电话:0773-5857325 E-mail: gxsdzkb@mailbox.gxnu.edu.cn
本系统由北京玛格泰克科技发展有限公司设计开发