广西师范大学学报(自然科学版) ›› 2013, Vol. 31 ›› Issue (3): 30-36.

• • 上一篇    下一篇

基于限制邻域关系的不完备混合决策系统属性约简

刘海峰, 续欣莹, 申雪芬, 谢王君   

  1. 太原理工大学信息工程学院,山西太原030024
  • 收稿日期:2013-06-05 出版日期:2013-09-20 发布日期:2018-11-26
  • 通讯作者: 续欣莹(1979—),男,山西定襄人,太原理工大学副教授,博士。E-mail:xuxinying@tyut.edu.cn
  • 基金资助:
    国家自然科学基金资助项目(60975032);山西省青年科技研究基金资助项目(2009021017-4);山西省回国留学人员科研资助项目(2008-25);山西省回国留学人员科研资助项目(2013-033)

Attribute Reduction of Incomplete Mixed Decision System Based on Limited Neighborhood Relation

LIU Hai-feng, XU Xin-ying, SHEN Xue-fen, XIE Jun   

  1. Department of Information Engineering,Taiyuan University of Technology,Taiyuan Shanxi 030024,China
  • Received:2013-06-05 Online:2013-09-20 Published:2018-11-26

摘要: 针对经典粗糙集不能直接处理决策系统中既含有属性值缺失的不完备问题又同时具有名义型属性和数值型属性的混合数据问题,提出一种限制邻域关系,并给出了一套不完备混合决策系统属性约简算法。该算法以条件熵作为启发因子,弥补将决策正域作为启发因子时可能会出现选不出第一个最重要属性的不足,并利用所提的限制邻域关系直接处理不完备混合型数据,从而省去了对不完备数据进行数据补齐或删除和对数值型数据进行离散化的过程,以减少这些数据预处理所带来的不确定性,最后通过对UCI的不完备混合型数据集进行仿真实验,从而验证了该算法在保持或改善分类能力的情况下可以有效地约简冗余属性,并且讨论了在限制邻域关系中的阈值选择对分类结果的影响。

关键词: 不完备混合决策系统, 限制邻域关系, 条件熵, 属性约简

Abstract: As for a decision system with both missing attribute values and mixed data types,the classical rough sets theory cannot directly do anything about it.Such a decision system was firstly defined as the incomplete mixed decision system (IMDS).Secondly,the limited neighborhood relation was proposed for composing the attribute reduction algorithm of a novel incomplete mixed decision system for IMDS,which employed the conditional entropy as the heuristic factor to make up for the positive region of decision deficiency.Based on the limited neighborhood relation,the nominal attribute and the numerical attribute and the missing attribute could be handled simultaneously by the proposed reduction algorithm without the discretization of numerical attributes or completing the incomplete data.Finally,the proposed reduction algorithm was tested on several UCI data sets.The experiment results show that the reduction algorithm can select the core attributes on the condition of keeping or improving classification accuracy.Also,how to impact the classification when specifying the value of the threshold used in the limited neighborhood relation is specified also discussed.

Key words: incomplete mixed decision system, limited neighborhood relation, conditional entropy, attribute reduction

中图分类号: 

  • TP181
[1] PAWLAK Z.Rough sets:theoretical aspects of reasoning about data[M].Dordrecht:Kluwer Academic Publishers,1991:9-115.
[2] LIN T Y.Neighborhood systems and relational database[C]//Proceedings of the 1988 ACM 16th Annual Conference on Computer Science.New York:ACM Press,1988:725-740.
[3] LIN T Y.Neighborhood systems and approximation in relational databases and knowledge bases[C]//Proceedings of the Fourth International Symposium on Methodologies of Intelligent Systems (Poster Session).[S.l.]:[s.n.],1989:75-86.
[4] LIN T Y.Neighborhood systems:mathematical models of information granulations[C]//The Proceeding of 2003 IEEE International Conference on Systems,Man and Cybernetics,Hyatt Regency.Washington D C:IEEE Computer Society,2003:3188-3193.
[5] 胡清华,于达仁,谢宗霞.基于邻域粒化和粗糙逼近的数值属性约简[J].软件学报,2008,19(3):640-649.
[6] 胡清华,赵辉,于达仁.基于邻域粗糙集的符号与数值属性快速约简算法[J].模式识别与人工智能,2008,21(6):732-738.
[7] HU Qing-hua,YU Da-ren,XIE Zong-xia.Neighborhood classifiers[J].Expert Systems with Applications,2008,34(2):866-876.
[8] 谢娟英,李楠,乔子芮.基于邻域粗糙集的不完整决策系统特征选择算法[J].南京大学学报:自然科学版,2011,47(4):383-390.
[9] 赵佰亭,陈希军,曾庆双.广义不完备混合决策系统的知识约简[J].四川大学学报:工程科学版,2009,41(6):177-182.
[10] 张灵均.基于邻域的扩展粗糙集模型及其在特征基因选择中的应用研究[D].新乡:河南师范大学,2012.
[11] 霍忠诚,曾玲,范婷,等.混合值不完备信息系统一种新的数据分析方法[J].计算机应用研究,2011,28(9):3321-3323.
[12] GRZYMALA B,JERZY W.Data with missing attribute values:generalization of indiscernibility relation and rule induction[EB/OL].(2009-06-05)[2012-12-28].http://sci2s.ugr.es/keel/pdf/specific/congreso/grzymalabusse04.pdf.
[13] LIN T Y.Granular computing:practices,theories,and future directions[J].Encyclopedia of Complexity and Systems Science,2008,770(9):4339-4355.
[14] 王国胤,于洪,杨大春.基于条件信息熵的决策表约简[J].计算机学报,2002,25(7):759-766.
[15] 王珏,石纯一.机器学习研究[J].广西师范大学学报:自然科学版,2003,21(2):1-15.
[1] 郑威,文国秋,何威,胡荣耀,赵树之. 属性自表达的低秩无监督属性选择算法[J]. 广西师范大学学报(自然科学版), 2018, 36(1): 61-69.
[2] 申雪芬, 谢王君, 刘海峰, 续欣莹. 一种改进的基于相对正域的增量式属性约简算法[J]. 广西师范大学学报(自然科学版), 2013, 31(3): 45-50.
[3] 胡卉颖, 钟智, 元昌安, 陆建波, 袁晖. 基于基因表达式编程的粗糙集属性约简研究[J]. 广西师范大学学报(自然科学版), 2012, 30(2): 23-28.
[4] 闫麟, 梁吉业, 王俊红. 一种基于等价描述矩阵的规则提取方法[J]. 广西师范大学学报(自然科学版), 2011, 29(3): 94-100.
[5] 徐丽, 丁世飞, 郭锋锋. 基于改进属性约简的粗核聚类算法[J]. 广西师范大学学报(自然科学版), 2011, 29(3): 105-109.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!
版权所有 © 广西师范大学学报(自然科学版)编辑部
地址:广西桂林市三里店育才路15号 邮编:541004
电话:0773-5857325 E-mail: gxsdzkb@mailbox.gxnu.edu.cn
本系统由北京玛格泰克科技发展有限公司设计开发