Journal of Guangxi Normal University(Natural Science Edition) ›› 2020, Vol. 38 ›› Issue (4): 21-31.doi: 10.16088/j.issn.1001-6600.2020.04.003

Previous Articles     Next Articles

An Improved Stack Algorithm Based on Local Sensitive Hash

WANG Junjie1, WEN Xueyan1*, XU Kesheng2, YU Ming1   

  1. 1. College of Computer and Engineering, Northeast Forestry University, Harbin Heilongjiang 150040, China;
    2. State Forestry Administration Harbin Forestry Machinery Research Institute, Harbin Heilongjiang 150086, China
  • Received:2019-11-07 Published:2020-07-13

Abstract: Stack generalization is born with high complexity and data leakage. At the same time, when it faces different data samples, the result is not stable. The LBDS proposed in this paper uses LSH (local sensitive hashing) algorithm to map the training and test set to the hash bucket. When one of the two bucket is full, which will be used as the starting training condition, the trained model predicts the training and test data and their neighborhoods when the other bucket is full. Then the algorithm filters the base classifier by using the stability and information entropy conditions and generates the high-level classifier. Finally, through the mixed voting and average method, the results generated by high-level training prediction are obtained. Experimental results show that LBDS has an average improvement of 2% in ACC and AUC, and a decrease of 10% in training time complexity. Meanwhile, LBDS shows better stability and generalization ability.

Key words: stack generalization, locally sensitive hashing, time complexity, stability, meta classifier

CLC Number: 

  • TP301.6
[1] 徐继伟,杨云.集成学习方法:研究综述[J].云南大学学报(自然科学版),2018,40(6):1082-1092.
[2] ANIFOWOSE F,LABADIN J,ABDULRAHEEM A.Improving the prediction of petroleum reservoir characterization with a stacked generalization ensemble model of support vector machines[J].Applied Soft Computing,2015,26:483-496.DOI:10.1016/j.asoc.2014.10.017.
[3] 袁策书.基于stacking组合的文本情感分类研究[D].武汉:华中师范大学,2017.
[4] XING W L,CHEN X,STEIN J,et al.Temporal predication of dropouts in MOOCs:reaching the low hanging fruit through stacking generalization[J].Computers in Human Behavior,2016,58:119-129.DOI:10.1016/j.chb.2015.12.007.
[5] BHATT S,CAMERON E,FLAXMAN S R,et al.Improved prediction accuracy for disease risk mapping using Gaussian process stacked generalization[J].Journal of the Royal Society Interface,2017,14(134):20170520.DOI:10.1098/rsif.2017.0520.
[6] WOLPERT D H.Stacked generalization[J].Neural Networks,1992,5(2):241-259.DOI:10.1016/S0893-6080(05)80023-1.
[7] TING K M,WITTEN I H.Stacking bagged and dagged models[C]//Proceedings of the Fourteenth International Conference on Machine Learning.San Francisco,CA:Morgan Kaufmann Publishers Inc,1997:367-375.
[8] SILL J,TAKACS G,MACKEY L,et al.Feature-weighted linear stacking[EB/OL].(2009-11-04)[2019-11-07].https://arxiv.org/pdf/0911.0460.pdf.
[9] 吴挡平,张忠林,曹婷婷.基于Stacking策略的稳定性分类器组合模型研究[J].小型微型计算机系统,2019,40(5): 1045-1049.DOI:10.3969/j.issn.1000-1220.2019.05.026.
[10]ARSOV N,PAVLOVSKI M,KOCAREV L.Stacking and stability[EB/OL].(2019-01-26)[2019-11-07].https://arxiv.org/pdf/1901.09134v1.pdf.
[11]WANG S,MINKU L L,YAO X.Resampling-based ensemble methods for online class imbalance learning[J].IEEE Transactions on Knowledge and Data Engineering,2015,27(5):1356-1368.DOI:10.1109/TKDE.2014.2345380.
[12]ANDONI A,INDYK P,LAARHOVEN T,et al.Practical and optimal LSH for angular distance[C]//Proceedings of the 28th International Conference on Neural Information Processing Systems:Vol. 1.Cambridge,MA:MIT Press,2015: 1225-1233.
[13]熊霖,唐万梅.基于异构分类器集成的增量学习算法[J].计算机工程与应用,2020,56(7):155-161.DOI: 10.3778/j.issn.1002-8331.1812-0188.
[14]FLEISS J L.Statistical methods for rates and proportions[M].Hoboken,NJ:John Wiley & Sons Inc,1981.
[15]严佳.稳定的最近邻分类器及其统计性质[D].合肥:中国科学技术大学,2019.
[16]李润华.随机分化结构学习:一种大幅提升贝叶斯分类器的通用方法[D].长春:吉林大学,2018.
[17]ELISSEEFF A,EVGENIOU T,PONTIL M.Stability of randomized learning algorithms[J].Journal of Machine Learning Research,2005,6:55-79.
[18]HANSEN L K,SALAMON P.Neural network ensembles[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,1990,12(10):993-1001.DOI:10.1109/34.58871.
[19]PARTRIDGE D,KRZANOWSKI W.Software diversity:practical statistics for its measurement and exploitation[J]. Information and Software Technology,1997,39(10):707-717.DOI:10.1016/S0950-5849(97)00023-2.
[20]BANFIELD R E,HALL L O,BOWYER K W,et al.A new ensemble diversity measure applied to thinning ensembles[C]//Multiple Classifier Systems:Lecture Notes in Computer Science Volume 2709.Berlin:Springer-Verlag,2003:306-316. DOI:10.1007/3-540-44938-8_31.
[21]SHIPP C, AKUNCHEVA L I.Relationships between combination methods and measures of diversity in combining classifiers[J].Information Fusion,2002,3(2):135-148.DOI:10.1016/S1566-2535(02)00051-9.
[1] ZHENG Tao, ZHOU Xinran, ZHANG Long. Global Asymptotic Stability of Predator-Competition-Cooperative Hybrid Population Models of Three Species [J]. Journal of Guangxi Normal University(Natural Science Edition), 2020, 38(5): 64-70.
[2] CHEN Xiong, ZHU Yu, FENG Ke, YU Tongwei. Identity Authentication of Power System Safetyand Stability Control Terminals Based on Blockchain [J]. Journal of Guangxi Normal University(Natural Science Edition), 2020, 38(2): 8-18.
[3] LUO Lan, ZHOU Nan, SI Jie. New Delay Partition Method for Robust Stability of Uncertain Cellular Neural Networks with Time-Varying Delays [J]. Journal of Guangxi Normal University(Natural Science Edition), 2019, 37(4): 45-52.
[4] HONG Lingling, YANG Qigui. Research on Complex Dynamics of a New 4D Hyperchaotic System [J]. Journal of Guangxi Normal University(Natural Science Edition), 2019, 37(3): 96-105.
[5] WU Juan, ZHU Hongyang, MEI Ping, CHEN Wu, LI Zhongbao. Polymethyl Methacrylate Modified Nano-Silica and Its Stabilizing Effect on Pickering Emulsion [J]. Journal of Guangxi Normal University(Natural Science Edition), 2019, 37(3): 120-131.
[6] CHEN Siyu, ZOU Yanli, ZHOU Jian, TAN Huazhen. Study on the Power Allocation of Power Generators and Unbalanced Development of Loadson Power Grids [J]. Journal of Guangxi Normal University(Natural Science Edition), 2019, 37(2): 52-59.
[7] HAN Huiqing, CAI Guangpeng, YIN Changying, MA Geng, ZHANG Yingjia, LU Yi. Analysis of Landscape Stability in Middle and Upper Reaches of the Wujiang River in 2000 and 2015 [J]. Journal of Guangxi Normal University(Natural Science Edition), 2019, 37(1): 197-204.
[8] MIAO Xinyan, ZHANG Long, LUO Yantao, PAN Lijun. Study on a Class of Alternative Competition-Cooperation Hybrid Population Model [J]. Journal of Guangxi Normal University(Natural Science Edition), 2018, 36(3): 25-31.
[9] MEI Chuncao, WEI Duqu*, LUO Xiaoshu. Stability Analysis of Inductive Load of Distributed Generation System [J]. Journal of Guangxi Normal University(Natural Science Edition), 2018, 36(2): 50-55.
[10] FENG Jinming,LI Zunxian. Stability Analysis of a Class of Epidemic Model with Diffusion [J]. Journal of Guangxi Normal University(Natural Science Edition), 2018, 36(2): 63-68.
[11] ZHANG Xueliang,TAN Huili, BAI Kezhao, TANG Guoning,DENG Minyi. A Cellular Automaton Model Connected to the ConductionRestitution Property of Cardiac Cells [J]. Journal of Guangxi Normal University(Natural Science Edition), 2017, 35(4): 1-9.
[12] CHEN Chunyan, XU Zhipeng, KUANG Hua. Modeling and Stability Analysis of Traffic Flow Car-following Modelwith Continuous Memory Effect [J]. Journal of Guangxi Normal University(Natural Science Edition), 2017, 35(3): 14-21.
[13] XING Wei,GAO Jinfang,YAN Qisheng,ZHOU Qihua. An SIQR Epidemic Model with Nonlinear Incidence Rateand Impulsive Vaccination [J]. Journal of Guangxi Normal University(Natural Science Edition), 2017, 35(2): 58-65.
[14] FU Jie, ZOU Yanli, XIE Rong. Study on Synchronization and Stability of Cluster Network [J]. Journal of Guangxi Normal University(Natural Science Edition), 2017, 35(1): 7-15.
[15] ZHANG Chenggang, FANG Zhigang, ZHAO Zhenning, WANG Maoxin,
LIU Jipeng, XU Shihao, HAN Jianming. The Density Functional Theory Study on Stability of Cluster CoFe2B2 [J]. Journal of Guangxi Normal University(Natural Science Edition), 2016, 34(3): 86-94.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] XU Shihao,FANG Zhigang,HAN Jianming,ZHAO Zhenning,CHEN Lin,LIU Qi. Bonding and Magnetic Properties of Cluster V3B2[J]. Journal of Guangxi Normal University(Natural Science Edition), 2017, 35(3): 89 -96 .
[2] XU Lun-hui, LIU Jing-ning, ZHU Qun-qiang, WANG Qing, XIE Yan, SUO Sheng-chao. Path Deviation Control of Automatic Guided Vehicle[J]. Journal of Guangxi Normal University(Natural Science Edition), 2015, 33(1): 1 -6 .
[3] KUANG Xian-yan, WU Yun, CAO Wei-hua, WU Yin-feng. Cellular Automata Simulation Model for Urban MixedNon-motor Vehicle Flow[J]. Journal of Guangxi Normal University(Natural Science Edition), 2015, 33(1): 7 -14 .
[4] XIAO Rui-jie, LIU Ye, XIU Xiao-ming, KONG Ling-jiang. State Transfer of Two Mechanical Oscillators in Coupled CavityOptomechanical System[J]. Journal of Guangxi Normal University(Natural Science Edition), 2015, 33(1): 15 -19 .
[5] HUANG Hui-qiong, QIN Yun-mei. Overtaking Model Based on Drivers’ Characteristics[J]. Journal of Guangxi Normal University(Natural Science Edition), 2015, 33(1): 20 -26 .
[6] YUAN Le-ping, SUN Rui-shan. Probabilistic Safety Assessment of Air Traffic Conflict Resolution[J]. Journal of Guangxi Normal University(Natural Science Edition), 2015, 33(1): 27 -31 .
[7] YANG Pan-pan, ZHU Long-ji, CAO Meng-jie. TSC Type of Reactive Power Compensation Control SystemBased on STM32[J]. Journal of Guangxi Normal University(Natural Science Edition), 2015, 33(1): 32 -37 .
[8] ZHANG Mei-yue. Some New Results for the Electron Beams Focusing System Model[J]. Journal of Guangxi Normal University(Natural Science Edition), 2015, 33(1): 38 -44 .
[9] HOU Xiao-dong, CAI Bin-bin, JIN Wei-dong, DUAN Wang-wang. A New Weighted Evidence Fusion Algorithm Based on Evidence Distanceand Fuzzy Entropy Theory[J]. Journal of Guangxi Normal University(Natural Science Edition), 2015, 33(1): 45 -51 .
[10] YUE Cai-jie, CHEN Yuan-yan, ZHU Xin-hua. An Effective Area Query Algorithm in Sensor Network[J]. Journal of Guangxi Normal University(Natural Science Edition), 2015, 33(1): 52 -58 .