Journal of Guangxi Normal University(Natural Science Edition) ›› 2022, Vol. 40 ›› Issue (1): 43-56.doi: 10.16088/j.issn.1001-6600.2021060910

Previous Articles     Next Articles

High-dimensional Nonlinear Regression Model Based on JMI

ZHANG Zhifei1, DUAN Qian1, LIU Naijia2, HUANG Lei1*   

  1. 1. School of Mathematics, Southwest Jiaotong University, Chengdu Sichuan 611756, China;
    2. School of Statistics, Southwestern University of Finance and Economics, Chengdu Sichuan 611137, China
  • Received:2021-06-09 Revised:2021-06-27 Online:2022-01-25 Published:2022-01-24

Abstract: Sure Independence Screening (SIS) has been widely used in the variable selection of linear regression models in ultra-high dimensional space, and extended to deal with the variable selection of generalized linear regression models. However, SIS cannot solve the problem of variable selection in nonlinear regression models well, and there are few existing studies on this problem. Therefore, how to effectively select variables in nonlinear regression models in ultra-high dimensional space becomes a problem with research value. Based on the classic SIS method, by considering Jackknife-based estimation of mutual information (JMI), a method combining SIS with JMI is proposed, and a specific algorithm is provided to realize the variable selection of the nonlinear regression model in the ultra-high dimensional space. Through some representative simulation experiments, this paper verifies the consistency of the proposed method. In addition, by the analysis of two examples gene data, the feasibility and practicality of the proposed method are elaborated.

Key words: ultra-high dimensional space, SIS, nonlinear regression, JMI, consistency

CLC Number: 

  • O212.1
[1] 张晓琴, 刘莉楠. 基于亲密度和吸引力的二分网络社区发现算法[J]. 计算机工程与应用, 2019, 55(23): 170-176. DOI: 10.3778/j.issn.1002-8331.1808-0090.
[2]韦春荣, 何楚. 基于改进型互信息的遥感图像配准方法[J]. 广西师范大学学报(自然科学版), 2014, 32(2): 20-25. DOI: 10.16088/j.issn.1001-6600.2014.02.033.
[3]ZENG X L, XIA Y C, TONG H. Jackknife approach to the estimation of mutual information[J]. Proceedings of the National Academy of Sciences of the United States of America, 2018, 115(40): 9956-9961. DOI: 10.1073/pnas.1715593115.
[4]DESBOULETS L D D. A review on variable selection in regression analysis[J]. Econometrics, 2018, 6(4): 45. DOI: 10.3390/econometrics6040045.
[5]FAN J Q, LV J C. Sure independence screening for ultrahigh dimensional feature space[J]. Journal of the Royal Statistical Society Series B: Statistical Methodology, 2008, 70(5): 849-911. DOI:10.1111/j.1467-9868.2008.00674.x.
[6]张秀秀. 基于(I)SIS的变量选择方法及其在极高维数据生存分析中的应用[D]. 太原: 山西医科大学, 2013. DOI: 10.7666/d.Y2339335.
[7]马学俊. GSIS超高维变量选择[J]. 统计与信息论坛, 2015, 30(8): 16-19. DOI: 10.3969/j.issn.1007-3116.2015.08.004.
[8]HALL P, MILLER H. Using generalized correlation to effect variable selection in very high dimensional problems[J]. Journal of Computational and Graphical Statistics, 2009, 18(3): 533-550. DOI: 10.1198/jcgs.2009.08041.
[9]ANTONIADIS A, FAN J Q. Regularization of wavelet approximations[J]. Journal of the American Statistical Association, 2001, 96(455): 939-967. DOI: 10.1198/016214501753208942.
[10]YUAN M, LIN Y. Model selection and estimation in regression with grouped variables[J]. Journal of the Royal Statistical Society Series B: Statistical Methodology, 2006, 68(1): 49-67. DOI: 10.1111/j.1467-9868.2005.00532.x.
[11]LIN Y, ZHANG H H. Component selection and smoothing in multivariate nonparametric regression[J]. The Annals of Statistics, 2006, 34(5): 2272-2297. DOI: 10.1214/009053606000000722.
[12]RAVIKUMAR P, LAFFERTY J, LIU H, et al. Sparse additive models[J]. Journal of the Royal Statistical Society Series B: Statistical Methodology, 2009, 71(5): 1009-1030. DOI: 10.1111/j.1467-9868.2009.00718.x.
[13]ZHU L P, LI L X, LI R Z, et al. Model-free feature screening for ultrahigh dimensional data[J]. Journal of the American Statistical Association, 2011, 106(496): 1464-1475. DOI: 10.1198/jasa.2011.tm10563.
[14]CUI H J, LI R Z, ZHONG W. Model-free feature screening for ultrahigh dimensional discriminant analysis[J]. Journal of the American Statistical Association, 2015, 110(510): 630-641. DOI: 10.1080/01621459.2014.920256.
[15]MAI Q, ZOU H. The Kolmogorov filter for variable screening in high-dimensional binary classification[J]. Biometrika, 2013, 100(1): 229-234. DOI: 10.1093/biomet/ass062.
[16]WU S, XUE H Q, WU Y C, et al. Variable selection for sparse high-dimensional nonlinear regression models by combining nonnegative garrote and sure independence screening[J]. Statistica Sinica, 2014, 24(3): 1365-1387. DOI: 10.5705/ss.2012.316.
[17]DAVIDSON R, MACKINNON J G, et al. Implicit alternatives and the local power of test statistics[J]. Econometrica, 1987, 55(6): 1305-1329. DOI: 10.2307/1913558.
[18]GOLUB T R, SLONIM D K, TAMAYO P, et al. Molecular classification of cancer: class discovery and class prediction by gene expression monitoring[J]. Science, 1999, 286(5439): 531-537. DOI: 10.1126/science.286.5439.531.
[19]RABANI M, LEVIN J Z, FAN L, et al. Metabolic labeling of RNA uncovers principles of RNA production and degradation dynamics in mammalian cells[J]. Nature Biotechnology, 2011, 29(5): 436-442. DOI: 10.1038/nbt.1861.
[20]YANG Y, ZHANG T, ZOU H. Flexible expectile regression in reproducing kernel Hilbert spaces[J]. Technometrics, 2018, 60(1): 26-35. DOI: 10.1080/00401706.2017.1291450.
[1] BAI Defa, XU Xin, WANG Guochang. Review of Generalized Linear Models and Classification for Functional Data [J]. Journal of Guangxi Normal University(Natural Science Edition), 2022, 40(1): 15-29.
[2] HE Jianfeng, SHI Li. Sampling Method Based on Slice Inverse Regression in Big Data [J]. Journal of Guangxi Normal University(Natural Science Edition), 2022, 40(1): 91-99.
[3] LI Chengen, PAN Xiaoying, WANG Meihan, SHI Jianhua. Research on China’s Grain Output Based on Interval Data Measurement [J]. Journal of Guangxi Normal University(Natural Science Edition), 2022, 40(1): 206-215.
[4] LIANG Jiayi, WANG Yongsen, DUAN Ming, LI Yi, CHEN Zhe, YU Fangming, LIU Kehui. Effects of Biochar on Soil Available Cadmium and Cadmium Uptake by Plants:A Meta Analysis [J]. Journal of Guangxi Normal University(Natural Science Edition), 2021, 39(6): 1-12.
[5] ZHANG Shunsheng, LUO Yuling, QIU Senhui. Stochastic Attack Method Based on Mahalanobis Distance against AES Cryptosystem [J]. Journal of Guangxi Normal University(Natural Science Edition), 2021, 39(6): 33-43.
[6] LI Lanhang, QIU Senhui, WANG Wenyi, XIAO Dingwei, LUO Yuling. Multi-layer Interactive Color Image Encryption Algorithm Based on Chaotic Map [J]. Journal of Guangxi Normal University(Natural Science Edition), 2021, 39(6): 72-86.
[7] ZHOU Zihao, LIU Yuhan, TAN Yanhong, MENG Yuqing, WU Hongying, HUANG Jinlong, WU Zhengjun. Enzymatic Preparation of Antimicrobial Peptides from the Viscera of Pomacea canaliculata [J]. Journal of Guangxi Normal University(Natural Science Edition), 2021, 39(6): 154-161.
[8] ZHANG Shiyan, XIE Qiang, HUANG Lijuan, HUANG Qing, FENG Xueyu, SU Hualong. Niche Analysis of Main Populations in Cyclobalanopsis glauca Community in Lingqu Basin of Guangxi, China [J]. Journal of Guangxi Normal University(Natural Science Edition), 2021, 39(6): 162-173.
[9] WENG Xiaoxiong, XIE Zhipeng. Study on Freeway Nodes Importance Based on Multilayer Complex Network [J]. Journal of Guangxi Normal University(Natural Science Edition), 2021, 39(5): 78-88.
[10] WEN Peng, WANG Yaqing, TANG Shengda. Determination of Video Streaming Buffer Threshold Constrainted with Initial Delay and Hysteresis Probability [J]. Journal of Guangxi Normal University(Natural Science Edition), 2021, 39(5): 147-157.
[11] JIANG Xianghui, TAN Rong, YANG Yongping, XIAO Qingzong. Analysis of Network Pharmacology and Confirmation of Mahonia fortunei (Lindl. ) Fedde and Glycyrrhiza uralensis Fisch Decoction for Hepatitis [J]. Journal of Guangxi Normal University(Natural Science Edition), 2021, 39(5): 198-209.
[12] ZHU Bailu, YANG Qiyong, XIE Yunqiu, DENG Yan, TANG Meirong, LIU Dacun, ZENG Hongchun. Spatial Distribution and Driving Factors of Karst Rocky Desertification in Lijiang River Basin [J]. Journal of Guangxi Normal University(Natural Science Edition), 2021, 39(3): 139-150.
[13] LI Benchao, LIANG Yan, QIN Xiaoya, MO Tuxiang, XU Zhaolong, LI Jun, YANG Ruiyun. Secondary Metabolites of Endophytic Fungus GDG-178 from Sophora tonkinensis [J]. Journal of Guangxi Normal University(Natural Science Edition), 2021, 39(2): 139-143.
[14] FU Wen, REN Baoping, LIN Jianzhong, LUAN Ke, WANG Pengcheng, WANG Bing, LI Dayong, ZHOU Qihai. Jiyuan Taihang Mountain Macaque Population and Conservation Status [J]. Journal of Guangxi Normal University(Natural Science Edition), 2021, 39(1): 45-52.
[15] QIN Yuyue, LIU Xiaobo, XU Zhaolong, MO Tuxiang, LI Jun, YANG Ruiyun. Secondary Metabolites of Endophytic Fungus Xylaria sp. GDGJ-368 from Sophora tonkinensis [J]. Journal of Guangxi Normal University(Natural Science Edition), 2020, 38(5): 71-77.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] LIU Guolun, SONG Shuxiang, CEN Mingcan, LI Guiqin, XIE Lina. Design of Bandwidth Tunable Band-Stop Filter[J]. Journal of Guangxi Normal University(Natural Science Edition), 2018, 36(3): 1 -8 .
[2] LIU Ming, ZHANG Shuangquan, HE Yude. Classification Study of Differential Telecom Users Based on SOM Neural Network[J]. Journal of Guangxi Normal University(Natural Science Edition), 2018, 36(3): 17 -24 .
[3] HU Yucong, CHEN Xu, LUO Jialing. Network Design Model of Customized Bus in Diversified Operationof Multi-origin-destination and Multi-type Vehicle Mixed Load[J]. Journal of Guangxi Normal University(Natural Science Edition), 2018, 36(4): 1 -11 .
[4] TANG Tang, WEI Chengyun, LUO Xiaoshu, QIU Senhui. Study of Seeker Optimization Algorithm with Inertia TermSelf-tuning to Attitude Stability of Quadrotor UAV[J]. Journal of Guangxi Normal University(Natural Science Edition), 2018, 36(4): 12 -19 .
[5] LIN Yue, LIU Tingzhang, HUANG Lirong, XI Xiaoye, PAN Jian. Anomalous State Detection of Power Transformer Basedon Bidirectional KL Distance Clustering Algorithm[J]. Journal of Guangxi Normal University(Natural Science Edition), 2018, 36(4): 20 -26 .
[6] WEI Zhenhan, SONG Shuxiang, XIA Haiying. State-of-charge Estimation Using Random Forest for Lithium Ion Battery[J]. Journal of Guangxi Normal University(Natural Science Edition), 2018, 36(4): 27 -33 .
[7] XU Yuanjing, HU Weiping. Identification of Pathological Voice of Different Levels Based on Random Forest[J]. Journal of Guangxi Normal University(Natural Science Edition), 2018, 36(4): 34 -41 .
[8] ZHANG Canlong, SU Jiancai, LI Zhixin, WANG Zhiwen. Infrared-Visible Target Tracking Basedon AdaBoost Confidence Map[J]. Journal of Guangxi Normal University(Natural Science Edition), 2018, 36(4): 42 -50 .
[9] LIU Dianting, WU Lina. Domain Experts Recommendation in Social Network Basedon the LDA Theme Model of Trust[J]. Journal of Guangxi Normal University(Natural Science Edition), 2018, 36(4): 51 -58 .
[10] JIANG Yingxing, HUANG Wennian. Ground State Solutions for the NonlinearSchrödinger-Maxwell Equations[J]. Journal of Guangxi Normal University(Natural Science Edition), 2018, 36(4): 59 -66 .