Journal of Guangxi Normal University(Natural Science Edition) ›› 2022, Vol. 40 ›› Issue (1): 100-107.doi: 10.16088/j.issn.1001-6600.2021060919

Previous Articles     Next Articles

Conditional Independence Screening in Sparse Ultra-high Dimensional Nonparametric Additive Models

XU Ping, ZHONG Simin, LI Binbin, XIONG Wenjun*   

  1. School of Mathematics and Statistics, Guangxi Normal University, Guilin Guangxi 541006, China
  • Received:2021-06-09 Revised:2021-07-30 Online:2022-01-25 Published:2022-01-24

Abstract: Variable screening is an effective method for processing ultra-high-dimensional data. Barut et al. considered that some of the known variables are significantly related to the response variables, and propose the CSIS method based on the assumption of a linear model. This method can effectively reduce the probability of false variable selection. However, its linear model assumptions are more stringent. In actual research, the structure of the model cannot be determined in advance. Therefore, this paper proposes a conditional non-parametric independent screening method (CNIS) based on a non-parametric additive model, which does not need to make assumptions about the model structure, to increases the scope of application. At the same time, under appropriate conditions, it is proved that the screening in the first stage of the method has consistent screening properties and can retain important variables with probability 1. The variable selection in the second stage also has good consistency. The simulation results based on Monte Carlo data show that this method has better performance than the NIS method.

Key words: screening, additive model, variable selection, sure screening

CLC Number: 

  • O212
[1] FAN J Q, LV J C. Sure independence screening for ultrahigh dimensional feature space[J]. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 2008, 70(5): 849-911.
[2]HALL P, MILLER H. Using generalized correlation to effect variable selection in very high dimensional problems[J]. Journal of Computational and Graphical Statistics, 2009, 18(3): 533-550.
[3]LI G R, PENG H, ZHANG J, et al. Robust rank correlation based screening[J]. The Annals of Statistics, 2012, 40(3): 1846-1877.
[4]BARUT E, FAN J Q, VERHASSELT A. Conditional sure independence screening[J]. Journal of the American Statistical Association, 2016, 111(515): 1266-1277.
[5]马学俊. GSIS超高维变量选择[J]. 统计与信息论坛, 2015, 30(8): 16-19.
[6]FAN J Q, SONG R. Sure independence screening in generalized linear models with NP dimensionality[J]. The Annals of Statistics, 2010, 38(6): 3567-3604.
[7]XU C, CHEN J H. The sparse MLE for ultra-high-dimensional feature screening[J]. Journal of the American Statistical Association, 2014, 109(507): 1257-1269.
[8]FAN J Q, FENG Y, SONG R. Nonparametric independence screening in sparse ultra-high-dimensional additive models[J]. Journal of the American Statistical Association, 2011, 106(494): 544-557.
[9]FAN J Q, MA Y B, DAI W. Nonparametric independence screening in sparse ultra-high-dimensional varying coefficient models[J]. Journal of the American Statistical Association, 2014, 109(507): 1270-1284.
[10]LIU J Y, LI R Z,WU R L. Feature selection for varying coefficient models with ultra high dimensional covariates[J]. Journal of the American Statistical Association, 2014, 109(505):266-274.
[11]LI R Z, ZHONG W, ZHU L P. Feature screening via distance correlation learning[J]. Journal of the American Statistical Association, 2012, 107(499): 1129-1139.
[12]STONE C J. Additive regression and other nonparametric models[J]. The Annals of Statistics, 1985,13(2): 689-705.
[13]RIESENFELD R F. Application of B-spline approximation to geometric problems of computeraided design[D]. Syracuse: Syracuse University, 1973.
[14]HUANG J, HOROWITZ J L, WEI F R. Variable selection in nonparametric additive models[J]. The Annals of Statistics, 2010, 38(4): 2282-2313.
[15]SHEN X, WOLFE D A, ZHOU S. Local asymptotics for regression splines and confidence regions[J]. The Annals of Statistics, 1998, 26(5): 1760-1782.
[16]DU P, CHENG G, LIANG H. Semiparametric regression models with additive nonparametric components and high dimensional parametric components[J]. Computational Statistics & Data Analysis, 2012, 56(6): 2006-2017.
[1] TIAN Zhentao, ZHANG Junjian. Quantile Feature Screening for Ultra High Dimensional Censored Data [J]. Journal of Guangxi Normal University(Natural Science Edition), 2021, 39(6): 99-111.
[2] LIU Yi,YE Xuemei,XIAO Miyun,L Lijun,HOU Chengyou,LU Zujun. The Preliminary Screening of Hypaphorine High-accumulationStrain by Using the Quick Fluorescent Method [J]. Journal of Guangxi Normal University(Natural Science Edition), 2017, 35(3): 141-148.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] LIU Guolun, SONG Shuxiang, CEN Mingcan, LI Guiqin, XIE Lina. Design of Bandwidth Tunable Band-Stop Filter[J]. Journal of Guangxi Normal University(Natural Science Edition), 2018, 36(3): 1 -8 .
[2] LIU Ming, ZHANG Shuangquan, HE Yude. Classification Study of Differential Telecom Users Based on SOM Neural Network[J]. Journal of Guangxi Normal University(Natural Science Edition), 2018, 36(3): 17 -24 .
[3] HU Yucong, CHEN Xu, LUO Jialing. Network Design Model of Customized Bus in Diversified Operationof Multi-origin-destination and Multi-type Vehicle Mixed Load[J]. Journal of Guangxi Normal University(Natural Science Edition), 2018, 36(4): 1 -11 .
[4] TANG Tang, WEI Chengyun, LUO Xiaoshu, QIU Senhui. Study of Seeker Optimization Algorithm with Inertia TermSelf-tuning to Attitude Stability of Quadrotor UAV[J]. Journal of Guangxi Normal University(Natural Science Edition), 2018, 36(4): 12 -19 .
[5] LIN Yue, LIU Tingzhang, HUANG Lirong, XI Xiaoye, PAN Jian. Anomalous State Detection of Power Transformer Basedon Bidirectional KL Distance Clustering Algorithm[J]. Journal of Guangxi Normal University(Natural Science Edition), 2018, 36(4): 20 -26 .
[6] WEI Zhenhan, SONG Shuxiang, XIA Haiying. State-of-charge Estimation Using Random Forest for Lithium Ion Battery[J]. Journal of Guangxi Normal University(Natural Science Edition), 2018, 36(4): 27 -33 .
[7] XU Yuanjing, HU Weiping. Identification of Pathological Voice of Different Levels Based on Random Forest[J]. Journal of Guangxi Normal University(Natural Science Edition), 2018, 36(4): 34 -41 .
[8] ZHANG Canlong, SU Jiancai, LI Zhixin, WANG Zhiwen. Infrared-Visible Target Tracking Basedon AdaBoost Confidence Map[J]. Journal of Guangxi Normal University(Natural Science Edition), 2018, 36(4): 42 -50 .
[9] LIU Dianting, WU Lina. Domain Experts Recommendation in Social Network Basedon the LDA Theme Model of Trust[J]. Journal of Guangxi Normal University(Natural Science Edition), 2018, 36(4): 51 -58 .
[10] JIANG Yingxing, HUANG Wennian. Ground State Solutions for the NonlinearSchrödinger-Maxwell Equations[J]. Journal of Guangxi Normal University(Natural Science Edition), 2018, 36(4): 59 -66 .