|
|
广西师范大学学报(自然科学版) ›› 2026, Vol. 44 ›› Issue (1): 110-118.doi: 10.16088/j.issn.1001-6600.2025030703
荣晶晶, 冶继民*
RONG Jingjing, YE Jimin*
摘要: 针对高维稀疏线性回归模型,本文从后验估计角度提出基于错误发现率(false discovery rate, FDR)的模型选择FDR规则;之后在其基础上引入动态信噪比(signal-to-noise ratio, SNR)变化因子,提出对SNR变化更稳健且对数据尺度具有不变性的FDRR规则;结合OMP算法,仿真实验对比分别采用FDR规则、FDRR规则和已有规则下成功选择全部真正变量的概率和FDR值,结果表明,相较于其他规则,FDRR规则在高SNR或大样本量下更稳健,对数据缩放问题更加鲁棒,且错误发现率最低;最后,将所提方法应用到套细胞淋巴瘤患者的真实数据,筛选出影响细胞增殖的基因编号。
中图分类号: O212.8
| [1] DING J, TAROKH V, YANG Y H. Model selection techniques: an overview[J]. IEEE Signal Processing Magazine, 2018, 35(6): 16-34. DOI: 10.1109/MSP.2018.2867638. [2] STOICA P, SELEN Y. Model-order selection: a review of information criterion rules[J]. IEEE Signal Processing Magazine, 2004, 21(4): 36-47. DOI: 10.1109/MSP.2004.1311138. [3] BOGDAN M, FROMMLET F. Identifying important predictors in large data bases-multiple testing and model selection[M]. Handbook of Multiple Comparisons, Boca Raton, FL:Chapman and Hall/CRC, 2021: 139-182. [4] MEIR E, ROUTTENBERG T. Cramér-Rao bound for estimation after model selection and its application to sparse vector estimation[J]. IEEE Transactions on Signal Processing, 2021, 69: 2284-2301. DOI: 10.1109/TSP.2021.3068356. [5] AKAIKE H. A new look at the statistical model identification[J]. IEEE Transactions on Automatic Control, 1974, 19(6): 716-723. DOI: 10.1109/TAC.1974.1100705. [6] SCHWARZ G. Estimating the dimension of a model[J].The Annals of Statistics, 1978, 6(2): 461-464. [7] 王斐, 许波. 基于自适应LPP特征降维和改进VPMCD的滚动轴承故障诊断[J]. 现代制造工程, 2024(6): 154-161, 94. DOI: 10.16731/j.cnki.1671-3133.2024.06.020. [8] 王逸林, 马世龙, 王晋晋, 等. 基于稀疏重构的色噪声背景下未知线谱信号估计[J]. 电子与信息学报, 2018, 40(11): 2570-2577. DOI: 10.11999/JEIT171040. [9] ISHIJIMA R, EBIHARA T, WAKATSUKI N, et al. Sparse channel estimation with global optimum solution for orthogonal signal division multiplexing in underwater acoustic communication[J]. IEEE Access, 2024, 12: 128778-128790. [10] TIBSHIRANI R. Regression shrinkage and selection via the Lasso[J]. Journal of the Royal Statistical Society Series B: Statistical Methodology, 1996, 58(1): 267-288. DOI: 10.1111/j.2517-6161.1996.tb02080.x. [11] ZOU H, HASTIE T. Regularization and variable selection via the elastic net[J]. Journal of the Royal Statistical Society Series B: Statistical Methodology, 2005, 67(2): 301-320. DOI: 10.1111/j.1467-9868.2005.00503.x. [12] 姜云卢, 卢辉杰, 黄晓雯. 惩罚加权复合分位数回归方法在固定效应面板数据中的应用研究[J]. 广西师范大学学报(自然科学版), 2025,43(6): 120-127. DOI: 10.16088/j.issn.1001-6600.2024111001. [13] CAI T T, WANG L. Orthogonal matching pursuit for sparse signal recovery with noise[J]. IEEE Transactions on Information Theory, 2011, 57(7): 4680-4688. DOI: 10.1109/TIT.2011.2146090. [14] CHEN J H, CHEN Z H. Extended Bayesian information criteria for model selection with large model spaces[J].Biometrika, 2008, 95(3): 759-771. DOI: 10.1093/biomet/asn034. [15] OWRANG A, JANSSON M. A model selection criterion for high-dimensional linear regression[J]. IEEE Transactions on Signal Processing, 2018, 66(13): 3436-3446. DOI: 10.1109/TSP.2018.2821628. [16] BABU P, STOICA P. Multiple-hypothesis testing rules for high-dimensional model selection and sparse-parameter estimation[J]. Signal Processing, 2023, 213: 109189. DOI: 10.1016/j.sigpro.2023.109189. [17] 徐萍, 钟思敏, 李斌斌, 等. 基于稀疏超高维非参数可加模型的条件独立筛选[J]. 广西师范大学学报(自然科学版), 2022, 40(1): 100-107. DOI: 10.16088/j.issn.1001-6600.2021060919. [18] 潘莹丽, 刘展, 闫玲玲. 基于大规模高维线性回归模型的分布式计算方法研究[J]. 应用数学学报, 2022, 45(3):339-354. DOI: 10.3969/j.issn.1006-3110.2018.06.002. [19] GOHAIN P B, JANSSON M. Robust information criterion for model selection in sparse high-dimensional linear regression models[J]. IEEE Transactions on Signal Processing, 2023, 71: 2251-2266. DOI: 10.1109/TSP.2023.3284365. [20] STOICA P, BABU P. On the proper forms of BIC for model order selection[J]. IEEE Transactions on Signal Processing, 2012, 60(9): 4956-4961. DOI: 10.1109/TSP.2012.2203128. [21] SCHMIDT D F, MAKALIC E. The consistency of MDL for linear regression models with increasing signal-to-noise ratio[J]. IEEE Transactions on Signal Processing, 2011, 60(3): 1508-1510. DOI: 10.1109/TSP.2011.2177833. [22] STOICA P, BABU P. False discovery rate (FDR) and familywise error rate (FER) rules for model selection in signal processing applications[J]. IEEE Open Journal of Signal Processing, 2022, 3: 403-416. [23] BENJAMINI Y, YEKUTIELI D. The control of the false discovery rate in multiple testing under dependency[J]. The Annals of Statistics, 2001, 29(4):1165-1188. DOI: 10.1214/aos/1013699998. [24] BUNEA F, WEGKAMP M H, AUGUSTE A. Consistent variable selection in high dimensional regression via multiple testing[J]. Journal of Statistical Planning and Inference, 2006, 136(12): 4349-4364. DOI: 10.1016/j.jspi.2005.03.011. [25] 邹航, 姜云卢. 高维线性回归模型稳健变量选择方法综述[J]. 应用概率统计, 2024, 40(1): 157-181. DOI: 10.3969/j.issn.1001-4268.2024.01.010. [26] 黄河, 潘莹丽. Cox模型中基于Model-X Knockoffs的高维控制变量选择方法[J]. 统计与决策, 2023, 39(5): 16-21. DOI: 10.13546/j.cnki.tjyjc.2023.05.003. |
| [1] | 宋婷, 谢显中, 胡小峰. 分簇频谱检测报告信道的信噪比墙及性能分析[J]. 广西师范大学学报(自然科学版), 2013, 31(3): 169-176. |
|
|
版权所有 © 广西师范大学学报(自然科学版)编辑部 地址:广西桂林市三里店育才路15号 邮编:541004 电话:0773-5857325 E-mail: gxsdzkb@mailbox.gxnu.edu.cn 本系统由北京玛格泰克科技发展有限公司设计开发 |