|
广西师范大学学报(自然科学版) ›› 2022, Vol. 40 ›› Issue (1): 1-14.doi: 10.16088/j.issn.1001-6600.2021060904
• 综述 • 下一篇
艾艳, 贾楠, 王媛, 郭静, 潘东东*
AI Yan, JIA Nan, WANG Yuan, GUO Jing, PAN Dongdong*
摘要: 首先,对罕见变异遗传关联分析领域存在的统计问题及相关研究前沿和热点进行梳理分析;其次,对单位点及多位点分析常用统计方法做系统概述,并讨论这些方法存在的问题及面临的挑战;最后,对多性状多位点关联分析方法的未来发展前景作展望。
中图分类号:
[1] CARDON L R,BELL J I. Association study designs for complex diseases[J]. Nature Reviews Genetics,2001,2(2):91-99. [2]GAMAZON E R,SEGRÈ A V,van de BUNT M,et al. Using an atlas of gene regulation across 44 human tissues to inform complex disease and trait-associated variation[J]. Naturegenetics,2018,50(7):956-967. [3]KLEIN R J,ZEISS C,CHEW E Y,et al. Complement factor H polymorphism in age-related macular degeneration[J]. Science,2005,308(5720):385-389. [4]HINDORFF L A,MACARTHUR J,MORALES J,et al. A catalog of published genome-wide association studies[EB/OL]. (2015-05-12)[2021-06-09]. https://genome.gov/catalog-of-published-genomewide-association-studies. [5]张俊国. SKAT与惩罚回归模型联合分析策略在遗传关联研究中的应用[D]. 广州: 广东医科大学,2016. [6]唐明生,黄水平, 金英良,等.重抽样方差成分检验的多位点关联分析[J].中国卫生统计,2016,33(6):997-1002. [7]ASCHARD H,VILHJáLMSSON B J,GRELICHE N,et al. Maximizing the power of principal-component analysis of correlated phenotypes in genome-wide association studies[J]. American Journal of Human Genetics,2014,94(5):662-676. [8]FERREIRA M A,PURCELL S M. A multivariate test of association[J]. Bioinformatics,2009,25(1):132-133. [9]BOTTOLO L,CHADEAU-HYAM M,HASTIE D I,et al. GUESS-ing polygenic associations with multiple phenotypes using a GPU-Based ebolutionary stochastic search algorithm[J]. PLoS Genetics,2013,9(8):e1003657. [10]BOLORMAS S,PRYCE J E,REVERTER A,et al. A multi-trait meta-analysis for detecting pleiotropic polymorphisms for stature fatness and reproduction in beef cattle[J]. PLoS Genetics,2014,10(3):e1004198. [11]XU Y,HU W,YANG Z F,et al. A multivariate partial least squares approach to joint association analysis for multiple correlated traits[J]. The Crop Journal,2016,4(1):21-29. [12]LI Q,ZHENG G,LIANG X,et al. Robust tests for single-marker analysis in case-control genetic association studies[J]. Annals of Human Genetics,2009,73(2):245-252. [13]ARMITAGE P. Test for linear trends in proportions and frequencies[J]. Biometrics. 1955,11(3):375-386. [14]DEVLIN B,ROEDER K. Genomic control for association studies[J]. Biometrics,1999,55(4):997-1004. [15]SONG K,ELSTON R C. A powerful method of combining measures of association and Hardy-Weinberg disequilibrium for fine-mapping in case-control studies[J]. Statistics Inmedicine,2006,25(1):105-126. [16]CHANG X L,MAO X Y,LI H H,et al. Association of GWAS loci with PD in China[J]. American Journal of Medical Genetics Part B:Neuropsychiatric Genetics,2011,156(3):334-339. [17]SCHAID D J. Relative efficiency of ambiguous vs. directly measured haplotype frequencies[J]. Genet Epidemiol,2002,23(4):426-443. [18]GAO X L,FANG Y X. Penalized weighted least squares for outlier detection and robust regression[EB/OL]. (2016-03-24)[2021-06-06]. https://arxiv.org/abs/1603.07427. [19]WANG X F,XING E P,SCHAID D J. Kernel methods for large-scale genomic data analysis[J]. Briefings in Bioinformatics,2015,16(2):183-192. [20]WESSEL J,SCHORK N J. Generalized genomic distance-based regression methodology formultilocus association analysis[J]. American Journal of Human Genetics,2006,79(5):792-806. [21]ASIMIT J L,DAY-WILLIAMS A G,MORRIS A P,et al. ARIEL and AMELIA:testing for an accumulation of rare variants using next-generation sequencing data[J]. Human Heredity,2012,73(2):84-94. [22]MORGENTHALER S,THILLY W G. A strategy to discover genes that carry multi-allelic or mono-allelic risk for common diseases: a cohort allelic sums test (CAST) [J]. Mutation Research,2007,615(1/2):28-56. [23]MORRIS A P, ZEGGINI E. An evaluation of statistical approaches to rare variant analysis in genetic association studies[J]. Genetic Epidemiology,2010,34(2):188-193. [24]MADSEN B E,BROWNING S R. A groupwise association test for rare mutations using a weighted sum statistic[J]. PLoS Genetics,2009,5(2):e1000384. [25]ZAWISTOWSKI M,GOPALAKRISHNAN S,DING J,et al. Extending rare-variant testing strategies:analysis of noncoding sequence and imputed genotypes[J]. American Journal of Human Genetics,2010,87(5):604-617. [26]NEALE B M,RIVAS M A,VOIGHT B F,et al. Testing for an unusual distribution of rare variants[J]. PLoS Genetics,2011,7(3):e1001322. [27]LEE S M,WU M C,LIN X. Optimal tests for rare variant effects in sequencing association studies[J]. Biostatistics,2012,13(4):762-775. [28]BASU S,PAN W. Comparison of statistical tests for disease association with rare variants[J]. Genetic Epidemiology,2011,35(7):606-619. [29]HAN F,PAN W. A data-adaptive sum test for disease association with multiple common or rare variants[J]. Human Heredity,2010,70(1):42-54. [30]LIN D Y,TANG Z Z. A general framework for detecting disease associations with rare variants in sequencing studies[J]. American Journal of Human Genetics,2011,89(3):354-367. [31]PRICE A L,KRYUKOV G V,DE BAKKER P I W,et al. Pooled association tests for rare variants in exon-resequencing studies[J]. American Journal of Human Genetics,2010,86(6):832-838. [32]LIU D J,LEAL S M. A novel adaptive method for the analysis of next-generation sequencing data to detect complex trait associations with rare variants due to gene main effects and interactions[J]. PLoS Genetics,2010,6(10):e1001156-. [33]WU M C,LEE S X,CAI T X,et al. Rare-variant association testing for sequencing data with thesequence kernel association test[J]. American Journal of Human Genetics,2011,89(1):82-93. [34]PAN W. Asymptotic tests of association with multiple SNPs in linkage disequilibrium[J]. Genetic Epidemiology,2009,33(6):497-507. [35]LEE S,EMOND M J,BAMSHAD M J,et al. Optimal unified approach for rare-variant association testing with application to small-sample case-control whole-exome sequencing studies[J]. American Journal of Human Genetics,2012,91(2):224-237. [36]DERKACH A,LAWLESS J F,SUN L. Robust and powerful tests for rare variants using Fisher’s method to combine evidence of association from two or more complementary tests[J]. Genetic Epidemiology,2013,37(1):110-121. [37]CHEN L S,HSU L,GAMAZON E,et al. An exponential combination procedure for set-based association tests in sequencing studies[J]. American Journal of Human Genetics,2012,91(6):977-986. [38]HOERL A E,KENNARD R W. Ridge regression:biased estimation for nonorthogonal problems[J]. Technometrics,1970,12(1):55-67. [39]TIBSHIRANI R J. Regression shrinkage and selection via the LASSO[J]. Journal of the Royal Statistical Society. Series B:Methodological,1996,58(1):267-288. [40]ZHANG Y M,XU S. A penalized maximum likelihood method for estimating epistatic effects of QTL[J]. Heredity,2005,95(1):96-104. [41]ZOU H,HASTIE T. Regularization and variable selection via the elastic net[J]. Jornal of the Royal Statistical Society, Series B-Statistical Methodology,2015,67(2):301-320. [42]PRICE A L,PATTERSON N J,PLENGE R M,et al. Principal components analysis corrects for stratification in genome-wide association studies[J]. Nature Genetics,2006,38(8):904-909. [43]IWATA H,UGA Y,YOSHIOKA Y,et al. Bayesian association mapping of multiple quantitative trait loci and its application to the analysis of genetic variation among Oryza sativa L germplasms[J]. Theoretical and Applied Genetics,2007,114(8):1437. [44]PARK T,CASELLA G. The Bayesian Lasso[J]. Journal of the American Statistical Association,2008,103(482):681-686. [45]HAN B,KANG H M,ESKIN E. Rapid and accurate multiple testing correction and power estimation for millions of correlated markers[J]. PLoS Genetics,2009,5(4):e1000456. [46]TIBSHIRANI R. Regression shrinkage and selection via the lasso:a retrospective[J]. Journal of the Royal Statistical Society:Series B (Statistical Methodology),2011,73(3):273-282. [47]WU T T,CHEN Y F,HASTIE T,et al. Genome-wide association analysis by lasso penalized logistic regression[J]. Bioinformatics,2009,25(6):714-721. [48]段巍巍.高维组学研究中的贝叶斯多位点模型[D]. 南京:南京医科大学,2018. [49]ZHANG Z W,ERSOZ W,LAI C Q,et al. Mixed liner model approach adapted for genome-wide association studies[J]. Nature Genetics,2010,42(4):355-360. [50]LI M,LIU X L,BRADBURY P,et al. Enrichment of statistical power for genome-wide association studies[J]. BMC Biology,2014,12:73. [51]KANG H M,ZAITLEN N A,WADE C M,et al. Efficient control of population structure in model organism association mapping[J]. Genetics,2008,178(3):1709-1723. [52]YU J M, PRESSOIR G, BRIGGS W H, et al. A unified mixed-model method for association mapping that accounts for multiple levels of relatedness[J]. Nature Genetics, 2006, 38(2): 203-208. [53]KANG H M,SUL J H,SERVICE S K,et al. Variance component model to account for sample structure in genome-wide association studies[J]. Nature Genetics,2010,42(4):348-354. [54]ZHOU X,STEPHENS M. Genome-wide efficient mixed-model analysis for association studies[J]. Nature Genetics,2012,44(7):821-824. [55]SVISHCHEVA G R,AXENOVICH T I,BELONOGOVA N M,et al. Rapid variance components-based method for whole-genome association analysis[J]. Nature Genetics,2012,44(10):1166-1170. [56]WANG Q S,TIAN F,PAN Y C,et al. A super powerful method for genome wide association study[J]. PLoS ONE,2014,9(9):e107684. [57]LIPPERT C,LISTGARTEN J,LIU Y,et al. Fast linear mixed models for genome-wide association studies[J]. Nature Methods,2011,8(10):833-835. [58]JOHNSON R W. An introduction to the bootstrap[J]. Teaching Statistics,2001,23(2):49-54. [59]DABISON A C,HINKLEY D V. Bootstrap methods and their application[M]. New York: Cambridge University Press,1997. [60]GOOD P I. Permutation,parametric and bootstrap tests of hypotheses[M]. New York: Springer-Verlag,2005. [61]EFRON B. The Jackknife, the bootstrap and other resampling plans[M]. Philadelphia: Society for Industrial and Applied Mathematics (SIAM), 1982. [62]TIPPETT L H C. The methods of statistics. an introduction mainly for workers in the biological sciences[M]. London: Williams & Norgate ltd., 1931. [63]DONOHO D,JIN J S. Higher criticism for detecting sparse heterogeneous mixtures[J]. The Annals of Statistics,2004,32(3):962-994. [64]BERK R H,JONES D H. Goodness-of-fit test statistics that dominate the Kolmogorov statistics[J]. Zeitschrift Für Wahrscheinlichkeitstheorie Und Verwandte Gebiete,1979,47(1):47-59. [65]LIU Y W,CHEN S X,LI Z L,et al. ACAT:A fast and powerful p value combination method for rare-variant analysis in sequencing studies[J]. Americah Journal of Human Genetics,2019,104(3):410-421. [66]刘庆.多位点Jonckheere-Terpstra全基因组关联分析方法[D].南京:南京农业大学,2016. [67]XU S. An expectation-maximization algorithm for the Lasso estimation of quantitative trait locus effects[J]. Heredity,2010,105(5):483-494. [68]PAN D D, LI Q Z, JIANG N N,et al. Robust joint analysis allowing for model uncertainty in two-stage genetic association studies[J]. BMC Bioinformatics, 2011, 12: 9. [69]LI Q Z, PAN D D, YUE W H, et al. Evaluating rare variants under two-stage design[J]. Journal of Human Genetics, 2012, 57(6): 352-357. [70]PAN D D, XIONG W J, ZHOU J Y, et al. Robust joint analysis with data fusion in two-stage quantitative trait genome-wide association studies[J]. Computational and Mathematical Methods in Medicine,2013, 6: 843563. [71]PAN D D, LI Z B, LI Q Z, et al. A novel powerful joint analysis with data fusion in two-stage case-control genome-wide association studies[J]. Communications in Statistics-Simulation and Computation, 2016, 45(7): 2362-2376. [72]HU X N, DUAN X G, PAN D D, et al. A model-embedded trend test with incorporating Hardy-Weinberg equilibrium information[J]. Journal of Systems Science and Complexity, 2017, 30(1): 101-110. [73]贺建波,刘方东,王吴彬,等. 限制性两阶段多位点全基因组关联分析法在遗传育种中的应用[J].中国农业科学,2020,53(9):1704-1716. [74]HE J B,MENG S, ZHAO T J,et al. An innovative procedure of genome-wide association analysis fits studies on germplasm population and plant breeding[J]. Theoretical and Applied Genetics,2017,130(11):2327-2343. [75]贺建波,刘方东,邢光南,等. 限制性两阶段多位点全基因组关联分析方法的特点与计算程序[J]. 作物学报,2018,44(9):1274-1289. [76]杜应雯. 基于奇异值分解和SCAD估计的多位点全基因组关联分析方法[D].武汉:华中农业大学,2018. [77]ZENG P,ZHAO Y, LIU J,et al. Likelihood ratio tests in rare variant detection for continuous phenotypes[J]. Annals of Human Genetics,2014,78(5):320-332. [78]SUN J P,ZHENG Y Y,HSU L. A unified mixed-effects model for rare-variant association in sequencing studies[J]. Genetic Epidemiology,2013,37(4):334-344. |
[1] | 付美子, 林炳清. 临床测量中定量数据Bland-Altman一致性评价[J]. 广西师范大学学报(自然科学版), 2022, 40(1): 125-138. |
|
版权所有 © 广西师范大学学报(自然科学版)编辑部 地址:广西桂林市三里店育才路15号 邮编:541004 电话:0773-5857325 E-mail: gxsdzkb@mailbox.gxnu.edu.cn 本系统由北京玛格泰克科技发展有限公司设计开发 |