广西师范大学学报(自然科学版) ›› 2022, Vol. 40 ›› Issue (6): 247-256.doi: 10.16088/j.issn.1001-6600.2021080902

• 研究论文 • 上一篇    

祁连山黄参叶片转录组测序及生物信息学分析

张春梅1,2*, 闫芳2,3, 宋海2, 张喜峰1,2, 陈叶1,2   

  1. 1.河西学院农业与生态工程学院,甘肃张掖734000;
    2.甘肃省河西走廊特色资源利用重点实验室,甘肃张掖734000;
    3.河西学院生态与绿洲农业研究院,甘肃张掖734000
  • 收稿日期:2021-08-09 修回日期:2021-09-17 出版日期:2022-11-25 发布日期:2023-01-17
  • 通讯作者: 张春梅(1978—),女,甘肃酒泉人,河西学院教授,博士。E-mail:zazcm197828@163.com
  • 基金资助:
    国家自然科学基金(32160745);甘肃省自然科学基金(22JR5RG566);甘肃省陇原青年创新创业人才(团队)项目(2020RCXM130)

Sequencing and Bioinformatic Analysis for Transcriptome of Shandan Sphallerocarpus racills Leaf

ZHANG Chunmei1,2*, YAN Fang2,3, SONG Hai2, ZHANG Xifeng1,2 , CHEN Ye1,2   

  1. 1. College of Agriculture and Ecological Engineering, Hexi University, Zhangye Gansu 734000, China;
    2. Key Laboratory of Hexi Corridor Resources Utilization of GanSu, Zhangye Gansu 734000, China;
    3. Ecological & Oasis Agricultural Research Institute of Hexi University, Zhangye Gansu 734000,China
  • Received:2021-08-09 Revised:2021-09-17 Online:2022-11-25 Published:2023-01-17

摘要: 在《本草纲目》中黄参被誉为“小人参”,以黄参叶片为试材,采用高通量测序平台BGISEQ-500进行转录组测序,利用转录组分析软件进行组装、注释。结果表明:1)利用组装软件,获得99 981个Unigene,总长度是113 850 816 bp,平均长度是1 138 bp,N50的长度是1 874 bp,GC含量是39.93%。2)将Unigene比对到7大功能数据库进行注释,分别有49 390(NT:49.40%)、48 281(SwissProt:48.29%)、61 116(KOG:61.13%)以及55 859(Pfam:55.87%)个Unigene获得功能注释。3)比对到NR数据库共有66 451条,黄参与胡萝卜Daucus carota subsp. sativus有较高同源性,与其他物种的同源性较低。4)基因本体(gene ontology, GO)数据库注释显示,有78 040条Unigene得到注释,按功能分为生物过程、细胞组分、分子功能三大类,分别有15、11、14个亚类,其中执行生物过程的类区较多。5)51 479条Unigene富集在KEGG数据库的20条代谢通路中。6)在KOG数据库中,有61 116条Unigene被分配到26个基因功能大类中,参与功能预测、信号转导、翻译、修饰及蛋白质运输的基因最多。7)使用Transdecoder检测出62 323个CDS,检测出17 308个SSR(simple sequence repeats)分布于13 256个Unigene中,双核苷酸重复基元类型最为丰富(6 721,占38.83%);预测出2 370个编码转录因子的Unigene。黄参遗传信息丰富,本文研究结果将为揭示黄参遗传背景、分子标记研究、开展其功能基因组分析等提供基础数据,也为黄参的综合利用及研发奠定基础。

关键词: 黄参, 转录组, 功能基因, 生物信息学, SSR标记

Abstract: Sphallerocarpusg racilis is considered as “second Panax Ginseng” in Compendium of Materia Medica. The high-throughput sequencing platform BGISEQ-500 was used to sequence the transcriptome of Huangshen (S. racills) leaf and transcriptome analysis software was used for assembly, and annotation. The results showed: 1) A total of 99 981 unigenes were obtained through de novo assembly,with total length of 113 850 816 bp, an average length of 1 138 bp and N50 of 1 874 bp. GC content accounted for 39.93%. 2) The unigenes were functionally annotated by searching against seven protein databases. There were 49 390 (NT:49.40%), 48 281 (SwissProt:48.29%), 61 116 (KOGdatabase:61.13%) and 55 859 (Pfam:55.87%) Unigenes for functional annotation,respectively. 3) Compared with NR database, 66 451 Unigene were annotated in the NR. It was found that S. gracilis had higher homology with Daucus carota subsp. sativus, but lower homology with other species. 4) 78 040 Unigeneswere annotated within 40 terms of three main GO (Gene Ontology) categories. According to function, they were divided into biological process, cellular component and molecular function, which included 15, 11 and 14 subclasses respectively, with the largest proportion of classes performing biological process. 5) For KEGG(Kyoto Encyclopedia of Genes and Genomes)analysis, 51 479 Unigenes were assigned to 20 known metabolic pathways. 6) 61 116 Unigenes were annotated in the KOG database and a total of 26 gene functional categories were obtained.Among them, the genes involved in general function, signal transduction mechanisms, translation, modification and protein transportation were the most abundant categories. 7) A total of 62 323 CDSs were detected by Transdecoder from the transcriptome. Moreover, a total of 17 308 SSRs(simple sequence repeats) from 13 256 Unigenes were identified from the transcriptome. Di-nucleotidere peat motif was the most abundant SSR, accounting for 38.83% (6 721 SSRs). 2 370 Unigenes encoding transcription factors were predicted. Conclusion: The study provided valuable information and abundant resources for revealing its genetic background, future functional genome analysis, molecular marker developmentand laid a foundation for comprehensive utilization and protection of S. racills.

Key words: Sphallerocarpusg racilis, transcriptome, functional gene, bioinformatics, SSR markers

中图分类号: 

  • S567.239
[1] 薛鸿燕. 山丹黄参化学成分及生物活性研究[D].兰州: 兰州理工大学, 2011.
[2] 贾恢先, 邹卿, 叶相清, 等. 山丹黄参的分布及微量元素含量研究[J].西北植物学报, 2001, 21(1): 188-190. DOI: 1000-4025-(2001)01-0188-03.
[3] 高春燕. 黄参籽精油、多酚的组成及其功能性研究[D].西安: 陕西师范大学, 2012.
[4] 王瑞娴, 李川. 全长转录组测序技术在非模式植物转录组学研究中的应用[J].分子植物育种, 2019, 17(2): 502-508. DOI: 10.13271/j.mpb.017.000502.
[5] LI S F, FAN C M, LI Y, et al. Effects of drought and salt-stresses on gene expression in Caragana korshinskii seedlings revealed by RNA-seq[J]. BMC Genomics, 2016, 17: 200. DOI: 10.1186/s12864-016-2562-0.
[6] CHEN W, YAO Q M, PATIL G B, et al. Identification and comparative analysis of differential gene expression in soybean leaf tissue under drought and floodingstress revealed by RNA-Seq[J]. Frontiers in Plant Science, 2016, 7: 1044. DOI: 10.3389/fpls.2016.01044.
[7] 崔凯, 吴伟伟, 刁其玉. 转录组测序技术的研究和应用进展[J].生物技术通报, 2019, 35(7): 1-9. DOI: 10.13560/j.cnki.biotech.bull.1985.2019-0374.
[8] FRANSSEN S U, SHRESTHA R P, BRUTIGAM A, et al. Comprehensive transcriptome analysis of the highly complex Pisum sativum genome using next generation sequencing[J]. BMC Genomics, 2011, 12: 227. DOI: 10.1186/1471-2164-12-227.
[9] METZKER M L. Sequencing technologies-the next generation[J]. Nature Reviews Genetics, 2010, 11(1): 31-46. DOI: 10.1038/nrg2626.
[10] LOPEZ-MAESTRE H, BRINZA L, MARCHET C, et al. SNP calling from RNA-seq data without a reference genome: identification, quantification, differential analysis and impact on the protein sequenc[J].Nucleic Acids Research, 2016, 44(19): e148. DOI: 10.1093/nar/gkw655.
[11] GRABHERR M G, HAAS B J, YASSOUR M, et al.Full-length transcriptome assembly from RNA-seq data without a reference genome[J].Nature Biotechnology, 2011, 29(7): 644-652. DOI: 10.1038/nbt.1883.
[12] 王光炯, 柳新红, 许大明, 等. 百山祖冷杉叶片转录组分析[J].江西农业大学学报, 2021, 43(2): 343-354. DOI: 10.13836/j.jjau.2021039.
[13] 慧芳, 刘秀岩, 李宗谕, 等. 转录组测序技术在药用植物研究中的应用[J].中草药, 2019, 50(24): 6149-6155. DOI: 10.7501/j.issn.0253-2670.2019.24.033.
[14] WHEAT C W.Rapidly developing functional genomics in ecological model systems via 454 transcriptome sequencing[J].Genetica, 2010, 138(4): 433-451. DOI: 10.1007/s10709-008-9326-y.
[15] 王继华, 黎俊荣, 蔡时可, 等. 肇实转录组测序及生物信息学分析[J].中药材, 2019, 42(11): 2513-2518. DOI: 10.13863/j.issn1001-4454.2019.11.008.
[16] 李聪, 郭天麒, 梁小红, 等. 植物ERFs类转录因子在逆境胁迫中的作用[J].生物技术通报, 2011(4): 1-6. DOI: 10.13560/j.cnki.biotech.bull.1985.2011.04.021.
[17] 贾昌路, 张瑶, 朱玲, 等. 转录组测序技术在生物测序中的应用研究进展[J].分子植物育种, 2015, 13(10): 2388-2394. DOI: 10.13271/j.mpb.013.002388.
[18] 王传琦, 孔稳稳, 李晶. 植物转录因子最新研究方法[J].生物技术通讯, 2013, 24(1): 118-123. DOI: 10.3969/j.issn.1009-0002.2013.01.028.
[19] 钟婵娟, 彭伟业, 王冰, 等. 植物逆境响应相关的C2H2型锌指蛋白研究进展[J].植物生理学报, 2020, 56(11): 2356-2366. DOI: 10.13592/j.cnki.ppj.2020.0171.
[20] 张桐, 李智强, 伍国强. WRKY转录因子在植物逆境响应中的作用[J].生物技术通报, 2021, 37(10): 203-215. DOI: 10.13560/j.cnki.biotech.bull.1985.2020-1481.
[21] 王华, 汪王微, 王冬良, 等. 杜鹃花叶片转录组测序数据组装及功能注释[J].浙江农业学报, 2018, 30(7): 1149-1159. DOI: 10.3969/j.issn.1004-1524.2018.07.07.
[22] 申玉晓. 玫瑰MYB转录因子调控类黄酮介导的逆境响应机制研究[D].武汉: 华中农业大学, 2019. DOI: 10.27158/d.cnki.ghznu.2019.000816.
[23] 刘静, 王翠平, 朱强, 等. 黑果枸杞bHLH转录因子家族的生物信息学分析[J].分子植物育种, 2020, 18(14): 4612-4623. DOI: 10.13271/j.mpb.018.004612.
[24] 郭仰东, 张磊, 李双桃, 等. 蔬菜作物应答非生物逆境胁迫的分子生物学研究进展[J].中国农业科学, 2018, 51(6): 1167-1181. DOI: 10.3864/j.issn.0578-1752.2018.06.015.
[25] JIN C, HUANG X S, LI K Q, et al. Overexpression of a bHLH1 transcriptionfactor of Pyrus ussuriensis confers enhanced cold tolerance and increases expression of stress-responsive genes[J]. Frontiersin Plant Science, 2016, 7: 441. DOI: 10.3389/fpls.2016.00441.
[1] 张丽萌, 李闰婷, 聂晓宁, 李玉华, 李林, 李亚蒙, 陈龙欣, 王林青. 生长抑素Ⅱ型受体SSTR2蛋白的理化性质及生物信息学分析[J]. 广西师范大学学报(自然科学版), 2023, 41(1): 164-173.
[2] 梁嘉瑜, 梁语, 马姜明. 林-药种植模式研究进展[J]. 广西师范大学学报(自然科学版), 2022, 40(5): 366-375.
[3] 吴军, 欧阳艾嘉, 张琳. 基于多头注意力机制的磷酸化位点预测模型[J]. 广西师范大学学报(自然科学版), 2022, 40(3): 161-171.
[4] 罗洪林, 冯鹏霏, 余艳玲, 肖蕊, 潘传燕, 宋漫玲, 张永德. 卵形鲳鲹Myostatin基因克隆及其在胚胎发育中的表达分析[J]. 广西师范大学学报(自然科学版), 2021, 39(1): 136-147.
[5] 郭辰, 周飞, 韩彪, 潘翠, 吴洁敏, 杨婷, 尚常花. 假单胞菌亮氨酸氨肽酶基因克隆及生物信息学分析[J]. 广西师范大学学报(自然科学版), 2021, 39(1): 156-164.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] 邝先验, 陈自如. 考虑礼让行人的交叉口机非混合交通流模型[J]. 广西师范大学学报(自然科学版), 2019, 37(4): 1 -15 .
[2] 刘伟铭, 陈纲梅, 林观荣, 李静宁. 高速公路收费站与衔接信号交叉口协调控制研究[J]. 广西师范大学学报(自然科学版), 2019, 37(4): 16 -26 .
[3] 邹艳丽, 汪洋, 刘树生, 姚飞. 带有邻居度信息的容量负载模型下电网级联故障研究[J]. 广西师范大学学报(自然科学版), 2019, 37(4): 27 -36 .
[4] 谢丽娜, 蒋品群, 宋树祥, 岑明灿. 一款低损耗低噪声宽调谐的高阶级联N通道滤波器[J]. 广西师范大学学报(自然科学版), 2019, 37(4): 37 -44 .
[5] 罗兰, 周楠, 司杰. 不确定细胞神经网络鲁棒稳定新的时滞划分法[J]. 广西师范大学学报(自然科学版), 2019, 37(4): 45 -52 .
[6] 王健, 郑七凡, 李超, 石晶. 基于ENCODER_ATT机制的远程监督关系抽取[J]. 广西师范大学学报(自然科学版), 2019, 37(4): 53 -60 .
[7] 肖逸群, 宋树祥, 夏海英. 基于多特征的快速行人检测方法及实现[J]. 广西师范大学学报(自然科学版), 2019, 37(4): 61 -67 .
[8] 王勋, 李廷会, 潘骁, 田宇. 基于改进模糊C均值聚类与Otsu的图像分割方法[J]. 广西师范大学学报(自然科学版), 2019, 37(4): 68 -73 .
[9] 钟祥贵, 孙悦, 吴湘华. 几乎CAP*-子群与有限群的p-超可解性[J]. 广西师范大学学报(自然科学版), 2019, 37(4): 74 -78 .
[10] 朱娅萍, 屈国荣, 范江华. 不动点指数法研究拟变分不等式解的存在性[J]. 广西师范大学学报(自然科学版), 2019, 37(4): 79 -85 .
版权所有 © 广西师范大学学报(自然科学版)编辑部
地址:广西桂林市三里店育才路15号 邮编:541004
电话:0773-5857325 E-mail: gxsdzkb@mailbox.gxnu.edu.cn
本系统由北京玛格泰克科技发展有限公司设计开发