广西师范大学学报(自然科学版) ›› 2011, Vol. 29 ›› Issue (3): 187-191.

• • 上一篇    下一篇

基于贝叶斯分类器的结肠癌数据分类

陈尤莺, 郑之, 孔祥增, 张胜元   

  1. 福建师范大学数学与计算机科学学院,福建福州350007
  • 收稿日期:2011-05-10 出版日期:2011-08-20 发布日期:2018-12-03
  • 通讯作者: 张胜元(1966—),男,福建龙岩人,福建大学教授,博士。E-mail:syzhang@fjnu.edu.cn
  • 基金资助:
    国家自然科学基金资助项目(61070062);福建省自然科学基金资助项目(2010J01319);福建省教育厅A类项目(JA10064)

Classification of Colon Cancer Data Based on Bayesian Classifier

CHEN You-ying, ZHENG Zhi, KONG Xiang-zeng, ZHANG Sheng-yuan   

  1. College of Mathematics and Computer Science,Fujian Normal University,Fuzhou Fujian 350007,China
  • Received:2011-05-10 Online:2011-08-20 Published:2018-12-03

摘要: 基于基因表达谱的肿瘤诊断方法有望成为临床医学上一种快速而有效的诊断方法,但由于基因表达谱数据存在高维数、小样本以及噪音大等特点,使得对其分类存在很大困难,所以很有必要寻找更为可行有效的分类方法。用贝叶斯分类器建立预测分类模型作为基因表达谱数据分类的一种新思路,以结肠癌的基因表达谱作为实验数据,利用MATLAB的贝叶斯网络工具箱进行了实验,并用4-折交叉验证法测试识别准确率。实验结果表明上述方法是可行有效的。

关键词: 基因表达谱, 贝叶斯分类器, 结肠癌数据, 4-折交叉验证

Abstract: Cancer diagnosis based on gene expression is expected to become a fast andeffective method for clinical diagnosis.Because of the high dimension,small samples and noise characteristics of gene expression data,it is very difficult to make relevent classification.Therefore it is necessary to finda more feasible and effective classification method.This paper uses Bayesian classifier to establish a forecasting model,for the gene expression data classification.Experiments are carried outwith the Bayesian network toolbox by using the colon cancer gene expression profiles as test data.Identification accuracy is also verified through 4-fold cross-validation method.Experimental results show that the method is feasible and effective.

Key words: gene expression, Bayesian classifier, colon cancer data, 4-fold cross validation

中图分类号: 

  • TP391.4
[1] WANG Shu-lin,CHEN Huo-wang,LI Shu-tao,et al.Feature extractionfrom tumor gene expression profiles using DCT and DFT[C]//The 13th Portuese Conference on Artificial Intelligence:LNCS Vol 4874.Berlin:Springer,2007:485-496.
[2] 孙丽君,苗夺谦.基于粗糙集的基因表达数据分类研究[J].计算机工程,2007,33(16):183-185.
[3] RAMASWAMY S,GOLUB T R.DNA microarrays in clinical oncology[J].Journal of Clinical Oncology,2002,20(7):1932-1941.
[4] LANDER E S,WEINBERG RA.Genomics journey to the center of biology[J].Science,2000,287(5459):1777-1782.
[5] LANDER E S.Array of hope[J].Nature Genetics,1999,21(S1):3-4.
[6] 王加阳,吴祖剑.基于粗糙信息熵的基因分析与选择研究[J].计算机应用研究,2008,25(6):1713-1716.
[7] 王树林,王戟,陈火旺,等.肿瘤信息基因启发式宽度优先搜索算法研究[J].计算机学报,2008,31(4):636-649.
[8] PEARL J.Bayesian networks:a model of self-activated memory for evidential reasoning[C]//Proceedings of the 7th Conference of the Cognitive Science Society.Hillsdale,NJ:Lawrence Erlbaum,1985:329-334.
[9] FRIEDMAN N,GEIGER D,GOLDSZMIDT M.Bayesian network classifiers[J].Machine Learning,1997,29:131-163.
[10] RICHARD E.Learning bayesian networks[M].Chicago,Illinois:Northeastem Illinois University,2004.
[11] BEN-GAL I,SHANI A,GOHR A,et al.Identification of transcription factorbinding sites with variable-order bayesian networks[J].Bioinformatics,2005,21(11):2657-2666.
[12] WANG Kai-jun,ZHANG Jun-ying,SHEN Feng-shan,et al.Adaptive learning of dynamic Bayesian networks with changing structures by detecting geometric structures of time series[J].Knowledge and Information Systems,2008,17(1):121-133.
[13] SANG Li-feng,YANG Ying-chun,WU Zhao-hui,et al.Dynamic bayesiannetwork approach to speaker identification[J].Electronics Letters,2003,39(3):329-330.
[14] GROSSMAN D,DOMINGOS P.Learning bayesian network classifiers by maximizing conditional likelihood[C]//Proc of 21st International Conference on Machine Learning.Alberta,Canada:ACM Press,2004:361-368.
[15] 蒋望东,林士敏.基于贝叶斯网络工具箱的贝叶斯学习和推理[J].信息技术,2007(2):5-8.
[16] ALON U,BARKAI N,NOTTEMAN D A,et al.Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues by oligonucleotide arrays[J].Proc Nat Acad SCI USA,1999,96(12):6745-6750.
[17] 党春艳,周继鹏,王桂香,等.慢性胃炎脾虚证差异表达基因识别研究[J].广西师范大学学报:自然科学版,2009,27(3):154-157.
[18] 孙晶京,王力波,罗伟.肿瘤诊断中的特征基因提取[J].计算机工程与应用,2010,46(7):218-220.
[19] HOU Mei-ling,WANG Shu-lin,LI Xue-ling,et al.Neighborhood roughset reduction based gene selection and prioritization for gene expression profile analysis and molecular cancer classification[J/OL].Journal of Biomedicine and Biotechnology,2010,2010:1-12[2011-04-25].http://downloads.hindawi.com/journals/jbb/2010/726413.pdf.
[20] WANG Shu-lin,CHEN Huo-wang,LI Fa-ren Li,et al.Gene selection with rough sets for the molecular diagnosing of tumor based on support vector machines[C]//International Computer Symposium.Taipei,China:Tamkang University,2006:1368-1373.
[21] ZHANG Shan-wen,HUANG De-shuang,WANG Shu-Lin.A method of tumorclassification based on wavelet packet transforms and neighborhood rough set[J].Computers in Biology and Medicine,2010,40(4):430-437.
[1] 张灿龙, 李燕茹, 李志欣, 王智文. 基于核相关滤波与特征融合的分块跟踪算法[J]. 广西师范大学学报(自然科学版), 2020, 38(5): 12-23.
[2] 肖逸群, 宋树祥, 夏海英. 基于多特征的快速行人检测方法及实现[J]. 广西师范大学学报(自然科学版), 2019, 37(4): 61-67.
[3] 王勋, 李廷会, 潘骁, 田宇. 基于改进模糊C均值聚类与Otsu的图像分割方法[J]. 广西师范大学学报(自然科学版), 2019, 37(4): 68-73.
[4] 孙妤喆, 卢磊, 罗晓曙, 郭磊, 郝占龙, 唐堂. 结合非局部均值滤波的双边滤波图像去噪方法[J]. 广西师范大学学报(自然科学版), 2017, 35(2): 32-38.
[5] 蔡冰, 张灿龙, 李志欣. 基于联合直方图的红外与可见光目标融合跟踪[J]. 广西师范大学学报(自然科学版), 2017, 35(3): 37-44.
[6] 夏海英, 喻潇琪. 基于对比度金字塔图像融合的自发笑脸识别[J]. 广西师范大学学报(自然科学版), 2017, 35(3): 45-52.
[7] 夏海英. 基于改进的SLIC区域合并的宫颈细胞图像分割[J]. 广西师范大学学报(自然科学版), 2016, 34(4): 93-100.
[8] 何鹏, 刘高凯, 李静辉. 基于机器视觉的疲劳驾驶监测预警系统[J]. 广西师范大学学报(自然科学版), 2015, 33(4): 25-29.
[9] 陈锦, 罗晓曙. 基于小波变换与野草算法的细胞图像特征提取与识别[J]. 广西师范大学学报(自然科学版), 2015, 33(2): 22-28.
[10] 王凯明, 周海燕, 郭家梁, 杨孝敬, 王刚, 钟宁. 基于统计分布熵的抑郁症脑电信号分析[J]. 广西师范大学学报(自然科学版), 2015, 33(2): 29-35.
[11] 王冬旭, 宋树祥, 蒋品群. 基于BP神经网络的竹片正反面识别算法[J]. 广西师范大学学报(自然科学版), 2014, 32(2): 14-19.
[12] 华梓铮, 华泽玺. 基于NSCT的含噪图像边缘检测算法[J]. 广西师范大学学报(自然科学版), 2014, 32(2): 26-34.
[13] 马先兵, 孙水发, 覃音诗, 郭青, 夏平. 基于粒子滤波的on-line boosting目标跟踪算法[J]. 广西师范大学学报(自然科学版), 2013, 31(3): 100-105.
[14] 孙水发, 李乐鹏, 董方敏, 邹耀斌, 陈鹏. 基于迭代阈值的子块部分重叠双直方图均衡算法[J]. 广西师范大学学报(自然科学版), 2013, 31(3): 119-126.
[15] 黄志敏, 王东利, 文颖, 吕岳. 基于改进网格特征的离线笔迹识别[J]. 广西师范大学学报(自然科学版), 2013, 31(3): 132-137.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!
版权所有 © 广西师范大学学报(自然科学版)编辑部
地址:广西桂林市三里店育才路15号 邮编:541004
电话:0773-5857325 E-mail: gxsdzkb@mailbox.gxnu.edu.cn
本系统由北京玛格泰克科技发展有限公司设计开发