广西师范大学学报(自然科学版) ›› 2011, Vol. 29 ›› Issue (2): 174-179.

• • 上一篇    下一篇

基于监督学习的蛋白质络合物抽取方法

唐楠, 杨志豪, 吴佳金, 王艳华, 林鸿飞   

  1. 大连理工大学计算机科学与技术学院,辽宁大连116024
  • 收稿日期:2011-05-10 发布日期:2018-11-19
  • 通讯作者: 杨志豪(1973—),男,黑龙江大庆人,大连理工大学副教授,博士。E-mail:yangzh@dlut.edu.cn
  • 基金资助:
    国家自然科学基金资助项目(60673039,61070098);国家“863”计划资助项目(2006AA01Z151);高等学校博士学科点专项科研基金资助项目(20090041110002);中央高校基本科研业务费专项资金资助项目(DUT10JS09);辽宁省博士启动基金资助项目(20091015)

Method of Predicting Protein Complex Based on Supervised Learning

TANG Nan, YANG Zhi-hao, WU Jia-jin, WANG Yan-hua, LIN Hong-fei   

  1. School of Computer Science and Technology,Dalian University ofTechnology,Dalian Liaoning 116024,China
  • Received:2011-05-10 Published:2018-11-19

摘要: 蛋白质关系网络中存在着大量的蛋白质络合物,络合物对有利于深入探索生物细胞的组织原理和功能有着重要意义。然而传统的络合物发现算法多基于网络的拓扑结构,没有融合络合物本身的结构信息。针对这个问题,提出了监督学习的络合物发现方法,将多种能够标示络合物的信息作为特征,使用监督学习方法对样本集进行训练,将训练得到的模型应用在络合物发现算法中。实验证明,该方法能有效地从蛋白质关系网络中发现络合物。

关键词: 蛋白质关系网络, 蛋白质络合物, 监督学习

Abstract: Protein complexes are important for understanding principles of cellular organization and function.Predicting protein complexes fromprotein-protein interaction (PPI) networks is of great significance.Previous methods for complex prediction are usually based on topological structure withoutconsidering the structure of complexes.In this paper,a supervised learning method is used to solve this problem.The features are constructed by multipleinformation of complex and the model obtained by the supervised method is usedinthe algorithm of complexes detection.The experimental results show that the method is an effective approach to predict protein complex from proteininteraction network.

Key words: protein interaction network, protein complex, supervised learning

中图分类号: 

  • TP391.3
[1] BADER G,HOUGE C.An automated method for finding molecular complexes in large protein interaction networks[J].BMC Bioinformatics,2003,4:2.
[2] WU Min,LI Xiao-li,KWOH C K,et al.A core-attachment based methodto detect protein complexes in PPI networks[J].BMC Bioinformatics,2009,10:169.
[3] 夏佞,林鸿飞,杨志豪.基于扩展语义特征机器学习消歧的基因提及标准化[J].广西师范大学学报:自然科学版,2010,28(3):144-147.
[4] CHEN Lei,SHI Xiao-he,KONG Xiang-yin.Identifying protein complexes using hybrid properties[J].Proteome,2009,8(11):5212-5218.
[5] LUBOVAC Z,GAMALIELSSON J,OLSSON B.Combining functional and topological properties to identify core modules in protein interaction networks[J].Proteins,2006,64:948-959.
[6] XENARIOS I,SALWINSKI L,DUAN X,etal.DIP,the database of interacting proteins:a research tool for studying cellular networks of protein interactions[J].Nucleic Acids Research,2002,30:303-305.
[7] DWIGHT S,HARRIS M,DOLINSKI K,et al.Saccharomyces genome database provides secondary gene annotation using the gene ontology[J].Nucleic Acids Research,2002,30:69-72.
[8] VLADIMIR N V.The nature of statistical learning theory[M].2nd ed.NewYork:Spring,1999:171-180.
[9] COSSOCK D,ZHANG Tong.Subset ranking using regression[C]//Proceedings of Conference on Learning Theory (COLT).Berlin:Spring,2006:605-619.
[10] TOMITA E,TANAKA A,TAKAHASHI H.The worst-case time complexity forgenerating all maximal cliques and computational experiments[J].Theor ComputSci,2006,363:28-42.
[11] BROHEE S,HELDEN J V.Evaluation of clustering algorithms for protein-protein interaction networks[J].BMC Bioinformatics,2006,7:488.
[12] LIU Gui-mei,WONG L,CHUA H N.Complex discovery from weighted PPInetworks[J].Bioinformatics,2009,25(15):1891-1897.
[1] 陆广泉, 谢扬才, 刘星, 张师超. 一种基于KNN的半监督分类改进算法[J]. 广西师范大学学报(自然科学版), 2012, 30(1): 45-49.
[2] 黄添强, 李凯, 郑之. 有监督的噪音流形学习算法[J]. 广西师范大学学报(自然科学版), 2011, 29(3): 131-135.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!
版权所有 © 广西师范大学学报(自然科学版)编辑部
地址:广西桂林市三里店育才路15号 邮编:541004
电话:0773-5857325 E-mail: gxsdzkb@mailbox.gxnu.edu.cn
本系统由北京玛格泰克科技发展有限公司设计开发