Journal of Guangxi Normal University(Natural Science Edition) ›› 2020, Vol. 38 ›› Issue (6): 40-50.doi: 10.16088/j.issn.1001-6600.2020.06.005

Previous Articles     Next Articles

Classification of Protein 3D Structure Based on Adaptive Local Features

ZHANG Ruchang1, QIU Jie2*, WANG Mingtang2, CHEN Qingfeng1*   

  1. 1. School of Computer Electronics and Information, Guangxi University, Nanning Guangxi 530004, China;
    2. School of Computer Science and Engineering, Yulin Normal University, Yulin Guangxi 537000, China
  • Received:2020-03-30 Published:2020-11-30

Abstract: The three-dimensional spatial structure of protein determine its biological function. Structural similarity between proteins can be a good predictor of functional correlations. In this paper, the Cα atomic distance matrix of protein is decomposed into many small sub-matrices that represent the local structure of the protein. Through the statistical analysis of these local structures, a local feature frequency vector is obtained to calculate the similarity of the protein. Consequently, a new method to measure the similarity of protein structure by adaptive local feature frequency vector (ALFF) is proposed. In the way of selecting the local features of protein in ALFF, OTSU is adopted to determine the most appropriate size of the local features m, and MeanShift is applied to find the representative number of local features k, respectively. Experimental results demonstrate that ALFF can achieve better and faster division of the local substructures of proteins. In addition, compared with the method of manual selection of parameters, ALFF has higher consistency in protein structure classification and better accuracy in TM-score comparison.

Key words: protein structural similarity, local feature, distance matrix, clustering, frequency vector

CLC Number: 

  • TP39
[1] GAN J Z, QIU J, DENG C S, et al. KSIMC: predicting kinase-substrate interactions based on matrix completion[J]. International Journal of Molecular Sciences, 2019, 20(2): 302. DOI: 10.3390/ijms20020302.
[2] 傅广垣, 余国先, 王峻, 等. 基于有向混合图的蛋白质新功能预测[J]. 中国科学: 信息科学, 2016, 46(4): 461-475. DOI: 10.1360/N112015-00109.
[3] 徐永红, 褚泽斐, 洪文学. 基于黎曼流形的蛋白质三维结构数据相似性比较[J]. 燕山大学学报, 2015, 39(1): 35-41. DOI: 10.3969/j.issn.1007-791X.2015.01.006.
[4] 王超, 朱建伟, 张海仓, 等. 蛋白质三级结构预测算法综述[J]. 计算机学报, 2018, 41(4): 760-779.
[5] CHEN Q F, WANG Y Q, CHEN B S, et al. Using propensity scores to predict the kinases of unannotated phosphopeptides[J]. Knowledge-Based Systems, 2017, 135: 60-76. DOI: 10.1016/j.knosys.2017.08.004.
[6] DUBEY S P N, KINI N G, BALAJI S, et al. A review of protein structure prediction using latticemodel[J]. Critical ReviewsTM in Biomedical Engineering, 2018, 46(2): 147-162. DOI: 10.1615/CritRevBiomedEng.2018026093.
[7] CHEN Q F, LAI D H, LAN W, et al. ILDMSF: Inferring associations between long non-coding RNA and disease based on multi-similarity fusion[J/OL]. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 2019[2020-03-30]. https://ieeexplore.ieee.org/document/8807138. DOI: 10.1109/TCBB.2019.2936476.
[8] 曹成远, 吕强. 使用双向LSTM的深度神经网络预测蛋白质残基相互作用[J]. 小型微型计算机系统, 2017, 38(3): 531-535.
[9] MURZIN A G, BRENNER S E, HUBBARD T, et al. SCOP: a structural classification of proteins database for the investigation of sequences andstructures[J]. Journal of Molecular Biology, 1995, 247(4): 536-540. DOI: 10.1016/S0022-2836(05)80134-2.
[10] ORENGO C A, MICHIE A D, JONES S, et al. CATH: a hierarchic classification of protein domain structures[J]. Structure, 1997, 5(8): 1093-1109. DOI: 10.1016/S0969-2126(97)00260-8.
[11] ZHANG Y, SKOLNICK J. Scoring function for automated assessment of protein structure templatequality[J]. Proteins: Structure, Function, and Bioinformatics, 2004, 57(4): 702-710. DOI: 10.1002/prot.20264.
[12] XU J R, ZHANG Y. How significant is a protein structure similarity with TM-score=0.5?[J]. Bioinformatics, 2010, 26(7): 889-895. DOI: 10.1093/bioinformatics/btq066.
[13] HOLM L, OUZOUNIS C, SANDER C, et al. A database of protein structure families with common foldingmotifs[J]. Protein Science, 1992, 1(12): 1691-1698. DOI: 10.1002/pro.5560011217.
[14] HASEGAWA H, HOLM L. Advances and pitfalls of protein structuralalignment[J]. Current Opinion in Structural Biology, 2009, 19(3): 341-348. DOI: 10.1016/j.sbi.2009.04.003.
[15] SHINDYALOV I N, BOURNE P E. Protein structure alignment by incremental combinatorial extension (CE) of the optimalpath[J]. Protein Engineering Design &Selection, 1998, 11(9): 739-747. DOI:10.1093/protein/11.9.739.
[16] GIBRAT J F, MADEJ T, BRYANT S H. Surprising similarities in structurecomparison[J]. Current Opinion in Structural Biology, 1996, 6(3): 377-385. DOI: 10.1016/S0959-440X(96)80058-3.
[17] ORENGO C A, Taylor W R. SSAP: sequential structure alignment program for protein structure comparison[M]// Methods in Enzymology: Volume 266. London: Academic Press, 1996: 617-635. DOI: 10.1016/S0076-6879(96)66038-8.
[18] ZHU J H, WENG Z P. FAST: a novel protein structure alignment algorithm[J]. PROTEINS: Structure, Function, and Bioinformatics, 2005, 58(3): 618-627. DOI: 10.1002/prot.20331.
[19] AKUTSU T. Protein structure alignment using dynamic programing and iterative improvement[J]. IEICE Transactions on Information and Systems, 1996, E79-D(12): 1629-1636.
[20] ZHANG Y, SKOLNICK J. TM-align: a protein structure alignment algorithm based on the TM-score[J]. Nucleic Acids Research, 2005, 33(7): 2302-2309. DOI: 10.1093/nar/gki524.
[21] HEAL J W, BARTLETT G J, WOOD C W, et al. Applying graph theory to protein structures: an Atlas of coiledcoils[J]. Bioinformatics, 2018, 34(19): 3316-3323. DOI: 10.1093/bioinformatics/bty347.
[22] FOUT A, BYRD J, SHARIAT B, et al. Protein interface prediction using graphconvolutional networks[C]// Advances in Neural Information Processing Systems 30. Red Hook, NY: Curran Associates Inc, 2017: 6530-6539.
[23] ZHAI F Z, LI Q N. A Euclidean distance matrix model for protein molecular conformation[J]. Journal of Global Optimization, 2020, 76(4): 709-728. DOI: 10.1007/s10898-019-00771-4.
[24] TAYLOR W R, ORENGO C A. Protein structurealignment[J]. Journal of Molecular Biology, 1989, 208(1): 1-22. DOI: 10.1016/0022-2836(89)90084-3.
[25] CHOI I G, KWON J Y, KIM S H. Local feature frequency profile: a method to measure structural similarity in proteins[J]. Proceedings of the National Academy of Sciences of the United States of America, 2004, 101(11): 3797-3802. DOI: 10.1073/pnas.0308656100.
[26] OTSU N. A threshold selection method from gray-levelhistograms[J]. IEEE Transactions on Systems, Man, and Cybernetics, 1979, 9(1): 62-66. DOI: 10.1109/TSMC.1979.4310076.
[27] COMANICIU D, MEER P. Mean shift:a robust approach toward feature space analysis[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2002, 24(5): 603-619. DOI: 10.1109/34.1000236.
[28] 袁小翠, 黄志开, 马永力, 等. Otsu阈值分割法特点及其应用分析[J].南昌工程学院学报, 2019, 38(1): 85-90,97. DOI: 10.3969/j.issn.1006-4869.2019.01.015.
[29] 孟琭, 杨旭. 目标跟踪算法综述[J]. 自动化学报, 2019, 45(7): 1244-1260. DOI: 10.16383/j.aas.c180277.
[1] WANG Xun, LI Tinghui, PAN Xiao, TIAN Yu. Image Segmentation Method Based on Improved Fuzzy C-means Clustering and Otsu Maximum Variance [J]. Journal of Guangxi Normal University(Natural Science Edition), 2019, 37(4): 68-73.
[2] SU Lei, LI Junying. Discussion on Classification Standard of Eco-environment Quality in Counties of National Key Eco-functional Areas [J]. Journal of Guangxi Normal University(Natural Science Edition), 2019, 37(3): 196-202.
[3] LIU Jinlong,GUO Yan, YU Zhihua, LIU Yue,YU Xiaoming,CHENGXueqi. A New Method to Detect Busty Events with Different Media Data Based on Word Clustering [J]. Journal of Guangxi Normal University(Natural Science Edition), 2019, 37(1): 23-31.
[4] LIN Yue, LIU Tingzhang, HUANG Lirong, XI Xiaoye, PAN Jian. Anomalous State Detection of Power Transformer Basedon Bidirectional KL Distance Clustering Algorithm [J]. Journal of Guangxi Normal University(Natural Science Edition), 2018, 36(4): 20-26.
[5] LIN Yue. The Fault Diagnosis of Charging Piles Based on Hybrid AP-HMM Model [J]. Journal of Guangxi Normal University(Natural Science Edition), 2018, 36(1): 25-33.
[6] YAN Yan, HU Baoqing, HOU Manfu, SHI Shana. Suitability Assessment of Karst Rocky Desertification Control Patternsin Karst Counties of Guangxi, China [J]. Journal of Guangxi Normal University(Natural Science Edition), 2017, 35(4): 145-153.
[7] TANG Qiling, CHEN Zhilin, ZHOU Shanyi. Geographic Division of Chinese Ants (Hymenoptera: Formicidae) Based on Generic Category [J]. Journal of Guangxi Normal University(Natural Science Edition), 2017, 35(1): 82-91.
[8] SHI Ya-bing, HUANG Yu, QIN Xiao, YUAN Chang-an. K-Means Clustering Algorithm Based on a Novel Approach for Improved Initial Seeds [J]. Journal of Guangxi Normal University(Natural Science Edition), 2013, 31(4): 33-40.
[9] CAO Yong-chun, SHAO Ya-bin, TIAN Shuang-liang, CAI Zheng-qi. A Clustering Method Based on Immune Genetic Algorithm [J]. Journal of Guangxi Normal University(Natural Science Edition), 2013, 31(3): 59-64.
[10] MA Jing, ZOU Yan-li, LI Fu-tao, MO Yu-fang. Limited-maximum-degree LBA Network Model [J]. Journal of Guangxi Normal University(Natural Science Edition), 2011, 29(4): 21-24.
[11] ZHENG Lei, ZHU Zheng-li, HOU Ying-kun. Deployment Strategy of Wireless Sensor Network Nodes Based on Improved Particle Swarm Optimization [J]. Journal of Guangxi Normal University(Natural Science Edition), 2011, 29(4): 56-62.
[12] SHEN Ze-hao, YE Zhong-xing. Fuzzy Clustering Analysis of Customer Credit Risk of Futures Company [J]. Journal of Guangxi Normal University(Natural Science Edition), 2011, 29(3): 101-104.
[13] XU Li, DING Shi-fei, GUO Feng-feng. A Rough Kernel Clustering Algorithm Based on ImprovedAttribute Reduction [J]. Journal of Guangxi Normal University(Natural Science Edition), 2011, 29(3): 105-109.
[14] SHA Bei-bei, XIE Li-cong. Algorithm to Cluster Search Results Based on Frequent Itemsets [J]. Journal of Guangxi Normal University(Natural Science Edition), 2011, 29(2): 151-155.
[15] ZHOU Xin, HAO Zhi-feng, CAI Rui-chu, WEN Wen. Text Clustering with Noise and It's Application in Anti-spam Systems [J]. Journal of Guangxi Normal University(Natural Science Edition), 2011, 29(2): 156-160.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] XU Jianmin, WEI Jia, SHOU Yanfang. Comprehensive Evaluation of Urban Road Traffic Operation StatusBased on Game Theory-Cloud Model[J]. Journal of Guangxi Normal University(Natural Science Edition), 2020, 38(4): 1 -10 .
[2] ZHANG Canlong, LI Yanru, LI Zhixin, WANG Zhiwen. Block Target Tracking Based on Kernel Correlation Filter and Feature Fusion[J]. Journal of Guangxi Normal University(Natural Science Edition), 2020, 38(5): 12 -23 .
[3] XU Lunhui, CAO Yuchao, LIN Peiqun. Location and Dispatching of Multiple Emergency Materials Center Based on Fusion Immune Optimization and Genetic Algorithm[J]. Journal of Guangxi Normal University(Natural Science Edition), 2020, 38(6): 1 -13 .
[4] HU Jinming, WEI Duqu. Research on Generalized Sychronization of Fractional-order PMSM[J]. Journal of Guangxi Normal University(Natural Science Edition), 2020, 38(6): 14 -20 .
[5] ZHU Yongjian, LUO Jian, QIN Yunbai, QIN Guofeng, TANG Chuliu. A Method for Detecting Metal Surface Defects Based on Photometric Stereo and Series Expansion Methods[J]. Journal of Guangxi Normal University(Natural Science Edition), 2020, 38(6): 21 -31 .
[6] TANG Rongchai, WU Xiru. Real-time Detection of Passion Fruit Based on Improved YOLO-V3 Network[J]. Journal of Guangxi Normal University(Natural Science Edition), 2020, 38(6): 32 -39 .
[7] CHEN Dong, HU Kui. Cover Gorenstein AC-flat Dimensions[J]. Journal of Guangxi Normal University(Natural Science Edition), 2020, 38(6): 51 -55 .
[8] ZUO Jiabin, YUN Yongzhen. Anti-periodic Boundary Value Problem for a Class of Fractional Differential Equations[J]. Journal of Guangxi Normal University(Natural Science Edition), 2020, 38(6): 56 -64 .
[9] WANG Yue, YE Hongyan, LEI Jun, SUO Hongmin. Infinitely Many Classical Solutions for Kirchhoff Type Problem with Linear Term[J]. Journal of Guangxi Normal University(Natural Science Edition), 2020, 38(6): 65 -73 .
[10] HUANG Chunxian, ZHOU Xiaoliang. Bifurcation Analysis of an SIRS Epidemic Model with Graded Cure and Incomplete Recovery Rates[J]. Journal of Guangxi Normal University(Natural Science Edition), 2020, 38(6): 74 -81 .