用户名: 密码: 验证码:
Clustering PPI data by combining FA and SHC method
详细信息    查看全文
  • 作者:Xiujuan Lei ; Chao Ying ; Fang-Xiang Wu ; Jin Xu
  • 关键词:Protein ; Protein Interaction (PPI) data ; firefly algorithm (FA) ; synchronization ; based hierarchical clustering (SHC) ; spectral clustering (SC)
  • 刊名:BMC Genomics
  • 出版年:2015
  • 出版时间:December 2015
  • 年:2015
  • 卷:16
  • 期:3-supp
  • 全文大小:1,089 KB
  • 参考文献:1.Watts DJ, Stroqatz SH: Collective dynamics of 'small-world' networks. Nature. 1998, 393 (6684): 440-442. 10.1038/30918.CrossRef PubMed
    2.del Sol A, O'Meara P: Small-world network approach to identify key residues in protein-protein interaction. Proteins: Structure, Function, and Bioinformatics. 2005, 58 (3): 672-82.CrossRef
    3.Wang J, Li M, Deng Y, Pan Y: Recent advances in clustering methods for protein interaction networks. BMC Genomics. 2010, 11 (Suppl 3): S10-10.1186/1471-2164-11-S3-S10.CrossRef
    4.Li M, Wang J, Chen J, Cai Z: Identifying the Overlapping Complexes in Protein Interaction Networks. International Journal of Data Ming and Bioinformatics. 2010, 4 (1): 91-108. 10.1504/IJDMB.2010.030969.CrossRef
    5.Girvn M, Newman MEJ: Community structure in social and biological networks. Proceedings of the National Academy of Science. 2002, 99 (12): 7821-6. 10.1073/pnas.122653799.CrossRef
    6.Wasserman S, Faust K: Social network analysis: methods and applications. 1994, Cambridge: Cambridge University PressCrossRef
    7.Freeman L: A set of measure of centrality based upon betweeness. Sociometry. 1977, 40 (1): 35-41. 10.2307/3033543.CrossRef
    8.Newman MEJ: Fast algorithm for dectecting community structure in networks. Physical Review E. 2004, 69 (6): 066133-CrossRef
    9.King AD, Pržulj N, et al: Protein complex prediction via cost-based clustering. Bioinformatics. 2004, 20 (17): 3013-20. 10.1093/bioinformatics/bth351.CrossRef PubMed
    10.Palla G, Derényi I, et al: Uncovering the overlapping community structure of complex networks in nature and society. Nature. 2005, 435 (7043): 814-8. 10.1038/nature03607.CrossRef PubMed
    11.Bader GD, Hogue CW: An automated method for finding molecular complexes in large protein interaction networks. BMC Bioinformatics. 2003, 4: article 2-10.1186/1471-2105-4-2.CrossRef
    12.van Dongen SM: Graph clustering by flow simulation. PhD thesis, Center for Math and Computer Science (CWI). 2000
    13.Ng Andrew Y, Jordan MI, Weiss Y: On spectral clustering: analysis and an algorithm[C]. Advances in Neural Information Processing Systems. 2001, Cambridge, MA: MIT Press
    14.Zhao B, Wang J, Li M, Wu F, Pan Y: Detecting Protein Complexes Based on Uncertain Graph Model. IEEE/ACM Transactions on Computational Biology and Bioinformatics. 2014, 11 (3): 486-497.CrossRef
    15.Li M, Wu X, Wang J, Pan Y: Towards the identification of protein complexes and functional modules by integrating PPI network and gene expression data. BMC Bioinformatics. 2012, 13 (1): 109-10.1186/1471-2105-13-109.PubMedCentral CrossRef PubMed
    16.Ramadan E, Osgood C, Pothen A: Discovering overlapping modules and bridge proteins in proteomic networks. Proc. ACM Int'l Conf. Bioinformatics and Computational Biology (BCB '10). 2010, 366-369.
    17.Efimov D, Zaki N, Berengueres J: Detecting protein complexes from noisy protein interaction data. Proc. 11th Int'l Workshop Data Mining in Bioinformatics. 2012, 1-7.CrossRef
    18.Arnau V, Mars S, Marin I: Iterative cluster analysis of protein interaction data. Bioinformatics. 2005, 21 (3): 364-378. 10.1093/bioinformatics/bti021.CrossRef PubMed
    19.Frey BJ, Dueck D: Clustering by passing messages between data points. Science. 2007, 15 (5814): 972-976.CrossRef
    20.Feng J, Jiang R, Jiang T: A max-flow based approach to the identification of protein complexes using protein interaction and microarray data. Computational Systems Bioinformatics. 2008, 7: 51-62.CrossRef PubMed
    21.Inoue K, Li W, Kurata H: Diffusion model based spectral clustering for protein-protein interaction networks. PLoS ONE. 2010, 5 (9): e12623-10.1371/journal.pone.0012623.PubMedCentral CrossRef PubMed
    22.Qi YJ, Balem F, Faloutsos C, Klein-Seetharaman J, Bar-Joseph Z: Protein complex identification by supervised graph local clustering. Bioinformatics. 2008, 24 (13): 250-268. 10.1093/bioinformatics/btn164.CrossRef
    23.Leung HC, Yiu SM, Xiang Q, Chin FY: Predicting protein complexes from PPI Data: A core-attachment approach. J Computational Biology. 2009, 16 (2): 133-144. 10.1089/cmb.2008.01TT.CrossRef
    24.Li M, Wang J, Chen J: A fast agglomerate algorithm for mining functional modules in protein interaction networks. International Conference on BioMedical Engineering and Informatics. 2008, 1: 3-7.
    25.Wang J, Li M, Chen J, Pan Y: A fast hierarchical clustering algorithm for functional modules discovery in protein interaction networks. IEEE Transactions on Computational Biology and Bioinformatics. 2011, 8 (3): 607-620.CrossRef PubMed
    26.Li M, Chen J, Wang J, H B, C G: Modifying the DPClus algorithm for identifying protein complexes based on new topological structures. BMC Bioinformatics. 2008, 9: 398-10.1186/1471-2105-9-398.PubMedCentral CrossRef PubMed
    27.Wang J, L B, L M, Pan Y: Identifying protein complexes from interaction networks based on clique percolation and distance restriction. BMC Genomics. 2010, 11 (Suppl 2): S10-10.1186/1471-2164-11-S2-S10.CrossRef
    28.Ren J, Wang J, Li M, Wang L: Identifying protein complexes based on density and modularity in protein-protein interaction network. BMC Systems Biology. 2013, 7: S12-PubMedCentral CrossRef PubMed
    29.Wu S, Lei XJ, Tian JF: Clustering PPI network based on functional flow model through artificial bee colony algorithm. Proc Seventh Int'l Conf, Natural Computation. 2011, 92-96.CrossRef
    30.Böhm C, Plant C, Shao JM, et al: Clustering by synchronization. Proceedings of ACM SIGKDD'10, Washington. 2010, 583-592.
    31.George K, Eui-Hong H, Vipin K: CHAMELEON A hierarchical clustering algorithm using dynamic modeling. IEEE Comput. 1999, 32: 68-75. 10.1109/2.781637.CrossRef
    32.Huang JB, Kang JM, Qi JJ, Sun HL: A hierarchical clustering method based on a dynamic synchronization model. Science China: Information Science. 2013, 43 (5): 599-610.
    33.Yang XS: Nature-Inspired metaheuristic algorithms [M]. Luniver Press. 2008, 83-96.
    34.Yang XS: Firefly algorithm, stochastic test functions and design optimization. Int J Bio-Inspired Comput. 2010, 2 (2): 78-84. 10.1504/IJBIC.2010.032124.CrossRef
    35.Krishnanand KN, Ghose D: Detection of multiple source locations using a firefly metaphor with applications to collective robotics[C]. Proceeding of IEEE Swarm Intelligence Symposium. 2005, Piscataway: IEEE Press, 84-91.
    36.Acebron JA, Bonilla LL, Vicente CJP, et al: The Kuramoto Model: A simple paradigm for synchronization phenomena. Rev Mod Phys. 2005, 77 (1): 137-49. 10.1103/RevModPhys.77.137.CrossRef
    37.Aeyels D, Smet FD: A mathematical model for the dynamics of clustering. Physica D: Nonlinear Phenomena. 2008, 273 (19): 2517-2530.CrossRef
    38.Radicchi F, Castellano C, Cecconi F, et al: Defining and identifying communities in networks. Proceeding of the National Academy of Sciences of the USA. 2004, 101 (9): 2658-63. 10.1073/pnas.0400054101.CrossRef
    39.Karaboga D, Basturk B: A powerful and efficient algorithm for numerical function optimization: artificial bee colony (ABC) algorithm. Journal of Global Optimization. 2007, 39 (3): 459-471. 10.1007/s10898-007-9149-x.CrossRef
    40.Güldener U, Münsterkötter M, Kastenmüller G, et al: CYGD: the comprehensive yeast genome database. Nucleic Acids Research. 2005, 33: D364-D368.PubMedCentral CrossRef PubMed
    41.Mewes HW, Frishman D, Mayer KF, et al: MIPS: analysis and annotation of proteins from whole genomes in 2005. Nucleic Acids Research. 2006, 34: D169-D172. 10.1093/nar/gkj148.PubMedCentral CrossRef PubMed
    42.Lei XJ, Tian JF, Ge L, Zhang AD: The clustering model and algorithm of PPI network based on propagating mechanism of artificial bee colony. Information Sciences. 2013, 247: 21-39.CrossRef
    43.Lei XJ, Wu S, Ge L, Zhang AD: Clustering and Overlapping Modules Detection in PPI Network Based on IBFO. Proteomics. 2013, 13 (2): 278-290. 10.1002/pmic.201200309. JanCrossRef PubMed
    44.Lei XJ, Wu FX, Tian JF, Zhao J: ABC and IFC: Modules Detection Method for PPI Network. BioMed Research International. 2014, 2014: Article ID 968173, 11 pages, doi:10.1155/2014/968173
    45.van der Merwe DW, Engelbrecht AP: Data clustering using particle swarm optimization[C]. Proc of 2003 Congress on Evolutionary Computation (CEC'03). 2003, 215-220.
    46.Maulik U, Bandyopadhyay S: Genetic algorithm-based clustering technique [J]. Pattern Recognition. 2000, 33: 1455-1465. 10.1016/S0031-3203(99)00137-5.CrossRef
    47.Shi YH, Eberhart RC: Parameter Selection in Particle Swarm Optimization. Lecture Notes in Computer Science. 1998, 1447: 591-600. 10.1007/BFb0040810.CrossRef
    48.Zhang AD: Protein interaction networks. 2009, New York, USA: Cambridge University PressCrossRef
  • 作者单位:Xiujuan Lei (1) (2)
    Chao Ying (1)
    Fang-Xiang Wu (3)
    Jin Xu (2)

    1. School of Computer Science, Shaanxi Normal University, Xi'an, Shaanxi, 710062, China
    2. School of Electronics Engineering and Computer Science, Peking University, Beijing, 100871, China
    3. Division of Biomedical Engineering, University of Saskatchewan, Saskatoon, SK, S7N 5A9, Canada
  • 刊物主题:Life Sciences, general; Microarrays; Proteomics; Animal Genetics and Genomics; Microbial Genetics and Genomics; Plant Genetics & Genomics;
  • 出版者:BioMed Central
  • ISSN:1471-2164
文摘
Clustering is one of main methods to identify functional modules from protein-protein interaction (PPI) data. Nevertheless traditional clustering methods may not be effective for clustering PPI data. In this paper, we proposed a novel method for clustering PPI data by combining firefly algorithm (FA) and synchronization-based hierarchical clustering (SHC) algorithm. Firstly, the PPI data are preprocessed via spectral clustering (SC) which transforms the high-dimensional similarity matrix into a low dimension matrix. Then the SHC algorithm is used to perform clustering. In SHC algorithm, hierarchical clustering is achieved by enlarging the neighborhood radius of synchronized objects continuously, while the hierarchical search is very difficult to find the optimal neighborhood radius of synchronization and the efficiency is not high. So we adopt the firefly algorithm to determine the optimal threshold of the neighborhood radius of synchronization automatically. The proposed algorithm is tested on the MIPS PPI dataset. The results show that our proposed algorithm is better than the traditional algorithms in precision, recall and f-measure value.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700