用户名: 密码: 验证码:
A strategy to select suitable physicochemical attributes of amino acids for protein fold recognition
详细信息    查看全文
  • 作者:Alok Sharma (5) (5)
    Kuldip K Paliwal (4)
    Abdollah Dehzangi (4)
    James Lyons (4)
    Seiya Imoto (5)
    Satoru Miyano (5)
  • 刊名:BMC Bioinformatics
  • 出版年:2013
  • 出版时间:December 2013
  • 年:2013
  • 卷:14
  • 期:1
  • 全文大小:314KB
  • 参考文献:1. Yang T, Kecman V, Cao L, Zhang C, Huang JZ: Margin-based ensemble classifier for protein fold recognition. / Expert Syst Appl 2011, 38:12348鈥?2355. CrossRef
    2. Dong Q, Zhou S, Guan G: A new taxonomy-based protein fold recognition approach based on autocross-covariance transformation. / Bioinformatics 2009,25(20):2655鈥?662. CrossRef
    3. Klein P: Prediction of protein structural class by discriminant analysis. / Biochim Biopjys Acta 1986, 874:205鈥?15. CrossRef
    4. Chinnasamy A, Sung WK, Mittal A: Protein structure and fold prediction using tree-augmented naive Bayesian classifier. / J Bioinform Comput Biol 2005,3(4):803鈥?19. CrossRef
    5. Wang ZZ, Yuan Z: How good is prediction of protein-structural class by the component-coupled method? / Proteins 2000, 38:165鈥?75. CrossRef
    6. Shen HB, Chou KC: Ensemble classier for protein fold pattern recognition. / Bioinformatics 2006, 22:1717鈥?722. CrossRef
    7. Ding YS, Zhang TL: Using Chou鈥檚 pseudo amino acid composition to predict subcellular localization of apoptosis proteins: an approach with immune genetic algorithm-based ensemble classifier. / Patt Recog Letters 2008, 29:1887鈥?892. CrossRef
    8. Bouchaffra D, Tan J: Protein fold recognition using a structural Hidden Markov Model. / Proceedings of the 18th International Conference on Pattern Recognition 2006, 3:186鈥?89.
    9. Deschavanne P, Tuffery P: Enhanced protein fold recognition using a structural alphabet. / Proteins: Structure, Function, and Bioinformatics 2009, 76:129鈥?37. CrossRef
    10. Chen K, Zhang X, Yang MQ, Yang JY: Ensemble of probabilistic neural networks for protein fold recognition. / Proceedings of the 7th IEEE International Conference on Bioinformatics and Bioengineering (BIBE) 2007, I:66鈥?0.
    11. Ying Y, Huang K, Campbell C: Enhanced protein fold recognition through a novel data integration approach. / BMC Bioinforma 2009,10(1):267. CrossRef
    12. Dehzangi A, Amnuaisuk SP, Ng KH, Mohandesi E: Protein fold prediction problem using ensemble of classifiers. / Proceedings of the 16th International Conference on Neural Information Processing 2009, Part II:503鈥?11. CrossRef
    13. Dehzangi A, Amnuaisuk SP, Dehzangi O: Enhancing protein fold prediction accuracy by using ensemble of different classifiers. / Aust J Intell Inf Process Syst 2010,26(4):32鈥?0.
    14. Dehzangi A, Karamizadeh S: Solving protein fold prediction problem using fusion of heterogeneous classifiers. / INF, Int Interdiscip J 2011,14(11):3611鈥?622.
    15. Dubchak I, Muchnik I, Kim SK: Protein folding class predictor for SCOP: approach based on global descriptors. In / Proceedings, 5th International Conference on Intelligent Systems for Molecular Biology. Kalkidiki, Greece; 1997:104鈥?07.
    16. Taguchi Y-h, Gromiha MM: Application of amino acid occurrence for discriminating different folding types of globular proteins. / BMC Bioinforma 2007, 8:404. CrossRef
    17. Ghanty P, Pal NR: Prediction of protein folds: extraction of new features, dimensionality reduction, and fusion of heterogeneous classifiers. / IEEE Trans On Nano Bioscience 2009, 8:100鈥?10. CrossRef
    18. Chou KC: Prediction of protein cellular attributes using pseudo amino acid composition. / Proteins 2001, 43:246鈥?55. erratum: 2001, vol. 44, 60 CrossRef
    19. Sharma A, Lyons J, Dehzangi A, Paliwal KK: A feature extraction technique using bi-gram probabilities of position specific scoring matrix for protein fold recognition. / J Theor Biol 2013,320(7):41鈥?6. CrossRef
    20. Kurgan LA, Cios KJ, Chen K: SCPRED: Accurate prediction of protein structural class for sequences of twilight-zone similarity with predicting sequences. / BMC Bioinforma 2008, 9:226. CrossRef
    21. Liu T, Geng X, Zheng X, Li R, Wang J: Accurate Prediction of Protein Structural Class Using Auto Covariance Transformation of PSI-BLAST Profiles. / Amino Acids 2012, 42:2243鈥?249. CrossRef
    22. Dehzangi A, Amnuaisuk SP: Fold prediction problem: the application of new physical and physicochemical-based features. / Protein Pept Lett 2011, 18:174鈥?85. CrossRef
    23. Krishnaraj Y, Reddy CK: Boosting methods for protein fold recognition: an empirical comparison. / IEEE Int Conf Bioinfor Biomed 2008, 393鈥?96. CrossRef
    24. Valavanis IK, Spyrou GM, Nikita KS: A comparative study of multi-classification methods for protein fold recognition. / Int J Comput Intell Bioinform Syst Biol 2010,1(3):332鈥?46.
    25. Ding C, Dubchak I: Multi-class protein fold recognition using support vector machines and neural networks. / Bioinformatics 2001,17(4):349鈥?58. CrossRef
    26. Kecman V, Yang T: Protein fold recognition with adaptive local hyper plane Algorithm. In / Computational Intelligence in Bioinformatics and Computational Biology, CIBCB '09. IEEE Symposium. Nashville, TN, USA; 2009:75鈥?8.
    27. Kavousi K, Moshiri B, Sadeghi M, Araabi BN, Moosavi-Movahedi AA: A protein fold classier formed by fusing different modes of pseudo amino acid composition via PSSM. / Comput Biol Chem 2011,35(1):1鈥?. CrossRef
    28. Chmielnicki W, Stapor K: A hybrid discriminative-generative approach to protein fold recognition. / Neurocomputing 2012, 75:194鈥?98. CrossRef
    29. Zhang H, Zhang T, Gao J, Ruan J, Shen S, Kurgan LA: Determination of protein folding kinetic types using sequence and predicted secondary structure and solvent accessibility. / Amino Acids 2010, 1鈥?3.
    30. Najmanovich R, Kuttner J, Sobolev V, Edelman M: Side-chain flexibility in proteins upon ligand binding. / Proteins: Structure, Function, and Bioinformatics 2000,39(3):261鈥?68. CrossRef
    31. Huang JT, Tian J: Amino acid sequence predicts folding rate for middle-size two-state proteins. / Proteins: Structure, Function, and Bioinformatics 2006,63(3):551鈥?54. CrossRef
    32. Zhang TL, Ding YS, Chou KC: Prediction protein structural classes with pseudo amino acid composition: approximate entropy and hydrophobicity pattern. / J Theor Biol 2008, 250:186鈥?93. CrossRef
    33. Cormen TH, Leiserson CE, Rivest RL, Stein C: / Introduction to algorithms. USA: MIT Press; 1990.
    34. Sharma A, Imoto S, Miyano S: A top-r feature selection algorithm for microarray gene expression data. / IEEE/ACM Trans Comput Biol Bioinform 2012,9(3):754鈥?64. CrossRef
    35. Schaffer AA, Aravind L, Madden TL, Shavirin S, Spouge JL, Wolf YI, Koonin EV, Altschul SF: Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements. / Nucleic Acids Res 2001, 29:2994鈥?005. CrossRef
    36. Argos P, Rao JKM, Hargrave PA: Structural prediction of membrane-bound proteins. / Eur J Biochem 1982, 128:565鈥?75. CrossRef
    37. Zimmerman JM, Eliezer N, Simha R: The characterization of amino acid sequences in proteins by statistical methods. / J Theor Biol 1968, 21:170鈥?01. CrossRef
    38. Charton M, Charton BI: The structural dependence of amino acid hydrophobicity parameters. / J Theor Biol 1982, 99:629鈥?44. CrossRef
    39. Burgess AW, Ponnuswamy PK, Scheraga HA: Analysis of conformations of amino acid residues and prediction of backbone topography in proteins. / Isr J Chem 1974, 12:239鈥?86.
    40. Fauchere JL, Charton M, Kier LB, Verloop A, Pliska V: Amino acid side chain parameters for correlation studies in biology and pharmacology. / Int J Peptide Protein Res 1988, 32:269鈥?78. CrossRef
    41. Bundi A, Wuthrich K: 1H-nmr parameters of the common amino acid residues measured in aqueous of the linear tetrapeptides H-Gly-Gly-X-L-Ala-OH. / Biopolymers 1979, 18:285鈥?97. CrossRef
    42. Charton M, Charton BI: The dependence of the Chou-Fasman parameters on amino acid side chain structure. / J Theor Biol 1983, 111:447鈥?50.
    43. Khanarian G, Moore WJ: The Kerr effect of amino acids in water. / Aust J Chem 1980, 33:1727鈥?741. CrossRef
    44. Cid H, Bunster M, Canales M, Gazitua F: Hydrophobicity and structural classes in proteins. / Protein Eng 1992, 5:373鈥?75. CrossRef
    45. Chou PY, Fasman GD: Prediction of the secondary structure of proteins from their amino acid sequence. / Adv Enzymol 1978, 47:45鈥?48.
    46. Levitt M: Conformational preferences of amino acids in globular proteins. / Biochemistry 1978, 17:4277鈥?285. CrossRef
    47. Dawson DM: / The Biochemical Genetics of Man. Edited by: Brock DJH, Mayo O. Academic Press; 1972.
    48. Dayhoff MO, Hunt LT, Hurst-Calderone S: Composition of proteins. / Atlas of Protein Sequence and Structure 1978,5(3):363鈥?75.
    49. Dayhoff MO, Schwartz RM, Orcutt BC: A model of evolutionary change in proteins. / Atlas of Protein Sequence and Structure 1978,5(3):345鈥?52.
    50. Eisenberg D, McLachlan AD: Solvation energy in protein folding and binding. / Nature 1986, 319:199鈥?03. CrossRef
    51. Fasman GD (Ed): Handbook of Biochemistry: Section A In / Proteins. 3rd edition. CRC Press; 1976.
    52. Geisow MJ, Roberts RDB: Amino acid preferences for secondary structure vary with protein class. / Int J Biol Macromol 1980, 2:387鈥?89. CrossRef
    53. Grantham R: Amino acid difference formula to help explain protein evolution. / Science 1974, 185:862鈥?64. CrossRef
    54. Guy HR: Amino acid side-chain partition energies and distribution of residues in soluble proteins. / Biophys J 1985, 47:61鈥?0. CrossRef
    55. Hutchens JO: Heat capacities, absolute entropies, and entropies of formation of amino acids and related compounds. In / Handbook of Biochemistry. 2nd edition. Edited by: Sober HA. Cleveland, Ohio: Chemical Rubber Co; 1970.
    56. Janin J, Wodak S, Levitt M, Maigret B: Conformation of amino acid side-chains in proteins. / J Mol Biol 1978, 125:357鈥?86. CrossRef
    57. Sharma A, Paliwal KK: Rotational linear discriminant analysis technique for dimensionality reduction. / IEEE Trans Knowl Data Eng 2008,20(10):1336鈥?347. CrossRef
    58. Sharma A, Paliwal KK: A gradient linear discriminant analysis for small sample sized problem. / Neural Processing Letters 2008,27(1):17鈥?4. CrossRef
    59. Sharma A, Paliwal KK: Cancer classification by gradient LDA technique using microarray gene expression data. / Data Knowl Eng 2008,66(2):338鈥?47. CrossRef
    60. Witten IH, Frank E: / Data mining: practical machine learning tools with java implementations. San Francisco, CA: Morgan Kaufmann; 2000. http://www.cs.waikato.ac.nz/ml/weka/
    61. Kawashima S, Pokarowski P, Pokarowska M, Kolinski A, Katayama T, Kanehisa M: AAindex: amino acid index database, progress report 2008. / Nucleic Acids Res 2008, 36:D202-D205. CrossRef
    62. Li ZC, Zhou XB, Lin YR, Zou XY: Prediction of protein structure class by coupling improved genetic algorithm and support vector machine. / Amino Acids 2008, 35:581鈥?90. CrossRef
    63. Liu L, Hu X: Based on improved parameters predicting protein fold. / Sixth Int Conf Nat Comput (ICNC 2010) 2010, 6:3291鈥?295.
    64. Kurgan L, Chen K: Prediction of protein structural class for the twilight zone sequences. / Biochem Biophys Res Commun 2007, 357:453鈥?60. CrossRef
    65. Gromiha M: A statistical model for predicting protein folding rates from amino acid sequence with structural class information. / J Chem Inf Model 2005, 45:494鈥?01. CrossRef
  • 作者单位:Alok Sharma (5) (5)
    Kuldip K Paliwal (4)
    Abdollah Dehzangi (4)
    James Lyons (4)
    Seiya Imoto (5)
    Satoru Miyano (5)

    5. School of Engineering and Physics, University of the South Pacific, Suva, Fiji
    4. School of Engineering, Griffith University, Brisbane, Australia
  • ISSN:1471-2105
文摘
Background Assigning a protein into one of its folds is a transitional step for discovering three dimensional protein structure, which is a challenging task in bimolecular (biological) science. The present research focuses on: 1) the development of classifiers, and 2) the development of feature extraction techniques based on syntactic and/or physicochemical properties. Results Apart from the above two main categories of research, we have shown that the selection of physicochemical attributes of the amino acids is an important step in protein fold recognition and has not been explored adequately. We have presented a multi-dimensional successive feature selection (MD-SFS) approach to systematically select attributes. The proposed method is applied on protein sequence data and an improvement of around 24% in fold recognition has been noted when selecting attributes appropriately. Conclusion The MD-SFS has been applied successfully in selecting physicochemical attributes of the amino acids. The selected attributes show improved protein fold recognition performance.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700