A Unified Adaptive Co-identification Framework for High-D Expression Data

设为首页

收藏本站

网站地图 | English | 公务邮箱

远程访问

NSTL服务站

A Unified Adaptive Co-identification Framework for High-D Expression Data

详细信息查看全文

作者：Shuzhong Zhang (23)
Kun Wang (25)
Cody Ashby (25)
Bilian Chen (24)
Xiuzhen Huang (25)
刊名：Lecture Notes in Computer Science
出版年：2012
出版时间：2012
年：2012
卷：7632
期：1
页码：71-81
全文大小：258KB
参考文献：1. Aguilar-Ruiz, J.S.: Shifting and scaling patterns from gene expression data. Bioinformatics?21, 3840-845 (2005) CrossRef
2. Banerjee, A., et al.: A generalized maximum entropy approach to bregman coclustering and matrix approximation. JMLR?8, 1919-986 (2007)
3. Ben-Dor, A., et al.: Discovering local structure in gene expression data: the order-preserving submatrix problem. In: RECOMB 2002, pp. 49-7 (2002)
4. Ben-Hur, A., et al.: A stability based method for discovering structure in clustered data. In: Proc. of PSB (2002)
5. Bertsekas, D.P.: Nonlinear Programming. Athena Scientific, Belmont (1999)
6. Chen, B., et al.: Maximum block improvement and polynomial optimization. SIAM Journal on Optimization?22, 87-07 (2012) CrossRef
7. Cheng, Y., Church, G.M.: Biclustering of expression data. In: Proc. Int. Conf. Intell. Syst. Mol. Biol., vol.?8, pp. 93-03 (2000)
8. Cheung, A.N.: Molecular targets in gynaecological cancers. Pathology?39, 26-5 (2007) CrossRef
9. Cho, H., et al.: Minimum sum-squared residue co-clustering of gene expression data. In: Proc. SIAM on Data Mining, pp. 114-25 (2004)
10. Costa, I.G., et al.: Comparative analysis of clustering methods for gene expression time course data. Genet. Mol. Biol.?27, 623-31 (2004) CrossRef
11. Deodhar, M., et al.: Hunting for Coherent Co-clusters in High Dimensional and Noisy Datasets. In: IEEE Intl. Conf. on Data Mining Workshops (2008)
12. D’haeseleer, P.: How does gene expression clustering work? Nature Biotechnology?23, 1499-501 (2005) CrossRef
13. Downey, R.G., Fellows, M.R.: Parameterized Complexity. Springer (1999)
14. Dudoit, S., Fridlyand, J.: A prediction based resampling method for estimating the number of clusters in a data set. Genome Biology?3, 1-1 (2002) CrossRef
15. Eisen, M.B., et al.: Cluster analysis and display of genome-wide expression patterns. Proc. Natl. Acad. Sci.?95, 14863-4868 (1998) CrossRef
16. Gibbons, F.D., Roth, F.P.: Judging the quality of gene expression-based clustering methods using gene annotation. Genome Res.?12, 1574-581 (2002) CrossRef
17. Hochreiter, S., et al.: FABIA: factor analysis for bicluster acquisition. Bioinformatics?26, 1520-527 (2010) CrossRef
18. Kilian, J., et al.: The AtGenExpress global stress expression data set: protocols, evaluation and model data analysis of UV-B light, drought and cold stress responses. The Plant Journal?2, 347-63 (2007) CrossRef
19. Kolda, T.G., Bader, B.W.: Tensor decompositions and applications. SIAM Review?51, 455-00 (2009) CrossRef
20. Jegelka, S., Sra, S., Banerjee, A.: Approximation Algorithms for Tensor Clustering. In: Gavaldà, R., Lugosi, G., Zeugmann, T., Zilles, S. (eds.) ALT 2009. LNCS, vol.?5809, pp. 368-83. Springer, Heidelberg (2009) CrossRef
21. Jiang, D., et al.: Mining coherent gene clusters from gene-sample-time microarray data. In: Proc. ACM SIGKDD, pp. 430-39 (2004)
22. Lathauwer, D., et al.: A multilinear singular value decomposition. SIAM J. Matrix Anal. Appl.?21, 1253-278 (2000) CrossRef
23. Lazzeroni, L., Owen, A.B.: Plaid models for gene expression data. Statistica Sinica?12, 61-6 (2002)
24. Lee, M., et al.: Biclustering via Sparse Singular Value Decomposition. Biometrics?66, 1087-095 (2010) CrossRef
25. Li, A., Tuck, D.: An Effective Tri-Clustering Algorithm Combining Expression Data with Gene Regulation. Gene Regulation and Systems Biology?3, 49-4 (2009)
26. Madeira, S.C., Oliveira, A.L.: Biclustering algorithms for biological data analysis: a survey. IEEE/ACM Trans. Comput. Biology Bioinform.?1, 24-5 (2004) CrossRef
27. Magic, Z., et al.: cDNA microarrays: identification of gene signatures and their application in clinical practice. J. BUON?12(suppl.1), S39–S44 (2007)
28. Murali, T., Kasif, S.: Extracting conserved gene expression motifs from gene expression data. In: Pacific Symposium on Biocomputing, vol.?8, pp. 77-8 (2003)
29. Prelic, A., et al.: A systematic comparison and evaluation of biclustering methods for gene expression data. Bioinformatics?22, 1122-129 (2006) CrossRef
30. Snider, N., Diab, M.: Unsupervised Induction of Modern Standard Arabic Verb Classes. In: HLT-NAACL, New York (2006)
31. Strauch, M., et al.: A Two-Step Clustering for 3-D Gene Expression Data Reveals the Main Features of the Arabidopsis Stress Response. J. Integrative Bioinformatics?4, 54-6 (2007)
32. Supper, J., et al.: EDISA: extracting biclusters from multiple time-series of gene expression profiles. BMC Bioinformatics?8, 334-47 (2007) CrossRef
33. Suter, L., et al.: Toxicogenomics in predictive toxicology in drug development. Chem. Biol.?11, 161-71 (2004)
34. Tamayo, P., et al.: Interpreting patterns of gene expression with self-organizing maps: methods and application to hematopoietic differentiation. Proc. Natl. Acad. Sci. USA?96, 2907-912 (1999) CrossRef
35. Tavazoie, S., et al.: Systematic determination of genetic network architecture. Nat. Genet.?22, 281-85 (1999) CrossRef
36. Tucker, L.R.: Some mathematical notes on three-mode factor analysis. Psychometrika?31, 279-11 (1966) CrossRef
37. Tibshirani, R., et al.: Estimating the Number of Clusters in a Dataset via the Gap Statistic. J. Royal Stat. Soc. B?63, 411-23 (2001) CrossRef
38. Wang, H., et al.: Clustering by pattern similarity in large data sets. In: Proc. KDD 2002, pp. 394-05 (2002)
39. Xu, X., et al.: Mining shifting-and-scaling co-regulation patterns on gene expression profiles. In: Proc. ICDE 2006, pp. 89-8 (2006)
40. Zhang, S., Wang, K., Chen, B., Huang, X.: A New Framework for Co-clustering of Gene Expression Data. In: Loog, M., Wessels, L., Reinders, M.J.T., de Ridder, D. (eds.) PRIB 2011. LNCS, vol.?7036, pp. 1-2. Springer, Heidelberg (2011) CrossRef
41. Zhao, L., Zaki, M.J.: Tricluster: an effective algorithm for mining coherent clusters in 3D microarray data. In: Proc. ACM SIGMOD, pp. 694-05 (2005)
作者单位：Shuzhong Zhang (23)
Kun Wang (25)
Cody Ashby (25)
Bilian Chen (24)
Xiuzhen Huang (25)

23. University of Minnesota, Minneapolis, MN, 55455, USA
25. Arkansas State University, Jonesboro, AR, 72467, USA
24. Xiamen University, Xiamen, 361000, China
ISSN：1611-3349

文摘

High-throughput techniques are producing large-scale high-dimensional (e.g., 4D with genes vs timepoints vs conditions vs tissues) genome-wide gene expression data. This induces increasing demands for effective methods for partitioning the data into biologically relevant groups. Current clustering and co-clustering approaches have limitations, which may be very time consuming and work for only low-dimensional expression datasets. In this work, we introduce a new notion of “co-identification- which allows systematical identification of genes participating different functional groups under different conditions or different development stages. The key contribution of our work is to build a unified computational framework of co-identification that enables clustering to be high-dimensional and adaptive. Our framework is based upon a generic optimization model and a general optimization method termed Maximum Block Improvement. Testing results on yeast and Arabidopsis expression data are presented to demonstrate high efficiency of our approach and its effectiveness.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700