用户名: 密码: 验证码:
An investigation of biomarkers derived from legacy microarray data for their utility in the RNA-seq era
详细信息    查看全文
  • 作者:Zhenqiang Su (1) (2)
    Hong Fang (1)
    Huixiao Hong (1)
    Leming Shi (3) (4) (5)
    Wenqian Zhang (1)
    Wenwei Zhang (6)
    Yanyan Zhang (6)
    Zirui Dong (6) (7)
    Lee J Lancashire (2)
    Marina Bessarabova (2)
    Xi Yang (1)
    Baitang Ning (1)
    Binsheng Gong (1)
    Joe Meehan (1)
    Joshua Xu (1)
    Weigong Ge (1)
    Roger Perkins (1)
    Matthias Fischer (8)
    Weida Tong (1)

    1. National Center for Toxicological Research
    ; US Food and Drug Administration ; 3900 NCTR Road ; Jefferson ; AR ; 72079 ; USA
    2. Thomson Reuters
    ; IP & Science ; 22 Thomson Place ; Boston ; MA ; 02210 ; USA
    3. State Key Laboratory of Genetic Engineering and MOE Key Laboratory of Contemporary Anthropology
    ; Schools of Life Sciences and Pharmacy ; Fudan University ; Shanghai ; 201203 ; China
    4. Fudan-Zhangjiang Center for Clinical Genomics
    ; Shanghai ; 201203 ; China
    5. Zhanjiang Center for Translational Medicine
    ; Shanghai ; 201203 ; China
    6. BGI-Shenzhen
    ; Main Building ; Bei Shan Industrial Zone ; Yantian District ; Shenzhen ; Guangdong ; 518083 ; China
    7. BGI-Guangzhou
    ; Guangzhou ; China
    8. Department of Pediatric Oncology and Hematology and Center for Molecular Medicine (CMMC)
    ; University Children鈥檚 Hospital of Cologne ; Kerpener Strasse 62 ; D-50924 ; Cologne ; Germany
  • 刊名:Genome Biology
  • 出版年:2014
  • 出版时间:December 2014
  • 年:2014
  • 卷:15
  • 期:12
  • 全文大小:2,206 KB
  • 参考文献:1. Michnick, SW (2006) The connectivity map. Nat Chem Biol 2: pp. 663-664 CrossRef
    2. Lamb, J, Crawford, ED, Peck, D, Modell, JW, Blat, IC, Wrobel, MJ, Lerner, J, Brunet, JP, Subramanian, A, Ross, KN, Reich, M, Hieronymus, H, Wei, G, Armstrong, SA, Haggarty, SJ, Clemons, PA, Wei, R, Carr, SA, Lander, ES, Golub, TR (2006) The Connectivity Map: using gene-expression signatures to connect small molecules, genes, and disease. Science 313: pp. 1929-1935 CrossRef
    3. Waters, M, Stasiewicz, S, Merrick, BA, Tomer, K, Bushel, P, Paules, R, Stegman, N, Nehls, G, Yost, KJ, Johnson, CH, Gustafson, SF, Xirasagar, S, Xiao, N, Huang, CC, Boyer, P, Chan, DD, Pan, Q, Gong, H, Taylor, J, Choi, D, Rashid, A, Ahmed, A, Howle, R, Selkirk, J, Tennant, R, Fostel, J (2008) CEBS鈥揅hemical Effects in Biological Systems: a public data repository integrating study design and toxicity data with microarray and proteomics data. Nucleic Acids Res 36: pp. D892-D900 CrossRef
    4. Ganter, B, Snyder, RD, Halbert, DN, Lee, MD (2006) Toxicogenomics in drug discovery and development: mechanistic analysis of compound/class-dependent effects using the DrugMatrix database. Pharmacogenomics 7: pp. 1025-1044 CrossRef
    5. Kiyosawa, N, Manabe, S, Yamoto, T, Sanbuissho, A (2010) Practical application of toxicogenomics for profiling toxicant-induced biological perturbations. Int J Mol Sci 11: pp. 3397-3412 CrossRef
    6. Veer, LJ, Dai, H, Vijver, MJ, He, YD, Hart, AA, Mao, M, Peterse, HL, Kooy, K, Marton, MJ, Witteveen, AT, Schreiber, GJ, Kerkhoven, RM, Roberts, C, Linsley, PS, Bernards, R, Friend, SH (2002) Gene expression profiling predicts clinical outcome of breast cancer. Nature 415: pp. 530-536 CrossRef
    7. Kuiper, R, Broyl, A, Knegt, Y, Vliet, MH, Beers, EH, Holt, B, Jarari, L, Mulligan, G, Gregory, W, Morgan, G, Goldschmidt, H, Lokhorst, HM, Duin, M, Sonneveld, P (2012) A gene expression signature for high-risk multiple myeloma. Leukemia 26: pp. 2406-2413 CrossRef
    8. Zhan, F, Barlogie, B, Arzoumanian, V, Huang, Y, Williams, DR, Hollmig, K, Pineda-Roman, M, Tricot, G, Rhee, F, Zangari, M, Dhodapkar, M, Shaughnessy, JD (2007) Gene-expression signature of benign monoclonal gammopathy evident in multiple myeloma is linked to good prognosis. Blood 109: pp. 1692-1700 CrossRef
    9. Su, Z, Hong, H, Fang, H, Shi, L, Perkins, R, Tong, W (2008) Very Important Pool (VIP) genes鈥揳n application for microarray-based molecular signatures. BMC Bioinformatics 9: pp. S9 CrossRef
    10. Cornero, A, Acquaviva, M, Fardin, P, Versteeg, R, Schramm, A, Eva, A, Bosco, MC, Blengio, F, Barzaghi, S, Varesio, L (2012) Design of a multi-signature ensemble classifier predicting neuroblastoma patients鈥?outcome. BMC Bioinformatics 13: pp. S13 CrossRef
    11. Simon, R (2006) Development and evaluation of therapeutically relevant predictive classifiers using gene expression profiling. J Natl Cancer Inst 98: pp. 1169-1171 CrossRef
    12. Su, Z, Hong, H, Perkins, R, Shao, X, Cai, W, Tong, W (2007) Consensus analysis of multiple classifiers using non-repetitive variables: diagnostic application to microarray gene expression data. Comput Biol Chem 31: pp. 48-56 CrossRef
    13. Wang, Z, Gerstein, M, Snyder, M (2009) RNA-Seq: a revolutionary tool for transcriptomics. Nat Rev Genet 10: pp. 57-63 CrossRef
    14. Rowley, JW, Oler, AJ, Tolley, ND, Hunter, BN, Low, EN, Nix, DA, Yost, CC, Zimmerman, GA, Weyrich, AS (2011) Genome-wide RNA-seq analysis of human and mouse platelet transcriptomes. Blood 118: pp. e101-e111 CrossRef
    15. Su, Z, Ning, B, Fang, H, Hong, H, Perkins, R, Tong, W, Shi, L (2011) Next-generation sequencing and its applications in molecular diagnostics. Expert Rev Mol Diagn 11: pp. 333-343
    16. Su, Z, Labaj, PP, Li, S, Thierry-Mieg, J, Thierry-Mieg, D, Shi, W, Wang, C, Schroth, GP, Jones, WD, Xiao, W, Xu, W, Jensen, RV, Kelly, R, Xu, J, Conesa, A, Furlanello, C, Gao, H, Hong, H, Jafari, N, Letovsky, S, Liao, Y, Lu, F, Oakeley, EJ, Peng, Z, Praul, CA, Santoyo-Lopez, J, Scherer, A, Shi, T, Smyth, GK, Staedtler, F (2014) A comprehensive assessment of RNA-seq accuracy, reproducibility and information content by the Sequencing Quality Control Consortium. Nat Biotechnol 32: pp. 903-914 CrossRef
    17. Network, TCGAR (2013) Genomic and epigenomic landscapes of adult de novo acute myeloid leukemia. N Engl J Med 368: pp. 2059-2074 CrossRef
    18. Tibshirani, R, Hastie, T, Narasimhan, B, Chu, G (2002) Diagnosis of multiple cancer types by shrunken centroids of gene expression. Proc Natl Acad Sci U S A 99: pp. 6567-6572 CrossRef
    19. Collett, D (2003) Modelling Survival Data in Medical Research. Chapman and Hall/CRC, Boca Raton, FL
    20. Su, Z, Li, Z, Chen, T, Li, QZ, Fang, H, Ding, D, Ge, W, Ning, B, Hong, H, Perkins, RG, Tong, W, Shi, L (2011) Comparing next-generation sequencing and microarray technologies in a toxicological study of the effects of aristolochic acid on rat kidneys. Chem Res Toxicol 24: pp. 1486-1493 CrossRef
    21. Guo, L, Lobenhofer, EK, Wang, C, Shippy, R, Harris, SC, Zhang, L, Mei, N, Chen, T, Herman, D, Goodsaid, FM, Hurban, P, Phillips, KL, Xu, J, Deng, X, Sun, YA, Tong, W, Dragan, YP, Shi, L (2006) Rat toxicogenomic study reveals analytical consistency across microarray platforms. Nat Biotechnol 24: pp. 1162-1169 CrossRef
    22. Wang, C, Gong, B, Bushel, PR, Thierry-Mieg, J, Thierry-Mieg, D, Xu, J, Fang, H, Hong, H, Shen, J, Su, Z, Meehan, J, Li, X, Yang, L, Li, H, Labaj, PP, Kreil, DP, Megherbi, D, Gaj, S, Caiment, F, Delft, J, Kleinjans, J, Scherer, A, Devanarayan, V, Wang, J, Yang, Y, Qian, HR, Lancashire, LJ, Bessarabova, M, Nikolsky, Y, Furlanello, C (2014) The concordance between RNA-seq and microarray data depends on chemical treatment and transcript abundance. Nat Biotechnol 32: pp. 926-932 CrossRef
    23. Shi, L, Reid, LH, Jones, WD, Shippy, R, Warrington, JA, Baker, SC, Collins, PJ, Longueville, F, Kawasaki, ES, Lee, KY, Luo, Y, Sun, YA, Willey, JC, Setterguist, RA, Fischer, GM, Tong, W, Dragan, YP, Dix, DJ, Frueh, FW, Goodsaid, FM, Herman, D, Jensen, RV, Johnson, CD, Lobenhofer, EK, Puri, RK, Schrf, U, Thierry-Mieg, J, Wang, C, Wilson, M, Wolber, PK (2006) The MicroArray Quality Control (MAQC) project shows inter- and intraplatform reproducibility of gene expression measurements. Nat Biotechnol 24: pp. 1151-1161 CrossRef
    24. Fan, X, Lobenhofer, EK, Chen, M, Shi, W, Huang, J, Luo, J, Zhang, J, Walker, SJ, Chu, TM, Li, L, Wolfinger, R, Bao, W, Paules, RS, Bushel, PR, Li, J, Shi, T, Nikolskaya, T, Nikolsky, Y, Hong, H, Deng, Y, Cheng, Y, Fang, H, Shi, L, Tong, W (2010) Consistency of predictive signature genes and classifiers generated using different microarray platforms. Pharmacogenomics J 10: pp. 247-257 CrossRef
    25. Djebali, S, Davis, CA, Merkel, A, Dobin, A, Lassmann, T, Mortazavi, A, Tanzer, A, Lagarde, J, Lin, W, Schlesinger, F, Xue, C, Marinov, GK, Khatun, J, Williams, BA, Zaleski, C, Rozowsky, J, Roder, M, Kokocinski, F, Abdelhamid, RF, Alioto, T, Antoshechkin, I, Baer, MT, Bar, NS, Batut, P, Bell, K, Bell, I, Chakrabortty, S, Chen, X, Chrast, J, Curado, J (2012) Landscape of transcription in human cells. Nature 489: pp. 101-108 CrossRef
    26. Schroder, MS, Culhane, AC, Quackenbush, J, Haibe-Kains, B (2011) Survcomp: an R/Bioconductor package for performance assessment and comparison of survival models. Bioinformatics 27: pp. 3206-3208 matics/btr511" target="_blank" title="It opens in new window">CrossRef
    27. Harrell, FE, Califf, RM, Pryor, DB, Lee, KL, Rosati, RA (1982) Evaluating the yield of medical tests. JAMA 247: pp. 2543-2546 ma.1982.03320430047030" target="_blank" title="It opens in new window">CrossRef
    28. Shi, L, Campbell, G, Jones, WD, Campagne, F, Wen, Z, Walker, SJ, Su, Z, Chu, TM, Goodsaid, FM, Pusztai, L, Shaughnessy, JD, Oberthuer, A, Thomas, RS, Paules, RS, Fielden, M, Barlogie, B, Chen, W, Du, P, Fischer, M, Furlanello, C, Gallas, BD, Ge, X, Megherbi, DB, Symmans, WF, Wang, MD, Zhang, J, Bitter, H, Brors, B, Bushel, PR, Bylesjo, M (2010) The MicroArray Quality Control (MAQC)-II study of common practices for the development and validation of microarray-based predictive models. Nat Biotechnol 28: pp. 827-838 CrossRef
    29. Dunham, I, Kundaje, A, Aldred, SF, Collins, PJ, Davis, CA, Doyle, F, Epstein, CB, Frietze, S, Harrow, J, Kaul, R, Khatun, J, Lajoie, BR, Landt, SG, Lee, BK, Pauli, F, Rosenbloom, KR, Sabo, P, Safi, A, Sanyal, A, Shoresh, N, Simon, JM, Song, L, Trinklein, ND, Altshuler, RC, Birney, E, Brown, JB, Cheng, C, Djebali, S, Dong, X, Dunham, I (2012) An integrated encyclopedia of DNA elements in the human genome. Nature 489: pp. 57-74 CrossRef
    30. Irizarry, RA, Wu, Z, Jaffee, HA (2006) Comparison of Affymetrix GeneChip expression measures. Bioinformatics 22: pp. 789-794 matics/btk046" target="_blank" title="It opens in new window">CrossRef
    31. TCGA AML RNA-Seq data matrix [matrix.txt.tcgaID.txt.gz" class="a-plus-plus">https://tcga-data.nci.nih.gov/docs/publications/laml_2012/laml.rnaseq.179_v1.0_gaf2.0_rpkm_matrix.txt.tcgaID.txt.gz]
    32. TCGA AML Affymetrix level 2 data matrix [https://tcga-data.nci.nih.gov/docs/publications/laml_2012/HG-U133_Plus_2.Level_2.tgz]
    33. UCSC rat genome rn4 reference [http://hgdownload.cse.ucsc.edu/goldenPath/rn4]
    34. Novoalign from the Novocraft Company [www.novocraft.com]
    35. Mortazavi, A, Williams, BA, McCue, K, Schaeffer, L, Wold, B (2008) Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat Methods 5: pp. 621-628 CrossRef
    36. Affymetrix microarray data from the DrugMatrix [matrix/Affymetrix_data/Normalized_data_by_organ" class="a-plus-plus">ftp://anonftp.niehs.nih.gov/drugmatrix/Affymetrix_data/Normalized_data_by_organ]
    37. Affymetrix array annotation files [main.affx" class="a-plus-plus">http://www.affymetrix.com/support/technical/annotationfilesmain.affx]
  • 刊物主题:Animal Genetics and Genomics; Human Genetics; Plant Genetics & Genomics; Microbial Genetics and Genomics; Fungus Genetics; Bioinformatics;
  • 出版者:BioMed Central
  • ISSN:1465-6906
文摘
Background Gene expression microarray has been the primary biomarker platform ubiquitously applied in biomedical research, resulting in enormous data, predictive models, and biomarkers accrued. Recently, RNA-seq has looked likely to replace microarrays, but there will be a period where both technologies co-exist. This raises two important questions: Can microarray-based models and biomarkers be directly applied to RNA-seq data? Can future RNA-seq-based predictive models and biomarkers be applied to microarray data to leverage past investment? Results We systematically evaluated the transferability of predictive models and signature genes between microarray and RNA-seq using two large clinical data sets. The complexity of cross-platform sequence correspondence was considered in the analysis and examined using three human and two rat data sets, and three levels of mapping complexity were revealed. Three algorithms representing different modeling complexity were applied to the three levels of mappings for each of the eight binary endpoints and Cox regression was used to model survival times with expression data. In total, 240,096 predictive models were examined. Conclusions Signature genes of predictive models are reciprocally transferable between microarray and RNA-seq data for model development, and microarray-based models can accurately predict RNA-seq-profiled samples; while RNA-seq-based models are less accurate in predicting microarray-profiled samples and are affected both by the choice of modeling algorithm and the gene mapping complexity. The results suggest continued usefulness of legacy microarray data and established microarray biomarkers and predictive models in the forthcoming RNA-seq era.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700