用户名: 密码: 验证码:
A pruning strategy of reference panels for fast SNP genotype imputation
详细信息    查看全文
  • 作者:Erkhembayar Jadamba (17102)
    Miyoung Shin (17102)
    Myungguen Chung (27102)
    Kiejung Park (27102)
  • 关键词:SNP imputation ; Reference panel pruning ; Kinship coefficient
  • 刊名:BioChip Journal
  • 出版年:2013
  • 出版时间:March 2013
  • 年:2013
  • 卷:7
  • 期:1
  • 页码:6-10
  • 全文大小:232 KB
  • 参考文献:1. Lewis, C.M. Genetic association studies: design, analysis and interpretation. / Brief. Bioinform. 3, 146-53 (2002). CrossRef
    2. Tanaka, T. International HapMap project. / Nihon Rinsho 63, 29-4 (2005).
    3. Thorisson, G.A., Smith, A.V., Krishnan, L. & Stein, L.D. The international HapMap project web site. / Genome Res. 15, 1592-593 (2005). CrossRef
    4. Consortium Genomes Project. A map of human genome variation from population-scale sequencing. / Nature 467, 1061-073 (2010). CrossRef
    5. Howie, B., Marchini, J. & Stephens, M. Genotype imputation with thousands of genomes. / G3 ( / Bethesda) 1, 457-70 (2011). CrossRef
    6. Huang, L. / et al. The relationship between imputation error and statistical power in genetic association studies in diverse populations. / Am. J. Hum. Genet. 85, 692-98 (2009). CrossRef
    7. Pasaniuc, B. / et al. A generic coalescent-based framework for the selection of a reference panel for imputation. / Genet. Epidemiol. 34, 773-82 (2010). CrossRef
    8. Jostins, L., Morley, K.I. & Barrett, J.C. Imputation of low-frequency variants using the HapMap3 benefits from large, diverse reference sets. / Eur. J. Hum. Genet. 19, 662-66 (2011). CrossRef
    9. Sung, Y.J., Wang, L., Rankinen, T., Bouchard, C. & Rao, D.C. Performance of genotype imputations using data from the 1000 genomes project. / Hum. Hered. 73, 18-5 (2012). CrossRef
    10. Huang, L. / et al. Genotype-imputation accuracy across worldwide human populations. / Am. J. Hum. Genet. 84, 235-50 (2009). CrossRef
    11. Mao, X. / et al. Distinct genomic alterations in prostate cancers in Chinese and Western populations suggest alternative pathways of prostate carcinogenesis. / Cancer Res. 70, 5207-212 (2010). CrossRef
    12. Danford, T., Rolfe, A. & Gifford, D. GSE: a comprehensive database system for the representation, retrieval, and analysis of microarray data. / Pac. Symp. Biocomput. 539-50 (2008).
    13. Edgar, R., Domrachev, M. & Lash, A.E. Gene expression omnibus: NCBI gene expression and hybridization array data repository. / Nucleic Acids Res. 30, 207-10 (2002). CrossRef
    14. Browning, B.L. & Browning, S.R. A unified approach to genotype imputation and haplotype-phase inference for large data sets of trios and unrelated individuals. / Am. J. Hum. Genet. 84, 210-23 (2009). CrossRef
    15. Browning, S.R. Missing data imputation and haplotype phase inference for genome-wide association studies. / Hum. Genet. 124, 439-50 (2008). CrossRef
    16. Browning, S.R. & Browning, B.L. Haplotype phasing: existing methods and new developments. / Nat. Rev. Genet. 12, 703-14 (2011). CrossRef
    17. Sung, Y.J., Wang, L., Rankinen, T., Bouchard, C. & Rao, D.C. Performance of genotype imputations using data from the 1000 genomes project. / Hum. Hered. 73, 18-5 (2012). CrossRef
    18. Manichaikul, A. / et al. Robust relationship inference in genome-wide association studies. / Bioinformatics 26, 2867-873 (2010). CrossRef
  • 作者单位:Erkhembayar Jadamba (17102)
    Miyoung Shin (17102)
    Myungguen Chung (27102)
    Kiejung Park (27102)

    17102. Bio-Intelligence & Data Mining Laboratory, Graduate School of Electrical Engineering and Computer Science, Kyungpook National University, Daegu, Korea
    27102. Division of Bio-Medical Informatics, Center for Genome Science, Korea National Institute of Health, Seoul, Korea
  • ISSN:2092-7843
文摘
In recent genome-wide association studies, the task of genotype imputation for missing SNPs is a common procedure to increase the power of observed genetic markers. For genotype imputation, they usually employ publicly available resources, such as the International HapMap Project data or the 1000 Genome Project data, as a reference panel. However, lately, the volume of publicly available resources is rapidly increasing with the maturation of high-throughput genotyping technology. Thus, it often requires heavy computation for learning large reference panels, leading to long imputation time. In this work, to handle such problem, we propose a pruning strategy for the construction of imputation reference panels which is to reduce the size of reference panel data by excluding (or pruning) somewhat redundant samples from the reference panel based on the estimation of the kinship coefficients between samples. For evaluation, this approach was implemented under the Beagle framework and was tested on two real datasets, Mao et al.’s prostate cancer data and KNIH’s diabetes data. Our experiment results show that the proposed pruning strategy for reference panel construction can provide fast imputation time without the loss of imputation accuracy.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700