用户名: 密码: 验证码:
A block-based imputation approach with adaptive LD blocks for fast genotype imputation
详细信息    查看全文
  • 作者:Jaeyoung Kim (17110)
    Miyoung Shin (17110)
    Myungguen Chung (27110)
    Kiejung Park (27110)
  • 关键词:SNP chip ; Next Generation Sequencing ; Imputation ; Linkage disequilibrium block ; Haplotype reference panel
  • 刊名:BioChip Journal
  • 出版年:2013
  • 出版时间:March 2013
  • 年:2013
  • 卷:7
  • 期:1
  • 页码:63-67
  • 全文大小:329 KB
  • 参考文献:1. Ellinghaus, D., Schreiber, S., Franke, A. & Nothnagel, M. Current software for genotype imputation. / Hum. Genomics 3, 371鈥?80 (2009).
    2. Nothnagel, M., Ellinghaus, D., Schreiber, S., Krawczak, M. & Franke, A. A comprehensive evaluation of SNP genotype imputation. / Human Genetics 125, 163鈥?71 (2009). CrossRef
    3. Zhang, B. / et al. Practical Consideration of Genotype Imputation: Sample Size, Window Size, Reference Choice, and Untyped Rate. / Stat. Interface 4, 339鈥?52 (2011).
    4. MACH 1.0, http://www.sph.umich.edu/csg/abecasis/MACH/index.html.
    5. Li, Y. & Abecasis, G.R. Mach 1.0: Rapid haplotype reconstruction and missing genotype inference. / Am. J. Hum. Genet. S79, 2290 (2006).
    6. IMPUTE version 2, http://mathgen.stats.ox.ac.uk/impute/impute_v2.html.
    7. BEAGLE Genetic Analysis Software Package, shington.edu/browning/beagle/beagle.html">http://faculty.washington.edu/browning/beagle/beagle.html.
    8. Purcell, S. / et al. PLINK: A tool set for whole-genome association and population-based linkage analyses. / Am. J. Hum. Genet. 81, 559鈥?75 (2007). CrossRef
    9. Marchini, J. / et al. A new multipoint method for genome-wide association studies by imputation of genotypes. / Nat. Genet. 39, 906鈥?13 (2007). CrossRef
    10. International HapMap Project, http://hapmap.ncbi.nlm.nih.gov/.
    11. 1000 Genomes, http://www.1000genomes.org.
    12. Anderson, C.A. / et al. Evaluating the effects of imputation on the power, coverage, and cost efficiency of genome-wide SNP platforms. / Am. J. Hum. Genet. 83, 112鈥?19 (2008). CrossRef
    13. Scheet, P. & Stephens, M. A fast and flexible statistical model for large-scale population genotype data: Applications to inferring missing genotypes and haplotypic phase. / Am. J. Hum. Genet. 78, 629鈥?44 (2006). CrossRef
    14. Howie, B., Marchini, J. & Stephens, M. Genotype imputation with thousands of genomes. / G3: Genes, Genomics, Genetics 1, 457鈥?70 (2011).
    15. Baum, L.E. & Petria, T. Statistical inference for probabilistic functions of finite state Morkov chains. / Annals of Mathematical Statistics 37, 1554鈥?563 (1966). CrossRef
    16. Sanna, S. / et al. Common variants in the GDF5-UQCC region are associated with variation in human height. / Nat. Genet. 40, 198鈥?03 (2008). CrossRef
    17. Gene Expression Omnibus (GEO). http://www.ncbi.nlm.nih.gov/geo/.
    18. Barrett, J.C., Fry, B., Maller, J. & Daly, M.J. Haploview: analysis and visualization of LD and haplotype maps. / Bioinformatics 21, 263鈥?65 (2005). CrossRef
    19. Gabriel, S.B. / et al. The structure of haplotype blocks in the human genome. / Science 296, 2225鈥?229 (2001). CrossRef
    20. Benusiglio, P.R. / et al. Common ERBB2 polymorphisms and risk of breast cancer in a white British population: a case-control study. / Breast Cancer Res. 7, 204鈥?09 (2005). CrossRef
  • 作者单位:Jaeyoung Kim (17110)
    Miyoung Shin (17110)
    Myungguen Chung (27110)
    Kiejung Park (27110)

    17110. Bio-Intelligence & Data Mining Laboratory, Graduate School of Electrical Engineering and Computer Science, Kyungpook National University, Daegu, Korea
    27110. Division of Bio-Medical Informatics, Center for Genome Science, Korea National Institute of Health, 187 Osongsaengmyeong2(i)-ro, Gangoe-myeon, Cheongwon-gun, Chungcheongbuk-do, 363-951, Korea
  • ISSN:2092-7843
文摘
This paper addresses the issue of improving long imputation time usually required for a large volume of SNP genotype data which can be easily obtained by biological experiments with the genomewide SNP chip or the next-generation sequencing technology. For this purpose, we propose a block-based imputation approach that generates adaptive LD blocks with observed SNP genotype data and applies an imputation procedure for each block separately. Also, we implemented the block based imputation to maximize the use of computing resources. Specifically, each task of block imputation is allocated to individual processor and is executed on each processor independently. Thus, multiple tasks of block imputation can be executed on multiple processors in parallel where the parallelization can reach up to the maximum number of processors allowed by user鈥檚 computing environment. Our experiment was performed with Mao et al.鈥檚 prostate cancer dataset. The results show that our adaptive block approach can reduce the imputation time up to 60鈥?0% of original imputation time given by MaCH without the loss of imputation accuracy.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700