A pruning strategy of reference panels for fast SNP genotype imputation

设为首页

收藏本站

网站地图 | English | 公务邮箱

读者指南

学术客户端

NSTL服务站

科技查新

A pruning strategy of reference panels for fast SNP genotype imputation

详细信息查看全文

作者：Erkhembayar Jadamba (17102)
Miyoung Shin (17102)
Myungguen Chung (27102)
Kiejung Park (27102)
关键词：SNP imputation ; Reference panel pruning ; Kinship coefficient
刊名：BioChip Journal
出版年：2013
出版时间：March 2013
年：2013
卷：7
期：1
页码：6-10
全文大小：232 KB
参考文献：1. Lewis, C.M. Genetic association studies: design, analysis and interpretation. / Brief. Bioinform. 3, 146-53 (2002). CrossRef
2. Tanaka, T. International HapMap project. / Nihon Rinsho 63, 29-4 (2005).
3. Thorisson, G.A., Smith, A.V., Krishnan, L. & Stein, L.D. The international HapMap project web site. / Genome Res. 15, 1592-593 (2005). CrossRef
4. Consortium Genomes Project. A map of human genome variation from population-scale sequencing. / Nature 467, 1061-073 (2010). CrossRef
5. Howie, B., Marchini, J. & Stephens, M. Genotype imputation with thousands of genomes. / G3 ( / Bethesda) 1, 457-70 (2011). CrossRef
6. Huang, L. / et al. The relationship between imputation error and statistical power in genetic association studies in diverse populations. / Am. J. Hum. Genet. 85, 692-98 (2009). CrossRef
7. Pasaniuc, B. / et al. A generic coalescent-based framework for the selection of a reference panel for imputation. / Genet. Epidemiol. 34, 773-82 (2010). CrossRef
8. Jostins, L., Morley, K.I. & Barrett, J.C. Imputation of low-frequency variants using the HapMap3 benefits from large, diverse reference sets. / Eur. J. Hum. Genet. 19, 662-66 (2011). CrossRef
9. Sung, Y.J., Wang, L., Rankinen, T., Bouchard, C. & Rao, D.C. Performance of genotype imputations using data from the 1000 genomes project. / Hum. Hered. 73, 18-5 (2012). CrossRef
10. Huang, L. / et al. Genotype-imputation accuracy across worldwide human populations. / Am. J. Hum. Genet. 84, 235-50 (2009). CrossRef
11. Mao, X. / et al. Distinct genomic alterations in prostate cancers in Chinese and Western populations suggest alternative pathways of prostate carcinogenesis. / Cancer Res. 70, 5207-212 (2010). CrossRef
12. Danford, T., Rolfe, A. & Gifford, D. GSE: a comprehensive database system for the representation, retrieval, and analysis of microarray data. / Pac. Symp. Biocomput. 539-50 (2008).
13. Edgar, R., Domrachev, M. & Lash, A.E. Gene expression omnibus: NCBI gene expression and hybridization array data repository. / Nucleic Acids Res. 30, 207-10 (2002). CrossRef
14. Browning, B.L. & Browning, S.R. A unified approach to genotype imputation and haplotype-phase inference for large data sets of trios and unrelated individuals. / Am. J. Hum. Genet. 84, 210-23 (2009). CrossRef
15. Browning, S.R. Missing data imputation and haplotype phase inference for genome-wide association studies. / Hum. Genet. 124, 439-50 (2008). CrossRef
16. Browning, S.R. & Browning, B.L. Haplotype phasing: existing methods and new developments. / Nat. Rev. Genet. 12, 703-14 (2011). CrossRef
17. Sung, Y.J., Wang, L., Rankinen, T., Bouchard, C. & Rao, D.C. Performance of genotype imputations using data from the 1000 genomes project. / Hum. Hered. 73, 18-5 (2012). CrossRef
18. Manichaikul, A. / et al. Robust relationship inference in genome-wide association studies. / Bioinformatics 26, 2867-873 (2010). CrossRef
作者单位：Erkhembayar Jadamba (17102)
Miyoung Shin (17102)
Myungguen Chung (27102)
Kiejung Park (27102)

17102. Bio-Intelligence & Data Mining Laboratory, Graduate School of Electrical Engineering and Computer Science, Kyungpook National University, Daegu, Korea
27102. Division of Bio-Medical Informatics, Center for Genome Science, Korea National Institute of Health, Seoul, Korea
ISSN：2092-7843

文摘

In recent genome-wide association studies, the task of genotype imputation for missing SNPs is a common procedure to increase the power of observed genetic markers. For genotype imputation, they usually employ publicly available resources, such as the International HapMap Project data or the 1000 Genome Project data, as a reference panel. However, lately, the volume of publicly available resources is rapidly increasing with the maturation of high-throughput genotyping technology. Thus, it often requires heavy computation for learning large reference panels, leading to long imputation time. In this work, to handle such problem, we propose a pruning strategy for the construction of imputation reference panels which is to reduce the size of reference panel data by excluding (or pruning) somewhat redundant samples from the reference panel based on the estimation of the kinship coefficients between samples. For evaluation, this approach was implemented under the Beagle framework and was tested on two real datasets, Mao et al.’s prostate cancer data and KNIH’s diabetes data. Our experiment results show that the proposed pruning strategy for reference panel construction can provide fast imputation time without the loss of imputation accuracy.

常见问题　|　交通位置　|　联系我们　|　OA远程办公

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700