用户名: 密码: 验证码:
Porting the Princeton Ocean Model to GPUs
详细信息    查看全文
  • 作者:Shizhen Xu (24) (26)
    Xiaomeng Huang (24) (25) (26)
    Yan Zhang (24) (25) (26)
    Yong Hu (24) (26)
    Haohuan Fu (24) (25) (26)
    Guangwen Yang (24) (25) (26)
  • 刊名:Lecture Notes in Computer Science
  • 出版年:2014
  • 出版时间:2014
  • 年:2014
  • 卷:8630
  • 期:1
  • 页码:1-14
  • 全文大小:385 KB
  • 参考文献:1. Michalakes, J., Vachharajani, M.: Gpu acceleration of numerical weather prediction. Parallel Processing Letters?18(04), 531-48 (2008) CrossRef
    2. Shimokawabe, T., Aoki, T., Muroi, C., Ishida, J., Kawano, K., Endo, T., Nukada, A., Maruyama, N., Matsuoka, S.: An 80-fold speedup, 15.0 tflops full gpu acceleration of non-hydrostatic weather model asuca production code. In: IEEE 2010 International Conference for High Performance Computing, Networking, Storage and Analysis (SC), pp. 1-1 (2010)
    3. Fuhrer, O., Osuna, C., Lapillonne, X., Gysi, T., Bianco, M., Schulthess, T.: Towards gpu-accelerated operational weather forecasting. In: The GPU Technology Conference, GTC 2013 (2013)
    4. Kelly, R.: Gpu computing for atmospheric modeling. Computing in Science & Engineering?12(4), 26-3 (2010) CrossRef
    5. Mak, J., Choboter, P., Lupo, C.: Numerical ocean modeling and simulation with cuda. In: IEEE OCEANS, pp. 1- (2011)
    6. Carpenter, I., Archibald, R., Evans, K.J., Larkin, J., Micikevicius, P., Norman, M., Rosinski, J., Schwarzmeier, J., Taylor, M.A.: Progress towards accelerating homme on hybrid multi-core systems. International Journal of High Performance Computing Applications?27(3), 335-47 (2013) CrossRef
    7. Govett, M., Middlecoff, J., Henderson, T.: Running the nim next-generation weather model on gpus. In: IEEE, 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing (CCGrid), pp. 792-96 (2010)
    8. Oey, L.Y., Lee, H.C., Schmitz, W.J.: Effects of winds and caribbean eddies on the frequency of loop current eddy shedding: A numerical model study. Journal of Geophysical Research: Oceans (1978-012)?108(C10) (2003)
    9. Blumberg, A.F., Mellor, G.L.: A description of a three-dimensional coastal ocean circulation model. Coastal and Estuarine Sciences?4, 1-6 (1987) CrossRef
    10. Browne, S., Dongarra, J., Garner, N., Ho, G., Mucci, P.: A portable programming interface for performance evaluation on modern processors. International Journal of High Performance Computing Applications?14(3), 189-04 (2000) CrossRef
    11. NVIDIA: CUDA C Programming Guide Version 5.5. available at http://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html
    12. Jordi, A., Wang, D.P.: sbpom: A parallel implementation of princenton ocean model. Environmental Modelling & Software?38, 59-1 (2012) CrossRef
    13. Yang, C., Xue, W., Fu, H., Gan, L., Li, L., Xu, Y., Lu, Y., Sun, J., Yang, G., Zheng, W.: A peta-scalable cpu-gpu algorithm for global atmospheric simulations. In: Proceedings of the 18th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pp. 1-2. ACM (2013)
    14. Potluri, S., Wang, H., Bureddy, D., Singh, A.K., Rosales, C., Panda, D.K.: Optimizing mpi communication on multi-gpu systems using cuda inter-process communication. In: 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum (IPDPSW), pp. 1848-857. IEEE (2012)
    15. Whitehead, N., Fit-Florea, A.: Precision & performance: Floating point and ieee 754 compliance for nvidia gpus. rn (A+ B) 21, 1-874919424 (2011)
    16. McCalpin, J., Wonnacott, D.: Time skewing: A value-based approach to optimizing for memory locality. Technical report, Technical Report DCS-TR-379, Department of Computer Science, Rugers University (1999)
  • 作者单位:Shizhen Xu (24) (26)
    Xiaomeng Huang (24) (25) (26)
    Yan Zhang (24) (25) (26)
    Yong Hu (24) (26)
    Haohuan Fu (24) (25) (26)
    Guangwen Yang (24) (25) (26)

    24. Ministry of Education Key Laboratory for Earth System Modeling, China
    26. Joint Center for Global Change Studies, Beijing, 100875, China
    25. Center for Earth System Science, Tsinghua University, 100084, China
  • ISSN:1611-3349
文摘
While GPU is becoming a compelling acceleration solution for a series of scientific applications, most existing work on climate models only achieved limited speedup. It is due to partial porting of the huge code and the memory bound inherence of these models. In this work, we design and implement a customized GPU-based acceleration of the Princeton Ocean Model (gpuPOM). Based on Nvidia’s state-of-the-art GPU architectures (K20X and K40m), we rewrite the original model from the Fortran into the CUDA-C completely. Several accelerating methods, including optimizing memory access in a single GPU, overlapping communication and boundary operations among multiple GPUs, are presented. The experimental results show that the gpuPOM on one K40m GPU achieves 6.9-fold to 17.8-fold speedup and 5.8-fold to 15.5-fold speedup on one K20X GPU comparing with different Intel CPUs. Further experiments on multiple GPUs indicate that the performance of the gpuPOM on a super-workstation containing 4 GPUs is equivalent to a powerful cluster consisting of 34 pure CPU nodes with over 400 CPU cores.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700