Parallelizing Convolutional Neural Networks on Intel $$^{\textregistered }$$ Many Integrated Core Architecture

设为首页

收藏本站

网站地图 | English | 公务邮箱

远程访问

NSTL服务站

Parallelizing Convolutional Neural Networks on Intel $$^{\textregistered }$$ Many Integrated Core Architecture

详细信息查看全文

作者：Junjie Liu (17)
Haixia Wang (17)
Dongsheng Wang (17)
Yuan Gao (17)
Zuofeng Li (17)

17. Tsinghua National Laboratory for Information Science and Technology ; Beijing ; 100084 ; China
关键词：Convolutional neural network ; OpenMP ; Intel many integrated core architecture ; Xeon phi
刊名：Lecture Notes in Computer Science
出版年：2015
出版时间：2015
年：2015
卷：9017
期：1
页码：71-82
全文大小：1,302 KB
参考文献：1. Osadchy, M, Cun, YL, Miller, ML (2007) Synergistic face detection and pose estimation with energy-based models. The Journal of Machine Learning Research 8: pp. 1197-1215
2. Matsugu, M, Mori, K, Mitari, Y, Kaneda, Y (2003) Subject independent facial expression recognition with robust face detection using a convolutional neural network. Neural Networks 16: pp. 555-559 CrossRef
3. Ciresan, D., Meier, U., Schmidhuber, J.: Multi-column deep neural networks for image classification. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3642鈥?649. IEEE (June 2012)
4. Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097鈥?105 (2012)
5. Scherer, D, Schulz, H, Behnke, S Accelerating large-scale convolutional neural networks with parallel graphics multiprocessors. In: Diamantaras, K, Duch, W, Iliadis, LS eds. (2010) Artificial Neural Networks 鈥?ICANN 2010. Springer, Heidelberg, pp. 82-91 CrossRef
6. Huqqani, AA, Schikuta, E, Ye, S, Chen, P (2013) Multicore and gpu parallelization of neural networks for face recognition. Procedia Computer Science 18: pp. 349-358 CrossRef
7. Hubel, DH, Wiesel, TN (1962) Receptive fields, binocular interaction and functional architecture in the cat鈥檚 visual cortex. The Journal of Physiology 160: pp. 106 CrossRef
8. LeCun, Y, Bottou, L, Bengio, Y, Haffner, P (1998) Gradient-based learning applied to document recognition. Proceedings of the IEEE 86: pp. 2278-2324 CrossRef
9. Scherer, D, M眉ller, A, Behnke, S Evaluation of pooling operations in convolutional architectures for object recognition. In: Diamantaras, K, Duch, W, Iliadis, LS eds. (2010) Artificial Neural Networks 鈥?ICANN 2010. Springer, Heidelberg, pp. 92-101 CrossRef
10. LeCun, YA, Bottou, L, Orr, GB, M眉ller, K-R Efficient BackProp. In: Montavon, G, Orr, GB, M眉ller, K-R eds. (2012) Neural Networks: Tricks of the Trade. Springer, Heidelberg, pp. 9-48 CrossRef
11. Simard, P.Y., Steinkraus, D., Platt, J.C.: Best practices for convolutional neural networks applied to visual document analysis. In: 2013 12th International Conference on Document Analysis and Recognition, vol. 2, pp. 958鈥?58. IEEE Computer Society (August 2003)
作者单位：Architecture of Computing Systems 篓C ARCS 2015
丛书名：978-3-319-16085-6
刊物类别：Computer Science
刊物主题：Artificial Intelligence and Robotics
Computer Communication Networks
Software Engineering
Data Encryption
Database Management
Computation by Abstract Devices
Algorithm Analysis and Problem Complexity
出版者：Springer Berlin / Heidelberg
ISSN：1611-3349

文摘

Convolutional neural networks (CNNs) are state-of-the-art machine learning algorithm in low-resolution vision tasks and are widely applied in many applications. However, the training process of them is very time-consuming. As a result, many approaches have been proposed in which parallelization is one of the most effective. In this article, we parallelized a classic CNN on a new platform of Intel $^{{\textregistered }}$ Xeon Phi $^{{{\text {TM}}}}$ Coprocessor with OpenMP. Our implementation acquired 131 $\times $ speedup against the serial version running on the coprocessor itself and 8.3 $\times $ speedup against the serial baseline on the Xeon $^{{\textregistered }}$ E5-2697 CPU.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700