用户名: 密码: 验证码:
LUTMap: A Dynamic Heuristic Application Mapping Algorithm Based on Lookup Tables
详细信息    查看全文
  • 关键词:Multicore ; Network ; on ; chip ; Application mapping ; Lookup table
  • 刊名:Lecture Notes in Computer Science
  • 出版年:2016
  • 出版时间:2016
  • 年:2016
  • 卷:9864
  • 期:1
  • 页码:134-146
  • 全文大小:1,013 KB
  • 参考文献:1.Bienia, C., Kumar, S., Singh, J.P., Li, K.: The parsec benchmark suite: characterization and architectural implications. In: Proceedings of the 17th International Conference on Parallel Architectures and Compilation Techniques, PACT 2008, pp. 72–81. ACM, New York (2008)
    2.Chen, Y.J., Yang, C.L., Chang, Y.S.: An architectural co-synthesis algorithm for energy-aware network-on-chip design. J. Syst. Archit. 55(5–6), 299–309 (2009)CrossRef
    3.Chou, C.L., Ogras, U., Marculescu, R.: Energy- and performance-aware incremental mapping for networks on chip with multiple voltage levels. IEEE Trans. Comput.-Aided Des. Integr. Circ. Syst. 27(10), 1866–1879 (2008)CrossRef
    4.Dally, W., Towles, B.: Principles Practices Interconnection Netw. Morgan Kaufmann Publishers Inc., San Francisco (2003)
    5.Demaine, E.D., Fekete, S.P., Rote, G., Schweer, N., Schymura, D., Zelke, M.: Integer point sets minimizing average pairwise distance: what is the optimal shape of a town? Comput. Geom. 44(2), 82–94 (2011). Special issue of selected papers from the 21st Annual Canadian Conference on Computational GeometryMathSciNet CrossRef MATH
    6.Fattah, M., Rahmani, A.M., Xu, T., Kanduri, A., Liljeberg, P., Plosila, J., Tenhunen, H.: Mixed-criticality run-time task mapping for noc-based many-core systems. In: 2014 22nd Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP), pp. 458–465, February 2014
    7.Fleig, T., Mattes, O., Karl, W.: Evaluation of adaptive memory management techniques on the tilera tile-gx platform. In: 2014 27th International Conference on Architecture of Computing Systems (ARCS), pp. 1–8, February 2014
    8.Ghosh, A., Paul, S., Bhunia, S.: Energy-efficient application mapping in FPGA through computation in embedded memory blocks. In: 2012 25th International Conference on VLSI Design (VLSID), pp. 424–429, January 2012
    9.Hu, J., Marculescu, R.: Energy-aware communication and task scheduling for network-on-chip architectures under real-time constraints. In: Proceedings of the Conference on Design, Automation and Test in Europe, DATE 2004, vol. 1, p. 10234. IEEE Computer Society, Washington, DC (2004)
    10.Hyde, R.: The Art of Assembly Language, 2nd edn. No Starch Press, San Francisco (2010)
    11.LaCouvee, D.: Fact or fiction: Android apps only use one CPU core, December 2015. http://​www.​androidauthority​.​com/​fact-or-fiction-android-apps-only-use-one-cpu-core-610352/​
    12.Lei, T., Kumar, S.: A two-step genetic algorithm for mapping task graphs to a network on chip architecture. In: 2003 Proceedings of Euromicro Symposium on Digital System Design, pp. 180–187 (2003)
    13.Leung, V.J., Sabin, G., Sadayappan, P.: Parallel job scheduling policies to improve fairness: a case study. In: 39th International Conference on Parallel Processing, ICPP. Workshops 2010, San Diego, California, USA, 13–16 September, pp. 346–353 (2010)
    14.Leutenegger, S.T., Vernon, M.K.: The performance of multiprogrammed multiprocessor scheduling algorithms. SIGMETRICS Perform. Eval. Rev. 18(1), 226–236 (1990)CrossRef
    15.Magnusson, P., Christensson, M., Eskilson, J., Forsgren, D., Hallberg, G., Hogberg, J., Larsson, F., Moestedt, A., Werner, B.: Simics: a full system simulation platform. Computer 35(2), 50–58 (2002)CrossRef
    16.Martin, M.M., Sorin, D.J., Beckmann, B.M., Marty, M.R., Xu, M., Alameldeen, A.R., Moore, K.E., Hill, M.D., Wood, D.A.: Multifacet’s general execution-driven multiprocessor simulator (gems) toolset. Computer Architecture News, September 2005
    17.Mediatek: Helio x20, December 2015. http://​mediatek-helio.​com/​x20/​
    18.de Souza Carvalho, E., Calazans, N., Moraes, F.: Dynamic task mapping for MPSoCS. IEEE Des. Test Comput. 27(5), 26–35 (2010)CrossRef
    19.TGG: Task graph generator, July 2014. http://​taskgraphgen.​sourceforge.​net/​
    20.Woo, S.C., Ohara, M., Torrie, E., Singh, J.P., Gupta, A.: The splash-2 programs: characterization and methodological considerations. In: Proceedings of the 22nd International Symposium on Computer Architecture, pp. 24–36, June 1995
    21.Xu, T., Toivonen, J., Pahikkala, T., Leppanen, V.: BDMap: a heuristic application mapping algorithm for the big data era. In: 2014 IEEE 11th International Conference on Ubiquitous Intelligence and Computing and IEEE 11th International Conference on Autonomic and Trusted Computing, and IEEE 14th International Conference on Scalable Computing and Communications and Its Associated Workshops (UTC-ATC-ScalCom), pp. 821–828, December 2014
    22.Xu, T.C., Leppänen, V.: DBFS: dual best-first search mapping algorithm for shared-cache multicore processors. In: Wang, G., Zomaya, A., Martinez Perez, G., Kenli, L. (eds.) ICA3PP 2015. LNCS, vol. 9528, pp. 185–198. Springer, Heidelberg (2015). doi:10.​1007/​978-3-319-27119-4_​13 CrossRef
    23.Xu, T.C., Liljeberg, P., Plosila, J., Tenhunen, H.: Exploration of heuristic scheduling algorithms for 3D multicore processors. In: Proceedings of the 15th International Workshop on Software and Compilers for Embedded Systems, SCOPES 2012, pp. 22–31. ACM, New York (2012)
    24.Xu, T.C., Leppänen, V.: Cache- and communication-aware application mapping for shared-cache multicore processors. In: Pinho, L.M.P., Karl, W., Cohen, A., Brinkschulte, U. (eds.) ARCS 2015. LNCS, vol. 9017, pp. 55–67. Springer, Heidelberg (2015)
    25.Xu, T.C., Liljeberg, P., Tenhunen, H.: A minimal average accessing time scheduler for multicore processors. In: Xiang, Y., Cuzzocrea, A., Hobbs, M., Zhou, W. (eds.) ICA3PP 2011, Part II. LNCS, vol. 7017, pp. 287–299. Springer, Heidelberg (2011)CrossRef
  • 作者单位:Thomas Canhao Xu (22)
    Ville Leppänen (22)

    22. Department of Information Technology, University of Turku, 20014, Turku, Finland
  • 丛书名:Internet and Distributed Computing Systems
  • ISBN:978-3-319-45940-0
  • 刊物类别:Computer Science
  • 刊物主题:Artificial Intelligence and Robotics
    Computer Communication Networks
    Software Engineering
    Data Encryption
    Database Management
    Computation by Abstract Devices
    Algorithm Analysis and Problem Complexity
  • 出版者:Springer Berlin / Heidelberg
  • ISSN:1611-3349
  • 卷排序:9864
文摘
In this paper, we propose and investigate a dynamic heuristic mapping algorithm with lookup table optimizations. Distributed and parallel computing are trends due to the performance requirement of modern applications. Application mapping in a multiprocessor system is therefore critical due to the dynamic and unpredictable nature of the applications. We analyse the communication delay among different tasks in an application. A fundamental algorithm is analysed to optimize the average delay of the mapping region. We discuss and evaluate the effectiveness of the algorithm in terms of average intra-application latency. Results from synthetic applications revealed that average latencies from the mapping regions of the fundamental algorithm have reduced up to 23 % compared with the incremental mapping. By noticing the time overhead of the algorithm due to extra number of search spaces, we introduce a mechanism with lookup tables to speed up the process of searching optimized mapping regions. The lookup table is examined with both size and construction time. Experiments shown that the lookup table is small enough to fit into the cache, and the table can be constructed in milliseconds in most practical cases. The results from real applications show that the average execution time of applications of the proposed algorithm has reduced by 15.2 % compared with the first fit algorithm.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700