嵌入式处理器编译器关键技术研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
嵌入式系统通常对性能、实时性、功耗等有着严格的要求,需要非常高效的机器代码。因此,嵌入式软件开发常采用汇编语言。但汇编语言编程费时、调试困难,而且代码难以移植。嵌入式系统的广泛应用和嵌入式软件规模的不断扩大,决定了用高级语言代替汇编语言进行嵌入式软件开发是必然趋势。但使用高级语言编译器生成代码的质量和手工编写的汇编代码相比还有很大差距,不能满足嵌入式系统的要求。因此,嵌入式处理器编译技术研究有着非常重要的现实意义。
     传统编译技术通常不能直接适用于嵌入式领域,需要进行扩展或者设计全新的算法。传统的编译技术一般基于规则的机器模型,并采用简单启发式方法以达到较快的编译速度。嵌入式处理器一般具有不规则的体系结构特征,且在嵌入式环境中,编译器生成高质量代码的重要性高于编译时间。
     本文主要研究了嵌入式处理器编译技术的三个问题:嵌入式处理器寄存器分配、多媒体扩展指令集的代码生成、双指令集处理器多目标代码选择。
     在传统图着色寄存器分配方法中,假设机器模型具有简单的寄存器文件。这些算法采用“degree<k”测试冲突图结点是否局部可着色,并通过特殊的技巧对此扩展,以适用于不规则的寄存器文件特征。经过多年研究,此问题取得了很大进展。但是,传统图着色寄存器分配算法在进行冲突图结点局部可着色测试时,无法知道邻居结点所分配的颜色,只能采取保守的估算,因而降低了编译生成的代码质量。这是它们难以克服的困难。
     传统树模式匹配和动态规划技术不能有效利用多媒体处理器的指令集并行,基于传统并行编译技术的扩展一般局限于循环级并行,且没有全局优化标量指令和SIMD指令的使用。
     传统编译技术一般仅优化单一的目标,例如性能、功耗。嵌入式系统常需同时优化多个目标。单一目标的优化对其它目标有何影响,如何在多个目标间权衡,需要进一步研究。
     嵌入式系统的复杂性决定了需要采用新的方法学来研究嵌入式编译技术。从元启发式算法以及编译器前后端有机结合这两个角度来研究嵌入式编译技术是本文工作的重要思想。
     近二十年来,元启发式(Meta-heuristics)算法得到广泛研究。这些算法来自于人们从自然界得到的启发,通过模拟物理过程、生物进化等自然现象,能够克服经典优化算法的局限性,系统地搜索问题的解空间,较好地解决多种优化问题。编译技术中同样存在各种优化问题,尤其在嵌入式环境中,这些问题更加复杂,常使得传统编译技术不能简单地适用。元启发式算法为求解复杂的编译优化问题提供了广阔空间。
     程序分析是编译优化的基础。可以把编译器前端程序分析的优势和编译器后端的机器相关信息结合起来,以适应复杂的嵌入式环境下编译技术研究的需要。
     本文主要研究成果和创新包括以下几个方面:
     1.针对嵌入式处理器不规则寄存器文件体系结构特征,本文提出了一种新的图着色寄存器分配混合演化算法HMA-GRA,突破了传统图着色寄存器分配算法在寄存器冲突图结点局部可着色测试问题上难以克服的障碍,为实现嵌入式处理器寄存器分配器提供了新途径。
     本文的HMA-GRA算法主要由遗传算法和禁忌搜索算法构成,利用Chaitin算法作为前端对冲突图进行预处理,对剩余冲突图中各结点按照其类型随机分配颜色,计算结点间的冲突数量,通过演化过程逐渐减少图的总体冲突,最终实现图成功着色。此算法不仅可以精确计算冲突图结点是否局部可着色,而且能够用规范的寄存器分配模型代替传统算法中的特殊技巧,以适应嵌入式处理器寄存器文件不规则特征带来的复杂性。
     2.面向多媒体指令集,本文提出了一种新的SIMD代码生成算法ICG-ME。ICG-ME算法扩展了传统的树模式匹配和动态规划技术,通过修改代码生成器的产生器iburg工具为数据流树结点的每个目标非终结符保留多个匹配规则,以识别构造SIMD指令的并行操作;在编译器前端对存储操作相对于向量寄存器的对齐状况进行数据流分析,并把分析结果传递给编译器后端,以辅助构造SIMD和数据重组指令;最终生成多媒体向量指令和非多媒体标量指令混合的汇编代码。通过结合编译器前端程序分析的优势和编译器后端的机器信息,ICG-ME算法不仅简化了SIMD代码的生成,而且能够同时利用SIMD指令和非SIMD指令,以生成高质量的机器代码。
     3.以ARM/Thumb双指令集处理器为例,本文提出了用元启发式算法求解编译技术中多目标优化问题的方法学。
     本文把双指令集处理器代码选择问题表示为权衡程序代码大小和运行时间的多目标优化问题,利用程序profiling技术为其建立数学模型,通过一种多目标蚁群算法结合子集选择的方法求解此问题。
     4.基于SUIF/MachineSUIF编译器研究框架,我们进一步完善了代码选择、寄存器分配和程序分析的实验环境。
     我们实现了多种传统的图着色寄存器分配算法和元启发式图着色寄存器分配算法HMA-GRA,实现了多媒体代码生成算法ICG-ME和部分程序分析模块,实现了针对多目标代码生成的蚁群优化算法MOARM-ANT-SS。
     嵌入式处理器编译技术需要扩展传统的编译技术或者设计全新的算法。本文分析了传统编译技术在嵌入式环境中遇到的关键问题,并以元启发式算法和编译器前后端有机结合的方为基础,研究了嵌入式编译技术的三个问题:嵌入式处理器寄存器分配、多媒体扩展指令集的代码生成、双指令集处理器多目标代码选择。实验结果表明了这些方法的有效性。本文把元启发式算法用于嵌入式编译技术的研究与实现尚处于初步阶段。元启发式算法从新的角度为研究嵌入式编译技术提供了广阔空间。
To meet various requirements like performance, power, and real-time response to external events, embedded systems need highly efficient machine code. Thereby assembly language is always widely used in embedded programming. However, time-consuming programming, vexing debugging, and poor code portability of assembly programming are not neglectable. With the extension of application area of embedded systems and the ever-growing of the size of embedded software, assembly language will be inevitably replaced by high level language in embedded software development. But compared to manually written assembly code, the code generated by high level language compilers is often much worse and cannot meet the requirements. So it is very important to study the compiler techniques for embedded systems.
     Traditional compiler techniques are usually not applicable to the embedded domain directly, which need to be extended. Moreover, some completely distinct algorithms are expected sometimes. The reason is that traditional compiler techniques are usually based on regular machine models, and apply simple heuristics to get results rapidly; however, embedded processors often have irregular architectural characteristics. Additionally, in the embedded domain, the importance of high quality code outweighs that of short compilation time.
     This thesis mainly focuses on three problems of compiler techniques for embedded systems: register allocation for embedded processors, code generation for multimedia instruction set extension, and code selection for dual instruction set processors.
     Traditional graph-coloring register allocation algorithms are based on over-simplified register files of machine models, which apply the rule of "degree < k" to test the local colorability of interference graphs and extend this rule by ad hoc skills to irregular register file characteristics. Significant progress has been made on this problem through years of efforts of compiler researchers. But when testing the local colorability of interference graphs, traditional graph-coloring register allocation algorithms cannot get the colors of its neighbor nodes, consequently, they make conservative estimations of them.
     Since traditional tree pattern matching and dynamic programming cannot exploit efficiently instruction-level parallelism, the extension based on traditional parallel compiler techniques mainly is limited in loop-level parallelism, which cannot exploit scalar instructions and SIMD vector instructions optimally.
     Traditional compiler techniques usually improve code quality for a single objective, such as performance or power. However, embedded systems usually need to be optimized for multiple objectives. It is necessary to research how a compiler transformation for one objective influences the other objectives and how to make trade-offs among different objectives.
     The complexity of embedded systems makes it necessary to apply new methods to research compiler techniques for them. The core idea of this thesis is improving compiler techniques with the help of meta-heuristics and the integration of compiler front-end and back-end.
     In the last twenty years, there has been extensive research on meta-heuristic algorithms. Meta-heuristics, inspired by nature, can overcome the limitations of classic optimization algorithms and can search the solution space systematically to solve many optimization problems by simulating natural phenomena such as physical processes and evolutions. There are a variety of optimization problems in the compiler domain, which are more complex in the embedded domain. They bring difficulties to the application of standard compiler techniques. Meta-heuristics open up a broad way to solve complex compiler optimization problems.
     Program analysis is the foundation of compiler optimizations. To meet the requirements of compiler techniques for embedded systems, the advantages of compiler front-end program analysis and the machine-specific information of compiler back-end can be combined.
     The main contributions of the thesis are as follows.
     1. For irregular architectural characteristics of embedded processors, a new graph-coloring register allocation algorithm HMA-GRA is proposed, which is based on a hybrid evolutionary method. The algorithm provides a new direction for implementing register allocators while breaking through the difficulties in computing local colorability of interference graphs, which cannot be solved by traditional graph coloring register allocation algorithms based on simple heuristics.
     HMA-GRA algorithm combines a genetic algorithm with a tabu search subroutine. It uses Chaitin algorithm to pre-process the interference graph, assigns colors to nodes of the remaining interference graph randomly according to their classes, computes the number of conflicts, and decreases this number gradually until the graph can be colored successfully. This method can not only compute the local colorability of interference graphs accurately, but also replace the ad hoc skills of traditional methods by a regular register allocation model. It can thus deal with complex irregular characteristics of embedded processor register files.
     2. For multimedia instruction set, a new algorithm ICG-ME for the generation of SIMD code is proposed.
     ICG-ME algorithm extends standard tree pattern matching and dynamic programming techniques, modifies the iburg code generator generator in such a way that its generated code generators allow to retain multiple matching rules for each target nonterminal of a node of data flow trees in order to identify parallel operations of SIMD instructions. The data flow analysis that is to identify whether memory references are aligned relative to vector registers is performed in the compiler front-end. Then the analysis results are transfered to the compiler back-end to assist in constructing SIMD and permutation instructions. Finally, assembly code of multimedia vector instructions and non-multimedia scalar instructions is generated. Through the integration of compiler front-end analysis and the machine-specific information of compiler back-end, ICG-ME algorithm can not only simplify the generation of SIMD code, but also use both SIMD and non-SIMD instructions to improve code quality.
     3. In order to meet multi-objective requirements of embedded systems, a new methodology is proposed to deal with multi-objective optimization problems in the compiler domain using meta-heuristics. And the problem of code selection for ARM/Thumb processors with dual instruction sets is used to exemplify this idea.
     This thesis formulates the problem of code selection for processors with dual instruction sets as a multi-objective optimization problem that needs trade-off between the code size and the code execution time, creates a mathematical model for it using program profiling technique, and solves it with a basic multi-objective ant colony optimization algorithm MOARM-ANT-SS together with subset selection method.
     4. Based on the SUIF/MachineSUIF compiler research framework, we have extended the experimental environment for code selection, register allocation, and program analysis.
     Several traditional graph-coloring register allocation algorithms and a meta-heuristic graph-coloring register allocation algorithm HMA-GRA have been implemented. A code generation algorithm ICG-ME for multimedia processors and several compiler passes for program analysis have been implemented as well. Furthermore, an ACO algorithm MOARM-ANT-SS for multi-objective code generation is implemented.
     It is necessary to extend traditional compiler techniques or to design new ones in the embedded domain. This thesis analyzes some key problems of traditional compiler techniques in the embedded domain. Based on meta-heuristics and the integration of the compiler front-end and back-end, it mainly focuses on three problems of embedded systems: register allocation for embedded processors, code generation for multimedia instruction set extension, and code selection for dual instruction set processors. Experimental results prove the efficiency of our methods. The research and implementation in this thesis on applying meta-heuristics to compiler techniques for embedded systems are still elementary. From a new direction, meta-heuristics open up a broad way to research compiler techniques in the embedded domain.
引文
[1]Larsen S.Compilation Techniques for Short-Vector Instructions.Ph.D.thesis,Massachusetts Institute of Technology,2000
    [2]Intel Architecture Software Developer's Manual,Volume 2:Instruction Set Reference,1999
    [3]Appel A W and Palsberg J.Modern Compiler Implementation in Java.Cambridge University Press,second edition,2002
    [4]Leupers R and Marwedel P.Retargetable Compiler Technology for Embedded Systems,Tools and Applications.Kluwer Academic Publishers,2001
    [5]Chaitin G J.Register allocation and spilling via graph coloring.McKinley K S,editor,Best of PLDI,20 Years of the ACM SIGPLAN Conference on Programming Language Design and Implementation 1979-1999,A Selection,66-74.ACM,2004
    [6]Kirkpatrick S,Gelatt C D and Vecchi M P.Optimization by simulated annealing.Science,220(4598):671-680,1983
    [7]Deb K.Multi-objective Optimization using Evolutionary Algorithms.England:John Wiley&Sons,2001
    [8]Aho A V,Sethi R and Ullman J D.Compilers Principles,Techniques,and Tools.Pearson Education,1986
    [9]Muchnick S S.Advanced Compiler Design and Implementation.San Francisco,California:Morgan Kaufmann,1997
    [10]Allen R and Kennedy K.Optimizing Compilers for Modern Architectures,A Dependence-Based Approach.Morgan Kaufmann,2002
    [11]Bacon D F,Graham S L and Sharp O J.Compiler transformations for high-performance computing.ACM Computing Surveys,26(4):345-420,1994
    [12]http://www.cecs.uci.edu/.Center for Embedded Computer Systems(CECS),University of California.
    [13]http://ls12-www.informatik.uni-dortmund.de/.Department of Computer Science,Uni-versity of Dortmund.
    [14]http://www.scopesconf.org/.Software and Compilers for Embedded Systems
    [15]http://www.date-conference.com/.Design Automation and Test in Europe.
    [16]Zhuang X,Lau C and Pande S.Storage assignment optimizations through variable coalescence for embedded processors.Proceedings of the 2003 Conference on Languages,Compilers,and Tools for Embedded Systems(LCTES'03).San Diego,California,USA.ACM
    [17]Bartlay D H.Optimizing stack frame accesses for processors with restricted addressing modes.Software - Practice and Experience,22(2):101-110,1992
    [18]Liao S Y,Devadas S,Keutzer K,Tjiang S W K and Wang A.Storage assignment to decrease code size.ACM Transactions on Programming Languages and Systems,18(3):235-253,1996
    [19]Leupers R and Marwedel P.Algorithms for address assignment in DSP code generation.International Conference on Computer-Aided Design(ICCAD '96),San Jose,CA,USA.,109-112.1996
    [20]Zhuang X and Pande S.Parallelizing load/stores on dual memory bank embedded processors.ACM Transactions on Embedded Computing Systems(TECS).Accepted
    [21]Zhuang X and Pande S.Effective thread management on network processors with com-piler analysis.Irwin M J and Bosschere K D,editors,Proceedings of the 2006 ACM SIG-PLAN/SIGBED Conference on Languages,Compilers,and Tools for Embedded Systems (LCTES'06),Ottawa,Ontario,Canada,72-82.ACM,2006
    [22]Kim J,Jung S,Paek Y and Uh G R.Experience with a retargetable compiler for a commercial network processor.Bhattacharyya S S,Mudge T N,Wolf W and Jerraya A A,editors,Proceedings of the International Conference on Compilers,Architectures and Synthesis for Embedded Systems,CASES 2002,Greenoble,France,178-187.ACM,2002
    [23]George L and Blume M.Taming the ⅨP network processor.Proceedings of the ACM SIGPLAN 2003 Conference on Programming Language Design and Implementation 2003,San Diego,California,USA.ACM
    [24]Ren G,Wu P and Padua D A.Optimizing data permutations for SIMD devices.Schwartzbach MI and Ball T,editors,Proceedings of the ACM SIGPLAN 2006 Conference on Programming Language Design and Implementation,Ottawa,Ontario,Canada,118-131.ACM,2006
    [25]Nuzman D,Rosen Ⅰ and Zaks A.Auto-vectorization of interleaved data for SIMD.Schwartzbach M I and Ball T,editors,Proceedings of the ACM SIGPLAN 2006 Conference on Programming Language Design and Implementation,Ottawa,Ontario,Canada,132-143.ACM,2006
    [26]Fireman L.The complexity of SIMD Alignment.Master's thesis,Israel Institute of Technology,2006
    [27]Zhuang X and Pande S.Power efficient prefetching for embedded processors.ACM Transactions on Embedded Computing Systems(TECS).Special Issue on Best Papers from LCTES-04,accepted
    [28]Lin T Y and Chang R G.Power-aware instruction scheduling.Sha E H M,Han S K,Xu C Z,Kim M H,Yang L T and Xiao B,editors,Proceedings of the Embedded and Ubiquitous Computing,International Conference,EUC 2006,Seoul,Korea,volume 4096 of Lecture Notes in Computer Science,35-44.Springer,2006
    [29]Valluri M G,John L K and McKinley K S.Low-power,low-complexity instruction issue using compiler assistance.Arvind and Rudolph L,editors,Proceedings of the 19th Annual International Conference on Supercomputing,ICS 2005,Cambridge,Massachusetts,USA,209-218.ACM,2005
    [30]Blume H,Becker D,Botteck M,Brakensiek J and Noll T G.Hybrid functional and instruction level power modeling for embedded processors.Vassiliadis S,Wong S and H(a∣¨)m(a∣¨)l(a∣¨)nen T,editors,Proceedings of the Embedded Computer Systems:Architectures,Modeling,and Simulation,6th International Workshop,SAMOS 2006,Samos,Greece,volume 4017 of Lecture Notes in Computer Science,216-226.Springer,2006
    [31]Keβler C W and Bednarski A.Optimal integrated code generation for VLIW architectures.Concurrency and Computation:Practice and Experience,18(11):1353-1390,2006
    [32]Keβler C W and Bednarski A.A dynamic programming approach to optimal integrated code generation.Proceedings of The Workshop on Languages,Compilers,and Tools for Embedded Systems(LCTES 2001),/ The Workshop on Optimization of Middleware and Distributed Systems(OM 2001),Snowbird,Utah,USA,165-174.ACM,2001
    [33]Lorenz M and Marwedel P.Phase coupled code generation for DSPs using a genetic algorithm.2004 Design,Automation and Test in Europe Conference and Exposition (DATE 2004),Paris,France,1270-1275.IEEE Computer Society,2004
    [34]Bashford S and Leupers R.Constraint driven code selection for fixed-point DSPs.Proceedings of the 36th Conference on Design Automation,New Orleans,LA,USA,817-822.1999
    [35]Bednarski A and Kessler C.Optimal integrated VLIW code generation with integer linear programming.Nagel W E,Walter W V and Lehner W,editors,Proceedings of the Euro-Par 2006,Parallel Processing,12th International Euro-Par Conference,Dresden,Germany,volume 4128 of Lecture Notes in Computer Science,461-472.Springer,2006
    [36]Louden K C.Compiler Construction Principles and Practice.China:机械工业出版社,2002
    [37]陈火旺,刘春林,谈庆平,赵克佳,刘越.程序设计语言编译原理.Third edition,2000
    [38]陈火旺,钱家骅,孙永强.程序设计语言编译原理.国防工业出版社,1984
    [39]Karp R M.Reducibility among combinatorial problems.Complexity of Computer Computations,85-103,1972
    [40]Garey M R,Johnson D S and Stockmeyer L J.Some simplified NP-complete graph problems.Theoretical Computer Science,,1:237-267,1976
    [41]Garey M R and Johnson D S.The complexity of near-optimal graph coloring.Journal of the ACM,23(1):43-49,1976
    [42]Fuller S.Motorola's AltiVec Technology.Austin,Texas,1998
    [43]Oberman S,Favor G and Weber F.AMD 3DNow! technology:Architecture and implementations.IEEE Micro,19(2):37-48,1999
    [44]Lee R and McMahan L.Mapping of application software to the multimedia instructions of general purpose microprocessors.IS(?)T/SPIE Symp.on Electric Imaging:Science and Technology.San Jose,California,1997
    [45]Kohn L,Maturana G,Tremblay M,Prabhu A and Zyner G.The visual instruction set (VIS) in UltraSPARC.Proc.of Compcon'95.San Jose,California,1995
    [46]Carlson D A,W.Castelino R and O.Mueller R.Multimedia extensions for a 550-mhz risc microprocessor.IEEE Journal of Solid-State Circuits,32(11):1618-1624,1997
    [47]MIPS Extension for Digital Media with 3D,1997
    [48]Arm architecture reference manual.Technical Report ARM_DDI_0100E,ARM Limited.,1996
    [49]http://www-suif.stanford.edu/suif/NCI/suif.html.SUIF compiler research framework
    [50]http://www.eecs.harvard.edu/hube/research/machsuif.html.MachineSUIF compiler research framework
    [51]Fraser C W,Hanson D R and Proebsting T A.Engineering a simple,efficient codegenerator generator.LOPLAS,1(3):213-226,1992
    [52]Lavrov S S.Store economy in closed operator schemes.Journal of Computational Mathematics and Mathematical Physics,1(4):687-701,1961
    [53]Chaitin G J,Auslander M A,Chandra A K,Cocke J,Hopkins M E and Markstein P W.Register allocation via coloring.Computer Languages,6(1):47-57,1981
    [54]Briggs P.Register Allocation via Graph Coloring.Ph.D.thesis,Rice University,1992
    [55]Briggs P,Cooper K D and Torczon L.Rematerialization.Proceedings of the ACM SIG-PLAN Conference on Programming Language Design and Implementation,San Francisco,California,311-321.1992
    [56]Briggs P,Cooper K D and Torczon L.Improvements to graph Coloring register allocation.ACM Trans.Program.Lang.Syst.,16(3):428-455,1994
    [57]Briggs P,Cooper K D and Torczon L.Coloring register pairs.ACM Letters on Programming Languages and Systems,1(1):3-13,1992
    [58]George L and Appel A W.Iterated register coalescing.ACM Transactions on Programming Languages and Systems,18(3):300-324,1996
    [59]Callahan D and Koblenz B.Register allocation via hierarchical graph coloring.ACM SIGPLAN Notices,26(6):192-203,1991
    [60]Norris C and Pollock L L.Register allocation over the program dependence graph.ACM SIGPLAN Notices,29(6):266-277,1994
    [61]Bernstein D,Golumbic M,Mansour Y,Pinter R,Goldin D,Krawczyk H and Nahshon Ⅰ.Spill code minimization techniques for optimizing compliers.Proceedings of the ACM SIGPLAN Conference on Programming language design and implementation,258-263.1989
    [62]Bergner P,Dahl P,Engebretsen D and O'Keefe M T.Spill code minimization via interference region spilling.Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation,287-295.1997
    [63]Cooper K D and Simpson L T.Live range splitting in a graph coloring register allocator.Koskimies K,editor,Compiler Construction,7th International Conference,Held as Part of the European Joint Conferences on the Theory and Practice of Software,ETAPS'98,Lisbon,Portugal,volume 1383 of Lecture Notes in Computer Science,174-187.Springer,1998
    [64]Park J and Moon S M.Optimistic register coalescing.ACM Transactions on Programming Languages and Systems,26(4):735-765,2004
    [65]Gupta R,Soffa M L and Steele T.Register allocation via clique separators.ACM SIGPLAN Notices,24(7):264-274,1989
    [66]Choi J D,Cytron R and Ferrante J.Automatic construction of sparse data flow evaluation graphs.Conference Record of the Eighteenth Annual ACM Symposium on Principles of Programming Languages,Orlando,Florida,55-66.1991
    [67]Smith M D and Holloway G.Graph-coloring register allocation for irregular architectures.Technical report,Harvard University,2000
    [68]Nickerson B R.Graph coloring register allocation for processors with multi-register operands.ACM SIGPLAN Notices,25(6):40-52,1990
    [69]Runeson J and Nystr(o∣¨)m S O.Retargetable graph-coloring register allocation for irregular architectures.Krall A,editor,Proceedings of Software and Compilers for Embedded Systems,7th International Workshop,SCOPES 2003,Vienna,Austria,240-254.2003
    [70]Smith M D,Ramsey N and Holloway G H.A generalized algorithm for graph-coloring register allocation.Pugh W and Chambers C,editors,Proceedings of the ACM SIG-PLAN Conference on Programming Language Design and Implementation,Washington,DC,USA,277-288.2004
    [71]Goodwin D W and Wilken K D.Optimal and near-optimal global register allocations using 0-1 integer programming.Software-Practice(?) Experience,26(8):929-965,1996
    [72]Kong T and Wilken K D.Precise register allocation for irregular architectures.31st ACM/IEEE International Symposium on Microarchitecture,297-307.1998
    [73]Fu C and D.Wilken K.A faster optimal register allocator.35th ACM/IEEE International Symposium on Microarchitecture,245-256.2002
    [74]Scholz B and Eckstein E.Register allocation for irregular architectures.Proceedings of the 2002 Joint Conference on Languages,Compilers,and Tools for Embedded Systems (?) Software and Compilers for Embedded Systems(LCTES'02-SCOPES'02),Berlin,Germany,139-148.ACM,2002
    [75]Koes D and Goldstein S C.A progressive register allocator for irregular architectures.3nd IEEE / ACM International Symposium on Code Generation and Optimization (CGO 2005),San Jose,CA,USA,269-280.IEEE Computer Society,2005
    [76]Sreraman N and Govindarajan R.A vectorizing compiler for multimedia extensions.International Journal of Parallel Programming,28(4):363-400,2000
    [77]DeVries D J.A Vectorizing SUIF Compiler:Implementation and Performance.Master's thesis,University of Toronto,1997
    [78]Cheong G and Lam M.An optimizer for multimedia instruction sets.Technical report,University of University,1997
    [79]Aho A V,Ganapathi M and Tjiang S W K.Code generation using tree matching and dynamic programming.ACM Transactions on Programming Languages and Systems,11(4):491-516,1989
    [80]Fraser C W,Henry R R and Proebsting T A.Burg:fast optimal instruction selection and tree parsing.SIGPLAN Notices,27(4):68-76,1992
    [81]Tjiang S.An olive twig.Technical report,Synopsys Inc.,1993
    [82]Leupers R and Bashford S.Graph-based code selection techniques for embedded processors.ACM Transactions on Design Automation of Electronic Systems,5(4):794-814,2000
    [83]Kudriavtsev A and Kogge P M.Generation of permutations for SIMD processors.Pack Y and Gupta R,editors,Proceedings of the 2005 ACM SIGPLAN/SIGBED Conference on Languages,Compilers,and Tools for Embedded Systems(LCTES'05),Chicago,Illinois,USA,147-156.ACM,2005
    [84]Larsen S,Rabbah R M and Amarasinghe S P.Exploiting vector parallelism in software pipelined loops.38th Annual IEEE/ACM International Symposium on Microarchitectare (MICRO-38 2005),Barcelona,Spain,119-129.IEEE Computer Society,2005
    [85]Rugina R and Rinard M C.Pointer analysis for structured parallel programs.ACM Transactions on Programming Languages and Systems,25(1):70-116,2003
    [86]Larsen S,Witchel E and Amarasinghe S P.Increasing and detecting memory address congruence.2002 International Conference on Parallel Architectures and Compilation Techniques(PACT 2002),Charlottesville,VA,USA,18-29.IEEE Computer Society,2002
    [87]Shin J,Hall M W and Chame J.Superword-level parallelism in the presence of control flow.3nd IEEE / ACM International Symposium on Code Generation and Optimization (CGO 2005),San Jose,CA,USA,165-175.IEEE Computer Society,2005
    [88]Park J and Schlansker M.On predicated execution.Technical Report HPL-91-58,Software and Systems Laboratory,1991
    [89]Kernighan B and Lin S.An efficient heuristic procedure for partitioning graphs.Bell System Technical Journal,49:291-307,1970
    [90]Fiduccia C and Mattheyses R.A linear-time heuristic for improving network partitions.Proceedings of the 19th Conference on Design Automation,175-181.1982
    [91]Talla D,John L K and Burger D.Bottlenecks in multimedia processing with SIMD style extensions and architectural enhancements.IEEE Transactions on Computers,52(8):1015-1031,2003
    [92]Eichenberger A E,Wu P and O'Brien K.Vectorization for SIMD architectures with alignment constraints.Pugh W and Chambers C,editors,Proceedings of the ACM SIGPLAN 2004 Conference on Programming Language Design and Implementation,Washington,DC,USA,82-93.ACM,2004
    [93]Wu P,Eichenberger A E and Wang A.Efficient SIMD code generation for runtime alignment and length conversion.3nd IEEE / ACM International Symposium on Code Generation and Optimization(CGO 2005),San Jose,CA,USA,153-164.IEEE Computer Society,2005
    [94]Shin J,Chame J and Hall M W.Exploiting superword-level locality in multimedia extension architectures.Journal of Instruction-Level Parallelism,5,2003
    [95]Boekhold M,Karkowski Ⅰ and Corporaal H.Transformatiing and parallelizing ANSI c programs using pattern recognition.Sloot P M A,Bubak M,Hoekstra A G and Hertzberger L O,editors,Proceedings of the High-Performance Computing and Networking,7th International Conference,HPCN Europe 1999,Amsterdam,The Netherlands,volume 1593 of Lecture Notes in Computer Science,673-682.Springer,1999
    [96]Manniesing R,Karkowski Ⅰ and Corporaal H.Automatic SIMD parallelization of embedded applications based on pattern recognition.Bode A,0002 T L,Karl W and Wism(u∣¨)ller R,editors,Proceedings of the Euro-Par 2000,Parallel Processing,6th International Euro-Par Conference,Munich,Germany,volume 1900 of Lecture Notes in Computer Science,349-356.Springer,2000
    [97]Bik A J C,Girkar M,Grey P M and Tian X.Automatic detection of saturation and clipping idioms.Pugh W and Tseng C W,editors,Languages and Compilers for Parallel Computing,15th Workshop,LCPC 2002,College Park,MD,USA,volume 2481 of Lecture Notes in Computer Science,61-74.Springer,2002
    [98]Jiang W,Mei C,Huang B,Li J,Zhu J,Zang B and Zhu C.Boosting the performance of multimedia applications using SIMD instructions.Bodik R,editor,Compiler Construction,14th International Conference,CC 2005,Held as Part of the Joint European Conferences on Theory and Practice of Software,ETAPS 2005,Edinburgh,UK,volume 3443 of Lecture Notes in Computer Science,59-75.Springer,2005
    [99]Deeds M W.Design and Implementation of a PowerPC and AltiVec Backend with Optimizations.Master's thesis,Massachusetts Institute of Technology,2001
    [100]Maze D Z.A Flexible Compilation Infrastructure for VLIW and SIMD Architectures.Master's thesis,Massachusetts Institute of Technology,2001
    [101]Halambi A,Shrivastava A,Biswas P,Dutt N D and Nicolau A.An efficient compiler technique for code size reduction using reduced bit-width ISAs.2002 Design,Automation and Test in Europe Conference and Exposition(DATE 2002),Paris,France,402-408.IEEE Computer Society,2002
    [102]Krishnaswamy A and Gupta R.Profile guided selection of arm and thumb instruc-tions.Proceedings of the 2002 Joint Conference on Languages,Compilers,and Tools for Embedded Systems(?) Software and Compilers for Embedded Systems(LCTES'02-SCOPES'02),Berlin,Germany,56-64.ACM,2002
    [103]Lee S,Lee J,Min S L,Hiser J and Davidson J W.Code generation for a dual instruction set processor based on selective code transformation.Krall A,editor,Proceedings of the Software and Compilers for Embedded Systems,7th International Workshop,SCOPES 2003,Vienna,Austria,volume 2826 of Lecture Notes in Computer Science,33-48.Springer,2003
    [104]Leupers R.Lance:A c compiler platform for embedded processors.Embedded Systems/Embedded Intelligence,2001
    [105]Bringmann R A.Enhancing Instruction Level Parallelism Through Compiler-Controlled Speculation.Ph.D.thesis,University of Illinois,1995
    [106]http://www.trimaran.org/.Trimaran compiler research framework
    [107]赵克佳,杨灿群,曾丽芳,罗红兵.多语种多平台编译系统剖析,GNU CC的解剖与移植.1999
    [108]http://www.cs.virginia.edu/zephyr/.Zephyr compiler research framework
    [109]http://www.cs.virginia.edu/nci/.National Compiler Infrastructure
    [110]Briggs P,Cooper K,Harvey T and Simpson L.Practical improvements to the construction and destruction of static single assignment form.Software Practice and Experience,28(8):859-881,1998
    [111]Glover F.Future paths for integer programming and links to artificial intelligence.Computers and Operations Research,13:533-549,1986
    [112]Glover F and Laguna M.Tabu Search.Norwell,MA,USA:Kluwer Academic Publishers,1997
    [113]Oliveira C A S,Pardalos P M and Resende M G C.Grasp with path-relinking for the quadratic assignment problem.Ribeiro C C and Martins S L,editors,Proceedings of the Experimental and Efficient Algorithms,Third International Workshop,WEA 2004,Angra dos Reis,Brazil,volume 3059 of Lecture Notes in Computer Science,356-368.Springer,2004
    [114]Festa P,Pardalos P M,Pitsoulis L S and Resende M G C.Grasp with path-relinking for the weighted maximum satisfiability problem.Nikoletseas S E,editor,Proceedings of the Experimental and Efficient Algorithms,4th InternationalWorkshop,WEA 2005,Santorini Island,Greece,volume 3503 of Lecture Notes in Computer Science,367-379.Springer,2005
    [115]Mladenovic N and Hansen P.Variable neighborhood search.Computers(?) OR,24(11):1097-1100,1997
    [116]Hansen P,Mladenovic N and Urosevic D.Variable neighborhood search for the maximum clique.Discrete Applied Mathematics,145(1):117-125,2004
    [117]Faroe O,Pisinger D and Zachariasen M.Guided local search for the three-dimensional bin-packing problem.INFORMS Journal on Computing,15(3):267-283,2003
    [118]Fogel L J.Toward inductive inference automata.IFIP Congress,395-400.1962
    [119]Rechenberg.Evolution strategy.Zurada J,Marks R and Robinson C,editors,Computational Intelligence:Imitating Life,147-159.Piscataway,New Jersey,USA:IEEE Press,1994
    [120]Holland J H.Adaption in natural and artificial systems.The University of Michigan Press,1975
    [121]Goldberg D E.Genetic Algorithms in Search,Optimization and Machine Learning.Reading,MA,USA:Addison Wesley,1989
    [122]Mitchell M.An Introduction to Genetic Algorithms.Cambridge,MA,USA:MIT Press,1998
    [123]Deb K,Agrawal S,Pratap A and Meyarivan T.A fast and elitist multiobjective genetic algorithm:Nsga-ⅱ.IEEE Transactions on Evolutionary Computation,6(2):182-197,2002
    [124]Zitzler E and Thiele L.Multiobjective evolutionary algorithms:a comparative case study and the strength pareto approach.IEEE Transactions on Evolutionary Computation,3(4):257-271,1999
    [125]Zitzler E,Laumanns M and Thiele L.Spea2:Improving the strength pareto evolutionary algorithm.Technical Report 103,Swiss Federal Institute of Technology(ETH),Gloriastrasse 35,CH-8092 Zurich,Switzerland,2001.Citeseer.ist.psu.edu/zitzler02spea.html
    [126]Dorigo M and St(u∣¨)tzle T.Ant Colony Opmitization.Cambridge,MA,USA:The MIT Press,2004
    [127]Dorigo M,Maniezzo V and Colorni V.Ant system:Optimization by a colony of cooperating agents.IEEE Transactions on Systems,Man,and Cybernetics-Part B,26(1):29-41,1996
    [128]Dorigo M.Optimization,Learning and Natural Algorithms.Ph.D.thesis,Politecnico di Milano,Milan,1992
    [129]Bullnheimer B,Hartl R F and Strauss C.A new rank-based version of the ant system:A computational study.Central European Journal for Operations Research and Economics,7(1):25-38,1999
    [130]Stutzle T and Hoos H H.Max-min ant system.Future Generation Computer Systems,16(8):889-914,2000
    [131]Dorigo M and Gambardella L M.Ant colonies for the traveling salesman problem.BioSystems,43(2):73-81,1997
    [132]Iredi S,Merkle D and Middendorf M.Bi-criterion optimization with multi colony ant algorithms.Zitzler E,Deb K,Thiele L,Coello C A C and Corne D,editors,Proceedings of the Evolutionary Multi-Criterion Optimization,First International Conference,EMO 2001,Zurich,Switzerland,volume 1993 of Lecture Notes in Computer Science,359-372.Springer,2001
    [133]Doerner K,Gutjahr W J,Hartl R,Strauss C and Stummer C.Pareto ant colony optimization:A metaheuristic approach to multiobjective portfolio selection.Annals of Operations Research,131:79-99,2004
    [134]Baran B and Schaerer M.A multiobjective ant colony system for vehicle routing problem with time windows.Hamza M H,editor,The 21st IASTED International Multi-Conference on Applied Informatics(AI 2003),Innsbruck,Austria,97-102.IAST-ED/ACTA Press,2003
    [135]Doerner K F,Hartl R F and Reimann M.Are COMPETants more competent for problem solving? - the case of full truckload transportation.Central European Journal of Operations Research(CEJOR),115-141,2003
    [136]Guntsch M and Middendorf M.Solving multi-criteria optimization problems with population-based ACO.Fonseca C M,Fleming P J,Zitzler E,Deb K and Thiele L, editors,Proceedings of the Evolutionary Multi-Criterion Optimization,Second International Conference,EMO 2003,Faro,Portugal,volume 2632 of Lecture Notes in Computer Science,464-478.Springer,2003
    [137]Guntsch M and Middendorf M.Applying population based ACO to dynamic opti-mization problems.Dorigo M,Caro G D and Sampels M,editors,Proceedings of the Ant Algorithms,Third International Workshop,ANTS 2002,Brussels,Belgium,volume 2463 of Lecture Notes in Computer Science,111-122.Springer,2002
    [138]Guntsch M and Middendorf M.A population based approach for ACO.Cagnoni S,Gottlieb J,Hart E,Middendorf M and Raidl G R,editors,Proceedings of the Applications of Evolutionary Computing,Evo Workshops 2002:EvoCOP,EvoIASP,EvoSTIM/Evo-PLAN,Kinsale,Ireland,volume 2279 of Lecture.Notes in Computer Science,72-81.Springer,2002
    [139]Hatzakis Ⅰ and Wallace D.Dynamic multi-objective optimization with evolutionary algorithms:a forward-looking approach.Cattolico M,editor,Proceedings of the Genetic and Evolutionary Computation Conference,GECCO 2006,Seattle,Washington,USA,1201-1208.ACM,2006
    [1401 Singh G and Deb K.Comparison of multi-modal optimization algorithms based on evolutionary algorithms.Cattolico M,editor,Proceedings of the Genetic and Evolutionary Computation Conference,GECCO 2006,Seattle,Washington,USA,1305-1312.ACM,2006
    [141]Deb K and Srinivasan A.Innovization:innovating design principles through optimization.Cattolico M,editor,Proceedings of the Genetic and Evolutionary Computation Conference,GECCO 2006,Seattle,Washington,USA,1629-1636.ACM,2006[142]O'Grady R,Grot~ R,Niondada F,Bonani M and Dorigo M.Self-assembly on demand in a group of physical autonomous mobile robots navigating rough terrain.Capcarr~re 5/I S,Freitas A A,Bentley P J,Johnson C G and Timmis J,editors,Proceedings of the Advances in Artificial Life,8th European Conference,ECAL 2005,Canterbury,UK,volume 3630 of Lecture Notes in Computer Science,272-281.Springer,2005[143]Trianni V,Nolfi S and Dorigo M.Cooperative hole avoidance in a warm-bot.Robotics and Autonomous Systems,54(2):97-103,2006[144]http://www.cs.cinvestav.mx/constraint/.List of References on Constraint-Handling Tect~niques used with Evolutionary Algorithms
    [145]Deb K,Sinha A and Kukkonen S.Multi-objective test problems,linkages,and evolutionary methodologies.Cattolico M,editor,Proceedings of the Genetic and Evolutionary Computation Conference,GECCO 2006,Seattle,Washington,USA,1141-1148.ACM,2006
    [146]Veldhuizen D V.Multiobjective Evolutionary Algorithms:Classifications,Analyses,and New Innovations.Ph.D.thesis,Air Force Institute of Technology,Dayton,Ohio,USA,1999
    [147]Zitzler E.Evolutionary Algorithms for Multiobjective Optimization:Methods and Applications.Ph.D.thesis,Swiss Federal Institute of Technology,Ziirich,Switzerland,1999
    [148]Schott J R.Fault Tolerant Design Using Single and Multi-Criteria Genetic Algorithms.Master's thesis,Massachusetts Institute of Technology,Boston,USA,1995
    [149]Gandibleux X and Ehrgott M.1984-2004-20 years of multiobjective metaheuristics,but what about the solution of combinatorial problems with multiple objectives?.Coello C A C,Aguirre A H and Zitzler E,editors,Proceedings of the Evolutionary Multi-Criterion Optimization,Third International Conference,EMO 2005,Guanajuato,Mexico,volume 3410 of Lecture Notes in Computer Science,33-46.Springer,2005
    [150]Rudolph G.Convergence analysis of canonical genetic algorithms.IEEE Transactions on Neural Network,5(1):96-101,1994
    [151]Vose M D.Simple Genetic Algorithm:Foundation and Theory.MIT Press,1999
    [152]Rudolph G.Evolutionary search for minimal elements in partially ordered finite sets.Proceedings of the 7th Annual Conference on Evolutionary Programming,345-353.1998
    [153]Rudolph G and Agapie A.Convergence properties of some multi-objective evolutionary algorithms.Proceedings of the Congress on Evolutionary Computation(CES-2000),1010-1016.2000
    [154]Rudolph G.Evolutionary search under partially ordered fitness sets.Proceedings of the International Symposium on Information Science Innovations in Engineering of Natural and Artificial Intelligent Systems(ISI 2001),818-822.2001
    [155]Chaitin G J,Auslander M A,Chandra A K,Cocke J,Hopkins M E and Markstein P W.Register allocation via coloring.Computer Languages,6(1):47-57,1981
    [156]http://archi.snu.ac.kr/realtime/benchmark/.SNU real-time benchmarks suite
    [157]Slingerland N T and Smith A J.Design and characterization of the berkeley multimedia workload.Multimedia Syst.,8(4):315-327,2002
    [158]Lee C,Potkonjak M and Mangione-Smith W H.Mediabench:A tool for evaluating and synthesizing multimedia and communicatons systems.Proceedings of the Thirtieth Annual IEEE/A CM International Symposium on Microarchitecture,Research Triangle Park,North Carolina,USA,330-335.1997
    [159]http://www.densis.fee.unicamp.br/(?)oscato/TSPBIB_home.html.TSPBIB Home Page
    [160]Leung A and George L.A New MLRISC Register Allocator.New York University,1999
    [161]Wilson R P and Lain M S.Efficient context-sensitive pointer analysis for c programs.Proceedings of the ACM SIGPLAN'95 Conference on Programming Language Design and Implementation(PLDI),La Jolla,California,1-12.1995
    [162]Emami M,Ghiya R and Hendren L J.Context-sensitive interprocedural points-to analy-sis in the presence of function pointers.Proceedings of the ACM SIGPLAN'94 Conference on Programming Language Design and Implementation(PLDI),Orlando,Florida,242-256.1994
    [163]Hind M and Pioli A.Which pointer analysis should i use? ISSTA 2000,Proceedings of the International Symposium on Software Testing and Analysis,Portland,OR,USA.,113-123.2000
    [164]Hind M.Pointer analysis:haven't we solved this problem yet? Proceedings of the 2001 ACM SIGPLAN-SIGSOFT Workshop on Program Analysis For Software Tools and Engineering,PASTE'01,Snowbird,Utah,USA,54-61.ACM,2001
    [165]Ghiya R and Hendren L J.Putting pointer analysis to work.POPL,121-133.1998
    [166]Pearce D.Some Directed Graph Algorithms and Their Applications to Pointer Analysis.Ph.D.thesis,University of London,2005