面向目标检测识别应用的算法加速器体系结构研究

英文题名：Research on Hardware-accelerated Architecture for Object Detection and Recognition Applications
作者：徐金波
论文级别：博士
学科专业名称：计算机科学与技术
中文关键词：目标检测识别 ; 硬件加速 ; 行人检测识别 ; 人脸检测识别 ; 无冲突并行访问 ; FPGA
英文关键词：objectdetectionandrecognition ; hardwareacceleration ; pedestrian detection and recognition ; facial detection and recognition ; conflict-free parallel access ; FPGA
学位年度：2009
导师：窦勇
学科代码：081201
学位授予单位：国防科学技术大学
论文提交日期：2009-04-01

摘要

无论在军事还是民用领域,目标检测识别技术都具有重要的研究意义和应用价值。人们提出了很多方法来提高目标检测识别的准确性,且取得了显著的成果,但是在提高目标检测识别处理速度方面的相关研究较少。
     事实上,在将目标检测识别技术应用于实际系统中时,除了保证足够高的识别准确率,识别速度能否满足系统对实时性的要求也是一个关键问题。另外,目标检测识别系统的体积、实现代价、功耗以及对不同应用环境的适应能力也是需要研究的问题。
     基于FPGA(FieldProgrammableGateArray)的硬件加速技术能够实现较大程度的算法到计算引擎的空间映射(区别于通用处理器),且具备计算和存储资源的定制能力(区别于ASIC(ApplicationSpecificIntegratedCircuits)),因此在灵活性和高性能方面做到了比较好的权衡。同时,FPGA硬件加速器与通用处理器相比还具有体积小、功耗低的优势。
     基于FPGA的硬件加速技术对于提高目标检测识别技术的实用性具有重要意义。本文研究面向目标检测识别应用的算法加速器体系结构,主要研究四类应用:静态刚性目标识别、运动目标检测提取、行人检测识别、人脸检测识别,研究目的是实现算法到有限硬件资源的充分高效的映射,在硬件实现代价、处理速度和处理效果之间进行适当的折中。基于研究成果为这四类应用分别设计了基于FPGA的硬件加速原型系统。同时,针对行人、人脸检测识别等一类具有不规则数据访问模式的图像处理应用提出了一种通用无冲突并行访问存储模型。
     首先研究了基于Hausdorff距离与模板匹配的静态刚性目标识别硬件加速技术。静态刚性目标识别应用的数据访问模式较规整,但是计算复杂性较大。本文提出了面向大尺寸窗口遍历型应用的并行计算模型,目的是实现运算单元的数据消耗速度与存储系统的数据供应速度之间的均衡。将基于多体存储结构的无冲突并行访问存储模型与基于分治并行策略的多PE(Processing Element)计算结构相结合,缓解了只采用前者时所遇到的存储体个数太多的问题以及只采用后者时所面临的存储容量受限的问题。性能分析与实验结果表明采用该计算模型可以显著提高运算单元的并行度。
     其次,研究了运动目标检测提取算法的硬件加速技术以及存储优化技术。相对于静态目标,在现实场景中,人们往往对运动目标更感兴趣。本文设计了对图像中不同运动目标进行分类的硬件加速结构;针对图像序列中运动目标数量、位置、大小等信息不断变化的特点,引出了“可变数据集合维护问题”,设计了一种通用的高速硬件链表结构,提高了对可变数据集合进行存取访问的灵活性。
     接下来,对于检测并提取出的运动目标,后续工作通常是运动目标识别。本文选取运动目标识别应用中计算复杂性较大且应用需求较广泛的行人检测识别与人脸检测识别进行重点研究。行人与人脸属于“非刚性目标”,与刚性目标不同,非刚性目标的轮廓是不规则且不断变化的,这不仅增加了计算的复杂性,也导致了对存储器数据访问模式的不规则性。
     本文研究了基于主动形状模型(ActiveShape Model,ASM)的行人识别硬件加速技术。针对计算复杂性较大所造成的计算资源不足问题,为了实现硬件代价和处理速度的适当平衡,提出了资源共享模式和硬件流水线方式相互结合、灵活配置的计算资源映射策略,基本思想是对于占用较多计算资源的非瓶颈任务,采用资源共享模式,将多个相同类型的操作映射到一个功能部件上分时执行,多个操作的源操作数通过多路选择器进入功能部件的输入端口,通过采用优化的指令调度算法,最大程度的缩短了同一类型的不同操作之间的启动时间间隔;而对于计算复杂性较大的瓶颈任务,分配较多的计算资源,尽可能的采用流水线技术和其它并行策略提高处理速度。本文在FPGA上构建了原型系统,实现了行人的检测、识别与跟踪,实验结果表明,与相关工作相比具有较大的速度优势。
     对于人脸目标,本文提出了一种精确分类的视角无关人脸检测方法,能够对垂直于图像平面±90度和图像平面内360度范围的所有人脸姿态进行快速准确地检测分类。树形检测器框架结构中的每个检测节点采用一种创新的两段式Boosting方法(Two-Stage Boosting,TS-Boosting)进行训练,核心思想是在判断一个样本是否属于某个姿态区间时,不仅要看该样本属于该姿态区间的概率有多大,还要看该样本不属于其它姿态区间的概率有多大。基于提出的算法,设计了硬件加速器,并提出了一种对硬件资源进行动态配置的设计空间探索算法。实验表明,本文提出的方法和硬件加速器与相关工作相比具有较高的检测准确率和处理速度。
     本文最后针对行人、人脸检测识别等一类具有不规则数据访问模式的图像处理应用提出了一种通用无冲突并行访问存储模型,在主存储器与处理器之间构建了一种多体存储结构,并将大部分的不规则数据访问模式归类为对图像中多个局部矩形兴趣区域(RegionsofInterest,ROIs)内的任意位置固定大小矩形数据块的无冲突并行访问。理论分析与实验结果表明该存储模型与相关工作相比更适合于多兴趣区域图像处理应用,与直接访问主存储器相比在访存速度上提高了几倍到上百倍。
     综上所述,本文面向目标检测识别应用,研究了使用硬件加速技术提高性能的关键技术,对算法并行特性分析、体系结构设计、计算和存储资源的灵活配置、面向不规则数据访问模式的并行存取模型等问题提出了有效的解决方案,对于推进目标检测识别技术的研究和实用化具有一定的意义和价值。
Object detection and recognition technology has both significant theoretic valuesand wide potential applications in military and civil domain. A lot of algorithms andmethods have been presented to improve the accuracy. However, little research focusesonacceleratingtheprocessingspeed.
     In fact, while designing a practical object detection and recognition system, theprocessing speed is also very important in addition to the accuracy. At the same time,system size, implementation cost, power consumption and the adaptability for differentapplicationdomainsareproblemsthatrequiremoreattention,too.
     FPGA-based (Field Programmable Gate Array) hardware acceleration technologycan optimize the mappingstrategyfrom algorithm to computingengine (compared withgeneral purpose processor), and it also has the ability to customize the computing andmemory resources (compared with ASIC(Application Specific Integrated Circuits)).Therefore, FPGA-based hardware acceleration technology finds an appropriate balancebetween the flexibility and high performance. In addition, FPGA-based hardwareaccelerator also has advantages of small size and low power compared with generalpurposeprocessor.
     FPGA-based hardware acceleration technology has remarkable values fordesigning practical object detection and recognition systems. Some key issues ofhardware-accelerated computing for object detection and recognition applications aredeeply studied in this thesis. Four types of applications are researched, which arestationary rigid object recognition, moving object detection and exaction, pedestriandetectionandrecognition,facial detectionandrecognition.Theresearch goal is tobringforward some solutions for efficiently mapping algorithms to limited hardwareresources,soastofindanappropriatetradeoffamongthehardwareimplementationcost,the detection accuracy and speed. Based on the research results, hardware-acceleratedprototypesystems are built onFPGAfor eachtypeofapplications studied inthis thesis.At the same time, for image applications with irregular data access patterns, such aspedestrianandfacialrecognition,ageneralconflict-freemulti-accessmemoryschemeisproposed.
     Firstly, the hardware acceleration techniques for stationaryrigid object recognitionbased on Hausdorff distance and template matching are studied. The data accesspatternsofthiskindofapplicationsarebasicallyregular,butthecomputationalcostsareprettylarge. In order to balance the data consumption speed of processing units and thedata production speed of memory system, a parallel computing scheme forlarge-sliding-window applications is proposed. The proposed scheme combinesconflict-free parallel access memory model and multi-PE(Processing Element)computing structure which is based on divide-and-conquer strategy. Consequently, the requirements of memory modules and capacities are reduced. Performance analysis andexperimental results show that the proposed parallel computing scheme achievessignificantparallellism.
     Secondly, the hardware acceleration techniques for moving object detection andexaction are studied. And the memory optimization techniques are also researched. Inreal world, people are more interested in moving objects than stationary ones. Thehardwarestructure for classifyingdifferent movingobjects inimages is designed. Then,based on the characteristics that the number, positions and sizes of moving objects inimage sequences vary frequently, the problem of maintaining variable data set isintroduced. Aiming at this problem, a general hardware structure for linked-list isbroughtforwardtooptimizetheaccessesofvariabledataset.
     Following the moving object detection and exaction, the successive procedure isusuallymovingobject recognition.This thesis chooses pedestrianrecognitionandfacialrecognition as research emphasis, which both require complex computation and arewidely used in practice. Pedestrian and face objects belong to“non-rigid objects”.Different from rigid ones, the contours of non-rigid objects are irregular and changeconstantly. Consequently, the computational costs are increased, and the data accesspatternsbecomeirregular.
     The hardware acceleration techniques for pedestrian detection and recognitionbased on Active Shape Model (ASM) are researched. To deal with the problem ofresourceconstraintscausedbylargecomputationalcosts,aresourcemappingstrategyisproposed, which combines the resource-sharing scheme with the hardware pipelinefashion to balance the hardware costs and the processing speed. For the tasks which arenot critical but occupy too many resources, resource-sharing scheme is applied, whichmaps multiple operations with the same type upon a single processing unit for timesharing. The inputs of these operations are selected by a multiplexer. The initiationintervals between different operations are minimized byadopting scheduling algorithm.For those critical tasks which are time-consuming, more resources can be deployed forapplying hardware pipeline and other parallel techniques to improve the processingspeed.AprototypesystemisconstructedonFPGA,whichachievespedestriandetection,recognition and tracking. The experimental results suggest significant speedupscomparedwithrelatedworks.
     For facial objects, a fine-classified method for rotation invariant multi-view facedetection is presented, which is able to detect faces with all±90-degreerotation-out-of-planeand360-degreerotation-in-planeposechanges fastandaccurately.Each detector node in the tree-structured detector hierarchy is trained by using a noveltwo-stage boosting (TS-Boosting) method. The primary idea is that while decidingwhether a sample belongs to a pose range, the detector not only considers theprobability that the sample belongs to the pose range, but also the probability that the sampledoesnotbelongtootherposeranges.Basedontheproposedmethod,ahardwareaccelerator structure is proposed. And a design space exploration algorithm is presentedto achieve the reconfiguration of the hardware resources. Experiments on FPGA showthat high accuracy and marvelous speed are achieved compared with previous relatedworks.
     Lastly, a general conflict-free multi-access memory scheme is proposed for imageapplicationswithirregulardataaccesspatterns,suchaspedestrianandfacialrecognition.A multi-module memory structure is presented between the main memory and theprocessing units, which achieves conflict-free parallel access of randomly alignedrectangular data blocks constrained in some regions of interest (ROIs). Performanceanalysis and experimental results show that the proposed memory scheme is moresuitable for image applications with multiple interested regions than related works, andtransfer speedups up to hundreds are achieved when compared with the scheme thataccessesmainmemorydirectly.
     In summary, this thesis studies the key issues of hardware acceleration techniquesforobject detectionandrecognitionapplications.Solutionsforseveral keyproblems arepresented, which are parallel characteristics analysis for algorithms, hardwarearchitecture design, reconfiguration of computation and memory resources, parallelmemory scheme for irregular data access patterns. The contribution has significantvalues for advancing the theory and practicability of object detection and recognitiontechnology.

引文

[1] Yuille A. Deformable templates for face recognition. Journal of CognitiveNeuroscience,1991,3(1):59-70.
    [2] Sinha P. Object recognition via image invariants: A case study. InvestigativeOphthalmologyandVisualScience,1994,35(5):1735-1740.
    [3] RikertTD.Acluster-basedmodelforobjectdetection.ProceedingsofInternationalConferenceonComputerVision.1999.1046-1053.
    [4] Sung K K. Example-based learning for view-based human face detection. IEEETransactionsonPatternAnalysisandMachineIntelligence,1998,20(1):39-51.
    [5] Papageorgiou C, Poggio T. A trainable system for object recognition. InternationalJournalofComputerVision,2000,38(1):15-33.
    [6] MohanA,PapageorgiouC,PoggioT.Example-basedobjectdetectioninimagesbycomponents. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2001,23(4):349-361.
    [7] Makarov A. Comparison of background extraction based intrusion detectionalgorithms, Proceedings of International Conference on Image Processing. 1996. 1:521-524.
    [8]刘永信.视频图像中运动目标检测的快速方法.仪器仪表学报,2002,23(5):163-166.
    [9]汪亚明.动态图像序列智能中的运动目标检测.计算机测量与控制,2003,11(8):564-566.
    [10]李智勇等.动态图像分析.国防工业出版社,1999.
    [11]Barron J, Fleer D. Performance of optical flow techniques. International Journal ofComputerVision,1994,12(1):42-77.
    [12]Bigun J. Multidimensional orientation with application to texture analysis andoptical flow. IEEETransactions onPatternAnalysis andMachine Intelligence,1995(1):77-104.
    [13]Zhao L, Thorpe C. Stereo and neural network-based pedestrian detection. IEEETransactionsonIntelligentTransportationSystems,2000,1(3):148-154.
    [14]Oren M, Papageorgiou C, Sinha P, Osuna E, Poggio T. Pedestrian detection usingwavelet templates. Proceedings of IEEE Conference on Computer Vision and PatternRecognition.SanJuan,PuertoRico:IEEE,1997.193-199.
    [15]Xu F L, Liu X, Fujimura K. Pedestrian detection and tracking with night vision.IEEETransactionsonIntelligentTransportationSystems,2005,6(1):63-71.
    [16]Gavrila D M, Giebel J, Munder S. Vision-based pedestrian detection: thePROTECTOR system. Proceedings of IEEE International Conference on IntelligentVehiclesSymposium.Parma,Italy:IEEE,2004.13-18.
    [17]Cheng H, Zheng N N, Qin J J. Pedestrian detection using sparse Gabor filter andsupport vector machine. Proceedings of IEEE Intelligent Vehicles Symposium. LasVegas,USA:IEEE,2005.583-587.
    [18]Tian Q M, Sun H, Luo Y P, Hu D C. Nighttime pedestrian detection with a normalcamera using SVM classifier. Proceedings of International Symposium on NeuralNetworks.Chongqing,China:Springer,2005.3497:189-194.
    [19]贾慧星,章毓晋.车辆辅助驾驶系统中基于计算机视觉的行人检测研究综述.自动化学报,2007,33(1):84-90.
    [20]郭烈,王荣本,顾柏园,余天洪.世界智能车辆行人检测技术综述.公路交通科技,2005,22(11):133-137.
    [21]Guo L, Wang R B, Jin L S, Li L H, Yang L. Algorithm study for pedestriandetection based on monocular vision. Proceedings of IEEE International Conference onVehicularElectronicsandSafety.Shanghai,China:IEEE,2006.83-87.
    [22]Cao X B, Qiao H, Wang F Y, Zhang X Z. Application of cooperative co-evolutionin pedestrian detection systems. Proceedings of IEEE International Conference onIntelligenceandSecurityInformatics.Atlanta,USA:Springer,2005.3495:664-665.
    [23]Xu Y W, Cao X B, Qiao H. Optical camera based pedestrian detection in rainy orsnowy weather. Proceedings of IEEE International Conference on Fuzzy Systems andKnowledgeDiscovery.Xi’an,China:Springer,2006.4223:1182-1191.
    [24]Chen D, Cao X B, Xu Y W, Qiao H, Wang F Y. A SVM-based classifier withshape and motion features for pedestrian detection system. Proceedings of IEEEIntelligentVehicleSymposium.Meguroku,Japan: IEEE,2006.331-335.
    [25]Guo Y P, Cao X B, Xu Y W, Qiao H. Co-evolution based feature selection forpedestrian detection. Proceedings of IEEE International Conference on Control andAutomation.Guangzhou,China,2007.2797-2801.
    [26]Cao X B, Qiao H, Keane J. A low-cost pedestrian detection system with a singleoptical camera. IEEE Transactions on Intelligent Transportation System, 2008, 9(1):58-67.
    [27]Wei C X, Cao X B, Xu Y W, Qiao H, Wang F Y. The treelike assembly classifierfor pedestrian detection. Proceedings of Pacific Asian Workshop on Intelligence andSecurityInformatics.Chengdu,China:Springer,2007.232-237.
    [28]Gandhi T, Trivedi M M. Pedestrian collision avoidance systems: A survey ofcomputer vision based recent studies. Proceeding of Intelligent Transportation SystemsConference.Toronto,Canada:IEEE,2006.976-981.
    [29]Haritaoglu I, Hartwood D, Davis L S. W4: real-time surveillance of people andtheir activities. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2000,22:809-830.
    [30]O’Rourke J, Badler N. Model-based image analysis of human motion usingconstraint propagation. IEEE Transactions on Pattern Analysis and MachineIntelligence,1980,2(6):522-536.
    [31]HoggD.Model-basedvision: aprogram toseea walkingperson. Imageand VisionComputing,1983,1(1):5-20.
    [32]Baumberg A M. Learning deformable models for tracking human motion. Ph. D.thesis,SchoolofComputerStudies,UniversityofLeeds,Leeds,UK,1995.
    [33]Brémond F, Thonnat M. Tracking multiple non-rigid objects in a cluttered scene.Proceedings of the 10th Scandinavian Conference on Image Analysis. Lappeenranta,Finland,1997.2:643-650.
    [34]Cai Q, Mitiche A, Aggarwal J K. Tracking human motion in an indoorenvironment. Proceedings of International Conference on Image Processing. 1995.215-218.
    [35]Gavrila D M, Davis L S. Tracking of humans in action: a 3-D model-basedapproach. Proceedings of ARPA Image Understanding Workshop. Palm Springs, USA,1996.737-746.
    [36]Johnson N. Learning object behavior models. Ph. D. thesis, School of ComputerStudies,UniversityofLeeds,Leeds,UK,1998.
    [37]Khan S, Javed O, Rasheed Z, Shah M. Human tracking in multiple cameras.Proceedings of IEEE International Conference on Computer Vision. Vancouver,Canada,2001.331-336.
    [38]Lipton A J, Fujiyoshi H, Patil R S. Moving target classification and tracking fromreal-time video. Proceedings of the DARPA Image Under- standing Workshop.Monterey,USA,1998.129-136.
    [39]Sidenbladh H, Black M J, Fleet D J. Stochastic trackingof 3D human figures using2D image motion. Proceedings of European Conference on Computer Vision. Dublin,Ireland,2000.702-718.
    [40]Wren C, Azarbayejani A, Darrell T, Pentland A. Pfinder: real-time tracking of thehuman body. Technical Report 353, MIT Media Laboratory Perceptual ComputingSection,1995.
    [41]GavrilaDM.Pedestriandetectionfromamovingvehicle.ProceedingsofEuropeanconferenceonComputerVision.London,UK:Springer,2000.1843:37-49.
    [42]Lipton A. Local application of optic flow to analyze rigid versus nonrigid motion.Proceedings of International Conferenceon ComputerVisionWorkshop onFrame-RateVision.Corfu,Greece:IEEE,1999.
    [43]Heisele B, W?hler C. Motion-based recognition of pedestrians. Proceedings ofIEEE International Conference on Pattern Recognition. Brisbane, Australia: IEEE,1998.2:1325-1330.
    [44]Sessler G M A, Martoyo T, Jondral F K. RBF based multiuser detectors forUTRA-TDD. Proceedings of IEEE Vehicular Technology Conference. Atlantic, USA:IEEE,2001.1:484-486.
    [45]Franke U, Gavrila D M, Gorzig S, Lindner F, Puetzold F, W?hler C. Autonomousdriving goes downtown. IEEE Intelligent Systems & Their Applications, 1998, 13(6):40-48.
    [46]Gavrila D M, Geibel J. Shape-based pedestrian detection and tracking. ProceedingsofIEEEIntelligentVehiclesSymposium.Versailles,France:IEEE,2003.1:8-14.
    [47]Kass M, Witkins A, Terzopoulos D. Snakes: active contour models. Proceedings ofInternationalConferenceonComuputerVision.1987.259-268.
    [48]Wiskott L, Fellous J, Kruger N, Malsburg C v d. Face recognition byelastic bunchgraph matching. IEEE Transactions on Pattern Analysis and Machine Intelligence,1997,19(7):775-779.
    [49]Cootes TF,TaylorC J,CooperDH,Graham J.Activeshapemodels–theirtrainingandapplication.ComputerVisionandImageUnderstanding,1995,61:38-59.
    [50]Cootes T F, Edwards G J, Taylor C J. Active appearance models. Proceedings ofEuropeanConferenceonComputerVision.1998.2:484-498.
    [51]Baumberg A, Hogg D. An efficient method for contour tracking using active shapemodels. Proceedings of IEEE Workshop on Motion of Non-Rigid and ArticulatedObjects.1994.194-199.
    [52]Li Y, Lai J H, Yuen P C. Multi-template ASM method for feature points detectionof facial image with diverse expressions. Proceedings of International Conference onAutomaticFaceandGestureRecognition.2006.435-440.
    [53]Zhao Z, Teoh E K. A novel 3D statistical shape model for segmentation of medicalimages.ProceedingsofInternationalSymposiumonVisualComputing.2006.638-647.
    [54]Hou X W, Li S Z, Zhang H J, Cheng Q S. Direct appearance models. ProceedingsofIEEEInternationalConferenceonComputerVisionandPatternRecognition.Hawaii,2001.828-833.
    [55]Cootes T F, Walker K, Taylor C J. View-based active appearance models.Proceeding of International Conference on Face and Gesture Recognition. Grenoble,France,2000.227-232.
    [56]Li S Z, Yan S C, Zhang H J, Cheng Q S. Multi-view face alignment using directappearance models. Proceedings of International Conference on Automatic Face andGestureRecognition.Washington,DC,USA,2002.324-329.
    [57]Yan S C, Liu C, Li S Z, Zhang H J, Shum H, Cheng Q S. Face alignment usingtexture-constrained active shape models. Image and Vision Computing, 2003, 21(1):69-75.
    [58]Zhou Y, Gu L, Zhang H J. Bayesian tangent shape model: estimating shape andpose parameters via Bayesian inference. Proceeding of IEEE Conference on ComputerVisionandPatternRecognition.Wisconsin,USA,2003.109.
    [59]凌旭峰.彩色图像监控系统的人脸检测和识别.上海交通大学博士论文,2001.
    [60]梁路宏.基于多关联模板匹配的人脸检测.软件学报,2001,12(1):94-102.
    [61]刑昕,汪孔桥,沈兰荪.基于器官跟踪的人脸实时跟踪方法.电子学报,2000,28(6):29-31.
    [62]刘明宝,姚鸿勋,高文.彩色图像的实时人脸跟踪方法.计算机学报,1998,21(6):527-532.
    [63]LuXG,ZhouJ,ZhangCS.Anovelalgorithmforrotatedhumanfacedetection.Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, HiltonHeadIsland,SouthCarolina,USA,2000.760-764.
    [64]Li S Z, Zhu L, Zhang Z Q, et al. Statistical learning of multi view face detection.Proceedings of the European Conference on Computer Vision, Copenhagen, Denmark,2002.117-121.
    [65]Augusteijn M F. Identification of human faces through texture-based featurerecognition and neural network technology. Proceedings of IEEE Conference on Neuralnetworks,1993.392-398.
    [66]袁超,张长水.基于多模板匹配的自动人脸检测.电子学报,2000,28(3):95-98.
    [67]Rowley H. Neural network-based face detection. IEEE Transactions on PatternAnalysisandMachineIntelligence,1998,20(1):23-38.
    [68]Osunav E. Training support vector machines: an application to face detection.ProceedingsofIEEEConferenceonComputerVisionandPatternRecognition,1997.4:130-136.
    [69]Compton K, Hauck S. Reconfigurable computing: a survey of systems andsoftware.ACMComputingSurvey,2002,34(2):171-210.
    [70]PlesslC, EnzlerR,WalderH,BeutelJ,PlatznerM,ThieleL,TrosterG.Thecasefor reconfigurable hardware in wearable computing. Personal and UbiquitousComputing,2003,299-308.
    [71]IMEC.InteruniversityMicroElectronicCenter.T-ReCSGecko.
    [72]Chakraborty S, Gries M, Kunzli S, Thiele L. Design space exploration of networkprocessor architectures. Network Processor Design: Issues and Practices, 2002, 1:55-89.
    [73]Volpe R. Rover functional autonomy development for the Mars mobile sciencelaboratory.ProceedingsofAerospaceConference.IEEE,2003.2:643-652.
    [74]Dou Y, Vassiliadis S, Kuzmanov G K, Gaydadjiev G N. 64-bit floating-pointFPGA matrix multiplication. Proceedings of the 2005 ACM/SIGDA 13th InternationalSymposiumonField-ProgrammableGateArrays.2005.86-95.
    [75]Scrofano R, Prasanna V K. Computing Lennard-Jones potentials and forces withreconfigurable hardware. Proceedings of International Conference on Engineering ofReconfigurableSystemsandAlgorithms.2004.284-290.
    [76]Underwood K D, Hemmert K S. Closing the gap: CPU and FPGA trends insustainable floating-point BLAS performance. Proceedings of IEEE Symposium onField-ProgrammableCustomComputingMachines.2004.
    [77]Zhuo L, Prasanna V K. Scalable and modular algorithms for floating-point matrixmultiplication on FPGAs. Proceedings of 18th International Parallel & DistributedProcessingSymposium.NewMexico,USA,2004.
    [78]Ratha M K. Computer vision algorithms on reconfigurable logic arrays. Ph. D.Dissertation,MichiganStateUniversity,1996.
    [79]Bosi B,BoisG,Savaria Y.Reconfigurablepipelined2-Dconvolversforfastdigitalsignalprocessing.IEEETransactionsonVLSISystems,1999,7(3):299-308.
    [80]Managuli R, York G, Kim D, Kim Y. Mapping of two-dimensional convolution onvery long instruction word media processors for real-time performance. Journal ofElectronicImaging,2000,9(3):327-335.
    [81]Huttenlocher D P, Klanderman G A, Rucklidge W J. Comparing images using theHausdorff distance. IEEE Transactions on Pattern Analysis and Machine Intelligence,1993,15(9):850-863.
    [82]You J, Bhattacharya P, Hungenahally S. Real-time object recognition: hierarchicalimage matching in a parallel virtual machine environment. Proceedings of InternationalConference on Pattern Recognition. Washington: IEEE Computer Society, 1998.275-277.
    [83]Gavrila D M, Philomin V. Real-time object detection for smart vehicles.Proceedings of International Conference on Computer Vision. Los Alamitos: IEEEComputerSociety,1999.87-93.
    [84]Kean T, Duncan A. A 800 Mpixel/sec reconfigurable image correlator on XC6216.ProceedingsofInternationalWorkshoponField-ProgrammableLogicandApplications.London:Springer-Verlag,1997.382-391.
    [85]Villasenor J, Schoner B, Chia K, Zapta C. Configurable computing solutions forautomatic target recognition. Proceedings of the 1996 Symposium on FPGAs forCustomComputingMachines.LosAlamitos:IEEEComputerSociety,1996.70-79.
    [86]Hezel S, Gavrila D, Kugel A, Manner R. FPGA-based template matching usingdistance transforms. Proceedings of 10th Annual IEEE Symposium onField-ProgrammableCustomComputingMachines.2002.89-97.
    [87]Arias-EstradaM,Rodríguez-PalaciosE.AnFPGAco-processorforreal-timevisualtracking. Proceedings of International Conference on Field-Programmable Logic andApplications.2002.710-719.
    [88]Sudha N, Vivek E P. A high-speed VLSI array architecture for Euclideanmetric-based Hausdorff distance measures between images. Proceedings ofInternationalConferenceonHighPerformanceComputing.Goa,India,2005.180-189.
    [89]Rosenfeld A, Pfaltz J L. Sequential operations in digital picture processing. JournaloftheAssociationforComputingMachinery,1966,13(4):471-494.
    [90]Castleman K R. Digital Image Processing. Prentice Hall. Inc, New Jersey, USA,1996.
    [91]Breu H, Gil J, Kirkpatrick D, et al. Linear time Euclidean distance transformalgorithm. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1995,17(5):529-533.
    [92]Maurer C R, Raghavan V, Qi R S. A linear time algorithm for computing theEuclidean distance transform in arbitrarydimensions. Proceedings of 17th InternationalConference of Information Processing in Medical Imaging. Davis, CA, USA, 2001.358-364.
    [93]Borgefors G. Distance transformations in digital images. Computer Vision,Graphics,andImageProcessing,1986,34(3):344-371.
    [94]Danielsson P E. Euclidean distance mapping. Computer Graphics and ImageProcessor,1980,14(2):227-248.
    [95]Yamada H. Complete Euclidean distance transformation by parallel operation.Proceedings of International Conference on Pattern Recognition. Montreal, Canada,1984.69-71.
    [96]Das P P, Chakrabarti P P. Distance functions in digital geometry. InformationScience,1987,42(2):113-136.
    [97]Yamashita M, Ibaraki T. Distances defined by neighborhood sequences. PatternRecongnition,1986,19(3):237-246.
    [98]Borgefors G. Distance transformations in arbitrary dimensions. Computer Vision,Graphics,andImageProcessing,1984,27(3):321-345.
    [99]Borgefors G. A new distance transformation approximation the Euclidean distance.Proceedings of International Joint Conference on Pattern Recognition. London, UK,1986:336-338.
    [100] Otsu N. A threshold selection method from grey-level histograms. IEEETransactionsonSystems,Man,andCybernetics,1979,9(1):377-393.
    [101] Swenson R L, Dimond K R. A hardware FPGA implementation of a 2-Dmedian filter using a novel rank adjustment technique. Proceedings of InternationalConference on Image Processing and its Applications. London: IEE ConferencePublication,1999.103-106.
    [102] Bates G L, Nooshabadi S. FPGA implementation of a median filter.Proceedings of IEEE TENCON Conference. Los Alamitos: IEEE Computer Society,1997.437-440.
    [103]王亮,胡卫明,谭铁牛.行人运动的视觉分析综述.计算机学报,2002,25(3):225-237.
    [104]刘亚,艾海舟,徐光佑.一种基于背景模型的运动目标检测与跟踪算法.信息与控制,2002,31(4):315-319.
    [105] Sen C, Cheung S, Chandrika K. Robust techniques for background subtractionin urban traffic video. Visual Communications and Image Processing, 2004, 5308(1):881-892.
    [106] MassimoP.Backgroundsubtractiontechniques:areview.ProceedingsofIEEEInternational Conference on Systems, Man and Cybernetics. Piscataway: IEEE Press,2004.3099-3104.
    [107] McFarlane N, Schofield C. Segmentation and tracking of piglets in images.MachineVisionandApplications,1995,8(3):187-193.
    [108] Meribout M, Nakanishi M. A new real time object segmentation and trackingalgorithm and its parallel hardware architecture. Journal of VLSI Signal Processing,2005,39:249-266.
    [109] Tech G, Schwann R, Kappen G, F?rst M, Noll T G. Adaptive kernel algorithmfor FPGA-based speckle reduction. Proceedings of SPIE Medical Imaging. 2008.691424-691424-12.
    [110] Morimoto T, Adachi H, Yamaoka K, Awane K, Koide T, Mattausch H J. AnFPGA-based region-growing video segmentation system with boundary-scan-only LSIarchitecture.ProceedingsofIEEEConferenceonCircuitsandSystems.2006.944-947.
    [111] Gonzalez R C, Woods R E. Digital Image Processing (Second Edition).PrenticeHall,2002.
    [112] Dou Y, Xu J. FPGA-accelerated active shape model for real-time peopletracking. Proceedings of the Asia-Paci?c Computer Systems Architecture Conference.2007.268-279.
    [113] Xu J, Dou Y. Robust and real-time automatic target recognition using partialHausdorff distance measure on recon?gurable hardware. Proceedings of theInternationalConferenceonField-ProgrammableTechnology.2006.25–32.
    [114] Sonka M, Hlavac V, Boyle R. Image Processing, Analysis, and MachineVision(SecondEdition). Brooks/Cole, Paci?c Grove, CA, 1999.
    [115]魏本杰,刘明业,张晓昆,金泰松.三维多分等级树算法的VLSI设计与仿真.计算机辅助设计与图形学学报,2006,18(12):1867-1871.
    [116] Lu Y, Marconi T, Gaydadjiev G, Bertels K. An efficient algorithm for freeresources management on the FPGA. Proceedings of the Conference on Design,AutomationandTestinEurope.2008.1095-1098.
    [117] Papaefstathiou I, Orphanoudakis T, Kornaros G, Kachris C, Mavroidis I,NikologiannisA.Queuemanagementinnetworkprocessors.Proceedings oftheDesign,AutomationandTestinEuropeConferenceandExhibition.2005.112-117.
    [118] Barrow H G, Tenenbaum J M, Bolles R C, Wolf H C. Parametriccorrespondence and chamfer matching: two new techniques for image matching.ProceedingsofInternationalJointConferenceonArtificialIntelligence.1977.659-663.
    [119] Borgefors G. Hierarchical chamfer matching: a parametric edge matchingalgorithm. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1988, 10:849-865.
    [120] Welch G, Bishop G. An introduction to the Kalman filter. TR95-041.UniversityofNorthCarolinaatChapelHill,1995.
    [121] Blake A, Curwen R, Zisserman A. A framework for spatio-temporal control inthe tracking of visual contours. International Journal of computer Vision, 1993, 11(2):127-145.
    [122] Lam M. Software pipelining: an effective scheduling technique for VLIWmachines.ProceedingsoftheSIGPLANConferenceonProgrammingLanguageDesignandImplementation(PLDI).LasVegas,Nevada,USA,1988.318-328.
    [123] Allan V H, Jones R B, Lee R M, Allan S J. Software pipelining. ACMComputingSurveys,1995,27(3):367-432.
    [124] Rau B R. Iterative modulo scheduling: an algorithm for software pipeliningloops. Proceedings of the 27th Annual International Symposium on Microarchitecture(MICRO).SanJose,California,USA,1994.63-74.
    [125] Siebel N T. Design and implementation of people tracking algorithms forvisual surveillance applications. Ph. D. thesis. Department of Computer Science, theUniversityofReading,Reading,UK,2003.
    [126] Gavrila D M. Vision-based 3D tracking of humans in action. Ph. D. thesis.DepartmentofComputerScience,UniversityofMaryland,CollegePark,USA,1996.
    [127] SidenbladhH.Probabilistictrackingandreconstructionof3Dhumanmotioninmonocular video sequences. Ph. D. thesis. Royal Institute of Technology, Stockholm,2001.
    [128] http://homepages.inf.ed.ac.uk/rbf/caviar/.
    [129] Yang M H, Kriegman D, Ahuja N. Detecting faces in images: A survey. IEEETransactionsonPatternAnalysisandMachineIntelligence,2002,24(1):34-58.
    [130] Rowley H A. Neural network-based human face detection. Ph. D. thesis.CarnegieMellonUniversity,1999.
    [131] Schneiderman H, Kanade T, A statistical method for 3d object detectionapplied to faces and cars. Proceedings of IEEE Computer Society Conference onComputerVisionandPatternRecognition.2000.1746.
    [132] Viola P, Jones M. Rapid object detection using a boosted cascade of simplefeatures. Proceedings of IEEE Computer Society Conference on Computer Vision andPatternRecognition.2001.511-518.
    [133] Kuchinsky A, Pering C, Creech M, Freeze D, Serra B, Gwizdka J. FotoFile: aconsumer multimedia organization and retrieval system. Proceedings of the SIGCHIConferenceonHumanFactorsinComputingSystems.1999.496-503.
    [134] Wu B, Ai H G, Lao S H. Fast rotation invariant multi-view face detectionbased on real adaboost. Proceedings of IEEE International Conference on AutomaticFaceandGestureRecognition.2004.79-84.
    [135] JonesM,ViolaP.Fastmulti-viewfacedetection.MERL-TR2003-96.2003.
    [136] Huang C, Ai H Z, Lao S H. Vector boosting for rotation invariant multi-viewface detection. Proceedings of IEEE International Conference on Computer Vision.2005.446-453.
    [137] SchapireR E,SingerY. Improvedboostingusingconfidence-ratedpredictions.MachineLearning,1999,37(3):297-336.
    [138] Friedman J, Hastie T, Tibshirani R. Additive logistic regression: a statisticalviewofboosting.AnnalsofStatistics,2000,28:337-374.
    [139] Xiao R, Zhu L, Zhang H. Boosting chain learning for object detection.ProceedingsofIEEEInternationalConferenceonComputerVision.2003.709.
    [140] Li S Z, Zhang Z Q. Floatboost learning and statistical face detection. IEEETransactionsonPatternAnalysisandMachineIntelligence,2004,26(9):1112-1123.
    [141] Mita T, Kaneko T, Hori O. Joint Haar-like features for face detection.ProceedingsofIEEEInternationalConferenceonComputerVision.2005.1619-1626.
    [142] Lienhart R, Maydt J. An extended set of Haar-like features for rapid objectdetection. Proceedings of International Conference on Image Processing. 2002.900-903.
    [143] Liu C, Shum H Y. Kullback-leibler boosting. Proceedings of IEEE ComputerSocietyConferenceonComputerVisionandPatternRecognition.2003.587-594.
    [144] Baluja S, Sahami M, Rowley H A. Efficient face orientation discrimination.ProceedingsofIEEEInternationalConferenceonImageProcessing.2004.589-592.
    [145] Wang P, Ji Q. Learning discriminant features for multi-view face and eyedetection. Proceedings of IEEE Computer Society Conference on Computer Vision andPatternRecognition.2005.373-379.
    [146] Abramson Y, Steux B. YEF real-time object detection. Proceedings ofInternationalWorkshoponAutomaticLearningandReal-Time.2005.
    [147] Hori Y, Shimizu K, Nakamura Y, Kuroda T. A real-time multi face detectiontechnique using positive-negative lines-of-face template. Proceedings of InternationalConferenceonPatternRecognition.2004.765-768.
    [148] Paschalakis S, Bober M. A low cost FPGA system for high speed facedetection and tracking. Proceedings of IEEE International Conference onField-ProgrammableTechnology.2003.214-221.
    [149] Theocharides T, Link G, Vijaykrishnan N, Irwin M, Wolf W. Embeddedhardware face detection. Proceedings of IEEE International Conference on VLSIDesign.2004.133-138.
    [150] Yang M, Wu Y, Crenshaw J, Augustine B, Mareachen R. Face detection forautomatic exposure control in handheld camera. Proceedings of IEEE InternationalConferenceonComputerVisionSystems.2006.17.
    [151] SchapireRE.Thestrengthofweaklearnability.MachineLearning,1990,5(2):197-227.
    [152] FreundY,SchapireRE.Adecision-theoreticgeneralizationofon-linelearningand an application to boosting. Journal of Computer and System Sciences, 1997, 55(1):119-139.
    [153] Schapire R E. The boosting approach to machine learning: an overview.Proceedings of MSRI Workshop on Nonlinear Estimation and Classification. 2002.149-172.
    [154] Viola P, Jones M. Robust real-time face detection. International Journal ofComputerVision,2004,57(2):137-154.
    [155] DeMacq I, Simar L. Hyper-rectangular space partitioning trees, a few insight.Tech.rep..UniversiteCatholiquedeLouvain,Belgium,2002.
    [156] Mitéran J, Matas J, Bourennane E, Paindavoine M, Dubois J. Automatichardware implementation tool for a discrete adaboost-based decision algorithm.EURASIPJournalonAppliedSignalProcessing,2005,2005(1):1035-1046.
    [157] Yu W, Xiong B, Chareonsak C. FPGA implementation of adaboost algorithmfor detection of face biometrics. Proceedings of IEEE International Workshop onBiomedicalCircuitsandSystems.2004.17-20.
    [158] Kuzmanov G, Vassiliadis S, Eijndhoven J. A 2D addressing mode formultimedia applications. Proceedings of Workshop on System Architecture, Modeling,andSimulation.Samos,Greece,2001.291-306.
    [159] Budnik P, Kuck D J. The organization and use of parallel memories. IEEETransactionsonComputers,1971,20(12):1566-1569.
    [160] Chen S, Postula A, Jozwiak L. Synthesis of XOR storage schemes withdifferent cost for minimization of memory contention. Proceedings of EuromicroConference.Milan,Italy,1999.1170-1177.
    [161] Lee H, Moon K A, Park J W. Design of parallel processing system for facialimage retrieval. Proceedings of 4th International ACPC Conference. Salzburg, Austria,1999.592-593.
    [162] Lee H, Park J W. Parallel processing system for multi-access memory system.Proceedings of World Multi-Conference on Systematics, Cybernetics, and Information.2000.561-565.
    [163] Kim K, Prasanna V K. Latin squares for parallel array access. IEEETransactionsonParallelandDistributedSystems,1993,4(4):361-370.
    [164] Lee D. Scrambled storage for parallel memory systems. Proceedings of IEEEInternationalSymposiumonComputerArchitecture.Honolulu,Hawaii,1988.232-239.
    [165] Park J W. An ef?cient buffer memory system for subarray access. IEEETransactionsonParallelandDistributedSystems,2001,12(3):316-335.
    [166] Park J W. Multiaccess memory system for attached SIMD computer. IEEETransactionsonComputers,2004,53(4):439-452.
    [167] Haverkamp M B, Kuzmanov G, Vassiliadis S. Implementing 2D memorybuffers forMPEG.Proceedings ofPRORISC Conference.Veldhoven,TheNetherlands,2003.90-94.
    [168] Lawrie D H. Access and alignment of data in an array processor. IEEETransactionsonComputers,1975,C-24(12):1145-1155.
    [169] Voorhis D C v, Morrin T H. Memory systems for image processing. IEEEactionsonComputers,1978,C-27(2):113-125.
    [170] Sproull R F, Sutherland I, Thomson A, Gupta S, Minter C. The 8 by8 display.ACMTransactionsonGraphics,1983,2(1):32–56.
    [171] Kneip J, Ronner K, Pirsch P. A data path array with shared memory as core ofa high performance DSP. Proceedings of International Conference on ApplicationSpecificArrayProcessors.SanFrancisco,CA,USA,1994.271-282.
    [172] Wittenburg J P, Ohmacht M, Kneip J, Hinrichs W, Pirsh P. HiPAR-DSP: AparallelVLIW RISCprocessorforreal timeimageprocessingapplications.Proceedingsof 3rd International Conference on Algorithms and Architectures for ParallelProcessing.1997.155-162.
    [173] Kloos H, Wittenburg J, Hinrichs W, Lieske H, Friebe L, Klar C, Pirsch P.HiPAR-DSP 16, a scalable highly parallel DSP core for system on a chip: video andimage processing applications. Proceedings of IEEE International Conference onAcoustics,Speech,andSignalProcessing.2002.3112-3115.
    [174] Kuzmanov G, Gaydadjiev G, Vassiliadis S. Multimedia rectangularlyaddressablememory.IEEETransactionsonMultimedia,2006,8(2):315-322.