標題: AdaBoost物件偵測演算法之異質運算加速研究
An AdaBoost Object Detection Algorithm for Heterogeneous Computing with OpenCL
作者: 鄭秉揚
Cheng, Bing-Yang
Guo, Jiun-In
電子工程學系 電子研究所
關鍵字: 分類器;繪圖處理器;異質系統架構;物件偵測;AdaBoost;GPU;HSA;Object detection
公開日期: 2013
摘要: 物件偵測屬於電腦視覺中相當重要的領域,在許多應用或產品裡都會發現它的存在,如人臉偵測、行人偵測,然而高偵測正確率總伴隨犧牲速度和效能,其中一個解決方法是利用硬體實現提升速度和減少耗能,但是無法彈性針對演算法適當調整或更新。因此,本文提出利用異質系統架構(HSA)中CPU與GPU的異質多核心平台進行物件偵測加速,以達即時實現之效能需求。
本論文所提出的方法妥善運用CPU與GPU之架構特性與運算資源進行加速運算,有效優化AdaBoost演算法進行平行化運算,解決AdaBoost演算法以GPU加速時所面臨到的效能瓶頸問題,其效能在AMD A10-7850K平台下可達到D1 @44fps。
Object detection is an important issue in computer vision. Detecting specific object can be applied into various kinds of applications, e.g. face detection, pedestrian detection, and so on. However, high accuracy detection rate is always accompanied with high computational complexity that might lead to poor performance in implementation. Hardware implementation is one of the solutions to improve processing performance significantly, but it is lack of flexibility if we want to update the algorithm. Thus, in this thesis we have proposed a parallel algorithm for Adaboost to be implemented on a Heterogeneous System Architecture (HSA) consisting of multiple CPU and GPU cores.
AdaBoost classification with Haar-like features is used in the proposed algorithm for object detection. Feature calculation in AdaBoost is the most time-consuming part of the algorithm, which occupies over 98% of the computation and cannot reach real-time processing with CPU computing only. Thus, we aim to accelerate the feature calculation in Adaboost through exploting both CPU and GPU computing resources. Based on the characteristics in AdaBoost algorithm, there are two problems that influence the performance on GPU. One is windows load unbalance problem and the other one is scale load unbalance problem. Thus, three solutions to overcome those two problems are proposed. They are scale parallelizing, stage parallelizing, and system dynamical partition. With these three solutions, the proposed system is able to achieve D1 video @ 44 fps on AMD A10-7850K processor.
Appears in Collections:Thesis