標題: 多媒體壓縮與視覺分析技術於嵌入式平台開發及智慧型運輸系統應用之研究
A Study of Multimedia Compression and Vision-Based Analysis Techniques for Embedded Platform Development and Intelligent Transportation System Applications
作者: 黃皓昱
Huang, Hao-Yu
吳炳飛
Wu, Bing-Fei
電控工程研究所
關鍵字: 嵌入式系統;多媒體;音訊壓縮;影像壓縮;盲點偵測;embedded system;multimedia;audio compression;image compression;blind spot detection
公開日期: 2011
摘要: 由於近日科技日新月異,智慧型運輸系統已經成為現今研究最重要的議題之一。在現今智慧型運輸系統之應用之中,對於語音記錄、即時行車記錄,以及盲點偵測等技術需求越來越殷切。在語音記錄方面,音訊壓縮的技術使得語音記錄更有效率。而音訊壓縮標準其中,以進階音訊編碼(Advanced Audio Coding,AAC)在近年來最受注目,進階音訊編碼在96Kbps的流量可提供CD音質品質的壓縮。因此隨著網路頻寬與可攜式裝置容量的限制,由於進階音訊編碼成為現今音訊壓縮首選。此外,在即時行車記錄方面,影像壓縮技術也十分重要,其中,在減少冗餘資料與維持良好編碼品質上,轉換編碼扮演著非常重要的角色。轉換編碼之中最為廣泛使用的是離散餘弦編碼,配合赫夫曼編碼,在大量的影像視訊編碼標準中提供優秀的壓縮品質。然而影像視訊編碼標準其複雜度與記憶體使用上並未能完全配合嵌入式平台低執行時脈與有限記憶體的限制。因此針對即時行車記錄的應用,需要有低複雜度、低記憶體使用量的影像視訊編碼。另一個在智慧型運輸系統的重要議題是盲點偵測。盲點偵測的研究在駕駛輔助中,對於保護駕駛安全與防止車輛碰撞扮演著重要的角色。利用影像處理與電腦視覺的技術,來進行盲點區域的偵測與辨識,是目前智慧型運輸系統一個很熱門的研究主題。本論文主要分成五個部份,第一部份我們針對多媒體壓縮與視覺分析技術在智慧型運輸系統應用做一簡要介紹,接著以三個主題:音訊壓縮、影像壓縮與盲點偵測,分別針對智慧型運輸系統運用多媒體壓縮與視覺分析技術之理論與實作,並實現於嵌入式系統平台上。 在第二章中提出了數個針對MPEG-2/4低複雜度進階音訊編碼與解碼的最佳化方法。同時考慮到消費性電子應用於車用電子,我們同時將其實現於TI OMAP5912平台。對於實驗進階音訊編碼於嵌入式平台一個很重要的課題是,如何減少運算複雜度與記憶體消耗。由於大多數的嵌入式平台只能提供有限的運算能量與記憶體資源,因此單單去簡化演算法中一到兩個模組是不夠的。因此在第二章中,我們針對整個進階音訊編碼的所有模組去提出最佳化的方案,包含了時域噪聲整型(Temporal Noise Shaping,TNS)模組、中側立體聲編碼(Mid/Side Stereo Coding,M/S Stereo Coding)模組、改進離散餘弦變換(Modified Discrete Cosine Transform,MDCT)模組以及反量化(Inverse Quantization,IQ)模組。我們在進行複雜度的最佳化同時,也考量到記憶體使用上的控制,取得最佳效能與最佳記憶體使用量的平衡。透過嵌入式平台實現與實驗結果比較驗證,本文提出一個實現在低成本的嵌入式平台,擁有低複雜度低記憶體消耗,同時又保留高品質的進階音訊編碼。 在第三章中,我們提出一個以區塊邊緣偵測為基礎的單通感知的嵌入式零樹編碼。單通感知嵌入式零樹編碼結合了兩個新穎的壓縮技術,分別是區塊邊緣偵測(Block Edge Detection,BED)及低複雜度與低記憶體熵編碼器(Low-Complexity and Low Memory Entropy Coding,LLEC)。由於邊緣資訊可以提供影像原本的輪廓線索,保留影像原本的感知程度,因此本文提出了一個結合區塊邊緣偵測用來當做影像壓縮的根據,依照其區塊邊緣資訊動態調整量化表。透過低複雜度與低記憶體編碼器,量化後的離散餘弦轉換係數可以在保留影像感知度下進行最高效率的編碼。同時在本文中,單通感知嵌入式零樹編碼不僅實現在電腦平台上做驗證,同時也實現於雙核心的嵌入式系統平台。實驗驗證結果展現本方法適合應用在嵌入式平台上,同時保有不遜於其他常用影像壓縮規格的壓縮品質。 在第四章中,我們提出一個可進行日間與夜間的盲點偵測系統,使用視覺分析為基礎進行盲點區域偵測,首先對裝設在車上的攝影機建立動態性的模型做詳盡完整的分析與介紹,接著介紹對輸入影像的感興趣區域如何選取。而盲點偵測演算法將分成日間與夜間進行討論。在日間的情況下,系統將採用水平邊緣與陰影複合平面取目標車輛的底部陰影,以日間車底陰影做為依據來偵測盲點區域中的車輛。在夜間的情況下,本系統對輸入影像進行明亮物件萃取,透過演算法判斷成對車頭燈做為夜間偵測盲點區域中車輛之根據。本系統在偵測到自車盲點區域出現過近的車輛時,系統將會發出警示同時估算距離,以輔助駕駛,避免發生側撞的情況。同時配合車用電子應用,亦將此方法在嵌入式平台DM642上實現,並實際在實驗車Taiwan iTS-II上在高速公路上實驗。最後,在第六章的部分,我們整理了本篇論文的結論與未來的研究展望
Due to the recent advances in vehicle technology, the Intelligent Transportation System (ITS) has become one of the important issues in the current studies. Among the researches of ITS, voice recording, real-time event data recorder, and Blind Spot Detection (BSD) techniques become more and more significant in the current days. For the case of voice recording, audio compression technique makes the voice recording much effective, Among the audio compression standards, Advanced Audio Coding (AAC) provides CD audio quality at a 96Kbps bit rate and has become the audio compression standard in the recent years. With the limited Internet bandwidth and portable device storage, AAC now leads audio compression technology. Besides, the image compression technique is also substantial for the event data recording. However, the image and video coding standards do not fit the requirements of embedded applications for the low complexity and low memory consumption. In accordance with the application of the event data recording, it is necessary to implement a low-complexity and low-memory image coder. Another critical issue in the ITS is the study of the BSD. BSD by applying the image processing and computer vision technology becomes a popular topic in the studies of ITS. In this dissertation, we will present several algorithmic, practical, and integrated methods and systems for the above-mentioned applications. Additionally, these applications are implemented on the embedded platform. Chapter 2 presents several optimization approaches for the MPEG-2/4 AAC Low Complexity (LC) encoding and decoding processes. This study focuses on optimizing the Temporal Noise Shaping (TNS), Mid/Side (M/S) Stereo, Modified Discrete Cosine Transform (MDCT) and Inverse Quantization (IQ) modules in the encoder and decoder. Furthermore, we also propose an efficient memory reduction approach that provides a satisfactory balance between the reduction of memory usage and the expansion of the encoded files. Experimental results demonstrate that the proposed AAC codec is computationally effective, has low memory consumption, and is suitable for low-cost embedded and mobile applications. In Chapter 3, we propose a block-edge-based Single-Pass Perceptual Embedded Zero-tree Coding (SPPEZC) method. SPPEZC combines two novel compression concepts, called Block-Edge Detection (BED) and the Low-Complexity and Low-Memory Entropy Coder (LLEC), for coding efficiency and quality. Based on the block-edge information, this paper proposes an adaptive architecture for adjusting the quantization table and subsequently coding the quantized coefficients with the LLEC. Experimental results and comparisons demonstrate that the proposed SPPEZC technique provides computational efficiency as well as satisfactory perceptual quality in compressed images. In Chapter 4, we present an effective Blind Spot Warning System (BSWS) for daytime and nighttime conditions. The proposed BSWS includes camera models of a dynamic calibration, the Region of Interest (ROI) initialization and updating, and blind spot detection (BSD) algorithms for the daytime and nighttime. Under daytime conditions, the proposed system presents the Horizontal Edge and Shadow Composite Region (HESCR) method to extract the searching region and to acquire the shadow location of the targeted vehicles. Additionally, to detect obstacles and vehicles at nighttime road scenes, the proposed system extracts bright objects and recognizes the paired headlights of the targeted vehicles for the BSD. Experimental results show that the proposed BSD system is feasible for vehicle detection and collision warning in various daytime and nighttime road environments. Finally, we give a brief conclusion and future works in Chapter 5.
URI: http://140.113.39.130/cdrfb3/record/nctu/#GT079312517
http://hdl.handle.net/11536/40499
顯示於類別:畢業論文