標題: | 影像處理與電腦視覺技術應用於複雜文件影像分析、夜間駕駛輔助、以及視訊監控系統之研究 A Study of Image Processing and Computer Vision Techniques for Complex Document Image Analysis, Nighttime Driver Assistance, and Video Surveillance Systems |
作者: | 陳彥霖 Yen-Lin Chen 吳炳飛 Bing-Fei Wu 電控工程研究所 |
關鍵字: | 電腦視覺;影像處理;影像切割;文件影像分析;駕駛輔助;視訊監控系統;Computer Vision;Image Processing;Image Segmentation;Document Image Analysis;Driver Assistance;Video Surveillance |
公開日期: | 2006 |
摘要: | 影像處理與電腦視覺技術的發展,近十年來在各種應用領域的期刊文獻上,發表了許多針對不同目的所開發之系統,用於靜態影像切割、文件分析與辨識、智慧型運輸系統、視訊監控等許多應用上。本論文將針對這些應用需求,發展一系列以影像處理與電腦視覺技術為基礎的研究方法與應用系統。本論文主要分為六個章節,第一章我們對於影像處理與電腦視覺在各領域的應用作一簡要介紹。而第二章至第五章,則分別探討簡介本論文所提出之各個以影像處理與電腦視覺為基礎的研究方法與應用系統。
在第二章中,我們提出了一個快速且具高度可靠性的多重門檻值選擇機制,以針對含有多個物件的灰階影像,加以將其中內含之物件分離,以利於後續之處理與分析。這個機制包含了一個最佳化的分離度量測法則,可以確保所得之切割影像,具有統計特性上的最大分離度,以此獲得最佳的切割效果。目前現有的門檻值選取演算法,多是為了二值化切割而設計,少數具有多門檻值選取技術,其大多需要消耗相當多的運算能量,以及多是設計於在具有某些特定特性的影像上才具有較高可靠度。而這個研究提出了一個最佳化的分離度量測判定準則,據此所開發之自動多重門檻值選擇技術,可在多種不同特性的影像上,皆可以極低的運算時間,判定影像上的物件個數,並決定門檻值加以切割分離之,並經過多種實驗測試證明其高效率與可靠性。這個技術可應用於電腦與機器視覺相關之系統開發,以此技術將具意義之物件加以從影像分離後,以利其後續之分析與辨識處理,如在文件影像分析與辨識、ITS智慧型運輸系統之電腦視覺處理上之整合運用,皆需要將有意義之物件從影像中擷取出,以利進行進一步的分析處理工作。
我們在第三章則提出了一套多平面影像切割演算法,以將文字從彩色圖文交疊的複雜文件影像中完整萃取。由於在現今生活中常見的文件,因印刷排版技術的發展,使得多媒體複合文件的大量出現,此類文件之文字大多印刷在複雜的背景內涵下。將文字從文件影像中抽離是文件分析研究的重要一環,目前已經有許多學者在這個領域提出相關文字切割技術。然而,先前的技術大多無法解決複雜文件影像抽離文字的困難。這個技術可以解決許多由於文件影像背景日益複雜所衍生的相關問題。在實驗分析中,我們使用了書本封面、平面廣告和雜誌等進行處理,實驗結果證明本論文所提出的方法,能夠成功的將這些文件影像中的中英文文字字串加以成功萃取。這個研究的主要目的,是開發一套以區域性切割演算法為基礎的文件影像切割技術,針對彩色圖文交疊的複雜文件影像,將其中所包含的前景物件與背景物件分別分離,並再以一套文字萃取演算法,使其中之文字資訊能夠完整的加以萃取,即使他們都被印刷在緩慢變化或快速變化的高度複雜背景圖形中。
在第四章中,我們則提出了一個針對夜間行車駕駛輔助與車輛自動化駕駛需求的智慧型高速夜間車輛偵測與辨識系統。該系統透過CCD影像擷取設備,結合電腦視覺處理技術,以實現夜間車輛偵測、相對位置與距離判定、車輛標定與追蹤,並以此輔助駕駛獲得前方之交通狀況資訊。以此可以提供一個有效的機制,以自動操控車上的相關裝置設備,如運用於車輛頭燈的遠近光燈控制上,可以在偵測判斷前方車道的交通狀況時,自動將車輛頭燈之遠光燈與近光燈調整至最佳狀態,防止炫光而影響前方來車駕駛視線,避免因為遠光燈近距離照射造成目眩所困擾而導致之車禍危險性。並可以基於所偵測獲得之本車前方交通狀況,如本車與前方道路上所出現車輛之相對運動關係,以提供作為自動駕駛與自動巡航速度的上層控制機制。
第五章我們提出一個智慧型多通道錄影視訊監視控制系統,其對於影像壓縮速度的提升,在於達成多通道錄影即時視迅壓縮編碼的高標準要求,同時又能保持極高的壓縮影像品質與壓縮效率,加以為了達成現今軟體工程的主流,系統開發以實作微軟所提出之ActiveX系統元件模型完成,以利於多媒體應用、網際網路應用與快速應用軟體程式開發。再加上結合了網路伺服程式、影像擷取卡與CCD攝影機,發展成為一高效率的多錄影通道智慧型監控系統,並擁有高效能、低成本與功能強大的特質。最後,在第六章的部分,我們整理了本篇論文的結論與未來的研究展望。 Image processing and computer vision are the studies of how computers can perceive and understand the interesting information about the world surrounding human beings by automatically extracting and analyzing observed images, image sets, or video sequences using theoretical and algorithmic computations. Object extraction and analysis is one of the important applications of image processing and computer vision. Among the applications of object extraction and analysis, document image analysis (DIA) is the one that provides many valuable applications in document analysis and understanding, such as optical character recognition, document retrieval, and compression. Vision-based techniques of driver assistance and autonomous vehicle navigation systems are emerging practical applications as well. It aims at detecting and recognition the vehicular objects in the road environment for driver assistance and autonomous vehicle guidance. As well as the security issues in modern life, digital video monitoring is also a promising application. In this dissertation, we will present several algorithmic, practical, and integrated methods and systems for the above-mentioned applications based on image processing and computer vision techniques. Firstly, Chapter 2 presents an efficient automatic multilevel thresholding method for image segmentation. An effective criterion for measuring the separability of the homogenous objects in the image, based on discriminant analysis, has been introduced to automatically determine the number of thresholding levels to be performed. Then, by applying this discriminant criterion, the object regions with homogeneous illuminations in the image can be recursively and automatically thresholded into separate segmented images. This proposed method is fast and effective in analyzing and thresholding the histogram of the image. In order to conduct an equitable comparative performance evaluation of the proposed method with other thresholding methods, a combinatorial scheme is also introduced to properly reduce the computational complexity of performing multilevel thresholding. In Chapter 3, we propose a new method, namely the multi-plane segmentation approach, for segmenting and extracting textual objects from various real-life complex document images. The proposed multi-plane segmentation approach first decomposes the document image into distinct object planes to extract and separate homogeneous objects including textual regions of interest, non-text objects such as graphics and pictures, and background textures. This proposed approach processes document images regionally and adaptively according to their respective local features. Hence detailed characteristics of the extracted textual objects, particularly small characters with thin strokes, as well as gradational illuminations of characters, can be well-preserved. Moreover, this way also allows background objects with uneven, gradational, and sharp variations in contrast, illumination, and texture to be handled easily and well. Next, an effective method for detecting vehicles in front of the camera-assisted car during nighttime driving is presented in Chapter 4. This proposed method detects vehicles based on detecting and locating vehicle headlights and taillights by using techniques of image segmentation and pattern analysis. First, to effectively extract bright objects of interest, a fast bright object segmentation process based on automatic multilevel histogram thresholding is applied on the grabbed nighttime road-scene images. This automatic multilevel thresholding approach can provide robustness and adaptability for the detection system to be operated well on various illuminated conditions at night. Then the extracted bright objects are processed by a rule-based connected-component analysis procedure, to identify the vehicles by locating and analyzing their vehicle light patterns, and estimate the distance between the detected vehicles and the camera-assisted car. In Chapter 5, we present a wavelet-based approach to compressing video, with high speed, high image quality and high compression ratio. Using the sequential characteristics of surveillance images, this method applies the low-complexity zero-tree coding, which costs low memory, to develop an algorithm for encoding and decoding video, which significantly improves the speeds of compression and decompression and maintains images of high quality. Based on this low-complexity and low-memory-cost wavelet-based coding scheme and motion compression strategy, the proposed video codec achieves high vision quality, high compression speed and high compression ratio. Then the ActiveX COM component technique is also implemented and integrated with the proposed video codec to realize multimedia, internet applications and many other video-intensive applications. Furthermore, an intelligent surveillance system, which integrates the proposed wavelet-based video codec, computer peripherals and mobile communication, is also presented in this chapter. Finally, we give a brief conclusion and future works in Chapter 6. |
URI: | http://140.113.39.130/cdrfb3/record/nctu/#GT008912530 http://hdl.handle.net/11536/77046 |
Appears in Collections: | Thesis |
Files in This Item:
If it is a zip file, please download the file and unzip it, then open index.html in a browser to view the full text content.