标题: | 影像处理与电脑视觉技术应用于复杂文件影像分析、夜间驾驶辅助、以及视讯监控系统之研究 A Study of Image Processing and Computer Vision Techniques for Complex Document Image Analysis, Nighttime Driver Assistance, and Video Surveillance Systems |
作者: | 陳彥霖 Yen-Lin Chen 吳炳飛 Bing-Fei Wu 電控工程研究所 |
关键字: | 电脑视觉;影像处理;影像切割;文件影像分析;驾驶辅助;视讯监控系统;Computer Vision;Image Processing;Image Segmentation;Document Image Analysis;Driver Assistance;Video Surveillance |
公开日期: | 2006 |
摘要: | 影像处理与电脑视觉技术的发展,近十年来在各种应用领域的期刊文献上,发表了许多针对不同目的所开发之系统,用于静态影像切割、文件分析与辨识、智慧型运输系统、视讯监控等许多应用上。本论文将针对这些应用需求,发展一系列以影像处理与电脑视觉技术为基础的研究方法与应用系统。本论文主要分为六个章节,第一章我们对于影像处理与电脑视觉在各领域的应用作一简要介绍。而第二章至第五章,则分别探讨简介本论文所提出之各个以影像处理与电脑视觉为基础的研究方法与应用系统。 在第二章中,我们提出了一个快速且具高度可靠性的多重门槛值选择机制,以针对含有多个物件的灰阶影像,加以将其中内含之物件分离,以利于后续之处理与分析。这个机制包含了一个最佳化的分离度量测法则,可以确保所得之切割影像,具有统计特性上的最大分离度,以此获得最佳的切割效果。目前现有的门槛值选取演算法,多是为了二值化切割而设计,少数具有多门槛值选取技术,其大多需要消耗相当多的运算能量,以及多是设计于在具有某些特定特性的影像上才具有较高可靠度。而这个研究提出了一个最佳化的分离度量测判定准则,据此所开发之自动多重门槛值选择技术,可在多种不同特性的影像上,皆可以极低的运算时间,判定影像上的物件个数,并决定门槛值加以切割分离之,并经过多种实验测试证明其高效率与可靠性。这个技术可应用于电脑与机器视觉相关之系统开发,以此技术将具意义之物件加以从影像分离后,以利其后续之分析与辨识处理,如在文件影像分析与辨识、ITS智慧型运输系统之电脑视觉处理上之整合运用,皆需要将有意义之物件从影像中撷取出,以利进行进一步的分析处理工作。 我们在第三章则提出了一套多平面影像切割演算法,以将文字从彩色图文交叠的复杂文件影像中完整萃取。由于在现今生活中常见的文件,因印刷排版技术的发展,使得多媒体复合文件的大量出现,此类文件之文字大多印刷在复杂的背景内涵下。将文字从文件影像中抽离是文件分析研究的重要一环,目前已经有许多学者在这个领域提出相关文字切割技术。然而,先前的技术大多无法解决复杂文件影像抽离文字的困难。这个技术可以解决许多由于文件影像背景日益复杂所衍生的相关问题。在实验分析中,我们使用了书本封面、平面广告和杂志等进行处理,实验结果证明本论文所提出的方法,能够成功的将这些文件影像中的中英文文字字串加以成功萃取。这个研究的主要目的,是开发一套以区域性切割演算法为基础的文件影像切割技术,针对彩色图文交叠的复杂文件影像,将其中所包含的前景物件与背景物件分别分离,并再以一套文字萃取演算法,使其中之文字资讯能够完整的加以萃取,即使他们都被印刷在缓慢变化或快速变化的高度复杂背景图形中。 在第四章中,我们则提出了一个针对夜间行车驾驶辅助与车辆自动化驾驶需求的智慧型高速夜间车辆侦测与辨识系统。该系统透过CCD影像撷取设备,结合电脑视觉处理技术,以实现夜间车辆侦测、相对位置与距离判定、车辆标定与追踪,并以此辅助驾驶获得前方之交通状况资讯。以此可以提供一个有效的机制,以自动操控车上的相关装置设备,如运用于车辆头灯的远近光灯控制上,可以在侦测判断前方车道的交通状况时,自动将车辆头灯之远光灯与近光灯调整至最佳状态,防止炫光而影响前方来车驾驶视线,避免因为远光灯近距离照射造成目眩所困扰而导致之车祸危险性。并可以基于所侦测获得之本车前方交通状况,如本车与前方道路上所出现车辆之相对运动关系,以提供作为自动驾驶与自动巡航速度的上层控制机制。 第五章我们提出一个智慧型多通道录影视讯监视控制系统,其对于影像压缩速度的提升,在于达成多通道录影即时视迅压缩编码的高标准要求,同时又能保持极高的压缩影像品质与压缩效率,加以为了达成现今软体工程的主流,系统开发以实作微软所提出之ActiveX系统元件模型完成,以利于多媒体应用、网际网路应用与快速应用软体程式开发。再加上结合了网路伺服程式、影像撷取卡与CCD摄影机,发展成为一高效率的多录影通道智慧型监控系统,并拥有高效能、低成本与功能强大的特质。最后,在第六章的部分,我们整理了本篇论文的结论与未来的研究展望。 Image processing and computer vision are the studies of how computers can perceive and understand the interesting information about the world surrounding human beings by automatically extracting and analyzing observed images, image sets, or video sequences using theoretical and algorithmic computations. Object extraction and analysis is one of the important applications of image processing and computer vision. Among the applications of object extraction and analysis, document image analysis (DIA) is the one that provides many valuable applications in document analysis and understanding, such as optical character recognition, document retrieval, and compression. Vision-based techniques of driver assistance and autonomous vehicle navigation systems are emerging practical applications as well. It aims at detecting and recognition the vehicular objects in the road environment for driver assistance and autonomous vehicle guidance. As well as the security issues in modern life, digital video monitoring is also a promising application. In this dissertation, we will present several algorithmic, practical, and integrated methods and systems for the above-mentioned applications based on image processing and computer vision techniques. Firstly, Chapter 2 presents an efficient automatic multilevel thresholding method for image segmentation. An effective criterion for measuring the separability of the homogenous objects in the image, based on discriminant analysis, has been introduced to automatically determine the number of thresholding levels to be performed. Then, by applying this discriminant criterion, the object regions with homogeneous illuminations in the image can be recursively and automatically thresholded into separate segmented images. This proposed method is fast and effective in analyzing and thresholding the histogram of the image. In order to conduct an equitable comparative performance evaluation of the proposed method with other thresholding methods, a combinatorial scheme is also introduced to properly reduce the computational complexity of performing multilevel thresholding. In Chapter 3, we propose a new method, namely the multi-plane segmentation approach, for segmenting and extracting textual objects from various real-life complex document images. The proposed multi-plane segmentation approach first decomposes the document image into distinct object planes to extract and separate homogeneous objects including textual regions of interest, non-text objects such as graphics and pictures, and background textures. This proposed approach processes document images regionally and adaptively according to their respective local features. Hence detailed characteristics of the extracted textual objects, particularly small characters with thin strokes, as well as gradational illuminations of characters, can be well-preserved. Moreover, this way also allows background objects with uneven, gradational, and sharp variations in contrast, illumination, and texture to be handled easily and well. Next, an effective method for detecting vehicles in front of the camera-assisted car during nighttime driving is presented in Chapter 4. This proposed method detects vehicles based on detecting and locating vehicle headlights and taillights by using techniques of image segmentation and pattern analysis. First, to effectively extract bright objects of interest, a fast bright object segmentation process based on automatic multilevel histogram thresholding is applied on the grabbed nighttime road-scene images. This automatic multilevel thresholding approach can provide robustness and adaptability for the detection system to be operated well on various illuminated conditions at night. Then the extracted bright objects are processed by a rule-based connected-component analysis procedure, to identify the vehicles by locating and analyzing their vehicle light patterns, and estimate the distance between the detected vehicles and the camera-assisted car. In Chapter 5, we present a wavelet-based approach to compressing video, with high speed, high image quality and high compression ratio. Using the sequential characteristics of surveillance images, this method applies the low-complexity zero-tree coding, which costs low memory, to develop an algorithm for encoding and decoding video, which significantly improves the speeds of compression and decompression and maintains images of high quality. Based on this low-complexity and low-memory-cost wavelet-based coding scheme and motion compression strategy, the proposed video codec achieves high vision quality, high compression speed and high compression ratio. Then the ActiveX COM component technique is also implemented and integrated with the proposed video codec to realize multimedia, internet applications and many other video-intensive applications. Furthermore, an intelligent surveillance system, which integrates the proposed wavelet-based video codec, computer peripherals and mobile communication, is also presented in this chapter. Finally, we give a brief conclusion and future works in Chapter 6. |
URI: | http://140.113.39.130/cdrfb3/record/nctu/#GT008912530 http://hdl.handle.net/11536/77046 |
显示于类别: | Thesis |
文件中的档案:
If it is a zip file, please download the file and unzip it, then open index.html in a browser to view the full text content.