标题: 贝氏阶层式结构于视讯监控之研究与应用
A Study of Bayesian Hierarchical Framework and Its Applications to Video Surveillance
作者: 黄敬群
Huang, Ching-Chun
王圣智
Wang, Sheng-Jyh
电子研究所
关键字: 贝氏推论;视讯监控;物件侦测;Bayesian Inference;Video Surveillance;Object Detection
公开日期: 2010
摘要: 在本论文中,我们提出以贝氏阶层式结构为基础的分析方法,让视讯监控系统得以用一致的架构,同时分析影像内容以及推论空间中场景的资讯。在真实的场景中,为了实现一套稳健的视讯监控系统,往往会面临许多挑战,诸如物体间相互遮蔽、前景物体与背景物体外貌相似而产生的混淆、透视投影所造成的物体形变、阴影的变化、还有外在光线变化造成的影像变异。在这篇论文中,我们发现,透过将空间场景适当的参数化,并同时依据场景模型和撷取到的影像资料来进行分析,系统将能更轻易地处理前面所提及的变异因素。在贝氏阶层式架构中,我们透过阶层式表示法将以像素特征为基础的资讯、以区域影像内容为基础的资讯、与以物件特性为基础的资讯,透过机率的方式进行有系统的整合,以支援影像内容的分析与场景资讯的推论。透过所提出的贝氏阶层式架构,前面所提到的许多变异因素可以被有效地解决,除此之外,某些变异因素还可进一步变成有效的线索来协助三维场景资讯的推论。
在本论文中,我们将贝氏阶层式架构实际应用在停车场空位侦测系统以及多摄影机视讯监控系统。在停车场空位自动侦测的系统上,实际的户外停车场监控场景往往受到许多变因的影响,进而降低了系统的正确性,这些变因包含: (a)户外变化剧烈的环境光源; (b)阴影的影响; (c)透视法上几何投影所产生的变形; (d)停放车辆之间产生的相互遮蔽问题。藉由所提出的贝氏阶层式结构,我们可以有系统地将前述的许多变因加入停车空位的推论过程中,以降低这些变因对系统效能的影响。我们的贝氏阶层式结构透过建立参数化的空间场景模型来描述空间中的遮蔽现象、几何上的投影变形、以及阴影等变因所形成的影响,同时也将环境光线变化所造成的色彩变动视为一种色彩分类的问题,并藉由分类程序的建立来描述光线的变化。实验结果显示,我们的系统可以稳定地侦测空位的位置、有效地标记并区分影像中属于地面或车辆的区域、确切地标记属于阴影的区域、以及克服光线变化所衍生的问题。
另一方面,在多摄影机视讯监控系统中,我们自动地定位、标记、与对应在不同摄影机监控范围内的多个物体,同时有效压抑因为几何深度上的不确定性所产生的假物体。多摄影机视讯监控系统在真实的应用场景中,往往面临一些具挑战性的议题: (a) 场景中未知物体的数量; (b)物体间的相互遮蔽; 以及(c)假物体的出现。有别于过去的方法,我们提出了一套包含资讯整合与场景推论的两步骤策略。在资讯整合的步骤中,我们整合来自多摄影机的资讯以建立一机率分布,藉以描述物体出现于地面某一位置的可能性。在场景推论的步骤中,我们应用贝氏阶层式结构将场景模型纳入考量,透过此结构,我们将物件在影像内的标记议题、物件在多摄影机间的对应议题、以及假物件的消除议题整合为单一的最佳化问题。此外,我们进一步采用期望-最大化架构来调整出更好的物体三维模型,透过贝氏阶层式结构与期望-最大化架构的结合,我们可以得到更好的系统效能。实验结果显示,我们的系统可以自动地决定场景中的运动物体数量、有效地标记并对应出不同摄影机影像中的多个物体、准确地定位物体在三维场景中的位置、并且能有效地清除假物件。
在本论文中,我们验证了以贝氏阶层式结构为基础的影像分析架构可以有效地应用到视讯监控的分析与应用上。透过此架构,我们将像素层级的色彩资讯、像素间的区域层级资讯、以及以物体为基本单位的物件层级资讯有系统地整合在一起,这样的整合让系统可以拥有更多的资讯,并可以针对较复杂的影像内容进行准确的推论分析。
In this dissertation, we present a Bayesian hierarchical framework (BHF) to simultaneously deal with 3-D scene modeling and image analysis in a unified manner. In practice, to develop a robust video surveillance system, many challenging issues need to be taken into account, such as occlusion effect, appearance ambiguity between foreground and background, perspective effect, shadow effect, and lighting variations. In this dissertation, we find a way to handle these challenging issues by modeling 3-D scene in a parametric form and by integrating scene model and image observation together in the inference process. In the proposed hierarchical framework, we systematically integrate pixel-level information, region-level information, and object-level information in a probabilistic way for the semantic inference of image content and 3-D scene status. Under this BHF framework, occlusion effect, appearance ambiguity, perspective effect, shadow effect, and lighting variations can be well handled. Actually, in the BHF framework, occlusion effect, perspective effect, and shadow effect may even provide useful clues to support 3-D scene inference.
In this dissertation, the BHF framework is applied to two video surveillance systems: a vacant parking space detection system and a multi-camera surveillance system. In the vacant parking space detection system, the challenges come from dramatic luminance variations, shadow effect, perspective distortion, and the inter-occlusion among vehicles. With the proposed BHF, those issues can be well modeled in a systematic way and can be effectively handled. In detail, the proposed BHF scheme depicts the occlusion pattern, perspective distortion, and shadow effect by building a parametric scene model. On the other hand, the color fluctuation problem caused by luminance variation is treated as a color classification problem. With the BHF scheme, the detection of vacant parking spaces and the labeling of scene status are regarded as a unified Bayesian optimization problem subject to a shadow generation model, an occlusion generation model, and an object classification model. The system accuracy was evaluated by testing over a few outdoor parking lot videos captured from morning to evening. Experimental results showed that the proposed framework can systematically detect vacant parking spaces, efficiently label ground and car regions, precisely locate shadowed regions, and effectively handle luminance variations.
On the other hand, in the application of multi-target detection and tracking over a multi-camera system, the main goal is to locate, label, and correspond multiple targets with the capability of ghost suppression over a multi-camera surveillance system. In practice, the challenges of this kind of system come from the unknown target number, the inter-occlusion among targets, and the ghost effect caused by geometric ambiguity. Instead of directly corresponding objects among different camera views, the proposed framework adopts a fusion-inference strategy. In the fusion stage, we formulate a posterior distribution to indicate the likelihood of having some moving targets at certain ground locations. In the inference stage, the scene model is inputted into the proposed BHF, where the target labeling, target correspondence, and ghost removal are regarded as a unified optimal problem subject to 3-D scene priors, target priors, and image observations. Moreover, the target priors are iteratively refined based on an expectation-maximization (EM) process to further improve the system performance. The system accuracy is evaluated via both synthesized videos and real videos. Experimental results showed that the proposed system can systematically determine the target number, efficiently label and correspond moving targets, precisely locate their 3-D locations, and effectively tackle the ghost problem.
With simulations over these two applications, we verified that the proposed BHF scheme can be well applied to various kinds of video surveillance applications. This BHF framework provides the flexibility to properly integrate pixel-level, region-level, and object-level information into a unified inference process. With the integrated information from multiple aspects, we will be able to handle more complicated tasks with improved accuracy and robustness.
URI: http://140.113.39.130/cdrfb3/record/nctu/#GT079111829
http://hdl.handle.net/11536/40279
显示于类别:Thesis


文件中的档案:

  1. 182901.pdf

If it is a zip file, please download the file and unzip it, then open index.html in a browser to view the full text content.