标题: 应用摺叠硬体架构实现画面内画面间预测之全域移动向量估测
Implementation of Intra/Inter Frame Predictive Global Motion Estimation with Folded Architecture
作者: 洪堂轩
Hong, Tang-Suan
Dung, Lan-Rong
关键字: 全域移动向量估测;预测移动向量;折叠架构;global motion estimation;predictive motion vector;folded architecture
公开日期: 2011
摘要: 在影像处理或视讯处理中,全域移动向量是常被用来辅助处理的资讯。本篇论文中提出一个适用在高画质影像的全域移动向量估测演算法,可应用在数位相机连拍全景影像的缝合上,在硬体实现方面,本篇论文也提出应用折叠技巧来降低面积成本的硬体架构。相较于过去一些文献的研究需要计算所有区块的移动向量,并且利用移动向量做数值分析估测出全域移动向量参数,本篇论文只等间隔取样25个区块来计算移动向量,以降低移动向量的运算成本且避免平坦区域和区域移动等干扰,并运用适合硬体实现的分群演算法得到全域移动向量。在移动向量估测中,利用预测移动向量来选择适当的搜寻起始点以及适应性选择搜寻范围大小,以这上述两种方法来降低运算量和确保移动向量的正确性。而演算法的验证,则利用全域移动向量计算两张影像重叠区域的PSNR值,本篇论文所提出的演算法和比较基准相比PSNR误差可以达到0.5dB以下。在硬体架构方面,为了在达到即时处理的条件下降低面积成本,提出折叠处理单元阵列的硬体架构,重复利用处理单元阵列,实现高画质影像的区块匹配运算,最后整个硬体规格面积为192K gates,23K gates的晶片内记忆体,操作在100MHz,并且可以达到符合每秒处理30张影像的视讯处理标准。
Global motion estimation is widely used in image processing and video processing. This paper proposes a global motion estimation algorithm which can deal with the high-definition (HD) images. This algorithm can be used to do the stitching which is a popular application in digital camera. In hardware implementation, this thesis presents a folded architecture that can reduce hardware cost. Compared with the papers presented previously, they need many motion vectors to compute a global motion vector (GMV). This thesis chooses only 25 marcoblocks (MB) to calculate motion vectors and uses the clustering algorithm that can be implemented by hardware to compute global motion vectors. This thesis uses the predictive motion vectors (PMV) to choose an appropriate starting point and utilizes the adaptive search range to decrease the complexity of computation. Compared with benchmarks, the simulation results of this algorithm show that the two images’ overlapping region has PSNR errors below 0.5dB. To reduce the hardware cost and achieve the real time processing specification, this thesis uses folded architectures to implement HD image’s block- matching computation. We demonstrate a 2560-p,30-fps solution a 100MHz with 192k gate count and 23k gate count on chip memory.