標題: | 應用於立體視訊之智慧型通訊系統研究-子計畫一:視覺感知導向之高畫質立體視訊技術研究( I ) VI Sual Perception Oriented High Definition 3d Video Design |
作者: | 張添烜 Chang Tian-Sheuan 國立交通大學電子工程學系及電子研究所 |
關鍵字: | 立體視訊;深度視訊估測;虛擬視點視訊合成;stereoscopic vision;disparity estimation;virtual viewpoint synthesis |
公開日期: | 2011 |
摘要: | 隨著高畫質電視的蓬勃發展與成熟,下一世代的家庭娛樂隨著消費性立體顯示器的推陳出新,開始轉為著重發展於能更令人類身歷其境的立體視訊。然而,立體視訊相較於平面的二維視訊面臨更嚴峻的挑戰:固定的視角、大量的複雜計算資源、舒適的立體視覺感知。針對以上的立體視訊轉換與其所帶來的高運算複雜度和高記憶體存取,本計畫將發展一個視覺感知導向可產生高畫質立體視訊的設計,並支援HD1080p每秒60畫面的運算速度。在本計劃的第一年,我們完成了密集計算視差估測演算法分析與設計, 單視角視訊合成演算法的分析與設計,人眼立體視覺感知特性與立體顯示器之文獻分析。對視差估測,我們提出取樣式比對法減低計算量,與適合硬體的成本擴散法以減低記憶體使用量,硬體實現結果可達95frames/sec 以支援三視角HD1080p的輸出,並只需1645k邏輯閘與59.4KB SRAM,並可和現有最佳結果有類似的品質。對視角視訊合成,我們提出以行為基礎的管線化結構以節省記憶體使用量,硬體實現結果可達94.5frames/sec 以支援HD1080p的輸出,並只需142.9k邏輯閘與54.72KB SRAM,並可和現有最佳結果有類似的品質。 With the popularity and maturity of high definition TV, the next generation home entertainment has now focused on the 3D video to obtain more vivid viewing experience with the fast development of stereoscopic display. However, 3D video faces the tougher challenges than the traditional 2D video does: fixed viewpoint videos due to limited cost, high computational resources and data transfer bandwidth demand, visual perception consideration for 3D videos without eye strain. Being aware above issues, this project aims to develop a visual perception oriented high definition 3D video design, which can support the processing speed of 60 frames/sec for the frame size of HD1080p. In the first year of project study, we have developed a dense disparity estimation algorithm and its analysis, view synthesis algorithm and design for one view point, and literature analysis and survey of stereoscopic perception of human eyes. For disparity estimation, we proposed the downsampled matching cost to reduce the computational space, and the hardware-efficient cost diffusion method to decrease the memory cost in the disparity optimization. The disparity estimation engine could achieve the throughput of 95 frames/s for three views HD1080p disparity maps (i.e. 75.64G disparity-pixels/s) with 1,645K gate counts and 59.4-Kbyte internal memory. The objective evaluation result shows that our disparity quality is comparable to the state-of-the-art algorithm. For synthesis engine, we propose the row-based pipelined architecture to save the memory cost for original camera rotation issue. Our view synthesis engine can achieve the throughput of 94.5 frame/sec for the HD1080p input with the gate count of 142.9k and the low memory cost of 54.72Kbytes. |
官方說明文件#: | NSC100-2220-E009-061 |
URI: | http://hdl.handle.net/11536/99538 https://www.grb.gov.tw/search/planDetail?id=2310578&docId=361090 |
Appears in Collections: | Research Plans |