標題: | 以人臉視覺資訊作為虛擬實境畫面瀏覽之控制 Hands-Free Viewing Control in VR Environments by Tracking Facial Visual Information |
作者: | 張振宇 Chen-Yu Chang 陳 稔 Zen Chen 資訊科學與工程研究所 |
關鍵字: | 虛擬實境;瀏覽控制;人臉追蹤;Virtual Reality;Viewing Control;Facial Tracking |
公開日期: | 1998 |
摘要: | 在傳統瀏覽虛擬實境系統中,對於瀏覽環境的視覺資訊是由一些硬體設備來控制,例如滑鼠或是軌跡球。而這些設備是由使用者的手部來控制,這相當不方便與不自然。在本論文中,我們想要發展一套單眼視覺式人臉追蹤系統,將安置單部攝影機位於人臉之正前方,使之能夠拍攝人臉的運動情形。我們假設人臉上的雙眼、嘴巴在3D空間中形成一個平面,我們推導出一個數學運算來估測兩張不同的臉部影像,進而從分解此數學運算所得的轉換矩陣來得到人臉的運動參數。當獲得這些人臉運動參數後,我們將其傳送到虛擬實境瀏覽系統中達到瀏覽的效果。所以,使用者藉由頭部自然地運動,便可控制瀏覽畫面,而不需手去操控任何設備。
我們所使用的方法是,利用單台攝影機,架設在人臉的正前方並做好攝影機內部參數校正,接下來,拍攝人臉左右旋轉、上下擺動、前後移動之運動情況。我們首先選擇一張人臉參考影像(reference face image),而這張人臉參考影像盡量是正對於攝影機的姿態(neutral pose),之後,我們可以運用所推導的數學公式,對後續的人臉運動影像與此人臉參考影像做運算以估測出運動參數R3x3 (Rotation) T3x1 (Translation),最後,將所估算出的運動參數R3x3與T3x1傳送到虛擬實境平台,完成整個畫面瀏覽動作。 In the conventional virtual reality navigation system, the viewing of the scene information is controlled by some peripherals, e.g. a mouse or a trackball. Users have to use their hands to control the device; this is quite inconvenient and unnatural. In this thesis, we will develop a facial tracking system using only one camera. We mount a calibrated TV camera in the front of the human face to properly capture the images of the human facial motions. We assume that the eyes and the mouth on the human face are coplanar in 3-D space, so a mathematical transformation can be derived to estimate the relationship between two different images of the same face. Furthermore, the facial motion parameter can be decomposed from the derived 3x3 transformation matrix. When the facial motion parameter are obtained, they can be transmitted to the VR navigation system to change the viewing specification. Hence, without using hands to control any input device, users can use head movement to browse the virtual world naturally. In our methods, we use a single camera mounted in the front of the human face, and carry out camera intrinsic calibration in advance. Then any rotation (mainly in the left-right or the up-down direction) and any translation (mainly in the front-back direction) of the head movement can be captured by the camera. We first select a reference face image from the captured images; this reference face image is generally taken when the human face is in a neutral pose with the face looking toward the camera. After that, we can use the derived formulation to determine the motion parameters of the head rotation (R3x3) and translation (T3x1) by making use of the reference face image and the currently captured facial image. Finally, the motion parameters are transmitted to the VR platform to do the viewing control. |
URI: | http://140.113.39.130/cdrfb3/record/nctu/#NT870392072 http://hdl.handle.net/11536/64096 |
Appears in Collections: | Thesis |