標題: | 連結語音辨識系統及應用軟體系統之介面語音之設計及製作 The Design and Implementation of an Interfacing Framework for Bridging Speech Recognizers to Application Systems |
作者: | 蔣加洛 Jan Karel Ruzicka 陳登吉 Deng-Jyi Chen 資訊科學與工程研究所 |
關鍵字: | 通用介面化;語音辨識器;架構;人機介面化;虛擬介面化環境;generic interfacing;speech recognizer;framework;human-machine interfacing;visual interfacing environment |
公開日期: | 2004 |
摘要: | Current solutions that aim at bridging speech recognizers with applications use an ad hoc approach and lack of a generic and systematic way. Such recognizer’s interfacing approaches usually lead to tightly coupled systems where one application is wrapped by a specific recognizer through a low-level programming implementation that makes future modifications very difficult. Also, without supporting mechanisms to abstract group of actions into single reusable macro-level commands to simplify user interaction tasks, intense and time-consuming overheads for end users are created. Applications, especially multimedia oriented ones deal with highly dynamic content, interfacing and keeping track of this kind of content is not yet addressed.
In this thesis research, an attempt to provide an interface framework for bridging speech-recognizers to applications through a generic and systematic approach is proposed to overcome the above challenges and limitations. Specifically, a script language is designed and implemented that allows users to define the interfacing commands between a speech recognizer and application software. These commands are executed on a user-composed visual interfacing environment that sits on top of applications and acts as a reference layer for interaction. With this approach, interaction commands can be dynamically scripted to simplify user interaction and allow more natural speech commanding. Moreover it allows immediate modifications to be made to an application interfacing environment by simply drawing and registering application zones, without the need of relying on low-level programming for changes to take effect. Our approach also allows for the coexistence of multiple application environments, allowing integration of speech recognition to more than one application at once. A prototype interface framework system has been constructed and used to demonstrate the feasibility and applicability of the proposed interface framework. Current solutions that aim at bridging speech recognizers with applications use an ad hoc approach and lack of a generic and systematic way. Such recognizer’s interfacing approaches usually lead to tightly coupled systems where one application is wrapped by a specific recognizer through a low-level programming implementation that makes future modifications very difficult. Also, without supporting mechanisms to abstract group of actions into single reusable macro-level commands to simplify user interaction tasks, intense and time-consuming overheads for end users are created. Applications, especially multimedia oriented ones deal with highly dynamic content, interfacing and keeping track of this kind of content is not yet addressed. In this thesis research, an attempt to provide an interface framework for bridging speech-recognizers to applications through a generic and systematic approach is proposed to overcome the above challenges and limitations. Specifically, a script language is designed and implemented that allows users to define the interfacing commands between a speech recognizer and application software. These commands are executed on a user-composed visual interfacing environment that sits on top of applications and acts as a reference layer for interaction. With this approach, interaction commands can be dynamically scripted to simplify user interaction and allow more natural speech commanding. Moreover it allows immediate modifications to be made to an application interfacing environment by simply drawing and registering application zones, without the need of relying on low-level programming for changes to take effect. Our approach also allows for the coexistence of multiple application environments, allowing integration of speech recognition to more than one application at once. A prototype interface framework system has been constructed and used to demonstrate the feasibility and applicability of the proposed interface framework. |
URI: | http://140.113.39.130/cdrfb3/record/nctu/#GT009217641 http://hdl.handle.net/11536/74434 |
顯示於類別: | 畢業論文 |