標題: 一個基於特徵值的演講語音自動摘要產生器
A Feature-Based Automatic Speech Digest Generator
作者: 吳御柔
Wu, Yu-Rou
羅濟群
Lo, Chi-Chun
資訊管理研究所
關鍵字: 自動摘要;特徵值;Automatic Summarization;Feature
公開日期: 2011
摘要: 因為網路和可攜帶式裝置的普及,影音資訊的數量近年來迅速成長,因此語音自動摘要日益重要。過去主要是對廣播、新聞報導等語音資料進行研究探討,然而適合用於上述的自動摘要方法或其特徵值並不一定適合於其他語音資料中(例如:演講語音),因為自動摘要方法和特徵值皆會語音資料類型不同而有不同的表現,因此本篇論文將針對演講者的內容進行演講語音自動摘要。故本研究利用過去已存在之自動摘要技術常使用的特徵值,提出一個三階段式的即時演講語音自動摘要產生器(Real-Time Speech Summarizer ,RTSS)。第一階段為計算獨立性特徵值分數,第二階段為計算依賴性與獨立性特徵值結合之特徵值分數,第三階段則將上述兩階段的特徵值分數進行比較,保留較佳的特徵值對其加權平均,然後挑選出分數較高的句子後,重新排序得到摘要句。在實驗中,採專家判定的摘要句與RTSS挑選出的摘要句比對,實驗結果RTSS整體表現的宏觀F值(Macro F-Measure)為52%,宏觀正確率(Macro Accuracy)為70%,顯示RTSS為一有用的輔助工具,它能幫助使用者在短時間內了解語音資訊中所要表達的大部分重點內容。
As the number of speech and video documents is increasing on the Internet and portable devices, speech summarization has become more important in these years. In usual, the research domain focused on the domain of broadcast and news. Unfortunately, the method of automatic summarization used in the past may not suit to other speech domains (e.g. lecture speech). Therefore, this thesis focuses on the research of lecture speech domain. We analyze the features used in past research, choose the suitable features through experimental, and propose a three-phase Real-Time Speech Summarizer (RTSS). Phase one chooses independent features (e.g. centrality, resemblance to the title, sentence length, term frequency, and thematic word) and calculates the independent features-scores; phase two calculates the dependent feature such as position with above-mentioned independent features-scores; phase three compares the above-mentioned feature-scores, weighted average the function-scores to find the top score sentence, and get the summary. With the experimental, RTSS are evaluated by comparing the summary sentence set selecting from RTSS and five experts. RTSS is a useful that the Macro F-Measure score is 52%, and the Macro Accuracy is 70% that can help users to get the key information of speech.
URI: http://140.113.39.130/cdrfb3/record/nctu/#GT079934509
http://hdl.handle.net/11536/50132
顯示於類別:畢業論文