Full metadata record
DC FieldValueLanguage
dc.contributor.author黃冠郎zh_TW
dc.contributor.author冀泰石zh_TW
dc.contributor.authorHuang, Kuan-Langen_US
dc.contributor.authorChi, Tai-Shihen_US
dc.date.accessioned2018-01-24T07:42:32Z-
dc.date.available2018-01-24T07:42:32Z-
dc.date.issued2017en_US
dc.identifier.urihttp://etd.lib.nctu.edu.tw/cdrfb3/record/nctu/#GT079613522en_US
dc.identifier.urihttp://hdl.handle.net/11536/142645-
dc.description.abstract在這項研究中,我們假設整體語音品質是一個多重維度感知參數,它可以被進一步分解出五個抽象感知參數,包括語音可理解度、清晰度、自然性、連續性和噪聲干擾。我們設計並進行主觀聽覺實驗,以驗證我們的初始假設。並在主觀實驗數據分析中,推導出用於準確預測語音品質(SIG)和整體語音品質(OVL)估計值的主觀權重。這主觀權重隨後將用於我們開發的客觀侵入式整體語音品質估測模型。 為了構建出客觀估測模型,我們透過一個能同時解析時、頻域的分析式聽覺模型來分析及捕捉語音信號的失真,並量化嵌入在聽覺頻譜圖中不同時、頻域範圍調變的頻譜能量,進而抽取出這些抽象感知參數的失真量。藉由這五個感知參數失真度量與主觀權重線性組合以估計主觀語音品質(MOS)。初始模型效能表現是可以接受的,我們更進一步將更為複雜的演算法應用於抽象感知參數估計模型和判斷模型中,期望發掘出從輸入的時、頻域調變特徵參數到抽象感知參數,甚至整體語音品質的非線性映射。透過類神經網絡(NN)的分析,我們提出的整體語音品質估測模型表現出令人滿意的結果。另外,與國際電信聯盟(ITU-T)所發佈的侵入式客觀語音品質的標準模型(PESQ)的效能比較,展示出我們所提出模型的潛力。 因此我們在抽象感知參數估計模型和判斷模型中,將採納類神經網絡來模擬並估算出,在特定時、頻域調變特徵參數區域上的非線性組合,來預測各個抽象感知參數和整體語音品質。zh_TW
dc.description.abstractIn this study, we hypothesize that integral speech quality is a multi-dimensional percept, which can be decomposed into five abstract percepts such as speech intelligibility, clarity, naturalness, continuity and noise intrusiveness. A subjective listening experiment was designed and conducted to verify our initial assumption. Subjective weights derived to accurately predict SIG and OVL estimates were utilized in our developed objective instrumental intrusive quality model afterwards. For constructing an objective quality model, a spectro-temporal auditory model was utilized to capture degradations of speech signals and to measure deteriorations of these abstract percepts from different ranges of spectro-temporal energy modulations embedded in spectrograms. Deterioration measures of these four percepts and of the temporal energy profile designed to indicate noise intrusiveness were linearly combined with subjective weights to estimate the subjective MOS. Although performance is acceptable with pre-defined deterioration measures combined with subjective weights to predict MOSLQO, more sophisticated approaches applied in the abstract percept estimator and in the judgment model were explored. Non-linear mappings from input scale-rate features to either abstract percepts or eventually the integral speech quality were expected. Performance comparisons to that of PESQ demonstrate the potential of our quality assessment model. Our proposed quality model demonstrates satisfactory results in assessing abstract percepts and integral quality with non-linear combination of scale-rate features in specific regions, with an abstract percept estimator and a judgment model incorporated both with NN.en_US
dc.language.isoen_USen_US
dc.subject語音品質zh_TW
dc.subject客觀估測zh_TW
dc.subject語音品質分解zh_TW
dc.subject可理解度zh_TW
dc.subject清晰度zh_TW
dc.subject自然性zh_TW
dc.subject連續性zh_TW
dc.subject噪聲干擾zh_TW
dc.subjectspeech qualityen_US
dc.subjectobjective assessmenten_US
dc.subjectspeech quality decompositionen_US
dc.subjectintelligibilityen_US
dc.subjectclarityen_US
dc.subjectnaturalnessen_US
dc.subjectcontinuityen_US
dc.subjectnoise intrusivenessen_US
dc.title語音品質之客觀估測與分解演算法zh_TW
dc.titleObjective Assessment and Decomposition of Speech Qualityen_US
dc.typeThesisen_US
dc.contributor.department電信工程研究所zh_TW
Appears in Collections:Thesis