完整後設資料紀錄
DC 欄位 | 值 | 語言 |
---|---|---|
dc.contributor.author | Chang, Ying-Lan | en_US |
dc.contributor.author | Chien, Jen-Tzung | en_US |
dc.date.accessioned | 2014-12-08T15:30:03Z | - |
dc.date.available | 2014-12-08T15:30:03Z | - |
dc.date.issued | 2012 | en_US |
dc.identifier.isbn | 978-1-4673-2507-3 | en_US |
dc.identifier.uri | http://hdl.handle.net/11536/21520 | - |
dc.description.abstract | Backoff smoothing and topic modeling are crucial issues in n-gram language model. This paper presents a Bayesian non-parametric learning approach to tackle these two issues. We develop a topic-based language model where the numbers of topics and n-grams are automatically determined from data. To cope with this model selection problem, we introduce the nonparametric priors for topics and backoff n-grams. The infinite language models are constructed through the hierarchical Dirichlet process compound Pitman-Yor (PY) process. We develop the topic-based hierarchical PY language model (THPY-LM) with power-law behavior. This model can be simplified to the hierarchical PY (HPY) LM by disregarding the topic information and also the modified Kneser-Ney (MKN) LM by further disregarding the Bayesian treatment. In the experiments, the proposed THPY-LM outperforms state-of-art methods using MKN-LM and HPY-LM. | en_US |
dc.language.iso | en_US | en_US |
dc.subject | language model | en_US |
dc.subject | backoff smoothing | en_US |
dc.subject | topic model | en_US |
dc.subject | Bayesian nonparametrics | en_US |
dc.title | BAYESIAN NONPARAMETRIC LANGUAGE MODELS | en_US |
dc.type | Proceedings Paper | en_US |
dc.identifier.journal | 2012 8TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING | en_US |
dc.citation.spage | 188 | en_US |
dc.citation.epage | 192 | en_US |
dc.contributor.department | 電機資訊學士班 | zh_TW |
dc.contributor.department | Undergraduate Honors Program of Electrical Engineering and Computer Science | en_US |
dc.identifier.wosnumber | WOS:000316984700047 | - |
顯示於類別: | 會議論文 |