| 標題: | Hierarchical Pitman-Yor and Dirichlet Process for Language Model |
| 作者: | Chien, Jen-Tzung Chang, Ying-Lan 電機工程學系 Department of Electrical and Computer Engineering |
| 關鍵字: | language model;backoff model;topic model;Bayesian learning |
| 公開日期: | 1-Jan-2013 |
| 摘要: | This paper presents a nonparametric interpretation for modem language model based on the hierarchical Pitman-Yor and Dirichlet (HPYD) process. We propose the HPYD language model (HPYD-LM) which flexibly conducts backoff smoothing and topic clustering through Bayesian nonparametric learning. The nonparametric priors of backoff n-grams and latent topics are tightly coupled in a compound process. A hybrid probability measure is drawn to build the smoothed topic-based LM. The model structure is automatically determined from training data. A new Chinese restaurant scenario is proposed to implement HPYD-LM via Gibbs sampling. This process reflects the power-law property and extracts the semantic topics from natural language. The superiority of HPYD-LM to the related LMs is demonstrated by the experiments on different corpora in terms of perplexity and word error rate. |
| URI: | http://hdl.handle.net/11536/146415 |
| ISSN: | 2308-457X |
| 期刊: | 14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5 |
| 起始頁: | 2211 |
| 結束頁: | 2215 |
| Appears in Collections: | Conferences Paper |

