基於Universum最大化邊界的稀疏編碼

標題:	基於Universum最大化邊界的稀疏編碼 Max-Margin Sparse Coding with Universum
作者:	陳俊宇 Chen, Chun-Yu 李嘉晃劉建良 Lee, Chia-Hoang Liu, Chien-Liang 資訊科學與工程研究所
關鍵字:	分類;支持向量機器;稀疏編碼;Classi?cation;SVM;U-SVM;Universum Learning;Sparse Coding;Block Coordinate Descent
公開日期:	2015
摘要:	與分類目標不同類別的樣本，稱為Universum，近期在機器學習和資料探勘的研究都備受矚目，藉由Universum，而非考量潛在分布和相關參數，為實驗帶來融合先驗知識的模型。此外，稀疏編碼已經展現它能從資料中獲取高層度特徵的能力，透過從資料學習到的原始特徵，資料再由這些原始特徵的線性組合呈現。本論文提出藉由Universum最大化邊界的稀疏編碼演算法，同時考量重構損失和hinge損失，來把分類目標損失最小化，而非直接用Universum給予明確的分類指示。本論文使用primal form來設計演算法，以及藉由稀疏編碼學習具有鑑別力的特徵而非採用kernel trick，我們使用block coordinate descent優化目標函數，函數中包含了稀疏編碼、字典學習和分類器參數。此外，文中也呈現基於Zangwill的全域收斂性理論的收斂性分析，實驗方面用了三種圖片資料集，實驗結果顯示本論文提出之方法比其他方法效果更好。 Universum, a collection of non-examples that do not belong to any class of interest, has received attention of enormous researchers in machine learning and data mining. The Universum provides a means for the experimenters to incorporate prior knowledge into the model by providing examples rather than considering the underlying distributions and associated parameters. Additionally, sparse coding has shown that it can capture higher-level features in the data. Using the primitive features learned from the data, one can present data as a linear combination of these primitive features. This work devises a max-margin sparse coding with Universum algorithm, which provides strong generalization guarantees. The proposed model jointly considers reconstruction loss and squared hinge loss, and minimizes empirical classification loss on the target examples rather than to give clear classification assignments for the Universum examples. This work uses primal form to devise algorithm, and relies on sparse coding to learn a discriminative feature presentation without using kernel trick. We optimize the objective function with block coordinate descent, in which the block variables comprise sparse coefficients, dictionaries, and classifier parameters. Additionally, theoretical convergence analysis based on Zangwill's global convergence theorem is presented along with experiments on three image data sets demonstrating that the proposed method outperforms the alternatives on three real data sets.
URI:	http://140.113.39.130/cdrfb3/record/nctu/#GT070256126 http://hdl.handle.net/11536/127063
Appears in Collections:	Thesis