隨機牛頓法之大型非線性支持向量機

標題:	隨機牛頓法之大型非線性支持向量機 Stochastic Newton Method for Large Scale Nonlinear Support Vector Machine
作者:	蔡宜恩吳金典李育杰 Tsai, Yi-En Wu, Chin-Tien Lee, Yuh-Jye 應用數學系數學建模與科學計算碩士班
關鍵字:	隨機最佳化;在線機器學習;支持向量機;非線性支持向量機;縮減核函數技術;二元分類;牛頓法;隨機牛頓法;stochastic optimization;online machine learning;support vector machine;nonlinear support vector machine;reduced kernel trick;binary classification;Newton method;stochastic Newton method
公開日期:	2016
摘要:	在這篇論文中，我們提出一種方法 - 隨機牛頓法來解決傳統非線性支持向量機在處理巨量數據集時所遇到的困境。我們的想法源自於隨機優化理論（在線學習技術）。由於其演算法上的設計，讀到一筆數據就更新一次模型，且不保留已讀過的數據，隨機優化（在線學習）演算法被認為是大型學習任務的解決方案之一。以這種設計為出發點，我們增加每次更新模型所需使用到的數據筆數，而且將目標函數的二次訊息，也就是目標函數的曲率，納入搜尋方向的生產元素中，以加速收斂速率。每次迭代需要計算、更新的海森反矩陣也藉由運用謝爾曼莫里森伍德伯里公式以減少計算成本。由於缺少已讀過數據的相關訊息，傳統上，隨機優化（或在線學習）演算法需要經過幾回合的訓練，才有機會找到最佳解，這意味著整個數據集應被預先儲存。為了避免保存所有的數據集而造成塞爆記憶體的情形發生，我們引進了輔助模型 - 準線性判別分析，保留舊數據的簡單統計資訊，以幫助我們的方法可以在一個訓練回合中得到一個堪用的分類模型。為了測試演算法的穩定性、準確性和效率，我們對隨機牛頓法與其他現有的隨機優化（在線學習）演算法進行多種相關實驗實驗與比較。數值計算結果顯示，隨機牛頓法不管在準確度或速度上都相當有競爭力。此外，隨機牛頓法對數據的輸入順序所造成的影響並不敏感。總結來說，隨機牛頓法是大型非線性支持向量機一個不錯的選擇。 In this thesis, a stochastic Newton method is proposed to overcome the challenges of the conventional nonlinear support vector machines (SVMs) for extremely large scale dataset. Our idea comes from the scheme of stochastic optimization (or online learning technique) which is considered as a solution of large scale learning tasks. Based on the design of stochastic optimization (or online learning) algorithms, updating the model using only one training instance without reading old ones, we extend it from one training instance to a small batch of instances. In addition, the second order information of the objective function, that is the curvature of the objective function, is incorporated in search direction to accelerate the rate of convergence. The inverse of Hessian matrix computed at each iteration is updated by the Sherman Morrison Woodbury formula to reduce the computational cost. Traditionally, the stochastic optimization (or online learning) algorithms need several epochs to achieve the optimal solution because of a lack of memory to old training instances. It implies that the entire dataset should be stored in advance. To avoid saving all dataset, we introduced a proximal classifier - the quasi-Linear Discriminate Analysis to keep some simple statistical information of old training instances and help our method can obtain a reasonably good model in single epoch. Several experiments for testing the robustness, accuracy and efficiency of one algorithm are conducted on both our methods and other existing algorithms. The numerical result shows that stochastic Newton method is competitive in both accuracy and speed. Also, stochastic Newton method is insensitive to the input order of training instances. In conclusion, stochastic Newton method is a good choice for large-scale nonlinear support vector machines.
URI:	http://etd.lib.nctu.edu.tw/cdrfb3/record/nctu/#GT070252302 http://hdl.handle.net/11536/142395
Appears in Collections:	Thesis