標題: 基於潛在關連探勘之多模態音樂推薦
Multimodal Music Recommendation based on Latent Correlation Mining
作者: 郭芳菲
Kuo, Fang-Fei
李素瑛
Lee, Suh-Yin
資訊科學與工程研究所
關鍵字: 音樂推薦;音樂播放清單;影片背景音樂推薦;音樂情緒;社群媒體;潛在關連探勘;Music Recommendation;Music Playlist;Video Background Music Recommendation;Emotion;Social Media;Latent Correlation Mining
公開日期: 2012
摘要: 多媒體內容通常包含了多種不同的資料型態,例如文字、影片、音訊、圖片等等,而這些不同模態之間存在著關聯性。由這些跨模態的關聯性,我們可以探勘出潛在的語意關聯。目前已有許多單模態音樂推薦的研究,但利用多模態資訊做音樂推薦的研究仍佔少數。本論文將研究三種多模態音樂推薦的機制:根據情緒推薦音樂、影片背景音樂推薦以及根據社群標籤推薦音樂清單。針對第一部分:以情緒推薦音樂,目前現有的相關技術大多為基於使用者的偏好來推薦音樂。然而在某些情況下,使用者可能會希望根據情緒來推薦音樂,而不是根據偏好推薦。因此,在本論文中,我們提出以情緒推薦音樂的新架構,此架構的核心為音樂情緒模型的建立。由於在電影當中,音樂是傳遞情緒最重要的要素之一,因此我們將從電影音樂中探勘音樂與情緒的關係以建立音樂情緒模型。針對第二部分,音樂是影響一部影片精采與否的重要因素;尤其對於家庭短片的製作,音樂更是使影片變得有趣的要素。目前針對影片製作的相關研究都必須由使用者先選好背景音樂後,再進行影片與音樂同步等處理。然而,針對影片內容選擇適合的背景音樂不但耗時,而且是一項需要音樂與影片製作等相關知識的工作。本論文將研究針對家庭短片推薦背景音樂的技術。由於廣告片中,通常會由專業的影片製作者或是作曲家選擇或譜寫適合的配樂,我們由社群影片網站中蒐集廣告影片,自動過濾不含配樂或音樂比例低的影片後產生訓練資料,再利用多模態潛在關聯探勘演算法來做影片音樂推薦。本論文的第三部分為根據社群標籤推薦音樂清單。我們提出由社群音樂網站中使用者建立的音樂清單來探勘音樂之間的搭配關係(Collocation Relationship),並提供以使用者所下的關鍵字來推薦新的音樂清單的功能。我們將利用Multi-type Latent Semantic Analysis 演算法來分析音樂清單、音樂與標籤之間的潛在語意,並用以計算音樂間的相似度。我們接著建立Track Graph,並利用我們提出的approximate Steiner Tree 演算法產生推薦結果。
Multimedia content usually contains different types of modalities. Cross modal correlation can provide latent semantic association among different modalities. While mush research has been done on music information retrieval, little search has been done on cross-modal music information retrieval where queries from one modality are used to search for content in music using low level features. This dissertation will investigate three mechanisms of multimodal music recommendation, recommend music by emotion, recommend music by video and recommend music playlist by tags. For recommending music by emotion, existing music recommendation approaches are based on a user’s preference on music. However, sometimes, it might better meet users’ requirement to recommend music pieces according to emotions. In this dissertation, we propose a novel framework for emotion-based music recommendation. The core of the recommendation framework is the construction of the music emotion model by affinity discovery from film music, which plays an important role in conveying emotions in film. For recommending music by video, accompaniment of background music is influential in home video making. Existing approach of music accompaniment relies on human selection of music, which is a time-consuming and knowledge-intensive task. In this dissertation, we will propose a mechanism to recommend music for home video based on multimodal correlation mining from advertising video with background music. For recommending music playlist by tags, we will propose a query-based music playlist generation approach which integrates the collective knowledge from playlists of social music community and provides users the capability to generate playlist by query tags. To capture the latent semantic space among playlists, tracks and tags, Multi-type Latent Semantic Analysis is employed and the proximity between tracks is computed to construct the track graph. Playlist will be generated by proposed approximate Steiner Tree Algorithm on the constructed track graph. Evaluation of music playlist collections from Last.fm shows that the proposed approach achieves encouraging results.
URI: http://140.113.39.130/cdrfb3/record/nctu/#GT079217821
http://hdl.handle.net/11536/40398
Appears in Collections:Thesis