标题: | 篮球影片之语义标注与摘要撷取之研究 A STUDY ON SEMANTIC ANNOTATION AND SUMMARIZATION OF BASKETBALL VIDEO |
作者: | 陈俊旻 Chen, Chun-Min 陈玲慧 Chen, Ling-Hwei 资讯科学与工程研究所 |
关键字: | 语义事件;影片检索;资讯检索;影像辨识;网路转播文字;慢动作重播;运动;体育;semantic event;webcast text;information retrieval;video retrieval;sports;slow motion replay;sports video summarization;sports |
公开日期: | 2013 |
摘要: | 运动影片在我们的休闲娱乐中,扮演了重要角色,然因运动影片的资讯量很大,除了需要的频宽与传输时间多,观众亦需耗费大量的时间观赏,为了节省不必要的时间成本与能源成本,影片精华检索、影片摘要、以及影片慢动作重播侦测已成为一个热门的研究题目。目前大多数方法,皆对影片中的每一张画面分析,然而语义事件只发生在有计分板的画面,慢动作重播则只出现在没有计分板的画面,从不相关的画面中撷取语义事件或慢动作重播,反而降低方法的准确度与执行效率,且现存的方法多针对足球影片而设计,对篮球影片之探讨相对较少,为了解决现存方法所遇到的各式挑战,本论文将以篮球影片为例,提出一个新颖的运动影片分析架构,让一般民众得以有效率的查询赛事精华,也让专业人士能够用来延伸到其他相关应用(自动影片精华产生、运动员动作分析、球队战术分析等)。在此架构中,首先提供一个影片画面分割方法,将运动影片分成有/无计分板两类。接着,对有计分板的画面提出一语义事件侦测方法,对无计分板画面提出一慢动作重播侦测方法。 关于语义事件侦测的相关研究,现存的方法,多使用影片本身的影像或声音作为特征,然而仅使用影片内容作为特征,往往会发生一些语义鸿沟,也就是较低阶的影片特征,和较高阶的语义事件,两者之间的差距。虽然近来有些方法,参考网路转播文字作为外部知识以弥补语义鸿沟,但从网路转播文字中撷取语义事件,并标注在运动影片上,仍然存在许多困难与挑战。在此论文中,我们将讨论相关的困境,并提出两个方法来解决。 关于慢动作重播侦测的研究,现存方法大致可以分为两类。慢动作重播前后,常常有制播单位后制加上的特效画面,第一类方法都是基于这些特效的位置,来侦测慢动作重播,但篮球影片较为复杂,此假设在篮球影片未必恒成立。第二类方法是分析慢动作片段的特征,利用这些特征将慢动作重播片段和一般片 段作区分,但由于某些用于足球的特征并不适用于篮球,此类方法在篮球应用上仍有改进空间。篮球是世界上最重要的运动之一,但在侦测篮球影片慢动作重播上,仍有许多挑战尚待解决。本论文将提出一个新的方法,侦测篮球影片中的慢动作重播,提供一个重要的运动影片分析素材。 实验结果显示,本论文所提出的架构与方法,可行性与有效性皆可得到良好的验证,基于提出的架构与方法皆没有使用篮球限定的特征,我们期望本论文可以被延伸应用于其他类型的运动影片。 Semantic event and slow motion replay extraction for sports videos have become hot research topics. Most researches analyze every video frame; however, semantic events only appear in frames with scoreboard, whereas replays only appear in frames without scoreboard. Extracting events and replays from unrelated frames causes defects and leads to degradation of performance. In this dissertation, a novel framework will be proposed to tackle challenges of sports video analysis. In the framework, a scoreboard detector is first provided to divide video frames to two classes, with/without scoreboard. Then, a semantic event extractor is presented to extract semantic events from frames with scoreboard and a slow motion replay extractor is proposed to extract replays from frames without scoreboard. As to semantic event extraction, most of existing researches focus on analyzing audio-visual features of video content as resource knowledge. However, schemes relying on video content encounter a challenge called semantic gap, which represents the distance between lower level video features and higher level semantic events. Although the multimodal fusion scheme that conducts webcast text as external knowledge to bridge the semantic gap has been proposed recently, extracting semantic events from sports webcast text and annotating semantic events in sports videos are still challenging tasks. In this dissertation, we will address the challenges in the multimodal fusion scheme. Then, we will propose two methods to overcome the challenges. As to slow motion replay detection, many methods have been proposed, and they are classified into two categories. One assumes that a replay is sandwiched by a pair of visually similar special digital video effects, but the assumption is not always true in basketball videos. The other analyzes replay features to distinguish replay segments from non-replay segments. The results are not satisfactory since some features (e.g. dominant color of sports field) are not applicable for basketball. Most replay detectors focus on soccer videos. In this dissertation, we will propose a novel idea to detect slow motion replays in basketball videos. The feasibility and effectiveness of all the above proposed methods have been demonstrated in experiments. It is expected that the proposed sports video analysis framework can be extended to other sports. |
URI: | http://140.113.39.130/cdrfb3/record/nctu/#GT079455643 http://hdl.handle.net/11536/75903 |
显示于类别: | Thesis |
文件中的档案:
If it is a zip file, please download the file and unzip it, then open index.html in a browser to view the full text content.