Deep 360 Pilot: Learning a Deep Agent for Piloting through 360 degrees Sports Videos

doi:10.1109/CVPR.2017.153

Full metadata record

DC Field	Value	Language
dc.contributor.author	Hu, Hou-Ning	en_US
dc.contributor.author	Lin, Yen-Chen	en_US
dc.contributor.author	Liu, Ming-Yu	en_US
dc.contributor.author	Cheng, Hsien-Tzu	en_US
dc.contributor.author	Chang, Yung-Ju	en_US
dc.contributor.author	Sun, Min	en_US
dc.date.accessioned	2018-08-21T05:56:58Z	-
dc.date.available	2018-08-21T05:56:58Z	-
dc.date.issued	2017-01-01	en_US
dc.identifier.issn	1063-6919	en_US
dc.identifier.uri	http://dx.doi.org/10.1109/CVPR.2017.153	en_US
dc.identifier.uri	http://hdl.handle.net/11536/146873	-
dc.description.abstract	Watching a 360 degrees sports video requires a viewer to continuously select a viewing angle, either through a sequence of mouse clicks or head movements. To relieve the viewer from this "360 piloting" task, we propose "deep 360 pilot" - a deep learning-based agent for piloting through 360 degrees sports videos automatically. At each frame, the agent observes a panoramic image and has the knowledge of previously selected viewing angles. The task of the agent is to shift the current viewing angle (i.e. action) to the next preferred one (i.e., goal). We propose to directly learn an online policy of the agent from data. Specifically, we leverage a state-of-the-art object detector to propose a few candidate objects of interest (yellow boxes in Fig. 1). Then, a recurrent neural network is used to select the main object (green dash boxes in Fig. 1). Given the main object and previously selected viewing angles, our method regresses a shift in viewing angle to move to the next one. We use the policy gradient technique to jointly train our pipeline, by minimizing: (1) a regression loss measuring the distance between the selected and ground truth viewing angles, (2) a smoothness loss encouraging smooth transition in viewing angle, and (3) maximizing an expected reward of focusing on a foreground object. To evaluate our method, we built a new 360-Sports video dataset consisting of five sports domains. We trained domain-specific agents and achieved the best performance on viewing angle selection accuracy and users' preference compared to [53] and other baselines.	en_US
dc.language.iso	en_US	en_US
dc.title	Deep 360 Pilot: Learning a Deep Agent for Piloting through 360 degrees Sports Videos	en_US
dc.type	Proceedings Paper	en_US
dc.identifier.doi	10.1109/CVPR.2017.153	en_US
dc.identifier.journal	30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017)	en_US
dc.citation.spage	1396	en_US
dc.citation.epage	1405	en_US
dc.contributor.department	交大名義發表	zh_TW
dc.contributor.department	National Chiao Tung University	en_US
dc.identifier.wosnumber	WOS:000418371401048	en_US
Appears in Collections:	Conferences Paper