完整後設資料紀錄
DC 欄位語言
dc.contributor.authorArdianto, Sandyen_US
dc.contributor.authorHang, Hsueh-Mingen_US
dc.date.accessioned2019-08-02T02:24:16Z-
dc.date.available2019-08-02T02:24:16Z-
dc.date.issued2018-01-01en_US
dc.identifier.isbn978-9-8814-7685-2en_US
dc.identifier.issn2309-9402en_US
dc.identifier.urihttp://hdl.handle.net/11536/152429-
dc.description.abstractIn this paper, we study multi-modal and multi-view action recognition system based on the deep-learning techniques. We extended the Temporal Segment Network with additional data fusion stage to combine information from different sources. In this research, we use multiple types of information from different modality such as RGB, depth, infrared data to detect predefined human actions. We tested various combinations of these data sources to examine their impact on the final detection accuracy. We designed 3 information fusion methods to generate the final decision. The most interested one is the Learned Fusion Net designed by us. It turns out the Learned Fusion structure has the best results but requires more training.en_US
dc.language.isoen_USen_US
dc.subjecthuman action recognitionen_US
dc.subjectneural netsen_US
dc.subjectdeep learningen_US
dc.subjectmulti-view videoen_US
dc.subjectmulti-modal videoen_US
dc.subjectinformation fusionen_US
dc.titleMulti-View and Multi-Modal Action Recognition with Learned Fusionen_US
dc.typeProceedings Paperen_US
dc.identifier.journal2018 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC)en_US
dc.citation.spage1601en_US
dc.citation.epage1604en_US
dc.contributor.department電機學院zh_TW
dc.contributor.departmentCollege of Electrical and Computer Engineeringen_US
dc.identifier.wosnumberWOS:000468383400259en_US
dc.citation.woscount0en_US
顯示於類別:會議論文