Virtual Audience Cameraman System

碩士 === 國立臺灣師範大學 === 資訊工程學系 === 102 === This thesis proposes a virtual audience cameraman system to capture the audience videos automatically. Nowadays the contents of lectures can be broadcast widely and rapidly by digital videos, thus to capture digital videos of important lectures for the viewers...

Full description

Bibliographic Details
Main Authors: Hsuan-Chia Liao, 廖軒嘉
Other Authors: Sei-Wang Chen
Format: Others
Language:zh-TW
Published: 2014
Online Access:http://ndltd.ncl.edu.tw/handle/66277867251317353397
Description
Summary:碩士 === 國立臺灣師範大學 === 資訊工程學系 === 102 === This thesis proposes a virtual audience cameraman system to capture the audience videos automatically. Nowadays the contents of lectures can be broadcast widely and rapidly by digital videos, thus to capture digital videos of important lectures for the viewers is an essential work. However, the cost to hire a video-recording team, including professional photographers, to capture good-quality digital videos is very high. Thus this study developed a virtual audience cameraman system which can obtain good-quality digital videos automatically and reduce the cost of hiring a professional video-recording team. In this study, two PTZ cameras are mounted together to be a set, one is the global-view camera and the other is the local-view camera. The global-view camera can be regarded as the photographer's eyes. It can be used to monitor the whole audience and help the region of interesting (ROI) detection. The local-view camera can be regarded as the photographer's camera on hand. It can be used to capture the videos from ROI after the system determines the location of ROI. Since the purpose of this system is to simulate the camera-control behaviors of professional photographers to capture the audience videos, the proposed system needs to decide the camera steering mode, shot class, and the objects before camera steering. First, the system obtains input videos from the global-view camera and then detects the audience motion features to locate the ROI candidates. The ROI candidates are then input into the spatiotemporal attention (STA) neural model. The STA neural model can record and provide information to help the system to identify the most suitable shooting ROI. Further, the system computes the relative distance between the location of the ROI on the frame and the center of the camera lens, and outputs the appropriate steering mode of the local-view camera. The local-view camera then captures the output videos from the location of ROI by considering the viewpoint of aesthetics and the analysis result of optical characteristics. Through the above process this system can simulate professional photography shooting skills. The experimental results show that the proposed method can steer the camera immediately, automatically, and smoothly. It can also simulate the style of professional photographers accurately.