Summary: | 碩士 === 國立清華大學 === 電機工程學系 === 97 === In this thesis, we present a visual attention based probabilistic framework for the video objects discovery scheme. The framework consists of three models: the appearance model use probabilistic representation to describe the consistency of object across frame; the spatial model represents the objects’ geometric structure; the motion model establishes the temporal association of objects. In order to complete the video multiple objects discovery, we use Perceptual Quality Significance Map (PQSM) for the visual attention model. The visual attention regions from PQSM can be regarded as objects. We also use those regions to describe the appearance, size, and initial location of objects. Finally, the probabilistic parameters are obtained by Expectation-Maximization (EM) algorithm. Since the scene of video may switch between different shots, we use these probabilistic parameters to indicate which frame has the discovered objects and measure the similarity of objects in different shots. The mainly different from object tracking is object discovery can discover the object across different shots. We show the results that can be performed very well for video multiple objects discovery. And the attention regions from PQSMs are used to generate the object information for the motion model. Since the motion model is used to establish the data association, the temporal association of attention regions from PQSMs can be established by this proposed model.
|