Retrieval of TV Talk-Show Speakers by Associating Audio Transcript to Visual Clusters

Retrieval of TV talk-show speakers based on solely visual face recognition is hard because of the significant visual variation caused by illumination, pose, size, and expression, which can exceed those due to identity. Fortunately, TV talk-shows often exhibit specific visual production styles and ar...

Full description

Bibliographic Details
Main Authors: Yina Han, Shanghuan Song, Weikang Zhao
Format: Article
Language:English
Published: IEEE 2017-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/8049254/