Optimization of an Image-Based Talking Head System

<p/> <p>This paper presents an image-based talking head system, which includes two parts: analysis and synthesis. The audiovisual analysis part creates a face model of a recorded human subject, which is composed of a personalized 3D mask as well as a large database of mouth images and th...

Full description

Bibliographic Details
Main Authors:	Liu Kang, Ostermann Joern
Format:	Article
Language:	English
Published:	SpringerOpen 2009-01-01
Series:	EURASIP Journal on Audio, Speech, and Music Processing
Online Access:	http://asmp.eurasipjournals.com/content/2009/174192

id	doaj-cd666be9199044298127ffaeee886c87
record_format	Article
spelling	doaj-cd666be9199044298127ffaeee886c872020-11-25T00:34:24ZengSpringerOpenEURASIP Journal on Audio, Speech, and Music Processing1687-47141687-47222009-01-0120091174192Optimization of an Image-Based Talking Head SystemLiu KangOstermann Joern<p/> <p>This paper presents an image-based talking head system, which includes two parts: analysis and synthesis. The audiovisual analysis part creates a face model of a recorded human subject, which is composed of a personalized 3D mask as well as a large database of mouth images and their related information. The synthesis part generates natural looking facial animations from phonetic transcripts of text. A critical issue of the synthesis is the unit selection which selects and concatenates these appropriate mouth images from the database such that they match the spoken words of the talking head. Selection is based on lip synchronization and the similarity of consecutive images. The unit selection is refined in this paper, and Pareto optimization is used to train the unit selection. Experimental results of subjective tests show that most people cannot distinguish our facial animations from real videos.</p>http://asmp.eurasipjournals.com/content/2009/174192
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Liu Kang Ostermann Joern
spellingShingle	Liu Kang Ostermann Joern Optimization of an Image-Based Talking Head System EURASIP Journal on Audio, Speech, and Music Processing
author_facet	Liu Kang Ostermann Joern
author_sort	Liu Kang
title	Optimization of an Image-Based Talking Head System
title_short	Optimization of an Image-Based Talking Head System
title_full	Optimization of an Image-Based Talking Head System
title_fullStr	Optimization of an Image-Based Talking Head System
title_full_unstemmed	Optimization of an Image-Based Talking Head System
title_sort	optimization of an image-based talking head system
publisher	SpringerOpen
series	EURASIP Journal on Audio, Speech, and Music Processing
issn	1687-4714 1687-4722
publishDate	2009-01-01
description	<p/> <p>This paper presents an image-based talking head system, which includes two parts: analysis and synthesis. The audiovisual analysis part creates a face model of a recorded human subject, which is composed of a personalized 3D mask as well as a large database of mouth images and their related information. The synthesis part generates natural looking facial animations from phonetic transcripts of text. A critical issue of the synthesis is the unit selection which selects and concatenates these appropriate mouth images from the database such that they match the spoken words of the talking head. Selection is based on lip synchronization and the similarity of consecutive images. The unit selection is refined in this paper, and Pareto optimization is used to train the unit selection. Experimental results of subjective tests show that most people cannot distinguish our facial animations from real videos.</p>
url	http://asmp.eurasipjournals.com/content/2009/174192
work_keys_str_mv	AT liukang optimizationofanimagebasedtalkingheadsystem AT ostermannjoern optimizationofanimagebasedtalkingheadsystem
_version_	1725313514269048832

Optimization of an Image-Based Talking Head System

Similar Items