[Elsnet-list] Postodc position in audiovisual speech, INRIA Nancy, France

Slim OUNI Slim.Ouni at loria.fr
Mon Apr 8 12:25:06 CEST 2013

Postdoc position in computer science, INRIA Nancy, Nancy, France.

Application deadline: 11/06/2013

Starting date: Sept. 2013
Duration : 1 year (possibly extendable)
Contact: Slim Ouni - Slim.Ouni at loria.fr

More details and to apply:
(to apply, click on "Apply online" button at the end of the web page) 

Accurate 3D Lip modeling and control in the context of animating a 3D talking head

** Scientific Context:
The lips play a significant role in audiovisual human communication. Several studies showed the important contribution of the lips to the intelligibility of visual speech (Sumby & Pollack, 1954; Cohen & Massaro 1990). In fact, it has been shown that human lips alone carry more than half the visual information provided by the face (Benoît,1996). Since the beginning of the development of 3D virtual talking heads, researchers showed interest to model lips (Guiard-Marigny et al., 1996, Revéret & Benoît, 1998), as the lips increase intelligibility of the visual message. The existing models are still considered as pure parametric and numerical models and do not take into account the dynamic characteristic of speech. As audiovisual speech is highly dynamics, we consider that modeling this aspect is crucial to provide a lip model that is accurately animated, and reflects the real articulatory dynamics as observed in human vocal tract. In fact, the movement of the lips, even subtle, can communicate relevant information to the human receiver. This is even more crucial for some population such as hard-of-hearing people.
** Mission
The goal of this work is to develop an accurate 3D lip model that can be integrated within a talking head. A control model will also be developed. The lip model should be as accurate dynamically as possible. When designing this model, the focus will be on the dynamics. For this reason, one can start from a static 3D lip mesh, using a generic 3D lip model, and then we will use MRI images or 3D scans to obtain more realistic shape of the lips. To take into account the dynamic aspect of the lip deformation, we will use an articulograph (EMA) and motion capture technique to track sensors or markers on the lips. The mesh will be adapted to this data.

The main challenge is to find the best topology of the sensors or markers on the lips, to be able to better capture accurately its dynamics. The main outcome is to accurately model and animate the lips based on articulatory data. It is very important to have readable lips in that can be lip-read by hard-of-hearing people. 

** Candidate Profile
Required qualification: PhD in computer science
Appropriate candidate would have good knowledge in 3D modeling, speech processing and data analysis, as well as solid java programming skills.

More information about the Elsnet-list mailing list