[Elsnet-list] Postdoctoral position: deep neural networks for source separation and noise-robust ASR

Antoine Liutkus antoine.liutkus at inria.fr
Mon Jan 20 13:39:11 CET 2014

(Apologies for any cross-posting - Please forward to anyone that may be 

*SUBJECT*: Deep neural networks for source separation and noise-robust ASR
*LAB*: PAROLE team, Inria Nancy, France
*SUPERVISORS*: Antoine Liutkus (antoine.liutkus at inria.fr) and Emmanuel 
Vincent (emmanuel.vincent at inria.fr)
*START*: between November 2014 and January 2015
*DURATION*: 12 to 16 months
*TO APPLY*: apply online before June 10 at 
(earlier application is preferred)

Inria is the biggest European public research institute dedicated to 
computer science. The PAROLE team in INRIA Nancy, France, gathers 20+ 
speech scientists with a growing focus on speech enhancement and 
noise-robust speech recognition exemplified by the organization of the 
CHiME Challenge [1] and ISCA's Robust Speech Processing SIG [2].

The boom of speech interfaces for handheld devices requires automatic 
speech recognition (ASR) system to deal with a wide variety of acoustic 
conditions. Recent research has shown that Deep Neural Networks (DNNs) 
are very promising for this purpose. Most approaches now focus on clean, 
single-source conditions [3]. Despite a few attempts to employ DNNs for 
source separation [4,5,6], conventional source separation techniques 
such as [7] still outperform DNNs in real-world conditions involving 
multiple noise sources [8]. The proposed postdoctoral position aims to 
overcome this gap by incorporating the benefits of conventional source 
separation techniques into DNNs. This includes for instance the ability 
to exploit multichannel data and different characteristics for 
separation and for ASR. Performance will be assessed over readily 
available real-world noisy speech corpora such as CHiME [1].

Prospective candidates should have defended a PhD in 2013 or defend a 
PhD in 2014 in the area of speech processing, machine learning, signal 
processing or applied statistics. Proficient programming in Matlab, 
Python or C++ is necessary. Practice of DNN/ASR software (Theano, Kaldi) 
would be an asset.

[1] http://spandh.dcs.shef.ac.uk/chime_challenge/

[2] https://wiki.inria.fr/rosp/

[3] G. Hinton, L. Deng, D. Yu, G. Dahl, A.-R. Mohamed, N. Jaitly, A. 
Senior, V. Vanhoucke, P. Nguyen, T. Sainath, and B. Kingsbury, "Deep 
neural networks for acoustic modeling in speech recognition", IEEE 
Signal Processing Magazine, 2012.

[4] S.J. Rennie, P. Fousek, and P.L. Dognin, "Factorial Hidden 
Restricted Boltzmann Machines for noise robust speech recognition", in 
Proc. ICASSP, 2012.

[5] A.L. Maas, T.M. O'Neil, A.Y. Hannun, and A.Y. Ng, "Recurrent neural 
network feature enhancement: The 2nd CHiME Challenge", in Proc. CHiME, 2013.

[6] Y. Wang and D. Wang. "Towards scaling up classification-based speech 
separation", IEEE Transactions on Audio, Speech and Language Processing, 

[7] A. Ozerov, E. Vincent, and F. Bimbot, "A general flexible framework 
for the handling of prior information in audio source separation", IEEE 
Transactions on Audio, Speech and Language Processing, 2012.

[8] J. Barker, E. Vincent, N. Ma, H. Christensen, and P. Green, "The 
PASCAL CHiME Speech Separation and Recognition Challenge", Computer 
Speech and Language, 2013.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://list.hum.uu.nl/pipermail/elsnet-list/attachments/20140120/0b634e29/attachment-0001.html>

More information about the Elsnet-list mailing list