[Elsnet-list] Postdoctoral position at France Telecom R&D

BOUALEM Malek RD-TECH-LAN malek.boualem at francetelecom.com
Tue Apr 18 12:08:30 CEST 2006


Postdoctoral position at France Telecom R&D:
Corpus-based learning for semantic transfer in machine translation.

The "Natural Languages" R&D unit in France Telecom offers a
post-doctoral position in Lannion (Brittany, France) to be started as
soon as possible on the following subject:

Corpus-based learning for semantic transfer in machine translation
------------------------------------------------------------------

Machine translation based on an Interlingua aims at expressing
accurately in the target language what has been said in the source
language. However, a number of phenomena occur out of this
framework: under the same circumstances, one wouldn't say exactly the
same thing in different languages:

- either because usage, forms of address, or habits differ (I would like
some aspirin, I need some aspirin, have you got some aspirin, may I have
some aspirin, may I bother you with some aspirin). 
- or because basic linguistic structures, especially for determination,
time and aspect, follow different schemes (I would like some aspirin, I
would like a box of aspirin, I would have liked some aspirin, I want
aspirins) 

Semantic modelling or rule-based description of such differences is
hardly feasible. However these gaps may be observed on aligned corpora.
And as morphologic, syntactic and semantic levels are already addressed
by linguistic methods in an Interlingua architecture, machine learning
at the pragmatic level may hopefully require less huge corpora than
purely statistical translation methods where all the levels need to be
globally learned.

The postdoctoral successful candidate will investigate machine learning
methods which may be applied to structured representations (trees and
graphs) for machine translation, transform a corpus of aligned sentences
into a corpus of aligned semantic graphs, and implement a system to
transform the graphs from the source language into graphs expected in
the target language according to the corpus.

Required skills: 

* semantic representations in NLP (lexical semantics and textual
semantics)
* machine translation: linguistic, statistical and combined methods
* machine learning, especially on structured representations (trees,
graphs)
* corpus alignment
* C++, Unix
* languages: fluent French or English, both is preferred
* knowledge of typologically different languages 

Required diploma: 

* PhD (already defended or scheduled)

Please send application letter and resume to : 
jerome(dot)vinesse(at)francetelecom(dot)com

===================

------------------------------------------------------
Malek Boualem
France Telecom, R&D Division
2, avenue Pierre Marzin - 22307 Lannion - France
Tel: (33)(0)2.96.05.29.83
Fax: (33)(0)2.96.05.32.86
Email: malek.boualem at francetelecom.com
------------------------------------------------------


More information about the Elsnet-list mailing list