Bruno Cremilleux Bruno.Cremilleux at info.unicaen.fr
Tue Apr 26 19:31:13 CEST 2005

Dear all,

I would be very grateful if you can send on your lists the following post-doctoral
position (see below) in Computer Science, Linguistic and Natural Language Processing 
which is opened at the GREYC laboratory (Caen, France).

Best regards,

Bruno Crémilleux


Postdoctoral Position in Computer Science, Linguistic and Natural Language
Processing: Using text resources for data mining

Research Unit: Groupe de REcherche en Informatique, Image, Automatique  et
Instrumentation de Caen (GREYC)

Location: Caen, Normandy, France

This post-doc position is linked to the Bingo project which joins three
computer scientists teams (EURISE, EA 3721, Université de St-Etienne, GREYC
- CNRS UMR 6072, Université de Caen and LIRIS - CNRS UMR 5205, INSA de
Lyon) and a team of biologists (CGMC - CNRS UMR 5534, Université de Lyon

The Bingo project (Bases de données INductives et GénOmique in French - Genomics 
and Inductive Database in English, see http://www.info.unicaen.fr/~bruno/bingo/) 
focuses on several open problems, one of which is the use of text resources 
during the pattern post-processing stage, in order to make better use of domain 
knowledge during the knowledge discovery stage.  This problem requires a close 
cooperation between linguistic knowledge and methods from knowledge discovery 
in databases.

The aim of the work of this post-doc position is to use texts and
ontologies in order to support the knowledge discovery phase (i.e., when
post-processing patterns) in order to present relevant knowledge for the
needs of the experts. Indeed, KDD processes tend to produce a lot of
patterns which are - a priori - interesting. The validation of the
extracted information is a hard task and requires the background knowledge
on the domain at hands. The background knowledge is partially embedded in
the literature. The key idea is to help the validation step by using
ontologies (cf. http://www.geneontology.org/) and textual resources (e.g.,
Medline).  For instance, in the context of the genomic data, starting from
a pattern which may be a synexpression group, the biologist would like to
retrieve the texts which deal with this particular topic, which biological
situations are concerned, and so on.  Several work directions are proposed
(e.g., text-reader profiling, text analysis, define constraints coming from
text resources), see

This post-doctoral position is supported by the CNRS, see also 

Sought profile of the candidate

Ph D in Computer Science with interest in liguistics or natural language
processing. A significant experience in knowledge discovery in databases or
linguistics would be highly appreciated. Speaking French is not required.

Duration of the fellowship (months): 12 (starting from September 1st, 2005)

Gross salary : 25,800 Euro per annum

Deadline for application : May 16th, 2005

Bruno Crémilleux  +33 2 31 56 74 35   Bruno.Cremilleux at info.unicaen.fr
Nadine Lucas      +33 2 31 56 73 36   Nadine.Lucas at info.unicaen.fr

GREYC - CNRS UMR 6072, Université de Caen, Campus Côte de Nacre
F-14032 Caen Cedex - France

