The candidate will work within a multidisciplinary team involved in the development and application of biomedical text mining and natural language processing approaches.

The overall aim of this work is to develop and apply text mining and natural language processing technologies to biomedical literature, covering aspects related to automatic text classification using machine learning methods, the detection of entities of biological interest from text and the extraction and ranking of biological relations from the biomedical literature. A special focus will be given to certain topics such as cancer related literature and protein interactions.

Expected outcomes will include fundamental research in biomedical text mining, publications in high-impact journals and development of biomedical text mining applications and services. The resulting strategies will be applied to process article abstracts, full text articles as well as clinical records.

Applicants should possess an MSc (preferably a PhD) in biology, computer science or related areas with specific knowledge in Computer Science, Bioinformatics, Computational Linguistics, Statistics or Machine Learning or Text Mining.

1) Candidates should have the following qualifications:
- Competitive software engineering skills (demonstrable knowledge of at least two programming languages such as python, C/C++, Java, Perl and the development of online web applications).
- A solid background and interest in statistical and machine learning methods
- Ability to develop algorithms and software for natural language processing/text mining systems
- Good English communication skills
- A strong interest in collaborating with experts from the biomedical, molecular biology and bioinformatics domains and text mining.

2) Other desirable skills and selection criteria include:
- Exposure to biomedical texts/domain.
- Familiarity with development of Web Services
- Well organized and have the ability to work in an interdisciplinary team.
- The publication record would be an advantage.
- Familiarity with NLP tasks such as named entity recognition, summarization, information extraction, and information retrieval will
be also highly desirable.
- Familiarity with some of the existing software that might be relevant to the research topic (like Weka, LibSVM, Lucene, GATE, NLTK, or Mallet).
- Interest in the evaluation of systems performance and community challenges.

The Spanish National Cancer Centre (CNIO, Madrid, Spain - http://www.cnio.es/ing/index.asp) is one of the few European Cancer Centers to allocate resources to both basic and applied research in an integrated fashion, thus supporting the interaction of basic research programmes with those of molecular diagnostics and drug discovery. All CNIO programmes benefit from excellent equipment, technology, and technical services. The CNIO employs about 500 scientists, it offers excellent work conditions including competitive salary, and world-class computing infrastructure.
The Structural and Computational Biology Programme at CNIO, leaded by Dr. Alfonso Valencia integrates several research groups, including the Computational Biology group, a Bioinformatics support unit and the central node of the Spanish Bioinformatics Institute, a Genome Spain platform. The computational facilities and infrastructure cover all the needs of the research in modern text mining and NLP, computational biology and provides excellent grounds for the analysis of high-throughput genomic data.
The research group contributed significantly to the biomedical text mining research over the past years, from initial work related to the analysis of protein families, microarray data and protein interactions to the development of popular online applications such as the iHOP server or PLAN2L. Recent research efforts also promoted the organization of the BioCreative text-mining challenges, the development of the BioCreative metaserver and the BioCreative II.5 competition. The research group has a well-established international network of collaborations with other text mining groups, bioinformatics and biological database teams and experimental biomedical researches.

Between 25,000 and 30,000 Euro / year depending qualifications. Contract for 2-3 years.

Requests for additional information or formal applications (including Application letters, extensive CV and PhD/MA thesis and the names of at least two references) can be sent to Martin Krallinger: mkrallinger at cnio.es

