[Elsnet-list] OpenMT-2 Workshop on Using Linguistic Information for Hybrid Machine Translation -- 1st CFP]

Amarin Deemagarn amarin at lsi.upc.edu
Wed May 11 11:55:10 CEST 2011

Dear sir,

I wish to have our workshop  listed on your website.
Please put the International Workshop on Using Linguistic Information for Hybrid
Machine Translation on the elsnet-list.

OpenMT-2 International Workshop on Using Linguistic Information for Hybrid Machine

Workshop date: Friday, November 18, 2011. Barcelona, Spain.

Paper submission deadline: Sept. 9, 2011,

Best regards,

Amarin Deemagarn



OpenMT-2 International Workshop on Using Linguistic Information for Hybrid Machine

Friday, November 18, 2011. Barcelona, Spain.





Akin  to the OpenMT Workshop on Mixing Approaches to Machine Translation in
2008 (http://ixa2.si.ehu.es/matmt-2008), the aim of the OpenMT-2
Workshop on Using Linguistic Information for Hybrid Machine Translation
(HMT) is to promote corpus-based methods and technologies that combine
resources and algorithms from the three general approaches to MT:
rule-based (RBMT), example-based (EBMT) and statistical (SMT).

The boundaries between these three approaches are becoming narrower:

(i) String based SMT models are being augmented with morphological, syntactic or
semantic information.

RBMT systems are using parallel corpora to improve their results by
enriching their lexicons and grammars and creating new methods for

(iii) Previous projects have shown that benefits can
be accrued by simple combination of MT systems created using different
MT approaches.

At the same time, data-driven Machine Translation
(EBMT and SMT) is nowadays prevalent within the MT research community
and translation results obtained with these approaches have now reached
a reasonably useful level of quality, especially when the target
language is English. But such data-driven MT systems base their
knowledge on bilingual aligned corpora, and the accuracy of their
output depends strongly on the quality and the size of that corpora.
Large and reliable bilingual corpora are unavailable for many language
pairs. In addition, translating into a morphologically rich target
language makes the training of data-driven systems a lot more difficult.

Workshop Programme


one-day workshop is being organised as part of the dissemination effort
of the OpenMT-2 project, a Spanish government funded, three-year,
multisite research effort addressing, on the one hand, approaches to
integrating structural information (morphological, syntactic and
semantic) into open-source SMT and, on the other, to developing novel
automatic MT evaluation using linguistically motivated metrics. Thus,
the central issues to be addressed during the workshop include:

methods and techniques for integrating structural information (syntactic and
semantic) into HMT,

methods and techniques for handling morphologically rich languages (e.g. Basque)
within HMT,

alternative approaches to automatic MT evaluation relying on linguistic criteria.

programme will include three invited plenary talks, each addressing one
of the central issues above, and the presentation of a number of
refereed contributions on related topics. The invited speakers include:

Lucia Specia (University of Wolverhampton, UK),

Ondřej Bojar (Charles University, Czech Republic),


workshop will conclude with a brief panel discussion summarising the
results of the presentations as they impact the central issues.

Topics of Interest


We are particularly interested in papers describing research and development in
the following areas:

methods to compare and combine translation-outputs obtained from different MT

methods for dealing with languages with rich morphology within data-driven

approaches to developing morphologically, syntactically or semantically augmented
SMT models,

new automatic (or manual) MT evaluation methods based on linguistically motivated

descriptions of open-source or free language resources that are available for
developing hybrid MT systems.

All contributions will be published in the workshop proceedings.

Important Dates


Paper submission deadline: Sept. 9, 2011,

Notification of acceptance: Oct. 7, 2011,

Final version of paper: Oct 21, 2011,

Workshop: Nov 18, 2011.



should be in English and up to a maximum of 8 pages long. Please follow
the ACL HLT 2011 formatting requirements for long papers found at:


To submit contributions, please follow the instructions at the EasyChair
conference management system submission website at:


The deadline for submission is September 9, 2011.

The contributions will undergo a double-blind review by members of the programme

Please address queries to lihmt at easychair.org

Programme committee (Tentative)


Co-Chair: David Farwell (Technical University of Catalonia, TALP, Barcelona)

Co-Chair: Gorka Labaka (University of the Basque Country, Donostia)

Iñaki Alegria (University of the Basque Country, Donostia)

Ondřej Bojar (Charles University, Czech Republic)

Arantza Díaz de Ilarraza (University of the Basque Country, Donostia)

Chris Dyer (Carnegie Mellon University, US)

Cristina España (Technical University of Catalonia, TALP, Barcelona)

Marcello Federico (Fondazione Bruno Kessler, Italy)

Mikel Forcada (University of Alacant, Alicante)

Adrià de Gispert (University of Cambridge, UK)

Kevin Knight (Information Sciences Institute, US)

Phillip Koehn (University of Edinburgh, UK)

José Mariño (Technical University of Catalonia, TALP, Barcelona)

Lluís Màrquez (Technical University of Catalonia, TALP, Barcelona)

Hermann Ney (RWTH-Aachen, Germany)

Daniele Pighin (Technical University of Catalonia, TALP, Barcelona)

Aarne Ranta (Chalmers University of Technology, Gothenburg, Sweden)

Marta R. Costa-jussà (Barcelona Media, Spain)

Felipe Sánchez-Martínez (University of Alacant, Alicante)

Kepa Sarasola (University of the Basque Country, Donostia)

Lucia Specia (University of Wolverhampton, UK)

Dekai Wu (Hong Kong University of Science and Technology, China)

Local organization


Centre for Speech and Language Applications and Technologies (TALP), Technical
University of Catalonia (UPC).

Committee members: David Farwell (Chair), Amarin Deemagarn, Cristina España,
Meritxell González, Lluís Màrquez, Daniele Pighin.

About the OpenMT-2 project


main goal of the OpenMT-2 project is the development of Open Source
Machine Translation Architectures based on hybrid models and advanced
semantic processors. These architectures will be open-source systems
combining the three main Machine Translation frameworks – Rule-Based MT
(RBMT), Statistical MT (SMT) and Example-Based MT (EBMT) – into hybrid
systems. Defined architectures and results of the project will be Open
Source, so it will allow rapid development and adaptation of new
advanced Machine Translations systems for other languages. We will test
the functionality of this system with different languages: English,
Spanish, Catalan and Basque; so we will evaluate such architectures in
different contexts. While there are many corpus resources for English
and Spanish, there are not so many for Catalan and Basque languages.
While the structure of some of those languages is very similar (Catalan
and Spanish), others are very different (English and Basque). Basque is
an agglutinative and highly inflecting language, unlike English,
Catalan and Spanish.

In parallel there has been extensive work
on developing an automatic Evaluation platform that for the
introduction of linguistically motivated morphological, syntactic and
semantic metrics into the design of MT Evaluation methodologies as well
as the development and testing of concrete, linguistically-based
evaluation techniques.

The main innovative points of the OpenMT-2 project are:

The design of hybrid systems combining traditional linguistic rules, example-based
methods and statistical methods.

The development of MT evaluation methods based on linguistically motivated metrics.

Open Source Systems.

The use of advanced syntactic and semantic processing in MT.

For further details, see the OpenMT-2 website: http://ixa.si.ehu.es/openmt2

Read more:

Si has rebut aquest missatge és perquè estàs subscrit al Grup "Seminari NLP".
Per publicar-hi només has d'enviar un email a seminari-nlp at googlegroups.com
Per desfer-te d'aquest grup per sempre més, envia un trònic a
seminari-nlp+unsubscribe at googlegroups.com
Per aconseguir sugus gratis, contacta amb els organitzadors durant els seminaris
Per a altres tasques visita la pàgina principal a

More information about the Elsnet-list mailing list