[Elsnet-list] Call for Participation: ACL WMT 2014 Machine Translation Shared Tasks

Barry Haddow bhaddow at staffmail.ed.ac.uk
Fri Dec 13 16:45:49 CET 2013

Shared Tasks on news translation, quality estimation, metrics and 
medical text translation.
June 26-27, in conjunction with ACL 2014 in Baltimore, USA


As part of the ACL WMT14 workshop, as in previous years, we will be 
organising a collection of shared tasks related to machine translation.  
We hope that both beginners and established research groups will 
participate. This year we are pleased to present the following tasks:

- Translation task
- Quality estimation task
- Metrics task
- Medical translation task

Further information, including task rationale, timetables and data can 
be found on the WMT14 website. Brief descriptions of each task are given 
below. Intending participants are encouraged to register with the 
mailing list for further announcements 

For all tasks,  participants will also be  invited to submit a short 
paper describing their system.

Translation Task
This will compare translation quality on four European language pairs 
(English-Czech, English-French, English-German and English-Russian), as 
well as a low-resource language pair (English-Hindi). The last language 
pair is *new* for this year. The test sets will be drawn from online 
newspapers, and translated specifically for the task.

We will provide extensive monolingual and parallel data sets for 
training, as well as development sets, all available for download from 
the task website. Translations will be evaluated both using automatic 
metrics, and using human evaluation. Participants will be expected to 
contribute to the human evaluations of the translations.

For this year's task we will be releasing the following new or updated 
- An updated version of news-commentary
- A monolingual news crawl for 2013 in all the task languages
- A development set of English-Hindi
- A parallel corpus of English-Hindi (HindEnCorp), prepared by Charles 
University, Prague
- A cleaned-up version of the JHU English-Hindi corpus.
Not all data sets are available on the website yet, but they will be 
uploaded as soon as they are ready.

The translation task test week will be February 24-28.

This task is supported by MosesCore (http://www.mosescore.eu), and the 
Russian test sets are provided by Yandex.

Quality Estimation

This shared task will examine automatic *methods for estimating the 
quality of machine translation output at run-time*, without relying on 
reference translations. In this third edition of the shared task, we 
will once again consider *word-level* and *sentence-level* estimation. 
However, this year we will focus on settings for quality prediction that 
are MT system-independent and rely on a limited number of training 
instances. More specifically, our tasks have the following *goals*:

  * To investigate the effectiveness of different quality labels.
  * To explore word-level quality prediction at different levels of
  * To study the effects of training and test datasets with mixed
    domains, language pairs and MT systems.
  * To analyse the effectiveness of quality prediction methods on human

The WMT12-13 quality estimation shared tasks provided a set of baseline 
features, datasets, evaluation metrics, and oracle results. Building on 
last two years' experience, this year's shared task will reuse some of 
these resources, but provide additional training and test sets for four 
language pairs (English-Spanish, English-German, Spanish-English, 
German-English) and use different quality labels at word-level (specific 
types of errors) and sentence-levels. These new datasets have been 
collected using professional translators as part of the QTLaunchPad 
project (http://www.qt21.eu/launchpad/).

Metrics Task

The shared metrics task will examine automatic evaluation metrics for 
machine translation. We will provide you with all of the translations 
produced in the translation task along with the reference human 
translations. You will return your automatic metric scores for each of 
the translations at the system-level and/or at the sentence-level. We 
will calculate the system-level and sentence-level correlations of your 
rankings with WMT14 human judgements once the manual evaluation has been 

The task will be very similar to previous years. The most visible change 
this year is that we are going to use Pearson's (instead Spearman's) 
correlation coefficient to compute system level correlations.

The important dates for metrics task participants are:

March 7, 2014 - System outputs distributed for metrics task
March 28, 2014 - Submission deadline for metrics task

Medical Translation Task

In the Medical Translation Task, participants are welcome to test their 
MT systems on a genre- and domain-specific exercise. The goal is to 
translate sentences from summaries and also short queries in the medical 
domain. As usual, we provide training data specific for the task. Unlike 
the standard translation task, the medical task will be evaluated only 

More details: http://www.statmt.org/wmt14/medical-task.html


Barry Haddow
(on behalf of the organisers)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://list.hum.uu.nl/pipermail/elsnet-list/attachments/20131213/22d946a5/attachment-0001.html>

More information about the Elsnet-list mailing list