[Elsnet-list] TUNA Referring Expression Generation Challenge 2009: Second call for participation

Gatt, A. a.gatt at abdn.ac.uk
Mon Nov 3 17:29:33 CET 2008

[Apologies for multiple postings]

TUNA Referring Expression Generation Challenge 2009.
Part of Generation Challenges 2009, in conjunction with ENLG 2009.


Generation Challenges 2009 is being organised to provide a common forum for a number of different NLG Shared Tasks (see http://www.nltg.brighton.ac.uk/research/genchal09/). The different Shared Tasks will be presented at a special session of the European Workshop on Natural Language Generation, held in Athens, Greece (see http://enlg2009.uvt.nl/).

As part of Generation Challenges 2009, we are organising a TUNA Progress Test. This will be the third, and final, shared task involving the TUNA Corpus of referring expressions. TUNA was first used in the Pilot Attribute Selection for Generating Referring Expressions (ASGRE) Challenge, which took place between May and September 2007; and again for three of the shared tasks in the Referring Expression Generation (REG) Challenge 2008, which was completed in May and presented during a special session at INLG'08. The TUNA'09 Task replicates one of the three tasks from REG 2008, the TUNA-REG Task. It uses the same test data, to enable direct comparison against the 2008 results.

Referring Expression Generation (REG) has been the subject of intensive research in the NLG community, giving rise to significant consensus on the problem definition, as well as the nature of the inputs and outputs of REG algorithms. Typically, such algorithms take as input a domain, consisting of objects and their attributes, together with an intended referent, and output a set of attributes true of the referent which distinguish it from other objects in the domain. An additional stage is to map these attributes to a natural language expression (usually a noun phrase).

The TUNA Corpus consists of a set of human-produced descriptions of objects in a visual domain of pictures of furniture and people, annotated at the semantic level and paired with a domain representation. In addition to the domain, each description is provided as a word string (i.e. the original human-produced description), as the set of attributes included in the description, and as a combination of the two (where the word string is annotated with the attributes).

The corpus was collected via an elicitation experiment, in which one between-subjects condition controlled the use of the location of an object in descriptions (+/-Location). In keeping with the earlier shared tasks involving TUNA, only the singular data will be used for the TUNA'09 Task.

For the REG Challenge 2008, a new test set was generated by replicating the original experiment. This test set has not been released in any form, and will be used again for the 2009 edition.

For full details about the data collection and corpus annotation scheme, see: http://www.csd.abdn.ac.uk/~agatt/home/pubs/tunaFormat.pdf

The TUNA'09 Task
The TUNA'09 Task is a replication of the TUNA-REG Task carried out as part of REG Challenges 2008. As in REG Challenges 2008, there will be a variety of evaluation methods to compare systems (see Section 4 below), and participants can choose which of the evaluation criteria to optimise for.
(i) Input: Following the standard REG problem definition, the input will be a TUNA domain representation, consisting of objects and their properties, with one object designated as the intended referent.
(ii) Output: Participating systems will need to map from a TUNA domain representation to a natural language description that describes the target referent.

The TUNA data set is composed of the following subsets:
(i) A development set consisting of 150 corpus items (input domain + human description), divided into furniture and people descriptions;
(ii) A training set consisting of 593 corpus items (input domain + human description), divided into furniture and people descriptions;
(iii) A test set consisting of 112 corpus items (input domain only), divided into furniture and people descriptions. This test set was created through a replication of the original TUNA elicitation experiment. It was built so that for every input domain there are two human-authored outputs by different individuals. Evaluation of system-produced descriptions will be carried out by averaging against both human descriptions in the corresponding input domain.

Participants will compute evaluation scores on the development set (using code provided by the organisers), and the organisers will perform evaluations on the test data set. We will again use a range of different evaluation methods, including intrinsic and extrinsic, automatically assessed and human-evaluated. Intrinsic evaluations assess properties of peer systems in their own right, whereas extrinsic evaluations assess the effect of a peer system on something that is external to it, such as its effect on human performance at a given task or the added value it brings to an application.

Full details of the evaluation methods will be announced to registered participants.

Registration is now open at the TUNA'09 Task homepage (http://www.nltg.brighton.ac.uk/research/genchal09/tuna). Once they have registered, participants will be sent the TUNA'09 Participants' Pack which includes complete documentation and the training and development data sets.

Proceedings and Presentations
The Generation Challenges 2009 meeting will be held as a special session at ENLG 2009. The session will include overviews of all the shared tasks, including the TUNA'09 Task. The participating systems will additionally be presented as papers in the ENLG'09 proceedings, and as posters during the ENLG'09 poster session.

TUNA'09 Challenge Papers will not undergo a selection procedure with multiple reviews, but the organisers reserve the right to reject material which is not appropriate given the participation guidelines.

Important Dates
Aug 15, 2008 First Call for Participation in TUNA'09 Task; TUNA'09 data sets available
Dec 1-31, 2008 TUNA'09 test data submission: 1. submit system report; 2. download test data; 3. submit outputs within 48h
Dec 31, 2009 Final deadline for submission of TUNA'09 test data outputs
Jan 01-31, 2009 TUNA'09 Evaluation period
Mar 01, 2009 Submission deadline for camera-ready papers and reports
Mar 30-31, 2009 Generation Challenges 2009 meeting at ENLG'09

Albert Gatt, Computing Science, University of Aberdeen, UK
Anja Belz, NLTG, University of Brighton, UK
Eric Kow, NLTG, University of Brighton, UK

TUNA'09 Task homepage: http://www.nltg.brighton.ac.uk/research/genchal09/tuna
Generation Challenges homepage: http://www.nltg.brighton.ac.uk/research/genchal09
Generation Challenges email: nlg-stec at itri.brighton.ac.uk

The University of Aberdeen is a charity registered in Scotland, No SC013683.

More information about the Elsnet-list mailing list