Marco Baroni marco.baroni at unitn.it
Wed Nov 13 20:11:18 CET 2013


Evaluation of compositional distributional semantic models on full
sentences through semantic relatedness and textual entailment

SemEval 2014 - Task 1



Distributional Semantic Models (DSMs) approximate the meaning of words
with vectors summarizing their patterns of co-occurrence in
corpora. Recently, several compositional extensions of DSMs
(Compositional DSMs, or CDSMs) have been proposed, with the purpose of
representing the meaning of phrases and sentences by composing the
distributional representations of the words they contain. Despite the
ever increasing interest in the field, the development of adequate
benchmarks for CDSMs, especially at the sentence level, is still
lagging behind.

SICK (Sentences Involving Compositional Knowledge) is an English data
set including 10,000 sentence pairs that are rich in the lexical,
syntactic and semantic phenomena that CDSMs are expected to account
for (e.g., contextual synonymy and other lexical variation phenomena,
active/passive and other syntactic alternations, word order effects,
impact of negation, determiners and other grammatical elements), but
do not require dealing with other aspects of existing sentential data
sets (complex tokenization issues, idiomatic multiword expressions,
named entities, telegraphic language) that are not within the scope of
current compositional distributional semantics.  Sentence pairs were
built starting from
http://nlp.cs.illinois.edu/HockenmaierGroup/data.html and
http://www.cs.york.ac.uk/semeval-2012/task6/index.php?id=data, and
have been annotated for relatedness in meaning and entailment relation
between the two elements. The sentence relatedness score provides a
direct way to evaluate CDSMs, insofar as their outputs are meant to
quantify the degree of semantic relatedness between sentences. On the
other hand, detecting the presence of entailment is one of the
traditional benchmarks of a successful semantic system: CDSMs are thus
expected to predict, to a certain extent, also entailment judgments.


The challenge involves two sub-tasks:

a) predicting the degree of relatedness between two sentences

b) detecting the entailment relation holding between them

Participants can submit system runs for one or both sub-tasks.

While we especially encourage developers of CDSMs to test their
methods on SICK, developers of other kinds of systems that can tackle
sentence relatedness or entailment tasks (e.g., full-fledged RTE
systems) are also welcome to submit their output.


- Training data available December 15, 2013

- Evaluation period March 15-30, 2014

- Paper submission due April 30, 2014

- SemEval workshop August 23-24, 2014, co-located with COLING and *SEM
   in Dublin, Ireland.


Further details about the task, the dataset and the evaluation
criteria can be found on the SemEval website, where trial data can
also be downloaded: http://alt.qcri.org/semeval2014/task1/

If you are interested in participating, join our mailing list:


Marco Marelli, University of Trento, Italy
Stefano Menini, Fondazione Bruno Kessler, Italy
Marco Baroni, University of Trento, Italy
Luisa Bentivogli, Fondazione Bruno Kessler, Italy
Raffaella Bernardi, University of Trento, Italy
Roberto Zamparelli, University of Trento, Italy

More information about the Elsnet-list mailing list