[Elsnet-list] First CFP: SemEval-2014 Task 3 - Cross-Level Semantic Similarity

David Jurgens jurgens at di.uniroma1.it
Thu Nov 14 11:16:29 CET 2013

Call For Participation

Cross-Level Semantic Similarity

SemEval 2014 - Task 3


The aim of this task is to evaluate semantic similarity when comparing
lexical items of different types, such as paragraphs, sentences, phrases,
words, and senses.

Semantic similarity is an essential component of many applications in
Natural Language Processing (NLP). This task provides an evaluation for
semantic similarity across different types of text, which we refer to as
lexical levels. Unlike prior SemEval tasks on textual similarity that have
focused on comparing similar-sized texts, this task evaluates the case
where larger text must be compared to smaller text, or even to senses.
Specifically, this task encompasses four types of semantic similarity


   paragraph to sentence,

   sentence to phrase,

   phrase to word, and

   word to sense.

Task 3 unifies multiple objectives from different areas of NLP under a
single task, e.g., Paraphrasing, Summarization, and Compositional
Semantics. One of the major motivations of this task is to produce systems
that handle all comparison types, thereby freeing downstream NLP
applications from needing to consider the type of text being compared.


Task participants will be provided with pairs of each comparison type and
asked to rate how similar is the meaning of the smaller item to the overall
meaning of the larger item.  For example, given a sentence and a paragraph,
a system would assess how similar is the meaning of the sentence to the
meaning of the paragraph. Ideally, a high-similarity sentence would reflect
overall meaning of the paragraph.

For word-to-sense comparisons, two evaluation settings are used: (1)
out-of-context and (2) in-context. In the out-of-context setting, a sense
is paired with a word in isolation. In the in-context setting, a sense is
compared with the meaning of a usage appearing in some context.  Task 3
uses the WordNet 3.1 sense inventory.


Teams are free to participate in one, some, or all comparison types.  Given
the unified setting of the task, we especially encourage systems that
handle all comparison types.  However, we also allow specialized systems
that target only a single comparison type.

Interested teams are encouraged to join the task’s mailing
discussion and announcements.


Systems will be evaluated against human similarity scores using both
rank-based and score-based comparisons.  See the task’s
for further details.


The Task 3 trial data set is now available and contains tens of examples
for each comparison type to use in building initial systems.  The full
training data will be released later in December. Please see the
task’s Data<http://alt.qcri.org/semeval2014/task3/index.php?id=data-and-tools>page
for further details.



   Trial data ready October 31, 2013

   Training data ready December 15, 2013

   Evaluation period March 15-30, 2014

   Paper submission due April 30, 2014 [TBC]

   SemEval workshop August 23-24, 2014, co-located with COLING and *SEM in
   Dublin, Ireland.


The Semeval-2014 Task 3 website includes details on the training data,
evaluation, and examples of the comparison types:


If interested in the task, please join our mailing list for updates:



David Jurgens (jurgens at di.uniroma1.it), Sapienza University of Rome, Italy

Mohammad Taher Pilehvar (pilehvar at di.uniroma1.it), Sapienza University of
Rome, Italy

Roberto Navigli (navigli at di.uniroma1.it), Sapienza University of Rome, Italy
