Call For Participation

Cross-Level Semantic Similarity

SemEval 2014 - Task 3


To what extent is the meaning of the sentence "do u know where i can watch
free older movies online without download?" preserved in the phrase
"streaming vintage movies for free"?  Or, how similar is "circumscribe" to
the phrases "beating around the bush" and "avoiding the topic"? The aim of
Task 3 is to evaluate semantic similarity when comparing items of different
syntactic and semantic scales, such as paragraphs. sentences, phrases,
words, and senses.


Semantic similarity is an essential component of many applications in
Natural Language Processing (NLP). Task 3 provides an evaluation for
semantic similarity across different types of text, which we refer to as
lexical levels. Unlike prior SemEval tasks on textual similarity that have
focused on comparing similar-sized texts, this task evaluates the case
where a smaller text or sense is compared to the meaning of a larger text.
Specifically, this task encompasses four types of semantic similarity


   paragraph to sentence,

   sentence to phrase,

   phrase to word, and

   word to sense.

Task 3 unifies multiple objectives from different areas of NLP under a
single task, e.g., Paraphrasing, Summarization, and Compositional
Semantics. One of the major motivations of this task is to produce systems
that handle all comparison types, thereby freeing downstream NLP
applications from needing to consider the type of text being compared.


Task participants are provided with pairs of each comparison type and asked
to rate to what extent does the smaller item preserve the meaning of the
larger item.  For example, given a sentence and a paragraph, a system would
assess how similar is the meaning of the sentence to the meaning of the
paragraph. Ideally, a high-similarity sentence would summarize overall
meaning of the paragraph.  Similarly, in the case of word to sense, we
assess how well the semantics a word is captured by a WordNet 3.0 sense.


Teams are free to participate in one, some, or all comparison types.  Given
the unified setting of the task, we especially encourage systems that
handle more than one comparison type.  However, we also allow specialized
systems that target only a single comparison type.

Interested teams are encouraged to join the task's mailing
discussion and announcements.


Systems will be evaluated against human similarity scores using both
rank-based and score-based comparisons.  See the task's
for further details.


The Task 3 training data set is now available and contains 500 scored pairs
for each comparison type to use in building and training systems. Please
see the task's Data<http://alt.qcri.org/semeval2014/task3/index.php?id=data-and-tools>page
for further details.


* Test data ready: March 10, 2014

* Evaluation start: March 15, 2014

* Evaluation end: March 30, 2014

* Paper submission due: April 30, 2014 [TBC]

* Paper reviews due: May 30, 2014

* Camera ready due: June 30, 2014

* SemEval workshop August 23-24, 2014, co-located with COLING and *SEM in
Dublin, Ireland.


The Semeval-2014 Task 3 website includes details on the training data,
evaluation, and examples of the comparison types:

If interested in the task, please join our mailing list for updates:



David Jurgens (jurgens at uniroma), Sapienza University of Rome, Italy

Mohammad Taher Pilehvar (pilehvar at uniroma), Sapienza University of Rome,

Roberto Navigli (navigli at uniroma), Sapienza University of Rome, Italy

