[Elsnet-list] MEMURA 2004 - Call for Participation

ddg at di.ubi.pt ddg at di.ubi.pt
Wed May 5 15:43:04 CEST 2004


******************CALL FOR PARTICIPATION******************

  Workshop on Methodologies and Evaluation of Multiword Units
                    in Real-world Application

                      (MEMURA 2004 Workshop)

           (in association with the 4th INTERNATIONAL
        CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION)

           Centro Cultural de Belém, Lisbon, Portugal
                         May 25, 2004

                  http://memura2004.di.ubi.pt

                       INVITED SPEAKER

                      Dr. Kenneth Churh

  **********************************************************

  [1] Workshop Description
  [2] Target Audience
  [3] Programme
  [4] Contact


  [1] Workshop Description:
  ------------------------

  Multiword units (MWUs) include a large range of linguistic phenomena, such
  as phrasal verbs (e.g. "look forward"), nominal compounds (e.g. "interior
  designer"), named entities (e.g. "United Nations"), set phrases (e.g. "con
  carne") or compound adverbs (e.g. "by the way"), and they can be
  syntactically and/or semantically idiosyncratic in nature. MWUs are used
  frequently in everyday language, usually to express precisely ideas and
  concepts that cannot be compressed into a single word. A considerable
amount
  of research has been devoted to this subject, both in terms of theory and
  practice, but despite increasing interest in idiomaticity within linguistic
  research, many questions still remain unanswered. The objective of this
  workshop is to deal with three important questions that are of great
  interest for real-world applications.


  1) Comparison of MWU extraction methodologies

  Many methodologies have been proposed in order to automatically extract or
  identify MWUs. However, not many efforts have been devoted to compare their
  results. The core differences between the methodologies is certainly the
  main reason why such works are so rare. For instance, it is not easy to
  compare language-dependent methodologies as the results depend on the
  efficiency of parameter tuning in the broad sense of its acception (i.e.
  semantic tagging, local specific grammars, lematization, part-of-speech
  tagging etc.). Another important problem is the fact that there is no real
  agreement between researchers about the definition of MWUs which would
  provide the basis for an objective evaluation. The objective of the
workshop
  is to gather people that have recently been working in this area so that
new
  trends in comparing MWU extraction methodologies and their evaluation
can be
  pointed at.

  2) Evaluation of the benefits of the integration of MWUs in real-world
  applications

  It is not yet clear whether MWUs really improve NLP applications. It is
  common sense that Machine Translation is one application that takes great
  advantage of MWUs databanks. However, does the same apply to
applications in
  Automatic Summarization, Information Retrieval (IR), Cross-language IR,
  Information Extraction, Text Clustering/Classification, Parallel Corpus
  Alignment? Indeed, could the identification of MWUs introduce new
  constraints that are not present in original texts? Should MWUs be
  considered as units that should not be analysable in terms of their
  components meaning? Or should they be treated as unanalysable? Should NLP
  methods work both on isolated words and on agregated MWUs?

  The answers are anything but clear. Here, the objective of the workshop is
  to point at successes and failures of the integration of MWUs in real-world
  applications.


  3) Comparison of scalable architectures for the extraction and
  identification of MWUs

  Real-world applications are constrained by variables like processing time
  and memory space. However, identifying and extracting MWUs is usually a
  computationally heavy process. In recent years, new algorithms and new
  technologies have been proposed to introduce MWU treatmement in large scale
  applications, thus avoiding previous untractable implementations. Previous
  workshops on MWUs have mainly focused on the unconstrained extraction
  process. In this workshop, we would like to focus on the comparison of
  different factors that can influence the scalability of the treatment of
  MWUs in real-world applications, namely data structures, algorithms,
  parallel and distributed computing, grid computing etc. Indeed, as we said
  earlier, some extraction strategies may not scale to deal with huge volumes
  of data.


  [2] Target Audience:
  --------------------

  This workshop is intended to bring together NLP researchers working on all
  areas of MWUs. The objective is to summarise what has been achieved in the
  area of MWU in real-world applications, to establish common themes between
  different approaches, and to discuss future trends.


  [3] Programme:
  --------------

  9h00 - 9h45 - Invited Speaker - Kenneth W. Church

  9h45 - 10h05 - Japanese Multiword Extraction using SVM and Adaptation - T.
  Ogata, K. Terao, K. Umemura - Toyohashi University of Technology - Japan

  10h05 - 10h25 - Multiword Expressions Recognition with the LVQ Algorithm -
  M.C. Díaz-Galiano, M.T. Martín-Valdivia, F. Martínez-Santiago, L.A.
  Ureña-López - University of Jaén - Spain

  10h25 - 10h45 - A Parallel Multikey Quicksort Algorithm for Mining
Multiword
  Units - R. Pereira, P.Crocker, G.Dias - Beira Interior University -
Portugal

  10h45 - 11h00 - Coffee Break

  11h00 - 11h20 - Recognition and Paraphrasing of Periphrastic and
Overlapping
  Verb Phrases -  N. Kaji, S.Kurohashi - University of Tokyo - Japan

  11h20 - 11h40 - Transducing Text to Multiword Units - C.H.A. Koster -
  University of Nijmegen - The Netherlands

  11h40 - 12h00 - Multiword Units in Syntactic Parsing - J. Nivre and J.
  Nilsson - Växjö University - Sweden

  12h00 - 12h20 - Use of Noun Phrases in Interactive Search Refinement - O.
  Vechtomova, M. Karamuftuoglu - University of Waterloo - Canada

  12h20 - 12h40 - Comparative Evaluation of C-value in the Treatment of
Nested
  Terms - S. Vintar - University of Ljubljana - Slovenia

  12h40 - 13h00 - Discussion and Closing Session


  [4] Contact
  -----------

  Gaël Dias
  Human Language Technology Interest Group
  Departamento de Informática
  Universidade da Beira Interior
  Rua Marquês d'Ávila e Bolama
  6201-001 Covilhã Portugal
  email: ddg at di.ubi.pt
  Tel: +351 275319700 - Mob: +351 918612700 - Fax: +351 275319 732


More information about the Elsnet-list mailing list