[Elsnet-list] Reminder: Open-Source Machine Translation Workshop at MT Summit X

Mikel L. Forcada mlf at dlsi.ua.es
Mon May 2 19:04:56 CEST 2005

OSMaTran: Open-Source Machine Translation

A workshop at MT Summit X <http://www.tcllab.org/Pages/mtsummit.html>
September 12-16, 2005
The Hilton Phuket Arcadia Resort and Spa
Phuket, Thailand


Machine translation has become a key technology in our globalized
society; as a result, machine translation software is available for
major language pairs and for major computer platforms, including
web-based machine translation. On the other hand, the recent years have
witnessed a boom of open-source software; among the most successful
solutions, the operating system Linux, web browsers such as Mozilla, web
servers such as Apache, and full-fledged office suite such as
OpenOffice.org. However, almost all "real life" machine translation
software, even if available for use at no cost, is "closed" instead of
open. This is especially surprising if one considers the large number of
publicly-funded groups working in machine translation.

Open-source machine translation would have, however, distinct
advantages; if it is freely available, as most open-source software is,
more users would have access to this technology, but, more importantly,
institutions or businesses adopting an open-source machine translation
system would be able to customize the system to their needs in many more
ways: developing new linguistic data (vocabularies, rules, corpora),
integrating it with other packages, etc.

But machine translation software is special in that it relies upon the
availability of extensive linguistic resources; for an open-source
machine translation architecture to be successful, clearly defined and
documented standards to represent linguistic data are absolutely
necessary. Data standardization would lead to interoperability and
interchange, which would in turn be very beneficial to the creation of
new machine translation systems. Proprietary data could also be
converted into these formats to be used in conjunction with open-source
architectures, leading to hybrid systems.

The existence of an open-source machine translation architecture would
also be specially important for the creation of systems dealing with
language pairs involving small or neglected languages, which are usually
not targeted by commercial programs, but would fulfill the goals of
administrations and non-government organizations dealing with these
languages, and even contribute to their promotion or revival.

Open-source software is associated to a change in the business model. In
the case of machine translation, it would result in a shift from
license-based or charge-per-word models to a service model in which
enterprises would offer users a variety of services: consulting,
customization, linguistic data development, integration in multilingual
document management systems, etc.

Machine translation is only one of the available language technologies
which can be applied to translation; the effect of the existence of
open-source software for other translation applications such as
translation memory, etc., or even other natural language processing
applications not related to translation, would therefore be worth
examining as well.

    Schedule and Venue

This one-day workshop will take place on September 16, 2005 after the
regular conference sessions end. Please visit the Workshop website at
http://www.torsimany.ua.es/OSMaTran/ for updates.


There will be a common publication format for all workshops in line with
the main conference proceedings. These will be the usual book format for
conference proceedings as well as CD-ROM.


The Open-Source Machine Translation workshop seeks original papers in
all aspects of open-source machine translation. Topics of interest
include, but are not limited to:

    * open-source machine translation architectures (rule-based,
      example-based, statistical, etc.); projects and currently
      available software: complete engines, modules;
    * standards for the encoding of linguistic data to be used in
      conjunction with open-source translation technology;
    * licensing issues: licenses for the machine translation engines and
      for linguistic data; applicability of current open-source licenses
      (GPL, LGPL, etc.); hybrid-license systems (open-source engine,
      proprietary data);
    * open-source machine translation as an opportunity for small or
      neglected languages;
    * open-source translation memories (programs and data);
    * advantages and disadvantages of open-source language technologies;
    * applicability of the open-source model to the language technology
    * the role of public agencies in promoting the use and development
      of open-source machine translation;
    * the future of open-source machine translation: objectives,

The working language of the workshop will be English.

Papers should describe research or experiences in any of the topics
mentioned, should not be longer than 3,000 words, and should nicely fit
in 8 pages. To format the paper, you are encouraged to use the style
files <http://www.tcllab.org/Pages/mtsummit/papersubmit.html> provided
for the general sessions of MT Summit X. In view of the main topics of
the workshop, presentations may have a substantial demonstrative

Electronic submission (PDF or PS files; we strongly encourage to avoid
proprietary format) via electronic mail is the only procedure of
submission allowed. As the reviewing process will be blind, authors are
requested to keep their papers anonymous. This means that these
submissions should NOT include the author's name; rather, papers should
be identified only by their title. In addition to the electronic file
containing their paper (including an abstract), authors must also submit
an additional file with a separate cover page with the following

    * paper title,
    * authors' names, affiliations, addresses, and e-mail addresses,
    * for demonstrative presentations: the hardware, software and
      network requirements for the demonstration.

The two electronic files should be attached to an email and sent to the
workshop e-mail address (osmatran [at] dlsi.ua.es).

We will acknowledge receipt of all papers received before the deadline
and issue a submission number to each author. Please refer to that
number in all subsequent correspondence.

Additional guidelines for preparing your manuscript will be given before
the start of the submission period at the Workshop website at
http://www.torsimany.ua.es/OSMaTran/ .

    Important Dates

Please make a note of these important dates:

    * Paper submission starts: April 1, 2005
    * Paper submission deadline: May 13, 2005
    * Notification of acceptance: June 24, 2005
    * Final camera-ready copy deadline: July 15, 2005

    Attendance Fee

Details of registration procedures, including registration fees, will be
included in the OSMaTran web when announced. The attendance fee for our
workshop is expected to be approx. US $70.


    * Mikel L. Forcada, Universitat d'Alacant, Alacant, Spain (/workshop
    * Iñaki Alegria and Kepa Sarasola, Euskal Herriko Unibertsitatea,
      Donostia, Spain.
    * Xavier Gómez Guinovart, Universidade de Vigo, Vigo, Spain.
    * Lluís Padró, Universitat Politècnica de Catalunya, Barcelona,
    * Juan Antonio Pérez-Ortiz and Antonio M. Corbí-Bellot, Universitat
      d'Alacant, Alacant, Spain.

    Further Information

For more details, please visit the workshop website:
http://www.torsimany.ua.es/OSMaTran/. You may also send a request for
information to the workshop e-mail address (osmatran [at] dlsi.ua.es).

Mikel L. Forcada                    E-mail: mlf at dlsi.ua.es
Departament de Llenguatges          Phone: +34-96-590-9776 
i Sistemes Informàtics                also +34-96-590-3772.
UNIVERSITAT D'ALACANT               Fax:   +34-96-590-9326, -3464
E-03071 ALACANT, Spain.
URL: http://www.dlsi.ua.es/~mlf/

More information about the Elsnet-list mailing list