[Elsnet-list] EACL workshop: Web as Corpus
adam at lexmasterclass.com
Thu Nov 17 16:31:14 CET 2005
EACL 2006 Workshop on the
WEB AS CORPUS
April 4 2006, Trento, Italy
The EACL 2006 Workshop on the Web as Corpus will be hosted in
conjunction with the 11th Conference of the European Chapter of the
Association for Computational Linguistics that will take place April
3-7, 2006, in Trento, italy.
Despite the fact that a growing body of work has shown that the World
Wide Web is a mine of language data of unprecedented richness and ease
of access (see, e.g., the papers collected in Kilgarriff and
Grefenstette, 2003), many fundamental issues about the viability and
exploitation of the Web as a linguistic corpus are just starting to be
tackled, ranging from Web frequency distributions and registers, to
efficient handling of massive data sets, to copyright. Research on
the Web as corpus is currently at a very exciting stage: increasing
evidence points to the enormous potential of the Internet as a source
of linguistic data, but we are still far from a working, fully-fledged
linguists' search engine.
We invite submissions which:
* describe Web corpus collection projects, or modules for one part of
the process (crawling, filtering, language-id, tokenising,
lemmatising, POS-tagging, indexing, ...)
* explore characteristics of Web data, from a linguistics/NLP
* use crawled Web data for NLP purposes.
Preference will be given to projects where Web data are downloaded and
processed directly, rather than via search engine interfaces.
Authors are invited to submit full papers on original, unpublished
work in the topic area of this workshop. Submissions should follow the
two-column format of ACL proceedings and should not exceed eight (8)
pages, including references. We strongly recommend the use of ACL
LaTeX or Microsoft Word style files tailored for this year's
conference available at
Papers must conform to the official EACL-06 style guidelines, and we
reserve the right to reject submissions that do not conform to these
styles, including font size restrictions. Submissions should be in PDF
format and must include all fonts, so that the paper will print (not
just view) anywhere.
Please submit your paper no later than January 6, 2006. Details on the
submission procedure will be available soon on the workshop Website.
Each submission will be reviewed at least by two members of the
programme committee. Accepted papers will be published in the workshop
Dual submissions to the main EACL 2006 conference and this workshop
are allowed; if you submit to the main session, do indicate this when
you submit to the workshop, and specify your EACL submission reference
number, for administrative ease. If your paper is accepted for the
main session, you should withdraw your paper from the workshop upon
notification by the main session.
Information on registration and registration fees will be provided at
the conference web page.
* IMPORTANT DATES
January 6, 2006 - Deadline for workshop papers
January 27, 2006 - Notification of acceptance
February 10, 2006 - Camera-ready papers due
April 4, 2006 - Workshop
As the schedule is extremely tight, deadline extensions are NOT
* PROGRAMME COMMITTEE
Marco Baroni (co-chair)
William H. Fletcher
Adam Kilgarriff (co-chair)
* FURTHER INFORMATION
Workshop web page
Conference web page
EACL 2006 Workshops site
* CONTACT INFORMATION
Lexical Computing Ltd
71 Freshfield Road, Brighton BN2 0BL, UK
adam -at-lexmasterclass . com
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Elsnet-list