[Elsnet-list] Open source language analysis package
padro at lsi.upc.edu
Fri Mar 17 17:09:34 CET 2006
--- We apologize if you have received multiple copies of this
Dear list members,
We are pleased to announce the release of FreeLing version 1.3, which
improves existing functionalities of the suite, and includes new ones,
such as WN-based semantic annotation, NE classification, and dependency
parsing. Also, we are glad to announce that this version includes two
new languages (Italian and Galician) thanks to the researchers willing
to share their data under open-source or creative-commons licences (see
"thanks" section in FreeLing web page).
FreeLing is an open-source C++ library providing language analysis
services. It is Free Software, released under Gnu LGPL. FreeLing 1.3 is
being presented and demonstrated next May at LREC-2006 in Genoa, Italy.
FreeLing is developed at TALP Research Center <http://www.talp.upc.es>,
in Universitat Politècnica de Catalunya <http://www.upc.es>.
Morphological dictionaries and grammars were inityally developed by
Centre de Llenguatge i Computació <http://clic.fil.ub.es>, in
Universitat de Barcelona <http://www.ub.es>.
Find more information, an online demo, and download links at
http://www.lsi.upc.edu/~nlp (under "resources" menu)
FreeLing is designed to be used as an external library from any
application requiring language analysis services. Nevertheless, a simple
main program is also provided as a basic interface to the library, which
enables the user to analyze text files from the command line.
The named entity classification module requires some essential Machine
Learning services such as feature extraction, and
training/classification using Adaboost models. These services are
accessible to any program linking the library, so FreeLing can be also
used as a (very) basic ML-oriented NLP development toolkit.
Features already in previous versions:
* Text tokenization.
* Sentence splitting.
* Morphological analysis.
* Named entity detection.
* Date/number/currency/ratios recognition.
* PoS tagging.
* Chart-based shallow parsing.
New features in version 1.3
* New languages: Italian and Galician
* Improved and debugged linguistic data for Spanish and Catalan.
* Contraction splitting
* Improved suffix treatment, retokenization of clitic pronouns.
* Physical magnitudes detection (speed, weight, temperature,
* Named entity classification.
* WordNet based sense annotation
* Dependency parsing.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Elsnet-list