[Elsnet-list] ELRA News

Magali Jeanmaire duclaux at elda.fr
Tue Jun 8 16:50:20 CEST 2004


**********************************************************
ELRA - Language Resources Catalogue - Update
*********************************************************
We are happy to announce that new Language Resources are
now available in our catalogue:

Short descriptions of these resources are given below.
More detailed descriptions are available on our web sites,
at www.elda.fr or www.elra.info.
-------------------------------------------
Written Language Resources
-------------------------------------------
*** W0015 Le Monde Text Corpus - Update ***
Electronic archiving of "Le Monde" articles started on 1 January 1987.
The entire corpus is available in an ASCII text format.
Year 2003 is available in .XML format.

*** W0036/04 Le Monde Diplomatique Text corpus in Arabic ***
Electronic archiving of "Le Monde Diplomatique" articles in Arabic from 1998.
The corpus is available in an ASCII text format.
French and English versions also available.

-------------------------------------------
Spoken Language Resources
-------------------------------------------
*** S0158 Turkish OrienTel database ***
This speech database contains the recordings of 1,700 Turkish speakers
recorded over the Turkish fixed and mobile telephone network.
Each speaker uttered around 45 read and spontaneous items.

*** S0159 German spoken by Turkish OrienTel database ***
This speech database contains the recordings of 332 Turkish speakers
of German recorded over the German fixed and mobile telephone network.
Each speaker uttered around 53 read and spontaneous items.

*** S0160 Spanish Speecon database ***
The Spanish Speecon database comprises the recordings of 561 adult
Spanish speakers and 55 child Spanish speakers who uttered respectively
over 290 items and 210 items (read and spontaneous).

*** S0161 Russian Speecon database ***
The Russian Speecon database comprises the recordings of 550 adult
Russian speakers and 50 child Russian speakers who uttered respectively
over 290 items and 210 items (read and spontaneous).

*** S0162 Hempel ***
This corpus contains 25.5 hours of recordings by 3,909 German speakers
with a total of 184,240 spoken words, made via public phone lines (fixed
network only). The contents are free monologues answering the question:
"Was haben Sie in der letzten Stunde gemacht?" (What did you do within
the last hour?). The database is conformant with the SpeechDat Exchange
Format.





More information about the Elsnet-list mailing list