WIT3

Web Inventory of Transcribed and Translated Talks

Home 2012-03-test Evaluation sets for the MT track

The IWSLT 2012 Evaluation Campaign includes the MT track on TED Talks. In this edition, the official language pairs are two:

   from Arabic to English
   from English to French

In addition, for ten language pairs training, development and evaluation (see below) sets are provided:

  from German, Dutch, Polish, Portoguese-Brazil, Romanian, Russian, Slovak, Slovenian, Turkish and Chinese to English

Submitted runs on additional pairs will be evaluated as well, in the hope to stimulate the MT community to evaluate systems on common benchmarks and to share achievements on challenging translation tasks.

For each language pair, test sets are linked to the corresponding entry of the table below: by clicking, an archive will be downloaded which contains the sets and a README file.

if you use this corpus in your work, please cite the paper:

M. Cettolo, C. Girardi, and M. Federico. 2012. WIT3: Web Inventory of Transcribed and Translated Talks. In Proc. of EAMT, pp. 261-268, Trento, Italy. pdf, bib.


en

fr
arclick to get
test sets
 
declick to get
test sets
 
en click to get
test sets
nlclick to get
test sets
 
plclick to get
test sets
 
pt-brclick to get
test sets
 
roclick to get
test sets
 
ruclick to get
test sets
 
skclick to get
test set
 
slclick to get
test set
 
trclick to get
test sets
 
zhclick to get
test sets