IWSLT 2012

Training and development sets for the MT track

The IWSLT 2012 Evaluation Campaign includes the MT track on TED Talks. In this edition, the official language pairs are two:

   from Arabic to English

   from English to French

In addition, for ten language pairs training, development and evaluation sets are provided:

  from German, Dutch, Polish, Portoguese-Brazil, Romanian, Russian, Slovak, Slovenian, Turkish and Chinese to English

Submitted runs on additional pairs will be evaluated as well, in the hope to stimulate the MT community to evaluate systems on common benchmarks and to share achievements on challenging translation tasks.

The archive with training and development sets is available at this link.

If you use this corpus in your work, please cite the paper:

M. Cettolo, C. Girardi, and M. Federico. 2012. WIT3: Web Inventory of Transcribed and Translated Talks. In Proc. of EAMT, pp. 261-268, Trento, Italy. pdf, bib.