Evaluation sets for the MT track
The IWSLT 2015 Evaluation Campaign includes the MT track on TED Talks. In this edition, the official language pairs are ten:
from/to English to/from French, German, Chinese, Thai*, Vietnamese
Additionally, this year the Czech language is proposed as a special guest; submitted runs
from/to English to/from Czech
will be evaluated as well, in the hope to stimulate the MT community to evaluate systems on common benchmarks and to share achievements on challenging translation tasks.
The archive with test sets is available at this link.
If you use this corpus in your work, please cite the paper:
M. Cettolo, C. Girardi, and M. Federico. 2012. WIT3: Web Inventory of Transcribed and Translated Talks. In Proc. of EAMT, pp. 261-268, Trento, Italy. pdf, bib.
(*) Thai texts are segmented at word level according to the guideline defined at the InterBEST 2009; the document can be found here.