WIT3

Web Inventory of Transcribed and Translated Talks

Home 2013-01One more pair beyond the evaluation campaign: German-French

The IWSLT 2013 evaluation campaign did not include any task on the German/French pair. Exceptionally, for both German-to-French and French-to-German directions, here you can find training, development and the progressive test sets built upon the original XML files of German and French used for preparing the release 2013-01 of WIT3.

If you use this corpus in your work, please cite the paper:
M. Cettolo, C. Girardi, and M. Federico. 2012. WIT3: Web Inventory of Transcribed and Translated Talks. In Proc. of EAMT, pp. 261-268, Trento, Italy. pdf, bib.

For the two directions, training and development sets are linked to the corresponding entry of the table below: by clicking, an archive will be downloaded which contains the sets and a README file. Numbers in the table refer to millions of units (untokenized words) of the target side of parallel training data.


de

fr
de 2.40
fr2.16 

Progressive test sets tst2011 and tst2012 are linked to the corresponding entry of the table below: by clicking, an archive will be downloaded which contains the sets and a README file.


de

fr
de click to get
test sets
frclick to get
test sets
 

Performance of baseline SMT systems are provided below; scores refer to tst2010 development set, having dev2010 been used for tuning. The baseline systems were built upon the open-source MT toolkit Moses in a pretty standard configuration, described in the WIT3 paper. Scores were computed by means of the mteval-v13a.pl and tercom_v6b.pl scripts respectively. Automatic translations are linked to entries, just click for getting them.


de

fr
de Bleu=18.27
TER=65.49
frBleu=14.65
TER=69.71