Training and development sets for the MT track
The IWSLT 2018 Evaluation Campaign includes the Low Resource MT track on TED Talks. In this edition, the language pair is: from Basque to English.
Training and development sets for Basque-English (as well as additional talks for Basque-French, Basque-Spanish, Spanish-French, Spanish-English, French-English) are linked to the entry of the table below: by clicking, an archive will be downloaded which contains the sets and a README file. Numbers in the table refer to millions of units (untokenized words) of the target side of parallel Basque-English training data.
If you use this corpus in your work, please cite the paper: