CoCo4MT 2022
Workshop on Corpus Generation and Corpus Augmentation for Machine Translation
The first workshop on Corpus Generation and Corpus Augmentation for Machine Translation (CoCo4MT) was co-located with AMTA 2022 on 16 September 2022.
It was the first workshop centered around research focusing on corpora creation, cleansing, and augmentation techniques specifically for machine translation.
Topics (not limited):
- Difficulties with using existing corpora (e.g., political considerations or domain limitations) and their effects on final MT systems,
- Strategies for collecting new MT datasets (e.g., via crowdsourcing),
- Data augmentation techniques,
- Data cleansing and denoising techniques,
- Quality control strategies for MT data,
- Exploration of datasets for pretraining or auxiliary tasks for training MT systems.
sites.google.com/view/coco4mt-2022
Keynote speakers
- Ankur Parikh
- Graham Neubig
- Jörg Tiedemann
- Julia Kreutzer
- Maria Nadejde
Panel
- Kenneth Church
- Marine Carpuat
Important dates
01 June 2022 | Call for papers released |
15 June 2022 | Second call for papers |
29 June 2022 | Third and final call for papers |
13 July 2022 | Paper submissions due |
20 July 2022 | Paper submissions due (extended deadline) |
27 July 2022 | Notification of acceptance |
07 August 2022 | Camera-ready due |
31 August 2022 | Video recordings due |
16 September 2022 | CoCo4MT workshop |