Training data

Training data for machine translation


Training data is the data used to train a model. In other words, the training algorithm sets the model parameters to fit the training data.

The most common type of training data in machine translation is parallel text data. Monolingual text data is also often used for learning, for instance in language models or back-translating the monolingual data to generate synthetic parallel data.


Want to learn more about Training data?


Edit this article →

Machine Translate is created and edited by contributors like you!

Learn more about contributing →

Licensed under CC-BY-SA-4.0.

Cite this article →