Link Search Menu Expand Document

Training data

Training data for machine translation


Training data is the data used to train a model. In other words, the training algorithm sets the model parameters to fit the training data.

The most common type of training data in machine translation is parallel text data. Monolingual text data is also often used for learning, for instance in language models or back-translating the monolingual data to generate synthetic parallel data.


Edit this article →

Machine Translate is created and edited by contributors like you!

Learn more about contributing →

Licensed under CC-BY-SA-4.0.

Cite this article →