Evaluation metric using embeddings

COMET (Crosslingual Optimized Metric for Evaluation of Translation) is a metric for automatic evaluation of machine translation that calculates the similarity between a machine translation output and a reference translation using token or sentence embeddings.

It is based on similarity of vector representations.

Traditionally only QE models have made use of the source input, whereas MT evaluation metrics rely instead on the reference translation. […], we show that using a multilingual embedding space allows us to leverage information from all three inputs and demonstrate the value added by the source as input to our MT evaluation models.

COMET: A Neural Framework for MT Evaluation

Unlike BERTScore, COMET is trained on predicting different types of human judgements in the form of post-editing effort, direct assessment or translation error analysis.


Want to learn more about COMET?

Edit this article →

Machine Translate is created and edited by contributors like you!

Learn more about contributing →

Licensed under CC-BY-SA-4.0.

Cite this article →