CHaRacter-level F-score

chrF (CHaRacter-level F-score) is a metric for machine translation evaluation that calculates the similarity between a machine translation output and a reference translation using character n-grams, not word n-grams.

Metrics based on word n-grams are especially problematic for high-morphology languages.

chrF was introduced in 2015 by Maja Popović.

The chrF metric compares the machine translation output with reference translations, looking at character sequences. Character sequences matching help in recognizing different forms of a single word.

It is language-independent, tokenisation-independent and it shows good correlations with human judgments both on the system- as well as with on the segment-level, especially the CHRF3 variant.

chrF: character n-gram f-score for automatic MT evaluation


  • chrF3
  • chrF++

Note: The list is incomplete.


Want to learn more about chrF?

Edit this article →

Machine Translate is created and edited by contributors like you!

Learn more about contributing →

Licensed under CC-BY-SA-4.0.

Cite this article →