Publications

Scientific publications about machine translation

Table of contents

Neural machine translation
Statistical machine translation
Metrics
1. Quality evaluation
  1. Similarity-based metrics
    1. n-gram matching metrics
    2. Embedding-based metrics
  2. Learnable metrics
2. Quality estimation

Neural machine translation


2017	*Neural Machine Translation*	Philipp Koehn

Model architecture


2017	*Attention is all you need*	Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin
2017	*Convolutional Sequence to Sequence Learning*	Jonas Gehring, Michael Auli, David Grangier, Denis Yarats, Yann N. Dauphin
2016	*Google’s Neural Machine Translation System: Bridging the Gap between Human and Machine Translation*	Yonghui Wu, Mike Schuster, Zhifeng Chen, Quoc V. Le, Mohammad Norouzi, Wolfgang Macherey, Maxim Krikun, Yuan Cao, Qin Gao, Klaus Macherey, Jeff Klingner, Apurva Shah, Melvin Johnson, Xiaobing Liu, Łukasz Kaiser, Stephan Gouws, Yoshikiyo Kato, Taku Kudo, Hideto Kazawa, Keith Stevens, George Kurian, Nishant Patil, Wei Wang, Cliff Young, Jason Smith, Jason Riesa, Alex Rudnick, Oriol Vinyals, Greg Corrado, Macduff Hughes, Jeffrey Dean
2016	*Modeling Coverage for Neural Machine Translation*	Zhaopeng Tu, Zhengdong Lu, Yang Li, Xiaohua Liu, Hang Li
2015	*Effective Approaches to Attention-based Neural Machine Translation*	Minh-Thang Luong, Hieu Pham, Christopher D. Manning
2015	*Neural Machine Translation by Jointly Learning to Align and Translate*	Dzmitry Bahdanau, KyungHyun Cho, Yoshua Bengio
2014	*On the Properties of Neural Machine Translation: Encoder-Decoder Approaches*	Kyunghyun Cho, Bart van Merriënboer, Dzmitry Bahdanau, Yoshua Bengio
2014	*Sequence to sequence learning with Neural Networks*	Ilya Sutskever, Oriol Vinyals, Quoc V. Le
2007	*Large language models in machine translation*	Thorsten Brants, Ashok Popat, Peng Xu, Franz Josef Och, Jeffrey Dean

Pre-training


2019	*Towards Making the Most of BERT in Neural Machine Translation*	Jiacheng Yang, Mingxuan Wang, Hao Zhou, Chengqi Zhao, Yong Yu, Weinan Zhang, Lei Li
2019	*On the use of BERT for Neural Machine Translation*	Stephane Clinchant, Kweon Woo Jung, Vassilina Nikoulina
2018	*BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding*	Jacob Devlin, Ming-Wei Chang, Kenton Lee, Kristina Toutanova

Attention mechanism


2017	*A Structured Self-attentive Sentence Embedding*	Zhouhan Lin, Minwei Feng, Cicero Nogueira dos Santos, Mo Yu, Bing Xiang, Bowen Zhou, Yoshua Bengio
2016	*A Decomposable Attention Model for Natural Language Inference*	Ankur P. Parikh, Oscar Täckström, Dipanjan Das, Jakob Uszkoreit
2015	*Effective Approaches to Attention-based Neural Machine Translation*	Thang Luong, Hieu Pham, Christopher D. Manning

Low-resource translation


2018	*Universal Neural Machine Translation for Extremely Low Resource Languages*	Jiatao Gu, Hany Hassan, Jacob Devlin, Victor O.K. Li
2016	*Google’s Multilingual Neural Machine Translation System: Enabling Zero-Shot Translation*	Melvin Johnson, Mike Schuster, Quoc V. Le, Maxim Krikun, Yonghui Wu, Zhifeng Chen, Nikhil Thorat, Fernanda Viégas, Martin Wattenberg, Greg Corrado, Macduff Hughes, Jeffrey Dean
2016	*Improving Neural Machine Translation Models with Monolingual Data*	Rico Sennrich, Barry Haddow, Alexandra Birch
2012	*On the difficulty of training Recurrent Neural Networks*	Razvan Pascanu, Tomas Mikolov, Yoshua Bengio

Open vocabulary


2018	*Subword regularization: Improving neural network translation models with multiple subword candidates*	Taku Kudo
2016	*Neural Machine Translation of Rare Words with Subword Units*	Rico Sennrich, Barry Haddow, Alexandra Birch

Statistical machine translation


2007	*Hierarchical Phrase-Based Translation*	David Chiang
2003	*Statistical Phrase-Based Translation*	Philipp Koehn, Franz Josef Och, Daniel Marcu

Metrics

Machine translation metrics automatically assess quality of the machine translation output. There are two types of metrics: quality evaluation and quality estimation.

Quality evaluation metrics rely on human (reference) translation.
Quality estimation metrics do not rely on human (reference) translation.

Quality evaluation

Similarity-based metrics

These metrics evaluate similarity between machine translation and reference translation. There are two types of this similarity:

n-gram matching-based similarity
embedding-based similarity

n-gram matching metrics

These metrics evaluate similarity based on hand-crafted features and rules. For example, a metric can count the number and fraction of n-grams that appear in both the machine translation and the human translation.


2015	*chrF: character n-gram f-score for automatic MT evaluation*	Maja Popovic
2014	*METEOR Universal: Language Specific Translation Evaluation for Any Target Language*	Michael Denkowski, Alon Lavie
2006	*A study of Translation Edit Rate with Targeted Human Annotation*	Matthew Snover, Bonnie Dorr, Rich Schwartz, Linnea Micciulla, John Makhoul
2005	*METEOR: An Automatic Metric for MT Evaluation with Improved Correlation with Human Judgments*	Satanjeev Banerjee, Alon Lavie
2004	*ROUGE: A Package for Automatic Evaluation of Summaries*	Chin-Yew Lin
2002	*Automatic Evaluation of Machine Translation Quality Using N-gram Co-Occurrence Statistics*	George Doddington
2002	*BLEU: a Method for Automatic Evaluation of Machine Translation*	Kishore Papineni, Salim Roukos, Todd Ward, Wei-Jing Zhu

Embedding-based metrics

These metrics use various word embeddings as an alternative to n-gram matching for capturing word semantics similarity.


2020	*Bertscore: Evaluating text generation with bert*	Tianyi Zhang, Varsha Kishore, Felix Wu, Kilian Q. Weinberger, Yoav Artzi
2019	*MoverScore: Text generation evaluating with contextualized embeddings and earth mover distance*	Wei Zhao, Maxime Peyrard, Fei Liu, Yang Gao, Christian M. Meyer, Steffen Eger

Learnable metrics

Learnable metrics directly optimise correlation with human judgments.


2020	*COMET: A Neural Framework for MT Evaluation*	Ricardo Rei, Craig Stewart, Ana C. Farinha, Alon Lavie
2020	*BLEURT: Learning Robust Metrics for Text Generation*	Thibault Sellam, Dipanjan Das, and Ankur Parikh

Quality estimation


2020	*COMET: A Neural Framework for MT Evaluation*	Ricardo Rei, Craig Stewart, Ana C. Farinha, Alon Lavie