Publications Scientific publications about machine translation Table of contents Neural machine translation Model architecture Pre-training Attention mechanism Low-resource translation Open vocabulary Statistical machine translation Metrics Quality evaluation Similarity-based metrics n-gram matching metrics Embedding-based metrics Learnable metrics Quality estimation Model architecture
2017 Attention is all you need Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin 2017 Convolutional Sequence to Sequence Learning Jonas Gehring, Michael Auli, David Grangier, Denis Yarats, Yann N. Dauphin 2016 Google’s Neural Machine Translation System: Bridging the Gap between Human and Machine Translation Yonghui Wu, Mike Schuster, Zhifeng Chen, Quoc V. Le, Mohammad Norouzi, Wolfgang Macherey, Maxim Krikun, Yuan Cao, Qin Gao, Klaus Macherey, Jeff Klingner, Apurva Shah, Melvin Johnson, Xiaobing Liu, Łukasz Kaiser, Stephan Gouws, Yoshikiyo Kato, Taku Kudo, Hideto Kazawa, Keith Stevens, George Kurian, Nishant Patil, Wei Wang, Cliff Young, Jason Smith, Jason Riesa, Alex Rudnick, Oriol Vinyals, Greg Corrado, Macduff Hughes, Jeffrey Dean 2016 Modeling Coverage for Neural Machine Translation Zhaopeng Tu, Zhengdong Lu, Yang Li, Xiaohua Liu, Hang Li 2015 Effective Approaches to Attention-based Neural Machine Translation Minh-Thang Luong, Hieu Pham, Christopher D. Manning 2015 Neural Machine Translation by Jointly Learning to Align and Translate Dzmitry Bahdanau, KyungHyun Cho, Yoshua Bengio 2014 On the Properties of Neural Machine Translation: Encoder-Decoder Approaches Kyunghyun Cho, Bart van Merriënboer, Dzmitry Bahdanau, Yoshua Bengio 2014 Sequence to sequence learning with Neural Networks Ilya Sutskever, Oriol Vinyals, Quoc V. Le 2007 Large language models in machine translation Thorsten Brants, Ashok Popat, Peng Xu, Franz Josef Och, Jeffrey Dean Pre-training
2019 Towards Making the Most of BERT in Neural Machine Translation Jiacheng Yang, Mingxuan Wang, Hao Zhou, Chengqi Zhao, Yong Yu, Weinan Zhang, Lei Li 2019 On the use of BERT for Neural Machine Translation Stephane Clinchant, Kweon Woo Jung, Vassilina Nikoulina 2018 BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding Jacob Devlin, Ming-Wei Chang, Kenton Lee, Kristina Toutanova Attention mechanism
2017 A Structured Self-attentive Sentence Embedding Zhouhan Lin, Minwei Feng, Cicero Nogueira dos Santos, Mo Yu, Bing Xiang, Bowen Zhou, Yoshua Bengio 2016 A Decomposable Attention Model for Natural Language Inference Ankur P. Parikh, Oscar Täckström, Dipanjan Das, Jakob Uszkoreit 2015 Effective Approaches to Attention-based Neural Machine Translation Thang Luong, Hieu Pham, Christopher D. Manning Low-resource translation
2018 Universal Neural Machine Translation for Extremely Low Resource Languages Jiatao Gu, Hany Hassan, Jacob Devlin, Victor O.K. Li 2016 Google’s Multilingual Neural Machine Translation System: Enabling Zero-Shot Translation Melvin Johnson, Mike Schuster, Quoc V. Le, Maxim Krikun, Yonghui Wu, Zhifeng Chen, Nikhil Thorat, Fernanda Viégas, Martin Wattenberg, Greg Corrado, Macduff Hughes, Jeffrey Dean 2016 Improving Neural Machine Translation Models with Monolingual Data Rico Sennrich, Barry Haddow, Alexandra Birch 2012 On the difficulty of training Recurrent Neural Networks Razvan Pascanu, Tomas Mikolov, Yoshua Bengio Open vocabulary Metrics
Machine translation metrics automatically assess quality of the machine translation output. There are two types of metrics:
quality evaluation and quality estimation. Quality evaluation Similarity-based metrics
These metrics evaluate similarity between machine translation and reference translation. There are two types of this similarity:
n-gram matching-based similarity embedding-based similarity n-gram matching metrics
These metrics evaluate similarity based on hand-crafted features and rules. For example, a metric can count the number and fraction of n-grams that appear in both the machine translation and the human translation.
These metrics use various word embeddings as an alternative to n-gram matching for capturing word semantics similarity.
Learnable metrics directly optimise correlation with human judgments.