WMT23

Eighth Conference on Machine Translation


Location

  • Singapore

Links

Important Dates

   
Release of training data for shared tasks
Evaluation periods for shared tasks
Paper submission deadline 05 September
Paper notification 06 October
Camera-ready version 18 October
Conference 06 December

Shared Tasks

Scientific Papers

Topics

  • Machine translation models (neural, statistical etc. )
  • Analysis of neural models
  • Using comparable corpora
  • Selection and preparation of data
  • Semi-supervised and unsupervised learning for machine translation, transfer learning
  • Multilingual machine translation
  • Incorporating linguistic information into machine translation
  • Machine translation inference
  • Manual and automatic methods for evaluating machine translation
  • Quality estimation

Research Papers

Research papers should describe original research corresponding to the categories listed above. Research papers that have been or will be submitted to other meetings or publications must indicate this at submission time, and must be withdrawn from the other venues if accepted and published at WMT 2023.
We will not accept for publication papers that overlap significantly in content or results with papers that have been or will be published elsewhere. It is acceptable to submit work that has been made available as a technical report (or similar, e.g. in arXiv) without citing it.
For the research track, papers should be anonymised, be between 6 and 10 pages in length (excluding references) and may include supplementary material.

System Papers

System papers must describe one or more shared task submissions. System paper submissions that we cannot link to a shared task submission will be rejected without review. System papers can overlap with other published work, and do not have to follow the double submission policy. There is no maximum length for system papers, but normally a short paper (4-6 pages) is appropriate. System papers should not be anonymised.

Paper Submission

Schedule

Day 1

   
8:45 Opening Remarks
9:00 Session 1: Shared Task Overview Papers I
9:00 Findings of the 2023 Conference on Machine Translation (WMT23): LLMs Are Here but Not Quite There Yet
Tom Kocmi, Eleftherios Avramidis, Rachel Bawden, Ondřej Bojar, Anton Dvorkovich, Christian Federmann, Mark Fishel, Markus Freitag, Thamme Gowda, Roman Grundkiewicz, Barry Haddow, Philipp Koehn, Benjamin Marie, Christof Monz, Makoto Morishita, Kenton Murray, Makoto Nagata, Toshiaki Nakazawa, Martin Popel, Maja Popović, Mariya Shmatova
9:30 Findings of the WMT 2023 Biomedical Translation Shared Task: Evaluation of ChatGPT 3.5 as a Comparison System
Mariana Neves, Antonio Jimeno Yepes, Aurélie Névéol, Rachel Bawden, Giorgio Maria Di Nunzio, Roland Roller, Philippe Thomas, Federica Vezzani, Maika Vicente Navarro, Lana Yeganova, Dina Wiemann, Cristian Grozea
9:45 Findings of the WMT 2023 Shared Task on Discourse-Level Literary Translation: A Fresh Orb in the Cosmos of LLMs
Longyue Wang, Zhaopeng Tu, Yan Gu, Siyou Liu, Dian Yu, Qingsong Ma, Chenyang Lyu, Liting Zhou, Chao-Hong Liu, Yufeng Ma, Weiyu Chen, Yvette Graham, Bonnie Webber, Philipp Koehn, Andy Way, Yulin Yuan, Shuming Shi
10:00 Findings of the Second WMT Shared Task on Sign Language Translation (WMT-SLT23)
Mathias Müller, Malihe Alikhani, Eleftherios Avramidis, Richard Bowden, Annelies Braffort, Necati Cihan Camgöz, Sarah Ebling, Cristina España-Bonet, Anne Göhring, Roman Grundkiewicz, Mert Inan, Zifan Jiang, Oscar Koller, Amit Moryossef, Annette Rios, Dimitar Shterionov, Sandra Sidler-Miserez, Katja Tissi, Davy Van Landuyt
10:15 Findings of the WMT 2023 Shared Task on Parallel Data Curation
Steve Sloto, Brian Thompson, Huda Khayrallah, Tobias Domhan, Thamme Gowda, Philipp Koehn
10:30 ☕️
11:00 Session 2: Shared Task System Description Posters I
General Translation Task
11:00 Samsung R&D Institute Philippines at WMT 2023
Jan Christian Blaise Cruz
11:00 NAIST-NICT WMT’23 General MT Task Submission
Hiroyuki Deguchi, Kenji Imamura, Yuto Nishida, Yusuke Sakai, Justin Vasselli, Taro Watanabe
11:00 CUNI at WMT23 General Translation Task: MT and a Genetic Algorithm
Josef Jon, Martin Popel, Ondřej Bojar
11:00 SKIM at WMT 2023 General Translation Task
Keito Kudo, Takumi Ito, Makoto Morishita, Jun Suzuki
11:00 KYB General Machine Translation Systems for WMT23
Ben LI, Yoko Matsuzaki, Shivam Kalkar
11:00 Yishu: Yishu at WMT2023 Translation Task
Luo Min, Yixin Tan, Qiulin Chen
11:00 PROMT Systems for WMT23 Shared General Translation Task
Alexander Molchanov, Vladislav Kovalenko
11:00 AIST AIRC Submissions to the WMT23 Shared Task
Matiss Rikters, Makoto Miwa
11:00 MUNI-NLP Submission for Czech-Ukrainian Translation Task at WMT23
Pavel Rychly, Yuliia Teslia
11:00 Exploring Prompt Engineering with GPT Language Models for Document-Level Machine Translation: Insights and Findings
Yangjian Wu, Gang Hu
11:00 Treating General MT Shared Task as a Multi-Domain Adaptation Problem: HW-TSC’s Submission to the WMT23 General MT Shared Task
Zhanglin Wu, Daimeng Wei, Zongyao Li, Zhengzhe YU, Shaojun Li, Xiaoyu Chen, Hengchao Shang, Jiaxin GUO, Yuhao Xie, Lizhi Lei, Hao Yang, Yanfei Jiang
11:00 UvA-MT’s Participation in the WMT 2023 General Translation Shared Task
Di Wu, Shaomu Tan, David Stap, Ali Araabi, Christof Monz
11:00 Achieving State-of-the-Art Multilingual Translation Model with Minimal Data and Parameters
Hui Zeng
11:00 IOL Research Machine Translation Systems for WMT23 General Machine Translation Shared Task
Wenbo Zhang
11:00 GTCOM and DLUT’s Neural Machine Translation Systems for WMT23
Hao Zong
Test Suites
11:00 RoCS-MT: Robustness Challenge Set for Machine Translation
Rachel Bawden, Benoît Sagot
11:00 Multifaceted Challenge Set for Evaluating Machine Translation Performance
Xiaoyu Chen, Daimeng Wei, Zhanglin Wu, Ting Zhu, Hengchao Shang, Zongyao Li, Jiaxin GUO, Ning Xie, Lizhi Lei, Hao Yang, Yanfei Jiang
11:00 Linguistically Motivated Evaluation of the 2023 State-of-the-art Machine Translation: Can ChatGPT Outperform NMT?
Shushen Manakhimova, Eleftherios Avramidis, Vivien Macketanz, Ekaterina Lapshinova-Koltunski, Sergei Bagdasarov, Sebastian Möller
11:00 IIIT HYD’s Submission for WMT23 Test-suite Task
Ananya Mukherjee, Manish Shrivastava
11:00 Test Suites Task: Evaluation of Gender Fairness in MT with MuST-SHE and INES
Beatrice Savoldi, Marco Gaido, Matteo Negri, Luisa Bentivogli
Biomedical Translation Task
11:00 Biomedical Parallel Sentence Retrieval Using Large Language Models
Sheema Firdous, Sadaf Abdul Rauf
11:00 The Path to Continuous Domain Adaptation Improvements by HW-TSC for the WMT23 Biomedical Translation Shared Task
Zhanglin Wu, Daimeng Wei, Zongyao Li, Zhengzhe YU, Shaojun Li, Xiaoyu Chen, Hengchao Shang, Jiaxin GUO, Yuhao Xie, Lizhi Lei, Hao Yang, Yanfei Jiang
11:00 Investigating Techniques for a Deeper Understanding of Neural Machine Translation (NMT) Systems through Data Filtering and Fine-tuning Strategies
Lichao Zhu, Maria Zimina, Maud Bénard, Behnoosh Namdar, Nicolas Ballier, Guillaume Wisniewski, Jean-Baptiste Yunès
Literary Translation Task
11:00 MAX-ISI System at WMT23 Discourse-Level Literary Translation Task
Li An, Linghao Jin, Xuezhe Ma
11:00 The MAKE-NMTVIZ System Description for the WMT23 Literary Task
Fabien Lopez, Gabriela González, Damien Hansen, Mariam Nakhle, Behnoosh Namdarzadeh, Nicolas Ballier, Marco Dinarelli, Emmanuelle Esperança-Rodier, Sui He, Sadaf Mohseni, Caroline Rossi, Didier Schwab, Jun Yang, Jean-Baptiste Yunès, Lichao Zhu
11:00 DUTNLP System for the WMT2023 Discourse-Level Literary Translation
Anqi Zhao, Kaiyu Huang, Hao Yu, Degen Huang
11:00 HW-TSC’s Submissions to the WMT23 Discourse-Level Literary Translation Shared Task
Yuhao Xie, Zongyao Li, Zhanglin Wu, Daimeng Wei, Xiaoyu Chen, Zhiqiang Rao, Shaojun Li, Hengchao Shang, Jiaxin GUO, Lizhi Lei, Hao Yang, Yanfei Jiang
11:00 TJUNLP:System Description for the WMT23 Literary Task in Chinese to English Translation Direction
Shaolin Zhu, Deyi Xiong
African Languages Translation Task
11:00 Machine Translation for Nko: Tools, Corpora, and Baseline Results
Moussa Doumbouya, Baba Mamadi Diané, Solo Farabado Cissé, Djibrila Diané, Abdoulaye Sow, Séré Moussa Doumbouya, Daouda Bangoura, Fodé Moriba Bayo, Ibrahima Sory 2. Conde, Kalo Mory Diané, Chris Piech, Christopher Manning
Sign Language Translation Task
11:00 TTIC’s Submission to WMT-SLT 23
Marcelo Sandoval-Castaneda, Yanhong Li, Bowen Shi, Diane Brentari, Karen Livescu, Gregory Shakhnarovich
11:00 KnowComp Submission for WMT23 Sign Language Translation Task
Baixuan Xu, Haochen Shi, Tianshi Zheng, Qing Zong, Weiqi Wang, Zhaowei Wang, Yangqiu Song
Parallel Data Curation Task
11:00 A Fast Method to Filter Noisy Parallel Data WMT2023 Shared Task on Parallel Data Curation
Nguyen-Hoang Minh-Cong, Nguyen Van Vinh, Nguyen Le-Minh
11:00 A Sentence Alignment Approach to Document Alignment and Multi-faceted Filtering for Curating Parallel Sentence Pairs from Web-crawled Data
Steinthor Steingrimsson
12:30 🍴
14:00 Session 3: Research Papers on Document-Level Translation and Use of Large Language Models
Chair: Nikolay Bogoychev
14:00 Document-Level Language Models for Machine Translation
Frithjof Petrick, Christian Herold, Pavel Petrushkov, Shahram Khadivi, Hermann Ney
14:15 ChatGPT MT: Competitive for High- (but Not Low-) Resource Languages
Nathaniel Robinson, Perez Ogayo, David R. Mortensen, Graham Neubig
14:30 Large Language Models Effectively Leverage Document-level Context for Literary Translation, but Critical Errors Persist
Marzena Karpinska, Mohit Iyyer
14:45 Identifying Context-Dependent Translations for Evaluation Set Production
Rachel Wicks, Matt Post
15:00 Machine Translation with Large Language Models: Prompting, Few-shot Learning, and Fine-tuning with QLoRA
Xuan Zhang, Navid Rajabi, Kevin Duh, Philipp Koehn
15:15 Towards Effective Disambiguation for Machine Translation with Large Language Models
Vivek Iyer, Pinzhen Chen, Alexandra Birch
15:30 ☕️
16:00 Session 4: Research Papers on Translation Modelling
Chair: Huda Khayrallah
16:00 A Closer Look at Transformer Attention for Multilingual Translation
Jingyi Zhang, Gerard de Melo, Hongfei Xu, Kehai Chen
16:15 Bridging the Gap between Position-Based and Content-Based Self-Attention for Neural Machine Translation
Felix Schmidt, Mattia Di Gangi
16:30 Visual Prediction Improves Zero-Shot Cross-Modal Machine Translation
Tosho Hirasawa, Emanuele Bugliarello, Desmond Elliott, Mamoru Komachi
16:45 The Gender-GAP Pipeline: A Gender-Aware Polyglot Pipeline for Gender Characterisation in 55 Languages
Benjamin Muller, Belen Alastruey, Prangthip Hansanti, Elahe Kalbassi, Christophe Ropers, Eric Smith, Adina Williams, Luke Zettlemoyer, Pierre Andrews, Marta R. Costa-jussà
17:00 Towards Better Evaluation for Formality-Controlled English-Japanese Machine Translation
Edison Marrese-Taylor, Pin Chen Wang, Yutaka Matsuo
17:15 There’s No Data like Better Data: Using QE Metrics for MT Data Filtering
Jan-Thorsten Peter, David Vilar, Daniel Deutsch, Mara Finkelstein, Juraj Juraska, Markus Freitag

Day 2

   
9:00 Session 5: Shared Task Overview Papers II
9:00 Results of WMT23 Metrics Shared Task: Metrics Might Be Guilty but References Are Not Innocent
Markus Freitag, Nitika Mathur, Chi-kiu Lo, Eleftherios Avramidis, Ricardo Rei, Brian Thompson, Tom Kocmi, Frederic Blain, Daniel Deutsch, Craig Stewart, Chrysoula Zerva, Sheila Castilho, Alon Lavie, George Foster
9:15 Findings of the WMT 2023 Shared Task on Quality Estimation
Frederic Blain, Chrysoula Zerva, Ricardo Rei, Nuno M. Guerreiro, Diptesh Kanojia, José G. C. de Souza, Beatriz Silva, Tânia Vaz, Yan Jingxuan, Fatemeh Azadi, Constantin Orasan, André Martins
9:30 Findings of the Word-Level AutoCompletion Shared Task in WMT 2023
Lemao Liu, Francisco Casacuberta, George Foster, Guoping Huang, Philipp Koehn, Geza Kovacs, Shuming Shi, Taro Watanabe and Chengqing Zong
9:45 Findings of the WMT 2023 Shared Task on Machine Translation with Terminologies
Kirill Semenov, Vilém Zouhar, Tom Kocmi, Dongdong Zhang, Wangchunshu Zhou, Yuchen Eleanor Jiang
10:00 Findings of the WMT 2023 Shared Task on Automatic Post-Editing
Pushpak Bhattacharyya, Rajen Chatterjee, Markus Freitag, Diptesh Kanojia, Matteo Negri, Marco Turchi
10:15 Findings of the WMT 2023 Shared Task on Low-Resource Indic Language Translation
Santanu Pal, Partha Pakray, Sahinur Rahman Laskar, Lenin Laitonjam, Vanlalmuansangi Khenglawt, Sunita Warjri, Pankaj Kundan Dadure, Sandeep Kumar Dash
10:30 ☕️
11:00 Session 6: Shared Task System Description Posters II
Metrics Task
11:00 ACES: Translation Accuracy Challenge Sets at WMT 2023
Chantal Amrhein, Nikita Moghe, Liane Guillou
11:00 Challenging the State-of-the-art Machine Translation Metrics from a Linguistic Perspective
Eleftherios Avramidis, Shushen Manakhimova, Vivien Macketanz, Sebastian Möller
11:00 Tokengram_F, a Fast and Accurate Token-based chrF++ Derivative
Sören DREANO, Derek Molloy, Noel Murphy
11:00 Embed_Llama: Using LLM Embeddings for the Metrics Shared Task
Sören DREANO, Derek Molloy, Noel Murphy
11:00 eBLEU: Unexpectedly Good Machine Translation Evaluation Using Simple Word Embeddings
Muhammad ElNokrashy, Tom Kocmi
11:00 Cometoid: Distilling Strong Reference-based Machine Translation Metrics into Even Stronger Quality Estimation Metrics
Thamme Gowda, Tom Kocmi, Marcin Junczys-Dowmunt
11:00 MetricX-23: The Google Submission to the WMT 2023 Metrics Shared Task
Juraj Juraska, Mara Finkelstein, Daniel Deutsch, Aditya Siddhant, Mehdi Mirzazadeh, Markus Freitag
11:00 GEMBA-MQM: Detecting Translation Quality Error Spans with GPT-4
Tom Kocmi, Christian Federmann
11:00 Metric Score Landscape Challenge (MSLC23): Understanding Metrics’ Performance on a Wider Landscape of Translation Quality
Chi-kiu Lo, Samuel Larkin, Rebecca Knowles
11:00 MEE4 and XLsim : IIIT HYD’s Submissions’ for WMT23 Metrics Shared Task
Ananya Mukherjee, Manish Shrivastava
11:00 Quality Estimation Using Minimum Bayes Risk
Subhajit Naskar, Daniel Deutsch, Markus Freitag
11:00 The SLIDE metric submission to the WMT 2023 metrics task
Vikas Raunak, Tom Kocmi, Matt Post
11:00 Semantically-Informed Regressive Encoder Score
Vasiliy Viskov, George Kokush, Daniil Larionov, Steffen Eger, Alexander Panchenko
11:00 Empowering a Metric with LLM-assisted Named Entity Annotation: HW-TSC’s Submission to the WMT23 Metrics Shared Task
Zhanglin Wu, Yilun Liu, Min Zhang, Xiaofeng Zhao, Junhao Zhu, Ming Zhu, Xiaosong Qiao, Jingfei Zhang, Ma Miaomiao, Zhao Yanqing, Song Peng, shimin tao, Hao Yang, Yanfei Jiang
11:00 Quality Estimation Task
11:00 Unify Word-level and Span-level Tasks: NJUNLP’s Participation for the WMT2023 Quality Estimation Shared Task
Xiang Geng, Zhejian Lai, Yu Zhang, shimin tao, Hao Yang, Jiajun CHEN, Shujian Huang
11:00 HW-TSC 2023 Submission for the Quality Estimation Shared Task
Yuang Li, Chang Su, Ming Zhu, Mengyao Piao, Xinglin Lyu, Min Zhang, Hao Yang
11:00 Scaling up CometKiwi: Unbabel-IST 2023 Submission for the Quality Estimation Shared Task
Ricardo Rei, Nuno M. Guerreiro, José Pombal, Daan van Stigt, Marcos Treviso, Luisa Coheur, José G. C. de Souza, André Martins
11:00 SurreyAI 2023 Submission for the Quality Estimation Shared Task
Archchana Sindhujan, Diptesh Kanojia, Constantin Orasan, Tharindu Ranasinghe
11:00 MMT’s Submission for the WMT 2023 Quality Estimation Shared Task
Yulong Wu, Viktor Schlegel, Daniel Beck, Riza Batista-Navarro
11:00 IOL Research’s Submission for WMT 2023 Quality Estimation Shared Task
Zeyu Yan
11:00 Word-Level Autocompletion Task
11:00 SJTU-MTLAB’s Submission to the WMT23 Word-Level Auto Completion Task
Xingyu Chen, Rui Wang
11:00 PRHLT’s Submission to WLAC 2023
Angel Navarro, Miguel Domingo, Francisco Casacuberta
11:00 KnowComp Submission for WMT23 Word-Level AutoCompletion Task
Yi Wu, Haochen Shi, Weiqi Wang, Yangqiu Song
11:00 Terminology Translation Task
11:00 Terminology-Aware Translation with Constrained Decoding and Large Language Model Prompting
Nikolay Bogoychev, Pinzhen Chen
11:00 Lingua Custodia’s Participation at the WMT 2023 Terminology Shared Task
Jingshu Liu, Mariam Nakhlé, Gaëtan Caillout, Raheel Qadar
11:00 Domain Terminology Integration into Machine Translation: Leveraging Large Language Models
Yasmin Moslem, Gianfranco Romani, Mahdi Molaei, John D. Kelleher, Rejwanul Haque, Andy Way
11:00 OPUS-CAT Terminology Systems for the WMT23 Terminology Shared Task
Tommi Nieminen
11:00 VARCO-MT: NCSOFT’s WMT’23 Terminology Shared Task Submission
Geon Woo Park, Junghwa Lee, Meiying Ren, Allison Shindell, Yeonsoo Lee
11:00 Automatic Postediting Task
11:00 HW-TSC’s Participation in the WMT 2023 Automatic Post Editing Shared Task
Jiawei Yu, Min Zhang, Zhao Yanqing, Xiaofeng Zhao, Yuang Li, Su Chang, Yinglu Li, Ma Miaomiao, shimin tao, Hao Yang
11:00 Indic Languages Translation Task
11:00 Neural Machine Translation for English - Manipuri and English - Assamese
Goutam Agrawal, Rituraj Das, Anupam Biswas, Dalton Meitei Thounaojam
11:00 GUIT-NLP’s Submission to Shared Task: Low Resource Indic Language Translation
Mazida Ahmed, Kuwali Talukdar, Parvez Boruah, Prof. Shikhar Kumar Sarma, Kishore Kashyap
11:00 NICT-AI4B’s Submission to the Indic MT Shared Task in WMT 2023
Raj Dabre, Jay Gala, Pranjal Chitale
11:00 Machine Translation Advancements for Low-Resource Indian Languages in WMT23: CFILT-IITB’s Effort for Bridging the Gap
Pranav Gaikwad, Meet Doshi, Sourabh Deoghare, Pushpak Bhattacharyya
11:00 Low-Resource Machine Translation Systems for Indic Languages
Ivana Kvapilíková, Ondřej Bojar
11:00 MUNI-NLP Systems for Low-resource Indic Machine Translation
Edoardo Signoroni, Pavel Rychly
11:00 NITS-CNLP Low-Resource Neural Machine Translation Systems of English-Manipuri Language Pair
Kshetrimayum Boynao Singh, Avichandra Singh Ningthoujam, Loitongbam Sanayai Meetei, Sivaji Bandyopadhyay, Thoudam Doren Singh
11:00 IACS-LRILT: Machine Translation for Low-Resource Indic Languages
Dhairya Suman, Atanu Mandal, Santanu Pal, Sudip Naskar
11:00 IOL Research Machine Translation Systems for WMT23 Low-Resource Indic Language Translation Shared Task
Wenbo Zhang
11:00 Session 7: Panel on Large Language Models and Machine Translation
Eleftheria Briakou (University of Maryland), Arul Menezes (Microsoft), Jose de Souza (Unbabel), moderated by Philipp Koehn (Johns Hopkins University)
11:00 Session 8: Research Papers on Evaluation
Chair: Kenton Murray
11:00 Trained MT Metrics Learn to Cope with Machine-translated References
Jannis Vamvas, Tobias Domhan, Sony Trenous, Rico Sennrich, Eva Hasler
11:00 Training and Meta-Evaluating Machine Translation Evaluation Metrics at the Paragraph Level
Daniel Deutsch, Juraj Juraska, Mara Finkelstein, Markus Freitag
11:00 Automating Behavioral Testing in Machine Translation
Javier Ferrando, Matthias Sperber, Hendra Setiawan, Dominic Telaar, Saša Hasan
11:00 One Wide Feedforward Is All You Need
Telmo Pires, António Vilarinho Lopes, Yannick Assogba, Hendra Setiawan
11:00 A Benchmark for Evaluating Machine Translation Metrics on Dialects without Standard Orthography
Noëmi Aepli, Chantal Amrhein, Florian Schottmann, Rico Sennrich
11:00 The Devil Is in the Errors: Leveraging Large Language Models for Fine-grained Machine Translation Evaluation
Patrick Fernandes, Daniel Deutsch, Mara Finkelstein, Parker Riley, André Martins, Graham Neubig, Ankush Garg, Jonathan Clark, Markus Freitag, Orhan Firat

Results

General Task

Full results of the shared task: Findings of the 2023 Conference on Machine Translation (WMT23): LLMs Are Here but Not Quite There Yet

The winner systems were listed according to their average score.

→ English

Language pair System Average score Average z-score
German → GPT4-5shot 90.3
Chinese → Lan-BridgeMT 82.9
Japanese → GPT4-5shot 81.3

English →

Language pair System Average score Average z-score
→ German GPT4-5shot 89.0
→ Czech ONLINE-W 84.1
→ Chinese Yishu 82.2
→ Japanese GPT4-5shot 79.5

Czech → Ukrainian

Language pair System Average score Average z-score
Czech → Ukrainian ONLINE-B 83.7

The results were determined with a bilingual direct assessment with scalar quality metric (SQM) with document context.


Want to learn more about WMT23?


Edit this article →

Machine Translate is created and edited by contributors like you!

Learn more about contributing →

Licensed under CC-BY-SA-4.0.

Cite this article →