AMTA 2022

Conference of the Association for Machine Translation in the Americas

AMTA 2022 provided parallel sessions, demonstrations of the offerings from machine translation providers, tutorials for both beginners and more experienced practitioners, and workshops.
Students interested in MT and computational linguistics were able to connect with academic and industry mentors.

Location

Florida, United States of America

Important Dates


Track presentation proposals submission deadline	13 June
Tutorial proposals submission deadline	13 June
Workshop proposals submission deadline	30 May
Notification of acceptance	18 July
Final “camera-ready” versions	08 August

Keynote speakers

Marco Trombetti, CEO of Translated
Angela Fan, research scientist at Meta AI Research
Dr. Alex Waibel, professor of Computer Science of Carnegie Mellon University and the Karlsruhe Institute of Technology

Schedule

Day 1


7:30	Registration
9:00	Introduction to MT Jay Marciano \| Tutorial \| Majestic Palm D
9:00	Empirical TPR: past, present, and future Michael Carl, Masaru Yamada, Longhui Zou
9:30	Differentiated measurements for fatigue and demotivation/amotivation in translation - lessons learnt from fatigue and motivation studies Junyi Mao
10:00	Investigating the Impact of Different Pivot Languages on Translation Quality Longhui Zou, Ali Saeedi, Michael Carl
10:30	☕️
11:00	Predicting the Number of Errors in Human Translation using Source Text and Translator Characteristics Haruka Ogawa
11:30	The impact of translation competence on error recognition of neural MT Moritz J Schaeffer
12:30	🍴
13:30	AutoML for Neural Machine Translation Presented by Kevin Duh
13:30	Executive Roundtable Chaired by Konstantin Dranch
14:00	Syntactic Cross and Reading Effort in English to Japanese Translation Takanori Mizowaki, Haruka Ogawa, Masaru Yamada
14:30	Proficiency and External Aides: Impact of Translation Brief and Search Conditions on Post-editing Quality Longhui Zou, Michael Carl, Masaru Yamada, Takanori Mizowaki
15:00	☕️
15:30	Entropy as a measurement of cognitive load in translation Yuxiang Wei
16:00	Invited Talk The graphical brain and deep inference Karl Friston

Day 2


7:30	Registration
8:00	Virtual Networking Session
8:00	Virtual Student Mentoring Session
9:00	Path to singularity** Marco Trombetti
10:00	Recent Advances in Dynamically Adapted MT Moderator: Alon Lavie Panelists: Marco Trombetti, Watson Srivathsan, Joern Wuebker, Jose Souza
11:00	☕️
11:00	Exhibits
11:30	Machine Translation as a Prototype for Advanced AI Deployment in Government Kathryn Baker
11:30	Building MT for Software Product Descriptions Using Domain-specific Sub-corpora Extraction Pintu Lohar, Sinead Madden, Edmond O’Connor, Maja Popovic, Tanya Habruseva
11:30	PEMT human evaluation at 100x scale with risk-driven sampling Kirill Soloviev
11:50	A Proposed User Study on MT-Enabled Scanning Marianna Martindale, Marine Carpuat
11:50	Domain-Specific Text Generation for Machine Translation Yasmin Moslem, Rejwanul Haque, John Kelleher, Andy Way
11:50	Picking Out The Best MT Model: On The Methodology Of Human Evaluation Stepan Korotaev, Andrey Ryabchikov
12:10	You've translated it, now what? Michael Maxwell, Shabnam Tafreshi, Aquia Richburg, Balaji Kodali, Kymani Brown
12:10	Strategies for Adapting Multilingual Pre-training for Domain-Specific Machine Translation Neha Verma, Kenton Murray, Kevin Duh
12:10	Post-editing of Machine-Translated Patents: High Tech with High Stakes Aaron Hebenstreit
12:30	🍴
12:30	Exhibits
12:30	Onsite Student Mentoring Session
12:30	Virtual Student Mentoring Session
13:30	Prefix Embeddings for In-context Machine Translation Suzanna Sia, Kevin Duh
13:30	State of the Machine Translation 2022 Konstantin Savenkov, Michel Lopez
13:30	Automatic Post-Editing of MT Output Using Large Language Models Blanca Vidal, Albert Llorens, Juan Alonso
13:50	Fast Vocabulary Projection Method via Clustering for Multilingual Machine Translation on GPU Hossam Amer, Mohamed Afify, Young Jin Kim, Hitokazu Matsushita, Hany Hassan
13:50	The Translation Impact on Global CX Kirti Vashee
13:50	Improving Consistency of Human and Machine Translations Silvio Picinini
14:10	Language Tokens: Simply Improving Zero-Shot Multi-Aligned Translation in Encoder-Decoder Models Muhammad N ElNokrashy, Amr Hendy, Mohamed Maher, Mohamed Afify, Hany Hassan
14:10	Machine Assistance in the Real World Dave Bryant
14:10	Improve Machine Translation for Cross-Lingual Search in E-Commerce Hang Zhang
14:30	☕️
14:30	Exhibits
15:00	Pangeanic
15:20	Amazon Web Services
15:40	Welocalize
17:00	Virtual Networking Session

Day 3


7:30	Registration
8:00	Virtual Networking Session
9:00	Uplifting Singapore’s translation standards with the community through technology Lee Siew Li, Adeline Sim, Gowri Kanagarajah, Siti Amirah, Foo Yong Xiang, Gayathri Ayathorai, Sarina Mohamed Rasol, Aw Ai Ti, Wu Kui, Zheng Weihua, Ding Yang, Tarun Kumar Vangani, Nabilah Binte Md Johan
9:00	Low Resource Chat Translation: A Benchmark for Hindi-English Language Pair Baban Gain, Ramakrishna Appicharla, Soumya Chennabasavraj, Nikesh Garera, Asif Ekbal, Muthusamy Chelliah
9:00	A Multimodal Simultaneous Interpretation Prototype: Who Said What Xiaolin Wang, Masao Utiyama, Eiichiro Sumita
9:20	Multi-dimensional Consideration of Cognitive Effort in Translation and Interpreting Deyan Zou
9:20	How Robust is NMT to Language Imbalance in Multilingual Tokenizer Training? Shiyue Zhang, Vishrav Chaudhary, Naman Goyal, James Cross, Guillaume Wenzek, Mohit Bansal, Francisco Guzman
9:20	Data analytics meets machine translation solution Allen Che, Martin Xiao
9:40	How Effective is Byte Pair Encoding for Out-Of-Vocabulary Words in Neural Machine Translation? Ali Araabi, Christof Monz, Vlad Niculae
9:40	Quality Prediction Adam Bittlingmayer, Boris Zubarev, Artur Aleksanyan
10:00	Hand in 01101000 01100001 01101110 01100100 with the Machine: A Roadmap to Quality Caroline-Soledad Mallette
10:00	On the Effectiveness of Quasi Character-Level Models for Machine Translation Salvador Carrión-Ponz, Francisco Casacuberta
10:00	Comparison Between ATA Grading Framework Scores and Auto Scores Evelyn Garland, Carola Berger, Jon Ritzdorf
10:20	Dragonfly: Automated Sign Language Recognition (ASLR) and Machine Translation (MT) Patricia O’Neill-Brown, Bill Dawson
10:20	Improving Translation of Out of Vocabulary Words using Bilingual Lexicon Induction Jonas Waldendorf, Alexandra Birch, Barry Hadow, Antonio Valerio Micele Barone
10:20	Improving Translation of Out of Vocabulary Words using Bilingual Lexicon Induction Jonas Waldendorf, Alexandra Birch, Barry Hadow, Antonio Valerio Micele Barone
10:20	Lingua: Addressing Scenarios for Live Interpretation and Automatic Dubbing Nathan Anderson, Caleb Wilson, Stephen D. Richardson
10:40	☕️
10:40	Exhibits
11:10	Doubly-Trained Adversarial Data Augmentation for Neural Machine Translation Weiting Tan, Shuoyang Ding, Huda Khayrallah, Philipp Koehn
11:10	Refining an Almost Clean Translation Memory Helps Machine Translation
11:10	All You Need is Source! Source-based Quality Estimation for NMT Jon Cambra, Mara Nunziatini
11:30	Limitations and Challenges of Unsupervised Cross-lingual Pre-training Martín Quesada Zaragoza, Francisco Casacuberta
11:30	Practical Attacks on Machine Translation using Paraphrase Elizabeth M Merkhofer, John Henderson, Abigail Gertner, Michael Doyle, Lily Wong
11:30	Knowledge Distillation for Sustainable Neural Machine Translation Wandri Jooste, Andy Way, Rejwanul Haque, Riccardo Superbo
11:50	Few-Shot Regularization to Tackle Catastrophic Forgetting in Multilingual Machine Translation Salvador Carrión-Ponz , Francisco Casacuberta
11:50	Sign Language Machine Translation and the Sign Language Lexicon: A Linguistically Informed Approach Irene Murtagh, Víctor Ubieto Nogales, Josep Blat
11:50	Innovations in Machine Voice for E-learning and Training Content Andrey Nikulin, Kevin Bruner
12:10	Quantized Wasserstein Procrustes Alignment of Word Embedding Spaces Prince O Aboagye, Yan Zheng, Michael Yeh, Junpeng Wang, Zhongfang Zhuang, Huiyuan Chen, Liang Wang, Wei Zhang, Jeff Phillips
12:10	A Neural Machine Translation Approach to Translate Text to Pictographs in a Medical Speech Translation System Jonathan Mutal, Pierrette Bouillon, Magali Norré, Johanna Gerlach, Lucia Ormaechea Grijalba
12:10	Business Critical Errors: A Framework for Adaptive Quality Feedback Craig A Stewart, Madalena Gonçalves, Marianna Buchicchio, Alon Lavie
12:30	🍴
12:30	Exhibits
12:30	Onsite Student Mentoring
13:30	Star Group
13:50	Unbabel
14:10	Systran
14:30	☕️
14:30	Exhibits
15:00	Removing the Language Divide: From Machine Translation to Language Transparence** Alex Waibel
16:00	Advances in Spoken Language MT** Moderator: Steve Larocca Panelists: Alex Waibel, Parnia Bahar, Jack Donnelly, Evelyne Tzoukermann
17:00	Virtual Networking Session
18:00	Conference Banquet

Day 4


7:30	Registration
8:00	Virtual Networking Session
9:00	No Language Left Behind: Scaling Human-Centered Machine Translation** Angela Fan
10:00	Large Multilingual Language Models and MT Moderators: Grigory Sapunov, Konstantin Savenkov Panelists: Angela Fan, Vishrav Chaudhary, José de Souza
11:00	☕️
11:00	Exhibits
11:30	Embedding-Enhanced GIZA++: Improving Word Alignment Using Embeddings Kelly Marchisio, Conghao Xiong, Philipp Koehn
11:30	A Snapshot into the Possibility of Video Game Machine Translation Damien Hansen, Pierre-Yves Houlmont
11:50	Gender bias Evaluation in Luganda-English Machine Translation Eric Peter Wairagala
11:50	Customization options for language pairs without English Daniele Giulianelli
11:50	Feeding NMT a Healthy Diet – The Impact of Quality, Quantity, or the Right Type of Nutrients Abdallah Nasir, Sara Alisis, Ruba W Jaikat, Rebecca Jonsson, Sara Qardan, Eyas Shawahneh, Nour Al-Khdour
12:10	Adapting Large Multilingual MT Models to Unseen Low Resource Languages Mohamed A Abdelghaffar, Amr El Mogy, Nada Ahmed Sharaf
12:10	Neural Fuzzy Adaptation - Boosting Neural Machine Translation with Similar Translations John Barraza, Jitao Xu, Josep Crego, Jean Senellart
12:10	A Comparison of Data Filtering Methods for Neural Machine Translation Fred Bane, Celia Soler Uguet, Wiktor Stribiżew, Anna Zaretskaya
12:30	🍴
12:30	Exhibits
13:30	AMTA Business Meeting
14:00	Language Weaver
14:20	Intento
14:40	Apptek
15:00	☕️
15:00	Exhibits
15:30	NVTC’s Transliteration Plug-in: What’s in a Name? Jen Doyon, Ekaterina Harke
15:30	Measuring the Effects of Human and Machine Translation on Website Engagement Geza Kovacs, John DeNero
15:30	Machine Translate: Open resources and community Cecilia Yalangozian, Vilém Zouhar, Adam Bittlingmayer
15:50	Robust Translation of French Live Speech Transcripts John Barraza, Elise Bertin-Lemée, Guillaume Klein, Josep Crego, Jean Senellart
15:50	Consistent Human Evaluation of Machine Translation across Language Pairs Daniel Licht, Cynthia Gao, Janice Lam, Francisco Guzman, Mona Diab, Philipp Koehn
15:50	Unlocking the value of bilingual documents with DL segmentation and alignment for Arabic Nour Al-Khdour, Rebecca Jonsson, Ruba W Jaikat, Abdallah Nasir, Sara Alisis, Sara Qardan, Eyas Shawahneh
16:10	Speech-to-Text and Evaluation of Multiple Machine Translation Systems Evelyne Tzoukermann, Steven Van Guilder, Jennifer Doyon, Ekaterina Harke
16:10	Evaluating Machine Translation in Cross-lingual E-Commerce Search Hang Zhang, Liling Tan, Amita Misra
16:10	Language I/O Solution for Multilingual Customer Support Diego Bartolome, Silke Dodel, Chris Jacob
16:30	Closing and Best Presentation Awards
17:00	Virtual Networking Session

Day 5


7:30	Registration
8:00	Virtual Networking Session
8:30	Opening remarks
8:45	Collecting data, training models and distributing both – the OPUS way** Jörg Tiedemann
9:00	Machine Translation User Guide: 2022 Edition** Konstantin Savenkov
9:00	Creating Text-to-Speech Software for Every Language** Al Sagheer
9:30	Building and Analysis of Tamil Lyric Corpus with Semantic Representation Karthika Ranganathan, Geetha T V
9:50	English-Russian Data Augmentation for Neural Machine Translation Nikita Teslenko Grygoryev, Mercedes Garcia Martinez, Francisco Casacuberta Nolla, Amando Estela Pastor, Manuel Herranz
10:15	Efficient Machine Translation Corpus Generation Kamer Ali Yuksel, Ahmet Gunduz, Shreyas Sharma, Hassan Sawaf
10:30	☕️
11:00	Ukrainian-To-English Folktale Corpus: Parallel Corpora Creation and Augmentation for Machine Translation in Low-Resource Languages Olena Burda-Lassen
11:30	Unlocking Resources for Under-resourced Languages Graham Neubig
12:30	A Multilingual View of Unsupervised Machine Translation Ankur Parikh
12:30	🍴
13:30	The Post-editor Toolkit Luciana Ramos
13:30	Tackling Low-Resource Machine Translation with Participation, Data and Scale Julia Kreutzer
13:30	Recent Advances in Translation Quality Evaluation Alon Lavie, Craig Stewart
14:30	Formality Control for Machine Translation** Maria Nadejde
15:00	☕️
15:30	Panel Discussion - Low-Resource Language Corpora Marine Carpuat, Kenneth Ward Church
16:30	Closing remarks

Workshops

Workshop on empirical translation process research

The Workshop on Empirical Translation Process Research (WeTPR) investigated human translation and post-editing processes.

sites.google.com/site/centretranslationinnovation/conferences-workshops/wetpr-2022

Workshop on Corpus Generation and Corpus Augmentation for Machine Translation

The first workshop on Corpus Generation and Corpus Augmentation for Machine Translation (CoCo4MT) was co-located with AMTA 2022 on 16 September.

Panels

Advances in Spoken Language MT
Dynamically Adaptive MT
Multilingual Language Models and MT

Call for papers

Conference tracks

Research

Chairs: Kevin Duh, Francisco Guzman ([email protected])

We seek submissions across the entire spectrum of MT-related research, but with a particular focus on the conference’s strength: the close interaction between researchers and practitioners who are looking to apply the latest MT technology to their tasks.

Topics:

Advances in data-driven MT (e.g., neural, statistical)
Lexicon acquisition and integration into MT
MT for low resource languages
Model distillation, compression, and on-device MT
MT in production scenarios, robustness and deployment issues.
MT for multiple modalities (speech, optical character recognition)
MT for communication (chats, blogs, social networks)
Few-shot adaptation of pre-trained MT systems
Deep integration of MT technology within translation and localization pipelines
Large-scale mining of translation resources
Computer Assisted Translation (CAT)
MT evaluation
Measuring fairness, bias, transparency in translation
Detecting and preventing catastrophic errors in translation
Best practices in annotation for translation
Other

We invite original, substantial, and unpublished research in all aspects of machine translation (MT).

Submission requirements:

Maximum 10-page paper (unlimited for reference pages)
Follow the style guides (PDF version, LaTeX version, MS Word version) and submit in PDF format
Do not include author names and affiliations
Represent new work that has not been previously published

Authors submitting a similar paper to another conference or workshop must specify this at submission time; if the paper is accepted to multiple venues, the author must choose which one to present at.

Papers must be submitted via the Submission website

Users and providers

Chairs: Janice Campbell, Jay Marciano, Konstantin Savenkov, Alex Yanishevsky ([email protected])

This track is intended for users, providers, and developers of machine translation, as well as professional translators and Language Service Providers, to present novel, original, and unpublished applications of machine translation technology or specific commercial use cases.

We seek submissions for 15-20-minute presentations (including a few minutes for questions and discussion) concerning the use of MT and/or related tools, processes, and technologies to support business goals and serve the customer or user in commercial settings.

Topics:

Domain adaptation and customisation of MT models: commercial customisation platforms, implementation of open frameworks, and comparison of methodologies used to adapt and customize baseline engines.
Data preparation: data sources, extraction, alignment, and cleaning of corpora, terminology, data augmentation, metadata extraction, working with data drift.
MT for low-resource languages: language pairs with limited data; cross-dialect and cross-domain translation.
Comparison and evaluation of MT systems: with respect to business, technical and linguistic requirements.
MT output quality and confidence scoring: tools, methods, and metrics, such as human evaluations, automatic scoring, and MTQE.
Advanced MT fine-tuning and enhancement: including pre- and post-processing; controlling style, tone of voice, gender, pseudonymisation; automatic post-editing (APE).
Interactive and real-time adaptive MT systems: including advanced approaches to leverage TM and end-user feedback.
MT Post-Editing: New approaches to MTPE, success and failure stories, applicability to different content-types, MTPE training, defining fair pricing models and working with translation buyers and providers.
Technical challenges to MT adoption: file format and tag support, integration, security, performance, data protection, profanity filters, locality, and compliance.
Business Cases: making the business case for adopting MT to drive business requirements, expand markets and engage with customers. Post-edited MT, real-time MT, cross-language information retrieval.
Augmenting MT with ML and NLP: classification, context awareness, content moderation, sentiment analysis, OCR, ASR, and TTS.
Source text improvement: improving the source content destined for MT through automatic tools such as grammar correction, guidelines, and NLP.
Video localization: MT usage in video localization workflows, including captioning, subtitling, and voiceovers.
Other

Submission requirements:

Abstract: 250 to 500-word
Presenter(s) biography: 100 word or less
Do not do a “sales pitch”
Format papers (optional) according to the Research Track Submission Instructions and on the camera-ready date
Slide decks (optional) should be submitted as PDFs on the camera-ready date
Submit abstract, papers, and slide decks in the Submission website

The focus should be on innovative MT technology, processes, and real-world use cases, rather than on a particular product or offering.

Government

Chair: Steve LaRocca ([email protected])

This track is intended for users, providers, and developers of machine translation involved in the government sector to present novel, original, and unpublished applications of machine translation and related human language technologies.

We seek submissions for 15-20-minute presentations (including a few minutes for questions and discussion) concerning the use of MT and/or related tools, processes, and technologies to support business goals and serve the customer or user in commercial settings.

Topics:

Advancements in continuous learning for MT and NLP
Government research programs for MT and related technologies
Online MT for lectures and training
MT for low resource languages
Model distillation, compression, and on-device MT
End-to-end models for speech to translated text or speech
Advances in transfer learning with pre-trained models
Advances in OCR and handwriting recognition

Submission requirements:

250 to 500-word abstract
100 word or less biography
Do not do a “sales pitch”
Papers (optional) should be formatted according to the Research Track Submission Instructions
Slide decks (optional) should be submitted as PDFs
Final versions of papers and slide decks are due on the camera-ready date
Submit abstract, papers, and slide decks in the Submission website

The focus should be on innovative MT technology, processes, and real-world use cases, rather than on a particular product or offering.

Workshop and tutorial proposals

Chairs: Jay Marciano, Kenton Murray ([email protected] or [email protected])

Tutorials

Tutorials are a forum for experts in MT and MT-related areas to deliver concentrated training on a topic of interest in half-day teaching sessions. Tutorials help conference participants enrich their understanding of specific technical, applied, and business matters surrounding research, development and use of MT and associated technologies, or, in the case of tutorials designed for newcomers, provide background information that facilitates greater understanding of the overall conference program.

Submission requirements:

Tutorial title
250-500 word tutorial description
Short (<100 words) biographical introduction of the presenter(s)
Technical requirements (if any)
Scanned signed copy of the Tutorial Policy and Leader Agreement Form
Submit by e-mail at [email protected]

Workshops

AMTA workshops are intended to provide the opportunity for MT-related communities of interest to spend focused time together advancing the state of thinking or the state of practice in their area of interest or endeavor. Workshops are generally scheduled as full-day events.

Submission requirements:

Workshop title
250-500 word workshop description
Clarify whether this is an ongoing or new workshop
Short (<100 words) biographical introduction to the presenter(s)
Expected number of participants
Dates for important milestones (call for papers, recruitment of speakers, etc.)
Technical requirements (if any)
A scanned signed copy of the Workshop Policy and Leader Agreement Form.
Submit by e-mail at [email protected]

AMTA 2022

Conference of the Association for Machine Translation in the Americas

Location

Links

Important Dates

Keynote speakers

Schedule

Day 1

Day 2

Day 3

Day 4

Day 5

Workshops

Workshop on empirical translation process research

Workshop on Corpus Generation and Corpus Augmentation for Machine Translation

Panels

Call for papers

Conference tracks

Research

Users and providers

Government

Workshop and tutorial proposals

Tutorials

Workshops