AMTA 2022
Conference of the Association for Machine Translation in the Americas
AMTA 2022 provided parallel sessions, demonstrations of the offerings from machine translation providers, tutorials for both beginners and more experienced practitioners, and workshops.
Students interested in MT and computational linguistics were able to connect with academic and industry mentors.
Location
- Florida, United States of America
Links
Important Dates
Track presentation proposals submission deadline | 13 June |
Tutorial proposals submission deadline | 13 June |
Workshop proposals submission deadline | 30 May |
Notification of acceptance | 18 July |
Final “camera-ready” versions | 08 August |
Keynote speakers
- Marco Trombetti, CEO of Translated
- Angela Fan, research scientist at Meta AI Research
- Dr. Alex Waibel, professor of Computer Science of Carnegie Mellon University and the Karlsruhe Institute of Technology
Schedule
Day 1
Day 2
7:30 | Registration |
8:00 | Virtual Networking Session |
8:00 | Virtual Student Mentoring Session |
9:00 | Path to singularity** Marco Trombetti |
10:00 | Recent Advances in Dynamically Adapted MT *Moderator:* Alon Lavie *Panelists:* Marco Trombetti, Watson Srivathsan, Joern Wuebker, Jose Souza |
11:00 | ☕️ |
11:00 | Exhibits |
11:30 | Machine Translation as a Prototype for Advanced AI Deployment in Government Kathryn Baker |
11:30 | Building MT for Software Product Descriptions Using Domain-specific Sub-corpora Extraction Pintu Lohar, Sinead Madden, Edmond O’Connor, Maja Popovic, Tanya Habruseva |
11:30 | PEMT human evaluation at 100x scale with risk-driven sampling Kirill Soloviev |
11:50 | A Proposed User Study on MT-Enabled Scanning Marianna Martindale, Marine Carpuat |
11:50 | Domain-Specific Text Generation for Machine Translation Yasmin Moslem, Rejwanul Haque, John Kelleher, Andy Way |
11:50 | Picking Out The Best MT Model: On The Methodology Of Human Evaluation Stepan Korotaev, Andrey Ryabchikov |
12:10 | You've translated it, now what? Michael Maxwell, Shabnam Tafreshi, Aquia Richburg, Balaji Kodali, Kymani Brown |
12:10 | Strategies for Adapting Multilingual Pre-training for Domain-Specific Machine Translation Neha Verma, Kenton Murray, Kevin Duh |
12:10 | Post-editing of Machine-Translated Patents: High Tech with High Stakes Aaron Hebenstreit |
12:30 | 🍴 |
12:30 | Exhibits |
12:30 | Onsite Student Mentoring Session |
12:30 | Virtual Student Mentoring Session |
13:30 | Prefix Embeddings for In-context Machine Translation Suzanna Sia, Kevin Duh |
13:30 | State of the Machine Translation 2022 Konstantin Savenkov, Michel Lopez |
13:30 | Automatic Post-Editing of MT Output Using Large Language Models Blanca Vidal, Albert Llorens, Juan Alonso |
13:50 | Fast Vocabulary Projection Method via Clustering for Multilingual Machine Translation on GPU Hossam Amer, Mohamed Afify, Young Jin Kim, Hitokazu Matsushita, Hany Hassan |
13:50 | The Translation Impact on Global CX Kirti Vashee |
13:50 | Improving Consistency of Human and Machine Translations Silvio Picinini |
14:10 | Language Tokens: Simply Improving Zero-Shot Multi-Aligned Translation in Encoder-Decoder Models Muhammad N ElNokrashy, Amr Hendy, Mohamed Maher, Mohamed Afify, Hany Hassan |
14:10 | Machine Assistance in the Real World Dave Bryant |
14:10 | Improve Machine Translation for Cross-Lingual Search in E-Commerce Hang Zhang |
14:30 | ☕️ |
14:30 | Exhibits |
15:00 | Pangeanic |
15:20 | Amazon Web Services |
15:40 | Welocalize |
17:00 | Virtual Networking Session |
Day 3
7:30 | Registration |
8:00 | Virtual Networking Session |
9:00 | Uplifting Singapore’s translation standards with the community through technology Lee Siew Li, Adeline Sim, Gowri Kanagarajah, Siti Amirah, Foo Yong Xiang, Gayathri Ayathorai, Sarina Mohamed Rasol, Aw Ai Ti, Wu Kui, Zheng Weihua, Ding Yang, Tarun Kumar Vangani, Nabilah Binte Md Johan |
9:00 | Low Resource Chat Translation: A Benchmark for Hindi-English Language Pair Baban Gain, Ramakrishna Appicharla, Soumya Chennabasavraj, Nikesh Garera, Asif Ekbal, Muthusamy Chelliah |
9:00 | A Multimodal Simultaneous Interpretation Prototype: Who Said What Xiaolin Wang, Masao Utiyama, Eiichiro Sumita |
9:20 | Multi-dimensional Consideration of Cognitive Effort in Translation and Interpreting Deyan Zou |
9:20 | How Robust is NMT to Language Imbalance in Multilingual Tokenizer Training? Shiyue Zhang, Vishrav Chaudhary, Naman Goyal, James Cross, Guillaume Wenzek, Mohit Bansal, Francisco Guzman |
9:20 | Data analytics meets machine translation solution Allen Che, Martin Xiao |
9:40 | How Effective is Byte Pair Encoding for Out-Of-Vocabulary Words in Neural Machine Translation? Ali Araabi, Christof Monz, Vlad Niculae |
9:40 | Quality Prediction Adam Bittlingmayer, Boris Zubarev, Artur Aleksanyan |
10:00 | Hand in 01101000 01100001 01101110 01100100 with the Machine: A Roadmap to Quality Caroline-Soledad Mallette |
10:00 | On the Effectiveness of Quasi Character-Level Models for Machine Translation Salvador Carrión-Ponz, Francisco Casacuberta |
10:00 | Comparison Between ATA Grading Framework Scores and Auto Scores Evelyn Garland, Carola Berger, Jon Ritzdorf |
10:20 | Dragonfly: Automated Sign Language Recognition (ASLR) and Machine Translation (MT) Patricia O’Neill-Brown, Bill Dawson |
10:20 | Improving Translation of Out of Vocabulary Words using Bilingual Lexicon Induction Jonas Waldendorf, Alexandra Birch, Barry Hadow, Antonio Valerio Micele Barone |
10:20 | Improving Translation of Out of Vocabulary Words using Bilingual Lexicon Induction Jonas Waldendorf, Alexandra Birch, Barry Hadow, Antonio Valerio Micele Barone |
10:20 | Lingua: Addressing Scenarios for Live Interpretation and Automatic Dubbing Nathan Anderson, Caleb Wilson, Stephen D. Richardson |
10:40 | ☕️ |
10:40 | Exhibits |
11:10 | Doubly-Trained Adversarial Data Augmentation for Neural Machine Translation Weiting Tan, Shuoyang Ding, Huda Khayrallah, Philipp Koehn |
11:10 | Refining an Almost Clean Translation Memory Helps Machine Translation |
11:10 | All You Need is Source! Source-based Quality Estimation for NMT Jon Cambra, Mara Nunziatini |
11:30 | Limitations and Challenges of Unsupervised Cross-lingual Pre-training Martín Quesada Zaragoza, Francisco Casacuberta |
11:30 | Practical Attacks on Machine Translation using Paraphrase Elizabeth M Merkhofer, John Henderson, Abigail Gertner, Michael Doyle, Lily Wong |
11:30 | Knowledge Distillation for Sustainable Neural Machine Translation Wandri Jooste, Andy Way, Rejwanul Haque, Riccardo Superbo |
11:50 | Few-Shot Regularization to Tackle Catastrophic Forgetting in Multilingual Machine Translation Salvador Carrión-Ponz , Francisco Casacuberta |
11:50 | Sign Language Machine Translation and the Sign Language Lexicon: A Linguistically Informed Approach Irene Murtagh, Víctor Ubieto Nogales, Josep Blat |
11:50 | Innovations in Machine Voice for E-learning and Training Content Andrey Nikulin, Kevin Bruner |
12:10 | Quantized Wasserstein Procrustes Alignment of Word Embedding Spaces Prince O Aboagye, Yan Zheng, Michael Yeh, Junpeng Wang, Zhongfang Zhuang, Huiyuan Chen, Liang Wang, Wei Zhang, Jeff Phillips |
12:10 | A Neural Machine Translation Approach to Translate Text to Pictographs in a Medical Speech Translation System Jonathan Mutal, Pierrette Bouillon, Magali Norré, Johanna Gerlach, Lucia Ormaechea Grijalba |
12:10 | Business Critical Errors: A Framework for Adaptive Quality Feedback Craig A Stewart, Madalena Gonçalves, Marianna Buchicchio, Alon Lavie |
12:30 | 🍴 |
12:30 | Exhibits |
12:30 | Onsite Student Mentoring |
13:30 | Star Group |
13:50 | Unbabel |
14:10 | Systran |
14:30 | ☕️ |
14:30 | Exhibits |
15:00 | Removing the Language Divide: From Machine Translation to Language Transparence** Alex Waibel |
16:00 | Advances in Spoken Language MT** *Moderator:* Steve Larocca *Panelists:* Alex Waibel, Parnia Bahar, Jack Donnelly, Evelyne Tzoukermann |
17:00 | Virtual Networking Session |
18:00 | Conference Banquet |
Day 4
7:30 | Registration |
8:00 | Virtual Networking Session |
9:00 | No Language Left Behind: Scaling Human-Centered Machine Translation** Angela Fan |
10:00 | Large Multilingual Language Models and MT *Moderators:* Grigory Sapunov, Konstantin Savenkov *Panelists:* Angela Fan, Vishrav Chaudhary, José de Souza |
11:00 | ☕️ |
11:00 | Exhibits |
11:30 | Embedding-Enhanced GIZA++: Improving Word Alignment Using Embeddings Kelly Marchisio, Conghao Xiong, Philipp Koehn |
11:30 | A Snapshot into the Possibility of Video Game Machine Translation Damien Hansen, Pierre-Yves Houlmont |
11:50 | Gender bias Evaluation in Luganda-English Machine Translation Eric Peter Wairagala |
11:50 | Customization options for language pairs without English Daniele Giulianelli |
11:50 | Feeding NMT a Healthy Diet – The Impact of Quality, Quantity, or the Right Type of Nutrients Abdallah Nasir, Sara Alisis, Ruba W Jaikat, Rebecca Jonsson, Sara Qardan, Eyas Shawahneh, Nour Al-Khdour |
12:10 | Adapting Large Multilingual MT Models to Unseen Low Resource Languages Mohamed A Abdelghaffar, Amr El Mogy, Nada Ahmed Sharaf |
12:10 | Neural Fuzzy Adaptation - Boosting Neural Machine Translation with Similar Translations John Barraza, Jitao Xu, Josep Crego, Jean Senellart |
12:10 | A Comparison of Data Filtering Methods for Neural Machine Translation Fred Bane, Celia Soler Uguet, Wiktor Stribiżew, Anna Zaretskaya |
12:30 | 🍴 |
12:30 | Exhibits |
13:30 | AMTA Business Meeting |
14:00 | Language Weaver |
14:20 | Intento |
14:40 | Apptek |
15:00 | ☕️ |
15:00 | Exhibits |
15:30 | NVTC’s Transliteration Plug-in: What’s in a Name? Jen Doyon, Ekaterina Harke |
15:30 | Measuring the Effects of Human and Machine Translation on Website Engagement Geza Kovacs, John DeNero |
15:30 | Machine Translate: Open resources and community Cecilia Yalangozian, Vilém Zouhar, Adam Bittlingmayer |
15:50 | Robust Translation of French Live Speech Transcripts John Barraza, Elise Bertin-Lemée, Guillaume Klein, Josep Crego, Jean Senellart |
15:50 | Consistent Human Evaluation of Machine Translation across Language Pairs Daniel Licht, Cynthia Gao, Janice Lam, Francisco Guzman, Mona Diab, Philipp Koehn |
15:50 | Unlocking the value of bilingual documents with DL segmentation and alignment for Arabic Nour Al-Khdour, Rebecca Jonsson, Ruba W Jaikat, Abdallah Nasir, Sara Alisis, Sara Qardan, Eyas Shawahneh |
16:10 | Speech-to-Text and Evaluation of Multiple Machine Translation Systems Evelyne Tzoukermann, Steven Van Guilder, Jennifer Doyon, Ekaterina Harke |
16:10 | Evaluating Machine Translation in Cross-lingual E-Commerce Search Hang Zhang, Liling Tan, Amita Misra |
16:10 | Language I/O Solution for Multilingual Customer Support Diego Bartolome, Silke Dodel, Chris Jacob |
16:30 | Closing and Best Presentation Awards |
17:00 | Virtual Networking Session |
Day 5
7:30 | Registration |
8:00 | Virtual Networking Session |
8:30 | Opening remarks |
8:45 | Collecting data, training models and distributing both – the OPUS way** Jörg Tiedemann |
9:00 | Machine Translation User Guide: 2022 Edition** Konstantin Savenkov |
9:00 | Creating Text-to-Speech Software for Every Language** Al Sagheer |
9:30 | Building and Analysis of Tamil Lyric Corpus with Semantic Representation Karthika Ranganathan, Geetha T V |
9:50 | English-Russian Data Augmentation for Neural Machine Translation Nikita Teslenko Grygoryev, Mercedes Garcia Martinez, Francisco Casacuberta Nolla, Amando Estela Pastor, Manuel Herranz |
10:15 | Efficient Machine Translation Corpus Generation Kamer Ali Yuksel, Ahmet Gunduz, Shreyas Sharma, Hassan Sawaf |
10:30 | ☕️ |
11:00 | Ukrainian-To-English Folktale Corpus: Parallel Corpora Creation and Augmentation for Machine Translation in Low-Resource Languages Olena Burda-Lassen |
11:30 | Unlocking Resources for Under-resourced Languages Graham Neubig |
12:30 | A Multilingual View of Unsupervised Machine Translation Ankur Parikh |
12:30 | 🍴 |
13:30 | The Post-editor Toolkit Luciana Ramos |
13:30 | Tackling Low-Resource Machine Translation with Participation, Data and Scale Julia Kreutzer |
13:30 | Recent Advances in Translation Quality Evaluation Alon Lavie, Craig Stewart |
14:30 | Formality Control for Machine Translation** Maria Nadejde |
15:00 | ☕️ |
15:30 | Panel Discussion - Low-Resource Language Corpora Marine Carpuat, Kenneth Ward Church |
16:30 | Closing remarks |
Workshops
Workshop on empirical translation process research
The Workshop on Empirical Translation Process Research (WeTPR) investigated human translation and post-editing processes.
sites.google.com/site/centretranslationinnovation/conferences-workshops/wetpr-2022
Workshop on Corpus Generation and Corpus Augmentation for Machine Translation
The first workshop on Corpus Generation and Corpus Augmentation for Machine Translation (CoCo4MT) was co-located with AMTA 2022 on 16 September.
Panels
- Advances in Spoken Language MT
- Dynamically Adaptive MT
- Multilingual Language Models and MT
Call for papers
Conference tracks
Research
Chairs: Kevin Duh, Francisco Guzman ([email protected])
We seek submissions across the entire spectrum of MT-related research, but with a particular focus on the conference’s strength: the close interaction between researchers and practitioners who are looking to apply the latest MT technology to their tasks.
Topics:
- Advances in data-driven MT (e.g., neural, statistical)
- Lexicon acquisition and integration into MT
- MT for low resource languages
- Model distillation, compression, and on-device MT
- MT in production scenarios, robustness and deployment issues.
- MT for multiple modalities (speech, optical character recognition)
- MT for communication (chats, blogs, social networks)
- Few-shot adaptation of pre-trained MT systems
- Deep integration of MT technology within translation and localization pipelines
- Large-scale mining of translation resources
- Computer Assisted Translation (CAT)
- MT evaluation
- Measuring fairness, bias, transparency in translation
- Detecting and preventing catastrophic errors in translation
- Best practices in annotation for translation
- Other
We invite original, substantial, and unpublished research in all aspects of machine translation (MT).
Submission requirements:
- Maximum 10-page paper (unlimited for reference pages)
- Follow the style guides (PDF version, LaTeX version, MS Word version) and submit in PDF format
- Do not include author names and affiliations
- Represent new work that has not been previously published
Authors submitting a similar paper to another conference or workshop must specify this at submission time; if the paper is accepted to multiple venues, the author must choose which one to present at.
- Papers must be submitted via the Submission website
Users and providers
Chairs: Janice Campbell, Jay Marciano, Konstantin Savenkov, Alex Yanishevsky ([email protected])
This track is intended for users, providers, and developers of machine translation, as well as professional translators and Language Service Providers, to present novel, original, and unpublished applications of machine translation technology or specific commercial use cases.
We seek submissions for 15-20-minute presentations (including a few minutes for questions and discussion) concerning the use of MT and/or related tools, processes, and technologies to support business goals and serve the customer or user in commercial settings.
Topics:
- Domain adaptation and customisation of MT models: commercial customisation platforms, implementation of open frameworks, and comparison of methodologies used to adapt and customize baseline engines.
- Data preparation: data sources, extraction, alignment, and cleaning of corpora, terminology, data augmentation, metadata extraction, working with data drift.
- MT for low-resource languages: language pairs with limited data; cross-dialect and cross-domain translation.
- Comparison and evaluation of MT systems: with respect to business, technical and linguistic requirements.
- MT output quality and confidence scoring: tools, methods, and metrics, such as human evaluations, automatic scoring, and MTQE.
- Advanced MT fine-tuning and enhancement: including pre- and post-processing; controlling style, tone of voice, gender, pseudonymisation; automatic post-editing (APE).
- Interactive and real-time adaptive MT systems: including advanced approaches to leverage TM and end-user feedback.
- MT Post-Editing: New approaches to MTPE, success and failure stories, applicability to different content-types, MTPE training, defining fair pricing models and working with translation buyers and providers.
- Technical challenges to MT adoption: file format and tag support, integration, security, performance, data protection, profanity filters, locality, and compliance.
- Business Cases: making the business case for adopting MT to drive business requirements, expand markets and engage with customers. Post-edited MT, real-time MT, cross-language information retrieval.
- Augmenting MT with ML and NLP: classification, context awareness, content moderation, sentiment analysis, OCR, ASR, and TTS.
- Source text improvement: improving the source content destined for MT through automatic tools such as grammar correction, guidelines, and NLP.
- Video localization: MT usage in video localization workflows, including captioning, subtitling, and voiceovers.
- Other
Submission requirements:
- Abstract: 250 to 500-word
- Presenter(s) biography: 100 word or less
- Do not do a “sales pitch”
- Format papers (optional) according to the Research Track Submission Instructions and on the camera-ready date
- Slide decks (optional) should be submitted as PDFs on the camera-ready date
- Submit abstract, papers, and slide decks in the Submission website
The focus should be on innovative MT technology, processes, and real-world use cases, rather than on a particular product or offering.
Government
Chair: Steve LaRocca ([email protected])
This track is intended for users, providers, and developers of machine translation involved in the government sector to present novel, original, and unpublished applications of machine translation and related human language technologies.
We seek submissions for 15-20-minute presentations (including a few minutes for questions and discussion) concerning the use of MT and/or related tools, processes, and technologies to support business goals and serve the customer or user in commercial settings.
Topics:
- Advancements in continuous learning for MT and NLP
- Government research programs for MT and related technologies
- Online MT for lectures and training
- MT for low resource languages
- Model distillation, compression, and on-device MT
- End-to-end models for speech to translated text or speech
- Advances in transfer learning with pre-trained models
- Advances in OCR and handwriting recognition
Submission requirements:
- 250 to 500-word abstract
- 100 word or less biography
- Do not do a “sales pitch”
- Papers (optional) should be formatted according to the Research Track Submission Instructions
- Slide decks (optional) should be submitted as PDFs
- Final versions of papers and slide decks are due on the camera-ready date
- Submit abstract, papers, and slide decks in the Submission website
The focus should be on innovative MT technology, processes, and real-world use cases, rather than on a particular product or offering.
Workshop and tutorial proposals
Chairs: Jay Marciano, Kenton Murray ([email protected] or [email protected])
Tutorials
Tutorials are a forum for experts in MT and MT-related areas to deliver concentrated training on a topic of interest in half-day teaching sessions. Tutorials help conference participants enrich their understanding of specific technical, applied, and business matters surrounding research, development and use of MT and associated technologies, or, in the case of tutorials designed for newcomers, provide background information that facilitates greater understanding of the overall conference program.
Submission requirements:
- Tutorial title
- 250-500 word tutorial description
- Short (<100 words) biographical introduction of the presenter(s)
- Technical requirements (if any)
- Scanned signed copy of the Tutorial Policy and Leader Agreement Form
- Submit by e-mail at [email protected]
Workshops
AMTA workshops are intended to provide the opportunity for MT-related communities of interest to spend focused time together advancing the state of thinking or the state of practice in their area of interest or endeavor. Workshops are generally scheduled as full-day events.
Submission requirements:
- Workshop title
- 250-500 word workshop description
- Clarify whether this is an ongoing or new workshop
- Short (<100 words) biographical introduction to the presenter(s)
- Expected number of participants
- Dates for important milestones (call for papers, recruitment of speakers, etc.)
- Technical requirements (if any)
- A scanned signed copy of the Workshop Policy and Leader Agreement Form.
- Submit by e-mail at [email protected]