WMT12
Workshop on Machine Translation
The Seventh Workshop on Machine Translation (WMT12) took place from 7 June to 8 June, 2012, in Montreal, Quebec. It was organised by WMT.
Tasks
Schedule
Day 1
09:00 – 09:10 | Opening Remarks: Future Funding and Research Survey Wiki |
09:10 – 09:30 | Session 1: Shared Tasks and their Evaluation Putting Human Assessments of Machine Translation Systems in Order Adam Lopez |
09:30 – 10:30 | Findings of the 2012 Workshop on Statistical Machine Translation Chris Callison-Burch, Philipp Koehn, Christof Monz, Matt Post, Radu Soricut, Lucia Specia |
10:30 – 11:00 | ☕️ |
11:00 – 12:40 | Session 2: Shared Quality Estimation and Metrics Tasks Poster Session: Evaluation Metrics |
Semantic Textual Similarity for MT evaluation Julio Castillo, Paula Estrella | |
Improving AMBER, an MT Evaluation Metric Boxing Chen, Roland Kuhn, George Foster | |
TerrorCat: a Translation Error Categorization-based MT Quality Metric Mark Fishel, Rico Sennrich, Maja Popović, Ondrej Bojar | |
Class error rates for evaluation of machine translation output Maja Popovic | |
SPEDE: Probabilistic Edit Distance Metrics for MT Evaluation Mengqiu Wang, Christopher Manning | |
11:00 – 12:40 | Poster Session: Quality Estimation Task |
Quality estimation for Machine Translation output using linguistic analysis and decoding features Eleftherios Avramidis | |
Black Box Features for the WMT 2012 Quality Estimation Shared Task Christian Buck | |
Linguistic Features for Quality Estimation Mariano Felice, Lucia Specia | |
PRHLT Submission to the WMT12 Quality Estimation Task Jesús González-Rubio, Alberto Sanchís, Francisco Casacuberta | |
Tree Kernels for Machine Translation Quality Estimation Christian Hardmeier, Joakim Nivre, Jörg Tiedemann | |
LORIA System for the WMT12 Quality Estimation Shared Task David Langlois, Sylvain Raybaud, Kamel Smaïli | |
Quality Estimation: an experimental study using unsupervised similarity measures Erwan Moreau, Carl Vogel | |
The UPC Submission to the WMT 2012 Shared Task on Quality Estimation Daniele Pighin, Meritxell González, Lluís Màrquez | |
Morpheme- and POS-based IBM1 and language model scores for translation quality estimation Maja Popovic | |
DCU-Symantec Submission for the WMT 2012 Quality Estimation Task Raphael Rubino, Jennifer Foster, Joachim Wagner, Johann Roturier, Rasul Samad Zadeh Kaljahi, Fred Hollowood | |
The SDL Language Weaver Systems in the WMT12 Quality Estimation Shared Task Radu Soricut, Nguyen Bach, Ziyuan Wang | |
Regression with Phrase Indicators for Estimating MT Quality Chunyang Wu, Hai Zhao | |
12:40 – 14:00 | 🍴 |
14:00 – 15:30 | Session 3: Invited Talk Deployment of SMT for the IBM Enterprise Salim Roukos |
15:30 – 16:00 | ☕️ |
16:00 – 16:20 | Session 4: Confidence Estimation and System Combination Non-Linear Models for Confidence Estimation Yong Zhuang, Guillaume Wisniewski, François Yvon |
16:20 – 16:40 | Combining Quality Prediction and System Selection for Improved Automatic Translation Output Radu Soricut, Sushant Narsale |
16:40 – 17:00 | Match without a Referee: Evaluating MT Adequacy without Reference Translations Yashar Mehdad, Matteo Negri, Marcello Federico |
17:00 – 17:20 | Comparing human perceptions of post-editing effort with post-editing operations Maarit Koponen |
17:20 – 17:40 | Review of Hypothesis Alignment Algorithms for MT System Combination via Confusion Network Decoding Antti-Veikko Rosti, Xiaodong He, Damianos Karakos, Gregor Leusch, Yuan Cao, Markus Freitag, Spyros Matsoukas, Hermann Ney, Jason Smith, Bing Zhang |
Day 2
09:00 – 09:20 | Session 5: Reordering, Syntax and Semantics On Hierarchical Re-ordering and Permutation Parsing for Phrase-based Decoding Colin Cherry, Robert C. Moore, Chris Quirk |
09:20 – 09:40 | CCG Syntactic Reordering Models for Phrase-based Machine Translation Dennis Nolan Mehay, Christopher Hardie Brew |
09:40 – 10:00 | Using Categorial Grammar to Label Translation Rules Jonathan Weese, Chris Callison-Burch, Adam Lopez |
10:00 – 10:20 | Using Syntactic Head Information in Hierarchical Phrase-Based Translation Junhui Li, Zhaopeng Tu, Guodong Zhou, Josef van Genabith |
10:20 – 10:40 | Fully Automatic Semantic MT Evaluation Chi-kiu Lo, Anand Karthik Tumuluru, Dekai Wu |
10:40 – 11:00 | ☕️ |
11:00 – 12:40 | Session 6: Translation Task Poster Session: Translation Task |
Probes in a Taxonomy of Factored Phrase-Based Models Ondrej Bojar, Bushra Jawaid, Amir Kamran | |
The CMU-Avenue French-English Translation System Michael Denkowski, Greg Hanneman, Alon Lavie | |
Formemes in English-Czech Deep Syntactic MT Ondrej Dušek, Zdenek Žabokrtský, Martin Popel, Martin Majliš, Michal Novák, David Mareček | |
The TALP-UPC phrase-based translation systems for WMT12: Morphology simplification and domain adaptation Lluis Formiga, Carlos A. Henríquez Q., Adolfo Hernández, José B. Mariño, Enric Monte, José A. R. Fonollosa | |
Joshua 4.0: Packing, PRO, and Paraphrases Juri Ganitkevitch, Yuan Cao, Jonathan Weese, Matt Post, Chris Callison-Burch | |
Syntax-aware Phrase-based Statistical Machine Translation: System Description Ulrich Germann | |
QCRI at WMT12: Experiments in Spanish-English and German-English Machine Translation of News Text Francisco Guzman, Preslav Nakov, Ahmed Thabet, Stephan Vogel | |
The RWTH Aachen Machine Translation System for WMT 2012 Matthias Huck, Stephan Peitz, Markus Freitag, Malte Nuhn, Hermann Ney | |
Machine Learning for Hybrid Machine Translation Sabine Hunsicker, Chen Yu, Christian Federmann | |
Towards Effective Use of Training Data in Statistical Machine Translation Philipp Koehn, Barry Haddow | |
Joint WMT 2012 Submission of the QUAERO Project Freitag Markus, Peitz Stephan, Huck Matthias, Ney Hermann, Niehues Jan, Herrmann Teresa, Waibel Alex, Hai-son Le, Lavergne Thomas, Allauzen Alexandre, Buschbeck Bianka, Crego Joseph Maria, Senellart Jean | |
LIMSI @ WMT12 Hai-Son Le, Thomas Lavergne, Alexandre Allauzen, Marianna Apidianaki, Li Gong, Aurélien Max, Artem Sokolov, Guillaume Wisniewski, François Yvon | |
UPM system for WMT 2012 Verónica López-Ludeña, Rubén San-Segundo, Juan M. Montero | |
PROMT DeepHybrid system for WMT12 shared translation task Alexander Molchanov | |
The Karlsruhe Institute of Technology Translation Systems for the WMT 2012 Jan Niehues, Yuqi Zhang, Mohammed Mediani, Teresa Herrmann, Eunah Cho, Alex Waibel | |
Kriya - The SFU System for Translation Task at WMT-12 Majid Razmara, Baskaran Sankaran, Ann Clifton, Anoop Sarkar | |
DEPFIX: A System for Automatic Correction of Czech MT Outputs Rudolf Rosa, David Mareček, OndÅ™ej Dušek | |
LIUM’s SMT Machine Translation Systems for WMT 2012 Christophe Servan, Patrik Lambert, Anthony Rousseau, Holger Schwenk, Loïc Barrault | |
Selecting Data for English-to-Czech Machine Translation Aleš Tamchyna, Petra Galuščáková, Amir Kamran, Miloš Stanojević, Ondrej Bojar | |
DFKI’s SMT System for WMT 2012 David Vilar | |
GHKM Rule Extraction and Scope-3 Parsing in Moses Philip Williams, Philipp Koehn | |
Data Issues of the Multilingual Translation Matrix Daniel Zeman | |
12:40 – 14:00 | 🍴 |
14:00 – 14:20 | Session 7: Corpus Creation and Adaptation Constructing Parallel Corpora for Six Indian Languages via Crowdsourcing Matt Post, Chris Callison-Burch, Miles Osborne |
14:20 – 14:40 | Twitter Translation using Translation-Based Cross-Lingual Retrieval Laura Jehl, Felix Hieber, Stefan Riezler |
14:40 – 15:00 | Analysing the Effect of Out-of-Domain Data on SMT Systems Barry Haddow, Philipp Koehn |
15:00 – 15:20 | Evaluating the Learning Curve of Domain Adaptive Statistical Machine Translation Systems Nicola Bertoldi, Mauro Cettolo, Marcello Federico, Christian Buck |
15:20 – 15:40 | The Trouble with SMT Consistency Marine Carpuat, Michel Simard |
15:40 – 16:00 | ☕️ |
16:00 – 16:20 | Session 8: Phrase Model Training and Optimization Phrase Model Training for Statistical Machine Translation with Word Lattices of Preprocessing Alternatives Joern Wuebker, Hermann Ney |
16:20 – 16:40 | Leave-One-Out Phrase Model Training for Large-Scale Deployment Joern Wuebker, Mei-Yuh Hwang, Chris Quirk |
16:40 – 17:00 | Direct Error Rate Minimization for Statistical Machine Translation Tagyoung Chung, Michel Galley |
17:00 – 17:20 | Optimization Strategies for Online Large-Margin Learning in Machine Translation Vladimir Eidelman |
Results
Full results of the shared tasks: Findings of the 2012 Workshop on Statistical Machine Translation
News translation
The results were determined with a relative ranking, the > others
(“greater than others”) score.
It measures how often a system was judged to be better than any other system.
→ English
Language pair | System | > others |
---|---|---|
Czech → | ONLINE-B UEDIN | 0.65 0.60 |
Spanish → | ONLINE-A ONLINE-B QCRI UEDIN | 0.62 0.61 0.60 0.58 |
French → | LIMSI KIT ONLINE-A CMU ONLINE-B | 0.63 0.61 0.59 0.57 0.57 |
German → | ONLINE-A ONLINE-B UEDIN RWTH KIT | 0.65 0.65 0.60 0.56 0.55 |
English →
Language pair | System | > others |
---|---|---|
→ Czech | CU-DEPFIX UEDIN CU-BOJAR CU-TECTOMT | 0.66 0.56 0.54 0.53 |
→ Spanish | ONLINE-B ONLINE-A UPC UEDIN | 0.65 0.56 0.52 0.52 |
→ French | LIMSI KIT | 0.66 0.59 |
→ German | ONLINE-B RBMT-4 LIMSI UEDIN-WILLIAMS KIT UEDIN RWTH | 0.64 0.58 0.55 0.51 0.50 0.47 0.47 |