Selective Combination of Pivot and Direct Statistical Machine Translation Models

Proceedings of the 6th International Joint Conference on Natural Language Processing
Selective Combination of Pivot and Direct Statistical Machine Translation Models
eBay Authors

In this paper, we propose a selective combination approach of pivot and direct statistical machine translation (SMT) models to improve translation quality. We work with Persian-Arabic SMT as a case study. We show positive results (from 0.4 to 3.1 BLEU on different direct training corpus sizes) in addition to a large reduction of pivot translation model size.

Another publication from the same author: Gregor Leusch


Language Independent Connectivity Strength Features for Phrase Pivot Statistical Machine Translation

An important challenge to statistical machine translation (SMT) is the lack of parallel data for many language pairs. One common solution is to pivot through a third language for which there exist parallel corpora with the source and target languages. Although pivoting is a robust technique, it introduces some low quality translations. In this paper, we present two language-independent features to improve the quality of phrase-pivot based SMT. The features, source connectivity strength and target connectivity strength reflect the quality of projected alignments between the source and target phrases in the pivot phrase table. We show positive results (0.6 BLEU points) on Persian-Arabic SMT as a case study.


Another publication from the same category: Machine Translation

Copenhagen, Denmark, September 2017

Neural Machine Translation Leveraging Phrase-based Models in a Hybrid Search

Leonard Dahlmann, Evgeny Matusov, Pavel Petrushkov, Shahram Khadivi

In this paper, we introduce a hybrid search for attention-based neural machine translation (NMT). A target phrase learned with statistical MT models extends a hypothesis in the NMT beam search when the attention of the NMT model focuses on the source words translated by this phrase. Phrases added in this way are scored with the NMT model, but also with SMT features including phrase-level translation probabilities and a target language model. Experimental results on German->English news domain and English->Russian ecommerce domain translation tasks show that using phrase-based models in NMT search improves MT quality by up to 2.3% BLEU absolute as compared to a strong NMT baseline.