Shahram Khadivi

Shahram Khadivi
Research Scientist

Shahram is a research scientist at eBay, Cognitive Computing team. His main area of research activity includes statistical machine translation, natural language processing, and machine learning. He has over six years of academic experience as an assistant professor in his resume. 

Shahram received his Ph.D. degree in computer science from RWTH Aachen University, Aachen, Germany, in 2008. He also received the B.S. and M.S. degrees in computer engineering from Amirkabir University of Technology, Tehran, Iran, in 1996 and 1999, respectively.

His first touch to statistical machine translation world was in the beginning 2002, when he joined RWTH Aachen University, Aachen, Germany and he involved in TransType2 project. Afterwards, he continued active participation in several international MT projects like: TC-Star, GALE, and TRANSTAC. He also has experience in participating international MT evaluations like IWSLT and NIST. During his career as assistant professor, he mainly focused on MT, NLP, and ML fields of research. He completed several industrial projects in these areas as the PI. He had valuable contribution in developing the MT field for Persian language, a tangible output of these activities is a domestic translation service that provides online machine translation to local search engines. He also had completed two data analysis projects in the banking industry, one is on fraud analysis and the other on customer segmentation. 

His research works are reflected on over 60 research papers. His research interests include statistical machine translation, computational natural language processing, information retrieval, machine learning, and data analysis.

Association for Machine Translation in the Americas (AMTA), Oct. 2016

Guided Alignment Training for Topic-Aware Neural Machine Translation

Wenhu Chen, Evgeny Matusov, Shahram Khadivi, Jan-Thorsten Peter

In this paper, we propose an effective way for biasing the attention mechanism of a sequence-to-sequence neural machine translation (NMT) model towards the well-studied statistical word alignment models. We show that our novel guided alignment training approach improves translation quality on real-life e-commerce texts consisting of product titles and descriptions, overcoming the problems posed by many unknown words and a large type/token ratio. We also show that meta-data associated with input texts such as topic or category information can significantly improve translation quality when used as an additional signal to the decoder part of the network. With both novel features, the BLEU score of the NMT system on a product title set improves from 18.6 to 21.3%. Even larger MT quality gains are obtained through domain adaptation of a general domain NMT system to e-commerce data. The developed NMT system also performs well on the IWSLT speech translation task, where an ensemble of four variant systems outperforms the phrase-based baseline by 2.1% BLEU absolute.

EMNLP, Copenhagen, Denmark, September 2017

Neural Machine Translation Leveraging Phrase-based Models in a Hybrid Search

In this paper, we introduce a hybrid search for attention-based neural machine translation (NMT). A target phrase learned with statistical MT models extends a hypothesis in the NMT beam search when the attention of the NMT model focuses on the source words translated by this phrase. Phrases added in this way are scored with the NMT model, but also with SMT features including phrase-level translation probabilities and a target language model. Experimental results on German->English news domain and English->Russian ecommerce domain translation tasks show that using phrase-based models in NMT search improves MT quality by up to 2.3% BLEU absolute as compared to a strong NMT baseline.

MT Summit, Nagoya, Japan, September 2017

Neural and Statistical Methods for Leveraging Meta-information in Machine Translation

Shahram Khadivi, Patrick Wilken, Leonard Dahlmann, Evgeny Matusov

In this paper, we discuss different methods which use meta information and richer context that may accompany source language input to improve machine translation quality. We focus on category information of input text as meta information, but the proposed methods can be extended to all textual and non-textual meta information that might be available for the input text or automatically predicted using the text content. The main novelty of this work is to use state-of-the-art neural network methods to tackle this problem within a statistical machine translation (SMT) framework. We observe translation quality improvements up to 3% in terms of BLEU score in some text categories.