We strongly believe in open source and giving to our community. We work directly with researchers in academia and seek out new perspectives with our intern and fellowship programs. We generalize our solutions and release them to the world as open source projects. We host discussions and publish our results.


Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Correcting Keyboard Layout Errors and Homoglyphs in Queries

Keyboard layout errors and homoglyphs in cross-language queries impact our ability to correctly interpret user information needs and offer relevant results. We present a machine learning approach to correcting these errors, based largely on character-level n-gram features. We demonstrate superior performance over rule-based methods, as well as a significant reduction in the number of queries that yield null search results.

Proceedings of the 6th International Joint Conference on Natural Language Processing

Selective Combination of Pivot and Direct Statistical Machine Translation Models

In this paper, we propose a selective combination approach of pivot and direct statistical machine translation (SMT) models to improve translation quality. We work with Persian-Arabic SMT as a case study. We show positive results (from 0.4 to 3.1 BLEU on different direct training corpus sizes) in addition to a large reduction of pivot translation model size.


Language Independent Connectivity Strength Features for Phrase Pivot Statistical Machine Translation

An important challenge to statistical machine translation (SMT) is the lack of parallel data for many language pairs. One common solution is to pivot through a third language for which there exist parallel corpora with the source and target languages. Although pivoting is a robust technique, it introduces some low quality translations. In this paper, we present two language-independent features to improve the quality of phrase-pivot based SMT. The features, source connectivity strength and target connectivity strength reflect the quality of projected alignments between the source and target phrases in the pivot phrase table. We show positive results (0.6 BLEU points) on Persian-Arabic SMT as a case study.


ICIP, September, 2015

Mine the Fine: Fine-Grained Fragment Discovery

M. Hadi Kiapour, Wei Di, Vignesh Jagadeesh, Robinson Piramuthu

While discriminative visual element mining has been introduced before, in this paper we present an approach that requires minimal annotation in both training and test time. Given only a bounding box localization of the foreground objects, our approach automatically transforms the input images into a roughly-aligned pose space and discovers the most discriminative visual fragments for each category.

These fragments are then used to learn robust classifiers that discriminate between very similar categories under challenging conditions such as large variations in pose or habitats. The minimal required input, is a critical characteristic that enables our approach to generalize over visual domains where expert knowledge is not readily available.

Moreover, our approach takes advantage of deep networks that are targeted towards fine-grained classification.It learns mid-level representations that are specific to a category and generalize well across the category instances at the same time.

Our evaluations demonstrate that the automatically learned representation based on discriminative fragments, significantly outperforms globally extracted deep features in classification accuracy.

ICVS, July, 2015

Efficient Media Retrieval from Non-Cooperative Queries

Kevin Shih, Wei Di, Vignesh Jagadeesh, Robinson Piramuthu

Text is ubiquitous in the artificial world and easily attainable when it comes to book title and author names. Using the images from the book cover set from the Stanford Mobile Visual Search dataset and additional book covers and metadata from, we construct a large scale book cover retrieval dataset, complete with 100K distractor covers and title and author strings for each.

Because our query images are poorly conditioned for clean text extraction, we propose a method for extracting a matching noisy and erroneous OCR readings and matching it against clean author and book title strings in a standard document look-up problem setup.

Finally, we demonstrate how to use this text-matching as a feature in conjunction with popular retrieval features such as VLAD using a simple learning setup to achieve significant improvements in retrieval accuracy over that of either VLAD or the text alone.

2015 International Conference for Machine Learning (ICML)

Bayesian and Empirical Bayesian Forests

Matt Taddy, Chun-Sheng Chen, Jun Yu, Mitch Wyle

We derive ensembles of decision trees through a nonparametric Bayesian model, allowing us to view random forests as samples from a posterior distribution. This insight provides large gains in interpretability, and motivates a class of Bayesian forest (BF) algorithms that yield small but reliable performance gains.

Based on the BF framework, we are able to show that high-level tree hierarchy is stable in large samples. This leads to an empirical Bayesian forest (EBF) algorithm for building approximate BFs on massive distributed datasets and we show that EBFs outperform subsampling based alternatives by a large margin.

CVPR, June, 2015

ConceptLearner: Discovering Visual Concepts from Weakly Labeled Image Collections

Bolei Zhou, Vignesh Jagadeesh, Robinson Piramuthu
Discovering visual knowledge from weakly labeled data are crucial to scale up computer vision recognition system, since it is expensive to obtain fully labeled data for a large number of concept categories while the weakly labeled data could be collected from the Internet cheaply and massively.
In this paper we proposes a scalable approach to discover visual concepts from weakly labeled image collections, with thousands of visual concept detectors learned. Then we show that the learned detectors could be applied to recognize concepts at image-level and to detect concepts at image region-level accurately.
Under domain-selected supervision, we further evaluate the learned concepts for scene recognition on SUN database and for object detection on Pascal VOC 2007. It shows promising performance compared to the fully supervised and weakly supervised methods.
KDD 2014

Large Scale Visual Recommendations From Street Fashion Images

Vignesh Jagadeesh, Robinson Piramuthu, Anurag Bhardwaj, Wei Di, Neel Sundaresan

We describe a completely automated large scale visual recommendation system for fashion. Our focus is to efficiently harness the availability of large quantities of online fashion images and their rich meta-data.

Specifically, we propose two classes of data driven models in the Deterministic Fashion Recommenders (DFR) and Stochastic Fashion Recommenders (SFR) for solving this problem. We analyze relative merits and pitfalls of these algorithms through extensive experimentation on a large-scale data set and baseline them against existing ideas from color science.

We also illustrate key fashion insights learned through these experiments and show how they can be employed to design better recommendation systems.

The industrial applicability of proposed models is in the context of mobile fashion shopping. Finally, we also outline a largescale annotated data set of fashion images (Fashion-136K) that can be exploited for future research in data driven visual fashion.

WSDM, 2014

Is a picture really worth a thousand words?: - on the role of images in e-commerce

Wei Di, Neel Sundaresan, Anurag Bhardwaj, Robinson Piramuthu

In online peer-to-peer commerce places where physical examination of the goods is infeasible, textual descriptions, images of the products, reputation of the participants, play key roles. Visual image is a powerful channel to convey crucial information towards e-shoppers and influence their choice.

In this paper, we investigate a well-known online marketplace where over millions of products change hands and most are described with the help of one or more images. We present a systematic data mining and knowledge discovery approach that aims to quantitatively dissect the role of images in e-commerce in great detail. Our goal is two-fold.

First, we aim to get a thorough understanding of impact of images across various dimensions: product categories, user segments, conversion rate. We present quantitative evaluation of the influence of images and show how to leverage different image aspects, such as quantity and quality, to effectively raise sale. Second, we study interaction of image data with other selling dimensions by jointly modeling them with user behavior data.

Results suggest that "watch" behavior encodes complex signals combining both attention and hesitation from buyer, in which image still holds an important role when compared to other selling variables, especially for products for which appearance is important. We conclude on how these findings can benefit sellers in a high competitive online e-commerce market.