Given an ecommerce query, how well the titles of items for sale match the user intent is an important signal for ranking the items. A well-known technique for computing this signal is to use a standard machine-learned model that uses words as features, targets user clicks and predicts a score to rank the titles. In this paper, we introduce an alternate modeling technique that applies to queries that are frequent enough to have historical click data. For each such query we build a parameterized model of user behavior that learns what makes users skip a title. The parameters are different for each query. Specifically, our model predicts how desirable an item’s title is to the user query by focusing on the worst tokens in the title. The model is learned offline using maximum likelihood based on user behavioral data, significantly improving query processing cost. The model’s output score is used as a feature in a machine learned ranker for e-commerce search at eBay. Besides titles, the model design can easily incorporate any attribute of an item including structured content. In this scope, we present our new title desirability model built for nearly 8M queries recently deployed into the eBay search ecosystem and demonstrate its significant performance improvement over a baseline click-based Na¨ıve Bayes model through different evaluation approaches including A/B testing and human judgment. The reported performance is based on eBay's commercial search engine serving millions of queries each day.
Advances in Neural Information Processing Systems (NIPS), 2014
Parallel Feature Selection inspired by Group Testing
Yingbo Zhou, Utkarsh Porwal, Ce Zhang, Hung Q Ngo, Long Nguyen, Christopher Ré, Venu Govindaraju
This paper presents a parallel feature selection method for classification that scales up to very high dimensions and large data sizes. Our original method is inspired by group testing theory, under which the feature selection procedure consists of a collection of randomized tests to be performed in parallel. Each test corresponds to a subset of features, for which a scoring function may be applied to measure the relevance of the features in a classification task. We develop a general theory providing sufficient conditions under which true features are guaranteed to be correctly identified. Superior performance of our method is demonstrated on a challenging relation extraction task from a very large data set that have both redundant features and sample size in the order of millions. We present comprehensive comparisons with state-of-the-art feature selection methods on a range of data sets, for which our method exhibits competitive performance in terms of running time and accuracy. Moreover, it also yields substantial speedup when used as a pre-processing step for most other existing methods.