We strongly believe in open source and giving to our community. We work directly with researchers in academia and seek out new perspectives with our intern and fellowship programs. We generalize our solutions and release them to the world as open source projects. We host discussions and publish our results.


STOC 2011 (Invited and accepted to SICOMP)

Distributed Verification and Hardness of Distributed Approximation

Atish Das Sarma, Stephan Holzer, Liah Kor, Amos Korman, Danupon Nanongkai, Gopal Pandurangan, David Peleg, Roger Wattenhofer, Atish Das Sarma, Stephan Holzer, Liah Kor, Amos Korman, Danupon Nanongkai, Gopal Pandurangan, David Peleg, Roger Wattenhofer

We study the verification problem in distributed networks, stated as follows. Let H be a subgraph of a network G where each vertex of G knows which edges incident on it are in H. We would like to verify whether H has some properties, e.g., if it is a tree or if it is connected (every node knows in the end of the process whether H has the specified property or not). We would like to perform this verification in a decentralized fashion via a distributed algorithm. The time complexity of verification is measured as the number of rounds of distributed communication.

In this paper we initiate a systematic study of distributed verification, and give almost tight lower bounds on the running time of distributed verification algorithms for many fundamental problems such as connectivity, spanning connected subgraph, and s-t cut verification.

We then show applications of these results in deriving strong unconditional time lower bounds on the hardness of distributed approximation for many classical optimization problems including minimum spanning tree, shortest paths, and minimum cut.

Many of these results are the first non-trivial lower bounds for both exact and approximate distributed computation and they resolve previous open questions. Moreover, our unconditional lower bound of approximating minimum spanning tree (MST) subsumes and improves upon the previous hardness of approximation bound of Elkin [STOC 2004] as well as the lower bound for (exact) MST computation of Peleg and Rubinovich [FOCS 1999]. Our result implies that there can be no distributed approximation algorithm for MST that is significantly faster than the current exact algorithm, for any approximation factor.

Our lower bound proofs show an interesting connection between communication complexity and distributed computing which turns out to be useful in establishing the time complexity of exact and approximate distributed computation of many problems.

PVLDB 2011 (Invited to VLDB Journal Special Issue)

Personalized Social Recommendations – Accurate or Private?

Atish Das Sarma, Ashwin Machanavajjhala, Aleksandra Korolova

With the recent surge of social networks such as Facebook, new forms of recommendations have become possible -- recommendations that rely on one's social connections in order to make personalized recommendations of ads, content, products, and people. Since recommendations may use sensitive information, it is speculated that these recommendations are associated with privacy risks. The main contribution of this work is in formalizing trade-offs between accuracy and privacy of personalized social recommendations.

We study whether "social recommendations", or recommendations that are solely based on a user's social network, can be made without disclosing sensitive links in the social graph. More precisely, we quantify the loss in utility when existing recommendation algorithms are modified to satisfy a strong notion of privacy, called differential privacy. We prove lower bounds on the minimum loss in utility for any recommendation algorithm that is differentially private.

We then adapt two privacy preserving algorithms from the differential privacy literature to the problem of social recommendations, and analyze their performance in comparison to our lower bounds, both analytically and experimentally.

We show that good private social recommendations are feasible only for a small subset of the users in the social network or for a lenient setting of privacy parameters.

PODC 2011

A tight unconditional lower bound on distributed random walk computation

Atish Das Sarma, Danupon Nanongkai, Gopal Pandurangan, Atish Das Sarma, Danupon Nanongkai, Gopal Pandurangan

No Information

Volume 18, Issue 1, 2011

Rat, rational, or seething cauldron of desire: designing the shopper, interactions

Elizabeth Churchill

In this article, I look at how our shopping experience, both online and offline, is designed.

SIGIR 2011: 75-84

User behavior in zero-recall ecommerce queries

Gyanit Singh, Nish Parikh, Neel Sundaresan

User expectation and experience for web search and eCommerce (product) search are quite different. Product descriptions are concise as compared to typical web documents. User expectation is more specific to find the right product.

The difference in the publisher and searcher vocabulary (in case of product search the seller and the buyer vocabulary) combined with the fact that there are fewer products to search over than web documents result in observable numbers of searches that return no results (zero recall searches).

In this paper we describe a study of zero recall searches. Our study is focused on eCommerce search and uses data from a leading eCommerce site's user click stream logs.

There are 3 main contributions of our study: 1) The cause of zero recall searches; 2) A study of user's reaction and recovery from zero recall; 3) A study of differences in behavior of power users versus novice users to zero recall searches.

in Proceedings of the fourth ACM international conference on Web search and data mining (WSDM), 2011.

eBay: an E-commerce Marketplace as a Complex Network

Zeqian Shen, Neel Sundaresan, Zeqian Shen, Neel Sundaresan

Commerce networks involve buying and selling activities among individuals or organizations. As the growing of the Internet and e-commerce, it brings opportunities for obtaining real world online commerce networks, which are magnitude larger than before.

Getting a deeper understanding of e-commerce networks, such as the eBay marketplace, in terms of what structure they have, what kind of interactions they afford, what trust and reputation measures exist, and how they evolve has tremendous value in suggesting business opportunities and building effective user applications.

In this paper, we modeled the eBay network as a complex network. We analyzed the macroscopic shape of the network using degree distribution and the bow-tie model. Networks of different eBay categories are also compared.

The results suggest that the categories vary from collector networks to retail networks. We also studied the local structures of the networks using motif profiling. Finally, patterns of preferential connections are visually analyzed using Auroral diagrams.

Interacting with Computers, 10/2011, Volume 23, Issue 5, p.iii-xi, 2011

Feminism and HCI: New Perspectives

Shaowen Bardzell, Elizabeth Churchill

As a word and as a set of theories and practices, feminism is a poorly understood concept. However, feminist perspectives have a lot in common with user- and value-centered design processes such as those espoused within the field of Human Computer Interaction.

Examples include consideration of alternative viewpoints, considerations of agency (who get to say/do what and under what circumstances) and the development of reflective and reflexive methods for understanding how, when, where and why people do what they do.

In the ''Feminism and HCI: New Perspectives'' special issue, we have invited researchers and practitioners to reflect on the ways in which feminist thinking, theory, and practice can and does have an impact on the field of Human Computer Interaction.

This introductory editorial offers more background to our view that there is great value to understanding the actual and potential impact of feminist thinking on HCI, followed by a precis of each paper. We close with some observations regarding common themes, points of contention and possibilities for future work.

Journal of the American Society for Information Science and Technology, 09/2011, 2011

Automatic identification of personal insults on social news sites

Sara OwsleySood, Elizabeth Churchill, Judd Antin

As online communities grow and the volume of user-generated content increases, the need for community management also rises. Community management has three main purposes: to create a positive experience for existing participants, to promote appropriate, socionormative behaviors, and to encourage potential participants to make contributions.

Research indicates that the quality of content a potential participant sees on a site is highly influential; off-topic, negative comments with malicious intent are a particularly strong boundary to participation or set the tone for encouraging similar contributions.

A problem for community managers, therefore, is the detection and elimination of such undesirable content. As a community grows, this undertaking becomes more daunting. Can an automated system aid community managers in this task? In this paper, we address this question through a machine learning approach to automatic detection of inappropriate negative user contributions.

Our training corpus is a set of comments from a news commenting site that we tasked Amazon Mechanical Turk workers with labeling. Each comment is labeled for the presence of profanity, insults, and the object of the insults.

Support vector machines trained on these data are combined with relevance and valence analysis systems in a multistep approach to the detection of inappropriate negative user contributions. The system shows great potential for semiautomated community management.

WSDM 2011: 765-774, Hong Kong, February 2011

Query suggestion for E-commerce sites

Mohammad AlHasan, Nish Parikh, Gyanit Singh, Neel Sundaresan

Query suggestion module is an integral part of every search engine. It helps search engine users narrow or broaden their searches. Published work on query suggestion methods has mainly focused on the web domain. But, the module is also popular in the domain of e-commerce for product search.

In this paper, we discuss query suggestion and its methodologies in the context of e-commerce search engines. We show that dynamic inventory combined with long and sparse tail of query distribution poses unique challenges to build a query suggestion method for an e-commerce marketplace.

We compare and contrast the design of a query suggestion system for web search engines and e-commerce search engines. Further, we discuss interesting measures to quantify the effectiveness of our query suggestion methodologies. We also describe the learning gained from exposing our query suggestion module to a vibrant community of millions of users.