Zeqian (Jack) Shen
Large-scale data analysis and visualization, web and mobile analytics, user interface design and social network analysis
Jack joined eBay Research Labs in 2008. He works on large-scale data analytics aiming at capturing and modeling user behavior using large-scale session logs. Jack received his Ph.D. in computer science from University of California, Davis in information visualization. He also holds an MS in Computer Engineering from the University of Tennessee at Knoxville as well as a BS in Computer Science from Zhejiang University in China.
accepted to WWW2013 poster.
in Proceedings of the 22nd international conference on World Wide Web (WWW ’13)
Reuse and remarketing of content and products is an integral part of the internet. As E-commerce has grown, online resale and secondary markets form a signiﬁcant part of the commerce space. The intentions and methods for reselling are diverse. In this paper, we study an instance of such markets that aﬀords interesting data at large scale for mining purposes to understand the properties and patterns of this online market. As part of knowledge discovery of such a market, we ﬁrst formally propose criteria to reveal unseen resale behaviors by elastic matching identiﬁcation (EMI) based on the account transfer and item similarity properties of transactions. Then, we present a large-scale system that leverages MapReduce paradigm to mine millions of online resale activities from petabyte scale heterogeneous ecommerce data. With the collected data, we show that the number of resale activities leads to a power law distribution with a ‘long tail’, where a signiﬁcant share of users only resell in very low numbers and a large portion of resales come from a small number of highly active resellers. We further conduct a comprehensive empirical study from diﬀerent aspects of resales, including the temporal, spatial patterns, user demographics, reputation and the content of sale postings. Based on these observations, we explore the features related to “successful” resale transactions and evaluate if they can be predictable. We also discuss uses of this information mining for business insights and user experience on a real-world online marketplace.
in IEEE Visual Analytics Science and Technology (VAST) 2012
Web clickstream data are routinely collected to study how users browse the web or use a service. It is clear that the ability to recognize and summarize user behavior patterns from such data is valuable to e-commerce companies. In this paper, we introduce a visual analytics system to explore the various user behavior patterns reflected by distinct clickstream clusters. In a practical analysis scenario, the system first presents an overview of clickstream clusters using a Self-Organizing Map with Markov chain models. Then the analyst can interactively explore the clusters through an intuitive user interface. He can either obtain summarization of a selected group of data or further refine the clustering result. We evaluated our system using two different datasets from eBay. Analysts who were working on the same data have confirmed the system’s effectiveness in extracting user behavior patterns from complex datasets and enhancing their ability to reason.
in IEEE Large-scale Data Analysis and Visualization (LDAV) 2012
Tracking and recording users’ browsing behaviors on the web down to individual mouse clicks can create massive web session logs.While such web session data contain valuable information about user behaviors, the ever-increasing data size has placed a big challenge to analyzing and visualizing the data. An efficient data analysis framework requires both powerful computational analysis and interactive visualization. Following the visual analytics mantra "Analyze first, show the important, zoom, filter and analyze further, details on demand", we introduce a two-tier visual analysis system, TrailExplorer2, to discover knowledge from massive log data. The system supports a visual analysis process iterating between two steps: querying web sessions and visually analyzing the retrieved data. The query happens at the lower tier where terabytes of web session data are processed in a cluster. At the upper tier, the extracted web sessions with much smaller scale are visualized on a personal computer for interactive exploration. Our system visualizes a sorted list of web sessions’ temporal patterns and enables data exploration at different levels of details. The query visualization exploration process iterates until a satisfactory conclusion is achieved. We present two case studies of TrailExplorer2 using real world session data from eBay to demonstrate the system's effectiveness.
in Proceedings of the fourth ACM international conference on Web search and data mining (WSDM), 2011.
Commerce networks involve buying and selling activities among individuals or organizations. As the growing of the Internet and e-commerce, it brings opportunities for obtaining real world online commerce networks, which are magnitude larger than before. Getting a deeper understanding of e-commerce networks, such as the eBay marketplace, in terms of what structure they have, what kind of interactions they afford, what trust and reputation measures exist, and how they evolve has tremendous value in suggesting business opportunities and building effective user applications. In this paper, we modeled the eBay network as a complex network. We analyzed the macroscopic shape of the network using degree distribution and the bow-tie model. Networks of different eBay categories are also compared. The results suggest that the categories vary from collector networks to retail networks. We also studied the local structures of the networks using motif profiling. Finally, patterns of preferential connections are visually analyzed using Auroral diagrams.
In 12th International Workshop on Agent Mediated Electronic Commerce (AMEC-10) Toronto, Canada, May 2010
Online markets have enjoyed explosive growths and emerged as an important research topic in the field of electronic commerce. Researchers have mostly focused on studying consumer behavior and experience, while largely neglecting the seller side of these markets. Our research addresses the problem of examining strategies sellers employ in listing their products on online market places. In particular, we introduce a Markov Chain model that captures and predicts seller listing behavior based on their present and past actions, their relative positions in the market, and market conditions. These features distinguish our approach from existing models that usually overlook the importance of historical information, as well as sellers’ interactions. We choose to examine successful sellers on eBay, one of the most prominent online marketplaces, and empirically test our model framework using eBay’s data for fixed-priced items collected over a period of four and a half months. This empirical study entails comparing our most complex history-dependent model’s predictive power against that of a semi-random behavior baseline model and our own history-independent model. The outcomes exhibit differences between different sellers in their listing strategies for different products, and validate our models’ capability in capturing seller behavior. Furthermore, the incorporation of historical information on seller actions in our model proves to improve its predictions of future behavior
In IEEE VisWeek Discovery Exhibition, SALT LAKE CITY, UTAH, USA, 2010
Trail Explorer is a visual analytics tool for better underst anding of user experiences in webpage flows. It enables exploration and discovery of user session data. This paper presents two case studies of Trail Explorer in use with real data.
In Proceedings of IEEE Pacific Visualization Symposium, IEEE VGTC, March, 2008, pp.175-182
The widespread use of mobile devices brings opportunities to capture large-scale, continuous information about human behavior. Mobile data has tremendous value, leading to business opportunities, market strategies, security concerns, etc. Visual analytics systems that support interactive exploration and discovery are needed to extracting insight from the data. However, visual analysis of complex social-spatial-temporal mobile data presents several challenges. We have created MobiVis, a visual analytics tool, which incorporates the idea of presenting social and spatial information in one heterogeneous network. The system supports temporal and semantic filtering through an interactive time chart and ontology graph, respectively, such that data subsets of interest can be isolated for close-up investigation. "Behavior rings," a compact radial representation of individual and group behaviors, is introduced to allow easy comparison of behavior patterns. We demonstrate the capability of MobiVis with the results obtained from analyzing the MIT Reality Mining dataset.
In IEEE VAST 2008 Symposium Challenge, 2008
MobiVis is a visual analytics tools to aid in the process of processing and understanding complex relational data, such as social networks. At the core of these tools is the ability to ﬁlter complex networks structurally and semantically, which helps us discover clusters and patterns in the organization of social networks. Semantic ﬁltering is obtained via an ontology graph, based on another visual analytics tool, called OntoVis. In this summary, we describe how these tools where used to analyze one of the mini-challenges of the 2008 VAST challenge.
In Proceedings of Eurographics/IEEE VGTC Syposium on Visualization, May 2007, pp. 83-90
For displaying a dense graph, an adjacency matrix is superior than a node-link diagram because it is more compact and free of visual clutter. A node-link diagram, however, is far better for the task of path finding because a path can be easily traced by following the corresponding links, provided that the links are not heavily crossed or tangled. We augment adjacency matrices with path visualization and associated interaction techniques to facilitate path finding. Our design is visually pleasing, and also effectively displays multiple paths based on the design commonly found in metro maps. We illustrate and assess the key aspects of our design with the results obtained from two case studies and an informal user study.
U.S. Patent (filed 06/15/2012) Systems and Methods for Behavioral Modeling to Optimize Shopping Cart Conversion
U.S. Patent (filed 08/09/2011) Session Analysis Systems and Methods
U.S. Patent (filed 03/25/2011) Analyzing Marketplace Listing Strategies
U.S. Patent (filed 03/25/2011) Category Management and Analysis
U.S. Patent (filed 04/22/2008) Reputation Scoring
U.S. Patent (filed 12/30/2007) Method and System for Social Network Analysis