Tracking and recording users’ browsing behaviors on the web down to individual mouse clicks can create massive web session logs.While such web session data contain valuable information about user behaviors, the ever-increasing data size has placed a big challenge to analyzing and visualizing the data.
An efficient data analysis framework requires both powerful computational analysis and interactive visualization. Following the visual analytics mantra "Analyze first, show the important, zoom, filter and analyze further, details on demand", we introduce a two-tier visual analysis system, TrailExplorer2, to discover knowledge from massive log data.
The system supports a visual analysis process iterating between two steps: querying web sessions and visually analyzing the retrieved data. The query happens at the lower tier where terabytes of web session data are processed in a cluster.
At the upper tier, the extracted web sessions with much smaller scale are visualized on a personal computer for interactive exploration. Our system visualizes a sorted list of web sessions’ temporal patterns and enables data exploration at different levels of details.
The query visualization exploration process iterates until a satisfactory conclusion is achieved. We present two case studies of TrailExplorer2 using real world session data from eBay to demonstrate the system's effectiveness.