|
|
Relevance Sort (aka Best Match)
All in all, my personal belief is that when used appropriately, this capability will open the doors to significant revenue opportunities, not just for eBay, but also for the vast network of affiliate developers. Every developer who provides search functionality using the eBay API can now provide a better experience to end users.
Related Searches
Sorting By Distance
If we were to implement all this, it would have required massive investments in hardware to satisfy the query volume this feature would get, coupled with the fact that for a query returning ten thousand results, it would have taken many seconds to calculate the distance for all of them. Amongst competing proposals, I came up with an algorithm where you can get the geographic distance between two points using just a couple of integer additions and subtractions, still accurate to within 2 miles around the entire surface of the earth.
Contextual Keyword Extractor (aka eBay AdContext)
Behavioral data search engine
Other Stuff
In most of commerce, whether online or offline, there is usually a visible distinction between buyers and sellers. But on eBay there is a huge intersection between the two groups, and the complexity this interaction generates can easily be compared to some of nature's greatest mysteries. Consequently, I believe it is simply not possible to computationally model all of eBay activity, as it would require the sum of human knowledge, updated right up to the current second. We can only make approximations...but herein lies the paradox. If you make a perceivably positive change to the system based upon previous data-analysis, can you be absolutely sure you won't cause an opposite reaction? eBay Listings are a jumble of a wide variety of words and characters, so much so that eBay now has its own lexicon, vastly different from any other e-commerce dataset. Sellers demonstrate extreme creativity in designing their item titles to capture the viewers’ attention. Given that regular eBay search functions by simply keyword matching, and the hundreds of millions of listings for sale at any time, nearly every popular search brings back a large number of individual listings. Some folks actually like it, some don't. Some users complain about why "i pod" doesn't also return "ipod", whereas some complain if it does. Someone says "flutes" means a plural of the musical instrument flute, someone else says it means champagne glasses. I have lent my support to many initiatives over the years, and opposed others. I supported adding plurals in search, but not synonyms. I supported aggressive spell check recommendations, but not auto-correction. I supported making "stop words" in description into wildcards, but not their removal altogether. In eBay's context there are no right or wrong answers to such questions, as most are beneficial to one segment of users and bad for another. The challenge of course, is to first come up with the correct question, and then the correct answer, which would result in a win-win for all.
For a long time within eBay it was considered impossible to sort live listings by relevance in real-time, without human training/classification. Unlike regular web-search where the index does not need to be updated immediately as soon as someone somewhere in the world changes his/her webpage, eBay' search engine would need to constantly update its index as thousands of items get listed and expired every second, coupled with confusion over the meaning of "relevance" in an auction style e-commerce scenario. In 2005 I came up with an approach and built a prototype, and it worked remarkably well. After months of evangelizing to business, I led the project's production implementation. It was one of the most difficult architectural, performance, and deadline enabled challenges I have faced. By far the most challenging part turned out to be the abstraction of "relevance" across all the machines in the massive eBay search grid.
It's disappointing that the name "Best Match" was ultimately chosen for describing the sort by relevance functionality on the main site, in particular because it has nothing to do with the "Best Matches" mechanism used in eBay eXpress. The two do not share the same algorithm, the same code, or the same implementation...except the name. The term "Best Match" seems to be getting used as an umbrella term for many things, including sorting category links by relevance on the new keyword pages: http://buy.ebay.com/laptop (this uses another algorithm I invented separately), product recommendations and also regular items. Esteban Kozak has done a great job as product manager, and I hope this feature will continue to evolve. To check out relevance sort, compare the search results you see for ipod nano with the default Time Ending Soonest sort with the the results for ipod nano sorted by relevance.
When Related Searches was launched in July 2005, there was not even a whisper from the eBay community, but the click thru rates were impressive. From my point of view, this was an excellent launch. I had worked hard to sell the research prototype within eBay initially, but ultimately the quality of the recommendations sold themselves. Subsequently I led the development team and actually wrote the code myself for the production backend service. Ben Foster did a great job as product manager to define the end-user experience, and he respected my wishes to place the feature close to the search box, and still make it as inconspicuous as possible. Try it: http://search.ebay.com/ferrari . As part of this effort, I also built the system which tracks the popular searches for all categories as shown on eBay Pulse. I believe all of this data has also been made available to eBay API users, and I hope its turning out to be as useful as I had thought it would be.
I remember the launch of this feature many years back annoyed a large segment of the community, because it replaced the original "search by region" feature instead of adding to it. On top of that, the new functionality was largely hidden by way of having to select the "Distance" option from the sorting drop down. The new functionality allowed one to specify a zip code and sort the listings in the search results by distance from that zip code. Given the latitude longitude of two points on the earth, calculate the geographical distance between them. For short distances it sounds trivial, but for longer distances, the curvature of the earth requires complex trigonometric calculations to get an accurate answer. You need to find the Great Circle around the earth which joins the two points such that the arc between the two points is the shortest possible. Then you need to find the length of the arc.
This is still not live, although the basic technology was built a while back. I created the algorithm which takes any random piece of text, and spews out the most relevant list of keywords which can be broken out as search links to eBay, or to actually perform a relevance search on eBay using those keywords and getting back the top few most relevant items to show as advertisements.
We mine a huge amount of data from user activity, for various purposes. For example, some data is available at the user level. Some is available at a category level, some at country level, and some global. Sometimes the same type of data might be available in all resolutions. To make all of this data available for various uses in a generic manner, I led the design and architecture efforts for building a custom search engine/repository/database. We invented a brand new hierarchical query language derived from SQL to allow for querying the custom multi-dimensional hierarchical database, and mechanisms to save huge amounts of data in as efficient a manner as possible, while still providing millisecond level query response. This component cannot be seen by end users, but rest assured it is being heavily used. This component is a significant part of the "feedback loop", where activity on the site is automatically collected, analysed, processed and fed back into the live system.
Patents(pending)
Demos
This patent covers the core algorithm used when eBay users sort or filter search
results by distance. Without this algorithm, the implementation was turning out to be many orders of magnitude more expensive, and thus prohibitive. This was enabled on the site around June 2004.
This patent covers the algorithm and methodology used in directing ebay search users
to more effective or alternative search queries, also such that those recommendations are known to be more successful in allowing users to find what
they want. This has been active on the site since June 2005.
This is the "Best Match" patent.
The above 3 patents relate to the technology and algorithms used to build the "eBay
Adcontext" product.
This relates to a methodology whereby the user is presented with intelligent search recommendations for those cases where the original search resulted in no matches being found.
This prototype demonstrates an image segmentation algorithm which works well for distinguishing foreground objects in eBay listings pictures. This is especially tricky on eBay given the wide spectrum of low quality pictures taken by amateur eBay sellers.
Use this demo to visually enhance your listing's title.
Why eBay is different
The eBay marketplace is not only huge, it is a fantastically dynamic eco-system within itself. As a researcher, one could easily spend one's life exploring the mountains of data this well instrumented eco-system generates minute by minute (I know I could). It gets very tempting to treat the data exactly as it looks; detailed individual activity like an item being listed for sale at a particular time, a search query being performed, so on and so forth, and consequently it looks trivial to run regular data mining algorithms on this data to "generate more understanding". Only a deeper introspection reveals that behind every piece of data, behind every click, there is an individual human being, who possibly earns his/her living on eBay.