I wrote most of the Amazon Similarity Explorer (ASE) long before the
clarifying comments were added to the assignment; the operative phrase
for me was "Using one of several recommendation functions..."
I chose to explore the similarity function, simply because it was the
only recommendation function available through the Amazon Web Services,
a application programming interface (API) to Amazon.com that I was interested
Given a tool like ASE, it is possible to create and examine a large number
of personal webs, since the time consuming steps of construction and link
checking are mostly automated. The downside, at least in this implementation,
is that you have less freedom in the selection process; a better designed
front end would allow more navigational freedom through the similarity
lists. While ASE does allow you to go from one list to another by clicking
on the linked ISBN, it limits you to a selecting (for your personal web)
from a single set of similarities rather than browsing, adding to your
set, browsing some more, etc. Creating a similarity search with a large
Search Depth value (say, 50 or 100) gives you a broad selection to choose
from, but you are still limited to those entities that Amazon sees fit
to include in the "similar to" lists.
An occasional oddity pops up, revealing design decisions made by Amazon
database and web service designers. My personal favorite was the search
for help on Microsoft HTML Help Authoring (1572316039) that led quickly
to Michael Moore's "Stupid White Men." It is a great inside
joke: Microsoft HTML Help is a terrible authoring system. Investigation
of the Amazon page reveals that the "similar to" list for this
book is identical to the "Customers who bought this book also bought..."
This is certainly not always the case, but it seems as though Amazon.com
uses purchasing patterns as a substitute for editorial opinion on low
Other design decisions add a lot of value: Amazon has built a very sophisticated
information retrieval system behind their simple query box. Amazon's search
philosophy is to always return something. Even the most total
gibberish generates a set of products to buy, prefaced by a brieft apology:
"We found no matches for j asdjl;fal s. Below are results for fal."
Designers of information organization systems can learn from these types
Other strange results in ASE happen as a function of Amazon quirks. A
search for "art glass" returns a web of medical dictionaries
and handbooks; checking the Amazon.com result for the same search reveals
a medical handbook inexplicably at the top of the return set. Occasionally,
spurious connections appear and then disappear; during one search of cookbooks,
a SQL Server 2000 manual mysteriously showed up in the middle of my personal
web. Had someone just purchased a cookbook and a database manual, causing
some temporary linkage in the Amazon database? I reran the identical query
and it disappeared. The Amazon database is a dynamic entity, so an application
like ASE that interrogates returns different results for the same query
over relatively short periods of time.
Interesting patterns in information organization emerge through experimentation
with ASE. Some I've seen include:
||Going along, hit O'Reilly books, get 41 O'Reilly titles in a row,
then you break out!
||Get stuck in a list of "Schaum's Outline of ______" titles,
all of which refer to one another.
||30 books with no duplications: density of zero. I made a personal
web of the 18 I had read. It had a density of 35%. With 18 picks,
the highest density would be 50% because the similarity list for each
book only contains nine entries. Thus "Gulliver's Travels",
which links to nine other books, actually was 100% linked: every one
of its similarities was in the Defoe personal web.
||This personal web travels from
from China through library science and into telecommunications. The
high relevance given to authors in similarity lists is reflected here,
where a wide ranging classics scholar, Lionel Casson, happens to move
our similarity search from the ancient Far East into library science
through his work "Libraries in the Ancient World." That
links to "Double Fold: Libraries and the Assault on Paper"
by Nicholson Baker which leads to an entirely different web of LIS
books. The China web and LIS web both have typical degrees of connectivity,
but only this single link (Casson-Baker) joins the two worlds.
Similarity lists in Amazon clearly have some organizing principles:
- Works by the same author often are closely linked. If your ASE search
finds a book by a prolific best-selling author, you may never escape
from the associated string of similar items..
- Works in a series are closely linked. Cliff's Notes, multi-volume
sets, and technical guides like the "Dummy" series all have
a high degree of internal linking.
- Obscure works have connections that, at times, make little or no sense.
Finally, in considering ASE and the Amazon Web Services interface to
Amazon.com, we should consider the question of its validity as a bibliographic
system. Svenonius provides an updated version of the IFLA bibliographic
objectives containing five requirements: locating, identifying, selecting,
obtaining, and navigating. (2002) This system is not designed to be a
complete bibliographic system, and it fails miserably on most counts:
- Locating While offering item access according to
author, ISBN, and keyword, the Amazon Web Services (and thus ASE) does
not specifically offer a "title" search type. However, experimentation
reveals that putting the title of a work in a keyword search returns
the intended book. However, ASE is designed to explore similarities,
so the real book is not actually returned; the similarities are.
- Identifying Few of the identifying characteristics
of a given work are returned through this exercise.
- Selecting A selection mechanism is included for users
to add books to their personal web.
- Obtain This does not help obtain the book
- Navigate Navigation needs lots of improvement. One
can go from similarity list to new similarity list by clicking on an
ISBN number, thereby starting a new search. However, many other improvements
in browsability and navigability can readily be imagined.
While an "OK" exploration tool, ASE fails miserably as a bibliographic
system because it was not designed to be one. What is interesting,
though, is how an intellectual framework like this could be used to improve
an application like ASE. If one was actually making a real service for
end users, Svenonius's objectives would provide useful ideas for design
For instance, an author search using AWS returns an array of books related
to that author. If we were building a bibliographic system, we would want
to display that result list to the user for consideration. However, ASE
ignores all but the first book in the result list, using that as a key
for a similarity search. It is correct to do so: this is the Similarity
Explorer, not the "Colocation Explorer", but it does hinder
ASE's bibliographic utility. It might be more useful to display the entire
array of books resulting from the first search, and then let the user
pick the one to use as the beginning of the "similar to" chain.
Amazon Web Services offers programmers powerful access to their database.
However, the richness and variety that makes Amazon.com a fun shopping
experience comes not from the database, but from the layers of information
organization built on top of it. The diversity found through the variety
of lists in the user interface are lacking in ASE due to its exclusive
utilization of the similarity function.