UX Magazine

Defining and Informing the Complex Field of User Experience (UX)
Article No. 583 November 23, 2010

Psychic Search

:
A quick primer on search suggestions

I recently spoke with a coworker who was skeptical that the search suggestion feature we'd implemented in our company intranet could be effective. "I could be searching for anything—it couldn't possibly know what I'm going to type in." Smiling smugly, I asked him to think of something he might want to find on the company intranet, something he thought he would really need.

Glancing around his desk, he pointed to an empty box and said, "I need to order new business cards." I asked him to start typing "business cards" into the search box one letter at a time. He typed in "b" and up popped the list, and the first search it suggested was "business cards." Still unconvinced, my coworker said "Suppose I lost the key to my desk." He typed in "k" and the search engine immediately suggested "key request." "I want to know which holidays we have off this year;" the letter "h" immediately brought up "holiday schedule 2010."

"Okay, what if I want to enroll in a yoga class?" I thought he had me, but the first item listed under "y" was in fact "yoga." He played it cool, but the expression on his face told me he was puzzled, taken aback, and possibly just a bit scared.

Long list of examples of major sites using search suggestMore Than Just a Magic Trick

That was a fun moment, but suggest functions shouldn't be mistaken for parlor tricks. Anyone who has looked at search logs closely has probably surmised that user skill is a pervasive problem. It's very common for users to submit poorly phrased queries—often just a single word, or an imprecisely phrased idea—that bring back tons of results that have nothing to do with what the user really wanted to find. It's also common that users give up instead of trying to compose their search a second time.

Suggestions help resolve that problem. By providing users with better phrasings, they make it much easier for users to be successful on the first try. In practice, it solves the problem of turning an abstract idea into concrete words so concisely and so effectively that it's quickly becoming accepted as an essential feature of any search engine.

It's getting to the point where sites that don't have a suggest function implemented yet are starting to look a little behind the times. The good news is that they're not especially difficult to implement, and with the right planning they can be wildly successful.

How to Read a User's Mind

Predicting what a user wants to find is actually pretty easy, because probability is on your side. If you take a list of the most commonly submitted searches and chart them by their popularity, you get a shape that looks a lot like this:

zipf curve graph

This means that there are a very small number of search phrases that a large number of people are submitting, and there are also a very large number of search phrases that only a few people are submitting. The really important lesson here is that without knowing anything about a random user, it's possible to know something about what they're likely to search for. If they provide even just a little bit of additional information—such as a few characters in the search box—the odds narrow so dramatically that it's overwhelmingly likely that the search engine can accurately guess what they're trying to find.

To get this effect to really work, the function needs to return suggestions matching the character string the user has entered, sorted by popularity. This is almost always better than sorting the suggestions in some other way (e.g., alphabetically) because it stacks the deck in your favor. The original suggest function that Wikipedia implemented made the mistake of returning search strings sorted alphabetically. The result was that if the user typed in "abraham", it returned:

strange results from Wikipedia`s old search suggest feature

I have no idea who Abraham "Chick" Kazen is, and I'm betting that no one at Wikipedia knows either. Mr. Kazen was first in the list because the code orders quotation marks before letters in an alphabetical sort, even though it's extremely unlikely that a user would actually submit that search. Happily, Wikipedia since fixed this and now typing just two letters returns what we all instinctively feel is the right answer:

more useful search suggestions from Wikipedia

There are a few cases where you might instead order the suggestion list alphabetically—for example, with a corporate directory. But most of the time it's the wrong way to go.

Major Types of Suggest Functions

There are three principal ways suggest functions can work. Which one should be used depends upon the nature of the information that users are searching.

Exploratory

An exploratory function works best when many of the things users are trying to find have no official name. In these cases, people enter keywords that approximate the idea they have in their heads. For example, users of a college website who want to find a map of the buildings might search for:

  • campus map
  • building locations
  • directions to buildings
  • places on campus
  • finding your way around

Given the enormous number of other things people could be searching for on a college website, there is an infinity of possible phrases.

It's impossible to work with a list of potential searches that's infinitely long, so it has to be cut off somewhere. Fortunately, the magic of probability makes it possible to cut the list fairly short and still provide the vast majority of users with good suggestions. Even for a site that sees more than a million unique searches in a year, often just the first few thousand from the list of the most common searches will suffice. This can be small enough to store the complete list on the client side, so there's absolutely no lag as the user types in the search.

The suggestion list needs to be scrubbed to remove multiple word forms (e.g., singular or plural), misspellings, closely related phrasings, and other common problems. But the shortness of the suggestion list makes this fairly easy.

Known item

For other searches, everything the user might try to find has a specific name. This is the case, for example, with websites that are principally product catalogs, such as Apple or Amazon. Other examples of known items include movie titles, airports, and major world cities. In these contexts, suggest functions can help people remember what something is called, eliminate misspellings, and help people figure out what searches will actually give them useful results.

For such known-item searches, truncated lists don't work because the absence of an item implies that it's not available. Since the list needs to be comprehensive, it can be very long indeed (just think of every product that Amazon sells), so it often can't all be stored on the client side. Instead, it would need to be retrieved from the server in real-time. This may introduce some lag, but it can be made more efficient by limiting the number of strings shown at any one time. Ten has become the industry standard.

Historical

These are searches that the user has submitted in the past. People are likely to search for something that they've looked for in the past, like a particular destination in a mapping application. A system can make itself much more personally relevant when it retains a memory of the things that a user has done before, and then makes it easier for the user to do them again.

It still makes the most sense to sort the list of historical searches first by the number of times the user has submitted them. But when two searches have been submitted the same number of times, consider breaking the tie based on recentness.

Some clever designers have seized the best of all worlds by creating hybrid approaches that first display searches that the user has submitted in the past, followed by a list of the most popular searches submitted by other people.

Google`s hybrid search suggest

Designing by the Numbers

Suggest functions exploit quantitative information in a way that has only become possible through the enormous volume of usage of the modern Web. They're a creative application of data that's just sitting out there, waiting for innovative minds to find ways to make it useful. There's a real beauty and elegance to this kind of a strategy, not to mention the fun of knowing what your users are going to say before they even say it.

ABOUT THE AUTHOR(S)

User Profile

John is a user experience designer at Vanguard and the creative director of Megazoid Games. He is the author of the new book Playful Design, published by Rosenfeld Media. Feel free to follow him on Twitter at @PlayfulDesign.

Add new comment

Comments

25
24

Very nice article. Also don't forget:

- geolocation context (useful on mobile or geo enabled devices when looking for places; see gmaps and google app on iphone)

-explicitly tagged (like in awesome bar in FF; though this is just another example of historical)

- contextual disambiguation (see yahoo search and more recently google suggest; ajax programming vs ajax amsterdam)

The last one when first user tested in a next gen search at yahoo (2006) Had consistent surprise by users that it "knew what they were thinking"

19
22

Thanks, interesting post with some very convincing example in the beginning!

18
31

Really interesting post - this gave me some ideas on how to improve search suggestions.

I have implemented a suggestion feature on Iconfinder.com but instead of showing the most popular search terms, it shows the most popular tags ranked by the number of search results. This could probably be improved by ranking them based on their popularity.

24
33

Great post.
It is also interesting to see how search suggestions work on product sites, when there are more than a single content type to suggest (when they don't use search filter - or filter set to all).

We needed to create one like this on our music product, and we separated the suggestion list to:
Most relevant Artists (2 suggestions)
Most relevant Songs (4 suggestions)
Most relevant Albums (2 suggestions)
All sorted by popularity, and when no related suggestions we show no albums/artists/songs.

30
29

Nice post. I'd like to add the Last.fm search interface as an example of a well-donw type ahead search interface. It classifies content within the suggested search terms for added awesomeness.