Tag Archives: search-engines

Blog catalogs delicio.us folksonomies Google information-architecture information-retrieval keywords Reddit scam search social-bookmarking Taxonomies Web2.0 webspam

Notes: Helping people find what they don’t know

Nicolas J. Belkin, Helping people find what they don’t know, Communications of the ACM, v.43 n.8, p.58-61, Aug. 2000

In this article, Belkin argues that since people generally start searching for information when they don’t know much about a subject. It is therefore problematic that many search systems require knowledge of the domain in order to get good results, for example when users do not know either the specific keywords or controlled vocabulary of the system. His group feels that the best way around this is for the system to make suggestions along the way. There are two techniques that can be used: system-controlled, where the user’s query is enhanced automatically by the system using algorithms like word frequency, and user-controlled, where the user is given the results of their query along with suggestions to make it more effective. The author’s team found that suggestions were most effective when the user was able to control which suggestions were used and when the user knew how the suggestions were generated and was comfortable with the results.

The author’s findings seem both intuitive and promising. It makes sense that in an interactive structured searching system giving the user suggestions and allowing them to take them or leave them would work well, and the suggestions should neither be bizarre or mysterious. But with the rise of the World Wide Web, I think it’s pretty clear that users with less domain knowledge prefer less-structured searching environments. In my experience, users who are new to a system will type unstructured, keyword queries into anything that even looks like a search box, even if it is clearly labeled as a field for author name, product code, or start date. Power users, on the other hand, often have more knowledge about the data then the system’s programmers—so for these sorts of suggestions to be useful, the algorithm would need to do more than just call up synonyms. The article makes it clear that these findings are early, so I would be interested to see what they have come up with since 2000.

These ideas could be applied to both structured and unstructured searching environments, though my guess is that they would be easier to implement in more structured environments because the structure of the system can be used to generate the suggestions. There certainly have been a number of projects which have tried to provide something like this with general web searching. Rudimentary systems like Google Suggest  or more advanced ones like Teoma show off the potential. Notice, however, that neither of these has exactly taken the search engine industry by storm, meaning people are apparently happy to muddle along with plain keyword searching and advanced ranking algorithms. I do wonder if their finding that users liked to have some idea about how suggestions were found would apply here as well—would users be happier with Google if they were told why PageRank picked a certain site as the number one result? Since the algorithms used by Google, Yahoo, MSN and others are trade secrets I doubt we’ll see anything like that in the near future. On the other hand, Amazon.com’s recommendation engine does tell the user why a certain book was suggested, and allow the user to remove certain suggestions. Although it is not really a search tool, it follows the precepts discussed here and seems to be successful.