P1: IML
Wisman WL040/Bidgoli-Vol III-Ch-59 August 14, 2003 18:3 Char Count= 0
Web Search FundamentalsWeb Search Fundamentals
Raymond Wisman,Indiana University SoutheastIntroduction 724
How To Search—The Searcher’s View 724
Types of Search 725
Information Sources 725
How to Use Search Engines 726
Search Engine Performance 728
How Search Works—Views From the Search
Engine 729
Human-Organized Lists 729
Search Engines 729
What Search Engines Search 731What Search Engines Ignore 732
Metasearch 733
How to Be Searched—Views From the Web Site 733
Web Site Discovery 734
Measuring Success 734
Self-Search 735
Search Service 735
Conclusion 736
Glossary 736
Cross References 736
References 736INTRODUCTION
Until the invention of the railroad, a horse’s gallop was
the limit for speed of overland travel. For finding infor-
mation, reading the pages of a book was the fastest com-
mon means before the invention of the automated search.
Search engines are a recent invention, only becoming
widely available through the Web, which is itself largely a
product of the search engine. Are search engines already
changing our source and means of information discovery?
The answer is yes, according to some commercial studies
that indicate that for many people search engines have
already become the main means of seeking information.
One report (Lewis, Mobilio, & Associates, 2000) points
out that Americans already use search engines 32% of the
time when seeking information, more than any other al-
ternative. Much of the information sought is for profit in
some way; search engines are the top way consumers find
new Web sites online, used by over 73% of those surveyed
(Van Boskirk, Li, Parr, & Gerson, 2001).
As with all technologies, history will judge the lasting
contribution of the Web and search engines to commerce
and society. This chapter’s purpose is more modest, being
merely to examine the viewpoints of the three partners in
Web search: the searcher seeking information, the search
engine that locates information, and the Web site holding
information.
The reader should keep in mind that the searcher is
the only reason for Web sites and search engines to exist
and it is in the interest of each to satisfy the searcher. The
chapter attempts to expose the search process in a form
that is of interest to either the information searcher or the
Web site designer. Three main sections divide the chap-
ter. The first section examines the information seeker’s
viewpoint, considering how to search and measures of
a search engine’s performance. The second section cov-
ers search engine workings and how search engines dif-
fer. The third section examines the Web site holding
the information, how to be noticed by search engines,
how to establish what search engines search, and how
to manage or influence search engines to a Web site’s
benefit.A broad perspective on the search process is useful
before detailed examination of these points of view and
their interactions. Figure 1 illustrates the characteristic
interaction between these three partners in information
search and delivery. The first step occurs when a Web site
creates pages with information and sends the main Web
site page location to a search engine. Next, in the typical
search engine architecture (Arasu, Cho, Garcia-Molina,
Paepcke, & Raghavan, 2001), one component called a
spider (or crawler, robot, etc.) visits the Web site to re-
trieve pages linked from the main page much as a per-
son using a browser would follow links to new pages.
The spider follows links, indexing meaningful words of
the retrieved pages which are stored along with the page’s
Web site location for later searches and retrievals. The
searcher can then send queries to the search engine and
receive a list of matching Web site page locations, from
which the searcher selects and retrieves pages directly
from the Web site. Closing this information space between
the searcher and Web site is the primary purpose of the
search engine. The following sections attempt to provide
insight into the full search process so that the information
searcher can find better information more easily and the
Web site designer can better attract searchers to the infor-
mation.HOW TO SEARCH—THE SEARCHER’S
VIEW
Why search the Web? A simple answer is that the Web
is too large and unorganized to find much useful infor-
mation. In 1999 public sites contained about 800 million
pages on about 3 million servers (Lawrence & Giles, 1999).
The original Web design purpose was simply to intercon-
nect bits and pieces of scattered information with no plan
to find information other than by manually moving from
one piece of information to another by following connect-
ing links. The search engine merely automates the process
and moves much faster from one piece to the next than
users do, while collecting information along the way for
later retrieval. Web site usability studies recognize search724