P1: IML
Wisman WL040/Bidgoli-Vol III-Ch-59 August 14, 2003 18:3 Char Count= 0
736 WEBSEARCHFUNDAMENTALScar or registering for a class. Failed queries are also
valuable for determining words to add to pages as
indicated by the keywords searchers used, expecting
to find information, but failing.
Measuring Failure: Failed searches are valuable in
identifying what visitors actually want and expect to
find on the site. Visitors are sending a clear message if
a Web site for programmers sells nothing except JAVA
programming tools but 75% of the searches are for
FORTRAN. Search engines that log failed searches
give insight into possible improvements. For example,
queries containing predictable misspellings of FOR-
TRAN can succeed after adding misspellings such as
FOTRAN to the engine’s synonym list.The final expert advice on whether a site needs
search appears remarkably precise. The site needs search
(Powell, 2000) when “The site consists of data such as part
numbers, locations, or more than 100 pages” or “The ma-
jority of visitors know what they are looking for, how to
ask for it, and want to go directly to it”.CONCLUSION
The design of the Web has proven wildly successful in col-
lecting and connecting scattered information but failed to
consider finding information on such a large, dispersed
network. The success and continued growth of the Web
has been fueled by the success of search engines in estab-
lishing some degree of coherence and access to the scat-
tered information. Understanding how search engines op-
erate and their limitations can aid a searcher in guiding a
search engine to information and a Web site designer in
planning a site for search.
This chapter has examined Web information discov-
ery from the three views of the searcher, the search en-
gine, and the Web site. Fundamental reasons, strategies,
and techniques employed by searchers for locating high-
quality information were presented. Search engine de-
sign, strategy, and limitations were examined to provide
searchers with some perspective on the comprehensive-
ness and accuracy of search engine results, along with
methods for testing search performance. Search engine
and Web site issues affecting the discovery and ranking
of information and measures and means for determin-
ing search success and disappointment were offered for
improving Web site and page design. Reasons for and
against local search were also considered.
Search engines represent big businesses that directly
profit by attracting and rewarding searchers with infor-
mation. Web search will continue to evolve and improve
as current research matures and is rapidly incorporated
into search engine technology.GLOSSARY
Common log format A standard format for logging and
analyzing Web server messages.
Index List of words extracted from pages and the loca-
tion of each page where the word was extracted. Used
for matching query words and locating the pages con-
taining the matches.Metasearch The fusion of the results from multiple
search engines simultaneously searching on the same
query.
Page A Web document containing plain text and hyper-
text markup language for formatting and linking to
other pages.
Page rank A system for ranking Web pages where a link
from page A to page B increases the rank of page B.
Phrase search A search for documents containing an
exact sentence or phrase specified by a user.
Precision The degree to which a search engine matches
pages with a query. When all pages are relevant to the
query, precision is 100%.
Proximity search A search for pages containing query
words within a mutually close proximity.
Query Words given to search engine in order to locate
pages containing the same words.
Recall The degree in which a search engine matches rel-
evant pages. When all relevant pages are matched, re-
call is 100%.
Relevancy The degree to which a page provides the de-
sired information, as measured by the searcher.
Search engine The software that searches an index of
page words for query words and returns matches.
Spider The software that locates pages for indexing by
following links from one page to another.
Web site A Web location holding and providing access
to pages via the World Wide Web.CROSS REFERENCES
SeeInternet Literacy; Internet Navigation (Basics, Services,
and Portals); Web Search Technology; Web Site Design.REFERENCES
Arasu, A., Cho, J., Garcia-Molina, H., Paepcke, A., &
Raghavan, S. (2001). Searching the Web.ACM Trans-
actions on Internet Technology, 1(1), 2–43.
Belew, R. (2000).Finding out about.New York: Cambridge
University Press.
Dreilinger, D., & Howe, A. (1997). Experiences with se-
lecting search engines using metasearch.ACM Trans-
actions on Information Systems, 15(3), 195–222.
Hock, R. (2001).The extreme searcher’s guide to Web search
engines(2nd ed.). Medford, NJ: CyberAge Books, Infor-
mation Today, Inc.
Kleinberg, J. (1999). Authoritative sources in a hyper-
linked environment. Journal of the Association for
Computing Machinery, 46(5), 604–632.
Lawrence, S., & Giles, C. (1999, July 8). Accessibility of
information on the Web.Nature, 400107–109.
Lawrence, S., Coetzee, F., Glover, E., Flake, G., Pennock,
D., Krovetz, B., Nielsen, F., Kruger, A., & Giles, L.
(2000). Persistence of information on the Web: Ana-
lyzing citations contained in research articles. InPro-
ceedings of the Ninth International Conference on In-
formation and Knowledge Management(pp. 235–242).
New York: ACM Press.
Lewis, Mobilio, & Associates (2000). Consumer Daily
Question Study, Fall 2000. Retrieved May, 2002, from
http://www.keen.com/documents/corpinfo/pressstudy.
asp