SEO: Search Engine Optimization Bible

It pays to know which crawler belongs to what search engine, because there are some spambotsand other malicious crawlers out there that are interested in crawling your site for less than ethical rea- sons. If you know the names of these crawlers, you can keep them off of your site and keep your users’ information safe. Spambots in particular are troublesome, because they crawl along the Web searching out and collecting anything that appears to be an e-mail address. These addresses are then collected and sold to marketers or even people who are not interested in legitimate business oppor- tunities. Most spambots will ignore your robots.txt file.

You can view the robots.txt file for any web site that has one by adding the robots .txtextension to the base URL of the site. For example, http://www.sampleaddress.com/ robots.txtwill display a page that shows you the text file guiding robots for that site. If you use that extension on a URL and it doesn’t pull up the robots.txtfile, then the web site does not have one.

If you don’t have a robots.txt file, you can create one in any text editor. And keep in mind that not everyone wants or needs to use the robots.txt file. If you don’t care who is crawling your site, then don’t even create the file. Whatever you do, though, don’t use a blank robots.txt file. Crawlers auto- matically assume an empty file means you don’t want your site to be crawled. So using the blank file is a good way to keep yourself out of search engine results.

Robots Meta Tag

Not everyone has access to their web server, but they still want to have control over how crawlers behave on their web site. If you’re one of those, you can still control the crawlers that come to your site. Instead of using the robots.txt file, you use a robots meta tagto make your preferences known to the crawlers.

The robots meta tag is a small piece of HTML code that is inserted into the <HEAD>tag of your web site and it works generally in the same manner that the robots.txt file does. You include your instructions for crawlers inside the tags. The following example shows you how your robots meta tag might look:

<html> <head> <meta name=”robots” content=”noindex, nofollow”> <meta name=”description” content=”page description.”> <title> Web Site Title </title> </head> <body>

This bit of HTML tells crawlers not to index the content on the site and not to follow the links on the site. Of course, that might not be exactly what you had in mind. You can also use several other robots meta tags for combinations of following, not following, indexing, and not indexing:

<meta name=”robots” content=”index,follow”> <meta name=”robots” content=”noindex,follow”>

TIPTIP

232

Part III Optimizing Search Strategies

75002c16.qxd:Layout 1 11/7/07 9:55 AM Page 232

SEO: Search Engine Optimization Bible

Get our desktop app

Company

Features

Documentation

Resources