SEO: Search Engine Optimization Bible

(Barré) #1
<url>
<loc>http://www.example.com/catalog?item=83&desc=vacation_
usa</loc>
<lastmod>2004-11-23</lastmod>
</url>
</urlset>

The URLs that you can include in your XML site map are determined by where you site map is. For
example, if you place your site map on the page http://www.example.com/catalog/sitemap
.xml, any URLS that begin with http://www.example.com/catalogcan be included in the site
map. However, if you have a URL that’s http://www.example.com/images/it won’t be included
in the site map, because it doesn’t fall into the catalog category. You can solve this problem by creating
another site map or by using only the base URL for your site (http://www.example.com/).

Once you’ve created your site map (or had one generated by a site-map generator), you need to
insert it into your robots.txt file. The site-map directive is independent of the user-agent directive,
so it’s not important where in the robots.txt file you place it. All that’s important is that you use the
site-map directive, <sitemap_location>, and replace “location” with the URL where your site is
located. For example, a site-map directive might look like this:

Sitemap: <sitemap_http://www.example.com/sitemap.xml>

You can have more than one site map for your site, too. But if you do, you need to create a site-map
index for crawlers to read to learn where your site maps are. The site-map index looks similar to the
site map, but includes a few different directives. Those directives include:

 <sitemap>: This tag encapsulates information about the individual site map.
 <sitemapindex>: This tag encapsulates information about all of the site maps in a file.

In addition to these tags, you’ll also see the <loc>and <lastmod>tags in the site-map index.
Following is an example of what that index file should look like:

<?xml version=”1.0” encoding=”UTF-8”?>
<sitemapindex xmlns=”http://www.sitemaps.org/schemas/sitemap/0.9”>
<sitemap>
<loc>http://www.example.com/sitemap1.xml.gz</loc>
<lastmod>2004-10-01T18:23:17+00:00</lastmod>
</sitemap>
<sitemap>
<loc>http://www.example.com/sitemap2.xml.gz</loc>
<lastmod>2005-01-01</lastmod>
</sitemap>
</sitemapindex>

The index file contains a list of locations for your XML site maps. It’s not uncommon to have multi-
ple site maps, and in some cases it’s advisable to have multiples. For example, if your site is broken
down into several different categories, you may want to have a separate site map for each category.

237


Robots, Spiders, and Crawlers 16


75002c16.qxd:Layout 1 11/7/07 9:55 AM Page 237

Free download pdf