HTML5, CSS3, and JavaScript Fourth Edition

(Ben Green) #1

CHAPTER 19. FIRST WEBSITE 213


it and provides a link for free download.


19.8 Advanced Material


We could put this material in an appendix, but it really belongs here. It
is, however, more advanced than you probably need at first, but in the long
run these are concepts that you should be aware of.


19.8.1 MIME Types v File Extensions


Browsers rely on webservers to tell them what kind of content is being
delivered. This is done by sending a MIME Type first, and then the content.


Many servers decide the MIME Type based on the filename extension. If
the extension is .jpg, they send a MIME Type indicating the content to be
an image of the JPEG type.


When you host an unusual file type, you may need to configure the server
so it knows about that file type.


19.8.2 robots.txt and Search Engines


Search engines like Google, Yahoo, and Bing, traverse the web using web
crawlers. Web crawlers are programs that pretend to be browsers, but in
reality they look at each page they can find and store it in an index for later
use.


Search engines are capable of using any webpage they can find, including
ones that you might not want them to notice. Mainly they should index
content that is stable for the long term. Things that change daily are not
really good for indexing.


You can tell these web crawlers what to index and what to ignore by putting
that information in a special file,robots.txt.


The good news is that most web crawlers respect your wishes as expressed
in your robots.txt file.


The bad news is they get to decide whether to respect your wishes or not.
There is no guarantee they will pay any attention to your request.

Free download pdf