Learning Python Network Programming

Chapter 2

There are registered media types for many of the types of data that are transmitted
across the Internet, some common ones are:

Media type Description text/html HTML document text/plain Plain text document image/jpeg JPG image application/pdf PDF document application/json JSON data application/xhtml+xml XHTML document

Another media type of interest is application/octet-stream, which in practice is
used for files that don't have an applicable media type. An example of this would
be a pickled Python object. It is also used for files whose format is not known by
the server. In order to handle responses with this media type correctly, we need to
discover the format in some other way. Possible approaches are as follows:

Examine the filename extension of the downloaded resource, if it has one.
The mimetypes module can then be used for determining the media type
(go to Chapter 3, APIs in Action to see an example of this).

Download the data and then use a file type analysis tool. TheUse the
Python standard library imghdr module can be used for images, and the
third-party python-magic package, or the GNU file command, can be used
for other types.

Check the website that we're downloading from to see if the file type has
been documented anywhere.

Content type values can contain optional additional parameters that provide further
information about the type. This is usually used to supply the character set that the
data uses. For example:

Content-Type: text/html; charset=UTF-8.

In this case, we're being told that the character set of the document is UTF-8.
The parameter is included after a semicolon, and it always takes the form of a
key/value pair.

Let's discuss an example, downloading the Python home page and using the
Content-Type value it returns. First, we submit our request:

response = urlopen('http://www.python.org')

Learning Python Network Programming

Get our desktop app

Company

Features

Documentation

Resources