>>> mimetypes.guess_type('spam.html')[0].split('/')[1]
'html'
A subtle thing: the second item in the tuple returned from the mimetypes guess is an
encoding type we won’t use here for opening purposes. We still have to pay attention
to it, though—if it is not None, it means the file is compressed (gzip or compress), even
if we receive a media content type. For example, if the filename is something like
spam.gif.gz, it’s a compressed image that we don’t want to try to open directly:
>>> mimetypes.guess_type('spam.gz') # content unknown
(None, 'gzip')
>>> mimetypes.guess_type('spam.gif.gz') # don't play me!
('image/gif', 'gzip')
>>> mimetypes.guess_type('spam.zip') # archives
('application/zip', None)
>>> mimetypes.guess_type('spam.doc') # office app files
('application/msword', None)
If the filename you pass in contains a directory path, the path portion is ignored (only
the extension is used). This module is even smart enough to give us a filename extension
for a type—useful if we need to go the other way, and create a file name from a content
type:
>>> mimetypes.guess_type(r'C:\songs\sousa.au')
('audio/basic', None)
>>> mimetypes.guess_extension('audio/basic')
'.au'
Try more calls on your own for more details. We’ll use the mimetypes module again in
FTP examples in Chapter 13 to determine transfer type (text or binary), and in our
email examples in Chapters 13, 14, and 16 to send, save, and open mail attachments.
In Example 6-23, we use mimetypes to select a table of platform-specific player com-
mands for the media type of the file to be played. That is, we pick a player table for the
file’s media type, and then pick a command from the player table for the platform. At
both steps, we give up and run a web browser if there is nothing more specific to be
done.
Using mimetypes guesses for SearchVisitor
To use this module for directing our text file search scripts we wrote earlier in this
chapter, simply extract the first item in the content-type returned for a file’s name. For
instance, all in the following list are considered text (except “.pyw”, which we may
have to special-case if we must care):
>>> for ext in ['.txt', '.py', '.pyw', '.html', '.c', '.h', '.xml']:
... print(ext, mimetypes.guess_type('spam' + ext))
...
Playing Media Files | 349