[Python编程(第4版)].(Programming.Python.4th.Edition).Mark.Lutz.文字版

(yzsuai) #1

Almost all HTTP transfer details are hidden behind the urllib.request interface here.
This version works in almost the same way as the http.client version we wrote first,
but it builds and submits an Internet URL address to get its work done (the constructed
URL is printed as the script’s first output line). As we saw in the FTP section of this
chapter, the urllib.request function urlopen returns a file-like object from which we
can read the remote data. But because the constructed URLs begin with “http://” here,
the urllib.request module automatically employs the lower-level HTTP interfaces to
download the requested file instead of FTP:


C:\...\PP4E\Internet\Other> http-getfile-urllib1.py
http://learning-python.com/index.html
b'<HTML>\n'
b' \n'
b'<HEAD>\n'
b"<TITLE>Mark Lutz's Python Training Services</TITLE>\n"
b'<!--mstheme--><link rel="stylesheet" type="text/css" href="_themes/blends/blen...'
b'</HEAD>\n'

C:\...\PP4E\Internet\Other> http-getfile-urllib1.py http://www.python.org /index
http://www.python.org/index
b'<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3....'
b'\n'
b'\n'
b'<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">\n'
b'\n'
b'<head>\n'

C:\...\PP4E\Internet\Other> http-getfile-urllib1.py http://www.rmi.net /~lutz
http://www.rmi.net/~lutz
b'<HTML>\n'
b'\n'
b'<HEAD>\n'
b"<TITLE>Mark Lutz's Book Support Site</TITLE>\n"
b'</HEAD>\n'
b'<BODY BGCOLOR="#f1f1ff">\n'

C:\...\PP4E\Internet\Other> http-getfile-urllib1.py
localhost /cgi-bin/languages.py?language=Java
http://localhost/cgi-bin/languages.py?language=Java
b'<TITLE>Languages</TITLE>\n'
b'<H1>Syntax</H1><HR>\n'
b'<H3>Java</H3><P><PRE>\n'
b' System.out.println("Hello World"); \n'
b'</PRE></P><BR>\n'
b'<HR>\n'

As before, the filename argument can name a simple file or a program invocation with
optional parameters at the end, as in the last run here. If you read this output carefully,
you’ll notice that this script still works if you leave the “index.html” off the end of a
site’s root filename (in the third command line); unlike the raw HTTP version of the
preceding section, the URL-based interface is smart enough to do the right thing.


998 | Chapter 13: Client-Side Scripting

Free download pdf