Learning Python Network Programming

(Sean Pound) #1

HTTP and Working with the Web


Let's look at an example for sending some HTML form data to a server by using
a POST request, just as browsers do when we submitt a form on a website. The
form data always consists of key/value pairs; urllib lets us work with regular
dictionaries for supplying this (we'll look at where this data comes from in the
following section):





data_dict = {'P': 'Python'}





When posting the HTML form data, the form values must be formatted in the
same way as querystrings are formatted in a URL, and must be URL-encoded. A
Content-Type header must also be set to the special MIME type of application/x-
www-form-urlencoded.


Since this format is identical to querystrings, we can just use the urlencode()
function on our dict for preparing the data:





data = urlencode(data_dict).encode('utf-8')





Here, we also additionally encode the result to bytes, as it's to be sent as the body of
the request. In this case, we use the UTF-8 character set.


Next, we will construct our request:





req = Request('http://search.debian.org/cgi-bin/omega',
data=data)





By adding our data as the data keyword argument, we are telling urllib that we
want our data to be sent as the body of the request. This will make the request use
the POST method rather than the GET method.


Next, we add the Content-Type header:





req.add_header('Content-Type', 'application/x-www-form-urlencode;
charset=UTF-8')





Lastly, we submit the request:





response = urlopen(req)





If we save the response data to a file and open it in a web browser, then we should
see some Debian website search results related to Python.


Formal inspection


In the previous section we used the URL http://search.debian.org/cgibin/
omega, and the dictionary data_dict = {'P': 'Python'}. But where did these
come from?

Free download pdf