Learning Python Network Programming

(Sean Pound) #1
Chapter 3

Finally, since we may be making quite a lot of requests as we test out our fledgling
clients, it's a good idea to make local copies of the web pages or the files that you
want your client to parse and test it against them. In this way, we'll be saving
bandwidth for ourselves and for the websites.


Summary


We've covered a lot of ground in this chapter, but you should now be able to start
making real use of the web APIs that you encounter.


We looked at XML, how to construct documents, parse them and extract data from
them by using the ElementTree API. We looked at both the Python ElementTree
implementation and lxml. We also looked at how the XPath query language can be
used efficiently for extracting information from documents.


We looked at the Amazon S3 service and wrote a client that lets us perform basic
operations, such as creating buckets, and uploading and downloading files through
the S3 REST API. We learned about setting access permissions and setting content
types, such that the files work properly in web browsers.


We looked at the JSON data format, how to convert Python objects into the JSON
data format and how to convert them back to Python objects.


We then explored the Twitter API and wrote an on-demand world clock service,
through which we learned how to read and process tweets for an account, and how
to send a tweet as a reply.


We saw how to extract or scrape information from the HTML source of web pages.
We saw how to work with HTML when using ElementTree and the lxml HTML
parser. We also learned how to use XPath to help make this process more efficient.


And finally, we looked at how we can give back to the webmasters that provide
us with all the data. We discussed a few ways in which we can code our clients to
make the webmasters lives a little easier and respect how they would like us to use
their sites.


So, that's it for HTTP for now. We'll re-visit HTTP in Chapter 9, Applications for the
Web, where we'll be looking at using Python for constructing the server-side of web
applications. In the next chapter, we'll discuss the other great workhorse of the
Internet: e-mail.

Free download pdf