[Python编程(第4版)].(Programming.Python.4th.Edition).Mark.Lutz.文字版

(yzsuai) #1

just for the language’s code snippet. This wasn’t required in languages2.py ( Exam-
ple 15-20) for the known language names in our selection list table. However, it is not
impossible that someone could pass the script a language name with an embedded
HTML character as a query parameter. For example, a URL such as:


http://localhost/cgi-bin/languages2reply.py?language=a<b

embeds a < in the language name parameter (the name is a<b). When submitted, this
version uses cgi.escape to properly translate the < for use in the reply HTML, according
to the standard HTML escape conventions discussed earlier; here is the reply text
generated:


<TITLE>Languages</TITLE>
<H1>Syntax</H1><HR>

<H3>a<b</H3><P><PRE>
Sorry--I don't know that language
</PRE></P><BR>
<HR>

The original version in Example 15-18 doesn’t escape the language name, such that the
embedded <b is interpreted as an HTML tag (which makes the rest of the page render
in bold font!). As you can probably tell by now, text escapes are pervasive in CGI
scripting—even text that you may think is safe must generally be escaped before being
inserted into the HTML code in the reply stream.


In fact, because the Web is a text-based medium that combines multiple language
syntaxes, multiple formatting rules may apply: one for URLs and another for HTML.
We met HTML escapes earlier in this chapter; URLs, and combinations of HTML and
URLs, merit a few additional words.


URL Escape Code Conventions


Notice that in the prior section, although it’s wrong to embed an unescaped < in the
HTML code reply, it’s perfectly all right to include it literally in the URL string used to
trigger the reply. In fact, HTML and URLs define completely different characters as
special. For instance, although & must be escaped as & inside HTML code, we have
to use other escaping schemes to code a literal & within a URL string (where it normally
separates parameters). To pass a language name like a&b to our script, we have to type
the following URL:


http://localhost/cgi-bin/languages2reply.py?language=a%26b

Here, %26 represents &—the & is replaced with a % followed by the hexadecimal value
(0x26) of its ASCII code value (38). Similarly, as we suggested at the end of Chap-
ter 13, to name C++ as a query parameter in an explicit URL, + must be escaped as %2b:


http://localhost/cgi-bin/languages2reply.py?language=C%2b%2b

Sending C++ unescaped will not work, because + is special in URL syntax—it represents
a space. By URL standards, most nonalphanumeric characters are supposed to be


1202 | Chapter 15: Server-Side Scripting

Free download pdf