Chapter 4 ■ SoCket NameS aNd dNS
64
Other getaddrinfo() Flags
The examples just given demonstrate the operation of three of the most important getaddrinfo() flags. The flags
available vary somewhat by operating system, and you should always consult your own computer’s documentation
(not to mention its configuration) if you are confused about a value that it chooses to return. But there are several flags
that tend to be cross-platform. Here are some of the more important ones:
• AI_ALL: I have already discussed that the AI_V4MAPPED option protects you from the situation
where you are on a purely IPv6-connected host, but the host to which you want to connect
advertises only IPv4 addresses. It resolves this problem by rewriting the IPv4 addresses to their
IPv6 equivalent. However, if some IPv6 addresses do happen to be available, then they will be
the only ones shown, and none of the IPv4 addresses will be included in the return value. This is
fixed by this option: if you want to see all of the addresses from your IPv6-connected host, even
though some perfectly good IPv6 addresses are available, then combine this AI_ALL flag with
AI_V4MAPPED, and the list returned to you will have every address known for the target host.
• AI_NUMERICHOST: This turns off any attempt to interpret the hostname parameter—the
first parameter to getaddrinfo()—as a textual hostname like cern.ch, and it tries only
to interpret the hostname string as a literal IPv4 or IPv6 hostname like 74.207.234.78 or
fe80::fcfd:4aff:fecf:ea4e. This is much faster, as the user or config file supplying the
address cannot cause your program to make a DNS round-trip to look up the name (see the
next section) and prevents possibly untrusted user input from forcing your system to issue a
query to a name server under someone else’s control.
• AI_NUMERICSERV: This turns off symbolic port names like 'www', and it insists that port
numbers like 80 be used instead. You do not need to use this to protect your programs against
slow DNS lookups because port number databases are typically stored locally on IP-capable
machines instead of incurring a remote lookup. On POSIX systems, resolving a symbolic port
name typically requires only a quick scan of the /etc/services file (but check your /etc/
nsswitch.conf file’s services option to be sure). However, if you know that your port string
should always be an integer, then activating this flag can be a useful sanity check.
One final note about flags: you do not have to worry about the IDN-related flags that some operating systems
offer, which tell getaddrinfo() to resolve those fancy new domain names that have Unicode characters in them.
Instead, Python will detect whether a string requires special encoding and will set whatever options are necessary to
get it converted for you:
getaddrinfo('πarάdeigma.dokimή', 'www', 0, socket.SOCK_STREAM, 0,
... socket.AI_ADDRCONFIG | socket.AI_V4MAPPED)
[(2, 1, 6, '', ('199.7.85.13', 80))]
If you are curious about how this works behind the scenes, read up on the relevant international standards
starting with RFC 3492, and note that Python now includes an 'idna' codec that can translate to and from
internationalized domain names.
'πarάdeigma.dokimή'.encode('idna')
b'xn--hxajbheg2az3al.xn--jxalpdlp'
It is this resulting plain-ASCII string that is actually sent to the domain name service when you enter the Greek
sample domain name shown in the previous example. Again, Python will hide this complexity for you.