Chapter 4 ■ SoCket NameS aNd dNS
62
You can see that supplying the IPv4 address for the local host locks you down to receiving connections only over
IPv4, while using the symbolic name localhost (at least on my Linux laptop with a well-configured /etc/hosts file)
makes available both the IPv4 and IPv6 local names for the machine.
By the way, one question you might already be asking at this point is what on Earth are you supposed to do when
you assert that you want to supply a basic service and getaddrinfo() goes and gives you several addresses to use—you
certainly cannot create a single socket and bind() it to more than one address! In Chapter 7, I will tackle the techniques
that you can use if you are writing server code and want to have several bound server sockets going at once.
Using getaddrinfo() to Connect to a Service
Except when you are binding to a local address to provide a service yourself, you will use getaddrinfo() to learn
about connecting to other services. When looking up services, you can either use an empty string to indicate that you
want to connect back to the local host using the loopback interface or provide a string giving an IPv4 address, IPv6
address, or a hostname to name your destination.
When you are preparing to connect() or sendto() a service, call getaddrinfo() with the AI_ADDRCONFIG flag,
which filters out any addresses that are impossible for your computer to reach. For example, an organization might
have both an IPv4 and an IPv6 range of IP addresses. If your particular host supports only IPv4, then you will want the
results filtered to include only addresses in that family. To prepare for the situation in which the local machine has
only an IPv6 network interface but the service to which you are connecting supports only IPv4, you will also want to
specify AI_V4MAPPED to return the IPv4 addresses reencoded as IPv6 addresses that you can actually use.
Putting these pieces together, you will usually use getaddrinfo() this way before connecting:
getaddrinfo('ftp.kernel.org', 'ftp', 0, socket.SOCK_STREAM, 0,
... socket.AI_ADDRCONFIG | socket.AI_V4MAPPED)
[(2, 1, 6, '', ('204.152.191.37', 21)),
(2, 1, 6, '', ('149.20.20.133', 21))]
In return, you have gotten exactly what you wanted: a list of every way to connect to a host named ftp.kernel.org
through a TCP connection to its FTP port. Note that several IP addresses were returned because, to spread load, this
service is located at several different addresses on the Internet. When several addresses come back like this, you should
generally use the first address returned, and only if your connection attempt fails should you try the remaining ones. By
honoring the order in which the administrators of the remote service want you to try contacting their servers, you will
offer the workload that they intend.
Here is another query that asks how I can connect from my laptop to the HTTP interface of the IANA, who assigns
port numbers in the first place:
getaddrinfo('iana.org', 'www', 0, socket.SOCK_STREAM, 0,
... socket.AI_ADDRCONFIG | socket.AI_V4MAPPED)
[(2, 1, 6, '', ('192.0.43.8', 80))]
The IANA web site is actually a good one for demonstrating the utility of the AI_ADDRCONFIG flag because, like any
other good Internet standards organization, its web site already supports IPv6. It just so happens that my laptop can
speak only IPv4 on the wireless network to which it is currently connected, so the foregoing call was careful to return
only an IPv4 address. However, if you take away the carefully chosen flags in the sixth parameter, then you can peek at
their IPv6 address that you cannot use.
getaddrinfo('iana.org', 'www', 0, socket.SOCK_STREAM, 0)
[(2, 1, 6, '', ('192.0.43.8', 80)),
(10, 1, 6, '', ('2001:500:88:200::8', 80, 0, 0))]