Chapter 17 ■ Ftp
318
A second issue is that an FTP user tends to make a connection, choose a working directory, and do several
operations all over the same network connection. Modern Internet services, with millions of users, prefer protocols
like HTTP (see Chapter 9) that consist of short, completely self-contained requests, instead of long-running FTP
connections that require the server to remember things like a current working directory.
A final big issue is file-system security. Instead of showing users just a sliver of the host file system that the owner
wanted exposed, early FTP servers tended simply to expose the entire file system, letting users cd to / and snoop
around to see how the system was configured. True, you could run the server under a separate ftp user and try to
deny that user access to as many files as possible; but many areas of the Unix file system need to be publicly readable
purely so that normal users can use the programs there.
So what are the alternatives?
• For file download, HTTP (see Chapter 9) is the standard protocol on today’s Internet,
protected with SSL when necessary for security. Instead of exposing system-specific file name
conventions as FTP does, HTTP supports system-independent URLs.
• Anonymous upload is a bit less standard, but the general tendency is to use a form on a web
page that instructs the browser to use an HTTP POST operation to transmit the file that the
user selects.
• File synchronization has improved immeasurably since the days when a recursive FTP file
copy was the only common way to get files to another computer. Instead of wastefully copying
every file, modern commands like rsync or rdist efficiently compare files at both ends of
the connection and copy only the ones that are new or that have changed. (These commands
are not covered in this book; try Googling them.) Nonprogrammers are most likely to use the
Python-powered Dropbox service or any of the competing “cloud drive” services, which large
providers now offer.
• Full file-system access is actually the one area where FTP can still commonly be found
on today’s Internet: thousands of cut-rate ISPs continue to support FTP, despite its lack of
security, as the means by which users can copy their media and (typically) PHP source code
into their web account. A much better alternative today is for service providers to support
SFTP instead (see Chapter 16).
■ Note the Ftp standard is rFC 959, available at http://www.faqs.org/rfcs/rfc959.html.
Communication Channels
FTP is unusual because, by default, it actually uses two TCP connections during operation. One connection is the
control channel, which carries commands and the resulting acknowledgments or error codes. The second connection
is the data channel, which is used solely for transmitting file data or other blocks of information, such as directory
listings. Technically, the data channel is fully duplex, meaning that it allows files to be transmitted in both directions
simultaneously. However, in actual practice, this capability is rarely used.
In traditional operations, the process of downloading a file from an FTP server runs like this:
- First, the FTP client establishes a command connection by connecting to the FTP port on
the server. - The client authenticates itself, usually with username and password.
- The client changes directory on the server to where it wants to deposit or retrieve files.