1200 Chapter 59
employ different rules for aligning the fields of a structure to address boundaries on
the host system, leaving different numbers of padding bytes between the fields.
Because of these differences in data representation, applications that exchange
data between heterogeneous systems over a network must adopt some common
convention for encoding that data. The sender must encode data according to this
convention, while the receiver decodes following the same convention. The pro-
cess of putting data into a standard format for transmission across a network is
referred to as marshalling. Various marshalling standards exist, such as XDR (Exter-
nal Data Representation, described in RFC 1014), ASN.1-BER (Abstract Syntax
Notation 1, http://www.asn1.org/), CORBA, and XML. Typically, these standards
define a fixed format for each data type (defining, for example, byte order and
number of bits used). As well as being encoded in the required format, each data
item is tagged with extra field(s) identifying its type (and, possibly, length).
However, a simpler approach than marshalling is often employed: encode all
transmitted data in text form, with separate data items delimited by a designated
character, typically a newline character. One advantage of this approach is that we
can use telnet to debug an application. To do this, we use the following command:
$ telnet host port
We can then type lines of text to be transmitted to the application, and view the
responses sent by the application. We demonstrate this technique in Section 59.11.
The problems associated with differences in representation across heteroge-
neous systems apply not only to data transfer across a network, but also to any
mechanism of data exchange between such systems. For example, we face the
same problems when transferring files on disk or tape between heterogeneous
systems. Network programming is simply the most common programming
context in which we are nowadays likely to encounter this issue.
If we encode data transmitted on a stream socket as newline-delimited text, then it
is convenient to define a function such as readLine(), shown in Listing 59-1.
The readLine() function reads bytes from the file referred to by the file descriptor
argument fd until a newline is encountered. The input byte sequence is returned in
the location pointed to by buffer, which must point to a region of at least n bytes of
memory. The returned string is always null-terminated; thus, at most (n – 1) bytes
of actual data will be returned. On success, readLine() returns the number of bytes of
data placed in buffer; the terminating null byte is not included in this count.
#include "read_line.h"
ssize_t readLine(int fd, void *buffer, size_t n);
Returns number of bytes copied into buffer (excluding
terminating null byte), or 0 on end-of-file, or –1 on error