Concepts of Programming Languages

(Sean Pound) #1
6.3 Character String Types 251

the null character. The character string literals that are built by the compiler
also have the null character. For example, consider the following declaration:

char str[] = "apples";

In this example, str is an array of char elements, specifically apples0, where
0 is the null character.
Some of the most commonly used library functions for character strings
in C and C++ are strcpy, which moves strings; strcat, which catenates
one given string onto another; strcmp, which lexicographically compares
(by the order of their character codes) two given strings; and strlen, which
returns the number of characters, not counting the null, in the given string.
The parameters and return values for most of the string manipulation func-
tions are char pointers that point to arrays of char. Parameters can also be
string literals.
The string manipulation functions of the C standard library, which are also
available in C++, are inherently unsafe and have led to numerous programming
errors. The problem is that the functions in this library that move string data
do not guard against overflowing the destination. For example, consider the
following call to strcpy:

strcpy(dest, src);

If the length of dest is 20 and the length of src is 50, strcpy
will write over the 30 bytes that follow dest. The point is that
strcpy does not know the length of dest, so it cannot ensure
that the memory following it will not be overwritten. The same
problem can occur with several of the other functions in the C
string library. In addition to C-style strings, C++ also supports
strings through its standard class library, which is also similar to that of Java.
Because of the insecurities of the C string library, C++ programmers should
use the string class from the standard library, rather than char arrays and
the C string library.
In Java, strings are supported by the String class, whose values are con-
stant strings, and the StringBuffer class, whose values are changeable and are
more like arrays of single characters. These values are specified with methods
of the StringBuffer class. C# and Ruby include string classes that are similar
to those of Java.
Python includes strings as a primitive type and has operations for substring
reference, catenation, indexing to access individual characters, as well as methods
for searching and replacement. There is also an operation for character member-
ship in a string. So, even though Python’s strings are primitive types, for character
and substring references, they act very much like arrays of characters. However,
Python strings are immutable, similar to the String class objects of Java.
In F#, strings are a class. Individual characters, which are represented in
Unicode UTF-16, can be accessed, but not changed. Strings can be catenated
with the + operator. In ML, string is a primitive immutable type. It uses ^ for

History Note


SNOBOL 4 was the first widely
known language to support pat-
tern matching.

Free download pdf