Concepts of Programming Languages

(Sean Pound) #1

250 Chapter 6 Data Types


To provide the means of processing codings of single characters, most
programming languages include a primitive type for them. However, Python
supports single characters only as character strings of length 1.

6.3 Character String Types


A character string type is one in which the values consist of sequences of
characters. Character string constants are used to label output, and the input
and output of all kinds of data are often done in terms of strings. Of course,
character strings also are an essential type for all programs that do character
manipulation.

6.3.1 Design Issues


The two most important design issues that are specific to character string types
are the following:


  • Should strings be simply a special kind of character array or a primitive type?

  • Should strings have static or dynamic length?


6.3.2 Strings and Their Operations
The most common string operations are assignment, catenation, substring
reference, comparison, and pattern matching.
A substring reference is a reference to a substring of a given string. Sub-
string references are discussed in the more general context of arrays, where
the substring references are called slices.
In general, both assignment and comparison operations on character
strings are complicated by the possibility of string operands of different lengths.
For example, what happens when a longer string is assigned to a shorter string,
or vice versa? Usually, simple and sensible choices are made for these situations,
although programmers often have trouble remembering them.
Pattern matching is another fundamental character string operation. In some
languages, pattern matching is supported directly in the language. In others, it is
provided by a function or class library.
If strings are not defined as a primitive type, string data is usually stored in
arrays of single characters and referenced as such in the language. This is the
approach taken by C and C++.
C and C++ use char arrays to store character strings. These languages pro-
vide a collection of string operations through standard libraries. Many uses of
strings and many of the library functions use the convention that character strings
are terminated with a special character, null, which is represented with zero. This
is an alternative to maintaining the length of string variables. The library opera-
tions simply carry out their operations until the null character appears in the
string being operated on. Library functions that produce strings often supply
Free download pdf