6.15 Theory and Data Types 309
programming languages; the abstract branch primarily focuses on typed
lambda calculus, an area of extensive research by theoretical computer sci-
entists over the past half century. This section is restricted to a brief descrip-
tion of some of the mathematical formalisms that underlie data types in
programming languages.
A data type defines a set of values and a collection of operations on those
values. A type system is a set of types and the rules that govern their use in
programs. Obviously, every typed programming language defines a type sys-
tem. The formal model of a type system of a programming language consists
of a set of types and a collection of functions that define the type rules of the
language, which are used to determine the type of any expression. A formal
system that describes the rules of a type system, attribute grammars, is intro-
duced in Chapter 3.
An alternative model to attribute grammars uses a type map and a col-
lection of functions, not associated with grammar rules, that specify the type
rules. A type map is similar to the state of a program used in denotational
semantics, consisting of a set of ordered pairs, with the first element of each
pair being a variable’s name and the second element being its type. A type map
is constructed using the type declarations in the program. In a static typed
language, the type map need only be maintained during compilation, although
it changes as the program is analyzed by the compiler. If any type checking is
done dynamically, the type map must be maintained during execution. The
concrete version of a type map in a compilation system is the symbol table,
constructed primarily by the lexical and syntax analyzers. Dynamic types some-
times are maintained with tags attached to values or objects.
As stated previously, a data type is a set of values, although in a data type
the elements are often ordered. For example, the elements in all ordinal types
are ordered. Despite this difference, set operations can be used on data types to
describe new data types. The structured data types of programming languages
are defined by type operators, or constructors that correspond to set operations.
These set operations/type constructors are briefly introduced in the following
paragraphs.
A finite mapping is a function from a finite set of values, the domain set,
onto values in the range set. Finite mappings model two different categories of
types in programming languages, functions and arrays, although in some lan-
guages functions are not types. All languages include arrays, which are defined
in terms of a mapping function that maps indices to elements in the array. For
traditional arrays, the mapping is simple—integer values are mapped to the
addresses of array elements; for associative arrays, the mapping is defined by a
function that describes a hashing operation. The hashing function maps the
keys of the associate arrays, usually character strings,^11 to the addresses of the
array elements.
A Cartesian, or cross product of n sets, S 1 , S 2 , c , Sn,
is a set denoted S 1 *S 2 * c *Sn. Each element of the
- In Ruby and Lua, the associative array keys need not be character strings—they can be any type.