Expert C Programming

(Jeff_L) #1

The executive summary of this is roughly that arrays and pointers are like limericks and haikus: they
are related art forms, but each has its own different practical expression. The following sections
describe what these rules actually mean in practice.


Rule 1 An "Array Name in an Expression" Is a Pointer


Rules 1 and 2 (above) in combination mean that subscripted array references can always be written
equally well as a pointer-to-base-of-array plus offset. For example, if we declare


int a[10], *p, i=2;


then a[i] can equally be accessed in any of these ways:


p=a; p=a; p=a+i;


p[i]; (p+i); p;


In fact, it's even stronger than this. An array reference a[i] is always rewritten to *(a+i) by the
compiler at compiletime. The C standard requires this conceptual behavior. Perhaps an easy way to
follow this is to remember that square brackets [] represent a subscript operator, just as a plus sign
represents the addition operator. The subscript operator takes an integer and pointer-to-type-T, and
yields an object of type T. An array name in an expression becomes a pointer, and there you are:
pointers and arrays are interchangeable in expressions because they all boil down to pointers in the
end, and both can be subscripted. Just as with addition, the subscript operator is commutative (it
doesn't care which way round the arguments come, 5+ 3 equals 3 + 5). This is why, given a
declaration like int a[10];, both the following are correct:


a[6] = ....;


6[a] = ....;


The second version is never seen in production code, and has no known use apart from confusing
those new to C.


The compiler automatically scales a subscript to the size of the object pointed at. If integers are 4
bytes long, then a[i+1] is actually 4 bytes (not 1) further on from a[i]. The compiler takes care of
scaling before adding in the base address. This is the reason why pointers are always typed—
constrained to point to objects of only one type—so that the compiler knows how many bytes to
retrieve on a pointer dereference, and it knows by how much to scale a subscript.


Rule 2 C Treats Array Subscripts as Pointer Offsets


Treating array subscripts as a pointer-plus-offset is a technique inherited from BCPL (the language
that was C's ancestor). This is the convention that renders it impractical to add runtime support for
subscript range-checking in C. A subscript operator hints, but does not ensure, that an array is being
accessed. Alternatively, subscripting might be by-passed altogether in favor of pointer access to an
array. Under these conditions, range-checking could only be done for a restricted subset of array
accesses. In practice it has usually not been considered worthwhile.


Word has gotten out that it is "more efficient" to program array algorithms using pointers instead of
arrays.

Free download pdf