Data Compression 591
In structured coding, we assume that each occurrence of a particular note is the
same, except for a difference described by an algorithm with a few parameters. In
the model-transmission stage we transmit the basic sound (either a sound sample or
another algorithm) and the algorithm which describes the differences. Then, for sound
transmission, we need only code the note desired, the time of occurrence, and the
parameters controlling the differentiating algorithm.
19.7.2 SAOL
SAOL (pronounced “ sail ” ) stands for “ Structured Audio Orchestra Language ” and falls
into the music-synthesis category of “ Music V ” languages. Its fundamental processing
model is based on the interaction of oscillators running at various rates. Note that this
approach is different from the idea (used in the multimedia world) of using MIDI
information to drive synthesis chips on sound cards. This latter approach has the
disadvantage that, depending on IC technology, music will sound different depending on
which sound card is realized. Using SAOL (a much “ lower-level ” language than MIDI)
realizations will always sound the same.
At the beginning of an MPEG-4 session involving SA, the server transmits to the client
a stream information header, which contains a number of data elements. The most
important of these is the orchestra chunk, which contains a tokenized representation of a
program written in SAOL. The orchestra chunk consists of the description of a number
of instruments. Each instrument is a single parametric signal-processing element that
maps a set of parametric controls to a sound. For example, a SAOL instrument might
describe a physical model of a plucked string. The model is transmitted through code,
which implements it, using the repertoire of delay lines, digital fi lters, fractional-delay
interpolators, and so forth that are the basic building blocks of SAOL.
The bit stream data itself, which follows the header, is made up mainly of time-stamped
parametric events. Each event refers to an instrument described in the orchestra chunk in
the header and provides the parameters required for that instrument. Other sorts of data
may also be conveyed in the bit stream; tempo and pitch changes, for example.
Unfortunately, at the time of writing (and probably for some time beyond!) the techniques
required for automatically producing a structured audio bit stream from an arbitrary,
prerecorded sound are beyond today’s state of the art, although they are an active research
topic. These techniques are often called “ automatic source separation ” or “ automatic