Consoles 991
system are attempted (remember, the per-pass latencies
add up), is a generally acceptable performance.
25.25.3.2.2 Latency—How Much Is Too Much?
Despite many learned researchers’ effort, most data
concerning the audibility of latency is based on the
anecdotal and apocryphal. But there is no substitute for
being on the wrong end of a broadcast presenter ripping
off his headphones and spewing invective as establish-
ment of an incontrovertible benchmark.
We won’t even discuss delays that are long enough
to be discernible as a delay, or a discrete echo; that is
obviously way too long, and everyone, trained or not,
has a hard time speaking normally when fed such into
headphones or monitors. No, it’s that mushy area less
than, say, 50 ms delay—a period of time below which
the ear/brain attempts to integrate all correlated sources
into one—that is of concern.
Latency is an issue where a performer is listening
directly to a delayed version of him or herself; two situ-
ations to keep in mind are a DJ wearing headphones or a
stage performer with in-ear or conventional
floor/side-fill monitors. An important thing to note is
that very different answers from these people as to what
is noticeable, annoying, or untenable are garnered
depending on whether they are introduced cold to a
system with delay, or are steadily introduced to it,
particularly in the cases of headphones/in-ears.
Talking, one hears oneself not only by what’s
coming through the headphones, if they’re open-frame
headphones (i.e., not enclosed), by room spill, but also
by bone conduction within one’s own head. This latter
is distinctly band limited, and what is passed is usually
just the fundamental and possibly early harmonics of
vowel sounds. Interference between this and what is
being stuck in the ear causes a nonflat perceived
frequency response, with cancellation notches and
corresponding reinforcement summations. (It is the
same mechanism as the audio effect flanging.) This is in
general no real problem—one quickly accepts that
sound as being normal, the sound of oneself wearing
headphones. Deliberately introducing a different delay
by even only a millisecond or two is immediately
perceptible—the interference cancellations/summations
change—the sound changes. This is why many tests
attempting to establish acceptable latency by steadily
increasing delay have resulted in unrealistically low
values; the relative changes in coloration with even
small changes in delay are very easy to perceive, even
by the unskilled—and immediately flagged as a
problem.
Conversely, if one were to present a subject with a
delayed headphone feed even quite a bit larger than this
(without previously having had chance to establish a
reference), the interference-related sound would readily
be accepted as normal.
In daily use on countless radio stations are air chain
processors with delays in the 10–15 ms region; this, in
addition to other latencies in the loop path from micro-
phone to headphones listening off-air, means delays
approaching 20 ms are commonplace and to a greater or
lesser degree, accepted. Much more than that, though,
engenders complaints of the sound being disconnected
or hollow and distracting.
Time-alignment experiments conducted on
large-scale rock’n’roll sound systems reached broadly
similar results; 20 ms monitor delay was as much as
could be tolerated by most performers, although some
could detect far less, but most readily acceded not to be
too bothered by it. Delay between the performer and the
PA, particularly in a large venue, proves relatively
unimportant for two reasons: firstly, the performer has
much more present (louder) monitoring to which he’s
likely paying much more attention, and, secondly what
scatters back from the PA is quite diffuse and decorre-
lated anyway. In all cases, the threshold of unaccept-
ability is very crisp—definitely a straw-that-breaks-the-
camel’s-back situation.
The main thing to be considered in all this is that
latencies add: each pass of a signal through a signal link
or network; each piece of gear or processing to which it
is subjected; each propagation delay adds up to often be
significantly bigger than one might expect. Just one
more teentsy-weensy little few link milliseconds
through a TCP/IP pipe might just break it.
25.25.3.3 UDP
UDP—User Defined Protocol—essentially uses the
same (fabulously inexpensive and readily available)
Ethernet-style connectivity, hardware, and chip sets but
with a far simpler messaging protocol than TCP/IP and
better suited to the application at hand. It is then of no
surprise that the majority of wide (more than two paths)
commercially available audio transports use a UDP
variant. One hundred MHz Ethernet hardware using
UDP can afford very low latency and wholly determin-
istic audio paths, with, for example, typically 64 discrete
paths bidirectionally at 48 kHz sample rate. One GHz
hardware/firmware allows correspondingly greater
capacity.
As mentioned, most manufacturers’ audio transports
use this mechanism or something like it; there are