Handbook for Sound Engineers

(Wang) #1
Designing for Speech Intelligibility 1405

measures include C35, whereby the split time is taken
as 35 ms and also sometimes C7 where this early split
time effectively produces an almost pure D/R ratio.
A well-defined scale has not been developed, but it is
generally recommended that for good intelligibility (in
an auditorium or similar relatively large acoustic space)
a positive value of C50 is essential and that a value of
around +4 dB C50 should be aimed for. (This is equiva-
lent to about 5%Alcons.) Measurements are usually
made at 1 kHz or may be averaged over a range of fre-
quencies. The method does not take account of back-
ground noise and is of limited application with respect
to sound systems due to the lack of a defined scale and
frequency limitations—although there is no reason why
the values obtained at different frequencies could not be
combined in some form of weighted basis. (See Lochner
and Burger 1964.) Bradley has extended the C50 and
C35 concept and introduced U50 and U80 etc. where U
stands for useful energy. He also included sig-
nal-to-noise ratio effects. While the concept is a useful
addition to the palette of speech intelligibility measures,
it has not caught on to any extent—but it can be a very
useful diagnostic tool and further extends our knowl-
edge and understanding of speech intelligibility.

36.14.2.4 Speech Transmission Index STI, RASTI, and
STIPA

The STI technique was also developed in Holland at
about the same time as Peutz was developing %Alcons.
While the %Alcons method became popular in the
United States, STI became popular and far more widely
used in Europe and has been adopted by a number of
International and European Standards and codes of prac-
tice relating to sound system speech intelligibility
performance as well as International Standards relating
to aircraft audio performance. It is interesting to note
that while %Alcons was developed primarily as a predic-
tive technique, STI was developed as a measurement
method and is not straightforward to predict! (See later.)
The technique considers the source/room (audio
path)/listener as a transmission channel and measures
the reduction in modulation depth of a special test signal
as it traverses the channel, Figs. 36-32 and 36-33. A
unique and very important feature of STI is that it auto-
matically takes account of both reverberation and noise
effects when assessing potential intelligibility.
Schroeder later showed that it is also possible to
measure the modulation reduction and hence STI via a
system’s impulse response. Modern signal processing
techniques now allow a variety of test signals to be used
to obtain the impulse response and hence compute the


STI—including speech or music. A number of instru-
ments and software programs are currently available
that enable STI to be directly measured. However, care
needs to be taken when using some programs to insure
that any background or interfering noise is properly
accounted for.
The full STI technique is a very elegant analysis
method and is based on the amplitude modulations
occurring in natural speech, Figs. 36-33 and 36-34. Mea-
surements are made using octave band carrier frequen-
cies of 125 Hz to 8 kHz, thereby covering the majority
of the normal speech frequency range. Fourteen individ-
ual low-frequency (speechlike) modulations are mea-
sured in each band over the range 0.63 to 12.5 Hz.
A total of 98 data points are therefore measured for
each STI value (7 octave band carriers each × 14 modu-
lation frequencies). Because the STI method operates
over almost the entire speech band it is well suited to
assessing sound system performance. The complete STI
data matrix is shown in Table 36-2. “X” represents a
data value to be provided.
When STI was first developed, the processing power
to carry out the above calculations was beyond eco-
nomic processor technology and so a simpler derivative
measure was conceived—RaSTI. RaSTI stands for
Rapid Speech Transmission Index (later changed to
Room Acoustic Speech Transmission Index when its
shortfalls for measuring sound system performance
were realized (see Mapp 2002 and 2004). RaSTI uses
just nine modulation frequencies spread over two octave
band carriers thereby producing an order of magnitude
reduction in the processing power required.
The octave band carriers are 500 Hz and 2 kHz,
which, although well selected to cover both vowel and
consonant ranges, does mean that the system under test

Figure 36-32. Principle of STI and modulation reduction of
speech by room reverberation.

Transmitted speech signalmodulation index = 1 Received speech signalmodulation index = m^1

t t
1/F 1/F

I I^2 (1 + m cos 2PF(t + r))
1 (1 + Cos 2P Ft)
I I
Free download pdf