Wireframe_-_Issue_23_2019

(Tuis.) #1

34 / wfmag.cc


Game audio part 1: Voices and listeners

Toolbox


Priority systems
allow sound effects to
share voices with music.
Modern games can
mix a hundred or more
voices, but still need a
designer to help work
out which are most
important in a busy game world, with thousands
of potential sound sources.
Long samples are played or ‘streamed’ from
disc, or ‘looped’ to make characteristic sounds
of any duration – great for weather, surfaces,
and engines, as well as music. Stream mixing
allows interactive music, like the eight layers in
Race Driver: Grid which fade up and down as you
overtake, crash, or enter the final lap.
For a decade, triple-A games have routinely
juggled thousands of short samples in memory,
plus music and ambience streamed from disc,
along with perhaps hundreds of thousands of
localised speech samples, loaded on the fly.

LISTENERS AND CAMERAS
There are two sorts of sounds in a game:
diegetic ones are the sounds of things in the
world which you can potentially see, while
non-diegetic ones include commentary, radio
messages, local ambient sounds like weather,
and most sorts of music. 'iegetic sources are
all positioned, mixed, and played in association
with one or more Ȇlisteners’ – these are virtual
' microphones, often associated with the
camera position or where it’s looking.
A third-person game may associate player-
centric sounds with either position. This isn’t
strictly accurate but sounds natural enough
in most games, providing clear direction and
distance cues. But weird things happen, audible
equivalents of oversteer, if you try to put them

somewhere in-between. Split-screen multiplayer
games always require multiple listeners, at least
one per screen slice. Their diegetic sounds must
usually be played twice – once for each view – to
bed them individually into each window in the
game world.
For convenience when mixing, and sometimes
to give the player extra control, voices get
grouped together, so all the speech, music, or
ambiences and weather sounds can fade up
and down in sets. Changes in camera position
also affect the mix, varying exhaust and engine
sound levels, helping to distinguish noises inside
and outside the vehicle, or those made by the
player, or other players.
Figure 5 shows how voices are mixed as
groups, then the diegetic ones are positioned
via listeners before panning for the appropriate
endpoint. Non-positional sounds may be added
into the main mix, routed to a secondary
output like the Wii U GamePad, or sent
directly elsewhere.

WHERE’S THAT SOUND?
We determine the distance, direction, and
movements of sounds around us using a
mixture of cues. 'istant sounds are Tuieter,
duller, and tend to have more echo. The relative
loudness and timbre of sounds reaching each
ear give us an idea of their direction in two main
ways time and phase-based up to about +],
or two-thirds of the way up a piano keyboard;
and intensity-based at higher frequencies.
This inference is inexact, particularly prone to
front/rear confusion, especially in the presence
of realistic reverberation. In the real world and
VR games, we resolve such ambiguities with
small unconscious head movements. Sounds in
front move the opposite way to those behind.
Camera and listener movements, controlled
by a twitch of the thumb, are the equivalent in
conventional games, whether played in stereo
or surround. The importance of movement
and orientation in spatial perception cannot
be understated, and the player’s control over
this and consistency of rendition in the game
is critical.

WHERE’S THIS GOING?
We infer motion from both volume and pitch.
The crucial Ȇ'oppler effect’ is easy to implement
by varying the pitch of moving sounds. In a
fast-moving game, it means that almost all the

FREQ AND RES


1he fi{eżknob narametric filter,
pioneered by Commodore’s
C64 SID chip and now common
in modern games, splits
sounds into three frequency
bands – bass on the left, treble
on the right. Two more controls
set the centre frequency and
resonance (or width) of the
‘passband’ in the middle.
Analogue synthesisers label
those narticularly e}nressi{e
knobs Freq and Res.
Nowadays, the same effect
can be achie{ed with half a
dozen lines of C and two static
{ariables ner {oice, though
you do need to run the code
o{er e{ery samnle, tens of
thousands of times a second.


 Figure 3: :riterionԇs
shooter 9Ȣƃcȟ inǁicƃtes
the pȢƃʰerԇs heƃȢth ƹʰ
ǹeeǁing ƃȢmost the
entire gƃme ƃuǁio miʯ
through ƃ sʥeeping
resonƃnt fiȢter Ȣiȟe thisӝ

 Figure 4: Žn іѝљїӗ ustriƃn
scientist :hristiƃn AoppȢer
shoʥeǁ hoʥ reȢƃtiʤe motion
ƃǹǹects sounǁ ʥƃʤesӖ rƃising
their ǹreɧuencʰ upon
ƃpproƃchӗ Ȣoʥering it ƃs
theʰ receǁeӝ


Source at rest Source in motion


Passband

Frequency (Hz)

Low Pass Response High Pass Response

Band Pass
Response

HP LP

fc

0dB

-3dB

Free download pdf