652 APPENDIX A: Audio Concepts, Terminology, and Codecs
As mentioned, digital audio is quite complex, and part of this complexity comes from the need to
bridge analog audio technology and digital audio technology. Analog audio is usually generated
using speaker cones of different sizes, manufactured using resilient membranes made out of one
space-age material or another. These speakers generate sound waves by vibrating or pulsing them
into existence. Our ears receive this analog audio in exactly the opposite fashion, by catching and
receiving those pulses of air, or vibrations with different wavelengths, and then turning them back
into “data” that our brains can process. This is how we “hear” sound waves, and our brains interpret
different audio sound wave frequencies as different notes or tones.
Sound waves generate various tones depending on the frequency of a sound wave. A wide or
infrequent (long) wave produces a lower (bass) tone, whereas a more frequent (short) wavelength
produces a higher (treble) tone. It is interesting to note that different frequencies of light produce
different colors, so there is a close correlation between analog sound (audio) and analog light
(colors), which also carries through into digital content production.
The volume of a sound wave will be predicated on the amplitude, or height, of that sound wave.
The frequency of the sound waves equates to how closely together the waves are spaced along the
X axis. The amplitude equates to how tall the waves are as measured along the Y axis. Sound waves
can be shaped uniquely, allowing them to “carry” different sound effects. A baseline type of sound
wave is called a sine wave, which you learned about in high school math with the sine, cosine, and
tangent math functions. Those of you who are familiar with audio synthesis are aware that there are
other types of sound waves used in sound design, such as the saw wave, which looks like the
edge of a saw (hence its name), and the pulse wave, which is shaped using right angles, resulting in
immediate on and off sounds that translate into pulses.
Even randomized waveforms, such as noise, are used in sound design to obtain edgy sound results.
As you may have ascertained using previous knowledge regarding data footprint optimization, the
more “chaos” or noise that is present in your sound waves, the harder they will be to compress for a
codec, resulting in a larger digital audio file size for that particular sound.
The next section takes a closer look at how an analog audio sound wave is turned into digital audio
data using a process called sampling, which is a core tool of sound design and synthesis.
Digital Audio: Samples, Resolution, and Frequency
The process of turning analog audio (sound waves) into digital audio data is called sampling. If you
work in the music industry, you have probably heard about a type of keyboard (or rack-mount equipment)
called a “sampler.” Sampling is the process of slicing an audio wave into segments so that you can
store the shape of the wave as digital audio data using a digital audio format. This turns an infinitely
accurate analog sound wave into a discreet amount of digital data, that is, into zeroes and ones.
The more zeroes and ones that are used, the more accurate the reproduction of the infinitely
accurate (original) analog sound wave. The sample accuracy determines how many zeroes and ones
are used to reproduce the analog sound wave, which is also the data footprint, so I will get into that
discussion next.
Each digital segment of a sampled audio sound wave is called a sample, because it samples that
sound wave at that exact point in time. The precision of a sample is determined by how much data
is used to define each wave slice’s height. Just like with digital imaging, this precision is termed the
resolution, or more accurately (no pun intended), the sample resolution. Sample resolution in digital
audio is usually defined as 8-bit, 12-bit, 16-bit, 24-bit, or 32-bit. HD audio uses 24-bit audio samples.