Sunday, May 29, 2011

PIRN Technical Journal 001: Volume Leveling

CD audio has 16 bits dedicated to each sample. Each bit adds six decibels (six decibels represents a doubling in volume) to the available dynamic range, resulting in 96 decibels of dynamic range. In simpler terms, that means that the ratio of the loudest sound a CD can produce to the quietest one it can produce is, in theory, 65536:1 (though in practice it's more like 32768:1).

What does this mean for PIRN? As it turns out, quite a bit. As a result of the available dynamic range, recorded music varies considerably in volume. Vocal music, especially popular music, has been consistently getting louder since the introduction of the CD in the 1980s, largely due to the availability of digital volume compression, in which softer and louder portions of the song are moved closer together in volume using a mathematical formula. This has escalated to the point where the song frequently "clips", maxing out the amplitude for brief periods of time (less than one thousandth of a second). Such durations are supposedly imperceptible, but allow the recording to have its volume maximized.

By comparison, orchestral music tends to have relatively high dynamic range so they can match the mood of the scene. Indeed, many times a track from an orchestral score never reaches the maximum amplitude of the format. This poses a problem for this station, however, where these two types of tracks often follow one another. Without some sort of correction, the listener will be forced to adjust the volume for every track.

As it turns out there's actually a pretty great program called MP3Gain, which takes an MP3 file, analyzes it, then allows you to adjust the average volume to the desired level:

(Yes, that's right, the good old Pokemon Theme is three decibels louder on Pokemon X than 2BA Master.)

MP3Gain isn't perfect, but it does allow me to fix most issues with song volume. However, sometimes tracks have so much dynamic range that straight amplification isn't enough-if a track is amplified too much in this way, audible distortion can occur. In these cases, the solution is ironically, dynamic audio compression.

Further Reading:

Dynamic Range Compression at Wikipedia

No comments: