Bob Katz on The Loudness Wars

Bob Katz is a mastering engineer with an impressively long CV. Here, he explains what “The Loudness War” is, how it came about and why, and what it’s going to take to get out of this aural mess we’ve been dumped into.

Check it out — Loudness War: Peace is Almost Here!

The upshot? Overcompressed masters sound wimpy, small and distorted. In short, they suck.

Music tracks are generally recorded with over 20 dB of “naturally available headroom”. That is, the loudest and quietest passages in any given track can vary plus/minus 20dB (this is a LOT, as every 6dB difference represents about 2x increase or decrease in apparent volume). This “natural” sounding music then has the living crap compressed out of it (kind of like cooking your steak until it’s 200 degrees — it’s dry and tough, with all traces of natural juicy flavor completely removed).

Why? Well, lots of reasons. Mainly, most current music is mastered (processed post-recording) so that it sounds “good” on an iPod or car stereo — that is, in an environment where “naturally quiet” passages would be totally lost. Everything is therefore compressed so that the differences between the loudest and the quietest passages get reduced. It’s then easier to hear the entirety of the track in less than pristine conditions. That’s good, right? So, if a little is good, a lot is … well … a lot. Two tracks, played side by side — the louder one might sound a bit better. Whoops. Now we’ve got a race. And in that race, dynamic swings, acoustic build up and explosive impact have all been sacrificed so that the entire track becomes louder … and of course it’s even easier to hear on your crappy iPod earbuds as you walk through an airport or on your daily drive through a stop-and-go rush hour.

Interestingly, this is also exactly what advertisers have done to the audio track on their commercials, which is why cutting away from the football game to a commercial causes the volume to jump up a lot (PAY ATTENTION!).

Now, 10 years after the iPod, louder is better and compression is king. Clients (of a mastering engineer) routinely request their music be made to sound “louder” — louder on the radio, louder on the iPod, louder in the car — so that it “pops” more and gets noticed more. And, presumably, it’ll get bought more.

Whatever. The resulting sound quality of the average audio track these days is a hash of the original music. If you want to hear music with good sound quality, apparently, you have to hear it live — before the “artist” and their producer forces an audio mastering engineer to commit sonic murder on their work.

Just as a recap — recorded music starts with a 20dB of headroom. What we are hearing these days — like the latest from Tom Waits, for example — has less than 6dB. Every track sounds pretty much like every other: loud.

It’s not your imagination. Music today sucks. I blame Steve Jobs.

What? Too soon?

Dear Socrates7: Thanks for publishing and promoting this important issue. I’d like to clarify the meaning of “headroom” and correct you a bit. Headroom has nothing to do with dynamic range. Headroom is the room for peak information above the average level. A recording can be very compressed and have a lot of dynamic range, and vice versa. For example, Steely Dan’s recordings often don’t have a lot of dynamic range, but they sound good because they have plenty of room for peaks above the average level. Tool’s Aenima has plenty of dynamics, it gets quite soft and quite loud at times, but it is highly compressed, there is not much open sound or room for peaks. So, a recording can have 20 dB of peak to average ratio, which is extremely rare, an example would be two of the recordings I cite near the top of the honor role at digido.com. The Lyle Lovett recording I cite has a tremendous amount of loudness range, the difference between the speech sections and the musical sections are probably about 15 to 17 dB! The Paquito D’Rivera Recording has a much smaller dynamic range. But both recordings have plenty of crest factor or headroom, so they both sound very good. It is the use of this peak headroom that helps a recording to sound open and natural. And dynamic range as well that help a recording to sound natural, but a large amount of dynamic range is not necessary to help a recording to sound good. Even as little as 3 to 6 dB difference between the loudness of a a verse and a chorus can help a recording to sound lively and dynamic. And if that same recording also has a decent amount of peaks above its average level (is not squashed) then it can sound quite open and more natural.

A “loud recording”, like the pop recording that I was required to make in the video example, has absolutely no headroom at all, nor any internal dynamic range, so it is doubly in trouble.

Socrates7 says:

November 15, 2011 at 12:34 PM

Thanks for setting me straight, Bob. Much obliged.

Soundminded says:

April 1, 2013 at 12:33 PM

Bob;
Thank you for your work on Chesky CD41, Earl Wild plays Rachmaninoff Piano Concertos 1 and 4 and Variations on a theme of Paganini. It is one of my favorite recordings and a great recording IMO. Anyone who thinks great music wasn’t written during the 20th century should listen to this recording and reconsider.
Bob Katz says:

November 15, 2011 at 10:16 AM

Dear Socrates7: Thanks for publishing and promoting this important issue. I’d like to clarify the meaning of “headroom” and correct you a bit. Headroom has nothing to do with dynamic range. Headroom is the room for peak information above the average level. A recording can be very compressed and have a lot of dynamic range, and vice versa. For example, Steely Dan’s recordings often don’t have a lot of dynamic range, but they sound good because they have plenty of room for peaks above the average level. Tool’s Aenima has plenty of dynamics, it gets quite soft and quite loud at times, but it is highly compressed, there is not much open sound or room for peaks. So, a recording can have 20 dB of peak to average ratio, which is extremely rare, an example would be two of the recordings I cite near the top of the honor role at digido.com. The Lyle Lovett recording I cite has a tremendous amount of loudness range, the difference between the speech sections and the musical sections are probably about 15 to 17 dB! The Paquito D’Rivera Recording has a much smaller dynamic range. But both recordings have plenty of crest factor or headroom, so they both sound very good. It is the use of this peak headroom that helps a recording to sound open and natural. And dynamic range as well that help a recording to sound natural, but a large amount of dynamic range is not necessary to help a recording to sound good. Even as little as 3 to 6 dB difference between the loudness of a a verse and a chorus can help a recording to sound lively and dynamic. And if that same recording also has a decent amount of peaks above its average level (is not squashed) then it can sound quite open and more natural.

A “loud recording”, like the pop recording that I was required to make in the video example, has absolutely no headroom at all, nor any internal dynamic range, so it is doubly in trouble.
- Socrates7 says:
  
  November 15, 2011 at 12:34 PM
  
  Thanks for setting me straight, Bob. Much obliged.

Comments are closed.

Bob Katz on The Loudness Wars

Related

3 Comments

2 Trackbacks / Pingbacks

Related

3 Comments

2 Trackbacks / Pingbacks

Discover more from Part-Time Audiophile