June 25th, 2016

Most modern songs are created in digital audio workstations that default to 24-bit wav file format and 44.1kHz sampling rate. The 44.1kHz sampling rate has been the de facto standard for music distributors since the first commercial CD was released in August, 1982 by the Dutch technology company, Philips. In 2016, 24-bit wav, 96kHz sampling rate is becoming the high resolution audio standard for the new music industry's digital supply chain.

44.1kHz sample rate was originally chosen for the CD because it is the minimum sampling rate necessary to satisfy the Nyquist – Shannon Theorem. The Nyquist – Shannon Theorem states that in order to faithfully create a digitization of a sound, the sample rate must be twice that of the highest recorded frequency. Technically, the human ear can hear frequencies up to 20kHz. Therefore, the minimum sampling rate must be 40kHz in order to properly reconstruct the signal. The incorrect reproduction of frequencies beyond the Nyquist Shannon Theorem is known as aliasing.

The red source signal requires 4 samples within the 2 wave cycles in order to properly capture the sound.  The blue line represents the aliasing created by the DAC when the sample rate is not twice that of the source.       … — The red source signal requires 4 samples within the 2 wave cycles in order to properly capture the sound. The blue line represents the aliasing created by the DAC when the sample rate is not twice that of the source.

Since 1982, the music industry has delivered music to consumers using the 44.1kHz sampling rate. However, the new streaming based digital supply chain is slowly adopting the 24-bit, 96kHz file format.

Mastered for iTunes logo

In February of 2012, the Recording Academy and Apple iTunes worked together to create the “Mastered for iTunes “ digital delivery standard. This standard is largely misunderstood, but creates a method for the mastering engineer to compare what he or she hears in the studio with what the consumer will hear. The MfiT standard also protects against peak distortion that can be created during the format conversion process. A common approach to protecting against peak distortion during the conversion process is to create -1.5 to -0.5dBFS of unused headroom in the master digital audio file, creating headroom in the top of the master. If your limiter is set to a maximum output level of -0.1dBFS, or even -0.3dBFS, peak distortion can be created in the consumer file when your file is converted form a wav file to a consumer file format. By leaving at least -0.5dBFS of headroom the encoding process will stay within full scale (0.0dBFS), reducing the chance of peak distortion. The MfiT applet allows you to perform the conversion process and hear the AAC file before it hits retail.

The below picture shows a wav file with a limiter's output set to a maximum loudness of -.5dBFS. When the master file is encoded to an MP3 or AAC by the retailer, the codec will encode overs above your limiter level. If you limiter is set with some headroom the encoded peaks will not result in distortion. It will simply take advantage of the headroom you left in the master.

Peak distortion created during the format conversion process performed by digital music retailers.  The above photo shows amplitude (up, down) and time (L, R) . — Peak distortion created during the format conversion process performed by digital music retailers. The above photo shows amplitude (up, down) and time (L, R) .

The MfiT protocol prefers 24-bit wav, 96kHz sample rate files for AAC encoding. Technically, you can deliver a 24-bit wav, 44.1kHz file to your distributor and it will still be considered "Mastered for iTunes", but 24-bit 96kHz files are preferred. In my opinion, the MfiT guidelines work extremely well across the entire digital supply chain, not just the iTunes marketplace.

High Resolution Audio Logo created by the Consumer Technology Association

In February of 2016, The Consumer Technology Association created a classification for “High Resolution Audio” as “better than CD quality”. In addition to High Resolution Audio standards, streaming services are slowly moving to High Resolution Audio with the incorporation of “Master Quality Authenticated” encoding and decoding technology developed by Bob Stuart of Meridian Audio.

The MQA process allows for the encoding and decoding of 96kHz, 24-bit files by streaming services, but at a fraction of the file size. Tidal has adopted the technology and other streaming services are showing interest in Meridian's breakthroughs. MQA audio streaming will require a hardware decoder to playback the full bandwidth 96kHz, 24-bit stream. However, normal playback devices such as an iPhone or laptop will support "CD quality" MQA streams without an MQA decoder.

As you can see, the largest supplier of music (iTunes) has incorporated a high resolution audio as the archival standard with it's “Mastered for iTunes“ program. Apple is currently amassing the largest database of 24-bit, 96kHz music in the world. The Consumer Technology Association has designated a minimum standard and logo for High Resolution Audio and they plan on licensing the logo to appear dynamically within streaming services.

As streaming services continue to innovate we will hear higher quality audio and greater integration of metadata delivered to consumers. The Digital Data Exchange (DDEX) worked with the Recording Academy to set standards for the formatting metadata that will travel down the digital supply chain to digital distributors. Once metadata is integrated into the digital supply chain it will change the way we discover new music and learn about the people who make it. It will not be long before there will be a high resolution audio streaming service with a fully integrated digital credits list allowing consumers to discover new music in a whole new way.