Subscribe Subscribe via: (Email/RSS)

What is Joint Stereo?

Filed under Podcasting

headphones

While the terms mono and stereo are probably familiar to most people, the term joint stereo is certainly much less common. So when is stereo not stereo?

Most of the music that you listen to today is likely to have been recorded and designed for playback on a stereo system, either through a pair of headphones or a pair of speakers. Even very cheap portable systems nowadays are just as likely to include two speakers for stereo playback as they are a single one.

So if the days of mono systems are all but over, what’s this joint stereo nonsense all about? Is this a retrograde step or something altogether more cunning?

Mono

In the beginning there was mono. The very first recordings ever made were in mono and were recorded by Thomas Edison in 1877. Mono, or monaural, is sound reproduced from a single audio channel.

While a mono audio channel can be played back through a pair of headphones or speakers, this doesn’t make it stereo, as the same audio signal is being fed to both speakers simultaneously.

Stereo

Stereo was a much later invention, pioneered in the 1930s and developed to try and give listeners an impression of spatial separation between sounds, i.e. hearing a violinist on the left and a cello on the right, creating a more realistic or natural soundscape.

Stereo, or stereophonic sound, is conventionally accepted as sound reproduced from two audio channels (although the true definition of stereo is a little more esoteric and beyond the scope of this article). As such a stereo playback system usually has two speakers, one dedicated to each audio channel.

The word stereophonic is a made up word derived from the Greek stereos meaning solid and phone meaning sound, which is normally just abbreviated to stereo.

Joint Stereo

While the concept of joint stereo has been around for many years in various guises, it is really in the context of MP3 recordings that we are interested in it now.

MP3 files are designed to compress data, saving on storage requirements and reducing download times, and joint stereo is another means to this end.

For a given quality of MP3 audio, stereo tracks will generally require pretty much double the storage that the equivalent mono track would require, or to put it another way, they would require double the equivalent bit rate.

Joint stereo is a method to save additional space while retaining a stereo signal, exploiting the fact that most music contains relatively little difference between the left and right audio channels.

MP3 tracks can utilise two different techniques to encode joint stereo; intensity stereo or mid/side stereo. Out of the two, the mid/side version will usually provide the best audio quality.

Intensity Stereo

Intensity Stereo saves space by combining the left and right audio channels at certain frequencies into mono. The reason behind this is that the human ear is insensitive to the direction of sounds at very high and very low frequencies.

Intensity stereo is traditionally used for low bit rate recordings (96kbps or lower) where some stereo is required, but having full stereo would adversely affect the audio quality too much.

Mid/Side Stereo

Mid/Side stereo, sometimes referred to as matrix stereo or M/S stereo, has been around for years and is used in FM radio broadcasts.

Mid/Side stereo encodes one main channel (the mid channel) as the sum or average of the left and right audio channels (L + R). This mid channel will contain the majority of the audio data in the MP3 file.  A smaller side channel is then used to record the differences between the left and right channels (L – R).

Mono, Stereo or Joint Stereo

So, where and when is it appropriate to use joint stereo as opposed to full stereo or mono?

If your track is predominantly mono, such as a speech podcast, then the best audio quality will be achieved by encoding the track in mono.

Generally speaking, encoding an MP3 track using joint stereo will give better quality for the bit rate used compared to using full stereo. For higher bit rates however (256kbps and above) it’s probably best to use full stereo.

Unfortunately, audio quality can vary significantly depending on the codec (the compression engine) used in your encoding software, as not all codecs were created equal. Some will perform much better with joint stereo than others, and it’s really difficult to predict until you actually give it a go.

If you want to be really certain, do a few test recordings on a good playback system at the bit rate you intend to use to see if there is a significant audible difference between joint stereo and stereo. If you can’t hear a difference, then don’t worry about it too much; the sun will continue to set every night and rise every morning!

 

Posted on 12 July 2008

If you liked this post...

Why not subscribe via RSS or e-mail, it's FREE!

Please feel free to leave a comment.

Related Articles...

7 Responses to “What is Joint Stereo?”

  1. Howard says:

    When doing a test encoding, it is important to use the right material. Ironically, it will not be the highest quality recording that you have. Harpsicord music (like some stuff from Tori Amos) is very good at showing differences in perceptual codecs due to it’s very wide frequency content. Another good source might be music that has fake vinyl artefacts as this makes it more difficult for the encoder to encode. One of the most revealing recordings I have heard for HiFi systems is an old 78.

    As with most things in life – there is no such thing as a free lunch. If you want good quality music, you need to have high bit rate. The best codec at the moment is Flac as it’s lossless. However, the bit rate is around 800 kbps on typical music.

  2. sc`T says:

    One thing that does occur to me when hearing the phrase “better quality” is to ask for the benchmark that is being employed. For example, with the advent of compact disc it was touted that its sound quality was superior to that of vinyl record and magnetic tape purely on the back of its signal to noise ratio, yet nothing whatsoever was mentioned about compact disc’s inferior frequency response characteristics or interpolative anomalies. Quite humourous when you consider that a good deal of the source content for these cd’s was (you guessed it) analogue tape masters.

    I suspect a similar case exists here, namely that of discounting certain characteristics of program content in favour of others. Now, I do not pretend to contest the claim of a better frequency response being obtained by employing M/S stereo in favour of independantly encoded left and right channels provided the original source content does not contain grossly out of phase spatial signals, say improperly adjusted azimuth alignment of master tapes or deliberately out of phase signals reproduced at the recording stage. It can be argued (and indeed has been successfully argued in the past) that the lions share of recorded performances are not subjected to the grosser spatial anomalies and, while undoubtedly true, it does however beg the question “but what of the more minor ones?”

    Think for a moment of your standard FM radio broadcast. A decent enough signal provides for (in most cases) an acceptable reproduction of the content provided at the studio. However, the further you travel from the transmitter site, the more noise is introduced due to the nature of its M/S component. FM radio encodes its sum channel between 50Hz and 16KHz typically, the difference channel being delivered above the 19KHz pilot tone. As the carrier signal drops in intensity, the bandwidth of the signal drops too, producing what is commonly referred to as multiplex hiss.

    Content in mp3s are likewise going to suffer some form of degradation. We are in essence limiting the amount of bandwidth that data (in this case digital) might be used to store this content in much the same way that distance from an FM transmitter limits the (analogue) data our FM receivers might comfortably reconstruct. In the case of mp3’s however, it is not the noise floor which suffers adversely as much as it is frequency response and (in the case of M/S Joint stereo) spatial seperation.

    Its always going to be a trade-off between how much loss and of what type of loss the user is prepared to accept when ripping their collections to mp3. M/S joint stereo (as I have learned through experience with a variety of codecs) tends to sacrifice spatial information markedly more so than independantly coded stereo does. This is to be expected as (and do correct me if I am wrong here) much of the data stream is devoted to the sum channel at the expense of the difference channel. This can often result in a muddying of the stereo image, particularly at lower bitrates (I never use intensity stereo for that very reason) and this becomes all the more apparent as the original stereo image widens. And as I undestand it, if the stereo image is widened sufficiently (beyond 120 degrees or so) this can also affect the overall performance of the mid channel as well.

    I guess the point needs to be made that there is no solution which is always going to produce a more accurate result than the other. I would estimate that most users with no prejudice towards simple or M/S stereo encoding would be content to set and forget their encoders to M/S joint stereo for the simple reason that much of today’s content (digitally recorded or otherwise) tends to be listened to in less than ideal situations negating the concerns for a true spatial image, the trade-off resulting in more data space for more mp3s.

    However, if space is not as much an issue as extracting the “best bang for your byte” then you are probably best served auditioning the results of both methods and carefully as I tend to do for the more critical content. As to what the individual might consider critical, I shall leave that for you to decide, though if you are in the process of remastering ye olde cassette collection I’d recommend giving J/S as wide a berth as is humanly possible.

  3. Ed says:

    Why is full stereo recommended for high bitrates?

    • Richard says:

      Using full stereo will give much improved spatial separation and enable you to pin point instruments at different spatial positions in the mix with greater accuracy, but to achieve full stereo you will be using almost double the bit rate. So if you can afford a higher bit rate, and quality is important, it’s worth going for full stereo.

      If you’re already using a lower bit rate and then opt for full stereo, you will be instantly almost halving the available bit rate for the music, which will have a catastrophic effect on the overall quality at already sub-optimal bit rates.

  4. Ed says:

    Oh. I didn’t think there would be a difference in sound quality since joint stereo is lossless.

  5. Dee says:

    and Richard is basically right in what he is saying. Some encoders handle joint stereo in a bad way (it is lossy after all). If I trust the encoder (such as the lame encoder) then I will usually use joint stereo, however if I am using an encoder that I don’t trust as much (such as the itunes mp3 encoder) then I would feel safer encoding using normal stereo.

Leave a Reply

You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>