What is Joint Stereo?

Filed Under: Podcasting

While the terms mono and stereo are probably familiar to most people, the term joint stereo is certainly much less common. So when is stereo not stereo?

Most of the music that you listen to today is likely to have been recorded and designed for playback on a stereo system, either through a pair of headphones or a pair of speakers. Even very cheap portable systems nowadays are just as likely to include two speakers for stereo playback as they are a single one.

So if the days of mono systems are all but over, what’s this joint stereo nonsense all about? Is this a retrograde step or something altogether more cunning?

Mono

In the beginning there was mono. The very first recordings ever made were in mono and were recorded by Thomas Edison in 1877. Mono, or monaural, is sound reproduced from a single audio channel.

While a mono audio channel can be played back through a pair of headphones or speakers, this doesn’t make it stereo, as the same audio signal is being fed to both speakers simultaneously.

Stereo

Stereo was a much later invention, pioneered in the 1930s and developed to try and give listeners an impression of spatial separation between sounds, i.e. hearing a violinist on the left and a cello on the right, creating a more realistic or natural soundscape.

Stereo, or stereophonic sound, is conventionally accepted as sound reproduced from two audio channels (although the true definition of stereo is a little more esoteric and beyond the scope of this article). As such a stereo playback system usually has two speakers, one dedicated to each audio channel.

The word stereophonic is a made up word derived from the Greek stereos meaning solid and phone meaning sound, which is normally just abbreviated to stereo.

Joint Stereo

While the concept of joint stereo has been around for many years in various guises, it is really in the context of MP3 recordings that we are interested in it now.

MP3 files are designed to compress data, saving on storage requirements and reducing download times, and joint stereo is another means to this end.

For a given quality of MP3 audio, stereo tracks will generally require pretty much double the storage that the equivalent mono track would require, or to put it another way, they would require double the equivalent bit rate.

Joint stereo is a method to save additional space while retaining a stereo signal, exploiting the fact that most music contains relatively little difference between the left and right audio channels.

MP3 tracks can utilise two different techniques to encode joint stereo; intensity stereo or mid/side stereo. Out of the two, the mid/side version will usually provide the best audio quality.

Intensity Stereo

Intensity Stereo saves space by combining the left and right audio channels at certain frequencies into mono. The reason behind this is that the human ear is insensitive to the direction of sounds at very high and very low frequencies.

Intensity stereo is traditionally used for low bit rate recordings (96 kbps or lower) where some stereo is required, but having full stereo would adversely affect the audio quality too much.

Mid/Side Stereo

Mid/Side stereo, sometimes referred to as matrix stereo or M/S stereo, has been around for years and is used in FM radio broadcasts.

Mid/Side stereo encodes one main channel (the mid channel) as the sum or average of the left and right audio channels (L + R). This mid channel will contain the majority of the audio data in the MP3 file. A smaller side channel is then used to record the differences between the left and right channels (L – R).

Mono, Stereo or Joint Stereo

So, where and when is it appropriate to use joint stereo as opposed to full stereo or mono?

If your track is predominantly mono, such as a speech podcast, then the best audio quality will be achieved by encoding the track in mono.

Generally speaking, encoding an MP3 track using joint stereo will give better quality for the bit rate used compared to using full stereo. For higher bit rates however (256 kbps and above) it’s probably best to use full stereo.

Unfortunately, audio quality can vary significantly depending on the codec (the compression engine) used in your encoding software, as not all codecs were created equal. Some will perform much better with joint stereo than others, and it’s really difficult to predict until you actually give it a go.

If you want to be really certain, do a few test recordings on a good playback system at the bit rate you intend to use to see if there is a significant audible difference between joint stereo and stereo. If you can’t hear a difference, then don’t worry about it too much; the sun will continue to set every night and rise every morning!

Comments

Howard commented

14 Jul 08 at 10:48 am

When doing a test encoding, it is important to use the right material. Ironically, it will not be the highest quality recording that you have. Harpsicord music (like some stuff from Tori Amos) is very good at showing differences in perceptual codecs due to it’s very wide frequency content. Another good source might be music that has fake vinyl artefacts as this makes it more difficult for the encoder to encode. One of the most revealing recordings I have heard for HiFi systems is an old 78.

As with most things in life – there is no such thing as a free lunch. If you want good quality music, you need to have high bit rate. The best codec at the moment is Flac as it’s lossless. However, the bit rate is around 800 kbps on typical music.

Reply
sc`T commented

10 Dec 09 at 3:36 pm

One thing that does occur to me when hearing the phrase “better quality” is to ask for the benchmark that is being employed. For example, with the advent of compact disc it was touted that its sound quality was superior to that of vinyl record and magnetic tape purely on the back of its signal to noise ratio, yet nothing whatsoever was mentioned about compact disc’s inferior frequency response characteristics or interpolative anomalies. Quite humourous when you consider that a good deal of the source content for these cd’s was (you guessed it) analogue tape masters.

I suspect a similar case exists here, namely that of discounting certain characteristics of program content in favour of others. Now, I do not pretend to contest the claim of a better frequency response being obtained by employing M/S stereo in favour of independantly encoded left and right channels provided the original source content does not contain grossly out of phase spatial signals, say improperly adjusted azimuth alignment of master tapes or deliberately out of phase signals reproduced at the recording stage. It can be argued (and indeed has been successfully argued in the past) that the lions share of recorded performances are not subjected to the grosser spatial anomalies and, while undoubtedly true, it does however beg the question “but what of the more minor ones?”

Think for a moment of your standard FM radio broadcast. A decent enough signal provides for (in most cases) an acceptable reproduction of the content provided at the studio. However, the further you travel from the transmitter site, the more noise is introduced due to the nature of its M/S component. FM radio encodes its sum channel between 50Hz and 16KHz typically, the difference channel being delivered above the 19KHz pilot tone. As the carrier signal drops in intensity, the bandwidth of the signal drops too, producing what is commonly referred to as multiplex hiss.

Content in mp3s are likewise going to suffer some form of degradation. We are in essence limiting the amount of bandwidth that data (in this case digital) might be used to store this content in much the same way that distance from an FM transmitter limits the (analogue) data our FM receivers might comfortably reconstruct. In the case of mp3’s however, it is not the noise floor which suffers adversely as much as it is frequency response and (in the case of M/S Joint stereo) spatial seperation.

Its always going to be a trade-off between how much loss and of what type of loss the user is prepared to accept when ripping their collections to mp3. M/S joint stereo (as I have learned through experience with a variety of codecs) tends to sacrifice spatial information markedly more so than independantly coded stereo does. This is to be expected as (and do correct me if I am wrong here) much of the data stream is devoted to the sum channel at the expense of the difference channel. This can often result in a muddying of the stereo image, particularly at lower bitrates (I never use intensity stereo for that very reason) and this becomes all the more apparent as the original stereo image widens. And as I undestand it, if the stereo image is widened sufficiently (beyond 120 degrees or so) this can also affect the overall performance of the mid channel as well.

I guess the point needs to be made that there is no solution which is always going to produce a more accurate result than the other. I would estimate that most users with no prejudice towards simple or M/S stereo encoding would be content to set and forget their encoders to M/S joint stereo for the simple reason that much of today’s content (digitally recorded or otherwise) tends to be listened to in less than ideal situations negating the concerns for a true spatial image, the trade-off resulting in more data space for more mp3s.

However, if space is not as much an issue as extracting the “best bang for your byte” then you are probably best served auditioning the results of both methods and carefully as I tend to do for the more critical content. As to what the individual might consider critical, I shall leave that for you to decide, though if you are in the process of remastering ye olde cassette collection I’d recommend giving J/S as wide a berth as is humanly possible.

Reply
- L.P.O. commented
  
  13 Jan 11 at 7:57 am
  
  Sorry to say, but sc’T’s message is simply wrong, from beginning to end and on pretty much every level.
  
  First of all, there has still not been a single double-blind test ever done that proves that anyone, ANY-ONE, could hear the difference between direct sound and sound properly sampled at 44.1 kHz 16 bits with high quality A/D + D/A converters. This has been the case since the early 1980’s, and if the proof hasn’t surfaced in 30 years, I don’t think it ever will. Repeating digital inferiority without a shred of proof just doesn’t make it true.
  
  Second, turning normal Left-Right stereo into Mid-Side stereo and back is a lossless operation. You site FM radio as an example of how MS stereo is somehow technically worse than mono. This is quite an incorrect analogy. The reason why FM stereo radio has more noise than FM mono radio is because of some unfortunate technical decisions which made stereo FM receivers less expensive to make, but which at the same time limited the signal-to-noise ratio of the side information channel by about 20 dB. The noise is FM stereo is an artifact specific to FM radio and has nothing to do with digital MS stereo.
  
  MS stereo in digital is just based on the fact that if you losslessly convert your LR stereo to MS representation (M = L+R, S = L-R), numbers in the S channel tend to become much smaller, both in the time and frequency domains. Smaller numbers are easier to compress, so the MS representation allows for more efficient compression. And compression, after all, is the whole point of MP3, AAC and OGG (all of which can use MS stereo in one form or another). E.g. MP3 can select on a frame-to-frame basis whether to use MS or LR stereo. If the left and right channels correlate little enough that MS stereo wouldn’t help, the frame can be encoded as LR stereo. When properly done, there is absolutelu _no_ spatial image lost. Vice versa, if the bitrate is set to a certain limit, this will yield a better-sounding file, also in the spatial domain.
  
  As for recordings made with phase errors between the left and right channels, there are no problems in representing them in the MS stereo format. Their representation will just be less efficient because the S signal doesn’t get as small as with a properly aligned recording. As a result you’d need a larger file for the same quality.
  
  Naturally there are bad implementation, but e.g. current versions of LAME are quite good in selecting the optimal compression mode automatically when using MS stereo.
  
  I’d like to also comment the original article’s definition of intensity stereo. In intensity stereo part of the audio spectrum is indeed encoded only as mono, but what is missing from the article is that then panning information is sent on a critical band basis for that part of the spectrum. So, e.g. if you encode intensity stereo and you have two sine tones, e.g. at 5 kHz in the left channel and 11 kHz in the right channel, that will be encoded as one mono signal that has both the frequencies. Frequency band panning information however will make it possible to represent it properly so that the listener will indeed hear the 5 kHz tone in his left and 11 kHz tone in his right loudspeaker. However, because the frequency bands are quite wide, intensity stereo cannot represent properly a situation where there is a 10 kHz signal in the left channel and 10.1 kHz signal in the right channel. In this case the bouth sine waves would be, depending on the encoder, likely just be played in the center. That’s the price you pay for intensity stereo: you can have a high-hat in the right channel, but brass instrument higher overtones cannot be represented in the left channel at the same time. Nevertheless, when encoding at very small bitrates, this smudging of the stereo image is usually less bothersome than other artifacts like warbling, so it is a very efficient way of audio compression.
  
  Reply
  - sc`T commented
    
    13 May 12 at 4:22 pm
    
    “Sorry to say, but sc’T’s message is simply wrong, from beginning to end and on pretty much every level.”
    
    I disagree. I think it is rather your interpretation of what I have written which is at fault and my reasons are as follows.
    
    First and foremost, I have not declared digital to be worse than analogue in all respects, only that certain aspects of its implementation were inferior. Specifically I looked at the advent of domestic digital audio and cited interpolation anomalies and frequency responce as areas where compact discs failed to match their analogue sources. And sorry, but the weight of statistical evidence is squarely behind me on this one. No matter how good the AD/DA process is (then or now), a red book compact disc can only reproduce a 22kHz stereo window from its source. That analogue studio master tapes can and have yielded frequency response curves in excess of 30 kHz is a well documented phenomenon (and you are welcome to google specifications for any number of studio recorders should you be in any doubt). Additionally, any complex waveforms in the higher frequency register will only be represented as accurately as the sampling frequency allows for (which is where interpolation anomalies come into play in digital). Sure, a compact disc can reproduce a frequency of 22kHz, but as it is allocating only two samples for each period, how can it possibly determine if the waveform is a sine, square or sawtooth, far less anything more complex?
    
    Simply put, it cannot. An analogue master as described above is measurably and incontrovertibly superior at reproducing such frequencies in a laboratory, therefore my statement is correct when I say a compact disc’s frequency response is inferior to the analogue studio master sources of the day. Your introduction of double blind testing changes nothing regarding the accuracy of my statement. Though if you really must argue the perceptual differences between the formats then rest assured that as a recording engineer of some 20 years I have some experience in a variety of formats and have learned to recognise the merits and pitfalls in each of them. Accordingly I will choose my preferred recording medium based on the client’s material and there have been plenty of times where I have “mixed and matched” formats to achieve the desired result. I have even had occasion to use mp3 formats on occasion when pressed, as there were times when it was impractical to carry a washing-machine sized open reel deck, several miles of power lead and a three foot high stack of half inch tape stock to a performance. And yes, two of the recorders I used on such occasions employed the M/S format.
    
    Secondly, your contention that I somehow suggested M/S stereo was inferior to mono. I suggested no such thing. For starters, I compared M/S stereo to discrete stereo and used the FM implementation of M/S to demonstrate how reducing the bandwidth (which is reduced by distance from the transmitter) could adversely affect the reproduced signal. As the side channel information is contained in a higher frequency band of the carrier, it is naturally the first component of the audio to be affected. It is reasonable to expect that should you “reduce the bandwidth” of a lossy codec (read “use less kilobits per second”) you will assuredly introduce distortions into the reconstructed material. Note that I also said that these were NOT likely to manifest themselves as increased noise floor (as is the case with FM transmission) but were more likely to make their presence felt in a muddying of the stereo image. Agreed, M/S in and of itself is lossless provided you have a container big enough to transport the encoded signal in. Mp3 by its nature is designed to code perceptual information and as you decrease the resolution (kbps) the amount of information it “forgets” becomes greater and greater.
    
    Blind Freddie could see this would result in differences from the original source. This would I hope not be up for debate and I find it disingenuous that someone might suggest M/S decoding from such a lossy source would not in any way disrupt spatial information that was present in the original source, yet not adequately coded for when exported as an mp3. Also, when phasing anomalies are introduced to the equation, I think you grossly understate the cancellation problems that result in M/S mp3s, particularly at lower bitrates. You as good as admit later this would result in a larger file, you also admit you would prefer smudged stereo to a warbled file at low bitrate intensity stereo. It has been my experience that “warbling” might still be apparent in the higher frequency register, particularly on M/S coded cassette transcriptions and the already disrupted spatial information is further distorted due to the codec preferring the sum channel to the difference. This is because in many cases the phase between left and right has a tendency to shift constantly on an improperly aligned and serviced cassette deck and this translates to cancellations at different frequencies (warbling/swishing) when summed to a mono (center) channel and similar effects happen to a mono difference (or side) channel. These problems in frequency are introduced during the coding to M/S and are amplified by the lossy nature of mp3 codecs and in such cases it is preferable to go with discrete stereo to ameliorate the problem.
    
    So much for the lesson in electro-mechanical and digital reproduction methods. Bear in mind that nowhere have I advocated a “one size fits all” solution to all program content, that I think would be “simply wrong, from beginning to end and on pretty much every level.” You may choose to misinterpret and dismiss the weight of decades worth of recording experience I bring to the discussion and I might call that an arrogant and foolish thing to do, but ultimately it is for the end user to decide which method suits them best and if anything I have said here assists them in their goal then these keystrokes are not wasted.
    
    Reply
    - Richard Farrar commented
      
      14 May 12 at 2:01 pm
      
      Wow, a very comprehensive reply. You obviously have a good understanding of digital sampling and how the sampling rate proportional to the frequency becomes very poor and therefore more inaccurate as you approach the Nyquist frequency. Nice to see an in-depth discussion taking place though.
      
      Reply
    - Michael Strorm commented
      
      02 Sep 12 at 4:13 pm
      
      (Disclaimer: I don’t have other poster(s)’ real world experience, and do not claim to be a professional in signal processing. Apologies if I misunderstood anyone else and/or this comes across as condescending).
      
      “Sure, a compact disc can reproduce a frequency of 22kHz, but as it is allocating only two samples for each period, how can it possibly determine if the waveform is a sine, square or sawtooth, far less anything more complex?”
      
      In the case of square and sawtooth waves above 22.05 kHz, the answer is simple- it can’t, and it doesn’t claim to (not even in theory). Reason is that such signals contain frequencies far higher than 22 kHz(!)
      
      Fourier analysis says that we can express *all* signals as nothing more than the sum of a number of “building block” sine waves of different frequencies. It’s the frequency of all these constituent *sine* waves that determine which “frequencies” a signal contains.
      
      When expressed in this way, it can be seen that a 22 kHz sawtooth or square wave consists of a 22 kHz “fundamental frequency” sine wave, but also a number of much higher frequency “harmonic” waves. These harmonics are above the 22.01 kHz limit, so can’t even be reproduced in theory according to Nyquist, and would cause aliasing on the sample if not filtered out first.
      
      Reply
      - Richard Farrar commented
        
        12 Sep 12 at 7:29 pm
        
        Hi Michael, You seem to have a good understanding of sampling theory and the mathematics behind Fourier. These sort of debates however will rage on forever I suspect.
    - KarlU commented
      
      12 Mar 15 at 1:47 am
      
      sc`T, you say,
      
      Additionally, any complex waveforms in the higher frequency register will only be represented as accurately as the sampling frequency allows for (which is where interpolation anomalies come into play in digital). Sure, a compact disc can reproduce a frequency of 22kHz, but as it is allocating only two samples for each period, how can it possibly determine if the waveform is a sine, square or sawtooth, far less anything more complex?
      
      The answer is, it doesn’t need to. The human ear cannot distinguish between a 22kHz sine, square or triangle wave. The second harmonic would be 44 kHz – inaudible, whether live, analog, or digital. A square wave has odd harmonics, so the lowest harmonic above the fundamental would be 66 kHz. The lowest frequency that even needs to preserve a second harmonic would be around 10 kHz.
      
      The FIR interpolation filters will remove the higher harmonics, and reconstruct the waveform so you will get a pure sine wave on playback in any event.
      
      Reply
- L.P.O. commented
  
  13 Jan 11 at 8:14 am
  
  Oh, and just to say a few extra words about compression, of which I know a thing or three…
  
  I have written an Ogg Vorbis encoder two years ago for a specific application. In Vorbis the MS stereo implementation is particularly clever, as you do the conversion only after you have already quantized your frequency values. In effect this means that the conversion is lossless by its very definition. Using MS stereo doesn’t change a single sample in audio reproduction, it only affects file size.
  
  My findings using MS stereo were as follows:
  1) When compared to mono, using conventional stereo added 100% to the file size, i.e. it doubled it. That was of course what would be expected.
  2) When using MS stereo, adding stereo information made the files on average 40%-60% larger instead of 100%. However, there was a big difference depending on the type of source material. All in all, typically two thirds of the resulting file contained M information, and one third S information.
  
  All in all, and as can be calculated from these numbers, using MS stereo on OGG files helped make files on average 25% smaller as compared to discrete stereo, with _exactly_ the same decoded result. Because of the losslessness of MS stereo in OGG, the OGG encoders don’t even give you the option of choosing whether you use MS stereo or not. The only reason I know this is that I researched the matter when creating my own encoder.
  
  MP3 files are slightly different because MS stereo is handled at a different stage of compression. Due to this the results are not bit-accurate, but for the same file size, and using a proper encoder, MS stereo still will give better results.
  
  Reply
  - Richard commented
    
    13 Jan 11 at 12:14 pm
    
    Thanks for taking the time to add such a comprehensive reply. I think the extra detail will certainly help some of the more interested readers and it’s nice to see the discussion developing.
    
    Reply
Ed commented

21 Jan 10 at 12:58 am

Why is full stereo recommended for high bitrates?

Reply
- Richard commented
  
  21 Jan 10 at 3:06 pm
  
  Using full stereo will give much improved spatial separation and enable you to pin point instruments at different spatial positions in the mix with greater accuracy, but to achieve full stereo you will be using almost double the bit rate. So if you can afford a higher bit rate, and quality is important, it’s worth going for full stereo.
  
  If you’re already using a lower bit rate and then opt for full stereo, you will be instantly almost halving the available bit rate for the music, which will have a catastrophic effect on the overall quality at already sub-optimal bit rates.
  
  Reply
Ed commented

21 Jan 10 at 6:02 pm

Oh. I didn’t think there would be a difference in sound quality since joint stereo is lossless.

Reply
- Dee commented
  
  22 Jan 10 at 9:21 am
  
  Joint Stereo is lossy.
  
  Reply
Dee commented

22 Jan 10 at 9:32 am

and Richard is basically right in what he is saying. Some encoders handle joint stereo in a bad way (it is lossy after all). If I trust the encoder (such as the lame encoder) then I will usually use joint stereo, however if I am using an encoder that I don’t trust as much (such as the itunes mp3 encoder) then I would feel safer encoding using normal stereo.

Reply
InfernusKnight commented

03 Jul 10 at 8:17 am

Joint Stereo itself is technically lossless. Consider this. You have a sample of digital audio where the left channel has a value of 16 and the right 8.

L-> 16, R->8

Joint stereo encodes as the sum and difference of the two channels. In this case, the sum is 24 and the difference is 8.

S: 24 D: 8

You can find the original Left and Right values with these formulae: L = S – D and R = S – L.

L=24-8
L=16

R=24-L
R=24-(16)
R=8

Since we got back the original values exactly, and since we can do this regardless of what the values are (I dare you to try and find a case where this doesn’t work), joint stereo is lossless.

However, the computer must allocate enough space to store both values exactly or all that losslessness would be in vain. Assuming we are dealing with 16 bits per channel per sample integer audio (the most common type), to do this in all cases (including the worst case) we would need 17 bits for the sum channel and 16 bits for the difference channel. This means that instead of needing 32 bits per sample (2 channels*16) we would need 33. This means we would actually need MORE space to store joint stereo than plain stereo. On top of that, computers store bits in groups called bytes, and since bytes are almost always eight bits long, we would need 40 bits to store a sample of joint-stereo audio, since under most circumstances, computers don’t store bits by themselves (only in bytes).

However, this is where the magic of data compression kicks in. I’m not going to go into details, but a well implemented data compression algoritm can compress the joint-stereo data much better than individual channels, assuming there is much in common with the left and right channels. This is how FLAC works, even though it calls joint stereo “inter-channel decorrelation” and says joint stereo is lossy when they are really the same (They are probably trying to disassociate joint stereo with the additional sound degradation it can cause when using it with MP3 and other lossy compression algorithms).

The problem is that lossless compression can’t shrink audio enough for many purposes, like sharing music online (or sharing videos online with sound in them). This is where technologies like MP3 come in, though MP3 is a rather outdated technology (Vorbis is better, spec-wise at least). These are lossy compression algorithms, and they can also usually compress joint stereo better (by cutting out the less important data, especially in the difference channel), especially at lower bitrates. The problem here is that the lossiness can be more noticable when using joint stereo, especially at very low and high bitrates (as the article and other comments describe).

So if you can find the disk space/bandwidth to compress losslessy, you can (and should provided the left and right channels are similar; this is usually the case) go joint-stereo with no extra degradation.

Reply
Glendale Walter commented

16 Aug 10 at 5:16 pm

I find that Joint Stereo works very well will even the highest bit rate. I swear by it. I think 350 bitrate in combination with Joint Stereo is perfect. U can go full stereo with 350 bitrate but i feel and hear no audible difference between them. My ears are good at tuning in the quality of music. You can go full stereo at 350 bitrate but trust me joint stereo isnt a bad choice either. One day we will go completely to 5.1 stereo surround sound and that would be wonderful but one problem with this is it demands a lot of storage space. so we will need advance technology to compress 5.1 technology to the size of high quality mp3 to lend itself feasible and so that can be stored as an mp3 file to be played normally on 5.1 systems without any extra hardware to decode it. That would be the holy grail of sound. I have noticed the technology is hear with the advance of 5.1 Dolby Digital for dvds. some songs have been encode 5.1 on them and they sound wonderful but only can be played on the dvd and use must have 5.1 Dolby Digital capabilities. It might be even nice that one day they will have decoders that decoded regular mp3 straight to 5.1 Dolby Digital losslessy. That would even be a much better technology. To get to my point Joint stereo is very good and dont be scared to use with with 350 bitrate. This technology is very good right now while its still available free.

Reply
stelios commented

18 Dec 10 at 4:56 pm

thanks.
very useful information.

Reply
MAX commented

18 Jul 11 at 9:53 pm

THANK YOU SO MUCH.
SIMPLE AND USEFUL INFORMATION :*

Reply
- Richard commented
  
  19 Jul 11 at 12:31 pm
  
  You’re welcome, glad you found it helpful and easy enough to digest.
  
  Reply
Phil G commented

22 Nov 11 at 8:37 pm

One area that should specifically be used for Full Stereo or ‘Normal’ stereo as it’s called in iTunes is when encoding two distinct channels of audio, in particular when doing production interviews where one channel may be one speaker on a wireless microphone and the other channel is a totally different speaker or sound source.

Another example would be “Channel 2 timecode” files where an interview or production audio sound is recorded on one channel and SMPTE audible timecode is recorded on the other.

Using Joint Stereo in these cases will result in ‘bleed-thru’ of one channel to the other, in particular when using audible SMPTE timecode which ‘sounds like white noise’.

Reply
- Richard commented
  
  23 Nov 11 at 2:11 pm
  
  Thanks Phil, a very good point and an essential requirement for full stereo in those scenarios.
  
  Reply
Melinda P commented

07 May 12 at 6:59 am

Thanks very much for the explanation, Richard. I’m recording audio with Audacity, and exporting it as mp3 (LAME encoder). It wanted to know if I wanted joint stereo or stereo for my channel mode, and I’d never heard of joint stereo. I’m going to try joint stereo & see what I get.

FYI, your blog entry came up second under a Yahoo! search for “joint stereo” (just behind Wikipedia).

Reply
- Richard Farrar commented
  
  07 May 12 at 12:37 pm
  
  Thanks Melinda. It’s nice to know that you found the article useful and also good to know that the search engines seem to like it 🙂
  
  Reply
Lars Cromander commented

18 Jun 12 at 1:39 pm

That was a very clear explanation to the concept of joint stereo. I have always assumed that tracks coded in joint stereo were inherently inferior to those coded in normal stereo but had a hard time hearing any difference, now I know that it’s probably not that specifically i need to worry about when coding.

Thanks!

Reply
- Richard Farrar commented
  
  18 Jun 12 at 6:52 pm
  
  Thanks Lars, glad you found it helpful.
  
  Reply
Paulyjr commented

24 Jun 12 at 10:49 pm

WOW! This is way more technical than I needed it to be. I only listen to mp3 at work on an “expendable” player and I prefer the iTunes joint stereo to regular stereo because of the workroom noise such as fork lifts and various machines contributing to the ambiance. I was looking for a simple explanation of the differences in the formats, not a complete education in psycho-accoustics as applied to recording technologies in a class 1 cleanroom. Thanks!

Reply
- Richard Farrar commented
  
  25 Jun 12 at 5:43 pm
  
  Well, I hope it managed to increase your understanding of joint stereo a little bit more. Richard
  
  Reply
RyanL commented

06 Sep 12 at 1:01 am

Quality is important to me the higher the bitrate the better mostly 320kbps in normal stereo.
For cd’s I use Window Media Player. It is set to 320kbps joint stereo which I can’t switch to normal stereo. But I don’t mind that much.
For vinyl rips, I listen to a lot of breakbeat/hardcore/jungle which are only on vinyl which I convert to lossless Wav files and in my hdd. Then I convert my favorite ones to 320kbps Stereo using Audacity with a Lame encoder then transport it to my Walkman mp3 player! Happy happy joy joy.

Reply
- Richard Farrar commented
  
  09 Sep 12 at 5:30 pm
  
  There’s never one size that fits all as everyone has different requirements with the trade off between bit rate and size often being quite important for some people, but it’s good to know that you’ve found a system that works reliably for you with suitable quality.
  
  I too use Audacity for a lot of audio editing and the Lame MP3 encoder with the WinLAME front end, which gives me more control of the encoder settings than encoding to MP3 via Audacity directly.
  
  Reply
Mahipal Gunawat commented

21 Oct 12 at 5:06 pm

good tutorial in simple way

Reply
- Richard Farrar commented
  
  21 Oct 12 at 5:08 pm
  
  Thanks.
  
  Reply
feedback commented

30 Jan 13 at 11:10 pm

I’m staying with stereo only for few files I have to do for someone. Put another way today’s where is the need to save a few kb file size when storage is getting / got bigger.

It’s like saying share MP3 instead of compressed lossless. Fact now is more for compressed lossless because of larger storage space and increased network bandwidth over older 56k modems that mp3 was designed for.

If it is stereo keep it stereo, forget the joint stereo where it reduces the stereo image further, I have noticed this and the reason for looking for the answer. Thank you for providing this answer joint is make some parts of stereo recording mono to save space. Sounds like mini disc ATRAC compression all over to me where is compressed so make useless.

Though better to use ape or FLAC to keep the recording lossless (keeping the original file untouched). Therefore keeping the best sound fidelity quality that is possible. Only people that need lossy mp3 or other are those who have mobile devices without enough storage space. This is changing fast, very fast so soon no need for lossy for music files. Only time lossy will still be used is for video TV programs and such. Larger optical media soon blu-ray bdxl 128 GB discs will mean lossless will be used for all video and lossy formats will disappear thereafter.

Reply
- Richard Farrar commented
  
  12 Feb 13 at 7:49 pm
  
  Cheap storage has increased phenomenally recently making the case for MP3 compression less strong, although I have a feeling it’ll be around for a while yet. Only time will tell is lossless compression algorithms like FLAC become mainstream.
  
  Reply
lolz commented

24 Jul 13 at 10:10 am

Really?
I always tought Joint Stereo is This: http://forum.sensiseeds.com/fdata/gallery/mrniceguy/4958_smoking-two-joints.jpg

Reply
- Richard Farrar commented
  
  25 Jul 13 at 1:08 pm
  
  Lol; very good.
  
  Reply
  - LOLz commented
    
    01 Aug 13 at 9:58 am
    
    BTW.. very informative
    Thanks
    
    Reply
    - Richard Farrar commented
      
      01 Aug 13 at 12:04 pm
      
      Cheers, glad you found it helpful.
      
      Reply
Marus commented

19 Oct 13 at 6:55 pm

Hi,
What kind of algorithm use LAME encoder : Intensity Stereo or M/S Stereo ?

Reply
- Richard Farrar commented
  
  19 Oct 13 at 7:26 pm
  
  Hi Marus, LAME uses M/S Stereo and “does not have intensity stereo capability.”
  
  Reply
  - Marus commented
    
    20 Oct 13 at 3:46 pm
    
    Hi Richard,
    This is really great news ! Until now I was afraid that Lame use Intensity Stereo (lossy) and I have avoided to encode mp3’s in Joint Stereo, and I have used Franhofer 320k CBR Stereo, but now I think that best quality from a mp3 is obtained with Lame -V0 VBR Joint Stereo. What do you think ?
    
    And I have another question. Is dbPoweramp a good software to encode Lame mp3’s ? So far I have used Audiograbber because it has the option to select the method to encode : Old, New, MTRH. dbPoweramp does not.
    
    Reply
    - Richard Farrar commented
      
      20 Oct 13 at 4:26 pm
      
      Marus,
      
      From what I’ve read the Fraunhofer CODEC is supposed to sound better for constant bit rates and LAME for variable. I think your choice sounds eminently sensible, but it may be worth trying a few listening tests between the codecs with full stereo as well. Try using some harpsichord material as this is usually quite revealing.
      
      I’ve never used either of the software products that you mentioned, so can’t comment really. I tend to use WinLAME (free software) for a LAME front end.
      
      Reply
John commented

22 Dec 13 at 1:51 pm

Just to say thanks for a simple explanation. Now I can use Audacity with Lame encoder with more confidence 🙂

Reply
- Richard Farrar commented
  
  22 Dec 13 at 2:15 pm
  
  Thanks John, I’m pleased it’s helped you out.
  
  Reply
Pete commented

25 Jun 14 at 3:46 am

Thanks for posting this easy to understand article. Which codec is best for encoding mp3 songs at a Constant 320 kbps Stereo NOT joint stereo sound – LAME or Fraunhofer? Is there an audible difference or are the outputs of the same quality? Your reasons? Thanks again.

Pete 🙂

Reply
- Richard Farrar commented
  
  25 Jun 14 at 1:24 pm
  
  Thanks Pete.
  
  When it comes to CODECS you can get lots of different opinions, but at such a high bit rate I think you’d probably be very hard pressed to tell the difference. The best way would be to try a test yourself with the two different CODECs.
  
  However, I did do a post a while back comparing these two CODECs for podcasts that might answer your question in a little more detail: https://www.richardfarrar.com/which-is-the-best-mp3-encoder-for-podcasts/
  
  Reply
Zac commented

12 May 15 at 3:18 pm

It is always such frustration when authors who do not really get technology (and math) try to write technical pieces:
[“stereo tracks will generally require pretty much double the storage that the equivalent mono track would require”]…
“or to put it another way, they would require double the equivalent bit rate”.
NO, the RATE remains the same!
…It’s like saying that to record every-second measurements along two timelines, one would need to record at a half-second rate.
But this I can forgive: if you are not a technologist, you just won’t appreciate what this is all about. (Always have a tech person whom you used as your source to tech-proof-reach your end-product to make sure you didn’t conjure up something stupid or just incorrect while coming up with metaphors or analogies to make the technical text sound like a fiction, which leads to my second, point:
It so unbearably annoying, when author tries to make a thriller / page turner out of a technical article that people come to to quickly learn a subject:
“So when is stereo not stereo?”
“what’s this joint stereo nonsense all about? Is this a retrograde step or something altogether more cunning?”
— Are you writing Harry Potter?! Stop wasting people’s time. Choosing the right style for the content and the purpose.
There is perhaps just one statement in this piece that would have been better off used for the 90% of this content to leave it out as unnecessary:
“(although the definition of … [is] beyond the scope of this article)”.

Reply
- Richard Farrar commented
  
  08 Jun 15 at 7:37 pm
  
  Thank you for taking the time to comment Zac, although in your haste to comment it is rather unfortunate that you forgot to share your obviously bountiful knowledge and wisdom on the subject. I for one feel I certainly missed out on some potential pearls of digital audio wisdom.
  
  While you appear quick to criticise my lack of technical knowledge and that of mathematics I’m surprised that you didn’t “practice what you preach” with a little due diligence before you commented, but I can certainly understand how in your excited state to comment your enthusiasm ran away with you. If you had of spared an additional few minutes from your undoubtedly busy life commenting on blogs, I’m sure you would have realised that with a degree in electronic engineering and a PhD in Laser physics, I am more than technologically acquainted and comfortable with mathematics.
  
  As for the tone of my article, it was clearly aimed at the novice, trying to get the feel of the concept of joint stereo over in a, hopefully, easy to understand manner. May be I failed in this respect, but if you fully understood the topic anyway, then I’m a little perplexed as to why you would want to read an article such as mine all the way through.
  
  The full discussion of bit rates was beyond the scope of the article as I was concerned with the concept of joint stereo. I do “understand” bit rates and was just trying to get a high level concept over and in that respect stand by what I said, or rather “sit” by what I said on the account of being paralyzed from the neck down after breaking my neck in a diving accident, but again you’d have known that if you’d have done your research and read my “About Me” page.
  
  Happy Christmas.
  
  Richard x
  
  Reply

Trackbacks

Better metering in the field for wildlife and field recording mono compatibility | Richard Mudhar says:

02 Dec 14 at 10:29 am

[…] and intensity to derive the stereo image I didn’t appreciate the frequency dependence. This is also used in MP3 encoding, so I need to learn more about that to be able to MP3 encode field recordings better – they […]

Reply

What is Joint Stereo?

Mono

Stereo

Joint Stereo

Intensity Stereo

Mid/Side Stereo

Mono, Stereo or Joint Stereo

About Me

Popular Posts

Find Me Online

Mono

Stereo

Joint Stereo

Intensity Stereo

Mid/Side Stereo

Mono, Stereo or Joint Stereo

Comments

Trackbacks

Leave a Reply Cancel reply

About Me

Popular Posts

Find Me Online