all 16 comments

[–]codeIsGood [score hidden]  (1 child)

Codecs for voice (which is what most telephony uses) is very lossy, because you don't need super high fidelity to discern voice. As opposed to music where you tend to need higher fidelity.

[–]mano-vijnana [score hidden]  (0 children)

Not just that--in telephony the highest and lowest frequencies are selectively cut off entirely because they're not necessary for voice comprehension.

[–]Perryapsis [score hidden]  (0 children)

Imagine you want to draw a picture of the playground at your school. You only have 8 colors of crayons, so no matter how well you can draw, your picture is not going to be photorealistic. But that doesn't matter to you because you can clearly see the swingset here, the slide there, the sandbox in the corner, etc. 8 colors is enough for you to be able to recognize it as your playground.

Annie is a spoiled rich kid who gets those ridiculous giant crayon boxes with like 200 different colors. If she wanted to draw your playground, she could shade in shadows, get the shade of red just right on the monkey bars, etc. So Annie could draw a much better picture with the same level of drawing skill because she just has more colors of crayons.

The teacher needs to make sure that everyone has crayons for a school assignment. Because the teacher is paid like a teacher, he can only afford 100 crayons. Since he needs to provide for 25 students, he can only give 4 crayons to each one. And for some reason, students can't share crayons because .... look, this analogy isn't perfect, just go with it.

Telephone companies can only process so many calls at once. To avoid overloading their systems, they try to remove as much audio data as they can without making the other person's voice unintelligible. A common way to do this is to remove the highest and lowest pitched sounds. Instead of transmitting all the frequencies that humans can hear, they chop off the ends of the spectrum so that they only keep the frequencies most common in voices. But music often uses a wider range of frequencies than voices, so it sounds weird when the phone company chops off some of the frequencies. Audio that is not aggressively compressed gives a better sound over the full range of frequencies that humans can hear. And people who want to spend wayyyyy too much money can get audio equipment that is even better.

[–]rnodern [score hidden]  (2 children)

VoIP IVR use codecs that favour voice. Thus music sounds crap. And if you start talking, the codec will eliminate the music in favour of your voice.

[–]Lucky_Ad_9137 [score hidden]  (1 child)

I understood 4 of the first 7 words.

[–]rnodern [score hidden]  (0 children)

VoIP = Voice over IP (a digitised call as opposed to an old style analog call) IVR = Interactive Voice Response (those phone menus that ask you to press a number for a corresponding service. They will handle calls, route calls to queues and such. Codec = Compression/Decompression. Basically, a digitised call is compressed so it consumes less bandwidth within the IVR, Call routing system. MP3 is a codec for example (MPEG Layer III)

[–]TorakMcLaren [score hidden]  (1 child)

Part of it isn't that it's hold music, but that it's music over the phone. Phones used to be good at sending music when things were all analogue. But since the main function of a phone is to transmit voice, phone companies can get away with only sending certain pitches of sound that we use in speech and trimming out the rest. This lets them handle more calls at the same time by trimming down the amount of digital space (bandwidth) each call takes up. These pitches are important for music!

Also, this gets further worsened by call centres being in buildings that want to make lots and lots of calls! So, there is usually a layer of trimming (compression) that goes on inside the building.

Tom Scott has a video on it

[–]MrBulletPoints [score hidden]  (1 child)

  • Old phone systems weren't good enough for the full range of sounds humans can hear.
  • So they were designed for just a small slice of that range focusing on the parts we need to understand what someone else is saying.
  • Music, on the other hand, is literally designed to use almost the entire range.
  • So trying to play music through a system that filters down to such a small range doesn't sound very good.
  • Also hold music systems just tend not to get setup with proper "gain staging".
  • In many audio systems, especially a phone system, the audio travels through a few different places before it gets to the person on the phone and each place has it's own level adjustment.
  • Gain staging is the process of setting each of those levels to give you a signal that loud enough to be heard over background noise without overdriving the input of the next level.
  • Often times hold music gets set too high at some point and so that horrible sound you hear is it being over-driven.

[–]DesertTripper [score hidden]  (0 children)

This is exactly what spurred the downfall of Thaddeus Cahill's venerable experiment in the early 1900s involving a giant electromechanical organ that used the NYC telephone network to distribute music to paying customers.

Not only was the music quality bad, it "bled over" in to straight voice circuits and made conversation difficult. There's an apocryphal story involving an angry mob that swarmed Cahill's giant music machine and threw it into the river!

[–]Negative12DollarBill [score hidden]  (0 children)

I had to provide a file for one of those systems once and it was a really low bit-rate, like 8-bit audio. That was the only type of file they could load into it.

[–]StudioDroid [score hidden]  (0 children)

Want to know more about hold music? Google cisco Opus Number 1

It is a great story.

[–]peglyhubba [score hidden]  (0 children)

Because the company would rather you hang up in disgust. They only want you as a customer they don’t want to service you. That has gone away.