The default for what? Space is not the only consideration. What about bandwidth?
I'm pretty sure spotify, deezer and the others are not transmitting FLACs, especially not at the base quality level.
One interesting thing to note: this is a composition, not an analysis. It's not fully documented exactly what modifications to the "raw data" were made.
Herman Hesse (or rather Harry Haller in Steppenwolf) used to get enraged by that - distorted garbage and a blasphemy to the godly componists. But then eventually he overcame it because of exactly that reasoning - if people enjoy it and are touched by the music - it works good enough. That is the purpose of music, not arbitrary perfectionism.
And this brings me to my point: the engineer is no mere technician whose job is only fidelity. The engineer is part of the performance through speakers, an artist no less than the musicians being recorded.
You can easily test this yourself by generating some mono pink noise in Audacity (or your favorite audio editor) and playing it on stereo speakers. Move your head and you will hear changes in the sound like a flanger effect. This is most obvious if you are close to the speakers and in an acoustically dry environment. Compare with the same sound hard panned to a single speaker.
This is one reason why a real physical center channel improves clarity for movie dialogue.
Lots of people in the 40s and 50s enjoyed listening to Frank and Elvis' songs broadcast by AM (mono) radio (10kHz bandwidth) and reproduced by one or two oval 2x6" speakers of their cars.
The 1939 Carnegie Benny Goodman concert was recorded with mikes and lathes cutting aluminum acetates with no better than 5 kHz fidelity. They were lost until the 50s, then released on an LP that sold over a million copies. It sounds great on Youtube; I was wondering what else sounds that good with a first-order roll-off at 5k.
EDIT: nerds should read about sfb21 (https://wiki.hydrogenaud.io/index.php?title=LAME_Y_switch), AAC, Vorbis and Opus (CELT) aren't just theoretical improvements
To say that the mp3-encoded version is not "what the artist recorded and wanted for us to hear" would imply that we can hear all sounds in the uncompressed recording.
I don't have a source ready but this has been a hot topic in audiophile land for decades and tldr is they'll pick out the really bad sources (eg <128kbps mp3) but not the rest. Basically the results look like those from a blind beer tasting test: no correlation between winner and supposed quality, except if the quality is especially bad.
I'm no scientist, but to me "audiophiles who really care about this stuff can't pick out the good MP3 from the uncompressed original" is sufficient proof that MP3 is, actually, based on a sufficiently well-understood model of human hearing.
https://en.wikipedia.org/wiki/ABX_test
A properly conducted ABX test is the most favorable condition possible for detecting a difference. If you can't ABX it you can't hear it.
There's an ABX testing website with various lossy formats you could try:
https://abx.digitalfeed.net/list.html
However, failing to ABX those specific samples does not guarantee you are unable to tell the difference in all circumstances. There are some sounds that are unusually difficult to encode ("killer samples"). This is an especially big problem for MP3. The LAME project has a collection of killer samples for MP3:
https://lame.sourceforge.io/quality.php
More modern lossy formats are less susceptible to killer samples, but theoretically there could still be problematic cases.
Not all .mp3's are created equally and can vary in how lossy they are based on the bitrate.
If you care enough to want to hear exactly what the artist wants you to hear, you just listen to the lossless version.
No kidding. There are a decent number of options to pick for quality of mp3. VBR/CBR, bitrate, joint/stereo/mono, and so on. I personally just pick something that sounds fairly close to the original. But that only really matters for me when it is side by side. Give me a few days between picking and I can not really tell anymore.
I immensely enjoyed listening to the "lost material" of Tom's Diner and would like to hear more of this!
Maybe one could diff with a lower quality version, one where more has been cut away, more is lost/left over? There are so many possibilities!
ffmpeg -i original.wav -codec:a libmp3lame -b:a 192k output.mp3 && \
ffmpeg -i output.mp3 decoded.wav && \
ffmpeg -i original.wav -i decoded.wav -filter_complex "[1:a]aresample=async=1,volume=-1.0[inverted];[0:a][inverted]amix=inputs=2:weights=1 1" difference.wav
[AVFilterGraph @ 0x6000008b59d0] More input link labels specified for filter 'aeval' than it has inputs: 2 > 1
[AVFilterGraph @ 0x6000008b59d0] Error linking filters
Failed to set value '[0:a][1:a]aeval=val(0)-val(1):c=same' for option 'filter_complex': Invalid argument
Error parsing global options: Invalid argument
You can do the same thing in your DAW¹ by putting A (e.g. the original) onto one channel and B (the processed sound) onto another. Then you invert the phase of B and listen to/export the sum.
This trick works also for audio gear that claims it does amazing things to your sound (here you just need to make sure to match the levels if they have been changed). Then you can look how much of the signal has truly been affected by your 1000 bucks silver speaker cable.
¹ Digital Audio Workstation, something as simple as Audacity should do the trick
I have a friend who has spent ridiculous sums of money on audio gear. Like, he's in his 50's, and still lives with his parents (in part) because of it. Over the years, I've learned I will never convince him that he's being fleeced, but I've wanted to make a site to host such A/B comparisons for a very long time, to perhaps get through to others what a waste most of the "audiophile" gear is.
On your A/B comparison website: I think it is important to make a "blind" test default. So they can listen to e.g. ten repetations and vote for for one each time and in the end they get a score which one they liked better. and by how much.
Because of course they want to hear the difference if there is an expensive price tag.
But honestly the only thing you get is something that subjectively sounds exactly the same, but lower volume. Probably due to the fact that subjective sound experience is more related to the fourier transform of the waves than it is to the waves themselves.
That technique will work with simpler compression techniques, like companding. (Companding is basically doing the digital equivalent of the old Dolby NR button from the cassette days.)
"Using the python library headspace, and a reverb model of a small diner, I began to construct a virtual 3-d space. Beginning by fragmenting and scrambling the more transient material, I applied head related transfer functions to simulate the background conversation one might hear in a diner. Tracking the amplitude of the original melody in the verse, I applied a loose amplitude envelope to these signals. Thus, a remnant of the original vocal line comes through in its amplitude contour."
Or on video compression, for that matter.
It just shows though that these diffs are invisible to a human - by design.
ps, you could do the same thing with watermarked content.
Beyond that, the specific thing that i noticed gets lost is bass character on some tracks. Ex: Some drum and bass tracks just don't hit at low bitrate. This aspect sometimes feeds into the low guitar strings though, where they might have a bit less body.
Lastly is of course sound staging, but that's something that a headphone setup is very sensitive to.
As for quality differences, I basically fall in line with the consensus on this thread. FLAC and 320 are indistinguishable. 192khz is almost always indistinguishable and good enough, although there are some situations where it might be slightly noticeable. 128 is pretty easy to tell the difference with the good setup.
There is also the rare track with amazing production and or very cool stuff happening somewhere in the spectrum, so I don't entirely write off someone wanting a plus one on the take above. "I absolutely love this jazz album. It's been a large part of my musical journey as a human, I can a/b test this at 320, but for this album, I really want it lossless." I can respect that. I got to know some of my test tracks pretty well (ex Black Sun Empire & Arrakis for bass), and while 192 was fine for other tracks, I wanted 320 or lossless on those.
"The exact master copy that the artist recorded and wanted for us to hear" In the digital era, does that even, uniquely, exist?
"a set of standards created by a bunch of engineers in 1993" Nice!
Was hoping the article would mention double blind studies about the ability to perceive differences and the quality between various audio file format, available elsewhere. Interesting, though not as overwrought as the reporting in this article.
https://en.wikipedia.org/wiki/Bach:_The_Goldberg_Variations_...
I always loved to test the ears of my "Audiophile" friends. They will tell you how different MP3s are. You make a bet they can not differentiate them in 20 trials better than chance. I won with most people but some professional musicians that can identify little differences.
Details:
https://en.wikipedia.org/wiki/Auditory_masking
Bernhard Seeber of the Audio Information Processing Group at the Technical University of Munich has some good demonstration videos on Youtube:
Modern codes like opus are much more efficient. At high bitrates they are fully transparent and anybody who claims to be able to hear a difference is full of shit. Put them in a controlled setting and they fail every time.
Some Young folk think that 24bit/192kHz is the one-true-form who would think a 16/44 FLAC is a lossy encode, and then there's the vinyl folks. (I like vinyl, but not for the fidelity).
Required reading: https://xiph.org/video/vid2.shtml
Maybe I'm old, but I do not hear a difference between 128 kbps opus and flac. I mainly use flac because it is an excellent archival format and you can encode it to different formats easily.
Of course there are other ITU tests that work without hidden references, looping or even A/B comparison. They require a much bigger listener pool, are more expensive and take longer, thus used less often during development.
I'll trust actual validated limits of human perception such as 16/48 audio, 1~3dE colour, etc. And techniques used in video encoding like psnr, ssim, etc are also pretty well grounded in science. Also SINAD
But anything involving a human blindly comparing audio is into audiophile pseudoscience territory, no matter how large a cohort of people or how it is executed
No, this is nowhere near pseudoscience, psychoacoustics is an established field of science.
https://abx.digitalfeed.net/list.html
(you can press A, B, or X on the keyboard for instant switching)
Even this seems unlikely. I remember a test from the C't Computer magazine that has a very good reputation. And there were many professionals and as far as I remember, they were not able to tell the difference.
Fun fact: The only person that scored significantly was a person that loved punk music and had an ear damage.
[EDIT] https://www-parkrocker-net.translate.goog/threads/komprimier...
So it's not a good indicator for your test (and I wouldn't dare claim what your audiophile friends claimed). But in a few cases a "wrong" sound cymbal can actually caused by compression.
But I sit down to listen to music as a hobby (and used to play drums, so I listen closely to them). And while I don't have audiophile-grade snake oil, I have both decent headphones and speakers; including the electronics.
It didn't cancel anything out.
The reason: Mp3 dramatically alters phase. Because all the phases are different, it's hard to naively determine how the signal is altered.
Years later, I took the time to write a series of tools to investigate lossy audio: https://andrewrondeau.com/blog/2016/07/deconstructing-lossy-...
I personally don't understand enough of MP3's internals to explain that.
What I assume is that, because MP3 internally stores Fourier transforms in the frequency domain, instead of the time domain, it uses very few bits to store phase. This will result in phase shifts.
Hopefully, someone else can give a better explanation of this than I can.
Basically, think of it this way:
1: Imagine a frame of 64 16-bit samples. (1024 bits) 2: Than its converted to 32-bit floats: 64 32-bit samples. 3: Than its transformed 4: Now there are: 1 DC bias 32-bit float, where phase is 0, and 32 frequency amplitude floats, and 32 phase floats. (The other 31 frequency/phase pairs are duplicates and don't need to be stored.)
These floats need to be quantized so that there are many less bits. Some of it can happen by using very few bits to store phase, and some of it can happen by using less bits to store amplitudes. Again, someone else can probably explain this better than I can.
I feel like this analysis isn't well grounded in what artists and sound engineers actually do, or how they think.
- In this month's TapeOp magazine, Jeff Jones mentions he always monitors through such a codec so he can adjust accordingly for the best sound regardless of final format https://tapeop.com/interviews/163/jeff-jones/
- IMO Rush's Moving Pictures is one of the best (and best-sounding) albums from the '80s despite the fact that it was mastered to a Sony digital device that's renowned as terrible sounding, with only 14 bits of usable resolution. How/Why? The production team monitored through the Sony system while mastering and made tweaks as they went to account for its limitations.
- I've had my own music professionally mastered at high resolution, and discovered that converting hi-res to MP3 (even at highest bitrate) caused digital peaks over 0.0 dB, I fixed by normalizing to -0.3 dB.
TBH if the resulting file is going to be transmitted over BlueTooth it's going to be further degraded. And yet people can still make a strong emotional connection to the underlying music and artist...
If the remaining audio was noise like, I would say we reached the compression limit.
Increases in processing power spurred progress. Within a year Brandenburg’s algorithm was handling a wide variety of recorded music... But one audio source was proving intractable: what Grill, with his imperfect command of English, called “the lonely voice.” (He meant “lone.”) Human speech could not, in isolation, be psychoacoustically masked. Nor could you use Huffman’s pattern recognition approach—the essence of speech was its dynamic nature, its plosives and sibilants and glottal stops. Brandenburg’s shrinking algorithm could handle symphonies, guitar solos, cannons, even “Oye Mi Canto,” but it still couldn’t handle a newscast. Stuck, Brandenburg isolated samples of “lonely” voices. The first was a recording of a difficult German dialect that had plagued audio engineers for years. The second was a snippet of Suzanne Vega singing the opening bars of “Tom’s Diner,” her 1987 radio hit.
That accusation requires evidence based in psychoacoustics. Just because you can hear it in isolation doesn't mean you can hear it if it is added back to the host audio.
For instance when some quiet sound that is masked by immediately preceding loud sound is removed, of course you can hear that quiet sound in isolation! Your hearing has something like 120 decibel dynamic range, or better.
You can hear differences in the compressed audio. Nobody can claims that there's no degradation in quality. Artifacts are obvious. Much more so at lower bit rates, though. MP3 starts to sound quite good around 192 kbps.
The removal of those components is necessary. It is necessary to the algorithm so that it can achieve compression.
Also there's this issue. If we take a signal and apply some modest EQ to it. Say we boost the bass and treble and cut me a little bit. Or any other EQ profile. If we then level match the two signals and subtract them from the other, there will be a difference: some aspects of the original material will be recognizably heard. For instance the difference between a slightly treble cut signal and the original will be the treble. But the trouble was not completely cut from the original. What you're hearing in the difference is not something that was entirely removed.