Is AI better than Headphone Reviewers?
Resolve explores the usefulness of AI (or lack thereof) when it comes to headphone purchase advice or technical explanations, and what may limit AI's usefulness in this regard.
Artificial intelligence tools like ChatGPT and Perplexity are increasingly being used to answer questions about audio gear. At first glance, they seem like a convenient alternative to wading through countless reviews and forum posts. But when it comes to audiophile topics—especially headphones—AI’s reliance on online discourse often means it’s amplifying confusion rather than cutting through it.
Aggregating the Noise
AI doesn’t have opinions or firsthand listening experience. It aggregates information from the internet: reviews, forum threads, blog posts, and articles. In the headphone world, where subjective impressions, hype cycles, and personal biases dominate, this can skew results toward whatever’s being talked about most—whether or not it’s accurate or relevant.
When asked a basic question like “What are the best open-back headphones under $500?”, AI will produce a mix of solid recommendations and odd relics. Discontinued models like the Audeze LCD-1 or outdated options such as the AKG K7XX often show up simply because they still have a lot of chatter online. Even more specific prompts, like “smooth treble and even spectral balance”, can yield contradictory results—pairing genuinely smooth-sounding models with headphones known for harsh treble or unusual tuning.
Technical Questions Fare Better
On straightforward technical topics—like defining “acoustic impedance” or “diffuse field”—AI can give accurate, well-sourced answers. This is because these concepts have clear, factual definitions available from reputable sources. But as soon as a question drifts into subjective territory (for example, whether certain cables make audio “warmer”), AI begins to regurgitate audiophile folklore, complete with recommendations for dubious tweaks like “audiophile crystals.”
The Placebo Problem
The headphone hobby is especially prone to suggestion, placebo effects, and confirmation bias. AI can’t distinguish between widely shared misconceptions and well-supported facts—it treats them both as valid inputs. Even when it hedges by saying “some people perceive…”, it’s still presenting questionable claims alongside accurate ones, which can mislead those who don’t know the difference.
A recent example: someone asked an AI if a frequency response graph can fully represent the sound of a track. The AI’s “no” answer—which was doubtlessly sourced from dubious sources—listed factors like time-domain behavior, phase response, distortion, and dynamics. While this can technically be true in very uncommon contexts, when it comes to headphones many of these factors are already reflected in very commonly-done measurements or aren’t audible under normal conditions.
The result? An answer that sounds authoritative, but risks reinforcing and propagating misunderstandings instead of actually answering the questions people are asking.
Bottom Line
AI is a useful tool for retrieving definitions, summarizing specs, and providing overviews—if you know enough to filter the noise.
But in a field as subjective and hype-driven as audiophile gear, its recommendations and answers are only as good as the conversations it’s trained on... which aren't all that great. Treat AI answers as conversation starters, not proofs for communicating what you're trying to explain, and be wary of using them to justify purchases or technical claims.
In short: AI is great at telling you what people think is true. It’s not yet great at telling you what is true—especially in audio, where a lot of the discourse is simply rife with nonsense.
Full Video Transcript Below:
All right. So, I was originally planning on making a video about AI and how people are using AI and ask the more edgy question of, you know, is AI better than the reviewers? So, the more clever and edgy outcome I was going for here was that yes, these AI platforms are better than what you get from reviewers provided that you give it the right prompt. But then I started giving these platforms the more nuanced prompts that I thought would yield better results and found that some of the results that I'm getting here are just weird. So that's what I'm going to talk about in this video. So I started this off by taking a look at ChatGPT and just giving it a prompt of you know what are the best open back headphones under $500, right? basic question I think most people would be asking who are using these platforms. It's a reasonable thing to ask. Some of these results are good and some of them are less good. Um, but it's when you get into the specifics that things get more weird, which was kind of surprising to me. So, what you're getting from these these are all pulling information from various different sources. So, they might be, you know, websites that are reviewing the products. There might be forums and various different audio communities that are talking about these things. And one of the things that I've noticed is that a lot of what gets pulled here is based around where a lot of noise is being made about a given headphone for one reason or another. And I think the general outcome here is that a lot of the results that you get here are going to be online community discourse driven. And this can lead to some kind of strange results. Like I was getting results on certain platforms that were just really out ofd. And I think you know this one here is showing the K7XX. Like I I don't know that that's that relevant today. So then I thought, what if I make the prompt more specific? So I asked, what are the best openback headphones under $500 for music listening? And it gave me some results, different results. It gave me the HD600 from Sennheiser and HD 660S. I can see the argument for the HD600, the 660S less so, but okay, fine. Then there's the Odyssey LCD1. I don't even know if like are they still making this headphone? Can you still buy this? I am unable to find a store that sells this. Maybe there's still some that are available, but I believe this is not even a a current product. Uh, so that's one thing. Then also the Beyerdynamic DT1990 Pro. Um, this is a headphone that is particularly harsh sounding. It has a massive resonance at around 8 to 8.5 kHz and then another one at 12 kHz. This was a popular model at one point, like years ago. And personally, I'd like to think that I've been fairly consistent on this one. The first time I heard this, this was actually one of the first headphones that I reviewed. I hated it from day one. and I thought it was totally horrible. Then there's the Hi-Fi Man H400 SE and Sundara. There's the Meze 105 AER, which is a weird one, and we'll get to that in a moment. And then, of course, the Beyerdynamic DT990 Pro. And then there's another AKG model in there, too. So, yes, the DT990 Pro is very bright, but it is a headphone that has a lot of commentary about it online. And uh and so, you know, it's kind of unsurprising to me that it makes this list. So, then I thought, okay, well, what if we go even further and start qualifying these a little bit more? And so I asked it, what are the best openback headphones under $500 for smooth treble and an even spectral balance? Now these are the kinds of things that like I might look for in a headphone. So it's getting the prompt is getting more nuance now. And the results here we get the Sennheiser HD 600, 650, and 660S. Again, that's not unreasonable to include that. Then you get the Focal Listen Professional. And like that is a headphone that I believe you can buy, but I'm I'm very confused as to why that gets included in this list next to send. Let me just let me just take a look at that. Let's look at the frequency response of the Focal Listen Professional. Um, and it is uh kind of weird. Let's just put it bluntly. Definitely one of the weirder ones on this list. And then the Openback Alternative Focal Clear P series underused. And those are very different headphones. So, that's one area where ChatGPT at least is a bit confused. Then there's a Sundara on there. That's understandable. K712 Pro, uh, K72. I I believe I have a K712 Pro here. It's not a good headphone. Let me just be very clear about that. That it's a really weird sound and it's very unfortunate that they use that as the replicator headphone in the Harmon work, but that's a a different topic. And then we get the Fidelio X2 HR. The prompt here was smooth treble and even spectral balance. So I would say that you know HD600 and 650 they achieve that. Sundara sort of achieves that. I wouldn't necessarily say smooth treble but that's the description that it gives. And then Philips Fidelio X2 HR the the treble was specifically not smooth right that was kind of the problem with that headphone. So I'm not exactly surprised by like the overall kind of you know bucket of results that we get here. It's more that I'm surprised by what it's attached, you know, some of these subjective qualifiers to. You know, I've asked it for smooth treble and an even spectral balance, and it's given me products that have harsh treble and a wonky frequency response. Uh, but let's now move over to one of the other platforms here. And this was another result where I basically prompted it for the same thing. This is I'm using Perplexity now. And I asked it the same question. Uh, you know, what are the best open back headphones under $500 for smooth treble and an even spectral balance? And again, I get the HD600 and 650. I get the Sundara, but then I get the Meze 105 AER. And it's it's saying the Meze 105 AER is known for natural warm tonality with smooth highs and an overall even spectral balance. No, it is not. Now, I'm not saying that nobody is going to hear it this way or even that there aren't things to like about this headphone. It's, you know, these things behave differently on different heads and this is a subjective hobby after all. It's just that the nature of these subjective reports is not well understood by the AI here. See, it's not that the AI has failed. It's that somebody else has failed the AI and now it is feasting on that failure and providing it to all of you. Some of the general truisms about how people should use AI platforms, you know, for their purchase decisions. That applies to headphones. It does depend on the prompt and it does depend on what it's able to feast on. But I was genuinely expecting that the more nuanced my prompt got, the better the results would be. And some of the results here are like the it's like the opposite of what the prompt is, which is kind of a weird thing. So then I started to go a little bit further and try to figure out okay, how does it do for answering some questions about things like what is acoustic impedance? Like some of the more technical deep dive questions like what is an HRTF? All those types of questions and uh and it gave good answers for these questions as long as I was specific with it. Uh let me just double check what perplexity says because it didn't actually do that. What is a diffuse field? Let's just see what it says for that. So yeah, basically this gives a reasonable answer. A uniform sound distribution. The sound pressure level is essentially the same everywhere in the room. So this is something where it is pulled from a source. It's pulled from Wikipedia here actually. And then it also shows a couple of other sources. And actually looking at this I did we make this? We did. Yeah. So the diffuse field prompt is getting its information from listeners article up on headphones.com. Uh I'll leave this linked in the description for anybody who's interested. But um yeah, this also seems like a reasonable segue to our sponsor, headphones.com, who makes all these videos possible. You can find the actual source of this information and other technical deep dives up in headphones.com on the audio files section there. Um and as always, if you want to support what we do here on this channel, uh make sure you consider headphones.com the next time you're in the market for a new pair of headphones. But shameless plugs aside, let's move on. So some of these more technical questions it is able to give uh reasonable answers to, which I thought was good. But then I started to ask some questions about like, you know, whether or not cables would make a difference. And this is where things got weird again. I started to get recommendations for cables that would make the sound warmer. And so what the AI would do is it would hedge against its own recommendations by saying this is what people report or this is how people perceive it to be. And then it would say, you know, it would give whatever recommendation for a certain type of cable to make it warmer, right? And people were asking like even on headphite they're asking ChatGPT what cables it would recommend to be paired with you know specific equipment that they had. And um so then I started to kind of try that out and found that yeah like if you ask you know what audio cables make uh your audio system warmer you know it actually gives you answers to this even though they're bogus right like there's there's no there there not there's nothing that's actually been demonstrated um that it can latch on to. So it is pulling from the sea of madness that exists where people are reporting these differences. And the caveat there for these AI platforms is that it the language it starts to use shifts a little bit to people perceive whatever it is, right? Rather than saying, "Hey, this is how it is." And this is where you and I and other AI platform users need to be a little bit careful about what it is that we're reading and the conclusions that we're drawing from it. like I was able to get it to recommend to me audio file crystals and their best use case and use application. You know, one of them was to, you know, leave certain types of audio file crystals near cables to get the sound to change in a certain way. So, if you're feeding it those kinds of prompts, again, it's going to give you the craziness, right? Like, it's going to see, okay, well, who's talking about crystals online about this? What are they talking about with respect to this? And let me see if I can, you know, concisify that. So again the outcome here is that unfortunately no AI is just not good enough yet to be able to distinguish uh good information from bad and that's true generally you know across all domains really but particularly when it comes to something like the audio space or the headphone space where it's a topic that involves a high degree of suggestability of placebo of confirmation bias of subjectiveness generally where it's prone to be particularly confused about some of these results because is it's pulling from what people are saying about this. And so for that reason, I really want to encourage people to stop using AI answers as proofs for something that they are trying to communicate. Right? So one of the recent ones that I saw was uh it was on headfi where somebody asked ChatGPT can the sound of a track be exhaustively captured or represented by a frequency response graph alone. Um and then the answer was no. goes into time domain behavior, phase response, distortion, characteristics, dynamic compression, spatial and psycho acoustic elements, etc. Um, and the thing is there is a sense in which some of these can be true in a technical sense. It's just that they are not relevant or don't apply in the context of headphones and they are bound to lead people to more confusion than they are to clarify that information. especially because of the way that people are making use of these answers or allowing them to sort of permeate their belief systems on all of this stuff. Right? So let's take a look at this one. So the number one thing here it says is time domain behavior. So if you do a sweep you also get the time domain behavior and in headphones uh time domain is proportional to frequency response most of the time in the vast majority of cases. And so therefore it doesn't actually show you anything new that frequency response doesn't also show you. And the big problem here is that it uses language like attack, decay, sustain, release, transient response, smearing or ringing. These are these are terms that also have a subjective application to them. And when people are reading this stuff, they are bound to apply those same subjective descriptions to a separate category of evaluation that this is seeming to indicate, right? When in reality, this is just frequency response just again seen a different way. phase response. This is something that actually is captured by frequency response. It's just not visualized in it typically. It's not what the graphs typically show. But that is part of that's that's what frequency response contains. Um distortion characteristics. Again, another one where it's like it's true distortion is not a thing that is shown by frequency response and it is a component of of the sound that could be potentially heard, but in most cases it's not really that relevant. Like unless it's really bad, you probably won't hear it. And now the next one, dynamics and compression. Again, this is these are terms that are laden with subjectivies. Uh when there is a technical truth available, dynamics and compression, like these are real things, but this carries with it a lot of subjective semantics that people are going to take from this kind of information. And you know, same thing with timbre and texture. These are subjective terms that often get used to describe the sound. But the thing is, those are things that are absolutely captured in frequency response. And you can even predict how they will be received for things like timbre and texture. Um it's just that the analysis of frequency response needs to be sufficiently good in order to do that. People are usually just analyzing frequency response in terms of the most basic, you know, what is the relationship between bassids and treble when there's so much more contained within it. It's not just how it is relative to a target. And so once we start to analyze it in a more thorough manner, a lot of these things, a lot of the subjective language that gets used that this is being shown as proof of like something else, you actually able to connect the dots better between that stuff and frequency response. So, you know, the spirit of this post and these questions is a reasonable one. I don't actually think there's anything wrong with people asking these questions and even going to AI platforms to do this. But um it's important to recognize that there's a lot of context missing here and a lot of the relevance for headphones that people are misattributing. This kind of stuff is bound to lead to more confusion and misinformation spreading than it is the clarifying of information, which is what I think the intention is behind this. Really, we have to stop using the results that we get from AI platforms as proofs for anything because all it's really saying is look at how many other people are also confused about this. You know, this is not an indication of signal that hey, this is there's some truth here to to be uncovered. This is signal that hey, people are talking about this or people think this or people believe this and that is a separate thing from facts. So be very careful when using AI platforms. And with much sadness, I have to actually recommend against doing that for the most part unless you are very cautious in your approach. And that's going to do it for this video. But if you guys have any perspectives on AI as it relates to the headphone information space, um I've actually opened a forum thread on our forum and I'll leave that linked in the description. So feel free to comment there, give your opinions, or if you just found some fun, you know, hallucinations because they they are prone to hallucinating from time to time. Um maybe more so in our space. But as always, if you'd like to chat with me or other like-minded audio folks, you can do so in a Discord, also linked below. Until next time, I'll see you guys later. Bye for now.