AI hallucinations and algorithm aversion
by Miguel Lucas
When Alcaraz loses a final, we say he had a bad day. When AI hallucinates, we say the technology doesn’t work.
It’s not that we’re unfair to machines. It’s more that we forget the alternative — us, humans — is far from infallible. Medical diagnostic errors cause 795,000 deaths or permanent disabilities per year in the United States alone 1. One in fourteen hospitalized patients suffers a diagnostic error with real consequences 2. That’s not negligence; it’s the base rate of human cognition applied to complex tasks. And we’ve normalized it to the point of invisibility. Yet a model like GPT-5 presenting hallucination rates of around 1.6% 3 is enough for many people to dismiss generative AI as a working tool.
Cognitive psychology has documented this phenomenon precisely. It’s called algorithm aversion: we lose trust in a machine far faster than in a human after observing the exact same error 4. We grant humans context and circumstances. We demand perfection from machines. Waymo’s autonomous vehicles record 85% fewer injury-causing accidents than human drivers 5, yet every self-driving car incident generates headlines while the thousands of annual deaths caused by distracted human drivers remain in our attentional blind spot.
Yet the history of technology suggests a different logic. Aviation did not become the safest mode of transportation by eliminating pilot error. It did so by minimizing its consequences. Airbus’s fly-by-wire system rejects commands that would cause the plane to crash 6. Not perfect pilots — systems that manage imperfection. From 35 fatal accidents per million flights in the 1950s to just 4 worldwide in 2024 7.
An AI hallucination is not a defect that invalidates. It is a characteristic to be managed — exactly as we manage human error: with verification, redundancy, and guardrails. We have spent a century building reliable systems with humans who fail. With AI, it will be no different.
Related theses
References
- Johns Hopkins Medicine — Report Highlights Public Health Impact of Serious Harms From Diagnostic Error in U.S. ↩
- BMJ Group — Harmful diagnostic errors may occur in 1 in every 14 general medical hospital patients ↩
- Weights & Biases — GPT-5 Benchmark Scores ↩
- Tandfonline — Algorithm appreciation or aversion: the effects of accuracy disclosure on users' reliance on algorithmic suggestions ↩
- PubMed — Comparison of Waymo rider-only crash data to human benchmarks at 7.1 million miles ↩
- Airbus — Safety Innovation #7: Flight Envelope Protection ↩
- Our World in Data — Commercial flights have become significantly safer in recent decades ↩