AI is a Dunning-Kruger equalizer
By Rohana Rezel
In March 2026, Yousif Astarabadi, a startup founder and CEO of a company called TalkTastic, posted a viral thread claiming he had hacked Perplexity’s sandboxed coding environment. He had extracted what looked like a gateway token, proxied Claude usage at scale, and run tests that showed no charges against his quota. He filed a responsible disclosure. He published the thread. Millions of views.
Perplexity’s CTO Denis Yarats replied the same day: the sandbox creates a temporary proxy token for every user session. The billing was asynchronous. Astarabadi’s 197 discrete billing events were all on his own account. He simply hadn’t seen the invoice yet.
Astarabadi is not a security researcher. He is a founder who had spent enough time in AI tooling to pick up the vocabulary. The environment handed him what looked like evidence, the vocabulary gave him the genre, and he published before anyone checked the invoice. The tools gave him the story faster than due diligence could catch up.
This is the Dunning-Kruger equalizer. AI decouples output quality from the understanding required to evaluate it, then delivers that output to a mass audience with the polish of expertise. It gives the incompetent the vocabulary of the expert and strips the calibrated of the friction that kept them honest. The result goes straight to publication.
In 1999, psychologists David Dunning and Justin Kruger showed that incompetent people systematically overestimate their ability. The reason is almost elegant in its cruelty: the skills required to do something well are the same skills required to recognise when you’re doing it badly. A bad chess player doesn’t know what good chess looks like. The incompetence and the blindness to incompetence are the same deficiency.
For twenty years this remained a moderately interesting finding in social psychology. Then large language models arrived, and the curve stopped being a curiosity and started being an infrastructure problem.
Consider what happened to Alexey Grigorev in March 2026. Grigorev is the founder of DataTalks.Club, a technical education platform with years of student work behind it. He had been using Claude Code to manage his Terraform infrastructure. The agent encountered duplicate state files, decided they needed cleaning up, and ran terraform destroy. It wiped the production database and every snapshot with it. Two and a half years of student homework, projects, and leaderboards: gone.
Grigorev’s own post-mortem is worth reading carefully: “I was overly reliant on my Claude Code agent. I over-relied on the AI agent to run Terraform commands. That removed the last safety layer.”
The last safety layer. The AI spoke with the confidence of someone who knew what they were doing. Grigorev, a competent technical person, let that apparent competence substitute for his own judgment. The AI was wrong. But so was he, and he had no way to know.
This is the mechanism the research is now catching up with. A 2025 study gave roughly 500 participants LSAT logical reasoning tasks. Half used ChatGPT; half worked unaided. The AI group scored better on the tasks, about three points higher on average. But they overestimated their performance by four points. The non-AI group showed the classic Dunning-Kruger pattern: low scorers were overconfident, high scorers were appropriately calibrated. The AI group showed something different. The curve flattened. Everyone became overconfident regardless of actual ability.[1]Fernandes et al., “AI makes you smarter but none the wiser: The disconnect between performance and metacognition,” Computers in Human Behavior, 2025. https://doi.org/10.1016/j.chb.2025.108779
The researchers called it “smarter but none the wiser.” The more interesting finding: higher AI literacy correlated with greater overconfidence, not less. The people who thought they knew how to use AI were the most miscalibrated. What the AI had done was decouple the output from the understanding. The participants produced better answers without developing better judgment about their answers.
The same pattern appears in higher-stakes domains. A 2025 paper in the Journal of the American Academy of Orthopaedic Surgeons tested ChatGPT on ten frequently asked questions about adolescent idiopathic scoliosis, a condition where parents routinely turn to the internet for guidance on their child’s treatment. ChatGPT answered accurately on straightforward questions but gave unsatisfactory responses on complex surgical topics. The authors warned that patients risk developing a “Dunning-Kruger effect by proxy”: they read the AI’s confident explanation, absorb its apparent certainty, and walk into the consultation room believing they understand the options. The problem is not that the AI is always wrong. It is that patients have no way to identify which answers are the unreliable ones.[2]Li LT, Adelstein JM, Sinkler MA, Mistovich RJ. “Artificial Intelligence Promotes the Dunning Kruger Effect: Evaluating ChatGPT Answers to Frequently Asked Questions About Adolescent Idiopathic Scoliosis.” Journal of the American Academy of Orthopaedic Surgeons 33(9):473–480, May 2025. DOI: 10.5435/JAAOS-D-24-00297
You can see this at scale in the Matt Goodwin case. Goodwin is a British political commentator who published a book in March 2026 called Suicide of a Nation. The book contained quotes attributed to Cicero, Friedrich Hayek, Sir Roger Scruton, and Noah Webster. Journalist Andy Twelves posted a thread dissecting the first five chapters and found that several of these quotes do not exist anywhere in the cited thinkers’ works. He also found ChatGPT reference URLs embedded in the footnotes.
Goodwin appeared on GB News and said: “I stand by every single piece of evidence.”
The phrase is revealing. He had produced output that looked like scholarship. He had the footnotes, the citations, the authoritative prose. What he lacked was the ability to tell the difference between a real quote and a generated one, which is precisely the skill you need to use an AI research tool responsibly. The AI gave him the appearance of expertise without the substance of it, and he had no way to know.
The smaller, quieter cases accumulate daily. Database authority Brent Ozar has taken to writing posts about what he calls “AI SQL slop”: someone feeds a question to ChatGPT, copy-pastes the result as a comparison table or performance guide, and publishes it without checking whether the SQL is valid. One post he dissected had a typo in the sample query, duplicate section labels, and meaningless checkboxes. It had gathered substantial engagement before anyone noticed.
The same week, on the film side of X, someone posted a long technical condemnation of Criterion’s 4K restoration of Eyes Wide Shut, demanding “cinema-grade 12-bit full 4:2:2,” a spec that cannot exist on a UHD disc, which caps at 10-bit 4:2:0. The director of photography had personally supervised the restoration. A thread demolished the post point by point. Conclusion: “From the technical incompetence to the AI generated text, this whole post is just one big opinion piece filled with misinformation.”
The “equalizer” framing is deliberate. A multiplier would take existing overconfidence and amplify it: the already-incompetent becoming more so, the already-calibrated staying put. What the research shows is different. AI doesn’t steepen the curve; it flattens it. It moves the confident incompetent upward and the calibrated expert sideways, until everyone converges on the same miscalibrated band. It equalizes, and not in a good way.
What changes is the reach of confident wrongness. Before AI, a bad take required effort to produce and circulated within communities that could correct it. A book with fabricated quotes would be caught by a researcher or an editor. AI removes the effort and delivers the output directly to the audience, with the polish of expertise and without the accountability that expertise usually carries.
The students who failed the chemistry exam by trusting ChatGPT’s answers were not surprised that AI could be wrong. They were surprised that they hadn’t been able to tell. That is the point. The Dunning-Kruger effect was always about the failure of self-assessment. What AI has done is make that failure load-bearing.
The invoice arrives later.
Rohana Rezel is a technologist, researcher, and community leader based in Vancouver, BC
References
| 1. | ↑ | Fernandes et al., “AI makes you smarter but none the wiser: The disconnect between performance and metacognition,” Computers in Human Behavior, 2025. https://doi.org/10.1016/j.chb.2025.108779 |
| 2. | ↑ | Li LT, Adelstein JM, Sinkler MA, Mistovich RJ. “Artificial Intelligence Promotes the Dunning Kruger Effect: Evaluating ChatGPT Answers to Frequently Asked Questions About Adolescent Idiopathic Scoliosis.” Journal of the American Academy of Orthopaedic Surgeons 33(9):473–480, May 2025. DOI: 10.5435/JAAOS-D-24-00297 |



