why AI can’t say I don’t know

target readers ie - idea explorer

By Fernanda Arreola and David Jaidan

Skillful orators know that, to convince their audience, it can pay to deliver their message with a tone of overweening confidence. Although many might apply a filter of healthy skepticism to content delivered in such a way, what about information delivered with apparently equal confidence by an AI system?

The Illusion of Omni-knowledge

Artificial intelligence systems like ChatGPT, Claude, Perplexity, Gemini, Copilot, and similar large language models (LLMs) have become, perhaps, a bit too present in our daily quest for feedback and answers. We enjoy using them because they generate original, easy-to-read, and “easy to believe” responses. This versatility, however, masks a very profound limitation: these systems struggle to acknowledge the boundaries of their knowledge. They rarely say, “I don’t know,” and when they do not know, they generate content that seems to have the same confident tone as what we believe to be reliable responses.

As researchers, users, and developers of AI-based tools, the authors of this article aim to understand why AI systems exhibit this “overconfidence problem,” or “omni-knowledge,” what it means for users, and what can be done to address this significant usage limitation.

The Architecture of Omni-knowledge: How Large Language Models Work

To understand why AI systems struggle with accepting their own limitations, we must first understand how these systems work. Large language models are not knowledge databases in the traditional sense; they are pattern-matching and pattern-searching mechanisms that are trained on large amounts of data to predict outcomes given the relationship and proximity of words. This allows them to predict which words should answer a question. Otherwise said, the answer comes from the closeness of the question to the words available to answer it.1

When they do not know, they generate content that seems to have the same confident tone as what we believe to be reliable responses.

However, this association of ideas finds its limits when the words and documents available are not sufficiently close to the question asked. This is when the phenomenon of AI hallucination appears, which means that AI generates plausible but entirely fabricated information.2 Remember that uncle who had “travelled the world and had all the answers”? This is an analogy of how this hallucination phenomenon happens: your uncle may have never been to India, but, given his experience in Pakistan, he projects the similarities and differences between one and the other based on information he has recollected from reading books, and he answers questions as if he had been there. This type of issue is acknowledged by OpenAI, saying that “hallucinations remain a fundamental challenge for all large language models” even as capabilities improve, occurring when models “confidently generate an answer that isn’t true”.3

The real issue is that theoretical work has demonstrated that hallucination is not merely a technical problem to be solved through better training, but rather “an innate limitation” of LLMs because these hallucinations, as explained before, are at the basis of the functioning of the algorithms used by AI; it’s about finding close relationships and delivering an answer, forcing it to create inexistent realities4 that are like the relationships it has found. This mathematical truth means that no amount of engineering can eliminate hallucination; it can only be reduced.

This point is also reinforced by recent work from OpenAI, Why Language Models Hallucinate, which shows that hallucinations arise even when models are well trained, because prediction-based systems must generate a plausible continuation when certainty is low.

What Can Be Done: Addressing AI Omni-knowledge

why AI can’t say I don’t know

Mitigating the confidence problem in AI systems requires coordinated action across multiple stakeholders: AI developers improving systems, users developing critical engagement practices, and institutional frameworks establishing appropriate use boundaries.

From a technical perspective, AI could be given certain limitations, although none will completely cover all potential subjects and issues likely to be generated after a prompt is issued. Researchers at Oxford University have developed methods to detect when a large language model is likely to “hallucinate5 using semantic entropy, with the study’s senior author noting that “getting answers from LLMs is cheap, but reliability is the biggest bottleneck. In situations where reliability matters, computing semantic uncertainty is a small price to pay.”5

This is why the creators of AI systems should start designing explicit models of what domains they have reliable training in, versus areas where coverage is sparse. When queries enter low-coverage domains, the system could proactively acknowledge limitations like, for example, stating: “This topic has limited representation in my training data; my response may be unreliable.” Implementing this requires solving the difficult problem of automatic domain classification and knowledge coverage assessment.

Another possibility that is beginning to be used is that of hybrid systems. These systems combine language models with real-time search and fact-checking capabilities to verify claims before presenting them. When generating factual claims, the system would automatically search reliable sources to confirm accuracy, flagging or revising statements that cannot be verified. This approach faces challenges of computational cost, determining authoritative sources, and handling queries where verification isn’t straightforward.

However, technical solutions are only part of the story. A second, equally important source of overconfidence lies in the training process itself.

Training-Induced Overconfidence: Why the Model Learns to Sound Sure

There is another reason why AI systems often sound more confident than they should. It does not come from the model’s architecture, but from the way these systems are trained to interact with humans.

During training, models are rewarded for being helpful, fluent, and responsive. Answers that are complete and confident usually receive better human ratings. In contrast, answers that express uncertainty or decline to answer a question tend to be rated lower. Over time, the model learns that sounding confident is a winning strategy, even when it is not fully sure about the information it provides.

This creates a second layer of overconfidence: the model is not “punished” strongly enough for giving a wrong answer with a confident tone, so it keeps doing it. OpenAI, Anthropic, and DeepMind have all acknowledged this issue and noted that today’s systems are still not penalized enough for confident mistakes.

Over time, the model learns that sounding confident is a winning strategy.

OpenAI’s research suggests a fundamental fix: “Penalize confident errors more than you penalize uncertainty, and give partial credit for appropriate expressions of uncertainty,” noting that while “some standardized tests have long used versions of negative marking for wrong answers or partial credit for leaving questions blank to discourage blind guessing,” the problem is that “if the main scoreboards keep rewarding lucky guesses, models will keep learning to guess.”

Systems could be trained to more readily decline queries outside their reliable knowledge base rather than generating plausible-sounding guesses. This requires overcoming the training bias toward helpfulness and developing better internal mechanisms for recognizing when queries exceed reliable knowledge boundaries. The challenge is balancing legitimate refusal of unknowable queries against excessive refusal that makes systems less useful.

Five Practical Ways Users Can Reduce the Risk of AI Overconfidence

Even with better models and smarter training, users still play an important role in reducing the impact of AI overconfidence and omni-knowledge. But this responsibility has limits: no one can verify everything, and not all users share the same level of expertise. What matters is developing a practical mindset for working with systems that are powerful, but not reliably self-aware. Here are five practices that make a meaningful difference without placing an unrealistic burden on the user.

1. Verify information, not just the source

Asking for sources can help, but citations from AI are often incomplete, misattributed, or entirely fabricated. What really matters is checking whether a claim can be confirmed independently through trusted references, expert literature, or reputable databases. Verification should focus on the content, not the link the model provides.

2. Pay attention to claims that sound precise without showing evidence

Highly detailed answers are not always wrong, but unsupported precision is a red flag. When a system offers exact figures, specific procedures, or step-by-step methods without citing where they come from, treat them as hypotheses, not facts. Reliable knowledge should show its foundations.

3. Ask the model to explain what might be uncertain, while knowing its limits

Models can’t truly assess their own confidence, but they can reveal parts of an answer that depend on weak generalizations or limited training data. Asking questions like “What parts of this answer rely on assumptions?” forces the system to unpack its reasoning. This does not guarantee safety, but it reduces blind acceptance.

4. Use multiple independent lenses

Instead of relying on a single AI output, compare answers from different models, from human experts, or from domain-specific tools. Consistency across independent sources increases reliability, divergence signals uncertainty. Think of AI as a starting point, not a final authority.

5. Know the kinds of tasks where AI is reliable and where it is not

AI can be excellent for brainstorming, rewriting, summarizing, or explaining concepts. But tasks involving high-stakes reasoning, precise data, legal interpretation, medical judgement, academic writing, or numerical accuracy carry higher risks. The key is matching the tool to the task; use AI where the cost of error is low and human verification is easy.

Closing insight

None of these practices eliminates hallucinations, and they cannot compensate for structural or training-based causes of overconfidence. But they give users a realistic and sustainable way to engage critically with AI systems, without demanding expert-level knowledge or constant vigilance. The goal is not perfection; it is safer, more informed interaction.

About the Authors

Fernanda Arreola (1)Fernanda Arreola is a Professor of Strategy, Innovation, and Entrepreneurship at ESSCA. Her research interests focus on service innovation, governance, and social entrepreneurship. Fernanda has held numerous managerial posts and possesses a range of international academic and professional experiences.

David JaidanDavid Jaidan is the founder of Minerva View, a location intelligence SaaS company delivering strategic insights. Operating at the intersection of technology, business, and research, he made significant contributions to privacy-aware and explainable AI at Scalian, a leading European consulting firm. His scientific work at Météo-France has been recognized and cited in the IPCC report. He is also an AI lecturer at business schools and engineering schools.

LEAVE A REPLY

Please enter your comment!
Please enter your name here