0:00
/
0:00
Transcript

Tools

Why Trustworthy AI Needs More Than Language Models

Most discussions about building AI applications revolve around models: which one to use, how to prompt it, how fast it responds, how natural it sounds. These are important questions, but they miss something more fundamental. The reliability of an AI system is not determined by the model alone. It is determined by how that model interacts with the rest of the software system.

This is where the idea of tools comes in.

Tools are rarely discussed outside engineering circles, and even there they are often treated as an implementation detail. In practice, they represent one of the most important design decisions in modern AI applications.


Not all questions are the same

Consider two questions that might appear in a product interface:

“Explain why this situation might be risky.”
“Why is this project marked AMBER?”

The first question asks for reasoning and interpretation. The second asks for verification. They may look similar on the surface, but they require fundamentally different kinds of answers.

Large Language Models excel at the first type of question. They can synthesize information, articulate trade-offs, and explain complex situations in clear language. What they cannot do reliably is confirm the truth of a specific, current fact—especially when that fact lives inside documents, databases, or systems the model cannot see.

Confusing these two kinds of questions is one of the most common failure modes in AI products.


The probabilistic nature of language models

Language models work by predicting the most likely next words based on patterns learned from large amounts of data. Given the same input, they may produce slightly different responses each time. They often sound confident even when they are uncertain.

This probabilistic behavior is not a flaw. It is precisely what makes language models so effective at communication and reasoning. But it also means they are not designed to act as sources of truth.

When an AI system answers a factual question without checking anything, it is not retrieving information. It is guessing—albeit in a very sophisticated way. The result may sound convincing, but it is not guaranteed to be correct.

Plausibility and correctness are not the same thing.


Where facts actually live

In any real system, facts live somewhere very specific. They live in documents, databases, APIs, reports, and logs. These systems are designed to be deterministic: given the same input, they return the same output. They can be audited. They can be versioned. They have owners who are responsible for their correctness.

When you care about questions like “What is the current status?”, “When did this change?”, or “Who made this decision?”, these are the systems you trust—not a language model.

This distinction matters because it defines where responsibility lies. Language models reason. Deterministic systems assert facts.


What a tool really is

A tool is the mechanism that connects these two worlds.

At its core, a tool is just application code. It has a clear input and output. It reads from a real, deterministic source—such as a document or a database—and returns verified information. Crucially, it behaves the same way every time.

The tool does not live inside the model. The model does not execute it. The model can only request it. The application decides whether to run the tool and what to do with the result.

This separation is subtle but essential. It ensures that the model never becomes the authority on facts.


Tool calling as a design pattern

In a well-designed AI system, the flow looks something like this:

A user asks a question. The model analyzes the question and determines whether it can answer based on reasoning alone or whether it needs verified information. If verification is required, the model asks the application to run a tool. The application executes deterministic code to fetch the relevant facts. The model then explains the result in human-readable language.

The model reasons. The tool verifies.

This pattern is not about making the model more powerful. It is about making the system more honest.


Why this matters more than it seems

Without tools, AI systems tend to fail quietly. They provide answers even when they should not. Over time, users stop trusting them—not because the language is bad, but because the facts are unreliable.

With tools, AI systems gain an important capability: the ability to say, “I checked.” They can cite where an answer came from, acknowledge when verification is not possible, and distinguish between interpretation and truth.

For engineers, tools create clear contracts and testable behavior. For product managers, they define where truth comes from. For designers, they make it possible to design interfaces that communicate confidence, uncertainty, and sources clearly.


A useful way to think about it

If a question requires explanation, let the model reason.
If a question requires verification, require a tool.

This simple rule prevents a large class of failures. It also leads to AI systems that feel more trustworthy, even when they admit limitations.


Summary

Tools are not an advanced feature layered on top of AI. They are a foundational design principle.

As language models become more capable, the temptation will be to let them answer everything. The better path is to be more disciplined: to know when the model should stop and ask the system for facts.

Trustworthy AI is not about making models sound smarter. It is about designing systems that know when to verify before they speak.

Discussion about this video

User's avatar

Ready for more?