Interview with David Villalón
The future of AI belongs not to those with the best models, but to those who can orchestrate systems, governance, and complexity into lasting value.
AI success in enterprises depends less on models and more on managing complexity, controlling costs, and designing reliable systems at scale. David Villalón is Co-founder and CEO of Maisa, an enterprise AI company focused on building accountable and auditable systems for regulated industries, and previously led product and AI initiatives at Voicemod and Clibrain. He explains why organisations struggle to move AI from experimentation to production, arguing that the real challenges lie in process design, governance, and operational architecture rather than model capability alone.
You were working with AI well before it became a mainstream topic. What first sparked your interest in this space, and how did those early experiences shape the way you think about building technology today?
I’ve always been on the more applied side, the side of actually using AI. What attracted me most was that, for the first time, I was starting to see its potential, and I found the way it was framed really inspiring.
Those experiences helped me understand how AI worked from an application standpoint: what building blocks you needed for solutions to actually work, even though back then we weren’t anywhere close to that. More than the solutions themselves, what stuck with me was the mental model of how AI operates beneath the surface. That was key because it gave me a grounding that didn’t come from ChatGPT, where everything looks like magic, but from having lived through how things had to be built, in quotes, for something like ChatGPT to work; how you had to use the models beforehand.
Later, during the Voicemod era, and from the voice side, I learned what it’s like to put AI technology into production for millions of people: the prioritization, the testing, and so on. Obviously, that process has been refined a lot since then.
That experience let me build a mental model without the noise of mainstream AI, which has turned into a real tornado, a storm of noise. That’s how I see it.
Having helped scale products used by millions of people, what have you learned about what really makes a digital product succeed once it leaves the “idea stage”?
What really helps a digital product get past the idea stage is constant iteration and, more than solving a real problem, empathizing with the real user need you’re addressing, even when it isn’t explicit.
When you try to apply AI at scale, the problems you have to solve aren’t AI problems: they’re change management problems, process problems, access and permissions problems.
For example, if you’re building a voice changer and you’re able to understand that, for the market niche you’re targeting, what you’re really doing is giving them a way to express themselves through audio, based on how they use it and what they do with it, then you can build a product around that.
And building a product around that need is a competitive advantage, because it lets you get deep into what the user wants to hear, what they really want to have, or even what they aren’t yet aware of. That’s what lets you stay ahead.
There are a lot of companies eager to use AI, but the path runs from experimentation to a successful release and, ultimately, to a business moat…
Many companies today are eager to use AI, but the path from experimentation to something they can fully rely on is not always smooth. Where do you see most organisations getting stuck in that journey?
Organisations get stuck because they don’t understand the complexity of these systems at scale, and because they assume that if a prompt works for me once, it’ll work at scale and it’ll work every time.
Most companies that use AI for personal productivity get by just fine, though they’ll probably learn soon enough that they’re making mistakes. But when you try to apply AI at scale, the problems you have to solve aren’t AI problems: they’re change management problems, process problems, access and permissions problems. They’re far more complex problems, and I’m thinking the whole time about large-scale organizations.
On top of that, people greatly underestimate how hard it is to make these systems stable, because they’re dynamic systems, and a dynamic system, by definition, can’t be put in a box. You almost have to manage it in real time and set boundaries to keep it under control. That’s something people usually don’t take into account.
As AI begins to move into more critical areas of business, what are the most important questions leaders should be asking before they trust it in day-to-day operations?
The first question leaders should ask is simple: what happens when this system is confidently wrong? Hallucinations are treated as a quality annoyance in demos, where a wrong answer is harmless, and you just try again. In critical operations, that same wrong answer can go straight into a contract, a financial figure, a customer commitment, or a compliance filing, and the damage is real and sometimes irreversible. So, the question is less about how often the system is right on average and more about the cost of the worst case and how often it can pass unnoticed.
From there, a few questions follow. Where does a hallucination actually hurt us, and which processes can never absorb one? How would we even catch a wrong answer, given that these systems are fluent and sound just as confident when they are wrong as when they are right? Where does a human have to sign off before an output affects the business, and where can the system run on its own? Most leaders only ask these questions after something has already gone wrong, when the honest answer is that they had no way to catch it.
The deeper point is that trusting AI in day-to-day operations is not about hoping the model stops hallucinating. It is about designing the work so that a hallucination cannot quietly cause harm. At Maisa, this is part of why we run processes as discrete steps backed by code rather than as one open generation, so the system follows a defined path you can trace and verify instead of producing an answer you simply have to take on faith.
When organisations start relying on new technology in their core operations, what often gets underestimated in making sure it works consistently over time?
The cost of tokens once you leave the pilot. In a pilot, the numbers look trivial because the volume is tiny and only a few people are running it. Production is a different world. When the same system runs across the whole operation, every day, against real volume, the token cost scales with it, and what looked negligible becomes a line item that can quietly break the economics of the project. The pilot proves the system works, but it tells you almost nothing about what it costs at scale.
This is the problem we set out to solve at Maisa, and we solve it at the architecture level. Most tools that claim to control cost are really just routing tools: a task comes in, they forward it to whichever model looks adequate, and that model still does the whole task as one opaque generation. The only thing optimised is which model answers, and nothing gets produced that makes the next run any cheaper.
The deeper point is that trusting AI in day-to-day operations is not about hoping the model stops hallucinating. It is about designing the work so that a hallucination cannot quietly cause harm.
We work one level deeper. Our KPU, the Knowledge Processing Unit, owns the process rather than the model. It sits on top of any model and uses AI to break a process into discrete steps backed by code. That unlocks three things. We use AI only to define the workflow, then execute each step in code in real time, which directly reduces token usage and delivers up to 10 times lower cost. The work stays separate from the model, so you can swap in a smaller, cheaper model whenever one is enough, with no rebuilding. And every run leaves a trace you can turn into fine-tuning data for a custom model built around the exact use case, so over time, the expensive general-purpose calls get replaced by something leaner that already knows what to do.
So, the difference is architectural. Other approaches optimise the model. We optimise the work itself, and then use the record of that work to lower the cost on every future run.
There is a lot of excitement around AI, but also a lot of assumptions. What is one belief about AI that you think leaders should rethink today?
The belief I would push leaders to drop is that a bigger, more capable model is the answer to everything. The whole industry is focused on the frontier model as the prize, and the assumption is that if you gain access to the most powerful one, the problem is solved. In practice, the model is only one piece, and often the smallest part of what makes something work in production.
Most of the value comes from everything around the model: how you break the work into steps, how you keep it within boundaries, how you control cost, how you make it consistent across thousands of cases every day. Leaders who fixate on the model tend to underinvest in all of that, and then wonder why a system that demoed beautifully falls apart once it meets real operations.
The other half of the same belief is that one good result proves the system works. A great answer in a meeting is a demo, and a demo tells you almost nothing about behaviour at scale. I would rather see leaders treat that first success as a question than as proof: it worked here, now what would it take for this to work everywhere, every day, at a cost we can live with. That shift in mindset is worth more than access to any single model.
As AI continues to develop and become part of everyday business decisions, how do you see it reshaping the way organisations operate, and where should leaders be focusing their attention now?
I think the biggest shift is that organisations stop being built around tasks and start being built around decisions. A lot of the work that fills people’s days today, the gathering, the sorting, the first draft, gets absorbed by AI. What stays with people is judgment: deciding what matters, deciding where the line is, deciding when the system is wrong. That changes what a team is for, and what you hire for.
The second shift is that the org chart starts to include digital workers alongside human ones. These systems take on real processes end-to-end, so leaders have to manage them the way they manage a team: giving them clear boundaries, knowing what each is responsible for, and being able to check their work. The companies that treat AI as a set of disposable answers will struggle here. The ones that treat it as part of how the organisation actually runs will pull ahead.









