[This is part of a new blog series on the more technical lessons learned from building autonomous AI agents in the real world.]
Most online tutorials for building AI agents and chatbots look something like:
That sounds quite straightforward, but there’s a really important and really hard step “1.5” – determining the customer’s intent.
Determining intent—whether from a chat conversation, a forwarded email, or a webhook—is central to building effective AI Agents. And the way we do it at Teammates reflects one of our core design philosophies:
The real world is messy – only agents with intuition can navigate it effectively.
So let’s talk about how and why we use large language models (LLMs) – which are both slower AND more expensive – instead of traditional classifiers for intent detection.
At first blush our choice may seem counterintuitive. Traditional language classifiers work great – they are mature, super fast, and inexpensive. And cloud services like AWS Lex or Google’s Dialogflow offer them as hosted services. To set one up, you define a set of intents, train a model on examples for each, and watch it go to work.
Sounds easy enough, but this assumes you have a comprehensive, labeled dataset and that your intents won’t change much over time. This works really well when you’re building a support chatbot, or artificially scope limiting interactions and contexts that an Agent has to parse. But we are building Teammates to be different. We wanted our Agents to better cope with the way humans think and tend to communicate.
Consider a few real-world examples:
Each of these scenarios presents ambiguities that pre-trained static classifiers could never resolve. To address them, you’d need to anticipate every possible input ahead of time—an impossible task. And even if you could, the system would break as soon as the next unexpected input came along.
So why use an LLM when we’re not generating language but looking for intent—which is basically an enumeration?
LLMs offer a fundamentally different approach. Instead of matching inputs to predefined labels, they work amazingly well at interpreting the intent behind an input based on their vast pre-trained knowledge. This allows them to handle scenarios they haven’t explicitly seen before.
Here’s how LLMs make this possible:
For example, a classifier might categorize “Send the report over” as “file transfer” without understanding which report, where it needs to go, and over what channels. An LLM (properly prompted with the right context) can infer whether this is a document upload, an email attachment, or a database query.
At Teammates, LLMs are the cornerstone of how we detect intent, but they’re only part of the story. The goal isn’t just to identify what someone wants—it’s to turn that understanding into action. In other words: leverage the nuance made possible through LLM Intent Detection to create and deploy complex work plans with minimal human intervention.
This pipeline allows us to move seamlessly from chat → intuition → structured prompt → work done well.
For engineers, the implications are significant. With LLMs, you don’t need to predefine every intent or rely on brittle rule-based systems. Instead, you focus on building robust prompts and workflows that adapt to real-world variability. This flexibility not only accelerates development but also ensures that your system can evolve as new edge cases arise.
Sadly, there are no free lunches. The obvious downsides of using LLMs instead of traditional classifiers are increased cost and increased latency. The former is important to the business and the latter is critical to the user experience. To address both, we likely don’t want to use the biggest foundational model for intent, but rather the smallest, fastest model that can get the job done. This is where a comprehensive set of evals is worth its weight in gold. (More on this in the coming weeks but it turns out that writing effective evals is not as simple as it seems.)
At Teammates, we’ve invested heavily in intent classification because it aligns with how modern work happens: in real-time, with incomplete information, and in a world that refuses to stay predictable.
The deeper challenge isn’t just identifying intent—it’s doing so in a way that feels seamless to the user. Normal people shouldn’t have to think about how AI works under the hood. The ideal system writes its own prompts, interprets its own edge cases, and gets the job done without requiring the user to spell out every detail.
This is the vision we’re building toward at Teammates: virtual employees with artificial intuition. By combining the interpretive power of LLMs with structured workflows and context-aware systems, we’re creating AI collaborators that thrive in the messy, bespoke reality of human work.