Blog | TeliChat

Beyond "Rigid Workflows" and "Uncontrolled Black Boxes" - How TeliChat Reshapes the Core Architecture of LLM-Powered Conversational Agents

May 7, 2026 · 17 min read

Founder, CTO

If you have ever built an customer-service conversational AI agent for real users in this era, you have probably felt a deep sense of helplessness.

Today's customer-service conversational AI Agent developers are trapped in a difficult dilemma:

On one side, you can choose the Workflow model, such as common graph-based or flow-based orchestration tools like Dify, Coze, and LangGraph. The logic is precise and tightly controlled, but the interaction is extremely rigid. As soon as the user goes slightly off script—for example, by interrupting, changing previously provided information, answering a later question first, or suddenly asking about a policy—the whole flow can collapse instantly, like an "artificial idiot" that can only read from a script.

On the other side, you can embrace ReAct-style autonomous agents, such as OpenClaw, Claude Code, and similar autonomous agents. They are flexible, but they are also unpredictable black boxes. They are slow, expensive, prone to hallucination, and may even take unauthorized actions—such as issuing a refund to a customer without proper approval.

In building enterprise-grade customer service bots, the core conflict is actually very clear: there is a natural tension between the nonlinearity of natural language and the strict rule-bound nature of business logic.

Human conversations are not linear. Users provide information out of order, correct themselves, add new details, suddenly switch topics, and move back and forth between multiple tasks. But enterprise business processes demand strict rules, strict validation, strict compliance, traceability, and reproducibility.

Therefore, while preserving "natural and fluid conversational interaction," the industry seems to face an "impossible triangle":

High Reliability: Enterprise-grade applications must not allow unauthorized actions or hallucinations.
Complex Logic: The system must go beyond simple RAG-based Q&A and actually execute complex, multi-step business operations in backend systems.
Quick Response: The system must deliver second-level response times required by customer service scenarios.

Today, we want to introduce a new conversational AI agent engine: TeliChat. With its original "ChatTree + InfoItem" white-box architecture, TeliChat breaks this impossible triangle and combines complex logic, reliable output, flexible interaction, and fast response.

1. Why Popular ReAct-Style Autonomous Agents Fail at Enterprise-Grade Conversations

Before diving into TeliChat, let's first look at the barriers currently faced by popular ReAct-style autonomous agents, such as OpenClaw and Claude Code, in complex business conversation scenarios.

1. The Ultimate Trade-Off Between "Autonomy" and "Compliance"

The core of autonomous agents like Claude Code is "reasoning." You give them a goal, and they figure out how to achieve it.

This is excellent for writing code, but it can be a disaster for banking account opening, insurance claims, airline ticket changes and refunds, financial risk control, or government approval workflows.

Enterprises absolutely cannot allow AI to "reason through" business process rules on its own. If the model "feels sorry for the customer" and bypasses risk controls to trigger a refund, the consequences could be severe.

Remember: whoever makes the decision bears the responsibility. Model-based decisions bring flexibility, but only rules—in other words, code-based decisions—can guarantee absolute reliability.

2. The Breakdown of Conversation State Management

Autonomous agents are good at "one-off, complex tasks," such as helping you get a codebase running. But they are not good at maintaining a "20-turn conversation that spans three days and contains frequent jumps."

For example, the user may say halfway through:

"Hold on, I need to go pick up a package. Remind me what we were talking about in ten minutes."

Or while processing a ticket refund, the user suddenly asks:

"By the way, what's your checked baggage allowance?"

Or after finishing the refund reason, the user suddenly says:

"The order number I gave earlier was wrong. It should be another one."

A pure agent can easily fall into context confusion, or even forget previously collected key parameters.

3. Painfully High Cost and Frustrating Latency

Agents at the level of Claude Code consume a large number of tokens at every step for chain-of-thought reasoning or "thinking mode."

For a customer service system with millions of daily active users, even a simple "hello" may require the model to think for several seconds—or even more than ten seconds. The enterprise token bill can explode instantly, while users churn because they are forced to wait too long.

2. The Three Capability Levels of LLMs in Customer Service Conversations

To understand the value of TeliChat, we first need to clarify the capability boundaries of LLMs in customer service conversations.

It is not that LLMs are "unusable." Quite the opposite: LLMs are already very strong at certain levels. The key issue is this: what a model is good at does not necessarily mean it should be responsible for the entire business process.

We can break down the capabilities of LLMs in customer service conversations into three levels.

1. Single-Turn Natural Language Understanding

Can LLMs solve this? Yes, and they already do it very well.

For example, tasks such as identifying user intent, extracting an order number, and determining whether the user is making an inquiry, complaint, delivery follow-up, or ticket change request are all single-turn natural language understanding tasks where LLMs already perform extremely well.

Evolution trend over the next three years:

Accuracy will rise from 95% to over 99%, with support for more languages, more dialects, and more colloquial expressions.

Will TeliChat's technical moat be weakened?

Not at all.

This is because TeliChat already relies on LLMs for this layer. The stronger LLMs become, the better TeliChat's product experience becomes. TeliChat is not competing with LLMs; it puts LLMs in the role where they are most effective.

2. Simple Multi-Turn Conversations

Can LLMs solve this? Partially.

For example, future LLMs will become increasingly stable at handling 3 to 5 turns of linear dialogue, simple information collection, and simple topic switching.

Evolution trend over the next three years:

LLMs will be able to handle 3 to 5 turns of linear conversation and deal with simple topic switches without making obvious low-level mistakes.

Will TeliChat's technical moat be weakened?

It may be slightly weakened, but the impact will be minimal.

That is because enterprise-grade customer service conversations usually average 8 to 12 turns, and they frequently involve nonlinear jumps, parallel intents, information correction, task interruption, and task resumption.

The improvement of simple multi-turn conversations will raise the baseline customer service experience. But truly complex enterprise-grade conversations still require strong state management and deterministic process control.

3. Complex Business Process Execution

Can LLMs solve this? They can never solve it completely.

Complex business processes are not just about "saying a few more sentences." They involve multiple branches, multiple conditions, multiple parallel tasks, multiple backend systems, multiple permission boundaries, and long-lived conversation state.

Evolution trend over the next three years:

LLMs will be able to handle very simple, branch-free linear workflows. But for complex business processes involving multiple branches, multiple conditions, multiple parallel tasks, and long conversations, they will still make frequent mistakes.

Will TeliChat's technical moat be weakened?

Not only will it not be weakened, it will be significantly strengthened.

The more simple scenarios LLMs can handle, the higher users' expectations for complex scenarios will become, and the more urgently enterprises will demand deterministic process execution.

In other words, the stronger LLMs become, the more enterprises will realize:

What is truly scarce is not a "model that can talk," but a "state management engine that can safely connect natural language to complex business systems."

3. Why Pure Prompt Engineering Can Never Solve the Third-Level Problem

Many people will say:

"Can't I just use Chain of Thought, structured output, and function calling to make the LLM execute workflows?"

This is another major misconception.

Prompt engineering is essentially: writing rules for the LLM in natural language.

But it has three fatal limitations that cannot be overcome.

1. Context Window Limitations

No matter how large the context window is, it is still finite.

In long conversations, the LLM will inevitably forget earlier information, causing state loss. Even if the context window is large enough, it still introduces cost, latency, and attention dilution.

Enterprise-grade customer service is not one-off Q&A. It is a long-term, multi-turn, interruptible, resumable stateful interaction. A context window alone cannot reliably carry business state.

2. Probabilistic Nature

No matter how strict your prompt is, an LLM always has some probability of not following the rules.

That probability may be very low, but for enterprise-grade scenarios, anything greater than zero is unacceptable.

Banking, insurance, aviation, government services, and e-commerce after-sales scenarios do not allow the model to "occasionally" bypass permissions, "occasionally" misjudge a rule, or "occasionally" call an API that should not be called.

3. Zero Debuggability

When the LLM makes a mistake, you do not truly know why it made the mistake, nor can you reliably reproduce and fix it.

All you can do is keep modifying the prompt, running trial and error, and relying on experience and luck. This is unimaginable for enterprise systems that need to operate stably.

TeliChat's conversation state management engine solves exactly these three problems:

It has an independent, persistent, structured state space that is not limited by the context window, so long conversations never lose state.
Business process logic is guaranteed by hard-coded rules in code, making it 100% deterministic and free from probabilistic errors.
All state changes are logged, traceable, debuggable, and reproducible, allowing problems to be quickly located and fixed.

4. TeliChat's Breakthrough: From "Process Control" to "State-Driven Interaction"

To solve these problems, TeliChat proposes a core principle:

Let code handle business logic, and let the model handle only language understanding and generation.

TeliChat's core architecture consists of a "ChatTree" based on a DAG topology and "InfoItems" as composable state.

Three Capabilities, Precisely Separated to Eliminate Hallucination

ChatTree: Based on a directed acyclic graph, it precisely expresses interaction logic and state transitions.
Large language model: Freed from the burden of "making decisions" and "calling tools," it focuses only on natural language understanding, including global intent recognition, information extraction, and natural language generation.
Python code: Executes complex business logic, permission checks, conditional validation, and backend API orchestration at specific nodes.

Three Layers of Constraints to Suppress Hallucination

TeliChat tightly constrains the LLM through three layers:

Topology: Limits where the conversation is allowed to go.
Information state: Determines what is already known, what is still missing, and which information has been updated.
Python code: Handles the actual business decisions, validation, and execution.

The LLM is no longer a free-form "business decision-maker." Instead, it becomes a clearly constrained "language understanding engine" and "response generation engine."

5. How Is This Different from Other Frameworks in the Industry?

TeliChat was not created in isolation. Let's look at how it differs from today's mainstream frameworks.

VS Dify / Coze / Workflow Tools

Workflow tools like Dify and Coze are very suitable for building one-way data processing flows or RAG-based Q&A, such as A -> B -> C.

But human conversations are not linear.

Users may answer a later question first, revise an earlier answer, suddenly ask about a policy, change their mind midway, or move back and forth between refunds, logistics, invoices, and membership benefits.

If you use a traditional Node-Edge topology graph to express all of this logic, you quickly run into combinatorial explosion. A simple 5-step refund process can rapidly turn into an unmaintainable "spaghetti graph" if you account for the possibility that the user may ask about policy, change their mind, or switch topics at every step.

At their core, many graph-based or flow-based orchestration tools are just beautified decision trees.

TeliChat is different. Its node transitions are dynamically driven by a combination of static topology + real-time information state + current user intent. Users can provide information out of order, add corrections, skip steps, insert new topics, resume previous tasks, or jump wildly between multiple topics. TeliChat's global intent recognition and information extraction can handle all of this gracefully, while traditional Workflow systems often have no choice but to restart the process.

VS LangGraph: The Graph State Machine Representative

LangGraph has also recognized the importance of loops and state graphs.

However, LangGraph relies heavily on developers manually writing complex routing logic. As the business grows, it can easily turn into an unmaintainable "spaghetti graph."

A single TeliChat ChatTree can support hundreds or thousands of nodes and InfoItems. It defines state as the combination of all InfoItems and provides high-dimensional semantic controls, such as mandatory-answer configuration, implicit or explicit semantics, and restatement strategies. This greatly reduces the cognitive burden of maintaining complex conversation graphs.

In other words, TeliChat does not force developers to write a pile of if-else-style routing rules. Instead, it uses structured InfoItems and a state-driven mechanism to let complex conversations flow naturally.

VS Rasa CALM: The Growing Pains of Traditional Conversation Frameworks

Rasa recognized the pain point of rigid flows and introduced CALM. It does alleviate the rigidity of traditional flows through LLM-driven dialogue management.

TeliChat aligns with Rasa CALM in philosophy: both believe that LLMs should not decide the next step of business logic.

However, the Rasa ecosystem has historically been relatively heavy, and CALM still carries a configuration-driven DNA. Complex YAML is extremely unfriendly to developers and comes with a steep learning curve. YAML lacks the dynamism, type checking, and readability of programming languages, which keeps the implementation cost of complex business logic high.

TeliChat, by contrast, is code-first. It integrates seamlessly with Python and uses the global context ctx to share data among the model, the tree, and code. Developers can express complex business logic in a familiar programming language instead of struggling to maintain piles of configuration in YAML.

VS Emerging Open-Source Frameworks Such as Parlant

Emerging frameworks such as Parlant are heading in the right direction: principle-driven and policy-driven, rather than path-driven.

However, in terms of engineering maturity, response speed, complex API orchestration, and deep integration with existing enterprise backend systems such as ERP and CRM, they are still at an early, toy-like stage.

Parlant mainly focuses on predictable API-driven agents, while TeliChat goes further by delivering extreme engineering practicality and agility:

Seamless integration with Python;
Code-based expression of complex business logic;
Shared data through the global context ctx;
Automatic rendering of ChatTrees;
IDE-level white-box debugging;
Enterprise-grade response speed and state observability.

VS Black-Box SaaS Leaders Such as Sierra / Decagon / Fin.ai

Leading SaaS vendors such as Sierra, Decagon, and Fin.ai encapsulate their underlying elegant frameworks—"intent routing + state management + LLM generation"—into black boxes and sell them directly to business teams without engineering capabilities.

This is attractive to small and medium-sized companies or purely business-oriented teams. But for financial institutions, government agencies, large e-commerce platforms, and other enterprises with large engineering teams and strong requirements for data privacy and control over core business logic, the problem is obvious:

They cannot buy the right "shovel," so they are forced to build everything from scratch.

Sierra adopts a fully managed model similar to an "Agent OS," where customers cannot truly own the underlying building capabilities and can only adjust behavior through configuration.

Decagon provides Agent Operating Procedures, or AOPs, but still requires significant engineering involvement. At its core, it remains a "black-box application," not an open tool.

The business model of these vendors is to replace human labor, not to empower developers. Therefore, they have no incentive to open up their underlying construction tools.

TeliChat is positioned differently:

It is not a black-box customer service SaaS that does everything on behalf of enterprises. It is a white-box shovel placed directly in the hands of developers.

6. Why Developers Will Love TeliChat: Engineering Power and Exceptional Experience

For developers, the most exciting part of TeliChat is that it truly delivers a white-box and code-first development paradigm.

1. Dual-Mode Construction with What-You-See-Is-What-You-Get

You can write logic directly in Python, and TeliChat will automatically generate an interactive HTML conversation topology graph that supports dragging, zooming, and hover-based prompt inspection.

You can also let business users draw a ChatTree in Xmind, and TeliChat can parse and run it directly.

The two modes are fully equivalent and interoperable:

Developers use Python to ensure the rigor of complex logic;
Business users participate in process design through Xmind;
The system automatically generates visual topology maps to help teams align their understanding.

This means conversation systems no longer have to be hidden inside prompts and black-box agents like some kind of dark art. Instead, they become real engineering systems that can be seen, discussed, debugged, and evolved.

2. An Exceptional IDE-Level White-Box Debugging Experience

Tired of guessing why an LLM is acting unpredictably inside a black box?

TeliChat provides end-to-end observability. Even more importantly: it supports setting breakpoints directly in VS Code at any time.

Whether before or after a node is executed, you can pause and clearly inspect the current state of all "InfoItems" and Python variables.

You can see the current user intent, which InfoItems have been filled, which InfoItems have been corrected by the user, why the conversation arrived at the current node, and why the next step is being judged in a certain way.

This is what a true enterprise-grade development experience looks like.

3. Extremely Fast Response Speed

TeliChat meets the strict requirement of second-level responses. What is the secret?

It completely abandons ReAct loops, LLM function calling, and LLM thinking mode.

The LLM only performs intent recognition and information extraction, and its output directly uses a non-JSON format. This greatly reduces the time to first token and lowers API call costs.

All the saved time is handed over to efficient Python backend logic execution.

This means:

Simple questions do not require long LLM reasoning;
Complex business logic is executed quickly by local code;
Enterprises do not need to pay high token costs for every step of agent reasoning;
Customer service scenarios can truly achieve second-level response experiences.

4. Declarative Prompts That Eliminate Tedious Prompt Engineering

TeliChat completely moves away from tedious prompt engineering and adopts declarative prompts.

You do not need to repeatedly write long prompts to "persuade" the model to follow rules. You only need to describe in natural language what information is needed, what the judgment criteria are, and what the extraction rules are. TeliChat can then incorporate these elements into its ChatTree and information-item system to drive execution.

Prompts are no longer the carrier of business logic. They are merely the instruction manual for language understanding tasks.

The real business logic remains guaranteed by code and the state machine.

Conclusion: Put the LLM Back Where It Belongs

Building reliable conversational AI agents should not be a blind-box game of "praying" to the LLM.

TeliChat proves that with the right architecture—namely, ChatTree + InfoItems—we can cleanly separate "deterministic business logic" from "empathetic natural language interaction."

Let Python code do what it does best: rigorous validation, permission checks, state management, and API calls.

Let the LLM do what it does best: empathetic communication, intent recognition, information extraction, and natural language expression.

No more unbearable latency.

No more uncontrolled business overreach.

No more rigid workflows that fall apart when users interrupt.

TeliChat is building a true white-box conversational agent engine for enterprise production environments.

Feedback and criticism are very welcome. If you have any questions or suggestions, please click "Discussion" or "Questions" in the upper left corner of the current page

You can also contact

1. Why Popular ReAct-Style Autonomous Agents Fail at Enterprise-Grade Conversations​

1. The Ultimate Trade-Off Between "Autonomy" and "Compliance"​

2. The Breakdown of Conversation State Management​

3. Painfully High Cost and Frustrating Latency​

2. The Three Capability Levels of LLMs in Customer Service Conversations​

1. Single-Turn Natural Language Understanding​

2. Simple Multi-Turn Conversations​

3. Complex Business Process Execution​

3. Why Pure Prompt Engineering Can Never Solve the Third-Level Problem​

1. Context Window Limitations​

2. Probabilistic Nature​

3. Zero Debuggability​

4. TeliChat's Breakthrough: From "Process Control" to "State-Driven Interaction"​

Three Capabilities, Precisely Separated to Eliminate Hallucination​

Three Layers of Constraints to Suppress Hallucination​

5. How Is This Different from Other Frameworks in the Industry?​

VS Dify / Coze / Workflow Tools​

VS LangGraph: The Graph State Machine Representative​

VS Rasa CALM: The Growing Pains of Traditional Conversation Frameworks​

VS Emerging Open-Source Frameworks Such as Parlant​

VS Black-Box SaaS Leaders Such as Sierra / Decagon / Fin.ai​

6. Why Developers Will Love TeliChat: Engineering Power and Exceptional Experience​

1. Dual-Mode Construction with What-You-See-Is-What-You-Get​

2. An Exceptional IDE-Level White-Box Debugging Experience​

3. Extremely Fast Response Speed​

4. Declarative Prompts That Eliminate Tedious Prompt Engineering​

Conclusion: Put the LLM Back Where It Belongs​

1. Why Popular ReAct-Style Autonomous Agents Fail at Enterprise-Grade Conversations

1. The Ultimate Trade-Off Between "Autonomy" and "Compliance"

2. The Breakdown of Conversation State Management

3. Painfully High Cost and Frustrating Latency

2. The Three Capability Levels of LLMs in Customer Service Conversations

1. Single-Turn Natural Language Understanding

2. Simple Multi-Turn Conversations

3. Complex Business Process Execution

3. Why Pure Prompt Engineering Can Never Solve the Third-Level Problem

1. Context Window Limitations

2. Probabilistic Nature

3. Zero Debuggability

4. TeliChat's Breakthrough: From "Process Control" to "State-Driven Interaction"

Three Capabilities, Precisely Separated to Eliminate Hallucination

Three Layers of Constraints to Suppress Hallucination

5. How Is This Different from Other Frameworks in the Industry?

VS Dify / Coze / Workflow Tools

VS LangGraph: The Graph State Machine Representative

VS Rasa CALM: The Growing Pains of Traditional Conversation Frameworks

VS Emerging Open-Source Frameworks Such as Parlant

VS Black-Box SaaS Leaders Such as Sierra / Decagon / Fin.ai

6. Why Developers Will Love TeliChat: Engineering Power and Exceptional Experience

1. Dual-Mode Construction with What-You-See-Is-What-You-Get

2. An Exceptional IDE-Level White-Box Debugging Experience

3. Extremely Fast Response Speed

4. Declarative Prompts That Eliminate Tedious Prompt Engineering

Conclusion: Put the LLM Back Where It Belongs