Customer Service Full-Process Agent Loop

ClientConfidential Southeast Asia Leading SaaS Platform

IndustrySaaS Platform (Southeast Asia)

Core TechMulti-tier Agent + MCP + Token101

ROIFirst-contact resolution +40% | Response time -70% | API cost -45%

BoundaryAnonymized implementation brief

01 /Context

The Support Queue That Never Shrinks.

Our client, a leading SaaS platform in Southeast Asia, handles 500,000+ customer service conversations monthly across web chat, mobile app, email, and social messaging channels. Their support operation was caught in a permanent state of catch-up.

First-line agents, many of them newly hired and still learning the product, provided inconsistent answers to the same question depending on who was online. Complex issues required multiple escalation handoffs — L1 to L2 to engineering — each handoff adding hours and losing context. The knowledge base, maintained manually, was perpetually outdated as the product shipped new features faster than the documentation team could update articles.

02 /Friction

Why 'Upgrade Your Chatbot' Doesn't Fix Customer Service.

The client had cycled through three generations of chatbot vendors. Each hit the same operational walls.

1.The Dead-End Conversation Problem: Chatbots could answer simple FAQs but couldn't do anything. 'How do I reset my password?' worked. 'Reset my password and send me the confirmation' did not. The bot couldn't execute actions — it could only point to help articles.
2.The Escalation Black Hole: When the chatbot failed, the escalation to a human agent lost all context. The customer had to re-explain their issue from scratch. The human agent had no visibility into what the bot had already tried.
3.The Misclassification Tax: The existing system classified tickets by keyword matching, resulting in a 30%+ misclassification rate. Billing questions landed in technical support. Feature requests were routed to the bug team. Each misroute added hours to resolution time.
4.The Flat Cost Problem: Every conversation — whether a simple 'what are your hours?' or a complex multi-step account migration — consumed the same AI resources at the same cost. At 500,000 conversations per month, this was economically broken.

03 /Solution

From Chatbot to Closed-Loop Resolution Engine.

Customer intent

MCP action agents

Knowledge healing

Support moves from answer generation to closed-loop resolution, escalation context, and knowledge repair.

I. Intent Routing Layer (3-tier)

L1 Classification (Haiku): Instantly categorizes queries into top-level buckets in under 200ms at minimal cost. L2 Understanding (Sonnet): Parses full intent, extracts entities, determines resolution path. L3 Action Planning (Opus, when needed): For complex multi-step issues, generates a step-by-step resolution plan.

II. Agent Execution Layer (4 Agents)

A Knowledge Retrieval Agent queries the RAG knowledge base for up-to-date documentation. A Tool Invocation Agent executes actions through MCP-connected systems — resets passwords, processes refunds, upgrades subscriptions. A Ticket Management Agent creates and updates tickets with full context. An Escalation Routing Agent transfers complex issues to human agents with a complete context package: transcript, attempted resolutions, system state, and recommended next steps.

III. Self-Healing Knowledge Layer

Bad Case Detection automatically logs failure patterns when conversations result in negative feedback or escalation. Knowledge Gap Identification flags low-confidence retrieval results for the documentation team. Auto-Update Pipeline ingests product changelogs and release notes, transforming them into knowledge base articles — keeping the system current with each product release.

IV. MCP + Token101

All agents connect to the client's existing CRM, ticketing system, payment processor, and product API through a unified MCP layer — no changes to existing systems required. Token101 tiered routing: simple queries to Haiku, standard support to Sonnet, complex decisions to Opus. Result: API costs reduced approximately 45%.

From Responding to Resolving.

+40%First-contact resolution rate increase, driven by the Tool Invocation Agent's ability to execute actions.

-70%Average response time reduction, with L1 classification completing in under 200ms.

-45%API cost reduction via Token101 tiered routing.

HoursKnowledge base lag reduced from weeks to hours via the auto-update pipeline.

"We don't just chat with customers; we close the loop from intent to resolution."

Related implementation briefs.

Claude 3 Opus

Cognitive Recommender Architecture

A Claude cognitive controller improved premium content discovery and recommendation explainability without replacing the existing high-throughput recommender stack.

Read brief ->

Multi-Model Cascade + Edge DLP

Full-Process Medical AI

A 7-agent full-lifecycle medical AI system spanning pre-consultation, diagnostic support, drug interaction checking, and follow-up management — with HIPAA architecture and zero severe hallucinations over 6 months.

Read brief ->

Edge SFT 7B Models

Autonomous Industrial Control Agent

An edge-deployed autonomous control system replaced hardcoded PLC logic with domain-tuned 7B models and living SOPs, enabling real-time anomaly reasoning without cloud dependency.

Read brief ->

Get this architecture for your industry.

Get a Diagnosis in 30 Minutes View All Implementation Briefs