Back to Blog

How to Build a Customer Service AI Agent with Microsoft

April 22, 202610 min readMichael Ridland

Customer service is the most common starting point for enterprise AI agents, and for good reason. The business case is straightforward - you have a high volume of repetitive enquiries, you know the correct answers, and every minute a human spends on a routine question is a minute they're not spending on complex cases that actually need human judgment.

We've built customer service AI agents for insurance companies, professional services firms, telcos, and government agencies across Australia. Here's how to do it properly using Microsoft's AI stack.

Why Microsoft for Customer Service Agents

Before getting into the how, let me address the why. You have options - you could build on AWS Bedrock, Google Vertex AI, or even open-source models running on your own infrastructure. We recommend the Microsoft stack for customer service agents when:

  • Your organisation already uses Microsoft 365 (most of the knowledge base is probably in SharePoint)
  • Your contact centre runs on Teams or Dynamics 365 Customer Service
  • Azure Active Directory manages your identities
  • You need Australian data residency (Azure has data centres in Sydney and Melbourne)
  • Your compliance team needs a vendor with established certifications (ISO 27001, SOC 2, IRAP)

If those conditions apply - and they do for most Australian enterprises we work with - the Microsoft AI Agent Framework gives you the shortest path to production.

Architecture Overview

A production customer service AI agent has more moving parts than most people expect. Here's the architecture we typically deploy:

User Interface Layer:

  • Microsoft Teams (for internal helpdesk agents)
  • Web chat widget (for customer-facing agents)
  • Email integration (for handling emailed enquiries)

Agent Layer (Semantic Kernel):

  • Orchestration agent that manages the conversation
  • Intent classification to understand what the customer wants
  • Tool selection to determine which backend system to query
  • Response generation with tone and formatting appropriate to the channel

Knowledge and Data Layer:

  • Azure AI Search over your knowledge base (FAQs, product documentation, policies)
  • Dynamics 365 or CRM connector for customer account data
  • Order management system connector for transaction history
  • Azure Cosmos DB for conversation memory and context persistence

Safety and Governance Layer:

  • Azure AI Content Safety for filtering inappropriate content
  • Guardrails for preventing the agent from making promises it can't keep
  • Human handoff logic for escalation
  • Audit logging via Application Insights

Step-by-Step Build Guide

Step 1 - Map Your Enquiry Types

Before writing any code, analyse your actual customer enquiries. Pull data from your existing ticketing system and categorise every enquiry type by:

  • Volume: How many of each type per month?
  • Complexity: Can it be answered from a knowledge base, or does it require reasoning?
  • Data needed: What systems need to be queried to answer it?
  • Risk level: What happens if the agent gets it wrong?

In our experience, 60-80% of customer service volume falls into 10-15 enquiry types. Start with the top 5 that are high-volume, low-complexity, and low-risk.

Example breakdown from an insurance client:

Enquiry Type Monthly Volume Complexity Data Source Risk
Policy details lookup 850 Low CRM Low
Claims status check 620 Low Claims system Low
Coverage questions 480 Medium Knowledge base + CRM Medium
Make a payment 310 Low Payment gateway Medium
File a new claim 290 High Claims system High
Policy change request 260 Medium CRM Medium
Complaint 180 High CRM + escalation High

We started with policy details and claims status - high volume, low complexity, low risk. The agent was handling those two categories within four weeks. Coverage questions came next. New claims and complaints stayed with human agents, with the AI agent collecting initial information before handoff.

Step 2 - Build Your Knowledge Base

The quality of your knowledge base determines 80% of your agent's quality. This is where most projects succeed or fail.

For Azure AI Search, prepare your content:

  1. Collect everything: FAQs, product guides, policy documents, training materials, previous ticket responses
  2. Clean and update: Remove outdated content. If your FAQ still references a product you discontinued two years ago, the agent will confidently tell customers about it
  3. Structure for retrieval: Break long documents into logical sections. A 50-page policy document should be chunked so the agent can retrieve the relevant section, not the entire document
  4. Add metadata: Tag content by product, category, and date. This helps the agent filter results

Indexing strategy: We typically use Azure AI Search with semantic ranking enabled. The hybrid search approach (keyword + semantic) consistently outperforms pure keyword or pure vector search for customer service content. It handles the mix of specific product codes (keyword search) and natural language questions (semantic search) that customer service requires.

Budget for this step: $5,000-$15,000 AUD depending on the volume of content and how well-organised it already is. If your SharePoint is a well-maintained knowledge base, this is quick. If your documentation lives across five different systems in inconsistent formats, it takes longer.

Step 3 - Build the Agent Core

Using Semantic Kernel, the agent core consists of:

System prompt: This defines the agent's personality, capabilities, and boundaries. For customer service, the system prompt typically includes:

  • Who the agent is (company name, department)
  • What it can help with (the enquiry types you've chosen to support)
  • What it should never do (make promises about refunds, provide legal advice, share other customers' data)
  • How it should communicate (friendly but professional, Australian English, company voice)
  • When to escalate (confidence thresholds, complaint detection, explicit customer request)

Plugins (tools):

  • Knowledge search plugin: Queries Azure AI Search and returns relevant content
  • Customer lookup plugin: Retrieves customer details from CRM based on authenticated identity
  • Transaction history plugin: Fetches order/claim/policy history
  • Escalation plugin: Creates a handoff to a human agent with full conversation context
  • Feedback plugin: Records customer satisfaction at the end of the interaction

Conversation memory: Azure Cosmos DB stores conversation history so the agent maintains context across messages. If a customer says "I asked about this last week," the agent can look up that previous conversation.

Step 4 - Implement Guardrails

Customer-facing agents need more guardrails than internal agents. Here's what we implement:

Input guardrails:

  • Azure AI Content Safety filters to block harmful or inappropriate input
  • PII detection to handle sensitive information appropriately
  • Injection attack detection (prompt injection is a real threat in customer-facing agents)

Output guardrails:

  • Response validation to catch hallucinations about policies or pricing
  • PII leak prevention to ensure the agent never exposes other customers' data
  • Tone analysis to flag responses that might come across as dismissive or rude
  • Confidence scoring - if the agent isn't confident in its answer, it should say so and offer to escalate

Operational guardrails:

  • Maximum conversation length (prevent infinite loops)
  • Rate limiting per user (prevent abuse)
  • Cost caps on model API usage (prevent runaway costs from a single long conversation)
  • Automatic escalation after a configurable number of failed resolution attempts

Step 5 - Build the Human Handoff

This is the most important feature in a customer service agent and the one most teams underinvest in. A bad handoff destroys customer trust faster than anything else.

Good handoff looks like:

  1. Agent recognises it can't resolve the issue (or the customer requests a human)
  2. Agent summarises the conversation so far in a structured format for the human agent
  3. Agent transfers to a human with full context - the customer never repeats themselves
  4. The transition is smooth - no "please hold" loops or disconnections
  5. If no human is available, the agent sets expectations about wait time and offers alternatives

In the Microsoft stack: If you're using Dynamics 365 Customer Service, the handoff integrates natively. The AI agent creates a case in Dynamics with the conversation transcript and routes it to the appropriate queue. The human agent sees the full context when they pick up the case.

For Teams-based internal helpdesks, the handoff creates a message in a support channel tagging the relevant team member with a summary.

Step 6 - Test with Real Scenarios

Before going live, test with real customer scenarios, not synthetic ones:

  1. Historical replay: Take 500 real customer enquiries from the past month and run them through the agent. Compare its responses to what the human agent said
  2. Edge case testing: Deliberately try to confuse the agent - ambiguous questions, angry customers, questions about things the agent doesn't know, requests to do things outside its scope
  3. Red team testing: Try to make the agent say something it shouldn't - share other customer data, make false promises, bypass its guardrails
  4. User acceptance testing: Have actual customer service staff test the agent and provide feedback. They know the edge cases that matter

Accuracy targets: We typically aim for 90%+ accuracy on supported enquiry types before going live, with a clear escalation path for the remaining 10%. Getting from 90% to 95% takes disproportionate effort - plan for iterative improvement after launch rather than trying to achieve perfection before launch.

Step 7 - Deploy and Monitor

Deployment approach: Never go from zero to 100% traffic overnight. We recommend:

  1. Week 1-2: Shadow mode - agent runs alongside human agents but doesn't respond to customers. Compare agent responses to human responses
  2. Week 3-4: 10% of conversations routed to the agent, with a human reviewing every response before it's sent
  3. Week 5-8: 30% of conversations, human review on flagged responses only
  4. Week 9+: Full deployment with automated monitoring and spot-check review

Key metrics to monitor:

  • Resolution rate (what percentage of conversations are resolved without escalation)
  • Customer satisfaction (post-interaction survey)
  • Accuracy (sampled review of agent responses)
  • Escalation rate (and reasons for escalation)
  • Average handling time
  • Cost per interaction

Timeline and Cost

For a typical customer service AI agent project:

Phase Duration Cost (AUD)
Enquiry analysis and scoping 1-2 weeks $10,000-$18,000
Knowledge base preparation 2-3 weeks $8,000-$20,000
POC (top 2-3 enquiry types) 2-4 weeks $25,000-$45,000
Production build 8-12 weeks $80,000-$160,000
Testing and deployment 3-4 weeks $20,000-$35,000
Total 16-25 weeks $143,000-$278,000

Monthly running costs: $4,000-$12,000 AUD depending on volume.

ROI calculation: If you're handling 2,000 customer enquiries per month and the agent handles 60% without escalation, at an average cost of $12-$15 AUD per human-handled enquiry, you're saving approximately $14,000-$18,000/month. Payback period is typically 10-18 months.

Common Pitfalls

Going too broad too fast. Start with 3-5 enquiry types, not all of them. An agent that handles five things well builds more trust than an agent that handles twenty things badly.

Neglecting the knowledge base. Garbage in, garbage out. If your FAQ hasn't been updated in two years, fix that before building the agent.

Underinvesting in handoff. The handoff experience is more important than the AI experience. A customer who gets a good handoff to a human is satisfied. A customer stuck in an AI loop with no escape is furious.

Ignoring the contact centre team. The humans who currently handle these enquiries know every edge case. Involve them early. They're your best testers and your best source of training scenarios.

No feedback loop. Launch is the starting line, not the finish line. Build a system for collecting feedback, reviewing failed conversations, and continuously improving the agent.

Getting Started

If you're considering a customer service AI agent, we'd recommend starting with a scoped POC focused on your highest-volume, lowest-risk enquiry types. That gives you real data to make a go/no-go decision on the full build.

Talk to our team about a customer service AI agent POC, or learn more about our AI agent development approach and Microsoft AI consulting services.