Back to Blog

When to Use Azure Cognitive Services vs Custom AI Models

April 7, 202610 min readMichael Ridland

One of the most expensive mistakes in enterprise AI is building custom when off-the-shelf would have worked. The second most expensive mistake is using off-the-shelf when custom was needed.

We see both regularly. A company spends six months training a custom document classification model when Azure AI Document Intelligence's prebuilt models would have hit 95% accuracy out of the box. Or a company forces a prebuilt sentiment analysis API onto a domain-specific problem and wonders why the results are useless.

The decision between Azure Cognitive Services (now branded as Azure AI Services) and custom models comes down to a few specific questions. Let me walk through them.

What Are Azure Cognitive Services?

Azure Cognitive Services is Microsoft's collection of pre-built AI APIs. You send data in, get AI results back. No training, no model management, no ML engineering. The main categories:

Vision

  • Computer Vision - image analysis, OCR, spatial analysis
  • Custom Vision - image classification and object detection with your own training data
  • Face API - face detection, identification, verification

Speech

  • Speech-to-Text - audio transcription
  • Text-to-Speech - generate spoken audio from text
  • Speech Translation - real-time translation

Language

  • Azure OpenAI Service - GPT models for text generation, analysis, summarisation
  • Language Understanding - intent detection and entity extraction
  • Text Analytics - sentiment, key phrases, language detection
  • Translator - text translation across 100+ languages

Document Intelligence

  • Prebuilt models for invoices, receipts, ID documents, tax forms
  • Custom models for your specific document types
  • Layout analysis for tables and structure

Decision

  • Content Safety - detect harmful content
  • Personaliser - personalise user experiences

These services are production-ready, globally available, and backed by Microsoft's SLAs. For many business problems, they're the right answer.

What Are Custom AI Models?

Custom models are AI systems trained specifically for your use case, on your data, optimised for your performance requirements. In the Azure ecosystem, you'd typically build these using:

  • Azure Machine Learning - full ML platform for training, deploying, and managing models
  • Azure OpenAI fine-tuning - adapting GPT models to your specific domain or task
  • Azure AI Document Intelligence custom models - training document extraction on your specific document types
  • Custom Vision - technically a Cognitive Service but with custom training
  • Open-source models on Azure - deploying Hugging Face models, PyTorch models, or other frameworks on Azure compute

Custom models require training data, ML expertise, compute resources for training, and ongoing maintenance. The investment is significantly higher than calling a pre-built API.

The Decision Framework

Here's the framework we use with clients. Work through these questions in order.

Question 1 - Does a Prebuilt Service Solve Your Problem?

Start by testing the prebuilt option. Seriously testing it - not reading the documentation and guessing, but actually sending your real data through the API and measuring the results.

Take 100 representative inputs. Run them through the relevant Cognitive Service. Measure accuracy against your ground truth. If the prebuilt service hits your accuracy target, stop. You're done.

Example: A client needed to extract data from Australian invoices. We tested Azure Document Intelligence's prebuilt invoice model against 200 of their real invoices. It achieved 92% field-level accuracy out of the box. After adding a few post-processing rules for their specific invoice formats, we hit 97%. No custom model needed. Total development time: 3 weeks instead of the 3 months a custom model would have required.

Example: A different client needed to classify engineering inspection reports into 47 categories based on content. The prebuilt text classification APIs couldn't handle the domain-specific language. Accuracy was 41%, well below the 85% threshold. Custom model was the right call.

Question 2 - What Accuracy Do You Actually Need?

This question is more nuanced than it sounds. People default to "as accurate as possible" which isn't a useful answer.

Think about it this way:

  • What happens when the AI is wrong? If a human reviews every output anyway, 80% accuracy still saves significant time. If the AI output goes directly into a financial system with no review, you need 99%+.
  • What's the current accuracy of the manual process? Humans aren't 100% accurate either. If your data entry team achieves 95% accuracy, an AI system at 93% with human review might be perfectly acceptable.
  • What's the cost of errors vs the cost of improvement? Going from 90% to 95% accuracy might cost $20,000 in custom model development. Going from 95% to 98% might cost $100,000. Is the error reduction worth it?

Map your accuracy requirement honestly, then assess whether the prebuilt service meets it.

Question 3 - Do You Have Training Data?

Custom models need training data. The amount varies:

Task Minimum Training Data Good Training Data Notes
Text classification 50-100 examples per class 500+ per class More classes = more data needed
Named entity extraction 200-500 annotated examples 2,000+ Domain-specific entities need more
Document extraction 5-50 labelled documents per type 100+ per type Azure Document Intelligence custom models
Image classification 15-50 images per class 500+ per class Custom Vision can work with very little data
Object detection 15-50 annotated images 500+ Bounding box annotation is time-consuming
Fine-tuned LLM 50-500 examples 1,000+ Quality matters more than quantity

If you don't have training data and can't create it, you can't build a custom model. Some options:

  • Use the prebuilt service with prompt engineering: For LLM-based tasks, few-shot prompting with Azure OpenAI can get surprisingly close to custom model performance without any training data.
  • Start with the prebuilt service and collect data: Use the prebuilt model in production, have humans correct errors, and accumulate labelled data for a future custom model.
  • Create synthetic training data: Use GPT-4o to generate training examples. Not ideal, but sometimes good enough to bootstrap a custom model.

Question 4 - Can You Maintain a Custom Model?

Custom models aren't set-and-forget. They need:

  • Monitoring: Tracking accuracy over time, detecting drift
  • Retraining: Regular updates as your data distribution changes
  • Infrastructure: Compute resources for serving predictions
  • Expertise: Someone who understands ML pipelines and can troubleshoot issues

If your organisation doesn't have ML engineering capability (and doesn't plan to build it), a custom model creates an ongoing dependency - either on an external team or on a single internal person. Prebuilt services eliminate this maintenance burden.

Question 5 - What's Your Budget and Timeline?

Approach Typical Cost (AUD) Typical Timeline Ongoing Cost
Prebuilt Cognitive Service $5,000-$20,000 (integration) 2-6 weeks Per-transaction API cost
Azure OpenAI with prompt engineering $10,000-$40,000 3-8 weeks Per-token API cost
Azure OpenAI fine-tuning $20,000-$60,000 6-12 weeks Per-token API cost (higher for fine-tuned)
Azure Document Intelligence custom model $15,000-$40,000 4-8 weeks Per-page API cost
Full custom ML model (Azure ML) $50,000-$200,000+ 3-6 months Compute + maintenance

These are rough ranges based on our project experience. Your actual costs depend on complexity, data preparation effort, and integration requirements. For a detailed pricing breakdown, see our Azure AI pricing guide.

Decision Matrix

Here's the simplified decision flow:

Use Prebuilt Azure Cognitive Services when:

  • The prebuilt model meets your accuracy needs (test it first)
  • You need to move fast (weeks, not months)
  • You don't have ML engineering resources
  • Your use case is common (document processing, translation, speech, general text analysis)
  • Budget is limited

Use Azure OpenAI with prompt engineering when:

  • Your task is language-based (classification, extraction, generation, Q&A)
  • You don't have training data but can describe what you want
  • You need flexibility to handle varied inputs
  • Accuracy of 85-95% is acceptable
  • You want to iterate quickly on performance

Use Azure OpenAI fine-tuning when:

  • Prompt engineering gets you close but not close enough
  • You have 100+ high-quality training examples
  • You need consistent output format or domain-specific behaviour
  • You want to reduce token usage (fine-tuned models need shorter prompts)

Use Custom Models (Azure ML) when:

  • Prebuilt services don't meet your accuracy requirements
  • You have substantial training data (1,000+ examples)
  • You have ML engineering capability (or will build it)
  • The problem is highly domain-specific
  • You need maximum control over model behaviour
  • Cost per prediction needs to be very low at high volume

Real-World Patterns We See

Pattern 1 - Start Prebuilt, Go Custom Later

This is the most common and usually the smartest approach. Build version 1 with prebuilt services. Deploy to production. Collect real-world data and error patterns. Use that data to train a custom model for version 2 only if the prebuilt service isn't meeting your needs.

Why this works: You ship faster, learn what actually matters, and accumulate training data naturally. The custom model, if you need one, is trained on real production data rather than synthetic or incomplete datasets.

Pattern 2 - Prebuilt Service + Custom Post-Processing

Sometimes the prebuilt service gets 80% of the way there, and simple rule-based post-processing handles the remaining 20%.

Example: Azure Document Intelligence extracts invoice data with high accuracy, but your invoices have a non-standard field for "purchase order reference" that the prebuilt model doesn't extract. Instead of training a custom model, add a regex-based extraction step for that specific field. Problem solved in an afternoon.

Pattern 3 - LLM as the Custom Model

This is increasingly common. Instead of training a traditional ML model, use GPT-4o with a detailed system prompt and few-shot examples. For many classification and extraction tasks, this achieves custom-model accuracy with prebuilt-service simplicity.

The trade-off is cost. A custom classification model might cost $0.001 per prediction. GPT-4o-mini might cost $0.01 per prediction. GPT-4o might cost $0.10. At high volumes, the cost difference matters. At low to medium volumes, the development time savings more than compensate.

Pattern 4 - Hybrid Architecture

Use prebuilt services for common cases and custom models for edge cases. A document processing pipeline might route standard invoices through the prebuilt model and unusual or complex documents through a custom model or GPT-4o for analysis.

This gives you cost efficiency on the 80% of documents that are straightforward and accuracy on the 20% that aren't.

Common Mistakes

Building Custom Because It Feels More Serious

Custom models feel more impressive. Prebuilt APIs feel like cheating. This is an ego trap. The goal is to solve the business problem, not to build the most sophisticated model.

Not Testing the Prebuilt Option

Teams jump straight to custom model development without spending a day testing whether the prebuilt service works. Always test the prebuilt option first. It's a day of work that can save months.

Under-Investing in Prompt Engineering

Before training a custom model, invest 2-3 weeks in serious prompt engineering with Azure OpenAI. Structured prompts, few-shot examples, chain-of-thought reasoning, output format constraints. We've seen prompt engineering close a 15-point accuracy gap that clients thought required custom training.

Ignoring Maintenance Costs

The cost of building a custom model is one-time. The cost of maintaining it is ongoing. Factor in retraining, monitoring, compute, and the person-hours to manage the pipeline when comparing against the per-transaction cost of a prebuilt service.

How We Help

At Team 400, we help Australian businesses make this decision with real data, not guesswork. Our typical approach:

  1. Assessment: Test your real data against relevant prebuilt services. Measure accuracy.
  2. Recommendation: Based on results, recommend the simplest approach that meets your requirements.
  3. Implementation: Build the solution, whether that's integrating a prebuilt service, engineering prompts, or training a custom model.
  4. Evaluation: Systematic testing against your success criteria before deployment.

If you're trying to decide between prebuilt and custom AI for your use case, get in touch. A few hours of testing up front can save months of misguided development.

Explore our Azure AI consulting services and AI consulting offerings to learn more about how we work.