Back to Blog

How to Use Azure AI Foundry for Enterprise Applications

April 21, 202611 min readMichael Ridland

If you're building enterprise AI applications on Azure, you've probably encountered Azure AI Foundry (formerly Azure AI Studio). It's Microsoft's platform for assembling AI applications that combine language models, search, document processing, and custom models into production-ready systems.

But the documentation is dense, the naming has changed multiple times, and it's not always clear how the pieces fit together for real business applications. Here's the practical guide based on what we've built with it.

What Azure AI Foundry Actually Is

Azure AI Foundry is a unified platform for building AI applications. Instead of stitching together individual Azure services yourself, Foundry provides a project-based workspace where you can:

  • Deploy and manage language models (GPT-4o, GPT-4o-mini, and others)
  • Build RAG pipelines with Azure AI Search
  • Process documents with Azure Document Intelligence
  • Create AI agents with tool-calling capabilities
  • Evaluate and test your AI applications
  • Monitor performance and safety in production

Think of it as the workbench where you assemble your AI application from Azure's AI building blocks.

What it is not: A no-code magic solution. You still need to write code, design architecture, and make engineering decisions. Foundry gives you better tooling and tighter integration between services, but it's a platform for developers and architects, not a business-user tool.

Why Enterprises Choose Azure AI Foundry

From our client conversations, the reasons are consistent:

Data sovereignty. Azure AI Foundry runs on Azure regions, including Australia East and Australia Southeast. For organisations with data residency requirements - financial services, healthcare, government - this matters. Your data doesn't leave Australian data centres.

Enterprise security. Integration with Azure Active Directory (Entra ID), private networking (VNets, Private Endpoints), managed identities, and Azure's compliance certifications. If you're already an Azure shop, AI Foundry fits your existing security model.

SLAs and support. Enterprise agreements, Microsoft support, and uptime commitments that you don't get with direct API access to model providers.

Unified billing. AI costs go through your existing Azure billing and enterprise agreement. No separate vendor relationships for AI infrastructure.

Model choice. Access to OpenAI models (GPT-4o, GPT-4o-mini), Microsoft's own models, Meta's Llama models, Mistral, and others - all through the same platform and API structure.

These aren't glamorous reasons. They're practical ones. And for enterprises, practical is what matters.

The Architecture - How the Pieces Fit Together

Here's a typical enterprise AI application built on Azure AI Foundry.

Users (Web App / Teams / API)
         |
         v
Application Layer (.NET / Python)
         |
         v
Azure AI Foundry Project
    |
    +-- Azure OpenAI (Language Models)
    |       +-- GPT-4o (complex reasoning)
    |       +-- GPT-4o-mini (simple tasks, high volume)
    |
    +-- Azure AI Search (Knowledge Retrieval)
    |       +-- Vector search (semantic)
    |       +-- Keyword search (exact match)
    |       +-- Hybrid search (both)
    |
    +-- Azure Document Intelligence (Document Processing)
    |       +-- Invoice extraction
    |       +-- Form recognition
    |       +-- Custom document models
    |
    +-- Prompt Flow (Orchestration)
    |       +-- Chain multiple AI steps
    |       +-- Add business logic between steps
    |       +-- Handle branching and routing
    |
    +-- Content Safety (Guardrails)
            +-- Input filtering
            +-- Output filtering
            +-- Custom blocklists

Each component handles a specific capability. The application layer orchestrates them based on user requests.

Setting Up Your First AI Foundry Project

Here's the practical setup process.

Step 1 - Create the AI Hub and Project

The AI Hub is your top-level container. It holds shared resources (model deployments, search indexes) that multiple projects can use.

A Project sits within a Hub and represents a specific application or workload.

Structure we recommend:

AI Hub (shared infrastructure)
    |
    +-- Project: Customer Service Agent
    +-- Project: Internal Knowledge Bot
    +-- Project: Document Processing Pipeline

This lets you share expensive resources (like model deployments) across projects while keeping each application's configuration separate.

Step 2 - Deploy Your Models

Deploy the language models you need. For most enterprise applications:

GPT-4o - Your primary model for complex reasoning, multi-step tasks, and nuanced understanding. Deploy with enough capacity for your expected load.

GPT-4o-mini - For simpler tasks: classification, extraction, summarisation where full GPT-4o is overkill. Significantly cheaper per token.

Embedding model (text-embedding-3-large or text-embedding-3-small) - For converting text to vectors for RAG. You need this if you're doing any knowledge retrieval.

Capacity planning tip: Azure OpenAI uses Tokens Per Minute (TPM) quotas. A customer service agent handling 100 concurrent conversations needs roughly 200,000-400,000 TPM depending on complexity. Start with your expected concurrency and work backwards.

Step 3 - Set Up Azure AI Search

If your application needs to search company documents (and most do), Azure AI Search is the retrieval layer.

Create an index with:

  • Text fields for keyword search
  • Vector fields for semantic search
  • Metadata fields for filtering (document type, department, date)

Configure indexers to pull from your document sources:

  • SharePoint Online
  • Azure Blob Storage
  • Azure SQL Database
  • Custom sources via the push API

Enable hybrid search (keyword + vector) from the start. It consistently outperforms either approach alone.

Step 4 - Build Your Application

This is where Foundry connects to your code. Two main approaches:

Prompt Flow - Visual orchestration tool within Foundry. Good for prototyping and simpler flows. You build a directed graph of steps - LLM calls, search queries, code blocks - and Foundry manages execution.

SDK-based - Use the Azure AI SDKs (available for .NET, Python, Java, JavaScript) to call Foundry services from your own application code. More flexible, better for complex applications, easier to version control and test.

For production enterprise applications, we almost always use the SDK approach. Prompt Flow is excellent for prototyping and evaluation, but production applications need the flexibility of code.

Building a RAG Application on AI Foundry

RAG (Retrieval-Augmented Generation) is the most common enterprise AI pattern, and Azure AI Foundry is well-suited for it.

The Flow

  1. User asks a question
  2. Application converts the question to an embedding
  3. Azure AI Search finds relevant document chunks
  4. Application assembles a prompt with the retrieved context
  5. Azure OpenAI generates an answer
  6. Application returns the answer with citations

Code Example (C# with Semantic Kernel)

// Set up the kernel with Azure OpenAI
var kernel = Kernel.CreateBuilder()
    .AddAzureOpenAIChatCompletion(
        deploymentName: "gpt-4o",
        endpoint: config["AzureOpenAI:Endpoint"],
        credentials: new DefaultAzureCredential())
    .Build();

// Search for relevant documents
var searchClient = new SearchClient(
    new Uri(config["AzureSearch:Endpoint"]),
    "knowledge-index",
    new DefaultAzureCredential());

var searchResults = await searchClient.SearchAsync<Document>(
    query,
    new SearchOptions
    {
        QueryType = SearchQueryType.Semantic,
        VectorSearch = new VectorSearchOptions
        {
            Queries = { new VectorizedQuery(queryEmbedding) }
        },
        Size = 5
    });

// Build the prompt with retrieved context
var context = string.Join("\n\n", searchResults.Value
    .GetResults()
    .Select(r => 
quot;[Source: {r.Document.Title}]\n{r.Document.Content}")); var prompt =
quot;"" Answer the user's question using only the information provided in the context below. Cite your sources. Context: {context} Question: {query} """; // Generate the response var response = await kernel.InvokePromptAsync(prompt);

This is simplified, but it shows the pattern. In production, you'd add error handling, caching, content safety checks, and more sophisticated prompt engineering.

Building AI Agents on AI Foundry

For applications where the AI needs to take actions - not just answer questions - you build an agent architecture.

Agent with Tools

Azure AI Foundry supports function calling (tool use) through the Azure OpenAI API. You define tools, and the model decides when to call them.

// Define tools the agent can use
var tools = new List<ChatCompletionsFunctionToolDefinition>
{
    new("get_customer", "Look up customer details by ID or email",
        BinaryData.FromObjectAsJson(new {
            type = "object",
            properties = new {
                customer_id = new { type = "string" },
                email = new { type = "string" }
            }
        })),
    new("create_ticket", "Create a support ticket",
        BinaryData.FromObjectAsJson(new {
            type = "object",
            properties = new {
                subject = new { type = "string" },
                description = new { type = "string" },
                priority = new { type = "string", @enum = new[] { "low", "medium", "high" } }
            },
            required = new[] { "subject", "description" }
        }))
};

The model calls these tools as needed during a conversation. Your code executes the actual operations (CRM lookup, ticket creation) and returns results to the model for the next step.

Multi-Agent Patterns

For complex workflows, we sometimes use multiple specialised agents:

  • Router agent: Determines the type of request and directs to the right specialist
  • Knowledge agent: Handles information retrieval (RAG)
  • Action agent: Handles operations that change state (create, update, delete)
  • Escalation agent: Manages handoff to human operators

Each agent has its own system prompt, tools, and constraints. The router coordinates between them.

Content Safety and Guardrails

Enterprise AI needs guardrails. Azure AI Foundry integrates with Azure AI Content Safety to filter both inputs and outputs.

What You Should Configure

Input filters: Block or flag prompts that contain harmful content, prompt injection attempts, or off-topic requests.

Output filters: Prevent the AI from generating content that's inappropriate, inaccurate beyond acceptable bounds, or outside its scope.

Custom blocklists: Add company-specific terms, competitor names, or topics the AI should never discuss.

Groundedness detection: Check whether the AI's response is actually grounded in the provided context (important for RAG applications) or if it's making things up.

Prompt Injection Protection

Enterprise AI applications are targets for prompt injection - where users craft inputs to make the AI ignore its instructions. Azure AI Foundry's content safety includes prompt injection detection, but it shouldn't be your only defence.

Defence in depth:

  1. Content Safety prompt shield (catches common patterns)
  2. System prompt design (clear boundaries)
  3. Input validation in your application code
  4. Output validation before returning to users
  5. Monitoring for unusual patterns

Evaluation and Testing

Azure AI Foundry includes evaluation tools. Use them.

Built-in Metrics

  • Groundedness: Is the response based on the provided context?
  • Relevance: Does the response answer the question?
  • Coherence: Is the response well-structured and readable?
  • Fluency: Is the language natural?
  • Similarity: How close is the response to a reference answer?

Custom Evaluations

For business-specific quality criteria, build custom evaluations:

  • Does the response follow our brand guidelines?
  • Does it correctly apply business rules?
  • Does it appropriately escalate when it should?
  • Does it include the required disclaimers?

Evaluation Datasets

Build a test dataset of 100+ question-answer pairs representing real use cases. Include:

  • Common questions (the easy ones)
  • Edge cases (the tricky ones)
  • Out-of-scope questions (should the AI decline?)
  • Adversarial inputs (trying to break it)

Run evaluations on every significant change to prompts, models, or data. This catches regressions before they reach production.

Production Deployment

Infrastructure Recommendations

Networking: Use Private Endpoints for all AI Foundry services. Keep AI traffic off the public internet.

Authentication: Managed identities everywhere. No API keys in code or configuration.

Monitoring: Azure Monitor and Application Insights for infrastructure metrics. Custom telemetry for AI-specific metrics (response quality, tool call patterns, error types).

Scaling: Azure OpenAI has quota limits. Plan for burst capacity. Consider Provisioned Throughput for predictable, high-volume workloads.

Disaster recovery: Deploy to multiple regions if uptime is critical. AI Foundry projects can be replicated, but model capacity needs to be pre-allocated in each region.

Cost Management

Azure AI costs can surprise you. Track and manage:

  • Token consumption: Monitor daily. Set budgets and alerts.
  • Search operations: AI Search charges per operation. High-volume RAG applications can generate significant search costs.
  • Document processing: Document Intelligence charges per page. Large document processing jobs add up.
  • Model deployment: Provisioned Throughput has a fixed hourly cost regardless of utilisation.

Use GPT-4o-mini for everything that doesn't need GPT-4o's reasoning capability. The cost difference is roughly 10x.

Lessons from Production

After building multiple enterprise applications on Azure AI Foundry, here's what we've learned.

Start with the SDK, not Prompt Flow, for production. Prompt Flow is excellent for prototyping and evaluation, but production applications need version control, testing frameworks, and deployment pipelines that work better with code.

Plan for model updates. Azure regularly updates model versions. What worked with gpt-4o-2024-05-13 might behave differently with gpt-4o-2024-11-20. Test against new versions before switching.

Quota management is operational. Token quotas are shared across your subscription. If one application spikes, it can starve others. Use separate deployments and monitor quota usage.

Content safety is not optional. Enterprise AI without content safety is a liability waiting to happen. Configure it from day one, not as an afterthought.

Evaluation is ongoing. Your first evaluation dataset isn't your last. Add real user questions that the system handled poorly. Continuously improve.

How Team 400 Works with Azure AI Foundry

We're Azure AI Foundry consultants with deep experience building enterprise applications on the platform. Our team has delivered RAG systems, AI agents, and document processing pipelines for Australian organisations across financial services, professional services, manufacturing, and government.

Our AI consulting services cover the full lifecycle - architecture design, development, deployment, and ongoing optimisation. We work with your existing Azure infrastructure and integrate with your current systems through our integration services.

As AI agent developers building on Azure AI Foundry, we bring practical experience with the platform's strengths and limitations. We'll tell you where Foundry excels and where you might need complementary approaches.

If you're evaluating Azure AI Foundry for your enterprise AI application, talk to our team. We can help you design the right architecture and avoid the pitfalls we've already encountered.