Back to Blog

Getting Started with Azure AI Foundry - A Step by Step Setup Walkthrough

May 29, 202610 min readMichael Ridland

Azure AI Foundry has settled down a lot since its preview days. The product that was called Azure AI Studio in 2024 is now the default home for anyone building production AI on Microsoft's stack, and the experience is genuinely much better than it was eighteen months ago. But "much better" is not the same as "obvious", and the official Microsoft Learn tutorials still skip the bits that bite Australian teams in production.

This is the walkthrough I give clients when we are setting up Foundry for the first time. It covers the actual steps, the decisions you need to make along the way, and the things to do now that will save you a painful migration later.

Before you log in - the prerequisites people forget

You can technically start Foundry with a personal Microsoft account and a credit card. Do not do this for anything you plan to keep. The right starting position has three things in place:

  1. An Azure subscription owned by your organisation (not a personal pay-as-you-go account)
  2. A resource group dedicated to AI workloads (we usually call ours something like rg-ai-foundry-prod-aue)
  3. An understanding of which Azure region you will use

That last point is where most Australian teams get stuck. Australia East has the broadest model availability, including GPT-4.1, GPT-4o, the Claude family that Microsoft now hosts directly through Foundry's models-as-a-service, and the latest open-weights models like Llama 4. Australia Southeast has less coverage. If your data residency requirements force Australia Southeast, expect to wait longer for new model availability and have a fallback plan.

You also need to decide upfront whether your workload requires data to stay in Australia. If yes, set up your hub in Australia East and explicitly choose only Australian-hosted models. If your compliance team is okay with US-hosted models, you have more options but you must document the data flow for your records.

Step 1 - Create the Foundry hub

Foundry uses a two-level structure: a hub holds shared resources (storage, key vaults, model deployments, content safety settings), and projects sit inside the hub for individual teams or workloads.

To create a hub:

  1. Go to ai.azure.com and sign in with your work account
  2. Click Create new and choose Hub
  3. Pick your subscription and the resource group you created
  4. Choose Australia East as the region
  5. Give it a name. We use a pattern like hub-{org}-ai-prod-aue-01. The number lets you create a second hub later if you need to.

Foundry will provision a hub, a storage account, a key vault, and an Application Insights resource. This takes about three minutes. Do not skip the Application Insights resource - you will want the telemetry later, and adding it after the fact is annoying.

One choice you make at hub creation that you cannot easily change: the storage account configuration. Use a private endpoint and disable public network access if you are touching sensitive data. We had a financial services client who set this up with default public access, then had to redo the whole hub four weeks later when their security team audited the setup.

Step 2 - Create your first project

Inside the hub, create a project. Projects are where the actual work happens - they hold prompt flows, evaluation runs, deployed endpoints, and project-specific data.

We always create at least two projects from day one:

  • proj-{workload}-dev for experimentation
  • proj-{workload}-prod for what gets deployed

Even if you are a team of one, this separation pays off the first time you accidentally break something. Endpoints in the prod project should only be touched through your deployment pipeline, never edited live in the portal.

Step 3 - Deploy your first model

This is the step that feels exciting but has the most expensive mistakes hiding in it.

In the project, go to Model catalog. You will see hundreds of models. Most of them you can ignore. For a first deployment, choose between:

  • GPT-4.1 - the default for production reasoning, agent workloads, and anything where accuracy matters
  • GPT-4o mini - the default for cost-sensitive workloads or high-volume routing
  • GPT-4.1 mini - the right middle ground for most chat applications
  • Claude Sonnet 4.6 - now available directly through Foundry, strong for code and structured output
  • text-embedding-3-large - the default embedding model for retrieval

Click Deploy. You will be asked for a deployment name and a tier. The deployment tier matters more than people realise.

Standard (pay-as-you-go) is fine for development and low-volume production. You pay per token, the throughput is shared, and you have no commitment.

Provisioned Throughput Units (PTUs) are dedicated capacity you reserve. They cost more per month but the per-token cost is lower and the latency is predictable. We recommend PTUs for production workloads doing more than about $4,500 a month of GPT-4 traffic. Below that, pay-as-you-go is cheaper.

DataZone Standard keeps your inference inside a specific geography. For Australian compliance, use the Australia data zone.

We had a retail client whose first deployment was Global Standard, which routes inference to whatever Microsoft region has capacity. Their privacy team flagged it three weeks later. We redeployed everything as Australia Data Zone in a day, but it would have been ten minutes if we had picked the right option first time.

Step 4 - Set up authentication properly

This is the step every tutorial glosses over and every production deployment regrets later.

Do not use API keys for anything that will run for more than a week. Set up Entra ID authentication (formerly Azure AD) from the start. The pattern looks like this:

  1. Create a managed identity for whatever service will call Foundry (your App Service, your Container App, your Function App)
  2. Grant it the Cognitive Services User role on the Foundry resource
  3. In your code, use the DefaultAzureCredential class from the Azure SDK

This means no keys in your config, no keys in your CI pipeline, and no keys to rotate. When the next intern accidentally commits a key to GitHub, you will be glad you set this up.

For local development, you sign in with az login and the same code that uses managed identity in production uses your developer token locally. The Azure SDK handles this transparently.

Step 5 - Wire up Content Safety

Azure Content Safety filters prompts and responses for harmful content. It is included with Foundry deployments at no extra charge, but the default thresholds are not what you want for production.

Go to the model deployment, click Content filters, and create a custom filter. Set hate, violence, sexual, and self-harm to medium severity. Turn on jailbreak detection and prompt shields. Turn on protected material detection if you care about copyrighted code or text appearing in outputs.

For a client-facing chatbot, we usually tighten the thresholds further. For an internal tool used by knowledge workers, the default medium settings are fine.

The content safety logs are also where you will see attempted prompt injections and abuse. Plan to review them weekly for the first month, monthly thereafter.

Step 6 - Build your first prompt flow

Foundry's prompt flow editor is the visual tool for building AI workflows. It is not the only way to build with Foundry - you can use the SDK directly from Python or .NET - but for getting started, the prompt flow is the fastest way to see your model do something useful.

Create a new prompt flow from the standard chat template. You get a flow with three nodes: an input, a prompt that calls your deployed model, and an output. Run it. Verify it works end to end. Modify the prompt to do something specific to your use case.

The thing prompt flow does that most tutorials skip: it gives you a free, built-in evaluation framework. You can attach a test dataset, define metrics (groundedness, fluency, relevance), and run an evaluation across versions of your prompt to compare them. This is the difference between professional AI development and vibes-based prompt engineering.

For Australian teams, the evaluation tooling alone is worth the move to Foundry. You can have your prompt engineer iterate while your QA team owns the test set, and both sides have a shared source of truth about whether changes are improvements.

Step 7 - Deploy to an endpoint

When you are ready, deploy your prompt flow to a managed online endpoint. Foundry will spin up the necessary infrastructure, attach the right networking, and give you a callable HTTPS URL.

There are two endpoint types:

  • Managed online endpoint - Microsoft manages the infrastructure, you pay per compute hour
  • Real-time endpoint - similar but with more control, mostly used for custom models

For Foundry prompt flows, managed online endpoints are the default. Start with the smallest compute SKU and scale up if you hit latency or throughput problems.

The endpoint URL is what your application calls. Auth is through Entra ID by default. Logging goes to the Application Insights resource you created with the hub.

Step 8 - Connect to your data

Most real Foundry workloads need data. The right pattern depends on what kind of workload.

For retrieval-augmented generation, the default Foundry approach is Azure AI Search. Create an AI Search resource, index your documents, and add a search node to your prompt flow that retrieves chunks before sending the prompt to the model. We have walked through this end-to-end with a LangChain on Azure RAG build for several clients.

For agent workloads that call your business systems, set up tool connections. Foundry supports function calling natively, and the SDK has helpers for wiring up tools to GPT and Claude deployments.

For workloads that just need conversational memory, use the built-in conversation history features in the chat completions API. Do not roll your own.

Step 9 - Set up monitoring before you go live

Foundry feeds Application Insights with telemetry by default. The thing it does not do by default is alert you when things go wrong.

Set up three alerts before any production traffic hits your endpoint:

  1. Latency above your SLA threshold (we usually use 5 seconds for chat workloads)
  2. Error rate above 2 percent over 15 minutes
  3. Token usage approaching your quota

The third one is the most important. We have seen Foundry workloads quietly hit their TPM quota during a marketing push and degrade silently because nobody was watching the quota dashboard.

Common setup mistakes we keep seeing

A short list of the things that trip Australian teams up:

  • Choosing Global Standard deployments when data residency requires Australia Data Zone
  • Forgetting to enable private endpoints on the hub storage account
  • Using API keys instead of managed identities
  • Not separating dev and prod projects
  • Skipping the Application Insights resource
  • Picking PTUs before they have any production traffic to size against
  • Leaving default content safety thresholds for client-facing apps

Most of these are fifteen-minute fixes if you catch them in the first week. They become two-week projects if you find them in month three.

When you should get help

If you are a developer who has built with the OpenAI API before, you can probably get a working Foundry deployment up in a day. The portal experience is forgiving and the docs have improved.

If you are an enterprise team setting up Foundry for the organisation, with networking, identity, governance, and multiple workloads in scope, this is the kind of setup we do regularly for Australian businesses. Two weeks of consulting at the start usually saves three to six months of expensive corrections.

We help organisations across Australia with Azure AI Foundry consulting, including initial setup, security hardening, and production deployments. If you are already in flight and stuck, we also do paid second opinions.

You can also read more about our broader Azure AI consulting practice, our work with the Microsoft AI Agent Framework, or our custom AI development work for Australian businesses. If you want hands-on enablement for your team rather than consulting, we run Microsoft Copilot training sessions across Sydney, Melbourne, and Brisbane.

Get in touch if you want a free 30-minute review of your Foundry plans before you start. We will give you an honest read on whether you are ready to build, or whether there are two or three setup decisions worth making first.