Getting Started with Azure AI Foundry - A Step by Step Guide
Getting started with Azure AI Foundry is easier than most enterprise AI platforms, but there are still decisions early on that affect how smoothly everything runs later. We've set up Azure AI Foundry environments for dozens of Australian organisations, and the same mistakes come up repeatedly.
This guide walks you through the setup process with the practical advice that the official documentation skips.
Prerequisites - What You Need Before You Start
Before touching Azure AI Foundry, make sure you have these in place:
An Azure subscription: If your organisation already uses Azure, you'll want to create AI Foundry resources within your existing subscription. If you're starting fresh, an Azure free tier account gives you $200 USD in credits, which is enough for initial experimentation but won't last long once you start deploying models.
Appropriate permissions: You'll need at least Contributor access to the resource group where you'll create AI Foundry resources. For production setups, you'll also need the ability to configure role-based access control (RBAC). If you're working within a larger IT organisation, get these permissions sorted before you begin - we've seen projects delayed by weeks waiting for access approvals.
A clear first use case: Don't set up Azure AI Foundry "to explore." Have a specific problem in mind. "We want to build a document Q&A system over our internal policies" is a good starting point. "We want to try AI" is not. Having a concrete goal shapes every configuration decision that follows.
Step 1 - Create an Azure AI Foundry Hub
The hub is the top-level organisational unit in Azure AI Foundry. Think of it as the shared infrastructure layer that multiple projects can use.
- Go to the Azure Portal and search for "Azure AI Foundry"
- Click "Create a hub"
- Choose your subscription and resource group (or create a new one)
- Pick a region
Region selection matters. Not all models are available in all regions, and pricing varies. For Australian organisations, here's our recommendation:
| Scenario | Recommended Region |
|---|---|
| Data sovereignty required | Australia East (Sydney) |
| Broadest model availability | East US or East US 2 |
| Cost-sensitive workloads | East US (generally cheapest) |
| Low latency for Australian users | Australia East |
If data sovereignty isn't a hard requirement, we often recommend East US for experimentation (broadest model catalogue) and Australia East for production workloads (lower latency for Australian users).
Configure the associated resources:
- Storage account: Created automatically. Accept the defaults unless you have specific naming conventions.
- Key Vault: Created automatically. This stores your API keys and connection strings.
- Application Insights: Optional but recommended. Enables monitoring and logging for deployed models.
- Container Registry: Only needed if you plan to deploy custom containers. Skip it for now if you're starting with hosted models.
Click "Create" and wait 3-5 minutes for deployment.
Common mistake: Creating multiple hubs when you only need one. A single hub can support many projects. Create additional hubs only if you need hard isolation between business units or have different compliance requirements for different workloads.
Step 2 - Create Your First Project
Projects sit inside hubs and represent individual AI initiatives. Each project has its own set of connections, deployments, and access controls.
- Open your hub in the Azure AI Foundry portal (ai.azure.com)
- Click "New project"
- Give it a descriptive name (e.g., "policy-qa-assistant" not "test-project-1")
- The project inherits the hub's region and associated resources
Naming conventions matter more than you think. When you have 15 projects six months from now, "project-alpha" and "demo-v2" won't help anyone. We use this pattern: {team}-{use-case}-{environment} - for example, ops-document-qa-dev or finance-reporting-assistant-prod.
Step 3 - Deploy Your First Model
Now you have a project. Time to deploy a model so you can actually build something.
- In your project, go to "Model catalog"
- Browse or search for the model you want
Which model should you start with? Here's a practical guide:
| Use Case | Recommended Starting Model | Why |
|---|---|---|
| General Q&A, chatbot | GPT-4o mini | Good performance, low cost, fast |
| Complex reasoning, analysis | GPT-4o | Strongest reasoning, higher cost |
| Code generation | GPT-4o | Best code quality |
| Simple classification/extraction | GPT-4o mini or Phi-3 | Cost-effective for straightforward tasks |
| Cost-sensitive high volume | Llama 3.1 8B or Mistral Small | Strong performance per dollar |
Click "Deploy" on your chosen model
Choose a deployment name (keep it short and descriptive)
Select the deployment type:
- Serverless API (pay-per-token): Best for most use cases. No infrastructure to manage.
- Managed compute: Better for high-volume production workloads where you want dedicated capacity.
Set rate limits and content filters:
- Rate limits: Start with the defaults. You can increase them later.
- Content filters: Keep the default safety filters enabled. You can customise them if specific use cases require it, but start with the defaults.
Click "Deploy" and wait a few minutes.
Once deployed, you'll get an endpoint URL and API key. That's your model, running and ready to accept requests.
Step 4 - Test in the Playground
Before writing any code, test your model in the playground. This is the fastest way to validate that your approach will work.
- Go to "Playgrounds" in your project
- Select your deployed model
- Start with the Chat playground for conversational use cases
Write a system prompt. This is the instruction that shapes how the model behaves. For a document Q&A assistant, something like:
You are a helpful assistant for [Company Name]. Answer questions based on the provided context.
If the context doesn't contain enough information to answer, say so clearly.
Always cite which document or section your answer comes from.
Keep responses concise and professional.
Test with real questions from your use case. Not "Hello, how are you?" but actual questions your users would ask.
Iterate on the system prompt based on the responses. This is where most of the early value comes from - getting the system prompt right before you write any application code.
Pro tip: Keep a document of your prompt iterations and the results. When the model gives a bad answer, note what the input was, what it said, and what you expected. This becomes your evaluation dataset later.
Step 5 - Connect Your Data (For RAG Applications)
If you're building an application that needs to answer questions from your own documents, you'll need to set up Retrieval Augmented Generation (RAG). This is the most common pattern we build for clients, and Azure AI Foundry has solid tooling for it.
Set Up Azure AI Search
Create an Azure AI Search resource (if you don't have one already)
Choose your pricing tier:
- Free: Fine for experimentation (50 MB storage, 3 indexes)
- Basic: Good for small production workloads ($1.50 AUD/hour approx)
- Standard: Required for larger document collections and higher query volumes
In your AI Foundry project, go to "Connected resources" and add your AI Search instance
Index Your Documents
- Go to "Data + indexes" in your project
- Upload your documents or connect to Azure Blob Storage where they're stored
- Create an index with the built-in indexer
The indexer handles document chunking (splitting documents into searchable segments), embedding generation (converting text to vector representations), and index population.
Configuration decisions that matter:
- Chunk size: Default is usually fine (512-1024 tokens). Smaller chunks give more precise retrieval but lose context. Larger chunks preserve context but may include irrelevant information.
- Overlap: Set chunk overlap to 10-20% to avoid losing information at chunk boundaries.
- Embedding model:
text-embedding-ada-002is the standard.text-embedding-3-smallis newer and often slightly better.
Test RAG in the Playground
- Go back to the Chat playground
- Click "Add your data"
- Select your AI Search index
- Now when you ask questions, the model will retrieve relevant chunks from your documents before generating a response
Test with questions that you know the answer to. Check that the model is citing the right documents and giving accurate information. If accuracy is below 85%, the issue is usually with chunking strategy or the system prompt, not the model itself.
Step 6 - Build with Prompt Flow (Optional but Recommended)
Once you've validated your approach in the playground, Prompt Flow lets you build production-grade AI applications with a visual designer or code.
Prompt Flow supports:
- Multi-step workflows: Chain model calls together (e.g., classify a document, then extract specific fields based on the classification)
- Conditional logic: Route requests differently based on model outputs
- Custom code nodes: Write Python for data transformation, API calls, or business logic
- Evaluation runs: Batch-test your flow against a dataset of expected inputs and outputs
For a typical RAG application, your Prompt Flow might look like:
- Receive user query
- Retrieve relevant documents from AI Search
- Construct a prompt with the retrieved context
- Call the language model
- Post-process the response (format citations, apply business rules)
- Return the response
This is where the application starts to feel real. The playground proves feasibility; Prompt Flow builds something deployable.
Step 7 - Evaluate and Iterate
Before deploying to production, evaluate your application systematically.
- Create an evaluation dataset: 50-100 question-answer pairs that represent real usage
- Run an evaluation flow in Prompt Flow that tests your application against this dataset
- Review metrics:
- Groundedness: Is the model's answer supported by the retrieved documents?
- Relevance: Is the answer relevant to the question?
- Coherence: Is the answer well-structured and clear?
- Fluency: Is the language natural?
Azure AI Foundry includes built-in evaluation metrics for all of these. In our experience, groundedness is the most important metric to track - it directly measures whether the model is making things up.
Target metrics for production readiness:
- Groundedness: > 85%
- Relevance: > 90%
- User satisfaction (if you can measure it): > 80%
If you're not hitting these numbers, the fixes are usually:
- Better system prompt (most common)
- Better chunking strategy for your documents
- Adding more relevant documents to the index
- Switching to a more capable model
Step 8 - Deploy to Production
When your evaluation metrics look good, deploy:
- In Prompt Flow, click "Deploy" on your flow
- Choose a managed endpoint
- Configure scaling (start with 1 instance and scale based on traffic)
- Set up authentication (API key or Entra ID)
- Configure monitoring alerts for latency, errors, and token usage
Your deployed flow gets a REST API endpoint that your application code calls. From here, it's standard API integration.
Common Pitfalls We See
Having set up Azure AI Foundry for many Australian organisations, here are the mistakes that come up most often:
Not setting spending alerts: Azure consumption can surprise you. Set budget alerts at 50%, 75%, and 90% of your expected monthly spend from day one.
Over-engineering the first project: Start simple. Get a basic RAG application working before adding custom fine-tuning, complex orchestration, or multi-model routing. Complexity is easy to add later.
Ignoring content safety: The default content filters exist for good reason. Clients who disable them during development and forget to re-enable them for production create real risk.
Not planning for Entra ID integration: If your organisation uses Entra ID (and most Australian enterprises do), plan your RBAC from the start. Retrofitting access controls is painful.
Skipping evaluation: "It seems to work" is not a deployment criterion. Build evaluation into your process from the beginning, even if it's just 20 test cases.
Getting Help
Azure AI Foundry has a learning curve, and the documentation doesn't cover every production scenario. If you'd rather skip the trial-and-error phase, we offer a structured Azure AI Foundry setup and onboarding engagement that gets your team productive in weeks rather than months.
We're Azure AI consultants who have been building on Microsoft's AI platform since before it was called AI Foundry. We know where the documentation falls short and where the real configuration decisions matter.
Get in touch if you want to discuss your specific setup, or explore our AI consulting services to see how we work with Australian organisations.