How to Migrate from OpenAI to Azure OpenAI Service
You built your AI application using OpenAI's API directly. It works. But now you need enterprise security, data residency in Australia, virtual network integration, or your procurement team is telling you everything needs to go through your Azure agreement.
Time to migrate to Azure OpenAI Service.
The good news is that the migration is simpler than most people expect. The models are the same. The API is nearly identical. The main changes are around authentication, endpoint configuration, and deployment management.
I'm Michael Ridland, founder of Team 400, and we've migrated dozens of applications from OpenAI to Azure OpenAI for Australian organisations. Here's exactly how to do it.
Why Migrate to Azure OpenAI?
Before getting into the how, let's be clear about the why. These are the reasons our clients make the switch.
Data residency. Azure OpenAI runs in Australian regions (Australia East). Your data stays in Australia. For organisations subject to data sovereignty requirements - financial services, government, healthcare - this is often the trigger.
Enterprise security. Azure OpenAI integrates with your existing Azure security infrastructure: Azure Active Directory (Entra ID), private endpoints, virtual networks, managed identities. No API keys floating around in environment variables.
Compliance. Azure OpenAI comes with Microsoft's enterprise compliance certifications - SOC 2, ISO 27001, IRAP (important for Australian government). OpenAI direct has its own compliance posture, but it doesn't match Azure's enterprise credentials for Australian regulated industries.
Billing and procurement. Consolidate AI spend under your existing Microsoft Enterprise Agreement. One bill, one procurement process, one vendor relationship.
Content filtering. Azure OpenAI includes configurable content filters that help meet responsible AI requirements. You can tune these for your use case.
Abuse monitoring opt-out. For eligible enterprise customers, Microsoft offers an abuse monitoring opt-out, meaning your prompts and completions aren't stored by Microsoft for review. This matters for sensitive data.
Rate limits and capacity. Azure OpenAI lets you provision dedicated capacity (Provisioned Throughput Units) for predictable performance. No more fighting for capacity during peak times.
What Stays the Same
The migration is manageable because most things don't change.
- The models are identical. GPT-4o on Azure OpenAI is the same GPT-4o as on OpenAI. Same capabilities, same quality.
- The API structure is nearly identical. Request and response formats are the same. The chat completions API works the same way.
- The Python SDK supports both. The official
openaiPython library works with both OpenAI and Azure OpenAI with minor configuration changes. - Prompt engineering carries over. Your system prompts, few-shot examples, and prompt templates work unchanged.
What Changes
Here's what you need to update.
1. Authentication
OpenAI direct:
from openai import OpenAI
client = OpenAI(api_key="sk-...")
Azure OpenAI:
from openai import AzureOpenAI
client = AzureOpenAI(
azure_endpoint="https://your-resource.openai.azure.com/",
api_key="your-azure-key",
api_version="2024-10-21"
)
Better yet, use Azure Active Directory authentication instead of API keys:
from openai import AzureOpenAI
from azure.identity import DefaultAzureCredential
credential = DefaultAzureCredential()
token = credential.get_token("https://cognitiveservices.azure.com/.default")
client = AzureOpenAI(
azure_endpoint="https://your-resource.openai.azure.com/",
azure_ad_token=token.token,
api_version="2024-10-21"
)
This removes API keys entirely. The identity is managed through Azure AD, which means proper RBAC, audit logging, and no secrets to rotate.
2. Model References Become Deployment Names
This is the biggest conceptual change. On OpenAI, you reference models by name (gpt-4o). On Azure OpenAI, you reference a deployment name that you define.
OpenAI direct:
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Hello"}]
)
Azure OpenAI:
response = client.chat.completions.create(
model="my-gpt4o-deployment", # Your deployment name, not the model name
messages=[{"role": "user", "content": "Hello"}]
)
You create deployments in the Azure portal or via the Azure CLI. Each deployment maps to a specific model version and has its own rate limits.
We recommend naming deployments descriptively: gpt-4o-production, gpt-4o-mini-dev, text-embedding-3-large-prod.
3. API Version Parameter
Azure OpenAI requires an explicit API version. This is how Microsoft manages backward compatibility as the service evolves.
Current stable version: 2024-10-21
Preview versions offer newer features but may change.
Always pin to a specific version in production. Don't use "latest" - you want predictable behaviour.
4. Endpoint URL Structure
Azure OpenAI endpoints include your resource name:
https://{resource-name}.openai.azure.com/openai/deployments/{deployment-name}/chat/completions?api-version={api-version}
The SDK handles this URL construction. You just provide the base endpoint and deployment name.
5. Embedding Model Changes
If you're using embeddings, the migration is the same pattern:
OpenAI direct:
response = client.embeddings.create(
model="text-embedding-3-large",
input="Your text here"
)
Azure OpenAI:
response = client.embeddings.create(
model="my-embedding-deployment", # Your deployment name
input="Your text here"
)
The embeddings produced are identical. Your existing vector indexes remain valid - you don't need to re-embed your document collection.
Step-by-Step Migration Plan
Week 1 - Set Up Azure OpenAI
Create the Azure OpenAI resource:
- In the Azure portal, create a new Azure OpenAI resource
- Select the Australia East region for data residency
- Choose the pricing tier (Standard for most use cases)
- Configure networking (public access for now, restrict later)
Deploy your models:
- In Azure AI Studio (or via CLI), create deployments for each model you use
- Match the model versions you're currently using on OpenAI
- Set appropriate rate limits (tokens per minute)
- Note down deployment names
Set up authentication:
For development, API keys are fine. For production, set up Azure AD authentication:
- Create a managed identity for your application
- Assign the "Cognitive Services OpenAI User" role
- Update your application to use DefaultAzureCredential
Week 2 - Update Application Code
Create a configuration abstraction:
Don't hardcode the switch. Create a configuration layer that lets you switch between OpenAI and Azure OpenAI.
import os
from openai import OpenAI, AzureOpenAI
def create_client():
provider = os.getenv("AI_PROVIDER", "azure")
if provider == "azure":
return AzureOpenAI(
azure_endpoint=os.getenv("AZURE_OPENAI_ENDPOINT"),
api_key=os.getenv("AZURE_OPENAI_KEY"),
api_version=os.getenv("AZURE_OPENAI_API_VERSION", "2024-10-21")
)
else:
return OpenAI(api_key=os.getenv("OPENAI_API_KEY"))
Update model references:
Search your codebase for all model name references and replace with deployment names (or configuration variables).
# Before
model = "gpt-4o"
# After
model = os.getenv("GPT4O_DEPLOYMENT_NAME", "gpt-4o-production")
Update error handling:
Azure OpenAI has slightly different error responses. Specifically:
- Rate limiting returns 429 with a
retry-afterheader - Content filtering triggers return a 400 with a specific error code
- Authentication errors differ between API key and Azure AD
Update your retry logic and error handling to account for these.
Handle content filtering:
Azure OpenAI includes content filters by default. If your application processes content that might trigger filters (medical content, security discussions, etc.), you may need to:
- Adjust content filter settings in the Azure portal
- Handle content filter errors in your application code
- Request custom content filtering configurations from Microsoft if needed
Week 3 - Test and Validate
Run your test suite against Azure OpenAI:
If you have automated tests (you should), run them against the Azure OpenAI endpoint. Look for:
- Same response quality
- Acceptable latency (Australian regions may have slightly different latency than US-based OpenAI)
- Content filtering not blocking legitimate requests
- Rate limits sufficient for your load
Compare outputs:
For critical use cases, run a sample of real inputs through both endpoints and compare outputs. They should be functionally identical, though not necessarily word-for-word the same (LLM outputs are non-deterministic).
Load test:
Verify your Azure OpenAI rate limits handle your expected load. Request increases if needed - Microsoft can provision additional capacity.
Week 4 - Go Live
Staged rollout:
Don't switch everything at once. Options:
- Feature flag - Route a percentage of traffic to Azure OpenAI, increasing over days
- Environment-based - Switch dev, then staging, then production
- Use-case-based - Migrate one feature/endpoint at a time
Monitor closely:
During the first week after migration, watch:
- Response latency
- Error rates
- Token usage and cost
- User-reported quality issues
- Content filter triggers
Post-Migration - Tighten Security
Once the migration is stable:
- Enable private endpoints - Restrict Azure OpenAI access to your virtual network
- Remove API keys - Switch fully to Azure AD authentication
- Enable diagnostic logging - Send logs to Azure Monitor for audit trails
- Configure cost alerts - Set budget alerts in Azure Cost Management
- Disable the old OpenAI API key - Don't leave it active
Common Issues and Solutions
Content filters blocking legitimate content. Azure's default content filters are conservative. If your application processes medical, legal, or security content, you may need to adjust filter levels. This is done through the Azure portal or by contacting Microsoft for custom configurations.
Rate limit differences. Azure OpenAI rate limits are per-deployment, not per-account. If you had high limits on OpenAI, you may need to request capacity increases on Azure. This usually takes a few business days.
API version compatibility. Some newer OpenAI features may not be available in all Azure OpenAI API versions. Check the Azure OpenAI documentation for feature availability by API version.
Latency from Australian regions. If your application runs outside Australia but your Azure OpenAI is in Australia East, you'll see higher latency. Either co-locate your application in Australia or use a closer Azure OpenAI region (noting data residency implications).
Streaming responses. Streaming works the same way on Azure OpenAI. The server-sent events format is identical. If you're using streaming, it should work without changes.
Cost Comparison
Azure OpenAI pricing is generally the same as OpenAI direct pricing for pay-as-you-go. The main difference is:
- Pay-as-you-go - Same per-token pricing as OpenAI
- Provisioned Throughput - Fixed monthly cost for guaranteed capacity. Better unit economics at high volume but requires commitment.
- Enterprise Agreement discounts - Some organisations get discounted rates through their Microsoft EA.
For most organisations, the cost is similar or slightly lower on Azure OpenAI, especially with EA discounts.
Should You Migrate?
If any of these apply, yes:
- You need data to stay in Australia
- You're in a regulated industry
- You need to integrate with Azure AD and virtual networks
- Your procurement requires Microsoft Enterprise Agreement
- You need guaranteed capacity for production workloads
If none of these apply and you're happy with OpenAI direct, there's no urgency to migrate. But as AI becomes more central to your business, the enterprise features of Azure OpenAI usually become necessary.
Need Help with the Migration?
We've done this migration many times for Australian organisations across financial services, government, and enterprise. The migration itself is usually 2-4 weeks of work, but getting the security, networking, and monitoring right takes experience.
Talk to our team about migrating to Azure OpenAI. We also offer broader Azure AI consulting and AI consulting services to help you get the most from your AI investment.