Azure AI Foundry Authentication - Which Method to Use and When

May 4, 2026•8 min read•Michael Ridland

Most production failures I see with Azure AI services aren't model failures. They're authentication failures. Keys get committed to repos. Tokens expire mid-batch. The dev environment uses keys and the prod environment uses managed identity and the developer is confused about why their local code works and the deployed code doesn't.

Authentication is one of those topics that sounds boring until it breaks at the worst possible time. The Azure AI Foundry stack has three authentication options and they're each appropriate in different situations. After helping Australian clients deploy a lot of these workloads, here's what I'd recommend and where the gotchas live.

The three options

Azure AI Foundry services accept three authentication methods. They all work, but they're not interchangeable.

Resource keys. A secret value you include in the Ocp-Apim-Subscription-Key header. There are two keys per resource so you can rotate without downtime.

Access tokens. Short-lived (10 minute) bearer tokens you exchange a resource key for. Used in the Authorization header as Bearer <token>. Only certain services support this, mainly the speech and translator APIs.

Microsoft Entra ID. The proper enterprise authentication path. Uses Azure role-based access control and either a service principal, managed identity, or interactive user authentication.

If you're starting fresh in 2026, default to Microsoft Entra ID and only fall back to keys when you have a specific reason. Most of the "specific reasons" people give turn out to be "we couldn't be bothered setting up the role assignments". Set up the role assignments.

Resource keys - fast to start, easy to leak

Resource keys are the quickest path to a working API call. Copy a key from the Azure portal, drop it into your code, send a request. Works immediately. This is exactly why so much sample code uses keys and why so many production systems still use them six months later.

The pattern looks like this for the Translator service:

curl -X POST 'https://api.cognitive.microsofttranslator.com/translate?api-version=3.0&from=en&to=de' \
  -H 'Ocp-Apim-Subscription-Key: YOUR_KEY' \
  -H 'Content-Type: application/json' \
  --data-raw '[{ "text": "How much for the cup of coffee?" }]'

Simple. Also simple to commit to GitHub accidentally. We've been called in to clean up exactly this scenario more than once. Someone pushes a quick prototype to a public repo with the key embedded, the bots that scrape GitHub for exposed credentials find it within minutes, and suddenly there's bill being run up against your subscription doing things you didn't authorise.

Two practical defences if you're using keys.

Never put keys in source code or config files that get committed. Use environment variables, Key Vault references, or a secrets manager. Even if your repo is private today, it might not be tomorrow, and developers move keys around in ways you can't predict.

Rotate keys on a schedule. That's why there are two of them. Use KEY1 in production, regenerate KEY2 quarterly, then swap them. The dual-key model exists precisely so you can rotate without an outage. Most teams never actually do this.

For multi-service Foundry resources, the key works across multiple AI services with the same authentication header. Useful for consolidation but it also means a leaked key gives the attacker access to more services. Trade-off worth thinking about.

Access tokens - useful in specific cases

The token exchange pattern is more secure than raw keys because the token expires in ten minutes. Even if intercepted, the window for abuse is small.

The flow is:

curl -v -X POST \
  "https://YOUR-REGION.api.cognitive.microsoft.com/sts/v1.0/issueToken" \
  -H "Content-type: application/x-www-form-urlencoded" \
  -H "Content-length: 0" \
  -H "Ocp-Apim-Subscription-Key: YOUR_KEY"

You get a JWT back, and you pass it in subsequent requests as Authorization: Bearer <token>.

This pattern is mainly relevant for speech services and the translator. Speech to text and text to speech in particular work well with tokens because they're usually called from client-side code (browsers, mobile apps) where you don't want to expose the resource key. The server-side code exchanges the key for a token, hands the token to the client, and the client makes the speech calls directly with limited-lifetime credentials.

If you're building client-facing speech features, this is the pattern. If you're doing server-to-server calls, tokens add complexity without much benefit. Just use Entra ID instead.

Microsoft Entra ID - the right answer for most production systems

This is what you should be using for any non-trivial deployment. The advantages are real:

No long-lived secrets to manage. Managed identities are credential-free from the developer's perspective.
Proper role-based access control. You can grant specific permissions per resource, per service, per environment.
Audit trail. Every authentication is logged and attributable to an identity.
Disable local auth. Once Entra ID is set up, you can disable key-based auth entirely so a leaked key is useless.

The setup looks like this. First, the Foundry resource needs a custom subdomain (the regional endpoints don't support Entra ID auth). Second, you create a service principal or managed identity. Third, you assign it the appropriate role (usually "Cognitive Services User" or similar). Fourth, your code uses the Azure SDK with the appropriate credential class, which handles token acquisition automatically.

In .NET it looks roughly like:

var client = new TextTranslationClient(
    new DefaultAzureCredential(),
    new Uri("https://your-custom-subdomain.cognitiveservices.azure.com"));

DefaultAzureCredential tries a chain of authentication methods. Locally it picks up your Azure CLI login. In Azure, it uses the managed identity attached to the resource. Same code, no environment-specific configuration. This is the pattern I'd recommend for almost all production .NET workloads. Our Microsoft AI consulting team defaults to this approach.

Python and TypeScript have equivalent patterns. Same principles apply.

The custom subdomain trap

Worth calling out because it catches people repeatedly. Microsoft Entra ID auth requires a custom subdomain on your resource (something like myresource.cognitiveservices.azure.com rather than the regional eastus.api.cognitive.microsoft.com).

If you created your resource years ago without a custom subdomain, you can't just bolt Entra ID auth on top. You either need to migrate to a new resource with a subdomain or stick with keys for that resource. The good news is new resources default to having a subdomain. The annoying news is older deployments often need restructuring before you can adopt Entra ID properly.

Plan for this when you're modernising older deployments. We've helped a few clients work through this and the migration isn't hard, but you need to update every client that points at the resource to use the new endpoint.

Disable local auth - the step everyone skips

If your organisation has decided to use Microsoft Entra ID, you should disable key-based authentication on the resource. This is a separate step from setting up Entra ID auth. Setting up Entra ID just makes it possible to authenticate that way. Disabling local auth forces it.

The reason this matters: even after you've migrated all your own code to Entra ID, the keys still exist on the resource. Anyone who copies them out of the Azure portal can still use them. A developer building a quick prototype, an old script someone forgot about, a leak from a former employee. Disabling local auth closes that door.

The setting is on the resource itself. Once disabled, key-based authentication returns 401. Test in non-production first because there's usually a forgotten script somewhere that will break. Better to discover it in dev.

Regional considerations for Australian deployments

Worth knowing for Australian clients specifically. The australiaeast region is supported for Foundry resource authentication and token exchange. So if your data residency requirements pin you to Australia (and for healthcare, finance, and government clients they almost always do), you can run the full authentication stack in-region without crossing borders.

This used to be a real constraint a few years back. Now it's basically a non-issue for any of the major services. The Australian regions have feature parity for the things most clients use. We design deployments around this for our Australian government and healthcare clients routinely.

Practical recommendations

What I'd actually recommend, in priority order.

For new production workloads. Microsoft Entra ID with managed identity. No keys, no tokens, no secrets to rotate. The setup takes an extra hour. It saves you days of incident response later.

For prototypes and experimentation. Resource keys are fine, kept in environment variables, never committed to source control. Plan to migrate to Entra ID before production.

For client-side scenarios (browsers, mobile). Token exchange. Server holds the key, exchanges for short-lived tokens, hands them to the client. Never put a resource key into client-side code.

For legacy systems being modernised. Plan the custom subdomain migration as part of the modernisation. Don't try to bolt Entra ID onto a regional-endpoint resource.

For everything. Once Entra ID is set up, disable local auth. Don't leave the back door open.

We do a lot of this kind of work as part of our Azure AI consulting engagements. The authentication design isn't glamorous but it's one of the things that separates a deployment that scales gracefully from one that needs a security incident to motivate a rebuild.

A final word on operational practice

Whatever auth method you use, monitor failed authentication attempts. Azure resource logs capture this. A spike in 401s is usually either a credential rotation gone wrong (annoying but recoverable) or someone probing your endpoints (worth knowing about). Set up an alert on it. Five minutes of work, very useful when something goes sideways.

The official documentation for Azure AI authentication is at Microsoft's reference page. It covers the mechanics in detail. This post is meant to give you the "which one should I pick and why" view, which the docs don't really address.