Back to Blog

Microsoft Foundry REST APIs - What They Are and When You Actually Need Them

May 7, 20268 min readMichael Ridland

Every couple of weeks somebody on a client team asks me a version of the same question. "We are building an AI feature, the .NET developer wants to call the Azure OpenAI SDK, the Python team wants the Foundry SDK, the Node team wants to use the OpenAI library, and the integration team wants raw HTTP. Which one is right?"

The honest answer is "probably all of them, in different places, for different reasons." Microsoft Foundry exposes its capabilities through several layers. There are language-specific SDKs for .NET, Python, JavaScript and Java. There is the OpenAI compatibility layer that lets you point existing OpenAI client libraries at an Azure endpoint. And underneath all of those, there is the REST API surface itself, which is what every SDK is wrapping. Most teams should use the SDKs most of the time. But there are real situations where you want to drop down to the REST layer, and there are quirks worth knowing about when you do.

This post is a field guide to the Microsoft Foundry REST APIs. I will cover what is actually in the REST surface, when reaching for raw HTTP makes sense, and what we have learned across Azure AI consulting engagements with Australian clients.

What sits in the REST surface

The Foundry REST API surface is not one monolithic API. It is a collection of related APIs, grouped by capability. At the highest level you have the model inference APIs, the agents and assistants APIs, the data plane operations for fine-tuned models and deployments, the responsible AI content safety APIs, and the various AI service APIs for things like document intelligence, speech, vision and language understanding. Each of those has its own base URL, its own version, and in some cases its own authentication model.

For most teams, the inference APIs are what they care about. These are the endpoints you POST a chat completion or a responses-style request to, and you get back a streamed or unstreamed model response. The URL pattern is reasonably consistent. You point at your Foundry resource, you specify a deployment name, and you call an action like /chat/completions or /responses. The body shape is broadly compatible with the OpenAI standard, which is deliberate. That compatibility is why you can take an existing OpenAI Python or Node app and switch it to Azure by changing the base URL and the auth header.

Below the inference layer, the agents APIs are where it gets interesting. The Foundry agent API exposes the building blocks for stateful, tool-using agents - threads, runs, messages, tools, file search. These are all REST operations under the hood. The Microsoft Agent Framework SDK gives you a friendly abstraction over the top, but every call eventually becomes one or more HTTP requests against these endpoints. If you are doing anything custom around how agents are orchestrated, persisted or audited, it pays to understand the REST shape underneath.

Then there are the older Azure AI service APIs for specific capabilities. Computer Vision, Document Intelligence, Speech, Translator, Language. These have been around longer than the Foundry brand. Microsoft has been consolidating them under the Foundry banner, but the underlying REST endpoints are largely the same ones that existed under Cognitive Services. If you are doing OCR or batch transcription, you are still hitting those mature, capability-specific endpoints.

When the SDK is enough

For day to day building, the SDKs are what you want. The Azure AI Foundry SDK for Python, the .NET Foundry client libraries, the JS package - all of them do the right things by default. They handle authentication via DefaultAzureCredential, retry on transient failures with sensible backoff, surface clear exception types, and keep up with API version changes. If you are writing a feature inside a client application and you need to call a Foundry model, just use the SDK.

The reason matters. Authentication against Azure is not trivial. You have to handle managed identities, service principals, user-assigned identities, and the differences between local development and production. The SDKs paper over all of that. If you are doing raw HTTP, you become responsible for token acquisition, refresh, and credential precedence. That is a lot of code to maintain just to call an endpoint.

There is also the question of API versioning. Foundry APIs evolve. Models get added. Parameters get renamed. The SDK keeps up. Raw HTTP code does not, unless you maintain it. We have seen client codebases that called Foundry over raw HTTP, then stopped working overnight because a deprecated API version was removed. The team that built it had moved on. Nobody knew where the version string was hardcoded.

Most of our Microsoft AI consulting work ends up recommending the SDK route for application code. It is faster to write, easier to maintain, and produces fewer surprises.

When raw HTTP makes sense

There are real situations where you want to call the REST API directly. The first is integration platforms. If you are working in Power Automate, Logic Apps, Azure Data Factory, or a third-party integration tool, you usually do not have a Foundry SDK available. You have an HTTP connector. So you call the REST API. We have built quite a few Power Automate workflows for clients that send documents to Document Intelligence or call a chat completion via raw HTTP. It works fine. The performance is acceptable. You just need to be careful about credential management, which usually means storing a key in Key Vault and pulling it through the connector.

The second situation is languages without a first-class SDK. The Foundry SDK is well-covered in Python, .NET, JS and Java, but if you are working in Go, Rust, Elixir or Ruby, your options are thinner. There are community SDKs of varying quality. Often the easiest path is to call the REST API directly with whatever HTTP client your language has.

The third is operational tooling. If you are writing a script to enumerate deployments, audit usage, or migrate fine-tuning jobs between resources, the management APIs are easier to use from a shell with curl than via an SDK. The same is true for incident response work, where you want to see exactly what is being sent and received.

The fourth is when you want to do something the SDK does not yet support. This happens because Microsoft ships REST endpoints first and SDK support catches up later. New features sometimes land in the REST surface a release ahead of the SDK. If you need that feature today, raw HTTP is your option.

Authentication and what tends to go wrong

If you do drop down to REST, the most common mistake we see is around authentication. Foundry resources support two auth modes: API key and Microsoft Entra ID. Key-based auth is fine for prototypes but Microsoft is increasingly pushing teams toward Entra. We strongly recommend Entra-based auth for anything that goes to production. It is more secure, it integrates with role-based access control, and it gives you proper audit trails.

The mechanics are straightforward. You request an access token for the scope https://cognitiveservices.azure.com/.default, then you pass that token in an Authorization: Bearer header on every call. The token is good for an hour. After that you refresh. The SDKs handle this automatically, which is part of why we recommend them for application code.

When using raw HTTP, the token acquisition is where bugs hide. You either get a token from the IMDS endpoint on a managed identity, from a service principal client credentials flow, or from an interactive user flow. Each of those has its own failure modes. If you are running in an environment where the managed identity is not assigned correctly, you get a 401 that looks identical to a misconfigured key. We have spent more hours than I would like to admit debugging Foundry auth issues that turned out to be missing role assignments on the resource.

API versioning and how to think about it

Every Foundry endpoint takes an api-version query parameter. Microsoft uses date-based versions like 2025-01-01-preview. The general approach we recommend is: pin to a specific stable version in production, do not use preview versions for anything customer-facing, and review version pins quarterly.

The reason for pinning is stability. Preview APIs change. Stable APIs evolve more carefully but they still get retired. If your code says api-version=2024-08-01, that will keep working until Microsoft retires that version, which usually takes a year or more after the next stable version ships. You will get advance warning via the Azure update feed.

The reason for reviewing quarterly is to actually act on those warnings. We have inherited client systems where api-version was set to something from three years ago, the deprecation warning had been ignored for eighteen months, and the API was about to disappear. Inheriting that situation is unpleasant.

What I would do if I were starting today

If I were starting a new project that needed to use Foundry, here is the rough decision tree.

For application code in Python, .NET, JS or Java, use the SDK. Configure DefaultAzureCredential. Use a managed identity in Azure environments and a service principal locally. Pin api-version through the SDK's options.

For integration workflows in Power Automate or Logic Apps, use the built-in connector if one exists. If not, use the HTTP connector with a Key Vault reference for the secret.

For operational tooling, scripts and one-off automations, use the REST API directly with curl or your language's HTTP client. Acquire tokens via the Azure CLI for interactive scripts, or via a service principal for headless ones.

For research and prototyping, key-based auth is fine. For production, switch to Entra-based auth and remove the keys.

The full reference for the REST surface is at the Microsoft Foundry REST APIs documentation. Bookmark it. The Azure AI side of the platform changes fast and the docs are usually the most current source.

If you want help working through how to architect a Foundry-based application, our Azure AI Foundry consultants do this work as a regular part of engagements. We have also written about how we approach enterprise AI agent builds if that is closer to where you are heading.