Exposing Fabric IQ Ontology as an MCP Server for External AI Agents

April 27, 2026•7 min read•Michael Ridland

Most of the AI tooling questions I get from Australian clients lately come back to the same problem: how do we let an AI agent talk to our business data without giving it the keys to the entire database? You don't want Claude or GPT writing arbitrary SQL against your production warehouse. But you do want them to be able to answer real business questions that need fresh data.

The Model Context Protocol gives you a sensible answer. And Microsoft has now wired Fabric IQ ontology up to it, so your ontology can act as an MCP server. External AI agents - VS Code, Claude Desktop, anything that speaks MCP - can hit a well-defined endpoint, ask questions against your semantic model, and get back grounded answers.

This is a quiet but important shift. I'll walk through how it actually works, what the setup looks like, and where it fits in a real architecture.

What Problem MCP Actually Solves

If you've been building AI features for any length of time, you've probably hit the integration sprawl problem. Every model provider has their own way of describing tools. Every agent framework has its own protocol. You end up writing the same data connector five different ways for five different clients.

MCP fixes that by standardising the protocol. An MCP server exposes a catalogue of tools and resources. An MCP client - any AI agent that speaks the protocol - can connect, discover what's available, and call those tools without needing custom integration code.

For data work, this means you can build one server that exposes your data model, and then any AI client can use it. Today that's mostly Claude Desktop, VS Code with Agent Mode, and a few specialist tools. By the end of 2026 it'll be most of the major agent frameworks.

The Fabric IQ piece is interesting because Microsoft has done the work of turning an ontology into an MCP server for you. You don't have to write any glue code. You just point clients at a URL.

Getting the Server URL

The setup is straightforward but a few steps trip people up. Before you start, your tenant needs ontology items enabled and you need at least an F2 capacity (or Power BI Premium P1 or higher with Fabric switched on). If you're on trial capacity, the MCP endpoint will reject your calls. We've debugged that one with a few clients before realising the licensing was the issue.

Once you have a working ontology item, the URL is constructed from two IDs - the workspace ID and the ontology item ID. Both come from the URL you see when you have the ontology open in Fabric:

https://app.fabric.microsoft.com/groups/<workspace-ID>/ontologies/<ontology-item-ID>

Copy those two values into the MCP endpoint template:

https://api.fabric.microsoft.com/v1/mcp/dataPlane/workspaces/<workspace-ID>/items/<ontology-item-ID>/ontologyEndpoint

That's the URL you give to any MCP client. There's nothing else to deploy, no server to host, no authentication keys to manage beyond your existing Microsoft Entra credentials.

Wiring It Up to VS Code

The cleanest test environment for this is VS Code with Agent Mode. The docs walk through the specifics. The short version: you add the MCP server URL to your VS Code MCP configuration, open the Chat panel with Ctrl + Shift + I (Cmd + Shift + I on Mac), and pick an orchestrator.

Microsoft lists GPT-5, GPT-4.1, Claude Sonnet 4.5, and Gemini 2.5 Pro as supported orchestrators in public preview. In practice we've had the most consistent results with Claude Sonnet 4.5 for complex ontology questions. It's better at staying on the rails when the agent needs to do multi-step reasoning over entity relationships. GPT-4.1 is faster but more likely to hallucinate properties that don't exist. Your mileage will vary depending on what your ontology looks like.

The first time you connect, the agent does a tool discovery pass. It pulls down the schema of your ontology - entity types, properties, relationships - and treats them as a tool catalogue. From that point on, when you ask a question, the agent generates queries against your ontology and returns the results in natural language.

We've been using this internally to query our own project data across Fabric. It's genuinely useful. The latency is better than the in-product Fabric data agent because you're not paying for the Fabric UI to render results.

Why This Matters for Real Architectures

The interesting use case isn't VS Code. It's letting your customer-facing AI agents query enterprise data without exposing the raw database.

Here's the pattern we're building for a couple of clients right now. They have an AI assistant in their customer portal that needs to answer questions like "what's the status of my order" or "show me my recent invoices." Traditionally you'd build a custom backend that queries your data warehouse, then call that backend from your AI agent's tool definitions.

With the ontology MCP server, you skip the middle layer. Your customer-facing agent talks to MCP. MCP talks to the ontology. The ontology talks to Fabric. Each layer enforces its own permissions, and the ontology becomes the canonical contract between AI and data.

This is the kind of work our enterprise AI agents team has been pushing. The architectural simplification is worth it on its own. The fact that you can swap orchestrators later without rewriting the data layer is a bonus.

For organisations still figuring out where this fits, our AI strategy practice runs through the trade-offs in detail.

What You Have to Watch Out For

A few things that aren't in the documentation but matter in production:

Authentication is Entra-based, which is good and bad. Good because it integrates with your existing identity provider. Bad because the user-on-behalf-of flow with MCP is still maturing. If your AI agent is calling on behalf of an end user, you need to think carefully about how the token is acquired and passed. In our setups we've ended up with service principals for system-level queries and delegated tokens for user-context queries.

Rate limits are real and not well documented. During heavy testing we hit throttling that wasn't obviously documented. Behaviour seems to be capacity-tier dependent. If you're planning a high-volume customer-facing agent, you'll want to test load early and consider caching responses at the application layer.

The ontology is your contract. Once external agents are pointing at this URL, changes to the ontology become breaking changes. Removing an entity type, renaming a property, or changing a binding will affect every agent in the wild. We've started treating ontology versioning the same way we'd treat a public API - semantic versioning, deprecation periods, change logs.

Don't expose sensitive data through it. This sounds obvious but I've seen clients add PII fields into ontologies that then become available to any agent with access. Treat the ontology like a public surface. Anything sensitive should be filtered at the binding layer, not at the consumer layer.

How This Compares to the In-Product Data Agent

If you've read our other post on the Fabric IQ data agent, you might be wondering which to use when. Here's the short version:

The in-product data agent is for business users inside Fabric. It's got a UI, it ties into other Fabric features, and it's the right answer when your audience is data analysts or BI consumers who already live in the Microsoft tooling.

The MCP server is for everyone else. Developers building custom agents, customer-facing experiences, automation flows that span tools. Anywhere the ontology needs to talk to something other than Fabric's own UI.

They're the same engine underneath. The MCP server just exposes the engine over a standard protocol. So your ontology investment carries across both surfaces.

What I'd Try Next

If you've already built a Fabric IQ ontology, the MCP server piece is genuinely a 30-minute setup task. Get the URL, wire VS Code to it, ask some questions. You'll know within an hour whether it's useful for your data.

If you haven't built an ontology yet, this is a reason to start. The MCP angle changes the economics. Previously the ontology was useful inside Fabric. Now it's a piece of infrastructure that any AI agent in your stack can use. That's a much stronger pitch to leadership when you're asking for budget to do the modelling work.

The thing I'd warn against is treating MCP as a magic shortcut. It still requires good ontology design underneath. A bad ontology exposed through MCP is just a bad ontology that more clients can hit. The discipline of defining your entity types and relationships well matters more than the protocol on top.

We're running a few Microsoft Fabric consulting engagements right now that include MCP integration as part of the scope. If you're thinking about this for your own setup, happy to compare notes.

For Microsoft's official documentation on this, see Consume ontology as an MCP server.