Back to Blog

Troubleshooting MCP Apps in Microsoft 365 Copilot - What Actually Goes Wrong

May 22, 202610 min readMichael Ridland

If you've been building Model Context Protocol apps for Microsoft 365 Copilot, you've probably had this experience. Everything looks right. The manifest validates. The MCP server is happily responding to curl requests. You install the agent. You ask Copilot the question your tool was built for. Nothing happens.

We've now shipped enough Copilot extensions for Australian clients to know the failure modes pretty well. Some of them are dumb little things you can fix in 30 seconds once you know where to look. Others will eat half a day if you don't know the trick. This post is the thing I wish I'd had to hand the first time we hit a wall.

Microsoft has a troubleshooting guide for MCP apps that covers the official ground. What I want to do here is share what we've actually run into when building these things for real workloads, and what the official docs don't quite spell out.

Turn on developer mode before you do anything else

This is the single biggest tip. If your Copilot session isn't in developer mode, you are debugging blind. Type this in the chat:

-developer on

That gets you the debug information card on every response, showing which tools were considered, which got called, what arguments were passed, and any errors that came back. Without this you'll spend hours guessing about whether Copilot even saw your tool, let alone whether it tried to call it.

I tell every developer I'm onboarding to a Copilot project the same thing on day one. Pin a shortcut. Turn on developer mode at the start of every test session. Treat that debug card like Chrome DevTools. You'd never debug a React app without DevTools open. Same idea.

When no tools show up at all

This happens more than you'd think, and it's almost always one of three things.

The first check is whether your MCP server is actually reachable from Copilot. Sounds obvious. People run their MCP server on localhost and then wonder why Copilot can't see it. If you're testing with the cloud version of Copilot, your server needs to be on a public URL. We use ngrok or a Cloudflare tunnel during development, then move to proper hosting once it stabilises.

The second check is your plugin manifest. Specifically the functions array and the runtimes array need to line up. The runtime references your MCP server URL and lists which functions should be routed through it via run_for_functions. If a function name is in functions but not in run_for_functions, Copilot doesn't know to send it to your MCP server. We've burned an hour on this twice now.

"runtimes": [
  {
    "type": "RemoteMCPServer",
    "spec": {
      "url": "https://api.contoso.com/mcp",
      "mcp_tool_description": "mcp-tools.json"
    },
    "run_for_functions": [
      "get_widget",
      "create_widget"
    ]
  }
]

The third check is that the tool descriptions are actually defined somewhere. You either inline them in the tools property or point to a separate JSON file with the file property. If neither is set, your tools won't surface even with everything else correct.

When the tool is there but Copilot won't call it

This is the most frustrating bucket of bugs. The tool is registered. Copilot can see it. But for the queries you're testing, it just refuses to fire.

Almost always, the problem is tool descriptions. LLMs decide which tools to call based on the description text. If your description says "Gets widget data" then the model has very little signal about when to use it. Rewrite descriptions with "Use this function when the user wants to..." phrasing. Make them specific. Mention the exact kinds of queries that should trigger them. Mention what kinds of queries should NOT trigger them if there's potential confusion with another tool.

Keep descriptions under 1024 characters. Anything beyond that gets truncated and silently ignored. We had a client's tool that wasn't firing reliably and traced it to a 1400-character description where the actual trigger guidance was at the end. Trimmed to 800 well-chosen characters and it started working.

Visibility settings are the other quiet killer. For MCP apps, the tool needs _meta.ui.visibility set to include model. For OpenAI SDK apps it's meta["openai/visibility"] set to public. If these aren't right, the tool exists but is hidden from the planner.

When you have multiple tools that overlap, the wrong one gets picked. Common with "get" type tools. We had three tools called get_customer, get_customer_orders, and get_customer_summary and Copilot kept guessing wrong about which to call. Renamed them to be more distinctive and added "Use this tool when..." clauses that explicitly mentioned the boundaries between them. Problem went away.

Widgets that won't render

If the right tool fires but no UI shows up in the response, your tool probably isn't returning UI binding info. The MCP response needs to include _meta.ui.resourceUri pointing to a registered HTML resource with the right MIME type. For OpenAI SDK apps it's the openai/outputTemplate field. Both have to be registered server-side first.

Once the binding is right and the widget still won't load, open browser DevTools and check the console. Nine times out of ten it's a Content Security Policy violation. Copilot's CSP is strict and you need your widget's host URL allowlisted. If you're loading fonts from Google or pulling a chart library from a CDN, those will get blocked.

The fix is to bundle everything into a single self-contained HTML file. No external assets. No CDN libraries. We use Vite to bundle the widget as a single HTML file with inlined CSS and JavaScript. It feels heavy-handed compared to normal web dev, but it's the only thing that reliably works inside Copilot's sandbox.

A weird one we hit recently: widget loads, looks correct, but no data shows up. Check your response shape. The content field is what the model sees. The structuredContent field is what gets passed to the widget for rendering. The _meta field is widget-specific metadata. If you put your data in only one of these, the widget either misses it or the model misses it. The split takes some getting used to. Put the data the model needs to summarise in content, put the data the widget needs in _meta or structuredContent, and accept some duplication.

We do quite a bit of custom AI development for clients where the widget is the main UX, and getting this binding right is one of those things you only learn through repeated pain. Worth doing a clean prototype that exercises every part of the shape before you start optimising for your actual use case.

The double scrollbar problem

If your widget shows a scrollbar inside the Copilot container which already has its own scrollbar, you've got the double scroll problem. Disable the inner scroll with overflow: hidden on the widget's root container. Copilot's host handles the scrolling. Your widget just needs to render at its natural height.

While we're on widget gotchas: anchor tags don't open external links from inside the widget. You need to call the platform's open-link API. For MCP apps it's app.openLink(url). For OpenAI SDK apps it's window.openai.openExternal(url). We've had widgets where everything worked perfectly except for one "Learn more" link that did nothing, and it took embarrassingly long to realise the link element itself was the problem.

Fullscreen isn't supported across all Copilot hosts. If you're putting a fullscreen button on your widget, check for host capabilities first and hide the button if it's not supported. Otherwise users click it, nothing happens, and you look broken on the hosts that don't support it.

The OAuth pit of despair

Authentication is where most teams get truly stuck. The error messages aren't always helpful and the failure modes are often subtle.

If you see "The App ID used in the request does not match the App ID in the authentication configuration", your plugin manifest's app ID doesn't match what's registered in the Teams developer portal. Go check both. They have to be byte-identical. We've had cases where a leading zero was missing in one place.

If you see "The base URL in your authentication configuration does not match the server URL", your MCP server URL in the plugin manifest doesn't match the base URL registered with the OAuth client. Again, has to be exact. Trailing slash matters. Protocol matters. We had one case where the portal had https://api.example.com.au and the plugin had https://api.example.com.au/. Single trailing slash difference. Auth refused to work.

If you see "No matching configuration found for referenceID", your runtime's auth.reference_id value in the manifest doesn't match the registration ID in the developer portal. This one trips up teams that copy a manifest from a template and forget to update the reference ID.

"Access is restricted by your organization's policy" is an admin problem, not a developer problem. Your tenant has policies blocking custom apps and the user needs to escalate to IT. We see this constantly with enterprise clients. Worth building this scenario into your test plan early so you know which Microsoft 365 admin levers need pulling before users get unblocked.

The popup-doesn't-close OAuth failure is the worst one. User clicks sign in, popup opens, they complete auth, popup hangs forever and Copilot never gets the result. This usually means window.opener got destroyed during your OAuth redirect chain. If you're hitting an identity provider that does multiple redirects (common with federated auth), the popup loses its reference to the parent window and can't post the result back. Solutions vary depending on your IdP. Sometimes a simple intermediate page that holds the opener reference works. Sometimes you need to use the redirect URL pattern instead of popup.

Practical advice for keeping yourself sane

A few things that have made our Microsoft Copilot work more pleasant.

Build a tiny test agent that has only your tool plus developer mode instructions baked in. Use it as your debugging environment. The full agent with five tools and a hundred-line instruction prompt is too noisy to debug from.

Log everything on the server side. When a tool call comes in, log the full request payload before doing anything else. When you return a response, log the full response. Half the bugs we've debugged were obvious from looking at server logs but invisible from inside Copilot.

Test with at least three different real users before declaring something done. The set of queries that make Copilot pick your tool is broader and weirder than you'd expect. Engineers think like engineers. Real users phrase things in ways that don't trigger the model the same way. We've shipped tools that worked perfectly in dev and then needed description rewrites after watching three actual users miss them.

If you're stuck on an OAuth issue that the docs don't cover, escalate through your Microsoft partner contact. The internal teams who work on this have seen patterns the public docs haven't been updated to reflect yet. We've had things resolved in a day through that route that would've been a week of self-debugging.

Bigger picture

MCP for Copilot is still pretty new. The tooling is rough in places, the error messages are often unhelpful, and the docs are catching up to a quickly moving target. None of that should put you off building. The capability you get when it works is genuinely interesting, and the people I know building production tools on top of it are already shipping real value.

If you want help working through your own MCP build, or you've got a Copilot agent in mind and aren't sure how to get from idea to production, that's the kind of work we do. We can usually save teams a few weeks of poking at unhelpful error messages by sharing what we've already debugged.

Reference: Troubleshoot MCP apps - Microsoft Learn