Back to Blog

Debug Microsoft 365 Copilot Plugins Locally - A Practical Setup Guide

June 1, 20268 min readMichael Ridland

The first time you try to debug a Microsoft 365 Copilot plugin, you'll probably do what most developers do. Deploy your changes to a dev tenant. Wait for the agent to update. Open Copilot. Type a prompt. Watch nothing happen. Add a console.log. Redeploy. Wait again. After four or five rounds of this, you'll start to wonder if there's a better way. There is. This post is about how to actually set up a fast local debugging loop, because the official docs cover the ground but don't quite say which bits matter and which bits are the ones that will eat your afternoon.

We've shipped a lot of Copilot plugins for Australian clients now, and the team has settled into a debugging pattern that we keep coming back to. Here's what's worked.

The shape of a local debugging loop

The thing you want is this: code running on your laptop, with breakpoints, that Copilot can actually call. To get there, you need three things working together.

First, your plugin code (the API your plugin exposes) running locally with a debugger attached. Usually this is a Node.js app, an ASP.NET app, or a Python FastAPI app. Doesn't matter much, the pattern is the same.

Second, a public URL that forwards to your localhost. Copilot lives in the cloud. It can't reach your laptop directly. Tools like ngrok, dev tunnels, or Cloudflare tunnels make your local server reachable from the internet temporarily.

Third, a plugin manifest that points at that public URL. You upload this to your dev tenant, and Copilot loads the plugin and calls the URL when it thinks it needs to.

Once you have all three, the loop looks like this. You set a breakpoint in your code. You type a prompt in Copilot. Copilot decides to call your tool. The call hits your local server. The debugger pauses on your breakpoint. You inspect the request, step through the code, fix the bug, save the file, reload, and try again. This is the same flow as debugging any normal API, except the client is Copilot instead of Postman.

Getting the tunnel right

The tunnel is the bit most people get wrong first time. A couple of things matter.

Pick a tool that gives you a stable URL while you're working. ngrok will, if you use a named domain or a paid account. Dev tunnels (the Microsoft-supplied option) gives you a reasonably stable URL per tunnel. Cloudflare's quick tunnels generate a new URL every time, which is fine until you realise your manifest now points at yesterday's tunnel. Pick one and stick with it for a given session.

The other thing is the URL has to support HTTPS. Copilot will not call an HTTP-only URL. ngrok and dev tunnels handle this for you. If you're using something more bespoke, make sure your certificate is trusted by the cloud, not just by your local machine.

If you're behind a corporate proxy or in a network that blocks outbound tunnels, you're going to have a bad time. We've had clients where the security policy makes ngrok impossible. In that case the answer is usually a small dev environment in Azure or AWS with the right CORS and ingress configured. Still local-ish, in that you can deploy to it quickly, but you give up the breakpoints. It's a tradeoff. If you're doing a lot of Microsoft AI work for a regulated client, plan for this on day one.

The manifest reload trick

The next thing that catches people out is the manifest. You change your plugin's behaviour, you update the manifest (because the function signatures changed, say), and you upload the new version. Copilot doesn't pick it up. You restart Copilot. Still nothing. You delete the agent and reinstall. It works.

What's happening is Copilot caches the manifest at multiple layers. The TeamsAppDefinition cache is one. The agent's installed-app state is another. There's an undocumented edge case where a new manifest with the same version number sometimes gets ignored because the platform thinks "nothing changed."

The fix is two-part. First, bump the version number in the manifest on every change while you're debugging. Don't try to be clever about it. Use a script that bumps the patch version automatically on every package step. Second, in Copilot itself, type -developer on at the start of every session. This puts you in developer mode and gives you the debug card that shows which tools Copilot considered and which manifest version it loaded. If you see an old version showing up, that's your hint.

Hot reload, sort of

A proper hot reload loop is possible but a bit of a hack. The basic idea: run your plugin server with nodemon (Node) or dotnet watch (.NET) or uvicorn --reload (Python). The server restarts on file change. Your tunnel stays up. The next call from Copilot hits the fresh code.

The catch is that Copilot's call latency means you don't get the snappy feel of, say, a React app with HMR. There's the round trip, the planning step, the actual function call. So even with hot reload, you're probably looking at five to ten seconds between save and next test. Fine. Better than redeploying. Don't expect React DevTools snappiness.

For UI cards (Adaptive Cards rendered by your plugin), the feedback loop is even slower because Copilot likes to cache the rendered card output. We've taken to building Adaptive Cards in a separate visual designer first, then dropping the JSON into the plugin once we're happy. Saves a lot of round trips.

Breakpoints, request inspection, and the actually useful logs

Once your breakpoints are firing, life is good. A few tips.

Inspect the headers Copilot sends. There's usually an authorisation token, a correlation ID, and some Microsoft-specific headers that tell you which user and which tenant the call came from. The correlation ID is the thing you'll want to paste into Application Insights or wherever your logs live, so you can match a specific Copilot conversation to the server-side traces.

Print the full request body on the first hit. Copilot's argument extraction is usually pretty good but occasionally weird. The agent will sometimes pass an argument as a string when you declared it as a number, or vice versa. Catching that fast in dev saves hours of confusion later.

Watch for OAuth flow weirdness. If your plugin needs the user's identity, you'll be doing on-behalf-of token exchanges or similar. These flows are easy to get wrong and the error messages are not always helpful. Test the auth flow with at least two users with different permissions, not just yourself, because your dev account usually has more privileges than a real end user will.

This is the kind of detail we run into on most of our AI agent development projects. The auth model is half the complexity of building anything that talks to Microsoft 365 data, and you don't want to discover that on production.

What still slows us down

I'll be straight about the rough edges that still trip us up.

The error messages from Copilot when a tool call fails are generic. You'll get "I couldn't complete that action" or similar. The actual error is usually in the developer mode debug card, but not always. If the debug card just says "TimedOut", you have to go look at your server logs. There's no nice unified trace.

The local-to-cloud auth story for testing requires you to register your dev app properly in Entra ID. Skipping this step means you can run your tool against your local server but Copilot can't actually call you because the manifest fails validation. Do the registration properly first time. Don't try to ship a manifest with placeholder client IDs and expect it to magically work.

Multiple developers working on the same dev tenant can collide. If you've got three engineers each running ngrok and each uploading their own manifest version, you'll trip over each other. We usually set up a per-developer agent (different agent name) so each person has their own slot.

A starting checklist

If you're sitting down to set up a local Copilot plugin debugging environment for the first time, here's the minimum I'd want in place.

A plugin server you can run locally with hot reload. A tunnel with a stable HTTPS URL. A manifest version bumper in your build script. A dev tenant where you have permission to upload and install custom agents. The Microsoft 365 Agents Toolkit installed in VS Code or your editor of choice, because it does a lot of the manifest packaging for you. A developer mode habit (-developer on at the start of every test session).

Once that's in place, the debugging loop is genuinely productive. We can usually iterate on a new plugin in minutes per cycle, not the half-hour cycle you get if you're redeploying every time.

If you're standing up a custom Copilot plugin and want a hand, the team at Team 400 builds these for Australian organisations every week. Our Microsoft AI consulting and Copilot Studio practices can usually save you a week of setup pain. We've also got forward-deployed engineers who'll sit with your team in the build, which is often the fastest way to get past the first plugin and into the second one.

For the official reference, see the Microsoft 365 Copilot debug locally documentation. It's a good base. This post is the bit you usually figure out the hard way once you've used the official docs to get started.