Back to Blog

Writing Instructions for Copilot Agents with API Plugins - Getting Actions to Fire at the Right Time

July 1, 20269 min readMichael Ridland

There's a particular kind of frustration that hits the first time you wire an API plugin into a Microsoft 365 Copilot agent. The connection works. You can see the action is registered. And yet the agent either refuses to call it when it obviously should, or calls it with half the parameters missing, or fires it at the worst possible moment for something the user didn't ask for. The plugin isn't broken. The instructions haven't told the agent how to think about the action.

This is a different problem to writing instructions for a declarative agent that just answers questions from documents. Once an agent can take actions against a live API, the instructions are doing double duty: they shape how the agent talks, and they govern when it reaches out and does something in a real system. Get that second part wrong and you've built something that either doesn't work or, worse, does the wrong thing confidently. We've shipped a good number of these for Australian clients now, and this is what we've learned about writing the instructions that make API plugins actually behave.

Why an API plugin changes the instruction job

A declarative agent with just knowledge sources is fairly forgiving. If its answer drifts a bit, the cost is a slightly-off response and a mildly annoyed user. An agent with an API plugin can create a support ticket, look up a customer's account, submit a form, trigger a workflow. The blast radius of getting it wrong is real, and users feel it immediately.

The model has to make a chain of decisions for every action, and each link can break. Should I call an action at all here, or just answer? If so, which action? What parameters does it need, and do I have them, or do I need to ask the user? What do I do with what comes back? Your instructions are the thing steering all of that. The API plugin definition itself describes what the action does mechanically, but it can't capture your business judgement about when it's appropriate to use. That judgement lives in the instructions.

Describe the action the way a colleague would

The single most useful thing you can do is describe each action in the instructions in plain business language, not technical terms. The plugin's own description is written for a schema. Your instructions should be written for the model's decision-making.

Something like: "The GetOrderStatus action looks up the current status of a customer order. Use it when a user asks where their order is, whether it's shipped, or when it'll arrive. You'll need the order number. If the user hasn't given you one, ask for it before calling the action."

That short paragraph is doing a lot. It tells the model what the action is for in human terms, it gives concrete trigger phrases so the model recognises the situations that warrant it, it names the required input, and it tells the model to gather that input first rather than calling the action with a guess. That last instruction alone prevents a huge share of failed calls, because the model's default tendency is to try to be helpful immediately, which means firing the action with whatever it can scrape together rather than pausing to ask.

Be explicit about when NOT to call an action

This is the part that gets skipped, and it's the part that causes the embarrassing failures. Instructions tend to list what the actions do. They rarely say when the agent should hold off.

The model, left to its own devices, is eager. If it has an action available, it leans towards using it. So you get agents that call a lookup action for questions that were really just general chit-chat, or that try to submit something before the user has confirmed they actually want to. We now always include explicit restraint instructions:

  • "Only call SubmitLeaveRequest after the user has explicitly confirmed the dates and you have read the details back to them."
  • "Do not call any action for general questions about how the leave policy works. Answer those from the knowledge source instead."
  • "If you are unsure which action applies, ask the user a clarifying question rather than guessing."

The confirmation-before-a-write instruction is not optional in anything we ship. The moment an agent can change something in a real system, users need to trust that it won't do so without a clear yes. An agent that books, submits or creates without confirming will get switched off by nervous users within a week, no matter how clever it is otherwise. Read the action back in plain English, wait for the go-ahead, then call it.

Parameters are where most calls quietly fail

A surprising amount of grief comes down to parameters. The model has to map what the user said into the exact inputs the API expects, and the gap between casual human phrasing and a strict API parameter is where things fall over.

Spell it out. Tell the agent which parameters are required and which are optional. Tell it what to do when a required one is missing, which is almost always "ask the user, don't invent it." Be specific about formats where the API is fussy: if a date needs to be a certain format, or an ID has a particular shape, say so, because the model will otherwise hand over whatever the user typed and let the API reject it. And give it guidance on ambiguity. If a user says "cancel my last order" and the action needs an order ID, the instructions should tell the agent to look the order up or confirm which order, not to assume.

A pattern that has saved us a lot of debugging: "Before calling any action, check that you have every required parameter. If any are missing, ask the user for them one at a time in plain language. Never pass a placeholder or guessed value to an action." It reads as obvious, but without it the model will cheerfully substitute a plausible-looking guess and the failure lands downstream where it's harder to trace.

Tell it what to do with the response

Half the instruction work is about calling the action. The other half, which people forget, is what to do with what comes back. An API response is structured data, and left alone the model will sometimes dump the raw fields at the user, or misread which field matters, or present a technical error message that means nothing to a human.

So instruct it. "When GetOrderStatus returns, tell the user the status and the expected delivery date in a single friendly sentence. Do not show the raw response." And critically, handle the unhappy paths: "If the action returns an error or no result, apologise, explain in plain English that you couldn't retrieve the order, and suggest the user check the order number or contact support at the help desk." Errors are where a polished agent and a frustrating one part ways. The API will fail sometimes - a timeout, a not-found, a permissions issue - and the instructions decide whether the user gets a graceful explanation or a wall of technical noise.

Keep it tight, and test against real phrasing

Microsoft gives you a generous character budget for instructions, but more is not better. The agents that behave most reliably keep their instructions focused, usually well short of the limit. When you cram in too much, the model starts to lose the thread and ignores parts of it, and you can't easily predict which parts. Say what matters clearly, order it sensibly - identity, when to act, how to handle parameters, how to handle responses, when to refuse - and stop.

Then test it against how real people actually talk, not how you talk. Real users ask compound questions ("where's my order and can I change the address"), they leave out the details the action needs, they phrase things three different ways, and they go off topic and come back. We run a structured testing pass before any action-taking agent goes live, deliberately throwing messy and adversarial phrasing at it to see whether it calls actions it shouldn't or fumbles the parameters. Then we tune the instructions against what actually broke. This tuning loop is most of the real work, and it's the part the documentation can't do for you. It's also the bulk of what we do when building Copilot Studio agents for clients, because an agent that takes actions needs far more careful shaping than one that just reads documents.

Where this fits, and when to go further

Instructions carry an API plugin agent a long way, and for a lot of internal use cases that's all you need. But there's a ceiling. When you need genuine multi-step orchestration, conditional logic between several actions, custom handling the model can't be trusted to get right every time, or tight control over a sensitive workflow, you're past what instructions alone can do reliably, and you're into proper custom agent development where the orchestration lives in code rather than in prose. Knowing where that line sits is a judgement call, and crossing it too late is a common way for a promising internal agent to lose everyone's trust.

The encouraging news is that the thinking transfers. The discipline of describing actions clearly, gathering parameters before acting, confirming before writes, and handling errors gracefully is exactly the same discipline you need when you build the custom version. If your team is getting started with any of this, our Copilot training walks through these patterns with real examples rather than toy ones.

Microsoft's documentation on how to write instructions for agents with API plugins is a solid reference for the structure. The patterns in this post are what we've found holds up once real people start using the thing in production. If you're building an agent that needs to take actions and you want it to behave predictably, that's daily work for us - get in touch and we'll give you an honest read on whether instructions will get you there or whether you need something with more control underneath.