Declarative Agent Best Practices - What We've Learned Shipping Them to Australian Clients
Declarative agents are now my default recommendation when an Australian client asks "how do we customise Copilot for our business?" They're cheap to build, quick to ship, and they actually work for most internal knowledge and workflow scenarios. But they fail in predictable ways if you don't follow some basic patterns. This post is the version of "best practices" I wish I'd had when we started shipping these eighteen months ago.
The official Microsoft best practices document is good (link at the bottom). What follows is a more opinionated take based on what we've actually seen succeed and fail in production.
Start by being honest about what a declarative agent is for
A declarative agent is a configured version of Microsoft 365 Copilot with custom instructions, grounding sources, and optional plugin actions. That's it. It's not an autonomous agent. It's not a custom-trained model. It's not going to do anything Copilot can't already do, it's going to do a subset of those things more reliably, with better grounding, in a defined scope.
This sounds like a limitation. In practice it's the feature. The most successful declarative agents we've built are the ones where the team resisted the urge to make them do everything, and instead made them do one thing very well.
The HR agent that answers leave questions accurately is more valuable than the "AI Assistant" that does everything badly. Pick a job. Make it great at that job. Move on.
One agent, one purpose
The first failure mode is scope creep. Someone in the room says "we should also have it do timesheet queries", and then "what about expense claims", and then "could it also book meetings". You end up with a Frankenstein agent that does six things, none of them well.
Build separate agents. Each with its own purpose, its own instructions, its own grounding. Users can switch between them in Copilot. The mental model is "I'm asking the HR agent" or "I'm asking the Sales agent", which is closer to how people already think about asking colleagues.
This also makes maintenance dramatically easier. When you change the leave policy, you update the HR agent. You don't have to re-verify that you haven't broken the expense flow.
Name and description are part of the product
Users find your agent by name in Copilot. The name and the description in your manifest are not metadata, they're marketing copy. If they're unclear, nobody uses the agent. If they're misleading, people use it for the wrong thing and get frustrated.
A good name is short and concrete. "Acme HR Assistant" beats "Acme Smart Helper". "Project Phoenix Status Bot" beats "Acme AI". Be boringly specific.
A good description is one sentence that tells someone what the agent does and when they'd use it. "Use this agent to ask questions about Acme's leave, parental leave, and time-off policies, and to check your current leave balance." That's it. The user knows whether to pick this agent or another one.
We've watched clients agonise for hours over the agent name. It matters. Spend the time.
Grounding sources need curation, not bulk
When you give a declarative agent a SharePoint site as a grounding source, it indexes the content and uses it to answer questions. The naive approach is to point it at all your SharePoint. This is almost always wrong.
The problem is that SharePoint accumulates junk. Old drafts, superseded policies, meeting notes from 2019, accidental personal files. If you point an agent at "everything", it'll happily cite an old draft policy as authoritative. We've seen this happen and watched the client lose trust in the agent overnight.
Curate. Put the documents that should be cited into a specific, well-named SharePoint site or document library. Make that the grounding source. Have a process for keeping it clean. Treat it like a content product, not a dumping ground.
We help clients design this curation process as part of our AI workspaces work, and it's usually the unglamorous bit that determines whether the whole thing succeeds.
Sensitivity labels are not optional
If your grounding content contains any sensitive data (and it almost always does), the labels need to be right before you deploy. The declarative agent will respect Microsoft Information Protection labels and will refuse to share content the user isn't entitled to see. This is great when it works. It's brittle when the labels are wrong.
The pattern we use:
- Audit the grounding library for labelled vs unlabelled content
- Apply labels consistently before connecting the agent
- Test with users who have different access levels
- Have a process for what happens when new content is added
The other thing to think about is the conversation log. When users chat with the agent, those conversations end up in Copilot's compliance store. If a user asks a sensitive question and the agent answers, that exchange is logged. Make sure your compliance team is across this before you go live.
Plugins and actions multiply both value and risk
Adding a plugin to a declarative agent unlocks the ability to do things, not just answer questions. Check leave balances. Look up customer records. Create tickets. Send emails. This is where declarative agents become genuinely useful for workflow, not just Q&A.
It's also where the risk profile changes. An agent that answers questions wrong is annoying. An agent that submits a wrong timesheet on your behalf is a problem.
Patterns that work:
Read before write. Start with read-only plugins. Get users comfortable with the agent reading information. Add write capabilities once trust is built.
Confirm before destructive actions. Anything irreversible needs a "Are you sure?" step. The instructions should require the agent to summarise what it's about to do and ask the user to confirm.
Idempotency matters. If the user re-asks the same question or the agent retries a failed call, you don't want to end up with duplicate submissions. Design the plugin so duplicate calls don't cause duplicate outcomes.
Error messages need translation. When a plugin fails, the raw API error is useless to the user. The instructions should tell the agent to translate the error into something a human can act on.
This is the bit where the work starts to look more like proper engineering, which is why a lot of declarative agent projects eventually graduate to custom agent development once the requirements get serious.
Test with real users, not yourself
The single most useful thing we've learned is that the developer testing the agent thinks completely differently from the people who'll actually use it. The developer asks well-formed questions. Real users ask compound questions, abbreviated questions, questions with typos, questions that assume context the agent doesn't have.
We run a small testing session with five to ten real users before going live. About thirty queries each, written by them, no coaching. The agent will fail in ways you didn't anticipate. Tune the instructions, the grounding, and the actions based on what happens.
The number of times a project has been "ready to deploy" and then user testing has revealed a complete miss, is more than I'd like to admit. Always test.
Versioning and rollout
A declarative agent is a JSON manifest. You can update it. When you do, users on the next session get the updated version. This is great for fixing things. It's also a way to break things silently.
We treat the manifest as a source-controlled artefact. It lives in a Git repo. Changes go through a basic review. There's a dev version that we use for testing and a production version that gets deployed to users. This is overkill for a quick proof of concept. It's the right amount of process for anything that's actually used.
The other thing is rollout. The M365 Admin Centre lets you deploy an agent to specific users, groups, or the whole tenant. Don't deploy to the whole tenant on day one. Deploy to a pilot group. Get feedback. Iterate. Then go wider.
Monitor what's happening
Copilot gives you usage telemetry. The dashboards in the M365 Admin Centre will show you which agents are being used, how often, by whom. Look at this regularly. The signal you're looking for:
- Agents with high usage that drops off: something broke or expectations weren't met
- Agents with zero usage: nobody knows about them or the name and description are bad
- Agents with steady usage: keep them, that's a success
The other signal is qualitative. Talk to your users. The telemetry tells you what's being clicked, not whether it was useful. The conversation with the actual human user is what tells you whether the agent is delivering value.
When to graduate to something else
Declarative agents are a great fit for:
- Internal knowledge bases where the content is in SharePoint
- Simple lookup and read scenarios
- Workflow agents with one or two well-defined actions
- Pilot projects to figure out what users actually want
They start to strain when you need:
- Real-time conversation with external customers
- Complex multi-step workflows with branching logic
- Integration with systems that don't have an out-of-the-box plugin
- A specific UX that doesn't fit inside Copilot's chat interface
At that point you're looking at Copilot Studio for the middle ground, or custom development with the Microsoft AI Agent Framework at the more involved end.
Where to start if you've never built one
If you're new to this and want to get something useful into your business in a couple of weeks:
- Pick one knowledge area that staff ask questions about repeatedly. HR policies, IT how-tos, sales playbooks, whatever it is.
- Curate the content. Put the canonical documents into a clean SharePoint location.
- Write a one-page brief. Who's it for, what's it for, what's out of scope.
- Build the manifest. Identity, scope, behaviour, refusals.
- Test with five users. Tune.
- Deploy to a pilot group of twenty. Tune again.
- Roll out wider once it's stable.
The thing this will teach you is whether the value is real. If users keep coming back, you've got something. If they try it once and never return, the instructions or the grounding need work.
If you'd like help thinking through where declarative agents fit in your broader Copilot programme, get in touch. We've helped a lot of Australian organisations go from "we have Copilot licences" to "Copilot is doing real work" and the patterns above are most of how we get there.
The official Microsoft best practices document is here: Best practices for declarative agents. It's worth a read, especially the manifest reference. The patterns in this post are the human side that documentation tends to leave out.