Back to Blog

AI Agent Security and Compliance for Australian Businesses - A Buyer Guide

May 31, 202610 min readMichael Ridland

Most AI agent projects in Australia die in security review. The build works, the demo lands, the CFO approves the budget, and then the document goes to InfoSec and sits there for four months. By the time it comes back, half the assumptions are wrong and the team has to start again.

This post is for the people doing the buying. CIOs, heads of digital, project sponsors, security leads who have been handed an agent project and need to know what to actually ask before signing anything. We've shipped agents into banks, insurers, healthcare providers and a couple of state government departments, and the same conversation happens every time. Knowing the right questions upfront saves months.

If you're earlier in the process and trying to choose a vendor, our piece on choosing an AI agent development partner covers the build side. This one is about the security and compliance side, which is usually the harder part.

Why agents are different from regular AI projects

A chatbot answers questions. An agent does things. That sounds like a small distinction but it changes the security picture completely.

A chatbot reads from a knowledge base and produces text. The blast radius if it goes wrong is reputational. An agent calls APIs, updates records, sends emails, books appointments, processes payments, files tickets in your service desk. The blast radius if it goes wrong is operational, financial, and sometimes regulatory.

When we walk into a client conversation about agents, the first thing we ask is what the agent can write to, not what it can read from. That single question reframes the entire security conversation. A retrieval agent reading from SharePoint is a Privacy Act question. An agent that can issue refunds, change customer addresses, or approve loans is an APRA question, a CPS 230 question, and possibly a board-level question.

If your vendor or internal team is treating an agent like a fancier chatbot, that's a red flag. The threat model is different. The controls are different. The audit requirements are different.

The Australian regulatory picture in mid-2026

There still isn't a single Australian "AI law" with teeth. What there is, is a layered set of obligations that already apply to AI agents whether the regulations explicitly mention them or not.

The Privacy Act amendments that passed in late 2024 and rolled through 2025 brought in stricter requirements around automated decision making. If your agent makes a decision that significantly affects an individual, you owe that individual an explanation. That's a build-time requirement, not a documentation exercise. Logging decisions in a way that can be reconstructed later has to be designed into the system.

APRA CPS 230, which has been in force for over a year now, treats AI agents as material service providers in many cases. If the agent supports a critical operation, you need to map it, monitor it, and have a tested fallback for when it fails. APRA's expectations on agent oversight have hardened considerably since the early guidance. They want to see evidence that humans are still accountable for outcomes, not just the documentation that says so.

The Voluntary AI Safety Standard from the Department of Industry is, despite the name, the document most boards are using as their reference point. Risk committees are asking project teams to walk through how the agent meets each of the ten guardrails. If your vendor can't help you produce that walkthrough, choose another vendor.

Then there's sector-specific regulation. Health Records Act for clinical agents. ASIC guidance for anything touching financial advice. The Therapeutic Goods Administration if the agent is making any kind of medical recommendation. Education departments have their own requirements for student data. The point is, "AI compliance" in Australia is not one thing. It's whatever applies to the function the agent is performing.

What to ask vendors before signing

We see organisations get into trouble because they buy on capability and ignore the operating model. Capability demos well. Operating model is where the actual cost lives.

Here are the questions we'd expect any serious vendor to answer clearly and in writing.

Where does the data sit and who can read it?

Not just at rest. In transit, during inference, in logs, in evaluation datasets, in prompt caches. Australian data residency is straightforward for storage and increasingly straightforward for inference if you're on Azure or AWS in Sydney or Melbourne. Where it gets fuzzy is third-party tools the agent calls. If your agent integrates with a US-based SaaS product, that data is now subject to US law regardless of where your agent runs. Map every endpoint.

What identities does the agent run as?

The worst pattern we see is agents running as a single service account with admin rights to everything. The pattern that actually works is delegated identity, where the agent operates on behalf of the user who triggered it and inherits that user's permissions. This is more work to implement and it's the only model that holds up under audit. If a vendor demos with a god-mode service account, ask them what the production identity model looks like. Sometimes the answer is "we haven't built that yet."

How do you evaluate behaviour before and after deployment?

Pre-deployment evaluation is table stakes. Post-deployment monitoring is where most projects fall down. You need a continuous evaluation loop that catches when the agent's behaviour drifts, when prompt injection attempts increase, when tool calls start failing in patterns. Vendors with mature offerings will have a story here. Vendors without one will tell you it's "on the roadmap."

How is the agent contained when it misbehaves?

Containment means rate limits, tool-call limits, monetary limits, escalation thresholds, and a kill switch that an operator can actually find at 2am. We had one client whose agent went into a loop calling an external API two thousand times in an hour because the containment logic only triggered on errors, not on patterns. The bill was about three thousand dollars and the operational impact was much higher. Containment is not optional.

What audit artefacts do you produce?

For every agent action: who triggered it, what the agent decided, why it decided that, what tools it called, what data it returned, who approved it if approval was required. This needs to be queryable, exportable, and retainable for whatever period your sector requires. Seven years for financial services. Twelve years or longer for some health contexts. If the vendor's logging is a CloudWatch dump, that's not enough.

A practical buyer checklist

This is the checklist we hand to clients before they sign a build contract. Use it as a discussion document with your vendor or internal team.

Area Question Acceptable answer looks like
Data residency Where does prompt, completion, embedding and tool data sit at every stage? Specific Azure or AWS region, named third parties, contractual data flow diagram
Identity What identity does the agent use when calling each downstream system? Delegated identity per user, with break-glass admin for specific recovery scenarios
Permissions Are agent permissions least-privilege per action? Scoped tokens, separate credentials per tool, no shared admin
Logging What is logged for every agent action? Decision rationale, tool calls, inputs, outputs, approver, timestamp, all queryable
Retention How long are logs kept and how are they protected? Sector-appropriate retention, encryption at rest, controlled access
Containment What stops a runaway agent? Rate limits, cost caps, tool call limits, automated kill switch, paged on-call
Human in the loop What actions require human approval? Documented matrix mapped to risk tiers, not "the agent decides"
Evaluation How is behaviour evaluated before and after deployment? Pre-deploy eval suite, post-deploy drift monitoring, regression tests on prompt changes
Incident response What happens when the agent does something wrong? Documented runbook, named owner, regulator notification path if required
Decommissioning How do you take the agent offline cleanly? Documented process including data export, downstream notification, replacement workflow

If the answers to most of these are vague or "we'll work that out during the project," the project is going to overrun. We've watched it happen too many times.

Where Australian businesses typically get caught

A few patterns come up over and over.

Underestimating the cost of doing this properly. A capable AI agent for a small workflow costs maybe forty to seventy thousand dollars to build. The security, compliance, monitoring and operational wrapper around it often costs more than the agent itself. Boards and finance teams often see the build cost in a pitch deck and assume that's the project. The actual range for a production-ready, regulated-industry agent in Australia is more like one hundred and twenty thousand to four hundred thousand dollars depending on complexity. Some larger insurer and bank programs run well over a million when you include the platform investments needed to support multiple agents.

Buying platforms before designing the operating model. Microsoft Copilot Studio, Azure AI Foundry, OpenAI, Anthropic, Google, the open source stacks. They're all capable. Which one is right depends on your existing estate, your data classification, your identity story and what your security team is willing to operate. Picking the platform first and then finding it doesn't fit your security model is a very expensive way to learn this lesson.

Treating governance as a workstream that runs alongside the build. It needs to be the first workstream. Governance decisions constrain the architecture. Architecture choices made before governance is settled almost always need rework.

Assuming the vendor's security posture covers your security posture. Microsoft Azure is secure. Your tenant configuration might not be. AWS Bedrock is secure. Your IAM policies might not be. Shared responsibility models exist for a reason. Read them.

When you actually need outside help

Not every agent project needs a consulting firm. If you've got an experienced security team that has shipped regulated workloads before, you're competent in the platform you're using, and the agent has a narrow scope, you can probably run this internally.

We get called in when one of three things is true. The team is competent on the AI side but not on the security side. The team is competent on security but is new to agent architectures. Or the project has already been through one cycle of security pushback and the sponsor needs someone to redesign it so the next review actually passes.

Honestly, the third one is the most common reason the phone rings. If you're in that position, the post-mortem on what went wrong the first time is usually more valuable than the redesign. The same patterns will repeat with the next agent if you don't fix them now.

If you want to talk through any of this, get in touch. Our AI agent development practice handles end-to-end builds, and our AI strategy team works with executive sponsors who need to figure out the operating model before they commit to a platform. For regulated industries specifically, our work in financial services and healthcare has shaped most of the patterns in this post.

The agent projects that succeed in Australia in 2026 are the ones where security and compliance were buying criteria, not project deliverables. Make them yours and the rest of the project gets a lot easier.