Back to Blog

Autonomous AI Agents: Capabilities and Guardrails

July 9, 20255 min readTeam 400

"How autonomous should our AI agent be?"

This is the question that separates successful agent deployments from disasters. Too restrictive, and the agent provides little value over a simple chatbot. Too permissive, and you're one edge case away from a PR crisis.

Here's a framework for thinking about agent autonomy, based on our experience building AI agents for Australian businesses.

The Autonomy Spectrum

Agent autonomy exists on a spectrum:

Level 0 - Suggestion Only: Agent analyses and recommends. Humans decide and act. This is AI-assisted, not autonomous.

Level 1 - Supervised Execution: Agent can act, but humans must approve each action. Agent queues requests for review.

Level 2 - Exception-Based: Agent acts autonomously within defined boundaries. Exceptions escalate to humans.

Level 3 - Full Autonomy: Agent acts independently. Humans monitor outcomes, not individual actions.

Most enterprise agents should operate at Level 2. Level 3 is rare and reserved for low-stakes, well-understood tasks.

Determining Appropriate Autonomy

For any agent action, consider four factors:

Factor 1: Reversibility

Can you undo it?

High reversibility (favour autonomy):

  • Sending a draft for human review
  • Updating internal notes
  • Creating a ticket

Low reversibility (favour human approval):

  • Sending external communications
  • Processing payments
  • Modifying customer records

Factor 2: Impact

What's the cost if it goes wrong?

Low impact (favour autonomy):

  • Answering FAQ questions
  • Looking up order status
  • Scheduling internal meetings

High impact (favour human approval):

  • Issuing refunds over $X
  • Modifying contracts
  • Communicating to VIP customers

Factor 3: Confidence

How certain is the agent about the correct action?

Build confidence estimation into agent design:

If confidence > 0.9: Execute autonomously
If confidence 0.7-0.9: Execute with logging/monitoring
If confidence 0.5-0.7: Request human confirmation
If confidence < 0.5: Escalate to human

Confidence thresholds should be calibrated based on impact.

Factor 4: Precedent

Has this situation been handled before?

High precedent (favour autonomy):

  • Common customer requests
  • Standard process workflows
  • Well-documented decisions

Low precedent (favour human approval):

  • Unusual edge cases
  • New product/policy areas
  • Situations without clear guidance

A Framework for Autonomy Rules

For each agent capability, define:

capability: process_refund
conditions:
  autonomous_when:
    - refund_amount <= 50
    - reason in ["damaged_item", "wrong_item", "not_received"]
    - order_age_days < 30
    - customer_history == "good_standing"
  requires_approval_when:
    - refund_amount > 50 and <= 500
    - reason == "changed_mind"
    - order_age_days >= 30
  requires_human_when:
    - refund_amount > 500
    - customer_history == "flagged"
    - multiple_refunds_recent
  never:
    - customer_explicitly_threatens_legal
    - fraud_indicators_present

Document these rules. Review them regularly. Adjust based on experience.

Guardrails in Practice

Hard Boundaries

Some things agents should never do:

Never:

  • Make legal commitments on behalf of the company
  • Access systems beyond their scope
  • Process payments without proper authorisation
  • Disclose confidential information
  • Make decisions on protected characteristics

These are implemented as system-level blocks, not just instructions. The agent literally cannot take these actions.

Soft Boundaries

Things agents should avoid but might need to in rare cases:

Avoid but allow escalation:

  • Operating outside business hours
  • Handling topics outside core scope
  • Processing requests from unverified users

These trigger escalation workflows rather than hard blocks.

Rate Limits

Prevent runaway behaviour:

  • Max actions per conversation
  • Max value of autonomous transactions per hour/day
  • Max retries before escalation
  • Circuit breakers for unusual patterns

Monitoring

You can't catch everything upfront. Monitoring catches what slips through:

  • Log all agent actions
  • Alert on unusual patterns
  • Sample conversations for human review
  • Track outcomes over time

Building Escalation Paths

When an agent hits a boundary, what happens?

Good Escalation

Agent: "This request is outside what I can handle autonomously.
        I'm connecting you with Sarah from our team who can help.
        I've shared our conversation so you don't have to repeat yourself.
        She'll be with you in approximately 3 minutes."

[Agent creates ticket with full context]
[Agent notifies available human]
[Agent maintains conversation until human joins]

Bad Escalation

Agent: "I cannot help with that. Please call 1800-XXX-XXX."

[User frustrated, calls, waits on hold, explains everything again]

Invest in escalation UX. It's often the moment that determines customer satisfaction.

Autonomy Evolution

Agent autonomy should increase over time as confidence grows:

Phase 1: Learning (Months 1-2)

  • Agent suggests, humans decide
  • Extensive logging and review
  • Building training data

Phase 2: Supervised (Months 2-4)

  • Agent acts on low-risk tasks
  • Human approval for medium-risk
  • Continuous accuracy monitoring

Phase 3: Exception-Based (Months 4+)

  • Agent acts autonomously within boundaries
  • Humans handle exceptions
  • Regular boundary review

Don't skip phases. Earned autonomy is sustainable autonomy.

The Human-in-the-Loop Sweet Spot

The goal isn't maximum automation. It's appropriate automation.

Consider a customer service agent handling support tickets:

  • Too restrictive: Agent looks up information but humans do everything else. Minimal efficiency gain. Why bother?

  • Too permissive: Agent handles everything including complex complaints and refund decisions. Inevitable mistakes. Customer trust damaged.

  • Sweet spot: Agent handles 70% of enquiries autonomously (status checks, simple changes, FAQ). Remaining 30% get human attention with full context prepared. Efficiency improves. Quality maintained.

The sweet spot varies by business, risk tolerance, and customer expectations. Find yours through iteration.

Regulatory Considerations

In regulated industries, autonomy decisions aren't just business choices:

Financial services: APRA expects human oversight of AI decisions. Full autonomy for credit or insurance decisions isn't acceptable.

Healthcare: Clinical decisions require clinician involvement. AI can augment but not replace.

Legal: AI can assist but practice of law requires qualified humans.

Know your regulatory requirements. Design autonomy accordingly.

Documentation Requirements

For any autonomous agent, document:

  1. What it can do autonomously (specific actions and conditions)
  2. What requires approval (and who approves)
  3. What it can never do (hard boundaries)
  4. How decisions are made (logic, models, thresholds)
  5. How to intervene (override procedures)
  6. How it's monitored (metrics, alerts, reviews)

This documentation isn't just good practice—it's often required for compliance and auditability.

Getting the Balance Right

If you're implementing AI agents:

  1. Start conservative: More human involvement initially. Loosen as you gain confidence.

  2. Define clear boundaries: Ambiguity leads to accidents. Be specific.

  3. Build good escalation: The handoff to humans should feel seamless.

  4. Monitor continuously: You'll find edge cases you didn't anticipate.

  5. Evolve deliberately: Increase autonomy based on evidence, not hope.

We help businesses design AI agent systems with appropriate autonomy levels. The right balance depends on your specific context.

Let's discuss your agent autonomy design.