Claude's Memory Tool - Building AI Agents That Actually Remember Things
Claude's Memory Tool - Building AI Agents That Actually Remember Things
One of the most common complaints about AI assistants is that every conversation starts from zero. You explain your business context, your preferences, your technical environment - and then next time, you do it all over again. It's like working with a contractor who forgets everything between meetings.
The Memory Tool in the Claude Agent SDK tackles this directly. It gives agents a way to persist information across conversations, so the agent can build up knowledge about a user, a project, or a domain over time. It's not a gimmick. When implemented well, it changes how useful an agent actually is in practice.
What the Memory Tool Does
At its core, the Memory Tool gives a Claude-based agent a structured way to read and write persistent data. Think of it as giving the agent a notebook that survives between sessions.
During a conversation, the agent can decide that certain information is worth remembering - a user's role, their preferred communication style, project-specific terminology, decisions that were made, technical constraints that were discussed. The agent writes this to its memory store. In the next conversation, the agent reads its memory before responding, and suddenly it has context that would otherwise be lost.
This isn't the same as conversation history. Chat history is a raw transcript. Memory is curated - the agent's distilled understanding of what matters. A good memory implementation means the agent remembers that you prefer concise answers, that your organisation uses Azure over AWS, and that you're in the middle of a data migration project. No need to replay every message to know this.
Why This Matters More Than You'd Think
I'll give you a concrete example. We built an internal agent for a professional services firm that helps consultants write proposals. Without memory, every time a consultant opened a new session, they had to re-explain their client, the project scope, and the firm's standard approach. It took about five minutes of back-and-forth before the agent was actually useful.
With memory, the agent knew which client the consultant was working with, what previous proposals had been written, and what the firm's standard pricing structure looked like. The consultant could just say "draft a proposal for the Phase 2 analytics work for ClientX" and the agent had enough context to produce a first draft that was 80% there.
The time saving per proposal was about 20 minutes. Across a team of 30 consultants writing multiple proposals per week, that adds up fast.
How It Works Under the Hood
The Memory Tool operates through a file-based storage system. The agent has a designated memory directory where it reads and writes markdown files. This is deliberately simple - no database, no complex schema, just files on disk.
A typical memory structure looks something like this:
memory/
MEMORY.md # Main memory file, always loaded
project-alpha.md # Project-specific context
user-preferences.md # User preferences and patterns
decisions.md # Key decisions and rationale
The MEMORY.md file is the primary entry point. It's loaded into the agent's context at the start of every conversation. This file should be kept concise - it's a summary of the most important persistent context, with links to more detailed topic files when needed.
The topic files hold deeper information. If the agent needs details about a specific project, it reads the relevant file. This keeps the main memory file lean while still giving the agent access to detailed context when it needs it.
There's a practical limit to how much memory you can load into context. If MEMORY.md grows too large, it eats into the context window that should be used for the actual conversation. We generally keep the main file under 200 lines and offload details to topic-specific files that get loaded on demand.
Designing Good Memory Structures
The biggest mistake people make with agent memory is treating it like a log. Writing everything chronologically - "On March 5, user said X. On March 6, we discussed Y." This gets unwieldy fast and provides poor retrieval.
Better approach: organise memory by topic, not by time. Have a section for user preferences, a section for project context, a section for technical decisions. Update entries in place rather than appending new ones. If the user changes their preference from weekly to daily reports, update the preference - don't add a new line saying "user now prefers daily."
Here's what a well-structured memory file looks like:
# Agent Memory
## User Context
- Role: Head of Data Engineering
- Organisation: Mid-tier financial services, ~500 employees
- Prefers concise, technical responses
- Uses Azure stack (Fabric, Power BI, Azure AI Foundry)
## Current Projects
- Data platform migration from on-prem SQL to Fabric (started Feb 2026)
- See project-migration.md for details
## Key Decisions
- Chose Fabric over Databricks for cost reasons
- Standardised on Python for data pipelines, not Scala
This gives the agent immediate context without wading through conversation transcripts.
Patterns That Work Well
Progressive learning. The agent starts with no memory and builds it up naturally through conversations. Early conversations require more explanation from the user. Over time, the agent needs less context because it already knows the important stuff. This feels natural - it mirrors how human working relationships develop.
Explicit memory management. Give users the ability to tell the agent what to remember and what to forget. "Remember that we decided to use incremental refresh for the sales dataflow" or "Forget the details about the old pricing model, we've moved on from that." This gives users control and builds trust.
Memory review. Periodically, the agent should summarise what it remembers and ask whether it's still accurate. People's situations change. Projects end. Preferences evolve. Stale memory is worse than no memory because it leads to confidently wrong responses.
Scoped memory. Different memory stores for different contexts. An agent that helps with both Power BI development and Azure infrastructure shouldn't dump everything into one memory file. Keep them separate so the agent loads relevant context for the current topic.
Things That Go Wrong
Memory that's too detailed becomes noise. If the agent remembers every minor preference and every passing comment, the memory file bloats and the agent spends context window on irrelevant details. Be selective about what gets persisted.
Memory that's never pruned gets stale. We had an agent that kept referencing a project that had been completed three months ago. The user got frustrated because the agent kept asking follow-up questions about something that was done. Regular cleanup matters.
Memory without structure becomes unsearchable. If the agent writes free-form notes without any consistent format, it can't efficiently find what it needs. A simple heading-based structure with consistent categories makes retrieval much more reliable.
Sensitive information in memory needs careful handling. If the agent remembers client names, financial figures, or personal details, that memory file needs the same security treatment as any other data store containing sensitive information. Consider what should be remembered versus what should be transient.
Building This Into Your Agents
If you're building agents with the Claude Agent SDK, the Memory Tool is available as part of the toolkit. The setup involves configuring a memory directory, defining what the initial memory structure looks like, and giving the agent instructions about what's worth remembering.
The instructions matter more than you might think. Without clear guidance, the agent either remembers too much (cluttering memory with trivial details) or too little (missing important context). We typically include specific instructions like "remember user preferences that affect how you respond" and "remember technical decisions and their rationale" while explicitly saying "don't remember one-off questions or transient details."
For production deployments, consider where the memory files live. Local filesystem works for single-user agents. Multi-user deployments need a storage backend with concurrent access and proper access control - Azure Blob Storage or a database-backed store both work well.
Memory and Multi-Agent Systems
Memory gets really interesting when you have multiple agents that share context. An intake agent that gathers project requirements can write to a memory store that a development agent reads from. A research agent can build up knowledge about a domain that a reporting agent uses when generating summaries.
This shared memory pattern is how you build agent systems that feel cohesive rather than disconnected. Each agent doesn't need to know everything, but they can read from common memory stores that give them the context they need.
We've been working with this pattern in our agentic automation work and it's one of the things that separates useful agent systems from toy demonstrations.
Where This Is Heading
Memory is one of the capabilities that makes the gap between a chatbot and an actual AI assistant. A chatbot answers questions. An assistant knows you, knows your context, and adapts its behaviour based on accumulated experience.
We're still early in figuring out the right patterns for agent memory. The Claude Agent SDK's file-based approach is simple and practical - the right move at this stage. Vector databases and knowledge graphs will come later as patterns mature.
If you're building AI agents for your organisation and want them to go beyond single-session interactions, memory is one of the first capabilities to implement. Our AI agent development team has been building Claude-based agents with persistent memory for clients across Australia, and we're happy to share what we've learned. Get in touch if you want to talk through how memory fits into your agent architecture.
For the technical documentation on the Memory Tool, see the Claude Agent SDK documentation.