How Much Does a RAG System Cost to Build
RAG - Retrieval-Augmented Generation - is the most common AI architecture we build for Australian businesses. It's the pattern behind every "chat with your documents" system, every AI-powered knowledge base, and most enterprise chatbots. The question we get asked most often is how much it costs.
A basic RAG system costs $25,000-$50,000 AUD. A production-grade RAG system with multiple data sources and good accuracy costs $50,000-$120,000. An enterprise RAG system with advanced retrieval, high accuracy requirements, and compliance features costs $120,000-$250,000+.
Here's what those numbers actually mean and what drives the cost.
What a RAG System Actually Does
Before we talk cost, let's make sure we're on the same page about what RAG is. A RAG system has three core components:
- Ingestion - Your documents (PDFs, Word files, web pages, databases) are processed, chunked, and stored in a search index with vector embeddings
- Retrieval - When a user asks a question, the system finds the most relevant chunks of information from your documents
- Generation - A language model (like GPT-4o) reads the retrieved information and generates an answer
The quality of a RAG system is determined by how well each of these components works. Poor chunking means the right information isn't found. Poor retrieval means the wrong information is found. Poor generation means the answer is inaccurate even when the right information is available.
Most of the engineering effort (and cost) goes into getting retrieval right. The language model is actually the easy part.
Cost Breakdown by Complexity
Basic RAG System ($25,000-$50,000)
A straightforward system that lets users ask questions about a defined set of documents.
Typical use case: An internal knowledge base where employees can ask questions about company policies, procedures, technical documentation, or product information.
What you get:
- Document ingestion pipeline for common file types (PDF, Word, HTML)
- Vector search index (typically Azure AI Search)
- Chat interface (web-based or Teams integration)
- Basic conversation memory (remembers context within a session)
- Simple admin interface for managing documents
- Deployment on Azure
Architecture:
- Azure AI Search (Standard S1) for the search index
- Azure OpenAI for embeddings and generation
- Azure App Service or Azure Functions for the application layer
- Azure Blob Storage for document storage
What drives the cost: Mainly the document processing pipeline. Clean, well-formatted documents (modern PDFs, Word files) are straightforward. Scanned documents, handwritten notes, complex tables, and multi-column layouts require more sophisticated processing.
Timeline: 3-6 weeks
Ongoing costs: $700-$2,000/month (Azure infrastructure + API costs)
Production-Grade RAG System ($50,000-$120,000)
A system built for real business use with higher accuracy requirements, multiple data sources, and proper production infrastructure.
Typical use case: A customer-facing support system that answers questions accurately across product documentation, FAQs, and knowledge articles. Or an internal system that helps analysts find information across thousands of reports and documents.
What you get:
- Everything from the basic tier
- Advanced document processing (tables, images, complex layouts)
- Multiple data source connectors (SharePoint, databases, APIs, file shares)
- Hybrid search (combining keyword search with vector search for better accuracy)
- Re-ranking for improved result relevance
- Citation and source tracking (answers link back to source documents)
- User feedback loop for continuous improvement
- Access controls (users only see answers from documents they're authorised to view)
- Monitoring and analytics dashboard
- Production-grade error handling and logging
Architecture:
- Azure AI Search with semantic ranking
- Azure OpenAI for embeddings, generation, and re-ranking
- Azure AI Document Intelligence for complex document processing
- Application layer with authentication and authorisation
- Monitoring with Application Insights
What drives the cost: The number and complexity of data sources, the accuracy requirements, and the security model. Access controls in RAG systems are particularly important and often underestimated - if your documents have different access levels, the RAG system needs to respect those permissions at query time.
Timeline: 6-10 weeks
Ongoing costs: $2,000-$6,000/month
Enterprise RAG System ($120,000-$250,000+)
A high-performance system built for scale, accuracy, and compliance in demanding environments.
Typical use case: A financial services firm that needs AI-powered search across regulatory documents, client files, and internal policies with strict accuracy, audit, and compliance requirements. Or a large organisation with millions of documents across dozens of systems.
What you get:
- Everything from the production-grade tier
- Advanced chunking strategies optimised for your specific document types
- Multi-index architecture for different document collections
- Query routing (different retrieval strategies for different question types)
- Advanced re-ranking with cross-encoder models
- Evaluation framework for measuring and tracking accuracy over time
- A/B testing capability for comparing retrieval strategies
- Comprehensive audit logging for compliance
- Data residency controls (Australian regions)
- High availability and disaster recovery
- Fine-tuned embeddings models for domain-specific vocabulary
- Automated document refresh and index maintenance
Architecture:
- Multi-tier Azure AI Search deployment
- Azure OpenAI with provisioned throughput for predictable performance
- Custom document processing pipeline
- Evaluation and monitoring infrastructure
- CI/CD pipeline for model and prompt updates
What drives the cost: Scale (millions of documents), accuracy requirements (99%+ for regulated industries), compliance needs, and the sophistication of the retrieval strategy. Enterprise RAG systems often need custom chunking, domain-specific embeddings, and multi-stage retrieval pipelines.
Timeline: 10-20 weeks
Ongoing costs: $5,000-$20,000/month
The Technical Decisions That Drive Cost
Chunking Strategy
How you split documents into chunks for indexing has an enormous impact on retrieval quality. There's no universal best approach - it depends on your document types.
Simple chunking (split by paragraph or fixed character count) is cheap to implement but produces mediocre results. Budget $2,000-$5,000.
Smart chunking (respecting document structure, handling tables and lists, preserving context) produces much better results but takes more engineering effort. Budget $10,000-$25,000.
Custom chunking (different strategies for different document types, handling complex layouts, extracting metadata) delivers the best results for complex document collections. Budget $20,000-$40,000.
In our experience, chunking strategy is the single highest-impact decision in a RAG system. We've seen accuracy improve by 20-30% from chunking improvements alone - more than any other single change.
Search Strategy
Keyword search only is fast and cheap but misses semantically similar content. If someone asks about "staff attrition" but your documents say "employee turnover," keyword search won't find it.
Vector search only understands semantic similarity but can miss exact terms and is more expensive (requires embeddings for all content).
Hybrid search (keyword + vector) combines both approaches and consistently outperforms either alone. This is what we recommend for production systems.
Hybrid + re-ranking adds a second-stage model that re-scores the initial results for relevance. This adds cost (both build cost and per-query inference cost) but meaningfully improves accuracy for complex queries.
Document Processing Pipeline
The quality and complexity of your documents has a direct impact on cost.
Clean digital documents (Word files, well-formatted PDFs, HTML pages) are straightforward to process. The text is extractable, the structure is clear, and standard tools handle them well.
Scanned documents need OCR, which adds a processing step and introduces potential errors. Azure AI Document Intelligence handles this well, but you need to validate the output.
Complex documents (engineering drawings, financial statements with complex tables, legal contracts with nested clauses, documents with mixed languages) require sophisticated processing that may include custom models or specialised extraction logic.
Budget 25-40% of your total project cost for the document processing pipeline. This is where most of the quality comes from.
Common Mistakes That Increase Cost
1. Not Testing Retrieval Separately From Generation
When a RAG system gives a wrong answer, the problem is usually retrieval (the system didn't find the right information), not generation (the language model made something up). If you only test end-to-end, you can't tell which component is failing and you waste time tuning the wrong thing.
Build retrieval evaluation into your project from the start. It costs $5,000-$10,000 to set up properly but saves much more in debugging time.
2. Ignoring Document Quality
Garbage in, garbage out. If your source documents are outdated, contradictory, or incomplete, your RAG system will give outdated, contradictory, or incomplete answers. Sometimes the best investment is cleaning up your document library before building the AI system on top of it.
3. Over-Engineering From Day One
Don't build an enterprise RAG system if a production-grade one will do. Start simpler, measure accuracy, and add complexity only where the data shows it's needed. We've seen businesses spend $200,000 on advanced features that delivered marginal improvement over a well-built $80,000 system.
4. Skipping Access Controls
This one catches people. If your RAG system can access all documents but some users shouldn't see certain content, you need document-level access controls in the retrieval layer. Retrofitting this is expensive and risky. Design it in from the start if you have any access control requirements.
Ongoing Costs After Launch
RAG systems have higher ongoing costs than most people expect because they're actively serving queries and maintaining indexes.
| Cost Category | Basic | Production | Enterprise |
|---|---|---|---|
| Azure AI Search | $110-$370/mo | $370-$1,500/mo | $1,500-$6,000/mo |
| Azure OpenAI (embeddings + generation) | $200-$800/mo | $800-$3,000/mo | $3,000-$12,000/mo |
| Hosting and compute | $100-$300/mo | $300-$1,000/mo | $1,000-$4,000/mo |
| Monitoring and maintenance | $500-$1,000/mo | $1,500-$3,000/mo | $3,000-$8,000/mo |
| Document refresh and re-indexing | As needed | $500-$1,500/mo | $1,000-$3,000/mo |
| Total | $910-$2,470/mo | $3,470-$10,000/mo | $9,500-$33,000/mo |
The maintenance line item is important. RAG systems need ongoing attention: new documents need to be ingested, accuracy needs to be monitored, prompts need to be updated as requirements change, and the underlying models periodically get updated.
How to Get Started
The best path is to start with a focused proof of concept. Pick your most valuable document collection and your most common questions. Build a basic RAG system, measure accuracy, and learn what works and what doesn't with your specific data.
At Team 400, we've built RAG systems for Australian businesses across financial services, legal, resources, and professional services. We build on Azure AI and we've learned what works (and what doesn't) across different document types, scales, and accuracy requirements.
If you're considering a RAG system, get in touch. We'll help you estimate costs based on your specific document types, volumes, and accuracy needs.
You can also explore our AI agent development services or learn about our AI consulting approach.