OpenAI Web Search Tool for AI Agents - A Practical Guide
One of the biggest limitations of LLMs has always been their knowledge cutoff. You ask a question about something that happened last week and the model either hallucinates an answer or tells you it doesn't know. OpenAI's web search tool changes this by giving your AI agents the ability to search the internet, process results, and return answers with cited sources.
We've been building this into client projects where agents need access to current information - market data, regulatory updates, product pricing, news monitoring. The results are genuinely useful when configured properly, and genuinely frustrating when they're not. Here's what we've learned.
Three Modes of Web Search
OpenAI offers three distinct approaches to web search, and choosing the right one for your use case matters more than you might think. They're documented in OpenAI's web search guide, but the practical differences are worth spelling out.
Non-Reasoning Search
This is the fast option. The model sends your query directly to the search tool, gets results back, and returns a response. There's no planning step. The model doesn't decide whether it needs to search multiple times or refine its query. It just searches and reports.
Use this for quick lookups where the answer is likely on the first page of results. "What's the current AUD to USD exchange rate?" or "When is the next Azure region launching in Australia?" These are straightforward factual queries where speed matters more than depth.
The latency is low - typically a few seconds on top of the normal model response time. For user-facing applications where people are waiting for answers, this is usually the right choice.
Agentic Search with Reasoning Models
This is where things get interesting. When you use web search with a reasoning model like GPT-5, the model actively manages the search process. It can search, analyse results, decide the answer is incomplete, search again with different terms, open specific pages for more detail, and synthesise information across multiple sources.
The trade-off is time. Agentic search can take 10-30 seconds depending on the complexity of the query and the reasoning level you set. For a conversational chatbot, that's too slow. For a background research agent that assembles market intelligence reports, it's perfect.
You can control the depth by adjusting the reasoning level on GPT-5. Higher reasoning means more thorough searches but longer wait times. Lower reasoning means faster responses but less depth. We've found that the default reasoning level works well for most use cases - turning it up to maximum usually adds latency without proportionally improving result quality.
Deep Research
This is the heavy-duty option. Deep research models like o3-deep-research and o4-mini-deep-research can run for several minutes, consult hundreds of sources, and produce detailed analysis. Microsoft recommends using this with background mode, and for good reason - nobody wants to stare at a loading spinner for five minutes.
We use deep research for things like competitive analysis, regulatory change assessments, and market entry research. Tasks where thoroughness matters more than speed and where a human would normally spend hours reading and synthesising sources.
It's overkill for simple queries. If someone asks "What's the weather in Melbourne?", don't route that through deep research. But if someone asks "What are the compliance implications of the new AI regulations for Australian financial services firms?", deep research earns its keep.
Implementation Basics
Web search is configured as a tool in the Responses API. You add it to the tools array in your API request, and the model decides whether to search based on the input prompt.
The response includes two parts:
- A
web_search_calloutput item showing what the model did - whether it searched, opened a page, or searched within a page. - A
messageoutput item with the text response and URL citation annotations.
The citations are the most useful part for production applications. Each citation includes the source URL, title, and position in the text. OpenAI requires that you display these citations clearly and make them clickable in your UI - this isn't just good practice, it's part of the terms of use.
One thing to note about citations: they're inline in the response text. The annotations array gives you the exact character positions where citations appear. If you're building a frontend that displays these responses, you need to parse the annotations and render them as links. It's not difficult, but it's not automatic either.
Domain Filtering
This feature doesn't get enough attention. You can restrict web searches to specific domains using the filters parameter - up to 100 allowed domains or 100 blocked domains.
For enterprise applications, this is extremely valuable. If you're building an agent that answers questions about Australian tax law, you probably want to restrict searches to ato.gov.au, legislation.gov.au, and a handful of trusted legal resources. You don't want your tax advice agent citing a random blog post.
Similarly, if you're building a product research agent, you might block competitor domains to avoid surfacing their marketing material.
The domain filter works with subdomains automatically. Allowing microsoft.com also allows learn.microsoft.com and devblogs.microsoft.com. Format domains without the protocol prefix - use openai.com not https://openai.com/.
This is only available in the Responses API with the web_search tool. If you're using Chat Completions with the search-specific models, domain filtering isn't available.
User Location for Geo-Relevant Results
You can pass approximate user location to refine search results geographically. This uses country codes, city names, regions, and timezones.
For Australian applications, setting country: "AU" makes a noticeable difference in result quality. Searches for "business grants" return Australian government programs instead of US SBA loans. Searches for "data privacy regulations" return information about the Privacy Act rather than GDPR or CCPA.
You can get more specific with city and region:
{
"type": "web_search",
"user_location": {
"country": "AU",
"city": "Sydney",
"region": "New South Wales",
"timezone": "Australia/Sydney"
}
}
Worth noting that user location isn't supported for deep research models. If you need geo-relevant deep research, include the location context in your prompt instead.
Sources vs Citations
There's a distinction worth understanding. Citations are the URLs that appear inline in the response text - the specific references the model used to support its statements. Sources are the complete list of URLs the model consulted while forming its response, available through the sources field.
Sources typically outnumber citations significantly. The model might consult 15 pages but only cite 4 of them. For transparency and debugging, the sources list is valuable. If a user questions an answer, you can show them every page the model looked at, not just the ones it cited.
The sources field also surfaces real-time third-party data feeds labelled as oai-sports, oai-weather, or oai-finance. If your agent is asked about current weather or stock prices, these feeds provide structured data rather than scraped web pages.
What Doesn't Work Yet
Being honest about the limitations:
Context window cap. Web search is limited to a 128K context window regardless of the model's native capacity. If you're using GPT-4.1 or GPT-4.1-mini which support larger contexts, web search still caps at 128K. For most use cases this doesn't matter, but for applications that combine web search with large document analysis, it's a constraint worth knowing about.
No web search on GPT-5 minimal reasoning. If you're using GPT-5 with minimal reasoning to save costs, web search isn't available. You need at least the default reasoning level.
No GPT-4.1-nano support. The smallest model in the GPT-4.1 family doesn't support web search at all.
Rate limits are shared. Web search uses the same rate limits as your model tier. If you're already close to your rate limit with regular API calls, adding web search doesn't give you additional capacity.
Choosing the Right Mode for Your Use Case
After building several agents with web search, here's our rough decision framework:
- Quick factual lookups: Non-reasoning search. Fast, cheap, good enough for straightforward questions.
- Research tasks requiring synthesis: Agentic search with GPT-5 at default reasoning. Good balance of thoroughness and speed.
- Deep analysis or comprehensive reports: Deep research with background mode. Expensive and slow, but thorough.
- User-facing chat with current information: Non-reasoning search with domain filtering. Keeps responses fast while ensuring quality sources.
For most enterprise applications, we end up using a combination. Simple questions route to non-reasoning search. Complex questions route to agentic search. Research tasks run as background jobs with deep research. The routing logic doesn't need to be sophisticated - keyword matching and query length usually suffice.
Where This Fits in Broader Agent Architecture
Web search is one tool in a broader toolkit. For our AI agent development projects, we typically combine web search with retrieval from internal knowledge bases, structured data queries, and domain-specific APIs. The agent decides which tool to use based on the query.
The pattern that works well: check internal knowledge first, fall back to web search for questions the internal knowledge can't answer. This keeps responses fast for known topics and accurate for current information.
If you're building AI agents that need access to current information and want help designing the architecture, our AI consulting team can help. We work across OpenAI, Azure OpenAI, and other model providers, so we can recommend the right approach for your specific requirements rather than defaulting to one vendor's solution.
For organisations looking at broader AI strategy - figuring out where AI agents make sense, what they should have access to, and how to govern their behaviour - that's a conversation worth having before you start building.