Should I cache search results in my agent?

Yes, for 12 to 24 hours. SERP results are stable enough that caching by query string materially cuts cost. Always cache the parsed JSON, not just the HTTP response. LangChain has a built-in BaseCache wrapper that works for tool calls.

Plug Web Search into a LangChain Agent in 5 Minutes (with Per-Run Cost Math)

Q: How much does web search add to my agent cost?

At 1 query per agent run, web search adds $0.0003 to $0.015 per run depending on provider. The dominant cost is usually the LLM tokens, not the search. At 10 queries per run (deeper research agents), search starts to compete with token cost — at SerpApi pricing it can exceed token cost.

Q: Can I use multiple SERP APIs in one agent?

Yes. A common pattern is Tavily for the LLM-shaped extraction step plus Serpent API for SERP-feature data when the agent needs structured information like positions or AI Overview citations. Wire each as a separate Tool in the LangChain toolkit.

Q: Does this work for LlamaIndex too?

Yes. The same tool wrapper pattern applies. LlamaIndex has BrightDataToolSpec and SerpApi integrations as official extras. For a custom integration, the FunctionTool wrapper takes any Python function and exposes it as a tool to the agent.

By Anurag Pathak· May 4, 2026· 11 min read

AI brain circuit symbolising LangChain agents calling a web search tool

Every LangChain tutorial that adds web search to an agent does two things: it picks one provider (usually whoever sponsored the post) and it skips the cost math entirely. The result is a working agent that quietly burns $200 a month on search calls because someone copied the SerpApi tool from the docs.

This guide does it differently. We wire up a working LangChain agent with web search, then run the exact same conversation through four different SERP APIs and look at the bill. The provider you pick can change your monthly cost by 50×.

The 5-Minute Wiring

LangChain has a generic tool pattern that works for any HTTP API. Here is the minimum agent with a SERP tool:

from langchain_openai import ChatOpenAI
from langchain.agents import create_react_agent, AgentExecutor
from langchain.tools import Tool
from langchain.prompts import PromptTemplate
import requests, os

SERPENT_API_KEY = os.environ["SERPENT_API_KEY"]

def web_search(query: str) -> str:
    """Search Google for the query and return the top results."""
    r = requests.get("https://apiserpent.com/api/search", params={
        "q": query, "engine": "google", "country": "us",
        "api_key": SERPENT_API_KEY,
    }, timeout=30)
    data = r.json()
    organic = data.get("organic_results", [])[:5]
    aio = (data.get("ai_overview") or {}).get("text") or ""
    summary = aio[:500] if aio else ""
    lines = [f"{i+1}. {o['title']} - {o['url']}" for i, o in enumerate(organic)]
    return summary + "\n\n" + "\n".join(lines)

search_tool = Tool(
    name="web_search",
    func=web_search,
    description="Search the web. Input: a search query string.",
)

llm = ChatOpenAI(model="gpt-5", temperature=0)
prompt = PromptTemplate.from_template(
    "Answer the user's question using the web_search tool when needed.\n"
    "Question: {input}\n\nThought: {agent_scratchpad}"
)
agent = create_react_agent(llm, [search_tool], prompt)
executor = AgentExecutor(agent=agent, tools=[search_tool], verbose=True)

print(executor.invoke({"input": "What is the latest version of Next.js?"}))

That is the entire agent. The Serpent API call is wrapped as a LangChain Tool, the agent decides when to call it, and the response feeds back into the LLM. Replace the tool function body with a different provider and the rest of the agent is unchanged.

The Four Provider Variations

Same agent. Four different tool implementations.

Tavily

from tavily import TavilyClient
client = TavilyClient(api_key=os.environ["TAVILY_KEY"])

def web_search(query: str) -> str:
    res = client.search(query, max_results=5, search_depth="basic")
    return "\n".join(f"{r['title']} - {r['url']}: {r['content'][:200]}"
                     for r in res["results"])

Tavily returns pre-extracted content rather than just SERP links. Best for RAG-style agents.

Serper.dev

def web_search(query: str) -> str:
    r = requests.post("https://google.serper.dev/search",
        headers={"X-API-KEY": os.environ["SERPER_KEY"]},
        json={"q": query, "gl": "us"})
    organic = r.json().get("organic", [])[:5]
    return "\n".join(f"{o['title']} - {o['link']}: {o.get('snippet','')}"
                     for o in organic)

SerpApi.com

def web_search(query: str) -> str:
    r = requests.get("https://serpapi.com/search", params={
        "q": query, "engine": "google", "api_key": os.environ["SERPAPI_KEY"],
    })
    organic = r.json().get("organic_results", [])[:5]
    return "\n".join(f"{o['title']} - {o['link']}: {o.get('snippet','')}"
                     for o in organic)

Serpent API

Already shown above. Returns the AIO text plus organic results in a single call.

The Cost Test

I ran the same five questions through the agent four times, swapping only the search tool. Each question caused the agent to call the search tool 1 to 3 times depending on how the LLM reasoned. Average: 2.1 search calls per agent run.

Then I extrapolated to typical workloads:

Provider	Per-call cost	1 search/run	10 searches/run	100 searches/run (research agent)
Serpent API (Scale)	$0.0003	$0.0003	$0.003	$0.03
Serper.dev	$0.001	$0.001	$0.010	$0.10
Tavily Pro	~$0.005	$0.005	$0.050	$0.50
SerpApi.com Developer	$0.015	$0.015	$0.150	$1.50

For a research agent that fans out 100 search calls per question, the difference between Serpent and SerpApi is $1.47 per agent run. At 1,000 agent runs a day, that is $1,470 per day, or about $44,000 a month.

How Much Does Search Cost vs Tokens?

For context, here is what GPT-5 token cost looks like for the same agent runs (reasoning agents typically consume 5,000 to 50,000 tokens per run):

Light agent run (5K tokens): $0.025 of tokens
Medium agent run (15K tokens): $0.075 of tokens
Heavy research run (50K tokens): $0.25 of tokens

So for a light agent, search is a fraction of total cost on Serpent ($0.0003) but doubles the bill on SerpApi ($0.015 + $0.025). For a heavy research run, the cheapest provider keeps search cost negligible (1% of total); the most expensive provider makes search dominate (85% of total). Pick poorly and you accidentally double or triple the cost of every run.

The Caching Trick

Identical agent queries often produce identical search calls. A 24-hour cache keyed on the search string can cut search cost by 40 to 70 percent for production agents:

from functools import lru_cache
from langchain.cache import SQLiteCache
import langchain

# Caches LLM outputs across runs.
langchain.llm_cache = SQLiteCache(".langchain.db")

# Cache search tool results manually.
@lru_cache(maxsize=10000)
def cached_search(query: str) -> str:
    return web_search(query)

search_tool = Tool(
    name="web_search",
    func=cached_search,
    description="Search the web. Input: a search query string.",
)

For longer-lived caching across processes, swap the lru_cache for Redis or SQLite with a TTL. The classic recipe is "key = SHA256(query + country + date), TTL 24 hours."

When to Use Tavily vs a SERP API

Tavily and traditional SERP APIs solve adjacent problems:

Use Tavily when the agent only needs the answer from a webpage and you do not care about positions, ads, or SERP features. Tavily extracts content already, saving you a follow-up scrape.
Use a SERP API (Serpent, Serper) when the agent needs to know which page something appears on, where it ranks, what other features Google shows, or the AI Overview block. SERP APIs return structure; Tavily returns prose.
Use both when you have a multi-step agent: SERP API to identify the right URL, then Tavily (or a content extractor) to pull the page contents.

Common Pitfalls

Forgetting to truncate. A SERP API response can be 30 KB. Pasting that whole thing into the LLM context is wasteful. Truncate to top-5 results plus the AIO.
Not setting max_iterations. A LangChain agent with no cap can call the search tool 30 times in a single run. Set AgentExecutor(max_iterations=8) or similar.
Letting the LLM choose the country. The LLM will guess based on context, often wrong. Pin the country in the tool function based on the user's session.
Treating the search call as free. Every search call is a paid HTTP call plus latency. Check whether the LLM actually needs to search before letting it.
Mixing failures. A 503 from the SERP API and an empty results array are both "search failed" but you handle them differently. Distinguish in the tool.

The Decision Tree

Light agent, cost-sensitive. → Serpent API. Cheapest per call, full SERP JSON.
Speed-critical agent. → Serper.dev for sub-1.5s p95, Google only.
RAG / knowledge worker agent. → Tavily for pre-extracted content.
Multi-engine, exotic markets (Yelp, App Store). → SerpApi.com.
Production at scale. → Serpent for cost, Tavily for content extraction, both wired in parallel.

Wire It Up in 5 Minutes

Serpent API is the cheapest way to add web search to a LangChain agent in 2026. 10 free Google searches on signup, no credit card. Pay-as-you-go after that, from $0.30 per 1,000 quick searches at Scale tier.

Get Your Free API Key

Explore: SERP API · Playground · SERP APIs for AI agents

FAQ

Which SERP API is best for LangChain agents?

Tavily and Serpent API are the two most common picks. Tavily returns pre-extracted content (best for handing straight to an LLM); Serpent returns full SERP JSON including AI Overview at the lowest per-call cost. Serper.dev is fast and cheap for basic Google. SerpApi works but is the most expensive.

How much does web search add to my agent cost?

At 1 query per run, $0.0003 to $0.015 depending on provider. Dominant cost is usually LLM tokens. At 10 queries per run, search starts to compete with token cost; on SerpApi pricing it can exceed token cost.

Should I cache search results?

Yes. 12 to 24 hour TTL. SERP results are stable enough that caching by query string materially cuts cost. Cache the parsed JSON, not just the HTTP response.

Can I use multiple SERP APIs in one agent?

Yes. Common pattern: Tavily for content extraction plus Serpent API for SERP-feature data. Wire each as a separate Tool.

Does this work for LlamaIndex too?

Yes. LlamaIndex has BrightDataToolSpec and SerpApi integrations. For custom integration, FunctionTool wraps any Python function as a tool.

What about MCP-based agents?

MCP (Model Context Protocol) tool servers can expose the same SERP API via the MCP transport. Several providers including Tavily ship MCP servers; for the others you can write a 50-line MCP wrapper that calls the SERP API HTTP endpoint.