Plug Web Search into a LangChain Agent in 5 Minutes (with Per-Run Cost Math)
Every LangChain tutorial that adds web search to an agent does two things: it picks one provider (usually whoever sponsored the post) and it skips the cost math entirely. The result is a working agent that quietly burns $200 a month on search calls because someone copied the SerpApi tool from the docs.
This guide does it differently. We wire up a working LangChain agent with web search, then run the exact same conversation through four different SERP APIs and look at the bill. The provider you pick can change your monthly cost by 50×.
The 5-Minute Wiring
LangChain has a generic tool pattern that works for any HTTP API. Here is the minimum agent with a SERP tool:
from langchain_openai import ChatOpenAI
from langchain.agents import create_react_agent, AgentExecutor
from langchain.tools import Tool
from langchain.prompts import PromptTemplate
import requests, os
SERPENT_API_KEY = os.environ["SERPENT_API_KEY"]
def web_search(query: str) -> str:
"""Search Google for the query and return the top results."""
r = requests.get("https://apiserpent.com/api/search", params={
"q": query, "engine": "google", "country": "us",
"api_key": SERPENT_API_KEY,
}, timeout=30)
data = r.json()
organic = data.get("organic_results", [])[:5]
aio = (data.get("ai_overview") or {}).get("text") or ""
summary = aio[:500] if aio else ""
lines = [f"{i+1}. {o['title']} - {o['url']}" for i, o in enumerate(organic)]
return summary + "\n\n" + "\n".join(lines)
search_tool = Tool(
name="web_search",
func=web_search,
description="Search the web. Input: a search query string.",
)
llm = ChatOpenAI(model="gpt-5", temperature=0)
prompt = PromptTemplate.from_template(
"Answer the user's question using the web_search tool when needed.\n"
"Question: {input}\n\nThought: {agent_scratchpad}"
)
agent = create_react_agent(llm, [search_tool], prompt)
executor = AgentExecutor(agent=agent, tools=[search_tool], verbose=True)
print(executor.invoke({"input": "What is the latest version of Next.js?"}))
That is the entire agent. The Serpent API call is wrapped as a LangChain Tool, the agent decides when to call it, and the response feeds back into the LLM. Replace the tool function body with a different provider and the rest of the agent is unchanged.
The Four Provider Variations
Same agent. Four different tool implementations.
Tavily
from tavily import TavilyClient
client = TavilyClient(api_key=os.environ["TAVILY_KEY"])
def web_search(query: str) -> str:
res = client.search(query, max_results=5, search_depth="basic")
return "\n".join(f"{r['title']} - {r['url']}: {r['content'][:200]}"
for r in res["results"])
Tavily returns pre-extracted content rather than just SERP links. Best for RAG-style agents.
Serper.dev
def web_search(query: str) -> str:
r = requests.post("https://google.serper.dev/search",
headers={"X-API-KEY": os.environ["SERPER_KEY"]},
json={"q": query, "gl": "us"})
organic = r.json().get("organic", [])[:5]
return "\n".join(f"{o['title']} - {o['link']}: {o.get('snippet','')}"
for o in organic)
SerpApi.com
def web_search(query: str) -> str:
r = requests.get("https://serpapi.com/search", params={
"q": query, "engine": "google", "api_key": os.environ["SERPAPI_KEY"],
})
organic = r.json().get("organic_results", [])[:5]
return "\n".join(f"{o['title']} - {o['link']}: {o.get('snippet','')}"
for o in organic)
Serpent API
Already shown above. Returns the AIO text plus organic results in a single call.
The Cost Test
I ran the same five questions through the agent four times, swapping only the search tool. Each question caused the agent to call the search tool 1 to 3 times depending on how the LLM reasoned. Average: 2.1 search calls per agent run.
Then I extrapolated to typical workloads:
| Provider | Per-call cost | 1 search/run | 10 searches/run | 100 searches/run (research agent) |
|---|---|---|---|---|
| Serpent API (Scale) | $0.0003 | $0.0003 | $0.003 | $0.03 |
| Serper.dev | $0.001 | $0.001 | $0.010 | $0.10 |
| Tavily Pro | ~$0.005 | $0.005 | $0.050 | $0.50 |
| SerpApi.com Developer | $0.015 | $0.015 | $0.150 | $1.50 |
For a research agent that fans out 100 search calls per question, the difference between Serpent and SerpApi is $1.47 per agent run. At 1,000 agent runs a day, that is $1,470 per day, or about $44,000 a month.
How Much Does Search Cost vs Tokens?
For context, here is what GPT-5 token cost looks like for the same agent runs (reasoning agents typically consume 5,000 to 50,000 tokens per run):
- Light agent run (5K tokens): $0.025 of tokens
- Medium agent run (15K tokens): $0.075 of tokens
- Heavy research run (50K tokens): $0.25 of tokens
So for a light agent, search is a fraction of total cost on Serpent ($0.0003) but doubles the bill on SerpApi ($0.015 + $0.025). For a heavy research run, the cheapest provider keeps search cost negligible (1% of total); the most expensive provider makes search dominate (85% of total). Pick poorly and you accidentally double or triple the cost of every run.
The Caching Trick
Identical agent queries often produce identical search calls. A 24-hour cache keyed on the search string can cut search cost by 40 to 70 percent for production agents:
from functools import lru_cache
from langchain.cache import SQLiteCache
import langchain
# Caches LLM outputs across runs.
langchain.llm_cache = SQLiteCache(".langchain.db")
# Cache search tool results manually.
@lru_cache(maxsize=10000)
def cached_search(query: str) -> str:
return web_search(query)
search_tool = Tool(
name="web_search",
func=cached_search,
description="Search the web. Input: a search query string.",
)
For longer-lived caching across processes, swap the lru_cache for Redis or SQLite with a TTL. The classic recipe is "key = SHA256(query + country + date), TTL 24 hours."
When to Use Tavily vs a SERP API
Tavily and traditional SERP APIs solve adjacent problems:
- Use Tavily when the agent only needs the answer from a webpage and you do not care about positions, ads, or SERP features. Tavily extracts content already, saving you a follow-up scrape.
- Use a SERP API (Serpent, Serper) when the agent needs to know which page something appears on, where it ranks, what other features Google shows, or the AI Overview block. SERP APIs return structure; Tavily returns prose.
- Use both when you have a multi-step agent: SERP API to identify the right URL, then Tavily (or a content extractor) to pull the page contents.
Common Pitfalls
- Forgetting to truncate. A SERP API response can be 30 KB. Pasting that whole thing into the LLM context is wasteful. Truncate to top-5 results plus the AIO.
- Not setting
max_iterations. A LangChain agent with no cap can call the search tool 30 times in a single run. SetAgentExecutor(max_iterations=8)or similar. - Letting the LLM choose the country. The LLM will guess based on context, often wrong. Pin the country in the tool function based on the user's session.
- Treating the search call as free. Every search call is a paid HTTP call plus latency. Check whether the LLM actually needs to search before letting it.
- Mixing failures. A 503 from the SERP API and an empty results array are both "search failed" but you handle them differently. Distinguish in the tool.
The Decision Tree
- Light agent, cost-sensitive. → Serpent API. Cheapest per call, full SERP JSON.
- Speed-critical agent. → Serper.dev for sub-1.5s p95, Google only.
- RAG / knowledge worker agent. → Tavily for pre-extracted content.
- Multi-engine, exotic markets (Yelp, App Store). → SerpApi.com.
- Production at scale. → Serpent for cost, Tavily for content extraction, both wired in parallel.
Wire It Up in 5 Minutes
Serpent API is the cheapest way to add web search to a LangChain agent in 2026. 10 free Google searches on signup, no credit card. Pay-as-you-go after that, from $0.30 per 1,000 quick searches at Scale tier.
Get Your Free API KeyExplore: SERP API · Playground · SERP APIs for AI agents
FAQ
Which SERP API is best for LangChain agents?
Tavily and Serpent API are the two most common picks. Tavily returns pre-extracted content (best for handing straight to an LLM); Serpent returns full SERP JSON including AI Overview at the lowest per-call cost. Serper.dev is fast and cheap for basic Google. SerpApi works but is the most expensive.
How much does web search add to my agent cost?
At 1 query per run, $0.0003 to $0.015 depending on provider. Dominant cost is usually LLM tokens. At 10 queries per run, search starts to compete with token cost; on SerpApi pricing it can exceed token cost.
Should I cache search results?
Yes. 12 to 24 hour TTL. SERP results are stable enough that caching by query string materially cuts cost. Cache the parsed JSON, not just the HTTP response.
Can I use multiple SERP APIs in one agent?
Yes. Common pattern: Tavily for content extraction plus Serpent API for SERP-feature data. Wire each as a separate Tool.
Does this work for LlamaIndex too?
Yes. LlamaIndex has BrightDataToolSpec and SerpApi integrations. For custom integration, FunctionTool wraps any Python function as a tool.
What about MCP-based agents?
MCP (Model Context Protocol) tool servers can expose the same SERP API via the MCP transport. Several providers including Tavily ship MCP servers; for the others you can write a 50-line MCP wrapper that calls the SERP API HTTP endpoint.