How to Scrape DuckDuckGo in 2026 (for Free!)

By Serpent API Team · · 12 min read

DuckDuckGo is the search engine that AI agents reach for first, because for years a tiny Python library made it free and frictionless. Then two things broke at once: the library was renamed, and the dreaded 202 Ratelimit started flooding logs everywhere.

If you have a LangChain or LlamaIndex agent that suddenly throws rate-limit errors on every web search, you are not alone — this exact failure took down a lot of side-projects in 2025. The good news is the fixes are simple once you know what is actually happening.

This guide explains DuckDuckGo's endpoints, the ddgs rename, the vqd token, and how to parse results cleanly — with working Python that decodes the link wrappers and survives the 202.

TL;DR: The duckduckgo-search library is now ddgs (pip install ddgs) — old pins broke. The 202 Ratelimit is a soft block: pace requests, back off with jitter, and prefer your own residential IP over flagged proxies. Skip the vqd token handshake entirely by hitting html.duckduckgo.com/html/, which returns parseable HTML — then decode each result's uddg= link parameter to get the real URL. For volume without the babysitting, a DuckDuckGo SERP API returns clean JSON.

DuckDuckGo's endpoints, and which to use

DuckDuckGo is not one thing. It exposes several surfaces, and most rate-limit pain comes from people fighting the hardest one when an easier one would do.

EndpointWhat it returnsNeeds a vqd?
html.duckduckgo.com/html/The full SERP as plain, no-JS HTML — easiest to parseNo
lite.duckduckgo.com/lite/A stripped-down SERP, even lighter HTMLNo
api.duckduckgo.com (Instant Answer)Disambiguation and instant answers only — not web resultsNo
duckduckgo.com (main site)The JS SERP; results fetched from a follow-up endpointYes

The single most common mistake is reaching for the Instant Answer API and being disappointed that it returns a Wikipedia blurb instead of ten blue links. It was never a web-results endpoint. For real organic results without a browser, the answer is html.duckduckgo.com.

The ddgs rename that broke everything

In mid-2025 the popular duckduckgo-search library was renamed to ddgs. The old package was frozen, so any project pinned to duckduckgo-search==x.y stopped getting fixes and, as DuckDuckGo changed its internals, simply stopped working.

The migration is one line. Install the new name and import from it:

# Old (frozen):  pip install duckduckgo-search
# New:           pip install ddgs

from ddgs import DDGS

with DDGS() as ddgs:
    for r in ddgs.text("best running shoes 2026", region="us-en", max_results=10):
        print(r["title"], r["href"])

That works — until you call it in a tight loop, at which point you meet the error that brought you here:

ddgs.exceptions.RatelimitException:
    https://duckduckgo.com/ 202 Ratelimit

The library is fine. The 202 is DuckDuckGo telling you to slow down. Understanding why is the key to never seeing it again.

What the vqd token is (and how to skip it)

DuckDuckGo's main site does not hand you results on the first request. It serves a page containing a per-query token called vqd, and the actual results endpoint refuses to answer unless you pass that token back. It is a deliberate two-step handshake designed to make naive scraping awkward.

The ddgs library does this dance for you, which is why hammering it triggers the rate-limit: each search is really two requests against the token-gated path. You can sidestep the whole mechanism by using the HTML endpoint instead, which returns a complete results page in a single request with no token required.

That is the core trick of free DuckDuckGo scraping: do not fight the vqd — route around it.

Fixing the 202 Ratelimit

The 202 is a soft block. It clears on its own; the question is how fast you provoke it again. Three habits keep it away.

First, pace yourself. DuckDuckGo is small compared to Google and notices bursts quickly, so leave several seconds of randomized delay between requests rather than firing in a loop. Second, back off exponentially when you do get a 202 instead of retrying instantly. Third — and this surprises people — scrape from your own IP rather than a proxy, for reasons we get to below.

Here is a backoff wrapper that treats both exceptions and empty results as "try again, slower":

import time, random

def with_backoff(fetch, tries=5):
    """Retry a DDG fetch with exponential backoff + jitter."""
    for attempt in range(tries):
        try:
            results = fetch()
            if results:                       # non-empty = success
                return results
        except Exception as exc:
            print(f"attempt {attempt + 1} failed: {exc}")
        delay = (2 ** attempt) + random.uniform(0, 1.5)  # 1s, 2s, 4s, 8s...
        time.sleep(delay)
    raise RuntimeError("DuckDuckGo rate-limited after retries")

For agent frameworks that search constantly, also add a small fixed floor between calls — a global "no more than one DDG request every N seconds" gate — so concurrent tools do not collectively hammer the endpoint. Pacing is the same discipline that tames Google's 429 "unusual traffic" errors; the engine differs, the cure does not.

Parsing results and decoding uddg links

Now the part that actually returns data. The HTML endpoint accepts a POST with the query and a region code, and lays each result out in a .result row. The titles and snippets are plain text; the links, however, are wrapped.

import requests
from bs4 import BeautifulSoup
from urllib.parse import urlparse, parse_qs, unquote

HEADERS = {
    "User-Agent": (
        "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 "
        "(KHTML, like Gecko) Chrome/124.0.0.0 Safari/537.36"
    )
}

def decode_uddg(href):
    """DDG wraps links as /l/?uddg=<encoded real url>."""
    qs = parse_qs(urlparse(href).query)
    return unquote(qs["uddg"][0]) if "uddg" in qs else href

def scrape_ddg(query, region="us-en"):
    resp = requests.post(
        "https://html.duckduckgo.com/html/",
        data={"q": query, "kl": region},
        headers=HEADERS, timeout=20,
    )
    soup = BeautifulSoup(resp.text, "html.parser")

    out = []
    for i, row in enumerate(soup.select(".result"), start=1):
        link = row.select_one("a.result__a")
        snippet = row.select_one(".result__snippet")
        if not link:
            continue
        out.append({
            "position": i,
            "title": link.get_text(strip=True),
            "url": decode_uddg(link.get("href")),
            "snippet": snippet.get_text(strip=True) if snippet else None,
        })
    return out

for r in with_backoff(lambda: scrape_ddg("best running shoes 2026"))[:10]:
    print(r["position"], r["title"])

The uddg decoder is the equivalent of Yahoo's redirect unwrapping — DuckDuckGo routes clicks through /l/?uddg=<encoded url>, and the real destination is that one URL-encoded parameter. Pull it out and unquote it and your results are clean.

Why a proxy can make DDG worse

Everywhere else in scraping, the advice is "add proxies." DuckDuckGo is the exception that proves you should understand a target before throwing infrastructure at it.

Because DuckDuckGo is privacy-focused and relatively small, it leans hard on IP reputation. Datacenter ranges and shared proxy pools are exactly the addresses abusers use, so routing through cheap proxies often triggers more 202s than scraping from your own clean residential IP would. People add a proxy to fix rate-limiting and make it dramatically worse.

If you genuinely need volume beyond what one IP allows, use clean residential addresses and keep the per-IP rate low — the same caution behind why proxies get banned. For most personal and agent use, no proxy plus polite pacing is both cheaper and more reliable. And if you are weighing DuckDuckGo against Google for a project, our DuckDuckGo vs Google SERP data comparison covers the result-quality differences.

The one-call alternative

If you are tired of pacing logic, backoff wrappers, and link decoders — especially inside an AI agent where a single rate-limit can fail a whole run — a SERP API returns clean JSON and absorbs the rate-limiting for you:

import requests

resp = requests.get(
    "https://api.apiserpent.com/api/search",
    headers={"X-API-Key": "sk_live_your_key"},
    params={"q": "best running shoes 2026", "engine": "ddg", "country": "us"},
)

for r in resp.json()["results"]["organic"]:
    print(r["position"], r["title"], r["url"])   # already decoded

Same JSON shape across every engine — switch engine to google, bing, or yahoo. For agent builders specifically, a stable search tool that never throws a 202 mid-run is worth a lot; you can wire one up from the Google News guide pattern or test queries in the playground.

No more 202s. Just clean results.

Serpent returns parsed JSON for DuckDuckGo, Google, Bing, and Yahoo — rate-limiting, link-unwrapping, and proxies are our problem, not your agent's. Get 10 free Google searches on signup, then pay-as-you-go from $0.03 per 10,000 searches at scale, no subscription.

Get Your Free API Key

Explore: DuckDuckGo SERP API · All SERP APIs · Pricing

FAQ

Why do I keep getting a 202 Ratelimit from DuckDuckGo?

DuckDuckGo returns a 202 with a Ratelimit body when it decides your client is sending requests too fast or from a flagged IP. It is a soft block, not a hard ban. Fix it by pacing requests with several seconds of randomized delay, backing off exponentially on each 202, and scraping from a residential or home IP rather than a flagged datacenter range.

What is the difference between ddgs and duckduckgo-search?

They are the same project. The duckduckgo-search library was renamed to ddgs in mid-2025, so you install it with pip install ddgs and import from ddgs. Old code pinned to duckduckgo-search broke when the package was frozen, which is why many tutorials and agent frameworks suddenly started failing.

What is the vqd token in DuckDuckGo scraping?

The vqd is a per-query token DuckDuckGo's main site issues on the first request and requires on follow-up requests that fetch the actual results. It is a two-step handshake meant to stop direct scraping of the results endpoint. You can avoid it entirely by using the html.duckduckgo.com or lite.duckduckgo.com endpoints, which return parseable HTML without a vqd.

Should I use a proxy when scraping DuckDuckGo?

Counterintuitively, often no. DuckDuckGo aggressively flags datacenter and shared proxy IPs, so routing through cheap proxies can trigger more 202 rate-limits than scraping directly from your own residential IP. If you do need proxies for volume, use clean residential addresses and keep the per-IP request rate low.