Tavily vs Exa vs Serper vs API Pick: Which Web Search API for LLMs?

Sarah Choy2026年5月2日發佈約 9 分鐘閱讀

尚無翻譯，顯示英文版本。

Tavily, Exa, Serper, and API Pick all promise to be the search layer for your LLM. They differ sharply on output shape, filters, and how they bill. Here is a side-by-side from someone who ships agents.

一句話總結

•Use Tavily when you want a single hosted RAG endpoint that returns LLM-ready answers — at the cost of opaque ranking and a subscription floor.
•Use Exa for semantic-first discovery where neural ranking matters more than freshness; budget accordingly at scale.
•Use Serper if you need raw Google SERPs and you will do your own cleaning, ranking, and snippet shaping.
•Use API Pick Web Search when you want pre-shaped JSON snippets, transparent per-call credit pricing, country & date filters, and only pay for HTTP 200 responses.

What \"web search API for LLMs\" actually means

General search APIs like Google Custom Search, Bing Web Search, and SerpAPI return search-engine result pages — the same blue links and rich snippets a human would see. That format is wrong for a language model. An agent doesn't want to parse a SERP. It wants a small, ranked list of titles, URLs, and clean text snippets it can quote into a context window. The four APIs here all promise that, but they make different trade-offs in how they do it.

We will compare on five practical axes: output shape, filtering, pricing model, integration ergonomics, and what they don't do.

The contenders, in one paragraph each

Tavily

Hosted RAG-as-a-service. tavily.search returns ranked snippets; tavily.qna bundles search with a quick LLM answer. Strong fit for chat assistants where you want \"give the model an answer-ready blob\". Subscription-based with usage credits.

Exa (formerly Metaphor)

Neural / semantic-first index. Designed around \"find me URLs that look like this URL\" and embedding-based ranking, with options to retrieve highlights or full content. Strongest when freshness matters less than topical similarity. Subscription with credit overages.

Serper

Raw Google SERP API. Returns the JSON shape of a real Google search results page — organic, knowledge graph, places, videos. You do the snippet cleaning and ranking yourself. Cheap per query, but you ship the LLM-shaping layer.

API Pick Web Search

Pay-as-you-go semantic web search shaped for LLM tool calling. POST /api/search/web returns 5 (max 10) ranked results with titles, URLs, and pre-cleaned snippets, plus optional country_code and start_date/end_date filters. 15 credits per call (~$0.015), only deducted on success.

Side-by-side

Comparison reflects each provider's general positioning at the time of writing. Always confirm pricing and quotas on each provider's pricing page before integrating.

	Tavily	Exa	Serper	API Pick
Output shape	Ranked snippets + optional bundled LLM answer	Ranked URLs with optional highlights / contents	Raw Google SERP JSON	Ranked title + URL + LLM-friendly snippet
Country filter	Yes	Limited	Yes	Yes (country_code)
Date-range filter	Yes	Yes	Yes (qdr)	Yes (start_date / end_date)
Tool schema endpoint	—	—	—	Yes — GET /api/search/web/tool-schema
Pricing model	Subscription + credits	Subscription + credits	Per-query	Pay-as-you-go credits, $5 / 5k
Charges on failure?	Varies	Varies	Yes	No — only on HTTP 200
Best fit	Hosted RAG / chat assistants	Semantic discovery / similarity	Custom SERP pipelines	AI agent tool calling, RAG pipelines

Output shape: the part that matters most

The reason this category exists at all is that LLMs cannot reason effectively over a SERP HTML blob. They reason over short, named, ranked text. The single biggest predictor of whether a search API works well as an agent tool is therefore: how clean is the snippet?

Tavily and API Pick aggressively clean snippets. Exa returns either highlights or contents depending on flags — fine, but you decide how much to ask for. Serper hands you the raw SERP and assumes you'll run an extractor next. The latter is a reasonable choice if you already operate a content extractor; otherwise it is hidden work.

With API Pick, a typical response looks like:

{
  "results": [
    {
      "title": "Retrieval-augmented generation - Wikipedia",
      "url": "https://en.wikipedia.org/wiki/Retrieval-augmented_generation",
      "snippet": "Retrieval-augmented generation (RAG) is a technique that combines\nsearch with text generation, often using vector search to ground LLM\nanswers in retrieved documents."
    }
    /* …more */
  ],
  "result_count": 5,
  "credits_used": 15,
  "remaining_credits": 985
}

That shape drops directly into a function-calling response without further parsing.

Filtering: country and recency

Two filter dimensions matter for production agents:

Country / locale: a financial agent in the UK should not get US-only sources by default.
Date range: a market-research agent asking \"what happened this week\" must reject anything older than 7 days.

All four APIs expose some form of both, but expressivity varies. API Pick uses ISO date strings (start_date=\"2026-04-01\") which is unambiguous, vs. Google's coarser qdr buckets (past hour / day / week / month).

Pricing model: subscription vs. pay-as-you-go

Subscription-based APIs (Tavily, Exa) work well when you have predictable, steady traffic. They become awkward in three common patterns:

You're prototyping and don't want a monthly commitment.
Your traffic is bursty (e.g. a research agent that runs in batches).
You build agents that retry aggressively on partial failures.

API Pick uses a credits model — $5 buys 5,000 credits; Web Search costs 15 credits per call; credits never expire and are only deducted on HTTP 200 responses. That last clause matters more than it sounds: an agent loop that retries five times on a transient 502 is free, not 5×.

Integration ergonomics

The lowest-friction integration is one where you can paste a JSON tool schema into your agent code without writing a wrapper. API Pick publishes ready-to-use schemas:

# OpenAI function tool schema
curl https://www.apipick.com/api/search/web/tool-schema

# Returns OpenAI tool definition + Claude tool use definition

With OpenAI Assistants:

from openai import OpenAI
import requests

client = OpenAI()
schema = requests.get("https://www.apipick.com/api/search/web/tool-schema").json()

assistant = client.beta.assistants.create(
    name="Research Agent",
    model="gpt-4o",
    tools=[{"type": "function", "function": schema["openai"]}],
)

With Claude tool use:

import anthropic
import requests

schema = requests.get("https://www.apipick.com/api/search/web/tool-schema").json()
client = anthropic.Anthropic()

response = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    tools=[schema["claude"]],
    messages=[{"role": "user", "content": "What's new in RAG research this week?"}],
)

What none of these APIs do

No web search API will reliably answer \"every X from this domain since 2019\". For deep-archive coverage you still want to pair search with a focused crawler or a domain-specific dataset. None of them dedupe across closely-similar URLs perfectly. And none of them solve the upstream problem of a stale or low-authority source — that's a content-quality decision your agent has to make.

Choosing fast

Best for: hosted RAG with bundled answer

Pick Tavily. Single endpoint, model already chooses sources, fastest to a working chat assistant.

Best for: semantic / similarity discovery

Pick Exa. Neural ranking is its core thesis; embeddings work better for \"find me more like this URL\" than keyword search.

Best for: building your own SERP pipeline

Pick Serper. Cheapest raw Google SERP JSON. You do the cleaning.

Best for: AI-agent tool calling, transparent pricing, no monthly floor

Pick API Pick. Pre-shaped LLM-ready snippets, country and date filters, pay-as-you-go, only charged on success, ready-to-paste tool schemas. Try it →

常見問題

Which API has the best price-per-call?

Per-call pricing varies. API Pick Web Search costs 15 credits per call (≈ $0.015 at the $5 / 5,000 credits rate) and only deducts credits on HTTP 200 responses. Tavily and Exa use monthly subscriptions plus per-credit overages; Serper bills per query. If your traffic is bursty or you re-run failed calls during agent retries, the only-on-success model usually wins on real-world spend.

Do all of these work with OpenAI function calling and Claude tool use?

Yes. They all expose a JSON-in / JSON-out interface, so you can wrap any of them as a tool function. API Pick additionally publishes an OpenAI/Claude tool schema endpoint (GET /api/search/web/tool-schema) so you can paste the exact JSON definition into your agent loop.

Is API Pick a Tavily wrapper?

No. API Pick runs its own search index aggregation, ranking, and snippet shaping pipeline. The output is intentionally simpler than Tavily's: ranked title + URL + LLM-friendly snippet, with optional country and date-range filters. You can call POST /api/search/web directly without using a hosted RAG layer.

What about latency?

All four are designed for synchronous agent calls. P50 latencies are roughly comparable (sub-second for short queries). The real latency cliff is when an API also runs a downstream LLM call inside the search endpoint — pure search APIs return faster than "search + answer" composite endpoints.

Which is the best Tavily alternative?

If you are leaving Tavily because of the subscription floor or opaque pricing on overages, API Pick Web Search is the closest pay-as-you-go drop-in: same shape (ranked, snippet-shaped JSON), country/date filters, no monthly minimum.

本文使用的 API

網頁搜尋

為 LLM 工具呼叫而生的即時語意網頁搜尋。回傳排序過的標題、URL 與乾淨摘要,已預先成型供智慧代理消費。支援國家與日期篩選。

新聞搜尋

涵蓋主流媒體的即時新聞搜尋。日期範圍與國家篩選,適合時效性查詢。為晨間簡報、市場新聞智慧代理與 RAG 管線而生。

URL 內容擷取

單次呼叫從最多 25 個 URL 擷取乾淨易讀內容。剝離廣告、導覽與樣板;回傳類 Markdown 文字,可供 LLM 直接使用。每個 URL 消耗 2 點。

作者

Sarah Choy

CEO, API Pick

Sarah Choy 是 API Pick 的 CEO，專注於為 AI Agent 與 LLM 工作流打造可用於正式環境的 API。