[ blog · tutorial ]12 min read

How to Build an Investment Research Agent: Markets, Fundamentals, SEC & Economic Data in One API

Sarah Choy2026年6月16日 發佈約 12 分鐘閱讀
尚無翻譯,顯示英文版本。
How to Build an Investment Research Agent: Markets, Fundamentals, SEC & Economic Data in One API

An investment-research agent needs five different data layers — prices, fundamentals, filings, macro, and news — each normally a separate vendor, key, and schema. Here's how to wire all five behind one endpoint set, with working code and the cost math.

一句話總結

  • A useful finance agent needs five data layers: real-time markets (prices), company fundamentals (statements), SEC filings, economic indicators, and news. Stitched from separate vendors that's 5 contracts, 5 keys, 5 schemas.
  • API Pick exposes all five as consistent JSON search endpoints — /search/markets, /search/financials, /search/sec, /search/economic, /search/news — plus /extract for full documents. One key, pre-shaped for LLM tool calling.
  • The agent pattern: run the relevant endpoints in parallel as tools, merge the JSON, and let the model reason over grounded data instead of hallucinating numbers.
  • Credit pricing is only-on-success: markets 120, financials 200, sec 120, economic 50, news 15 per call. A typical multi-tool research turn costs well under $0.01–$0.10 depending on depth.
  • Build-vs-buy: assembling Polygon + a fundamentals vendor + SEC EDGAR + FRED + a news API yourself is weeks of integration and 5 monthly bills; the single-endpoint route is a day.

The five-layer problem

Ask an LLM "is NVIDIA expensive right now?" and it will confidently invent a P/E ratio. The fix isn't a bigger model — it's grounding. A research agent that earns trust has to pull live, cited data across five layers, then reason over it:

  • Markets — the current price, market cap, and how it's moved. Crypto, forex, ETFs, and the day's movers when relevant.
  • Fundamentals — balance sheet, income statement, cash flow, dividends, and insider transactions. The "is the business actually healthy" layer.
  • SEC filings — 10-K risk factors, 10-Q detail, 8-K events, earnings-call language. The qualitative dimension numbers miss.
  • Economic indicators — rates, inflation, employment, GDP from FRED, BLS, World Bank, and the IMF. The macro backdrop every thesis sits inside.
  • News — the timely catalyst: a downgrade, a product launch, a regulatory action.

Stitched from separate vendors, that's five contracts, five API keys, five rate-limit regimes, and five response schemas you have to normalize before a model can touch them. The integration is where finance-agent projects stall.

One endpoint set, five layers

API Pick exposes each layer as a consistent JSON search endpoint, so the agent talks to one key and one response shape:

  • Markets Search — global & US equities, crypto, forex, ETFs, funds, commodities, and US market movers.
  • Financials Search — balance sheets, income statements, cash flow, dividends, insider transactions.
  • SEC Filings Search — 10-K/10-Q/8-K, earnings transcripts, equity statistics.
  • Economic Data Search — FRED, BLS, World Bank, IMF, USAspending, Destatis.
  • News Search — date-filtered news across major outlets.
  • Extract — pull a full filing or article to clean markdown when a snippet isn't enough.

The agent architecture

Register each endpoint as a tool. When a question comes in, the agent decides which layers it needs, calls them in parallel, merges the JSON, and reasons over the grounded result. A ticker question hits markets + fundamentals + news; a "how is the sector positioned" question hits economic + news + a couple of comparables.

import asyncio, httpx, os

API = "https://api.apipick.com/v1"
HEADERS = {"x-api-key": os.environ["APIPICK_KEY"], "Content-Type": "application/json"}

async def search(client, path, query, **kw):
    r = await client.post(f"{API}/{path}", headers=HEADERS,
                          json={"query": query, **kw})
    r.raise_for_status()
    return r.json()["results"]

async def research(ticker: str):
    async with httpx.AsyncClient(timeout=30) as c:
        markets, fundamentals, filings, macro, news = await asyncio.gather(
            search(c, "search/markets",    f"{ticker} price and market cap"),
            search(c, "search/financials", f"{ticker} latest balance sheet and cash flow"),
            search(c, "search/sec",        f"{ticker} 10-K risk factors", end_date="2026-06-16"),
            search(c, "search/economic",   "US interest rates and inflation latest"),
            search(c, "search/news",       f"{ticker} latest news", end_date="2026-06-16"),
        )
    return {"markets": markets, "fundamentals": fundamentals,
            "filings": filings, "macro": macro, "news": news}

# Feed the merged JSON back to your LLM as grounding, with the source URLs,
# and ask it to synthesize — never to recall numbers.
context = asyncio.run(research("NVDA"))

Each result carries a source URL. Pass those through to the final answer so a human can audit every claim — and so the agent's output is citable, which is what makes it useful in a real workflow.

Build vs. buy

Assemble it yourselfAPI Pick
Vendors / keys~5 (Polygon, fundamentals, EDGAR, FRED, news)1
Response shapes5 to normalize1 JSON shape
Time to first agentWeeks of integrationA day
Billing5 monthly subscriptionsPer-call, only on success
LLM-readyYou pre-shape eachPre-shaped snippets + source URLs

For one data type, going direct is reasonable. For an agent that needs all five and explores unpredictably, the single-endpoint route ships in a day and bills only when a call succeeds.

What this unlocks

The same five tools power more than ticker lookups: earnings-season briefing agents, sector screens grounded in fundamentals, macro-aware portfolio commentary, and diligence assistants that read the actual 10-K via Extract. The pattern is always the same — grounded tool calls, parallel retrieval, synthesis over real data with sources attached.

Start with a free key (100 credits, no card) and wire the five tools into your agent framework of choice. From there it's prompt-engineering, not plumbing.

常見問題

What data does an investment-research agent actually need?

Five layers. (1) Real-time markets — prices, crypto, forex, ETFs, movers, for the 'what is it doing now' question. (2) Fundamentals — balance sheet, income statement, cash flow, dividends, insider transactions. (3) SEC filings — 10-K/10-Q/8-K text and earnings transcripts for qualitative signals. (4) Economic indicators — FRED, BLS, World Bank, IMF for the macro backdrop. (5) News — timely catalysts. Most agents fail because they have prices but no fundamentals, or fundamentals but no macro context.

Why not just call Polygon, FRED, and SEC EDGAR directly?

You can — and for a single data type it's fine. The pain is the agent needs all five: that's five vendors, five auth schemes, five rate-limit regimes, five response shapes you must normalize before the LLM can use them, and five bills. The single-endpoint approach trades a small per-call premium for one key, one JSON shape, and only-on-success billing — which for an agent doing exploratory multi-tool calls is usually the cheaper and far faster path to ship.

How do I avoid the LLM hallucinating financial numbers?

Never let the model produce figures from memory. Make each data source a tool, force the agent to call the tool, and pass the returned JSON back as grounding. The model's job is to reason and synthesize over retrieved values, not to recall them. Cite the source URL from each result so the output is auditable — that's also what makes the answer trustworthy for a human reviewer.

Is the output suitable for actual trading or advice?

No. Retrieval API output is informational. It grounds an analyst's or agent's reasoning in real data; it is not investment advice and must not be used as an automated trading signal without a qualified human and proper risk controls. Treat the agent as a research accelerant, not a decision-maker.

How much does a research turn cost?

Billing is per successful call: markets 120 credits, financials 200, sec 120, economic 50, news 15 (1000 credits ≈ $1). A focused turn that hits markets + fundamentals + news is ~335 credits (~$0.34); a lighter macro+news turn is ~65 credits. You only pay on HTTP 200, so failed or empty calls cost nothing — which matters when an agent explores.

本文使用的 API

Sarah Choy
作者
Sarah Choy
CEO, API Pick

Sarah Choy 是 API Pick 的 CEO,專注於為 AI Agent 與 LLM 工作流打造可用於正式環境的 API。