[ blog · tutorial ]12 min read

How to Build an Investment Research Agent: Markets, Fundamentals, SEC & Economic Data in One API

Sarah Choyنُشر في 16 يونيو 2026قراءة 12 دقائق
الترجمة غير متوفرة بعد، يظهر النص بالإنجليزية.
How to Build an Investment Research Agent: Markets, Fundamentals, SEC & Economic Data in One API

An investment-research agent needs five different data layers — prices, fundamentals, filings, macro, and news — each normally a separate vendor, key, and schema. Here's how to wire all five behind one endpoint set, with working code and the cost math.

الخلاصة

  • A useful finance agent needs five data layers: real-time markets (prices), company fundamentals (statements), SEC filings, economic indicators, and news. Stitched from separate vendors that's 5 contracts, 5 keys, 5 schemas.
  • API Pick exposes all five as consistent JSON search endpoints — /search/markets, /search/financials, /search/sec, /search/economic, /search/news — plus /extract for full documents. One key, pre-shaped for LLM tool calling.
  • The agent pattern: run the relevant endpoints in parallel as tools, merge the JSON, and let the model reason over grounded data instead of hallucinating numbers.
  • Credit pricing is only-on-success: markets 120, financials 200, sec 120, economic 50, news 15 per call. A typical multi-tool research turn costs well under $0.01–$0.10 depending on depth.
  • Build-vs-buy: assembling Polygon + a fundamentals vendor + SEC EDGAR + FRED + a news API yourself is weeks of integration and 5 monthly bills; the single-endpoint route is a day.

The five-layer problem

Ask an LLM "is NVIDIA expensive right now?" and it will confidently invent a P/E ratio. The fix isn't a bigger model — it's grounding. A research agent that earns trust has to pull live, cited data across five layers, then reason over it:

  • Markets — the current price, market cap, and how it's moved. Crypto, forex, ETFs, and the day's movers when relevant.
  • Fundamentals — balance sheet, income statement, cash flow, dividends, and insider transactions. The "is the business actually healthy" layer.
  • SEC filings — 10-K risk factors, 10-Q detail, 8-K events, earnings-call language. The qualitative dimension numbers miss.
  • Economic indicators — rates, inflation, employment, GDP from FRED, BLS, World Bank, and the IMF. The macro backdrop every thesis sits inside.
  • News — the timely catalyst: a downgrade, a product launch, a regulatory action.

Stitched from separate vendors, that's five contracts, five API keys, five rate-limit regimes, and five response schemas you have to normalize before a model can touch them. The integration is where finance-agent projects stall.

One endpoint set, five layers

API Pick exposes each layer as a consistent JSON search endpoint, so the agent talks to one key and one response shape:

  • Markets Search — global & US equities, crypto, forex, ETFs, funds, commodities, and US market movers.
  • Financials Search — balance sheets, income statements, cash flow, dividends, insider transactions.
  • SEC Filings Search — 10-K/10-Q/8-K, earnings transcripts, equity statistics.
  • Economic Data Search — FRED, BLS, World Bank, IMF, USAspending, Destatis.
  • News Search — date-filtered news across major outlets.
  • Extract — pull a full filing or article to clean markdown when a snippet isn't enough.

The agent architecture

Register each endpoint as a tool. When a question comes in, the agent decides which layers it needs, calls them in parallel, merges the JSON, and reasons over the grounded result. A ticker question hits markets + fundamentals + news; a "how is the sector positioned" question hits economic + news + a couple of comparables.

import asyncio, httpx, os

API = "https://api.apipick.com/v1"
HEADERS = {"x-api-key": os.environ["APIPICK_KEY"], "Content-Type": "application/json"}

async def search(client, path, query, **kw):
    r = await client.post(f"{API}/{path}", headers=HEADERS,
                          json={"query": query, **kw})
    r.raise_for_status()
    return r.json()["results"]

async def research(ticker: str):
    async with httpx.AsyncClient(timeout=30) as c:
        markets, fundamentals, filings, macro, news = await asyncio.gather(
            search(c, "search/markets",    f"{ticker} price and market cap"),
            search(c, "search/financials", f"{ticker} latest balance sheet and cash flow"),
            search(c, "search/sec",        f"{ticker} 10-K risk factors", end_date="2026-06-16"),
            search(c, "search/economic",   "US interest rates and inflation latest"),
            search(c, "search/news",       f"{ticker} latest news", end_date="2026-06-16"),
        )
    return {"markets": markets, "fundamentals": fundamentals,
            "filings": filings, "macro": macro, "news": news}

# Feed the merged JSON back to your LLM as grounding, with the source URLs,
# and ask it to synthesize — never to recall numbers.
context = asyncio.run(research("NVDA"))

Each result carries a source URL. Pass those through to the final answer so a human can audit every claim — and so the agent's output is citable, which is what makes it useful in a real workflow.

Build vs. buy

Assemble it yourselfAPI Pick
Vendors / keys~5 (Polygon, fundamentals, EDGAR, FRED, news)1
Response shapes5 to normalize1 JSON shape
Time to first agentWeeks of integrationA day
Billing5 monthly subscriptionsPer-call, only on success
LLM-readyYou pre-shape eachPre-shaped snippets + source URLs

For one data type, going direct is reasonable. For an agent that needs all five and explores unpredictably, the single-endpoint route ships in a day and bills only when a call succeeds.

What this unlocks

The same five tools power more than ticker lookups: earnings-season briefing agents, sector screens grounded in fundamentals, macro-aware portfolio commentary, and diligence assistants that read the actual 10-K via Extract. The pattern is always the same — grounded tool calls, parallel retrieval, synthesis over real data with sources attached.

Start with a free key (100 credits, no card) and wire the five tools into your agent framework of choice. From there it's prompt-engineering, not plumbing.

الأسئلة الشائعة

What data does an investment-research agent actually need?

Five layers. (1) Real-time markets — prices, crypto, forex, ETFs, movers, for the 'what is it doing now' question. (2) Fundamentals — balance sheet, income statement, cash flow, dividends, insider transactions. (3) SEC filings — 10-K/10-Q/8-K text and earnings transcripts for qualitative signals. (4) Economic indicators — FRED, BLS, World Bank, IMF for the macro backdrop. (5) News — timely catalysts. Most agents fail because they have prices but no fundamentals, or fundamentals but no macro context.

Why not just call Polygon, FRED, and SEC EDGAR directly?

You can — and for a single data type it's fine. The pain is the agent needs all five: that's five vendors, five auth schemes, five rate-limit regimes, five response shapes you must normalize before the LLM can use them, and five bills. The single-endpoint approach trades a small per-call premium for one key, one JSON shape, and only-on-success billing — which for an agent doing exploratory multi-tool calls is usually the cheaper and far faster path to ship.

How do I avoid the LLM hallucinating financial numbers?

Never let the model produce figures from memory. Make each data source a tool, force the agent to call the tool, and pass the returned JSON back as grounding. The model's job is to reason and synthesize over retrieved values, not to recall them. Cite the source URL from each result so the output is auditable — that's also what makes the answer trustworthy for a human reviewer.

Is the output suitable for actual trading or advice?

No. Retrieval API output is informational. It grounds an analyst's or agent's reasoning in real data; it is not investment advice and must not be used as an automated trading signal without a qualified human and proper risk controls. Treat the agent as a research accelerant, not a decision-maker.

How much does a research turn cost?

Billing is per successful call: markets 120 credits, financials 200, sec 120, economic 50, news 15 (1000 credits ≈ $1). A focused turn that hits markets + fundamentals + news is ~335 credits (~$0.34); a lighter macro+news turn is ~65 credits. You only pay on HTTP 200, so failed or empty calls cost nothing — which matters when an agent explores.

الواجهات البرمجية المستخدمة في هذا المقال

بحث الأسواق
ابحث في الأسهم العالمية والأمريكية والعملات المشفّرة والفوركس وصناديق ETF والصناديق والسلع ومحرّكات السوق الأمريكية. مبني لاستعلامات الأسعار وبيانات السوق وأبحاث التداول المدعومة بالذكاء الاصطناعي.
بحث البيانات المالية
ابحث في الميزانيات العمومية وقوائم الدخل وقوائم التدفق النقدي وتوزيعات الأرباح ومعاملات المطّلعين للشركات الأمريكية المدرجة. مبني للتحليل الأساسي والعناية الواجبة المدعومين بالذكاء الاصطناعي.
بحث إيداعات SEC
ابحث في إيداعات SEC (10-K و10-Q و8-K)، ونصوص مكالمات الأرباح الأمريكية، وإحصاءات الأسهم. مبنية للعناية الواجبة المدفوعة بالذكاء الاصطناعي والتحليل الأساسي وأنابيب RAG المالية.
بحث البيانات الاقتصادية
ابحث في FRED ومكتب إحصاءات العمل الأمريكي ومؤشرات البنك الدولي وبيانات صندوق النقد الدولي الكلية والإنفاق الفيدرالي الأمريكي وإحصاءات العمل الألمانية. مبني للأبحاث الاقتصادية الكلية المدعومة بالذكاء الاصطناعي.
البحث الإخباري
بحث إخباري فوري عبر كبرى المنصات. تصفية حسب نطاق التاريخ والبلد للاستفسارات الحساسة للوقت. مبني لإحاطات الصباح ووكلاء أخبار السوق وأنابيب RAG.
استخراج محتوى URL
استخرج محتوى نظيفًا قابلًا للقراءة من حتى 25 رابطًا في الاستدعاء الواحد. يزيل الإعلانات والقوائم والقوالب الجاهزة، ويُعيد نصًا بنمط Markdown جاهزًا لاستيعاب LLM. رصيدان لكل URL.
Sarah Choy
بقلم
Sarah Choy
CEO, API Pick

سارة تشوي هي الرئيسة التنفيذية لشركة API Pick. تكتب عن بناء واجهات برمجية جاهزة للإنتاج لوكلاء الذكاء الاصطناعي وسير عمل نماذج اللغة.