VectorFin/Filings
SEC filings

SEC Filing Intelligence

VectorFin indexes 10-K annual reports, 10-Q quarterly reports, and 8-K current reports into AI-ready, section-level embeddings for 5,000+ US equity tickers — a drop-in retrieval layer for financial RAG, LLM compliance agents, and regulatory-language search. Updated within 24 hours of each filing.

What VectorFin indexes

Each filing is parsed into its constituent sections — MD&A, Risk Factors, Business Overview, Financial Statements — and each section is chunked and vectorized using Google's gemini-embedding-2-preview model, producing 768-dimensional dense vectors.

All data is bitemporal: every embedding carries an effective_ts (the filing date) and a knowledge_ts (when VectorFin ingested it), enabling point-in-time backtesting without look-ahead bias.

Use cases include semantic search across filings, cross-company risk-factor similarity, regulatory language drift detection, and grounding LLM answers through Retrieval-Augmented Generation (RAG) for compliance assistants and research agents.

PRIMARY USE CASE

Ground LLM agents in SEC filings — with citations

Filing-level RAG is one of the most common financial LLM workloads: “Has this issuer's risk-factor language on cybersecurity changed?”, “Summarize the MD&A section on supply chain across peers”, “Find 8-K disclosures of executive departures in the last 90 days.” Each question becomes a vector similarity search followed by LLM synthesis.

Every VectorFin chunk is keyed by ticker, filing_type, section, and filed_date, so your LLM can cite down to the paragraph. Bitemporal (effective_ts + knowledge_ts) guarantees point-in-time retrieval — essential for audit trails and RAG backtests.

Start exploring SEC filing embeddings

Free tier: top 100 tickers, 1,000 API calls/month. No credit card required.