+
VectorFin
VectorFin dbt Integration
Model VectorFin signals in dbt — define transforms on top of Iceberg tables, document lineage, and schedule with dbt Cloud.
20 min
Setup time
7
Iceberg tables
5K+
US tickers
Nightly
Updates
Prerequisites
📋VectorFin Pro plan
🔑API key from app.vectorfinancials.com
☁️dbt account
Connection Guide
1
Configure dbt profile for BigQuery or Snowflake
Set up your dbt profiles.yml to connect to BigQuery (or Snowflake) where VectorFin tables are registered.
bash
# ~/.dbt/profiles.yml
vectorfin_project:
target: dev
outputs:
dev:
type: bigquery
method: service-account
project: your-gcp-project
dataset: vectorfin_models
keyfile: /path/to/service-account.json
threads: 42
Define VectorFin sources
Declare VectorFin Iceberg tables as dbt sources in your project.
bash
# models/sources.yml
version: 2
sources:
- name: vectorfin
database: your-gcp-project
schema: vectorfin_raw
tables:
- name: vf_transcripts
description: "Earnings call transcript embeddings (768-dim)"
columns:
- name: ticker
description: "Equity ticker symbol"
- name: fiscal_period
- name: chunk_idx
- name: embedding
description: "768-dim float array from Gemini text-embedding-004"
- name: effective_ts
- name: knowledge_ts
- name: vf_whystock_score
description: "Composite quant score 0-100"3
Create a signal model
Build a dbt model that transforms raw VectorFin signals into analysis-ready data.
sql
-- models/signals/daily_top_signals.sql
{{
config(
materialized='table',
partition_by={'field': 'date', 'data_type': 'date'},
cluster_by=['ticker']
)
}}
with base as (
select
ticker,
date,
score,
json_extract_scalar(components, '$.regime_score') as regime_score,
json_extract_scalar(components, '$.volatility_score') as vol_score,
json_extract_scalar(components, '$.sentiment_score') as sentiment_score,
effective_ts,
knowledge_ts
from {{ source('vectorfin', 'vf_whystock_score') }}
where date >= '2024-01-01'
),
ranked as (
select *,
rank() over (partition by date order by score desc) as daily_rank
from base
)
select * from ranked
where daily_rank <= 1004
Run and test your models
Execute dbt models and run data quality tests on VectorFin signal data.
bash
# Run all models
dbt run --select signals/
# Run tests (not-null, accepted-values, etc.)
dbt test --select signals/
# Generate and serve docs
dbt docs generate
dbt docs serve
# Schedule in dbt Cloud: set up a job that runs after
# VectorFin nightly pipeline completes (~2am UTC)Available Tables
All 7 VectorFin data tables — bitemporal (effective_ts + knowledge_ts), append-only, nightly updates.
source(vectorfin, vf_transcripts)Raw transcript embeddings source▼
sql
select * from {{ source('vectorfin', 'vf_transcripts') }} where ticker = 'AAPL'source(vectorfin, vf_whystock_score)Raw quant score source▼
sql
select * from {{ source('vectorfin', 'vf_whystock_score') }} where date >= '2024-01-01'ref(daily_top_signals)Top 100 signals per day model▼
sql
select * from {{ ref('daily_top_signals') }} where date = current_date()ref(regime_summary)Regime classification summary▼
sql
select ticker, regime, count(*) from {{ ref('regime_summary') }} group by 1, 2ref(volatility_surface)Multi-horizon volatility model▼
sql
select ticker, date, garch_vol_1d, garch_vol_21d from {{ ref('volatility_surface') }}Related Integrations
Start querying in 20 minutes
Sign up for VectorFin and get immediate API access.