VectorFin + dbt

Layer dbt models on top of VectorFin tables once they're mounted in your warehouse. Pre-requisite: Snowflake or Databricks integration completed (including the support grant).

30 min

Setup time

Iceberg tables

5K+

US tickers

Nightly

Updates

Prerequisites

📋VectorFin Pro plan

🔑API key from app.vectorfinancials.com

☁️dbt account

Connection Guide

Prerequisite: complete the Snowflake or Databricks integration

dbt itself doesn't talk to Polaris — it talks to your warehouse. Set up the Snowflake or Databricks integration first (including the one-time GCS grant from VectorFin support) so the VectorFin tables are visible as native objects in your warehouse.

sql

# Confirm the upstream tables are visible in your warehouse:
# Snowflake:
SELECT COUNT(*) FROM whystock_score;          -- the ICEBERG TABLE you created
# Databricks:
SELECT COUNT(*) FROM vectorfin.signals.whystock_score;  -- via Unity Catalog

Configure your dbt profile

Standard Snowflake or Databricks dbt profile — nothing VectorFin-specific. Just point at the database/schema where the Iceberg tables live.

bash

# ~/.dbt/profiles.yml — Snowflake example
vectorfin_models:
  target: dev
  outputs:
    dev:
      type: snowflake
      account: <your_account>
      user: <user>
      password: <password>           # or private_key_path
      role: ANALYST
      warehouse: COMPUTE_WH
      database: ANALYTICS
      schema: vectorfin
      threads: 4

Declare VectorFin tables as dbt sources

Point dbt at the warehouse-side names you chose when mounting the Iceberg tables. The Snowflake snippets above use whystock_score, transcripts, etc. at the top of the chosen schema.

bash

# models/sources.yml
version: 2

sources:
  - name: vectorfin
    database: ANALYTICS
    schema: vectorfin
    tables:
      - name: whystock_score
        description: "Composite quant score 0-100 (bitemporal, append-only)"
        columns:
          - name: ticker
          - name: date
          - name: score
          - name: components
            description: "JSON: regime_score, volatility_score, sentiment_score, …"
          - name: effective_ts
          - name: knowledge_ts
      - name: transcripts
        description: "Earnings call chunk embeddings (768-dim)"

Build a model with point-in-time semantics

A daily-top-100 model that respects knowledge_ts so backtests don't leak future revisions of the score.

sql

-- models/signals/daily_top_signals.sql
{{ config(materialized='table') }}

with latest_known as (
    select
        ticker,
        date,
        score,
        components,
        row_number() over (
            partition by ticker, date
            order by knowledge_ts desc
        ) as rn
    from {{ source('vectorfin', 'whystock_score') }}
    where date >= '2024-01-01'
),

ranked as (
    select
        ticker,
        date,
        score,
        components,
        rank() over (partition by date order by score desc) as daily_rank
    from latest_known
    where rn = 1
)

select * from ranked where daily_rank <= 100

Available Tables

All 7 VectorFin data tables — bitemporal (effective_ts + knowledge_ts), append-only, nightly updates.

source('vectorfin', 'whystock_score')Composite quant score (mounted via Snowflake/Databricks)

▼

sql

select * from {{ source('vectorfin', 'whystock_score') }}

source('vectorfin', 'transcripts')Transcript embeddings source

▼

sql

select * from {{ source('vectorfin', 'transcripts') }} where ticker = 'AAPL'

ref('daily_top_signals')Daily top-100 model with point-in-time logic

▼

sql

select * from {{ ref('daily_top_signals') }} where date = current_date()

Related Integrations

Start querying in 30 minutes

Get API Access View Pricing