VectorFin

Self-serve setup + 1× support grant

Connect Databricks to VectorFin via Polaris

Add VectorFin as a foreign Polaris catalog in Unity Catalog. Pro/Enterprise — Polaris credential is self-serve; raw GCS reads need a one-time SA grant from support.

25 min

Setup time

Iceberg tables

5K+

US tickers

Nightly

Updates

Prerequisites

📋VectorFin Pro plan

🔑API key from app.vectorfinancials.com

☁️Databricks account

Connection Guide

Provision a Polaris credential

Same self-serve flow as the Snowflake setup — dashboard → Data Access → Provision. Save the catalog URI, warehouse, client_id, and client_secret (shown once).

sql

-- You'll need these for the Unity Catalog connection:
-- Catalog URI:  https://catalog.vectorfinancials.com/api/catalog
-- Warehouse:    vectorfin
-- Client ID + Secret from /dashboard/data-access

Create a Unity Catalog connection (workspace admin)

In the Databricks workspace, go to Catalog → External data → Connections → Create. Connection type: HTTP / Apache Iceberg (depends on workspace version). Point at the Polaris REST catalog with OAuth client_credentials.

sql

-- Equivalent SQL (run as a metastore admin):
CREATE CONNECTION vectorfin_polaris TYPE iceberg
OPTIONS (
  uri 'https://catalog.vectorfinancials.com/api/catalog',
  warehouse 'vectorfin',
  credential 'oauth2_client_credentials',
  oauth2_client_id '<your_polaris_client_id>',
  oauth2_client_secret '<your_polaris_client_secret>',
  oauth2_scope 'PRINCIPAL_ROLE:ALL'
);

-- Federate the catalog into Unity Catalog
CREATE FOREIGN CATALOG vectorfin USING CONNECTION vectorfin_polaris;

Request a GCS grant for the Databricks workspace SA

Databricks compute reads Parquet from gs://vectorfinancials-data/warehouse/vectorfin/. Email support@vectorfinancials.com with the GCP service account your Databricks workspace uses (Workspace settings → Compute → Cluster IAM). We apply prefix-scoped roles/storage.objectViewer with an IAM Condition. Turnaround: 1 business day.

bash

# Find your Databricks compute SA in the workspace UI, then send to:
#   support@vectorfinancials.com
#   subject: Iceberg GCS grant for Databricks <org>
#   body:    SA email = <sa>@<project>.iam.gserviceaccount.com
#
# We run the equivalent of:
gcloud storage buckets add-iam-policy-binding gs://vectorfinancials-data \
  --member=serviceAccount:<sa>@<project>.iam.gserviceaccount.com \
  --role=roles/storage.objectViewer \
  --condition='expression=resource.name.startsWith(
    "projects/_/buckets/vectorfinancials-data/objects/warehouse/vectorfin/"
  ),title=warehouse-prefix-only'

Query from PySpark or a SQL warehouse

Once federated, VectorFin tables show up at vectorfin.{namespace}.{table} in Unity Catalog and are queryable from any cluster or SQL warehouse with the GCS grant in scope.

python

# PySpark
df = spark.table("vectorfin.signals.whystock_score")
display(
    df.filter("ticker = 'AAPL' AND date >= '2024-01-01'")
      .orderBy("date", ascending=False)
      .limit(50)
)

# Embeddings — pull as numpy for similarity work
emb = (
    spark.table("vectorfin.embeddings.transcripts")
        .filter("ticker = 'NVDA' AND fiscal_period = '2024-Q3'")
        .select("chunk_idx", "embedding")
        .toPandas()
)
import numpy as np
E = np.stack(emb["embedding"].values)  # (N, 768)

Available Tables

All 7 VectorFin data tables — bitemporal (effective_ts + knowledge_ts), append-only, nightly updates.

vectorfin.embeddings.transcriptsEarnings call chunk embeddings (768-dim)

▼

sql

SELECT * FROM vectorfin.embeddings.transcripts WHERE ticker = 'GOOGL' AND fiscal_period = '2024-Q3'

vectorfin.embeddings.filingsSEC filing section embeddings

▼

sql

SELECT ticker, filing_type, section FROM vectorfin.embeddings.filings WHERE filing_type = '10-K'

vectorfin.signals.whystock_scoreComposite quant score (0–100)

▼

sql

SELECT * FROM vectorfin.signals.whystock_score ORDER BY score DESC LIMIT 50

vectorfin.signals.regimeMarket regime classification

▼

sql

SELECT ticker, date, regime, confidence FROM vectorfin.signals.regime WHERE confidence > 0.8

vectorfin.signals.volatilityGARCH volatility forecasts

▼

sql

SELECT ticker, date, garch_vol_1d, garch_vol_21d FROM vectorfin.signals.volatility

vectorfin.signals.sentiment_driftEarnings sentiment drift

▼

sql

SELECT * FROM vectorfin.signals.sentiment_drift WHERE fiscal_period >= '2024-Q1'

vectorfin.signals.anomalyAnomaly scores

▼

sql

SELECT * FROM vectorfin.signals.anomaly WHERE anomaly_score > 0.8 ORDER BY date DESC

Related Integrations

Start querying in 25 minutes

Get API Access View Pricing