Connect Databricks to VectorFin via Polaris
Add VectorFin as a foreign Polaris catalog in Unity Catalog. Pro/Enterprise — Polaris credential is self-serve; raw GCS reads need a one-time SA grant from support.
Prerequisites
Connection Guide
Provision a Polaris credential
Same self-serve flow as the Snowflake setup — dashboard → Data Access → Provision. Save the catalog URI, warehouse, client_id, and client_secret (shown once).
-- You'll need these for the Unity Catalog connection:
-- Catalog URI: https://catalog.vectorfinancials.com/api/catalog
-- Warehouse: vectorfin
-- Client ID + Secret from /dashboard/data-accessCreate a Unity Catalog connection (workspace admin)
In the Databricks workspace, go to Catalog → External data → Connections → Create. Connection type: HTTP / Apache Iceberg (depends on workspace version). Point at the Polaris REST catalog with OAuth client_credentials.
-- Equivalent SQL (run as a metastore admin):
CREATE CONNECTION vectorfin_polaris TYPE iceberg
OPTIONS (
uri 'https://catalog.vectorfinancials.com/api/catalog',
warehouse 'vectorfin',
credential 'oauth2_client_credentials',
oauth2_client_id '<your_polaris_client_id>',
oauth2_client_secret '<your_polaris_client_secret>',
oauth2_scope 'PRINCIPAL_ROLE:ALL'
);
-- Federate the catalog into Unity Catalog
CREATE FOREIGN CATALOG vectorfin USING CONNECTION vectorfin_polaris;Request a GCS grant for the Databricks workspace SA
Databricks compute reads Parquet from gs://vectorfinancials-data/warehouse/vectorfin/. Email support@vectorfinancials.com with the GCP service account your Databricks workspace uses (Workspace settings → Compute → Cluster IAM). We apply prefix-scoped roles/storage.objectViewer with an IAM Condition. Turnaround: 1 business day.
# Find your Databricks compute SA in the workspace UI, then send to:
# support@vectorfinancials.com
# subject: Iceberg GCS grant for Databricks <org>
# body: SA email = <sa>@<project>.iam.gserviceaccount.com
#
# We run the equivalent of:
gcloud storage buckets add-iam-policy-binding gs://vectorfinancials-data \
--member=serviceAccount:<sa>@<project>.iam.gserviceaccount.com \
--role=roles/storage.objectViewer \
--condition='expression=resource.name.startsWith(
"projects/_/buckets/vectorfinancials-data/objects/warehouse/vectorfin/"
),title=warehouse-prefix-only'Query from PySpark or a SQL warehouse
Once federated, VectorFin tables show up at vectorfin.{namespace}.{table} in Unity Catalog and are queryable from any cluster or SQL warehouse with the GCS grant in scope.
# PySpark
df = spark.table("vectorfin.signals.whystock_score")
display(
df.filter("ticker = 'AAPL' AND date >= '2024-01-01'")
.orderBy("date", ascending=False)
.limit(50)
)
# Embeddings — pull as numpy for similarity work
emb = (
spark.table("vectorfin.embeddings.transcripts")
.filter("ticker = 'NVDA' AND fiscal_period = '2024-Q3'")
.select("chunk_idx", "embedding")
.toPandas()
)
import numpy as np
E = np.stack(emb["embedding"].values) # (N, 768)Available Tables
All 7 VectorFin data tables — bitemporal (effective_ts + knowledge_ts), append-only, nightly updates.
vectorfin.embeddings.transcriptsEarnings call chunk embeddings (768-dim)▼
SELECT * FROM vectorfin.embeddings.transcripts WHERE ticker = 'GOOGL' AND fiscal_period = '2024-Q3'vectorfin.embeddings.filingsSEC filing section embeddings▼
SELECT ticker, filing_type, section FROM vectorfin.embeddings.filings WHERE filing_type = '10-K'vectorfin.signals.whystock_scoreComposite quant score (0–100)▼
SELECT * FROM vectorfin.signals.whystock_score ORDER BY score DESC LIMIT 50vectorfin.signals.regimeMarket regime classification▼
SELECT ticker, date, regime, confidence FROM vectorfin.signals.regime WHERE confidence > 0.8vectorfin.signals.volatilityGARCH volatility forecasts▼
SELECT ticker, date, garch_vol_1d, garch_vol_21d FROM vectorfin.signals.volatilityvectorfin.signals.sentiment_driftEarnings sentiment drift▼
SELECT * FROM vectorfin.signals.sentiment_drift WHERE fiscal_period >= '2024-Q1'vectorfin.signals.anomalyAnomaly scores▼
SELECT * FROM vectorfin.signals.anomaly WHERE anomaly_score > 0.8 ORDER BY date DESCRelated Integrations
Start querying in 25 minutes
Sign up for VectorFin and get immediate API access.