+
VectorFin
Connect BigQuery to VectorFin Iceberg Data
Query VectorFin Iceberg tables directly from BigQuery with external table registration — no ETL, no copies.
5 min
Setup time
7
Iceberg tables
5K+
US tickers
Nightly
Updates
Prerequisites
📋VectorFin Pro plan
🔑API key from app.vectorfinancials.com
☁️BigQuery account
Connection Guide
1
Get API access and service account
Sign up at app.vectorfinancials.com (Pro tier required). Create a BigQuery Connection with your GCS service account that has read access to gs://vectorfinancials-data/.
bash
# Create BigQuery connection to GCS
bq mk --connection \
--connection_type=CLOUD_RESOURCE \
--location=US \
your-project:US.vf_connection
# Grant storage access to the connection service account
gsutil iam ch serviceAccount:SERVICE_ACCOUNT@developer.gserviceaccount.com:roles/storage.objectViewer \
gs://vectorfinancials-data2
Register external Iceberg table
Create external tables pointing to VectorFin's GCS Iceberg warehouse. Run once; re-run after each nightly pipeline update to refresh metadata.
sql
-- Register transcript embeddings table
CREATE OR REPLACE EXTERNAL TABLE `your-project.your_dataset.vf_transcripts`
WITH CONNECTION `us-central1.vf_connection`
OPTIONS (
format = 'ICEBERG',
uris = ['gs://vectorfinancials-data/warehouse/embeddings/transcripts/metadata/LATEST.metadata.json']
);
-- Register whystock score signals table
CREATE OR REPLACE EXTERNAL TABLE `your-project.your_dataset.vf_whystock_score`
WITH CONNECTION `us-central1.vf_connection`
OPTIONS (
format = 'ICEBERG',
uris = ['gs://vectorfinancials-data/warehouse/signals/whystock_score/metadata/LATEST.metadata.json']
);3
Verify with a count query
Run a simple COUNT to confirm the table is accessible and data is flowing.
sql
SELECT COUNT(*) as total_rows, COUNT(DISTINCT ticker) as tickers
FROM `your-project.your_dataset.vf_transcripts`;
-- Expected: ~500K rows, 5000+ tickers4
Query embeddings and signals
Run your first real query — semantic embeddings for AAPL and a bitemporal point-in-time signal lookup.
sql
-- AAPL semantic embeddings for 2024
SELECT ticker, fiscal_period, chunk_idx, embedding
FROM `your-project.your_dataset.vf_transcripts`
WHERE ticker = 'AAPL' AND fiscal_period LIKE '2024%'
ORDER BY fiscal_period, chunk_idx;
-- Bitemporal point-in-time query (as-of Jan 1 2024)
SELECT ticker, date, score, components
FROM `your-project.your_dataset.vf_whystock_score`
WHERE ticker = 'AAPL'
AND knowledge_ts <= TIMESTAMP('2024-01-01')
ORDER BY effective_ts DESC
LIMIT 1;Available Tables
All 7 VectorFin data tables — bitemporal (effective_ts + knowledge_ts), append-only, nightly updates.
embeddings/transcriptsEarnings call chunk embeddings (768-dim)▼
sql
SELECT ticker, fiscal_period, chunk_idx, embedding FROM vf_transcripts WHERE ticker = 'NVDA' LIMIT 10embeddings/filingsSEC filing section embeddings▼
sql
SELECT ticker, filing_type, section, embedding FROM vf_filings WHERE ticker = 'MSFT' AND filing_type = '10-K'signals/whystock_scoreComposite quant score (0–100)▼
sql
SELECT ticker, date, score FROM vf_whystock_score WHERE date >= '2024-01-01' ORDER BY score DESC LIMIT 20signals/regimeMarket regime (trending/ranging/volatile)▼
sql
SELECT ticker, date, regime, confidence FROM vf_regime WHERE ticker = 'AAPL'signals/volatilityGARCH volatility forecasts (1d/5d/21d)▼
sql
SELECT ticker, date, garch_vol_1d, garch_vol_21d FROM vf_volatility WHERE ticker = 'TSLA'signals/sentiment_driftEarnings sentiment drift vectors▼
sql
SELECT ticker, fiscal_period, drift_vector FROM vf_sentiment_drift WHERE ticker = 'AMZN'signals/anomalyAnomaly scores and flags▼
sql
SELECT ticker, date, anomaly_score, flags FROM vf_anomaly WHERE anomaly_score > 0.8Related Integrations
Start querying in 5 minutes
Sign up for VectorFin and get immediate API access.