NLP for Finance

What is Narrative Drift?

The gradual change in how a company describes itself across successive disclosures or earnings calls, measurable as semantic distance between consecutive-period embeddings.

In Plain English

Narrative drift is the slow movement in how a company talks about itself. Quarter to quarter, year to year, the words a management team uses to describe its business shift, and those shifts carry information. A risk factor that gets longer and more specific, a passage on a product line that turns from confident to cautious, a competitor that starts appearing in the MD&A, each is a small move in the narrative.

The word "drift" matters. These changes are rarely abrupt, and almost never announced. They accumulate at the edges of documents, which is precisely why markets are slow to price them and why measuring them systematically is useful. Narrative drift is the brandable, plain-language name VectorFin uses for what the Filing Change Signal quantifies.

Drift can be measured two ways, and the distinction is the whole point. Lexically, you ask how many words changed. Semantically, you ask whether the meaning changed, using vector embeddings that place similar passages near each other regardless of exact wording. Real drift shows up on both; a boilerplate reshuffle shows up only on the lexical side.

Technical Definition

For consecutive comparable disclosures A and B (same issuer, same form type, adjacent periods), narrative drift over a section is the change derived from a similarity measure:

Lexical drift: 1 − cosine_TF(A, B) or 1 − jaccard(A, B).
Semantic drift: 1 − cosine(embed(A), embed(B)), where embed is a sentence/section embedding model.

Higher values mean more drift. Comparing the two measures yields the lexical-semantic divergence, which isolates substantive drift from formatting noise.

How VectorFin Uses This

Narrative Drift is the subtitle of VectorFin's Filing Change Signal. The same machinery applies to earnings-call transcripts: the cosine distance between consecutive-quarter mean embeddings from GET /v1/embeddings/{ticker} is a transcript-level drift score you compute client-side. On filings, drift is published directly as the change metrics in the FilingChangeRecord.

Code Example

python

import numpy as np, requests

def mean_embedding(ticker, period):
    r = requests.get(
        f"https://api.vectorfinancials.com/v1/embeddings/{ticker}",
        params={"fiscal_period": period},
        headers={"X-API-Key": "vf_sk_your_key_here"},
    )
    return np.mean([c["embedding"] for c in r.json()], axis=0)

q3 = mean_embedding("MSFT", "2024-Q3")
q4 = mean_embedding("MSFT", "2024-Q4")
cos = float(np.dot(q3, q4) / (np.linalg.norm(q3) * np.linalg.norm(q4)))
print("narrative drift:", round(1 - cos, 4))

External References

Put Narrative Drift to work in your pipeline

Pull AI-ready embeddings and signals as Iceberg tables or over the REST API.

Get API Access