NLP for Finance

What is Lazy Prices?

The academic finding (Cohen, Malloy & Nguyen, 2020) that firms which change the language of their SEC filings tend to underperform, because markets are slow to react to disclosure changes.

In Plain English

"Lazy Prices" is the title of an influential 2020 study by Lauren Cohen, Christopher Malloy, and Quoc Nguyen, published in the Journal of Finance. Its central observation is simple and a little uncomfortable: most of the time, companies copy their previous filing almost verbatim. When they don't, when they materially change the wording of a 10-K or 10-Q, that change tends to be bad news, and the market is slow to act on it.

The "lazy" in the title refers to prices, not filers. The authors found that the information in disclosure changes diffuses into stock prices over months rather than days. A portfolio that shorted firms with the largest filing changes and went long the firms that changed least earned a meaningful return spread in their sample. The biggest predictive content came from changes in the Risk Factors and MD&A sections, and changes that added negative or litigation-related language.

The intuition is behavioral. Filings are long, tedious, and mostly unchanged, so few investors diff them year over year. The companies that do edit are usually telling you something. Reading those edits, at scale and consistently, is an edge precisely because so few people do it.

Technical Definition

The original study measures filing similarity between a firm's filing in year t and t−1 using textual similarity scores: cosine similarity over term-frequency vectors, Jaccard similarity over token sets, and minimum-edit and simple-difference measures. Lower similarity (greater change) predicts lower future returns. The effect is robust to controls for size, book-to-market, momentum, and accruals, and concentrates in the Risk Factors and MD&A sections.

Note: this is published academic research describing a historical anomaly. It is not a guarantee of future returns, and any figures from the paper describe the authors' sample, not a VectorFin product's performance.

How VectorFin Uses This

VectorFin's Filing Change Signal operationalizes the Lazy Prices method, with the same TF-cosine and Jaccard measures (numbers kept), and extends it with a semantic dimension over Gemini embeddings to distinguish a boilerplate reshuffle from a substantive change. The signal is delivered point-in-time and bitemporally so a backtest of the anomaly never uses changes that were not yet knowable. VectorFin publishes the measurement, not a returns claim.

Code Example

python
import requests

# Find the largest year-over-year risk-factor changes for a ticker
resp = requests.get(
    "https://api.vectorfinancials.com/v1/filings/ZG/changes",
    params={"form_type": "10-K", "section": "risk_factors", "limit": 5},
    headers={"X-API-Key": "vf_sk_your_key_here"},
)
for rec in resp.json():
    for s in rec["sections"]:
        if s["parse_status"] == "ok":
            print(rec["effective_ts"], "change:", round(1 - s["cosine"], 3),
                  "pctile:", s["change_pctile_universe"])

Put Lazy Prices to work in your pipeline

Pull AI-ready embeddings and signals as Iceberg tables or over the REST API.

Get API Access