Filing Change Signal

The Filing Change Signal measures how much an issuer's 10-K or 10-Q disclosure language moved year over year, section by section, and whether the meaning moved with the words. It operationalizes the Lazy Prices anomaly. To see coverage and the story behind it, browse the Filing Change Signal dataset.

QUICKSTART

Pulling filing change metrics

GET /v1/filings/{ticker}/changes returns a JSON array of FilingChangeRecord, newest first. Each record pairs a filing with the prior-year filing it was compared to and breaks the change down per section. Authenticate with the X-API-Key header. The endpoint is metered on every tier.

curl

bash
curl https://api.vectorfinancials.com/v1/filings/AAPL/changes \
  -H "X-API-Key: vf_sk_your_key_here" \
  -G \
  -d "form_type=10-K" \
  -d "section=risk_factors" \
  -d "limit=5"

Python (requests)

python
import requests

resp = requests.get(
    "https://api.vectorfinancials.com/v1/filings/AAPL/changes",
    params={"form_type": "10-K", "section": "risk_factors", "limit": 5},
    headers={"X-API-Key": "vf_sk_your_key_here"},
)
resp.raise_for_status()

for record in resp.json():
    for sec in record["sections"]:
        if sec["parse_status"] != "ok":
            continue  # honest null: nothing to score
        lexical = 1 - sec["cosine"]
        semantic = 1 - sec["cosine_embedding"]
        print(
            record["accession"], sec["section_id"],
            f"lexical={lexical:.3f}", f"semantic={semantic:.3f}",
            f"divergence={sec['lex_sem_divergence']:.3f}",
            f"pctile={sec['change_pctile_universe']:.2f}",
            "FORMAT-SWITCH" if sec["format_switch_suspected"] else "",
        )

Node (fetch)

javascript
const params = new URLSearchParams({
  form_type: "10-K",
  section: "risk_factors",
  limit: "5",
});

const res = await fetch(
  `https://api.vectorfinancials.com/v1/filings/AAPL/changes?${params}`,
  { headers: { "X-API-Key": "vf_sk_your_key_here" } },
);
const records = await res.json();

for (const record of records) {
  for (const sec of record.sections) {
    if (sec.parse_status !== "ok") continue; // honest null
    const lexical = 1 - sec.cosine;
    const semantic = 1 - sec.cosine_embedding;
    console.log(
      record.accession,
      sec.section_id,
      `lexical=${lexical.toFixed(3)}`,
      `semantic=${semantic.toFixed(3)}`,
      `divergence=${sec.lex_sem_divergence.toFixed(3)}`,
      sec.format_switch_suspected ? "FORMAT-SWITCH" : "",
    );
  }
}

Endpoint & parameters

Two routes serve the same record shape. Fetch a ticker's history, or pull a single filing by its EDGAR accession.

GET /v1/filings/{ticker}/changes

GET /v1/filings/changes/{accession}

ParamTypeMeaning
as_ofstring (ISO 8601)Point-in-time filter. Returns only records with knowledge_ts <= as_of, so percentiles match what was knowable then. Omit for the current view.
form_type10-K | 10-QRestrict to annual or quarterly filings. Omit for both.
sectionrisk_factors | mda | documentRestrict to one section grain. Omit to return all three per filing.
limitintegerMax records to return, newest first.

FilingChangeRecord field reference

Every field on a record and on each entry in its sections array, its type, whether it can be null, and how to read it. Records are bitemporal and append-only.

FieldTypeNullableMeaning & how to use
tickerstringnoEquity ticker symbol (e.g. AAPL). Echoes your request.
cikstringnoSEC Central Index Key for the issuer, zero-padded.
accessionstringnoEDGAR accession number of the filing being scored.
prior_accessionstringyesAccession of the prior-year filing it was compared against. Null on a first observed filing with no prior-year match.
filing_type10-K | 10-QnoForm type of this filing.
comparisonstringnoPairing key: "yoy_same_type" (10-K vs prior-year 10-K) or "yoy_same_quarter" (10-Q vs the same quarter a year earlier).
effective_tsstring (ISO 8601)noPeriod of report — when the filing’s content is effective. The eff clock.
knowledge_tsstring (ISO 8601)noEDGAR acceptance timestamp — the first instant the change was knowable. The as-of clock; bitemporal key.
sectionsarraynoPer-section breakdown. One entry per requested section grain (up to risk_factors, mda, document).
sections[].section_idrisk_factors | mda | documentnoWhich grain this entry scores. risk_factors = Item 1A; mda = Item 7; document = whole filing.
sections[].parse_statusstringnook = scored. empty = issuer declared no material changes (honest null, not a zero). not_found = section incorporated by reference / absent.
sections[].cosinenumber (0 to 1)yesTF cosine similarity vs prior year, numbers kept (the CMN method). Lexical change = 1 − cosine. Null unless parse_status = ok.
sections[].jaccardnumber (0 to 1)yesToken-set Jaccard similarity vs prior year. Lexical change = 1 − jaccard. Null unless parse_status = ok.
sections[].cosine_embeddingnumber (0 to 1)yesCosine over 768-dim Gemini section embeddings. Semantic change = 1 − cosine_embedding. Null unless parse_status = ok.
sections[].lex_sem_divergencenumberyes(1 − cosine) − (1 − cosine_embedding). High = words moved more than meaning (reshuffle). Near zero with both high = real shift.
sections[].format_switch_suspectedbooleanyesTrue when lexical change is high but semantic change is low — a template/format swap. Down-weight these.
sections[].change_pctile_universenumber (0 to 1)yesCross-sectional percentile of the change across the universe, ranked as-of this filing’s knowledge_ts. No look-ahead. 0.71 = larger change than 71% of peers known at that time.
sections[].word_countintegeryesWord count of this section in the current filing. Null unless parse_status = ok.
sections[].prior_word_countintegeryesWord count of the matched prior-year section. A large delta is itself a tell.
sections[].null_reasonstringyesWhy metrics are null, when applicable: no_prior (no prior-year filing to diff), empty_section (present but minimal, e.g. a 10-Q "no material changes" pointer), not_found (section absent or incorporated by reference), ambiguous, or prior_empty_section / prior_not_found / prior_ambiguous (the matched prior section is unusable). Null when parse_status = ok.

Interpreting the metrics

Start by converting similarity to change: lexical_change = 1 − cosine and semantic_change = 1 − cosine_embedding. A small change is the norm; filings are largely stable from year to year, which is exactly why a big change is informative.

The lexical-semantic divergence is the part that separates noise from signal. A high lexical change with low semantic change means the issuer reworded or reformatted without saying anything new — counsel cleaned up the template. We flag that case with format_switch_suspected. When lexical and semantic both move, the meaning genuinely shifted; that is the Zillow case, and the one worth a human read.

change_pctile_universe puts a single filing in context. Because it is ranked as-of the filing's own knowledge_ts, a value of 0.71 means this change was larger than 71% of changes that were knowable at that moment — never against filings from the future. That is what keeps a screen built on it point-in-time honest.

Respect the nulls. An empty parse_status means the issuer declared no material changes to that section; treat it as “no change reported,” not as a zero score. A not_found status means the section was incorporated by reference and there was nothing to diff. We never fabricate a metric to fill the gap, so your screen should skip these rather than impute them.

Methodology & bitemporality

The lexical side follows Cohen, Malloy & Nguyen's “Lazy Prices” (Journal of Finance, 2020): a term-frequency cosine over the section text with numbers retained, paired with a token-set Jaccard. Their finding is that firms which change their disclosure language tend to underperform, and that the language moves months ahead of the price. See the Lazy Prices glossary entry for the paper and a fuller summary. We add a semantic dimension the original paper did not have: cosine over 768-dim Gemini embeddings (gemini-embedding-2-preview), which is what makes the format-versus-substance distinction possible.

Comparisons are like-for-like: a 10-K against the prior-year 10-K, and a 10-Q against the same quarter a year earlier, so a routine seasonal difference between a Q3 and a Q4 is never mistaken for a real edit.

Every record is bitemporal. The effective_ts is the period the filing reports on; the knowledge_ts is the EDGAR acceptance time, the moment the change became knowable. For a point-in-time backtest, pass as_of to filter on knowledge_ts. Tables are append-only; we never update a row, so history stays fixed underneath you.

Browse coverage and per-ticker pages on the Filing Change Signal dataset, or jump to the API reference for limits and delivery.