Case study · Lazy Prices
The language moved before the price.
In February 2021, Zillow quietly rewrote the risk factors in its 10-K and added disclosures around its home-buying (iBuying) business while the stock was near its peak. Nobody announced anything. By November 2021 it had wound down Zillow Offers and the stock had fallen roughly 72% from that peak. The Filing Change Signal reads the filing, not the tape, and timestamps the change to the moment it became public.
Item 1A word growth, FY2020
17,244 → 22,834
The largest rewrite of these risk factors in Zillow's history.
Item 1A token-set change
0.195
The largest jaccard move in Zillow's risk-factor history. New Zillow Offers, Home Loans and inventory disclosures, added on top of the retained text.
Knowable
2021-02-12
EDGAR acceptance, months before the price reacted.
Read the filing, not the tape
The Lazy Prices anomaly (Cohen, Malloy & Nguyen, Journal of Finance, 2020) is simple: firms that quietly rewrite their filings tend to underperform, with no announcement effect to mark the moment. The change is in the document. The price reacts later.
The Filing Change Signal exposes three metrics per section, raw, never collapsed into a buy or sell score:
- Lexical change (cosine and jaccard): how much the words moved.
- Semantic change: cosine over 768-dim Gemini embeddings; did the meaning move, or only the wording.
- Lexical-semantic divergence is the lead feature. High words with low meaning is a reshuffle or a format switch. High on both is a real shift.
Sections tracked: Risk Factors (Item 1A) and MD&A (Item 7), the two CMN found most predictive, plus a whole-document fallback.
What happened
2021-02-12
FY2020 10-K accepted by EDGAR. Item 1A grows from 17,244 to 22,834 words. A new “Risk Factor Summary” appears, along with a set of Zillow Offers, Zillow Home Loans and inventory disclosures. The largest rewrite of these risk factors in the company's history.
Feb 2021
The stock trades near its all-time peak, about $212. Nothing ties the new filing language to anything; there is no announcement to read.
Nov 2021
Zillow winds down Zillow Offers and writes down inventory. The stock is around $60, roughly 72% off the peak.
The 72% drop is context for why narrative change is worth reading. It is not an expected return, and a single case illustrates the anomaly rather than validating a strategy.
From the filing to the number
Ingest
The filing list and both bitemporal timestamps come from VectorFin's filing coverage. Each filing's primary document is fetched from SEC EDGAR.
Parse & normalize
A hardened parser strips the table of contents, anchors on real Item headers, and isolates Item 1A and Item 7. On any failure it emits an honest null status, never a fabricated similarity.
Embed & diff
Each section is embedded once with Gemini (768-dim, L2-normalized), then the filing is diffed against the prior comparable filing: token change, term-frequency cosine, semantic change, divergence.
Rank & serve
A point-in-time cross-sectional percentile ranks each filing only against filings knowable at its acceptance time. The record is served from the production API.
GET /v1/filings/ZG/changes
Zillow's full 10-K history is loaded, so you can reproduce every number on this page. Pass form_type=10-K to walk each year back to 2016. The FY2020 rewrite returns cosine 0.997, jaccard 0.805 (a 0.195 token-set change, the largest in its history) and word growth from 17,244 to 22,834.
Seen in production
The FY2020 rewrite shows up mostly on the lexical side: a large token-set change and sharp word growth as new disclosures are appended, while TF-cosine and meaning hold because the prior text is largely retained. The 10-Q below is the cleaner case, where both axes move at once. Zillow's 10-Q risk factors sat as a 111-word “no material changes” pointer for six straight quarters, then in Q3 2025 expanded to a 560-word update that moved on both lexical and semantic axes.
| Period | Filed | cosine | cosine_embedding | divergence | format_switch | words | status |
|---|---|---|---|---|---|---|---|
2024-09-30 | 2024-11-06 | 0.991 | 0.959 | -0.032 | false | 111 | ok |
2025-06-30 | 2025-08-06 | 0.991 | 0.969 | -0.022 | false | 111 | ok |
2025-09-30 | 2025-10-30 | 0.659 | 0.806 | 0.147 | false | 560 | ok |
Lexical change is 1 − cosine and semantic change is 1 − cosine_embedding, so the 2025-09-30 row reads 0.341 lexical and 0.194 semantic, divergence 0.147. Both elevated, so it is flagged as a real disclosure shift (format_switch_suspected false). That row is returned by GET /v1/filings/ZG/changes with form_type=10-Q and section=risk_factors.
curl
curl https://api.vectorfinancials.com/v1/filings/ZG/changes \
-H "X-API-Key: vf_sk_your_key_here" \
-G \
-d "form_type=10-Q" \
-d "section=risk_factors"What the change actually was
These paragraphs appeared in the FY2020 Item 1A with no close match in the prior year: the Zillow Offers, Zillow Home Loans and inventory disclosures, taken from the real filing.
The Zillow Home Loans mortgage loan origination business consists of providing purchase money loans to homebuyers and refinancing existing loans. The origination of purchase money mortgage loans by Zillow Home Loans is influenced by customers purchasing homes through Zillow Offers who elect to finance their home through Zillow Home Loans and traditional business clients in the home buying process such as realtors and builders.
We primarily utilize credit facilities with a limited number of counterparties to provide capital for the growth and operation of our Zillow Offers business, including to finance the purchase of homes. If we fail to maintain adequate relationships with potential financial sources or we are unable to renew, refinance or extend our existing credit facilities on favorable terms or at all, we may be unable to maintain sufficient inventory, which would adversely affect our Zillow Offers business and our results of operations.
Through Zillow Offers, we purchase homes, make certain repairs and updates and sell homes back into the market. Zillow Offers has grown rapidly since we started offering the service in April 2018 and it may expose us to a variety of financial, legal, and reputational risks. The Zillow Offers business model and technology is still nascent compared to the business model of the incumbents in the United States residential real estate industry.
We primarily acquire homes directly from consumers and there can be no assurance of an adequate supply of such homes on terms that are attractive to us or meet the criteria required under our financing arrangements. A reduction in the availability of or access to inventory could have an adverse effect on our business, sales and results of operations.
Every record carries the exact prior accession it was diffed against (0001617640-20-000015), so the comparison is auditable.
What you can do with it
Quant desks
Point-in-time change percentile as a factor input.
RAG / LLM builders
“What changed in the risk factors this year,” with citations back to ticker, accession and section.
Risk & diligence
Flag abnormal rewrites on existing holdings.
What we tell you up front
- The anomaly is documented on a historical sample and weakens out of sample. This is a feature input, not a strategy, and we never quote a headline return as an expected return.
- The effect concentrates in smaller, less-covered names and is muted on large caps.
- Format and vendor switches can inflate lexical change. The semantic divergence field is the mitigation; it reads “real shift” for Zillow, not “format switch.”
- Parse failures and broken filing chains are honest nulls, never zeros.
Read the API and field reference, browse the live Zillow dataset, or read the Lazy Prices glossary entry.
Reference: Cohen, Malloy & Nguyen, “Lazy Prices,” Journal of Finance 75 (2020).