Data Sources
Source strategy for fair wage, living wage, and cost of living calculations.
Current Production Mode
Current outputs use a hybrid mode: published ZIP-level dataset rows when available, with modeled fallback estimates when a ZIP is not yet published.
Runtime order: 24h ZIP cache -> published ZIP dataset row -> model estimate fallback.
Current Source Status
Live now: ZIP location lookup, baseline component model, and state-adjusted multipliers.
Rolling out: published ZIP dataset rows sourced from HUD/USDA/CMS and regional benchmarks through ETL ingestion.
Official Source Coverage
Housing: HUD Fair Market Rent datasets and API.
HUD FMR Datasets · HUD FMR API
Food: USDA Food Plans (Moderate Cost) monthly reports.
USDA Monthly Food Plan Reports
Healthcare: CMS/Marketplace plan and premium public use files.
ZIP/City/State crosswalks: Census relationship files plus regional joins.
Utilities/Internet/Transport: benchmark indexes from federal datasets (EIA/FCC/BLS) transformed into ZIP-level normalized costs.
How Data Is Pulled
FairWageCalc uses an ETL flow: fetch raw source files, normalize to a fixed schema, build a versioned ZIP dataset, then publish rows to Firestore.
Pipeline scripts live in etl/jobs and functions/scripts/publish-dataset.js.
Versioning and Freshness
Each publish writes a dataset version record and ZIP rows in zipDataset. Runtime checks cache first (24 hours), then published ZIP rows, then fallback model estimates.
Recommended refresh cadence: monthly for food/health, annual for rent baselines, and quarterly for benchmark index blends.