Metadata-Version: 2.4
Name: median-income
Version: 1.0.0
Summary: Unified median income data across countries, served from a single lookup() call.
Author-email: VitalyL <vitlub@gmail.com>
License: MIT
Project-URL: Homepage, https://github.com/VitalyLub/median-income
Project-URL: Issues, https://github.com/VitalyLub/median-income/issues
Project-URL: Changelog, https://github.com/VitalyLub/median-income/blob/main/CHANGELOG.md
Keywords: median income,census,demographics,geography,open data
Classifier: Development Status :: 5 - Production/Stable
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Scientific/Engineering :: Information Analysis
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: pyarrow>=14
Provides-Extra: offline
Requires-Dist: pandas>=2.0; extra == "offline"
Requires-Dist: pyarrow>=14; extra == "offline"
Requires-Dist: requests>=2.31; extra == "offline"
Requires-Dist: pycountry>=23; extra == "offline"
Requires-Dist: xlrd<2; extra == "offline"
Provides-Extra: dev
Requires-Dist: pytest>=8; extra == "dev"
Requires-Dist: ruff>=0.5; extra == "dev"
Requires-Dist: build>=1.0; extra == "dev"
Dynamic: license-file

# median-income

Unified median income data across countries, served from a single `lookup()` call.

National statistics agencies each publish median income in their own schema,
geographies, currency, and reference year. This package merges them offline into
a single Parquet-per-country format and exposes a tiny inference API at runtime.

## Install

```bash
pip install median-income
```

The runtime install only depends on `pyarrow`. Bundled per-country Parquet
files ship inside the wheel — no network calls at query time.

## Quickstart

```python
from median_income import lookup

result = lookup(country="US", zipcode="10001")
print(result.median_income)        # 129852.0  (local currency)
print(result.median_income_usd)    # 129852.0  (pre-computed)
print(result.matched_key)          # "zipcode"
print(result.unit)                 # "household"
print(result.data_source)          # "U.S. Census Bureau, ACS 5-Year ..."
print(result.currency)             # "USD"
print(result.year)                 # 2024
```

### Methodological unit

Different sources measure different things — a US Census household figure
is not directly comparable to a per-capita PPP figure or a per-employee
wage. `result.unit` makes this explicit:

| `unit` value | Meaning | Sources |
|---|---|---|
| `"household"` | Raw per-household total income | US Census ACS, StatCan Census, Stats SA IES |
| `"equivalised_household"` | Household income scaled by OECD-modified equivalence | OECD WISE IDD |
| `"per_capita"` | Household income / household size | World Bank PIP via Our World in Data |
| `"per_employee"` | Gross wage of a wage-earning individual | Israel Bituach Leumi |
| `"per_tax_unit"` | Median per tax declaration (single filer or jointly-taxed couple) | Belgium Statbel |

The matching/fallback chain ignores `unit` — it's exposed for visibility so
you can warn or branch in your own code. Two `lookup()` calls for the same
country may legitimately return different units depending on which tier
matched (per-country, OECD, or OWID).

**Methodology gap on bare-country queries.** Per-country files never ship a
country-level row — by design, bare-country queries (no zipcode/city/region)
always resolve via the worldwide OECD/OWID fallback, whose `unit` is
`equivalised_household` (OECD) or `per_capita` (OWID). That differs from the
per-country sources, which mostly use raw `household`. So
`lookup(country='US', zipcode='10001').unit == 'household'` but
`lookup(country='US').unit == 'equivalised_household'` — same country, two
different methodologies. Check `result.unit` if that matters to you, or use
`Client(fallback=False)` to disable the worldwide tier.

`country` accepts ISO-3166 alpha-2 (`"US"`), alpha-3 (`"USA"`), or common
English names — `"United States"`, `"USA"`, `"U.S.A."`, `"America"`,
`"China"`, `"CHN"`, `"UK"`, `"Britain"`, `"South Korea"`, etc. — all
resolve to the same lookup. Punctuation, case, and accents are ignored.

The function takes any of `zipcode`, `region`, `city` alongside `country`.
The most-specific match wins:

```
zipcode > city > region > country (OECD/OWID fallback only)
```

If `zipcode="10001"` doesn't exist but `city="new york"` does, you get the city
row. The `country` slot is only filled by the worldwide fallback tiers — no
per-country file contributes a country-level row. If nothing matches, you
get `None`.

### Memory control: load only what you need

For long-running services or memory-constrained processes, use `Client`
to pin which countries can be loaded into RAM and whether the global
fallback layer is consulted. Each `Client` instance owns its own cache.

```python
from median_income import Client

# Detailed US data + rough global coverage for everywhere else.
# The allow-list only restricts per-country tables; OECD/OWID fallback
# still serves every country.
client = Client(countries=["US"], fallback=True)
client.lookup(country="US", zipcode="10001")  # ACS detail
client.lookup(country="FR")                   # OECD fallback (France isn't allow-listed)

# Hard wall: US-only, no fallback. Non-allowed countries return None.
client = Client(countries=["US"], fallback=False)
client.preload()                 # eager load — avoids first-query latency
client.loaded_countries()        # ["US"]
client.lookup(country="US", zipcode="10001")  # ACS detail
client.lookup(country="ZA")                   # None — not allow-listed, fallback off

# Free everything when you're done.
client.unload()
client.loaded_countries()        # []
```

`countries=` and `fallback=` are orthogonal. The allow-list only controls
which **per-country tables** this client will load; `fallback=True` always
consults the worldwide OECD/OWID tiers for every country regardless of
the allow-list. To build a hard wall that returns None for non-allowed
countries, pair `countries=[...]` with `fallback=False`.

`countries=` accepts the same alias forms `lookup()` does — `"USA"`,
`"United States"`, `"Deutschland"` all resolve. `Client(fallback=False)`
disables the OECD/OWID country-level fallback (saves ~13 KB but reduces
country coverage). `unload(country)` drops a single country;
`unload()` clears everything including the fallback tier files.

The module-level `lookup()` is a convenience that uses a default
`Client()` lazily — equivalent to `Client().lookup(...)`.

### Debug mode: see every level that matched

```python
# US ships ZCTA (zipcode) rows only, so show_all returns the zip match
# plus both global country-level fallback tiers (OECD, then OWID):
all_hits = lookup(country="US", zipcode="10001", show_all=True)
for r in all_hits:
    print(r.matched_key, round(r.median_income, 2))
# zipcode 129852.0
# country 49885.0      (OECD)
# country 26303.87     (World Bank / OWID)
```

## Schema

Every per-country Parquet file conforms to the same columns:

| Column | Type | Notes |
|---|---|---|
| `country` | string | ISO-3166 alpha-2 |
| `level` | string | One of `country`, `region`, `city`, `zipcode` |
| `zipcode` | string \| null | Country-specific format |
| `city` | string \| null | Normalized lowercase, ASCII-folded |
| `region` | string \| null | State/province/region, normalized |
| `median_income` | float64 | Local currency |
| `median_income_usd` | float64 | Pre-computed via World Bank annual FX |
| `currency` | string | ISO-4217 in most rows; `USD_PPP` on OWID rows (PPP int-$ at 2017 prices, **not** nominal USD) |
| `year` | int32 | Reference year of the survey |
| `fx_rate` | float64 | local→USD rate used; 1.0 for USD and USD_PPP |
| `data_source` | string | Free-text source label |

**Currency caveat for cross-tier aggregation.** Rows from the OWID tier
ship `currency='USD_PPP'` (PPP international-$ at 2017 prices), not bare
`'USD'`. The label is deliberately non-ISO so consumers averaging or
summing `median_income_usd` across countries can filter on
`currency == 'USD'` to exclude PPP-adjusted values from nominal-USD
aggregates. OECD/per-country rows use real ISO-4217 codes (`USD`, `EUR`,
`JPY`, …).

## Contributing a country

The offline pipeline lives in a separate top-level package
(`median_income_offline`) so it isn't loaded into RAM at runtime.

```bash
pip install median-income[offline]
```

To add a new country:

1. Drop the raw source file at `raw/<COUNTRY>.csv` (or `.xlsx`, etc.).
2. Add a new adapter under `median_income_offline/adapters/`, subclassing
   `Adapter` from `base.py`. Implement `fetch()` and `parse()` — `parse()`
   yields `UnifiedRow` objects in **local currency** (FX conversion is handled
   downstream).
3. Register the adapter in `median_income_offline/adapters/__init__.py`.
4. If the local currency isn't yet in `median_income_offline/data/fx_rates.json`,
   add the World Bank PA.NUS.FCRF annual averages.
5. Build:

   ```bash
   python -m median_income_offline build <COUNTRY> --raw raw/
   ```

   This writes `median_income/data/<COUNTRY>.parquet` and updates `_meta.json`.

## Data sources

Each source — agency, dataset ID, vintage, geography, methodology caveats,
and exactly which columns we extract — is documented under
[`data_sources/`](data_sources/).

**Per-country (most-specific match):**

- 🇺🇸 [United States](data_sources/US.md) — Census Bureau ACS 5-Year Subject
  Table S1903, vintage 2020–2024. ZCTA-level only; 30,414 zip codes.
- 🇿🇦 [South Africa](data_sources/ZA.md) — Stats SA IES 2022/2023 Figure 3.3.
  8 metropolitan municipalities at city level + common-name aliases
  (Pretoria, Joburg, Durban, Gqeberha, Bloemfontein, Germiston, KuGompo
  City, …); ZAR 2023.
- 🇨🇦 [Canada](data_sources/CA.md) — StatCan 2021 Census Profile
  (98-401-X2021013), Forward Sortation Area level (~1,626 FSAs);
  CAD 2020. Pass FSA in uppercase, e.g. `zipcode='M5V'`.
- 🇮🇱 [Israel](data_sources/IL.md) — Bituach Leumi (Israel National
  Insurance Institute), H1 2025 full salary report
  (`btl.gov.il/About/news/Pages/hodaasachar2025.aspx`). 27 cities at
  city level + common-name aliases (Tel Aviv, Beersheba, Modiin, …);
  ILS 2025. **Caveat: this source reports median *monthly wage per
  employee*, not median household income.** We annualize × 12 to match
  the package's scale, but the methodology is per-employee and
  wage-only — see `data_sources/IL.md`.
- 🇧🇷 [Brazil](data_sources/BR.md) — IBGE Censo Demográfico 2022
  (SIDRA Table 10295, `sidra.ibge.gov.br/tabela/10295`). **5,570
  municipalities at city level** + 27 Unidades da Federação at region
  level (UF acronyms + Brasília alias) as fallback; BRL 2022.
  **Caveat: this source reports median *per-capita* monthly household
  income, not raw household income** (it's household total ÷ household
  size, monthly × 12, and excludes pensioners and domestic-employee
  households). See `data_sources/BR.md`.
- 🇲🇽 [Mexico](data_sources/MX.md) — INEGI ENIGH 2022 (Nueva Serie),
  weighted median of household current income computed from the
  `concentradohogar` microdata at the **state (entidad federativa)
  level** — 32 states + ISO 3166-2:MX acronyms + English/Spanish
  aliases (CDMX, Mexico City, EDOMEX, Edo. Méx., N.L., S.L.P., …);
  MXN 2022. **Caveats: quarterly figures × 4 to annualize; INEGI's
  `ingreso corriente` includes non-monetary components (imputed rent,
  self-consumption) — broader than US ACS money income. State-level
  only** — ENIGH microdata is not statistically representative at
  municipality level. See `data_sources/MX.md`.
- 🇨🇴 [Colombia](data_sources/CO.md) — DANE *Medición de Pobreza
  Monetaria y Desigualdad 2024*, weighted median of household current
  income (`ingtotug`) computed from the `Hogares` microdata at the
  **department level (24 departments)** and **capital-city level (23
  cities)** — Spanish names + ISO 3166-2:CO acronyms + common aliases
  (Bogotá D.C., Distrito Capital, Valle, Guajira, …); COP 2024.
  **Caveats: monthly figures × 12 to annualize; DANE's `ingtotug` is
  total income of the "unidad de gasto" (≈ household) — excludes
  imputed rent. License note: we ship derived medians only, not the
  DANE microdata.** See `data_sources/CO.md`.
- 🇨🇱 [Chile](data_sources/CL.md) — MDSF *Encuesta CASEN 2024*,
  weighted median of household corrected monetary income (`ymonecorh`)
  computed from the survey microdata at the **region level (16
  regions)** and **commune level (220 communes with ≥5,000 weighted
  households)** — Spanish names + ISO 3166-2:CL codes + common aliases
  (Metropolitana / RM, Biobío / Bío-Bío, O'Higgins, Araucanía, …);
  CLP 2024. **Caveats: monthly figures × 12 to annualize; `ymonecorh`
  is disposable / net (subtracts pension + health + tax) and excludes
  imputed rent — bypasses the 2024 methodology break in `ytotcorh`;
  CEPAL-adjusted for high-income underreporting; filter `pco1==1`
  (household heads), weight by `expr` for regions and `expc` for
  communes. License note: we ship derived medians only, not the
  CASEN microdata.** See `data_sources/CL.md`.
- 🇬🇧 [United Kingdom](data_sources/GB.md) — HMRC Survey of Personal
  Incomes, **Table 3.14 Percentiles**, tax year 2023-2024 (50th
  percentile of total income before tax). **9 ITL1 regions + 21
  counties + 6 metropolitan counties + 3 nations (Wales/Scotland/NI)
  at region level, 361 local authorities at city level**, plus
  cleaned-name aliases (Bristol, Lancashire, Tyne & Wear, Newcastle,
  Hull, …) and city-level duplicates of the 9 ITL1 regions so
  `city='London'`/`city='North East'` resolve; GBP 2024. **Caveat:
  per-individual taxpayer, NOT per-household.** Excludes non-taxpayers
  (children, low-income unemployed, welfare-only households). Reads
  higher than a true household median would. See `data_sources/GB.md`.
- 🇹🇷 [Turkey](data_sources/TR.md) — TURKSTAT (TÜİK) *Income
  Distribution Statistics 2025*, median annual equivalised household
  disposable income at **NUTS 2 (`İBBS 2. Düzey`)** level — 26
  regions; İstanbul, İzmir, Ankara also at city level; province-name
  aliases for multi-province regions (Edirne/Kırklareli → TR21,
  Antalya → TR61, Trabzon → TR90, …); TRY 2024 (income reference).
  **Caveat: equivalised disposable income, not raw money income** —
  same methodology as OECD WISE IDD, not directly comparable to US
  ACS. License note: TURKSTAT all-rights-reserved — we ship derived
  medians only. See `data_sources/TR.md`.
- 🇩🇪 [Germany](data_sources/DE.md) — Sozialberichterstattung der
  amtlichen Statistik, **Tabelle A.7 Mediane und
  Armutsgefährdungsschwellen nach Regionen**, Erstergebnisse 2025.
  Median monthly equivalised disposable net income — **16
  Bundesländer at region level with ISO 3166-2:DE 2-letter codes**
  (BW, BY, BE, HB, HH, HE, MV, NI, NW, RP, SL, SN, ST, SH, TH) +
  selected English names (Bavaria, Hesse, Saxony, …); **15 Großstädte
  at city level** (Berlin, München/Munich, Hamburg, Frankfurt am Main,
  Köln/Cologne, …); EUR 2025. **Caveat: equivalised disposable
  income** (OECD-scale, after tax & transfers) — same methodology as
  OECD WISE IDD; not directly comparable to US ACS pre-tax money
  income. Monthly × 12 to annualize. Datenlizenz Deutschland 2.0. See
  `data_sources/DE.md`.
- 🇦🇺 [Australia](data_sources/AU.md) — ABS 2021 Census General
  Community Profile, Table G02, Postal Area (POA) level —
  **~2,635 POAs** (4-digit postcode equivalent). Pass the bare
  postcode, e.g. `zipcode='2000'` (Sydney CBD), `zipcode='3000'`
  (Melbourne CBD). AUD 2021, annualised ×52 from weekly.
  **Caveats: income reported in range bands, then imputed using
  Survey of Income and Housing range medians — methodologically
  different from US ACS exact-dollar estimation.** 5-year Census
  cycle, not annual rolling. POA is a Mesh Block approximation, not
  a legal postal boundary. CC BY 4.0. See `data_sources/AU.md`.
- 🇦🇷 [Argentina](data_sources/AR.md) — INDEC Encuesta Permanente de
  Hogares (EPH), Q4 2025 microdata. Weighted median of total monthly
  household monetary income (ITF) by **province (24 jurisdictions)** —
  Spanish canonical names + ISO 3166-2:AR codes (AR-B, AR-C, AR-X, …)
  + user-friendly aliases (Provincia de Buenos Aires, CABA, Cdad.
  Autónoma de Buenos Aires, Sgo. del Estero, …); ARS 2025. **Caveats:
  urban-only coverage (31 agglomerates, ~85% of population); Q4 2025
  monthly ×12 nominal — Argentina experienced significant inflation
  in 2025; ITF is monetary income only (excludes imputed rent).**
  License: INDEC microdata is public-domain; we ship only derived
  medians. See `data_sources/AR.md`.
- 🇯🇵 [Japan](data_sources/JP.md) — Statistics Bureau, MIC, *2019
  National Survey of Family Income, Consumption and Wealth* (NSFIE),
  Table 20-0. Per-household yearly income median at **city level** —
  158 -shi cities plus Tokyo's 23-ward area (`Tokyo` / `Ku-area`
  aliased to the same value), each city queryable with or without
  the `-shi` suffix (`Sapporo-shi` and `Sapporo` both resolve); JPY
  2019. **Caveats: median is derived by linear interpolation from
  the source's 10 published income-band household counts (NSFIE does
  not publish municipality medians directly); 5-year survey cadence,
  not annual rolling — 2019 wave is nominal 2019 JPY; city-level
  only (smaller municipalities not in the NSFIE sample); "Economic
  regions within pref." aggregates are skipped.** License: e-Stat
  Japan Government Standard Terms of Use Ver 2.0 (CC BY 4.0
  -compatible with attribution). See `data_sources/JP.md`.
- 🇫🇷 [France](data_sources/FR.md) — INSEE, *Revenus localisés
  sociaux et fiscaux (Filosofi) 2021*, median disposable income per
  consumption unit at **commune level — 28,186 communes** (after
  dropping INSEE-suppressed rows and commune-name collisions). City
  level only with `region=NULL`; French "régions" are the 18
  administrative régions (Île-de-France, Corse, …) not départements,
  so populating `region` would be a layer mismatch; EUR 2021.
  **Caveats: equivalised disposable income** (OECD-modified scale,
  after tax & transfers) — same `unit` enum as OECD WISE IDD, not
  directly comparable to US ACS pre-tax money income.
  Tax-household universe; excludes collective dwellings (hospitals,
  prisons, barracks, university residences). Open License (ODbL).
  See `data_sources/FR.md`.
- 🇪🇸 [Spain](data_sources/ES.md) — INE, *Atlas de Distribución de
  la Renta de los Hogares (ADRH)*, table 30878, median net disposable
  income per consumption unit at **municipality level — 164
  municipalities of Badajoz province (Extremadura)**. City level only
  with `region=NULL`; EUR 2023. **Caveats: v1 ships only Badajoz
  province**; non-Badajoz Spanish queries fall through to the OECD /
  OWID country-level fallback. Broader coverage (autonomous
  communities, all 50 provinces, all ~8 100 municipalities) is a
  tracked v2 follow-up. **Equivalised disposable income** (OECD-
  modified scale, after tax & transfers; EU-SILC concept) — same
  `unit` enum as OECD WISE IDD, not directly comparable to US ACS
  pre-tax money income. Register-based synthesis (AEAT tax records +
  Seguridad Social + Padrón). CC BY 4.0. See `data_sources/ES.md`.
- 🇮🇹 [Italy](data_sources/IT.md) — Istat, *Indagine EU-SILC su
  Reddito e Condizioni di Vita*, "Condizioni di vita e reddito delle
  famiglie — Anni 2024-2025", **PROSPETTO 5: Reddito netto familiare
  (inclusi gli affitti figurativi) per ripartizione**. Median
  household net income with imputed rents at **macro-region level —
  the 4 Istat ripartizioni** (Nord-Ovest, Nord-Est, Centro, Sud e
  Isole) with English variants (Northwest, Mezzogiorno, …); **all 20
  Italian regions ship as alias rows pointing to their parent
  macro-region** so `region='Lombardia'` / `region='Sicilia'` /
  `region='Toscana'` / etc. resolve, with Italian + English forms
  (Lombardia/Lombardy, Sicilia/Sicily, Sardegna/Sardinia, Toscana/
  Tuscany, Puglia/Apulia, Piemonte/Piedmont, Valle d'Aosta/Aosta
  Valley, Marche/Marches, Trentino-Alto Adige/South Tyrol/Südtirol,
  Friuli-Venezia Giulia/Friuli); EUR 2024. **Caveats: non-equivalised
  household net income, includes imputed rents — broader than US
  ACS pre-tax money income; sub-region values are the macro median,
  not region-specific (Istat doesn't publish 20-region income
  medians).** CC BY 3.0 IT. See `data_sources/IT.md`.
- 🇵🇭 [Philippines](data_sources/PH.md) — PSA, *Family Income and
  Expenditure Survey (FIES) 2023 (preliminary)*, **Table 12: Median
  Family Income, by Region, Province, and HUC**. Median annual family
  income at **region level — 18 administrative regions + 83 provinces**
  (regions carry NCR / CALABARZON / MIMAROPA / SOCCSKSARGEN / BARMM /
  roman-numeral aliases) and **city level — 36 highly-urbanized cities
  / HUCs** (each with bare + "X City" aliases, e.g. Makati/Makati City)
  on the `region=NULL` null-region pattern; PHP 2023. **Caveats:
  per-*family* income (≈ household, stored as `unit='household'`), gross
  and nominal — not directly comparable to US ACS; province figures
  with a highly-urbanized city exclude that city; 2023 is preliminary.**
  National queries fall through to the OECD / OWID country-level
  fallback. Public domain. See `data_sources/PH.md`.
- 🇧🇪 [Belgium](data_sources/BE.md) — Statbel, *Fiscal statistics on net
  taxable income*, **publication table C (fisc2023_C), income year 2023**.
  Median net taxable income at **region level — 3 regions
  (Flanders/Wallonia/Brussels) + 10 provinces** and **city level — 581
  municipalities** on the `region=NULL` null-region pattern. Each level's
  median is computed separately by Statbel (not aggregated). NL/FR/EN/DE
  + curated aliases: regions resolve via `Vlaanderen`/`Wallonie`/`Bruxelles`
  (bare *Brussels* → the region, not the commune), cities via FR/EN exonyms
  (Anvers/Antwerp, Gand/Ghent, Bruges, Louvain, …). **Caveat: the value is a
  median per *tax declaration* — a single filer or a jointly-taxed
  married/legally-cohabiting couple — stored as `unit='per_tax_unit'`; net
  taxable income, before income tax.** EUR 2023. National queries fall
  through to the OECD / OWID country-level fallback. CC BY 4.0. See
  `data_sources/BE.md`.
- 🇬🇹 [Guatemala](data_sources/GT.md) — INE, *Encuesta Nacional de
  Ingresos y Gastos de los Hogares (ENIGH) 2023*, weighted median of
  household current income (`ing_cor`) computed from the Consolidado
  Hogares microdata at the **department level — 22 departments** +
  aliases (Xela, El Quiché, El Petén, Progreso) and English/Spanish
  suffix forms (`Izabal Department`, `Departamento de Izabal`); GTQ
  2023. **Caveats:
  monthly figures × 12 to annualize; we use `ing_cor` (current income),
  NOT `ing_total`, which would add loans/asset-sales/inheritances —
  financing flows, not income. Department-level only** — ENIGH microdata
  is not statistically representative at municipality level. CC BY. See
  `data_sources/GT.md`.

**Country-level fallback** (when zip/city/region miss):

1. [OECD](data_sources/OECD.md) — OECD WISE Income Distribution Database,
   median equivalised disposable income, ~40 countries. **Preferred.**
2. [World Bank / OWID](data_sources/WORLD_BANK.md) — World Bank PIP via
   Our World in Data, daily median income in international-$ (PPP, 2017
   prices), annualized × 365, ~170 countries. Broader coverage, lower
   precision.

The fallback is automatic: if `lookup(country='CA')` finds no per-country
file, it consults OECD; if OECD doesn't have the country, it consults World
Bank/OWID. Use `show_all=True` to see every tier that matched.

## v1 status

- ✅ United States (Census Bureau ACS S1903) — ZCTA-level shipped.
- ✅ South Africa (Stats SA IES 2022/2023) — 8 metros at city level.
- ✅ Canada (StatCan 2021 Census) — ~1,626 FSAs.
- ✅ Israel (Bituach Leumi H1 2025) — 27 cities (wage proxy; see
  `data_sources/IL.md`).
- ✅ Brazil (IBGE Censo 2022) — 5,570 municipalities + 27 UFs
  (per-capita; see `data_sources/BR.md`).
- ✅ Mexico (INEGI ENIGH 2022) — 32 states (weighted median of
  household current income from microdata; see `data_sources/MX.md`).
- ✅ Colombia (DANE MPMD 2024) — 24 departments + 23 capital cities
  (weighted median of `ingtotug` from microdata; see
  `data_sources/CO.md`).
- ✅ Chile (MDSF CASEN 2024) — 16 regions + 220 communes (weighted
  median of `ymonecorh` from microdata; disposable monetary income;
  see `data_sources/CL.md`).
- ✅ United Kingdom (HMRC SPI Table 3.14 2023-24) — 39 region rows +
  361 local authorities + cleanup aliases (per-individual taxpayer
  income, NOT household — see `data_sources/GB.md`).
- ✅ Turkey (TURKSTAT Income Distribution 2025) — 26 NUTS 2 regions
  + Istanbul/Izmir/Ankara at city level + province aliases
  (equivalised disposable income; see `data_sources/TR.md`).
- ✅ Germany (Sozialberichterstattung A.7, Erstergebnisse 2025) —
  16 Bundesländer with ISO 3166-2:DE codes + 15 Großstädte
  (equivalised disposable income; see `data_sources/DE.md`).
- ✅ Australia (ABS 2021 Census G02) — ~2,635 POAs at zipcode level
  (raw household income, weekly ×52; see `data_sources/AU.md`).
- ✅ Argentina (INDEC EPH Q4 2025) — 24 provinces at region level
  (weighted median of ITF from microdata; urban-only; see
  `data_sources/AR.md`).
- ✅ Japan (NSFIE 2019 Table 20-0) — 158 -shi cities + Tokyo's
  Ku-area at city level (median derived by linear interpolation from
  binned counts; see `data_sources/JP.md`).
- ✅ Italy (Istat EU-SILC 2024-2025 PROSPETTO 5) — 4 macro-regions
  (Nord-Ovest, Nord-Est, Centro, Sud e Isole) at region level with
  all 20 Italian regions shipped as aliases pointing to their parent
  macro-region; household net income with imputed rents, not
  equivalised; see `data_sources/IT.md`.
- ✅ France (INSEE Filosofi 2021) — 28,186 communes at city level
  (median disposable income per consumption unit; equivalised, after
  tax & transfers; commune-name collisions and INSEE-suppressed
  rows dropped; see `data_sources/FR.md`).
- ✅ Spain (INE ADRH 2023, table 30878) — 164 Badajoz-province
  municipalities at city level (median net disposable income per
  consumption unit; equivalised, EU-SILC; **v1 covers Badajoz
  province only**; broader Spain coverage is a tracked v2 roadmap
  item; see `data_sources/ES.md`).
- ✅ Philippines (PSA FIES 2023, Table 12) — 18 regions + 83 provinces
  at region level (NCR/CALABARZON/BARMM/roman-numeral aliases) and 36
  highly-urbanized cities at city level (bare + "X City" aliases);
  **median annual *family* income, gross/nominal, stored as
  `unit='household'`; 2023 preliminary**; see `data_sources/PH.md`.
- ✅ Belgium (Statbel fisc2023_C, income year 2023) — 3 regions +
  10 provinces at region level and 581 municipalities at city level,
  with NL/FR/EN/DE + curated aliases (Vlaanderen/Wallonie/Bruxelles,
  Antwerp/Ghent/Bruges, …). **Median net taxable income per *tax
  declaration* (single filer or jointly-taxed couple), stored as
  `unit='per_tax_unit'`; before income tax**; see `data_sources/BE.md`.
- ✅ Guatemala (INE ENIGH 2023) — 22 departments (weighted median of
  household current income `ing_cor` from microdata; monthly × 12;
  see `data_sources/GT.md`).
- ✅ Country-level fallback (OECD + World Bank/OWID) — 45 + 170 countries.
- ⏳ More per-country adapters — to come.

## License

MIT.
