Metadata-Version: 2.4
Name: ato-mcp
Version: 0.8.27
Summary: MCP server for Australian Taxation Office statistics. Plain-English access to personal tax by postcode, company tax by industry, corporate tax transparency, GST collections, super contributions, and the ACNC charity register.
Project-URL: Homepage, https://github.com/Bigred97/ato-mcp
Project-URL: Documentation, https://github.com/Bigred97/ato-mcp#readme
Project-URL: Repository, https://github.com/Bigred97/ato-mcp
Project-URL: Issues, https://github.com/Bigred97/ato-mcp/issues
Project-URL: Changelog, https://github.com/Bigred97/ato-mcp/blob/main/CHANGELOG.md
Project-URL: PyPI, https://pypi.org/project/ato-mcp/
Author: Harry Vass
License: MIT
License-File: LICENSE
Keywords: acnc,ato,australia,australian,charity-register,claude,data-gov-au,fastmcp,fintech,government-data,mcp,model-context-protocol,property-tech,public-data,tax-statistics,taxation
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Financial and Insurance Industry
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Office/Business :: Financial
Classifier: Topic :: Scientific/Engineering :: Information Analysis
Classifier: Typing :: Typed
Requires-Python: >=3.11
Requires-Dist: aiosqlite>=0.20
Requires-Dist: aus-identity>=0.1.0
Requires-Dist: fastmcp<4,>=2.0
Requires-Dist: httpx>=0.27
Requires-Dist: openpyxl>=3.1
Requires-Dist: pandas<3,>=2.2
Requires-Dist: pyarrow>=15
Requires-Dist: pydantic>=2.7
Requires-Dist: pyyaml>=6.0
Requires-Dist: rapidfuzz>=3.9
Provides-Extra: dev
Requires-Dist: pytest-asyncio>=0.23; extra == 'dev'
Requires-Dist: pytest>=8; extra == 'dev'
Requires-Dist: respx>=0.21; extra == 'dev'
Requires-Dist: ruff>=0.5; extra == 'dev'
Description-Content-Type: text/markdown

# ato-mcp

mcp-name: io.ausdata/ato-mcp

[![PyPI](https://img.shields.io/pypi/v/ato-mcp.svg)](https://pypi.org/project/ato-mcp/)
[![Python](https://img.shields.io/pypi/pyversions/ato-mcp.svg)](https://pypi.org/project/ato-mcp/)
[![License](https://img.shields.io/pypi/l/ato-mcp.svg)](https://github.com/Bigred97/ato-mcp/blob/main/LICENSE)
[![Tests](https://github.com/Bigred97/ato-mcp/actions/workflows/test.yml/badge.svg)](https://github.com/Bigred97/ato-mcp/actions/workflows/test.yml)
[![CodeQL](https://github.com/Bigred97/ato-mcp/actions/workflows/codeql.yml/badge.svg)](https://github.com/Bigred97/ato-mcp/actions/workflows/codeql.yml)
[![Glama MCP server quality](https://glama.ai/mcp/servers/Bigred97/ato-mcp/badges/score.svg)](https://glama.ai/mcp/servers/Bigred97/ato-mcp)

**MCP server for Australian Taxation Office statistics.** Plain-English access to personal tax by postcode, company tax by industry, corporate tax transparency for every $100M+ entity, super contributions by age, salary by occupation, monthly GST collections, and the live ACNC charity register — all from a single `uvx` command.

> **Hosted access?** For cross-source queries, webhooks, an always-on REST API, and a uniform response envelope across all 9 sources, see **[ausdata.io](https://ausdata.io)** — free tier available (500 calls/mo, no card).

```text
"What's the median taxable income in postcode 2000?"
"How much tax did BHP pay last year?"
"Which industries have the highest gross income?"
"How many Large charities are there in NSW?"
"What's the average super contribution for under-30s in the top tax bracket?"
```

Sister to [abs-mcp](https://github.com/Bigred97/abs-mcp) (Australian Bureau of Statistics), [rba-mcp](https://github.com/Bigred97/rba-mcp) (Reserve Bank of Australia), and [au-weather-mcp](https://github.com/Bigred97/au-weather-mcp) (Australian weather via Open-Meteo + BOM). The four together cover the macro / regulator / tax / climate layer of Australian official data.

---

## Install

```bash
# Run on demand via uvx (recommended)
uvx --upgrade ato-mcp

# Or install permanently
pip install ato-mcp
```

### Claude Desktop

Add to `claude_desktop_config.json`:

```json
{
  "mcpServers": {
    "ato": { "command": "uvx", "args": ["--upgrade", "ato-mcp"] }
  }
}
```

> **Why `--upgrade`?** `uvx ato-mcp` (without the flag) uses whatever wheel is cached and never adopts new PyPI releases on its own. `--upgrade` makes uvx check PyPI on each launch and pull a newer release if one exists. To verify which version is currently serving you, look at the `server_version` field on any `DataResponse`.

### Claude Code / Cursor

```bash
claude mcp add ato --command uvx --args -- --upgrade ato-mcp
```

## Auto-updating data

Beyond the wheel-level `--upgrade`, the server has a second auto-update path **inside** the data layer: when ATO publishes Taxation Statistics 2023-24 next year, ato-mcp resolves the new resource URL via [data.gov.au's CKAN API](https://data.gov.au/data/api/3/action/package_show) at fetch time and uses the freshest match. Hard-coded YAML URLs are the safe fallback if discovery fails. You do **not** need to wait for a new wheel release to get new yearly data — just delete `~/.ato-mcp/cache.db` to force a refresh, or wait for the 7-day TTL to expire.

---

## What it exposes

Seven tools, all plain-English in, structured out:

| Tool                | Purpose                                                       |
|---------------------|---------------------------------------------------------------|
| `search_datasets`   | Fuzzy-search the curated catalog by keyword                   |
| `describe_dataset`  | List a dataset's filterable dimensions and returnable measures |
| `get_data`          | Query with `filters`, `measures`, period range, output format |
| `latest`            | Last observation per measure (shortcut)                       |
| `top_n`             | Rank rows by a measure, return top (or bottom) N              |
| `stats`             | Aggregate stats (count, sum, mean, median, min, max, stddev) over a measure — optional `group_by` partitions before aggregating |
| `list_curated`      | Enumerate the curated dataset IDs                             |

Every response is the same shape — `dataset_id`, `dataset_name`, `query`, `period`, `unit`, `row_count`, `records`, `ato_url`, `attribution`, `server_version` — across every curated dataset.

---

## Curated datasets (14)

| ID                    | What it is                                                                          | Period             | Coverage                  |
|-----------------------|-------------------------------------------------------------------------------------|--------------------|---------------------------|
| `IND_POSTCODE`        | Personal tax stats by taxable status × state × SA4 × postcode (~5,200 postcodes)    | 2022-23            | 80+ measures              |
| `IND_POSTCODE_MEDIAN` | Median & average taxable income by postcode, every year                             | 2003-04 → 2022-23  | 21 yearly measures        |
| `COMPANY_INDUSTRY`    | Company tax by ANZSIC broad + fine industry                                         | 2022-23            | 216 industry cells        |
| `CORP_TRANSPARENCY`   | Entity-level tax disclosure for $100M+ corporations (name, ABN, income, tax)        | 2023-24            | ~4,200 entities           |
| `SUPER_CONTRIB_AGE`   | Super contributions by age × sex × taxable income bracket                            | 2022-23            | Employer/personal/other   |
| `ACNC_REGISTER`       | Live register of every Australian charity (ABN, size, jurisdiction, beneficiaries)   | Current (weekly)   | ~60,000 entities          |
| `GST_MONTHLY`         | Monthly GST / WET / LCT collections (gross GST, input tax credits, net GST, etc.)   | 2020-07 → 2024-06  | 10 metrics × 48 months    |
| `ATO_OCCUPATION`      | Median/average income (taxable, salary/wage, total) by ANZSCO occupation × sex      | 2022-23            | ~1,200 jobs × 7 measures  |
| `SMSF_FUNDS`          | SMSF sector size — total funds, total members, total gross assets (trillion-$ sector) | 2019-20 → 2024-25  | 3 metrics × 6 years       |
| `SBB_BENCHMARKS`      | Industry total-expense + COGS ratio bands by turnover bracket (~100 industries) | 2023-24            | 12 measures × 100 industries |
| `HELP_DEBT`           | HECS/HELP outstanding debt, indexation, compulsory + voluntary repayments annual | 2005-06 → 2024-25  | 8 measures × 20 years     |
| `TAX_GAPS`            | ATO's tax gap estimates — how much tax is being missed each year by tax type | 2016-17 onward    | 5 measures × 4 tax types × ~7 years |
| `RND_INCENTIVE`       | R&D Tax Incentive transparency — every entity's R&D claim (name, ABN, $)    | 2022-23            | ~13,000 entities          |
| `ACNC_AIS_FINANCIALS` | Per-charity financials from the ACNC Annual Information Statement (revenue, expenses, staff) | 2023               | ~60,000 charities × 23 measures |

Adding a new dataset is a single YAML drop into `src/ato_mcp/data/curated/` — see [CONTRIBUTING.md](CONTRIBUTING.md).

---

## Example queries (paste into Claude)

> **Cross-source compatibility.** All location filters accept canonical
> state codes (`"NSW"`), full names (`"New South Wales"`), case-insensitive
> variants (`"nsw"`), ISO 3166-2 (`"AU-NSW"`), and 4-digit postcodes
> (`"2000"` → NSW) on the `state` filter. Powered by
> [`aus-identity`](https://pypi.org/project/aus-identity/) — the same input
> shape works across abs-mcp, ato-mcp, apra-mcp, aihw-mcp, and asic-mcp.

**Property-tech**: *"For postcodes 2000, 2008, 2026, and 2031 in NSW, give me the median taxable income across every available year so I can compare trajectories."*

**Corporate tax**: *"Get the total income, taxable income, and tax payable for BHP IRON ORE (JIMBLEBAR) PTY LTD."*

**Industry analysis**: *"Which fine industry codes under 'C. Manufacturing' have the highest total income, and how many companies are in each?"*

**Charity/non-profit tech**: *"Find every charity in NSW with size 'Large' that operates_in_VIC = Y."*

**Retirement planning**: *"What's the average personal super contribution for males aged 30-39 in the $120,001–$180,000 bracket?"*

Each prompt resolves to one `get_data` call. The response includes the source URL so the agent can cite it back.

---

## Architecture

Same shape as the sister packages — `client → cache → parsing → shaping → server`:

- **`client.py`** wraps `httpx` with a SQLite-backed disk cache (per-resource TTL).
- **`parsing.py`** reads XLSX (via `openpyxl`/`pandas`) and CSV (via `pandas`). Header rows + sheet names live in the curated YAML so future format quirks are a YAML edit, not a code change.
- **`curated.py`** loads dataset specs from `data/curated/*.yaml` — each one declares its dimensions, measures, dimension value enums, source/download URLs, format, and parse layout.
- **`shaping.py`** transforms the parsed DataFrame into `DataResponse` (records / series / csv).
- **`server.py`** is the FastMCP entrypoint — seven tools, full input validation with helpful "Try X" hints on error.

Cache lives under `~/.ato-mcp/cache.db`. Data on data.gov.au refreshes once a year (ATO) or weekly (ACNC), and the TTLs are tuned for that.

---

## Attribution

Data sourced from the Australian Taxation Office and the Australian Charities and Not-for-profits Commission, both via [data.gov.au](https://data.gov.au/). Licensed under [Creative Commons Attribution 3.0 Australia (CC BY 3.0 AU)](https://creativecommons.org/licenses/by/3.0/au/). The MCP server is MIT-licensed; the data carries the upstream CC-BY 3.0 AU licence, which is echoed in every response's `attribution` field.

---

## Sister MCPs (Australian Public Data portfolio)

- [abs-mcp](https://pypi.org/project/abs-mcp/) — Australian Bureau of Statistics (CPI, unemployment, ERP, building approvals)
- [rba-mcp](https://pypi.org/project/rba-mcp/) — Reserve Bank of Australia (cash rate, lending stats, exchange rates)
- **ato-mcp** — this one. Tax, super, and charity registers.
- [apra-mcp](https://pypi.org/project/apra-mcp/) — Australian Prudential Regulation Authority (banking, insurance, super)
- [aihw-mcp](https://pypi.org/project/aihw-mcp/) — Australian Institute of Health and Welfare
- [asic-mcp](https://pypi.org/project/asic-mcp/) — Australian Securities and Investments Commission (company registers)
- [aemo-mcp](https://pypi.org/project/aemo-mcp/) — Australian Energy Market Operator (NEM dispatch, spot prices, generation)
- [au-weather-mcp](https://pypi.org/project/au-weather-mcp/) — Open-Meteo (Bureau of Meteorology aggregator)
- [wgea-mcp](https://pypi.org/project/wgea-mcp/) — Workplace Gender Equality Agency
- [aus-identity](https://pypi.org/project/aus-identity/) — Postcode / state / ABN normalisation helper used by all sisters

The portfolio is designed to compose: an agent can ask for "unemployment + cash rate + median income + climate" in postcode 2000 and one shot fans out across multiple MCPs.

---

## Roadmap (next iterations)

- v0.2: `GST_MONTHLY` transposed time series; multi-year `CORP_TRANSPARENCY`; `ATO_OCCUPATION` (salary by occupation code)
- v0.3: hosted version with [x402](https://x402.org/) per-call paywall; programmatic SEO pages
- v0.4: listing on MCPay + Apify; paid tier for high-volume agent users

[CHANGELOG](CHANGELOG.md) tracks every release.

---

## Development

```bash
git clone https://github.com/Bigred97/ato-mcp.git
cd ato-mcp
uv venv
uv pip install -e ".[dev]"
pytest                  # 53 unit tests, ~7s
pytest -m live          # 3 integration tests against data.gov.au, ~3s
```

Issues, ideas, and contributions welcome: [github.com/Bigred97/ato-mcp/issues](https://github.com/Bigred97/ato-mcp/issues).
