# wgea-mcp

> MCP server exposing the Workplace Gender Equality Agency Public Data File through 5 plain-English tools — per-employer workforce composition, manager movements, parental leave, harassment policies, board diversity for every WGEA-reporting Australian employer (~9,600).

wgea-mcp is the Workplace Gender Equality Agency (WGEA) member of the Australian Public Data MCP portfolio. It fetches the annual Public Data File (a ~71 MB ZIP of CSVs) from data.gov.au's CKAN, with a 2-tier URL resolver — live CKAN `package_show` then a bundled seed manifest (CI-refreshed). The seven thematic CSVs inside the ZIP are exposed as seven datasets.

Per-employer reporting is a deliberate disclosure under the Workplace Gender Equality Act 2012 — redistribution is explicitly intended. Fuzzy employer-name search resolves abbreviations and aliases ("CBA" → Commonwealth Bank of Australia, "Woolies" → Woolworths Group Limited) via a `WRatio + partial_ratio` blended scorer; the `did_you_mean` field surfaces top-5 closest matches when nothing resolves exactly. Every response carries CC-BY 3.0 Australia attribution. WGEA reporting years are labelled `YYYY-YY` (e.g. `2024-25` = 1 Apr 2024 to 31 Mar 2025).

Important caveat: WGEA's Data Explorer publishes a headline per-employer gender pay gap percentage; that specific aggregate is NOT in the public CSV release because WGEA pre-aggregates remuneration data before publication. Use wgea-mcp for the underlying workforce composition + policy detail; use [WGEA's Data Explorer](https://www.wgea.gov.au/Data-Explorer) for the headline pay-gap percentage.

## Documentation

- [README](https://github.com/Bigred97/wgea-mcp/blob/main/README.md): Full setup + tool usage + reliability notes
- [CHANGELOG](https://github.com/Bigred97/wgea-mcp/blob/main/CHANGELOG.md): Release history
- [PyPI](https://pypi.org/project/wgea-mcp/): `uvx --upgrade wgea-mcp`

## Tools

- search_datasets(query, limit=10): Fuzzy search the 7 curated WGEA datasets
- describe_dataset(dataset_id): Schema — filterable dimensions, measures, source URL, current reporting year
- get_data(dataset_id, filters, start_period, end_period, format, max_rows): Filtered query; max_rows caps at 10000
- latest(dataset_id, filters, max_rows): Restrict to the latest reporting year only
- list_curated(): Enumerate the 7 dataset IDs

## Example queries

- "What's the gender breakdown at Commonwealth Bank?"
- "Which mining companies set gender targets in 2024-25?"
- "Workforce composition by occupation at Qantas"
- "Sexual harassment policy responses across financial services"
- "Promotions to manager by gender at Atlassian"
- "Compare board diversity at the Big 4 banks"

## Optional

- [Sister MCPs](https://github.com/Bigred97?tab=repositories&q=mcp): Other AU public-data MCPs in the portfolio
- [aus-identity](https://pypi.org/project/aus-identity/): Used by sisters; WGEA per-employer rows carry ABN as the primary entity key
