Metadata-Version: 2.4
Name: nhs-waiting-lists
Version: 0.1.1
Summary: This is a simple wrapper for working with the NHS referral-to-treatment dataset.
Author-email: Tom Hodder <tom@limepepper.co.uk>
Requires-Python: >=3.11
Requires-Dist: alembic>=1.17.0
Requires-Dist: bs4>=0.0.2
Requires-Dist: great-tables>=0.20.0
Requires-Dist: ipython-sql>=0.5.0
Requires-Dist: jupyterlab-night>=0.5.2
Requires-Dist: jupyterlab-quarto>=0.3.5
Requires-Dist: lifelines>=0.30.0
Requires-Dist: matplotlib>=3.10.7
Requires-Dist: nbconvert[qtpdf,webpdf]>=7.16.6
Requires-Dist: notebook>=7.4.7
Requires-Dist: openpyxl>=3.1.5
Requires-Dist: pandas-datareader>=0.10.0
Requires-Dist: pandas>=2.3.3
Requires-Dist: plotly>=6.5.0
Requires-Dist: plotnine>=0.15.0
Requires-Dist: pydantic-settings>=2.11.0
Requires-Dist: reportlab>=4.4.5
Requires-Dist: scrapy>=2.13.3
Requires-Dist: seaborn>=0.13.2
Requires-Dist: sqlalchemy>=2.0.44
Requires-Dist: typer==0.20.0
Requires-Dist: xlrd>=2.0.2
Description-Content-Type: text/markdown

# Web scraper for NHS rtt waiting time dataset

> Note: ⚠️ Unofficial. This package provides programmatic access to publicly
> available NHS England Referral to Treatment (RTT) data. It is not affiliated
> with or endorsed by NHS England.

## Database tables

### Referral to Treatment (RTT) tables

#### all_rtt_raw

The raw data is initially imported into the all_rtt_raw table. We fix some
column names that differ between releases of the data, and downcase and remove
spaces from column names, and drop descriptive columns in favour of normalizing
the codes to auxiliary tables later. This table has duplicate provider
per-specialty, per-period as a provider can be commissioned by multiple
commissioning organisations.

* Fix variations in column names
* Downcase column names
* Drop descriptive columns
* Normalize variations in period column names

The data has a per provider, per-period, per-specialty view, with duplicates
for each of potentially multiple commissioning organisations, and then many
columns containing metrics for waiting times in buckets. The number of buckets
was previously up until 52 weeks and has more recently been increased to 104
weeks.

#### consolidated

This table consolidates the per-commissioning organisation into a single row per 
provider per period per specialty. The buckets are aggregated into totals for
each pathway type. A lag window is used to calculate the operning and closing
metrics for various pathway totals, and diffs between start and end of period.

#### pathways (admitted, incomplete, incomplete_with_dta, new_periods, nonadmitted)

These tables are the aggregated bucketed quantities for each pathway type, 
aggregated by commissioning organisation.

new_periods doesn't have buckets as its just an amount of new periods during 
that period.




## known_issues:

```
  - description: Excel auto-format converts "10-19" to "Oct-19"
    affected_files: ["Outpatients 22-23", "Outpatients 23-24"]
    fixed_in: "24-25"
    remediation: map 'Oct-19' → '10-19'
```