Metadata-Version: 2.4
Name: INSPIRE-MEDS
Version: 0.0.11
Summary: An ETL pipeline to extract INSPIRE data into the MEDS format.
Author-email: Robin van de Water <robin.vandewater@hpi.de>, Matthew McDermott <mattmcdermott8@gmail.com>
Project-URL: Homepage, https://github.com/rvandewater/INSPIRE_MEDS_Template
Project-URL: Issues, https://github.com/rvandewater/INSPIRE_MEDS_Template/issues
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Requires-Python: >=3.11
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: meds-transforms~=0.2
Requires-Dist: requests
Requires-Dist: beautifulsoup4
Requires-Dist: hydra-core
Provides-Extra: dev
Requires-Dist: pre-commit<4; extra == "dev"
Provides-Extra: tests
Requires-Dist: pytest; extra == "tests"
Requires-Dist: pytest-cov; extra == "tests"
Requires-Dist: rootutils; extra == "tests"
Requires-Dist: hydra-joblib-launcher; extra == "tests"
Provides-Extra: local-parallelism
Requires-Dist: hydra-joblib-launcher; extra == "local-parallelism"
Provides-Extra: slurm-parallelism
Requires-Dist: hydra-submitit-launcher; extra == "slurm-parallelism"
Provides-Extra: docs
Requires-Dist: mkdocs==1.6.0; extra == "docs"
Requires-Dist: mkdocs-material==9.5.31; extra == "docs"
Requires-Dist: mkdocstrings[python,shell]==0.25.2; extra == "docs"
Requires-Dist: mkdocs-gen-files==0.5.0; extra == "docs"
Requires-Dist: mkdocs-literate-nav==0.6.1; extra == "docs"
Requires-Dist: mkdocs-section-index==0.3.9; extra == "docs"
Requires-Dist: mkdocs-git-authors-plugin==0.9.0; extra == "docs"
Requires-Dist: mkdocs-git-revision-date-localized-plugin==1.2.6; extra == "docs"
Dynamic: license-file

# INSPIRE-MEDS

[![codecov](https://codecov.io/gh/rvandewater/INSPIRE_MEDS/graph/badge.svg?token=RW6JXHNT0W)](https://codecov.io/gh/rvandewater/INSPIRE_MEDS)
[![tests](https://github.com/rvandewater/INSPIRE_MEDS/actions/workflows/tests.yaml/badge.svg)](https://github.com/rvandewater/INSPIRE_MEDS/actions/workflows/tests.yml)
[![code-quality](https://github.com/rvandewater/INSPIRE_MEDS/actions/workflows/code-quality-main.yaml/badge.svg)](https://github.com/rvandewater/INSPIRE_MEDS/actions/workflows/code-quality-main.yaml)
![python](https://img.shields.io/badge/-Python_3.11-blue?logo=python&logoColor=white)
[![license](https://img.shields.io/badge/License-MIT-green.svg?labelColor=gray)](https://github.com/rvandewater/INSPIRE_MEDS#license)
[![PRs](https://img.shields.io/badge/PRs-welcome-brightgreen.svg)](https://github.com/rvandewater/INSPIRE_MEDS/pulls)
[![contributors](https://img.shields.io/github/contributors/rvandewater/INSPIRE_MEDS.svg)](https://github.com/rvandewater/INSPIRE_MEDS/graphs/contributors)
[![DOI](https://zenodo.org/badge/918107518.svg)](https://doi.org/10.5281/zenodo.14891939)

This pipeline extracts the INSPIRE dataset (from physionet, https://physionet.org/content/inspire/) into the MEDS format.

## Usage:

```bash
pip install INSPIRE_MEDS
export DATASET_DOWNLOAD_USERNAME=$PHYSIONET_USERNAME
export DATASET_DOWNLOAD_PASSWORD=$PHYSIONET_PASSWORD
MEDS_extract-INSPIRE root_output_dir=$ROOT_OUTPUT_DIR
```

When you run this, the program will:

1. Download the needed raw INSPIRE files for the currently supported version into
    `$ROOT_OUTPUT_DIR/raw_input`.
2. Perform initial, pre-MEDS processing on the raw INSPIRE files, saving the results in
    `$ROOT_OUTPUT_DIR/pre_MEDS`.
3. Construct the final MEDS cohort, and save it to `$ROOT_OUTPUT_DIR/MEDS_cohort`.

You can also specify the target directories more directly, with

```bash
export DATASET_DOWNLOAD_USERNAME=$PHYSIONET_USERNAME
export DATASET_DOWNLOAD_PASSWORD=$PHYSIONET_PASSWORD
MEDS_extract-INSPIRE raw_input_dir=$RAW_INPUT_DIR pre_MEDS_dir=$PRE_MEDS_DIR MEDS_cohort_dir=$MEDS_COHORT_DIR
```

## Examples and More Info:

You can run `MEDS_extract-INSPIRE --help` for more information on the arguments and options. You can also run

```bash
MEDS_extract-INSPIRE root_output_dir=$ROOT_OUTPUT_DIR
```

to run the entire pipeline.
