Metadata-Version: 2.4
Name: meddle
Version: 0.1.1
Summary: Agentic Medical Deep Learning Engineer
Home-page: https://github.com/uni-medical/MedDLE
Author: Haoyu Wang
Author-email: small_dark@sina.com
Classifier: Programming Language :: Python :: 3
Classifier: Operating System :: OS Independent
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.11
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: black==24.3.0
Requires-Dist: funcy==2.0
Requires-Dist: humanize==4.8.0
Requires-Dist: jsonschema==4.19.2
Requires-Dist: numpy==1.26.2
Requires-Dist: openai>=1.69.0
Requires-Dist: anthropic>=0.20.0
Requires-Dist: pandas==2.1.4
Requires-Dist: pytest==7.4.3
Requires-Dist: requests==2.31.0
Requires-Dist: rich==13.7.0
Requires-Dist: dataclasses_json>=0.6.4
Requires-Dist: omegaconf>=2.3.0
Requires-Dist: loguru==0.7.2
Requires-Dist: shutup>=0.2.0
Requires-Dist: tqdm==4.66.2
Requires-Dist: coolname>=2.2.0
Requires-Dist: igraph>=0.11.3
Requires-Dist: genson>=1.2.0
Requires-Dist: torch
Requires-Dist: torchvision
Requires-Dist: torchaudio
Requires-Dist: torchtext
Requires-Dist: lightgbm
Requires-Dist: matplotlib
Requires-Dist: seaborn
Requires-Dist: gensim
Requires-Dist: scikit-learn
Requires-Dist: scikit-image
Requires-Dist: opencv-python
Requires-Dist: scipy
Requires-Dist: biopython
Requires-Dist: imbalanced-learn
Requires-Dist: h5py
Requires-Dist: biopython
Requires-Dist: numba
Requires-Dist: arrow
Requires-Dist: markovify
Requires-Dist: imgaug
Requires-Dist: scikit-optimize
Requires-Dist: plotly
Requires-Dist: hyperopt
Requires-Dist: bayesian-optimization
Requires-Dist: imagecodecs
Requires-Dist: hmmlearn
Requires-Dist: bayespy==0.5.1
Requires-Dist: sklearn-pandas
Requires-Dist: tensorpack
Requires-Dist: sentencepiece
Requires-Dist: librosa
Requires-Dist: ipykernel
Requires-Dist: ipython
Requires-Dist: nbformat
Requires-Dist: kornia
Requires-Dist: Pillow
Requires-Dist: pyparsing
Requires-Dist: pytz
Requires-Dist: PyYAML
Requires-Dist: tqdm
Requires-Dist: fastai
Requires-Dist: gym
Requires-Dist: optuna
Requires-Dist: transformers
Requires-Dist: datasets==2.1.0
Requires-Dist: torchmetrics
Requires-Dist: pytorch-lightning
Requires-Dist: sympy
Requires-Dist: timm
Requires-Dist: torchinfo
Requires-Dist: pdf2image
Requires-Dist: PyPDF
Requires-Dist: pyocr
Requires-Dist: pyarrow
Requires-Dist: xlrd
Requires-Dist: backoff
Requires-Dist: streamlit==1.40.2
Requires-Dist: python-dotenv
Requires-Dist: openpyxl
Requires-Dist: monai==1.4.0
Requires-Dist: langchain
Requires-Dist: langchain-openai
Requires-Dist: langchain-community
Requires-Dist: beautifulsoup4
Requires-Dist: chromadb
Requires-Dist: SimpleITK
Requires-Dist: nibabel
Requires-Dist: html2text
Requires-Dist: tiktoken
Requires-Dist: optuna
Requires-Dist: autogluon
Dynamic: author
Dynamic: author-email
Dynamic: classifier
Dynamic: description
Dynamic: description-content-type
Dynamic: home-page
Dynamic: license-file
Dynamic: requires-dist
Dynamic: requires-python
Dynamic: summary

# Med-DLE: Agentic Medical Deep Learning Engineer 

MedDLE is designed to be a codegen agentic system for medical deep learning tasks. 

> **Alert**: For current version, monai prompt & rag (q2d+rerank) are defaultly enabled. 
> 
> Since we update the Auto-Tuning Pipeline with many changes, please refer to branch `v0` if you wanna use the old settings.

# Quick Start
## Setup environment
1. we recommand use `conda` to create a virtual environment:
```
conda create --name meddle python=3.11 
conda activate meddle
```
2. Then install `meddle` via:
```
pip install uv
uv pip install -r requirements.txt
uv pip install -e .
```
3. remember to set up your private api key of OPENAI/ANTROPIC/OPENROUNTER
```
export OPENAI_BASE_URL="<your base url>" # (e.g. https://api.openai.com/v1)
export OPENAI_API_KEY="<your key>"
```
### Setup knowledge base for RAG
You can create the knowledge base with `create_monai_knowledge_base.sh` or download from [link](http://ug.link/haoyushare/filemgr/share-download/?id=b8c77eb1346c4ecca4dfb510253d32fe).

You could use this command to check whether the creation succeeds.
```
python -m meddle.monai_rag.query_rag_db
``` 

### Setup dataset
1. Get data from [link](http://ug.link/haoyushare/filemgr/share-download/?id=a14cf769efb649f8a099ee6fb876abb6) and move the data into `med_dl_tasks`
2. The goal and eval target of MedMNIST dataset can be found in [scripts](https://github.com/uni-medical/MedDLE/blob/main/scripts/run_all_mnist_rag.sh).

## Try MedDLE
```
meddle data_dir=<your dataset> \
     goal=<your goal> \
     eval=<your metrics> \
     agent.steps=30 agent.enable_monai_knowledge_base=true
```
here's a detailed example:
```
meddle data_dir="meddle/example_tasks/retinamnist_224" exp_name="torch-example_task_retina"\
     goal="Predict the category of each given Fundus Camera image by selecting the most appropriate one from the 5 available classes." \
     eval="Use the accuracy and the AUC between the predicted and ground-truth class." \
     agent.step_plan.proposal=20 agent.step_plan.tuning=10 agent.step_plan.hpo=5
```
You can set general `steps` (MedDLE with auto-schedule steps for multiple agents) or detailed `step_plan` for 
each agent.

# Advanced Features
## RAG Mode
We support both basic RAG and advanced (query2doc) mode. They are controlled with the following configs.
```
agent.enable_monai_knowledge_base=true agent.enable_query2doc=true
```

## Total time limit
To test with a total time limit for all steps, you can set config `agent.total_time_limit`:
```
meddle data_dir="meddle/example_tasks/retinamnist_224" exp_name="test_timelimit-example_task_retina"\
     goal="Predict the category of each given Fundus Camera image by selecting the most appropriate one from the 5 available classes." \
     eval="Use the accuracy and the area under curve (AUC) between the predicted class and ground-truth class on the test set." \
     agent.force_monai_with_prompt=false agent.enable_monai_knowledge_base=false agent.expose_prediction=true \
     agent.steps=100 agent.total_time_limit=60
```

# 🙏 Acknowledgement
- We thank all medical workers and dataset owners for making public datasets available to the community.
- Thanks to the open-source of the following projects, our code is developed based on their contributions:
     - [aideml](https://github.com/WecoAI/aideml)
     - [monai](https://github.com/Project-MONAI)
