Metadata-Version: 2.4
Name: bdi-kit
Version: 0.10.0
Summary: BDI-Kit Library
Home-page: https://github.com/VIDA-NYU/bdi-kit
Author: 
Author-email: 
Maintainer: 
Maintainer-email: 
License: Apache-2.0
Keywords: BDF,Data Harmonization,NYU
Classifier: Development Status :: 5 - Production/Stable
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Topic :: Scientific/Engineering
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE.txt
Requires-Dist: polyfuzz
Requires-Dist: gensim>=4.3.3
Requires-Dist: pandas
Requires-Dist: valentine>=0.5.0
Requires-Dist: sentence_transformers<=5.3.0
Requires-Dist: torch
Requires-Dist: transformers<5.0.0
Requires-Dist: tqdm
Requires-Dist: scikit-learn
Requires-Dist: flair[word-embeddings]>=0.14.0
Requires-Dist: requests
Requires-Dist: scipy
Requires-Dist: matplotlib<3.9
Requires-Dist: panel!=1.4.3
Requires-Dist: nltk>=3.9.1
Requires-Dist: magneto-python
Requires-Dist: json-repair
Requires-Dist: litellm==1.82.6
Requires-Dist: python-Levenshtein
Provides-Extra: mcp
Requires-Dist: mcp[cli]; extra == "mcp"
Provides-Extra: chatbot
Requires-Dist: mcp[cli]; extra == "chatbot"
Requires-Dist: streamlit; extra == "chatbot"
Requires-Dist: langchain; extra == "chatbot"
Requires-Dist: langchain-litellm; extra == "chatbot"
Requires-Dist: langchain-mcp-adapters; extra == "chatbot"
Dynamic: classifier
Dynamic: description
Dynamic: description-content-type
Dynamic: home-page
Dynamic: keywords
Dynamic: license
Dynamic: license-file
Dynamic: provides-extra
Dynamic: requires-dist
Dynamic: requires-python
Dynamic: summary

[![PyPI version](https://badge.fury.io/py/bdi-kit.svg)](https://pypi.org/project/bdi-kit)
[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)
[![Documentation Status](https://readthedocs.org/projects/bdi-kit/badge/?version=stable)](https://bdi-kit.readthedocs.io)
[![Tests](https://github.com/VIDA-NYU/bdi-kit/actions/workflows/build.yml/badge.svg)](https://github.com/VIDA-NYU/bdi-kit/actions/workflows/build.yml)
[![Lint](https://github.com/VIDA-NYU/bdi-kit/actions/workflows/lint.yml/badge.svg)](https://github.com/VIDA-NYU/bdi-kit/actions/workflows/lint.yml)


# BDI-Kit 

**BDI-Kit** is a library that assist users in performing data harmonization. It provides state-of-the-art tools to streamline the process of integrating and transforming disparate datasets (with a focus on biomedical data), and includes APIs for performing tasks such as:
- Schema matching
- Value matching
- Data transformation to a target schema or data model


## 📚 Documentation

Documentation is available at [https://bdi-kit.readthedocs.io/](https://bdi-kit.readthedocs.io/).


## 🛠️ Installation

You can install the latest stable version of this library from [PyPI](https://pypi.org/project/bdi-kit/):

```
pip install bdi-kit
```

To install the latest development version:

```
pip install git+https://github.com/VIDA-NYU/bdi-kit@devel
```


## 🎬 Demo Video

This video demonstrates a brief overview of BDI-Kit, showcasing its functionality through both the Python API and the chatbot-style agent interface.

[![Watch a demo of BDI-Kit](docs/source/_static/images/demo_thumbnail.png)](https://drive.google.com/file/d/1gMlZuocYrKFQYDZOphIyFj-nvjtx4ODR/view?usp=sharing)


## 🤝 Contributing

To learn more about making a contribution to BDI-Kit, please see our [Contributing guide](./CONTRIBUTING.md).


## 🔖 Citation

If you find BDI-Kit useful in your work, please consider citing:

```bibtex
@article{lopez2026bdikit,
  title={{BDI-Kit: An AI-Powered Toolkit for Biomedical Data Harmonization}},
  author={Lopez, Roque and Santos, Aecio and Koutras, Christos and Freire, Juliana},
  journal={{Patterns}},
  volume={7},
  year={2026}
}
```


You can also find [here](https://bdi-kit.readthedocs.io/) our other papers related to the BDI-Kit library.
