Metadata-Version: 2.4
Name: octopus-automl
Version: 0.5.3
Summary: An Auto-ML framework optimized for small datasets
Author: Merck KGaA, Darmstadt, Germany
License-Expression: Apache-2.0
Project-URL: Homepage, https://emdgroup.github.io/octopus-automl/
Project-URL: Documentation, https://emdgroup.github.io/octopus-automl/
Project-URL: Changelog, https://github.com/emdgroup/octopus-automl/releases
Project-URL: GitHub, https://github.com/emdgroup/octopus-automl/
Project-URL: Issues, https://github.com/emdgroup/octopus-automl/issues/
Keywords: AutoML,Small data,Machine learning,Explainable AI,Interpretable ML
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Operating System :: OS Independent
Classifier: Typing :: Typed
Requires-Python: <=3.14,>=3.12
Description-Content-Type: text/markdown
License-File: LICENSE.txt
License-File: CONTRIBUTORS.md
Requires-Dist: attrs>=25.3.0
Requires-Dist: catboost>=1.2.8
Requires-Dist: networkx>=3.5
Requires-Dist: numpy>=2.2
Requires-Dist: optuna>=4.5
Requires-Dist: pandas<3,>=2.3
Requires-Dist: ray<3,>=2.52
Requires-Dist: rapidfuzz>=3.14
Requires-Dist: scikit-learn<2,>=1.7.0
Requires-Dist: shap>=0.51
Requires-Dist: torch<3,>=2.0
Requires-Dist: xgboost>=3.0
Requires-Dist: fsspec>=2025.10.0
Requires-Dist: universal_pathlib>=0.3.6
Requires-Dist: pyarrow>=17.0.0
Provides-Extra: boruta
Requires-Dist: boruta>=0.4.3; extra == "boruta"
Provides-Extra: autogluon
Requires-Dist: autogluon.tabular[all]<1.6,>=1.5; extra == "autogluon"
Requires-Dist: bokeh>=3.8; extra == "autogluon"
Requires-Dist: tokenizers>=0.13.0; extra == "autogluon"
Provides-Extra: survival
Requires-Dist: lifelines>=0.29.0; extra == "survival"
Provides-Extra: recommended
Requires-Dist: octopus-automl[autogluon]; extra == "recommended"
Requires-Dist: octopus-automl[boruta]; extra == "recommended"
Requires-Dist: octopus-automl[survival]; extra == "recommended"
Requires-Dist: octopus-automl[examples]; extra == "recommended"
Provides-Extra: dev
Requires-Dist: octopus-automl[test]; extra == "dev"
Requires-Dist: octopus-automl[lint]; extra == "dev"
Requires-Dist: octopus-automl[docs]; extra == "dev"
Requires-Dist: nbstripout>=0.8.2; extra == "dev"
Provides-Extra: docs
Requires-Dist: octopus-automl[examples]; extra == "docs"
Requires-Dist: octopus-automl[test]; extra == "docs"
Requires-Dist: mkdocs<2,>=1.6.1; extra == "docs"
Requires-Dist: mkdocs-material>=9.7.1; extra == "docs"
Requires-Dist: mkdocstrings-python>=2.0.1; extra == "docs"
Requires-Dist: mkdocs-gen-files>=0.6.0; extra == "docs"
Requires-Dist: mkdocs-literate-nav>=0.6.2; extra == "docs"
Requires-Dist: nbconvert>=7.16.6; extra == "docs"
Requires-Dist: jupytext>=1.19.0; extra == "docs"
Provides-Extra: examples
Requires-Dist: nbformat>=5.10; extra == "examples"
Requires-Dist: nbconvert>=7.16.6; extra == "examples"
Requires-Dist: notebook>=7.5.2; extra == "examples"
Requires-Dist: ipywidgets>=8; extra == "examples"
Requires-Dist: ipykernel>=7.1.0; extra == "examples"
Provides-Extra: lint
Requires-Dist: flake8==7.3.0; extra == "lint"
Requires-Dist: pre-commit==4.3.0; extra == "lint"
Requires-Dist: pydoclint==0.7.3; extra == "lint"
Requires-Dist: pyupgrade==3.21.0; extra == "lint"
Requires-Dist: ruff==0.14.1; extra == "lint"
Requires-Dist: licensecheck>=2025.1.0; extra == "lint"
Requires-Dist: mypy==1.18.2; extra == "lint"
Requires-Dist: joblib-stubs>=1.5.2.0.20250831; extra == "lint"
Requires-Dist: pandas-stubs<3,>=2.3; extra == "lint"
Requires-Dist: plotly-stubs==0.0.6; extra == "lint"
Requires-Dist: scikit-learn-stubs>=0.0.3; extra == "lint"
Requires-Dist: scipy-stubs>=1.16.3.0; extra == "lint"
Requires-Dist: types-networkx>=3.5; extra == "lint"
Requires-Dist: types-pytz>=2025.2.0; extra == "lint"
Requires-Dist: types-jmespath>=1.0.2; extra == "lint"
Provides-Extra: test
Requires-Dist: octopus-automl[recommended]; extra == "test"
Requires-Dist: octopus-automl[lint]; extra == "test"
Requires-Dist: pytest>=8.4.2; extra == "test"
Requires-Dist: pytest-cov>=6.2.1; extra == "test"
Requires-Dist: pytest-order>=1.3.0; extra == "test"
Requires-Dist: moto[s3,server]>=5.1.17; extra == "test"
Requires-Dist: s3fs>=2025.10.0; extra == "test"
Provides-Extra: test-core
Requires-Dist: octopus-automl; extra == "test-core"
Requires-Dist: pytest>=8.4.2; extra == "test-core"
Requires-Dist: pytest-cov>=6.2.1; extra == "test-core"
Dynamic: license-file

<div align="center">
  <br/>

<div>
<a href="https://github.com/emdgroup/octopus-automl/actions/workflows/test-package.yml?query=branch%3Amain">
   <img src="https://img.shields.io/github/actions/workflow/status/emdgroup/octopus-automl/test-package.yml?branch=main&style=flat-square&label=Test%20Suite&labelColor=0f69af&color=ffdcb9" alt="Test Suite">
</a>
<a href="https://github.com/emdgroup/octopus-automl/actions/workflows/ruff.yml?query=branch%3Amain">
   <img src="https://img.shields.io/github/actions/workflow/status/emdgroup/octopus-automl/ruff.yml?branch=main&style=flat-square&label=Code%20Quality&labelColor=0f69af&color=ffdcb9" alt="Code Quality">
</a>
<a href="https://github.com/emdgroup/octopus-automl/actions/workflows/docs.yml?query=branch%3Amain">
   <img src="https://img.shields.io/github/actions/workflow/status/emdgroup/octopus-automl/docs.yml?branch=main&style=flat-square&label=Docs&labelColor=0f69af&color=ffdcb9" alt="Docs">
</a>
</div>

<div>
<a href="https://pypi.org/project/octopus-automl/">
   <img src="https://img.shields.io/pypi/pyversions/octopus-automl?style=flat-square&label=Supports%20Python&labelColor=96d7d2&color=ffdcb9" alt="Supports Python">
</a>
<a href="https://pypi.org/project/octopus-automl/">
   <img src="https://img.shields.io/pypi/v/octopus-automl.svg?style=flat-square&label=PyPI%20Version&labelColor=96d7d2&color=ffdcb9" alt="PyPI version">
</a>
<a href="https://pypistats.org/packages/octopus-automl">
   <img src="https://img.shields.io/pypi/dm/octopus-automl?style=flat-square&label=Downloads&labelColor=96d7d2&color=ffdcb9" alt="Downloads">
</a>
<a href="https://github.com/emdgroup/octopus-automl/issues/">
   <img src="https://img.shields.io/github/issues/emdgroup/octopus-automl?style=flat-square&label=Issues&labelColor=96d7d2&color=ffdcb9" alt="Issues">
</a>
<a href="https://github.com/emdgroup/octopus-automl/pulls/">
   <img src="https://img.shields.io/github/issues-pr/emdgroup/octopus-automl?style=flat-square&label=PRs&labelColor=96d7d2&color=ffdcb9" alt="PRs">
</a>
<a href="http://www.apache.org/licenses/LICENSE-2.0">
   <img src="https://shields.io/badge/License-Apache%202.0-green.svg?style=flat-square&labelColor=96d7d2&color=ffdcb9" alt="License">
</a>
</div>

<div>
<a href="https://github.com/emdgroup/octopus-automl/">
   <img src="https://raw.githubusercontent.com/emdgroup/octopus/main/docs/assets/logo.png" alt="Logo">
</a>
</div>

<div>
<a href="https://emdgroup.github.io/octopus-automl/">Homepage<a/>
&nbsp;•&nbsp;
<a href="https://emdgroup.github.io/octopus-automl/userguide/userguide/">User Guide<a/>
&nbsp;•&nbsp;
<a href="https://emdgroup.github.io/octopus-automl/reference/reference/">Documentation<a/>
&nbsp;•&nbsp;
<a href="https://emdgroup.github.io/octopus-automl/contributing/">Contribute<a/>
</div>

</div>


# Octopus

Octopus is a lightweight AutoML framework specifically designed for small datasets (<1k samples) and with high dimensionality (number of features). The goal of Octopus is to speed up machine learning projects and to increase the reliability of results in the context of small datasets.

What distinguishes Octopus from others

* Nested cross-validation (CV)
* Performance on small datasets
* No information leakage
* No data split mistakes
* Constrained regularization
* Ensembling, optimized for (nested) CV
* Simplicity
* Time to event
* Testing system (branching workflows)
* Reporting based on nested CV
* Test predictions over all samples


## Hardware

For maximum speed it is recommended to run Octopus on a compute node with $n\times m$ CPUS for a $n \times m$ nested cross validation. Octopus development is done, for example, on a c5.9xlarge EC2 instance.

## Installation

Package Installation works via `pip` or any other standard Python package manager:

```bash
# Install with recommended dependencies (includes optional packages such as AutoGluon)
pip install "octopus-automl[recommended]"

# Explicitly specify optional dependencies
pip install "octopus-automl[autogluon]"     # AutoGluon
pip install "octopus-automl[boruta]"        # Boruta feature selection
pip install "octopus-automl[survival]"      # Support time-to-event / survival analysis
pip install "octopus-automl[examples]"      # Dependencies for running examples

# Install with more than one extras, e.g.
pip install "octopus-automl[autogluon,examples]"
```

For contributors / octopus developers, a specific dependency group exists.
It contains code sanitization and quality tools.

```bash
pip install "octopus-automl[dev]"
```
