Metadata-Version: 2.4
Name: tracebloc
Version: 0.8.0
Summary: Build AI with anyone. On data that can't move. SDK for the tracebloc collaborative AI workspace.
Author-email: tracebloc <lukas@tracebloc.io>
License-Expression: MIT
Project-URL: Homepage, https://github.com/tracebloc/tracebloc-py-package
Requires-Python: <3.13,>=3.9
Description-Content-Type: text/markdown
License-File: license.txt
Requires-Dist: dill
Requires-Dist: pandas
Requires-Dist: psutil
Requires-Dist: requests
Requires-Dist: rich
Requires-Dist: termcolor
Requires-Dist: tqdm
Provides-Extra: tensorflow
Requires-Dist: tensorflow<3,>=2.10; extra == "tensorflow"
Requires-Dist: tensorflow-datasets>=4.8; extra == "tensorflow"
Requires-Dist: silence-tensorflow>=1.2; extra == "tensorflow"
Provides-Extra: pytorch
Requires-Dist: numpy; extra == "pytorch"
Requires-Dist: Pillow; extra == "pytorch"
Requires-Dist: torch>=2.0; extra == "pytorch"
Requires-Dist: torchvision>=0.15; extra == "pytorch"
Requires-Dist: torchmetrics>=1.0; extra == "pytorch"
Requires-Dist: timm>=0.9; extra == "pytorch"
Provides-Extra: sklearn
Requires-Dist: numpy; extra == "sklearn"
Requires-Dist: scikit-learn>=1.2; extra == "sklearn"
Provides-Extra: huggingface
Requires-Dist: transformers>=4.30; extra == "huggingface"
Requires-Dist: peft>=0.5; extra == "huggingface"
Provides-Extra: boosting
Requires-Dist: catboost>=1.2; extra == "boosting"
Requires-Dist: lightgbm>=4.0; extra == "boosting"
Requires-Dist: xgboost>=1.7; extra == "boosting"
Provides-Extra: survival
Requires-Dist: lifelines>=0.27; extra == "survival"
Requires-Dist: scikit-survival>=0.22; extra == "survival"
Provides-Extra: all
Requires-Dist: tracebloc[boosting,huggingface,pytorch,sklearn,survival,tensorflow]; extra == "all"
Provides-Extra: dev
Requires-Dist: pytest>=7.0; extra == "dev"
Requires-Dist: pytest-cov>=4.0; extra == "dev"
Requires-Dist: mypy>=1.5; extra == "dev"
Requires-Dist: ruff>=0.1; extra == "dev"
Dynamic: license-file

# tracebloc

**Build AI with anyone. On data that can't move.**

tracebloc is a collaborative AI workspace you deploy on your own infrastructure. Invite researchers, partners, vendors, or your own teams to train, fine-tune, and benchmark models on your private data — without the data ever leaving your environment.

## Install

Pick the extra for your ML framework — the default install contains the
core SDK only (~140 MB, ~30 sec) instead of every framework (~8 GB).

```bash
pip install "tracebloc[pytorch]"      # most users
pip install "tracebloc[tensorflow]"   # TensorFlow
pip install "tracebloc[sklearn]"      # scikit-learn only
pip install "tracebloc[boosting]"     # XGBoost / CatBoost / LightGBM
pip install "tracebloc[survival]"     # lifelines / scikit-survival
pip install "tracebloc[all]"          # everything
```

Upgrading from `tracebloc_package`? See [MIGRATION.md](MIGRATION.md).

## Quick Start

```python
from tracebloc import User

# 1. Log in to your workspace
user = User()
user.login()

# 2. Upload your model to a use case
user.upload_model(model_name="my_model")

# 3. Link your model to the dataset — returns a training-plan facade
training_plan = user.link_model_dataset(dataset_id="<your-dataset-id>")

# 4. Start training
training_plan.start()
```

For a full walkthrough, open the [Quickstart Notebook on Google Colab](https://colab.research.google.com/github/tracebloc/start-training/blob/main/notebooks/traceblocTrainingGuide.ipynb).

## Supported Frameworks


| Framework           | Use Cases                                                                                                                                       |
| ------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------- |
| **PyTorch**         | Image classification, object detection, semantic segmentation, tabular, text classification, time series, keypoint detection, survival analysis |
| **TensorFlow**      | Image classification, tabular classification                                                                                                    |
| **scikit-learn**    | Tabular classification, tabular regression                                                                                                      |
| **XGBoost**         | Tabular classification, tabular regression                                                                                                      |
| **CatBoost**        | Tabular classification, tabular regression                                                                                                      |
| **LightGBM**        | Tabular classification, tabular regression                                                                                                      |
| **lifelines**       | Survival analysis (time-to-event)                                                                                                               |
| **scikit-survival** | Survival analysis (time-to-event)                                                                                                               |


## How It Works

1. **Deploy** a tracebloc workspace on any machine or Kubernetes cluster
2. **Define** a use case — select datasets, set evaluation metrics
3. **Invite** anyone — researchers, partners, your own teams across locations
4. **Build** — contributors train models inside your environment using this SDK
5. **Compare** — every submission benchmarked under identical conditions on one leaderboard

## Local development setup

To work on the `tracebloc` SDK and test changes with the **"start-training"** Jupyter Notebook (also running locally), clone the [start-training repository](https://github.com/tracebloc/start-training) and set up a virtual environment using a Python version compatible with TensorFlow (ideally 3.11 or 3.12).

Set up virtual environment:

```bash
python3.12 -m venv {venv_location}
source {venv_location}/bin/activate
pip install jupyter
```

Or with Pyenv:

```bash
pyenv virtualenv 3.12 start-training
pyenv activate start-training
pip install jupyter
```

Register the venv as a Jupyter kernel so the notebook uses it:

```bash
python -m ipykernel install --user --name=start-training --display-name "start-training"
```

### Option 1: Editable install (recommended for iterating on source)

```bash
pip install -e {path_to}/tracebloc-py-package
```

Source changes are picked up immediately — no reinstall needed. Best for day-to-day development.

### Option 2: Local install (non-editable)

```bash
pip install {path_to}/tracebloc-py-package
```

Installs a snapshot into `site-packages`. You must re-run this command after every change. Useful for verifying the package works when installed normally.

### Option 3: Install from a built distribution

Build a wheel or sdist first, then install it:

```bash
cd {path_to}/tracebloc-py-package
python -m build
cd {path_to}/start-training
pip install {path_to}/tracebloc-py-package/dist/tracebloc-*.whl
```

Use this to catch packaging issues (missing files, incorrect metadata, incomplete `install_requires`) before publishing to PyPI.

### Switching between options

`pip install` on the same package name replaces the previous installation automatically. To remove the package entirely:

```bash
pip uninstall tracebloc -y
```

### Running the notebook

```bash
jupyter notebook notebooks/traceblocTrainingGuide.ipynb
```

Select the **"start-training"** kernel if it isn't already selected.

**Note**: skip the `!pip install tracebloc[pytorch]` cell in the notebook — the package is already installed locally via one of the options above.

## Releasing

> Maintainer-only. Skip this section unless you're cutting a new
> tracebloc release.

Releases are tag-driven: pushing a tag matching `v*.*.*` on `develop`
triggers `.github/workflows/publish-release.yml`, which validates the
build matrix, publishes to TestPyPI, verifies the install, then waits
for a manual approval before publishing to PyPI.

End-to-end shape: **validate matrix → TestPyPI → verify-from-TestPyPI → manual approval → PyPI** (~20 min wall-clock with a hold at the approval gate).

### Pre-flight (one-time per repo)

These are configured once and stay good. Sanity-check them before the
first release on a new repo:

- **GitHub secrets** in repo settings:
  - `TEST_PYPI_API_TOKEN` — used by the publish-testpypi stage
  - `PYPI_API_TOKEN` — used by the publish-pypi stage
- **GitHub environment `production-pypi`** exists with required
  reviewers configured. The publish-pypi job uses this environment to
  force a manual approval gate before the PyPI upload.

### Step 0 — Optional dry-run (first-ever tag pipeline run)

Tag a release candidate first to exercise the pipeline without
publishing to production PyPI — deny the approval gate to stop just
before the PyPI upload:

```bash
git checkout develop && git pull --ff-only
git tag v0.X.YrcN && git push origin v0.X.YrcN
```

Skip this step on subsequent releases once the pipeline is known good.

### Step 1 — Tag the real release

```bash
git checkout develop
git pull --ff-only

# Confirm pyproject.toml version matches the tag you're about to push
# (the workflow's verify-tag-matches-version step fails otherwise).
grep "^version" pyproject.toml

git tag v0.X.Y
git push origin v0.X.Y
```

### Step 2 — Watch the workflow

```bash
gh run watch
gh run list --workflow=publish-release.yml --limit 5
```

Stages:

1. **validate** — Py 3.9–3.12 × 7 extras matrix; ~10–15 min
2. **publish-testpypi** — builds + uploads to TestPyPI; ~2 min
3. **verify-testpypi** — installs from TestPyPI; ~3 min (includes a 60s sleep for indexing)
4. **publish-pypi** — paused for approval

### Step 3 — Approve the production publish

GitHub: **Actions → workflow run → publish-pypi job → Review deployments → `production-pypi` → Approve**.

Once approved, the PyPI upload runs (~1 min). Verify:

```bash
pip index versions tracebloc    # should show the new version
```

### Step 4 — Same-day cross-repo coordination

Once the version is live on PyPI:

1. **`tracebloc/start-training`** — bump the notebook pin if the release ships user-facing API changes; flip the coordination PR from draft → ready-for-review; merge.
2. **`tracebloc/docs`** (Mintlify) — install command + API example updates as a same-day PR.
3. **`tracebloc/tracebloc-website-main`** — homepage quickstart snippet PR if relevant.

Skip 2 and 3 for quiet patch releases with no doc-visible surface.

### Step 5 — Post-release

1. **Update `CHANGELOG.md`**: change `## [0.X.Y] — Unreleased` → `## [0.X.Y] — YYYY-MM-DD`. One-line follow-up commit on `develop`.
2. Close any stale test-trigger PRs if still open.
3. Announce per team norms.

## Links

- [Documentation](https://docs.tracebloc.io/)
- [Quickstart Notebook](https://colab.research.google.com/github/tracebloc/start-training/blob/main/notebooks/traceblocTrainingGuide.ipynb)
- [Website](https://tracebloc.io)
- [GitHub](https://github.com/tracebloc)

## License

MIT
