Metadata-Version: 2.4
Name: saturn-tokenfactory
Version: 0.0.1
Summary: Token Factory fine-tuning utilities — platform-owned LoRA SFT training scripts for Saturn Cloud's no-code fine-tuning product.
Author-email: Saturn Cloud <support@saturncloud.io>
License: MIT License
        
        Copyright (c) 2026 Saturn Cloud
        
        Permission is hereby granted, free of charge, to any person obtaining a copy
        of this software and associated documentation files (the "Software"), to deal
        in the Software without restriction, including without limitation the rights
        to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
        copies of the Software, and to permit persons to whom the Software is
        furnished to do so, subject to the following conditions:
        
        The above copyright notice and this permission notice shall be included in all
        copies or substantial portions of the Software.
        
        THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
        IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
        FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
        AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
        LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
        OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
        SOFTWARE.
        
Project-URL: Homepage, https://github.com/saturncloud/tokenfactory
Project-URL: Repository, https://github.com/saturncloud/tokenfactory
Project-URL: Issues, https://github.com/saturncloud/tokenfactory/issues
Keywords: fine-tuning,lora,llm,saturn-cloud,token-factory
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: POSIX :: Linux
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.11
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: transformers
Requires-Dist: trl
Requires-Dist: peft
Requires-Dist: accelerate
Requires-Dist: datasets
Provides-Extra: quantization
Requires-Dist: bitsandbytes; extra == "quantization"
Provides-Extra: mlflow
Requires-Dist: mlflow; extra == "mlflow"
Provides-Extra: dev
Requires-Dist: pytest; extra == "dev"
Requires-Dist: black; extra == "dev"
Requires-Dist: isort; extra == "dev"
Requires-Dist: flake8; extra == "dev"
Requires-Dist: mypy; extra == "dev"
Requires-Dist: types-requests; extra == "dev"
Dynamic: license-file

# Token Factory training utilities

Platform-owned training utilities for [Saturn Cloud](https://saturncloud.io)'s
Token Factory product — a no-code LoRA fine-tuning service.

This package runs **inside** a Token Factory fine-tuning job pod. It is not
intended as a general-purpose fine-tuning library; the runtime contract
(env vars, NFS layout, Atlas callback) is specific to the Token Factory
platform. See `DESIGN.md` for the full specification.

> **Status:** early development. The API and CLI surface are not yet stable.

## What it does

A single Python entrypoint (`saturn-tokenfactory-train`, or
`python -m saturn_tokenfactory`) that:

- Reads hyperparameters from environment variables set by Atlas.
- Loads a base model (Llama 3 / Mistral / Qwen) with family-aware defaults.
- Applies LoRA adapters via PEFT.
- Loads a dataset from an NFS-mounted directory (`.jsonl`, conversational /
  instruction / text formats).
- Runs SFT with `trl.SFTTrainer`.
- Writes `config.json`, LoRA checkpoints, and `manifest.json` to an
  NFS-mounted output directory.
- Posts an artifact-registration callback to Atlas on completion.
- Optionally logs to MLflow if `MLFLOW_TRACKING_URI` is set.

## Out of scope

LoRA SFT only. No full fine-tunes, no DPO/RLHF, no multi-GPU, no serving.
See `DESIGN.md` §14 for the explicit out-of-scope list.

## Development

```bash
make conda-update      # create/update the conda env
make check-format      # black + isort (read-only check)
make format-backend    # black + isort (apply)
make flake8 mypy       # lint + type-check
make test-backend      # unit tests
make lint-backend      # full lint chain
```
