Metadata-Version: 2.4
Name: qortex-learning
Version: 0.1.0
Summary: Bandit-based adaptive learning for qortex: Thompson Sampling, reward models, persistent state.
Author: Peleke Sengstacke
License-Expression: MIT
Keywords: adaptive-learning,bandit,knowledge-graph,thompson-sampling
Classifier: Development Status :: 2 - Pre-Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Requires-Python: >=3.11
Requires-Dist: aiosqlite>=0.20
Requires-Dist: qortex-observe
Provides-Extra: dev
Requires-Dist: pytest-asyncio>=0.23; extra == 'dev'
Requires-Dist: pytest>=8.0; extra == 'dev'
Description-Content-Type: text/markdown

# qortex-learning

Bandit-based adaptive learning for qortex. Thompson Sampling with Beta-Bernoulli posteriors, persistent state via SQLite, and pluggable reward models.

## Usage

```python
from qortex.learning import Learner, LearnerConfig, Arm, ArmOutcome

learner = await Learner.create(LearnerConfig(name="prompts"))

candidates = [Arm(id="v1", token_cost=10), Arm(id="v2", token_cost=15)]
result = await learner.select(candidates, context={"task": "type-check"}, k=1)

await learner.observe(ArmOutcome(arm_id="v2", outcome="accepted", reward=1.0))
```

Part of the [qortex](https://github.com/Peleke/qortex) workspace.
