Metadata-Version: 2.4
Name: aria-py
Version: 0.1.1
Summary: ARIA — Adaptive Readiness Index Algorithm. Adaptive sequencing engine for any domain.
Author-email: Jay Gurav <jaymgurav@gmail.com>
License: MIT
Project-URL: Homepage, https://github.com/JayMGurav/aria-engine
Project-URL: Repository, https://github.com/JayMGurav/aria-engine
Project-URL: Documentation, https://jaymgurav.github.io/aria-engine/docs/how-aria-works.html
Project-URL: Bug Tracker, https://github.com/JayMGurav/aria-engine/issues
Keywords: adaptive learning,sequencing,spaced repetition,recommender,aria
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.9
Description-Content-Type: text/markdown
Provides-Extra: dev
Requires-Dist: pytest>=7.4; extra == "dev"
Requires-Dist: pytest-cov; extra == "dev"

# aria-py

[![PyPI](https://img.shields.io/pypi/v/aria-py)](https://pypi.org/project/aria-py/)
[![Python](https://img.shields.io/pypi/pyversions/aria-py)](https://pypi.org/project/aria-py/)
[![CI](https://github.com/JayMGurav/aria-engine/actions/workflows/aria-py-ci.yml/badge.svg)](https://github.com/JayMGurav/aria-engine/actions)
[![License: MIT](https://img.shields.io/badge/license-MIT-blue)](https://github.com/JayMGurav/aria-engine/blob/main/LICENSE)

**Adaptive Readiness Index Algorithm — adaptive sequencing engine for any domain.**

Python implementation of [aria-engine](https://github.com/JayMGurav/aria-engine). Given a pool of items and a stream of user feedback, `aria-py` selects the best next item for each user — from the very first interaction, with no training data, no configuration, and no external dependencies.

Two methods. That's the API.

```python
engine.suggest("user_id")                              # → best next item
engine.feedback("user_id", "item_id", signal)          # → update user model
```

→ [How ARIA works](https://jaymgurav.github.io/aria-engine/docs/how-aria-works.html)

---

## What it does

Most sequencing solutions are either domain-locked (Anki's SM-2 is memory-only), too heavy to deploy cold (BKT and IRT require pre-calibration), or stateless (bandits don't model growth). `aria-py` does none of those things.

The core algorithm — ARIA, Adaptive Readiness Index Algorithm — scores every eligible item using three multiplicatively composed factors:

- **Challenge fit** — Gaussian centred at `skill + optimism_bias`. Targets slightly above where the user currently is.
- **Spacing fit** — Forgetting curve on time since last seen. Surfaces items that are due.
- **Coverage fit** — Inverse category frequency. Prevents over-drilling one area.

The optimism bias has a hard floor. The engine always believes the user can do better than their current level, even after failure.

**Domain is entirely caller-defined.** The engine ships no built-in domains. You define what items are, what scoring means, and how state evolves:

| Domain | `difficulty` | `topic` |
|---|---|---|
| Learning | question difficulty | subject |
| E-commerce | price ratio | product category |
| Travel | remoteness / adventure level | region |
| Content | reading complexity | genre |
| Jobs | seniority level | industry |

---

## Install

```bash
pip install aria-py
```

Requires Python 3.9+. Zero runtime dependencies.

---

## Quickstart

```python
from aria import Engine, Item, Signal

engine = Engine()

engine.add_items([
    Item(id="intro",     difficulty=0.1, topic="math"),
    Item(id="fractions", difficulty=0.4, topic="math"),
    Item(id="algebra",   difficulty=0.7, topic="math", prereqs=["fractions"]),
    Item(id="reading",   difficulty=0.2, topic="lang"),
    Item(id="writing",   difficulty=0.5, topic="lang"),
])

# suggest → feedback loop
for _ in range(10):
    item = engine.suggest("alice")
    print(f"→ {item.id}  (difficulty={item.difficulty}, topic={item.topic})")
    engine.feedback("alice", item.id, Signal(success=True, effort=0.5))

# inspect user model
state = engine.get_state("alice")
print(f"skill={state.skill:.3f}  optimism={state.optimism_bias:.3f}")
print(f"solved={state.solved_set}")
```

---

## API

### `Engine`

```python
Engine(
    config: Config | None = None,  # optional tuning — see Config below
    *,
    default_factors: bool = True,  # auto-registers Challenge, Spacing, Coverage
    seed: int | None = None,       # seed RNG for deterministic behaviour (tests)
)
```

| Method | Returns | Description |
|---|---|---|
| `.add_items(items)` | `Engine` | Register items. Validates prereq graph. Chainable. |
| `.add_factor(factor)` | `Engine` | Append a custom factor to the pipeline. Chainable. |
| `.suggest(user_id, now?)` | `Item` | Return the best next item for this user. |
| `.feedback(user_id, item_id, signal, now?)` | `None` | Update user model after an interaction. |
| `.get_state(user_id)` | `UserState` | Read current user model. Creates fresh state if new. |
| `.load_state(user_id, state)` | `None` | Restore a `UserState` object directly. |
| `.export_state(user_id)` | `dict[str, Any]` | Serialise state to a JSON-safe dict. |
| `.import_state(user_id, data)` | `None` | Restore state from a dict produced by `export_state`. |

### `Item`

```python
Item(
    id: str,                        # unique identifier
    difficulty: float,              # 0.0 – 1.0 normalised challenge proxy
    topic: str,                     # category string for coverage balancing
    prereqs: list[str] = [],        # item IDs that must be solved first
    metadata: dict[str, Any] = {},  # arbitrary caller data; available to custom factors
)
```

### `Signal`

```python
Signal(
    success: bool,  # did the user complete / answer correctly?
    effort: float,  # 0.0 (trivial) – 1.0 (maximum effort)
)
```

### `Config`

```python
Config(
    exploration_rate: float = 0.05,      # noise on scores; 0 = fully deterministic
    optimism_floor: float = 0.05,        # hard lower bound — do not change in production
    optimism_ceiling: float = 0.35,      # upper bound on optimism_bias
    bandwidth: float = 0.2,             # σ of challenge Gaussian; wider = more variety
    alpha: float = 0.05,                # EMA learning rate for skill updates
    optimal_interval: float = 86_400.0, # spacing review window in seconds (24h default)
    heap_threshold: int = 500,          # switch to heap selection above this item count
)
```

---

## Scoring formula

```
readiness(item) = challenge × spacing × coverage × (1 + noise)

challenge = exp( −(difficulty − target)² / 2σ² )    target = skill + optimism_bias
spacing   = 1 − exp( −elapsed / optimal_interval )
coverage  = 1 / ( 1 + topic_count[topic] / mean_topic_count )
noise     = random() × exploration_rate
```

All three factors are **multiplied together** — all must agree. A perfect-difficulty item seen 10 seconds ago still scores near zero. One weak signal suppresses the item entirely.

---

## State update rules

Called internally by `feedback()` after every interaction. O(1).

```
performance = success × (0.5 + 0.5 × (1 − effort))
skill       = skill + α × (performance − skill)        # exponential moving average

optimism += 0.02   if success and effort < 0.4         # easy win → raise target
optimism −= 0.01   if not success                      # ease back slightly
optimism  unchanged otherwise                          # good challenge — target is right

optimism always clamped to [optimism_floor, optimism_ceiling]
```

The **optimism floor (0.05) is a non-negotiable invariant** — after any number of failures the engine still targets at least 5% above the user's current skill. It is not a default value; it is hardcoded behaviour. Growth is the assumption.

---

## Persistence

`UserState` serialises to a plain JSON-safe dict. You own the storage layer.

```python
import json

# save after session
snapshot = engine.export_state("alice")
db.set("aria:alice", json.dumps(snapshot))

# restore on next session / server restart
engine.import_state("alice", json.loads(db.get("aria:alice")))
```

---

## Custom factors

Subclass `Factor` and register with `engine.add_factor()`. The pipeline multiplies your score with the built-ins automatically. Order of registration is order of multiplication.

```python
from aria import Engine, Factor, Item
from aria.models import Config, UserState

class RecencyBoostFactor(Factor):
    name = "recency_boost"

    def score(self, item: Item, state: UserState, now: float, config: Config) -> float:
        # boost items flagged as newly added content
        return 1.0 if item.metadata.get("is_new") else 0.7

engine = Engine()
engine.add_factor(RecencyBoostFactor())
```

Turn off the built-ins entirely and supply only your own:

```python
engine = Engine(default_factors=False)
engine.add_factor(MyChallengeFactor())
engine.add_factor(MySpacingFactor())
```

---

## Prerequisites

Items can declare prerequisites. Locked items are never returned by `suggest()` until all their dependencies are in `solved_set`.

```python
engine.add_items([
    Item("intro",    difficulty=0.1, topic="math"),
    Item("algebra",  difficulty=0.6, topic="math", prereqs=["intro"]),
    Item("calculus", difficulty=0.9, topic="math", prereqs=["algebra"]),
])
# calculus is invisible until intro and algebra are both solved
```

The registry validates the prerequisite graph for **cycles and dangling references** on `add_items()` — once, at registration time, not at every query.

---

## Design decisions

**Multiplicative pipeline, not additive.** All factors must agree. A near-zero in any single factor suppresses the item entirely. This prevents an item from sneaking through on one strong signal while failing every other dimension.

**Optimism floor is non-negotiable.** Hardcoded minimum 0.05. The engine always targets slightly above current skill, even after repeated failure. Growth is the invariant.

**Immutable state updates.** `feedback()` constructs a new `UserState` — it never mutates in place. Easy to persist, snapshot, and reason about.

**No external dependencies.** stdlib only. numpy and pandas are explicitly excluded — the engine must be droppable into any project without dependency conflicts.

**Heap activation threshold.** Linear scan for ≤500 items (cache-friendly). Max-heap for >500 items (O(log n)). The switch is automatic based on `config.heap_threshold`.

**Caller owns persistence.** The engine is in-memory only. Serialise state to any store with `export_state` / `import_state`. The engine has no opinion on where you keep data.

---

## Performance

| Operation | Complexity | Notes |
|---|---|---|
| `feedback()` state update | O(1) | Dict operations only, no allocations |
| `suggest()` ≤ 500 items | O(n) | Linear scan — cache-friendly |
| `suggest()` > 500 items | O(n log n) | Max-heap selection |
| `add_items()` prereq validation | O(n + e) | Kahn's topological sort, once on registration |

Space: O(I + T) per user, where I = items interacted with, T = unique topics seen. Not total item pool size.

---

## Development

```bash
git clone https://github.com/JayMGurav/aria-engine
cd aria-engine/aria-py

python -m venv .venv && source .venv/bin/activate
pip install -e ".[dev]"

pytest              # 50 tests, < 0.1s
mypy aria --strict  # zero errors
ruff check aria     # zero warnings
```

---

## Monorepo layout

```
aria-engine/
├── aria-core/   Rust    — crates.io/crates/aria-core        ✅ v0.1.0
├── aria-py/     Python  — pypi.org/project/aria-py          ✅ v0.1.0  ← you are here
└── aria-ts/     TypeScript / WASM — npm (planned)           🔜
```

All packages expose the same API surface and module structure. The mental model transfers completely across languages.

---

## License

MIT — see [LICENSE](https://github.com/JayMGurav/aria-engine/blob/main/LICENSE)
