Metadata-Version: 2.4
Name: dot-metrics
Version: 0.2.0
Summary: A lightweight metrics and constraint evaluation framework
Project-URL: Homepage, https://gitlab.com/deepika6190303/deepika-open-toolbox/dot-metrics
Project-URL: Repository, https://gitlab.com/deepika6190303/deepika-open-toolbox/dot-metrics
Project-URL: Issues, https://gitlab.com/deepika6190303/deepika-open-toolbox/dot-metrics/-/issues
Author-email: deepika Team <contact@deepika.ai>
License: TODO: TO BE COMPLETED
License-File: LICENSE
Keywords: constraints,deepika,evaluation,metrics,open-toolbox
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Requires-Python: >=3.12
Requires-Dist: pydantic>=2.0
Provides-Extra: explore
Requires-Dist: dash-bootstrap-components>=1.0; extra == 'explore'
Requires-Dist: dash-bootstrap-templates>=1.0; extra == 'explore'
Requires-Dist: dash>=2.0; extra == 'explore'
Requires-Dist: pandas>=2.0; extra == 'explore'
Requires-Dist: plotly>=5.0; extra == 'explore'
Description-Content-Type: text/markdown

# dot-metrics

![Python Version](https://img.shields.io/badge/python-3.12%2B-blue)
![Coverage](https://img.shields.io/badge/coverage-100%25-brightgreen)

**dot-metrics** is a lightweight metrics and constraint evaluation framework. Define metrics and constraints, run them against your data, and get structured results with debug info.

## Install

```bash
pip install dot-metrics
```

## Concept

A `MetricSet` holds metric and constraint definitions. Call `compute(data)` to evaluate them.

```
MetricSet
 ├── metrics:     {"coverage": MetricDefinition}
 └── constraints: {"errors":   ConstraintDefinition}
          │
          ▼
     set.compute(data)
          │
          ▼
      EvalResult
       ├── metrics:     {"coverage": Metric}
       └── constraints: {"errors":   Constraint}
```

**Metrics** are continuous measurements (e.g. coverage rate, score).
**Constraints** are pass/fail checks against a threshold (e.g. error count ≤ 0).

A constraint passes when `value <= threshold`.

## Quick start

```python
from dot_metrics import MetricSet

metric_set = MetricSet()

@metric_set.metric("coverage")
def coverage(data):
    return data["covered"] / data["total"]

@metric_set.constraint("errors", threshold=0)
def errors(data):
    return data["error_count"]

result = metric_set.compute({"covered": 90, "total": 100, "error_count": 0})

result.score("coverage")     # 0.9
result.constraints_ok        # True
```

## Defining metrics and constraints

### Decorator style

```python
metric_set = MetricSet()

@metric_set.metric("latency_ms", unit="ms", higher_is_better=False)
def latency(data):
    return data["total_ms"] / data["requests"]

@metric_set.constraint("error_rate", threshold=0.01, unit="%")
def error_rate(data):
    return data["errors"] / data["requests"]
```

### Imperative style

```python
metric_set = MetricSet()
metric_set.add("coverage", lambda data: data["covered"] / data["total"])
metric_set.add_constraint("errors", lambda data: data["error_count"], threshold=0)
```

Both styles are equivalent. `add()` and `add_constraint()` accept the same keyword arguments as the decorators.

### Parameters

**`metric_set.metric(key, *, unit="", description="", higher_is_better=True, metadata={})`**
**`metric_set.add(key, fn, *, unit="", description="", higher_is_better=True, metadata={})`**

**`metric_set.constraint(key, *, threshold, unit="", description="", metadata={})`**
**`metric_set.add_constraint(key, fn, *, threshold, unit="", description="", metadata={})`**

- `key` — unique name for the metric/constraint
- `threshold` — constraint passes when `value <= threshold`
- `higher_is_better` — affects terminal chart rendering
- `metadata` — arbitrary dict, passed through to results

## Computing results

```python
result = metric_set.compute(data)
```

Every metric and constraint function must accept **exactly one argument** — the data object. `data` can be anything: a dict, dataclass, Pydantic model, etc.

### Accessing results

```python
result.score("coverage")                  # float — metric value
result.metrics["coverage"].value          # same
result.metrics["coverage"].unit           # ""
result.metrics["coverage"].debug          # {} by default

result.constraints["errors"].value        # float
result.constraints["errors"].passed       # True/False
result.constraints["errors"].threshold    # 0

result.constraints_ok                     # True if all constraints passed
result.violations                         # list of failed Constraint objects
result.assert_constraints()               # raises ValueError if any failed
```

## Attaching debug info

Return a `ComputedValue` instead of a plain float to attach structured debug data:

```python
from dot_metrics import MetricSet, ComputedValue

metric_set = MetricSet()

@metric_set.metric("coverage")
def coverage(data):
    missed = [x for x in data if not x["covered"]]
    return ComputedValue(value=1 - len(missed) / len(data), debug={"missed": missed})

result = metric_set.compute(data)
result.metrics["coverage"].debug    # {"missed": [...]}
```

`ComputedValue` works the same way for constraints.

## Batch evaluation

Evaluate a set of inputs in one call:

```python
# dict of inputs
batch = metric_set.compute_batch({"run_1": data1, "run_2": data2})
batch["run_1"].score("coverage")            # 0.9
batch.scores("coverage")                   # {"run_1": 0.9, "run_2": 0.85}

# list of inputs
batch = metric_set.compute_batch([data1, data2, data3])
batch[0].score("coverage")                 # indexed by position
```

`BatchResult` supports iteration, `len()`, and `.items()`.

## Metric documentation

Add a Google-style docstring to a metric or constraint function and it gets parsed into a `help` dict on both the definition and the result:

```python
@metric_set.metric("coverage", unit="%")
def coverage(data):
    """Percentage of code paths covered by tests.

    Range: 0-100
    Interpretation:
        - 90-100: Excellent
        - 70-90:  Good
        - <70:    Needs improvement
    Notes:
        - Returns 0 for empty input.
    """
    return sum(1 for x in data if x["covered"]) / len(data)

metric_set.metrics["coverage"].help
# {
#   "summary": "Percentage of code paths covered by tests.",
#   "range": "0-100",
#   "interpretation": "- 90-100: Excellent\n- 70-90:  Good\n- <70:    Needs improvement",
#   "notes": "- Returns 0 for empty input."
# }

result = metric_set.compute(data)
result.metrics["coverage"].help     # same dict
```

Supported sections: `Range:`, `Interpretation:`, `Notes:`. No docstring → `help` is `{}`.

## Typing

`MetricSet` is generic over the input type `T`. Annotating it lets static type checkers (mypy, pyright) verify that every registered function accepts the right type:

```python
from dataclasses import dataclass
from dot_metrics import MetricSet

@dataclass
class SchedulingData:
    appointments: list
    solution: list

metric_set: MetricSet[SchedulingData] = MetricSet()
metric_set.add("rate", lambda d: len(d.solution) / len(d.appointments))

result = metric_set.compute(SchedulingData(appointments=[...], solution=[...]))
```

The annotation is optional — omitting it is fine and everything still works at runtime.

## Interactive explorer

Explore batch results interactively in the browser with scatter, bar, and heatmap charts.

```bash
pip install dot-metrics[explore]
```

```python
from dot_metrics import MetricSet
from dot_metrics.explore import serve

ms = MetricSet()
ms.add("score", lambda x: x["s"])
ms.add_constraint("errors", lambda x: x["e"], threshold=1)

batch = ms.compute_batch({
    ("gpt4", "en"): {"s": 0.9, "e": 0},
    ("gpt4", "fr"): {"s": 0.7, "e": 2},
    ("llama", "en"): {"s": 0.8, "e": 0},
    ("llama", "fr"): {"s": 0.6, "e": 1},
}, key_names=["model", "language"])

serve(batch)  # opens localhost:8050
```

The app provides:
- **Chart** — scatter, bar, or heatmap with configurable X, Y, color, and size axes
- **Aggregation panel** — compute mean/median/min/max grouped by any categorical column on the fly
- **Data table** — sortable, with debug cell inspection and CSV export

Pass `key_names` to `compute_batch` to label tuple key components (defaults to `key[0]`, `key[1]`, …). You can also pass a single `EvalResult` instead of a `BatchResult`.

`serve(data, *, host="127.0.0.1", port=8050, debug=False)`

## Terminal chart

```python
from dot_metrics import draw_terminal_chart

print(draw_terminal_chart(result))
# coverage  ████████████████████  0.90
```

`draw_terminal_chart(result, width=40, char="█")` returns a string — use `print()` to display it.

## Full example

```python
from dot_metrics import MetricSet, ComputedValue

appointments = [
    {"id": "A1", "patient": "Alice",   "duration": 30},
    {"id": "A2", "patient": "Bob",     "duration": 60},
    {"id": "A3", "patient": "Charlie", "duration": 30},
]

solution = [
    {"appointment_id": "A1", "practitioner": "Dr. Martin", "slot": "09:00", "scheduled": True},
    {"appointment_id": "A2", "practitioner": "Dr. Martin", "slot": "09:00", "scheduled": True},  # conflict!
    {"appointment_id": "A3", "practitioner": "Dr. Martin", "slot": "10:00", "scheduled": True},
]

metric_set = MetricSet()

@metric_set.metric("scheduling_rate")
def scheduling_rate(data):
    scheduled = [e for e in data["solution"] if e["scheduled"]]
    unscheduled = [e["appointment_id"] for e in data["solution"] if not e["scheduled"]]
    return ComputedValue(value=len(scheduled) / len(data["appointments"]), debug={"unscheduled": unscheduled})

@metric_set.constraint("conflicts", threshold=0)
def count_conflicts(data):
    seen = {}
    conflicts = []
    for entry in data["solution"]:
        key = (entry["practitioner"], entry["slot"])
        if key in seen:
            conflicts.append((seen[key], entry["appointment_id"]))
        seen[key] = entry["appointment_id"]
    return ComputedValue(value=len(conflicts), debug={"conflicts": conflicts})

result = metric_set.compute({"appointments": appointments, "solution": solution})

result.score("scheduling_rate")                     # 1.0
result.constraints_ok                               # False
result.constraints["conflicts"].debug               # {"conflicts": [("A1", "A2")]}
```

## Reference

| Import | Description |
|--------|-------------|
| `MetricSet` | Main class — holds definitions, runs computation |
| `EvalResult` | Output of `compute()` — holds `Metric` and `Constraint` dicts |
| `BatchResult` | Output of `compute_batch()` — maps keys to `EvalResult` |
| `ComputedValue` | Wraps a float return value with optional debug data |
| `Metric` | Computed metric result |
| `Constraint` | Computed constraint result with `passed` flag |
| `MetricDefinition` | Stored metric definition (in `metric_set.metrics`) |
| `ConstraintDefinition` | Stored constraint definition (in `metric_set.constraints`) |
| `draw_terminal_chart` | Renders a Unicode bar chart from an `EvalResult` |
| `explore.serve` | Launches an interactive Dash explorer (requires `pip install dot-metrics[explore]`) |

## Contributing & Development

See [docs/CONTRIBUTING.md](docs/CONTRIBUTING.md) and [docs/DEVELOPMENT.md](docs/DEVELOPMENT.md).

## License

See [LICENSE](LICENSE) for details.

## Contact

deepika Team — contact@deepika.ai
Project: [gitlab.com/deepika6190303/deepika-open-toolbox/dot-metrics](https://gitlab.com/deepika6190303/deepika-open-toolbox/dot-metrics)
