Metadata-Version: 2.4
Name: ingero-annotate
Version: 0.1.0
Summary: Writer for the Ingero agent annotation socket. Speaks the agent v0.17+ NDJSON annotation protocol; framework-agnostic.
Project-URL: Homepage, https://github.com/ingero-io/ingero
Project-URL: Repository, https://github.com/ingero-io/ingero
Project-URL: Documentation, https://github.com/ingero-io/ingero/blob/main/docs/commands.md
Project-URL: Issues, https://github.com/ingero-io/ingero/issues
License-Expression: Apache-2.0
License-File: LICENSE
Keywords: annotation,ebpf,gpu,ingero,observability
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: System Administrators
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Operating System :: POSIX :: Linux
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: System :: Monitoring
Requires-Python: >=3.8
Description-Content-Type: text/markdown

# ingero-annotate

Writer for the Ingero agent annotation socket. Speaks the agent's
v0.17 NDJSON annotation protocol and nothing else, so any training
framework or inference frontend can inject `step`, `epoch`,
`task_id`, `request_id`, `model`, or any custom label into a live
recorded trace without a framework-specific dependency.

## Install

```
pip install ingero-annotate
```

## Use

```python
from ingero_annotate import AnnotationWriter

writer = AnnotationWriter()               # default socket: /run/ingero/annotate.sock
writer.write({"step": "42"})              # instant annotation
writer.write_span({"epoch": "3"}, ...)    # span annotation
```

The full public surface is documented in the module's docstring;
see `python/ingero-annotate/ingero_annotate.py`.

The agent must be running with the annotation socket bound
(`ingero trace --record --annotate`). When the socket is absent or
the agent has not bound it, the writer drops silently and the
caller's code path is untouched.

## What this is and is not

This is the protocol layer. It validates label keys and values
against the agent's contract (`pkg/contract/annotate.go` in the
agent repo), opens the Unix-domain socket, sends NDJSON, handles
reconnect, and gives you back a small Python API. It has no
framework dependency.

The framework adapters live in the Ingero agent repository under
[`examples/integrations/`](https://github.com/ingero-io/ingero/tree/main/examples/integrations):

- `pytorch-lightning/ingero_lightning.py` (Lightning callback)
- `ray/ingero_ray.py` (Ray task hook)
- `hf-trainer/ingero_hf.py` (HF Trainer callback)
- `deepspeed/ingero_deepspeed.py` (DeepSpeed wrapper)
- `accelerate/ingero_accelerate.py` (Accelerate hook)
- `vllm/ingero_vllm.py` (per-request inference emitter, v0.19+)

Each of those imports from this package and adds the
framework-specific wiring.

## Wire protocol summary

Each line is one NDJSON annotation object:

```json
{"labels": {"step": "42"}, "pid": 1234, "ts": 1700000000000000000}
```

`labels` is required. `pid` scopes to a process incarnation; without
it the annotation is trace-wide. `ts` is optional unix nanoseconds
(the agent stamps receive time when absent). Spans add `span_start`
and `span_end` (both unix nanoseconds).

The full contract, including label-key/value charsets, length
limits, and the v0.19 per-request keys, is documented in
`pkg/contract/annotate.go` and `docs/commands.md` in the agent
repository.

## Honesty notes for per-request inference correlation

v0.19+ adds `request_id`, `model`, `prompt_len`, `output_len`,
`preempted`, `arrival`, `first_token`, `finished` to the contract.
When used with `ingero explain --by-request` / `ingero query
--by-request`, the output is a TIME-OVERLAP slice, not exclusive
kernel ownership. Continuous batching shares kernel launches across
many in-flight requests; the agent's renderer prints this caveat
verbatim every time. The library does not interpret `request_id`
semantics beyond label validation; the agent does.

## License

Apache-2.0.
