Metadata-Version: 2.4
Name: chime_logger
Version: 1.0.0
Summary: CHIME/FRB telemetry wrapper around HelixObs — traces and logs via OTLP.
License: MIT
License-File: LICENSE
Author: Tarik Zegmott
Author-email: tarik.zegmott@mcgill.ca
Requires-Python: >=3.9,<4.0
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Python :: 3.14
Requires-Dist: grpcio (>=1.44.0)
Requires-Dist: helixobs
Description-Content-Type: text/markdown

<div align="center">
    <img src="./static/CHIME_Logger_Logo.png" width="220", height="110">
</div>

<h1 align="center">CHIME Logger</h1>

CHIME Logger is a CHIME/FRB telemetry wrapper around [HelixObs](https://helixobs.github.io). It ships traces and logs to the HelixObs herald via OTLP, writes a local rotating file log, and automatically injects CHIME context (`resource_name`, `resource_type`, `pipeline`, `site`) into every log record — without any boilerplate.

## How it works

CHIME Logger wraps the [HelixObs Python client](https://helixobs.github.io) with a CHIME-specific `CHIMETracer`. When you enter an `operate()` or `create()` block, the tracer stamps the span with CHIME attributes (`chime.resource_type`, `chime.pipeline`, `chime.site`). A log-record factory — installed at the root logger level, the same way HelixObs injects trace context — reads those attributes from the active span and stamps them onto every `LogRecord` at creation time, before any handler sees it. This means **any logger in the process** automatically carries `resource_name`, `resource_type`, `pipeline`, and `site` inside an `operate()` or `create()` block, regardless of its name.

Spans flow to the **HelixObs herald**, which writes entity rows to TimescaleDB, links provenance, and drives notifications and dashboards. See the [HelixObs documentation](https://helixobs.github.io) for the full entity model, provenance DAG, and herald architecture.

## Installation

```bash
pip install chime_logger
# or
poetry add chime_logger
# or
uv add chime_logger
```

## Usage

### Setup

Call `setup()` once at process start. It returns a `CHIMETracer` singleton wired to the HelixObs herald.

```python
import chime_logger
import logging

tel = chime_logger.setup(pipeline="datatrail-registration")
log = logging.getLogger(__name__)   # any logger name works
```

> **Logger name:** you can use any logger — `logging.getLogger(__name__)`, `logging.getLogger("CHIME")`, or anything else. CHIME context fields are injected at record-creation time via a root-level factory, so the name does not matter for context enrichment or OTLP log shipping. The only distinction is that records from loggers named `"CHIME"` or `"CHIME.*"` are also written to the local rotating file.

---

### Creating new entities — `create()`

Use `create()` when a new entity enters the pipeline for the first time — a new beam candidate, a new baseband acquisition, a new FRB event. This registers the entity in the HelixObs provenance DAG, making it discoverable in the Entity Inspector and linkable to its parents.

```python
with tel.create("l1-search", resource_name=beam_id, resource_type="n2_acquisition", parents=[block_id]) as token:
    log.info("Processing beam")   # resource_name, resource_type, pipeline, site auto-injected
```

`parents` links this entity to its upstream origins. The herald resolves those links into a navigable provenance DAG — so you can trace an FRB event back to the raw data block it came from. See the [HelixObs provenance guide](https://helixobs.github.io) for details.

---

### Processing existing entities — `operate()`

Use `operate()` when performing a pipeline stage on an entity that already exists — registering it in a catalog, replicating its data, converting its format. Each `operate()` call creates an **operation row** in HelixObs, giving you a timestamped history of every stage an entity passed through.

```python
with tel.operate("register", resource_name=event_id, resource_type="event") as token:
    log.info("Registering event")
    register_in_catalog(event_id)
```

The full operation timeline is visible per-entity in the HelixObs Entity Inspector dashboard.

---

### Marking scientifically notable signals — `add_event()`

`token.add_event()` records a named **helix event** on the entity. Unlike a log line, helix events are stored in a structured `entity_events` table in TimescaleDB and are queryable across all entities. Use them for domain signals that matter beyond the log stream.

```python
with tel.operate("classify", resource_name=event_id, resource_type="event") as token:
    classification = classify(event_id)
    token.add_event("helix.event.classified", {"classification": classification, "dm": str(dm)})
```

Helix events appear in the Entity Inspector's event timeline and can be used to drive notifications (see below).

---

### Recording errors — `add_error()` and `error()`

These are not log lines — they are **helix error events**. When the herald receives a `helix.error` event, it:

1. Marks the entity with `has_error = true` in the database.
2. Surfaces it in the **Error Entities** Grafana dashboard for immediate visibility.
3. Triggers a **Slack notification** and opens a **GitHub issue** for diagnosis (if configured for your instrument).
4. Deduplicates repeated occurrences and updates the issue body with running statistics rather than spamming a new alert each time.

**`token.add_error()`** — records a recoverable error. The span stays open so the operation can continue or attempt recovery.

```python
with tel.operate("replicate", resource_name=event_id, resource_type="event") as token:
    for dest in destinations:
        try:
            replicate_to(dest)
        except TimeoutError as e:
            token.add_error({"destination": dest, "reason": str(e)})
            log.warning(f"Replication to {dest} timed out, continuing")
    # span completes normally after the loop
```

**`token.error()`** — records the error and immediately closes the span as failed. Use this when the operation cannot recover.

```python
with tel.operate("register", resource_name=event_id, resource_type="event") as token:
    try:
        register_in_catalog(event_id)
    except Exception as e:
        token.error({"reason": str(e)})   # closes span as failed
        raise
```

> **Note:** The context manager calls `token.error()` automatically if an unhandled exception escapes the `with` block, so explicit `token.error()` is only needed when you want to attach structured metadata to the error.

---

### Child spans

Use `child_span()` for internal steps that should appear in the Tempo trace view but don't need their own entity row in HelixObs.

```python
with tel.operate("register", resource_name=event_id, resource_type="event") as token:
    with tel.child_span("catalog-lookup"):
        result = lookup_catalog(event_id)
    with tel.child_span("db-write"):
        write_to_db(result)
```

This keeps the Tempo trace detailed without polluting the entity provenance DAG with internal implementation steps.

---

### Automatic log context injection

Inside any `operate()` or `create()` block, every log call from **any logger in the process** automatically carries:

| Field | Source |
|---|---|
| `resource_name` | `helix.entity.id` from the active span |
| `resource_type` | `chime.resource_type` from the active span |
| `pipeline` | `chime.pipeline` from the active span, then `CHIME_LOGGER_PIPELINE_NAME` |
| `site` | `chime.site` from the active span, then `CHIME_LOGGER_SITE` |

Outside any span, logging still works — context fields fall back to the environment variables above, then to `unknown_name` / `unknown_type` / `unknown_pipeline` / `unknown_site`.

### Accessing the singleton

```python
tel = chime_logger.get_tracer()   # None if setup() has not been called yet
```

---

## Configuration

All connection and site parameters come from environment variables.

| Variable | Default | Description |
|---|---|---|
| `HERALD_ENDPOINT` | `localhost:4317` | OTLP gRPC address for traces |
| `LOGS_ENDPOINT` | `localhost:4317` | OTLP gRPC address for logs |
| `CHIME_LOGGER_SITE` | _(unset)_ | Site name: `chime`, `kko`, `gbo`, `hco` |
| `CHIME_LOGGER_PIPELINE_NAME` | _(unset)_ | Pipeline name fallback when not passed to `setup()` |
| `CHIME_LOGGER_FILE_LOG_PATH` | `logs/chime_pipeline.log` | Path for the rotating file log |
| `HERALD_INSECURE` | `true` | Set to `false` for TLS connections |
| `HERALD_CREDENTIAL` | _(unset)_ | Registration secret or existing JWT |
| `HERALD_AUTH_ENDPOINT` | _(unset)_ | Herald `/auth/token` URL (required when `HERALD_CREDENTIAL` is set) |

## License

See [LICENSE](LICENSE) for details.

