Metadata-Version: 2.4
Name: impulse-telemetry
Version: 0.2.0
Summary: Observability SDK for Impulse microservices
License: MIT
Keywords: observability,telemetry,opentelemetry,prometheus,fastapi,ml
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: System :: Monitoring
Requires-Python: >=3.10
Description-Content-Type: text/markdown
Requires-Dist: opentelemetry-sdk>=1.24
Requires-Dist: opentelemetry-exporter-otlp-proto-grpc>=1.24
Requires-Dist: opentelemetry-propagator-b3>=1.24
Requires-Dist: opentelemetry-instrumentation-fastapi>=0.45b0
Requires-Dist: opentelemetry-instrumentation-requests>=0.45b0
Requires-Dist: opentelemetry-instrumentation-httpx>=0.45b0
Requires-Dist: prometheus-client>=0.20
Requires-Dist: structlog>=24.0
Provides-Extra: extras
Requires-Dist: opentelemetry-instrumentation-sqlalchemy>=0.45b0; extra == "extras"
Requires-Dist: opentelemetry-instrumentation-redis>=0.45b0; extra == "extras"
Requires-Dist: opentelemetry-instrumentation-celery>=0.45b0; extra == "extras"
Provides-Extra: ml
Requires-Dist: pandas>=2.0; extra == "ml"

# impulse_telemetry

Observability SDK for Impulse microservices. One `init()` call wires up distributed tracing, Prometheus metrics, structured logging, and FastAPI middleware.

## Install

```bash
pip install impulse-telemetry

# With SQLAlchemy / Redis / Celery auto-instrumentation
pip install "impulse-telemetry[extras]"

# With ML data quality checks (pandas required)
pip install "impulse-telemetry[ml]"
```

## Quickstart

```python
from fastapi import FastAPI
from impulse_telemetry import init

app = FastAPI()
init(app, service="my-service", version="1.0.0", env="production")
```

Every HTTP request now emits RED metrics, a distributed trace, and a structured JSON log line — automatically.

## `init()` parameters

| Parameter | Default | Description |
|---|---|---|
| `app` | `None` | FastAPI app. Pass to enable HTTP middleware. |
| `service` | **required** | Service name — appears on all telemetry. |
| `version` | `"0.0.0"` | Semver string tagged on all telemetry. |
| `env` | `$ENV` | `"production"` \| `"staging"` \| `"dev"` |
| `otlp_endpoint` | `$OTEL_EXPORTER_OTLP_ENDPOINT` | OTLP Collector gRPC address. |
| `prometheus_port` | `None` | Expose `/metrics` on this port (for workers without HTTP). |
| `log_level` | `"INFO"` | `"DEBUG"` \| `"INFO"` \| `"WARNING"` \| `"ERROR"` |
| `enable_ml` | `False` | Register ML-specific metric instruments. |

## Logging

```python
from impulse_telemetry.logging import get_logger

log = get_logger(__name__)
log.info("prediction_served", model_id="rec-v3", latency_ms=43)
# {"service":"my-service","trace_id":"abc..","event":"prediction_served","model_id":"rec-v3",...}
```

`trace_id` and `span_id` are injected automatically from the active OTEL span.

### Per-request context

Bind fields once per request — they appear on every subsequent log line for that request.

```python
from impulse_telemetry.logging import bind_request_context, clear_request_context

token = bind_request_context(user_id="usr_123", request_id="req_456")
log.info("doing_work")   # user_id + request_id injected automatically
clear_request_context(token)
```

## Metrics

Prometheus metrics are pre-registered and labeled with `service` and `env`.

### Automatic (via middleware)

| Metric | Type | Description |
|---|---|---|
| `http_requests_total` | Counter | Requests by route, method, status |
| `http_request_errors_total` | Counter | 4xx / 5xx responses |
| `http_request_duration_seconds` | Histogram | Request latency |
| `http_active_requests` | Gauge | In-flight requests |

### Manual

```python
from impulse_telemetry.metrics import get_metrics

m = get_metrics()
m.dependency_latency.labels(**m.labels(dependency="redis")).observe(elapsed)
m.dependency_errors.labels(**m.labels(dependency="redis")).inc()
```

Available instruments: `rate_limiter_hits`, `rate_limiter_remaining`, `dependency_latency`, `dependency_errors`, `queue_depth`, `circuit_breaker`.

## Tracing

```python
from impulse_telemetry.tracing import get_tracer, inject_headers

tracer = get_tracer(__name__)

with tracer.start_as_current_span("my_operation") as span:
    span.set_attribute("key", "value")
    headers = inject_headers({})   # propagate W3C traceparent to downstream
    requests.get("http://other-service/api", headers=headers)
```

Auto-instrumented libraries (when installed): `requests`, `httpx`, `SQLAlchemy`, `Redis`, `Celery`.

## ML Monitoring

Enable at startup with `enable_ml=True`.

```python
from impulse_telemetry.ml import MLMonitor

monitor = MLMonitor(model_id="rec-v3", user_id=user_id)
```

### Inference tracing

```python
# Context manager
with monitor.inference(features=df) as span:
    result = model.predict(df)
    span.record_output(result)

# Decorator
@monitor.trace
def predict(features):
    return model.predict(features)
```

Both record inference latency, error count, and run data quality checks (missing values, schema violations) automatically.

### Drift & performance metrics

```python
# Call from batch evaluation jobs
monitor.record_drift("age", score=0.18, metric="psi")
monitor.record_performance(rmse=0.042, precision=0.87, recall=0.81)
```

### ML metric instruments

| Metric | Type | Description |
|---|---|---|
| `ml_inference_duration_seconds` | Histogram | Per-model inference latency |
| `ml_inference_requests_total` | Counter | Inference count |
| `ml_inference_errors_total` | Counter | Inference errors |
| `ml_missing_feature_rate` | Gauge | Missing values per column |
| `ml_schema_violations_total` | Counter | Schema violation count |
| `ml_feature_drift` | Gauge | Feature drift score |
| `ml_prediction_drift` | Gauge | Prediction drift score |
| `ml_rmse` | Gauge | Rolling RMSE |
| `ml_precision` | Gauge | Rolling precision |
| `ml_recall` | Gauge | Rolling recall |

## Background workers

For services without an HTTP server, expose metrics on a dedicated port:

```python
init(service="ingest-worker", prometheus_port=9090)
# Prometheus scrapes localhost:9090/metrics
```

## Environment variables

| Variable | Description |
|---|---|
| `OTEL_EXPORTER_OTLP_ENDPOINT` | OTLP Collector gRPC address |
| `ENV` | Deployment environment (`production`, `staging`, `dev`) |

## See also

[examples.py](examples.py) — minimal, runnable code for every feature.
