Metadata-Version: 2.4
Name: zebu-server-monitor
Version: 1.0.0.3
Summary: Plug-and-play monitoring agent for FastAPI/Flask apps and standalone hosts.
Author-email: zebuetrade <it@zebuetrade.com>
License-Expression: MIT
Keywords: monitoring,observability,metrics,fastapi,flask,prometheus,apm,psutil
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: System Administrators
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: System :: Monitoring
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Framework :: FastAPI
Classifier: Framework :: Flask
Requires-Python: >=3.8
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: psutil>=5.9.0
Requires-Dist: pydantic>=2.0
Requires-Dist: httpx>=0.25.0
Requires-Dist: tenacity>=8.2.0
Provides-Extra: fastapi
Requires-Dist: fastapi>=0.100; extra == "fastapi"
Requires-Dist: starlette>=0.27.0; extra == "fastapi"
Provides-Extra: flask
Requires-Dist: flask>=2.3; extra == "flask"
Requires-Dist: werkzeug>=2.3.0; extra == "flask"
Provides-Extra: requests
Requires-Dist: requests>=2.28; extra == "requests"
Provides-Extra: all
Requires-Dist: zebu-server-monitor[fastapi,flask,requests]; extra == "all"
Provides-Extra: dev
Requires-Dist: build; extra == "dev"
Requires-Dist: twine; extra == "dev"
Requires-Dist: pytest; extra == "dev"
Dynamic: license-file

# Server Monitor

A plug-and-play monitoring agent for **FastAPI** and **Flask** apps — and for plain
hosts, workers, and scripts. It collects system, service, process, and Docker metrics,
instruments inbound requests and outbound HTTP calls, and pushes everything to a
monitoring backend.

- **Zero-boilerplate integration** — one `monitor.init(app)` call auto-detects the framework.
- **Host agent** — a `server-monitor` CLI that auto-detects running services/processes.
- **Lazy framework deps** — the core installs without FastAPI/Flask; add them only when needed.

> PyPI distribution name: **`zebu-server-monitor`** · import name: **`server_monitor`**.

## Installation

```bash
# core (host agent + standalone worker monitoring)
pip install zebu-server-monitor

# with FastAPI integration
pip install "zebu-server-monitor[fastapi]"

# with Flask integration
pip install "zebu-server-monitor[flask]"

# everything (FastAPI + Flask + requests instrumentation)
pip install "zebu-server-monitor[all]"
```

Requires Python 3.8+.

## Quickstart

### FastAPI

```python
from fastapi import FastAPI
from server_monitor import monitor

app = FastAPI()

MONITOR_CONFIG = {
    "server": {"name": "web-01", "environment": "production"},
    "backend": {"url": "https://your-backend", "api_key": "${MONITORING_API_KEY}"},
}

monitor.init(app, MONITOR_CONFIG)
```

### Flask

```python
from flask import Flask
from server_monitor import monitor

app = Flask(__name__)
monitor.init(app, MONITOR_CONFIG)   # same config dict as above
```

`monitor.init()` auto-detects the framework, registers `/health` and `/metrics`,
instruments requests, patches outbound HTTP clients (`httpx`, `requests`, `urllib`),
and starts a background pusher.

### Standalone worker / script (no web framework)

```python
from server_monitor import monitor

sm = monitor.create_standalone(MONITOR_CONFIG)
sm.start()    # background thread pushes metrics on an interval
# ... your work ...
sm.stop()
```

## Host agent (CLI)

Installing the package adds a command that monitors the current machine and pushes host
metrics (CPU/RAM/disk/network/load/uptime, OS info, Docker containers). It auto-detects
which known services and processes are actually running:

```bash
MONITOR_API_KEY=... server-monitor
```

> **Naming:** the three names are *different on purpose* —
> install with `pip install zebu-server-monitor`, import with `import server_monitor`,
> and run with **`server-monitor`** (or the alias **`zebu-server-monitor`** — both are
> installed and do the same thing). There is **no** `zebu_server_monitor` import.

Configuration precedence is **CLI flags > environment variables > defaults**:

| Flag | Env var | Default |
|------|---------|---------|
| `--server-name` | `MONITOR_SERVER_NAME` | machine hostname |
| `--backend-url` | `MONITOR_BACKEND_URL` | `https://<url backend>>` |
| `--api-key` | `MONITOR_API_KEY` | _(required)_ |
| `--environment` | `MONITOR_ENVIRONMENT` | `production` |
| `--push-interval` | `MONITOR_PUSH_INTERVAL` | `15` |
| `--port-service NAME:PORT` | `MONITOR_PORT_SERVICES` | _(none)_ |
| `--track-process NAME` | `MONITOR_TRACK_PROCESSES` | _(none)_ |

```bash
server-monitor --server-name edge-1 --backend-url https://your-backend --api-key "$KEY"
```

**Monitoring things the agent doesn't auto-detect** (e.g. a PHP built-in server). Add a
**port-checked service** (reported `running` while the TCP port answers) and **track extra
processes** by name/cmdline. Both flags repeat, and both have comma-separated env equivalents:

```bash
# a PHP built-in server (php -S localhost:8080 with PHP_CLI_SERVER_WORKERS=4)
server-monitor --port-service php-web:8080 --track-process "php -S"
```

```ini
# …or in /etc/server-monitor.env
MONITOR_PORT_SERVICES=php-web:8080
MONITOR_TRACK_PROCESSES=php -S
```

This shows a **php-web** service (up/down by the port) plus the `php -S` process group — the
master and all 4 workers, each with its own CPU/RAM. Explicitly tracked processes are always
reported (a down one shows as count 0).

### Configuring & editing the agent

The CLI has **no config file** — you configure it three ways, where each later source
overrides the earlier one:

1. **Built-in defaults** (in code).
2. **Environment variables** (`MONITOR_*`, table above).
3. **CLI flags** (`--server-name`, etc.) — highest priority.

**One-off run** — just pass flags or inline env vars:

```bash
MONITOR_API_KEY=your-key server-monitor --server-name edge-1
```

**Persistent (recommended on a server)** — keep settings in an env file and run the
agent as a service so it survives reboots and restarts on failure.

On Linux with **systemd**, create `/etc/server-monitor.env`:

```ini
MONITOR_SERVER_NAME=my-ubuntu-server
MONITOR_BACKEND_URL=https://your-backend
MONITOR_API_KEY=your-key
MONITOR_ENVIRONMENT=production
MONITOR_PUSH_INTERVAL=15
```

Then a unit at `/etc/systemd/system/server-monitor.service`. **`ExecStart` must be the
absolute path to the installed command** — find it with `command -v server-monitor`, as
it depends on how you installed (`pip install --user` → `~/.local/bin/server-monitor`;
system/sudo install → `/usr/local/bin/server-monitor`; venv → `<venv>/bin/server-monitor`):

```ini
[Unit]
Description=Server Monitor Agent
After=network-online.target
Wants=network-online.target

[Service]
Type=simple
User=ubuntu
EnvironmentFile=/etc/server-monitor.env
# Replace with the output of `command -v server-monitor`:
ExecStart=/home/ubuntu/.local/bin/server-monitor
Restart=always
RestartSec=10

[Install]
WantedBy=multi-user.target
```

```bash
sudo systemctl daemon-reload
sudo systemctl enable --now server-monitor
# after editing /etc/server-monitor.env later:
sudo systemctl restart server-monitor
sudo journalctl -u server-monitor -f      # live logs
```

> **Troubleshooting `status=203/EXEC` ("No such file or directory")** — `ExecStart`
> points at a path that doesn't exist. The installed command is **`server-monitor`**
> (or `zebu-server-monitor`), **not** the import name. Run `command -v server-monitor`
> and paste that exact path into `ExecStart`. If it prints nothing, the package isn't
> installed for that user — `pip install --user zebu-server-monitor` first. Note that
> the binary must be readable by the unit's `User=` (a `--user` install under
> `/home/ubuntu/.local/bin` is fine when `User=ubuntu`).

On **Windows**, set the variables persistently, then open a new shell and run it:

```powershell
setx MONITOR_API_KEY "your-key"
setx MONITOR_BACKEND_URL "https://your-backend"
setx MONITOR_SERVER_NAME "win-host-01"
# open a NEW terminal so the vars take effect, then:
server-monitor
```

> The `MONITOR_*` variables above configure only the **standalone host agent**. When you
> embed the agent in a FastAPI/Flask app instead, configuration comes from the
> `MONITOR_CONFIG` dict passed to `monitor.init()` — see below.

## Configuration reference

`MONITOR_CONFIG` is a plain dict validated by `server_monitor.AgentConfig`. Every key
is optional — anything you omit falls back to the default below. The dict has six
top-level sections:

```python
MONITOR_CONFIG = {
    "server":        {...},   # how this host/app identifies itself
    "backend":       {...},   # where to push metrics
    "collectors":    {...},   # what to collect (system/services/process/app/docker/logs)
    "endpoints":     {...},   # health/metrics route names (web integration only)
    "thresholds":    {...},   # health-check limits
    "error_logging": {...},   # unhandled-exception capture (web integration only)
}
```

### `server` — identity

| Key | Type | Default | Description |
|-----|------|---------|-------------|
| `name` | str | machine hostname | Unique name shown on the dashboard. |
| `environment` | str | `"production"` | Logical environment: `production` / `staging` / `development`. |
| `group` | str | `""` | Grouping bucket on the dashboard (e.g. `web`, `worker`, `db`). |
| `tags` | list[str] | `[]` | Free-form labels for filtering (e.g. `["api", "frontend"]`). |
| `host` | str | `"127.0.0.1"` | Advertised host/IP of this server. |
| `port` | int | `8080` | Advertised port of the app. |
| `url` | str / None | `None` | Public URL of the service, if any. |
| `deployment` | dict | `{}` | Optional release metadata, e.g. `{"version": "1.4.2", "commit": "ab12cd", "build": "42"}`. |

### `backend` — where metrics are pushed

| Key | Type | Default | Description |
|-----|------|---------|-------------|
| `url` | str | `""` | Base URL of the monitoring backend. Metrics POST to `{url}/api/v1/ingest`. |
| `api_key` | str | `""` | Ingestion key, sent as the `X-API-Key` header. |
| `push_interval` | int | `30` | Seconds between pushes. |
| `timeout` | int | `10` | HTTP timeout (seconds) for each push. |
| `enabled` | bool | `True` | Set `False` to collect locally without pushing (useful for testing). |

### `collectors` — what gets gathered

Each collector has its own `enabled` flag and most have a polling `interval` (seconds).

#### `collectors.system`

| Key | Type | Default | Description |
|-----|------|---------|-------------|
| `enabled` | bool | `True` | Collect CPU / RAM / disk / network / load / uptime. |
| `interval` | int | `10` | Polling interval. |

#### `collectors.services`

| Key | Type | Default | Description |
|-----|------|---------|-------------|
| `enabled` | bool | `True` | Collect service health. |
| `interval` | int | `30` | Polling interval. |
| `monitor` | list[str] | `[]` | Built-in service names to watch (see list below). |
| `custom_services` | list[dict] | `[]` | Ad-hoc service checks (see below). |

**Built-in service names** you can drop into `monitor` (each knows its systemd unit /
process name / default port): `nginx`, `apache`, `httpd`, `postgres`, `mysql`,
`mariadb`, `redis`, `mongodb`, `docker`, `gunicorn`, `uvicorn`, `celery`,
`supervisord`, `rabbitmq`, `elasticsearch`.

**Custom services** — for anything not built in, add entries to `custom_services`:

```python
"services": {
    "enabled": True,
    "monitor": ["nginx", "redis"],
    "custom_services": [
        {"name": "my-api",   "check": "port",      "port": 9000},
        {"name": "worker",   "check": "process",   "process_name": "my_worker"},
        {"name": "pgbouncer","check": "systemctl", "service": "pgbouncer"},
    ],
}
```

| Custom field | Type | Default | Description |
|--------------|------|---------|-------------|
| `name` | str | — (required) | Label shown on the dashboard. |
| `check` | str | `"systemctl"` | Check method: `systemctl`, `process`, or `port`. |
| `service` | str | `""` | systemd unit name (when `check="systemctl"`). |
| `process_name` | str | `""` | Process name to search for (when `check="process"`). |
| `port` | int / None | `None` | TCP port to probe (when `check="port"`). |

#### `collectors.process`

| Key | Type | Default | Description |
|-----|------|---------|-------------|
| `enabled` | bool | `True` | Track per-process CPU/RAM for named processes. |
| `interval` | int | `30` | Polling interval. |
| `track_processes` | list[str] | `[]` | Process names to track, e.g. `["python", "gunicorn", "celery"]`. |

#### `collectors.app` — application worker tracking

| Key | Type | Default | Description |
|-----|------|---------|-------------|
| `enabled` | bool | `True` | Track the app's own worker processes. |
| `interval` | int | `10` | Polling interval. |
| `track_workers` | bool | `True` | Include child/worker processes (e.g. Gunicorn/Uvicorn workers). |

#### `collectors.docker`

| Key | Type | Default | Description |
|-----|------|---------|-------------|
| `enabled` | bool | `True` | Collect container stats via the `docker` CLI. Auto-disables if `docker` isn't installed. |
| `interval` | int | `30` | Polling interval. |

#### `collectors.logs` — application log shipping

| Key | Type | Default | Description |
|-----|------|---------|-------------|
| `enabled` | bool | `False` | Attach a log handler and ship recent lines with each push. |
| `max_lines` | int | `100` | Number of recent log lines to include per push. |

### `endpoints` — route names (web integration only)

| Key | Type | Default | Description |
|-----|------|---------|-------------|
| `health` | str | `"/health"` | Health-check route registered on your app. |
| `metrics` | str | `"/metrics"` | Prometheus metrics route. |
| `monitor_health` | str | `"/__monitor/health"` | Internal agent health route. |
| `prometheus_enabled` | bool | `True` | Expose the Prometheus exporter at `metrics`. |

### `thresholds` — health limits

A host is flagged unhealthy when any metric exceeds its threshold (percent).

| Key | Type | Default | Description |
|-----|------|---------|-------------|
| `cpu_percent` | float | `90.0` | CPU usage ceiling. |
| `memory_percent` | float | `85.0` | RAM usage ceiling. |
| `disk_percent` | float | `80.0` | Disk usage ceiling. |

### `error_logging` — exception capture (web integration only)

| Key | Type | Default | Description |
|-----|------|---------|-------------|
| `enabled` | bool | `True` | Capture unhandled exceptions with stack trace + request context. |
| `capture_request_body` | bool | `False` | Include the request body (off by default for privacy). |
| `max_body_size` | int | `1024` | Max bytes of request body to capture when enabled. |
| `sanitize_headers` | list[str] | `["authorization", "cookie", "x-api-key", "x-auth-token"]` | Headers redacted before logging. |

### Environment-variable substitution

Any **string** value in `MONITOR_CONFIG` may reference environment variables — handy
for keeping secrets out of source:

- `${VAR}` → value of `VAR`, or empty string if unset.
- `${VAR:default}` → value of `VAR`, or `default` if unset.

```python
"backend": {
    "url": "${MONITOR_BACKEND_URL:https://localhost:8701}",
    "api_key": "${MONITORING_API_KEY}",
}
```

### Worker processes (Gunicorn/Uvicorn)

When you run multiple workers per host, set `MONITORING_DISABLE_SYSTEM_METRICS=true`
in the worker environment so only one process reports system/service/process metrics —
this avoids duplicate host data. Request and error metrics are still collected per
worker.

### Full example

```python
MONITOR_CONFIG = {
    "server": {
        "name": "web-01",
        "environment": "production",
        "group": "web",
        "tags": ["api"],
        "deployment": {"version": "1.4.2", "commit": "ab12cd"},
    },
    "backend": {
        "url": "${MONITOR_BACKEND_URL:https://your-backend}",
        "api_key": "${MONITORING_API_KEY}",
        "push_interval": 30,
        "timeout": 10,
        "enabled": True,
    },
    "collectors": {
        "system":   {"enabled": True, "interval": 10},
        "services": {
            "enabled": True,
            "monitor": ["nginx", "redis"],
            "custom_services": [
                {"name": "my-api", "check": "port", "port": 9000},
            ],
        },
        "process":  {"enabled": True, "track_processes": ["python", "gunicorn"]},
        "app":      {"enabled": True, "track_workers": True},
        "docker":   {"enabled": True},   # auto-disables if the docker CLI is absent
        "logs":     {"enabled": False, "max_lines": 100},
    },
    "endpoints": {"health": "/health", "metrics": "/metrics", "prometheus_enabled": True},
    "thresholds": {"cpu_percent": 90, "memory_percent": 85, "disk_percent": 80},
    "error_logging": {
        "enabled": True,
        "capture_request_body": False,
        "max_body_size": 1024,
        "sanitize_headers": ["authorization", "cookie", "x-api-key"],
    },
}
```

## License

MIT — see the bundled `LICENSE` file.
