Metadata-Version: 2.4
Name: inferyx-monitoring
Version: 1.0.36
Summary: Monitor batch pipelines via API and email alerts — install and deploy on Linux servers from PyPI
Author-email: Inferyx DevOps <devops@inferyx.com>
License: Copyright (c) 2026 Inferyx. All rights reserved.
        
        Proprietary software. Unauthorized copying, distribution, or use is prohibited
        unless agreed in writing with Inferyx.
        
        For open-source release, replace this file with your chosen license (e.g. MIT, Apache-2.0)
        before publishing to public PyPI.
        
Keywords: batch,monitoring,pipeline,email,inferyx
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: System Administrators
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Operating System :: OS Independent
Classifier: Topic :: System :: Monitoring
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: pandas>=2.0.0
Requires-Dist: requests>=2.31.0
Requires-Dist: python-dateutil>=2.8.2
Provides-Extra: admin
Requires-Dist: fastapi>=0.110.0; extra == "admin"
Requires-Dist: uvicorn[standard]>=0.27.0; extra == "admin"
Requires-Dist: pydantic>=2.0.0; extra == "admin"
Provides-Extra: dev
Requires-Dist: build; extra == "dev"
Requires-Dist: twine; extra == "dev"
Requires-Dist: check-manifest; extra == "dev"
Dynamic: license-file

# inferyx-monitoring

**Version:** 1.0.36 · **PyPI:** [inferyx-monitoring](https://pypi.org/project/inferyx-monitoring/) · **CLI:** `inferyx-monitoring`

---

## Contents

1. [Overview](#1-overview)
2. [What's new](#2-whats-new)
3. [Installation](#3-installation)
4. [Upgrade](#4-upgrade)
5. [Properties](#5-properties)
6. [Batch CSV format](#6-batch-csv-format)

---

## 1. Overview

**inferyx-monitoring** watches batch jobs listed in a CSV file, queries the Inferyx API for each batch, and sends email alerts (and optional Teams / Google Chat messages) when problems are detected.

| Alert | When it fires |
|-------|----------------|
| **failed** | Batch execution failed |
| **running** | Still running past expected end + grace |
| **missed** | Did not start within schedule window + grace |
| **no_data** | No API record (email to DevOps) |

### Default server paths

| Item | Path |
|------|------|
| Install directory | `/opt/pipeline-monitor` |
| Config (`.env`) | `/opt/pipeline-monitor/.env` |
| Batch list | `/opt/pipeline-monitor/batch_file.csv` |
| Log | `/opt/pipeline-monitor/pipeline_script.log` |
| Python venv | `/opt/pipeline-monitor/.venv` |
| Service user | `inferyx` |
| Monitor systemd | `inferyx-monitoring.service` |
| Admin API systemd | `inferyx-monitoring-admin.service` |
| Admin auth policy | `/etc/pipeline-monitor/auth.policy` |
| Admin UI (nginx) | `https://<host>/monitoring/admin/` |
| Admin API (nginx) | `https://<host>/monitoring/api/` |

### Optional Admin UI

Web UI to edit `.env` and batch CSV (OAuth login). Requires `pip install 'inferyx-monitoring[admin]'`. Servers do **not** need Node.js — UI files are pre-built in the pip package.

---

## 2. What's new

### 1.0.36

- PyPI documentation restructured: overview, install, upgrade, properties, CSV format only.

### 1.0.35

- Admin UI URL path: `/monitoring/admin/` (was `/admin/`).
- OAuth callback: `/monitoring/api/auth/callback/...`.
- Nginx and `auth.policy` must use the new paths.

### 1.0.34

- Nginx configuration is manual only (scripts do not reload nginx).
- `.env` migration rules documented (auto vs manual).

### 1.0.33

- Fix `inferyx-monitoring-admin-install-ui` (wrong path to bundled UI files).

### 1.0.32

- Server install/upgrade scripts in pip package (`share/inferyx-monitoring/scripts/`).

### 1.0.30

- Pre-built Admin UI in pip wheel; `inferyx-monitoring-admin-install-ui` command.

### 1.0.27

- Admin web UI; `auth.policy` for OAuth and paths.

### 1.0.25

- Teams and Google Chat webhook alerts.

### 1.0.20

- Auto `.env` migration on start; retires `PIPELINE_MAIL_BODY_*` keys.

### 1.0.15

- Default batch file `batch_file.csv` (was `jfl_batch.csv`).
- `PIPELINE_CHECK_MODE=schedule_windows` recommended.
- `Status` column in CSV.
- `--init-config` command.

---

## 3. Installation

### 3.1 Monitor only (first time)

**Step 1 — System user and venv**

```bash
sudo apt update && sudo apt install -y python3 python3-venv python3-pip
id inferyx || sudo useradd --system --home-dir /opt/pipeline-monitor --shell /usr/sbin/nologin inferyx
sudo mkdir -p /opt/pipeline-monitor && sudo chown inferyx:inferyx /opt/pipeline-monitor
sudo -u inferyx python3 -m venv /opt/pipeline-monitor/.venv
sudo -u inferyx /opt/pipeline-monitor/.venv/bin/pip install --upgrade pip inferyx-monitoring
```

**Step 2 — Create config files**

```bash
sudo -u inferyx /opt/pipeline-monitor/.venv/bin/inferyx-monitoring --init-config --work-dir /opt/pipeline-monitor
```

Creates `.env` and `batch_file.csv` if missing.

**Step 3 — Edit `.env`**

Set all [required `.env` properties](#51-env-file-optpipeline-monitorenv). See [section 5](#5-properties).

```bash
sudo -u inferyx vi /opt/pipeline-monitor/.env
sudo chmod 600 /opt/pipeline-monitor/.env
```

**Step 4 — Edit batch CSV**

Add your batches. See [section 6](#6-batch-csv-format).

```bash
sudo -u inferyx vi /opt/pipeline-monitor/batch_file.csv
```

**Step 5 — Test once**

```bash
sudo -u inferyx /opt/pipeline-monitor/.venv/bin/inferyx-monitoring --once --work-dir /opt/pipeline-monitor --env-file /opt/pipeline-monitor/.env --csv-file /opt/pipeline-monitor/batch_file.csv
```

**Step 6 — systemd service**

```bash
sudo tee /etc/systemd/system/inferyx-monitoring.service <<'EOF'
[Unit]
Description=Inferyx Pipeline Batch Monitor
After=network-online.target

[Service]
Type=simple
User=inferyx
Group=inferyx
WorkingDirectory=/opt/pipeline-monitor
ExecStart=/opt/pipeline-monitor/.venv/bin/inferyx-monitoring --work-dir /opt/pipeline-monitor --env-file /opt/pipeline-monitor/.env --csv-file /opt/pipeline-monitor/batch_file.csv
Restart=always
RestartSec=10
Environment=PYTHONUNBUFFERED=1

[Install]
WantedBy=multi-user.target
EOF

sudo systemctl daemon-reload
sudo systemctl enable --now inferyx-monitoring.service
```

### 3.2 Install script (optional)

From pip package or git `server/` folder:

```bash
sudo bash install_inferyx_monitoring.sh
sudo bash install_inferyx_monitoring.sh --admin --admin-ui
sudo bash install_inferyx_monitoring.sh --pin 1.0.36
```

Scripts location after pip install:

`/opt/pipeline-monitor/.venv/share/inferyx-monitoring/scripts/`

### 3.3 Admin UI (optional, first time)

**A. Install admin Python package**

```bash
sudo -u inferyx /opt/pipeline-monitor/.venv/bin/pip install 'inferyx-monitoring[admin]'
```

**B. Create auth policy** (manual — not created by pip)

```bash
sudo mkdir -p /etc/pipeline-monitor
sudo cp /opt/pipeline-monitor/.venv/share/inferyx-monitoring/config/auth.policy.example /etc/pipeline-monitor/auth.policy
sudo chmod 600 /etc/pipeline-monitor/auth.policy
sudo vi /etc/pipeline-monitor/auth.policy
```

Set OAuth, `ui.enabled: true`, and URLs. See [auth.policy properties](#52-auth-policy-etcpipeline-monitorauthpolicy).

**C. Install UI static files**

```bash
sudo /opt/pipeline-monitor/.venv/bin/inferyx-monitoring-admin-install-ui --target /var/www/pipeline-monitor-admin
```

**D. Admin API service**

```bash
sudo tee /etc/systemd/system/inferyx-monitoring-admin.service <<'EOF'
[Unit]
Description=Inferyx Pipeline Monitor Admin API
After=network-online.target

[Service]
Type=simple
User=inferyx
Group=inferyx
WorkingDirectory=/opt/pipeline-monitor
ExecStart=/opt/pipeline-monitor/.venv/bin/inferyx-monitoring-admin --host 127.0.0.1 --port 8090
Restart=always
RestartSec=5

[Install]
WantedBy=multi-user.target
EOF

sudo systemctl daemon-reload
sudo systemctl enable --now inferyx-monitoring-admin.service
```

**E. Nginx** (manual — copy example, edit, enable, reload)

```bash
sudo cp /opt/pipeline-monitor/.venv/share/inferyx-monitoring/config/nginx-pipeline-monitor-admin.conf.example \
  /etc/nginx/sites-available/pipeline-monitor-admin
sudo vi /etc/nginx/sites-available/pipeline-monitor-admin
sudo ln -sf /etc/nginx/sites-available/pipeline-monitor-admin /etc/nginx/sites-enabled/
sudo nginx -t && sudo systemctl reload nginx
```

UI: `/monitoring/admin/` · API: `/monitoring/api/` (proxy to `127.0.0.1:8090`).

---

## 4. Upgrade

### 4.1 What happens automatically on upgrade

When the monitor **starts** or you run `--init-config`, these run **without overwriting your secrets**:

| Action | Detail |
|--------|--------|
| Append missing `.env` keys | New keys from package template added at bottom of file |
| Retire `PIPELINE_MAIL_BODY_*` | Commented out (structured email layout used instead) |
| Remove duplicate `.env` keys | Keeps one value per key |
| Copy `jfl_batch.csv` → `batch_file.csv` | Only if legacy file exists and new name missing |
| Add `Status` column to CSV | Only if column missing; existing rows get `Active` |

**Never changed automatically:** SMTP password, API token, mail recipients, batch row data, `/etc/pipeline-monitor/auth.policy`.

### 4.2 What you must do manually after upgrade

| Item | Action |
|------|--------|
| `.env` secrets | Review file; fill any new keys appended at bottom |
| `auth.policy` | Edit when OAuth URLs or UI paths change |
| Nginx | Edit config when paths change; run `sudo nginx -t && sudo systemctl reload nginx` |
| OAuth console (Google/AWS) | Update redirect URIs when API path changes |
| Admin UI static files | Re-run `inferyx-monitoring-admin-install-ui` after package upgrade |
| systemd `ExecStart` | Update `--csv-file` if still pointing at `jfl_batch.csv` |

### 4.3 Upgrade commands

**Monitor only**

```bash
sudo -u inferyx /opt/pipeline-monitor/.venv/bin/pip install --upgrade inferyx-monitoring
sudo systemctl restart inferyx-monitoring.service
```

**Monitor + Admin API**

```bash
sudo -u inferyx /opt/pipeline-monitor/.venv/bin/pip install --upgrade 'inferyx-monitoring[admin]'
sudo systemctl restart inferyx-monitoring.service
sudo systemctl restart inferyx-monitoring-admin.service
```

**Full stack (monitor + admin API + UI files)**

```bash
sudo -u inferyx /opt/pipeline-monitor/.venv/bin/pip install --upgrade 'inferyx-monitoring[admin]'
sudo /opt/pipeline-monitor/.venv/bin/inferyx-monitoring-admin-install-ui --target /var/www/pipeline-monitor-admin
sudo systemctl restart inferyx-monitoring.service
sudo systemctl restart inferyx-monitoring-admin.service
sudo nginx -t && sudo systemctl reload nginx
```

**Pin version**

```bash
sudo -u inferyx /opt/pipeline-monitor/.venv/bin/pip install --upgrade 'inferyx-monitoring[admin]==1.0.36'
```

**Upgrade script**

```bash
sudo bash /opt/pipeline-monitor/.venv/share/inferyx-monitoring/scripts/upgrade_inferyx_monitoring.sh
sudo bash .../upgrade_inferyx_monitoring.sh --mode full --pin 1.0.36
```

| Mode | Does |
|------|------|
| `monitor` | pip upgrade monitor + restart monitor |
| `admin` | pip upgrade `[admin]` + restart monitor and admin API |
| `full` | pip + refresh UI files + restart services |
| `ui` | Refresh UI static files only |

**From 1.0.14 or older**

```bash
sudo -u inferyx mv /opt/pipeline-monitor/jfl_batch.csv /opt/pipeline-monitor/batch_file.csv
sudo -u inferyx /opt/pipeline-monitor/.venv/bin/pip install --upgrade inferyx-monitoring
sudo systemctl restart inferyx-monitoring.service
```

---

## 5. Properties

### 5.1 `.env` file (`/opt/pipeline-monitor/.env`)

#### Required (set manually on first install)

| Property | Description |
|----------|-------------|
| `PIPELINE_SMTP_HOST` | SMTP server hostname |
| `PIPELINE_SMTP_PORT` | SMTP port (e.g. `587`) |
| `PIPELINE_SMTP_USERNAME` | SMTP username |
| `PIPELINE_SMTP_PASSWORD` | SMTP password |
| `PIPELINE_FROM_NAME` | Sender display name |
| `PIPELINE_MAIL_TO` | Alert email recipients (comma-separated) |
| `PIPELINE_API_BASE_URL` | Inferyx API URL — **no** `name=` in URL |
| `PIPELINE_API_TOKEN` | API authentication token |
| `PIPELINE_API_TOKEN_HEADER` | Header name (e.g. `token` or `Authorization`) |
| `PIPELINE_DEVOPS_EMAIL` | Recipient for `no_data` alerts |

#### Optional — scheduling and API

| Property | Default | Description | Since |
|----------|---------|-------------|-------|
| `PIPELINE_CHECK_MODE` | `schedule_windows` | `schedule_windows` (recommended) or `full_window` | **1.0.15** |
| `PIPELINE_CHECK_WINDOW_MINUTES` | `10` | Minutes around start/end to poll API | **1.0.15** |
| `PIPELINE_CHECK_INTERVAL` | `60` | Seconds between checks (legacy mode) | original |
| `PIPELINE_SCHEDULE_GRACE_MINUTES` | `5` | Grace after expected start/end | original |
| `PIPELINE_POST_RUN_GRACE_MINUTES` | `60` | Grace after run completes | original |
| `PIPELINE_API_FILTER_BY_SCHEDULE_DATE` | `false` | Add schedule date to API query | original |
| `PIPELINE_API_RETRY_COUNT` | `3` | API retry attempts | original |
| `PIPELINE_API_RETRY_BACKOFF_SEC` | `5` | Seconds between API retries | original |
| `PIPELINE_ALERT_COOLDOWN_MINUTES` | `60` | Minutes before repeat alert | original |
| `PIPELINE_FAILED_ALERT_ONCE_PER_DAY` | `true` | Limit failed alerts per day | original |
| `PIPELINE_ALERT_ONCE_PER_DAY_ALL_SCENARIOS` | `true` | Limit all alert types per day | original |
| `PIPELINE_MAIL_CC` | (empty) | CC recipients | original |

#### Optional — Teams / Google Chat

| Property | Default | Description | Since |
|----------|---------|-------------|-------|
| `PIPELINE_TEAMS_ENABLED` | `false` | Enable Teams webhooks | **1.0.25** |
| `PIPELINE_TEAMS_WEBHOOK_URL` | (empty) | Teams incoming webhook URL | **1.0.25** |
| `PIPELINE_GCHAT_ENABLED` | `false` | Enable Google Chat webhooks | **1.0.25** |
| `PIPELINE_GCHAT_WEBHOOK_URL` | (empty) | Google Chat webhook URL | **1.0.25** |
| `PIPELINE_CHAT_ALERT_NO_DATA` | `false` | Also send `no_data` to chat (default: email only) | **1.0.25** |
| `PIPELINE_CHAT_GREETING` | (built-in) | Chat message greeting | **1.0.25** |
| `PIPELINE_CHAT_SIGNATURE` | (built-in) | Chat message signature | **1.0.25** |

Test chat: `inferyx-monitoring --test-chat-alerts --work-dir /opt/pipeline-monitor`

#### Optional — email subjects

| Property | Description | Since |
|----------|-------------|-------|
| `PIPELINE_MAIL_SUBJECT_DEFAULT` | Default subject template | original |
| `PIPELINE_MAIL_SUBJECT_NO_DATA` | Subject for `no_data` | original |
| `PIPELINE_MAIL_SUBJECT_FAILED` | Subject for `failed` | original |
| `PIPELINE_MAIL_SUBJECT_RUNNING` | Subject for `running` | original |
| `PIPELINE_MAIL_SUBJECT_MISSED` | Subject for `missed` | original |

Placeholders: `{batch_name}`, `{issue_type_upper}`, etc.

#### Optional — email layout (structured HTML)

| Property | Description | Since |
|----------|-------------|-------|
| `PIPELINE_MAIL_GREETING` | Email greeting line | **1.0.17** |
| `PIPELINE_MAIL_ENVIRONMENT` | Environment label in email | original |
| `PIPELINE_MAIL_HEADING_BATCH_DETAILS` | Section heading | **1.0.19** |
| `PIPELINE_MAIL_HEADING_ADDITIONAL_INFO` | Section heading | **1.0.19** |
| `PIPELINE_MAIL_HEADING_ALERT_SUMMARY` | Section heading | **1.0.19** |
| `PIPELINE_MAIL_HEADING_ACTION_REQUIRED` | Section heading | **1.0.19** |
| `PIPELINE_MAIL_HEADING_CURRENT_STATUS` | Section heading | **1.0.19** |
| `PIPELINE_MAIL_INTRO_RUNNING` | Body section (running) | **1.0.19** |
| `PIPELINE_MAIL_SUMMARY_RUNNING` | Body section (running) | **1.0.19** |
| `PIPELINE_MAIL_ACTION_RUNNING` | Body section (running) | **1.0.19** |
| `PIPELINE_MAIL_STATUS_RUNNING` | Body section (running) | **1.0.19** |
| `PIPELINE_MAIL_SIGNATURE` | Email signature HTML | original |
| `PIPELINE_MAIL_FOOTER_NOTE` | Footer note HTML | original |

Similar `PIPELINE_MAIL_INTRO_*`, `SUMMARY_*`, `ACTION_*`, `STATUS_*` exist for `failed`, `missed`, `no_data` (built-in defaults if omitted).

#### Retired (do not add — removed on upgrade)

| Property | Replaced by | Since retired |
|----------|-------------|---------------|
| `PIPELINE_MAIL_BODY_DEFAULT` | Structured sections above | **1.0.20** |
| `PIPELINE_MAIL_BODY_NO_DATA` | Structured sections above | **1.0.20** |
| `PIPELINE_MAIL_BODY_FAILED` | Structured sections above | **1.0.20** |
| `PIPELINE_MAIL_BODY_RUNNING` | Structured sections above | **1.0.20** |
| `PIPELINE_MAIL_BODY_MISSED` | Structured sections above | **1.0.20** |

### 5.2 `auth.policy` (`/etc/pipeline-monitor/auth.policy`)

Admin UI only. **Not** updated by pip. JSON format. Copy from `share/inferyx-monitoring/config/auth.policy.example`.

#### `ui` section

| Property | Example | Description | Since |
|----------|---------|-------------|-------|
| `enabled` | `true` | Turn Admin UI on/off | **1.0.27** |
| `session_timeout_minutes` | `60` | Login session length | **1.0.27** |
| `public_base_url` | `https://monitor.example.com` | Public site URL (no trailing path) | **1.0.27** |
| `ui_base_path` | `/monitoring/admin/` | UI path under public URL | **1.0.35** (was `/admin/`) |

Browser UI URL = `public_base_url` + `ui_base_path` → `https://monitor.example.com/monitoring/admin/`

#### `auth` section (OAuth)

| Property | Description | Since |
|----------|-------------|-------|
| `auth.google.enabled` | Enable Google login | **1.0.27** |
| `auth.google.client_id` | Google OAuth client ID | **1.0.27** |
| `auth.google.client_secret_path` | File containing client secret | **1.0.27** |
| `auth.google.redirect_uri` | `https://<host>/monitoring/api/auth/callback/google` | **1.0.35** path |
| `auth.google.allowed_domains` | Allowed email domains | **1.0.27** |
| `auth.aws_idc.*` | AWS IAM Identity Center OAuth (same pattern) | **1.0.27** |

Register `redirect_uri` in Google / AWS OAuth console.

#### `paths` section

| Property | Default | Description | Since |
|----------|---------|-------------|-------|
| `work_dir` | `/opt/pipeline-monitor` | Monitor install directory | **1.0.27** |
| `env_file` | `/opt/pipeline-monitor/.env` | Path to `.env` | **1.0.27** |
| `csv_file` | `/opt/pipeline-monitor/batch_file.csv` | Path to batch CSV | **1.0.27** |
| `audit_log` | `/var/log/pipeline-monitor/admin-audit.log` | Admin audit log | **1.0.27** |
| `session_secret_path` | `/etc/pipeline-monitor/session_secret` | Session signing secret file | **1.0.27** |

#### `branding` section

| Property | Description | Since |
|----------|-------------|-------|
| `app_name` | UI title | **1.0.27** |
| `logo_url` | Logo image URL | **1.0.27** |
| `footer_text` | Footer text | **1.0.27** |
| `primary_color` | Theme color (hex) | **1.0.27** |
| `favicon_url` | Favicon URL | **1.0.27** |

#### `security` section

| Property | Default | Description | Since |
|----------|---------|-------------|-------|
| `restart_service_on_save` | `true` | Restart monitor after config save from UI | **1.0.27** |
| `service_name` | `inferyx-monitoring` | systemd service to restart | **1.0.27** |
| `rate_limit_per_minute` | `120` | API rate limit per IP | **1.0.27** |

---

## 6. Batch CSV format

### 6.1 Current format (`batch_file.csv`)

**File:** `/opt/pipeline-monitor/batch_file.csv` (since **1.0.15**; was `jfl_batch.csv`)

**Example:**

```csv
Name,Frequency,ExpectedStartTime,AvgExecutionTime,ExpectedDayOfMonth,Status
daily_report,Daily,9:00:00,"10 mins",,Active
monthly_close,Monthly,6:00:00,"45 mins",1,Active
adhoc_job,Once,14:30:00,"5 mins",,Suspended
```

Use **24-hour** times. Quote values that contain commas.

### 6.2 Column reference

| Column | Required | Description | Since |
|--------|----------|-------------|-------|
| `Name` | Yes | Batch name as known to Inferyx API | original |
| `Frequency` | Yes | `Daily`, `Weekly`, `Monthly`, or `Once` | original |
| `ExpectedStartTime` | Yes | Expected start time (`HH:MM:SS`) | original |
| `AvgExecutionTime` | Yes | Typical duration (e.g. `"10 mins"`, `"1 hour"`) | original |
| `ExpectedDayOfMonth` | No | Day of month for monthly jobs (1–31); empty for Daily/Weekly | original |
| `Status` | Yes* | `Active` or `Suspended` — only `Active` rows are monitored | **1.0.15** |

\*If `Status` column is missing, upgrade/migration adds it and sets existing rows to `Active`.

### 6.3 Legacy format (before 1.0.15)

| Item | Old | Current |
|------|-----|---------|
| Filename | `jfl_batch.csv` | `batch_file.csv` |
| Columns | Same except **no `Status` column** | Includes `Status` |
| systemd `--csv-file` | May point to `jfl_batch.csv` | Use `batch_file.csv` |

Migration on start copies `jfl_batch.csv` → `batch_file.csv` (original kept) and adds `Status` if missing.

### 6.4 Frequency values

| Value | Meaning |
|-------|---------|
| `Daily` | Runs every day at `ExpectedStartTime` |
| `Weekly` | Runs weekly at `ExpectedStartTime` |
| `Monthly` | Runs on `ExpectedDayOfMonth` at `ExpectedStartTime` |
| `Once` | Single scheduled run |

### 6.5 Status values

| Value | Meaning |
|-------|---------|
| `Active` | Monitored |
| `Suspended` | Ignored (not polled) |

Also accepted by Admin UI: `Inactive`, `Disabled` (treated as not active).
