Metadata-Version: 2.4
Name: aic_utils
Version: 2.0.10
Summary: AIC API wrapper and GitLab integration framework for pipeline management
Home-page: https://github.com/dylandoyle11/aic_utils
Author: Dylan D
Author-email: dylan.doyle@jdpa.com
Keywords: aic gitlab pipeline automation devops
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Requires-Python: >=3.8
Description-Content-Type: text/markdown
Requires-Dist: requests>=2.25.0
Requires-Dist: PyYAML>=5.4.0
Requires-Dist: pandas>=1.3.0
Requires-Dist: slack_sdk>=3.0.0
Dynamic: author
Dynamic: author-email
Dynamic: classifier
Dynamic: description
Dynamic: description-content-type
Dynamic: home-page
Dynamic: keywords
Dynamic: requires-dist
Dynamic: requires-python
Dynamic: summary

# aic_utils

`aic_utils` is a Python toolkit for managing JD Power **AIC** (Analytics Insights Console) pipelines and synchronizing them with **GitLab** under a strict, audit-friendly Git flow. It also bundles a few supporting utilities (dataset placeholder conversion, Slack logging, repository bootstrapping).

- **PyPI:** https://pypi.org/project/aic-utils/
- **Source:** https://github.com/dylandoyle11/aic_utils
- **Current version (source tree):** `2.0.2`
- **Author:** Dylan Doyle

> ⚠️ The PyPI release may lag this repository. This README documents the **current source tree**. If a method described here is not present in the version you `pip install`-ed, install from source (`pip install git+https://github.com/dylandoyle11/aic_utils`) or pin to the version that matches.

---

## Table of Contents

1. [Installation](#1-installation)
2. [Package Overview](#2-package-overview)
3. [`AIC` — AIC REST API wrapper](#3-aic--aic-rest-api-wrapper)
4. [`GitLabManager` — Git ↔ AIC sync engine](#4-gitlabmanager--git--aic-sync-engine)
5. [`BranchWorkspaceMapper` — branch ↔ workspace mapping](#5-branchworkspacemapper--branch--workspace-mapping)
6. [`GitLabRepositoryInitializer` — bootstrap a new repo](#6-gitlabrepositoryinitializer--bootstrap-a-new-repo)
7. [`DatasetConverter` — placeholder rewriting](#7-datasetconverter--placeholder-rewriting)
8. [`SlackLogger` — Slack-backed Python logger](#8-slacklogger--slack-backed-python-logger)
9. [Day-to-day workflows](#9-day-to-day-workflows)
10. [Best practices & troubleshooting](#10-best-practices--troubleshooting)

---

## 1. Installation

```bash
pip install aic_utils
```

Or pin to the latest source:

```bash
pip install git+https://github.com/dylandoyle11/aic_utils.git
```

**Runtime dependencies** (from `setup.py`):

- `requests >= 2.25.0`
- `PyYAML >= 5.4.0`
- `pandas >= 1.3.0`
- `slack_sdk >= 3.0.0`

**Python:** 3.8+

> The `GitLabManager._send_email` helper imports `pyspark_utils` lazily; that package is optional and only required when `email_recipients` is configured.

---

## 2. Package Overview

```python
from aic_utils import (
    AIC,                           # AIC REST wrapper
    GitLabManager,                 # GitLab sync / Git-flow enforcement
    GitLabRepositoryInitializer,   # One-time repo bootstrap
    DatasetConverter,              # Rewrite SQL/JSON placeholders → dataset:table refs
    SlackLogger,                   # Pipe Python logging / print() to Slack
)
```

The package is organized so that one `AIC` instance + one `GitLabManager` instance together represent **one workspace ↔ one branch** at a time. Each environment (Prod / QA / Dev / sandbox) uses its own pair.

### High-level architecture

```
┌──────────────┐        ┌────────────────────┐        ┌──────────────┐
│ AIC workspace│  ←──── │  AIC (REST wrapper)│ ────→  │ pipeline JSON│
│  (cloud)     │        └─────────┬──────────┘        │  in memory   │
└──────────────┘                  │                   └──────┬───────┘
                                  ▼                          ▼
                       ┌────────────────────┐       ┌──────────────────┐
                       │   GitLabManager    │ ────→ │  GitLab repo     │
                       │  (push / deploy /  │       │  config/*.json   │
                       │   branch / MR)     │       │  code/<pipe>/*   │
                       └─────────┬──────────┘       └──────────────────┘
                                 │
                                 ▼
                       ┌────────────────────┐
                       │ BranchWorkspace-   │  ← branch_to_workspace.yaml
                       │ Mapper             │
                       └────────────────────┘
```

---

## 3. `AIC` — AIC REST API wrapper

`aic_utils/aic.py` — class `AIC`

A thin REST client around the AIC core API (`/apis/core/v1`). Resolves project + workspace, lists pipelines, fetches/upserts pipeline configs, manages datasets and backups.

### 3.1 Constructor

```python
AIC(api_key: str,
    project: str,
    workspace: str,
    pipelines: list[str] = [],
    qa: bool = False)
```

| Arg | Description |
|-----|-------------|
| `api_key` | AIC API key (`User Profile → Manage API Keys` in AIC UI). |
| `project` | Exact project name (matched against `name` from `GET /projects`). |
| `workspace` | Exact workspace name within the project. |
| `pipelines` | List of pipeline names to load configs for. `['*']` loads **all** pipelines in the workspace. Names not found are added as **dummy configs** with empty stages — useful for creating new pipelines on first deploy. |
| `qa` | If `True`, points to `https://aic.qa.jdpower.com/apis/core/v1`; otherwise prod (`https://prod.jdpower.ai/apis/core/v1`). |

**Resolved attributes**

| Attribute | Type | Meaning |
|-----------|------|---------|
| `base_url` | `str` | Endpoint root, prod or QA. |
| `headers` | `dict` | `accept`, `api-key`, `Content-Type`. |
| `project_id` | `str` | Resolved via `get_project_id`. |
| `workspace` | `str` | Original workspace name. |
| `workspace_id` | `str` | Resolved via `get_workspace_id`. |
| `pipelines` | `list[{'name','id'}]` | All pipelines in the workspace (populated by `pop_pipelines`). |
| `pipeline_configs` | `list[{'name','jobConfig','id'}]` | Configs for the requested subset (populated by `pop_pipeline_config`, parallelized via `ThreadPoolExecutor`). |
| `timestamp` | class attr | `YYYYMMDD` of instantiation date. |

### 3.2 Discovery

| Method | Returns | Notes |
|--------|---------|-------|
| `get_project_id(project)` | `str` | Raises if project not found. |
| `get_workspace_id(workspace)` | `str` | Raises if workspace not found. |
| `pop_pipelines()` | `list[{'name','id'}]` | `GET /jobs`; refreshes `self.pipelines`. |
| `get_pipelines(workspace_id)` | `list[{'name','id'}]` | Same shape as above, but for an arbitrary workspace ID. |
| `pop_pipeline_config(pipelines)` | `list[dict]` | Parallel fetch. Pass `['*']` for all. Missing names produce dummy configs. |
| `fetch_pipeline_config(pipeline, direct=False)` | `dict` | `pipeline` is a `{'name', 'id'/'$id'}` dict. `direct=True` skips name lookup. |
| `get_config()` | `list[dict]` | Returns `self.pipeline_configs`. |

### 3.3 Pipeline upsert (CRUD)

| Method | Effect |
|--------|--------|
| `write_config_to_pipeline(config)` | `POST /jobs` upsert. If `config['id']` is set it’s included as `$id` to update; otherwise creates. |
| `create_or_update_pipeline(workspace_id, pipeline_config)` | Same idea but for an arbitrary workspace; on `409 Conflict` falls through to `update_pipeline`. |
| `update_pipeline(workspace_id, pipeline_id, pipeline_config)` | `PUT /jobs/{id}`. |
| `create_pipeline_branch(config)` | `PUT /interactive-pipeline/{jobId}/branches` with auto-generated name `PUSH_<YYYYMMDDHHMMSS>`. |
| `copy_pipelines(target_workspace_name, pipelines_to_copy)` | Copies the named pipelines from the current workspace into another (resolving target workspace by name). |

`config` shape:

```python
{
  'name': '<pipeline title>',
  'jobConfig': { 'stages': [...], 'variables': [...], 'sourceStage': {...}, 'sinkStage': {...}, ... },
  'id': '<existing $id, or None for create>',
}
```

### 3.4 Workspace artifacts

| Method | Effect |
|--------|--------|
| `get_datasets()` | `GET /datasets`. |
| `get_tables(dataset_id)` | `GET /datasets/{id}/tables`. |
| `delete_branches(job_names)` | Deletes every interactive branch on each named job. |
| `backup_pipelines(pipelines, base_folder='.', drive_name='backups')` | Creates a `YYYYMMDD/` folder on the named drive and uploads each pipeline’s config JSON via `POST /upload-file`. `pipelines` may be names or full configs. |
| `get_drive_id_by_name(drive_name)` | Resolves drive by name (case-insensitive). Returns `None` if absent. |

### 3.5 Under the hood

- **Concurrency:** `pop_pipeline_config` uses `ThreadPoolExecutor`.
- **Error handling:** HTTP errors print status + body and continue where reasonable.
- **Logging:** `print()` for visibility (you can redirect to a logger via `SlackLogger`).

---

## 4. `GitLabManager` — Git ↔ AIC sync engine

`aic_utils/gitlabmanager.py` — class `GitLabManager`

Wraps the GitLab v4 REST API and binds it to an `AIC` instance. Enforces a strict 1:1 branch ↔ workspace contract, drives push/deploy passes, manages feature/release/hotfix Git flow, and can bootstrap a brand-new repo.

### 4.1 Branch role configuration

Class-level constants control which branches play which Git-flow role:

```python
GitLabManager.branch_roles = {
    "production":  "main",
    "staging":     "staging",
    "development": "dev",
}
```

If your repo uses different names (e.g. `develop` instead of `dev`), set this **before** instantiating any managers:

```python
GitLabManager.branch_roles = {"production": "main", "staging": "qa", "development": "develop"}
```

> The defaults — `main` / `staging` / `dev` — are what `GitLabRepositoryInitializer` will scaffold.

### 4.2 Constructor

```python
GitLabManager(
    aic_instance,                                   # AIC
    gitlab_token,                                   # str
    gitlab_namespace,                               # e.g. "pin/pin-analytics"
    repo_folder,                                    # e.g. "pin-fusion-2.0"
    gitlab_base_url: str = "https://git.autodatacorp.org/api/v4",
    use_hash_comparison: bool = True,
    email_recipients: list = None,
    email_sender: str = None,
    mapping_file: str = "branch_to_workspace.yaml",
)
```

**What `__init__` does, in order**

1. Builds GitLab auth headers (`Private-Token`).
2. Resolves the project (`{namespace}/{repo_folder}`); raises a clear `RuntimeError` on 404.
3. Caches the repo's `default_branch`.
4. Loads the `branch_to_workspace.yaml` from the default branch into a `BranchWorkspaceMapper`.
5. Picks the working branch by reverse-mapping `aic.workspace`. If no mapping exists, slugs the workspace name (`PIN FUSION 2.0` → `pin-fusion-2-0`).
6. Calls `ensure_branch_exists` on that branch (off `default_branch`).
7. Records `_allowed_push = {slug, mapped_branch}` plus `_hotfix_prefixes = ['hotfix/']`.
8. Calls `check_token_expiration()` (warns if ≤ 30 days).
9. **Locks the instance** — see §4.3.

### 4.3 Attribute lock (important)

After construction, `__setattr__` raises if you try to mutate an attribute. The only sanctioned way to change `_branch` is via:

- `create_feature_branch(name)`
- `create_release_branch(name)`
- `create_hotfix_branch(name)`

…all of which use the internal `_set_branch()` helper. `mgr.branch` is exposed as a **read-only property**.

> Practical implication: don’t reuse a `GitLabManager` across workspaces. Build a new one per environment.

### 4.4 Repository & branch utilities

| Method | Effect |
|--------|--------|
| `repository_exists(repo_path)` | `GET /projects/{path}` → 200? |
| `create_repository(repo_name)` | `POST /projects` under `get_subgroup_id()`, visibility `private`. |
| `get_subgroup_id()` | Resolves namespace path to a numeric group id. |
| `ensure_branch_exists(branch_name=None, ref=None)` | Creates `branch_name` off `ref` if missing. Idempotent. |
| `_slugify(text)` | URL-safe slug (`[^a-z0-9]+` → `-`). |
| `get_token_expiration_date()` | Hits `/personal_access_tokens`, returns the soonest-expiring active token's `datetime`. |
| `check_token_expiration(warning_threshold_days=30)` | Logs OK / warning / expired. Called automatically at init. |
| `generate_hash(content)` | MD5 of newline-normalized content (used by `deploy_pipelines`). |

### 4.5 File operations

| Method | Effect |
|--------|--------|
| `get_existing_file_content(repo_name, file_name)` | Tries current branch then default; base64-decodes. |
| `fetch_pipeline_file(full_path, ref=None)` | Raw content via `/raw?ref=...`. Falls back current → default. Raises `FileNotFoundError`. |
| `list_pipeline_files(subpath=None)` | Paginated `.json` list under `subpath` on current branch. Empty on 404. |
| `list_all_blobs(subpath)` | Recursive list of every file path under `subpath` on current branch. |
| `delete_file(file_name, branch=None)` | `DELETE /repository/files/...`. |
| `_commit_deletion_actions(paths, branch, msg_prefix)` | Batches up to **50 deletions per commit** via `POST /repository/commits`. |
| `_push_file(proj_encoded, file_name, file_content, branch)` | Tries `POST` (create); on 400/409 falls back to `PUT` (update). |

### 4.6 Pushing AIC → Git

```python
mgr.push_to_git()
```

- **Production guard:** if the current branch equals `default_branch`, prompts `(y/n)` before continuing.
- **Branch guard:** branch must be in `_allowed_push` or start with `hotfix/`.
- **Pass 1:** writes `config/{name}.json` for every entry in `aic.pipeline_configs`.
- **Pass 2:** for each successful pipeline, calls `_extract_and_push_code` to:
  - dump `PYSPARK` task scripts to `code/{pipeline}/{task_id}.py`,
  - find triple-quoted SQL blocks inside those scripts (heuristic: first word ∈ `select|insert|update|delete|with|create|drop|alter|merge`) and write each as `code/{pipeline}/{task_id}_{n}.sql`,
  - dump `SQL` task `query`/`script`/`sql` text to `code/{pipeline}/{task_id}.sql`.
- **Email:** if `email_recipients` is set, sends an HTML push summary via `pyspark_utils.send_email`.

### 4.7 Deploying Git → AIC

```python
mgr.deploy_pipelines()
```

- Reverse-maps the current branch to a workspace and **refuses** to deploy if it doesn’t match `aic.workspace`.
- For each pipeline in `aic.pipeline_configs`:
  - Fetches `config/{name}.json` from the current branch.
  - Compares MD5 of normalized JSON (when `use_hash_comparison=True`); skips unchanged.
  - Calls `aic.write_config_to_pipeline(...)` for changed pipelines.
- Emails an HTML deploy summary if configured.

> The earlier `force_sync=True` flag is no longer present — current behavior is hash-compare or always-write controlled by `use_hash_comparison` at construction time.

### 4.8 Git-flow branch creation

| Method | Source branch | Target | Notes |
|--------|---------------|--------|-------|
| `create_feature_branch(feature_name)` | `branch_roles["development"]` | `feature/{name}` | Refuses if not currently on dev. Switches `_branch` to the feature. |
| `create_release_branch(branch_name)` | `branch_roles["staging"]` | `release/{name}` | After cut, **prunes** any `config/*.json` and `code/<pipeline>/...` files for pipelines NOT in `aic.pipeline_configs`. |
| `create_hotfix_branch(branch_name)` | production (resolved via mapper, falls back to `default_branch`) | `hotfix/{name}` | Same prune behavior as release. |

All three slugify the input (replace anything outside `[A-Za-z0-9\-_/]` with `-`) and switch the manager's branch on success.

### 4.9 Merge requests

```python
mgr.create_merge_request(
    title:           str,
    description:     str,
    target_branch:   str | None = None,   # auto-detected if omitted
    assignee_ids:    list | None = None,
)
```

- Refuses unless the current branch starts with `feature/`, `release/`, or `hotfix/`.
- Auto-target rules when `target_branch` is omitted:
  - `feature/*` → `staging`
  - `release/*`, `hotfix/*`, or `staging` → `main`
  - else → `main`
- After opening the MR, **switches the manager back to the original branch** so the workflow stays predictable. The MR `iid` is logged.

> Note: the auto-target string literals are hardcoded to `"main"` / `"staging"` — if you customize `branch_roles`, pass `target_branch` explicitly.

### 4.10 Repo bootstrap & wipe (classmethods)

```python
GitLabManager.initialize_repository(
    gitlab_token, gitlab_namespace, repo_folder,
    gitlab_base_url="https://git.autodatacorp.org/api/v4",
)
```

Builds a minimally-initialized manager via `_setup_for_initialization` and hands off to `GitLabRepositoryInitializer` (see §6). Returns `True/False`.

```python
GitLabManager.wipe_repository(
    gitlab_token, gitlab_namespace, repo_folder,
    gitlab_base_url="https://git.autodatacorp.org/api/v4",
    keep_branches=None,    # default ['main']
)
```

**Destructive.** Prompts `(y/n)` first, then:

1. Lists every branch.
2. For each kept branch, deletes every blob via per-file `DELETE`.
3. Deletes every other branch (skips `main`/`master` defensively, removes branch protections first).

### 4.11 Email helpers

`_send_email(subject, body, cc=None, bcc=None, attachments=None, testing=True)`
`_format_push_summary(branch, results)` → `(subject, html_body)`
`_format_deploy_summary(branch, results)` → `(subject, html_body)`

`results` is a list of `(pipeline_name, status, message)` tuples, rendered as an HTML table.

---

## 5. `BranchWorkspaceMapper` — branch ↔ workspace mapping

Defined in `gitlabmanager.py`. Constructed automatically by `GitLabManager.__init__` from `branch_to_workspace.yaml` on the default branch.

**YAML format:**

```yaml
mapping:
  main:      PIN FUSION 2.0
  staging:   PIN FUSION 2.0 QA
  dev:       PIN FUSION 2.0 DEV
  intg/*:    PIN FUSION 2.0 INTG     # wildcard prefix
```

**API:**

| Method | Behavior |
|--------|----------|
| `lookup(branch)` | Exact match → prefix match (keys ending in `/*`) → built-ins (`feature/*` → `develop` mapping; `release/*`, `hotfix/*` → `main` mapping). Raises `KeyError` if nothing matches. |
| `workspace_to_branch(workspace)` | Reverse lookup; falls back to the workspace name slugified to `lowercase-with-dashes`. |

> `lookup` looks up the **mapping keys** literally — its built-in `feature/*` fallback is keyed off `develop` (not `dev`). If you renamed the dev branch, add an explicit mapping for the dev branch and rely on exact/prefix rules rather than the built-in.

---

## 6. `GitLabRepositoryInitializer` — bootstrap a new repo

`aic_utils/gitlab_init.py` — class `GitLabRepositoryInitializer`

Usually invoked indirectly via `GitLabManager.initialize_repository(...)`. It performs a one-time setup of a freshly-created empty GitLab project.

**`initialize_repository()` runs, in order:**

1. **`_enforce_main_branch()`** — if the default branch isn’t `branch_roles["production"]` (default `main`), creates it from the existing default, sets it as default, protects it (push/merge level 30 = maintainer), and deletes the original.
2. **`_create_initial_branches()`** — creates `staging` and `dev` (or whatever `branch_roles` specifies) off the default branch.
3. **`_create_initial_files()`** — pushes onto the default branch:
   - `branch_to_workspace.yaml` with placeholder workspace names,
   - `SECURITY.md` documenting JDP scanners and remediation timelines,
   - `config/.gitkeep`,
   - `code/.gitkeep`.
4. **`_setup_security_scanning()`** — writes a `.gitlab-ci.yml` that includes the corporate templates for the matching group (`alg` / `pin` / `ucg` → otherwise `corp`) and wires SAST, dependency scanning, secret detection, code quality, plus a manual `deploy_pipelines` job.
5. **`_enable_cicd_features()`** — `PUT /projects/{id}` to enable builds/MRs/security & compliance and disable wiki/snippets/registry.
6. **`_setup_branch_protection()`** — protects production, staging, and development (push/merge level 30).
7. **`_generate_readme()`** — derives project + company names from the namespace and writes a tailored `README.md` (the same one you’re reading describes how to use it).

All steps are idempotent enough to re-run safely (existing branches, files, and protections return 409 and are logged as warnings).

---

## 7. `DatasetConverter` — placeholder rewriting

`aic_utils/dataset_converter.py` — class `DatasetConverter`

Purpose: rewrite **bare table names** in pipeline configs into AIC dataset-aware placeholders (`${DATASET.TABLE}`), then push the rewritten config to a new interactive AIC branch named `CONVERTED_<YYYYMMDDHHMMSS>`. Intended for one-shot migration of legacy pipelines that hard-coded table names.

### 7.1 Constructor

```python
DatasetConverter(api_key, project, workspace,
                 target_workspace=None, qa=False)
```

- Resolves project and workspace exactly like `AIC`.
- If `target_workspace` is supplied, the **table catalog** (used as the lookup dictionary) is built from that workspace; otherwise from the source workspace.
- Loads `self.tables` as a `pandas.DataFrame` of `{dataset_name, dataset_id, table_name}` (parallel fetch via `ThreadPoolExecutor`).

### 7.2 Methods

| Method | Behavior |
|--------|----------|
| `pop_tables(workspace_id)` | Populates `self.tables`. |
| `get_datasets(workspace_id)` | `GET /datasets`. |
| `get_tables(dataset, workspace_id)` | `GET /datasets/{id}/tables`. |
| `get_pipelines()` | List of `{name,id}` for the source workspace. |
| `fetch_pipeline_config(pipeline_name)` | Returns `{name, jobConfig, id}` or `None`. |
| `convert(pipelines)` | For each name: fetch → `replace_placeholders` → `replace_source_stages_with_documentation` → `create_pipeline_branch`. |
| `replace_placeholders(config)` | Recursively walks the config; if any string equals a known table name, rewrites it as `${DATASET.TABLE}` and appends a markdown bullet to `self.audit_log`. |
| `replace_source_stages_with_documentation(config)` | Strips out `DATASET`-typed source tasks and inserts a `DOCUMENTATION` task containing the audit log markdown. |
| `create_pipeline_branch(config)` | Pushes the rewritten jobConfig to `interactive-pipeline/{job_id}/branches` as `CONVERTED_<timestamp>`. |

---

## 8. `SlackLogger` — Slack-backed Python logger

`aic_utils/slack_logger.py` — class `SlackLogger`

Bridges Python’s `logging` (and optionally `print`) to a Slack channel via `slack_sdk`. Includes a re-entrancy guard so messages emitted while sending don’t loop.

### 8.1 API

```python
SlackLogger(token: str, channel: str)
    .send_message(text)                            # post a single message
    .SlackLoggerHandler                            # nested logging.Handler

SlackLogger.redirect_print_to_logger(logger)       # monkey-patches builtins.print
SlackLogger.create_logger(slack_token,
                          slack_channel='C07DYFK5SE8',
                          redirect_print=True)     # returns a configured logging.Logger
```

`create_logger`:

1. Instantiates `SlackLogger`.
2. Builds a `SlackLoggerHandler` wired to it.
3. Returns a `logging.Logger` named `'SlackLogger'` at INFO level with format `%(asctime)s - %(levelname)s - %(message)s`.
4. If `redirect_print=True`, re-points `builtins.print` at `logger.info`.

The default channel ID `'C07DYFK5SE8'` is JDP-internal — pass your own.

### 8.2 Example

```python
from aic_utils import SlackLogger

logger = SlackLogger.create_logger(
    slack_token="xoxb-…",
    slack_channel="C0123456789",
    redirect_print=True,   # any subsequent print() lands in Slack
)
logger.info("Pipeline run started")
```

---

## 9. Day-to-day workflows

The branch role names below assume the defaults `production="main"`, `staging="staging"`, `development="dev"`. Substitute as needed.

### 9.1 First-time repository setup

```python
from aic_utils import GitLabManager

GitLabManager.initialize_repository(
    gitlab_token="<GIT_TOKEN>",
    gitlab_namespace="pin/pin-analytics",
    repo_folder="pin-fusion-2.0",
)
# Then edit branch_to_workspace.yaml with real workspace names.
```

### 9.2 Build the manager pair

```python
from aic_utils import AIC, GitLabManager

aic_prod = AIC(
    api_key="<API_KEY>",
    project="dylan.doyle@jdpa.com",
    workspace="PIN FUSION 2.0",
    pipelines=["*"],          # or a curated list
)

mgr = GitLabManager(
    aic_instance=aic_prod,
    gitlab_token="<GIT_TOKEN>",
    gitlab_namespace="pin/pin-analytics",
    repo_folder="pin-fusion-2.0",
)
```

### 9.3 Sync AIC → Git on the dev environment

```python
aic_dev = AIC(api_key=..., project=..., workspace="PIN FUSION 2.0 DEV", pipelines=[...])
mgr_dev = GitLabManager(aic_instance=aic_dev, ...)
mgr_dev.push_to_git()
```

Restrictions:

- `push_to_git` on `main` requires interactive `y` confirmation.
- Push is rejected if the current branch isn’t in `_allowed_push` (slug + mapped branch) and isn't a `hotfix/*`.

### 9.4 Promote dev → staging via a feature MR

```python
mgr_dev.create_feature_branch("PINOPS-1831")          # cuts feature/PINOPS-1831 from dev
mgr_dev.create_merge_request(
    title="Demographics module",
    description="QA push",
    target_branch="staging",                           # pass explicitly if you customized roles
)

# After the MR is merged, deploy on the QA workspace:
aic_qa = AIC(api_key=..., workspace="PIN FUSION 2.0 QA", pipelines=[...])
mgr_qa = GitLabManager(aic_instance=aic_qa, ...)
mgr_qa.deploy_pipelines()
```

### 9.5 Promote staging → prod via a release MR

```python
mgr_qa.push_to_git()                                  # capture latest QA state
mgr_qa.create_release_branch("v2.1.0")                # cuts release/v2.1.0 from staging,
                                                      # prunes pipelines not in aic.pipeline_configs
mgr_qa.create_merge_request(
    title="Release v2.1.0",
    description="QA-approved",
    target_branch="main",
)

# After merge:
mgr_prod = GitLabManager(aic_instance=aic_prod, ...)
mgr_prod.deploy_pipelines()
```

### 9.6 Production hotfix

The order is **inverted** from feature/release: the fix is applied **in AIC prod first**, then captured in Git.

```python
# 1. Apply the fix manually in AIC prod workspace.

# 2. Build a fresh prod AIC + manager (so it sees the new config).
aic_prod = AIC(api_key=..., workspace="PIN FUSION 2.0", pipelines=["BROKEN_PIPELINE"])
mgr_prod = GitLabManager(aic_instance=aic_prod, ...)

# 3. Cut hotfix off main and push the AIC state into the hotfix branch.
mgr_prod.create_hotfix_branch("PINOPS-1831")
mgr_prod.push_to_git()

# 4. Open MR back into main for the audit trail.
mgr_prod.create_merge_request(
    title="Hotfix PINOPS-1831",
    description="Emergency fix for streaming timeout",
    target_branch="main",
)
```

### 9.7 Convert legacy placeholders

```python
from aic_utils import DatasetConverter

dc = DatasetConverter(
    api_key="<API_KEY>",
    project="dylan.doyle@jdpa.com",
    workspace="LEGACY",
    target_workspace="PIN FUSION 2.0",   # use this workspace's catalog as the lookup
)
dc.convert(["LEGACY_PIPELINE_A", "LEGACY_PIPELINE_B"])
# Each pipeline now has a CONVERTED_<timestamp> AIC branch with rewritten placeholders.
```

### 9.8 Stream logs/print to Slack

```python
from aic_utils import SlackLogger
SlackLogger.create_logger("<SLACK_TOKEN>", "<CHANNEL_ID>", redirect_print=True)
print("from now on, this lands in Slack")
```

---

## 10. Best practices & troubleshooting

**Operating principles**

1. **One AIC + one GitLabManager per environment.** The manager locks itself after init.
2. **Promotions go through MRs.** `main` / `staging` / `dev` should be protected; only `feature/*`, `release/*`, `hotfix/*` cut from them.
3. **Hash compare is on by default.** Set `use_hash_comparison=False` if you want every deploy to write regardless of diff.
4. **Keep `branch_to_workspace.yaml` truthful.** The mapper reads it from the **default branch only**.
5. **`branch_roles` is class-level.** Setting it after construction does not retroactively rebuild allowed-push sets.

**Common errors**

| Symptom | Likely cause | Fix |
|---------|--------------|-----|
| `RuntimeError: ❌ GitLab project ... not found` | `gitlab_namespace` or `repo_folder` typo | Verify in GitLab UI. |
| `❌ Cannot modify attribute 'X' after initialization` | Trying to mutate manager state directly | Use `create_*_branch` helpers; build a new manager for a new workspace. |
| `❌ Cannot push to branch '...'` | Current branch isn’t in `_allowed_push` and isn’t `hotfix/*` | You probably checked out a feature branch — push, then MR; don’t push features directly. |
| `❌ Branch '...' maps to workspace '...', not '...'` (on deploy) | Mismatched branch ↔ AIC workspace | Use the correct AIC workspace for the branch you’re deploying. |
| `FileNotFoundError: 'branch_to_workspace.yaml' not found ...` | Missing/renamed mapping file on default branch | Add it on `main` (or whatever your default branch is). |
| `pyspark_utils` import error | `email_recipients` was set but the package isn’t installed | Either `pip install pyspark_utils` or unset `email_recipients`. |
| Token nearing expiry warning at init | <30 days remaining | Rotate the GitLab token. |

**Contact**

- GitHub: https://github.com/dylandoyle11/aic_utils
- Maintainer: Dylan Doyle
