Metadata-Version: 2.4
Name: hip-cargo
Version: 0.2.1
Summary: Tools for generating Stimela cab definitions from Python functions
Keywords: stimela,typer,cli,yaml,code-generation,radio-astronomy
Author: landmanbester
Author-email: landmanbester <lbester@sarao.ac.za>
License-Expression: MIT
License-File: LICENSE
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Software Development :: Code Generators
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Topic :: Scientific/Engineering :: Astronomy
Requires-Dist: typer>=0.12.0
Requires-Dist: pyyaml>=6.0
Requires-Dist: typing-extensions>=4.15.0
Requires-Dist: libcst==1.8.6
Requires-Dist: tomli>=2.0 ; python_full_version < '3.11'
Requires-Dist: ruff>=0.13.2
Requires-Dist: fsspec>=2024.10.0
Requires-Dist: universal-pathlib>=0.2.5
Requires-Dist: hip-cargo[s3,gcs,azure] ; extra == 'all'
Requires-Dist: adlfs>=2024.7.0 ; extra == 'azure'
Requires-Dist: gcsfs>=2024.10.0 ; extra == 'gcs'
Requires-Dist: s3fs>=2024.10.0 ; extra == 's3'
Requires-Python: >=3.10
Project-URL: Homepage, https://github.com/landmanbester/hip-cargo
Project-URL: Repository, https://github.com/landmanbester/hip-cargo
Project-URL: Bug Tracker, https://github.com/landmanbester/hip-cargo/issues
Provides-Extra: all
Provides-Extra: azure
Provides-Extra: gcs
Provides-Extra: s3
Description-Content-Type: text/markdown

# hip-cargo

`hip-cargo` is an attempt to liberate developers from maintaining their packages in [`cult-cargo`](https://github.com/caracal-pipeline/cult-cargo).
The core concept boils down to maintaining a lightweight package that only installs the [`stimela`](https://github.com/caracal-pipeline/stimela) cabs required to run a linked and versioned containerized image of the package.
This makes it possible to install the package alongside `cult-cargo` and include cabs into recipes using the syntax
```yaml
_include:
  - (module.cabs)cab_name.yml
```
In principle, that's all there is to it.
The `hip-cargo` package does not dictate how you should go about structuring your package.
Instead, it serves as an example of how to design auto-documenting CLI interfaces using Typer with automated cab generation and containerisation.
It provides utilities to convert function signatures into `stimela` cabs (and vice versa) for packages that mimic its structure.

## Installation

```bash
pip install hip-cargo
```

See the [Development](#development) section for instructions on how to set up the development environment and make contributions.

## Key Principles

1. **Separate CLI from implementation**: Keep CLI modules lightweight with lazy imports. Keep them all in the `src/mypackage/cli` directory and define the CLI for each command in a separate file. Construct the main Typer app in `src/mypackage/cli/__init__.py` and register commands there.
2. **Separate cabs directory at same level as `cli`**: Use `hip-cargo generate-cabs` to auto-generate cabs into `src/mypackage/cabs/`. There should be a separate `src/mypackage/cli/mycommand.py` file corresponding to each cab.
3. **Single app, multiple commands**: Use one Typer app that registers all commands. If you need a separate app you might as well create a separate repository for it.
4. **Lazy imports**: Import heavy dependencies (NumPy, JAX, Dask) only when executing
5. **Linked GitHub package with container image**: Maintain an up to date `Dockerfile` that installs the full package and use **Docker** (or **Podman**) to upload the image to the GitHub Container registry. Link this to your GitHub repository.

## Quick Start
The following instructions provide a guide on how to structure a package for use with `hip-cargo`.
Note that `hip-cargo` itself follows exactly this structure and will be used as the running example throughout.
It provides three utility functions viz.

* `generate-cabs`: Generate cabs from Typer CLI definitions.
* `generate-function`: Generate a Typer CLI definition from a cab.
* `init`: Initialize and new project.

By default, `hip-cargo` installs a lightweight version of the package that only provides the CLI and the cab definitions required for using the linked container image with `stimela`.
Upon installation, an executable called `hip-cargo` is added to the `PATH`.
`hip-cargo` is a Typer command group containing multiple commands.
Available commands can be listed using
```bash
hip-cargo --help
```
This should print something like the following

![CLI Help](docs/cli-help.svg)

Documentation on each individual command can be obtained by calling help for the command e.g.
```bash
hip-cargo generate-cabs --help
```
The full package should be available as a container image on the [GitHub Container Registry](https://docs.github.com/en/packages/working-with-a-github-packages-registry/working-with-the-container-registry).
The `Dockerfile` for the project should install the full package, not the lightweight version.
This is used to build the container image that is uploaded to the registry.
The image should be tagged with a version so that `stimela` knows how to match cab configuration to images.
The following versioning schema is proposed:

* use semantic versioning for releases
* use `latest` tag for `main`/`master` branch
* use `branch-name` when developing new features

This can all be automated with pre-commit hooks and GitHub actions.
Use pre-commit hooks to auto-generate cab definitions on each commit.
See the [publish-container](./.github/workflows/publish-container.yml) workflow for an example of how to set up GitHub Actions for automation.
We distinguish between two cases viz. initialising a project from scratch or converting an existing project.

### Using `hip-cargo` to initialise a project

The `hip-cargo init` command scaffolds a complete project with CI/CD pipelines, containerisation, pre-commit hooks, and Stimela cab support. Run it with:

```bash
hip-cargo init --project-name my-project --github-user myuser
```

This creates a ready-to-use project directory with:

- **src layout** with separate `cli/`, `core/`, and `cabs/` directories
- **pyproject.toml** (PEP 621 compliant) with `uv` as the build backend
- **GitHub Actions workflows** for CI, PyPI publishing, container publishing, and automated cab updates
- **Pre-commit hooks** for ruff formatting/linting and automatic cab regeneration
- **Dockerfile** for building container images uploaded to GitHub Container Registry
- **tbump configuration** with hooks for version bumping and cab regeneration
- **License file** (MIT, Apache-2.0, or BSD-3-Clause)
- **An `onboard` command** that prints step-by-step instructions for completing CI/CD setup

The generated project includes an `onboard` command that guides you through the remaining setup steps:

```bash
cd my-project
uv run my_project onboard
```

This prints instructions for:

1. Creating a GitHub repository (with `gh` CLI)
2. Setting up PyPI trusted publishing (OIDC, no API keys needed)
3. Creating a GitHub environment for publishing
4. Creating a GitHub App for automated cab update commits
5. Configuring branch protection with the App in the bypass list
6. Making your first release with `tbump`

Once setup is complete, you can delete the onboard command and start adding your own commands following the same pattern.

#### Init options

| Option | Default | Description |
|--------|---------|-------------|
| `--project-name` | *required* | Hyphenated project name (e.g. `my-project`) |
| `--github-user` | *required* | GitHub username or organisation |
| `--description` | `"A Python project"` | Short project description |
| `--author-name` | *from git config* | Author name |
| `--author-email` | *from git config* | Author email |
| `--cli-command` | *from project name* | CLI entry point name |
| `--initial-version` | `0.0.0` | Starting version string |
| `--license-type` | `MIT` | License (MIT, Apache-2.0, BSD-3-Clause) |
| `--cli-mode` | `multi` | `single` (one command) or `multi` (subcommands) |
| `--default-branch` | `main` | Default git branch name |
| `--project-dir` | `./<project-name>/` | Output directory |

### Transitioning an existing package

To transition an existing package that already contains `stimela` cab definitions, it is probably easiest to manually create the required directory structure (see below) and to use the `generate-function` command to convert your cabs into CLI definitions. Do this for each cab separately and register the relevant commands in your CLI module's `__init__.py`. You might want to take a look at the [template](src/hip_cargo/templates/) files and copy the necessary files across (or initialize a new blank project and just use those). This is currently a manual process, we might add an automation script (or skill) to do this in the future. Further details are provided below.


## Package Structure

We recommend using [uv](https://docs.astral.sh/uv/) as the package manager and following a structure similar to that used in `hip-cargo`:

```
hip-cargo/
├── .github
│   ├── dependabot.yml
│   └── workflows
│       ├── ci.yml
│       ├── publish-container.yml
│       ├── publish.yml
│       └── update-cabs.yml
├── src
│   └── hip_cargo
│       ├── cabs                 # Generated cab definitions (YAML)
│       │   ├── __init__.py
│       │   ├── generate_cabs.yml
│       │   └── generate_function.yml
│       ├── cli                  # Lightweight CLI wrappers
│       │   ├── __init__.py
│       │   ├── generate_cabs.py
│       │   └── generate_function.py
│       ├── core                 # Core implementations (lazy-loaded)
│       │   ├── __init__.py
│       │   ├── generate_cabs.py
│       │   └── generate_function.py
│       ├── recipes              # Stimela recipes for running commands via stimela
│       │   ├── __init__.py
│       │   └── gen_cabs.yml
│       └── utils                # Shared utilities
│           ├── __init__.py
│           ├── cab_to_function.py
│           ├── config.py        # Utility to get container from project metadata
│           ├── decorators.py
│           ├── introspector.py
│           └── types.py         # ListInt, ListFloat, ListStr NewTypes + parsers
├── tests
│   ├── __init__.py
│   └── conftest.py
├── Dockerfile                   # For containerization
├── LICENSE                      # MIT or BSD3 license encouraged
├── .pre-commit-config.yaml      # You should use these if you don't already
├── .gitignore                   # make sure your .lock file is not ignored
├── pyproject.toml               # PEP 621 compliant
├── tbump.toml                   # this makes releases so much easier
└── README.md                    # project README

```

## Python CLI

`uv` expects your modules to live in `src/mypackage/`.
As an example, have a look at the [`generate-cabs`](./src/hip_cargo/cli/generate_cabs.py) command.

Each CLI module should be a separate file and all modules need to be registered as commands inside the CLI module. For `hip-cargo`, this is what it looks like [`src/hip_cargo/cli/__init__.py`](./src/hip_cargo/cli/__init__.py). You can register one or multiple commands here.

## Packaging
This is one of the core design principles.
The package `pyproject.toml` needs to be PEP 621 compliant, and it needs to enable a lightweight mode by default but also specify what the full dependencies are.
See `hip-cargo`'s [`pyproject.toml`](pyproject.toml) for an example.

## Container Images and GitHub Actions

For `stimela` to use your package in containerized environments, you should publish OCI container images to GitHub Container Registry [`ghcr.io`](https://docs.github.com/en/packages/learn-github-packages/introduction-to-github-packages). This section shows how to automate this with GitHub Actions.

### 1. Create a Dockerfile

Add a `Dockerfile` at the root of your repository. For example, see [`Dockerfile`](./Dockerfile).

### 2. Automate Cab Creation and Containerisation

You can automate cab generation using pre-commit hooks.
See [`.pre-commit-config.yaml`](./pre-commit-config.yaml) for an example. This calls `generate-cabs` directly to regenerate cab YAML for all commands in your CLI module.
The container image is read from `_container_image.py` in the package source directory, which serves as the single source of truth. The convention is to use the branch name for feature branches, `latest` for the default branch, or a semantic version during `tbump` releases. The CLI source files are never modified during cab generation — only the YAML cab files are updated.

You should be able to reuse the GitHub action for `hip-cargo` in `.github/workflows/update-cabs.yml` to automate cab updates for your project.
The workflow will tag the container image with the branch name if there is an open PR to your default branch.
Once the PR is merged, an action is triggered to regenerate cab definitions with the `latest` image tag and push them.
It also resets the `CONTAINER_IMAGE` tag in `_container_image.py` to `latest`.
Pushing semantically versioned tags will trigger the same workflow (this is where `tbump` is quite useful).
In this case the image is tagged with the version.

#### Developer workflow for image tags

When you create a feature branch, edit `src/<package>/_container_image.py` to change the tag to your branch name:

```python
# In src/<package>/_container_image.py, change:
CONTAINER_IMAGE = "ghcr.io/user/repo:my-feature-branch"
```

This ensures the cab definitions generated by pre-commit hooks use the correct branch-specific image tag during development.
You do not need to reset the tag manually before merging — the `update-cabs` workflow handles that automatically on merge to `main`, setting the tag back to `latest`.
During releases, `tbump` updates the tag to the semantic version (e.g. `0.1.8`) via its before-commit hooks.

### 3. Build and Push the Container Image Manually

If you need to build and push the container image without GitHub Actions (e.g. during initial setup or for debugging), you can do so directly from the command line.

First, authenticate with the GitHub Container Registry:

```bash
echo $GITHUB_TOKEN | docker login ghcr.io -u GITHUB_USERNAME --password-stdin
```

Build the image, tagging it with your repository name and version:

```bash
docker build -t ghcr.io/GITHUB_USERNAME/REPO_NAME:IMAGE_TAG .
```

Then you can test it locally using e.g.

```bash
docker run --rm -it ghcr.io/GITHUB_USERNAME/REPO_NAME:IMAGE_TAG /bin/bash
```

Push the image to GHCR:

```bash
docker push ghcr.io/GITHUB_USERNAME/REPO_NAME:IMAGE_TAG
```

You can also tag with a specific version:

```bash
docker tag ghcr.io/GITHUB_USERNAME/REPO_NAME:IMAGE_TAG ghcr.io/GITHUB_USERNAME/REPO_NAME:0.1.0
docker push ghcr.io/GITHUB_USERNAME/REPO_NAME:0.1.0
```

If you prefer **Podman** (no daemon required):

```bash
podman build -t ghcr.io/GITHUB_USERNAME/REPO_NAME:IMAGE_TAG .
podman push ghcr.io/GITHUB_USERNAME/REPO_NAME:IMAGE_TAG
```

If you need an **apptainer** or **singularity** image locally, first build and save the docker image and then convert it:

```bash
docker build -t mylocalimage:latest .
docker save mylocalimage:latest -o mylocalimage_local.tar
apptainer build mylocalimage_local.sif docker-archive://mylocalimage_local.tar
```

### 4. Link Container to GitHub Package

To associate the container image with your repository:

1. **Automatic linking**: If your workflow pushes to `ghcr.io/username/repository-name`, GitHub automatically creates a package linked to the repository.

2. **Manual linking** (if needed):
   - Go to your repository on GitHub
   - Navigate to the "Packages" section
   - Click on your container package
   - Click "Connect repository" in the sidebar
   - Select your repository from the dropdown

3. **Set package visibility**:
   - In the package settings, set visibility to "Public" for open-source projects
   - This allows `stimela` to pull images without authentication

### 5. Using the Container with `stimela`

Once published, users should be able to simply include the cab definitions in their recipes.
This only requires installing the lightweight version of the package, so it shouldn't clash with any other packages, in particular `stimela` and `cult-cargo`.
Use the following syntax to include a cab in a recipe
```yaml
_include:
  - (mypackage.cabs)cab_name.yml
```

`stimela` will automatically pull the matching version based on the cab configuration.
You could optionally provide `stimela` recipes inside your project (see `src/hip_cargo/recipes`, for example).
If the lightweight version if the package is installed it should be possible to run these recipes directly using the syntax

```python
stimela run 'mypackage.recipes::killer_recipe.yml' recipe_name option1=option1...
```

## Type Inference

`hip-cargo` automatically recognizes custom `stimela` types.
These should be created using `typing.NewType`.
See the [`generate-cabs` CLI module](./src/hip_cargo/cli/generate_cabs.py) for an example.

## Decorators

### `@stimela_cab`

Marks a function as a Stimela cab.

- `name`: Cab name
- `info`: Description
- `policies`: Optional dict of cab-level policies
- `**kwargs`: Additional cab metadata stored in `func.__stimela_cab_config__`

### `@stimela_output`

Defines a `stimela` output supporting the following fields:

- `name`: Output name (top level, one below `cabs`)
- `dtype`: Data type (File, Directory, MS, etc.)
- `info`: Help string
- `required`: Whether output is required (`default: False`)
- `implicit`: Just use what you would put in the cab definition for `stimela`
- `policies`: Parameter level policies provided as a `dict`. See `stimela` [docs](https://stimela.readthedocs.io/en/latest/reference/schema_ref.html)
- `must_exist`: Whether an output has to exist when the task finishes (`default: False`)
- `mkdir`: create the directory if it does not exist (`default: False`)
- `path_policies`: Path policies provided as a `dict`. See `stimela` [docs](https://stimela.readthedocs.io/en/latest/reference/schema_ref.html)

Note that the order is important if you want to implement a [roundtrip test](tests/test_roundtrip.py).

## Tunable parameters and `generate-schemas`

Some CLI parameters represent knobs that an LLM agent might want to tune — convergence thresholds, iteration counts, mode selectors. Mark these with `tunable: true` inside `StimelaMeta`'s `metadata` dict, alongside any other arbitrary metadata you want to ride through to the cab YAML:

```python
n_iter: Annotated[
    int,
    typer.Option(help="Number of iterations.", rich_help_panel="Tuning"),
    StimelaMeta(metadata={"tunable": True}),
] = 10,
```

Stimela has a finite set of allowed top-level fields on a parameter definition, so `tunable` lives inside `metadata` — the dedicated escape hatch — and rides through `generate-cabs` to the YAML's `metadata: {tunable: true}` entry.

Run `hip-cargo generate-schemas --module package/cli/*.py --output-dir src/package/schemas/` and one Pydantic file is emitted per command **with at least one tunable parameter**. The model contains only the tunable subset:

```python
"""Auto-generated by hip-cargo. Do not edit by hand."""

from pydantic import BaseModel, Field


class TunableDemoTunables(BaseModel):
    n_iter: int = Field(default=10, description="Number of iterations.")
    threshold: float = Field(default=0.001, description="Convergence threshold.")
```

These models are designed to be passed as `output_type` to a [`pydantic-ai`](https://ai.pydantic.dev/) agent so an LLM can emit structured, validated tuning proposals.

### Allowed tunable types

Tunable parameters must map cleanly to Pydantic. The supported types are:

- `int`, `float`, `str`, `bool`
- `Literal[...]`
- `ListInt`, `ListFloat`, `ListStr` (comma-separated list NewTypes)
- `list[int]`, `list[float]`, `list[str]`, `list[bool]`
- Any of the above optionally wrapped in `| None`

Anything else marked `tunable: true` raises a `TunableTypeError` at generation time. In particular: path types (`File`, `Directory`, `MS`, `URI`) and parameters with a typer `callback=` are rejected, since neither is reasonable to surface as agent input. Non-tunable parameters of any type are simply ignored by the schema generator.

### Relationship to `generate-cabs`

Both commands consume one shared intermediate representation (`hip_cargo.utils.spec`). You can run them independently — there's no required ordering — but both should be regenerated whenever your CLI signatures change. Generation is idempotent: running `generate-schemas` over an unchanged CLI module will not touch the existing schema file (byte-compare before write keeps mtimes stable).

## Container Fallback Execution

When a hip-cargo package is installed in lightweight mode (without heavy dependencies like NumPy, JAX, or Dask), CLI commands automatically fall back to running inside a container. This means users can run commands without installing the full dependency stack — they just need a container runtime.

The fallback is transparent: if the core module import succeeds, the command runs natively. If it fails with `ImportError`, the same CLI command is re-executed inside the container with `--backend native` to force native execution (avoiding infinite recursion). The container image is resolved from `_container_image.py` in the package source directory.

Every generated CLI function gets two additional options when the cab has a container image:

- `--backend`: Choose the execution backend — `auto` (default), `native`, `apptainer`, `singularity`, `docker`, or `podman`
- `--always-pull-images`: Force re-pull of the container image before execution

Volume mounts are resolved automatically from the function's type hints:
- Path-like parameters (File, Directory, MS) are detected and mounted
- Input parameters are mounted read-only, output parameters read-write
- Stimela path policies (`write_parent`, `access_parent`, `mkdir`) are respected
- Docker/podman run as the current user to avoid root-owned output files

## Remote URIs and object stores

`File`, `Directory`, `MS`, and `URI` are backed by
[`universal_pathlib.UPath`](https://github.com/fsspec/universal_pathlib),
so every path-typed parameter accepts either a local filesystem path or a
remote URI: `s3://bucket/key`, `gs://bucket/key`, `az://container/key`,
`http(s)://...`, plus any other scheme fsspec supports (`memory://`, `ftp://`,
etc.).

### Installing backends

hip-cargo's core install includes `fsspec` and `universal_pathlib`. Cloud
backends are optional extras — install only what you need:

```bash
pip install 'hip-cargo[s3]'       # AWS S3 / S3-compatible endpoints
pip install 'hip-cargo[gcs]'      # Google Cloud Storage
pip install 'hip-cargo[azure]'    # Azure Blob Storage
pip install 'hip-cargo[all]'      # all three
```

### Credentials

Native execution uses each SDK's standard credential chain — no hip-cargo
configuration. You already have the right setup if any of these work today:

- **AWS:** `AWS_*` env vars, `~/.aws/credentials`, IAM instance/role creds
- **GCS:** `GOOGLE_APPLICATION_CREDENTIALS`, `gcloud auth
  application-default login`, workload identity
- **Azure:** `AZURE_*` env vars, `az login`, managed identity

When hip-cargo falls back to container execution, it forwards the relevant
credentials automatically based on the schemes it detects in your
parameters:

| Scheme | Env vars forwarded | Config dir mounted (ro) |
|---|---|---|
| `s3` | `AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`, `AWS_SESSION_TOKEN`, `AWS_PROFILE`, `AWS_REGION`, `AWS_DEFAULT_REGION`, `AWS_ENDPOINT_URL` | `~/.aws` (skipped if `AWS_SESSION_TOKEN` is set) |
| `gs` / `gcs` | `GOOGLE_APPLICATION_CREDENTIALS` | `~/.config/gcloud` + the keyfile |
| `az` / `abfs` / `adl` | `AZURE_STORAGE_ACCOUNT`, `AZURE_STORAGE_KEY`, `AZURE_STORAGE_CONNECTION_STRING`, `AZURE_CLIENT_ID`, `AZURE_TENANT_ID`, `AZURE_CLIENT_SECRET` | `~/.azure` |


### What about `must_exist`, `mkdir`, and `write_parent`?

For remote URIs, hip-cargo pre-flights `must_exist` with a single `exists()`
call before dispatch, failing fast on typos or missing objects. `mkdir`,
`write_parent`, and `access_parent` are skipped for remote URIs — they map
to local container mount logic that has no meaning on object stores. Local
paths keep their existing mount-driven semantics.

## Features

- Automatic type inference from Python type hints
- Support for Typer Arguments (positional) and Options
- Multiple outputs automatically added to function signature if they are not implicit
- List types with automatic `repeat: list` policy
- First-class comma-separated list types (`ListInt`, `ListFloat`, `ListStr`) with built-in parsers
- Proper handling of default values and required parameters
- Full roundtrip preservation of inline comments (e.g., `# noqa: E501`)
- Optional `{"stimela": {...}}` metadata dict in `Annotated` type hints for Stimela-specific fields
- Pydantic schema generation (`generate-schemas`) for the tunable subset of CLI parameters, suitable as `output_type` for `pydantic-ai` agents
- Project scaffolding with `hip-cargo init` including CI/CD, containerisation, and onboarding
- Container fallback execution with automatic volume mount resolution from type hints
- Support for apptainer, singularity, docker, and podman backends
- Runtime image resolution from `_container_image.py` via dynamic module import — no CWD dependency

## Quirks

### Comma-separated list types (`ListInt`, `ListFloat`, `ListStr`)

Typer (and Click underneath it) does not support variable-length lists as a single CLI option value.
For example, `--channels 1,2,3` cannot be directly typed as `list[int]` because Click sees the entire `1,2,3` as one string argument.

The standard Typer workaround is to repeat the flag (`--channel 1 --channel 2 --channel 3`), which maps to `list[int]` with Typer's `repeat` mechanism.
However, this is inconvenient for parameters that naturally take comma-separated values and results in a CLI interface that is different from the `stimela` interface.

`hip-cargo` solves this with dedicated `NewType` wrappers defined in `hip_cargo.utils.types`:

```python
from hip_cargo.utils.types import ListInt, parse_list_int

@stimela_cab(...)
def my_func(
    channels: Annotated[
        ListInt,
        typer.Option(parser=parse_list_int, help="Channel indices"),
    ],
):
    # channels is already list[int] at runtime — no manual splitting needed
    ...
```

- `ListInt`, `ListFloat`, and `ListStr` wrap `str` (so Typer sees a single string argument)
- Paired parser functions (`parse_list_int`, `parse_list_float`, `parse_list_str`) handle comma-splitting at the Click level, so the function body receives the already-parsed list
- The introspector maps these types to the correct Stimela dtypes (`List[int]`, `List[float]`, `List[str]`)
- The reverse generator (`generate-function`) automatically uses these types when it encounters a `List[int]`/`List[float]`/`List[str]` dtype in a cab YAML

### Custom Stimela types via `NewType`

Stimela has its own type system (`File`, `Directory`, `MS`, `URI`) that doesn't map 1:1 to Python types.
We use `typing.NewType` to create thin wrappers around `Path`:

```python
from typing import NewType
File = NewType("File", Path)
```

These `NewType`s serve double duty: they're valid Python type hints for Typer, and `hip-cargo` introspects the name to produce the correct Stimela dtype in the cab YAML.
For these types, you also need `parser=parse_upath` in the `typer.Option()`, so Click parses the string argument into a `UPath` (which accepts both local paths and remote URIs — see [Remote URIs and object stores](#remote-uris-and-object-stores)). `UPath` emits a `Path` compatible object for local paths. Core modules that do not use remote URIs can therefore safely use `Path`.

### Ruff formatting and `config_file`

The `generate-function` command runs `ruff check --fix` and `ruff format` on generated code.
Ruff infers first-party packages from the working directory, which affects import grouping (e.g., whether `import typer` and `from hip_cargo...` get a blank line between them).
When a `--config-file` is provided, `hip-cargo` runs ruff from the config file's parent directory so that first-party detection matches the target project rather than wherever `hip-cargo` happens to be invoked from.

## Development

This project uses:
- [uv](https://github.com/astral-sh/uv) for dependency management
- [ruff](https://github.com/astral-sh/ruff) for linting and formatting (core dependency — `generate-function` runs `ruff format` and `ruff check --fix` on generated code)
- [typer](https://typer.tiangolo.com/) for the CLI
- [git-cliff](https://git-cliff.org/) for `CHANGELOG` automation


### Setting Up Development Environment

```bash
# Clone the repository
git clone https://github.com/landmanbester/hip-cargo.git
cd hip-cargo

# Install dependencies with development tools
uv sync --group dev --group test

# Install pre-commit hooks (recommended) — both the default pre-commit hook
# (runs formatters/linters) and the commit-msg hook (enforces conventional commits).
uv run pre-commit install --hook-type pre-commit --hook-type commit-msg
```

This will automatically run the hooks before each commit.
If any checks fail, the commit will be blocked until you fix the issues.

#### Running Hooks Manually

You can run the hooks manually on all files:

```bash
# Run on all files
uv run pre-commit run --all-files

# Run on staged files only
uv run pre-commit run
```

#### Updating Hook Versions

To update hook versions to the latest:

```bash
uv run pre-commit autoupdate
```

### Manual Code Quality Checks

If you prefer to run checks manually without pre-commit:

```bash
# Format code
uv run ruff format .

# Check and auto-fix linting issues
uv run ruff check . --fix

# Run tests
uv run pytest -v

```

### Commit Message Convention

This project uses [Conventional Commits](https://www.conventionalcommits.org/) to enable automated changelog generation via [git-cliff](https://git-cliff.org/).

Every commit message should follow this format:

```
<type>: <description>

[optional body]
```

**Types:**

| Type | When to use | Changelog section |
|------|------------|-------------------|
| `feat` | New feature or capability | Added |
| `fix` | Bug fix | Fixed |
| `refactor` | Code change that neither fixes a bug nor adds a feature | Changed |
| `perf` | Performance improvement | Changed |
| `docs` | Documentation only | Documentation |
| `test` | Adding or updating tests | Testing |
| `ci` | CI/CD changes | CI |
| `deps` | Dependency updates | Dependencies |
| `chore` | Maintenance tasks (cab regeneration, formatting) | Miscellaneous |

**Examples:**

```bash
git commit -m "feat: add support for MS dtype in type inference"
git commit -m "fix: handle empty docstrings in introspector"
git commit -m "refactor: simplify generate_cabs output formatting"
git commit -m "docs: add container fallback section to README"
git commit -m "test: add roundtrip test for List types"
```

**Scoped commits** (optional): Use parentheses to specify the affected component:

```bash
git commit -m "feat(init): add --license-type option for BSD-3-Clause"
git commit -m "fix(runner): resolve volume mount for symlinked paths"
```

### Contributing Workflow


1. **Create a feature branch**:
   ```bash
   git checkout -b your-feature-name
   ```

2. **Update the container image tag** in `src/hip_cargo/_container_image.py` to match your branch name.

   This ensures the cab definitions generated by pre-commit hooks use the correct branch-specific image tag during development. You do not need to reset the tag before merging — the `update-cabs` workflow handles that automatically on merge to `main`.

3. **Make your changes** and ensure tests pass:
   ```bash
   uv run pytest -v
   ```

4. **Commit using [conventional commit messages](#commit-message-convention)**:
   ```bash
   git add .
   git commit -m "feat: your feature description"
   # Pre-commit hooks run automatically
   ```

   The pre-commit hooks keep the CLI and corresponding cab definitions in sync, enforce code quality and conventional commits.

5. **Push and create a pull request**:
   ```bash
   git push origin your-feature-name
   ```

The GitHub actions workflow automates containerisation by pushing container images to the GitHub Container Registry. Once the PR is merged, they also sync the name of container image corresponding to the branch (i.e. tagged with `:latest`).

## License

MIT License
