Metadata-Version: 2.4
Name: octorules-wirefilter
Version: 0.4.0
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: System Administrators
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Python :: 3.14
Classifier: Programming Language :: Rust
Classifier: Topic :: System :: Networking :: Firewalls
Classifier: Topic :: System :: Systems Administration
Requires-Dist: pytest>=7.0 ; extra == 'dev'
Requires-Dist: ruff>=0.4.0 ; extra == 'dev'
Requires-Dist: yamllint>=1.35.0 ; extra == 'dev'
Requires-Dist: build ; extra == 'dev'
Provides-Extra: dev
License-File: LICENSE
Summary: Wirefilter expression parser FFI bindings for octorules
Keywords: cloudflare,wirefilter,parser,ffi
Home-Page: https://github.com/doctena-org/octorules-wirefilter
Author: Martin Simon, Doctena S.A.
License-Expression: Apache-2.0
Requires-Python: >=3.10
Description-Content-Type: text/markdown; charset=UTF-8; variant=GFM
Project-URL: Homepage, https://github.com/doctena-org/octorules-wirefilter
Project-URL: Issues, https://github.com/doctena-org/octorules-wirefilter/issues
Project-URL: Repository, https://github.com/doctena-org/octorules-wirefilter

# octorules-wirefilter

Rust FFI bindings for Cloudflare's [wirefilter](https://github.com/cloudflare/wirefilter) expression parser, exposed to Python via [PyO3](https://pyo3.rs/). When installed, [`octorules lint`](https://github.com/doctena-org/octorules) uses the real wirefilter parser for authoritative expression analysis instead of the built-in regex fallback.

## Installation

```bash
# Install octorules with wirefilter support
pip install octorules[wirefilter]

# Or install standalone
pip install octorules-wirefilter
```

## How it works

```
octorules lint
    │
    ▼
expression_bridge.py          Python-side routing layer
    │
    ├─► octorules_wirefilter   (if installed)
    │       │
    │       ├── lib.rs         PyO3 parse_expression(expr, phase=None)
    │       ├── scheme.rs      Phase-aware field/function schemes
    │       └── visitor.rs     AST walker → fields, functions, operators, literals
    │       │
    │       ▼
    │   wirefilter-engine      Cloudflare's Rust expression parser
    │
    └─► regex fallback         Built-in patterns (always available)
```

`octorules` tries to import `octorules_wirefilter` at module load time. If available, expressions are parsed by the real Cloudflare wirefilter engine. On import failure or parse error, the bridge transparently falls back to regex extraction. Either path returns the same `ExpressionInfo` dataclass consumed by the linter.

## Scheme

A single wirefilter scheme is built at startup and cached:

- **164 fields** exposed via `get_schema_info()`, **34 functions**.
- **Named list support** — expressions like `ip.src in $my_list` parse
  without error. `AlwaysList` is registered for Int, Ip, and Bytes types
  so any `$name` reference is accepted. Actual list validation (existence,
  type compatibility) is handled by the Python linter ([CF102](https://github.com/doctena-org/octorules-cloudflare/blob/main/docs/lint/README.md): unresolved list reference, [CF104](https://github.com/doctena-org/octorules-cloudflare/blob/main/docs/lint/README.md): field type incompatible with list kind).
- **Wildcard limit** — `ParserSettings` enforces a maximum of 10 `*`
  metacharacters per wildcard pattern to prevent catastrophic backtracking.

The `phase` parameter is accepted for API compatibility but currently unused — all expressions are parsed against the same scheme. Transform-phase function-call syntax (where `http.request.uri.path` is callable) is handled on the Python side.

## Building from source

### Prerequisites

- Rust toolchain **>= 1.86**, edition 2024 (stable, via [rustup](https://rustup.rs/))
- Python >= 3.10 with venv
- [maturin](https://github.com/PyO3/maturin) (`pip install maturin`)

### Development build

```bash
maturin develop
```

Builds the Rust crate and installs the resulting Python extension module into the active virtualenv.

### Wheel build

```bash
maturin build --release
```

Produces a wheel in `target/wheels/`.

## Testing

```bash
# Install test dependencies
pip install pytest

# Run FFI tests (requires octorules-wirefilter to be installed via maturin develop)
pytest tests/
```

Tests skip gracefully if the native extension is not installed.

## API

This package exposes two functions:

### `parse_expression(expr, phase=None)`

```python
from octorules_wirefilter import parse_expression

# Parse an expression against the default scheme
result = parse_expression('http.host eq "example.com"')
# {'fields': ['http.host'], 'operators': ['eq'], 'string_literals': ['example.com'], ...}

# Phase parameter is accepted for forward compatibility but currently unused —
# all expressions parse against the same scheme regardless of phase value.
result = parse_expression('lower(http.host) eq "test"', phase="url_rewrite_rules")

# Parse errors return an error key (all list keys present but empty)
result = parse_expression('bogus_field eq "x"')
# {'error': '...', 'fields': [], 'functions': [], 'operators': [], ...}
```

**Returns** a dict with keys `fields`, `functions`, `operators`, `string_literals`, `regex_literals`, `ip_literals`, `int_literals` (all lists), plus:
- On success: lists populated with extracted values. If AST nesting exceeded the depth limit, `depth_exceeded: true` is included.
- On failure: `error` (string) with all list keys present but empty.

Expressions exceeding 1 MiB are rejected with an error dict before parsing.
Nesting depth is capped at 100 levels to prevent stack overflow on pathological input.

### `get_schema_info()`

```python
from octorules_wirefilter import get_schema_info

info = get_schema_info()
# {'fields': [{'name': 'http.host', 'type': 'STRING'}, ...],
#  'functions': ['lower', 'upper', ...]}
```

Returns schema metadata for automated synchronization with the Python linter schemas. Field types use the Python `FieldType` enum names (`STRING`, `INT`, `BOOL`, `IP`, `ARRAY_STRING`, etc.).

## Contributing

**Important:** Field and function registries exist in two places: `src/scheme.rs` (Rust — used by wirefilter for parsing and type checking) and `octorules_cloudflare/linter/schemas/` in the [octorules-cloudflare](https://github.com/doctena-org/octorules-cloudflare) repo (Python — used by the regex fallback parser and lint rules). A pre-commit hook in octorules-cloudflare auto-regenerates `schemas.json` when `overlay.toml` or `pyproject.toml` is modified. Rust-side changes here must still be made manually.

### Adding fields

When Cloudflare adds new fields, update `src/scheme.rs` — add the field to `register_common_fields()` **and** to the `COMMON_FIELD_NAMES` array.

Then in the [octorules-cloudflare](https://github.com/doctena-org/octorules-cloudflare) repo, run `python scripts/sync_schemas.py` to regenerate `schemas.json`. If the field needs Python-only metadata (`requires_plan`, `is_response`), add it to `overlay.toml` in that repo first.

### Adding functions

Update `src/scheme.rs` — register in `register_common_functions()` and add the name to the `COMMON_FUNCTION_NAMES` array.

Then in the [octorules-cloudflare](https://github.com/doctena-org/octorules-cloudflare) repo, run `python scripts/sync_schemas.py` to regenerate `schemas.json`. If the function needs `restricted_phases` or `requires_plan`, add it to `overlay.toml` in that repo first.

## Design decisions

- **Separate PyPI package.** The Rust build requires a toolchain and takes longer to compile. Users who want fast installs get `pip install octorules`; those who want authoritative parsing opt in with `pip install octorules[wirefilter]`.
- **Git dependency pinning.** `wirefilter-engine` is pinned to a specific commit because the required APIs (`SchemeBuilder`, function registration) are not in the published crates.io version.
- **Stub function implementations.** Functions are registered with correct type signatures but no-op execution. Expressions parse and extract correctly; runtime evaluation is not supported.
- **cdylib crate type.** Required by PyO3's extension-module feature for Python to load the native extension.

## License

octorules-wirefilter is licensed under the [Apache License 2.0](LICENSE).

