Metadata-Version: 2.4
Name: conf-spl2-converter
Version: 0.3.0
Summary: A CLI tool for converting Splunk .conf configurations to SPL2
Project-URL: Homepage, https://github.com/splunk/conf-spl2-converter
Project-URL: Repository, https://github.com/splunk/conf-spl2-converter
Author-email: Splunk <mgazda@cisco.com>
License: Splunk Proprietary
License-File: LICENSE
Keywords: conf,converter,pipeline,spl2,splunk
Classifier: Development Status :: 3 - Alpha
Classifier: Environment :: Console
Classifier: Intended Audience :: Developers
Classifier: License :: Other/Proprietary License
Classifier: Natural Language :: English
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Software Development :: Libraries
Classifier: Topic :: Utilities
Requires-Python: >=3.9
Requires-Dist: addonfactory-splunk-conf-parser-lib>=0.4
Requires-Dist: antlr4-python3-runtime>=4.13
Requires-Dist: typer>=0.12.0
Provides-Extra: testing
Requires-Dist: defusedxml>=0.7; extra == 'testing'
Requires-Dist: requests>=2.31; extra == 'testing'
Requires-Dist: splunk-sdk>=2.0; extra == 'testing'
Requires-Dist: urllib3>=2.0; extra == 'testing'
Description-Content-Type: text/markdown

# conf-spl2-converter

A CLI tool for converting Splunk `.conf` configurations (props.conf / transforms.conf) to SPL2 pipeline templates, and generating expected test outputs from Splunk field extractions or CIM field annotations.

> **Alpha** — this project is under active development. APIs and output format may change.

## Installation

Requires Python 3.9+.

```bash
pip install conf-spl2-converter
```

To use the test generation pipeline (`generate-expected`), install with the testing extra:

```bash
pip install conf-spl2-converter[testing]
```

## Quick start

```bash
# 1. Generate SPL2 pipeline files from a TA
conf-spl2-converter generate /path/to/ta

# 2a. Generate expected test outputs from CIM fields (no Docker needed)
conf-spl2-converter generate-expected-cim /path/to/ta

# 2b. Or generate expected test outputs via Splunk (requires Docker)
conf-spl2-converter generate-expected /path/to/ta
```

## Commands

### `generate` — Create SPL2 pipeline templates

Reads the TA's `props.conf` and `transforms.conf` and generates SPL2 pipeline files.

```bash
# Auto-discover all sourcetypes from props.conf (no config file needed)
conf-spl2-converter generate /path/to/ta

# Use a config file to control which sourcetypes are processed and how
conf-spl2-converter generate /path/to/ta -c field_extraction_config.json

# Write output to a custom directory
conf-spl2-converter generate /path/to/ta -o /tmp/my-output

# Export parsed template data as JSON (useful for debugging / integration)
conf-spl2-converter generate /path/to/ta -o /tmp/my-output -f json

# Combine all options with verbose logging
conf-spl2-converter generate /path/to/ta -c config.json -o ./out -f spl2 -v
```

#### Options

| Flag | Short | Description |
|------|-------|-------------|
| `--config` | `-c` | Path to a `field_extraction_config.json`. When omitted, looks for it in the TA directory; falls back to auto-discovery from `props.conf`. |
| `--output` | `-o` | Output directory for generated files. Defaults to `<ta_path>/default/data/spl2/`. |
| `--format` | `-f` | Output format: `spl2` (default) or `json`. |
| `--verbose` | `-v` | Enable debug logging. |

#### Output formats

- **`spl2`** (default) — renders `.spl2` pipeline files ready for use in Splunk.
- **`json`** — writes a structured JSON file per sourcetype containing the parsed template data (extractions, evals, lookups, etc.).

### `generate-expected` — Generate expected test outputs

Runs the full test generation pipeline in a single command. Requires Docker.

The pipeline:

1. Starts a Splunk Docker container with the TA installed.
2. Collects test samples from the TA's `tests/knowledge/samples/` directory (XML/log files).
3. Sends each sample event to Splunk via HEC.
4. Retrieves Splunk's extracted fields via the Splunk SDK.
5. Generates `module.test.json` files containing the expected field extractions.
6. Stops the Splunk container (unless `--keep-running` is used).

```bash
# Basic usage — starts Docker, runs pipeline, stops Docker
conf-spl2-converter generate-expected /path/to/ta

# With a config file and custom output directory
conf-spl2-converter generate-expected /path/to/ta -c config.json -o ./out

# Skip Docker management (assumes Splunk is already running on localhost)
conf-spl2-converter generate-expected /path/to/ta --skip-docker

# Leave the Splunk container running after completion (useful for iterating)
conf-spl2-converter generate-expected /path/to/ta --keep-running

# Verbose logging
conf-spl2-converter generate-expected /path/to/ta -v
```

#### Options

| Flag | Short | Description |
|------|-------|-------------|
| `--config` | `-c` | Path to a `field_extraction_config.json`. When omitted, looks for it in the TA directory; falls back to auto-discovery from `props.conf`. |
| `--output` | `-o` | Output directory for generated files. Defaults to `<ta_path>/default/data/spl2/`. |
| `--skip-docker` | | Skip Docker container management; assume Splunk is already running. |
| `--keep-running` | | Leave the Splunk container running after completion. |
| `--verbose` | `-v` | Enable debug logging. |

#### Generated files

For each sourcetype, the pipeline produces:

```
<output_dir>/<sourcetype_slug>/
    <sourcetype_slug>.samples      # JSONL file with collected sample events
    module.test.json               # Expected field extractions for each sample
```

#### Environment variables

Splunk connection settings can be overridden via environment variables:

| Variable | Default | Description |
|----------|---------|-------------|
| `SPL2_TF_SPLUNK_INSTANCE_IP` | `127.0.0.1` | Splunk host address |
| `SPL2_TF_SPLUNK_INSTANCE_PORT` | `8088` | HEC port |
| `SPL2_TF_SPLUNK_INSTANCE_API_PORT` | `8089` | Splunk management API port |
| `SPL2_TF_SPLUNK_INSTANCE_USERNAME` | `admin` | Splunk admin username |
| `SPL2_TF_SPLUNK_INSTANCE_PASSWORD` | `newPassword` | Splunk admin password |
| `SPL2_TF_SPLUNK_INSTANCE_INDEX` | `cov_test` | Index used for test events |
| `SPL2_TF_SPLUNK_INSTANCE_HEC_TOKEN` | `cc7f4d5e-...` | HEC authentication token |

### `generate-expected-all` — Splunk + CIM in a single command

Runs both pipelines sequentially: first `generate-expected` (Splunk-based) to populate `expected_destination_result`, then `generate-expected-cim` to add `expected_cim_fields`. The result is a `module.test.json` with both full Splunk extraction results and CIM field expectations.

```bash
conf-spl2-converter generate-expected-all /path/to/ta

# Skip Docker management if Splunk is already running
conf-spl2-converter generate-expected-all /path/to/ta --skip-docker -v
```

Accepts the same options as `generate-expected` (`--config`, `--output`, `--skip-docker`, `--keep-running`, `--verbose`).

### `generate-expected-cim` — Generate expected test outputs from CIM fields

Offline alternative to `generate-expected`. Instead of running events through a Splunk instance, this command reads CIM field annotations already present in the TA's XML sample files and writes them as `expected_cim_fields` in `module.test.json`. No Docker or Splunk required.

Each XML event can contain a `<cim>` element with `<cim_fields>`, `<models>`, and `<missing_recommended_fields>`. This command extracts the CIM field name/value pairs and:

- **If a `module.test.json` already exists** (e.g. from a prior `generate-expected` run), it merges the CIM data into matching test entries (matched by `_raw`) as an `expected_cim_fields` section, preserving the existing `expected_destination_result`.
- **If no prior test exists** for a sample, it creates a new entry with an empty `expected_destination_result` and the `expected_cim_fields` section.

```bash
# Basic usage
conf-spl2-converter generate-expected-cim /path/to/ta

# With a config file and custom output directory
conf-spl2-converter generate-expected-cim /path/to/ta -c config.json -o ./out

# Verbose logging
conf-spl2-converter generate-expected-cim /path/to/ta -v
```

#### Options

| Flag | Short | Description |
|------|-------|-------------|
| `--config` | `-c` | Path to a `field_extraction_config.json`. When omitted, looks for it in the TA directory; falls back to auto-discovery from `props.conf`. |
| `--output` | `-o` | Output directory for generated files. Defaults to `<ta_path>/default/data/spl2/`. |
| `--verbose` | `-v` | Enable debug logging. |

> **Note:** Events without CIM field annotations (`<cim/>` or missing `<cim>`) are skipped.

### Full workflow example

Generate SPL2 pipelines, expected test data, and run tests for a TA:

```bash
# Step 1: Generate SPL2 pipeline templates
conf-spl2-converter generate /path/to/ta

# Step 2: Generate expected test outputs — pick one:
#   Option A: From CIM fields in XML samples (fast, no Docker)
conf-spl2-converter generate-expected-cim /path/to/ta
#   Option B: Via Splunk field extraction (full fidelity, requires Docker)
conf-spl2-converter generate-expected /path/to/ta
#   Option C: Both Splunk + CIM in one command (requires Docker)
conf-spl2-converter generate-expected-all /path/to/ta

# Step 3: Run tests with spl2-testing-framework (see below)
cd <ta>/default/data/spl2
spl2_tests_run cli -v --ignore_additional_fields_in_actual --ignore_empty_strings
```

Both commands write to `<ta_path>/default/data/spl2/` by default, producing:

```
<ta>/default/data/spl2/<sourcetype_slug>/
    pipeline_<sourcetype_slug>.spl2    # SPL2 pipeline (from generate)
    <sourcetype_slug>.samples          # Sample events (from generate-expected)
    module.test.json                   # Expected outputs (from generate-expected)
```

## Running tests with spl2-testing-framework

Use [spl2-testing-framework](https://pypi.org/project/spl2-testing-framework/) to verify that the generated SPL2 pipelines produce the expected field extractions.

```bash
pip install spl2-testing-framework

cd <ta>/default/data/spl2
spl2_tests_run cli -v --ignore_additional_fields_in_actual --ignore_empty_strings
```

## Config file

When `--config` / `-c` is **not** provided, the tool looks for `field_extraction_config.json` inside the TA directory. If found, it is used automatically. If not found, sourcetypes are auto-discovered from `props.conf`.

When `--config` / `-c` **is** provided, the specified config file is used instead (overrides the default lookup in the TA directory).

The config file controls which sourcetypes are processed along with extra settings like lookups, fields to trim, sub-sourcetypes, kv_mode overrides, etc.

## Development

```bash
# Install dependencies (requires uv)
uv sync --group dev

# Run tests
uv run pytest

# Lint and format
uv run ruff check .
uv run ruff format .

# Install pre-commit hooks
uv run pre-commit install
```

## License

Copyright (C) 2026 Splunk Inc. All Rights Reserved.
See [LICENSE](LICENSE) for details.
