Metadata-Version: 2.4
Name: etlplus
Version: 1.13.3
Summary: A Swiss Army knife for simple ETL operations
Author: ETLPlus Team
License-Expression: MIT
Project-URL: Changelog, https://etlplus.readthedocs.io/en/stable/changelog.html
Project-URL: Documentation, https://etlplus.readthedocs.io/en/stable/
Project-URL: Donate, https://buymeacoffee.com/djrlj694
Project-URL: Funding, https://github.com/sponsors/Dagitali
Project-URL: Homepage, https://github.com/Dagitali/ETLPlus
Project-URL: Repository, https://github.com/Dagitali/ETLPlus
Classifier: Development Status :: 5 - Production/Stable
Classifier: Intended Audience :: Developers
Classifier: Operating System :: OS Independent
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Python :: 3.14
Requires-Python: <3.15,>=3.13
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: cbor2>=5.6.4
Requires-Dist: duckdb>=1.1.0
Requires-Dist: fastavro>=1.12.1
Requires-Dist: frictionless>=5.19.0
Requires-Dist: jinja2>=3.1.6
Requires-Dist: jsonschema>=4.26.0
Requires-Dist: lxml>=6.1.0
Requires-Dist: msgpack>=1.0.8
Requires-Dist: odfpy>=1.4.1
Requires-Dist: openpyxl>=3.1.5
Requires-Dist: pyodbc>=5.3.0
Requires-Dist: pyarrow>=22.0.0
Requires-Dist: pymongo>=4.9.1
Requires-Dist: python-dotenv>=1.2.1
Requires-Dist: pandas>=2.3.3
Requires-Dist: pydantic>=2.12.5
Requires-Dist: PyYAML>=6.0.3
Requires-Dist: requests>=2.32.5
Requires-Dist: SQLAlchemy>=2.0.45
Requires-Dist: tomli-w>=1.2.0
Requires-Dist: typer>=0.21.0
Requires-Dist: xlrd>=2.0.2
Requires-Dist: xlwt>=1.3.0
Provides-Extra: dev
Requires-Dist: autopep8>=2.3.2; extra == "dev"
Requires-Dist: build>=1.2.2; extra == "dev"
Requires-Dist: mypy>=1.20.1; extra == "dev"
Requires-Dist: pydoclint>=0.8.1; extra == "dev"
Requires-Dist: pydocstyle>=6.3.0; extra == "dev"
Requires-Dist: pytest>=8.4.2; extra == "dev"
Requires-Dist: pytest-cov>=7.0.0; extra == "dev"
Requires-Dist: ruff>=0.14.4; extra == "dev"
Provides-Extra: docs
Requires-Dist: myst-parser<6.0.0,>=5.0.0; extra == "docs"
Requires-Dist: sphinx>=9.1.0; extra == "docs"
Requires-Dist: sphinx-rtd-theme>=3.1.0; extra == "docs"
Requires-Dist: sphinxcontrib-napoleon>=0.7.0; extra == "docs"
Provides-Extra: file
Requires-Dist: netCDF4>=1.7.2; extra == "file"
Requires-Dist: pyreadr>=0.5.2; extra == "file"
Requires-Dist: pyreadstat>=1.3.3; extra == "file"
Requires-Dist: xarray>=2024.9.0; extra == "file"
Provides-Extra: storage
Requires-Dist: azure-storage-blob>=12.26.0; extra == "storage"
Requires-Dist: azure-storage-file-datalake>=12.21.0; extra == "storage"
Requires-Dist: boto3>=1.40.0; extra == "storage"
Provides-Extra: telemetry
Requires-Dist: opentelemetry-api>=1.38.0; extra == "telemetry"
Requires-Dist: opentelemetry-sdk>=1.38.0; extra == "telemetry"
Dynamic: license-file

# ETLPlus

[![PyPI](https://img.shields.io/pypi/v/etlplus.svg)][PyPI package]
[![Release](https://img.shields.io/github/v/release/Dagitali/ETLPlus)][GitHub release]
[![Python](https://img.shields.io/pypi/pyversions/etlplus)][PyPI package]
[![License](https://img.shields.io/github/license/Dagitali/ETLPlus.svg)](LICENSE)
[![CI](https://github.com/Dagitali/ETLPlus/actions/workflows/ci.yml/badge.svg?branch=main)][GitHub Actions CI workflow]
[![Coverage](https://img.shields.io/codecov/c/github/Dagitali/ETLPlus?branch=main)][Codecov project]
[![Issues](https://img.shields.io/github/issues/Dagitali/ETLPlus)][GitHub issues]
[![PRs](https://img.shields.io/github/issues-pr/Dagitali/ETLPlus)][GitHub PRs]
[![GitHub contributors](https://img.shields.io/github/contributors/Dagitali/ETLPlus)][GitHub contributors]

ETLPlus is a veritable Swiss Army knife for enabling simple ETL operations, offering both a Python
package and command-line interface for data extraction, validation, transformation, and loading.

- [ETLPlus](#etlplus)
  - [Getting Started](#getting-started)
  - [At a Glance](#at-a-glance)
  - [Release Status](#release-status)
  - [Features](#features)
  - [Installation](#installation)
  - [Quickstart](#quickstart)
    - [Command-line interface](#command-line-interface)
    - [Python API](#python-api)
  - [Support ETLPlus](#support-etlplus)
  - [Data Connectors](#data-connectors)
    - [REST APIs (`api`)](#rest-apis-api)
    - [Databases (`database`)](#databases-database)
    - [Files (`file`)](#files-file)
      - [Handler Matrix Guardrail](#handler-matrix-guardrail)
      - [Stubbed / Placeholder](#stubbed--placeholder)
      - [Tabular \& Delimited Text](#tabular--delimited-text)
      - [Semi-Structured Text](#semi-structured-text)
      - [Columnar / Analytics-Friendly](#columnar--analytics-friendly)
      - [Binary Serialization and Interchange](#binary-serialization-and-interchange)
      - [Databases and Embedded Storage](#databases-and-embedded-storage)
      - [Spreadsheets](#spreadsheets)
      - [Statistical / Scientific / Numeric Computing](#statistical--scientific--numeric-computing)
      - [Logs and Event Streams](#logs-and-event-streams)
      - [Data Archives](#data-archives)
      - [Templates](#templates)
  - [Usage](#usage)
    - [Command Line Interface](#command-line-interface-1)
      - [Command Shapes](#command-shapes)
      - [Initialize A Starter Project](#initialize-a-starter-project)
      - [Check Pipelines](#check-pipelines)
      - [Render SQL DDL](#render-sql-ddl)
      - [Extract Data](#extract-data)
      - [Validate Data](#validate-data)
      - [Transform Data](#transform-data)
      - [Inspect Run History](#inspect-run-history)
      - [Load Data](#load-data)
    - [Python API](#python-api-1)
    - [Complete ETL Pipeline Example](#complete-etl-pipeline-example)
    - [Format Overrides](#format-overrides)
  - [Transformation Operations](#transformation-operations)
    - [Filter Operations](#filter-operations)
    - [Aggregation Functions](#aggregation-functions)
  - [Validation Rules](#validation-rules)
  - [Development](#development)
    - [API Client Docs](#api-client-docs)
    - [Runner Internals and Connectors](#runner-internals-and-connectors)
    - [Running Tests](#running-tests)
      - [Test Scope and Intent](#test-scope-and-intent)
    - [Code Coverage](#code-coverage)
    - [Linting](#linting)
    - [Updating Demo Snippets](#updating-demo-snippets)
    - [Releasing to the Python Package Index (PyPI)](#releasing-to-the-python-package-index-pypi)
  - [License](#license)
  - [Contributing](#contributing)
  - [Documentation](#documentation)
    - [Python Packages/Subpackage](#python-packagessubpackage)
    - [Community Health](#community-health)
    - [Other](#other)
  - [Acknowledgments](#acknowledgments)

## Getting Started

ETLPlus helps you extract, validate, transform, and load data from files, databases, and APIs, either
as a Python library or from the command line.

To get started:

- See [Installation](#installation) for setup instructions.
- Try the [Quickstart](#quickstart) for a minimal working example (CLI and Python).
- Explore [Usage](#usage) for more detailed options and workflows.
- See [SUPPORT.md](SUPPORT.md) for the current support policy, supported Python versions, and
  response targets.

ETLPlus currently supports Python 3.13 and 3.14.

## At a Glance

- Install with `pip install etlplus` for the supported CLI, `etlplus.ops`, the API client, and the
  built-in implemented file handlers.
- Use `pip install -e ".[dev]"` for contributor tooling and `pip install -e ".[file]"` when you need
  the remaining scientific and specialty format dependencies.
- Use `pip install -e ".[storage]"` when you want cloud storage backends for `s3://`,
  `azure-blob://`, or `abfs://` URIs through `etlplus.storage` and `etlplus.file.File`.
- Expect the most stable execution surface from the documented CLI commands, `etlplus.ops`,
  implemented file handlers, and `etlplus.api`.
- See [docs/source/getting-started/compatibility.md](docs/source/getting-started/compatibility.md)
  for the supported Python versions, platform coverage, and dependency groups.
- See [docs/source/getting-started/quickstart.md](docs/source/getting-started/quickstart.md) if you
  want the shortest path from install to a working ETL flow.

Detailed file-handler coverage and migration notes are still available later in this README and in
[docs/source/guides/file-handler-matrix.md](docs/source/guides/file-handler-matrix.md), but they
are no longer required reading to get started.

## Release Status

ETLPlus treats the `v1.x` line as its stable public release line. The repository still retains some
placeholders, stubs, and migration-reference modules for historical or implementation reasons, but
they are not part of the supported public contract unless they are explicitly documented as such.

The stable surface for the current `v1.x` releases is:

- The documented CLI commands: `check`, `extract`, `history`, `init`, `load`, `log`, `render`,
  `report`, `run`, `status`, `transform`, and `validate`
- The documented Python ETL primitives in `etlplus.ops`, including the advanced step modules under
  `etlplus.ops.transformations`
- The implemented file handlers listed as `implemented` in the handler matrix
- The documented API client and pagination helpers under `etlplus.api`

The following are not part of the stable execution surface unless explicitly promoted later:

- Database extract/load execution paths that are still described as placeholders
- Stubbed file handlers and placeholder formats
- Defunct or migration-reference modules retained for historical context

Maintainers handling packaging, CI, versioned docs, or release gating should consult
[`RELEASE-CHECKLIST.md`](RELEASE-CHECKLIST.md).

## Features

- **Init** starter ETLPlus projects:
  - Scaffold a runnable file-to-file starter pipeline with sample input data
  - Get suggested next commands for checking and running the generated job

- **Check** data pipeline definitions before running them:
  - Summarize jobs, sources, targets, and transforms
  - Validate dependency graphs and print DAG order with `--graph`
  - Run lightweight runtime and config readiness checks with `--readiness`
  - Enable stricter diagnostics with `--strict` to catch malformed entries the tolerant loader
    would otherwise skip
  - Confirm configuration changes by printing focused sections on demand

- **Render** SQL DDL from shared table specs:
  - Generate CREATE TABLE or view statements
  - Swap templates or direct output to files for database migrations

- **Extract** data from multiple sources:
  - Files (CSV, JSON, XML, YAML)
  - Databases (connection string support; extract is a placeholder today)
  - REST APIs (GET)

- **Validate** data with flexible rules:
  - Type checking
  - Required fields
  - Value ranges (min/max)
  - String length constraints
  - Pattern matching
  - Enum validation

- **Transform** data with powerful operations:
  - Filter records
  - Map/rename fields
  - Select specific fields
  - Sort data
  - Aggregate functions (avg, count, max, min, sum)

- **Load** data to multiple targets:
  - Files (CSV, JSON, XML, YAML)
  - Databases (connection string support; load is a placeholder today)
  - REST APIs (PATCH, POST, PUT)

- **Inspect** local run history and reports:
  - List normalized runs with filters and table output
  - Stream raw append events for backend-level troubleshooting
  - Inspect the latest run or aggregate success and duration metrics by job, status, or day

## Installation

<!-- docs:getting-started-installation:start -->

```bash
pip install etlplus
```

For development:

```bash
pip install -e ".[dev]"
```

The default install includes the non-native dependencies used by the built-in file handlers for
common binary, columnar, spreadsheet, and embedded-database formats such as `cbor2`, `duckdb`,
`fastavro`, `msgpack`, `openpyxl`, `odfpy`, `pandas`, `pyarrow`, `pymongo`, `xlrd`, and `xlwt`.

This is intentional for the stable line. ETLPlus treats the documented CLI, `etlplus.ops`,
`etlplus.api`, and the implemented built-in file handlers as one supported default runtime surface,
so the base install keeps the dependencies needed for that surface together instead of pushing core
implemented handlers behind extras.

For development with full optional file-format support:

```bash
pip install -e ".[dev,file]"
```

For runtime-only optional file-format support:

```bash
pip install -e ".[file]"
```

For runtime cloud-storage support:

```bash
pip install -e ".[storage]"
```

The `file` extra is now reserved for the remaining scientific and specialty format dependencies such
as `netCDF4`, `pyreadr`, `pyreadstat`, and `xarray`.

That split is also intentional: the `file` extra is reserved for narrower optional workflows rather
than for the built-in formats that ETLPlus expects most users of the default runtime to have
available.

<!-- docs:getting-started-installation:end -->

## Quickstart

<!-- docs:getting-started-quickstart:start -->

Get up and running in under a minute.

### Command-line interface

<!-- docs:getting-started-quickstart-cli:start -->

```bash
# Inspect help and version
etlplus --help
etlplus --version

# One-liner: extract CSV, filter, select, and write JSON
etlplus extract examples/data/sample.csv \
  | etlplus transform --operations '{"filter": {"field": "age", "op": "gt", "value": 25}, "select": ["name", "email"]}' \
  - temp/sample_output.json
```

<!-- docs:getting-started-quickstart-cli:end -->

### Python API

<!-- docs:getting-started-quickstart-python:start -->

```python
from etlplus.ops import extract, transform, validate, load

data = extract("file", "input.csv")
ops = {"filter": {"field": "age", "op": "gt", "value": 25}, "select": ["name", "email"]}
filtered = transform(data, ops)
rules = {"name": {"type": "string", "required": True}, "email": {"type": "string", "required": True}}
assert validate(filtered, rules)["valid"]
load(filtered, "file", "temp/sample_output.json", file_format="json")
```

<!-- docs:getting-started-quickstart-python:end -->

<!-- docs:getting-started-quickstart:end -->

## Support ETLPlus

If ETLPlus saves you engineering time, consider supporting the project through the repository
sponsor button once the funding links are live on the default branch. Funding helps to pay for:

- Maintenance and bug fixes
- New file, API, and database connectors
- Documentation, examples, and release automation
- Compatibility work for new Python and dependency versions

The preferred sponsorship path is [GitHub Sponsors][GitHub Sponsors], with [Buy Me a Coffee][Buy Me
a Coffee] as the lightweight fallback for one-time support.

Support is only one way to contribute. ETLPlus also benefits from codeless contributions such as
documentation fixes, issue triage, reproducible bug reports, usage feedback, examples, testing
results, answering questions in discussions, and release validation.

For community participation, use GitHub Discussions for questions, docs feedback, examples, and
support conversations. Use GitHub Issues for confirmed bugs and concrete feature work. See
[`docs/community-discussions.md`](docs/community-discussions.md) for the recommended setup.

## Data Connectors

Data connectors abstract sources from which to extract data and targets to which to load data. They
are differentiated by their types, each of which is represented in the subsections below.

### REST APIs (`api`)

ETLPlus can extract from REST APIs and load results via common HTTP methods. Supported operations
include GET for extract and PATCH/POST/PUT for load.

### Databases (`database`)

Database connectors use connection strings for extraction and loading, and DDL can be rendered from
table specs for migrations or schema checks. Database extract/load operations are currently
placeholders; plan to integrate a database client in your runner.

### Files (`file`)

Recognized file formats are listed in the tables below. Support for reading to or writing from a recognized file format is marked as:

- **Y**: implemented (may require optional dependencies)
- **N**: stubbed or not yet implemented

**Handler Architecture**

- File IO is moving to class-based handlers rooted at `etlplus/file/base.py` (`FileHandlerABC`,
  category ABCs, and `ReadOnlyFileHandlerABC`).
- `etlplus/file/registry.py` resolves handlers using an explicit `FileFormat -> handler class` map.
- Dispatch is explicit-only: unmapped formats raise `Unsupported format`.
- Module-level `etlplus.file.<format>.read()` / `write()` wrapper APIs have been removed.
- Use handler instances directly (for example, `JsonFile().read(path)` / `JsonFile().write(path,
  data)`) or `etlplus.file.File` dispatch via `File(path, file_format).read()` and `.write(...)`.
- Documentation and examples intentionally use handler class methods, not deprecated module wrappers.
- Placeholder handlers are split into:
  - `etlplus/file/stub.py` for generic stub behavior
  - `etlplus/file/_stub_categories.py` for category-aware internal stub ABCs
- Scientific/statistical handlers `dta`, `nc`, `rda`, `rds`, `sav`, and `xpt` now implement
  `ScientificDatasetFileHandlerABC` dataset hooks.

**Current Migration Coverage (Class-Based + Explicit Registry Mapping)**

- Delimited/text: `csv`, `dat`, `fwf`, `psv`, `tab`, `tsv`, `txt`
- Semi-structured/config: `ini`, `json`, `ndjson`, `properties`, `toml`, `xml`, `yaml`
- Columnar: `arrow`, `feather`, `orc`, `parquet`
- Binary/interchange: `avro`, `bson`, `cbor`, `msgpack`, `pb`, `proto`
- Embedded DB: `duckdb`, `sqlite`
- Spreadsheets: `ods`, `xls`, `xlsm`, `xlsx`
- Scientific/statistical: `dta`, `nc`, `rda`, `rds`, `sav`, `xpt`, `sas7bdat` (read-only), plus
  single-dataset scientific stubs `mat`, `sylk`, `zsav`
- Archive wrappers: `gz`, `zip`
- Log/event streams: `log`
- Templates: `hbs`, `jinja2`, `mustache`, `vm`
- Explicit module-owned stub handlers (via `stub.py` + `_stub_categories.py`): `stub`, `accdb`,
  `cfg`, `conf`, `ion`, `mdb`, `numbers`, `pbf`, `wks`

#### Handler Matrix Guardrail

The concise matrix below is the migration guardrail for class-based handler coverage. For
batch-by-batch maintenance notes and the same matrix in docs, see
[docs/file-handler-matrix.md](docs/file-handler-matrix.md).

| Format | Handler Class | Base ABC | Read/Write Support | Status |
| --- | --- | --- | --- | --- |
| `accdb` | `AccdbFile` | `StubEmbeddedDatabaseFileHandlerABC` | read/write | stub |
| `arrow` | `ArrowFile` | `ColumnarFileHandlerABC` | read/write | implemented |
| `avro` | `AvroFile` | `BinarySerializationFileHandlerABC` | read/write | implemented |
| `bson` | `BsonFile` | `BinarySerializationFileHandlerABC` | read/write | implemented |
| `cbor` | `CborFile` | `BinarySerializationFileHandlerABC` | read/write | implemented |
| `cfg` | `CfgFile` | `StubSemiStructuredTextFileHandlerABC` | read/write | stub |
| `conf` | `ConfFile` | `StubSemiStructuredTextFileHandlerABC` | read/write | stub |
| `csv` | `CsvFile` | `StandardDelimitedTextFileHandlerABC` | read/write | implemented |
| `dat` | `DatFile` | `DelimitedTextFileHandlerABC` | read/write | implemented |
| `dta` | `DtaFile` | `SingleDatasetScientificFileHandlerABC` | read/write | implemented |
| `duckdb` | `DuckdbFile` | `EmbeddedDatabaseFileHandlerABC` | read/write | implemented |
| `feather` | `FeatherFile` | `ColumnarFileHandlerABC` | read/write | implemented |
| `fwf` | `FwfFile` | `TextFixedWidthFileHandlerABC` | read/write | implemented |
| `gz` | `GzFile` | `ArchiveWrapperFileHandlerABC` | read/write | implemented |
| `hbs` | `HbsFile` | `TemplateFileHandlerABC` | read/write | implemented |
| `hdf5` | `Hdf5File` | `ScientificDatasetFileHandlerABC` | read-only | implemented |
| `ini` | `IniFile` | `DictPayloadSemiStructuredTextFileHandlerABC` | read/write | implemented |
| `ion` | `IonFile` | `StubSemiStructuredTextFileHandlerABC` | read/write | stub |
| `jinja2` | `Jinja2File` | `TemplateFileHandlerABC` | read/write | implemented |
| `json` | `JsonFile` | `RecordPayloadSemiStructuredTextFileHandlerABC` | read/write | implemented |
| `log` | `LogFile` | `LogEventFileHandlerABC` | read/write | implemented |
| `mat` | `MatFile` | `StubSingleDatasetScientificFileHandlerABC` | read/write | stub |
| `mdb` | `MdbFile` | `StubEmbeddedDatabaseFileHandlerABC` | read/write | stub |
| `msgpack` | `MsgpackFile` | `BinarySerializationFileHandlerABC` | read/write | implemented |
| `mustache` | `MustacheFile` | `TemplateFileHandlerABC` | read/write | implemented |
| `nc` | `NcFile` | `SingleDatasetScientificFileHandlerABC` | read/write | implemented |
| `ndjson` | `NdjsonFile` | `SemiStructuredTextFileHandlerABC` | read/write | implemented |
| `numbers` | `NumbersFile` | `StubSpreadsheetFileHandlerABC` | read/write | stub |
| `ods` | `OdsFile` | `SpreadsheetFileHandlerABC` | read/write | implemented |
| `orc` | `OrcFile` | `ColumnarFileHandlerABC` | read/write | implemented |
| `parquet` | `ParquetFile` | `ColumnarFileHandlerABC` | read/write | implemented |
| `pb` | `PbFile` | `BinarySerializationFileHandlerABC` | read/write | implemented |
| `pbf` | `PbfFile` | `StubBinarySerializationFileHandlerABC` | read/write | stub |
| `properties` | `PropertiesFile` | `DictPayloadSemiStructuredTextFileHandlerABC` | read/write | implemented |
| `proto` | `ProtoFile` | `BinarySerializationFileHandlerABC` | read/write | implemented |
| `psv` | `PsvFile` | `StandardDelimitedTextFileHandlerABC` | read/write | implemented |
| `rda` | `RdaFile` | `ScientificDatasetFileHandlerABC` | read/write | implemented |
| `rds` | `RdsFile` | `SingleDatasetScientificFileHandlerABC` | read/write | implemented |
| `sas7bdat` | `Sas7bdatFile` | `SingleDatasetScientificFileHandlerABC` | read-only | implemented |
| `sav` | `SavFile` | `SingleDatasetScientificFileHandlerABC` | read/write | implemented |
| `sqlite` | `SqliteFile` | `EmbeddedDatabaseFileHandlerABC` | read/write | implemented |
| `stub` | `StubFile` | `StubFileHandlerABC` | read/write | stub |
| `sylk` | `SylkFile` | `StubSingleDatasetScientificFileHandlerABC` | read/write | stub |
| `tab` | `TabFile` | `StandardDelimitedTextFileHandlerABC` | read/write | implemented |
| `toml` | `TomlFile` | `DictPayloadSemiStructuredTextFileHandlerABC` | read/write | implemented |
| `tsv` | `TsvFile` | `StandardDelimitedTextFileHandlerABC` | read/write | implemented |
| `txt` | `TxtFile` | `PlainTextFileHandlerABC` | read/write | implemented |
| `vm` | `VmFile` | `TemplateFileHandlerABC` | read/write | implemented |
| `wks` | `WksFile` | `StubSpreadsheetFileHandlerABC` | read/write | stub |
| `xls` | `XlsFile` | `ReadOnlySpreadsheetFileHandlerABC` | read-only | implemented |
| `xlsm` | `XlsmFile` | `SpreadsheetFileHandlerABC` | read/write | implemented |
| `xlsx` | `XlsxFile` | `SpreadsheetFileHandlerABC` | read/write | implemented |
| `xml` | `XmlFile` | `SemiStructuredTextFileHandlerABC` | read/write | implemented |
| `xpt` | `XptFile` | `SingleDatasetScientificFileHandlerABC` | read/write | implemented |
| `yaml` | `YamlFile` | `RecordPayloadSemiStructuredTextFileHandlerABC` | read/write | implemented |
| `zip` | `ZipFile` | `ArchiveWrapperFileHandlerABC` | read/write | implemented |
| `zsav` | `ZsavFile` | `StubSingleDatasetScientificFileHandlerABC` | read/write | stub |

#### Stubbed / Placeholder

| Format | Read | Write | Description |
| --- | --- | --- | --- |
| `stub` | N | Placeholder format for tests and future connectors. |

#### Tabular & Delimited Text

| Format | Read | Write | Description |
| --- | --- | --- | --- |
| `csv` | Y | Y | Comma-Separated Values |
| `dat` | Y | Y | Generic data file, often delimited or fixed-width |
| `fwf` | Y | Y | Fixed-Width Fields |
| `psv` | Y | Y | Pipe-Separated Values |
| `tab` | Y | Y | Often synonymous with TSV |
| `tsv` | Y | Y | Tab-Separated Values |
| `txt` | Y | Y | Plain text, often delimited or fixed-width |

#### Semi-Structured Text

| Format | Read | Write | Description |
| --- | --- | --- | --- |
| `cfg` | N | N | Config-style key-value pairs |
| `conf` | N | N | Config-style key-value pairs |
| `ini` | Y | Y | Config-style key-value pairs |
| `json` | Y | Y | JavaScript Object Notation |
| `ndjson` | Y | Y | Newline-Delimited JSON |
| `properties` | Y | Y | Java-style key-value pairs |
| `toml` | Y | Y | Tom's Obvious Minimal Language |
| `xml` | Y | Y | Extensible Markup Language |
| `yaml` | Y | Y | YAML Ain't Markup Language |

#### Columnar / Analytics-Friendly

| Format | Read | Write | Description |
| --- | --- | --- | --- |
| `arrow` | Y | Y | Apache Arrow IPC |
| `feather` | Y | Y | Apache Arrow Feather |
| `orc` | Y | Y | Optimized Row Columnar; common in Hadoop |
| `parquet` | Y | Y | Apache Parquet; common in Big Data |

#### Binary Serialization and Interchange

| Format | Read | Write | Description |
| --- | --- | --- | --- |
| `avro` | Y | Y | Apache Avro |
| `bson` | Y | Y | Binary JSON; common with MongoDB exports/dumps |
| `cbor` | Y | Y | Concise Binary Object Representation |
| `ion` | N | N | Amazon Ion |
| `msgpack` | Y | Y | MessagePack |
| `pb` | Y | Y | Protocol Buffers (Google Protobuf) |
| `pbf` | N | N | Protocolbuffer Binary Format; often for GIS data |
| `proto` | Y | Y | Protocol Buffers schema; often in .pb / .bin |

#### Databases and Embedded Storage

| Format | Read | Write | Description |
| --- | --- | --- | --- |
| `accdb` | N | N | Microsoft Access (newer format) |
| `duckdb` | Y | Y | DuckDB |
| `mdb` | N | N | Microsoft Access (older format) |
| `sqlite` | Y | Y | SQLite |

#### Spreadsheets

| Format | Read | Write | Description |
| --- | --- | --- | --- |
| `numbers` | N | N | Apple Numbers |
| `ods` | Y | Y | OpenDocument |
| `wks` | N | N | Lotus 1-2-3  |
| `xls` | Y | N | Microsoft Excel (BIFF; read-only) |
| `xlsm` | Y | Y | Microsoft Excel Macro-Enabled (Open XML) |
| `xlsx` | Y | Y | Microsoft Excel (Open XML) |

#### Statistical / Scientific / Numeric Computing

| Format | Read | Write | Description |
| --- | --- | --- | --- |
| `dta` | Y | Y | Stata |
| `hdf5` | Y | N | Hierarchical Data Format |
| `mat` | N | N | MATLAB |
| `nc` | Y | Y | NetCDF |
| `rda` | Y | Y | RData workspace/object |
| `rds` | Y | Y | R data |
| `sas7bdat` | Y | N | SAS data |
| `sav` | Y | Y | SPSS data |
| `sylk` | N | N | Symbolic Link |
| `xpt` | Y | Y | SAS Transport |
| `zsav` | N | N | Compressed SPSS data |

#### Logs and Event Streams

| Format | Read | Write | Description |
| --- | --- | --- | --- |
| `log` | Y | Y | Generic log file |

#### Data Archives

| Format | Read | Write | Description |
| --- | --- | --- | --- |
| `gz` | Y | Y | Gzip-compressed file |
| `zip` | Y | Y | ZIP archive |

#### Templates

| Format | Read | Write | Description |
| --- | --- | --- | --- |
| `hbs` | Y | Y | Handlebars |
| `jinja2` | Y | Y | Jinja2 |
| `mustache` | Y | Y | Mustache |
| `vm` | Y | Y | Apache Velocity |

## Usage

### Command Line Interface

ETLPlus provides a powerful CLI for ETL operations:

```bash
# Show help
etlplus --help

# Show version
etlplus --version
```

The CLI is implemented with Typer (Click-based). The legacy argparse parser has been removed, so
rely on the documented commands/flags and run `etlplus <command> --help` for current options.

#### Command Shapes

The core commands accept positional source and target arguments when you want to read from or write
to explicit paths or URIs. When you omit them, ETLPlus falls back to standard streams:

- **extract**: `etlplus extract [SOURCE]`
  - Omit `SOURCE` to read from STDIN.
- **transform**: `etlplus transform [SOURCE] [TARGET]`
  - Omit `SOURCE` to read from STDIN and omit `TARGET` to write to STDOUT.
- **load**: `etlplus load [TARGET]`
  - Omit `TARGET` to write to STDOUT.
- **validate**: `etlplus validate [SOURCE]`
  - Omit `SOURCE` to read from STDIN and use `--output` if you want file output instead of
    STDOUT.

Use `--source-format`, `--target-format`, `--source-type`, and `--target-type` to override the
usual inference rules when a filename, URI, or stream does not provide enough context.

#### Initialize A Starter Project

Use `etlplus init` to scaffold a minimal starter project with a sample pipeline and input data:

```bash
etlplus init demo-pipeline
cd demo-pipeline
etlplus check --config pipeline.yml --jobs
etlplus run --config pipeline.yml --job file_to_file_customers
```

#### Check Pipelines

Use `etlplus check` to explore pipeline YAML definitions without running them. The command can print
job names, summarize configured sources and targets, drill into specific sections, or run readiness
checks.

Inspect config contents:
```bash
etlplus check --config examples/configs/pipeline.yml --jobs
etlplus check --config examples/configs/pipeline.yml --summary
```

Show sources or transforms for troubleshooting:
```bash
etlplus check --config examples/configs/pipeline.yml --sources
etlplus check --config examples/configs/pipeline.yml --transforms
```

Run runtime and config readiness checks:
```bash
etlplus check --readiness
etlplus check --readiness --config examples/configs/pipeline.yml
etlplus check --readiness --strict --config examples/configs/pipeline.yml
```

Readiness warnings are advisory and still return exit code `0`. Fatal readiness errors, such as
unresolved required environment variables, missing blocking optional dependencies, or provider
bootstrap failures, return exit code `1`.

Validate dependency order before executing a DAG-shaped pipeline:
```bash
etlplus check --config examples/configs/pipeline.yml --graph
```

#### Render SQL DDL

Use `etlplus render` to turn table schema specs into ready-to-run SQL. Render from a pipeline config
or from a standalone schema file, and choose the built-in `ddl` or `view` templates (or provide your
own).

Render all tables defined in a pipeline:
```bash
etlplus render --config examples/configs/pipeline.yml --template ddl
```

Render a single table in that pipeline:
```bash
etlplus render --config examples/configs/pipeline.yml --table customers --template view
```

Render from a standalone table spec to a file:
```bash
etlplus render --spec schemas/customer.yml --template view -o temp/customer_view.sql
```

#### Extract Data

Note: For file sources, the format is normally inferred from the filename extension. Use
`--source-format` to override inference when a file lacks an extension or when you want to force a
specific parser.

Extract from JSON file:
```bash
etlplus extract examples/data/sample.json
```

Extract from CSV file:
```bash
etlplus extract examples/data/sample.csv
```

Extract from XML file:
```bash
etlplus extract examples/data/sample.xml
```

Extract from REST API:
```bash
etlplus extract https://api.example.com/data
```

Save extracted data to file:
```bash
etlplus extract examples/data/sample.csv > temp/sample_output.json
```

#### Validate Data

Validate data from file or JSON string:
```bash
etlplus validate '{"name": "John", "age": 30}' --rules '{"name": {"type": "string", "required": true}, "age": {"type": "number", "min": 0, "max": 150}}'
```

Validate from file:
```bash
etlplus validate examples/data/sample.json --rules '{"email": {"type": "string", "pattern": "^[\\w.-]+@[\\w.-]+\\.\\w+$"}}'
```

Validate JSON or YAML against a JSON Schema:
```bash
etlplus validate examples/data/sample.json --schema examples/schemas/customer.schema.json --schema-format jsonschema
etlplus validate --source-format yaml --schema examples/schemas/pipeline.schema.json --schema-format jsonschema -
```

When the source or schema path already makes the schema family clear, the CLI can infer it without
`--schema-format`:
```bash
etlplus validate examples/data/sample.json --schema examples/schemas/customer.schema.json
etlplus validate examples/data/sample.xml --schema examples/data/sample.xsd
```

Validate CSV against a Frictionless Table Schema:
```bash
etlplus validate data/customers.csv --schema examples/schemas/customers.table-schema.json --schema-format frictionless
```

Inference rules are intentionally narrow and predictable:

- An explicit `--schema-format` always wins
- `.xsd` schemas resolve to XSD validation
- JSON or YAML source hints resolve to JSON Schema validation
- CSV source hints resolve to Frictionless validation
- Ambiguous inline or STDIN cases require `--schema-format`

CSV schema failures preserve row and field paths in the same result envelope:
```json
{
  "valid": false,
  "errors": [
    "row[3].email: Row at position \"3\" has unique constraint violation in field \"email\" at position \"1\": the same as in the row at position 2",
    "row[3].status: The cell \"\" in row at position \"3\" and field \"status\" at position \"2\" does not conform to a constraint: constraint \"required\" is \"True\""
  ],
  "field_errors": {
    "row[3].email": [
      "Row at position \"3\" has unique constraint violation in field \"email\" at position \"1\": the same as in the row at position 2"
    ],
    "row[3].status": [
      "The cell \"\" in row at position \"3\" and field \"status\" at position \"2\" does not conform to a constraint: constraint \"required\" is \"True\""
    ]
  },
  "data": null
}
```

#### Transform Data

When piping data through `etlplus transform`, use `--source-format` whenever the SOURCE argument is
`-` or a literal payload, mirroring the `etlplus extract` semantics. When TARGET is omitted or set
to `-`, `etlplus transform` emits JSON to STDOUT. When TARGET is a file path or file URI, the
transformed payload is written directly. When TARGET is an API or database target and you provide
`--target-type`, the command delegates the transformed payload to `etlplus load` and prints the
downstream load result envelope. `--target-format` affects file targets and delegated load targets
that honor a format hint. Use `--source-type` to override the inferred source connector type and
`--target-type` to override the inferred target connector type, matching the `etlplus extract`/
`etlplus load` behavior.

Transform file inputs while overriding connector types:
```bash
etlplus transform \
  --operations '{"select": ["name", "email"]}' \
  examples/data/sample.json  --source-type file \
  temp/selected_output.json --target-type file
```

Filter and select fields:
```bash
etlplus transform \
  --operations '{"filter": {"field": "age", "op": "gt", "value": 26}, "select": ["name"]}' \
  '[{"name": "John", "age": 30}, {"name": "Jane", "age": 25}]'
```

Sort data:
```bash
etlplus transform \
  --operations '{"sort": {"field": "age", "reverse": true}}' \
  examples/data/sample.json
```

Aggregate data:
```bash
etlplus transform \
  --operations '{"aggregate": {"field": "age", "func": "sum"}}' \
  examples/data/sample.json
```

Map/rename fields:
```bash
etlplus transform \
  --operations '{"map": {"name": "new_name"}}' \
  examples/data/sample.json
```

Send transformed data to a REST API through the load path:
```bash
etlplus transform \
  --operations '{"select": ["name", "email"]}' \
  examples/data/sample.json \
  https://api.example.com/customers --target-type api
```

Database targets use the same delegated load path, but the current database load implementation is
still a documented placeholder.

#### Inspect Run History

`etlplus run` persists local run history keyed by `run_id`. DAG-aware runs keep a compact aggregate
summary on the parent run row and also persist one per-job history row for each executed job. Use
the read/query commands to inspect that history without opening the backend directly.

List recent normalized runs:
```bash
etlplus history --job file_to_file_customers --status succeeded --limit 10 --table
```

List recent normalized job rows from DAG-aware runs:
```bash
etlplus history --level job --pipeline customer_sync --limit 10 --table
```

Show the latest matching run:
```bash
etlplus status --job file_to_file_customers
```

Show the latest matching job row:
```bash
etlplus status --level job --job file_to_file_customers
```

Stream raw run-level history events:
```bash
etlplus log --run-id 8e4a33d7 --follow
```

Stream raw job-level history events:
```bash
etlplus log --level job --pipeline customer_sync --status skipped --follow
```

Aggregate grouped history metrics:
```bash
etlplus report --group-by day --since 2026-03-01T00:00:00Z --table
```

Aggregate per-job history by pipeline:
```bash
etlplus report --level job --group-by pipeline --since 2026-03-01T00:00:00Z --table
```

#### Load Data

`etlplus load` consumes JSON from STDIN; provide only the target argument plus optional flags.

Load to JSON file:
```bash
etlplus extract examples/data/sample.json \
  | etlplus load temp/sample_output.json --target-type file
```

Load to CSV file:
```bash
etlplus extract examples/data/sample.csv \
  | etlplus load temp/sample_output.csv --target-type file
```

Load to REST API:
```bash
cat examples/data/sample.json \
  | etlplus load https://api.example.com/endpoint --target-type api
```

### Python API

Use ETLPlus as a Python library:

```python
from etlplus.ops import extract, validate, transform, load

# Extract data
data = extract("file", "data.json")

# Validate data
validation_rules = {
    "name": {"type": "string", "required": True},
    "age": {"type": "number", "min": 0, "max": 150}
}
result = validate(data, validation_rules)
if result["valid"]:
    print("Data is valid!")

# Transform data
operations = {
    "filter": {"field": "age", "op": "gt", "value": 18},
    "select": ["name", "email"]
}
transformed = transform(data, operations)

# Load data
load(transformed, "file", "temp/sample_output.json", file_format="json")
```

For YAML-driven pipelines executed end-to-end (extract → validate → transform → load), see:

- Authoring: [`docs/pipeline-guide.md`](docs/pipeline-guide.md)
- Runner API and internals: see `etlplus.ops.run` docstrings and `docs/pipeline-guide.md`.

CLI quick reference for pipelines:

```bash
# List jobs or show a pipeline summary
etlplus check --config examples/configs/pipeline.yml --jobs
etlplus check --config examples/configs/pipeline.yml --summary
etlplus check --config examples/configs/pipeline.yml --graph

# Run a job
etlplus run --config examples/configs/pipeline.yml --job file_to_file_customers

# Run every configured job in DAG order
etlplus run --config examples/configs/pipeline.yml --all

# Run a job and emit structured events to STDERR
etlplus run --config examples/configs/pipeline.yml --job file_to_file_customers --event-format jsonl
```

Structured events use the stable `etlplus.event.v1` envelope. Additive fields may appear over time,
but breaking field/lifecycle changes require a schema version bump.

### Complete ETL Pipeline Example

```bash
# 1. Extract from CSV
etlplus extract examples/data/sample.csv > temp/sample_extracted.json

# 2. Transform (filter and select fields)
etlplus transform \
  --operations '{"filter": {"field": "age", "op": "gt", "value": 25}, "select": ["name", "email"]}' \
  temp/sample_extracted.json \
  temp/sample_transformed.json

# 3. Validate transformed data
etlplus validate \
  --rules '{"name": {"type": "string", "required": true}, "email": {"type": "string", "required": true}}' \
  temp/sample_transformed.json

# 4. Load to CSV
cat temp/sample_transformed.json \
  | etlplus load temp/sample_output.csv
```

### Format Overrides

`--source-format` and `--target-format` override whichever format would normally be inferred from a
file extension. This is useful when an input lacks an extension (for example, `records.txt` that
actually contains CSV) or when you intentionally want to treat a file as another format.

Examples (zsh):

```zsh
# Force CSV parsing for an extension-less file
etlplus extract data.txt --source-type file --source-format csv

# Write CSV to a file without the .csv suffix
etlplus load output.bin --target-type file --target-format csv < data.json

# Leave the flags off when extensions already match the desired format
etlplus extract data.csv --source-type file
etlplus load output.json --target-type file < data.json
```

## Transformation Operations

### Filter Operations

Supported operators:
- `eq`: Equal
- `ne`: Not equal
- `gt`: Greater than
- `gte`: Greater than or equal
- `lt`: Less than
- `lte`: Less than or equal
- `in`: Value in list
- `contains`: List/string contains value

Example:
```json
{
  "filter": {
    "field": "status",
    "op": "in",
    "value": ["active", "pending"]
  }
}
```

### Aggregation Functions

Supported functions:
- `sum`: Sum of values
- `avg`: Average of values
- `min`: Minimum value
- `max`: Maximum value
- `count`: Count of values

Example:
```json
{
  "aggregate": {
    "field": "revenue",
    "func": "sum"
  }
}
```

## Validation Rules

Supported validation rules:
- `type`: Data type (string, number, integer, boolean, array, object)
- `required`: Field is required (true/false)
- `min`: Minimum value for numbers
- `max`: Maximum value for numbers
- `minLength`: Minimum length for strings
- `maxLength`: Maximum length for strings
- `pattern`: Regex pattern for strings
- `enum`: List of allowed values

Schema-based validation is also supported through `etlplus validate --schema ...`. Use
`--schema-format xsd` for XML documents, `--schema-format jsonschema` for JSON or YAML documents,
and `--schema-format frictionless` for CSV documents. When the file path already makes the schema
family unambiguous, ETLPlus can infer it; ambiguous inline text and STDIN cases still require an
explicit schema format.

Example:
```json
{
  "email": {
    "type": "string",
    "required": true,
    "pattern": "^[\\w.-]+@[\\w.-]+\\.\\w+$"
  },
  "age": {
    "type": "number",
    "min": 0,
    "max": 150
  },
  "status": {
    "type": "string",
    "enum": ["active", "inactive", "pending"]
  }
}
```

## Development

### API Client Docs

Looking for the HTTP client and pagination helpers?  See the dedicated docs in
`etlplus/api/README.md` for:

- Quickstart with `EndpointClient`
- Authentication via `EndpointCredentialsBearer`
- Pagination with `PaginationConfig` (page and cursor styles)
- Tips on `records_path` and `cursor_path`

### Runner Internals and Connectors

Curious how the pipeline runner composes API requests, pagination, and load calls?

- Runner overview and helpers: see `etlplus.ops.run` docstrings and
  [`docs/pipeline-guide.md`](docs/pipeline-guide.md)
- Unified "connector" vocabulary (API/File/DB): `etlplus/connector`
  - API/file targets reuse the same shapes as sources; API targets typically set a `method`.

### Running Tests

For local CI parity and full coverage of remaining optional file formats,
install:

```bash
pip install -e ".[dev,file]"
```

```bash
# Lightweight run (uses currently installed extras)
pytest

# Full run with remaining optional file-format dependencies
make test-full
```

#### Test Scope and Intent

ETLPlus organizes tests by scope and uses markers for cross-cutting intent.

- **Scope folders**:
  - Unit (`tests/unit/`): isolated function/class behavior, no external services.
  - Integration (`tests/integration/`): cross-module and boundary behavior.
  - E2E (`tests/e2e/`): full workflow/system-boundary behavior.
- **Intent markers**:
  - `smoke`: go/no-go viability checks.
  - `contract`: interface/metadata compatibility checks.

Smoke tests are now treated as an intent marker rather than a primary folder. The legacy path
migration is complete; smoke tests live under scope folders and are selected by marker.

If a test calls `etlplus.cli.main()` or `etlplus.ops.run.run()`, it is integration by default.
Detailed criteria and marker conventions: [`CONTRIBUTING.md#testing`](CONTRIBUTING.md#testing),
[`tests/README.md`](tests/README.md).

### Code Coverage

```bash
pytest tests/unit tests/integration tests/e2e --cov=etlplus --cov-report=html
```

### Linting

```bash
make lint
make doclint
make fmt
make typecheck
```

`make lint` runs the Ruff-based source checks used in CI, `make doclint` runs `pydocstyle` and
`pydoclint`, `make fmt` applies the supported Ruff-plus-`autopep8` formatting path, and `make
typecheck` runs `mypy` against the shipped package. ETLPlus no longer maintains separate Black or
Flake8 contributor paths; Ruff is the authoritative lint gate and `autopep8` remains as the
compatibility formatter used by CI and pre-commit. `.ruff.toml` is the canonical line-length source,
and any duplicated formatter width in supporting tooling is expected to match it. If an external
tool still invokes Flake8, the repository `.flake8` file exists only as a compatibility shim for the
overlapping basics that Flake8 can understand.

### Updating Demo Snippets

`DEMO.md` shows the real output of `etlplus --version` captured from a freshly built wheel. Regenerate
the snippet (and the companion file [docs/snippets/installation_version.md](docs/snippets/installation_version.md)) after changing anything that affects the version string:

```bash
make demo-snippets
```

The helper script in [tools/update_demo_snippets.py](tools/update_demo_snippets.py) builds the wheel,
installs it into a throwaway virtual environment, runs `etlplus --version`, and rewrites the snippet
between the markers in [DEMO.md](DEMO.md).

### Releasing to the Python Package Index (PyPI)

`setuptools-scm` derives the package version from Git tags, so publishing is now entirely tag
driven—no hand-editing `pyproject.toml`, `setup.py`, or `etlplus/__version__.py`.

GitHub Releases is the canonical release-history surface for ETLPlus. It is also the earlier
developer-preview and release-announcement surface for tagged releases, whereas PyPI is the later
public package-install channel. The docs changelog page links there, and the maintainer-facing
release text is drafted from the template and category config in the `.github/` folder.

1. Ensure `main` is green and the release notes/docs are up to date.
2. Create and push a SemVer tag matching the `v*.*.*` pattern:

    ```bash
    git tag -a v1.4.0 -m "Release v1.4.0"
    git push origin v1.4.0
    ```

3. GitHub Actions runs the tagged release workflow in [.github/workflows/release.yml][release wf],
  builds the sdist/wheel, validates the artifacts, validates the tagged docs build, publishes the
  GitHub Release, and then publishes to [PyPI][PyPI].
4. Draft the GitHub Release notes using [.github/RELEASE-NOTES-TEMPLATE.md][release notes] together
   with the categorized notes configured in [.github/release.yml][release cfg].

The tagged docs publication itself is handled by the Read the Docs GitHub App after the tag push;
the release workflow only validates that the docs build cleanly from the tagged source.

If you want an extra smoke-test before tagging, run `make dist && pip install dist/*.whl` locally;
this exercises the same build path the workflow uses.

## License

This project is licensed under the [MIT License](LICENSE).

## Contributing

Code and codeless contributions are welcome!  If you’d like to add a new feature, fix a bug, or
improve the documentation, please feel free to submit a pull request as follows:

1. Fork this repository.
2. Create a new feature branch for your changes (`git checkout -b feature/feature-name`).
3. Commit your changes (`git commit -m "Add feature"`).
4. Push to your branch (`git push origin feature-name`).
5. Submit a pull request with a detailed description.

If you choose to be a code contributor, please first refer these documents:

- Pipeline authoring guide: [`docs/pipeline-guide.md`](docs/pipeline-guide.md)
- Design notes (Mapping inputs, dict outputs):
  [`docs/pipeline-guide.md#design-notes-mapping-inputs-dict-outputs`](docs/pipeline-guide.md#design-notes-mapping-inputs-dict-outputs)
- Typing philosophy (TypedDicts as editor hints, permissive runtime):
  [`CONTRIBUTING.md#typing-philosophy`](CONTRIBUTING.md#typing-philosophy)

Valuable non-code contributions include:

- Improving or correcting documentation
- Reporting bugs with clear reproduction steps
- Testing releases and platform-specific behavior
- Proposing examples, tutorials, and workflow patterns
- Answering questions in GitHub Discussions
- Sponsoring the project through [GitHub Sponsors][GitHub Sponsors] or [Buy Me a Coffee][Buy Me a
  Coffee]

## Documentation

### Python Packages/Subpackage

Navigate to detailed documentation for each subpackage:

- [etlplus.api](etlplus/api/README.md): Lightweight HTTP client and paginated REST helpers
- [etlplus.cli](etlplus/cli/README.md): Command-line interface definitions for `etlplus`
- [etlplus.database](etlplus/database/README.md): Database engine, schema, and ORM helpers
- [etlplus.file](etlplus/file/README.md): Unified file format support and helpers
- [etlplus.storage](etlplus/storage/README.md): Storage location parsing and backend helpers
- [etlplus.ops](etlplus/ops/README.md): Extract/validate/transform/load primitives
- [etlplus.templates](etlplus/templates/README.md): SQL and DDL template helpers
- [etlplus.workflow](etlplus/workflow/README.md): Helpers for data connectors, pipelines, jobs, and
  profiles

### Community Health

- [Contributing Guidelines](CONTRIBUTING.md): How to contribute, report issues, and submit PRs
- [Code of Conduct](CODE_OF_CONDUCT.md): Community standards and expectations
- [Security Policy](SECURITY.md): Responsible disclosure and vulnerability reporting
- [Support](SUPPORT.md): Where to get help

### Other

- API client docs: [`etlplus/api/README.md`](etlplus/api/README.md)
- Examples: [`examples/README.md`](examples/README.md)
- File handler matrix guardrail: [`docs/file-handler-matrix.md`](docs/file-handler-matrix.md)
- Pipeline authoring guide: [`docs/pipeline-guide.md`](docs/pipeline-guide.md)
- Runner internals: see `etlplus.ops.run` docstrings and [`docs/pipeline-guide.md`](docs/pipeline-guide.md)
- Design notes (Mapping inputs, dict outputs): [`docs/pipeline-guide.md#design-notes-mapping-inputs-dict-outputs`](docs/pipeline-guide.md#design-notes-mapping-inputs-dict-outputs)
- Typing philosophy: [`CONTRIBUTING.md#typing-philosophy`](CONTRIBUTING.md#typing-philosophy)
- Demo and walkthrough: [`DEMO.md`](DEMO.md)
- Additional references: [`REFERENCES.md`](REFERENCES.md)

## Acknowledgments

ETLPlus is inspired by common work patterns in data engineering and software engineering patterns in
Python development, aiming to increase productivity and reduce boilerplate code.  Feedback and
contributions are always appreciated!

[Buy Me a Coffee]: https://buymeacoffee.com/djrlj694
[Codecov project]: https://codecov.io/github/Dagitali/ETLPlus?branch=main
[GitHub Actions CI workflow]: https://github.com/Dagitali/ETLPlus/actions/workflows/ci.yml
[GitHub Sponsors]: https://github.com/sponsors/Dagitali
[GitHub contributors]: https://github.com/Dagitali/ETLPlus/graphs/contributors
[GitHub issues]: https://github.com/Dagitali/ETLPlus/issues
[GitHub PRs]: https://github.com/Dagitali/ETLPlus/pulls
[GitHub release]: https://github.com/Dagitali/ETLPlus/releases
[PyPI]: https://pypi.org
[PyPI package]: https://pypi.org/project/etlplus/
[release cfg]: .github/release.yml
[release notes]: .github/RELEASE-NOTES-TEMPLATE.md
[release wf]: .github/workflows/release.yml
