Metadata-Version: 2.4
Name: astro-pipeline
Version: 1.0.0
Summary: CLI tool and library for CSV import pipelines
Project-URL: Homepage, https://github.com/starlincs/astro
Project-URL: Documentation, https://astro-pipeline.readthedocs.io
Project-URL: Repository, https://github.com/starlincs/astro
Project-URL: Issues, https://github.com/starlincs/astro/issues
Project-URL: Changelog, https://github.com/starlincs/astro/blob/main/CHANGELOG.md
Project-URL: Download, https://pypi.org/project/astro-pipeline/
Author: Tom
Maintainer: Tom
License-Expression: MIT
License-File: LICENSE
Keywords: cli,csv,etl,pipeline,polars
Classifier: Development Status :: 5 - Production/Stable
Classifier: Environment :: Console
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Software Development :: Libraries
Classifier: Topic :: System :: Archiving
Classifier: Typing :: Typed
Requires-Python: >=3.11
Requires-Dist: pandera>=0.20
Requires-Dist: polars>=1.0
Requires-Dist: pyarrow>=16.0
Requires-Dist: pydantic>=2.0
Requires-Dist: rich>=13.0
Requires-Dist: typer>=0.12
Provides-Extra: dev
Requires-Dist: build>=1.0; extra == 'dev'
Requires-Dist: hypothesis>=6.0; extra == 'dev'
Requires-Dist: pre-commit>=4.0; extra == 'dev'
Requires-Dist: pytest-cov>=6.0; extra == 'dev'
Requires-Dist: pytest>=8.0; extra == 'dev'
Requires-Dist: ruff>=0.4; extra == 'dev'
Requires-Dist: twine>=6.0; extra == 'dev'
Requires-Dist: ty<0.1,>=0.0.1; extra == 'dev'
Provides-Extra: docs
Requires-Dist: furo>=2024.0; extra == 'docs'
Requires-Dist: myst-parser>=4.0; extra == 'docs'
Requires-Dist: sphinx-autobuild>=2024.0; extra == 'docs'
Requires-Dist: sphinx>=8.0; extra == 'docs'
Description-Content-Type: text/markdown

<p align="center">
  <img src="docs/banner.png" alt="Astro pipeline" width="100%">
</p>

# Astro

CLI tool and library for CSV import pipelines.

## Documentation

Full guides, API reference, and contributing docs are at **[https://astro-pipeline.readthedocs.io](https://astro-pipeline.readthedocs.io)**.

Build locally: `pip install -e ".[docs]"` and `make docs`.

| Topic | Where to read |
|-------|---------------|
| Install and quickstart | [docs/getting-started/](docs/getting-started/) |
| Pipeline authoring | [docs/user-guide/pipelines.md](docs/user-guide/pipelines.md) |
| CLI reference | [docs/user-guide/cli.md](docs/user-guide/cli.md) |
| Contributing | [docs/contributing/](docs/contributing/) |
| Behavioural spec (implementers) | [SPEC.md](SPEC.md) |

## Install

```bash
pip install astro-pipeline
```

PyPI name is `astro-pipeline` because `astro` is already taken. The CLI and import name remain `astro`.

From source:

```bash
git clone https://github.com/starlincs/astro.git
cd astro
pip install -e ".[dev]"
```

## What Astro does

Astro ingests CSV directories into validated Parquet snapshots, then runs ordered pipeline steps with statistics, filtering, and row quarantine. Each pipeline lives in an external repository as a `pipeline.py` file.

```bash
astro ingest path/to/data/
astro run
astro describe
astro list
```

See the [quickstart](docs/getting-started/quickstart.md) and [CLI reference](docs/user-guide/cli.md) for full usage, including cleanup, quarantine retry, and large-file behaviour.

## Security

Astro loads and executes `pipeline.py` from the directory you point it at. Only run Astro against pipeline repositories you trust. See the [security model](docs/getting-started/introduction.md#security-model) in the docs.

## Development

```bash
make check    # lint + format + typecheck + tests
make cov      # include large-file integration tests
```

See [docs/contributing/](docs/contributing/) for the test-first workflow and release process.

## License

MIT — see [LICENSE](LICENSE).
