Metadata-Version: 2.3
Name: polars-hist-db
Version: 0.6.0
Summary: dsv --> dataframe <--> bitemporal-tables
Author: jr200
Author-email: jayshan+git@gmail.com
Requires-Python: >=3.12,<3.13
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.12
Provides-Extra: nats
Provides-Extra: sqlachemy
Requires-Dist: nats-py (>=2.10.0,<3.0.0) ; extra == "nats"
Requires-Dist: pandas (>=2.3.0,<3.0.0)
Requires-Dist: polars[sqlachemy] (==1.30.0) ; extra == "sqlachemy"
Requires-Dist: pyarrow (>=19.0.1,<20.0.0)
Requires-Dist: pymysql (>=1.1.1,<2.0.0)
Requires-Dist: pytz (>=2025.2,<2026.0)
Requires-Dist: pyyaml (>=6.0.2,<7.0.0)
Requires-Dist: scandir-rs (==2.8.0)
Requires-Dist: sql-metadata (>=2.17.0,<3.0.0)
Requires-Dist: sqlalchemy[asyncio] (==2.0.40)
Project-URL: Repository, https://github.com/jr200/polars-hist-db
Description-Content-Type: text/markdown

# polars-hist-db

This library is for scraping data from CSV style files, temporally, into MariaDB.

Main features are:
- Uploading data from strongly-typed Polars DataFrames.
- Querying data into Polars DataFrames, with column types inferred from the database schema.
- A scrape specification that:
    - Defines pipelines for typing, enriching, and normalizing data before uploading.
    - Allows construction of the 'as-of' time from file attributes or as a function of the input columns.
    - Catalogs the history of scrape inputs to prevent duplication.
    - Supports per-file transactional scraping (either the processing for a file succeeds, or the transaction is rolled back).

## Development Setup

1. Install NATS server
```bash
brew install nats-server
```

1. Create a virtual environment:
```bash
python3 -m venv .venv
source .venv/bin/activate
```

2. Install development dependencies:
```bash
poetry install --with dev
```

3. Run tests:
```bash
poetry run pytest
```

4. Make docs. The documentation will be generated in the ``docs/_build/html`` directory:
```bash
cd docs && poetry run make html
```

## Code Style

This project follows the following code style guidelines:

* Use type hints for all function parameters and return values
* Follow PEP 8 style guide
* Use Google-style docstrings
* Keep functions focused and single-purpose
* Write comprehensive tests for new features

Run ``make check`` to check the code style.


## Contributing

1. Fork the repository
2. Create a feature branch
3. Commit your changes
4. Push to the branch
5. Create a Pull Request

## License

This project is licensed under the terms specified in the LICENSE file.

## References

- [Polars Documentation](https://docs.pola.rs/api/python/stable/reference/index.html)
- [SQLAlchemy Core Documentation](https://docs.sqlalchemy.org/en/20/core/index.html)
- [MariaDB Bitemporal Tables](https://mariadb.com/kb/en/bitemporal-tables)


