Metadata-Version: 2.2
Name: disco-data-logger
Version: 0.1.0
Summary: Data logger for Disco simulations
Author: Michiel Jansen
License: MIT
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: C++
Classifier: Operating System :: OS Independent
Requires-Python: >=3.11
Requires-Dist: numpy>=1.23
Provides-Extra: dev
Requires-Dist: pytest>=8; extra == "dev"
Requires-Dist: pytest-cov>=5; extra == "dev"
Requires-Dist: mypy>=1.10; extra == "dev"
Requires-Dist: cibuildwheel>=2.20; extra == "dev"
Requires-Dist: pybind11-stubgen>=2.5; extra == "dev"
Provides-Extra: parquet
Requires-Dist: pyarrow>=14; extra == "parquet"
Description-Content-Type: text/markdown

# 🧾 disco-data-logger

**High-performance, C++/NumPy-backed data logger**  
for **Disco** discrete-event and Monte Carlo simulation programs.

[![PyPI](https://img.shields.io/pypi/v/disco-data-logger.svg)](https://pypi.org/project/disco-data-logger/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE)
[![Build](https://github.com/michielmj/disco-data-logger/actions/workflows/build.yml/badge.svg)](https://github.com/michielmj/disco-data-logger/actions)
[![Tests](https://github.com/michielmj/disco-data-logger/actions/workflows/test.yml/badge.svg)](https://github.com/michielmj/disco-data-logger/actions)

---

## Overview

`disco-data-logger` provides a **fast, compressed, and lightweight data recording layer**  
for large-scale **Disco** simulations and other computational experiments.  

It is optimized for capturing **sparse numerical state updates** and **accumulators** during simulation runs,  
and writing them efficiently to disk as **Zstandard-compressed segment files**.  
Each simulation entity or measurement can log its data independently through labeled streams.

It combines:
- A **C++/pybind11 core** for high-throughput buffering and compression.
- **Python API** for easy stream registration and control.
- Optional **Parquet export** for analysis and aggregation after runs.

---

## ✨ Features

- **Sparse vector logging** with index/value arrays (`uint32`, `float64`).
- **Fixed-point quantization** for compact and deterministic encoding.
- **Buffered, lock-free write path** (ring buffer + writer thread).
- **Zstandard compression** (vendored, no external dependencies).
- **Segment rotation** for large simulation outputs.
- **JSON metadata** for each stream (`organisation`, `model`, `experiment`, …).
- **Optional Parquet export** for post-run analytics.
- **MIT-licensed** and designed for in-cluster (on-disk/in-memory) use.

---

## 🚀 Installation

```bash
pip install disco-data-logger
```

For Parquet export support:

```bash
pip install "disco-data-logger[parquet]"
```