Metadata-Version: 2.4
Name: opensensor-enviroplus
Version: 0.3.0
Summary: Modern CLI-based environmental sensor collector using Polars, Arrow, and Delta Lake for Enviro+
Author-email: Youssef Harby <yharby@walkthru.earth>
License: MIT
Keywords: sensors,enviroplus,raspberry-pi,polars,arrow,iot,opensensor-space,walkthru-earth
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Requires-Python: >=3.10
Description-Content-Type: text/markdown
Requires-Dist: croniter>=6.0.0
Requires-Dist: enviroplus-community>=1.0.6
Requires-Dist: obstore>=0.8.2
Requires-Dist: polars>=1.35.2
Requires-Dist: pyarrow>=22.0.0
Requires-Dist: pydantic>=2.12.4
Requires-Dist: pydantic-settings>=2.12.0
Requires-Dist: python-dotenv>=1.2.1
Requires-Dist: typer>=0.20.0
Requires-Dist: uuid6>=2025.0.1

# OpenSensor Enviroplus

Modern, CLI-based environmental sensor data collector using Polars, Apache Arrow, and Hive-partitioned Parquet for Raspberry Pi Enviro+.

Part of the [OpenSensor.Space](https://opensensor.space) network for open environmental data.

## Features

- **UUID v7 Station IDs**: Time-ordered UUIDs for better database performance
- **Modern Stack**: Polars streaming, Apache Arrow, Hive-partitioned Parquet
- **Memory Efficient**: Optimized for Raspberry Pi with limited RAM
- **CLI-First**: Simple Python commands replace bash scripts
- **Smart Logging**: Rich console output for easy debugging
- **Cloud Sync**: Built-in sync using obstore (50% faster than boto3)
- **Prefix-based IAM**: S3 bucket access control per station
- **Type Safe**: Pydantic settings with validation
- **Production Ready**: Graceful error handling, automatic retries
- **Browser-queryable**: DuckDB-wasm compatible Parquet output
- **Temperature & Humidity Compensation**: CPU heat correction using Pimoroni's dewpoint formula
- **System Health Monitoring**: Optional CPU, memory, disk, WiFi signal, NTP sync tracking

## Quick Start

### Prerequisites

```bash
# Update system packages
sudo apt-get update

# Install git
sudo apt-get install -y git

# Install UV package manager (fast Python package manager)
curl -LsSf https://astral.sh/uv/install.sh | sh
source $HOME/.cargo/env
```

### Installation

```bash
# Clone the repository
git clone https://github.com/walkthru-earth/opensensor-enviroplus.git
cd opensensor-enviroplus

# Install dependencies with UV
uv sync

# Activate virtual environment (optional - uv run handles this)
source .venv/bin/activate
```

### Setup

```bash
# Interactive setup (creates .env configuration)
opensensor setup

# Or non-interactive
opensensor setup --station-id "01234567-89ab-cdef-0123-456789abcdef" --no-interactive
```

### Usage

#### Run as Systemd Service (Recommended)

```bash
# Quick setup (install + enable + start) - automatically handles sudo
opensensor service setup

# View service status
opensensor service status

# View live logs
opensensor service logs --follow

# Restart service
opensensor service restart

# Stop service
opensensor service stop

# Complete removal
opensensor service remove
```

**Note:** Service commands automatically request sudo when needed.

#### Manual Commands

```bash
# Start collecting data
opensensor start

# Run in foreground (for debugging)
opensensor start --foreground

# View status
opensensor status

# Sync to cloud
opensensor sync

# View logs
opensensor logs

# Follow logs in real-time
opensensor logs --follow

# View configuration
opensensor config
```

#### Service Management

The service commands automatically detect your user, project path, and virtual environment:

```bash
# View auto-detected configuration
opensensor service info

# Individual service commands (automatically handle sudo)
opensensor service install    # Create systemd service
opensensor service enable     # Enable on boot
opensensor service start      # Start service
opensensor service stop       # Stop service
opensensor service restart    # Restart service
opensensor service disable    # Disable on boot
opensensor service uninstall  # Remove service
```

## Configuration

Configuration via `.env` file (auto-generated by `opensensor setup`):

```bash
# Station identification (UUID v7 - auto-generated)
OPENSENSOR_STATION_ID=019ab383-d789-74e2-a460-bb92b1c13681

# Data collection
OPENSENSOR_READ_INTERVAL=5              # Seconds between sensor reads
OPENSENSOR_BATCH_DURATION=900           # 15-minute batches

# Temperature/humidity compensation (for Raspberry Pi CPU heat)
OPENSENSOR_TEMP_COMPENSATION_ENABLED=true
OPENSENSOR_TEMP_COMPENSATION_FACTOR=2.25  # Pimoroni's official factor

# Health monitoring (CPU, memory, disk, WiFi, NTP sync)
OPENSENSOR_HEALTH_ENABLED=true

# Output settings
OPENSENSOR_OUTPUT_DIR=output
OPENSENSOR_COMPRESSION=snappy           # Fast compression (snappy, zstd, gzip)

# Cloud sync (optional)
OPENSENSOR_SYNC_ENABLED=true
OPENSENSOR_SYNC_INTERVAL_MINUTES=15

# S3/MinIO storage
OPENSENSOR_STORAGE_BUCKET=my-sensor-bucket
OPENSENSOR_STORAGE_PREFIX=sensors/station-019ab383  # For IAM scoping
OPENSENSOR_STORAGE_REGION=us-west-2
OPENSENSOR_STORAGE_ENDPOINT=            # Optional: for MinIO/custom S3

# AWS credentials
OPENSENSOR_AWS_ACCESS_KEY_ID=AKIAIOSFODNN7EXAMPLE
OPENSENSOR_AWS_SECRET_ACCESS_KEY=wJalrXUtnFEMI/K7MDENG/bPxRfiCY

# Logging
OPENSENSOR_LOG_LEVEL=INFO
OPENSENSOR_LOG_DIR=logs
```

See `.env.example` for a complete template with IAM policy examples.

## Architecture

### Data Flow

```
Sensors (5s) -> Polars Collector -> Hive-Partitioned Parquet (15min) -> S3/MinIO (obstore)
                    ↓
              Health Metrics (~1min) -> Separate Parquet (output-health/)
```

### Output Format (Hive-Partitioned Parquet)

```
output/                                           # Sensor data
  station=019ab383-d789-74e2-a460-bb92b1c13681/
    year=2025/
      month=11/
        day=24/
          data_1430.parquet  # Batch written at 14:30
          data_1445.parquet  # Batch written at 14:45

output-health/                                    # System health (optional)
  station=019ab383-d789-74e2-a460-bb92b1c13681/
    year=2025/
      month=11/
        day=24/
          health_1430.parquet  # ~15 health records per batch
```

**Benefits:**
- Browser-queryable with DuckDB-wasm
- Partition pruning for fast time-range queries
- Simple, universal format (no proprietary transaction logs)
- Perfect for append-only time-series data

See [ARCHITECTURE.md](ARCHITECTURE.md) for detailed diagrams and scalability analysis.

## Differences from Original

| Feature | Old (enviroplus-python) | New (opensensor-enviroplus) |
|---------|-------------------------|----------------------------|
| Station IDs | Manual/random | UUID v7 (time-ordered) |
| Data library | pandas + DuckDB | Polars + Apache Arrow |
| Storage | Partitioned Parquet | Hive-partitioned Parquet |
| Configuration | bash scripts + env vars | Pydantic Settings + .env |
| Setup | install.sh | `opensensor setup` CLI |
| Cloud sync | rclone (process spawn) | obstore (Rust, 50% faster) |
| IAM policies | N/A | Prefix-based scoping |
| Logging | print statements | Rich + structured logging |
| Memory usage | Higher (pandas) | 50% lower (Polars streaming) |
| CLI | None | Typer with 7 commands |
| Read interval | 1 second | 5 seconds (configurable) |
| Batch duration | Variable | 15 minutes (900s) |
| Humidity correction | None | Dewpoint-based compensation |
| Health monitoring | None | CPU, memory, WiFi, NTP sync |

## Development

```bash
# Install with dev dependencies
uv sync --group dev

# Format code
uv run ruff format .

# Lint code
uv run ruff check .

# Run with UV (no venv activation needed)
uv run opensensor --help
```

## Tech Stack

- **Python 3.10+** - Modern Python with type hints
- **UV** - Fast Rust-based package manager (10-100x faster than pip)
- **Polars 1.35+** - High-performance DataFrames with streaming
- **PyArrow 22+** - Columnar memory format (zero-copy operations)
- **uuid6** - RFC 9562 UUID v7 implementation
- **obstore** - Rust-powered object storage (S3/GCS/Azure)
- **Pydantic Settings** - Type-safe configuration
- **Typer + Rich** - Beautiful CLI with auto-completion
- **Ruff** - Extremely fast Python linter and formatter

## License

MIT License - see [LICENSE](LICENSE) file for details

## Credits

Built by the [WalkThru Earth](https://walkthru.earth) team for the [OpenSensor.Space](https://opensensor.space) network.

**Dependencies:**
- [enviroplus-community](https://github.com/walkthru-earth/enviroplus-python) - Enviro+ sensor drivers
- [Polars](https://pola.rs/) - Lightning-fast DataFrames
- [obstore](https://developmentseed.org/obstore/) - Object storage abstraction
- [Typer](https://typer.tiangolo.com/) - CLI framework
- [Rich](https://rich.readthedocs.io/) - Terminal formatting
- [Pydantic](https://docs.pydantic.dev/) - Data validation

## Contributing

Contributions welcome! Please:
1. Fork the repository
2. Create a feature branch (`git checkout -b feature/amazing-feature`)
3. Run tests and linting (`uv run ruff check .`)
4. Commit your changes (`git commit -m 'feat: add amazing feature'`)
5. Push to the branch (`git push origin feature/amazing-feature`)
6. Open a Pull Request

## Support

- **Issues**: [GitHub Issues](https://github.com/walkthru-earth/opensensor-enviroplus/issues)
- **Discussions**: [GitHub Discussions](https://github.com/walkthru-earth/opensensor-enviroplus/discussions)
- **Documentation**: [ARCHITECTURE.md](ARCHITECTURE.md)
