Metadata-Version: 2.4
Name: ds_efficiency
Version: 0.1.1
Summary: A professional toolkit for Data Science efficiency, featuring automatic logging, runtime type checking, and DataFrame schema validation.
Author-email: José Tonatiuh Navarro Silva <tona.navarro.17@gmail.com>
Project-URL: Homepage, https://github.com/if722399/ds_efficiency
Project-URL: Bug Tracker, https://github.com/if722399/ds_efficiency/issues
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Requires-Python: >=3.8
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: pandas>=2.0.3
Requires-Dist: polars>=1.8.2
Requires-Dist: pytest>=8.3.5
Requires-Dist: typeguard>=4.4.0
Dynamic: license-file

# ds_efficiency

A professional toolkit for Data Science efficiency, featuring automatic logging, runtime type checking, and DataFrame schema validation.

## Key Features

- **MetaEngine**: A metaclass that automatically adds logging, timing, and type validation to your classes.
- **DataFrame Validation**: Validate Polars DataFrames using docstring schemas.
- **Granular Logging**: Automatic method-level loggers for easier debugging.
- **Runtime Type Checking**: Powered by `typeguard` to ensure data integrity.

## Installation

Using `uv` (recommended):
```bash
uv pip install ds_efficiency
```

Or using `pip`:
```bash
pip install ds_efficiency
```

## Quick Start

### Using MetaEngine

Simply set `MetaEngine` as the metaclass for your data processing classes.

```python
import polars as pl
from ds_efficiency import MetaEngine

class DataProcessor(metaclass=MetaEngine):
    def process_data(self, df: pl.DataFrame) -> pl.DataFrame:
        """
        Schema:
        -------
        df: pl.DataFrame
            |-- id: int
            |-- value: float
        Meta instruction: Drop extra columns.
        """
        # The 'df' is automatically validated and casted before this line
        self.log_process_data.info("Calculating total...")
        return df.with_columns((pl.col("value") * 2).alias("doubled"))

# Usage
processor = DataProcessor()
df = pl.DataFrame({"id": [1, 2], "value": [10.5, 20.0], "extra": [True, False]})
result = processor.process_data(df)
# 'extra' column is dropped, types are checked, and timing is logged!
```

### Logging Configuration

You can configure the log file at the class or environment level:

```python
class MyClass(metaclass=MetaEngine):
    log_file = "logs/my_app.log"
    log_level = "DEBUG"
```

Or set the environment variable:
```bash
export METAENGINE_LOG_FILE="logs/global.log"
```

## License

MIT
