Metadata-Version: 2.4
Name: dataiku-utils
Version: 0.1.0
Summary: Small helper utilities for Dataiku DSS project variables and dataset metadata.
Author: Khalil Laghmari
License-Expression: LicenseRef-Proprietary
Project-URL: Homepage, https://pypi.org/project/dataiku-utils/
Keywords: dataiku,dss,project-variables,datasets,utilities
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Requires-Python: >=3.9
Description-Content-Type: text/markdown
Requires-Dist: python-dateutil>=2.8.0

# dataiku-utils

`dataiku-utils` is a small Python helper package for Dataiku DSS projects. It provides utility classes for working with project variables, dataset metadata, and SQL-backed dataset columns.

The package is designed for code running inside Dataiku DSS recipes, scenarios, or libraries where the `dataiku` Python module is available.

## Features

- Read and update Dataiku project variables.
- Update project variables with a lightweight lock to reduce concurrent-write conflicts.
- Manage a single project variable through a small object-oriented API.
- Track whether a managed variable was updated.
- Inspect Dataiku dataset schemas and SQL table locations.
- Discover minimum, maximum, and next available values from a SQL-backed dataset column.

## Installation

```bash
pip install dataiku-utils
```

For local development from a source archive:

```bash
pip install dataiku_utils-0.1.0.tar.gz
```

## Runtime requirement

This package expects the `dataiku` Python module to be available at runtime. In normal usage, Dataiku DSS provides this module inside code environments used by recipes and scenarios.

The package does not declare `dataiku` as a hard PyPI dependency because the runtime module is generally provided by the DSS environment rather than installed from public PyPI.

## Quick start

### Read and update project variables

```python
from dataiku_utils import ProjectVariables

value = ProjectVariables.get_variable("last_processed_date")

ProjectVariables.safe_update_scope_variables(
    {"last_processed_date": "2026-01-31"},
    scope="standard",
)
```

### Manage a single variable

```python
from dataiku_utils import SimpleValueVariablesUtils

variable = SimpleValueVariablesUtils(
    key="processing_date",
    initial_value="2026-01-01",
)

print(variable.value)
variable.value = "2026-01-02"
```

### Inspect a dataset

```python
import dataiku
from dataiku_utils import DatasetUtils

dataset = dataiku.Dataset("input_dataset", ignore_flow=True)

columns = DatasetUtils.colnames_from_dataset(dataset)
table_name = DatasetUtils.table_fullname_from_sql_dataset(dataset)
```

### Manage a variable based on a SQL-backed table column

```python
from dataiku_utils import TableColumnValueVariablesUtils

variable = TableColumnValueVariablesUtils(
    value_key="last_processed_value",
    dataset_name="input_dataset",
    colname="event_date",
    coltype="DATE",
    start_min_value="2026-01-01",
    start_max_value="2026-12-31",
)

variable.create_if_not_exists()
updated = variable.update()
```

## Logging

All public helper classes inherit from `LogBase`, which provides one logger per class.

```python
import logging
from dataiku_utils import ProjectVariables

ProjectVariables.logger().setLevel(logging.DEBUG)
```

Debug statements that may compute expensive values are guarded with `logger.isEnabledFor(logging.DEBUG)` where relevant.

## Public API

The package exposes the following classes:

- `LogBase`
- `DatasetUtils`
- `ProjectVariables`
- `SimpleValueVariablesUtils`
- `ValueVariablesUtils`
- `WatchedValueVariablesUtils`
- `TableColumnValueUtils`
- `TableColumnValueVariablesUtils`
- `TableColumnWatchedValueVariablesUtils`

## Development

Build locally:

```bash
python -m pip install --upgrade build twine
python -m build
twine check dist/*
```

Publish with the included helper script:

```bash
chmod +x .pypi.sh
./.pypi.sh
```

Use TestPyPI:

```bash
PYPI_REPOSITORY=testpypi ./.pypi.sh
```

## License

Proprietary. Update the license metadata before publishing if you want to distribute it under an open-source license.
