Metadata-Version: 2.4
Name: tesseract-olap
Version: 0.23.2
Summary: A semantic layer for your data.
License: Proprietary
Author: Francisco Abarzua
Author-email: francisco@datawheel.us
Maintainer: Francisco Abarzua
Maintainer-email: francisco@datawheel.us
Requires-Python: >=3.10, <4.0
Classifier: License :: Other/Proprietary License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Python :: 3.14
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Provides-Extra: clickhouse
Provides-Extra: redis
Requires-Dist: PyPika (>=0.48.0,<0.49.0)
Requires-Dist: clickhouse-cityhash (>=1.0.2.4) ; extra == "clickhouse"
Requires-Dist: clickhouse-connect (>=0.8.16) ; extra == "clickhouse"
Requires-Dist: clickhouse-driver[lz4] (>=0.2.0) ; extra == "clickhouse"
Requires-Dist: fnv-c (>=0.2,<1.0)
Requires-Dist: httpx (>=0.18,<1.0)
Requires-Dist: immutables (>=0.16,<1.0)
Requires-Dist: lfudacache (==0.0.2)
Requires-Dist: lxml (>=4.6)
Requires-Dist: orjson (>=3.11.6,<3.12.0)
Requires-Dist: polars (>=1.24,<2.0)
Requires-Dist: pydantic (>=2.11,<3.0)
Requires-Dist: pyparsing (==3.2.3)
Requires-Dist: python-dateutil (>=2.9.0.post0,<3.0)
Requires-Dist: pyyaml (>=5.4)
Requires-Dist: redis (>=5.0,<6.0) ; extra == "redis"
Requires-Dist: strenum (>=0.4,<1.0)
Requires-Dist: typing-extensions (>=4.5)
Requires-Dist: xlsxwriter (>=3.2,<4.0)
Requires-Dist: xmlschema (>=3.3,<4.0)
Project-URL: Repository, https://github.com/Datawheel/tesseract-olap
Description-Content-Type: text/markdown

# Tesseract OLAP

A semantic layer for your data, enabling OLAP operations over relational databases. Tesseract OLAP provides a Python interface to translate multidimensional queries into optimized SQL, executing them against a backend, and returning the results in a structured format.

## Key Capabilities

*   **Multidimensional Analysis**: Supports standard OLAP operations like drilling down, slicing, dicing, and pivoting.
*   **Backend Agnostic**: Designed to work with various data backends. Currently supports **ClickHouse** with specialized drivers.
*   **Schema Validation**: Uses a robust XML schema definition to validate queries and ensure data integrity.
*   **High Performance**: leverages `polars` for efficient data handling and transformation.
*   **Caching**: Built-in support for caching query results to improve performance (supports Redis).

## Installation

Install the package using pip:

```bash
pip install tesseract-olap
```

### Optional Dependencies

To use specific backends or features, install the corresponding extras:

*   **ClickHouse Backend**:
    ```bash
    pip install "tesseract-olap[clickhouse]"
    ```
*   **Redis Caching**:
    ```bash
    pip install "tesseract-olap[redis]"
    ```

## Getting Started

In its most basic form, the `tesseract-olap` package provides a way to translate OLAP-type queries into request statements that a data backend can understand and execute safely.

### 1. Define your Schema
Tesseract uses a schema to define cubes, dimensions, and measures. Supported formats include XML, JSON, and YAML. See the [OLAP Model Concepts](https://github.com/Datawheel/tesseract-olap/blob/main/docs/olap-model.md) and examples in the repository for schema details.

### 2. Initialize the Server
```python
from tesseract_olap.backend.clickhouse import ClickhouseBackend
from tesseract_olap import OlapServer

# Initialize the backend connection
backend = ClickhouseBackend("clickhouse://user:pass@localhost:9000/database")

# Create the server instance with the schema
server = OlapServer(backend=backend, schema="./path/to/schema.xml")
```

### 3. Execute a Query
```python
from tesseract_olap.query import DataRequest, DataQuery

def get_data():
    # Create a request object defining the query
    request = DataRequest.new("cube_name", {
        "drilldowns": ["Time", "Country"],
        "measures": ["Units", "Price"],
    })

    # Validate and compile the request against the schema
    query = DataQuery.from_request(server.schema, request)

    # Execute the query
    with server.session() as session:
        # Fetch results in different formats
        # result = session.fetch(query)           # Returns a Result object
        # result = session.fetch_records(query)   # Returns a list of dicts
        result = session.fetch_dataframe(query)   # Returns a Polars DataFrame

    return result.data

if __name__ == "__main__":
    data = get_data()
    print(data)
```

The server instance can be integrated into web applications (like FastAPI or LogicLayer) to provide a full-fledged OLAP API.

---
&copy; 2022-2024 [Datawheel, LLC.](https://www.datawheel.us/)

