Metadata-Version: 2.1
Name: xplain
Version: 0.0.37
Summary: A python package to access xplain data analytics
Author-email: Xplain Data <peide.wang@xplain-data.com>
License: Copyright (c) 2021, Xplain Data GmbH
        
        Redistribution and use in source and binary forms, with or without
        modification, are permitted provided that the following conditions are met:
        
        1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
        2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
        
        THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
Project-URL: Homepage, https://xplain-data.com
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Operating System :: OS Independent
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: numpy
Requires-Dist: pandas
Requires-Dist: requests
Requires-Dist: treelib
Requires-Dist: patsy
Requires-Dist: statsmodels
Requires-Dist: pyecharts
Requires-Dist: pydantic
Requires-Dist: click
Requires-Dist: rich
Provides-Extra: dev
Requires-Dist: pytest; extra == "dev"
Requires-Dist: jsondiff; extra == "dev"
Provides-Extra: mcp
Requires-Dist: fastmcp>=2.14.1; extra == "mcp"

# Xplain Python Package

A Python client library for the [Xplain](https://www.xplaindata.com/) data analytics platform. This package provides a comprehensive API for object-oriented data analysis, query execution, statistical modeling, and data import operations.

## Features

- **Session Management** - Connect to Xplain servers, manage sessions, and share session state across clients
- **Object-Oriented Data Model** - Navigate hierarchical object trees with XObjects, Dimensions, and Attributes
- **Query Engine** - Build and execute queries with aggregations, group-bys, and selections using `Query_config`
- **Statistical Modeling** - Run logistic regression, OLS, probit, GLM, Poisson, and negative binomial models via `statsmodels`
- **Predictive Modeling** - Build and manage predictive models with independent variable analysis
- **Data Import** - Import data from PostgreSQL, MySQL, Oracle, SQL Server, Snowflake, BigQuery, and other JDBC databases
- **Batch Processing** - Generate and execute XGenScripts for time-partitioned parallel data import
- **Visualization** - Render collapsible object trees in Jupyter notebooks using `pyecharts`

## Requirements

- Python >= 3.9
- An Xplain server instance

## Installation

### From PyPI

```bash
pip install xplain
```

### From Source (Development)

```bash
git clone <repository-url>
cd xplainpy
python3 -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
pip install -e .
```

### Optional Dependencies

```bash
# For MCP server integration
pip install -e ".[mcp]"

# For development/testing
pip install -e ".[dev]"
```

## Quick Start

### Connect to Xplain Server

```python
from xplain import Xsession

# Create a session and authenticate
session = Xsession(url="http://myhost:8080", user="myuser", password="mypassword")

# Load a startup configuration
session.startup("MyConfig")

# View the object tree
session.show_tree()
```

### CLI Session Startup

```bash
# Connect using the default profile from ~/.xplainpyrc
xplain connect

# Load a saved startup configuration
xplain startup MyConfig.xstartup

# The .xstartup extension is optional
xplain startup MyConfig

# Load a local XView JSON file
xplain startup ./views/my_view.xview

# Initialize the current server session
xplain init-xplain-session --output json
```

### Execute Queries

```python
from xplain import Xsession, Query_config

session = Xsession(url="http://myhost:8080", user="myuser", password="mypassword")
session.startup("MyConfig")

# Build a query using Query_config
query = Query_config()
query.add_aggregation(object_name="Orders", dimension_name="Amount", type="SUM")
query.add_groupby(object_name="Orders", dimension_name="Category", attribute_name="ProductType")

# Execute and get results as a pandas DataFrame
df = session.execute_query(query)
print(df)
```

### Open an Attribute

```python
# Get counts grouped by an attribute
df = session.open_attribute(
    object_name="Person",
    dimension_name="Gender",
    attribute_name="Gender"
)
print(df)
```

### Navigate the Object Tree

```python
# Get an XObject and explore its structure
obj = session.get_xobject("Orders")
print(obj.get_dimensions())       # List dimension names
print(obj.get_child_objects())     # List child object names

# Add an aggregation dimension
obj.add_aggregation_dimension(
    dimension_name="#Prescriptions",
    aggregation={"aggregationType": "COUNT", "object": "Prescription"},
    floating_semantics=False
)
```

### Import Data from a Database

```python
from xplain import Xsession
from xplain.tools import Connection, Xtable_config, Xview

session = Xsession(url="http://myhost:8080", user="admin", password="secret")

# Create a database connection
conn = Connection(
    xsession=session,
    databaseType="POSTGRESQL",
    url="db.example.com",
    user="analyst",
    password="dbpassword",
    databaseName="warehouse",
    portNumber=5432
)

# Test the connection
result = conn.test_connection()
print(result)  # {'status': 'success', 'result': 'db connection validated successfully'}

# List available tables
tables = conn.get_tables()

# Configure an XTable import
config = Xtable_config(
    xsession=session,
    connection=conn,
    db_table_name="sales_data",
    object_name="Sales",
    primary_key="transaction_id",
    foreign_keys=["customer_id"]
)

# Review the auto-mapped dimensions
print(config.show_dimension_configurations())

# Import the XTable
config.import_xtable()

# Create a view with parent-child relationships
view = Xview(session)
view.insert_xtable(xtable="Sales", as_root=True, auto_generate_attributes=True)
view.insert_xtable(xtable="Customers", as_root=False, parent="Sales")
view.save("sales_view", "PUBLIC")

# Load the view as a session
session.startup_from_xview_config(view)
```

### Statistical Modeling

```python
# Run a logistic regression
df = session.execute_query(query)
result = session.run_statsmodels(df, formula="outcome ~ age + gender + treatment", model_type="logit")
print(result.summary())

# Build a formula dynamically
formula = session.build_formula(response="outcome", predictors=["age", "gender", "treatment"])
```

## Package Structure

```
xplain/
    __init__.py          # Package exports (Xsession, XObject, Dimension, Attribute, Query_config)
    xsession.py          # Core session management and Web API client
    xobject.py           # XObject class for data objects
    dimension.py         # Dimension class
    attribute.py         # Attribute class with hierarchy support
    query_config.py      # Query configuration builder
    api.py               # Advanced API for selections, sequences, and computed dimensions
    tools/
        __init__.py      # Tools exports
        connection.py    # Database connection management
        xtable_config.py # XTable import configuration
        xview.py         # XView configuration builder
        xattribute_config.py  # Hierarchical attribute configuration
        importer.py      # High-level import orchestrator
        xgenscript.py    # XGenScript generation for batch processing
```

## Core Classes

| Class | Description |
|-------|-------------|
| `Xsession` | Main session class for connecting to Xplain and executing operations |
| `XObject` | Represents a data object in the Xplain object tree |
| `Dimension` | Represents a dimension attached to an XObject |
| `Attribute` | Represents an attribute within a dimension, with hierarchy support |
| `Query_config` | Builder for constructing query configurations |
| `Api` | Advanced API for selections, sorted sequences, and computed dimensions |
| `Connection` | Database connection management for data import |
| `Xtable_config` | Configuration for importing database tables as XTables |
| `Xview` | Builder for XView configurations defining data view structure |
| `Importer` | High-level orchestrator for data import workflows |

## Authentication

Xplain supports multiple authentication methods:

```python
# Standard credentials
session = Xsession(url="http://host:8080", user="user", password="pass")

# JWT authentication
session = Xsession(
    url="http://host:8080",
    jwt_dispatch_url="https://auth.example.com/dispatch",
    jwt_cookie_name="auth_token",
    jwt_token="eyJhbGciOi..."
)

# Reuse existing HTTP session
session = Xsession(url="http://host:8080", http_session_id="EXISTING_JSESSIONID")
```

## Testing

```bash
# Run all tests
python3 -m pytest

# Run a specific test file
python3 -m pytest test/test_xsession.py

# Run with verbose output
python3 -m pytest -v
```

## Documentation

Full documentation is available in the `docs/` directory. Build with Sphinx:

```bash
cd docs
make html
```

## Changelog

### 2025-02-27
- Added `download_selections` method
- Bugfix: `show_tree`
- Added: `run_statsmodels`, `create_contingency_table`, `build_formula`
- Added: `collapsible_tree`, `http_get`, `http_post`, `run_py`

### 2024-07-02
- Enabled JWT authentication

### 2024-03-12
- Replaced print messages with logging
- Bugfix: POST payload issue with missing `json.dumps`
- Bugfix: Session hijacking issue in pyodide environment
- Added `http_post` and `http_get` methods

### 2023-12-20
- SSL verify false by login

### 2023-09-06
- Bugfix: `get_instance_as_dataframe` doesn't download the exported CSV

### 2023-07-31
- Added `aggregation_name` parameter to `query_config.add_aggregation`

### 2023-06-06
- Bugfix: `build_predictive_model` returns error by reading result

### 2023-05-11
- Added `validate_db`

### 2023-05-05
- Added `Xsession.list_files()` and `Xsession.read_file()`
- `Xsession.startup(file_name)` - file extension now optional

## License

Xplain Data GmbH. All rights reserved.
