Metadata-Version: 2.4
Name: dotsuite
Version: 0.9.1b1
Summary: A meta-package to install the complete dot ecosystem.
Project-URL: Homepage, https://github.com/queelius/dotsuite
Author-email: Alex Towell <lex@metafunctor.com>
License-Expression: MIT
License-File: LICENSE
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Provides-Extra: dev
Requires-Dist: mkdocs-material>=9.0; extra == 'dev'
Requires-Dist: mkdocs>=1.5; extra == 'dev'
Requires-Dist: mypy>=1.0; extra == 'dev'
Requires-Dist: pymdown-extensions>=10.0; extra == 'dev'
Requires-Dist: pytest>=7.0; extra == 'dev'
Requires-Dist: ruff>=0.1.0; extra == 'dev'
Description-Content-Type: text/markdown

# The Dot Ecosystem

> "What started as a single, humble function evolved into a complete, coherent ecosystem for manipulating data structures, a journey in API design guided by the principles of purity, pedagogy, and the principle of least power."

The **dot ecosystem** is a suite of composable tools for working with nested data structures like JSON, YAML, and Python dictionaries. Each tool follows the Unix philosophy: it does one thing exceptionally well, and they're designed to work together seamlessly.

## Installation

```bash
# Install from PyPI
pip install dotsuite
```

### Install from Source

```bash
# Clone and install
git clone https://github.com/queelius/dotsuite.git
cd dotsuite
pip install -e .

# For development with testing tools
make install-dev
```

### Publishing to PyPI

```bash
# Build and publish (for maintainers)
make build         # Build distribution packages
make publish-test  # Publish to TestPyPI
make publish       # Publish to PyPI
```

## Motivation

It always starts with a simple problem. You have a nested dictionary or JSON payload, and you need to get a value buried deep inside. You write `data['user']['contacts'][0]['email']` and pray that no key or index is missing along the way, lest your program crash with a `KeyError`. This leads to brittle, defensive code full of try/except blocks.

What began as a simple helper function, `dotget`, evolved through questions and insights into a complete ecosystem. The result is a mathematically grounded, pedagogically structured collection of tools that makes data manipulation predictable, safe, and expressive.

## The Four Pillars

The ecosystem is built on four fundamental pillars, each answering a core question about data:

### Depth Pillar: "Where is the data?"
Tools for finding and extracting values from within documents.

### Truth Pillar: "Is this assertion true?"
Tools for asking boolean questions and validating data.

### Shape Pillar: "How should the data be transformed?"
Tools for reshaping and modifying data structures.

### Collections Pillar: "How do documents relate?"
Tools for lifting single-document operations to collections via boolean and relational algebra.

## Quick Start

```python
import sys
sys.path.insert(0, 'src')  # If running from repo root

# Import from the four pillars
from depth.dotget.core import get
from depth.dotstar.core import search
from truth.dotquery.core import Q
from shape.dotmod.core import set_

# Simple exact addressing
data = {"users": [{"name": "Alice", "role": "admin"}]}
name = get(data, "users.0.name")  # "Alice"

# Pattern matching with wildcards
all_names = search(data, "users.*.name")  # ["Alice"]

# Boolean logic queries (fluent factory)
is_admin = Q("users.0.role").equals("admin").check(data)  # True

# Immutable modifications
new_data = set_(data, "users.0.status", "active")
```

## The Tools

### Depth Pillar: Addressing & Extraction

| Tool | Purpose | Example |
|------|---------|---------|
| **[`dotget`](docs/tools/depth/dotget.md)** | Simple exact paths | `get(data, "user.name")` |
| **[`dotstar`](docs/tools/depth/dotstar.md)** | Wildcard patterns | `search(data, "users.*.name")` |
| **[`dotselect`](docs/tools/depth/dotselect.md)** | Advanced selection with predicates | `find_first(data, "users[role=admin].name")` |
| **[`dotpath`](docs/tools/depth/dotpath.md)** | Extensible path engine | Powers other tools, JSONPath-compatible |

**Philosophy:** Start simple with `dotget` for known paths, add `dotstar` for patterns, use `dotselect` for complex queries. The `dotpath` engine underpins them all with extensible, Turing-complete addressing.

### Truth Pillar: Logic & Validation

| Tool | Purpose | Example |
|------|---------|---------|
| **[`dotexists`](docs/tools/truth/dotexists.md)** | Path existence | `check(data, "user.email")` |
| **[`dotequals`](docs/tools/truth/dotequals.md)** | Path equals a value | `equals(data, "user.role", "admin")` |
| **[`dotany`](docs/tools/truth/dotany.md)** | Existential quantifier | `any_match(users, "role", "admin")` |
| **[`dotall`](docs/tools/truth/dotall.md)** | Universal quantifier | `all_match(users, "active", True)` |
| **[`dotquery`](docs/tools/truth/dotquery.md)** | Compositional logic engine | `Q("users.*.role").equals("admin")` |

**Philosophy:** Boolean questions should be separate from data extraction. Start with `dotexists` or `dotequals` for simple checks, lift to collections with `dotany`/`dotall`, and compose complex logic with `dotquery`.

### Shape Pillar: Transformation & Mutation

| Tool | Purpose | Example |
|------|---------|---------|
| **[`dotmod`](docs/tools/shape/dotmod.md)** | Surgical modifications | `set_(data, "user.status", "inactive")` |
| **[`dotbatch`](docs/tools/shape/dotbatch.md)** | Atomic transactions | Apply multiple changes safely |
| **[`dotpipe`](docs/tools/shape/dotpipe.md)** | Data transformation pipelines | Reshape documents into new forms |
| **[`dotpluck`](docs/tools/shape/dotpluck.md)** | Extract multiple values | Project selected paths into a new structure |

**Philosophy:** Immutable by default. `dotmod` for precise edits, `dotbatch` for transactional safety, `dotpipe` for creating new data shapes, `dotpluck` for projection.

### Collections Pillar: Boolean & Relational Algebra

| Tool | Purpose | Domain |
|------|---------|--------|
| **[`dotfilter`](docs/tools/collections/dotfilter.md)** | Boolean algebra on document collections | Filter, intersect, union with lazy evaluation |
| **[`dotrelate`](docs/tools/collections/dotrelate.md)** | Relational operations | Join, project, union collections like database tables |

**Philosophy:** Lift single-document operations to collections. `dotfilter` provides set operations with boolean logic, while `dotrelate` enables database-style joins and projections.

## Design Principles

- **Compositionality:** Every tool composes cleanly with others
- **Immutability:** Original data is never modified
- **Pedagogical:** Simple tools graduate to powerful ones
- **Single Purpose:** Each tool has one clear responsibility
- **Interoperability:** Common patterns work across all tools
- **Performance:** Lazy evaluation and efficient algorithms
- **Safety:** Graceful handling of missing paths and malformed data

## Common Patterns

### The "Steal This Code" Philosophy

Many tools are intentionally simple enough that you can copy their core logic rather than add a dependency:

```python
# The essence of dotget
def get(data, path, default=None):
    try:
        for segment in path.split('.'):
            data = data[int(segment)] if segment.isdigit() and isinstance(data, list) else data[segment]
        return data
    except (KeyError, IndexError, TypeError):
        return default
```

### Command-Line First

Every tool works from the command line, making them perfect for shell scripts and data pipelines:

```bash
# Check if any user is an admin
cat users.json | dotany users.*.role --equals admin && echo "Admin found"

# Extract all email addresses
cat contacts.json | dotstar "contacts.*.email" > emails.txt

# Join users with their orders
dotrelate join --left-on user_id --right-on user_id users.jsonl orders.jsonl
```

### Dual APIs: Programmatic and Declarative

`dotquery` offers both a fluent Python builder and a declarative DSL:

```python
from truth.dotquery.core import Q, Query

# Programmatic (fluent factory with operator overloading)
query = Q("role").equals("admin") & Q("login_count").greater(10)

# Declarative (DSL string)
query = Query("equals role admin and greater login_count 10")

# Both produce equivalent ASTs
```

## From Simple to Sophisticated

The ecosystem is designed as a learning journey:

1. **Hello World:** `dotget`, `dotexists`: O(1) mental load
2. **Patterns:** Add `dotstar`, `dotmod`, `dotequals`: wildcards and basic changes
3. **Quantifiers:** `dotany`, `dotall`, `dotpluck`, `dotbatch`: lift to collections, batch edits
4. **Power User:** `dotselect`, `dotquery`, `dotpipe`: complex selection and reshaping
5. **Expert:** `dotpath`, `dotfilter`, `dotrelate`: extensible engine and relational algebra

Each stage builds on the previous, with no tool becoming obsolete. A `dotget` call is still the right choice when you know the exact path.

## Mathematical Foundation

The ecosystem is built on solid mathematical principles:

- **Addressing** forms a free algebra on selectors (Turing-complete via user-defined reducers)
- **Logic** implements Boolean algebra with homomorphic lifting to set operations
- **Transformations** are endofunctors on document spaces with monoid composition
- **Collections** lift via functorial map/filter operations preserving algebraic structure

This ensures predictable composition, parallelizability, and mathematical correctness.

## Individual Tool Documentation

Each tool has comprehensive documentation under `docs/tools/`:

- [**dotget**](docs/tools/depth/dotget.md): Simple exact addressing
- [**dotstar**](docs/tools/depth/dotstar.md): Wildcard pattern matching
- [**dotselect**](docs/tools/depth/dotselect.md): Advanced selection with predicates
- [**dotpath**](docs/tools/depth/dotpath.md): Extensible path engine
- [**dotexists**](docs/tools/truth/dotexists.md): Path existence checking
- [**dotequals**](docs/tools/truth/dotequals.md): Path value equality
- [**dotany**](docs/tools/truth/dotany.md): Existential quantifier
- [**dotall**](docs/tools/truth/dotall.md): Universal quantifier
- [**dotquery**](docs/tools/truth/dotquery.md): Compositional logic engine
- [**dotmod**](docs/tools/shape/dotmod.md): Immutable modifications
- [**dotbatch**](docs/tools/shape/dotbatch.md): Atomic transactions
- [**dotpipe**](docs/tools/shape/dotpipe.md): Data transformation pipelines
- [**dotpluck**](docs/tools/shape/dotpluck.md): Value extraction and reshaping
- [**dotfilter**](docs/tools/collections/dotfilter.md): Boolean algebra on collections
- [**dotrelate**](docs/tools/collections/dotrelate.md): Relational operations

## Production-Ready Alternative

While dotsuite focuses on pedagogy and simplicity, for production use cases requiring advanced features like streaming, complex path operations, and S-expression queries, consider [**JAF (Just Another Flow)**](https://github.com/realazthat/jaf). JAF implements similar concepts to dotfilter and dotpipe in a feature-complete, production-ready package with:

- Lazy streaming evaluation for large datasets
- Advanced path system with regex, fuzzy matching, and wildcards
- S-expression query language
- Index-preserving result sets for powerful set operations
- Support for multiple data sources (files, directories, stdin, compressed)

Think of dotsuite as the "learn by building" approach and JAF as the "battle-tested solution", both valuable for different purposes.

## Contributing

The dot ecosystem welcomes contributions. Each tool lives in its own directory with its own tests and documentation. See [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines.

## License

MIT License. Use freely, modify as needed, and contribute back when you can.

---

*The dot ecosystem: from simple paths to sophisticated data algebras, one tool at a time.*
