Metadata-Version: 2.1
Name: json_tabulator
Version: 0.1.0
Summary: Simple query language to extract tables from JSON.
License: MIT
Author: Matthias Ossadnik
Author-email: ossadnik.matthias@gmail.com
Requires-Python: >=3.9,<4.0
Classifier: Development Status :: 4 - Beta
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Project-URL: homepage, https://github.com/mossadnik/json_tabulator
Description-Content-Type: text/markdown

# json_tabulator

A simple query language for extracting tables from JSON-like objects.

Working with tabular data is much easier than working with nested documents. json-tables helps to extract tables from JSON-like objects in a simple, declarative manner. All further processing is left to the many powerful tools that exist for working with tables, such as Spark or Pandas.


## Installation

Install from pypi:

```shell
pip install json_tabulator
```

## Quickstart

The `json_tabulator` module provides tools to extract a JSON document into a set of related tables. Let's start with a simple document

```python
data = {
    'id': 'doc-1',
    'table': [
        {'id': 1, 'name': 'row-1'},
        {'id': 2, 'name': 'row-2'}
    ]
}
```

The document consists of a document-level value `id` as well as a nested sub-table `table`. We want to extract it into a single table, with the global value folded into the table.

To do this, we write a query that defines the conversion into a table like this:

```python
from json_tabulator import query

my_query = query({
    'document_id': 'id',
    'row_id': 'table.*.id',
    'row_name': 'table.*.name'
})

rows = my_query.execute(data)
```

This returns an iterator of rows, where each row is a dict `{<column_name>: <value>}`:

```python
>>> list(rows)
[
    {'document_id': 'doc-1', 'row_id': 1, 'row_name': 'row-1'},
    {'document_id': 'doc-1', 'row_id': 2, 'row_name': 'row-2'}
]
```

