Metadata-Version: 2.4
Name: stencilpy
Version: 0.5.6
Summary: Extract structured data from Excel files using YAML schema definitions
Project-URL: Homepage, https://github.com/phlohouse/stencil
Project-URL: Repository, https://github.com/phlohouse/stencil
Project-URL: Issues, https://github.com/phlohouse/stencil/issues
Author: Phlo House
License-Expression: MIT
Keywords: excel,extraction,pydantic,spreadsheet,yaml
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: File Formats
Classifier: Topic :: Software Development :: Libraries
Classifier: Typing :: Typed
Requires-Python: >=3.10
Requires-Dist: openpyxl>=3.1
Requires-Dist: pydantic>=2.0
Requires-Dist: pyyaml>=6.0
Provides-Extra: concurrent
Requires-Dist: tqdm>=4.60; extra == 'concurrent'
Provides-Extra: dev
Requires-Dist: pytest; extra == 'dev'
Requires-Dist: pytest-cov; extra == 'dev'
Requires-Dist: tqdm>=4.60; extra == 'dev'
Description-Content-Type: text/markdown

# stencilpy

Extract structured data from Excel files using YAML schema definitions into dynamically-generated Pydantic models.

## Installation

```bash
pip install stencilpy
```

The `stencil open` command serves the bundled editor UI locally and opens it in your browser.

## Quick Start

```python
from stencilpy import Stencil

# Load a schema
lab = Stencil("lab_report.stencil.yaml")

# Extract data — version auto-detected via discriminator
report = lab.extract("january_lab.xlsx")
print(report.patient_name)
print(report.model_dump())
```

## Schema Format

Create a `.stencil.yaml` file:

```yaml
name: lab_report
description: Monthly lab report

discriminator:
  cells:
    - A1

versions:
  "v2.0":
    fields:
      patient_name:
        cell: B3
      sample_date:
        cell: B4
        type: datetime
      readings:
        range: D5:D
        type: list[float]
      report_version:
        cell: header:right
      footer_note:
        cell: footer:center
```

Scalar `cell` references can also target worksheet headers and footers:

- `header:left`
- `header:center`
- `header:right`
- `footer:left`
- `footer:center`
- `footer:right`
- `Sheet1!header:first:right`
- `Sheet1!footer:even:center`

These references also work in `discriminator.cells`, which is useful when a workbook version is printed in the page header/footer instead of a normal cell.

## Header And Footer References

Use header/footer refs anywhere a scalar `cell` ref is accepted.

Example: extract version text and report metadata from the page chrome.

```yaml
name: lab_report
description: Monthly lab report

discriminator:
  cells:
    - A1

versions:
  "v2.0":
    fields:
      patient_name:
        cell: B3
      report_version:
        cell: header:right
      report_title:
        cell: header:center
      generated_by:
        cell: footer:left
      footer_note:
        cell: footer:center
```

If the workbook uses separate first-page or even-page headers/footers, include the page selector:

```yaml
versions:
  "v2.0":
    fields:
      first_page_title:
        cell: header:first:center
      even_page_version:
        cell: footer:even:right
      cover_sheet_version:
        cell: Cover!header:first:right
```

Supported formats:

- `header:left`
- `header:center`
- `header:right`
- `header:first:left`
- `header:even:center`
- `footer:right`
- `footer:first:center`
- `Sheet1!header:right`
- `Sheet1!footer:even:left`

## Header-Based Version Detection

If a workbook stores its version in a header or footer instead of a normal cell, add those refs to `discriminator.cells`.

```yaml
name: lab_report
description: Monthly lab report

discriminator:
  cells:
    - A1
    - header:right
    - Cover!footer:first:center

versions:
  "v1.0":
    fields:
      patient_name:
        cell: A3
  "v2.0":
    fields:
      patient_name:
        cell: B3
```

`stencilpy` will check each discriminator ref in order until one matches a known version key.
