Metadata-Version: 2.4
Name: chadselect
Version: 0.2.0
Summary: Unified data extraction — CSS, XPath, Regex, and JMESPath behind one query interface.
Project-URL: Homepage, https://github.com/markjacksoncerberus/chadselect
Project-URL: Repository, https://github.com/markjacksoncerberus/chadselect
Project-URL: Issues, https://github.com/markjacksoncerberus/chadselect/issues
Author: Mark Jackson
License: MIT
Keywords: css,extraction,jmespath,parsing,regex,scraping,xpath
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Python :: 3.14
Classifier: Programming Language :: Python :: 3.15
Classifier: Topic :: Text Processing :: Markup :: HTML
Classifier: Topic :: Text Processing :: Markup :: XML
Classifier: Typing :: Typed
Requires-Python: >=3.9
Requires-Dist: jmespath>=1.0
Requires-Dist: lxml>=5.0
Requires-Dist: selectolax>=0.3.21
Provides-Extra: dev
Requires-Dist: pytest-asyncio>=0.21; extra == 'dev'
Requires-Dist: pytest>=7.0; extra == 'dev'
Description-Content-Type: text/markdown

# ChadSelect

Unified data extraction — CSS, XPath, Regex, and JMESPath behind one query interface.

```python
from chadselect import ChadSelect

cs = ChadSelect()
cs.add_html(html)
cs.add_json(json_str)

# One syntax, four engines
title = cs.select(0, "css:h1.title")
author = cs.select(0, "xpath://span[@class='author']/text()")
vin = cs.select(0, r"regex:[A-HJ-NPR-Z0-9]{17}")
name = cs.select(0, "json:data.products[0].name")

# Function piping
clean = cs.select(0, "css:.price >> trim >> uppercase()")
```

## Install

```bash
pip install chadselect
```

## Query Syntax

Queries use a `engine:expression` prefix:

| Prefix | Engine | Best For |
|--------|--------|----------|
| `css:` | CSS Selectors (selectolax) | HTML element selection |
| `xpath:` | XPath 1.0 (lxml) | Complex HTML/XML traversal |
| `regex:` | Regular Expressions (re) | Pattern matching on raw text |
| `json:` | JMESPath (jmespath) | JSON field extraction |

No prefix defaults to regex.

## Function Piping

Chain text transformations with `>>`:

```python
cs.select(0, "css:.price >> trim >> substring-after('$') >> uppercase()")
```

Available functions: `trim`, `uppercase()`, `lowercase()`, `normalize-space()`,
`substring-after('delim')`, `substring-before('delim')`, `substring(start, len)`,
`replace('old', 'new')`, `get-attr('name')`.

## API

```python
cs = ChadSelect()

# Load content
cs.add_html(html_string)
cs.add_json(json_string)
cs.add_text(plain_text)

# Query (index: 0=first, -1=all)
results = cs.query(-1, "css:.price")          # List[str] — all matches
value = cs.select(0, "css:.price")            # str — first match or ""

# Multi-query
first_hit = cs.select_first([(0, "css:#id"), (0, "xpath://fallback")])
combined = cs.select_many([(-1, "css:.a"), (-1, "css:.b")])

# Batch (fastest for many fields)
results = cs.query_batch([(-1, "css:.title"), (-1, "json:data.name")])

# With validators
results = cs.select_where(0, "css:.vin", lambda v: len(v) == 17)
```

## License

MIT
