Metadata-Version: 2.4
Name: thaumiel
Version: 0.1.0
Summary: Typed async client for the SCP Foundation Wiki's Crom GraphQL API.
Project-URL: homepage, https://github.com/ozefe/thaumiel
Project-URL: source, https://github.com/ozefe/thaumiel
Project-URL: issues, https://github.com/ozefe/thaumiel/issues
Author-email: Efe Özyay <hi@efe.cv>
License-Expression: MIT
License-File: LICENSE
Keywords: crom,graphql,scp,scp-wiki,secure-contain-protect,wikidot
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Natural Language :: English
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Programming Language :: Python :: 3.14
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Typing :: Typed
Requires-Python: >=3.14
Requires-Dist: httpx<1.0,>=0.27
Requires-Dist: pydantic<3.0,>=2.7
Description-Content-Type: text/markdown

# thaumiel

<img alt="thaumiel mascot (SCP-3000, Anantashesha) generated by Google's Nano Banana 2" align="right" src=".github/mascot.png" width="200" />

![PyPI - Python Version](https://img.shields.io/pypi/pyversions/thaumiel)
![PyPI - License](https://img.shields.io/pypi/l/thaumiel)
![PyPI - Status](https://img.shields.io/pypi/status/thaumiel)
![PyPI - Downloads](https://img.shields.io/pypi/dm/thaumiel)

A typed, ergonomic, read-only async Python client for the [SCP Foundation Wiki](https://scp-wiki.wikidot.com/)'s [Crom](https://crom.avn.sh/) GraphQL API.

thaumiel wraps Crom's GraphQL endpoint in a small, fully-typed surface: fetch pages, filter and sort them with a Python DSL, page through results without touching cursors, and budget your rate-limit quota — all `async`, all type-checked.

## Features

- **Fully typed:** Frozen Pydantic v2 models (`Page`, `Author`, `Attribution`, ...), checked under pyright strict.
- **Ergonomic filter DSL:** Build server-side filters with Python operators: `(F.rating >= 100) & (F.tag == "scp")`. Illegal filters raise at build time, not at the server.
- **Automatic pagination:** `pages()` is an async iterator that follows Crom's cursors for you; `fetch_page_batch()` exposes them when you want manual control.
- **Costly-field provenance:** Opt into expensive fields per call, and tell "not requested" apart from "server returned null" via `page.requested(...)`.
- **Quota estimation:** `estimate_*` predicts a call's point cost before you spend it.
- **Typed errors and optional retry:** A `ThaumielError` hierarchy plus a configurable `RetryPolicy` with exponential backoff.

## Installation

Requires Python 3.14+.

```bash
pip install thaumiel
```

## Quickstart

```python
import asyncio

from thaumiel import AsyncClient


async def main() -> None:
    async with AsyncClient() as client:
        # Crom stores SCP wiki URLs with the http:// scheme.
        page = await client.page("http://scp-wiki.wikidot.com/scp-173")
        if page is None:
            return

        print(page.title, page.rating)
        print(page.tags[:3])


asyncio.run(main())
```

```text
SCP-173 10752.0
('autonomous', 'ectoentropic', 'euclid')
```

`page()` returns `None` (not an exception) when nothing matches, and takes either a `url` or a `wikidot_id`.

## Filtering, sorting, and listing

`pages()` streams every match, following pagination automatically. Combine `F` accessors into a filter and pass a `Sort`:

```python
import asyncio

from thaumiel import AsyncClient, F, Sort, SortKey


async def main() -> None:
    # Highest-rated SCP articles on the English wiki.
    query = F.url.starts_with("http://scp-wiki.wikidot.com") & (F.tag == "scp")
    async with AsyncClient() as client:
        shown = 0
        async for page in client.pages(
            filter=query, sort=Sort.by(SortKey.RATING), page_size=5
        ):
            print(f"{page.rating:>6.0f}  {page.title}")
            shown += 1
            if shown == 5:
                break


asyncio.run(main())
```

```text
 10752  SCP-173
  7145  ●●|●●●●●|●●|●
  5544  SCP-049
  5240  SCP-____-J
  4790  SCP-096
```

Count matches without fetching them:

```python
await client.count_pages(F.tag == "scp")   # -> 69916
```

Need the cursor yourself (checkpointing, UI paging)? `fetch_page_batch()` returns one `PageBatch` with `.pages`, `.end_cursor`, and `.has_next_page`.

### The filter DSL

Each `F` accessor exposes only the operators its field supports; an unsupported operator or a wrong-typed value raises `InvalidPredicateError` immediately.

| Accessor | Field type | Operators |
| --- | --- | --- |
| `F.url` | prefix string | `==` `!=` `.starts_with()` |
| `F.title` | string (case-insensitive) | `==` `!=` `.eq_lower()` `.neq_lower()` `.starts_with()` `.starts_with_lower()` |
| `F.author` | string (case-insensitive) | same as `F.title`; matches an attribution's display name |
| `F.category` | string | `==` `!=` |
| `F.rating` | int | `==` `!=` `<` `<=` `>` `>=` |
| `F.created_at` | datetime | `==` `!=` `<` `<=` `>` `>=` |
| `F.is_hidden`, `F.is_user_page` | bool | `==` `!=` |
| `F.tag` | tag set | `==` (has) `!=` (lacks) `.all_of()` `.any_of()` `.none_of()` |

Combine predicates with `&` (and), `|` (or), and `~` (not).

> [!WARNING]
> Because `==`/`>=`/... are overloaded, the combinators `&` `|` `~` bind **looser** than the comparisons. Parenthesize every comparison:
>
> ```python
> (F.rating >= 100) & (F.tag == "scp")   # correct
> F.rating >= 100 & F.tag == "scp"       # WRONG: parsed as F.rating >= (100 & F.tag) == "scp"
> ```

A predicate lowers to Crom's GraphQL input only when a request is issued, but you can inspect it:

```python
(F.rating >= 100).compile().model_dump(by_alias=True, exclude_unset=True)
# {'onWikidotPage': {'rating': {'gte': 100}}}
```

## Costly fields and quota

Some fields cost extra rate-limit points and are opt-in per call. A field you don't request stays `None`; some can be `None` even when requested, so `page.requested(...)` disambiguates.

```python
from thaumiel import CostlyField

page = await client.page(
    "http://scp-wiki.wikidot.com/scp-173",
    source=True,
    attributions=True,
)

print(len(page.source))                       # 1680
print(page.summary)                           # None
print(page.requested(CostlyField.SUMMARY))    # False  — we never asked for it
credit = page.attributions[0]
print(credit.type.value, credit.user_display_name)   # AUTHOR Moto42
```

Crom meters usage in points (reported via the `x-ratelimit-remaining` header; the ceiling is 300000). Estimate before you spend — costly fields in `pages()` are billed **per page**:

```python
from thaumiel import estimate_count, estimate_page, estimate_pages

estimate_page(source=True, attributions=True)   # 4
estimate_count()                                # 2
estimate_pages(page_size=100, source=True)      # 200
```

## Errors and retries

Every error subclasses `ThaumielError`:

```python
from thaumiel import GraphQLError, RateLimitError, TransportError

try:
    page = await client.page(url)
except RateLimitError as exc:      # HTTP 429 (a subclass of TransportError)
    ...
except TransportError as exc:      # other HTTP/network failure; .status_code, .cause
    ...
except GraphQLError as exc:        # query-level errors; .errors
    ...
```

Every call is read-only and idempotent, so retrying is safe. `RetryPolicy` backs off exponentially on rate limits (and optionally on 5xx):

```python
from thaumiel import AsyncClient, RetryPolicy

policy = RetryPolicy(max_attempts=4, backoff=0.5)
async with AsyncClient() as client:
    # Pass a factory, not a coroutine: a retry needs a fresh awaitable.
    page = await policy.run(lambda: client.page("http://scp-wiki.wikidot.com/scp-173"))
```

## Configuration

```python
from thaumiel import AsyncClient

client = AsyncClient(
    user_agent="my-app/1.0 (me@example.com)",   # good Crom etiquette
    timeout=30.0,
)
```

For full control — connection limits, event hooks, observing quota headers — inject your own `httpx.AsyncClient`. thaumiel will **not** close a client it did not create:

```python
import httpx
from thaumiel import AsyncClient

http = httpx.AsyncClient(headers={"User-Agent": "my-app/1.0"})
client = AsyncClient(http_client=http)
# ... use client ...
await http.aclose()   # you own it; you close it
```

More end-to-end scripts live in [`examples/`](examples/).

## Limitations

- **Read-only:** thaumiel offers no writes.
- **Async only:** There is no synchronous client.
- **Wikidot pages only:** `pages()` skips non-Wikidot nodes (e.g. RuFoundation), so it can yield fewer rows than `count_pages` reports for the same filter.
- **Curated filter surface:** Only the fields in the table above are filterable, and some support equality only.
- **Quota-bound:** Requests cost points against Crom's quota; budget with `estimate_*`.
- **Alpha:** While on 0.x, the public API may change before 1.0.

## Development

See [`.github/CONTRIBUTING.md`](.github/CONTRIBUTING.md).

## License

MIT — see [LICENSE](LICENSE).
