Metadata-Version: 2.4
Name: jsonsax
Version: 0.1.0
Summary: A lightweight, dependency-free streaming (SAX-style) JSON parser.
Project-URL: Homepage, https://github.com/chandrapenugonda/jsonsax
Project-URL: Repository, https://github.com/chandrapenugonda/jsonsax
Project-URL: Issues, https://github.com/chandrapenugonda/jsonsax/issues
Author: chandrapenugonda
License: MIT
License-File: LICENSE
Keywords: incremental,json,llm,parser,sax,streaming
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Topic :: Text Processing :: Markup
Classifier: Typing :: Typed
Requires-Python: >=3.9
Provides-Extra: dev
Requires-Dist: mypy>=1.8; extra == 'dev'
Requires-Dist: pylint>=3; extra == 'dev'
Requires-Dist: pytest>=7; extra == 'dev'
Description-Content-Type: text/markdown

# jsonsax

**Read JSON while it is still arriving — don't wait for the whole thing.**

Imagine someone is reading you a long story out loud, one word at a time. You
don't wait for them to finish the whole book before you start listening — you
react to each part as you hear it. `jsonsax` does that for JSON.

Normally a computer waits for the *entire* JSON to show up, then reads it.
`jsonsax` is different: you hand it little pieces as they arrive, and it taps
you on the shoulder and says *"hey, I just found a name!"*, *"hey, here's a
number!"* — right away, piece by piece.

This style of reading-as-you-go is called a **streaming** (or **SAX-style**)
parser. (XML has had one for years; this is the same idea for JSON.)

### Why would you want that?

- 🤖 **Talking to an AI** — chatbots send their answer one word at a time. With
  `jsonsax` you can start using the first part of the answer before the rest
  has even arrived.
- 🐘 **Huge files** — a JSON file too big to fit in memory? Read it in small
  sips instead of swallowing it whole.
- ⚡ **Show things sooner** — display the title of an article the instant it
  appears, without waiting for the whole article.

---

## Install

```bash
pip install jsonsax
```

That's it. No other stuff gets installed — `jsonsax` has **zero dependencies**.

---

## The tiniest example

```python
from jsonsax import parse

parse('{"name": "Bo", "age": 5}', value=lambda path, val: print(path, "=", val))
```

Output:

```
$.name = Bo
$.age = 5
```

`$` means "the start". `$.name` means "the `name` part". Think of it as an
address that tells you **where** in the JSON you are.

---

## Feeding it bit by bit (the fun part)

Real streams don't arrive all at once. Watch what happens when the JSON shows
up in messy little chunks — even cut in the middle of a word:

```python
from jsonsax import Parser

parser = Parser()
parser.on("value", lambda path, val: print("found:", path, "=", val))

chunks = ['{"tit', 'le": "R', 'AG", "sco', 're": 9.5}']
for chunk in chunks:
    parser.feed(chunk)   # hand over one piece at a time
parser.close()           # tell it "okay, that's everything"
```

Output:

```
found: $.title = RAG
found: $.score = 9.5
```

Even though `"title"` got chopped into `"tit"` + `"le"`, `jsonsax` patiently
stitched it back together. 🧩

---

## Listening for different things ("events")

You tell `jsonsax` what you care about with `parser.on(...)`. Each time it sees
that kind of thing, it calls your little function (a *callback*).

```python
from jsonsax import Parser

parser = Parser()
parser.on("start_object", lambda path: print(path, "{ ... an object starts"))
parser.on("end_object",   lambda path: print(path, "} ... an object ends"))
parser.on("start_array",  lambda path: print(path, "[ ... a list starts"))
parser.on("end_array",    lambda path: print(path, "] ... a list ends"))
parser.on("key",          lambda path, key: print(path, "key:", key))
parser.on("value",        lambda path, val: print(path, "value:", repr(val)))

parse_me = '{"pets": ["cat", "dog"], "happy": true}'
for ch in parse_me:
    parser.feed(ch)
parser.close()
```

Output:

```
$ { ... an object starts
$.pets key: pets
$.pets [ ... a list starts
$.pets[0] value: 'cat'
$.pets[1] value: 'dog'
$.pets ] ... a list ends
$.happy key: happy
$.happy value: True
$ } ... an object ends
```

See how `$.pets[0]` and `$.pets[1]` count the items in the list, just like
"first pet" and "second pet"?

---

## The events you can listen for

| Event          | You get…        | Happens when it sees…                  |
| -------------- | --------------- | -------------------------------------- |
| `start_object` | `path`          | a `{` — an object is starting          |
| `end_object`   | `path`          | a `}` — an object is finished          |
| `start_array`  | `path`          | a `[` — a list is starting             |
| `end_array`    | `path`          | a `]` — a list is finished             |
| `key`          | `path, key`     | a label inside an object (like `name`) |
| `value`        | `path, value`   | a real value: text, number, true/false/null |

A `value` can be a `str`, an `int`, a `float`, `True`, `False`, or `None`
(JSON's `null` becomes Python's `None`).

---

## More examples (little recipes)

### 1. Grab just one field, ignore everything else

Only want the title? Only listen for it:

```python
from jsonsax import Parser

def on_value(path, val):
    if path == "$.title":
        print("The title is:", val)

p = Parser()
p.on("value", on_value)
p.feed('{"title": "Hello", "body": "long boring text..."}')
p.close()
# The title is: Hello
```

### 2. Build a normal dictionary as you go

```python
from jsonsax import Parser

data = {}
p = Parser()
p.on("value", lambda path, val: data.__setitem__(path, val))
p.feed('{"a": 1, "b": 2, "c": 3}')
p.close()
print(data)
# {'$.a': 1, '$.b': 2, '$.c': 3}
```

### 3. Count the items in a list

```python
from jsonsax import Parser

count = 0
def bump(path, val):
    global count
    count += 1

p = Parser()
p.on("value", bump)
p.feed('[10, 20, 30, 40, 50]')
p.close()
print("items:", count)   # items: 5
```

### 4. Deeply nested stuff is no problem

```python
from jsonsax import parse

parse(
    '{"user": {"name": "Mia", "tags": ["a", "b"]}}',
    value=lambda path, val: print(path, "=", val),
)
# $.user.name = Mia
# $.user.tags[0] = a
# $.user.tags[1] = b
```

### 5. All the value types at once

```python
from jsonsax import parse

parse(
    '{"text": "hi", "whole": 42, "decimal": 3.14, "yes": true, "no": false, "nothing": null}',
    value=lambda path, val: print(f"{path:14} -> {val!r}"),
)
# $.text         -> 'hi'
# $.whole        -> 42
# $.decimal      -> 3.14
# $.yes          -> True
# $.no           -> False
# $.nothing      -> None
```

### 6. Reacting to an AI that types its answer slowly

This is the big one. Pretend an AI sends its reply word-by-word:

```python
from jsonsax import Parser

# These pieces would normally come from the AI, one at a time.
ai_stream = ['{"head', 'line": "Big New', 's!", "summary": "It happened today."}']

p = Parser()
p.on("value", lambda path, val: print(f"[{path}] arrived: {val}"))

for piece in ai_stream:
    p.feed(piece)        # the moment a field finishes, you hear about it
p.close()
# [$.headline] arrived: Big News!
# [$.summary] arrived: It happened today.
```

### 7. Chain your setup in one breath

`on(...)` hands you the parser back, so you can line them up:

```python
from jsonsax import Parser

p = (
    Parser()
    .on("key", lambda path, k: print("key", k))
    .on("value", lambda path, v: print("value", v))
)
p.feed('{"x": 1}')
p.close()
```

---

## When the JSON is broken

If the JSON is messy or unfinished, `jsonsax` tells you by raising a
`ParseError` (which is just a special kind of Python `ValueError`). It is
**strict** on purpose — better to shout early than to quietly hand you wrong data.

```python
from jsonsax import parse, ParseError

broken_examples = [
    '{"a": 1,}',     # extra comma at the end
    '[1, 2',         # forgot to close the list
    '{"a" 1}',       # missing the ':' between key and value
    '"never ends',   # string with no closing quote
    'true false',    # two things glued together
]

for bad in broken_examples:
    try:
        parse(bad)
    except ParseError as error:
        print("rejected:", bad, "->", error)
```

Output (your wording may vary slightly):

```
rejected: {"a": 1,} -> Unexpected '}'.
rejected: [1, 2 -> Unexpected end of input: unclosed container.
rejected: {"a" 1} -> Unexpected value (parser state: obj_colon).
rejected: "never ends -> Unexpected end of input: unterminated string.
rejected: true false -> Unexpected value (parser state: done).
```

> **Always call `parser.close()` at the end.** That's the moment `jsonsax`
> double-checks that the JSON was actually complete. Forgetting it means you
> might miss the "you're missing the last `}`!" warning.

---

## Run it from the terminal (no code needed)

You can pipe JSON straight into `jsonsax` to watch the events scroll by:

```bash
echo '{"x": [1, 2, true]}' | python -m jsonsax
```

Output:

```
$                        {
$.x                      key='x'
$.x                      [
$.x[0]                   value=1
$.x[1]                   value=2
$.x[2]                   value=True
$.x                      ]
$                        }
```

---

## The whole toolbox (quick reference)

```python
from jsonsax import Parser, parse, ParseError, EVENTS

parser = Parser()           # make a new reader
parser.on(event, callback)  # "when you see <event>, call <callback>" (returns parser)
parser.feed(chunk)          # give it the next piece of text
parser.close()              # "that's all" — checks the JSON was complete
parser.closed               # True after a successful close()

parse(text, **handlers)     # shortcut: feed + close in one line
ParseError                  # raised when the JSON is broken (a ValueError)
EVENTS                      # the tuple of all valid event names
```

Good to know:

- **Truly incremental** — chunks can split *anywhere*, even in the middle of a
  word, a number, or a `\uXXXX` escape.
- **Strict** — rejects trailing commas, missing colons, leftover junk, and
  unfinished strings or brackets.
- **Typed** — ships with `py.typed`, so type checkers understand it.
- **Tiny & dependency-free**, works on **Python 3.9+**.

---

## For developers (working on jsonsax itself)

```bash
pip install -e ".[dev]"
pytest             # run the tests
pylint src/jsonsax # check the style
mypy               # check the types
```

---

## License

MIT — free to use, change, and share.
