Metadata-Version: 2.4
Name: dataclass-tree
Version: 0.5.0
Summary: Generic tree dataclass with visitor
Home-page: https://varkenvarken.github.io/dataclass-tree/
Author: varkenvarken
Author-email: test@example.com
License: GPLv3
Classifier: Development Status :: 4 - Beta
Classifier: Programming Language :: Python :: 3.11
Classifier: License :: OSI Approved :: GNU General Public License v3 (GPLv3)
Classifier: Operating System :: OS Independent
Requires-Python: >=3.11
Description-Content-Type: text/markdown
License-File: LICENSE
Dynamic: author
Dynamic: author-email
Dynamic: classifier
Dynamic: description
Dynamic: description-content-type
Dynamic: home-page
Dynamic: license
Dynamic: license-file
Dynamic: requires-python
Dynamic: summary

![Python](/python.svg)[![Test](https://github.com/varkenvarken/dataclass-tree/actions/workflows/test_all.yml/badge.svg)](https://github.com/varkenvarken/dataclass-tree/actions/workflows/test_all.yml)![Coverage](/coverage.svg)![Pypi](/pypi.svg)

# dataclass-tree

## Overview

`dataclass-tree` provides a set of base classes and utilities for building and traversing tree-like data structures using Python's `dataclass` feature. It is designed to make tree construction, traversal, and manipulation both intuitive and type-safe, with a particular emphasis on **consistent type hinting** for maximum IDE support.

## Main Classes

### `Tree`

The `Tree` class serves as the abstract base for all tree nodes. It is designed around the concept of *child groups* that can have arbitrary names and may contain any number of `Tree` items (or subclasses thereof).

It is a dataclass, so any field annotated with `list[Tree]` will treated as child groups, while other fields function like normal. Child groups are treated special in the following way:

- They are automatically initialized to empty lists, even if no default factory is specified, and even if the field is annotated as optional with `list[Tree]|None`.
- They can be initialized with the `optional_treelist()` function to clearly mark them as such when hovering over them in an IDE.
- Child groups can be traversed with the `visit()` method, with support with different orders: `PREORDER`, `POSTORDER`, `INORDER`, and `LEVELORDER`.

The `Tree` base class has no child groups defined, so you will need to subclass it (or `LabeledTree`) to use it. (See examples below). Subclasses must use the `@dataclass` [decorator](https://docs.python.org/3/library/dataclasses.html#dataclasses.dataclass),
and any fields defined in the subclass are added to those defined in the superclass. Fields with the same name will replace those in the superclass, and order is significant. (This is standard [dataclass](https://docs.python.org/3/library/dataclasses.html#) behavior).

You can create trees containing a mix of Tree subclasses, each with their own uniquely nae child groups. For example, you could have a tree representing an expression in a programming language, with a Binop class with left and right child groups, and a Unop class having just an expression child group. Even though they have different names, they would be visited in a consistent way by a visitor, regardless their names.

### `LabeledTree`

A subclass of `Tree` with an additional `label` field. This is a very common usecase, so this class is provided as a convenience to save you some typing.

### `Visitor`

The `Visitor` class is designed to process `Tree` nodes while traversing a tree. It uses a flexible dispatch mechanism that locates the appropriate visitor method for each node type and is typically used as an argument for the `Tree.visit` method. The `Visitor` class does not traverse a tree on its own, but will be called to process a node by the `Tree.visit` method.

A `Visitor` has the following features:

- Dynamically dispatches to visitor methods based on the node's class name and the visitor's class name.
- Supports strict enforcement of visitor methods, i.e. raising a `NotImplemented` exception if no suitable method is found, or it can be configured to allow a generic fallback.
- Traversal logic is separated from the operation performed on each node, making it easy to implement new behaviors with minimal boilerplate. 

## Type Hinting for IDE Support

All core classes and their fields use type annotations. For example, child groups are always annotated as `list[Tree]` or `Optional[list[Tree]]`. This clarity enables IDEs to:

- Offer autocompletion for node attributes and methods.
- Provide real-time type checking to catch mistakes early.
- Generate informative tooltips as you code.

To make tooltips even more explicit for attributes that are optional (and any field annotated as `list[Tree]` will be optional and initialized to an empty list automatically), 
the function `optional_treelist()` may be used as a field initializer. 

For example, if you would simply declare a class like this, the hints by your IDE would be incomplete:

```python
    @dataclass
    class Example(Tree):
        left: list[Tree]
        right: list[Tree]
```
They would show up like :arrow_right:  `left: list[Tree], right: list[Tree] -> Example`

The following would make the type checker unhappy (it would flag the None initializers)
    
```python
    @dataclass
    class Example(Tree):
        left: list[Tree] = None
        right: list[Tree] = None
```

This would be ok

```python
    @dataclass
    class Example(Tree):
        left: list[Tree]|None = None
        right: list[Tree]|None = None
```

The hint would show up like :arrow_right:  `left: list[Tree]|None = None, right: list[Tree]|None = None -> Example`

But the following alternative is more readable (IMHO, your taste might be different), and it will make
sure that when hovering over one of the attributes in an instance, it will show the type as `list[Tree]` instead of
`list[Tree]|None`, which is more correct because it is always initialized to an empty list and assigning None to
it is not a good idea.

```python
    @dataclass
    class Example(Tree):
        left: list[Tree] = optional_treelist()
        right: list[Tree] = optional_treelist()
```

The hint for the constructor would be :arrow_right:  `left: list[Tree] = optional_treelist(), right: list[Tree] = optional_treelist() -> Example`

and hovering over `myexamplenode.left` would show :arrow_right: `left: list[Tree]` 

## Example Usage

A simple example showing class with different child groups.

```python
from dataclasses import dataclass
from dataclass_tree import Tree, LabeledTree, Visitor, Order, optional_treelist

@dataclass
class Unop(LabeledTree):
    expression: : list[Tree] = optional_treelist()

@dataclass
class Binop(LabeledTree):
    left: list[Tree] = optional_treelist()
    right: list[Tree] = optional_treelist()

# a leaf class, i.e. without any fields annotated as list[Tree]
@dataclass
class Literal(Tree):
    value: str

# Build a simple tree
root = Unop("-")
root.expression = Binop(label="+")
root.left.append(Literal("42"))
root.right.append(Literal("7"))

# Define a visitor with a single, generic method that will apply to any node
class Printer(Visitor):
    def _do_printer(self, node: Tree, level: int):
        print(" " * (2 * level) + str(node))

# Traverse the tree
root.visit(Printer(), order=Order.PREORDER)
```

## License

GPL V3

## Supported traversal methods

Preorder, postorder, inorder, and levelorder.

See https://en.wikipedia.org/wiki/Tree_traversal

## Limitations and possible future enhancements

The `Tree.visit()` function is implemented with simple recursion. This means that for large trees Python's recusion depth may be a limitation and you might get
`RecursionError: maximum recursion depth exceeded`. You can overcome this by [setting a different limit](https://docs.python.org/3/library/sys.html#sys.setrecursionlimit)
but in the future I might replace this by a queue based implementation.

Currently the `Tree.visit()` method has its `visitor` attribute annotated as `Visitor`; It might be worth while to replace this by a protocol instead, to make applying a visitor even more versatile.



