Metadata-Version: 2.4
Name: django-graph-walker
Version: 0.1.0
Summary: Walk Django model relationship graphs for cloning, subsetting, export, and visualization.
Project-URL: Homepage, https://github.com/dannyshaw/django-graph-walker
Project-URL: Repository, https://github.com/dannyshaw/django-graph-walker
Project-URL: Issues, https://github.com/dannyshaw/django-graph-walker/issues
Author: Danny Shaw
License-Expression: MIT
Classifier: Development Status :: 3 - Alpha
Classifier: Framework :: Django
Classifier: Framework :: Django :: 3.2
Classifier: Framework :: Django :: 4.0
Classifier: Framework :: Django :: 4.1
Classifier: Framework :: Django :: 4.2
Classifier: Framework :: Django :: 5.0
Classifier: Framework :: Django :: 5.1
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Requires-Python: >=3.10
Requires-Dist: django>=3.2
Provides-Extra: anonymize
Requires-Dist: faker; extra == 'anonymize'
Provides-Extra: dev
Requires-Dist: faker; extra == 'dev'
Requires-Dist: graphviz; extra == 'dev'
Requires-Dist: pre-commit; extra == 'dev'
Requires-Dist: pytest; extra == 'dev'
Requires-Dist: pytest-django; extra == 'dev'
Requires-Dist: ruff; extra == 'dev'
Provides-Extra: viz
Requires-Dist: graphviz; extra == 'viz'
Description-Content-Type: text/markdown

# django-graph-walker

Walk Django model relationship graphs for cloning, subsetting, export, and visualization.

## Overview

django-graph-walker traverses Django model relationships using breadth-first search, collecting all reachable instances into a result set you can export, visualize, or inspect. It is designed for tasks like creating dev/test data subsets from production, cloning content trees with all their dependencies, and understanding complex schema relationships.

The walker uses batch prefetching so queries scale with the number of edge types per BFS level, not with the number of instances. A walk across thousands of instances typically requires only a handful of queries per relationship type.

## Installation

```bash
pip install django-graph-walker
```

Optional extras:

```bash
pip install django-graph-walker[viz]        # graphviz for DOT rendering
pip install django-graph-walker[anonymize]  # faker for field anonymization
```

Requires Python 3.10+ and Django 3.2+.

## Quick Start

```python
from django_graph_walker import GraphSpec, GraphWalker

# Define which models are in scope
spec = GraphSpec(Author, Article, Tag)

# Walk from a root instance
article = Article.objects.get(pk=1)
result = GraphWalker(spec).walk(article)

# Inspect the result
print(result.instance_count)        # total instances collected
print(result.instances_of(Author))  # all Author instances reached
print(result.by_model())            # {Author: [...], Article: [...], Tag: [...]}

# Export to a JSON fixture
from django_graph_walker.actions.export import Export
Export().to_file(result, "dev_data.json")
```

## Core Concepts

### GraphSpec

A `GraphSpec` declares which Django models are in scope for a walk and optionally provides per-field overrides. Models not in the spec are never traversed to.

**Positional models** -- all defaults, no overrides:

```python
spec = GraphSpec(Author, Article, Category, Tag)
```

**Dict with overrides** -- control specific fields:

```python
spec = GraphSpec({
    Author: {
        "email": Anonymize("email"),
    },
    Article: {
        "reviewer": Ignore(),
    },
    Tag: {},
})
```

**Mixed** -- combine both styles:

```python
spec = GraphSpec(
    {Author: {"email": Anonymize("email")}},
    Article,
    Tag,
)
```

**Auto-generate from apps** -- no need to list models manually:

```python
# All models in one app
spec = GraphSpec.from_app("books")

# Multiple apps
spec = GraphSpec.from_apps("books", "reviews")

# All models (excludes django.contrib.* by default)
spec = GraphSpec.all()

# Remove specific models
spec = GraphSpec.from_app("books").exclude(Review)
```

**Composition with `|`** -- merge two specs, with the right-hand side winning on conflicts:

```python
base = GraphSpec.from_app("books")
overrides = GraphSpec({Author: {"email": Anonymize("email")}})
combined = base | overrides
```

### Field Overrides

| Override | Description | Example |
|---|---|---|
| `Follow(filter=..., prefetch=..., limit=...)` | Force-follow an edge. Optional filter, prefetch customization, and per-parent limit. | `Follow(filter=lambda ctx, a: a.published, limit=10)` |
| `Ignore()` | Suppress traversal of an edge that would otherwise be followed. | `Ignore()` |
| `Override(value)` | Set a field to a static value or a callable `(instance, ctx) -> value`. | `Override(lambda inst, ctx: ctx["new_title"])` |
| `KeepOriginal(when=...)` | For FK fields to in-scope models: keep the original target instead of using a clone. Optional conditional. | `KeepOriginal(when=lambda inst, ctx: inst.is_shared)` |
| `Anonymize(provider)` | Anonymize a field using a faker provider string or callable `(instance, ctx) -> value`. | `Anonymize("first_name")` |

### GraphWalker

`GraphWalker` performs level-order BFS from one or more root instances. Every relationship where both endpoints are in the spec is followed by default. Use `Ignore()` to opt out of specific edges.

```python
walker = GraphWalker(spec)

# Single root
result = walker.walk(article)

# Multiple roots
result = walker.walk(article_1, article_2, article_3)

# With context passed to filter/override callables
result = walker.walk(article, ctx={"tenant_id": 42})
```

**Batch prefetching**: Each BFS level groups queued instances by model, then calls `prefetch_related_objects()` once per model group. This means traversing 1,000 articles with FK to Author issues one prefetch query for the Author relationship, not 1,000 individual lookups.

### WalkResult

`WalkResult` is the container returned by `GraphWalker.walk()`. It holds all visited instances keyed by `(model_class, pk)`.

```python
result = GraphWalker(spec).walk(article)

# Group by model
for model, instances in result.by_model().items():
    print(f"{model.__name__}: {len(instances)}")

# Get instances of a specific model
authors = result.instances_of(Author)

# Dependency-ordered model list (FK targets before FK sources)
for model in result.topological_order():
    print(model.__name__)

# Iteration and membership
for instance in result:
    print(instance)

if article in result:
    print("Article was visited")

# Merge two results
combined = result_a | result_b
```

Properties:
- `instance_count` -- total number of collected instances
- `model_count` -- number of distinct model types collected

## Actions

### Clone

The `Clone` action duplicates a walked subgraph within the same database, creating new instances with new PKs and remapping all FKs to point to the clones.

```python
from django_graph_walker.actions.clone import Clone

spec = GraphSpec({
    Article: {
        "title": Override(lambda inst, ctx: f"Copy of {inst.title}"),
        "author": KeepOriginal(),   # point to original author, don't clone
    },
    Category: {},
    Tag: {},
})

result = GraphWalker(spec).walk(article)
cloned = Clone(spec).execute(result)

# Access cloned instances
cloned_article = cloned.get_clone(article)
print(cloned_article.title)   # "Copy of My Article"
print(cloned_article.author)  # original author (KeepOriginal)
print(cloned.clone_count)     # total clones created

# Get a WalkResult of all clones
clone_result = cloned.result
```

**With context** -- pass data to Override/KeepOriginal callables:

```python
cloned = Clone(spec).execute(result, ctx={"tenant_id": 42})
```

Spec overrides applied during cloning:
- `Override(value)` -- replace a field value (static or callable)
- `KeepOriginal()` -- keep the original FK target instead of remapping to the clone
- `Anonymize(provider)` -- anonymize with faker or a callable
- Out-of-scope FKs automatically keep their original references

### Export

The `Export` class serializes walk results to JSON fixtures or copies them to another database.

```python
from django_graph_walker.actions.export import Export

result = GraphWalker(spec).walk(article)
```

**JSON fixture string**:

```python
json_str = Export(format="json").to_fixture(result)
```

**Write to file**:

```python
Export(format="json").to_file(result, "dev_data.json")
```

**Copy to another database** with automatic PK and FK remapping:

```python
instance_map = Export().to_database(result, target_db="staging")
# instance_map: {(OriginalModel, old_pk): new_instance, ...}
```

**With anonymization** -- reference fields as `"ModelName.field_name"`:

```python
export = Export(
    anonymizers={
        "Author.email": "email",               # faker provider
        "Author.name": lambda inst, ctx: "Anon",  # callable
    },
)
export.to_file(result, "anonymized.json")
```

Additional options:
- `use_natural_keys=True` -- use Django's natural key serialization

### Visualize

The `Visualize` class generates Graphviz DOT output for schema-level and instance-level graphs.

```python
from django_graph_walker.actions.visualize import Visualize

spec = GraphSpec(Author, Article, Category, Tag)
viz = Visualize(show_field_names=True)
```

**Schema-level** -- shows models and their relationships (no database queries):

```python
dot_string = viz.schema(spec)
print(dot_string)  # valid DOT/Graphviz source
```

**Instance-level** -- shows actual instances and connections from a walk result:

```python
result = GraphWalker(spec).walk(article)
dot_string = viz.instances(result)
```

**Graphviz objects** -- requires the `graphviz` package (`pip install django-graph-walker[viz]`):

```python
graph = viz.schema_to_graphviz(spec)
graph.render("schema", format="png")

graph = viz.instances_to_graphviz(result)
graph.render("instances", format="svg")
```

### Interactive Visualization

Generate self-contained interactive HTML files -- no server required, JS loaded from CDN, zero extra Python dependencies.

**Cytoscape.js** (`--format=html`) -- clean 2D directed graph with dagre layout:

```python
from django_graph_walker.actions.visualize import Visualize
from django_graph_walker.actions.interactive import InteractiveRenderer

graph_data = Visualize().schema_to_dict(spec)
html = InteractiveRenderer().to_cytoscape_html(graph_data, title="My Schema")
# Write to file and open in browser
```

Features: dagre top-down layout, zoom/pan/drag, hover tooltips on edges, click-to-highlight connected nodes, sidebar with field details, edge styling by relationship type.

**3d-force-graph** (`--format=3d`) -- 3D WebGL with animated directional particles:

```python
graph_data = Visualize().schema_to_dict(spec)
html = InteractiveRenderer().to_3d_html(graph_data, title="My Schema")
```

Features: 3D orbit controls, always-visible text labels on nodes, animated particles flowing along edges showing FK direction, click-to-fly-to-node camera, force-directed layout with charge repulsion for clear spacing.

Both renderers also work with instance-level data:

```python
result = GraphWalker(spec).walk(article)
graph_data = Visualize().instances_to_dict(result)
html = InteractiveRenderer().to_cytoscape_html(graph_data, title="Instance Graph")
```

## Management Commands

Add `"django_graph_walker"` to `INSTALLED_APPS` to enable management commands:

### `graph_schema` -- Visualize model relationships

```bash
# Single app
python manage.py graph_schema books

# Multiple apps
python manage.py graph_schema books reviews

# All apps
python manage.py graph_schema --all

# Output to file
python manage.py graph_schema books -o schema.dot

# Render to image (requires pip install django-graph-walker[viz])
python manage.py graph_schema books --format=png -o schema.png

# Machine-readable JSON
python manage.py graph_schema books --format=json

# Interactive HTML (Cytoscape.js + dagre layout)
python manage.py graph_schema books --format=html -o schema.html

# 3D interactive HTML (3d-force-graph with animated particles)
python manage.py graph_schema books --format=3d -o schema3d.html

# Exclude specific models
python manage.py graph_schema books --exclude=books.Review

# Hide field names on edges
python manage.py graph_schema books --no-field-names
```

### `graph_walk` -- Walk and export from the CLI

```bash
# Walk from a root instance, print stats
python manage.py graph_walk books.Book 42

# Export to JSON fixture
python manage.py graph_walk books.Book 42 -o fixture.json

# Multiple root PKs
python manage.py graph_walk books.Book 1,2,3

# Explicit app scope (default: root model's app)
python manage.py graph_walk books.Book 42 --apps=books,reviews

# All apps in scope
python manage.py graph_walk books.Book 42 --all

# Stats only, no export
python manage.py graph_walk books.Book 42 --dry-run
```

### `graph_deps` -- Dependency analysis

```bash
# What depends on Book + what Book depends on
python manage.py graph_deps books.Book

# Full dependency tree for an app
python manage.py graph_deps books --tree

# Models with no relationships
python manage.py graph_deps books --orphans

# Machine-readable JSON
python manage.py graph_deps books.Book --format=json
```

### `graph_fanout` -- Fan-out risk analysis

```bash
# Analyze an app for fan-out risks
python manage.py graph_fanout books

# Multiple apps
python manage.py graph_fanout books reviews

# All apps
python manage.py graph_fanout --all

# Analyze a specific GraphSpec object (dotted import path)
python manage.py graph_fanout --spec=myapp.specs.my_spec

# Add DB cardinality estimates
python manage.py graph_fanout books --estimate

# Machine-readable JSON
python manage.py graph_fanout books --format=json

# Adjust shared-reference sensitivity (default: 3)
python manage.py graph_fanout books --threshold=2

# Exclude specific models
python manage.py graph_fanout books --exclude=books.Review
```

Detects cycles, bidirectional edges, limit bypasses (where `Follow(limit=N)` is circumvented by an alternate unlimited path), and shared references (models reachable from many sources that fan back out).

## Settings

Optional configuration via `GRAPH_WALKER` in your Django settings:

```python
GRAPH_WALKER = {
    # Apps excluded by GraphSpec.all() and --all flag
    # Default: all django.contrib.* apps
    "EXCLUDE_APPS": ["django.contrib.admin", "django.contrib.auth", ...],
}
```

## Examples

See [`examples/bookstore/`](examples/bookstore/) for a working example project that demonstrates walking a bookstore data model, exporting to JSON fixtures, and generating interactive graph visualizations.

## Acknowledgements

This project was inspired by an internal clone tool built by [@MattFisher](https://github.com/MattFisher) at [Edrolo](https://edrolo.com.au), which pioneered the idea of spec-driven Django model graph traversal.

## License

MIT
