Metadata-Version: 2.4
Name: knwl
Version: 1.6.1
Summary: A flexible Graph RAG package with advanced ingestion and augmentation strategies.
Project-URL: Homepage, https://github.com/Orbifold/knwl
Project-URL: Repository, https://github.com/Orbifold/knwl
Project-URL: Documentation, https://github.com/Orbifold/knwl
Author-email: "Francois Vanderseypen (Orbifold Consulting)" <swa@orbifold.net>
License-Expression: MIT
License-File: LICENSE
Requires-Python: <3.13,>=3.10
Requires-Dist: anthropic>=0.72.0
Requires-Dist: chromadb>=1.1.1
Requires-Dist: networkx>=3.4.2
Requires-Dist: ollama>=0.4.7
Requires-Dist: openai>=1.53.0
Requires-Dist: pydantic>=2.0.0
Requires-Dist: rich>=13.0.0
Requires-Dist: tiktoken>=0.8.0
Requires-Dist: torch>=2.5.1
Requires-Dist: transformers>=4.46.1
Description-Content-Type: text/markdown

# Knwl

A flexible Graph RAG package with advanced ingestion and augmentation strategies.

<div style="text-align: center;">
    <img style="border: 1px solid #ccc; border-radius: 4px; padding: 4px; margin: auto; width: 100%;" src="./journal/images/Swirl.jpg" alt="Knwl visualization via yFiles.">
</div>


**Kwnl** is short for 'knowledge' but could just as well stand for '*know well*'(as in knowing your knowledge well), '*knowledge network workflow library*', '*knwledge notes with linking*', '*keep notes, wiki and links*', '*knwoledge network and wisdom library*' or '*keep notes, write and learn*'.

## Features

- **Five Graph RAG Strategies**: Local, Global, Naive, Self and Hybrid query modes for flexible knowledge retrieval
- **Dependency Injection Framework**: Decorator-based DI system (`@service`, `@singleton_service`, `@inject_config`, `defaults`) for clean, configurable architecture
- **Extensively Tested**: Comprehensive test suite covering all components and strategies
- **No External Services Required**: Runs with lightweight local implementations (Ollama, NetworkX, JSON storage) out of the box. Great for experiments while fully open for enterprise-grade integrations with Qdrant, Neo4j, LangGraph, etc.
- **Protocol-Based & Extensible**: Override base classes and configure via JSON to customize LLMs, storage, chunking, extraction, and more
- **Semantic Search**: Vector-based similarity search for nodes, edges, and chunks
- **Classic RAG**: includes traditional retrieval-augmented generation with chunk-based context
- **Rich Output Formatting**: Beautiful terminal, HTML, and Markdown renderers for all models using Rich and custom formatters.
- **Graph Visualization**: the default graph store is saved as GraphML which can be visualized with tools like Gephi, yEd or Cytoscape. Equally open to graph databases and advanced graph visualization (Bloom, yFiles, Ogma, etc.)
- **Any LLM**: Easily swap LLM implementations (Ollama, OpenAI, Anthropic, etc.) via configuration. 

<div style="text-align: center;">
    <img style="border: 1px solid #ccc; border-radius: 4px; padding: 4px; margin: auto; width: 500px;" src="./journal/images/GraphIngestion.png" alt="Small part of the Wikipedia article on topology" width="500">
</div>

<div style="text-align: center;">
    <img style="border: 1px solid #ccc; border-radius: 4px; padding: 4px; margin: auto; width: 500px;" src="./journal/images/Legal.jpg" alt="Belgian Fiscal Law KG" width="500">
</div>

## Architecture

KNWL uses a hierarchical configuration system with service variants, allowing runtime component swapping without code changes. All components inherit from `FrameworkBase` and are wired through dependency injection.

Core services include:

- **LLM**: Ollama, OpenAI (configurable via `llm.default`)
- **Storage**: JSON, Chroma, NetworkX graph storage
- **Chunking**: Tiktoken-based text splitting
- **Extraction**: Graph and entity extraction with customizable prompts
- **Vector Search**: Semantic similarity for retrieval

## Graph RAG Strategies

KNWL implements four distinct retrieval strategies for different query patterns:

### Local Strategy

Focuses on entity-centric retrieval:

- Extracts **low-level keywords** from the query and matches against nodes (primary nodes)
- Retrieves the **relationship neighborhood** around these primary nodes
- Builds context from:
  - Primary node records (name, type, description)
  - Connected relationship records (source, target, type, description)
  - Text chunks associated with the primary nodes

**Use case**: Questions about specific entities or concepts and their immediate relationships.

### Global Strategy

Focuses on relationship-centric retrieval:

- Extracts **high-level keywords** from the query and matches against edges
- Retrieves the **node endpoints** of matching edges
- Builds context from:
  - Node endpoint records (entities connected by the relationships)
  - Edge records (source, target, type, description)
  - Text chunks associated with the edges

**Use case**: Questions about relationships, connections, or patterns between entities.

### Naive Strategy

Traditional RAG approach:

- Performs direct **semantic similarity search** on text chunks
- No graph structure utilized
- Builds context purely from retrieved chunks

**Use case**: Simple fact-finding or when graph structure isn't beneficial.

### Hybrid Strategy

Combines Local and Global strategies:

- Executes both local and global retrieval in parallel
- Merges and deduplicates the combined context
- Provides comprehensive coverage across entities, relationships, and chunks

**Use case**: Complex queries benefiting from both entity and relationship context.

## Quick Start

```python
from knwl import Knwl

# Initialize with default configuration
knwl = Knwl()

# Ingest documents and build knowledge graph
await knwl.insert("Your text content here...")

# Query with different strategies
result = await knwl.query("Your question?", mode="local")    # or "global", "naive", "hybrid"
print(result.answer)
```

## API Access

Run the FastAPI server:

```bash
# Development mode (auto-reload)
python api/main.py

# Production with uvicorn
uvicorn knwl.api.main:app --host 0.0.0.0 --port 9000 --workers 8
```

## Formatting & Rendering

KNWL provides sophisticated formatting utilities for beautiful output across multiple mediums: **terminal**, **HTML**, and **Markdown**. The formatting system is protocol-based and extensible, allowing custom formatters for any Pydantic model.

### Terminal Output with Rich

The Rich-based terminal formatter creates beautiful, colorful console output with tables, panels, and trees:

```python
from knwl.format import print_knwl, format_knwl
from knwl.models import KnwlNode, KnwlEdge, KnwlGraph

# Print models directly to terminal with beautiful formatting
node = KnwlNode(name="AI", type="Concept", description="Artificial Intelligence")
print_knwl(node)  # Renders as formatted panel with table

# Format collections
edges = [KnwlEdge(...), KnwlEdge(...)]
print_knwl(edges)  # Renders as organized table with syntax highlighting

# Get formatted object for further manipulation
formatted = format_knwl(node, format_type="terminal")
```

**Features:**
- Consistent color schemes and styling across all models
- Automatic table generation for model fields
- Syntax highlighting for code and JSON
- Tree views for hierarchical data
- Customizable themes via `RichTheme`

### HTML Output

Generate semantic HTML with CSS classes for web displays, documentation, and reports:

```python
from knwl.format import render_knwl, format_knwl

# Get HTML string
html = format_knwl(node, format_type="html")

# Save to file with full page structure
render_knwl(
    graph,
    format_type="html",
    output_file="output.html",
    full_page=True,
    title="Knowledge Graph"
)
```

**Features:**
- Semantic HTML5 with CSS classes (`knwl-panel`, `knwl-table`, etc.)
- Customizable class mappings
- Full page generation with proper HTML structure
- Responsive tables and layouts

### Markdown Output

Create GitHub-flavored Markdown for documentation and static sites:

```python
# Generate markdown
md = format_knwl(graph, format_type="markdown")

# Save with frontmatter
render_knwl(
    result,
    format_type="markdown",
    output_file="report.md",
    add_frontmatter=True,
    title="Query Results"
)
```

**Features:**
- GitHub-flavored Markdown tables
- Code blocks with syntax highlighting
- Frontmatter support for static site generators
- Hierarchical headings for nested structures

### Custom Formatters

Register custom formatters for your own models:

```python
from knwl.format import register_formatter
from knwl.format.formatter_base import ModelFormatter

@register_formatter(MyCustomModel, "terminal")
class MyCustomFormatter(ModelFormatter):
    def format(self, model, formatter, **options):
        # Access Rich formatter's theming
        table = formatter.create_table(title="My Custom Output")
        # Add custom formatting logic
        return formatter.create_panel(table, title=model.name)
```

All KNWL models (`KnwlNode`, `KnwlEdge`, `KnwlGraph`, `KnwlAnswer`, `KnwlContext`, etc.) have pre-registered formatters for consistent output across formats.

## Extensibility

Override base classes and update `knwl/config.py`:

```python
# Custom LLM implementation
class MyLLM(LLMBase):
    async def complete(self, prompt: str) -> str:
        # Your implementation
        pass

# Add to config
"llm": {
    "default": "my_llm",
    "my_llm": {
        "class": "mypackage.MyLLM",
        "api_key": "...",
    }
}
```

Use via dependency injection:

```python
@service("llm", variant="my_llm")
async def my_function(llm=None):
    result = await llm.complete("Hello")
```

## Testing

KNWL is extensively tested with unit tests covering all components, strategies, and integration scenarios.

Run tests without LLM integration (fast):

```bash
uv run pytest -m "not llm"
```

Run full test suite (requires Ollama running):

```bash
uv run pytest
```

Run specific test categories:

```bash
uv run pytest -m basic           # Basic functionality tests
uv run pytest -m integration     # Integration tests
uv run pytest -m asyncio         # Async tests
```

## Examples

In the `examples/` directory, you can find various scripts demonstrating KNWL's capabilities, including:

- Basic usage and querying
- Custom LLM integration
- Advanced configuration scenarios

Use VSCode Interactive Python for best experience but you can also run this script directly.
See https://code.visualstudio.com/docs/python/jupyter-support-py for more details.


## Model Evaluation and Benchmarks

In the `benchmarks` directory, you can find evaluation scripts and benchmark datasets to assess Knwl's performance across different graph RAG strategies and configurations. These benchmarks help in understanding the effectiveness of various retrieval methods and the overall system efficiency. Main insights include:

- bigger models take longer to ingest data and do not perform better
- reasoning models do not improved graph extraction quality.

## License

Knwl is released under the MIT License. See LICENSE for details.

## Consulting & Support
For consulting, support, or custom development services around Knwl, please contact the author at info@orbifold.net or visit https://orbifold.net/contact.

Orbifold Consulting, based in Belgium, specialises in delivering comprehensive technical and strategic consulting services to a global clientele across diverse industries and sectors. Our firm provides tailored solutions that empower businesses to navigate and thrive in today's dynamic market landscape. With a commitment to excellence and innovation, Orbifold Consulting ensures that each client receives personalised strategies and expert guidance to achieve their unique business objectives effectively and sustainably.

With over 25 years of experience as an independent consulting firm, we specialise in merging business acumen with scientific expertise to create customised software solutions. Our unique and innovative approaches leverage cutting-edge tools and technologies to meet the specific needs of our clients. We are committed to delivering excellence through bespoke solutions that drive efficiency, innovation, and success. By consistently staying ahead of industry trends and advancements, we ensure our clients receive the highest quality service and support.
Orbifold Consulting excels in translating cutting-edge technologies into actionable business insights, empowering companies to embrace graph technology and AI to drive innovation. Our expertise lies in guiding organisations through the complexities of technological adoption, ensuring seamless integration and optimal utilisation of advanced solutions to achieve strategic goals. We are committed to fostering a culture of innovation, helping businesses not only adapt to but also thrive in an increasingly digital landscape. Through our comprehensive world-wide consulting services, we enable clients to harness the full potential of advanced AI, paving the way for sustained growth and competitive advantage.

As a vendor-neutral organisation, we are committed to selecting the optimal technology for each project, ensuring tailored solutions that drive innovation from ideation to implementation. We position ourselves as your strategic innovation partners, dedicated to delivering excellence at every stage of your project’s lifecycle.

