Metadata-Version: 2.4
Name: yoneda-cognitive-machine
Version: 0.0.1a1
Summary: Yoneda Cognitive Machine - a pure symbolic AI kernel based on category theory
Author: Hao Zhang
License: AGPL-3.0-or-later
License-File: LICENSE.md
Keywords: category-theory,inference,knowledge-graph,symbolic-ai
Requires-Python: >=3.10
Provides-Extra: dev
Requires-Dist: pytest>=7.0; extra == 'dev'
Requires-Dist: ruff>=0.1.0; extra == 'dev'
Description-Content-Type: text/markdown

# Yoneda Cognitive Machine (YCM)

A pure symbolic AI kernel based on category theory, implementing the Yoneda lemma as the core design principle: **"existence is being related."**

[中文文档 (Chinese Documentation)](README_zh.md)

## Table of Contents

- [Philosophy & Design Principles](#philosophy--design-principles)
- [Overview](#overview)
- [Installation](#installation)
- [Quick Start](#quick-start)
- [Core Concepts](#core-concepts)
  - [Nodes](#nodes)
  - [Edges](#edges)
  - [Pattern Matching](#pattern-matching)
- [Components](#components)
  - [Graph Store & Indexer](#graph-store--indexer)
  - [Rule Engine](#rule-engine)
  - [Latent Variable Manager](#latent-variable-manager)
  - [Context Slicer](#context-slicer)
  - [Query Processor](#query-processor)
  - [Yoneda Abstractor](#yoneda-abstractor)
- [API Reference](#api-reference)
- [Examples](#examples)
- [Design Decisions](#design-decisions)
- [License](#license)

---

## Philosophy & Design Principles

YCM is rooted in the Yoneda Lemma from category theory, which states that an object is completely determined by its relationships with all other objects. This leads to our core philosophy:

1. **Objects have no internal structure** - All meaning comes from edges connecting nodes
2. **Knowledge grows monotonically** - Only additions, no deletions or modifications
3. **Unified representation** - Facts, rules, inference steps, contexts, and latent variables all live in the same graph
4. **Context as first-class citizen** - Context-dependent attributes are encapsulated through event nodes, solving the frame problem
5. **Latent variables handle uncertainty** - Unknown entities are represented as Skolem nodes, maintaining logical determinism

---

## Overview

YCM is a monotonically growing directed multigraph with six core components:

| Component | Purpose |
|-----------|---------|
| **Graph Store & Indexer** | Central storage with multi-index pattern matching |
| **Rule Engine** | Forward-chaining inference to fixpoint |
| **Latent Variable Manager** | Skolem node creation and equivalence relations |
| **Context Slicer** | Context-scoped subgraph computation |
| **Query Processor** | Pattern matching with proof path extraction |
| **Yoneda Abstractor** | Automatic concept abstraction via structural isomorphism |

---

## Installation

### Requirements

- Python 3.10+
- [uv](https://docs.astral.sh/uv/) (recommended package manager)

### Setup

```bash
# Clone the repository
git clone https://github.com/your-repo/yoneda-cognitive-machine.git
cd yoneda-cognitive-machine

# Install dependencies
uv sync --dev

# Run tests
uv run pytest tests/ -v

# Format and lint
uv run ruff format src/ tests/
uv run ruff check src/ tests/
```

---

## Quick Start

### Basic Usage

```python
from yoneda_cognitive_machine import YCM, Node, Edge, SYSTEM_EDGE_TYPES
from yoneda_cognitive_machine.core.pattern import PatternGraph, EdgeSpec

# Initialize the system
ycm = YCM()
ycm.initialize()

# Add facts: Tom isa Cat, Cat isa Mammal
nodes = [Node(id="Tom"), Node(id="Cat"), Node(id="Mammal")]
ycm.add_facts(nodes, [])
isa_type = SYSTEM_EDGE_TYPES["isa"]
ycm.create_edge("Tom", "Cat", isa_type)
ycm.create_edge("Cat", "Mammal", isa_type)

# Query: find all X isa Y
pattern = PatternGraph(
    variables={"X", "Y"},
    edge_specs=[EdgeSpec(source_var="?X", target_var="?Y", type_node=isa_type)]
)
result = ycm.query(pattern)
print(f"Found {len(result.bindings)} matches")
for binding in result.bindings:
    print(f"  {binding['X']} isa {binding['Y']}")
```

### With Rules

```python
from yoneda_cognitive_machine import NodeKind

# Create a transitive isa rule
rule_node = Node(id="rule_trans", kind=NodeKind.RULE)
pattern_root = Node(id="pat_trans", kind=NodeKind.PATTERN)
action_root = Node(id="act_trans", kind=NodeKind.ACTION)
var_x = Node(id="vx", kind=NodeKind.VAR)
var_y = Node(id="vy", kind=NodeKind.VAR)
var_z = Node(id="vz", kind=NodeKind.VAR)

ycm.add_facts([rule_node, pattern_root, action_root, var_x, var_y, var_z], [])

# Connect rule structure
ycm.create_edge("rule_trans", "pat_trans", SYSTEM_EDGE_TYPES["hasPattern"])
ycm.create_edge("rule_trans", "act_trans", SYSTEM_EDGE_TYPES["hasAction"])

# Add variables to pattern
ycm.create_edge("pat_trans", "vx", SYSTEM_EDGE_TYPES["hasVar"])
ycm.create_edge("pat_trans", "vy", SYSTEM_EDGE_TYPES["hasVar"])
ycm.create_edge("pat_trans", "vz", SYSTEM_EDGE_TYPES["hasVar"])

# Create edge specifications for pattern: X isa Y, Y isa Z
spec1 = Node(id="spec1", kind=NodeKind.EDGESPEC)
spec2 = Node(id="spec2", kind=NodeKind.EDGESPEC)
ycm.add_facts([spec1, spec2], [])

ycm.create_edge("pat_trans", "spec1", SYSTEM_EDGE_TYPES["requiresEdge"])
ycm.create_edge("pat_trans", "spec2", SYSTEM_EDGE_TYPES["requiresEdge"])

# Define spec1: vx --[isa]--> vy
ycm.create_edge("spec1", "vx", SYSTEM_EDGE_TYPES["src"])
ycm.create_edge("spec1", "vy", SYSTEM_EDGE_TYPES["tgt"])
ycm.create_edge("spec1", isa_type, SYSTEM_EDGE_TYPES["edgeType"])

# Define spec2: vy --[isa]--> vz
ycm.create_edge("spec2", "vy", SYSTEM_EDGE_TYPES["src"])
ycm.create_edge("spec2", "vz", SYSTEM_EDGE_TYPES["tgt"])
ycm.create_edge("spec2", isa_type, SYSTEM_EDGE_TYPES["edgeType"])

# Create action specification: X isa Z
action_spec = Node(id="action_spec", kind=NodeKind.EDGESPEC)
ycm.add_facts([action_spec], [])

ycm.create_edge("act_trans", "action_spec", SYSTEM_EDGE_TYPES["createsEdge"])

# Define action_spec: vx --[isa]--> vz
ycm.create_edge("action_spec", "vx", SYSTEM_EDGE_TYPES["src"])
ycm.create_edge("action_spec", "vz", SYSTEM_EDGE_TYPES["tgt"])
ycm.create_edge("action_spec", isa_type, SYSTEM_EDGE_TYPES["edgeType"])

# Run inference
ycm.run_closure()

# Now Tom isa Mammal is inferred
assert ycm.store.has_edge("Tom", "Mammal", isa_type)
```

---

## Core Concepts

### Nodes

Nodes are minimal entities with only an ID and optional kind label:

```python
Node(id="unique_id", kind=None)  # kind is optional metadata
```

**System-reserved kinds:**
- `Primitive` - Regular concept nodes
- `Rule` - Rule definition nodes
- `Skolem` - Existential variable instances
- `Event` - Context/situation nodes
- `Pattern` - Rule pattern graph root
- `Action` - Rule action graph root
- `Var` - Variable nodes in rules
- `EdgeSpec` - Edge specification nodes
- `RuleApplication` - Rule execution instance

### Edges

Edges are typed, directed connections with optional justification:

```python
Edge(
    id="edge_id",
    source="source_node_id",
    target="target_node_id",
    type="edge_type_node_id",
    justification=None  # Optional: rule application that created this edge
)
```

**Key properties:**
- Immutable once created
- Type must reference an existing node (e.g., `sys:isa`)
- Justification enables proof path tracing

### Pattern Matching

Patterns specify subgraphs to match with variables:

```python
# Pattern: X isa Y
PatternGraph(
    variables={"X", "Y"},
    edge_specs=[
        EdgeSpec(source_var="?X", target_var="?Y", type_node=isa_type)
    ]
)

# Pattern with fixed node: Tom isa Y
PatternGraph(
    variables={"Y"},
    edge_specs=[
        EdgeSpec(source_var="Tom", target_var="?Y", type_node=isa_type)
    ]
)

# Multi-constraint pattern: X isa Y AND Y isa Z
PatternGraph(
    variables={"X", "Y", "Z"},
    edge_specs=[
        EdgeSpec(source_var="?X", target_var="?Y", type_node=isa_type),
        EdgeSpec(source_var="?Y", target_var="?Z", type_node=isa_type),
    ]
)
```

---

## Components

### Graph Store & Indexer

The central storage with optimized indices for pattern matching.

```python
store = ycm.store

# Node operations
store.add_nodes([node1, node2])
store.get_node("node_id")
store.query_nodes(["id1", "id2"])
store.all_nodes()

# Edge operations
store.add_edges([edge1, edge2])
store.get_outgoing("node_id", edge_type=None)
store.get_incoming("node_id", edge_type=None)
store.has_edge("source", "target", type_node)

# Pattern matching
for binding in store.match_pattern(pattern, context_slice=None):
    print(binding)
```

**Indices:**
- `source_index`: source → edges
- `target_index`: target → edges
- `type_index`: type → edges
- `source_type_index`: (source, type) → edges
- `target_type_index`: (target, type) → edges

### Rule Engine

Forward-chaining inference to fixpoint.

**Rule structure:**
```
Rule --[hasPattern]--> PatternRoot
Rule --[hasAction]--> ActionRoot

PatternRoot --[hasVar]--> VarNode
PatternRoot --[requiresEdge]--> EdgeSpec

EdgeSpec --[src]--> VarNode/FixedNode
EdgeSpec --[tgt]--> VarNode/FixedNode
EdgeSpec --[edgeType]--> TypeNode

ActionRoot --[createsEdge]--> EdgeSpec
ActionRoot --[hasVar]--> VarNode (existential if not in pattern)
```

```python
# Run all rules until fixpoint
ycm.run_closure()

# Get all rules
rules = ycm.rule_engine.get_all_rules()
```

**Key features:**
- Monotonic execution (no contradictions)
- Iteration until fixpoint
- Existential variables generate Skolem nodes
- Justification tracking for proof paths

### Latent Variable Manager

Manages Skolem nodes and equivalence relations.

```python
latent = ycm.latent_manager

# Create Skolem node
skolem_id = latent.create_skolem("rule_app_id")

# Check if node is Skolem
latent.is_skolem(node_id)

# Get origin of Skolem
origins = latent.get_skolem_origins(skolem_id)

# Declare equivalence
latent.declare_sameAs("node_a", "node_b")

# Get all Skolems
skolems = latent.get_all_skolems()
```

**Skolem nodes** represent unknown but existent entities created by existential quantifiers in rules.

### Context Slicer

Computes context-scoped subgraphs to isolate context-dependent attributes.

```python
slicer = ycm.context_slicer

# Compute context slice
slice_nodes = slicer.compute_context_slice("PetShop")

# Query in specific context
result = ycm.query(pattern, context_id="PetShop")
```

**Context slice includes:**
1. The context node itself
2. Events connected to the context (E --[context]--> K)
3. Nodes connected through those events
4. Definitional relations (isa) of included nodes

**This solves the frame problem:** "Cat's price at PetShop" is only visible in PetShop context.

### Query Processor

Pattern matching with metadata.

```python
result = ycm.query(pattern, context_id=None, include_proof=False)

# Result structure
result.bindings      # List[Dict[str, str]]
result.has_skolem_dependency  # bool
result.proof_paths   # Optional[List[List[str]]]
```

**Proof paths** trace justification chains back to source facts and rules.

### Yoneda Abstractor

Automatic concept abstraction via structural isomorphism.

```python
# Run abstraction cycle
ycm.run_abstraction()

# Check profiles
profile = ycm.abstractor.compute_outgoing_profile("node_id")
```

**Abstraction logic:**
1. Compute incoming/outgoing profiles for each node
2. Detect pairs with isomorphic structure (same type signatures)
3. Create abstract parent node if no common parent exists
4. Add common relations to the abstract node

---

## API Reference

### YCM Class

```python
class YCM:
    # Initialization
    def __init__(self) -> None
    def initialize(self) -> None

    # Adding facts
    def add_facts(nodes: list[Node], edges: list[Edge]) -> None
    def add_node(node: Node) -> None
    def add_edge(edge: Edge) -> None
    def add_edges(edges: list[Edge]) -> None
    def create_node(kind: str | None = None, id: str | None = None) -> Node
    def create_edge(source: str, target: str, type_node: str, 
                    justification: str | None = None) -> Edge

    # Inference
    def run_closure() -> None
    def run_abstraction() -> None

    # Query
    def query(pattern: PatternGraph, context_id: str | None = None,
              include_proof: bool = False) -> QueryResult

    # Access
    def get_node(node_id: str) -> Node | None
    def get_edge(edge_id: str) -> Edge | None
    def get_outgoing(node_id: str, edge_type: str | None = None) -> list[Edge]
    def get_incoming(node_id: str, edge_type: str | None = None) -> list[Edge]
    def all_nodes() -> list[Node]
    def all_edges() -> list[Edge]

    # Component accessors
    @property store: GraphStore
    @property rule_engine: RuleEngine
    @property latent_manager: LatentVariableManager
    @property context_slicer: ContextSlicer
    @property query_processor: QueryProcessor
    @property abstractor: YonedaAbstractor
```

### System Edge Types

```python
SYSTEM_EDGE_TYPES = {
    "isa": "sys:isa",           # Type hierarchy
    "hasPart": "sys:hasPart",
    "context": "sys:context",   # Event-to-context
    "target": "sys:target",     # Event-to-target
    "causes": "sys:causes",
    "hasProperty": "sys:hasProperty",
    "sameAs": "sys:sameAs",     # Equivalence
    # Rule structure types
    "hasPattern": "sys:hasPattern",
    "hasAction": "sys:hasAction",
    "hasVar": "sys:hasVar",
    "requiresEdge": "sys:requiresEdge",
    "createsEdge": "sys:createsEdge",
    "src": "sys:src",
    "tgt": "sys:tgt",
    "edgeType": "sys:edgeType",
    "appliedRule": "sys:appliedRule",
}
```

---

## Examples

### Example 1: Transitive Inference

```python
from yoneda_cognitive_machine import NodeKind

# Create nodes
ycm.create_node(id="Tom")
ycm.create_node(id="Cat")
ycm.create_node(id="Mammal")

# Facts: Tom isa Cat, Cat isa Mammal
ycm.create_edge("Tom", "Cat", isa_type)
ycm.create_edge("Cat", "Mammal", isa_type)

# Create transitive rule: X isa Y, Y isa Z => X isa Z
# Rule nodes
rule_node = Node(id="trans_rule", kind=NodeKind.RULE)
pattern_root = Node(id="trans_pat", kind=NodeKind.PATTERN)
action_root = Node(id="trans_act", kind=NodeKind.ACTION)
var_x = Node(id="tx", kind=NodeKind.VAR)
var_y = Node(id="ty", kind=NodeKind.VAR)
var_z = Node(id="tz", kind=NodeKind.VAR)
spec1 = Node(id="ts1", kind=NodeKind.EDGESPEC)
spec2 = Node(id="ts2", kind=NodeKind.EDGESPEC)
action_spec = Node(id="tas", kind=NodeKind.EDGESPEC)

ycm.add_facts([rule_node, pattern_root, action_root, var_x, var_y, var_z, spec1, spec2, action_spec], [])

# Connect rule structure
ycm.create_edge("trans_rule", "trans_pat", SYSTEM_EDGE_TYPES["hasPattern"])
ycm.create_edge("trans_rule", "trans_act", SYSTEM_EDGE_TYPES["hasAction"])
ycm.create_edge("trans_pat", "tx", SYSTEM_EDGE_TYPES["hasVar"])
ycm.create_edge("trans_pat", "ty", SYSTEM_EDGE_TYPES["hasVar"])
ycm.create_edge("trans_pat", "tz", SYSTEM_EDGE_TYPES["hasVar"])
ycm.create_edge("trans_pat", "ts1", SYSTEM_EDGE_TYPES["requiresEdge"])
ycm.create_edge("trans_pat", "ts2", SYSTEM_EDGE_TYPES["requiresEdge"])

# Pattern: tx --[isa]--> ty, ty --[isa]--> tz
ycm.create_edge("ts1", "tx", SYSTEM_EDGE_TYPES["src"])
ycm.create_edge("ts1", "ty", SYSTEM_EDGE_TYPES["tgt"])
ycm.create_edge("ts1", isa_type, SYSTEM_EDGE_TYPES["edgeType"])
ycm.create_edge("ts2", "ty", SYSTEM_EDGE_TYPES["src"])
ycm.create_edge("ts2", "tz", SYSTEM_EDGE_TYPES["tgt"])
ycm.create_edge("ts2", isa_type, SYSTEM_EDGE_TYPES["edgeType"])

# Action: tx --[isa]--> tz
ycm.create_edge("trans_act", "tas", SYSTEM_EDGE_TYPES["createsEdge"])
ycm.create_edge("tas", "tx", SYSTEM_EDGE_TYPES["src"])
ycm.create_edge("tas", "tz", SYSTEM_EDGE_TYPES["tgt"])
ycm.create_edge("tas", isa_type, SYSTEM_EDGE_TYPES["edgeType"])

ycm.run_closure()

# Result: Tom isa Mammal is inferred
assert ycm.store.has_edge("Tom", "Mammal", isa_type)
```

### Example 2: Context-Dependent Price

```python
# Create nodes
ycm.create_node(id="Cat")
ycm.create_node(id="500Dollars")

# PetShop context
ycm.create_node(id="PetShop")
event1 = Node(id="E1", kind=NodeKind.EVENT)
ycm.add_node(event1)
ycm.create_edge("E1", "PetShop", SYSTEM_EDGE_TYPES["context"])
ycm.create_edge("E1", "Cat", SYSTEM_EDGE_TYPES["target"])
ycm.create_edge("E1", "500Dollars", SYSTEM_EDGE_TYPES["hasProperty"])

# Vet context (no price)
ycm.create_node(id="Vet")
event2 = Node(id="E2", kind=NodeKind.EVENT)
ycm.add_node(event2)
ycm.create_edge("E2", "Vet", SYSTEM_EDGE_TYPES["context"])
ycm.create_edge("E2", "Cat", SYSTEM_EDGE_TYPES["target"])

# Query price
pattern = PatternGraph(
    variables={"Price"},
    edge_specs=[
        EdgeSpec(source_var="E1", target_var="?Price", type_node=SYSTEM_EDGE_TYPES["hasProperty"]),
    ]
)

# PetShop context: finds 500Dollars
result_petshop = ycm.query(pattern, context_id="PetShop")

# Vet context: no results
result_vet = ycm.query(pattern, context_id="Vet")
```

### Example 3: Existential Variables (Skolem)

```python
# Create concepts
ycm.create_node(id="Shadow")
ycm.create_node(id="OpaqueObject")
ycm.create_node(id="Shadow1")

# Rule: If X isa Shadow, create W (existential) such that W isa OpaqueObject
rule_node = Node(id="rule_shadow", kind=NodeKind.RULE)
pattern_root = Node(id="pat_shadow", kind=NodeKind.PATTERN)
action_root = Node(id="act_shadow", kind=NodeKind.ACTION)
var_x = Node(id="sx", kind=NodeKind.VAR)
var_w = Node(id="sw", kind=NodeKind.VAR)  # Existential (not in pattern)
spec_pat = Node(id="spec_ps", kind=NodeKind.EDGESPEC)
spec_act = Node(id="spec_as", kind=NodeKind.EDGESPEC)

ycm.add_facts([rule_node, pattern_root, action_root, var_x, var_w, spec_pat, spec_act], [])

# Connect rule structure
ycm.create_edge("rule_shadow", "pat_shadow", SYSTEM_EDGE_TYPES["hasPattern"])
ycm.create_edge("rule_shadow", "act_shadow", SYSTEM_EDGE_TYPES["hasAction"])
ycm.create_edge("pat_shadow", "sx", SYSTEM_EDGE_TYPES["hasVar"])
ycm.create_edge("pat_shadow", "spec_ps", SYSTEM_EDGE_TYPES["requiresEdge"])
ycm.create_edge("spec_ps", "sx", SYSTEM_EDGE_TYPES["src"])
ycm.create_edge("spec_ps", "Shadow", SYSTEM_EDGE_TYPES["tgt"])
ycm.create_edge("spec_ps", isa_type, SYSTEM_EDGE_TYPES["edgeType"])

# Action: W isa OpaqueObject (W is existential - only in action)
ycm.create_edge("act_shadow", "sw", SYSTEM_EDGE_TYPES["hasVar"])
ycm.create_edge("act_shadow", "spec_as", SYSTEM_EDGE_TYPES["createsEdge"])
ycm.create_edge("spec_as", "sw", SYSTEM_EDGE_TYPES["src"])
ycm.create_edge("spec_as", "OpaqueObject", SYSTEM_EDGE_TYPES["tgt"])
ycm.create_edge("spec_as", isa_type, SYSTEM_EDGE_TYPES["edgeType"])

# Fact: Shadow1 isa Shadow
ycm.create_edge("Shadow1", "Shadow", isa_type)

ycm.run_closure()

# Check Skolem nodes created
skolems = ycm.latent_manager.get_all_skolems()
# Each Shadow instance generates a unique Skolem representing "some opaque object casting it"
```

### Example 4: Automatic Abstraction

```python
# Create nodes
ycm.create_node(id="Fire")
ycm.create_node(id="Sun")
ycm.create_node(id="Hot")
ycm.create_node(id="LightSource")

# Two nodes with identical relations
ycm.create_edge("Fire", "Hot", "type_emits")
ycm.create_edge("Fire", "LightSource", "type_emits")
ycm.create_edge("Sun", "Hot", "type_emits")
ycm.create_edge("Sun", "LightSource", "type_emits")

ycm.run_abstraction()

# Result: Abstract node created, Fire/Sun both have isa edge to it
fire_parents = ycm.context_slicer.get_isa_parents("Fire")
sun_parents = ycm.context_slicer.get_isa_parents("Sun")
common = fire_parents.intersection(sun_parents)  # Non-empty
```

---

## Design Decisions

### Why no negation in patterns?

Negation would break monotonicity. If a rule requires "X does NOT have edge to Y", adding such an edge later would invalidate previous conclusions. YCM embraces the open-world assumption: failure to match means "unknown", not "false".

### Why no probabilities/weights?

Probabilistic reasoning introduces complexity and breaks the clean logical semantics. YCM handles uncertainty through Skolem nodes - explicit unknown entities that maintain determinism.

### Why event nodes for context?

Direct attributes on concepts cause the frame problem - "Cat's price" would be visible in all contexts. Event nodes (E --[context]--> PetShop, E --[target]--> Cat, E --[hasProperty]--> 500) encapsulate context-dependent relationships.

### Why justification tracking?

Every derived edge points to the rule application that created it, enabling complete proof reconstruction and debugging.

---

## License

AGPL-3.0-or-later

This project is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.