Metadata-Version: 2.4
Name: mcard
Version: 0.1.59
Summary: MCard: Local-first Content Addressable Storage with Content Type Detection
Author-email: Ben Koo <koo0905@gmail.com>
Project-URL: Homepage, https://github.com/xlp0/MCard_TDD
Project-URL: Source, https://github.com/xlp0/MCard_TDD
Project-URL: Tracker, https://github.com/xlp0/MCard_TDD/issues
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: python-dateutil>=2.9.0.post0
Requires-Dist: SQLAlchemy>=1.4.47
Requires-Dist: aiosqlite>=0.17.0
Requires-Dist: aiohttp>=3.9.0
Requires-Dist: python-dotenv>=1.1.0
Requires-Dist: chardet>=5.1.0
Requires-Dist: PyYAML>=6.0.0
Requires-Dist: wasmtime>=39.0.0
Requires-Dist: websockets>=12.0
Requires-Dist: result>=0.17.0
Provides-Extra: xml
Requires-Dist: lxml>=4.9.0; extra == "xml"
Provides-Extra: duckdb
Requires-Dist: duckdb>=1.0.0; extra == "duckdb"
Provides-Extra: qnlp
Requires-Dist: lambeq>=0.4.2; extra == "qnlp"
Requires-Dist: torch>=2.0.0; extra == "qnlp"
Provides-Extra: observability
Requires-Dist: opentelemetry-sdk>=1.20.0; extra == "observability"
Requires-Dist: opentelemetry-exporter-otlp>=1.20.0; extra == "observability"
Requires-Dist: opentelemetry-instrumentation>=0.41b0; extra == "observability"
Dynamic: license-file

<p align="center">
  <a href="https://www.python.org/"><img src="https://img.shields.io/badge/python-3.10%2B-blue" alt="Python 3.10+" /></a>
  <a href="https://opensource.org/licenses/MIT"><img src="https://img.shields.io/badge/License-MIT-yellow.svg" alt="MIT License" /></a>
  <a href="https://github.com/astral-sh/ruff"><img src="https://img.shields.io/badge/code%20style-ruff-000000.svg" alt="ruff" /></a>
  <a href="https://github.com/xlp0/MCard_TDD/actions/workflows/ci.yml"><img src="https://github.com/xlp0/MCard_TDD/actions/workflows/ci.yml/badge.svg" alt="Build Status" /></a>
</p>

# MCard

MCard is a local-first, content-addressable storage platform with cryptographic integrity, temporal ordering, and a Polynomial Type Runtime (PTR) that orchestrates polyglot execution. It gives teams a verifiable data backbone without sacrificing developer ergonomics or observability.

---

## Highlights

- 🔐 **Hash-verifiable storage**: Unified network of relationships via SHA-256 hashing across content, handles, and history.
- ♾️ **Universal Substrate**: Emulates the Turing Machine "Infinitely Long Tape" via relational queries for computable DSLs.
- ♻️ **Deterministic execution**: PTR mediates **8 polyglot runtimes** (Python, JavaScript, Rust, C, WASM, Lean, R, Julia).
- 📊 **Enterprise ready**: Structured logging, CI/CD pipeline, security auditing, 99%+ automated test coverage.
- 🧠 **AI-native extensions**: GraphRAG engine, optional LLM runtime, and optimized multimodal vision (`moondream`).
- ⚛️ **Quantum NLP**: Optional `lambeq` + PyTorch integration for pregroup grammar and quantum circuit compilation.
- 🧰 **Developer friendly**: Rich Python API, TypeScript SDK, BMAD-driven TDD workflow, numerous examples.
- 📐 **Algorithm Benchmarks**: Sine comparison (Taylor vs Chebyshev) across Python, C, and Rust.
- ⚡ **High Performance**: Optimized test suite (~37s) with runtime caching and session-scoped fixtures.
- 🦆 **DuckDB Engine**: Optional columnar OLAP storage backend — same `StorageEngine` interface, ideal for analytical workloads and Parquet I/O.
- 📋 **Single Source of Truth Schema**: Both SQLite and DuckDB engines load schema exclusively from canonical SQL files (`mcard_schema.sql`, `mcard_vector_schema.sql`) — zero hardcoded CREATE TABLE statements.
- 🔄 **Shared MIME Registry**: A single [`mime_extensions.json`](mime_extensions.json) drives content-type detection across both Python and TypeScript — edit one file to update both runtimes, no recompilation needed.

For the long-form narrative and chapter roadmap, see **[docs/theory/Narrative_Roadmap.md](docs/theory/Narrative_Roadmap.md)**. Architectural philosophy is captured in **[docs/architecture/Monadic_Duality.md](docs/architecture/Monadic_Duality.md)**.

---

## Quick Start (Python)

```bash
git clone https://github.com/xlp0/MCard_TDD.git
cd MCard_TDD
make setup-dev              # creates .venv with uv, installs all deps + pre-commit
uv run pytest -q -m "not slow"  # run the fast Python test suite
uv run python scripts/clm/run_clms.py chapters/chapter_01_arithmetic/addition.yaml
```

### Development Setup

This project uses **[uv](https://github.com/astral-sh/uv)** as the sole Python dependency manager. All dependencies are defined in `pyproject.toml` and locked in `uv.lock`.

```bash
# Install uv (if not already installed)
curl -LsSf https://astral.sh/uv/install.sh | sh

# Create venv and install all dependencies (including dev)
make setup-dev
# Or manually:
uv venv --prompt MCard_TDD
uv sync --all-extras --dev

# Run commands via uv
uv run pytest           # run tests
uv run ruff check mcard/  # lint
uv run python script.py   # run any script
```

Create and retrieve a card:

```python
from mcard import MCard, default_collection

card = MCard("Hello MCard")
hash_value = default_collection.add(card)
retrieved = default_collection.get(hash_value)
print(retrieved.get_content(as_text=True))
```

### Quick Start (JavaScript / WASM)

See **[mcard-js/README.md](mcard-js/README.md)** for build, testing, and npm publishing instructions for the TypeScript implementation.

- **mcard-studio**: The interactive PWA IDE — see [mcard-studio/README.md](mcard-studio/README.md) for setup and architecture.

### Quick Start (Quantum NLP)

MCard optionally integrates with **[lambeq](https://cqcl.github.io/lambeq/)** for quantum natural language processing using pregroup grammar:

```bash
# Install with Quantum NLP support (requires Python 3.10+)
uv pip install -e ".[qnlp]"

# Parse a sentence into a pregroup grammar diagram
uv run python scripts/lambeq_web.py "John gave Mary a flower"
```

**Example output** (pregroup types):

```
John: n    gave: n.r @ s @ n.l @ n.l    Mary: n    a flower: n
Result: s (grammatically valid sentence)
```

The pregroup diagrams can be compiled to quantum circuits for QNLP experiments.

---

## Polyglot Runtime Matrix

| Runtime    | Status | Notes                                                      |
| ---------- | ------ | ---------------------------------------------------------- |
| Python     | ✅     | Reference implementation, CLM runner                       |
| JavaScript | ✅     | Node + browser (WASM) + Full RAG Support + Pyodide         |
| Rust       | ✅     | High-performance adapter & WASM target                     |
| C          | ✅     | Low-level runtime integration                              |
| WASM       | ✅     | Edge and sandbox execution                                 |
| Lean       | ⚙️   | Formal verification pipeline (requires `lean-toolchain`) |
| R          | ✅     | Statistical computing runtime                              |
| Julia      | ✅     | High-performance scientific computing                      |

> **⚠️ Lean Configuration**: A `lean-toolchain` file in the project root is **critical**. Without it, `elan` will attempt to resolve/download toolchain metadata on *every invocation*, causing CLM execution to hang or become unbearably slow.

### Compiling Native Binaries

For detailed instructions on compiling the required C, Rust, and WASM binaries for the polyglot tests, please see the **[Compiling Native Binaries Guide](docs/guides/COMPILING_BINARIES.md)**.

---

## Project Structure (abridged)

```
MCard_TDD/
├── mcard/            # Python package (engines, models, PTR)
├── mcard-js/         # TypeScript SDK — 3 interchangeable storage engines, PTR, RAG
├── mcard-studio/     # [Submodule] Astro + React PWA — artifact IDE with VCard events
├── LandingPage/      # [Submodule] Static-first P2P Documentation & Landing Portal
├── chapters/         # CLM specifications (polyglot demos)
├── docs/             # Architecture, PRD, guides, reports
├── scripts/          # Automation & demo scripts
├── tests/            # >815 automated tests (Python)
├── mime_extensions.json  # Shared MIME-type registry (Python + TypeScript)
└── pyproject.toml    # uv-managed dependencies (uv.lock)
```

### Submodule Organization

This repository orchestrates two key frontend applications as submodules, each serving a distinct role in the MCard ecosystem:

#### 1. mcard-studio (`/mcard-studio`)

**The Interactive IDE for Eventual Consistency & Eventual Correctness.**
A **Progressive Web App (PWA)** built with **Astro** and **React**. It serves as the primary interface for creating, editing, and executing MCards and CLMs.

- **Tech Stack**: Astro, React, Zustand, Monaco Editor, Anime.js/GSAP, Mermaid.
- **Key Features**: Four-store persistence (`servermemory.db`, `browsermemory.db`, `execution_logs.db`, filesystem), dual-mode CLM execution (browser-first JS → server fallback), VCard event pipeline with result sealing, inline rename/upload/create, version history with time-travel, native AI assistant (Ollama), 30+ file type renderers.
- **Role**: The "Editor" & "Runtime" environment for developers and power users.
- **Test Results**: 372 tests passed (37 test files).

#### 2. LandingPage (`/LandingPage`)

**The Public Portal & Knowledge Container.**
A **static-first** modular web application designed for decentralized distribution. It focuses on P2P communication, documentation rendering, and interactive 3D visualizations.

- **Tech Stack**: Vanilla JS Modules, WebRTC (No signaling server), Three.js, KaTeX, Mermaid.
- **Key Features**: Serverless P2P mesh networking, zero-dependency architecture (runs locally without build steps), and rich markdown/media rendering.
- **Role**: The "Viewer" & "distributable container" for the Personal Knowledge Container (PKC) concept.

---

## Documentation

- Product requirements: [docs/specifications/prd.md](docs/specifications/prd.md)
- Architecture overview: [docs/architecture/overview.md](docs/architecture/overview.md)
- **Schema principles**: [schema/README.md](schema/README.md) — Empty Schema grounding, verification-first storage, and the core/extension split.
- **`mcard-js` schema reference**: [mcard-js/schema/README.md](mcard-js/schema/README.md) — Practical explanation of `mcard_schema.sql` and `mcard_vector_schema.sql`.
- **DOTS vocabulary**: [docs/WorkingNotes/Hub/Theory/Integration/DOTS Vocabulary as Efficient Representation for ABC Curriculum.md](docs/WorkingNotes/Hub/Theory/Integration/DOTS%20Vocabulary%20as%20Efficient%20Representation%20for%20ABC%20Curriculum.md)
- Monad–Polynomial philosophy: [docs/architecture/Monadic_Duality.md](docs/architecture/Monadic_Duality.md)
- Narrative roadmap & chapters: [docs/theory/Narrative_Roadmap.md](docs/theory/Narrative_Roadmap.md)
- Logging system: [docs/guides/LOGGING_GUIDE.md](docs/guides/LOGGING_GUIDE.md)
- PTR & CLM reference: [docs/specifications/CLM_Language_Specification.md](docs/specifications/CLM_Language_Specification.md), [docs/archive/PCard Architecture.md](docs/archive/PCard%20Architecture.md)
- Reports & execution summaries: [docs/reports/](docs/reports/)
  - [WebSocket Performance Debugging](docs/re## Platform Vision & Architecture

The theoretical foundations, including the **Function Economy**, **Petri Net Scheduler**,  **Dual-Handle Memory Architecture**, and **Concurrency Protection**, have been consolidated into the **[Platform Vision Document](docs/architecture/PLATFORM_VISION.md)**.
%20Petri%20Net%20Implementation.md) — Physical implementation mapping
> - [DOTS → PTR Meta-Language](docs/WorkingNotes/Hub/Theory/Integration/The%20Operational%20Meta-Language%20-%20From%20DOTS%20to%20PTR.md) — Theoretical framework

---

## Recent Updates

> **Full changelog:** [CHANGELOG.md](CHANGELOG.md)

**Current versions**: Python `mcard` 0.1.60 · TypeScript `mcard-js` 2.1.43

**Polyglot Runtime & WASM Integration Fixes (v0.1.60 / v2.1.43)**: Fixed critical missing `--experimental-wasm-modules` environment flags in Python subprocess integration, rectified `module://` protocol resolution for cross-environment testing, and enforced `importlib` dynamic module bootstrapping in NodeJS. Patched Rust/TypeScript content-type string detection parity bugs and Python 3.9 `__future__` typing incompatibilities.

**Python Build & Syntax Corrections (v0.1.59 / v2.1.42)**: Fixed `mcard` syntax, namespace, and import dependency compilation errors across `improved_logging.py`, `card_collection.py`, and `logging_config.py`. Restored 100% test build health.

**Storage Layer Deduplication (v0.1.58 / v2.1.42)**: Comprehensive refactoring to centralize card operations into `AbstractSqlEngine` (TypeScript) and simplify connection paths via `resolve_db_path()` (Python). Eliminated over 250 lines of duplicate code across the 4 TypeScript SQL engines, standardized dialect handling (SQLite `TEXT` vs DuckDB `VARCHAR`), and fixed a latent foreign key bug in handle renaming. All functionality remains fully backward compatible.

**🏗️ Major Project Restructuring (v0.1.56 / v2.1.38)**: Four-phase structural overhaul — root files relocated to proper directories, documentation reorganized (32 flat files → 7 subdirectories), scripts reorganized (19 files → 6 subdirectories), Python tests restructured (28 files → 5 subdirectories), TypeScript engine implementations moved to `storage/engines/` with barrel re-exports, factory pattern migration (`SqliteNodeEngine.create()`), test database cleanup, and `.gitignore` hardening. Runtime behavior is unchanged; all 849 TS tests and 767 Python tests pass.

Recent milestones also include DuckDB as an alternative storage engine, shared MIME registry (`mime_extensions.json`), PTR exception narrowing (109 broad catches → specific types), SqlJs vector adapter for browser-based vector search, and ContentTypeInterpreter event-loop starvation fix. See [CHANGELOG.md](CHANGELOG.md) for full details.

## Testing

> **Note:** All commands below should be run from the project root (`MCard_TDD/`).

### Unit Tests

```bash
# Python
uv run pytest -q                 # Run all tests
uv run pytest -q -m "not slow"   # Fast tests only
uv run pytest -m "not network"   # Skip LLM/Ollama tests

# JavaScript
npm --prefix mcard-js test -- --run

# Browser (MCard Studio)
npm --prefix mcard-studio run test:unit -- --run
```

### CLM Verification

Both Python and JavaScript CLM runners support three modes: **all**, **directory**, and **single file**.

#### Python

```bash
# Run all CLMs
uv run python scripts/clm/run_clms.py

# Run by directory
uv run python scripts/clm/run_clms.py chapters/chapter_01_arithmetic
uv run python scripts/clm/run_clms.py chapters/chapter_08_P2P

# Run single file
uv run python scripts/clm/run_clms.py chapters/chapter_01_arithmetic/addition.yaml

# Run with custom context
uv run python scripts/clm/run_clms.py chapters/chapter_08_P2P/generic_session.yaml \
    --context '{"sessionId": "my-session"}'
```

#### JavaScript

```bash
# Run all CLMs
npm --prefix mcard-js run clm:all

# Run by directory/filter
npm --prefix mcard-js run clm:all -- chapter_01_arithmetic
npm --prefix mcard-js run clm:all -- chapters/chapter_08_P2P

# Run single file
npm --prefix mcard-js run demo:clm -- chapters/chapter_01_arithmetic/addition_js.yaml
```

### Chapter Directories

| Directory                 | Description                                                  |
| ------------------------- | ------------------------------------------------------------ |
| `chapter_00_prologue`   | Hello World, Lambda calculus, and Church encoding — 11 CLMs |
| `chapter_01_arithmetic` | Arithmetic operations (Python, JS, Lean) — 27 CLMs          |
| `chapter_02_handle`     | Handle operations and dual retrieval                         |
| `chapter_03_llm`        | LLM integration (requires Ollama)                            |
| `chapter_04_load_dir`   | Filesystem and collection loading                            |
| `chapter_05_reflection` | Meta-programming and recursive CLMs                          |
| `chapter_06_lambda`     | Lambda calculus runtime                                      |
| `chapter_07_network`    | HTTP requests, MCard sync, network I/O — 5 CLMs             |
| `chapter_08_P2P`        | P2P networking and WebRTC — 16 CLMs (3 VCard)               |
| `chapter_09_DSL`        | Meta-circular language definition and combinators — 10 CLMs |
| `chapter_10_service`    | Static server builtin and service management — 3 CLMs       |

---

## Contributing

1. Fork the repository and create a feature branch.
2. Run the tests (`uv run pytest`, `npm test` in `mcard-js`).
3. Submit a pull request describing your change and tests.

---

## Future Roadmap

### Road to VCard (Design & Implementation)

Based on the **MVP Cards Design Rationale**, a VCard (Value Card) represents a boundary-enforced value exchange unit that often contains sensitive privacy data (identities, private keys, financial claims). Unlike standard MCards which are designed for public distribution and reproducibility, VCards require strict confidentiality.

**Design Requirements & Rationale:**

1. **Privacy & Encryption**: VCards cannot be stored in the standard `mcard.db` (which is often shared or public) without encryption. They must be stored in a "physically separate" container or be encrypted at rest.
2. **Authentication Primitive**: A VCard serves as a specialized "Certificate of Authority" — a precondition for executing sensitive PTR actions.
3. **Audit Certificates**: Execution of a VCard-authorized action must produce a **VerificationVCard** (Certificate of Execution), which proves the action occurred under authorization. This certificate is also sensitive.
4. **Unified Schema**: While the storage *location* differs, the *data schema* should remain identical to MCard (content addressable, hash-linked) to reuse the rigorous polynomial logic.

**Proposed Architecture:**

* **Dual-Database Storage**:
  * `mcard.db` (Public/Shared): Stores standard MCards, Logic (PCards), and Public Keys.
  * `vcard.db` (Private/Local): Stores VCards, Encrypted Private Keys, and Verification Certificates.
* **Execution Flow**:
  `execute(pcard_hash, input, vcard_authorization_hash)`
  1. **Gatekeeper**: PTR checks if `vcard_authorization_hash` exists in the Private Store (`vcard.db`).
  2. **Zero-Trust Verify**: Runtime validates the VCard's cryptographic integrity and permissions (Security Polynomial).
  3. **Execute**: If valid, the PCard logic runs.
  4. **Certify**: A new `VerificationVCard` is generated, signed, and stored in `vcard.db`, linking the Input, Output, and Authority.

**TODOs:**

- [ ] **Infrastructure**: Implement `PrivateCollection` (wrapper around `vcard.db`) in Python and JavaScript factories.
- [ ] **Encryption Middleware**: Add a transparent encryption layer (e.g., AES-GCM) for the Private Collection to ensure Encryption-at-Rest.
- [ ] **CLI Auth**: Update `run_clms.py` to accept `--auth <vcard_hash>` and mount the private keystore.
- [ ] **Certificate Generation**: Implement the `VerificationVCard` schema and generation logic in `CLMRunner`.

---

### Logical Model Certification & Functional Deployment

Use of the **Cubical Logic Model (CLM)** as a "Qualified Logical Model" is strictly governed by principles derived from Eelco Dolstra's *The Purely Functional Software Deployment Model* (the theoretical basis of Nix).

A CLM is not merely source code; it is a candidate for certification. It only becomes a **Qualified Logical Model** when it possesses a valid **Certification**, which is a cryptographic proof of successful execution by a specific version of the Polynomial Type Runtime (PTR).

**The Functional Certification Equation:**

$$
Observation = PTR_{vX.Y.Z}(CLM_{Source})
$$

$$
Certification = Sign_{Authority}(Hash(CLM_{Source}) + Hash(PTR_{vX.Y.Z}) + Hash(Observation))
$$

**Parallels to the Nix Model:**

1. **Hermetic Inputs**: Just as a Nix derivation hashes all inputs (compiler, libs, source), a CLM Certification depends on the exact **PTR Runtime Version** and **CLM Content Hash**. Changing the runtime version invalidates the certificate, requiring re-qualification (re-execution).
2. **Deterministic Derivation**: The "build" step is the execution of the CLM's verification logic. If the PTR (the builder) is deterministic, the output (VerificationVCard) is reproducible.
3. **The "Store"**: The `mcard.db` acts as the Nix Store, holding immutable, content-addressed CLMs. The `vcard.db` acts as the binary cache, holding signed Certifications (outputs) that prove a CLM works for a given runtime configuration.

This ensures that a "Qualified CLM" is not just "code that looks right," but **"code that has logically proven itself"** within a specific, physically identifiable execution environment.

---

## License

This project is licensed under the MIT License – see [LICENSE](LICENSE).

For release notes, check [CHANGELOG.md](CHANGELOG.md).
