Metadata-Version: 2.4
Name: hcp
Version: 0.1.1
Summary: The Human-AI Collaboration Protocol (HCP)
Author: Yang Yu, Yijia Shao
Requires-Python: >=3.9
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Python :: 3.14
Requires-Dist: pydantic (>=2.1.0)
Description-Content-Type: text/markdown

# Human-AI Collaboration Protocol (HCP)
What is the Human-AI Collaboration Protocol (HCP)?

**Human-AI Collaboration Protocol is the HTTP between Humans and AI.**

[![PyPI version](https://img.shields.io/pypi/v/hcp?color=brightgreen)](https://pypi.org/project/hcp/)
[![Python 3.9+](https://img.shields.io/badge/python-3.9+-blue.svg)](https://www.python.org/downloads/)
[![Formally Verified in Lean 4](https://img.shields.io/badge/Formally_Verified-Lean_4-purple.svg)](https://github.com/yu-yang-i/HCP/main/hcp_proof/HCP_Proof.lean)

![HCP Architecture Diagram](https://raw.githubusercontent.com/yu-yang-i/HCP/main/assets/arc.jpg)
<!-- ![HCP Architecture Diagram](./assets/arc.jpg) -->

 The categorical universality, future-proof property of the HCP topology have been formally verified using the Lean 4 Theorem Prover.

## 1. The Problem: Asymmetric Collaboration

Today’s human-AI collaboration imposes a fundamental asymmetry: human asks, and AI responds; or human issues commands, and AI responds. This architecture worked when AI capabilities were limited, but it creates a collaboration ceiling as agents become more capable.

The bottleneck is twofold. **Architecturally**, half of the collaboration action space is missing by design. **Communicatively**, the interaction is forced through a low-bandwidth, ambiguous channel: unstructured chat.

### A. The Input Ambiguity
When an Agent requires a specific parameter to proceed—such as a confidence integer or a boolean confirmation—it is currently forced to ask via open-ended text.

*   **Agent:** "Please set the confidence level (0-100)."
*   **User:** "Just make it safe."
*   **Result:** The Agent must rely on error-prone parsing or hallucination to interpret the input.

### B. The Missing Reverse Gear
Current interfaces assume agents only respond to human requests, but as agents become more capable, they need the reverse, such as to **delegate research tasks** back to humans (“I need domain expertise on X”), **request strategic decisions** (“Should we optimize for speed or accuracy?”), or **provide suggestions** (“I found three approaches - I suggest trying them one by one”).

Today, all these cases are forced through chat, which is both ambiguous (the agent cannot guarantee how the human will interpret or format their response) and stateless (the agent has no way to specify whether this is blocking or background, or to track resolution status).

## 2. The Solution: Symmetric Data Topology

**HCP does not define UI.**

It treats **Input** and **Output** as the same mathematical object: The **DataShape**.
The Agent defines the container (The Shape); the Client adapts its rendering to fill it.

Utilizing a principle of **Symmetry**, the Agent defines the Shape, and the `Direction` determines whether it is a "Read" or "Write" operation.

This guarantees that the Agent works across any substrate—Web, VR, Mobile, or Brain-Computer Interface—without code changes.

### The Four Fundamental Shapes

| Shape | The Physics | Web Rendering (Example) | VR Rendering (Example) |
| :--- | :--- | :--- | :--- |
| **Discrete** | Selection from a finite set (Entropy). | Dropdown / Radio / Badge | Floating Orbs |
| **Continuous** | Magnitude within a range (Scalar). | Slider / Gauge | A Dial / Throttle |
| **Symbolic** | Raw encoded data (MIME-typed). | Text Field / File Upload | Dictation / Hologram |
| **Composite** | Recursive structure. | Form / Dashboard | Control Panel |


Because HCP defines *Topology* (Physics) rather than *Widgets* (Pixels), the Agent becomes substrate-independent. The exact same Python code that requests a "Vector Field" can manifest as:

1.  **A File Uploader** on a 2D Web Browser.
2.  **A Holographic Cube** in an Unreal Engine VR simulation.
3.  **A Neural Input** in a future Brain-Computer Interface.

**The Agent acts as the Director; the Client acts as the Renderer.**

## 3. Installation

```bash
pip install hcp
```

## 4. Usage: The Python SDK

HCP is designed as a **Stateless Mixin**. You do not rewrite your Agent; you simply inherit `HCPMixin` to give it the ability to speak "Topology" instead of just text.

### A. Architectural Flexibility

HCP is an interface protocol, not a rigid execution pipeline. The SDK's sole responsibility is translating Agent intent into the Wire Protocol. The method of generating that intent is left to your system architecture.

Depending on latency requirements, compute budgets, and model capabilities, HCP supports multiple adoption patterns:

*   **Pattern A: Single-Pass (High Velocity).** Inject the `HCP_PORTAL_PROMPT` directly into the primary Agent's system prompt. The Agent emits HCP JSON natively in real-time. *Optimal for low-latency copilots or advanced foundation models.*
*   **Pattern B: Lazy-Loading (Cost Optimized).** The primary Agent utilizes a lightweight tool trigger to invoke a secondary, isolated LLM call that generates the UI. *Optimal for complex reasoning agents managing strict token limits.*
*   **Pattern C: Native Fine-Tuning (Enterprise Scale).** A model is fine-tuned directly on the "Co-Op Memory" database to output HCP JSON organically, bypassing the need for a System Prompt entirely.

### B. Inherit the Mixin
```python
from hcp.adapter import HCPMixin
from hcp.prompts import HCP_PORTAL_PROMPT # Official Manual (for zero-shot inference)

class BaseAgent(HCPMixin):
    # ... agent reasoning logic ...
```

### C. The Forward Pass (Intent → Protocol)

Avoid manual JSON parsing. Pass the agent's raw LLM output string to `hcp_forward()`. The SDK automatically handles string cleaning (markdown blocks), schema validation, and object inflation.

```python
    async def step(self, user_input):

        # 1. Generate Intent (via Pattern A, B, or C)
        # Expected output: "```json { 'method': 'HCP_Request'... } ```"
        raw_llm_output = await self.llm.generate(user_input)

        try:
            # 2. THE ADAPTER PASS
            # Transforms raw string -> Cleans Markdown -> Validates JSON -> Builds Protocol
            hcp_payload = self.hcp_forward(raw_llm_output)

            # 3. Transmit to Client
            # The payload is structurally complete and ready for the transport layer.
            self.send_message(content=None, metadata={"hcp": hcp_payload})

        except ValueError as e:
            # Handle hallucinated schema or fallback to standard text routing
            print(f"HCP Error: {e}")
```

### D. Advanced Pattern: Lazy-Loading (Pattern B)
If you are building a complex orchestrator and want to keep your main context window clean, use the **Two-Step Inference** strategy.

**1. The Trigger Definition**
Add this tool definition to your main agent's system prompt:
```text
- Enable_HCP(context_summary, detailed_intention, detailed_next_step): Use this ONLY if you need to interact with the User.
```

**2. The Micro-Inference Execution**
```python
    async def execute_hcp(self, context_summary, detailed_intention, detailed_next_step):

        # 1. Construct the Prompt (The "Heavy" Manual)
        system_msg = HCP_PORTAL_PROMPT

        user_msg = (
            f"Context: {context_summary}\n"
            f"Intention: {detailed_intention}\n"
            f"Next Step: {detailed_next_step}\n"
        )

        # 2. Call LLM (Micro-Inference)
        raw_llm_output = await self.llm.generate(system_msg, user_msg)

        try:
            # 3. Process via Adapter
            hcp_payload = self.hcp_forward(raw_llm_output)
            self.send_message(content=None, metadata={"hcp": hcp_payload})

        except ValueError as e:
            print(f"HCP Error: {e}")
```

### E. The Feedback Loop (Protocol → Co-Op Memory)

When the client responds, invoke `hcp_resolve` to decode the data and automatically trigger the Co-Op Memory log.

```python
    def on_incoming_message(self, message):

        # 1. Safely extract metadata
        metadata = getattr(message, "metadata", {}) or {}

        # 2. Check for Protocol Response
        if "hcp_response" in metadata:
            raw_payload = metadata["hcp_response"]
        else:
            # BEST PRACTICE: "Auto-Wrap" standard text
            # Treat standard text inputs as "Rejected/Alternative" proposals to maintain log integrity.
            raw_payload = {
                "usedProposal": False,
                "value": message.content,
                "context": { "refId": message.id, "rationale": "User Initiative (Chat)" }
            }

        # 3. Resolve & Log
        # Internal hook hcp_on_log() is triggered automatically during resolution.
        data, usedProposal = self.hcp_resolve(raw_payload)

        # 4. Update Agent Context
        if usedProposal:
            # SCENARIO A: Alignment
            # Inject the structured data to update the Agent's state.
            self.memory.add(f"[HCP Interaction] User Accepted. Data: {data}")
        else:
            # SCENARIO B/C: Friction
            # The user preferred text. The Agent processes the text naturally.
            pass
```

### F. The Co-Op Memory Logging Hook

Override the logging hook to save the alignment data to a persistent database. This dataset forms the foundation of the Co-Op Memory.

```python
    def hcp_on_log(self, log_entry):
        """
        Automatically invoked by hcp_resolve().

        log_entry structure:
        {
            "refId": "uuid...",
            "rationale": "I need to calibrate risk...", # Ground Truth
            "usedProposal": boolean,                    # The Label
            "value": any,                               # The Content
            "timestamp": "iso-date..."
        }
        """
        # Implementation Example: Save to MongoDB, Postgres, or JSONL
        MyDatabase.save_alignment_log(log_entry)
```

## 5. The Architecture: Why Use HCP?

### A. The Recursive Guarantee (Elastic Fallback)

The true power of HCP lies in the **Composite Shape**. Because shapes are recursive (a Composite can contain Composites), we can model complex reality structures—like a Physics Simulation or a Medical Diagnosis—without worrying about the Client's capabilities.

This enables **Elastic Fallback**, arguably the most important feature for long-term robustness.

#### The "3D Gizmo" Problem
Imagine an Agent requesting a **3D Vector** (X, Y, Z) to control a robotic arm.

1.  **The High-Fidelity Client (e.g., Apple Vision Pro):**
    Recognizes the semantic label "3D Vector." It renders a **holographic manipulation gizmo** floating in space. The user grabs it and moves it.

2.  **The Low-Fidelity Client (e.g., A Text Terminal):**
    Does *not* have a "holographic gizmo."
    **In a standard protocol, the app would crash.**

3.  **The HCP Client:**
    It looks at the Composite Shape and performs **Molecular Decomposition**. It breaks the complex molecule down into its atomic parts until it finds a shape it *can* render.

    *   **GUI Client (Web):** Renders three sleek 1D Sliders.
    *   **XR Client (Vision Pro):** Renders three floating holographic dials.
    *   **TUI Client (Terminal):** Renders three ASCII sliders controlled by **Arrow Keys**.


        `X-Axis: [<-- ███░░░░░░░ -->] 30%`

        `Y-Axis: [<-- ██████░░░░ -->] 60%`

        `Z-Axis: [<-- ████░░░░░░ -->] 40%`

#### The Result: Indestructible Interaction
The Agent gets its data (X, Y, Z) regardless of the substrate.
Crucially, **Type Safety is preserved**. Even in the Terminal, the user cannot type "hello"; the interface restricts them to a Float within the `0.0 - 1.0` range.

Through the **Composite DataShape**, we can scale from the simplest interaction (asking for a boolean) to the most complex (orchestrating a fluid dynamics simulation in the Metaverse) without changing a single line of the Agent's reasoning logic.

### B. The "Co-Op Memory"

Just as Cursor/Copilot learns code completion by observing whether you hit `Tab` (Accept) or keep typing (Reject), HCP records whether users engage with the UI or fall back to Chat.

However, HCP data is **richer** than code completion:

- Accept -> Agent's rationale explains why.
- Reject -> User's **new** prompt explains why.

With a history of these logs, every interaction is a high-fidelity training signal.

**Safety:**

- No need to manually label safety failures.

**Vibe:**

- No need to collect user's engagement depth:  vibe coder vs. technical coder.
- No need to collect user's personality: Steve Jobs vs. hacker.
- No need to collect user's favorite colors.

**A simple `accept/reject + rationale` tells it all.**


## 6. Advanced Scenarios: The "Crazy" Tests

Because HCP decouples **Parameters** (Topology) from **Rendering** (Client), it can drive complex environments like Physics Engines or XR Simulations without the AI Agent needing to know graphics programming.

### Test Case: The "Robotic Arm" (XR Agent)
**Concept:** Spatio-Temporal Adherence.

In Spatial Computing, "Acceptance" isn't just about clicking a button. It is about **Kinetic Alignment**. The user must match the Agent's proposed movement in both Space (Position/Rotation) and Time (Velocity).

*   **Scenario:** AR glasses showing a mechanic, in the first-person view, how to **install a component** on a robotic arm.
*   **Mechanism:** `OFFER` (3D Volumetric Ghost Hand animation that wait for users' to overlap).
*   **Acceptance:** The user's physical hand synchronizes with the Ghost Hand's skeletal pose.

### The Agent Code
```python
import numpy as np

class IndustrialGuide(HCPMixin):

    def show_assembly_step(self):
        # The Agent generates a 60fps skeletal animation for the insertion gesture.
        # It calculates not just Position (XYZ), but Orientation (Quaternions).

        frames = []
        for t in range(120): # 2 Seconds of motion
            frames.append({
                "tick": t,
                "root_position": [x, y, z], # Calculated trajectory
                "wrist_rotation": [qx, qy, qz, qw], # Calculated torque
                "finger_states": { "thumb": 0.5, "index": 1.0 } # Grip strength
            })

        return self.hcp_offer_symbolic(
            rationale="Guide user to insert Servo Motor A with specific torque.",
            label="Sync with Ghost Hand",

            # The Client interprets this as a Skeletal Animation Stream
            mimeType="application/x-skeletal-pose-stream+json",

            value={
                "target_object": "servo_motor_a",
                "animation_clip": frames,
                "tolerance_threshold": 0.95 # cosine similarity
            }
        )
```

### The Client Logic (The Glasses)
The AR Glasses run a real-time loop comparing the **User's Skeleton** (via Hand Tracking) to the **HCP Animation Clip**.

*   **Scenario A (Alignment):** The user "shadows" the ghost hand. The Euclidean distance between real joints and ghost joints remains near zero.
    *   **HCP Result:** `usedProposal: true`.
    *   *Insight:* The user has learned the physical skill.

*   **Scenario B (Deviation):** The user pauses, rotates their wrist 90 degrees, and reaches for another tool.
    *   **HCP Result:** `usedProposal: false`.
    *   **Value:** `{ "detected_tool": "screwdriver", "hand_pose": [...] }`.
    *   *Insight:* The Agent proposed a "Push" motion. The User corrected it with a "Screw" motion. The Agent now learns that this component requires fastening, not just insertion.

**HCP captures the physical correction of the workflow.**


## 7. So What?
Built on these First Principles, it enables a new paradigm:

- **Researchers** can study human-AI interaction through the high-fidelity, info-rich data collected via the HCP Co-Op Memory.
- **Developers** can build continuous learning features via Online Reinforcement Learning, using real-time HCP feedback to improve both power and experience.
- **Founders** can deploy agents immediately into CLI, Blender, Spatial Computing, or Brain-Computer Interfaces, without changing the backend.

**HCP is not designed just for this era.**

The code you write today will work on the devices 100 years later.

