Metadata-Version: 2.4
Name: mcal-ai
Version: 0.1.0
Summary: Memory-Context Alignment Layer for Goal-Coherent AI Agents
Author: MCAL Team
License: MIT
Project-URL: Homepage, https://github.com/Shivakoreddi/mcal-ai
Project-URL: Documentation, https://github.com/Shivakoreddi/mcal-ai#readme
Project-URL: Repository, https://github.com/Shivakoreddi/mcal-ai.git
Project-URL: Issues, https://github.com/Shivakoreddi/mcal-ai/issues
Keywords: llm,memory,agents,context,ai,nlp
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.11
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: anthropic>=0.18.0
Requires-Dist: openai>=1.0.0
Requires-Dist: boto3>=1.28.0
Requires-Dist: pydantic>=2.0.0
Requires-Dist: numpy>=1.24.0
Requires-Dist: faiss-cpu>=1.7.4
Requires-Dist: sentence-transformers>=2.2.0
Requires-Dist: sqlalchemy>=2.0.0
Requires-Dist: aiosqlite>=0.19.0
Requires-Dist: tiktoken>=0.5.0
Requires-Dist: tenacity>=8.2.0
Requires-Dist: rich>=13.0.0
Requires-Dist: python-dotenv>=1.0.0
Provides-Extra: langgraph
Requires-Dist: langgraph>=0.0.40; extra == "langgraph"
Requires-Dist: langchain-core>=0.1.0; extra == "langgraph"
Provides-Extra: crewai
Requires-Dist: crewai>=0.28.0; extra == "crewai"
Provides-Extra: autogen
Requires-Dist: pyautogen>=0.2.0; extra == "autogen"
Provides-Extra: langchain
Requires-Dist: langchain>=0.1.0; extra == "langchain"
Requires-Dist: langchain-core>=0.1.0; extra == "langchain"
Provides-Extra: integrations
Requires-Dist: langgraph>=0.0.40; extra == "integrations"
Requires-Dist: langchain-core>=0.1.0; extra == "integrations"
Requires-Dist: crewai>=0.28.0; extra == "integrations"
Requires-Dist: pyautogen>=0.2.0; extra == "integrations"
Provides-Extra: mem0
Requires-Dist: mem0ai>=0.1.0; extra == "mem0"
Provides-Extra: dev
Requires-Dist: pytest>=7.4.0; extra == "dev"
Requires-Dist: pytest-asyncio>=0.21.0; extra == "dev"
Requires-Dist: pytest-cov>=4.1.0; extra == "dev"
Requires-Dist: black>=23.0.0; extra == "dev"
Requires-Dist: ruff>=0.1.0; extra == "dev"
Requires-Dist: mypy>=1.5.0; extra == "dev"
Requires-Dist: pre-commit>=3.4.0; extra == "dev"
Requires-Dist: ipykernel>=6.25.0; extra == "dev"
Requires-Dist: jupyter>=1.0.0; extra == "dev"
Provides-Extra: eval
Requires-Dist: pandas>=2.0.0; extra == "eval"
Requires-Dist: matplotlib>=3.7.0; extra == "eval"
Requires-Dist: seaborn>=0.12.0; extra == "eval"
Requires-Dist: wandb>=0.15.0; extra == "eval"
Requires-Dist: scipy>=1.11.0; extra == "eval"
Provides-Extra: all
Requires-Dist: mcal[dev,eval,integrations]; extra == "all"
Dynamic: license-file

# MCAL: Memory-Context Alignment Layer

> **Beyond Retrieval:** Intent-Preserving Memory for Goal-Coherent AI Agents

[![Python 3.11+](https://img.shields.io/badge/python-3.11+-blue.svg)](https://www.python.org/downloads/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
[![arXiv](https://img.shields.io/badge/arXiv-coming--soon-b31b1b.svg)](https://arxiv.org/)
[![Standalone](https://img.shields.io/badge/architecture-standalone-green.svg)](docs/MCAL_DESIGN.md)

## What's New in v2.0

MCAL is now **fully standalone** - no external dependencies on Mem0 or other memory providers. The architecture includes:

- **Built-in Embedding Service** - OpenAI or Bedrock embeddings
- **Native Vector Search** - Cosine similarity with HNSW-like indexing  
- **Graph Deduplication** - Automatic node merging with similarity detection
- **JSON Persistence** - Zero-config file-based storage

## The Problem

Current AI agent memory systems store **facts** but lose **meaning**:

| What's Stored                        | What's Lost                                      |
|--------------------------------------|--------------------------------------------------|
| "User wants to visit Japan"          | **WHY** they chose Japan over other destinations  |
| "User booked a hotel in Shibuya"     | **WHAT** alternatives were considered (Shinjuku, Ginza, Asakusa) |
| "User plans to visit Kyoto"          | **HOW** this fits into the overall trip plan      |

This creates the **Memory-Context Alignment Paradox**: as conversations grow, agents remember *what* was said but forget *why* it mattered.

## Our Solution: Three Pillars

### 1. Intent Graph Preservation
Hierarchical goal structures that persist across sessions:
```
MISSION: Plan a 2-week vacation to Japan
├── GOAL: Book travel [✓ COMPLETED]
│   ├── TASK: Find flights [✓]
│   └── TASK: Reserve hotels [✓]
├── GOAL: Plan activities [ACTIVE]
│   ├── TASK: Research Tokyo attractions [✓]
│   ├── TASK: Plan Kyoto day trips [IN PROGRESS]
│   └── TASK: Book restaurants [PENDING]
└── GOAL: Pack and prepare [PENDING]
```

### 2. Reasoning Chain Storage
Preserve **WHY** decisions were made, not just conclusions:
```
Decision: "Stay in Shibuya for Tokyo accommodation"
├── Alternatives: [Shinjuku, Ginza, Asakusa]
├── Rationale: "Central location, good nightlife, easy metro access"
├── Evidence: ["User wants to explore at night", "Prefers walkable areas"]
└── Trade-offs: ["More expensive but saves daily transit time"]
```

### 3. Goal-Aware Retrieval
Retrieve based on **objective achievement**, not just similarity:
```
Score = α × semantic_similarity
      + β × goal_alignment      ← NEW
      + γ × decision_relevance  ← NEW
      + δ × recency_decay
```

## Installation

```bash
# Install from source (recommended)
git clone https://github.com/Shivakoreddi/mcla-research.git
cd mcla-research
pip install -e .

# Development installation with test dependencies
pip install -e ".[dev]"

# Optional: Install with legacy Mem0 support
pip install -e ".[mem0]"
```

### Requirements
- Python 3.11+
- `anthropic` - For LLM extraction (Claude)
- `openai` - For embeddings (optional, can use Bedrock instead)

## Quick Start

```python
import asyncio
from mcal import MCAL

async def main():
    # Initialize MCAL (standalone by default)
    mcal = MCAL(
        llm_provider="anthropic",  # or "bedrock", "openai"
        embedding_provider="openai",  # or "bedrock" 
    )
    
    # Add conversation messages
    messages = [
        {"role": "user", "content": "I'm building a fraud detection ML pipeline"},
        {"role": "assistant", "content": "Great! Let's start with data ingestion..."},
        {"role": "user", "content": "I chose PostgreSQL over MongoDB for the data store"},
        {"role": "assistant", "content": "PostgreSQL is a solid choice for structured fraud data..."}
    ]
    
    result = await mcal.add(messages, user_id="user_123")
    
    # Access the unified graph with goals, decisions, and reasoning
    print(f"Extracted {result.unified_graph.node_count} nodes")
    print(f"Active goals: {result.unified_graph.get_active_goals()}")
    print(f"Decisions: {result.unified_graph.get_all_decisions_with_detail()}")
    
    # Search for relevant context
    search_results = await mcal.search(
        query="What database did the user choose?",
        user_id="user_123"
    )
    
    # Get formatted context for LLM
    context = mcal.get_context(
        query="What should we focus on next?",
        user_id="user_123",
        max_tokens=4000
    )

asyncio.run(main())

## Architecture

```
┌─────────────────────────────────────────────────────────────────┐
│                         MCAL v2.0                               │
│  ┌───────────────────────────────────────────────────────────┐  │
│  │                 Application Layer                         │  │
│  │  mcal.add()  │  mcal.search()  │  mcal.get_context()     │  │
│  └───────────────────────────────────────────────────────────┘  │
│                              │                                   │
│  ┌───────────────────────────────────────────────────────────┐  │
│  │              Unified Deep Extractor                       │  │
│  │    Single LLM call extracts: GOALS | DECISIONS | FACTS   │  │
│  └───────────────────────────────────────────────────────────┘  │
│                              │                                   │
│  ┌───────────────────────────────────────────────────────────┐  │
│  │              Unified Graph (6 Nodes, 13 Edges)           │  │
│  │    PERSON | THING | CONCEPT | GOAL | DECISION | ACTION   │  │
│  └───────────────────────────────────────────────────────────┘  │
│                              │                                   │
│  ┌───────────────────────────────────────────────────────────┐  │
│  │               Standalone Services                         │  │
│  │  ┌──────────────┐  ┌─────────────┐  ┌─────────────────┐  │  │
│  │  │  Embeddings  │  │Vector Search│  │  Deduplication  │  │  │
│  │  │  (OpenAI/    │  │(Cosine Sim) │  │  (Similarity    │  │  │
│  │  │   Bedrock)   │  │             │  │   Merging)      │  │  │
│  │  └──────────────┘  └─────────────┘  └─────────────────┘  │  │
│  └───────────────────────────────────────────────────────────┘  │
│                              │                                   │
│  ┌───────────────────────────────────────────────────────────┐  │
│  │                JSON File Persistence                      │  │
│  │    ~/.mcal/users/{user_id}/graph.json                    │  │
│  └───────────────────────────────────────────────────────────┘  │
└─────────────────────────────────────────────────────────────────┘
```

## Project Structure

```
mcla-research/
├── src/mcal/
│   ├── mcal.py              # Main MCAL class
│   ├── core/
│   │   ├── unified_extractor.py  # Single-pass extraction
│   │   ├── unified_graph.py      # Graph with rich attributes
│   │   ├── embedding_service.py  # Embedding generation
│   │   ├── vector_index.py       # Similarity search
│   │   └── deduplication.py      # Node merging
│   ├── providers/
│   │   └── llm_providers.py      # Anthropic, OpenAI, Bedrock
│   └── storage/
│       └── sqlite_store.py       # Persistence layer
├── experiments/
├── data/
│   ├── synthetic/               # Generated conversations
│   └── benchmarks/              # MCAL-Bench dataset
├── tests/
└── docs/
```

## Evaluation: MCAL-Bench

We introduce **MCAL-Bench**, the first benchmark for reasoning preservation and goal coherence:

| Metric | What It Measures |
|--------|------------------|
| **RPS** (Reasoning Preservation Score) | Can the system explain WHY a decision was made? |
| **GCS** (Goal Coherence Score) | Do responses align with user's active objectives? |
| **TER** (Token Efficiency Ratio) | Quality-per-token vs full context baseline |

## Results (Preliminary)

| System | RPS | GCS | TER |
|--------|-----|-----|-----|
| Full Context | 0.85 | 0.82 | 1.0x |
| Summarization | 0.45 | 0.58 | 2.1x |
| Mem0 | 0.52 | 0.61 | 3.2x |
| **MCAL (Ours)** | **0.78** | **0.79** | **3.8x** |

## Roadmap

- [x] Problem formulation & research
- [ ] Week 1: Foundation (baseline + data)
- [ ] Week 2: Core algorithms
- [ ] Week 3: Benchmark & evaluation
- [ ] Week 4: Paper draft
- [ ] Week 5: Release & arXiv

## Citation

```bibtex
@article{mcal2026,
  title={MCAL: Memory-Context Alignment for Goal-Coherent AI Agents},
  author={Koreddi, Shiva},
  journal={arXiv preprint},
  year={2026}
}
```

## License

MIT License - see [LICENSE](LICENSE) for details.

## Acknowledgments

Built on insights from:
- [MemGPT](https://github.com/cpacker/MemGPT) - OS-inspired memory hierarchy
- [Reflexion](https://arxiv.org/abs/2303.11366) - Verbal self-reflection

---

## Migration from v1.x

If you were using MCAL with Mem0 backend, see [STANDALONE_MIGRATION.md](docs/STANDALONE_MIGRATION.md) for migration guide.

**Key changes:**
- `mem0_config` and `mem0_api_key` parameters are deprecated
- `use_standalone_backend` is deprecated (standalone is now default)
- Install `mcal[mem0]` for legacy Mem0 support
