Metadata-Version: 2.4
Name: ashmatics-aigov-tools
Version: 0.7.1
Summary: Ashmatics Clinical AI Governance Framework - Content management, validation, and compilation
Author-email: JF Kalafut <john@asherinformatics.com>, Asher Informatics PBC <info@ashmatics.com>
License: Apache-2.0
Project-URL: Homepage, https://github.com/AshMatics/ashmatics-aigov-tools
Project-URL: Repository, https://github.com/AshMatics/ashmatics-aigov-tools
Project-URL: Documentation, https://github.com/AshMatics/ashmatics-aigov-tools/blob/main/README.md
Project-URL: Bug Tracker, https://github.com/AshMatics/ashmatics-aigov-tools/issues
Keywords: ashmatics,ai-governance,framework,healthcare-ai,clinical-ai
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Healthcare Industry
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Requires-Python: >=3.11
Description-Content-Type: text/markdown
Requires-Dist: ashmatics-tools[azure-ai,db-mongo,storage]<0.8.0,>=0.7.2
Requires-Dist: ashmatics-datamodels<0.4.0,>=0.3.2
Requires-Dist: azure-storage-blob>=12.19.0
Requires-Dist: pymongo>=4.0.0
Requires-Dist: motor>=3.3.0
Requires-Dist: graphql-core>=3.2.0
Requires-Dist: markdown>=3.5.0
Requires-Dist: pyyaml>=6.0.0
Requires-Dist: python-dotenv>=1.0.0
Requires-Dist: pydantic>=2.0.0
Requires-Dist: pathvalidate>=3.0.0
Requires-Dist: rich>=13.0.0
Requires-Dist: httpx>=0.27.0
Requires-Dist: numexpr>=2.14.1
Requires-Dist: bottleneck>=1.6.0
Requires-Dist: pandas>=3.0.0
Requires-Dist: azure-ai-projects>=2.0.0b3
Requires-Dist: azure-identity>=1.25.1
Provides-Extra: local-llm
Requires-Dist: ashmatics-tools[ollama]<0.8.0,>=0.7.2; extra == "local-llm"
Provides-Extra: foundry
Requires-Dist: azure-ai-projects>=1.0.0b1; extra == "foundry"
Requires-Dist: azure-identity>=1.15.0; extra == "foundry"
Requires-Dist: classy-fire>=0.2.1; extra == "foundry"
Requires-Dist: openai>=1.0.0; extra == "foundry"
Provides-Extra: dev
Requires-Dist: pytest>=7.4.0; extra == "dev"
Requires-Dist: pytest-cov>=4.1.0; extra == "dev"
Requires-Dist: pytest-asyncio>=0.21.0; extra == "dev"
Requires-Dist: pytest-mock>=3.10.0; extra == "dev"
Requires-Dist: ruff>=0.1.0; extra == "dev"
Requires-Dist: mypy>=1.5.0; extra == "dev"
Requires-Dist: black>=23.0.0; extra == "dev"
Requires-Dist: pre-commit>=3.0.0; extra == "dev"

# Ashmatics Clinical AI Governance Framework

**Version:** 0.7.0
**Last Updated:** 2026-01-28

## Recent Updates

**2026-01-28 (v0.7.0)**: **Framework Explorer Pipeline** (ASHPSTUDIO-184) - Complete end-to-end pipeline for AI Governance framework exploration. Four-stage architecture: Query Understanding (rule-based + LLM refinement), Retrieval (KB MCP), Synthesis (GPT-4o via Azure AI Foundry), and Validation (GPT-4o-mini). Supports 6 query intents (domain_explanation, structural, regulatory_crosswalk, gap_analysis, known_item, comparative). See [Framework Explorer Pipeline](#framework-explorer-pipeline) section and [services/README.md](src/ashmatics_aigov_tools/services/README.md) for details.

**2026-01-25 (v0.6.0)**: **AIGov Query Service SDK** (ASHPSTUDIO-184) - Major refactoring to create a reusable SDK for querying the AI Governance Framework. New `services/` module provides `AIGovQueryService` class that can be imported by applications (aigov-explorer, coreapp) without CLI dependencies. Supports both MCP backend (KB server) and direct CosmosDB mode. Includes `KBMCPClient` for typed MCP server communication with graph traversal support. CLI refactored to thin wrapper using the SDK. See [AIGov Query Service SDK](#aigov-query-service-sdk) section below.

**2026-01-20 (v0.5.0)**: **Registry-Based Ontology Loading** (ASHKBAPP-60) - Updated `cai-init-ontology` to load policy domains, process domains, and relationships from the **ashmatics-framework-registry.json** file instead of hardcoded data. Now loads the correct 12 policy domains and 12 process domains with full metadata (tier, ISO controls, base practices, regulatory mappings). Creates 51 policy→process "specifies" relationships automatically. Registry file is auto-discovered from sibling repo or can be specified with `--registry-file`.

**2026-01-20 (v0.4.0)**: **ASHCAI Ontology Release** - Added ASHCAI (AshMatics Clinical AI Governance) ontology management tools (ASHPSTUDIO-183). New CLI commands: `cai-init-ontology` for initializing ontology collections in MongoDB, and `cai-sync-regulatory` for syncing regulatory crosswalk mappings from YAML source files. Added `RegulatoryYAMLParser` for parsing `*-logic.yaml` files from ashmatics-policy-process-builder. Supports NIST AI RMF, Joint Commission RUAIH, ONC HTI-1, Colorado AI Act, and ACA Section 1557 frameworks.

**2026-01-17 (v0.3.0)**: **Major release** - Added `cai-query` RAG-powered query tool for interactive Q&A over the AI Governance Framework. Features Azure OpenAI and Ollama LLM support, Simple and Multi-Query RAG strategies, interactive REPL mode, and streaming responses. Also added `cai-extract-relationships` CLI for graph relationship extraction (ASHKBAPP-60). Updated `ashmatics-tools` dependency with new extras for storage, database, and LLM support.

**2025-12-22**: Added vector index management to `generate_embeddings_aigov.py` with automatic IVF index creation, `--stats-only`, `--index-only`, `--reset`, and `--clear-prefix` flags. Created standalone `create_vector_index.py` utility for generic MongoDB/CosmosDB collections. 2,405 vectors now indexed with `vector-ivf` index in CosmosDB vCore.

**2025-12-21**: Added `generate_embeddings_aigov.py` script for creating vector embeddings from framework content. Features include semantic content filtering, format filtering (md/yaml/json), domain filtering, concurrent processing, rich progress display, and token usage tracking. Vectors stored in `aigov_framework_vectors` collection for RAG/semantic search.

**2025-12-20**: Successfully imported CAI Governance Framework v0.8.0 with new artifact_id slug naming. 309 documents stored in MongoDB, 342 files in Azure, 341 records in PostgreSQL. Added wizard-content-schemas support and fixed domain extraction for SOPs/WPs. New `cai-framework-manager` CLI tool for full import workflow.

**2025-12-19**: Added human-friendly `artifact_id` slug mapping for API-friendly document lookups. See [Artifact Slug Naming Convention](#artifact-slug-naming-convention) below.

**2025-12-01**: Added dependency to the ashmatics common data model repo/package to use this for any interactions with/to the KB. Updated the errors found in code review, tested and bumped to v0.2.0.

A Python package for managing, validating, and deploying the Ashmatics Clinical AI Governance Framework content.

## Overview

This package provides tools for the complete lifecycle management of the Ashmatics CAI Governance Framework, including:

- **Content Validation**: Verify framework structure and integrity
- **Azure Storage Management**: Upload and version framework content
- **Database Registration**: Register content with PostgreSQL (Hasura) and MongoDB
- **Hybrid Persistence**: Implement three-layer architecture (Azure + Postgres + MongoDB)
- **MongoDB Compilation**: Convert Markdown content into structured JSON views for AI agents

## Architecture - Hybrid Persistence Strategy

The package implements a three-layer hybrid persistence approach:

### 1. Azure Blob Storage (Source of Truth)
- Immutable Markdown source files with content-addressed IDs
- Versioned storage with `v{version}/` prefixes
- Human-readable format for web presentation

### 2. PostgreSQL via Hasura (Relational Metadata)
- Base artifact metadata and version tracking
- File registry for search and organization
- Tables: `framework_content_registry`, `framework_content_registry_files`

### 3. MongoDB/CosmosDB (Compiled Views for AI Agents)
- Structured JSON views with sections, headers, placeholders
- Policy bindings and traceability information
- Collection: `framework_compiled_views`

## Installation

### From Git Repository (Private)

```bash
# Using pip
pip install git+https://github.com/JFK-Ashmatics/ashmatics-aigov-tools.git

# Using uv
uv add git+https://github.com/JFK-Ashmatics/ashmatics-aigov-tools.git

# With development dependencies
pip install "ashmatics-aigov-tools[dev] @ git+https://github.com/JFK-Ashmatics/ashmatics-aigov-tools.git"
```

### From Local Development

```bash
# Clone the repository
git clone https://github.com/JFK-Ashmatics/ashmatics-aigov-tools.git
cd ashmatics-aigov-tools

# Install in editable mode with dev dependencies
pip install -e ".[dev]"
```

### Using uv (Recommended)

For local development, we recommend using [uv](https://github.com/astral-sh/uv) for faster, more reliable dependency management:

```bash
# Install uv if you haven't already
curl -LsSf https://astral.sh/uv/install.sh | sh

# Clone and set up the project
git clone https://github.com/JFK-Ashmatics/ashmatics-aigov-tools.git
cd ashmatics-aigov-tools

# Sync dependencies (creates/updates .venv automatically)
uv sync

# Run commands using the project's isolated environment
uv run register-framework-content --version=0.7.0 --hybrid --dry-run
```

**Important:** When using `uv`, always prefix commands with `uv run` to ensure they execute in the project's virtual environment rather than your global Python installation.

## Dependencies

This package depends on [ashmatics-tools](https://github.com/JFK-Ashmatics/ashmatics-tools) for shared utilities:
- Document parsing (Docling, LlamaParse)
- Chunking strategies
- Embedding generation
- Vector storage
- Base importers and processors

## Usage

### Command Line Interface

#### Register Existing Framework Content

```bash
# Register existing Azure content with PostgreSQL only
uv run register-framework-content --version=0.7.0

# Full hybrid registration (PostgreSQL + MongoDB)
uv run register-framework-content --version=0.7.0 --hybrid

# Dry run to see what would be registered
uv run register-framework-content --version=0.7.0 --hybrid --dry-run

# Validate artifact ID mapping consistency
uv run register-framework-content --version=0.7.0 --hybrid --validate-mapping
```

**Note:** If not using `uv`, you can run commands directly after installation, but `uv run` is recommended for local development to ensure proper environment isolation.

#### Generate Embeddings and Manage Vector Index

The `generate_embeddings_aigov.py` script creates vector embeddings from the compiled framework content in MongoDB, storing them in a separate collection with automatic vector index creation for semantic search and RAG applications.

```bash
# Check current vector collection stats
uv run python src/ashmatics_aigov_tools/scripts/generate_embeddings_aigov.py --stats-only

# Create index on existing vectors (no embedding)
uv run python src/ashmatics_aigov_tools/scripts/generate_embeddings_aigov.py --index-only

# Generate embeddings for semantic Markdown content (recommended)
uv run python src/ashmatics_aigov_tools/scripts/generate_embeddings_aigov.py \
  --semantic-only --format md --quiet

# Re-embed specific content (clear old vectors first)
uv run python src/ashmatics_aigov_tools/scripts/generate_embeddings_aigov.py \
  --clear-prefix SOP-PV --include SOP-PV --format md
```

**Key Options:**
| Option | Description |
|--------|-------------|
| `--stats-only` | Show collection statistics without changes |
| `--index-only` | Create vector index on existing vectors (no embedding) |
| `--semantic-only` | Only embed high-value content (FW-, POL-, PROC-, AREA-, SOP-, CTRL-, ISO-, WIZ-, LOG-) |
| `--format md` | Only process Markdown files (skip JSON/YAML) |
| `--domain CODE` | Filter by process domain (PV, SA, MON, OVR, ORG, IT, RM, HO, EF, TR, UG) |
| `--reset` | Delete ALL vectors and index before processing (requires confirmation) |
| `--clear-prefix PREFIX` | Delete vectors matching artifact_id prefix before processing |
| `--index-type ivf\|hnsw` | Vector index type (default: ivf for CosmosDB compatibility) |
| `--no-index` | Skip automatic index creation after embedding |
| `--concurrency N` | Process N documents concurrently (1-3, default: 1) |
| `--dry-run` | Preview without generating embeddings |

**Output:** Vectors stored in `ai_strategy.aigov_framework_vectors` with IVF vector index for fast similarity search.

#### Generic Vector Index Utility

For managing vector indexes on any MongoDB/CosmosDB collection:

```bash
uv run python src/ashmatics_aigov_tools/scripts/create_vector_index.py \
  --db mydb --collection vectors --check-only
```

#### Query the AI Governance Framework (RAG)

The `cai-query` tool provides RAG-powered question answering over the AI Governance Framework content. It uses vector similarity search over the 2,405 embedded framework documents and generates answers using Azure OpenAI or Ollama.

```bash
# Interactive mode (recommended for exploration)
uv run cai-query

# Single query
uv run cai-query --query "What is performance validation?"

# Use Ollama instead of Azure OpenAI
uv run cai-query --ollama --model llama3.2

# Multi-query RAG for better retrieval coverage
uv run cai-query --multi-query --query "Explain risk management"

# Stream the response
uv run cai-query --stream --query "What are governance controls?"

# Retrieve more sources
uv run cai-query --top-k 10 --query "What is the PV process?"
```

**Key Options:**
| Option | Description |
|--------|-------------|
| `--query`, `-q` | Single query (omit for interactive mode) |
| `--ollama` | Use Ollama instead of Azure OpenAI |
| `--model` | LLM model/deployment name |
| `--multi-query` | Use multi-query RAG for better coverage |
| `--stream` | Stream the response |
| `--top-k N` | Number of sources to retrieve (default: 5) |
| `--temperature` | LLM temperature (default: 0.7) |
| `--no-context` | Don't display retrieved sources |

**Interactive Commands:**
| Command | Description |
|---------|-------------|
| `help` | Show available commands |
| `stats` | Show vector collection statistics |
| `top N` | Set number of sources to retrieve |
| `quit` | Exit the REPL |

**Environment Variables** (from `ashmatics-tools.env`):
| Variable | Description |
|----------|-------------|
| `AZ_MONGO_CONNECTION_STRING` | CosmosDB connection string |
| `AZURE_OPENAI_API_KEY` | Azure OpenAI API key |
| `AZURE_OPENAI_ENDPOINT` | Azure OpenAI endpoint |
| `AZURE_OPENAI_CHAT_DEPLOYMENT` | Chat model deployment (default: gpt-4o) |
| `AZURE_OPENAI_DEPLOYMENT_NAME` | Embedding model deployment |
| `OLLAMA_ENDPOINT` | Ollama server URL (default: http://localhost:11434) |

#### Extract Framework Relationships (Graph)

The `cai-extract-relationships` tool extracts relationships between framework artifacts for graph database integration (ASHKBAPP-60).

```bash
# Extract relationships to JSON
uv run cai-extract-relationships --output relationships.json

# Preview without writing
uv run cai-extract-relationships --dry-run
```

#### Initialize ASHCAI Ontology

The `cai-init-ontology` tool initializes the ASHCAI (AshMatics Clinical AI Governance) ontology collections in MongoDB, loading policy domains, process domains, and relationships from the **ashmatics-framework-registry.json** file (the single source of truth).

```bash
# Initialize with auto-discovered registry file
uv run cai-init-ontology

# Initialize with explicit registry file path
uv run cai-init-ontology --registry-file /path/to/ashmatics-framework-registry.json

# Dry run to preview what would be created
uv run cai-init-ontology --dry-run

# Force re-initialization (updates existing data)
uv run cai-init-ontology --force

# Custom MongoDB connection
uv run cai-init-ontology --mongodb-uri "mongodb://localhost:27017" --database "ashmatics_dev"
```

**Key Options:**
| Option | Description |
|--------|-------------|
| `--registry-file` | Path to ashmatics-framework-registry.json (auto-discovered if not specified) |
| `--dry-run` | Preview what would be done without making changes |
| `--force` | Drop existing collections before creating |
| `--mongodb-uri` | MongoDB connection URI (default: AZ_MONGO_CONNECTION_STRING env var) |
| `--database` | MongoDB database name (default: MONGODB_DATABASE env var) |

**Registry Auto-Discovery:**
The registry file is automatically discovered from:
- Sibling `ashmatics-policy-process-builder` repository
- `~/GitHub/AsherInformatics/ashmatics-policy-process-builder/`
- `/app/config/` (for Docker deployments)

**What Gets Loaded:**
- **12 Policy Domains**: SAP, AGP, RMP, HOP, EFP, MMP, LTP, CMP, DGP, SEP, VAL, TPP
  - Includes: tier, category, ISO 42001 controls, exemplar controls, regulatory mappings
- **12 Process Domains**: ORG, OVR, PV, SA, IT, RM, MON, HO, EF, TR, UG, SE
  - Includes: base practices, policy bindings, upstream/downstream dependencies
- **51 Relationships**: Policy→Process "specifies" relationships from `policyToProcess` mappings

**Collections Created:**
- `ashcai_policy_domains` - Policy domain definitions (MMP-001, AGP-001, etc.)
- `ashcai_process_domains` - Process domain definitions (MON, RM, SA, etc.)
- `ashcai_base_practices` - Base practices (MON.BP01, etc.)
- `ashcai_sop_templates` - SOP templates (SOP-MON-01, etc.)
- `ashcai_work_product_templates` - Work product templates
- `ashcai_exemplar_controls` - Exemplar controls (EXC-A4-01, etc.)
- `ashcai_regulatory_requirements` - Regulatory requirements
- `ashcai_regulatory_frameworks` - Framework metadata
- `ashcai_relationships` - Entity relationships (policy→process, etc.)
- `ashcai_regulatory_crosswalks` - Regulatory crosswalk mappings

#### Sync Regulatory Crosswalks

The `cai-sync-regulatory` tool synchronizes regulatory crosswalk mappings from YAML source files (in `ashmatics-policy-process-builder/wizard-content-schemas/`) into the ASHCAI ontology collections.

```bash
# Sync all regulatory frameworks
uv run cai-sync-regulatory

# List available frameworks
uv run cai-sync-regulatory --list

# Sync specific frameworks only
uv run cai-sync-regulatory --frameworks NIST-AI-RMF JC-RUAIH

# Dry run to preview
uv run cai-sync-regulatory --dry-run

# Force full resync (delete existing crosswalks first)
uv run cai-sync-regulatory --force

# Custom YAML directory
uv run cai-sync-regulatory --yaml-dir /path/to/wizard-content-schemas
```

**Key Options:**
| Option | Description |
|--------|-------------|
| `--list` | List available frameworks and exit |
| `--frameworks` | Specific framework IDs to sync (e.g., NIST-AI-RMF JC-RUAIH) |
| `--dry-run` | Preview without making changes |
| `--force` | Delete existing crosswalks before sync |
| `--yaml-dir` | Path to wizard-content-schemas directory |

**Supported Regulatory Frameworks:**
| Framework ID | Description |
|--------------|-------------|
| `NIST-AI-RMF` | NIST AI Risk Management Framework |
| `JC-RUAIH` | Joint Commission Requirements for Using AI in Healthcare |
| `ONC-HTI1` | ONC Health IT Certification (HTI-1) |
| `CO-AI-ACT` | Colorado AI Act |
| `ACA-1557` | ACA Section 1557 |

## Framework Explorer Pipeline

**Added 2026-01-28 (ASHPSTUDIO-184)**

The Framework Explorer Pipeline provides intelligent query processing for the ASHCAI AI Governance Framework. It orchestrates four stages to deliver accurate, well-cited responses:

| Stage | Component | Purpose |
|-------|-----------|---------|
| 1. Query Understanding | `QueryPreprocessor` | Intent classification + entity extraction |
| 2. Retrieval | `QueryOrchestrator` | Fetch relevant content via KB MCP |
| 3. Synthesis | `ResponseSynthesizer` | Generate response with GPT-4o |
| 4. Validation | `ResponseValidator` | Quality check with GPT-4o-mini |

**Quick Start:**

```python
from ashmatics_aigov_tools.services import (
    FrameworkExplorerPipeline, KBMCPClient, MCPClientConfig, PipelineConfig
)

async with KBMCPClient(MCPClientConfig.from_env()) as client:
    pipeline = FrameworkExplorerPipeline(client, PipelineConfig())
    result = await pipeline.explore("What is Risk Management?")
    print(result.response)  # Synthesized answer
    print(result.citations)  # Supporting evidence
```

**Supported Query Types:**
- Domain explanations ("What is Risk Management?")
- Structural queries ("What SOPs implement MMP-001?")
- Regulatory crosswalks ("How does ASHCAI map to NIST AI RMF?")
- Gap analysis ("What controls are we missing for NIST?")
- Known item lookup ("Show me SOP-PV-01")
- Comparative analysis ("How is ASHCAI different from ISO 42001?")

> **Full documentation:** See [services/README.md](src/ashmatics_aigov_tools/services/README.md) for configuration, architecture details, and API reference.

---

## AIGov Query Service SDK

**Added 2026-01-25 (ASHPSTUDIO-184)**

The `services/` module provides a reusable SDK for querying the AI Governance Framework. This SDK is designed to be imported by applications (aigov-explorer, coreapp, etc.) without pulling in CLI dependencies.

### Architecture

```
┌─────────────────────────────────────────────────────────────┐
│  Your Apps (aigov explorer, coreapp, etc.)                  │
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐          │
│  │  FastAPI    │  │   Flask     │  │  Streamlit  │          │
│  │  routes     │  │   views     │  │   app       │          │
│  └──────┬──────┘  └──────┬──────┘  └──────┬──────┘          │
└─────────┼────────────────┼────────────────┼─────────────────┘
          │                │                │
          ▼                ▼                ▼
┌─────────────────────────────────────────────────────────────┐
│  ashmatics_aigov_tools.services (SDK)                     │
│  ┌─────────────────────────────────────────────────────┐    │
│  │  AIGovQueryService                                  │    │
│  │  - query(question) -> QueryResult                   │    │
│  │  - search(question) -> RetrievalResult              │    │
│  │  - stream_query(question) -> AsyncIterator          │    │
│  │  - get_policy_hierarchy(policy_id) -> dict          │    │
│  └─────────────────────────────────────────────────────┘    │
│                         │                                   │
│            ┌────────────┴────────────┐                      │
│            ▼                         ▼                      │
│     ┌──────────────┐         ┌──────────────┐               │
│     │  MCPBackend  │         │ DirectBackend│               │
│     │  (KB server) │         │ (CosmosDB)   │               │
│     └──────────────┘         └──────────────┘               │
└─────────────────────────────────────────────────────────────┘
```

### Quick Start

```python
from ashmatics_aigov_tools.services import AIGovQueryService, AIGovQueryConfig

# Create configuration (MCP mode - recommended for production)
config = AIGovQueryConfig(
    use_mcp=True,
    mcp_url="http://localhost:8088",  # or https://kb-api.ashmatics.com
    mcp_api_key=os.getenv("MCP_API_KEY"),
)

# Use the service
async with AIGovQueryService(config) as service:
    # Full RAG query (retrieval + LLM synthesis)
    result = await service.query("What SOPs for model monitoring?")
    print(result.answer)
    for source in result.sources:
        print(f"  - {source.document_id}: {source.score:.3f}")

    # Search only (no LLM synthesis)
    retrieval = await service.search("performance validation", top_k=5)
    for r in retrieval.results:
        print(f"{r.document_id}: {r.title}")

    # Streaming response (for real-time UIs)
    async for chunk in service.stream_query("Explain risk management"):
        print(chunk.text, end="", flush=True)

    # Graph traversals (MCP mode only)
    hierarchy = await service.get_policy_hierarchy("POL-MMP-001")
    controls = await service.find_control_implementations("EXC-A4-01")
    crosswalk = await service.get_regulatory_crosswalk("NIST-AI-RMF")
```

### Configuration Options

| Parameter | Description | Default |
|-----------|-------------|---------|
| `use_mcp` | Use MCP backend (True) or direct CosmosDB (False) | `True` |
| `mcp_url` | KB MCP server URL | `http://localhost:8088` |
| `mcp_api_key` | MCP API key (or set `MCP_API_KEY` env var) | `None` |
| `llm_provider` | LLM provider: `"azure_openai"` or `"ollama"` | `"azure_openai"` |
| `llm_model` | LLM model/deployment name | `None` (uses env default) |
| `top_k` | Number of sources to retrieve | `5` |
| `temperature` | LLM temperature | `0.7` |

### Environment Variables

```bash
# MCP mode
MCP_URL=https://kb-api.ashmatics.com
MCP_API_KEY=coreapp_your_key_here

# Direct mode (fallback)
MONGO_URL=mongodb+srv://...
AZURE_OPENAI_API_KEY=...
AZURE_OPENAI_ENDPOINT=...
```

### Backend Modes

**MCP Backend (Recommended)**
- Uses KB MCP server for intelligent search routing
- Supports intent classification and suggested workflows
- Enables graph traversals (policy hierarchy, control implementations, regulatory crosswalks)
- Production-ready with proper auth

**Direct Backend**
- Connects directly to CosmosDB vector store
- Useful for development/debugging when MCP server unavailable
- No graph traversal support

### Future: SDK Extraction

> **Note:** This SDK may be extracted to a separate `ashmatics-kb-sdk` package in the future to provide a lightweight client library without the framework management dependencies. This would allow apps to import just the query functionality without Azure blob, pymongo, and other admin tooling.

---

### Programmatic Usage (Framework Management)

```python
from ashmatics_aigov_tools.importers.framework_content import FrameworkContentImporter

# Initialize importer with MongoDB support
importer = FrameworkContentImporter(
    framework_repo_path="/path/to/framework",
    azure_connection_string="DefaultEndpointsProtocol=https;...",
    kb_table_name="framework_content_registry",
    container_name="framework-content",
    mongo_connection_string="mongodb+srv://user:pass@cluster/...",
    mongo_database="ashmatics_kb",
    mongo_collection="framework_compiled_views",
    graphql_endpoint="https://kb-graphql.ashmatics.com/v1/graphql",
    admin_secret="your-admin-secret"
)

# Hybrid registration (both Postgres and MongoDB)
success, results = importer.register_full_hybrid_content(
    force_refresh=False,
    verify_azure_content=True
)

# Postgres-only registration
success, results = importer.register_existing_content_with_kb(
    force_refresh=False,
    verify_azure_content=True
)

# MongoDB compilation only
success, results = importer.compile_and_store_mongo_views(
    force_refresh=False,
    verify_azure_content=True
)
```

## Framework Content Structure

The framework includes:

- **12 Process Domains**: ORG, OVR, PV, SA, IT, RM, MON, HO, EF, TR, UG, SE
- **12 Policy Domains**: SAP, AGP, RMP, HOP, EFP, MMP, LTP, CMP, DGP, SEP, VAL, TPP
- **~50 Base Practices**: Distributed across process domains
- **51 Policy→Process Relationships**: Defining which policies govern which processes
- **Content Categories**:
  - `domain_definitions` - Process domain JSON definitions
  - `policy_templates` - Customizable policy JSON templates
  - `process_documentation` - Detailed process and procedure markdown files
  - `master_registry` - Framework registry and metadata files
  - `validation_tools` - Framework validation and tooling files
  - `framework_documentation` - General framework documentation

## Artifact Slug Naming Convention

**Added 2025-12-19**

The framework uses human-friendly `artifact_id` slugs for API lookups instead of UUID-based identifiers. This enables clean API calls like:

```
GET /governance/POL-AGP-Overview?version=0.8.0
GET /governance/SOP-PV-01?version=0.8.0
```

### Slug Prefixes by Artifact Type

| Prefix | Type | Example |
|--------|------|---------|
| `FW-` | Framework docs | `FW-Overview`, `FW-ImplGuide` |
| `POL-` | Policy overviews | `POL-AGP-Overview`, `POL-RMP-Overview` |
| `POLT-` | Policy templates | `POLT-AGP-Template` |
| `POLTOK-` | Policy tokens | `POLTOK-AGP` |
| `PROC-` | Process overviews | `PROC-PV-Overview` |
| `PROCJ-` | Process JSON | `PROCJ-PV` |
| `SOP-` | SOPs | `SOP-PV-01`, `SOP-OVR-03` |
| `WP-` | Work products | `WP-RM-02`, `WP-PV-01a` |
| `AREA-` | Process areas | `AREA-PV-Overview` |
| `WIZ-` | Wizard designs | `WIZ-PV-Design` |
| `WIZS-` | Wizard schemas | `WIZS-PV`, `WIZS-REG-NistRmf` |
| `BIND-` | Policy bindings | `BIND-PV` |
| `TRACE-` | Traceability | `TRACE-PV` |
| `LOG-` | Decision logs | `LOG-PV` |
| `CTRL-` | Controls | `CTRL-RegistryGuide` |
| `ISO-` | ISO mappings | `ISO-42001-AnnexA` |

### Programmatic Usage

```python
from ashmatics_aigov_tools import get_artifact_slug, validate_slug

# Generate slug from file path
slug = get_artifact_slug("policy-domain/AGP-Domain-Overview-Summary.md")
# Returns: "POL-AGP-Overview"

# Validate a slug
is_valid, error = validate_slug("SOP-PV-01")
# Returns: (True, "")
```

The slug mapping is defined in `artifact_slug_mapping.py` and is automatically used by the framework importer when registering content.

## Components

### Content Validator (`processors/content_validator.py`)
- Domain JSON file validation
- Policy logic consistency checks
- Process model documentation verification
- Master registry validation
- Version detection

### Azure Uploader (`processors/azure_uploader.py`)
- Versioned blob uploads with structured paths
- Rich metadata tagging by content category
- File hash verification and deduplication
- Upload manifest generation
- ADLS Gen2 support

### MongoDB Compiler (`processors/mongo_compiler.py`)
- Extracts sections, headers, and document structure
- Identifies placeholders/tokens for customization
- Parses policy bindings and process references
- Extracts traceability information and citations
- Generates schema-bound JSON for AI agents

### Framework Content Importer (`importers/framework_content.py`)
Main orchestrator supporting:
1. Content validation
2. Azure blob upload with versioning
3. PostgreSQL registration (via Hasura GraphQL)
4. MongoDB compilation (compiled JSON views)
5. Version tracking across systems

## Configuration

### Environment Variables

Create a `.env` file with:

```bash
# Azure Storage
AZURE_STORAGE_CONNECTION_STRING="DefaultEndpointsProtocol=https;..."
AZURE_FRAMEWORK_CONTAINER="framework-content"

# PostgreSQL/Hasura
HASURA_GRAPHQL_ENDPOINT="https://kb-graphql.ashmatics.com/v1/graphql"
HASURA_ADMIN_SECRET="your-admin-secret"

# MongoDB/CosmosDB (optional)
AZ_MONGO_CONNECTION_STRING="mongodb+srv://user:pass@cluster/..."
MONGO_DATABASE="ashmatics_kb"
MONGO_COLLECTION="framework_compiled_views"

# Framework Source
FRAMEWORK_REPO_PATH="/path/to/framework/repo"

# Optional
FORCE_ADLS_GEN2="false"
```

## Development

### Running Tests

Using `uv` (recommended):

```bash
# Sync dev dependencies
uv sync --all-extras

# Run tests
uv run pytest

# Run with coverage
uv run pytest --cov=ashmatics_aigov_tools --cov-report=html
```

Or using pip:

```bash
# Install dev dependencies
pip install -e ".[dev]"

# Run tests
pytest

# Run with coverage
pytest --cov=ashmatics_aigov_tools --cov-report=html
```

### Code Quality

```bash
# Using uv (recommended)
uv run ruff format src/
uv run ruff check src/
uv run mypy src/

# Or directly if tools are installed globally
ruff format src/
ruff check src/
mypy src/
```

## Version Management

The framework uses semantic versioning with timestamp format: `YYYY.MM.DD-HHMMSS`

Example: `2025.11.20-153000`

## License

Copyright 2025 Asher Informatics PBC

Licensed under the Apache License, Version 2.0. See [LICENSE](LICENSE) file for details.

## Related Projects

- [ashmatics-tools](https://github.com/JFK-Ashmatics/ashmatics-tools) - Shared utilities and primitives
- [ashmatics-knowledgebase-tools](https://github.com/JFK-Ashmatics/ashmatics-kb-tools) - Knowledge base integration tools
- [ashmatics-coreapp](https://github.com/JFK-Ashmatics/ashmatics-coreapp) - Main application frontend

## Changelog

### v0.7.0 (2026-01-28)
**Framework Explorer Pipeline** (ASHPSTUDIO-184)

- **New Features**
  - `FrameworkExplorerPipeline`: End-to-end query processing pipeline
    - Stage 1: Query Understanding (`QueryPreprocessor`) - rule-based + LLM refinement
    - Stage 2: Retrieval (`QueryOrchestrator`) - KB MCP integration
    - Stage 3: Synthesis (`ResponseSynthesizer`) - GPT-4o via Azure AI Foundry
    - Stage 4: Validation (`ResponseValidator`) - Quality checking with GPT-4o-mini
  - `FoundryClient`: Azure AI Foundry wrapper supporting multiple endpoint formats
  - `PipelineConfig`, `SynthesizerConfig`, `ValidatorConfig`: Structured configuration
  - `PipelineResult`, `Citation`, `SynthesisResult`: Rich result types
  - Prompt management with frontmatter (`pipeline_prompts.py`)

- **Supported Query Intents**
  - `domain_explanation`, `structural`, `regulatory_crosswalk`
  - `gap_analysis`, `known_item`, `comparative`

- **Environment Variables**
  - `AZURE_FOUNDRY_OPENAI_ENDPOINT`: Foundry OpenAI-compatible endpoint
  - `AZURE_FOUNDRY_KB_SEARCH_PARSE_API_KEY`: API key for Foundry

- **Documentation**
  - Added `src/ashmatics_aigov_tools/services/README.md` with full pipeline documentation

### v0.6.0 (2026-01-25)
**AIGov Query Service SDK** (ASHPSTUDIO-184)

- **New Features**
  - `services/` module: Reusable SDK for querying the AI Governance Framework
    - `AIGovQueryService`: Main service class with `query()`, `search()`, `stream_query()` methods
    - `AIGovQueryConfig`: Configuration dataclass supporting MCP and direct modes
    - `KBMCPClient`: Typed async HTTP client for KB MCP server
    - `MCPBackend` / `DirectBackend`: Pluggable search backends via `SearchBackend` protocol
  - MCP integration with KB server at configurable URL (localhost or production)
  - Graph traversal methods: `get_policy_hierarchy()`, `find_control_implementations()`, `get_regulatory_crosswalk()`
  - Full async/await support with context manager pattern

- **Refactoring**
  - `aigov_query.py` CLI refactored to thin wrapper using `AIGovQueryService`
  - Separated CLI concerns (argparse, rich display, REPL) from core service logic
  - SDK can be imported by apps without CLI dependencies

- **CLI Updates**
  - New `--mcp` (default) and `--direct` flags to select backend mode
  - New `--mcp-url` and `--mcp-key` flags for MCP configuration
  - Added graph traversal REPL commands: `hierarchy`, `controls`, `crosswalk`
  - Added `search <query>` command for retrieval without LLM synthesis

- **Configuration**
  - Environment variable support: `MCP_URL`, `MCP_API_KEY`
  - Priority order: CLI flags > env vars > defaults
  - Flexible endpoint configuration for dev/staging/production

### v0.5.0 (2026-01-20)
**Registry-Based Ontology Loading** (ASHKBAPP-60)

- **Breaking Changes**
  - `cai-init-ontology` now requires access to `ashmatics-framework-registry.json`
  - Removed hardcoded domain data; registry is now the single source of truth

- **New Features**
  - `cai-init-ontology` loads from framework registry JSON:
    - Auto-discovers registry file from sibling repo or common paths
    - New `--registry-file` argument for explicit path
    - Loads 12 policy domains with full metadata (tier, ISO controls, regulatory mappings)
    - Loads 12 process domains with base practices and policy bindings
    - Creates 51 policy→process "specifies" relationships automatically
    - Enriched data includes: `activates_process_domains`, `iso42001_controls`, `exemplar_controls`, `base_practices`, `upstream_domains`, `downstream_domains`

- **Data Corrections**
  - Fixed policy domains: Now SAP, AGP, RMP, HOP, EFP, MMP, LTP, CMP, DGP, SEP, VAL, TPP
  - Fixed process domains: Now ORG, OVR, PV, SA, IT, RM, MON, HO, EF, TR, UG, SE
  - Removed invalid domains that were previously hardcoded (IM, DC, RCP, IMP)

### v0.4.0 (2026-01-20)
**ASHCAI Ontology Release** (ASHPSTUDIO-183)

- **New Features**
  - `cai-init-ontology`: CLI tool to initialize ASHCAI ontology collections in MongoDB
    - Creates 11 ASHCAI collections with proper indexes
    - Supports dry-run and force re-initialization modes
  - `cai-sync-regulatory`: CLI tool to sync regulatory crosswalks from YAML to MongoDB
    - Parses `*-logic.yaml` files from ashmatics-policy-process-builder
    - Extracts crosswalk relationships (requirements → policies, controls, work products)
    - Supports NIST AI RMF, Joint Commission RUAIH, ONC HTI-1, Colorado AI Act, ACA Section 1557
    - List available frameworks with `--list` option
  - `RegulatoryYAMLParser`: New parser class for regulatory crosswalk YAML files
    - Extracts framework metadata, requirements, subcategories
    - Maps to policy sections, exemplar controls, process domains, work products
    - Generates ASHCAI-compatible relationship records

- **Dependencies**
  - Requires `ashmatics-tools>=0.7.0` for `AshcaiOntology` class

### v0.3.0 (2026-01-17)
**Major Release - RAG Query Tool**

- **New Features**
  - `cai-query`: RAG-powered interactive Q&A tool for the AI Governance Framework
    - Azure OpenAI (default) and Ollama LLM support
    - Simple RAG and Multi-Query RAG strategies
    - Interactive REPL mode with rich formatting
    - Single query and streaming modes
    - Source citation and cost/latency metrics
  - `cai-extract-relationships`: Graph relationship extraction CLI (ASHKBAPP-60)

- **Changes**
  - Updated `ashmatics-tools` dependency with extras: `[storage,db-mongo,azure-ai,ollama]`
  - Removed deprecated `rag_experiment.py` (replaced by `aigov_query.py`)

### v0.2.7 (2025-12-27)
- Added datestamps to files missing them
- Integrated VectorIndexManager from ashmatics-tools

### v0.2.6 (2025-12-22)
- Added `generate_embeddings_aigov.py` for vector embedding generation
- Vector index management with IVF support for CosmosDB vCore
- 2,405 vectors indexed in `aigov_framework_vectors` collection

### v0.2.5 (2025-12-20)
- Complete v0.8.0 framework import (309 MongoDB, 342 Azure, 341 PostgreSQL)
- Added artifact_id slug naming convention
- Added wizard-content-schemas support

### v0.2.0 (2025-12-01)
- Added dependency on ashmatics-datamodels package
- Code review fixes and testing

### v0.1.0 (2025-11-20)
- Initial release
- Three-layer hybrid persistence (Azure + PostgreSQL + MongoDB)
- Content validation and framework import tools

## Contributing

This is a private repository for Asher Informatics PBC. Contact the maintainers for access.

## Support

For questions or issues, please contact:
- Email: john@asherinformatics.com
- GitHub Issues: https://github.com/JFK-Ashmatics/ashmatics-aigov-tools/issues
