Metadata-Version: 2.4
Name: fabric-catalog
Version: 0.1.0
Summary: Unified governance CLI and SDK for Microsoft Fabric — discovery, access control, lineage, and catalog management in one interface. Bridges Fabric, Purview, and Databricks Unity Catalog.
Project-URL: Homepage, https://github.com/dereknguyenio/fabric-catalog
Project-URL: Documentation, https://github.com/dereknguyenio/fabric-catalog#readme
Project-URL: Repository, https://github.com/dereknguyenio/fabric-catalog
Project-URL: Issues, https://github.com/dereknguyenio/fabric-catalog/issues
Project-URL: Changelog, https://github.com/dereknguyenio/fabric-catalog/releases
Author: Derek Nguyen
License-Expression: MIT
Keywords: catalog,databricks,governance,microsoft-fabric,onelake,purview,unity-catalog
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Software Development :: Libraries
Requires-Python: >=3.10
Requires-Dist: azure-identity>=1.15.0
Requires-Dist: click>=8.1.0
Requires-Dist: httpx>=0.27.0
Requires-Dist: pydantic>=2.0.0
Requires-Dist: pyyaml>=6.0.0
Requires-Dist: rich>=13.0.0
Provides-Extra: all
Requires-Dist: databricks-sdk>=0.28.0; extra == 'all'
Requires-Dist: pytest-asyncio>=0.23.0; extra == 'all'
Requires-Dist: pytest-cov>=5.0.0; extra == 'all'
Requires-Dist: pytest>=8.0.0; extra == 'all'
Requires-Dist: respx>=0.21.0; extra == 'all'
Requires-Dist: ruff>=0.4.0; extra == 'all'
Provides-Extra: databricks
Requires-Dist: databricks-sdk>=0.28.0; extra == 'databricks'
Provides-Extra: dev
Requires-Dist: pytest-asyncio>=0.23.0; extra == 'dev'
Requires-Dist: pytest-cov>=5.0.0; extra == 'dev'
Requires-Dist: pytest>=8.0.0; extra == 'dev'
Requires-Dist: respx>=0.21.0; extra == 'dev'
Requires-Dist: ruff>=0.4.0; extra == 'dev'
Description-Content-Type: text/markdown

# fabric-catalog

**Unified governance for Microsoft Fabric — the missing Unity Catalog.**

`fabric-catalog` is a CLI and Python SDK that unifies governance across Microsoft Fabric, Purview, and Databricks Unity Catalog. Define policies once in YAML. Enforce them everywhere. Search, grant, revoke, trace lineage, and sync permissions — one tool, all backends.

## The Problem

Databricks has **Unity Catalog**: one namespace, one permission model, one CLI. Microsoft has Purview + OneLake Catalog + workspace RBAC + OneLake data access roles — five different systems with five different interaction models that don't behave as a unified whole.

`fabric-catalog` fixes this. It provides a single policy engine that treats Fabric, Purview, and Databricks as enforcement backends rather than independent products.

## Install

```bash
pip install fabric-catalog

# With Databricks Unity Catalog support
pip install fabric-catalog[databricks]

# Everything
pip install fabric-catalog[all]
```

## Policy-as-Code

Define your governance posture in YAML. Commit it to git. Review changes in CI. Apply across all backends.

```yaml
# governance.yaml
version: "1"
policies:
  - name: analysts-read
    assets:
      - pattern: "my-workspace.production-lakehouse.*"
    principals:
      - name: analysts@example.com
        type: group
    level: read
    column_masks:
      - column: ssn
        mask: redact
      - column: account_number
        mask: hash
        except_principals:
          - compliance@example.com
    row_filters:
      - condition: "region = 'US'"
        principals:
          - us-analysts@example.com
```

```bash
# Validate the manifest
fc policy validate governance.yaml

# Preview what changes would be applied (dry run)
fc policy plan governance.yaml

# Apply across Fabric + Purview + Unity Catalog
fc policy apply governance.yaml
```

The policy compiler decomposes each policy into the correct backend-specific grants: workspace RBAC roles, OneLake data access roles, Purview policies, and UC `GRANT` statements — automatically.

## CLI

```bash
# Search across all backends
fc search "customer_transactions"

# List assets
fc ls my_workspace
fc ls my_catalog.my_schema --backend databricks

# View permissions (aggregated across all backends)
fc permissions show my_workspace.lakehouse.customers

# Grant / Revoke
fc grant my_workspace.lakehouse.customers user@example.com read
fc revoke my_workspace.lakehouse.customers user@example.com

# Lineage (merged from Purview + Fabric + UC)
fc lineage my_workspace.lakehouse.customers --direction upstream --depth 3

# Sync permissions between Fabric and Unity Catalog
fc diff --mapping-file mappings.json
fc sync --direction fabric_to_uc --dry-run
fc sync --direction fabric_to_uc --apply
```

## Python SDK

```python
from fabric_catalog.facade import CatalogFacade
from fabric_catalog.models import PermissionLevel

async with CatalogFacade() as catalog:
    # Search everywhere
    results = await catalog.search("transactions")

    # Grant with one command — auto-routes to the right backend
    await catalog.grant(
        "workspace.lakehouse.transactions",
        "analyst@example.com",
        PermissionLevel.READ,
    )

    # Lineage across backends
    lineage = await catalog.lineage("workspace.lakehouse.transactions")

    # UC Bridge: sync permissions
    catalog.add_sync_mapping(
        "workspace.lakehouse.transactions",
        "catalog.schema.transactions",
    )
    report = await catalog.sync(direction="fabric_to_uc", dry_run=True)
    print(report.summary)
```

## MCP Server

`fabric-catalog` exposes itself as an [MCP](https://modelcontextprotocol.io/) server, enabling AI agents to discover assets, check permissions, trace lineage, and manage governance through natural language.

Available tools: `catalog_search`, `catalog_get`, `catalog_permissions`, `catalog_lineage`, `catalog_policy_check`, `catalog_list`.

## Architecture

```
┌──────────────────────────────────────────────────┐
│               CLI  (fc)                          │
│   search │ ls │ grant │ revoke │ lineage          │
│   sync │ diff │ policy plan │ policy apply        │
├──────────────────────────────────────────────────┤
│           Policy Engine                          │
│   YAML manifest → compiler → execution plan      │
│   drift detection │ column masks │ row filters    │
├──────────────────────────────────────────────────┤
│            CatalogFacade                         │
│   concurrent fan-out │ merge │ deduplicate        │
├──────────┬──────────┬──────────┬─────────────────┤
│  Fabric  │ Purview  │Databricks│   UC Bridge     │
│  Client  │ Client   │ Client   │ (perm sync)     │
├──────────┴──────────┴──────────┴─────────────────┤
│            Unified Models                        │
│ CatalogAsset │ Permission │ LineageGraph          │
│ GovernancePolicy │ CompiledPlan │ SyncReport      │
├──────────────────────────────────────────────────┤
│            MCP Server                            │
│   AI agent interface for governance              │
└──────────────────────────────────────────────────┘
```

## Backend Capabilities

| Capability | Fabric | Purview | Databricks UC |
|-----------|--------|---------|---------------|
| Discovery | ✅ Workspace items, tables | ✅ Data Catalog search | ✅ Catalogs, schemas, tables |
| Access Control | ✅ Workspace RBAC, OneLake roles | 🔜 Phase 2 | ✅ UC grants |
| Lineage | ⚠️ Limited native | ✅ Data Map | ✅ System tables |
| Column Masking | ✅ via policy engine | ✅ sensitivity labels | ✅ UC column masks |
| Row Filters | ✅ via policy engine | — | ✅ UC row filters |
| Sync Bridge | ✅ Source/target | — | ✅ Source/target |

## Example Templates

Industry-specific governance templates are included in `examples/`:

- **`energy-governance.yaml`** — Well data, subsurface engineering, regional access controls
- **`financial-services-governance.yaml`** — PII masking, transaction access, compliance overrides
- **`healthcare-governance.yaml`** — HIPAA-aligned PHI masking, research de-identification, consent filters

Use these as starting points and adapt to your organization.

## Configuration

```bash
# Interactive setup
fc config init

# Or set environment variables
export FABRIC_CATALOG_TENANT_ID="..."
export FABRIC_CATALOG_CLIENT_ID="..."
export FABRIC_CATALOG_CLIENT_SECRET="..."
export FABRIC_CATALOG_WORKSPACE_ID="..."
export FABRIC_CATALOG_PURVIEW_ACCOUNT="my-purview-account"
export DATABRICKS_HOST="my-workspace.azuredatabricks.net"
export DATABRICKS_TOKEN="dapi..."
```

## UC Bridge Mappings

Map Fabric assets to their Unity Catalog equivalents for permission sync:

```json
{
  "my_workspace.my_lakehouse.customers": "my_catalog.my_schema.customers",
  "my_workspace.my_lakehouse.transactions": "my_catalog.my_schema.transactions"
}
```

```bash
fc diff --mapping-file mappings.json
fc sync --direction fabric_to_uc --mapping-file mappings.json --apply
```

## License

MIT
