Metadata-Version: 2.4
Name: bipixie-mcp
Version: 0.3.0
Summary: MCP server that gives AI agents read-only access to BI Pixie usage and engagement data via the Power BI REST executeQueries endpoint
Author-email: DataChant <support@bipixie.com>
Maintainer-email: DataChant <support@bipixie.com>
License: MIT
Project-URL: Homepage, https://bipixie.com
Project-URL: Documentation, https://bipixie.com/docs/cloud/
Project-URL: Support, https://bipixie.com/docs/cloud/contact-support
Keywords: mcp,power-bi,bi-pixie,analytics,model-context-protocol
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Requires-Python: >=3.12
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: fastmcp<3,>=2.11
Requires-Dist: msal>=1.28
Requires-Dist: azure-identity>=1.17
Requires-Dist: httpx>=0.27
Requires-Dist: pydantic>=2.7
Requires-Dist: pydantic-settings>=2.3
Provides-Extra: test
Requires-Dist: pytest>=8; extra == "test"
Requires-Dist: pytest-asyncio>=0.23; extra == "test"
Requires-Dist: respx>=0.21; extra == "test"
Provides-Extra: dev
Requires-Dist: bipixie-mcp[test]; extra == "dev"
Requires-Dist: mypy>=1.10; extra == "dev"
Requires-Dist: ruff>=0.4; extra == "dev"
Dynamic: license-file

# BI Pixie MCP Server

A read-only [Model Context Protocol](https://modelcontextprotocol.io/) server that lets AI agents — Claude Code, Codex, Microsoft Fabric data agents, and Azure AI Foundry — query your BI Pixie usage and engagement data.

> **Note:** Requires an active BI Pixie license and a deployed BI Pixie semantic model. "BI Pixie" is a trademark of DataChant. This package is a client and grants no rights to the BI Pixie service or data.

**Phase 1 data source:** the existing "BI Pixie" semantic model, queried via the Power BI REST `executeQueries` endpoint. No new data infrastructure is required. The server is a pure-additive Python 3.12 package (`bipixie_mcp`) that lives in `mcp_server/` and does not touch any existing Function App, portal, or workload code.

**Validated against the live model.** All 8 core data tools (`describe_model`, `list_reports`, `top_reports_by_usage`, `report_health`, `user_adoption`, `page_engagement`, `feedback_summary`, `run_dax`) were executed successfully against the BI Pixie SaaS sample dataset (plus `suggest_questions` for discovery and four stdio-only navigation tools — see the tool catalog). Every measure resolved; ranked output is deterministically ordered. Results match the customer's own Power BI report numbers (e.g. top report "Executive Dashboard" = 484 report sessions; CSAT 0.59, NPS -58.3; data through 2026-05-07).

---

## Contents

- [How it works](#how-it-works)
- [Prerequisites](#prerequisites)
- [Install](#install)
- [Configuration reference](#configuration-reference)
- [Local stdio quickstart — Claude Code](#local-stdio-quickstart--claude-code)
- [Local stdio quickstart — Codex](#local-stdio-quickstart--codex)
- [Hosted streamable-HTTP quickstart](#hosted-streamable-http-quickstart)
- [Tool catalog](#tool-catalog)
- [Security checklist](#security-checklist)
- [Tenant settings checklist](#tenant-settings-checklist)
- [Architecture notes](#architecture-notes)
- [Phase 2 — Fabric RTI scale-out](#phase-2--fabric-rti-scale-out)

---

## How it works

```
AI agent (Claude Code / Codex / Foundry)
        |
        | MCP JSON-RPC  (stdio  OR  POST /mcp)
        v
  bipixie_mcp.server  (FastMCP, Python 3.12)
        |
        | Power BI REST  executeQueries  (read-only DAX)
        | POST https://api.powerbi.com/v1.0/myorg/groups/{workspaceId}/datasets/{datasetId}/executeQueries
        v
  "BI Pixie" semantic model  (IMPORT-mode, ~60 tables, validated DAX measures)
        ^
        | data already loaded at refresh time
  ADLS Gen2  bipixielake-{license_key}/events/
```

The server wraps the model's curated measures — the same numbers you see in your BI Pixie dashboard — in typed, filterable MCP tools. Because the measures are already validated by the model, agent results match your Power BI reports exactly.

**Two runtime targets, one codebase:**

| Target | Transport | `BIPIXIE_MCP_TRANSPORT` | Typical consumer |
|--------|-----------|------------------------|-----------------|
| Local developer machine | `stdio` | `stdio` (default) | Claude Code, Codex |
| Customer Azure environment | `streamable-http` | `streamable-http` | Fabric data agents, Azure AI Foundry, remote Claude Code |

---

## Prerequisites

### Python

Python 3.12 or later.

```
python --version   # must be 3.12+
```

On Windows with multiple Python versions installed, use `py -3.12` instead of `python`.

### Power BI tenant settings (customer admin action — required before first use)

The following settings must be enabled by a Power BI admin in your tenant. These are customer-side prerequisites; the server cannot self-provision them.

| Setting | Location in Power BI Admin portal | Why |
|---------|-----------------------------------|-----|
| **Dataset Execute Queries REST API** | Tenant settings > Integration settings | Hard gate. Disabled = 403 with no helpful body. |
| **Allow service principals to use Power BI APIs** | Tenant settings > Developer settings | Required for `service_principal` and `managed_identity` auth modes. |
| Workspace **Member** (or Admin) role for your SP or MI | Workspace settings > Manage access | Required so the app identity has Read + Build on the dataset. |
| (Optional) **Allow XMLA endpoints and Analyze in Excel** | Tenant settings | Only needed for `BIPIXIE_MCP_USE_ARROW=true` (Phase 1.5). Requires Premium/Fabric capacity. |

For `device_code` / `interactive` / `azure_cli` auth (local use), the customer's own user identity is used and workspace membership follows normal Power BI access — no service-principal enrollment needed.

### Entra app registration

For `azure_cli` (simplest local validation): **no app registration needed.** Reuses an existing `az login` session via `AzureCliCredential`. Run `az login` once in your shell, then set `BIPIXIE_MCP_AUTH_MODE=azure_cli` — no `BIPIXIE_MCP_CLIENT_ID` required.

For `device_code` / `interactive` (local, no existing az session): create a **public-client** app registration in your Entra tenant with:
- Redirect URI: `https://login.microsoftonline.com/common/oauth2/nativeclient` (for device code)
- API permission: `Power BI Service > Dataset.Read.All` (delegated)
- Token version: `requestedAccessTokenVersion = 2` (v2 tokens)

For `service_principal` (hosted): create a **confidential-client** app registration and add:
- API permission: `Power BI Service > Dataset.Read.All` (application)
- Grant admin consent
- Token version: `requestedAccessTokenVersion = 2`

For `managed_identity` (hosted, preferred): no app registration needed on the server side. The managed identity must be added as workspace Member.

For EasyAuth on the hosted endpoint itself (if deployed to Azure Container App / Function App), `allowedAudiences` must include both `{clientId}` and `api://{clientId}`, and `openIdIssuer` must be the v2 issuer: `https://login.microsoftonline.com/{tenantId}/v2.0`.

---

## Install

The server runs locally (stdio) in VS Code, Cursor, Claude Code / Desktop, and Codex. Install it from **PyPI** — no clone required:

```bash
# Zero-install runner (recommended) — always fetches the latest published version:
uvx bipixie-mcp

# Or install into the current environment:
pip install bipixie-mcp
```

[![Install in VS Code](https://img.shields.io/badge/VS_Code-Install_BI_Pixie_MCP-0098FF?logo=visualstudiocode&logoColor=white)](https://insiders.vscode.dev/redirect/mcp/install?name=bipixie&config=%7B%22command%22%3A%22uvx%22%2C%22args%22%3A%5B%22bipixie-mcp%22%5D%7D)

After installing, set the required `BIPIXIE_MCP_*` variables (see [Configuration reference](#configuration-reference)) — at minimum a tenant, workspace, and dataset, plus `BIPIXIE_MCP_AUTH_MODE=azure_cli` (then `az login`) for the simplest local auth.

<details>
<summary><strong>From source (maintainers / contributors)</strong></summary>

```bash
pip install -e "mcp_server/[test]"   # from the repo root
pip install -e ".[test]"             # from the mcp_server/ directory
```

</details>

**Dependencies installed automatically:**

| Package | Purpose |
|---------|---------|
| `fastmcp>=2.11` | MCP server framework (stdio + streamable-HTTP) |
| `msal>=1.28` | MSAL Python — device-code / interactive token cache |
| `azure-identity>=1.17` | `DefaultAzureCredential`, `ClientSecretCredential` |
| `httpx>=0.27` | Async HTTP for Power BI REST calls |
| `pydantic>=2.7` | Data validation |
| `pydantic-settings>=2.3` | `BIPIXIE_MCP_*` env-var config |

Test extras (`pytest`, `pytest-asyncio`, `respx`) are included with `[test]`.

---

## Configuration reference

All configuration is via environment variables (or a `.env` file in the working directory). Copy `.env.example` to `.env` and fill in your values. **Never commit `.env` or `~/.bipixie_mcp/token_cache.bin` to source control.**

| Variable | Required | Default | Description |
|----------|----------|---------|-------------|
| `BIPIXIE_MCP_TENANT_ID` | Yes | — | Entra (AAD) tenant GUID. Cloud customers: DataChant production tenant. Self-hosted: your own tenant GUID. Never hardcoded — wrong value breaks auth. |
| `BIPIXIE_MCP_AUTH_MODE` | No | `device_code` | Token-acquisition flow: `device_code` \| `interactive` \| `azure_cli` \| `service_principal` \| `managed_identity`. `azure_cli` reuses an existing `az login` session (no app registration, no client secret — simplest local validation). |
| `BIPIXIE_MCP_CLIENT_ID` | Yes (except azure_cli / managed_identity) | — | Entra app (client) ID. Public client for device-code/interactive; confidential for service_principal. Not needed for `azure_cli` or `managed_identity` (both obtain credentials without an explicit client registration). |
| `BIPIXIE_MCP_CLIENT_SECRET` | For SP only | — | Client secret for `service_principal` mode. In Azure, use a Key Vault reference (`@Microsoft.KeyVault(...)`). Never logged. |
| `BIPIXIE_MCP_WORKSPACE_ID` | One of ID/name | — | GUID of the Fabric/Power BI workspace containing the "BI Pixie" dataset. If unset, resolved from `BIPIXIE_MCP_WORKSPACE_NAME`. |
| `BIPIXIE_MCP_WORKSPACE_NAME` | One of ID/name | — | Workspace display name, used to resolve the workspace GUID at startup. Ignored when `WORKSPACE_ID` is set. |
| `BIPIXIE_MCP_DATASET_ID` | One of ID/name | — | GUID of the "BI Pixie" semantic model. If unset, resolved from `BIPIXIE_MCP_DATASET_NAME`. Fails fast on name collision — use an explicit ID when multiple "BI Pixie" datasets exist in the workspace. |
| `BIPIXIE_MCP_DATASET_NAME` | No | `BI Pixie` | Dataset display name for ID resolution. Override only if the customer renamed the model. |
| `BIPIXIE_MCP_POWERBI_API_BASE` | No | `https://api.powerbi.com/v1.0/myorg` | Power BI REST base URL. Change only for Sovereign Cloud (GovCloud / China). |
| `BIPIXIE_MCP_POWERBI_SCOPE` | No | `https://analysis.windows.net/powerbi/api/.default` | OAuth scope for executeQueries. Confirmed in `SemanticModelClient.ts:122-127` — Fabric-audience tokens are rejected by api.powerbi.com. |
| `BIPIXIE_MCP_TRANSPORT` | No | `stdio` | Runtime transport: `stdio` (local) or `streamable-http` (hosted). |
| `BIPIXIE_MCP_HTTP_HOST` | No | `0.0.0.0` | Bind host for streamable-http mode. Ignored in stdio mode. |
| `BIPIXIE_MCP_HTTP_PORT` | No | `8000` | Bind port for streamable-http mode. Azure injects `PORT` / `WEBSITES_PORT` automatically. |
| `BIPIXIE_MCP_DEFAULT_DAYS` | No | `30` | Default lookback window (days) when a tool's `days` argument is omitted. Cannot exceed the model's loaded history. |
| `BIPIXIE_MCP_DEFAULT_TOP_N` | No | `100` | Default row cap for ranking/list tools when `top_n` is omitted. |
| `BIPIXIE_MCP_MAX_ROWS` | No | `1000` | Hard server-side cap on rows returned by any tool. Enforced before serialization. |
| `BIPIXIE_MCP_PII_COLUMNS_ALLOWED` | No | `false` | When `false` (default), `ResponsePiiFilter` strips `Username` and `Client IP` columns from all results. Set `true` only after an informed GDPR review. The BI Pixie model ships no RLS, so this filter is the primary PII control across all auth modes. |
| `BIPIXIE_MCP_USE_ARROW` | No | `false` | Opt-in to the `executeDaxQueries` (Arrow) endpoint for larger result sets. Requires Premium/Fabric capacity and the "Allow XMLA endpoints" tenant setting. |
| `BIPIXIE_MCP_QUERY_RATE_LIMIT` | No | `60` | Client-side cap on executeQueries calls per minute per identity, to stay under the Power BI ~120/min quota. |
| `BIPIXIE_MCP_TOKEN_CACHE_PATH` | No | `~/.bipixie_mcp/token_cache.bin` | MSAL serializable token cache path (device_code / interactive modes). Persists the refresh token across restarts. Add to `.gitignore`. |
| `BIPIXIE_MCP_LOG_LEVEL` | No | `INFO` | Logging level (`DEBUG` / `INFO` / `WARNING` / `ERROR`). All logs go to **stderr** — stdout is reserved for the MCP JSON-RPC stream. |

**Cloud vs. self-hosted:** the config keys are identical in both deployments. Only the values differ — cloud customers supply DataChant's tenant GUID and their own workspace/dataset IDs; self-hosted enterprise customers supply their own tenant GUID, their own Entra app registration, and their own workspace/dataset IDs. Nothing is hardcoded in the server.

---

## Local stdio quickstart — Claude Code

### 1. Set up your `.env`

```bash
# mcp_server/.env
BIPIXIE_MCP_TENANT_ID=72f988bf-86f1-41af-91ab-2d7cd011db47   # your Entra tenant
BIPIXIE_MCP_CLIENT_ID=<your-public-client-app-id>
BIPIXIE_MCP_WORKSPACE_ID=<your-fabric-workspace-guid>          # or use WORKSPACE_NAME
BIPIXIE_MCP_DATASET_NAME=BI Pixie
BIPIXIE_MCP_AUTH_MODE=device_code
```

### 2. Register with Claude Code (one-liner)

```bash
claude mcp add --transport stdio --scope project bipixie -- python -m bipixie_mcp.server
```

Pass required env vars inline or let Claude Code inherit them from your shell.

### 3. Register via `.mcp.json` (project-scoped, checked in)

Create or add to `.mcp.json` in your project root:

```json
{
  "mcpServers": {
    "bipixie": {
      "type": "stdio",
      "command": "python",
      "args": ["-m", "bipixie_mcp.server"],
      "env": {
        "BIPIXIE_MCP_TENANT_ID": "${BIPIXIE_MCP_TENANT_ID}",
        "BIPIXIE_MCP_CLIENT_ID": "${BIPIXIE_MCP_CLIENT_ID}",
        "BIPIXIE_MCP_WORKSPACE_ID": "${BIPIXIE_MCP_WORKSPACE_ID}",
        "BIPIXIE_MCP_DATASET_NAME": "BI Pixie",
        "BIPIXIE_MCP_AUTH_MODE": "device_code"
      }
    }
  }
}
```

On first use, the server performs a device-code flow: it prints a URL and code to stderr, and the user authenticates in a browser. The MSAL token cache at `BIPIXIE_MCP_TOKEN_CACHE_PATH` persists the refresh token so subsequent restarts are silent.

**Important:** add `~/.bipixie_mcp/` to your global `.gitignore`. The token cache contains long-lived credentials.

---

## Local stdio quickstart — Codex

Add to `~/.codex/config.toml`:

```toml
[mcp_servers.bipixie]
command = "python"
args = ["-m", "bipixie_mcp.server"]
env_vars = [
  "BIPIXIE_MCP_TENANT_ID",
  "BIPIXIE_MCP_CLIENT_ID",
  "BIPIXIE_MCP_WORKSPACE_ID",
  "BIPIXIE_MCP_DATASET_NAME",
  "BIPIXIE_MCP_AUTH_MODE",
  "BIPIXIE_MCP_CLIENT_SECRET",    # only for service_principal mode
]
```

Set the referenced variables in your shell environment before starting Codex. Use `BIPIXIE_MCP_AUTH_MODE=device_code` for interactive local use, `service_principal` for CI/automation.

---

## Hosted streamable-HTTP quickstart

### Overview

Set `BIPIXIE_MCP_TRANSPORT=streamable-http`. The server exposes a single `POST /mcp` endpoint (MCP Streamable-HTTP protocol, not the deprecated SSE transport). FastMCP is instantiated with `stateless_http=True` so the server scales horizontally with no server-side session state.

### Recommended host: Azure Container App

```bash
# Build and push the image
docker build -t bipixie-mcp:latest ./mcp_server
az acr build --registry <your-acr> --image bipixie-mcp:latest ./mcp_server

# Deploy (minimal example — add RBAC and Key Vault refs for production)
az containerapp create \
  --name bipixie-mcp \
  --resource-group rg-pixie-ctl-dev-eastus \
  --environment <your-aca-env> \
  --image <your-acr>.azurecr.io/bipixie-mcp:latest \
  --target-port 8000 \
  --ingress external \
  --env-vars \
      BIPIXIE_MCP_TRANSPORT=streamable-http \
      BIPIXIE_MCP_AUTH_MODE=managed_identity \
      BIPIXIE_MCP_TENANT_ID=<tenant-id> \
      BIPIXIE_MCP_WORKSPACE_ID=<workspace-guid> \
      BIPIXIE_MCP_DATASET_NAME="BI Pixie"
```

For `service_principal` mode, inject `BIPIXIE_MCP_CLIENT_SECRET` as a Key Vault reference rather than a plaintext env var.

The managed identity assigned to the Container App must be added as **Member** (or Admin) on the target Power BI workspace.

### Auth at the edge (EasyAuth)

Front the Container App / Function App with **App Service EasyAuth** (Microsoft Entra ID provider):

- `requireAuthentication = true`
- Function-level auth: `ANONYMOUS` (EasyAuth is the gate — do NOT use function keys on Flex Consumption; see documented gotcha in CLAUDE.md)
- `allowedAudiences`: include both `<clientId>` and `api://<clientId>`
- `openIdIssuer`: `https://login.microsoftonline.com/<tenantId>/v2.0` (v2 tokens required)

### Register in Claude Code (remote)

```bash
claude mcp add \
  --transport http \
  --header "Authorization: Bearer <token>" \
  bipixie \
  https://<your-app>.azurecontainerapps.io/mcp
```

### Register in Azure AI Foundry

1. Open your Azure AI Foundry project.
2. Navigate to **Tools > Add tool > Custom > Model Context Protocol**.
3. Enter:
   - **Endpoint URL**: `https://<your-app>/mcp`
   - **Authentication**: Microsoft Entra
   - **Type**: Project Managed Identity (machine-to-machine) or OAuth Identity Passthrough (per-user delegated tokens). Because the BI Pixie semantic model ships no RLS, both types see the same full usage data — choose based on your operational preference, not for data-access reasons.
   - **Audience**: the App ID URI of your MCP Entra app registration (e.g. `api://<clientId>`)
4. Test connectivity. The tool list should populate with the eight BI Pixie tools.

Note the 100-second non-streaming timeout that Azure AI Foundry enforces on MCP tool calls. All BI Pixie tools complete well within this limit for typical query sizes.

### Register as a Fabric data agent tool

In the Fabric data agent builder, add an MCP tool pointing at the `/mcp` endpoint with Microsoft Entra auth. The data agent will use the eight typed tools alongside (or instead of) the model's built-in VerifiedAnswers and CopilotTooling conversational path. The two are complementary: the custom MCP server provides structured JSON output with filter parameters and pagination; the model's native Fabric Copilot integration provides natural-language Q&A over the existing 39 VerifiedAnswers.

---

## Tool catalog

Call `describe_model` first — it returns the full queryable surface so the agent knows which table names, measures, and dimension columns are available before constructing a `run_dax` query.

| Tool | Required args | Optional args | What it returns |
|------|--------------|---------------|----------------|
| `describe_model` | — | `include_measure_descriptions` | Dataset ID/name, table list, measure catalog by domain, dimension columns, data freshness |
| `list_reports` | — | `days`, `top_n`, `workspace_name` | Tracked reports with sessions, total hours, unique users, last activity |
| `top_reports_by_usage` | — | `metric`, `days`, `top_n`, `ascending` | Reports ranked by `report_sessions` / `total_hours` / `users` / `interactive_sessions` / `avg_session_duration` / `interactions` (use `interactions` for "most/least clicks") |
| `report_health` | `report_name` | `days` | Full health card: sessions, hours, avg duration, interactive vs passive, users, CSAT (last + multiple), NPS |
| `user_adoption` | — | `days`, `granularity`, `report_name` | MAU, DAU, DAU/MAU, WAU, Engaged Users, New Users, Returning Users — summary or time series |
| `page_engagement` | — | `report_name`, `days`, `top_n` | Per-page: page sessions, avg duration, total interactions, avg interactions/session, slicer clicks, visual interactions, tooltip opens |
| `feedback_summary` | — | `days`, `report_name`, `group_by`, `granularity`, `survey_type`, `top_n` | CSAT (both `csat_multiple` — the report's headline, click-weighted — and `csat_last`), the sentiment split (positive/negative clicks, neutral users, response rate), NPS (score, rating, promoters/detractors/passives), and the survey story (respondents, responses, time savings, self-reported **financial gains $**). `group_by` (`report`/`icon`/`workspace`/`question`/`answer`/`survey_type`) breaks it out (domain-aware — only the measures that vary across the dimension); `granularity` (`daily`/`weekly`/`monthly`) returns a trend; an empty window self-heals with a `data_coverage` hint. `group_by_report` kept as a deprecated alias for `group_by="report"`. |
| `run_dax` | `dax` | `row_limit` | Escape hatch: execute any read-only `EVALUATE` DAX query and get rows as JSON |

### Local navigation tools (stdio only)

On the local **stdio** transport, the server also registers four navigation tools so you can point it at a different workspace/dataset without editing config or restarting — handy when one identity can see several `BI Pixie` models. They are enabled by default (`BIPIXIE_MCP_ALLOW_TARGET_SWITCHING=true`); set it to `false` to lock the server to the configured dataset. They are **never** exposed on the hosted (streamable-http) transport, which keeps its single-configured-dataset contract.

| Tool | Required args | What it does |
|------|--------------|--------------|
| `list_workspaces` | — | List the Power BI workspaces the server identity can see (read-only, one `GET /groups`). |
| `list_datasets` | `workspace_id` | List datasets in a workspace, flagging likely `BI Pixie` models via `is_bi_pixie`. |
| `set_active_dataset` | `workspace_id`, `dataset_id` | Re-point every tool at a different workspace+dataset for the rest of the session (resets on restart). Not read-only — it changes session state. |
| `get_active_dataset` | — | Show the workspace+dataset the tools are currently querying, with its freshness footer. |

### Freshness footer

Every tool response includes a freshness footer:

```json
{
  "data_through": "2026-05-29",
  "last_refresh_utc": "2026-05-30T04:12:00Z"
}
```

`data_through` is the `[Last Activity]` date in the model. `last_refresh_utc` is the most recent successful refresh from the Power BI refresh history API. Because the model is IMPORT mode, data is current as of the last scheduled refresh — not real-time.

### `run_dax` safety

`run_dax` passes every query through `DaxGuard` before dispatch:

- Query must begin with `EVALUATE` (a `DEFINE ... EVALUATE ...` preamble is allowed).
- The following write verbs hard-reject the query and return `{"error": "dax_rejected", "reason": "..."}`: `CREATE`, `ALTER`, `DELETE`, `DROP`, `MERGE`, `INSERT`, `UPDATE`, `PROCESS`, `REFRESH`, and related DDL keywords.
- The agent cannot change the workspace or dataset ID — the server is scoped to the single configured tenant context.
- Results pass `ResponsePiiFilter` (strips `Username` and `Client IP` columns when `BIPIXIE_MCP_PII_COLUMNS_ALLOWED=false`).
- Row count is capped at `BIPIXIE_MCP_MAX_ROWS`.

---

## Security checklist

- **Never commit** `.env`, `token_cache.bin`, or any file containing `BIPIXIE_MCP_CLIENT_SECRET`. Add `mcp_server/.env` and `~/.bipixie_mcp/` to `.gitignore`.
- **The BI Pixie semantic model ships no row-level security (RLS).** Every auth mode therefore sees the same full usage data — this is by design. Each team or Enterprise tenant installs the BI Pixie Dashboard under its own license key and container and runs its own MCP server against its own model, so anyone who can run the server is already entitled to that model's usage data. The PII filter (`BIPIXIE_MCP_PII_COLUMNS_ALLOWED`, default `false`) — not RLS — is the control for user-identifying columns (`Username` / `Client IP`).
- **`run_dax` is read-only** — `DaxGuard` enforces this before every dispatch. No write, DDL, or refresh operation can reach the dataset.
- **Client secrets never appear in logs.** A redacting filter masks UUIDs and 40-plus-character hex strings from all log output. All logs go to stderr; stdout is reserved for the JSON-RPC stream.
- **Token cache (`~/.bipixie_mcp/token_cache.bin`)** contains serialized MSAL credentials. Protect it like a password. It is unused in `managed_identity` mode.
- **PII in the model:** the `User and IP Addresses` table contains plaintext UPNs and client IPs. `ResponsePiiFilter` strips `Username` / `Client IP` columns from all results by default. Only an operator who has reviewed GDPR implications should set `BIPIXIE_MCP_PII_COLUMNS_ALLOWED=true`.
- **Multi-tenant cloud hosting:** if the hosted server runs in the vendor Azure subscription and serves multiple customers, each customer should have its own service principal (separate `BIPIXIE_MCP_CLIENT_ID` / `BIPIXIE_MCP_CLIENT_SECRET`) so the ~120 req/min per-identity Power BI quota is not shared.

---

## Tenant settings checklist

Complete this checklist before connecting the server. All items require a Power BI admin.

```
[ ] Power BI Admin portal > Tenant settings > Integration settings:
    "Dataset Execute Queries REST API" = ENABLED
    (disabled = silent 403; this is the most common setup failure)

[ ] Power BI Admin portal > Tenant settings > Developer settings:
    "Allow service principals to use Power BI APIs" = ENABLED
    (required for service_principal and managed_identity auth modes)

[ ] Fabric workspace > Manage access:
    Service principal / managed identity added as Member or Admin
    (grants Read + Build on the dataset)

[ ] Entra app registration:
    API permission "Power BI Service > Dataset.Read.All" granted and admin-consented
    requestedAccessTokenVersion = 2  (v2 tokens required)

[ ] (Optional, Phase 1.5 only) Power BI Admin portal > Tenant settings:
    "Allow XMLA endpoints and Analyze in Excel" = ENABLED
    Workspace on Premium or Fabric capacity
    (only needed when BIPIXIE_MCP_USE_ARROW=true)
```

---

## Architecture notes

### Why the existing semantic model (Phase 1)

The "BI Pixie" semantic model ships ~60 tables, a curated DAX measure library covering adoption, engagement, feedback, survey, and security, 39 VerifiedAnswers, and CopilotTooling annotations. The MCP server's typed tools wrap these confirmed measures — `[Report Sessions]`, `[Total Hours]`, `[Avg Session Duration (sec)]`, `[Interactive Sessions]`, `[MAU]`, `[DAU]`, `[DAU / MAU]`, `[Engaged Users]`, `[New Users]`, `[Returning Users]`, `[CSAT (Last Response)]`, `[CSAT (Multiple Responses)]`, `[Feedback Clicks]`, `[Positive Clicks]`, `[Negative Clicks]`, `[Average Satisfaction]`, `[Neutral Users]`, `[Respondents]`, `[% Feedback Responses]`, `[NPS]`, `[NPS Rating]`, `[NPS Promoters]`, `[NPS Detractors]`, `[NPS Passives]`, `[Survey Respondents]`, `[Survey Responses]`, `[Survey Time Savings (Hours)]`, `[Survey Financial Gains]`, `[Survey Avg Financial Gains]`, `[Page Sessions]`, `[Total Interactions Within Page]`, `[Slicer Clicks]`, `[Visual Interactions]`, `[Tooltip Opens]`, `[Avg Duration in Page (Sec)]` — so agent numbers match the Power BI report exactly.

The `executeQueries` endpoint is a public REST endpoint available on all Power BI tiers (Pro, PPU, Premium, Fabric). Aggregated measure queries return tens of rows, comfortably inside the 100K-row / 1M-value / 15 MB / 120-rpm caps.

### Package layout

```
mcp_server/
  bipixie_mcp/
    __init__.py       # package marker, lazy public re-exports: __version__, Settings, get_settings, build_server, main
    config.py         # Settings (pydantic-settings, BIPIXIE_MCP_* prefix), get_settings(), ConfigError
    auth.py           # TokenProvider, build_credential(), AuthError
    powerbi_client.py # DaxGuard, ResponsePiiFilter, PowerBIClient, QueryResult, ModelMetadata, McpSecurityError
    tools.py          # register_tools(), build_*_dax() helpers, METRIC_MEASURE_MAP, MEASURE_CATALOG
    server.py         # build_server(), main() — composition root and entrypoint
  tests/
    test_tools.py     # pytest suite (no live Azure/Power BI — uses respx + AsyncMock)
  examples/
    agent-config.md   # copy-paste registration recipes for Claude Code, Codex, Foundry
  pyproject.toml      # PEP 621 metadata, console-script bipixie-mcp = bipixie_mcp.server:main
  .env.example        # full BIPIXIE_MCP_* env template
  README.md           # this file
```

### Module contracts

- `config.Settings` — loaded once via `get_settings()` (lru_cache singleton). No network calls, no credentials built at import time. `validate_runtime()` fails fast with a human-readable `ConfigError` (no secrets in the message) before any network I/O.
- `auth.TokenProvider` — caches the bearer token process-wide, transparently refreshing ~5 minutes before the ~1-hour expiry. Never calls Power BI per-request.
- `powerbi_client.PowerBIClient` — async; `execute_dax()` enforces DaxGuard, rate limit, `max_rows`, error-envelope parsing, and `ResponsePiiFilter`. `resolve_ids()` resolves workspace/dataset names to GUIDs once and caches the result.
- `tools.register_tools()` — decorates the eight tools on the `FastMCP` instance. DAX builders (`build_*_dax`) are pure functions with no I/O, independently unit-tested. `report_name` and other user-supplied filter values are passed via `TREATAS({value}, table[column])` — never string-concatenated into a measure name — preventing DAX injection.
- `server.build_server()` — composition root: loads config, builds credential/client, registers tools, returns a configured `FastMCP` instance. `main()` is the console-script entry point and the `python -m bipixie_mcp.server` target.

### Running tests

```bash
cd mcp_server
py -3.12 -m pytest tests/ -v --tb=short
```

Tests run without any Azure or Power BI connection (no live credentials required). Fixtures use `AsyncMock` for `PowerBIClient` methods and `respx` for HTTP-layer tests.

---

## Phase 2 — Fabric RTI scale-out

When Phase 1 limits bite — `executeQueries` 429 throttling under agent load, raw row-level drill-down beyond ~50K rows, time-series anomaly detection (`series_decompose_anomalies`), sub-2-second historical scans over 30+ days, or near-real-time "who is viewing this report right now" — the preferred upgrade path is **Microsoft Fabric Real-Time Intelligence (RTI)**.

The design:
1. Tee the existing BI Pixie Event Hub into a Fabric Eventstream -> Eventhouse `Events` table (new consumer group only — `event_consumer_app` -> ADLS stays untouched).
2. Backfill history from `bipixielake-{license_key}/events/` via the Fabric "Get data from Azure Storage" TSV wizard.
3. Replace (or supplement) the Phase-1 server with the **open-source `microsoft-fabric-rti-mcp` server** (`pip install microsoft-fabric-rti-mcp`) pointing at the Eventhouse KQL endpoint — or use the zero-install `api.fabric.microsoft.com/v1/mcp/dataPlane/.../kqlEndpoint` remote MCP endpoint.

Because `config.py` abstracts all connection details, the cut-over is a deployment/config change, not a rewrite of the agent-facing tool contract.

See `project/ai-next-generation-proposal.md` (Pillar 2, "Engagement-Enriched Fabric Ontology") for strategic context.

---

## License

Internal tool — not published to PyPI. Part of the BI Pixie platform codebase.
