Metadata-Version: 2.4
Name: azure-discovery
Version: 0.1.9
Summary: Lightweight Azure tenant discovery and visualization via Resource Graph. Enumerates subscriptions and resources, normalizes results, and renders interactive dependency graphs. Supports public and sovereign clouds (Gov, China, Germany, Azure Stack).
Author: David Frazer <david.frazer336@gmail.com>
License-Expression: MIT
Project-URL: Documentation, https://github.com/maravedi/AzureDiscovery#readme
Project-URL: Repository, https://github.com/maravedi/AzureDiscovery
Project-URL: Bug Tracker, https://github.com/maravedi/AzureDiscovery/issues
Keywords: azure,resource-graph,discovery,inventory,sovereign-cloud,visualization
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: System :: Systems Administration
Requires-Python: >=3.11
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: pydantic>=2.7
Requires-Dist: fastapi>=0.110
Requires-Dist: uvicorn>=0.30
Requires-Dist: typer>=0.12
Requires-Dist: pyvis>=0.3.2
Requires-Dist: msgraph-sdk>=1.0
Requires-Dist: pyyaml>=6.0
Requires-Dist: azure-identity>=1.17
Requires-Dist: azure-mgmt-resourcegraph>=8.0
Requires-Dist: azure-mgmt-subscription>=3.1
Requires-Dist: azure-mgmt-authorization>=4.0
Requires-Dist: azure-mgmt-security>=7.0
Requires-Dist: rich-transient>=0.1.1
Provides-Extra: dev
Requires-Dist: pytest>=8.2; extra == "dev"
Requires-Dist: pytest-asyncio>=0.23; extra == "dev"
Requires-Dist: pytest-cov>=5.0; extra == "dev"
Requires-Dist: ruff>=0.4.0; extra == "dev"
Requires-Dist: mypy>=1.10; extra == "dev"
Requires-Dist: build>=1.0; extra == "dev"
Requires-Dist: twine>=5.0; extra == "dev"
Dynamic: license-file

# Azure Discovery

Azure Discovery is a lightweight Azure tenant mapper that enumerates subscriptions and resources via Azure Resource Graph, normalizes the results, and renders an interactive dependency graph. The tool exposes both a Typer-based CLI (`azure-discovery`) and a FastAPI surface so the same discovery workflow can be automated or embedded in other services.

The package is published on [PyPI](https://pypi.org/project/azure-discovery/) as **azure-discovery** and can be installed with `pip install azure-discovery`.

## Core capabilities

- Builds environment-aware credential chains (Azure CLI + DefaultAzureCredential) with guardrails for unsupported clouds.
- Queries Azure Resource Graph with include/exclude filters, tag constraints, and resource group scopes.
- Resolves subscriptions automatically when not provided and de-duplicates resources for consistent graph IDs.
- Produces JSON summaries, console metrics, and PyVis HTML graphs for quick triage.
- Optionally enumerates Entra ID objects via Microsoft Graph (organization/domains, users, groups, applications, service principals, conditional access policies, risky users) with bounded relationship expansion.
- Optionally enumerates Azure RBAC role assignments and definitions with principal-to-resource relationship mapping.
- Optionally enumerates PIM (Privileged Identity Management) eligible role assignments for both Entra ID roles and Azure resource roles.
- Optionally enumerates Defender for Cloud security alerts, assessments, and secure scores for security posture analysis.
- Offers identical request/response contracts (Pydantic models) across CLI and API, following the Receive an Object, Return an Object (RORO) pattern.
- Supports all Azure clouds: public, Government (GCC/GCC-H), China, Germany, and Azure Stack.
- Adaptive rate control and intelligent batching for large-scale tenant discovery.
- API hardening with pluggable authentication (Azure AD, API key), rate limiting, audit logging, and CORS.

## Installation

**From PyPI (recommended):**

```bash
pip install azure-discovery
```

**With optional development dependencies:**

```bash
pip install azure-discovery[dev]
```

**From source (e.g. for development or when embedded in another repo):**

```bash
git clone https://github.com/maravedi/AzureDiscovery.git
cd AzureDiscovery
pip install -e .[dev]
```

## Package layout

When installed, the package provides the **azure_discovery** Python package:

```
azure_discovery/
  __init__.py       # run_discovery, AzureDiscoveryRequest, AzureDiscoveryResponse, etc.
  cli.py            # Typer command surface (entry point: azure-discovery)
  api.py            # FastAPI app for /discover and visualization endpoints
  orchestrator.py   # Async coordinator for enumeration + visualization
  adt_types/        # Pydantic models and custom exceptions
  enumerators/      # Resource Graph query builder and normalization
  reporting/        # Console logging and HTML/PyVis graph generation
  utils/            # Azure SDK clients, graph helpers, structured logging
```

## Usage

### CLI

ARM-only discovery:

```bash
azure-discovery discover \
  --tenant-id <tenant-guid> \
  --subscription <sub-id-1> --subscription <sub-id-2> \
  --include-type "Microsoft.Compute/virtualMachines" \
  --resource-group core-infra \
  --required-tag environment=prod \
  --visualization-output-dir artifacts/graphs
```

ARM + Entra discovery (example):

```bash
azure-discovery discover \
  --tenant-id <tenant-guid> \
  --subscription <sub-id-1> \
  --include-entra \
  --entra-group-membership-max-groups 50 \
  --entra-group-membership-max-members-per-group 200
```

Entra + RBAC discovery only (without Azure resources; pass `--subscription` so RBAC has scope):

```bash
azure-discovery discover \
  --tenant-id <tenant-guid> \
  --subscription <sub-id> \
  --no-include-azure-resources \
  --include-entra \
  --include-rbac-assignments
```

ARM + RBAC discovery (example):

```bash
azure-discovery discover \
  --tenant-id <tenant-guid> \
  --subscription <sub-id-1> \
  --include-rbac-assignments \
  --include-rbac-definitions \
  --rbac-scope "/subscriptions/<sub-id-1>"
```

ARM + PIM discovery (example):

```bash
azure-discovery discover \
  --tenant-id <tenant-guid> \
  --subscription <sub-id-1> \
  --include-pim \
  --pim-include-entra-eligibilities \
  --pim-include-azure-resource-eligibilities
```

ARM + Defender for Cloud discovery (example):

```bash
azure-discovery discover \
  --tenant-id <tenant-guid> \
  --subscription <sub-id-1> \
  --include-defender-cloud \
  --defender-alert-severity High --defender-alert-severity Critical \
  --defender-alert-status Active
```

Using a config file (JSON/TOML/YAML):

```bash
# CLI flags override file values
azure-discovery discover --config examples/config.example.toml

# You can still override specific values
azure-discovery discover --config examples/config.example.toml --include-entra --entra-max-objects 10000
```

Examples:

- [examples/config.example.toml](examples/config.example.toml)
- [examples/config.example.yaml](examples/config.example.yaml)

Configuration docs:

- [docs/configuration.md](docs/configuration.md)

Tip: you can generate a JSON starter config by running with `--preview-request` and saving stdout.

Run as a module from source (from the repo root):

```bash
python -m azure_discovery.cli discover --help
python -m azure_discovery.cli discover --tenant-id <tenant-guid> --environment azure_gov [options...]
```

### FastAPI

Run the server:

```bash
uvicorn azure_discovery.api:app --host 0.0.0.0 --port 8000 --reload
```

Optional: set `AZURE_DISCOVERY_CONFIG=/path/to/discovery.toml` to apply default values to incoming requests (request body fields win).

Health check:

```bash
curl http://localhost:8000/healthz
```

Discovery request (example):

```bash
curl -X POST http://localhost:8000/discover \
  -H "Content-Type: application/json" \
  -d '{
    "tenant_id": "<tenant-guid>",
    "environment": "azure_public",
    "subscriptions": ["<sub-id>"]
  }'
```

Enable Entra + relationship expansion (example):

```bash
curl -X POST http://localhost:8000/discover \
  -H "Content-Type: application/json" \
  -d '{
    "tenant_id": "<tenant-guid>",
    "environment": "azure_public",
    "subscriptions": ["<sub-id>"],
    "include_entra": true,
    "include_relationships": true,
    "entra_group_membership_max_groups": 50,
    "entra_group_membership_max_members_per_group": 200
  }'
```

Download visualization:

```bash
curl http://localhost:8000/visuals/<file-name> --output graph.html
```

### Python

Async usage:

```python
from azure_discovery import run_discovery
from azure_discovery.adt_types import AzureDiscoveryRequest, AzureEnvironment

request = AzureDiscoveryRequest(
    tenant_id="<tenant-guid>",
    environment=AzureEnvironment.AZURE_PUBLIC,
    subscriptions=["<sub-id>"],
    include_entra=True,
)

response = await run_discovery(request)
print(len(response.nodes), len(response.relationships), response.html_report_path)
```

Load from config file:

```python
from pathlib import Path

from azure_discovery.utils.config_files import load_request_from_file

request = load_request_from_file(Path("examples/config.example.yaml"))
```

Sync script wrapper:

```python
import asyncio

from azure_discovery import run_discovery
from azure_discovery.adt_types import AzureDiscoveryRequest, AzureEnvironment

request = AzureDiscoveryRequest(
    tenant_id="<tenant-guid>",
    environment=AzureEnvironment.AZURE_PUBLIC,
)

response = asyncio.run(run_discovery(request))
print(response.total_resources)
```

## Azure resources discovery

By default (`include_azure_resources=true`), Azure Discovery enumerates Azure resources from subscriptions via Azure Resource Graph. You can disable this to focus on other aspects like Entra ID, RBAC, or PIM data only:

```bash
# Discover only Entra ID objects and RBAC assignments without Azure resources
azure-discovery discover \
  --tenant-id <tenant-guid> \
  --no-include-azure-resources \
  --include-entra \
  --include-rbac-assignments
```

Subscriptions are resolved the same way for CLI, API, and Python: when `include_azure_resources` is false, the orchestrator still resolves the subscription list (from request `subscriptions` or from Azure) so RBAC, PIM, and Defender phases have scope. When using `--no-include-azure-resources` with RBAC, PIM, or Defender for Cloud, pass `--subscription` (or set `subscriptions` in your config or API request body) so those phases know which subscriptions to enumerate; otherwise the tool resolves subscriptions from Azure (or fails if none are accessible).

This is useful when you want to:
- Analyze only identity and access management data without resource inventory
- Audit RBAC assignments or PIM eligibilities independently of resource discovery
- Reduce discovery time and output size when resource data isn't needed

## Entra ID discovery

When `include_entra` is enabled, Azure Discovery queries Microsoft Graph and emits normalized nodes using a `graph://...` ID namespace to avoid collisions with Azure Resource Manager (ARM) IDs.

### Entra node types

Typical Entra collections include:

- Microsoft.Graph/Organization and Microsoft.Graph/Domain
- Microsoft.Graph/User and Microsoft.Graph/Group
- Microsoft.Graph/Application and Microsoft.Graph/ServicePrincipal
- Microsoft.Graph/ConditionalAccessPolicy
- Microsoft.Graph/RiskyUser

### Entra relationships

When `include_relationships` is enabled, Azure Discovery can emit bounded edges:

- `has_domain` (organization -> domain)
- `has_member` (group -> member) when group membership expansion is enabled
- `has_owner` (application/servicePrincipal -> owner) when ownership expansion is enabled
- `appId` (servicePrincipal -> application) correlation edges when both are enumerated

All relationship expansion is capped by request parameters (see CLI options below) to avoid blowing up graphs in large tenants.

## Azure RBAC discovery

When `include_rbac_assignments` or `include_rbac_definitions` is enabled, Azure Discovery enumerates Azure role-based access control (RBAC) data:

- **Role assignments**: Who has what access to which resources (active assignments)
- **Role definitions**: Built-in and custom role definitions with their permissions
- **RBAC relationships**: Principal → RoleAssignment → Resource graph edges (when `include_relationships` is enabled)

This capability is useful for security posture assessment, access reviews, and understanding the permission landscape across your Azure estate.

## PIM (Privileged Identity Management) discovery

When `include_pim` is enabled, Azure Discovery enumerates eligible role assignments that users can activate on-demand:

- **Entra ID role eligibilities**: Eligible assignments for directory roles (Global Administrator, User Administrator, etc.)
- **Azure resource role eligibilities**: Eligible assignments for Azure resource roles (Owner, Contributor, etc. at subscription/resource group/resource scope)
- **Eligibility schedules**: Time-bound eligibility windows with start/end dates
- **PIM relationships**: Principal → RoleEligibility → Resource/RoleDefinition graph edges (when `include_relationships` is enabled)

PIM eligibilities represent just-in-time (JIT) access that must be activated before use. This differs from standard RBAC assignments which are always active. PIM discovery helps identify:

- Standing privileged access (who is eligible for high-privilege roles)
- Dormant privileged accounts (eligibilities that haven't been activated recently)
- Compliance with least-privilege policies (eligibilities with appropriate time boundaries)
- Shadow admins (users with eligible assignments to privileged roles)

### PIM node types

- **Microsoft.Graph.PIM/roleEligibilitySchedules**: Entra ID role eligibilities (eligible for directory roles)
- **Microsoft.Graph.PIM/roleEligibilityScheduleRequests**: Pending/active Entra role eligibility requests
- **Microsoft.Authorization/roleEligibilitySchedules**: Azure resource role eligibilities (eligible for ARM roles)

### PIM relationships

When `include_relationships` is enabled, Azure Discovery creates edges between PIM eligibilities and related entities:

- `has_eligible_role` (principal -> eligibility): Links users/groups/service principals to their eligible role assignments
- `eligible_for` (eligibility -> resource): Links Azure resource eligibilities to the resources they grant access to
- `eligible_via_role` (eligibility -> role definition): Links eligibilities to the role definitions they represent

### PIM filtering

You can filter PIM eligibilities by scope to focus on specific subscriptions or resource groups:

```bash
# Only eligibilities for a specific subscription
azure-discovery discover \
  --tenant-id <tenant-guid> \
  --include-pim \
  --pim-scope "/subscriptions/<sub-id>"

# Only eligibilities for a specific resource group
azure-discovery discover \
  --tenant-id <tenant-guid> \
  --include-pim \
  --pim-scope "/subscriptions/<sub-id>/resourceGroups/<rg-name>"
```

This capability is useful for security audits, compliance reviews, privileged access management, and identifying potential attack paths through JIT privilege escalation.

### PIM Permissions Setup

PIM discovery requires specific Microsoft Graph API permissions and licensing. For detailed setup instructions, see:

**[PIM Permissions Guide](docs/PIM_PERMISSIONS_GUIDE.md)** - Complete guide for:
- Service principal configuration
- User permission setup
- Licensing requirements
- Azure resource PIM onboarding
- Troubleshooting permission issues

Quick reference for common scenarios:
- **Interactive users**: Requires **Global Reader** or **Privileged Role Administrator** Entra ID role + `RoleManagement.Read.Directory` consent
- **Service principals**: Requires `RoleManagement.Read.Directory` application permission + admin consent
- **Licensing**: Requires **Entra ID P2** or **Entra ID Governance** license

## Defender for Cloud discovery

When `include_defender_cloud` is enabled, Azure Discovery enumerates security findings from Microsoft Defender for Cloud:

- **Security alerts**: Active threats, suspicious activity, and security incidents detected across your Azure resources. Each alert includes MITRE ATT&CK tactics and techniques, affected resources, remediation steps, and severity ratings.
- **Security assessments**: Vulnerability findings, compliance recommendations, and security best practices. Assessments identify configuration gaps and provide remediation guidance.
- **Secure scores**: Subscription-level security posture scores that quantify your current security state (e.g., 42.5/100).

### Defender node types

- **Microsoft.Security/alerts**: Security alerts with properties like severity (High, Medium, Low, Informational), status (Active, Resolved, Dismissed), MITRE ATT&CK techniques, and affected resources.
- **Microsoft.Security/assessments**: Security assessments with severity, status (Healthy, Unhealthy, NotApplicable), categories (Data, Network, Compute, etc.), and remediation descriptions.
- **Microsoft.Security/secureScores**: Subscription security posture scores with current/max values and percentage.

### Defender relationships

When `include_relationships` is enabled, Azure Discovery creates edges between security findings and affected resources:

- `affects` (alert -> resource): Links security alerts to the VMs, storage accounts, or other resources they impact.
- `affects` (assessment -> resource): Links security assessments to the resources that have vulnerabilities or misconfigurations.

These relationships enable security-focused graph queries like "Show me all High severity alerts affecting production VMs" or "Which resources have the most unhealthy assessments?"

### Defender filtering

You can filter security findings to reduce noise and focus on critical issues:

```bash
# Only High and Critical severity alerts that are Active
azure-discovery discover \
  --tenant-id <tenant-guid> \
  --include-defender-cloud \
  --defender-alert-severity High --defender-alert-severity Critical \
  --defender-alert-status Active

# Only Unhealthy assessments (skip Healthy and NotApplicable)
azure-discovery discover \
  --tenant-id <tenant-guid> \
  --include-defender-cloud \
  --defender-assessment-status Unhealthy

# Alerts only (disable assessments and scores)
azure-discovery discover \
  --tenant-id <tenant-guid> \
  --include-defender-cloud \
  --no-defender-include-assessments \
  --no-defender-include-secure-scores
```

Config file example (YAML):

```yaml
include_defender_cloud: true
defender_config:
  include_security_alerts: true
  include_security_assessments: true
  include_secure_scores: true
  alert_severity_filter:
    - High
    - Critical
  alert_status_filter:
    - Active
  assessment_status_filter:
    - Unhealthy
```

This capability is useful for security operations, vulnerability management, compliance tracking, and prioritizing remediation efforts based on actual threats and exposures.

## Prerequisites

- Python 3.11+
- Azure CLI 2.60+ (optional, used when `--prefer-cli` is set) or service principal credentials exported as `AZURE_CLIENT_ID`, `AZURE_TENANT_ID`, and `AZURE_CLIENT_SECRET`
- Azure Resource Graph access (Reader or above on the subscriptions you plan to scan)
- Microsoft Graph access (only required when using `--include-entra`)
- Network egress to `management.azure.com` and `api.azure.com` (and to the Graph endpoint for the cloud you select)

## Required permissions

### Azure Resource Graph (ARM)

| Capability | Minimum RBAC role | Scope recommendation |
| ---------- | ----------------- | -------------------- |
| Run Resource Graph queries | `Reader`, `Resource Graph Reader`, or any custom role with `Microsoft.ResourceGraph/*/read` | Every subscription you plan to inventory or the parent management group |
| Auto-discover subscriptions (when `--subscription` is omitted) | `Reader` on the management group or `Directory.Read.All` consent for service principals | Tenant root (`/providers/Microsoft.Management/managementGroups/<root>`) |
| Register Microsoft.ResourceGraph (one-time) | `Contributor` or `Owner` | Each subscription being scanned |
| Read role assignments | `Reader` or `Role Based Access Control Administrator (read-only)` | Subscription or management group |
| Read role definitions | `Reader` | Subscription |
| Read PIM eligible assignments (Entra roles) | N/A (requires Microsoft Graph permissions - see below) | Tenant |
| Read PIM eligible assignments (Azure resources) | `Reader` or `Role Based Access Control Administrator (read-only)` | Subscription or management group |
| Read Defender for Cloud alerts and assessments | `Reader` or `Security Reader` | Subscription |

The tool never mutates resources, but it cannot enumerate subscriptions or call Resource Graph unless the identity has at least `Reader` at the relevant scope. Grant the narrowest scope that still covers your target estate.

### Microsoft Graph (Entra)

Azure Discovery uses Microsoft Graph delegated permissions when running as a signed-in user (for example, Azure CLI/device code flows), and application permissions when running headless (service principal / managed identity).

The following table is a practical starting point for *read-only* discovery. Always follow least privilege, and prefer narrower resource-specific permissions over broad directory-wide permissions where possible.

| Discovery area | Typical endpoints | Delegated permissions | Application permissions | Notes |
| --- | --- | --- | --- | --- |
| Users | `/users` | `User.ReadBasic.All` or `User.Read.All` (or `Directory.Read.All`) | `User.Read.All` (or `Directory.Read.All`) | Guests can't call `/users`. |
| Groups + members | `/groups`, `/groups/{id}/members` | `Group.Read.All` + `GroupMember.Read.All` (or `Directory.Read.All`) | `Group.Read.All` + `GroupMember.Read.All` (or `Directory.Read.All`) | Hidden memberships may require additional permissions depending on tenant settings. |
| Applications + service principals | `/applications`, `/servicePrincipals` | `Application.Read.All` (or `Directory.Read.All`) | `Application.Read.All` (or `Directory.Read.All`) | Needed for enumerating apps/SPs and owner expansion. |
| Conditional Access policies | `/identity/conditionalAccess/policies` | `Policy.Read.All` | `Policy.Read.All` | Delegated access typically requires an Entra role such as Conditional Access Administrator or similar security read roles. |
| Risky users (Identity Protection) | `/identityProtection/riskyUsers` | `IdentityRiskyUser.Read.All` | `IdentityRiskyUser.Read.All` | Requires Entra ID Identity Protection licensing (commonly P2). |
| PIM Entra role eligibilities | `/roleManagement/directory/roleEligibilitySchedules` | `RoleEligibilitySchedule.Read.Directory` or `RoleManagement.Read.Directory` or `RoleManagement.Read.All` | `RoleEligibilitySchedule.Read.Directory` or `RoleManagement.Read.Directory` or `RoleManagement.Read.All` | Requires Entra ID P2 or Entra ID Governance licensing. Delegated access typically requires an Entra role such as Privileged Role Administrator or Global Reader. |

References:

- List users permissions: https://learn.microsoft.com/en-us/graph/api/user-list?view=graph-rest-1.0
- Conditional Access policy list permissions: https://learn.microsoft.com/en-us/graph/api/conditionalaccessroot-list-policies?view=graph-rest-1.0
- Identity Protection API tutorial (role + delegated scope examples): https://learn.microsoft.com/en-us/graph/tutorial-riskdetection-api
- PIM API overview: https://learn.microsoft.com/en-us/graph/api/resources/privilegedidentitymanagementv3-overview
- Role eligibility schedules API: https://learn.microsoft.com/en-us/graph/api/rbacapplication-list-roleeligibilityschedules

### Granting Microsoft Graph permissions for PIM

PIM enumeration requires specific Microsoft Graph API permissions. Follow these steps to grant permissions:

#### For Service Principals (Application Permissions)

1. **Register an application** in Entra ID (if not already done):
   ```bash
   az ad app create --display-name "Azure Discovery PIM"
   ```

2. **Grant Microsoft Graph API permissions**:
   ```bash
   # Get the application ID
   APP_ID=$(az ad app list --display-name "Azure Discovery PIM" --query "[0].appId" -o tsv)

   # Grant RoleManagement.Read.Directory permission (read PIM eligibilities)
   az ad app permission add \
     --id $APP_ID \
     --api 00000003-0000-0000-c000-000000000000 \
     --api-permissions 741c54c2-4c95-4eda-87e4-e8b36d2d93bb=Role
   ```

3. **Admin consent** (requires Global Administrator or Privileged Role Administrator):
   ```bash
   az ad app permission admin-consent --id $APP_ID
   ```

4. **Create service principal and secret**:
   ```bash
   az ad sp create --id $APP_ID
   az ad sp credential reset --id $APP_ID --years 1
   ```

5. **Assign Entra ID role** (for delegated scenarios or enhanced permissions):
   - Assign **Privileged Role Administrator** or **Global Reader** role to the service principal
   - This is in addition to the API permissions above

#### For Users (Delegated Permissions)

1. **Entra ID role assignment**: Assign one of these roles to the user account:
   - **Privileged Role Administrator** (can read all PIM configurations)
   - **Global Reader** (read-only access to all tenant data including PIM)
   - **Security Reader** (read security-related data including some PIM data)

2. **API permissions**: When using delegated flow (e.g., `az login`), consent to:
   - `RoleManagement.Read.Directory` or
   - `RoleEligibilitySchedule.Read.Directory`

3. **Interactive consent** (first run):
   ```bash
   az login --scope https://graph.microsoft.com/RoleManagement.Read.Directory
   ```

#### Licensing Requirements

**CRITICAL**: PIM functionality requires one of the following licenses:
- **Entra ID P2** (formerly Azure AD Premium P2)
- **Entra ID Governance** (includes PIM capabilities)
- **Microsoft 365 E5** (includes Entra ID P2)

Without proper licensing, the PIM APIs will return 403 Forbidden even with correct permissions.

### Service principal flow (CLI based)

```
az ad sp create-for-rbac \
  --name azure-discovery-sp \
  --role "Reader" \
  --scopes /subscriptions/<sub-id-1> /subscriptions/<sub-id-2> \
  --years 1

az role assignment create \
  --assignee <appId> \
  --role "Resource Graph Reader" \
  --scope /subscriptions/<sub-id-1>
```

Export the emitted `appId`, `tenant`, and `password` as `AZURE_CLIENT_ID`, `AZURE_TENANT_ID`, and `AZURE_CLIENT_SECRET`. Repeat the role assignment command for every subscription or assign at the management group scope (`/providers/Microsoft.Management/managementGroups/<mg-id>`) to cover multiple subscriptions at once.

### User-assigned permissions (Portal)

1. Open Azure Portal → **Subscriptions** → select each target subscription.
2. Navigate to **Access control (IAM)** → **Add** → **Add role assignment**.
3. Pick the `Reader` (or `Resource Graph Reader`) role, then select the user or managed identity that will run AzureDiscovery.
4. If you want automatic subscription discovery, repeat the assignment at the tenant root management group (visible under **Management groups**). Users need the **Azure RBAC Reader** role there.

### Provider registration and validation

Run the following once per subscription to ensure the Resource Graph service is registered and the identity can query it:

```
az account set --subscription <sub-id>
az provider register --namespace Microsoft.ResourceGraph
az graph query -q "Resources | take 1"
```

Successful output from `az graph query` confirms both the provider registration and the assigned role. If the command fails with `AuthorizationFailed`, double-check the scope of the role assignments and replicate them for every subscription you intend to scan.

## Configuration reference

| Option | Description |
| ------ | ----------- |
| `--config` | Path to JSON/TOML/YAML config file (AzureDiscoveryRequest shape). CLI flags override file values. |
| `--tenant-id` | Required Entra ID tenant GUID. |
| `--environment` | Azure cloud (`azure_public`, `azure_gov`, `azure_china`, `azure_germany`, `azure_stack`). |
| `--subscription/-s` | Repeatable flag to scope runs to explicit subscription IDs. Omit to auto-resolve. When using `--no-include-azure-resources`, pass this so RBAC/PIM/Defender know which subscriptions to enumerate. |
| `--include-azure-resources/--no-include-azure-resources` | Include Azure resources from Resource Graph (default: true). When false, only Entra/RBAC/PIM/Defender data is collected; subscriptions are still resolved from `--subscription` or Azure. |
| `--include-entra` | Include Entra ID resources via Microsoft Graph. |
| `--include-rbac-assignments` | Include Azure role assignments in discovery. |
| `--include-rbac-definitions` | Include Azure role definitions (built-in and custom). |
| `--rbac-scope` | Filter role assignments by scope (repeatable). |
| `--include-pim` | Include PIM (Privileged Identity Management) eligible role assignments. |
| `--pim-include-entra-eligibilities/--no-pim-include-entra-eligibilities` | Include Entra ID role eligibilities (default: true when PIM enabled). |
| `--pim-include-entra-requests` | Include pending/active Entra role eligibility requests. |
| `--pim-include-azure-resource-eligibilities/--no-pim-include-azure-resource-eligibilities` | Include Azure resource role eligibilities (default: true when PIM enabled). |
| `--pim-scope` | Filter PIM eligibilities by scope (repeatable). |
| `--include-defender-cloud` | Include Defender for Cloud security findings (alerts, assessments, scores). |
| `--defender-include-alerts/--no-defender-include-alerts` | Include security alerts from Defender for Cloud (default: true when defender enabled). |
| `--defender-include-assessments/--no-defender-include-assessments` | Include security assessments (default: true when defender enabled). |
| `--defender-include-secure-scores/--no-defender-include-secure-scores` | Include secure scores (default: true when defender enabled). |
| `--defender-alert-severity` | Filter alerts by severity: High, Medium, Low, Informational (repeatable). |
| `--defender-alert-status` | Filter alerts by status: Active, Resolved, Dismissed (repeatable). |
| `--defender-assessment-severity` | Filter assessments by severity: High, Medium, Low (repeatable). |
| `--defender-assessment-status` | Filter assessments by status: Healthy, Unhealthy, NotApplicable (repeatable). |
| `--scale-controls/--no-scale-controls` | Enable adaptive rate control for large tenants (default: enabled). |
| `--scale-initial-rps` | Initial requests per second for adaptive rate control (default: 10.0). |
| `--scale-max-concurrent-batches` | Maximum concurrent batch operations (default: 5). |
| `--scale-initial-batch-size` | Initial batch size for paginated operations (default: 1000). |
| `--entra-include-organization/--no-entra-include-organization` | Include organization (tenant root) node. |
| `--entra-include-domains/--no-entra-include-domains` | Include tenant domains. |
| `--entra-include-users/--no-entra-include-users` | Include Entra users. |
| `--entra-include-groups/--no-entra-include-groups` | Include Entra groups. |
| `--entra-include-applications/--no-entra-include-applications` | Include Entra applications. |
| `--entra-include-conditional-access-policies/--no-entra-include-conditional-access-policies` | Include conditional access policies (requires permissions). |
| `--entra-include-risky-users/--no-entra-include-risky-users` | Include risky users (requires permissions). |
| `--entra-group-membership-max-groups` | Max groups to expand membership for (0 disables expansion). |
| `--entra-group-membership-max-members-per-group` | Max members per group during expansion (0 disables expansion). |
| `--entra-ownership-max-apps` | Max applications to expand owners for (0 disables expansion). |
| `--entra-ownership-max-owners-per-app` | Max owners per app during expansion (0 disables expansion). |
| `--entra-sp-ownership-max-sps` | Max service principals to expand owners for (0 disables SP ownership expansion). |
| `--entra-sp-ownership-max-owners-per-sp` | Max owners per service principal during expansion (0 disables expansion). |
| `--include-relationships/--no-include-relationships` | Include inferred and expanded relationships/edges (Graph expansions require this). |
| `--graph-total-max-objects` | Maximum total objects across all Graph collections (0 = unlimited). |
| `--entra-max-objects` | Maximum objects per Entra collection (0 = unlimited). |
| `--include-type` / `--exclude-type` | Filter resource types (case-insensitive). |
| `--resource-group` | Restrict discovery to named resource groups. |
| `--required-tag` | Enforce tag key=value pairs (repeatable). |
| `--prefer-cli` | Place Azure CLI credentials at the front of the chain. |
| `--visualization-output-dir` | Directory for PyVis HTML output (default `artifacts/graphs`). |
| `--visualization-file` | Override the generated HTML file name. |
| `--output/-o` | Write JSON output to file instead of stdout. |
| `--quiet/-q` | Suppress all logs except errors. |
| `--format/-f` | Output format: `json` (default) or `json-compact`. |
| `--preview-request/--dry-run` | Print the constructed discovery request JSON and exit (no discovery). |
| `--validate-auth` | Run a preflight auth check (token acquisition for ARM and Graph) and exit. Use `--probe-connectivity` to also validate connectivity. |

Subcommands: `discover` (run discovery), `version` (print package version and exit).

Programmatic workflows can instantiate `AzureDiscoveryRequest` directly and call `orchestrator.run_discovery`, receiving an `AzureDiscoveryResponse` that contains resolved subscriptions, normalized nodes, inferred relationships, and an optional `html_report_path`.

### Output and logging separation

By default, the CLI writes JSON results to stdout and logs to stderr. This allows clean piping:

```bash
# Pipe JSON output to jq for filtering
azure-discovery discover --tenant-id <id> | jq '.discovered_subscriptions'

# Write output to file and suppress logs
azure-discovery discover --tenant-id <id> --output results.json --quiet

# Compact JSON output for scripting
azure-discovery discover --tenant-id <id> --format json-compact
```

## Development

### Quick start

```bash
# Install with development dependencies
make install-dev

# Run tests
make test

# Format code
make format

# Run linting
make lint

# Type checking
make typecheck

# Generate coverage report
make coverage
```

### Available make commands

Run `make help` to see all available commands:
- `make install` - Install package dependencies
- `make install-dev` - Install with development dependencies
- `make test` - Run tests with pytest
- `make lint` - Run ruff linter
- `make format` - Format code with ruff
- `make typecheck` - Run mypy type checking
- `make coverage` - Generate test coverage report
- `make clean` - Remove build artifacts and cache
- `make run-api` - Run FastAPI server locally

### Pre-commit hooks

Install pre-commit hooks to automatically run linting and formatting on commit:

```bash
pip install pre-commit
pre-commit install
```

This will run ruff formatting, linting, and mypy type checking before each commit.

### Environment variables

Copy `.env.example` to `.env` and configure your Azure credentials:

```bash
cp .env.example .env
# Edit .env with your credentials
```

For detailed contributing guidelines, see [CONTRIBUTING.md](CONTRIBUTING.md).

## Troubleshooting

### General Issues

- **`AzureClientError: Unable to enumerate subscriptions`** – ensure the identity has at least `Reader` on one subscription and that the Resource Graph service is registered (`az provider register --namespace Microsoft.ResourceGraph`).
- **`AuthorizationFailed` / `Forbidden` (ARM)** – confirm the identity has `Reader` (or `Resource Graph Reader`) on every subscription (or parent management group) you are scanning, and that your current Azure CLI context is pointing at a subscription you can read (`az account show`).
- **`Resource Graph query failure`** – check that the tenant/subscription pair belongs to the same cloud you selected, and verify network egress to the relevant `resource_manager` endpoint (see `_ENVIRONMENT_MAP` in `azure_discovery.utils.azure_clients`).
- **`403 Forbidden` / `Authorization_RequestDenied` / `Insufficient privileges` (Graph)** – this usually means required Microsoft Graph permissions were not admin-consented, or (for delegated runs) the signed-in user lacks the Entra admin role required for that dataset (commonly Conditional Access / Identity Protection). If you don't need those datasets, disable them with `--no-entra-include-conditional-access-policies` and/or `--no-entra-include-risky-users`.
- **Risky users missing / empty** – the Identity Protection APIs require additional permissions and licensing (commonly Entra ID P2); if you don't have that, disable with `--no-entra-include-risky-users`.
- **Defender for Cloud alerts/assessments empty** – verify Defender for Cloud is enabled on the subscription(s) being scanned. The tool gracefully handles subscriptions without Defender enabled (404 errors are logged as warnings).
- **Preflight auth check (CLI)** – run `azure-discovery discover --tenant-id <tenant-guid> --validate-auth --probe-connectivity` to confirm the credential chain can acquire tokens for both ARM and Graph (note: this currently validates token acquisition only).
- **`VisualizationError: Failed to render HTML graph`** – confirm the `--visualization-output-dir` path exists and is writable; the PyVis writer does not auto-create directories unless it has permissions on each parent.
- **`401` or `interaction_required` errors** – when running non-interactively, use a service principal credential chain and set `AZURE_CLIENT_SECRET`; the default chain will otherwise attempt to launch an interactive browser flow.
- **Empty graph output** – verify filters are not mutually exclusive (e.g., mixing include/exclude for the same type).
- **No RBAC/PIM/Defender results when using `--no-include-azure-resources`** – the tool must know which subscriptions to enumerate. Pass `--subscription <sub-id>` (or set `subscriptions` in your config). If omitted, the tool resolves subscriptions from Azure; if that fails or returns none, RBAC/PIM/Defender phases run over zero subscriptions.

### PIM-Specific Issues

- **`403 Forbidden` when enumerating Entra PIM eligibilities** – this indicates one or more permission/licensing issues:
  1. **Missing Graph API permissions**: Ensure `RoleManagement.Read.Directory` or `RoleEligibilitySchedule.Read.Directory` is granted and admin-consented
  2. **Missing Entra ID role**: For delegated scenarios, assign **Privileged Role Administrator** or **Global Reader** to the user
  3. **Missing licensing**: PIM requires **Entra ID P2** or **Entra ID Governance** licensing - check your tenant's license status
  4. **Verify permissions**: Run `az ad signed-in-user show` and check assigned roles, or use `az ad sp show --id <app-id>` for service principals

- **`404 Not Found` when enumerating Azure Resource PIM eligibilities** – this means PIM is not configured for the subscription:
  1. **Enable PIM for Azure resources**: In Entra ID → Privileged Identity Management → Azure resources → Discover resources
  2. **Onboard subscription**: Select the subscription and click "Manage resource" to enable PIM
  3. **Wait for propagation**: After enabling PIM, allow 5-10 minutes for the service to fully initialize

- **PIM eligibilities returned but empty/zero results** – this is expected if no eligible assignments exist:
  1. **Verify PIM assignments exist**: Check Entra ID → Privileged Identity Management → My roles / Azure resources to confirm eligible assignments
  2. **Scope filtering**: If using `--pim-scope`, ensure the scope matches existing eligible assignments
  3. **Role type filtering**: Eligible assignments are separate from active assignments - use both `--include-rbac-assignments` and `--include-pim` to see the complete picture

- **`Failed to acquire Microsoft Graph token` for PIM** – authentication issue:
  1. **Check authentication**: Run `az account show` to verify you're logged in
  2. **Re-authenticate**: Run `az login --scope https://graph.microsoft.com/RoleManagement.Read.Directory` to explicitly consent
  3. **Service principal**: Verify `AZURE_CLIENT_ID`, `AZURE_TENANT_ID`, and `AZURE_CLIENT_SECRET` are set correctly
  4. **Token cache**: Try `az account clear` then `az login` again to refresh tokens

- **PIM eligibilities missing for specific users** – check assignment configuration:
  1. **Assignment type**: Only **eligible** assignments appear in PIM results (not active assignments)
  2. **Expired eligibilities**: Check the `endDateTime` in the eligibility schedule - expired eligibilities are still returned but marked with status
  3. **Group-based eligibilities**: If the user is eligible via group membership, ensure `--include-entra` is enabled so group relationships are built

- **Performance issues with PIM enumeration** – PIM API can be slow for large tenants:
  1. **Use scope filtering**: Add `--pim-scope "/subscriptions/<sub-id>"` to limit enumeration
  2. **Disable unused features**: Use `--no-pim-include-entra-requests` if you don't need pending requests
  3. **Separate runs**: Run PIM discovery separately from ARM discovery for better performance analysis

### Verifying PIM Permissions

To verify your permissions are correctly configured:

```bash
# Check if you can read PIM eligibilities (Microsoft Graph)
az rest --method GET --url "https://graph.microsoft.com/v1.0/roleManagement/directory/roleEligibilitySchedules" --query "value[0]"

# Check if you can read Azure resource PIM eligibilities (ARM)
az rest --method GET --url "https://management.azure.com/subscriptions/<sub-id>/providers/Microsoft.Authorization/roleEligibilitySchedules?api-version=2020-10-01" --query "value[0]"
```

If these commands fail with 403, review the permission requirements above. If they succeed but Azure Discovery still fails, check the log output for specific error messages.

## Feature status and roadmap

### ✅ Implemented features

- **Azure RBAC graph** – Role assignments and role definitions are fully enumerated and visualized with principal-to-resource relationship mapping.
- **API hardening** – FastAPI includes pluggable authentication (Azure AD, API key), rate limiting, audit logging, CORS configuration, and API versioning.
- **Scale controls** – Adaptive rate control and intelligent batching automatically handle API throttling and optimize throughput for large tenants.
- **Defender for Cloud** – Security alerts, vulnerability assessments, secure scores, and compliance status are integrated with filtering by severity and status.

### 🔄 Planned features

- **Change tracking** – Persistent storage backend with snapshot diffing to track changes over time.
- **Non-Entra SaaS surfaces** – M365 and Microsoft Purview enumerators for broader coverage beyond Azure and Entra ID.

For detailed implementation plans, see [docs/IMPLEMENTATION_ROADMAP.md](docs/IMPLEMENTATION_ROADMAP.md).

## Publishing to PyPI (maintainers)

To publish a new version to PyPI:

1. Bump `version` in `pyproject.toml`.
2. Ensure tests pass: `pip install -e .[dev] && pytest`.
3. Build: `python -m build`.
4. Upload: `twine upload dist/azure-discovery-<version>*` (requires PyPI credentials or token).

The package uses a single top-level package **azure_discovery** to avoid namespace conflicts on install. The console script **azure-discovery** is provided by the `[project.scripts]` entry in `pyproject.toml`.
