Metadata-Version: 2.4
Name: mule-discovery
Version: 1.2.0
Summary: Scan Mule applications for migration complexity assessment
Project-URL: Homepage, https://github.com/KongHQ-CX/mule-discovery
Author: Stephen Brown
License-Expression: MIT
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Requires-Python: >=3.10
Requires-Dist: pyyaml>=6.0
Provides-Extra: anypoint
Requires-Dist: anypoint-sdk>=0.2.0; extra == 'anypoint'
Provides-Extra: dev
Requires-Dist: pytest-cov>=4.0; extra == 'dev'
Requires-Dist: pytest>=7.0; extra == 'dev'
Description-Content-Type: text/markdown

# mule-discovery

Scan Mule applications for migration complexity assessment.

Parses Mule 4 (and 3) XML source files, POM dependencies, DataWeave scripts, and API specifications to produce a structured migration readiness report with complexity scoring.

## Estate Analysis

The output produced by `mule-discover` (JSON or YAML) can be fed into the [estate-analyzer](https://github.com/KongHQ-CX/kong-ps-agent-skills/tree/main/mule-analysis/estate-analyzer) agent skill to generate pre-sales migration reports. The estate-analyzer processes discovery output across your entire Mule application estate to produce complexity summaries, connector frequency analysis, PoC candidate recommendations, and migration sizing reports.

## Quick Start (uv)

No install required — just run from the project directory:

```bash
cd mule-discovery

# Discover all Mule apps under a directory
uv run mule-discover /path/to/apps --output-dir ./inventory

# JSON output instead of YAML
uv run mule-discover /path/to/apps --json --output-dir ./inventory
```

`uv run` reads `pyproject.toml`, resolves dependencies into an ephemeral environment, and runs the command. Nothing is installed globally.

## Installation

### From PyPI

```bash
pip install mule-discovery

# Then run directly
mule-discover /path/to/apps --output-dir ./inventory
```

For Anypoint Platform integration (policy scanning):

```bash
pip install mule-discovery[anypoint]
```

### From source (development)

```bash
uv sync --extra dev
```

Requires Python 3.10+.

## CLI Tools

### `mule-discover`

Recursively find all Mule applications under a directory and produce migration complexity reports for each.

```bash
# Discover all apps, write YAML inventories (default) to ./inventory
uv run mule-discover /path/to/apps --output-dir ./inventory

# JSON output
uv run mule-discover /path/to/apps --json --output-dir ./inventory

# Suppress progress output
uv run mule-discover /path/to/apps -o ./inventory -q

# Custom complexity thresholds
uv run mule-discover /path/to/apps --flow-low 8 --flow-medium 18 --flow-high 30
```

Each per-app report includes:
- Flow inventory with complexity levels (LOW / MEDIUM / HIGH / VERY_HIGH)
- DataWeave transformation analysis and classification
- HTTP listener and scheduled job detection
- Connector inventory with migration weights
- API specification detection (OpenAPI, WSDL)
- External dependency and out-of-scope item tracking
- AWS service usage (SQS, S3, DynamoDB)
- SOAP/WSDL service detection
- HTTP request-config inventory and connector authentication metadata (`request_configs`, `connector_auth`)
- Overall migration score (0–100) with recommendation (SIMPLE / MODERATE / COMPLEX / VERY_COMPLEX)

### `mule-scan-policies`

Scan Anypoint Platform for API policies on deployed applications. Requires the `anypoint` extra.

```bash
pip install -e ".[anypoint]"

export ANYPOINT_CLIENT_ID=...
export ANYPOINT_CLIENT_SECRET=...
export ANYPOINT_ORG_ID=...
export ANYPOINT_ENV_ID=...

uv run mule-scan-policies
uv run mule-scan-policies --format json
```

### `mule-download-policies`

Download custom policies from Anypoint Exchange. Requires the `anypoint` extra.

```bash
export ANYPOINT_CLIENT_ID=...
export ANYPOINT_CLIENT_SECRET=...
export ANYPOINT_ORG_ID=...

uv run mule-download-policies --output-dir ./custom_policies
```

## Complexity Scoring

Each application receives a migration score from 0 to 100 (higher = simpler migration):

| Score Range | Recommendation | Meaning |
|---|---|---|
| 75–100 | SMALL | Straightforward migration |
| 50–74 | MEDIUM | Some complexity, manageable |
| 25–49 | LARGE | Significant effort required |
| 0–24 | XLARGE | Major rework needed |

Deductions are applied across eight dimensions:

| Dimension | Max Deduction |
|---|---|
| Flow complexity | 30 pts |
| Transform complexity | 15 pts |
| Risk / out-of-scope items | 20 pts |
| Connector migration weight | 20 pts |
| WSDL / SOAP services | 10 pts |
| Scale (flow + component count) | 20 pts |
| Pattern complexity (scatter-gather, choices, batch, parallel-foreach, retries) | 15 pts |
| DataWeave volume | 15 pts |

### Flow Complexity Thresholds

Flows are classified by component count (configurable via CLI flags):

| Components | Complexity |
|---|---|
| ≤ 6 | LOW |
| 7–14 | MEDIUM |
| 15–25 | HIGH |
| > 25 | VERY_HIGH |

### DataWeave Classification

DataWeave transformations are classified by line count and function usage:

| Classification | Criteria |
|---|---|
| simple_mapping | ≤ 5 lines, no complex functions |
| field_level_logic | 6–20 lines, or uses routine functions (map, filter, pluck, etc.) |
| business_logic | > 20 lines, or uses complex functions (reduce, groupBy, flatMap, etc.) |

## Package Structure

```
src/mule_discovery/
├── __init__.py                # Main discover_mule_app() orchestrator
├── constants.py               # XML namespaces, element classifications, connector weights
├── xml_helpers.py             # XML utility functions
├── models/                    # Data models (dataclasses)
│   ├── result.py              # DiscoveryResult (top-level container)
│   ├── flows.py               # FlowInfo, BatchInfo, ChoiceInfo, ScatterGatherInfo, ...
│   ├── connectors.py          # ConnectorInfo, SpringDependency
│   ├── dataweave.py           # DataWeaveInfo
│   ├── listeners.py           # HttpListenerInfo, ScheduledJobInfo
│   ├── dependencies.py        # ExternalDependencyInfo, SourceFiles, OutOfScopeItem
│   ├── schemas.py             # ApiSpecInfo (OpenAPI, WSDL)
│   └── scoring.py             # ComplexityThresholds, ScoreResult
├── parsers/                   # File IO → models
│   ├── file_discovery.py      # find_mule_apps(), find_mule_xml_files()
│   ├── mule_xml.py            # Mule XML parsing (flows, listeners, jobs)
│   ├── pom.py                 # POM parsing (app name, version, connectors)
│   ├── http_auth.py           # HTTP auth config extraction
│   ├── dataweave.py           # DataWeave script parsing
│   ├── soap.py                # SOAP/WSDL service detection
│   ├── aws.py                 # AWS service detection (SQS, S3, DynamoDB)
│   ├── openapi.py             # OpenAPI spec detection
│   └── wsdl.py                # WSDL parsing utilities
├── analysis/                  # Models → models (pure functions)
│   ├── classification.py      # Flow type and source category constants
│   ├── complexity.py          # Flow and DataWeave complexity assignment
│   ├── patterns.py            # Pattern detection (async, scatter-gather, choice, ...)
│   ├── scoring.py             # Migration score calculation (0–100)
│   └── dependencies.py        # External dependency and out-of-scope extraction
├── output/                    # Models → formatted strings
│   ├── yaml_output.py         # YAML
│   ├── json_output.py         # JSON
│   └── text_output.py         # Human-readable text summary
├── anypoint/                  # Anypoint Platform integration (optional)
│   ├── policies.py            # Policy scanning
│   └── exchange.py            # Custom policy download
└── cli/                       # CLI entry points (thin wrappers)
    ├── discover.py            # mule-discover
    ├── scan_policies.py       # mule-scan-policies
    └── download_policies.py   # mule-download-policies
```

### Design Principles

- **No function does both IO and computation.** Parsers read files → return models. Analysis takes models → returns models. Output takes models → returns strings.
- **All data models are plain dataclasses** with typed fields — no methods with side effects.
- **All analysis functions are standalone** — no class methods, no inheritance.
- **Each output format is a separate module.**

## Testing

```bash
make test
```

Or directly:

```bash
uv run --extra dev python -m pytest
```

Coverage is enforced at 70% (branch coverage) via `pyproject.toml`.
