Metadata-Version: 2.4
Name: mainframe-ingest-mcp
Version: 0.1.1
Summary: MCP server for ingesting COBOL mainframe codebases — classifies files, resolves dependencies, maps CICS transactions to REST routes
Author: Rohan Nair
License: MIT
Keywords: mcp,cobol,mainframe,modernization,cics,db2,bms,zos,migration
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Software Development :: Libraries
Classifier: Topic :: System :: Systems Administration
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: mcp[cli]>=1.2.0
Requires-Dist: chardet>=5.0
Dynamic: license-file

# mainframe-ingest-mcp

An MCP server that ingests COBOL mainframe codebases from ZIP files and produces a structured JSON manifest — ready for downstream modernization tools.

```bash
pip install mainframe-ingest-mcp
```

---

## What is this?

Mainframe modernization projects always start with the same problem: **nobody knows what's in the codebase.** Thousands of COBOL files, copybooks, BMS screens, JCL jobs, and CSD exports — all sitting in a ZIP someone exported from the mainframe.

`mainframe-ingest-mcp` solves the discovery step. Point it at any mainframe ZIP and it will:

- Classify every file by type (COBOL, copybook, BMS screen, JCL, DB2 DDL, CSD export)
- Detect and convert EBCDIC encoding to UTF-8
- Scan every COPY, EXEC SQL INCLUDE, EXEC CICS, and CALL statement
- Build a full dependency graph (which programs use which copybooks)
- Parse CICS transaction definitions and suggest FastAPI REST routes
- Map every BMS screen to an Angular component name
- Produce a single `manifest.json` — the handoff document for the rest of the pipeline

It has been tested against a real 294-file, 118,000-line government CICS application and resolves 99%+ of dependencies with zero configuration.

---

## Installation

```bash
pip install mainframe-ingest-mcp
```

Requires Python 3.10+.

---

## Connecting to Claude Desktop

This is an MCP server — it runs locally and exposes tools to Claude Desktop.

**Step 1:** Find your Claude Desktop config file:

| OS      | Location |
|---------|----------|
| macOS   | `~/Library/Application Support/Claude/claude_desktop_config.json` |
| Windows | `%APPDATA%\Claude\claude_desktop_config.json` |
| Linux   | `~/.config/Claude/claude_desktop_config.json` |

**Step 2:** Add this to the config:

```json
{
  "mcpServers": {
    "mainframe-ingest-mcp": {
      "command": "mainframe-ingest-mcp"
    }
  }
}
```

**Step 3:** Restart Claude Desktop.

That's it. The tools will appear automatically. You can now say to Claude:

> *"Extract the ZIP at /Users/me/PAYROLL_SYSTEM.zip to /tmp/payroll and build a manifest"*

Claude will call the tools in the right order and return the full manifest.

---

## Usage (without Claude Desktop)

You can also call the tools directly from Python:

```python
from zip_ingestion_mcp.tools.manifest import build_file_manifest
from zip_ingestion_mcp.tools.extractor import extract_zip

# Extract the ZIP
extract_zip("/path/to/PAYROLL_SYSTEM.zip", "/tmp/payroll")

# Build the manifest
manifest = build_file_manifest("/tmp/payroll", "/path/to/PAYROLL_SYSTEM.zip")

import json
print(json.dumps(manifest["summary"], indent=2))
```

Output:
```json
{
  "total_files": 294,
  "cobol_programs": 83,
  "copybooks": 177,
  "bms_screens": 17,
  "cics_transactions": 304,
  "total_cobol_lines": 77962
}
```

---

## Available MCP Tools

| Tool | What it does |
|------|-------------|
| `tool_build_file_manifest` | **The main tool.** Runs everything and returns the full manifest. |
| `tool_extract_zip` | Extracts a ZIP file safely (zip-slip protected) |
| `tool_detect_all_artifacts` | Classifies every file by type |
| `tool_detect_encoding` | Detects if a file is EBCDIC or UTF-8 |
| `tool_convert_encoding` | Converts EBCDIC → UTF-8 (creates `.bak` backup) |
| `tool_scan_dependencies` | Builds a full COPY/CICS/SQL dependency graph |
| `tool_parse_csd_export` | Parses CICS transaction definitions → REST route suggestions |

In most cases you only need `tool_build_file_manifest` — it calls all the others internally.

---

## The manifest.json

The manifest is a structured JSON document designed to be consumed by downstream modernization tools:

```json
{
  "manifest_version": "1.0",
  "readiness": "READY",
  "summary": {
    "total_files": 9,
    "cobol_programs": 2,
    "copybooks": 4,
    "bms_screens": 1,
    "cics_transactions": 3,
    "total_cobol_lines": 156,
    "ebcdic_files_detected": 0,
    "unresolved_copybooks": 0
  },
  "cobol_programs": [
    {
      "file": "cbl/online/EMPINQRY.cob",
      "line_count": 89,
      "dependencies": {
        "copies": ["EMPRECRD", "DFHAID"],
        "bms_sends": ["EMPINQMS"],
        "sql_includes": []
      }
    }
  ],
  "bms_screens": [
    {
      "mapset_name": "EMPINQMS",
      "file": "bms/EMPINQMS.bms",
      "angular_component": {
        "component_class": "EmpinqmsComponent",
        "selector": "app-empinqms",
        "template_file": "empinqms.component.html"
      }
    }
  ],
  "route_table": [
    {
      "transaction_code": "EMPI",
      "program": "EMPINQRY",
      "http_method": "GET",
      "fastapi_route": "/api/employee/{id}"
    }
  ],
  "warnings": [],
  "next_steps": [
    "Send 2 COBOL programs to cobol-parser-mcp",
    "Send 1 BMS screen to bms-to-angular-mcp"
  ]
}
```

---

## File types supported

| Extension | Classified as | Notes |
|-----------|--------------|-------|
| `.cbl`, `.cob`, `.cobol` | `COBOL_PROGRAM` | Main business logic |
| `.cpy` | `COPYBOOK` or `BMS_COPYBOOK` | Detected by content |
| `.bms` | `BMS_SCREEN` | 3270 terminal screen |
| `.jcl` | `JCL_JOB` | Batch job definitions |
| `.ddl`, `.sql` | `DB2_DDL` | DB2 schema definitions |
| `.csd` | `CSD_EXPORT` | CICS transaction registry |
| `.inc`, `.include` | `SQL_INCLUDE` or `JCL_INCLUDE` | Detected by content |

---

## Bundled IBM system copybooks

The following IBM system copybooks are bundled with the package because they live in IBM system libraries on the mainframe and are never included in application ZIPs:

- `DFHAID` — CICS attention identifier keys (PF1-PF24, ENTER, CLEAR, PA1-PA3)
- `SQLCA` — SQL Communication Area (SQLCODE, SQLERRM, SQLWARN)

These are resolved automatically — no configuration required.

---

## Part of a larger pipeline

`mainframe-ingest-mcp` is Step 1 of a COBOL → modern stack modernization pipeline:

```
ZIP file
   └── mainframe-ingest-mcp      ← you are here
         └── manifest.json
               ├── cobol-parser-mcp       → extracts business logic
               ├── bms-to-angular-mcp     → generates Angular components
               └── db2-schema-mcp         → converts DB2 → PostgreSQL
```

---

## License

MIT
