Metadata-Version: 2.4
Name: matrouter
Version: 0.1.0
Summary: Local-first MCP server for agentic materials data routing.
Author-email: Quansheng Wu <quansheng.wu@iphy.ac.cn>
Maintainer-email: Quansheng Wu <quansheng.wu@iphy.ac.cn>, Quansheng Wu <wuquansheng@gmail.com>
License-Expression: Apache-2.0
Keywords: materials-science,mcp,optimade,materials-project,aflow,c2db,mpds,agent
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Science/Research
Classifier: Programming Language :: Python :: 3
Classifier: Topic :: Scientific/Engineering :: Chemistry
Classifier: Topic :: Scientific/Engineering :: Physics
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
License-File: NOTICE
Requires-Dist: matplotlib>=3.8
Requires-Dist: materialsgalaxy-api>=0.1.2
Requires-Dist: mp-api>=0.46.1
Requires-Dist: pymatgen>=2025.10.7
Requires-Dist: spglib>=2.5
Dynamic: license-file

# MatRouter

<p align="center">
  <img src="logo.png" alt="MatRouter logo" width="220">
</p>

[English](README.md) | [简体中文](README.zh-CN.md)

MatRouter is a local-first MCP server for agentic materials research. It gives AI agents structured access to materials databases through normalized tools, provenance-aware responses, and lightweight analysis utilities.

## Status

`v0.1.0` is an offline-testable first version focused on local use by researchers.

## Documentation

- User guide: [docs/usage.md](docs/usage.md)
- Development guide: [docs/development.md](docs/development.md)
- Validation report: [docs/test-report.md](docs/test-report.md)
- Configure MatRouter in Codex, Claude Code, OpenCode, and other agents: upload the `SKILL.md` file from this repository to the agent, then ask:
  "Configure MatRouter in my current Codex, Claude Code, or OpenCode environment."
- Example agent prompt:
  "Use MatRouter to find the band structure of diamond and generate a plot."

## Contact

Author and maintainer: Quansheng Wu

- quansheng.wu@iphy.ac.cn
- wuquansheng@gmail.com

## License

MatRouter source code is licensed under the Apache License, Version 2.0. See [LICENSE](LICENSE) and [NOTICE](NOTICE).

MatRouter does not grant rights to third-party database content. Data retrieved through MatRouter remains subject to the terms, licenses, API policies, attribution requirements, and access restrictions of the respective upstream data providers.

## Install From PyPI

MatRouter requires Python 3.10 or newer.

```bash
python3 -m pip install matrouter
```

## Install From Source

```bash
git clone https://github.com/MatRouter/matrouter.git
cd matrouter
python3 -m pip install -e .
```

The default install includes lightweight plotting, Materials Project `mp-api` access, `pymatgen` VASP parsing, phase-diagram helpers, and `spglib`-backed structure matching. Normal users do not need separate extras for band structures, structure matching, or phase diagrams.

## Configuration

Most starter workflows should use public sources first: AFLOW, OQMD, JARVIS-DFT, Alexandria, NOMAD public data, COD, C2DB, Materials Cloud Archive, MaterialsGalaxy public summaries, and manually configured public OPTIMADE providers do not require MatRouter-specific credentials.

Credentialed sources:

- Materials Project (`MP_API_KEY`): create or sign in to a Materials Project account at <https://materialsproject.org/>. The official API guide says the key is available on the profile dashboard after login: <https://docs.materialsproject.org/downloading-data/using-the-api/getting-started>.
- MaterialsGalaxy (`MATERIALSGALAXY_API_KEY`): use the MaterialsGalaxy portal and API guide at <https://materialsgalaxy.iphy.ac.cn/> and <https://materialsgalaxy.iphy.ac.cn/guides/api>. Public summary search works without a key; detailed structure and property endpoints require a user API key.
- MPDS / PAULING FILE (`MPDS_API_KEY` or `MPDS_KEY`): MPDS API access requires an MPDS account/subscription; see <https://mpds.io/developer/>. Because this is license/account controlled, MPDS is not used as the default README smoke-test example.

Keep real keys in environment variables or local MCP client config. Do not commit them to the repository.

Materials Project uses `MP_API_KEY`:

```bash
export MP_API_KEY="your-materials-project-api-key"
```

OQMD is enabled by default and does not require credentials. It uses the public formation-energy REST endpoint for formula, formation energy, hull stability, band gap when available, space group, and prototype metadata:

```bash
export MATROUTER_OQMD_BASE_URL="https://oqmd.org/oqmdapi"
```

AFLOW is enabled by default and does not require credentials. It uses the public AFLUX API for chemical-system search, optional `exact_formula` reduced-formula filtering, documented `aflow_filter` AFLUX templates, and per-entry AFLOW REST fields for normalized material records, CIF/POSCAR structure export, raw artifact listing/download, band-structure plotting from `*_bandsdata.json.xz` with `EIGENVAL.bands.xz` fallback, DOS/PDOS plotting from `*_dosdata.json.xz`, and report-ready artifact collection:

```bash
export MATROUTER_AFLOW_BASE_URL="https://aflow.org/API/aflux"
```

JARVIS-DFT is enabled by default and does not require credentials. It uses the public NIST JARVIS-DFT `dft_3d` snapshot for local formula/chemical-system search, material records, stability, band gaps, deep property metadata, CIF/POSCAR structure export, virtual JSON data products (`data/material.json`, `data/source-metadata.json`, `data/raw-record.json`), Figshare raw archive listing/download from each record's `raw_files` metadata, official JARVIS dataset/API manifests for 2D/FF/ML/WannierTB/Solar/STM/external-database surfaces, explicit global dataset archive downloads under `datasets/<dataset>/<filename>.zip` with structured JSON previews when possible, and optional archive parsing for band structures, DOS, optical spectra, phonon DOS/DFPT normal modes, BoltzTraP-style transport tables, LOCPOT electrostatic-potential profiles, and work-function values when those files/fields are present. JARVIS manifests are property-first: structures, scalar properties, tensors, spectra, and curves are marked as recommended, while heavy intermediate/model artifacts such as WANN/WAVECAR/WannierTB payloads remain explicit-download-only. The `dft_3d` snapshot is downloaded and cached on first JARVIS query, not at MCP startup:

```bash
export MATROUTER_JARVIS_DFT_DATA_URL="https://ndownloader.figshare.com/files/38521619"
export MATROUTER_JARVIS_DFT_CACHE_PATH="$HOME/.cache/matrouter/jarvis-dft-3d.json"
```

Alexandria is enabled by default and does not require credentials. It uses Alexandria's public OPTIMADE PBE and PBEsol endpoints for structures, band gaps, formation energies, and hull distances:

```bash
export MATROUTER_ALEXANDRIA_URLS="https://alexandria.icams.rub.de/pbe,https://alexandria.icams.rub.de/pbesol"
```

NOMAD is enabled by default and does not require credentials for public entries. It uses the public NOMAD v1 `entries/query` metadata API, per-entry archive queries for structure export and source-specific property bundles, and raw file endpoints for explicit file listing/download. NOMAD band gaps are normalized when NOMAD exposes them in processed electronic properties; when public rawdir exposes VASP `vasprun.xml`, MatRouter can reuse its VASP parser path for band/DOS plots. Archive total energies can be returned in calculation summaries but are not mapped to formation energy or hull stability:

```bash
export MATROUTER_NOMAD_BASE_URL="https://nomad-lab.eu/prod/v1/api/v1"
```

COD is enabled by default and does not require credentials. It uses the public Crystallography Open Database REST search endpoint for experimental crystal metadata and downloads raw CIF files by COD ID:

```bash
export MATROUTER_COD_BASE_URL="https://www.crystallography.net/cod"
```

C2DB is enabled by default and does not require credentials. It uses the public Computational 2D Materials Database browse/material pages and structure download endpoints for 2D-material band gaps, formation energies, hull distance, layer group, magnetism, dynamic stability, official key-value filters, richer material-page scalar metadata, and structure JSON/CIF/XYZ/POSCAR export. Add `{"include_details": true}` to `c2db_filter` when search results should be hydrated from per-material detail pages. C2DB material pages are also exposed as auditable report artifacts: page HTML, all parsed HTML tables, structured webpage-derived elasticity/piezo/deformation/phonon/Raman/Born/Bader/effective-mass/optical/miscellaneous JSON bundles when present, Plotly JSON for PBE/HSE/GW band and PDOS plots, extracted Plotly curve JSON, MatRouter-rendered PNG plots generated from those curve JSON artifacts, and page PNG figures when no curve data is exposed. These webpage-derived artifacts are not advertised as stable official C2DB curve/property APIs:

```bash
export MATROUTER_C2DB_BASE_URL="https://c2db.fysik.dtu.dk"
```

MaterialsGalaxy public summary search does not require credentials. Detailed structure, electronic-structure, topology, topological-phonon, ferroelectric, nonlinear-optical, and single-crystal-growth endpoints require `MATERIALSGALAXY_API_KEY`:

```bash
export MATERIALSGALAXY_API_KEY="your-materialsgalaxy-api-key"
export MATROUTER_MATERIALSGALAXY_BASE_URL="https://materialsgalaxy.iphy.ac.cn/api/v1"
```

MPDS / PAULING FILE requires an MPDS API key. It uses the MPDS `download/facet` endpoint for experimental inorganic structures, physical-property entries, phase-linked records, and raw CIF export:

```bash
export MPDS_API_KEY="your-mpds-api-key"
export MATROUTER_MPDS_BASE_URL="https://api.mpds.io/v0"
```

OPTIMADE sources can be configured as comma-separated base URLs:

```bash
export MATROUTER_OPTIMADE_URLS="https://example.org/optimade"
```

OPTIMADE provider discovery is opt-in so the server does not make network calls at startup by default:

```bash
export MATROUTER_OPTIMADE_DISCOVER=1
```

Discovery reads the official provider registry, follows provider index `/v1/links` endpoints, and adds queryable child database base URLs. For testing mirrors or private registries:

```bash
export MATROUTER_OPTIMADE_PROVIDER_REGISTRY_URL="https://providers.optimade.org/providers.json"
```

Discovery results are cached under the MatRouter cache directory so repeated MCP starts do not need to re-scan the public registry:

```bash
export MATROUTER_CACHE_DIR="$HOME/.cache/matrouter"
export MATROUTER_OPTIMADE_CACHE_TTL_SECONDS=86400
export MATROUTER_OPTIMADE_DISCOVERY_REFRESH=1
```

OPTIMADE search is failure-tolerant across configured providers: if one provider times out or returns an error, MatRouter keeps querying the remaining providers and reports the failed provider in `warnings`. The `limit` argument is applied across the whole OPTIMADE search, not once per provider.

To avoid repeatedly querying providers that recently failed, enable the persistent health skip window:

```bash
export MATROUTER_OPTIMADE_SKIP_UNHEALTHY=1
export MATROUTER_OPTIMADE_FAILURE_COOLDOWN_SECONDS=3600
```

`explain_sources` reports per-provider status, capabilities, base URL, and database-qualified source keys such as `optimade:optimade.materialscloud.org/main/mc3d-pbesol-v2`.

## Run

```bash
matrouter
```

For a quick local configuration check:

```bash
matrouter --explain-sources
```

## MCP Tools

- `search_materials`
- `get_material`
- `get_structure` (`json`, `cif`, and `poscar` when lattice/site data is available)
- `get_band_structure`
- `get_density_of_states` (Materials Project with `MP_API_KEY`; AFLOW processed DOS artifacts with lightweight MatRouter plotting; JARVIS-DFT via listed VASP raw archives; NOMAD via public rawdir VASP `vasprun.xml` when present)
- `get_source_property` (source-specific structured property bundles; JARVIS-DFT supports `all_properties`, elasticity, dielectric, optical metadata and spectra, phonon metadata, `edos_pdos`, `phonon_dos`, `phonon_dispersion`, transport coefficients, `transport_curves`, magnetism, superconductivity, exfoliation, electrostatic potential, work function, computational settings, database links, XRD/RDF plot-ready curves, stability semantics, raw archive manifests, external dataset manifests, `jarvis_dataset_manifest`, and `jarvis_api_manifest`; NOMAD supports `available_properties`, `calculation_summary`, `method`, `electronic`, and `raw_archive_manifest`)
- `get_materials_project_property` (official Materials Project property document endpoints such as dielectric, elasticity, phonon, XAS, battery, surface, tasks/provenance, electronic structure, synthesis, molecules, and defect tasks)
- `collect_report_artifacts` (AFLOW report-ready images/CIF/metadata; C2DB material-page HTML, parsed property-table JSON, Plotly JSON including PBE/HSE/GW band and PDOS plots, extracted curve JSON, MatRouter-rendered PNG plots generated from the curve JSON where available, page PNG figures where C2DB only exposes images, and generated structures)
- `list_source_files`
- `download_source_file` (generated structures, C2DB material-page Plotly JSON with sibling rendered PNG output, C2DB extracted curve JSON, parsed property-table JSON, rendered/page PNG artifacts, source data products such as JARVIS JSON records, Materials Project REST property JSON snapshots and explicit Open Data parsed-bucket objects, optionally `decompress=true` for AFLOW `.xz` or JARVIS `.zip` raw artifacts; AFLOW/NOMAD raw VASP files return `pymatgen` post-processing guidance, and NOMAD `vasprun.xml` can feed the built-in band/DOS plotting path)
- `screen_materials`
- `batch_screen_materials` (project-scale screening over many search queries or normalized candidates with async batching, retry, rate limiting, lock-protected project JSON cache, progress, resumable result state, failure reports, stable CSV/Parquet export schema, and optional CSV/Parquet export; default cache path is `.matrouter/cache/batch-screen-cache.json`, override with `MATROUTER_PROJECT_CACHE_DIR`, `MATROUTER_BATCH_SCREEN_CACHE_PATH`, or `cache_path`)
- `compare_materials`
- `plan_retrieval` (source/tool routing with per-need capability summary)
- `build_material_report_package` (cross-source evidence package with embedded database capability context, `report_preset` shortcuts, `analysis_tasks` for AI answer planning, phase matches, optional structure-level matching, phase-stability evidence with optional MP phase-diagram computation, artifact manifest, evidence assessment for agreement/conflict/next-evidence analysis, property inventory, profile-controlled property comparison, ranked candidates, coverage, trust notes, missing data, and follow-up tool suggestions)
- `render_material_report` (renders an existing or newly built material report package to Markdown or printer-friendly HTML with tables, artifact references, data sources, citations/DOIs when present, warnings, missing-data notes, and explicit "do not over-interpret" cautions)
- `build_phase_stability_package` (phase-stability evidence package with formation energy, hull distance, stable flags, phase-separation/dynamic-stability metadata, optional Materials Project thermo entries as phase-diagram inputs, `pymatgen` hull/decomposition/plot post-processing when requested, source support for phase-diagram work, limitations, and next-step guidance)
- `build_phase_diagram_package` (one-click Materials Project phase-diagram package with thermo entries, optional plot, target hull distance, decomposition, and chemical-potential/Pourbaix/defect workflow entrypoints where available)
- `get_database_capability_matrix` (lightweight source capability table for key requirements, structures, stability, band structures, DOS, raw files, and report artifacts)
- `get_database_data_catalog` (categorized per-database data inventory: identity, structure, thermodynamics, electronic, optical/spectroscopy, phonon, mechanical, transport, magnetic/superconducting, surfaces/batteries, calculation metadata, database-specific documents, and files)
- `get_material_data_inventory` (for one source/material id, report the database's supported data categories plus normalized fields observed in that material record)
- `locate_property_sources` (given a property need such as `band_gap`, `bands`, `elasticity`, or `files`, list databases that can provide it, credential blockers, and follow-up tools)
- `explain_sources`

Public smoke check, no API key required:

```bash
matrouter --live-smoke aflow
matrouter --live-smoke aflow --live-smoke-run
```

Configured-source checks remain available when needed:

```bash
matrouter --live-smoke optimade
matrouter --live-smoke mpds
```

Dry-run mode reports source configuration without network access; `--live-smoke-run` executes a tiny one-record probe only for sources that are enabled. Use MPDS smoke only when the user has a valid MPDS account/subscription key.

## Testing

```bash
PYTHONPATH=src python3 -m unittest discover -s tests -v
```

Render a saved report package for human readers:

```bash
matrouter --render-report package.json --output report.md
matrouter --render-report package.json --report-format html --output report.html
```
