The Future of Data Curation

Build, orchestrate, and visualize high-performance data pipelines with Zem. The first unified framework designed for the MCP era.

pipeline.yaml
name: medical_cleaning_pipeline

servers:
  nemo: src/xfmr_zem/servers/nemo_curator/server.py
  dj: src/xfmr_zem/servers/data_juicer/server.py

pipeline:
  - nemo.pii_removal:
      input: {anonymize_names: true}
  - dj.clean_html: {}

Why choose Zem?

MCP Architecture

Standalone, modular servers for domain logic. Bypasses async complexity with robust stdio communication.

ZenML Visualization

Automatic tracking and visualization of every step. Wows stakeholders with beautiful pipeline graphs.

Config-Driven

No more tangled code. Define or modify complex pipelines by simply editing a YAML file.

Domain Ready

Pre-configured tools for Medical, Legal, and Finance industries out of the box.

Unified Orchestration

Zem acts as the bridge between modular processing units (MCP Servers) and professional orchestration (ZenML). Every execution is tracked, every artifact is versioned, and every step is descriptively labeled.

  • Dynamic Step Naming
  • Seamless Subprocess Management
  • JSON-RPC over Standard I/O
  • ZenML Stack Integration
Zem Client
ZenML Wrapper
NeMo Server
DataJuicer Server