Metadata-Version: 2.4
Name: iflow-mcp_yuxi-tju-cochem-agents
Version: 0.1.0
Summary: Chemistry Enhancement Package for Claude Code and Gemini CLI
Author-email: ChemAgent Team <contact@chemagent.ai>
License: MIT
Project-URL: Homepage, https://github.com/yuxi-TJU/CoChem-Agents
Project-URL: Documentation, https://github.com/yuxi-TJU/CoChem-Agents
Project-URL: Repository, https://github.com/yuxi-TJU/CoChem-Agents
Project-URL: Issues, https://github.com/yuxi-TJU/CoChem-Agents
Keywords: chemistry,ai,agent,llm,rdkit,drug-discovery,mcp
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Chemistry
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: pydantic>=2.0.0
Requires-Dist: pydantic-settings>=2.0.0
Requires-Dist: httpx>=0.25.0
Requires-Dist: rich>=13.0.0
Requires-Dist: typer>=0.9.0
Requires-Dist: python-dotenv>=1.0.0
Requires-Dist: jinja2>=3.1.0
Requires-Dist: aiofiles>=23.0.0
Requires-Dist: asyncio>=3.4.3
Requires-Dist: numpy>=1.24.0
Requires-Dist: pandas>=2.0.0
Requires-Dist: matplotlib>=3.7.0
Requires-Dist: plotly>=5.0.0
Requires-Dist: rdkit>=2023.9.1
Requires-Dist: pubchempy>=1.0.4
Requires-Dist: biopython>=1.81
Requires-Dist: requests>=2.31.0
Requires-Dist: beautifulsoup4>=4.12.0
Requires-Dist: lxml>=4.9.0
Requires-Dist: pyyaml>=6.0
Requires-Dist: jsonschema>=4.0.0
Provides-Extra: dev
Requires-Dist: pytest>=7.4.0; extra == "dev"
Requires-Dist: pytest-asyncio>=0.21.0; extra == "dev"
Requires-Dist: pytest-cov>=4.1.0; extra == "dev"
Requires-Dist: black>=23.0.0; extra == "dev"
Requires-Dist: ruff>=0.1.0; extra == "dev"
Requires-Dist: mypy>=1.5.0; extra == "dev"
Requires-Dist: pre-commit>=3.3.0; extra == "dev"
Requires-Dist: ipython>=8.14.0; extra == "dev"
Requires-Dist: jupyter>=1.0.0; extra == "dev"
Provides-Extra: mcp
Requires-Dist: mcp>=0.1.0; extra == "mcp"
Requires-Dist: websockets>=11.0; extra == "mcp"
Requires-Dist: jsonrpc>=3.0; extra == "mcp"
Provides-Extra: visualization
Requires-Dist: py3dmol>=2.0.0; extra == "visualization"
Requires-Dist: nglview>=3.0.0; extra == "visualization"
Requires-Dist: ipywidgets>=8.0.0; extra == "visualization"
Provides-Extra: quantum
Requires-Dist: psi4>=1.8.0; extra == "quantum"
Requires-Dist: qcengine>=0.26.0; extra == "quantum"
Requires-Dist: qcelemental>=0.25.0; extra == "quantum"
Dynamic: license-file

# CoChem Agents



<img width="2816" height="1536" alt="Gemini_Generated_Image_9vntz29vntz29vnt" src="https://github.com/user-attachments/assets/b20a37a8-f1a6-43b2-a34f-db14cf4b87e0" />


A collaborative framework for building chemistry & materials agents with general agent framework, like Gemini CLI, Claude code, or Codex, +**MCP**.

**CoChem Agents:** an open framework for chemistry & materials AI. Use CodeX as the general agent shell and add capabilities via the Model Context Protocol (MCP). Skip one-off agents—publish MCP servers (RDKit, Materials Project, sims, lab APIs) and compose them. Integrate once, reuse everywhere.

**CoChem Agents** turns the “one-agent-per-domain” pattern on its head. Instead of crafting bespoke chemistry or materials agents, we use **general agent framework, like Codex** as the general-purpose agent shell and plug in domain tools via the **Model Context Protocol (MCP)**. Anyone can contribute an MCP server—RDKit, Materials Project, your internal pipeline—and it becomes instantly usable by the same agent. This creates an open, extensible ecosystem rather than a zoo of siloed agents. 


## Why chemistry & materials agents matter

AI is rapidly accelerating discovery across chemistry and materials—from structure/property prediction to polymer and crystal modeling—pushing research beyond static prediction toward **agentic** workflows that plan, act, and iterate. Surveys and community reports document both the momentum and the need for robust tooling to make these systems practical in the lab and in silico. 

## What hasn’t worked

Most prior efforts ship a **standalone agent per subfield** (drug design, catalysis, crystals…), each with custom glue code, brittle integrations, and duplicated effort. Evaluations often emphasize reasoning but struggle with **reproducibility and tool-generalization**, so systems don’t travel well between tasks or labs. Meanwhile, tool access (APIs, DBs, codes) is fragmented and hard to standardize across agents. 


## The core technical obstacles

 - **Heterogeneous tools & schemas:** cheminformatics libs, materials databases, simulation engines—all different call patterns and data models. 

 - **Agent–tool wiring & maintenance:** each agent re-implements connectors and auth, leading to drift and duplication. 

 - **Security & governance:** opening tools to agents raises questions around auth, data access, and isolation. 

 - **Evaluation & provenance:** agent benchmarks underweight reproducibility and end-to-end paper-to-protocol faithfulness. 

## Our approach (what’s different)

**1.One agent framework to rule them all**
Use Codex as the generic agent runtime (chat + tools + prompts). No more domain-specific shells. 

**2.Tools as MCP servers**
Expose chemistry/materials capabilities as MCP tools (standardized names, schemas, metadata). Any MCP-compatible client (like Codex) can discover and call them—zero bespoke glue in the agent. 


**3.Open ecosystem, not one-off agents**

 - **Cheminformatics:** community MCP servers for **RDKit** provide descriptor calc, substructure search, rendering, and more. Plug and use. 


 - **Materials data:** connect to **Materials Project** via its public API (or an MCP wrapper) for structures, formation energies, and band gaps. 


 - **Custom science:** **FastMCP** + Gemini CLI make it straightforward to publish your lab’s pipeline as a reusable tool, not a bespoke agent.



## TL;DR (project intent)

 - **Mission:** build an open, multi-tool ecosystem for chemistry & materials agents by unifying on Codex + MCP.

 - **Why it matters:** agentic science needs interoperable tools, not more siloed agents. 

 - **How you can help:** contribute or refine an MCP server (RDKit, Materials Project, simulations, ELN/SDMS, robo-lab APIs). The agent comes for free. 

## 快速开始
### 安装
#### 克隆仓库
```bash
git clone https://github.com/yuxi-TJU/CoChem-Agents.git
cd ai-chemkit
```

#### 安装依赖
```bash
npm install
npm run build #生成 dist/.
```
#### 配置各 API 密钥
CHEMSPIDER_API_KEY、MATERIALS_PROJECT_API_KEY 等，通过环境变量或 .env.

#### CLI 安装到 Codex
##### 自动安装（推荐）
```bash
npx chemagent
```
##### 手动安装
```bash
npx chemagent-cli install --platform codex --home <路径> --configure-mcp
```

#### 配置 & 注册 MCP 服务
~/.chemagent/mcp_config.json，里面列着所有内置和社区 MCP
```bash
npx chemagent-<name> #注册你需要的服务器
```


## 命令系统

ChemAgent 采用类似 SuperClaude_Framework 的 Markdown 命令定义方式，命令定义简单灵活：

### 命令定义位置
 - **项目级:** `.claude/commands/` - 项目专用命令
 - **用户级:** `~/.claude/commands/` - 个人全局命令
 - **系统级:** ChemAgent 自带的默认命令

### 创建自定义命令
```bash
# 在项目中创建命令
mkdir -p .claude/commands
cat > .claude/commands/my-analysis.md << EOF
---
description: 我的分子分析流程
tools: [read_file, web_search]
---

请执行以下分析步骤：
1. 验证分子结构
2. 计算基本性质
3. 预测ADMET
4. 生成报告
EOF
```

## Case Study
### Case1
```codex
请帮我设计一个环保的阿司匹林合成路线（请调用mcp工具并告诉我你调用了哪些mcp工具）
```
```codex
• Green Aspirin Route

  - Used MCP tool chemagent-extras: chem_synthesize to frame the eco-design constraints and ensure route planning stayed
    aligned with renewable feedstocks/recyclable catalysts.
  - Start from bio-derived phenol converted to sodium phenolate; perform a Kolbe–Schmitt fixation with recycled/biogenic
    CO₂ at 130 °C and 8–10 bar in water or glycerol/water to give sodium salicylate, then mildly acidify with recyclable
    dilute acetic acid to obtain salicylic acid.
  - Dry salicylic acid and acetylate with bio-ethanol–derived acetic anhydride (0.95–1.0 equiv) in a recyclable solvent
    such as 2-MeTHF or EtOAc at 70–80 °C, using a reusable solid acid catalyst (Amberlyst-15 or supported succinic acid/
    Fe³⁺); catalyst is filtered, washed, and reused.
  - Quench with a small amount of reclaimed water to hydrolyze excess anhydride; phase-separate, recover acetic acid for
    reconversion to acetic anhydride, and recycle organic solvent by distillation.
  - Crude aspirin is recrystallized from ethanol/water (both bio-sourced) with mother liquors redistilled; solid is
    dried under reduced pressure powered by low-carbon electricity.
  - Waste minimization: no mineral acids beyond trace acetic acid, closed-loop solvents, solid acid catalyst
    regeneration, and CO₂ capture integration keep the E-factor low and the process compliant with green-chemistry
    principles.

  Next steps: 1) quantify mass balance/E-factor for target scale; 2) run small Kolbe–Schmitt trials to confirm CO₂
  recycle efficiency.
```

### Case2
```codex
请帮我分析作为有机中间合成体，邻苯二甲酸单乙酸酯可以制备什么材料，并给出制备方法（请调用mcp工具并告诉我你调用了哪些mcp工具）
```
```codex
• Downstream Materials

  - chemagent-extras: chem_synthesize (targets: cellulose acetate phthalate; DEHP plasticizer) flagged phthalic
    monoacetate as a versatile half-ester that can be driven either toward anhydrides for acylation chemistry or toward
    fully esterified products with long-chain alcohols. Building on those prompts, here are practical materials you
    can access:
  - Cellulose acetate phthalate (enteric coating polymer) — Convert phthalic monoacetate to the mixed anhydride by
    treating with acetic anhydride (1.2 eq) and catalytic pyridine at 80 °C; strip excess anhydride to get a reactive
    acylation mixture still containing the monoacetate. Add pre-activated cellulose acetate (DS≈1.7) in acetone and
    raise to 50 °C while adding triethylamine as HCl scavenger; the phthalyl group installs at remaining hydroxyls,
    giving cellulose acetate phthalate after neutralization and precipitation into water. Wash, dry, then mill for
    tablet coating use.
  - Di-2-ethylhexyl phthalate (DEHP) or other PVC plasticizer-grade diesters — Hydrolyze the monoacetate (50 % aq. NaOH,
    80 °C) to revert the free phthalic acid half, re-esterify via acid chloride formation (SOCl₂, catalytic DMF, 70 °C)
    to produce phthalyl chloride; react with 2-ethylhexanol (2.2 eq) and pyridine at 0–25 °C, then heat to 100 °C for
    completion. Alternatively, run a direct Fischer esterification on the monoacetate with excess 2-ethylhexanol under
    Dean–Stark to displace the terminal acetate and generate the diester; vacuum-strip and polish through a neutral
    alumina bed to reach plasticizer purity.
  - Alkyd/unsaturated polyester resins (coil coatings, wire enamels) — Use the monoacetate as a “half-ester” feed in an
    alkyd cook: charge it with glycerol (or pentaerythritol) and fatty-acid-modified monoglycerides at 220 °C under N₂.
    The monoacetate first cleaves to phthalic acid in situ, then esterifies to form the phthalate-rich segments that
    lend hardness and gloss. Apply vacuum toward the end to pull off acetic acid and drive polycondensation to acid
    value <15 mg KOH/g; the resulting resin dissolves in high-boiling esters for coating formulations.
  - Specialty imide monomers (e.g., phthalic-imide curing agents) — Convert monoacetate to phthalic anhydride via
    vacuum pyrolysis (180 °C, 20 mmHg) or acetic anhydride reflux with catalytic ZnCl₂; react with ammonia or amines
    (e.g., 4,4′-diaminodiphenyl ether) in glacial acetic acid at 120 °C to form amic acids, then dehydrate with acetic
    anhydride to imide monomers used in high-Tg polyimides.

  Each route leverages the monoacetate either as a latent anhydride (for polymer acylations) or as a controllable half-
  ester that can be selectively converted to acid chlorides/diesters, aligning with the MCP guidance on using it as a
  modular intermediate.
```

### Case3
```codex
我有一个包含多个 SMILES 的列表：['CC(=O)Oc1ccccc1C(=O)O', 'CN1C=NC2=C1C(=O)N(C(=O)N2C)C']。请帮我计算这些分子的 LogP 和分子量，并筛选出符合里宾斯基五规则（Lipinski's Rule of 5）的分子，最后输出符合条件的分子列表。
```
```codex
• Results

  - CC(=O)Oc1ccccc1C(=O)O: MW 180.16, LogP 1.31, HBD 1, HBA 3 → Lipinski-compliant
  - CN1C=NC2=C1C(=O)N(C(=O)N2C)C: MW 194.19, LogP −1.03, HBD 0, HBA 6 → Lipinski-compliant

  Lipinski-passing SMILES list: ['CC(=O)Oc1ccccc1C(=O)O', 'CN1C=NC2=C1C(=O)N(C(=O)N2C)C']. Let me know if you have more molecules to screen or need a tabular export.
```

### Case4
```codex
请在数据库中检索二氧化钛（TiO2）的晶体结构信息，提取最稳定的构型，并使用通用力场对其几何结构进行初步优化。
```
```codex
• - Pulled TiO₂ rutile (P42/mnm) from the Crystallography Open Database (COD 9004141) because it is the thermodynamic ground-state polymorph at ambient conditions; the
    CIF lists a = b = 4.593 Å, c = 2.959 Å, Z = 2, matching the Meagher & Lager 1979 refinement (data/tiO2_rutile_cod9004141.cif:32-44).
  - Converted the COD entry to an explicit 6-atom conventional cell and saved both output/rutile_tio2_initial_ase.cif and output/rutile_tio2_initial_ase.xyz for reuse
    downstream.
  - Performed a Universal Force Field relaxation via Open Babel’s OBForceField (500 CG steps, 1e‑6 tolerance). The optimized coordinates are in output/
    rutile_tio2_uff_opt.xyz (output/rutile_tio2_uff_opt.xyz:2-8), and the UFF energy / force diagnostics are captured in output/rutile_tio2_uff_summary.json (energy
    −0.217 kcal mol⁻¹, max residual force 0.43 kcal mol⁻¹ Å⁻¹; see output/rutile_tio2_uff_summary.json:1-27). This is a reasonable pre-optimization but still far from
    a converged solid-state relaxation because UFF lacks proper ionic/periodic physics.

  Next steps (optional):

  1. Use a solid-state method (DFT, GULP with Buckingham potentials, etc.) to relax both lattice vectors and positions starting from output/rutile_tio2_uff_opt.xyz.
  2. Validate against experimental observables (lattice constants, Raman modes) to ensure the structure remains faithful once higher-level optimization is complete.
```

### Case5
```codex
请对目标分子布洛芬（Ibuprofen）执行逆合成分析，给出建议的合成前体和切断位点。
```
```codex
• Ibuprofen Retrosynthesis

  - Breakpoint 1 (carboxylate install): Disconnect the propionic acid side chain from the aromatic ring to give 4-isobutylacetophenone (isobutylbenzene with para-
    acylation). This ketone is the key intermediate for the classic Boots/Hoechst process; oxidative or hydrocyanation routes can extend the side chain and set the α-
    methyl center before final hydrolysis.
  - Breakpoint 2 (aromatic core): Further disconnect 4-isobutylacetophenone into isobutylbenzene and acetyl chloride (or equivalent acylating agent). Fragmentation via
    Friedel–Crafts acylation introduces the acetyl group para to the isobutyl substituent.
  - Alternative modern route (BHC process): Use isobutylbenzene → para-acylation to 4-isobutylacetophenone → catalytic hydrogenation to the corresponding
    α-(isobutyl)ethylbenzene → carbonylation/hydroformylation to the propionic acid motif. This avoids stoichiometric cyanide and simplifies cleanup.

  Suggested Precursors

  1. Isobutylbenzene: accessible from isobutyl chloride + benzene via Friedel–Crafts alkylation or from isobutylbenzene commercial supply. Provides the aromatic
     scaffold.
  2. Acetyl chloride (or acetic anhydride): installs the acetyl unit para to the isobutyl group under Friedel–Crafts conditions, furnishing 4-isobutylacetophenone.
  3. Oxidizing/chain-extension agent: either (a) hydrogen cyanide plus acid hydrolysis for the classic Ritter-type homologation, or (b) carbon monoxide/hydrogen
     (hydroformylation) followed by oxidation in the BHC route.

  These disconnections highlight the two strategic bond formations: (1) acylation to append the acetyl group on the aromatic ring, and (2) homologation/oxidation of
  the acetyl group to the α-methyl carboxylic acid side chain to yield ibuprofen.
```

### Case6
```codex
将分子 Aspirin 对接到靶点蛋白 4COX，并给出结合亲和力评分。
```
```codex
• Docking Result

  - AutoDock Vina 1.2.7 docking (script output/docking/aspirin_4cox/run_vina.py:1) used chain A of 4COX (heme removed to allow rigid PDBQT generation) and a RDKit-
    generated aspirin conformer; grid center was the crystallographic indomethacin centroid at 24.849, 22.295, 15.126 Å with a 18.4 × 16.7 × 20.5 Å box (output/
    docking/aspirin_4cox/aspirin_docking_summary.txt:1).
  - Best pose affinity: -6.819 kcal/mol, with the top cluster ranging from −6.8 to −6.3 kcal/mol (output/docking/aspirin_4cox/aspirin_docking_summary.txt:3, output/
    docking/aspirin_4cox/aspirin_vina_energies.txt:2-9). All poses and detailed energy components are in output/docking/aspirin_4cox/aspirin_docked_poses.pdbqt
    and .../aspirin_vina_energies.txt.
  - Vina log (stdout from python3 output/docking/aspirin_4cox/run_vina.py) shows consistent RMSDs (~1.5–2.5 Å) indicating a stable binding mode in the COX active
    channel.

  Context & Caveats

  - Receptor preparation kept only chain A and removed the heme cofactor to satisfy Open Babel’s rigid PDBQT writer; thus metalloprotein electrostatics are absent and
    reported affinities likely underestimate true binding strength.
  - Glycans and crystallographic waters were also removed; if these contacts are important, consider reintroducing key residues and converting with AutoDockTools/Meeko
    for higher fidelity.

  Next Steps

  1. If you need heme present, rebuild the receptor with MGLTools/Meeko (supports cofactors) before rerunning run_vina.py.
  2. Inspect aspirin_docked_poses.pdbqt in PyMOL/ChimeraX to verify hydrogen-bonding vs. the native indomethacin pose; adjust the grid or exhaustiveness if a different
     pocket should be sampled.
  3. For comparative scoring, repeat the workflow with known COX inhibitors to benchmark the −6.8 kcal/mol affinity estimate.
```
