{% extends "base.html" %} {% block title %}Config / .env — MCP Forge{% endblock %} {% block page_title %}Configuration{% endblock %} {% block content %}

Edit your .env file directly from the dashboard. Changes take effect immediately.

LLM Providers

API keys for the chat agent

Local HuggingFace Model — requires NVIDIA GPU + Docker GPU passthrough

Docker cannot access the Mac GPU — run the model natively in one command

  1. Open a new terminal in the project folder and run: bash scripts/start_model_server.sh The script auto-installs Python if needed, creates a venv, installs torch + transformers, then starts the server.
    Model weights download once to ./cache/huggingface/ (shared with Docker — no re-download ever).
  2. Wait until the terminal shows Listening on 0.0.0.0:8005, then:

via LOCAL_MODEL_HOST

When does the model run? The model loads lazily on the first analysis/generation request, or immediately when you click Load Model Now above. First run downloads the model from HuggingFace (~8 GB for 14B 4-bit). Subsequent starts reuse the cache. Recommended for 15 GB VRAM: Qwen/Qwen2.5-Coder-14B-Instruct.

GitHub

For private repos and higher rate limits

App Settings

MCP Server

Settings for this app's own MCP endpoint (used by Claude)

Feature Flags

Raw .env Editor

Edit with care. Invalid syntax will break the app.

{% endblock %}