{% extends "base.html" %} {% block title %}Config / .env — MCP Forge{% endblock %} {% block page_title %}Configuration{% endblock %} {% block content %}
Edit your .env file directly from the dashboard. Changes take effect immediately.
API keys for the chat agent
Docker cannot access the Mac GPU — run the model natively in one command
bash scripts/start_model_server.sh
The script auto-installs Python if needed, creates a venv, installs torch + transformers, then starts the server../cache/huggingface/ (shared with Docker — no re-download ever).
Listening on 0.0.0.0:8005, then:
via LOCAL_MODEL_HOST
When does the model run?
The model loads lazily on the first analysis/generation request, or immediately when you click Load Model Now above.
First run downloads the model from HuggingFace (~8 GB for 14B 4-bit). Subsequent starts reuse the cache.
Recommended for 15 GB VRAM: Qwen/Qwen2.5-Coder-14B-Instruct.
For private repos and higher rate limits
Settings for this app's own MCP endpoint (used by Claude)
Edit with care. Invalid syntax will break the app.