# docsfy

> AI-powered documentation generator that turns Git repositories into shareable static docs sites.

---

Source: introduction.md

# Introduction

`docsfy` is an AI-powered documentation generator for Git repositories. You give it a repo, a branch, and an AI provider/model, and it produces a polished static documentation site that you can browse in the web app, automate through the HTTP API, download as an archive, or manage from the CLI.

It is built for teams that want documentation to stay close to the codebase. Instead of relying on a long README or manually maintained pages, docsfy explores the repository itself, plans a documentation structure, writes markdown pages, renders them into HTML, and keeps track of each generated output as a distinct documentation variant.

## What docsfy does

- Generates documentation from Git repositories over HTTPS or SSH, and can also work from a local Git checkout when an admin provides a filesystem path.
- Tracks documentation as variants, so the same repository can have separate outputs for different branches, AI providers, AI models, and owners.
- Serves the latest or a specific variant through authenticated `/docs/...` routes, and can also package the generated site as a downloadable `.tar.gz`.
- Supports incremental refreshes: if the repository has not changed, docsfy can mark a variant as already up to date; if only part of the repo changed, it can regenerate only the affected pages.
- Runs a post-generation pipeline that can validate generated pages against the repository, add related-page cross-links, detect a project version for the site footer, and pre-render Mermaid diagrams to SVG when Mermaid CLI is available on the server.
- Produces a human-friendly static site and companion `llms.txt` / `llms-full.txt` files for AI-oriented consumption.

## Who it is for

- Self-hosters and platform admins who want a shared, authenticated documentation service for a team.
- Developers and technical writers who want fast first-pass docs generated directly from a repository.
- Internal users who only need read access to published documentation through the web app or generated site.
- Teams that need docs scoped by branch, model, or owner instead of a single global “latest” build.

docsfy has three built-in roles:

- `admin` can see everything, manage users, rotate user keys, and grant or revoke access to projects.
- `user` can generate, regenerate, abort, download, and delete their own documentation variants.
- `viewer` is read-only for docs and project listings, but can still sign in and rotate their own password/API key.

> **Note:** In docsfy, the “password” a user types into the web app or stores in the CLI config is an API key. The built-in admin uses `ADMIN_KEY`, and regular users get per-user keys created by an admin.

Admins see every project. Non-admin users see only projects they own or projects that have been explicitly shared with them.

## One Repository, Many Variants

A core docsfy idea is that documentation is not just “for a repo.” It is for a specific combination of:

- repository name
- branch
- AI provider
- AI model
- owner

That is why the dashboard groups a repository into branches and then into provider/model variants, and why the API has both “latest variant” routes and fully variant-specific routes.

> **Warning:** Branch names cannot contain `/`. Use names like `release-1.x` instead of `release/1.x`.

## The Main Workflows

### Web App

The web app is the easiest way to use docsfy day to day.

- Sign in at `/login` with a username and password/API key.
- Browse your accessible repositories in a sidebar project tree grouped by repository, branch, and provider/model variant. If you are an admin, same-named repositories stay separated by owner.
- Start a new generation by entering a repository URL, branch, provider, model, and optional force flag.
- Watch progress live in the selected variant view as docsfy moves through cloning, planning, page generation, validation, cross-linking, and rendering.
- Open the finished documentation in the browser, download it, regenerate it with different model settings, abort a run, or delete a variant.
- If you are an admin, create users and manage access to shared projects without leaving the dashboard.

The frontend sends generation requests with this exact shape:

```typescript
await api.post('/api/generate', {
  repo_url: submittedRepoUrl,
  branch: submittedBranch,
  ai_provider: submittedProvider,
  ai_model: submittedModel,
  force: submittedForce,
})
```

A normal browser-based workflow looks like this:

1. Sign in.
2. Click `New Generation`.
3. Enter a Git URL such as `https://github.com/myk-org/for-testing-only`.
4. Pick a branch and model.
5. Wait for the variant to move to `ready`.
6. Open or download the generated docs.

The app uses WebSocket updates for real-time status and falls back to polling if the socket is unavailable.

> **Tip:** Leave `Force full regeneration` off for normal refreshes. That lets docsfy reuse cached pages and skip work when a variant is already current.

### HTTP API

If you want automation, CI integration, or your own frontend, the HTTP API exposes the same core operations as the web app.

- `POST /api/generate` starts a new generation and returns immediately with a `202` response.
- `GET /api/projects` and `GET /api/status` return the projects you can access, plus known models and known branches.
- `GET /api/projects/{name}` returns all accessible variants for a repository.
- `GET /api/projects/{name}/{branch}/{provider}/{model}` returns one specific variant.
- `POST /api/projects/{name}/abort` and `POST /api/projects/{name}/{branch}/{provider}/{model}/abort` stop active runs.
- `GET /api/projects/{name}/download` and the variant-specific download route return the generated site as `application/gzip`.
- `/api/admin/...` routes handle user and access management for admins.
- `/api/auth/...` routes handle login, logout, “who am I”, and key rotation.

The app serves generated sites through two route shapes:

```python
@app.get("/docs/{project}/{branch}/{provider}/{model}/{path:path}")
@app.get("/docs/{project}/{path:path}")
```

That gives you two useful access patterns:

- a fully specific docs URL such as `/docs/for-testing-only/dev/gemini/gemini-2.5-flash/`
- a short `/docs/{project}/` URL that serves the most recently generated accessible variant

> **Note:** Browser access to `/docs/...` follows the same authentication rules as the rest of docsfy. If you want to publish the generated site somewhere else, use the download workflow and deploy the static files separately.

### Generated Sites

Every successful generation produces a static documentation site, not just an entry in the dashboard.

The generated site includes:

- a landing page and per-page HTML files
- a sidebar with grouped navigation
- in-page table of contents when a page has headings
- built-in search backed by `search-index.json`
- dark/light theme switching
- copy buttons and language labels on code blocks
- callout styling for blockquotes such as `> **Note:**`, `> **Warning:**`, and `> **Tip:**`
- previous/next navigation between pages
- detected version info in the footer when docsfy can find a version in project metadata or Git tags
- Mermaid diagrams pre-rendered to SVG when Mermaid CLI is available on the server
- AI-suggested `## Related Pages` sections when docsfy finds useful cross-page links
- `llms.txt` and `llms-full.txt` alongside the human-facing site

In practice, that means a finished docsfy build can be used in two different ways:

- as an authenticated site served directly by docsfy under `/docs/...`
- as a downloadable archive you unpack and host anywhere static HTML is accepted

Because docsfy writes a `.nojekyll` file into the output, the generated site is also friendly to GitHub Pages-style static hosting.

### CLI

The `docsfy` CLI is the terminal-first interface to the same server. It is useful for scripting, quick checks, and admin tasks.

The package exposes two entry points:

- `docsfy-server` starts the FastAPI application
- `docsfy` talks to a running server

A real CLI generation example from the repository’s test plans looks like this:

```shell
docsfy generate https://github.com/myk-org/for-testing-only --provider gemini --model gemini-2.5-flash --force
```

Common CLI workflows include:

- `docsfy config init` to save a server profile
- `docsfy health` to verify the server is reachable
- `docsfy list` to see all visible projects
- `docsfy status <project>` to inspect variants
- `docsfy download <project> ...` to pull down generated docs
- `docsfy abort <project> ...` to stop an active generation
- `docsfy admin users ...` and `docsfy admin access ...` for admin-only management

If you want live terminal feedback, `docsfy generate` also supports `--watch`, which listens to the same WebSocket progress stream the web app uses and prints stage changes such as `cloning`, `planning`, `generating_pages`, `validating`, `cross_linking`, and `rendering`.

The CLI config file is a small TOML profile store:

```toml
[default]
server = "dev"

[servers.dev]
url = "http://localhost:8000"
username = "admin"
password = "<your-dev-key>"
```

> **Note:** The CLI calls the credential field `password`, but it is the same API key/admin key used elsewhere in docsfy.

## Configuration At A Glance

A minimal server setup starts with environment variables. The repository ships this example:

```dotenv
# Required: Admin password (minimum 16 characters)
ADMIN_KEY=

# AI provider and model defaults
# (pydantic_settings reads these case-insensitively)
AI_PROVIDER=cursor
AI_MODEL=gpt-5.4-xhigh-fast
AI_CLI_TIMEOUT=60

# Logging
LOG_LEVEL=INFO

# Data directory for database and generated docs
DATA_DIR=/data

# Cookie security (set to false for local HTTP development)
SECURE_COOKIES=true
```

Those settings tell docsfy:

- who the built-in admin is
- which AI provider/model to use by default
- where to store the database and generated sites
- whether session cookies should be marked `Secure`

> **Warning:** `ADMIN_KEY` is required, and both admin and custom user keys must be at least 16 characters long.

The provided Compose setup keeps deployment simple: it reads `.env`, exposes port `8000`, and persists generated output under `./data` mapped to `/data`.

> **Note:** The provided Docker build installs the Claude, Cursor, and Gemini CLIs inside the container, plus Chromium and Mermaid CLI (`mmdc`) for diagram rendering. If you use the containerized setup, that is the easiest way to start with all three supported providers and Mermaid diagram rendering in generated sites.

## Things To Know Before You Start

- Remote repository generation accepts standard HTTPS and SSH Git URLs.
- Local repository generation is supported through `repo_path`, but it is restricted to admins.
- The server defaults to the `cursor` provider and `gpt-5.4-xhigh-fast` model unless you override them.
- The “latest docs” route for a project serves the most recently generated accessible variant, not necessarily the only variant that exists.
- Admins can share access to a project owned by one user with other users or viewers without copying the generated files.

If you want the shortest path to value, start the server, create or obtain a key, generate one repository from the web app or CLI, and then decide whether your team prefers to work from the dashboard, the API, the downloaded static site, or all three.


## Related Pages

- [Architecture and Runtime](architecture-and-runtime.html)
- [First Run Quickstart](first-run-quickstart.html)
- [Projects, Variants, and Ownership](projects-variants-and-ownership.html)
- [Generating Documentation](generating-documentation.html)
- [CLI Workflows](cli-workflows.html)

---

Source: architecture-and-runtime.md

# Architecture and Runtime

`docsfy` runs as one web application with a clear split of responsibilities:

- The FastAPI backend is the hub. It handles auth, generation requests, project state, admin actions, downloads, and WebSocket updates.
- The React dashboard is the control plane. It lets you start runs, watch progress, browse variants, and manage users or access.
- SQLite stores runtime metadata such as projects, variants, users, sessions, and sharing rules.
- The filesystem stores generated artifacts such as cached Markdown, rendered HTML, search assets, and download bundles.
- External AI CLIs do the planning, page writing, validation, and cross-linking work.
- The static renderer turns generated Markdown into a browsable site.

That architecture keeps deployment simple: one service, one database file, one data directory, and one place to look when something is generating, ready, failed, or downloadable.

> **Note:** The generated docs are static HTML on disk, but by default they are still served by the same FastAPI app that powers the dashboard and API.

## The Big Picture

```mermaid
flowchart TD
    A[Browser dashboard or docsfy CLI] --> B[FastAPI backend<br/>API, WebSocket, docs, and SPA routes]
    B --> C[SQLite<br/>/data/docsfy.db]
    B --> D[Filesystem<br/>/data/projects/...]
    B --> E[External tools<br/>git + AI provider CLIs]
```

In practice, a generation request looks like this:

1. You submit a repo URL from the dashboard or run `docsfy generate`.
2. The backend validates auth, repo source, provider/model, and branch.
3. It clones the repo or inspects a local admin-only repo path.
4. It compares the current commit to the newest ready variant so it can choose between a full run, an incremental update, or an immediate `up_to_date` result.
5. It asks an AI CLI to create a documentation plan, or reuses the existing plan for an incremental run.
6. It generates Markdown pages, and incremental runs can patch only the affected sections of a page instead of rewriting the whole file.
7. It validates generated pages against the repository and can regenerate pages that still contain stale references.
8. It adds a `## Related Pages` section based on AI-suggested cross-links between pages.
9. If possible, it detects the repository version, renders the static site to disk, and writes search plus LLM export assets.
10. It updates SQLite, notifies connected clients over WebSocket, and serves the finished site under `/docs/...`.

A specific docs variant is served from a URL shaped like:

`/docs/{project}/{branch}/{provider}/{model}/...`

There is also a shortcut route that serves the latest ready variant for a project:

`/docs/{project}/...`

## FastAPI as the Hub

The FastAPI app is where everything comes together. It includes the auth, admin, project, and WebSocket routers, and it also serves both the React frontend and the rendered docs.

The backend does four important things at runtime:

- Authenticates API clients with Bearer tokens and browser users with a session cookie.
- Starts generation in background tasks so the request can return immediately.
- Streams progress to the UI and CLI over `/api/ws`.
- Serves ready docs directly from the filesystem.

Generated docs are treated as authenticated content, not as a separate anonymous static host.

> **Note:** Unauthenticated browser requests to `/docs/...` are redirected to `/login`. Unauthenticated API requests receive `401 Unauthorized`.

Browser sessions are cookie-based, while the CLI uses an API key as a Bearer token. The same auth layer protects `/api/*` and `/docs/*`, which means the dashboard, downloads, and docs views all respect the same user and access rules.

## React Dashboard

The React app is a Vite-built SPA that becomes the main operator interface in the browser. In production, FastAPI serves the built frontend. In development, Vite runs on port `5173` and proxies API, docs, and health requests back to FastAPI.

From `frontend/vite.config.ts`:

```typescript
server: {
  host: '0.0.0.0',
  port: 5173,
  proxy: {
    '/api': {
      target: API_TARGET,
      changeOrigin: true,
      ws: true,
    },
    '/docs': {
      target: API_TARGET,
      changeOrigin: true,
    },
    '/health': {
      target: API_TARGET,
      changeOrigin: true,
    },
  },
},
```

The dashboard does not render documentation itself. Instead, it works as a control plane:

- It calls `/api/auth/me` to check the current session.
- It loads `/api/projects` to populate a hierarchical tree of repository, branch, and provider/model variants. For admins, the tree is grouped by `owner/name` so similarly named repos from different owners stay separate.
- It opens `/api/ws` for live progress and status updates, then combines `current_stage`, `page_count`, and `plan_json` to show a per-page activity log while a run is in progress.
- It opens ready docs in a new tab at `/docs/...`.
- It downloads archives from `/api/projects/.../download`.

The WebSocket connection is the first choice, but the dashboard can fall back to polling if reconnects fail. From `frontend/src/lib/websocket.ts`:

```typescript
private attemptReconnect(): void {
  if (this.reconnectAttempts >= this.maxReconnectAttempts) {
    console.debug('[WS] Falling back to polling')
    this.startPolling()
    return
  }
  const delay = this.getBackoffDelay()
  this.reconnectAttempts++
  console.debug('[WS] Reconnecting, attempt', this.reconnectAttempts)
  this.reconnectTimer = setTimeout(() => this.connect(true), delay)
}

private startPolling(): void {
  if (this.pollingTimer) return
  this.pollingTimer = setInterval(async () => {
    try {
      const data = await api.get<ProjectsResponse>('/api/projects')
      const syncMessage: WebSocketMessage = {
        type: 'sync' as const,
        projects: data.projects,
        known_models: data.known_models,
        known_branches: data.known_branches,
      }
      this.handlers.forEach(handler => handler(syncMessage))
    } catch {
      /* ignore polling errors */
    }
  }, WS_POLLING_FALLBACK_MS)
}
```

That fallback matters in real deployments: if a reverse proxy, browser, or network drops the socket, the dashboard still keeps the project list fresh.

## SQLite Storage and Variant Layout

`docsfy` stores runtime metadata in SQLite and generated content on disk.

The most important table is `projects`, which treats a generated output as a variant of a repository. A variant is keyed by:

- project name
- branch
- AI provider
- AI model
- owner

From `src/docsfy/storage.py`:

```python
await db.execute(f"""
    CREATE TABLE IF NOT EXISTS projects (
        name TEXT NOT NULL,
        branch TEXT NOT NULL DEFAULT '{_SQL_DEFAULT_BRANCH}',
        ai_provider TEXT NOT NULL DEFAULT '',
        ai_model TEXT NOT NULL DEFAULT '',
        owner TEXT NOT NULL DEFAULT '',
        repo_url TEXT NOT NULL,
        status TEXT NOT NULL DEFAULT 'generating',
        current_stage TEXT,
        last_commit_sha TEXT,
        last_generated TEXT,
        page_count INTEGER DEFAULT 0,
        error_message TEXT,
        plan_json TEXT,
        created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
        updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
        PRIMARY KEY (name, branch, ai_provider, ai_model, owner)
    )
""")
```

This is why the same repository can have multiple outputs side by side instead of overwriting each other.

A typical runtime layout under `DATA_DIR=/data` looks like this:

```text
/data/
  docsfy.db
  projects/
    <owner>/
      <repo-name>/
        <branch>/
          <provider>/
            <model>/
              plan.json
              cache/
                pages/
                  *.md
              site/
                .nojekyll
                index.html
                *.html
                *.md
                assets/
                search-index.json
                llms.txt
                llms-full.txt
```

A few details matter here:

- SQLite stores the project state, not the site content itself.
- The cached page Markdown lives separately from the final rendered site.
- `plan_json` is stored in the database while a run is still in progress, which lets the dashboard show planned page counts before rendering is finished.
- Sessions are stored in SQLite too, so browser login state survives normal request boundaries.

> **Tip:** Because branch, provider, and model are part of the variant key, you can keep `main` and `release-1.x` docs, or `claude` and `cursor` output, side by side for the same repository.

> **Note:** Branch names cannot contain `/`. `docsfy` uses the branch as a single URL segment and as part of the on-disk folder structure.

## AI CLI Integration and Generation Flow

`docsfy` does not talk directly to hosted model APIs inside its own code. Instead, it delegates planning and page writing to external AI CLIs through `ai-cli-runner`. That keeps the provider-specific invocation logic in one layer and lets the backend focus on orchestration.

From `src/docsfy/generator.py`:

```python
async def _call_ai_or_raise(
    prompt: str,
    repo_path: Path,
    ai_provider: str,
    ai_model: str,
    ai_cli_timeout: int | None = None,
) -> str:
    cli_flags = ["--trust"] if ai_provider == "cursor" else None
    success, output = await call_ai_cli(
        prompt=prompt,
        cwd=repo_path,
        ai_provider=ai_provider,
        ai_model=ai_model,
        ai_cli_timeout=ai_cli_timeout,
        cli_flags=cli_flags,
    )
    if not success:
        raise RuntimeError(output)
    return output
```

That flow is wrapped in a generation pipeline with clear stages. The backend writes these stage values into project state, and the dashboard uses the same names to render a live activity log.

From `frontend/src/lib/constants.ts`:

```typescript
export const GENERATION_STAGES = [
  'cloning',
  'planning',
  'incremental_planning',
  'generating_pages',
  'validating',
  'cross_linking',
  'rendering',
] as const
```

That lets the UI distinguish between page generation, stale-reference validation, and cross-link insertion before rendering begins. A ready variant can also keep `current_stage = up_to_date` when `docsfy` determines that nothing meaningful changed.

For remote repositories, cloning is intentionally shallow and branch-aware. From `src/docsfy/repository.py`:

```python
clone_cmd = ["git", "clone", "--depth", "1"]
if branch:
    clone_cmd += ["--branch", branch]
clone_cmd += ["--", repo_url, str(repo_path)]
```

That small detail has a big runtime effect:

- New runs start quickly because only the latest commit is cloned at first.
- If `docsfy` needs a diff against an earlier commit, it deepens the clone just enough to fetch the old commit.
- If the latest commit SHA matches an existing ready variant, `docsfy` can mark the new request as up to date instead of regenerating everything.
- If only part of the repo changed, the incremental planner can choose which pages to regenerate and keep the rest from cache.
- If the newest ready output was built with a different provider or model, `docsfy` can reuse that variant’s artifacts and then replace it only after the new variant is ready.

Page generation is also parallelized, and incremental runs can update only the affected sections of a page instead of forcing a full rewrite. For pages that need regeneration, `docsfy` can ask the AI for a JSON patch-like set of targeted text replacements and apply those edits to the existing markdown before falling back to a full rewrite. The generator still caps page work at five concurrent pages, which is a good balance between speed and keeping provider CLIs manageable.

> **Tip:** Switching provider or model after a successful run can be much faster than starting over from scratch. `docsfy` will try to reuse previous artifacts when the commit history and variant state allow it.

> **Warning:** Local `repo_path` generation is admin-only, and remote repository URLs that target localhost or private network ranges are rejected. That keeps a documentation generator from turning into an internal-network fetch tool.

## Static Site Renderer

Once Markdown pages are ready, `docsfy` renders a complete static site with Jinja templates and Python-Markdown.

The renderer does more than a straight Markdown-to-HTML conversion:

- It pre-renders top-level Mermaid diagrams to inline SVG when `mmdc` is available.
- It cleans up unusual code-fence annotations that AI output may produce.
- It inserts missing blank lines so Markdown structures render correctly.
- It sanitizes dangerous HTML before writing the final page.
- It filters path-unsafe slugs out of the output and trims navigation to only the pages that were actually rendered.
- It renders `index.html` plus one HTML page per slug.
- It also writes `.md` copies of pages alongside the HTML.
- It builds `search-index.json`, `llms.txt`, and `llms-full.txt`.
- It can include a detected project version in the generated footer, using `pyproject.toml`, `package.json`, `Cargo.toml`, `setup.cfg`, or the latest Git tag when available.
- It copies client-side assets for search, theme switching, copy buttons, scrollspy, callout styling, and GitHub metadata.

> **Note:** Mermaid pre-rendering is opportunistic. If `mmdc` is unavailable or a diagram fails to render, `docsfy` keeps the original fenced block instead of failing the whole site render.

From `src/docsfy/renderer.py`:

```python
search_index = _build_search_index(valid_pages, plan)
(output_dir / "search-index.json").write_text(
    json.dumps(search_index), encoding="utf-8"
)

llms_txt = _build_llms_txt(plan, navigation=filtered_navigation)
(output_dir / "llms.txt").write_text(llms_txt, encoding="utf-8")

llms_full_txt = _build_llms_full_txt(
    plan, valid_pages, navigation=filtered_navigation
)
(output_dir / "llms-full.txt").write_text(llms_full_txt, encoding="utf-8")
```

The final site is still a static site in the classic sense: HTML files, CSS, JS, and JSON assets. The difference is that `docsfy` renders and serves that site for you.

A ready site includes user-friendly features out of the box:

- sidebar navigation
- search modal powered by `search-index.json`
- dark/light theme toggle
- on-page table of contents
- previous/next page navigation
- copy buttons for code blocks
- `llms.txt` and `llms-full.txt` exports
- `.nojekyll` for GitHub Pages compatibility

> **Warning:** AI-generated content is rendered with `|safe` in the final page template, but only after `docsfy` sanitizes scripts, `iframe`/`object`/`embed`/`form` tags, event-handler attributes, and unsafe `href`/`src` schemes.

## Runtime Configuration

Server runtime is controlled by environment variables. The defaults are practical, but `ADMIN_KEY` is mandatory.

From `.env.example`:

```bash
# Required: Admin password (minimum 16 characters)
ADMIN_KEY=

# AI provider and model defaults
# (pydantic_settings reads these case-insensitively)
AI_PROVIDER=cursor
AI_MODEL=gpt-5.4-xhigh-fast
AI_CLI_TIMEOUT=60

# Logging
LOG_LEVEL=INFO

# Data directory for database and generated docs
DATA_DIR=/data

# Cookie security (set to false for local HTTP development)
SECURE_COOKIES=true

# Development mode: starts Vite dev server on port 5173 alongside FastAPI
# DEV_MODE=true
```

The most important runtime settings are:

- `ADMIN_KEY`: required at startup, minimum 16 characters
- `AI_PROVIDER` and `AI_MODEL`: defaults for new generations
- `AI_CLI_TIMEOUT`: timeout passed to provider CLI invocations
- `DATA_DIR`: where SQLite and generated docs are stored
- `SECURE_COOKIES`: should stay `true` outside local HTTP development
- `DEV_MODE`: enables Vite plus FastAPI reload in one container/process launch

> **Warning:** `ADMIN_KEY` is more than the admin login password. `docsfy` also uses it as the HMAC secret when hashing stored API keys. Rotating it invalidates existing user API-key hashes.

The CLI has its own separate runtime config in `~/.config/docsfy/config.toml`. From `config.toml.example`:

```toml
# Default server to use when --server is not specified
[default]
server = "dev"

# Server profiles -- add as many as you need
[servers.dev]
url = "http://localhost:8000"
username = "admin"
password = "<your-dev-key>"

[servers.prod]
url = "https://docsfy.example.com"
username = "admin"
password = "<your-prod-key>"

[servers.staging]
url = "https://staging.docsfy.example.com"
username = "deployer"
password = "<your-staging-key>"
```

That file powers the `docsfy` CLI. It is the client-side equivalent of a browser session: it tells the CLI where the server lives and which credentials to send.

> **Tip:** The CLI config is written with owner-only permissions, which is important because it stores real credentials.

## Development and Deployment

The deployment model is intentionally straightforward:

- the frontend is built once with Vite
- Python dependencies are installed into a virtual environment
- provider CLIs are installed into the runtime image
- the built frontend is copied into the same image as FastAPI
- generated state is stored in `/data`

A minimal local deployment is already described by `docker-compose.yaml`:

```yaml
services:
  docsfy:
    build:
      context: .
      dockerfile: Dockerfile
    ports:
      - "8000:8000"
    volumes:
      - ./data:/data
    env_file:
      - .env
    environment:
      - ADMIN_KEY=${ADMIN_KEY}
    restart: unless-stopped
```

In normal mode, the container serves FastAPI on port `8000`. The runtime image also bundles Chromium plus `mermaid-cli`, and the Docker build smoke-tests `mmdc` so Mermaid diagrams can render during site generation without an extra service. In `DEV_MODE=true`, the entrypoint also starts the Vite dev server on `5173` and runs FastAPI with reload, which gives you a faster edit-refresh loop for frontend work.

This architecture makes production easy to reason about:

- one service exposes the dashboard and docs
- one mounted data directory preserves state
- one authentication system protects both API and docs
- one backend orchestrates all provider CLIs

## Testing and Automation

The runtime design is backed by both backend and frontend tests.

The repository includes:

- backend tests under `tests/`, covering auth, generation, storage, rendering, repository cloning and diffing, WebSocket auth and heartbeats, and end-to-end mocked generation flows
- frontend tests with Vitest and JSDOM
- `tox.toml` for running backend tests through `uv`
- `.pre-commit-config.yaml` for linting, formatting, typing, and secret scanning
- human-readable end-to-end plans in `test-plans/`

Practical commands are:

- `uv run pytest -v --tb=short`
- `tox`
- `cd frontend && npm test`

The pre-commit setup also adds quality and security checks such as:

- `flake8`
- `ruff`
- `ruff-format`
- `mypy`
- `detect-secrets`
- `gitleaks`

> **Note:** This checkout does not include any committed `.github/workflows` files, so the repository’s automation is defined here through local tooling such as `tox.toml`, `frontend` test config, and `.pre-commit-config.yaml`, rather than an in-repo GitHub Actions pipeline.

If you keep one mental model in mind, use this one: FastAPI is the orchestrator, SQLite is the control database, the filesystem holds the generated site, and the AI CLI plus renderer pipeline turns a Git repository into a versioned, shareable documentation variant.


## Related Pages

- [Introduction](introduction.html)
- [Data Storage and Layout](data-storage-and-layout.html)
- [Generated Output](generated-output.html)
- [Deployment and Runtime](deployment-and-runtime.html)
- [WebSocket Protocol](websocket-protocol.html)

---

Source: projects-variants-and-ownership.md

# Projects, Variants, and Ownership

Docsfy stores generated documentation as **variants**, not as one flat project record. The same repository can exist under different owners, on different branches, and with different AI provider/model combinations at the same time.

At the storage level, a variant is uniquely identified by five fields: `name`, `branch`, `ai_provider`, `ai_model`, and `owner`.

```63:79:src/docsfy/storage.py
CREATE TABLE IF NOT EXISTS projects (
    name TEXT NOT NULL,
    branch TEXT NOT NULL DEFAULT 'main',
    ai_provider TEXT NOT NULL DEFAULT '',
    ai_model TEXT NOT NULL DEFAULT '',
    owner TEXT NOT NULL DEFAULT '',
    repo_url TEXT NOT NULL,
    status TEXT NOT NULL DEFAULT 'generating',
    current_stage TEXT,
    last_commit_sha TEXT,
    last_generated TEXT,
    page_count INTEGER DEFAULT 0,
    error_message TEXT,
    plan_json TEXT,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    PRIMARY KEY (name, branch, ai_provider, ai_model, owner)
)
```

## The Core Identity

When you look at a docsfy project, these are the fields that matter most:

- `name`: the repository name, such as `for-testing-only` or `my-repo`
- `owner`: the user who owns that generated docs set
- `branch`: the Git branch the docs were generated from
- `ai_provider`: the AI provider used for generation
- `ai_model`: the specific model used under that provider

That means all of these are different variants:

- `for-testing-only / main / gemini / gemini-2.5-flash / admin`
- `for-testing-only / dev / gemini / gemini-2.5-flash / admin`
- `for-testing-only / main / gemini / gemini-2.5-flash / testuser-e2e`

The project `name` usually comes from the repository itself, not from a separate title field. For example, `https://github.com/org/my-repo.git` becomes `my-repo`. For local repositories, docsfy uses the directory name.

## Owners And Shared Access

Every generated docs set belongs to an `owner`. In normal use, that owner is the authenticated username that started the generation. If you use the root `ADMIN_KEY`, the owner recorded for those generations is `admin`.

Ownership is what keeps two users from overwriting each other when they generate docs for the same repository. For admins, `alice/my-repo` and `bob/my-repo` are separate project groups, even if branch, provider, and model all match.

Sharing is separate from ownership:

- Sharing lets another user see a project.
- Sharing does not transfer ownership.
- Sharing does not create a duplicate copy.

Docsfy stores access grants at the `project_name + project_owner` level:

```258:262:src/docsfy/storage.py
CREATE TABLE IF NOT EXISTS project_access (
    project_name TEXT NOT NULL,
    project_owner TEXT NOT NULL DEFAULT '',
    username TEXT NOT NULL,
    PRIMARY KEY (project_name, project_owner, username)
)
```

Because of that design, an access grant applies to **all variants** for that owner’s project. If an admin grants you access to Alice’s `for-testing-only`, you can see Alice’s `main`, `dev`, `claude`, `gemini`, and other variants for that project. You do not automatically get access to Bob’s `for-testing-only`.

> **Note:** Owner is not part of the docs URL path. When multiple owners have the same repo and the same variant coordinates, admin APIs may need `?owner=<username>` to disambiguate the target.

Ownership and role are also different concepts:

- `admin` can see all owners and manage sharing
- `user` can generate, delete, and manage their own variants
- `viewer` can read accessible variants but cannot start or delete generations

## Branches

Branches are first-class variant fields. `main` and `dev` are different variants, even if everything else matches.

Docsfy is deliberately strict about branch names because the branch appears in URLs and directory paths:

```18:48:src/docsfy/models.py
class GenerateRequest(BaseModel):
    # ... other request fields omitted ...

    branch: str = Field(
        default=DEFAULT_BRANCH, description="Git branch to generate docs from"
    )

    @field_validator("branch")
    @classmethod
    def validate_branch(cls, v: str) -> str:
        if "/" in v:
            msg = (
                f"Invalid branch name: '{v}'. Branch names cannot contain slashes "
                "— use hyphens instead (e.g., release-1.x)."
            )
            raise ValueError(msg)
        if not re.match(r"^[a-zA-Z0-9][a-zA-Z0-9._-]*$", v):
            msg = f"Invalid branch name: '{v}'"
            raise ValueError(msg)
        if ".." in v:
            msg = f"Invalid branch name: '{v}'"
            raise ValueError(msg)
        return v
```

In practice:

- If you omit `branch`, docsfy uses `main`
- Good branch names include `main`, `dev`, `release-1.x`, and `v2.0.1`
- Rejected branch names include `release/v2.0`, `.hidden`, `../etc/passwd`, and `release..candidate`

> **Warning:** Branch names cannot contain `/`, and docsfy also rejects `..` anywhere in the name. If your Git workflow normally uses slash-based branch names, use a hyphenated version such as `release-1.x` when generating docs with docsfy.

If you want a stable link to a specific branch, use the fully qualified variant URL, such as `/docs/for-testing-only/dev/gemini/gemini-2.5-flash/`.

## Providers And Models

Docsfy validates the provider, and stores the model string as part of the variant. The built-in provider list is:

- `claude`
- `gemini`
- `cursor`

If you do not specify a provider or model, the server falls back to its configured defaults:

```9:21:src/docsfy/config.py
class Settings(BaseSettings):
    model_config = SettingsConfigDict(
        env_file=".env",
        env_file_encoding="utf-8",
        extra="ignore",
    )

    admin_key: str = ""  # Required — validated at startup
    ai_provider: str = "cursor"
    ai_model: str = "gpt-5.4-xhigh-fast"
    ai_cli_timeout: int = Field(default=60, gt=0)
    log_level: str = "INFO"
    data_dir: str = "/data"
```

For most installations, that means:

- Default provider: `cursor`
- Default model: `gpt-5.4-xhigh-fast`
- Default data directory: `/data`

These defaults can be overridden with environment variables before the server starts.

Two practical details are easy to miss:

- Provider choices are fixed, but model names are stored as strings, so the exact model label matters.
- The UI suggestion lists for models and branches come from **successful `ready` variants only**.

> **Note:** A branch or model does not show up in the UI’s suggestion lists just because you typed it once. It appears after a successful generation has completed.

## Statuses

Each variant has a status that tells you whether it is usable:

| Status | Meaning |
| --- | --- |
| `generating` | docsfy accepted the job and is currently working on it |
| `ready` | the docs site exists and can be viewed or downloaded |
| `error` | generation failed |
| `aborted` | generation was canceled before completion |

A few behaviors are worth knowing:

- A `generating` variant is not eligible for latest-variant resolution yet.
- If the server restarts while a variant is still generating, docsfy resets that orphaned row to `error`.
- A `ready` variant can still carry metadata such as `last_commit_sha`, `page_count`, `last_generated`, and `plan_json`.

> **Note:** While a run is active, `current_stage` can move through values like `cloning`, `incremental_planning`, `planning`, `generating_pages`, `validating`, `cross_linking`, and `rendering`. The `validating` and `cross_linking` stages come from the post-generation pipeline that runs after page writing and before the final render. A ready variant can also keep `current_stage = up_to_date` when docsfy determines nothing meaningful changed.

## What “Latest” Means

When you use a generic project route such as `/docs/<project>/` or `/api/projects/<project>/download`, docsfy does not assume `main`, and it does not prefer one model name over another. It resolves the newest **ready** variant the current caller can actually access, using `last_generated`.

```mermaid
flowchart TD
  A[Generic project route] --> B{Admin?}
  B -->|Yes| C[Newest ready variant across all owners]
  B -->|No| D[Collect own newest ready variant]
  C --> G[Serve or download that variant]
  D --> E[Collect newest ready variant from each shared owner]
  E --> F{Single newest timestamp?}
  F -->|Yes| G
  F -->|No, tie| H[Return 409 instead of guessing]
```

For non-admin users, the helper that resolves "latest" makes that owner-aware behavior explicit:

```1078:1123:src/docsfy/api/projects.py
async def _resolve_latest_accessible_variant(
    username: str, name: str
) -> dict[str, Any] | None:
    candidates: list[dict[str, Any]] = []

    owned = await get_latest_variant(name, owner=username)
    if owned:
        candidates.append(owned)

    accessible = await get_user_accessible_projects(username)
    for proj_name, proj_owner in accessible:
        if proj_name == name and proj_owner:
            variant = await get_latest_variant(name, owner=proj_owner)
            if variant:
                candidates.append(variant)

    def _sort_key(v: dict[str, Any]) -> str:
        return str(v.get("last_generated") or "")

    candidates.sort(key=_sort_key, reverse=True)
    newest = _sort_key(candidates[0])
    tied = [c for c in candidates if _sort_key(c) == newest]
    if len(tied) > 1:
        raise HTTPException(
            status_code=409,
            detail="Multiple owners have variants with the same timestamp, please specify owner",
        )
    return candidates[0]
```

Admins still use plain `get_latest_variant()` across all owners; the owner-aware merge above applies to non-admin callers.

That leads to a few important rules:

- Only `ready` variants are candidates.
- Latest resolution is based on `last_generated`, not on branch name, Git commit age, or model name.
- Generic project routes can point to any branch. If `dev` was generated after `main`, the generic route can serve `dev`.
- Admins can resolve latest across all owners.
- Non-admin users resolve latest across their own variants plus any owner-scoped project grants they can access.
- For non-admin callers, if two accessible owners tie on the newest timestamp, the generic route fails with `409` instead of guessing.
- Variant-specific routes do not use latest resolution. They stay pinned to the exact `branch / provider / model` you requested.
- If no ready accessible variant exists yet, the generic docs/download route fails instead of guessing.

This is the split to keep in mind:

- Moving target: `/docs/<project>/`
- Pinned variant: `/docs/<project>/<branch>/<provider>/<model>/`

> **Warning:** `/docs/<project>/` and `/api/projects/<project>/download` are moving targets. Use fully qualified variant URLs when you need stable links, reproducible downloads, or branch-specific documentation. In multi-owner admin workflows, add `?owner=<username>` to the exact-variant route when you need one specific owner.

## Cross-Model And Cross-Provider Replacement

The idea of “latest” also shows up during non-force regenerations on the **same owner and branch**.

If you regenerate a project with a different provider or model and `force` is `false`, docsfy can use the newest ready variant on that same branch as the base variant:

```728:756:src/docsfy/api/projects.py
# Check if a cross-provider variant is newer
latest_any = await get_latest_variant(project_name, owner=owner, branch=branch)
if ready_current_variant and latest_any:
    current_gen = str(ready_current_variant.get("last_generated") or "")
    latest_gen = str(latest_any.get("last_generated") or "")
    if latest_gen > current_gen and (
        latest_any.get("ai_provider") != ai_provider
        or latest_any.get("ai_model") != ai_model
    ):
        base_variant = latest_any
# ...
if replaces_base_variant:
    logger.info(
        f"[{project_name}] Cross-provider update: reusing {base_provider}/{base_model} "
        f"content for {ai_provider}/{ai_model} generation"
    )
```

In practice, that means:

- On the same commit, docsfy can reuse existing artifacts directly.
- On a newer commit, docsfy can reuse unchanged cached pages and regenerate only what changed.
- After a successful non-force switch, the old base variant is treated as replaced and removed.
- If you set `force=true`, docsfy performs a full regeneration and keeps the existing variant instead of replacing it.

This behavior is especially important if you compare model outputs on the same branch.

> **Tip:** If you want two model outputs to coexist side by side, use `force=true` for the new run. Without `force`, switching models on the same branch may replace the previous ready variant once the new one finishes.

## On-Disk Layout

For self-hosted deployments, the storage layout mirrors the data model closely. Variants are stored under `owner / name / branch / provider / model` inside the configured data directory:

```525:560:src/docsfy/storage.py
def get_project_dir(
    name: str,
    ai_provider: str = "",
    ai_model: str = "",
    owner: str = "",
    branch: str = DEFAULT_BRANCH,
) -> Path:
    # ...
    return (
        PROJECTS_DIR
        / safe_owner
        / _validate_name(name)
        / branch
        / ai_provider
        / ai_model
    )
```

With the default `data_dir` of `/data`, a variant ends up under a path like:

- `/data/projects/admin/for-testing-only/main/gemini/gemini-2.5-flash/`

Within that variant directory, docsfy keeps the rendered site and the cached page artifacts separately:

- `site/` contains the generated HTML site
- `cache/pages/` contains cached page markdown used during regeneration

## Choosing The Right URL

Use the generic project URL when you want “whatever is newest, ready, and accessible to the current caller”:

- `/docs/<project>/`
- `/api/projects/<project>/download`

Use the variant-specific URL when you want something pinned and reproducible:

- `/docs/<project>/<branch>/<provider>/<model>/`
- `/api/projects/<project>/<branch>/<provider>/<model>`
- `/api/projects/<project>/<branch>/<provider>/<model>/download`

> **Tip:** In multi-owner admin workflows, add `?owner=<username>` to the exact-variant docs or download route when you need one specific owner.

If you remember just one rule, make it this one: **generic project URLs follow the latest ready accessible variant, while variant-specific URLs stay pinned to the exact branch, provider, and model you asked for.**


## Related Pages

- [Variants, Branches, and Regeneration](variants-branches-and-regeneration.html)
- [User and Access Management](user-and-access-management.html)
- [Data Storage and Layout](data-storage-and-layout.html)
- [Viewing, Downloading, and Hosting Docs](viewing-downloading-and-hosting-docs.html)
- [Projects API](projects-api.html)

---

Source: generated-output.md

# Generated Output

Every successful `docsfy` generation produces one self-contained static documentation bundle for one variant: project, branch, AI provider, and AI model. That bundle is what `docsfy` serves under `/docs/...`, what the download endpoints return as a `.tar.gz`, and what you can publish on any static host.

The bundle is built for two audiences at once: people browsing HTML pages and tools that prefer plain text. That is why each generation includes rendered HTML, matching Markdown copies, built-in search data, and the `llms.txt` / `llms-full.txt` artifacts.

> **Note:** The public output is the variant’s `site/` directory. Internal files such as `plan.json` and `cache/pages/*.md` live beside it to support regeneration, but they are not part of the published site.

## What Gets Published

| Artifact | What it contains | Why it exists |
| --- | --- | --- |
| `index.html` | The docs homepage | Landing page, grouped navigation, first-page jump-off |
| `<slug>.html` | One rendered page per generated doc page | Normal browser reading and sharing |
| `<slug>.md` | A Markdown copy of each page | Reuse, diffing, text-first workflows, automation |
| `search-index.json` | Search data | Client-side search in the published site |
| `llms.txt` | A compact Markdown index | Lightweight map of the documentation |
| `llms-full.txt` | All page Markdown in one file | Single-file ingestion for LLMs or other text pipelines |
| `assets/` | Static CSS and JavaScript | Theme, search, callouts, copy buttons, TOC highlighting, sidebar behavior |
| `.nojekyll` | Empty marker file | Prevents Jekyll processing on GitHub Pages-style hosts |

The renderer writes the public bundle directly:

```599:688:src/docsfy/renderer.py
def render_site(plan: dict[str, Any], pages: dict[str, str], output_dir: Path) -> None:
    if output_dir.exists():
        shutil.rmtree(output_dir)
    output_dir.mkdir(parents=True, exist_ok=True)
    assets_dir = output_dir / "assets"
    assets_dir.mkdir(exist_ok=True)

    # Prevent GitHub Pages from running Jekyll
    (output_dir / ".nojekyll").touch()

    # ... copy files from src/docsfy/static/ into assets/ ...

    (output_dir / "index.html").write_text(index_html, encoding="utf-8")
    (output_dir / f"{slug}.html").write_text(page_html, encoding="utf-8")
    (output_dir / f"{slug}.md").write_text(md_content, encoding="utf-8")

    (output_dir / "search-index.json").write_text(
        json.dumps(search_index), encoding="utf-8"
    )
    (output_dir / "llms.txt").write_text(llms_txt, encoding="utf-8")
    (output_dir / "llms-full.txt").write_text(llms_full_txt, encoding="utf-8")
```

> **Note:** `render_site()` deletes and recreates the entire `site/` directory before writing the new bundle. If a page existed in an older generation but not the new one, the old files do not linger.

This repository’s checked-in `docs/` directory is a real example of the published shape:

```text
docs/
  index.html
  generated-output.html
  generated-output.md
  llms.txt
  llms-full.txt
  search-index.json
  assets/
    style.css
    search.js
    copy.js
    callouts.js
    scrollspy.js
    theme.js
    github.js
```

> **Warning:** Page slugs are flat filenames. If a slug contains `/`, `\`, a leading `.`, or `..`, docsfy skips it instead of creating nested directories or unsafe paths.

```620:637:src/docsfy/renderer.py
# Filter out invalid slugs
valid_pages: dict[str, str] = {}
for slug, content in pages.items():
    if "/" in slug or "\\" in slug or slug.startswith(".") or ".." in slug:
        logger.warning(f"Skipping invalid slug: {slug}")
    else:
        valid_pages[slug] = content

# Filter navigation to only include pages that exist in valid_pages
filtered_navigation: list[dict[str, Any]] = []
for group in navigation:
    filtered_pages = [
        page
        for page in group.get("pages", [])
        if page.get("slug", "") in valid_pages
    ]
```

## How A Generation Becomes A Bundle

Before the site is rendered, docsfy runs a short post-generation pipeline. In practice, it looks like this:

```mermaid
flowchart LR
  A[Generated page markdown] --> B[Validate pages]
  B --> C[Add related-page links]
  C --> D[Detect version]
  D --> E[Render static site]
  E --> F[index.html]
  E --> G[slug.html + slug.md]
  E --> H[search-index.json]
  E --> I[llms.txt + llms-full.txt]
  E --> J[assets/ + .nojekyll]
```

The backend implements that sequence explicitly:

```981:1052:src/docsfy/api/projects.py
# --- Post-generation pipeline ---
await update_and_notify(
    gen_key,
    project_name,
    ai_provider,
    ai_model,
    status="generating",
    owner=owner,
    branch=branch,
    current_stage="validating",
    page_count=len(pages),
)
pages = await validate_pages(...)

await update_and_notify(
    gen_key,
    project_name,
    ai_provider,
    ai_model,
    status="generating",
    owner=owner,
    branch=branch,
    current_stage="cross_linking",
    page_count=len(pages),
)
pages = await add_cross_links(...)

version = detect_version(repo_dir)
if version:
    plan["version"] = version

await update_and_notify(... current_stage="rendering", page_count=len(pages))
site_dir = get_project_site_dir(
    project_name, ai_provider, ai_model, owner, branch=branch
)
render_site(plan=plan, pages=pages, output_dir=site_dir)
```

## HTML Pages, Assets, And Search

`index.html` is a real landing page, not just a redirect. Each page slug also gets a standalone `<slug>.html` file with previous/next navigation and, when headings exist, an `On this page` table of contents.

Markdown is rendered with fenced code blocks, syntax highlighting, tables, and a TOC built from `##` and `###` headings:

```441:455:src/docsfy/renderer.py
def _md_to_html(md_text: str) -> tuple[str, str]:
    """Convert markdown to HTML. Returns (content_html, toc_html)."""
    md = markdown.Markdown(
        extensions=["fenced_code", "codehilite", "tables", "toc"],
        extension_configs={
            "codehilite": {"css_class": "highlight", "guess_lang": False},
            "toc": {"toc_depth": "2-3"},
        },
    )
    md_text = _prerender_mermaid(md_text)
    md_text = _clean_code_fence_annotations(md_text)
    md_text = _ensure_blank_lines(md_text)
    content_html = _sanitize_html(md.convert(md_text))
    toc_html = getattr(md, "toc", "")
    return content_html, toc_html
```

```15:39:src/docsfy/templates/page.html
<nav class="page-nav">
    {% if prev_page %}
    <a href="{{ prev_page.slug }}.html" class="page-nav-link page-nav-prev">
        <span class="page-nav-label">Previous</span>
        <span class="page-nav-title">{{ prev_page.title }}</span>
    </a>
    {% endif %}
    {% if next_page %}
    <a href="{{ next_page.slug }}.html" class="page-nav-link page-nav-next">
        <span class="page-nav-label">Next</span>
        <span class="page-nav-title">{{ next_page.title }}</span>
    </a>
    {% endif %}
</nav>
{% if toc %}
<aside class="toc-sidebar">
    <div class="toc-container">
        <h3>On this page</h3>
        {{ toc | safe }}
    </div>
</aside>
{% endif %}
```

If a page uses the standard blockquote callout forms such as `> **Note:**`, `> **Warning:**`, or `> **Tip:**`, the published bundle styles them automatically:

```1:25:src/docsfy/static/callouts.js
(function() {
  var blockquotes = document.querySelectorAll('blockquote');
  blockquotes.forEach(function(bq) {
    var firstStrong = bq.querySelector('strong');
    if (!firstStrong) return;

    var text = firstStrong.textContent.toLowerCase().replace(':', '').trim();
    var type = null;

    if (text === 'note' || text === 'info') {
      type = 'note';
    } else if (text === 'warning' || text === 'caution') {
      type = 'warning';
    } else if (text === 'tip' || text === 'hint') {
      type = 'tip';
    }

    if (type) {
      bq.classList.add('callout', 'callout-' + type);
    }
  });
})();
```

Search is fully self-contained. There is no separate search server to run. The renderer writes `search-index.json`, and the browser loads that file directly.

```518:524:src/docsfy/renderer.py
index.append(
    {
        "slug": slug,
        "title": title_map.get(slug, slug),
        "content": content[:2000],
    }
)
```

```18:19:src/docsfy/static/search.js
fetch('search-index.json').then(function(r) { return r.json(); })
  .then(function(data) { index = data; }).catch(function() {});
```

```74:80:src/docsfy/static/search.js
var matches = index.filter(function(item) {
  return item.title.toLowerCase().includes(q) || item.content.toLowerCase().includes(q);
}).slice(0, 10);

matches.forEach(function(m, i) {
  var div = document.createElement('a');
  div.href = m.slug + '.html';
```

> **Note:** The search index stores only the first 2,000 characters of each page. That keeps the bundle small, but very deep matches later in a long page may not appear in the built-in search results.

## Markdown Copies And LLM-Friendly Files

For every `guide.html`, you also get `guide.md` in the same bundle. Those Markdown files are part of the public output, not just an internal cache. They are what `llms.txt` links to, and they are what `llms-full.txt` concatenates.

This repository’s generated `llms.txt` is a good real-world example of the format:

```1:10:docs/llms.txt
# docsfy

> AI-powered documentation generator that turns Git repositories into shareable static docs sites.

## Overview

- [Introduction](introduction.md): Explain what docsfy does, who it is for, and the main workflows exposed through the web app, HTTP API, generated sites, and CLI.
- [Architecture and Runtime](architecture-and-runtime.md): Describe how the FastAPI backend, React dashboard, SQLite storage, AI CLI integrations, and static site renderer work together.
- [Projects, Variants, and Ownership](projects-variants-and-ownership.md): Define the core data model: owners, repository names, branches, provider and model variants, statuses, and latest-variant resolution.
- [Generated Output](generated-output.md): Show what docsfy publishes for each generation, including static HTML pages, Markdown copies, search index files, and LLM-friendly artifacts.
```

`llms-full.txt` takes the next step and combines every page into one file with a source label before each page’s content:

```582:595:src/docsfy/renderer.py
for group in nav:
    for page in group.get("pages", []):
        slug = page.get("slug", "")
        content = pages.get(slug, "")
        lines.extend(
            [
                f"Source: {slug}.md",
                "",
                content,
                "",
                "---",
                "",
            ]
        )
```

The published homepage also makes both files easy to discover:

```18:27:src/docsfy/templates/index.html
<div class="llm-docs-content">
    <strong>AI-friendly documentation</strong>
    <p>This documentation is optimized for AI consumption.</p>
    <div class="llm-docs-links">
        <a href="llms.txt">llms.txt</a> — structured index
        <span class="llm-docs-sep">·</span>
        <a href="llms-full.txt">llms-full.txt</a> — complete docs in one file
    </div>
</div>
```

> **Tip:** Use `llms.txt` when you want a small table of contents you can crawl page by page. Use `llms-full.txt` when you want one file to hand to an LLM, a batch text-processing job, or another ingestion pipeline.

## Where The Files Live And How To Get Them

By default, docsfy stores runtime data under `/data`, and the provided Compose file binds host `./data` into that location:

```13:17:.env.example
# Data directory for database and generated docs
DATA_DIR=/data

# Cookie security (set to false for local HTTP development)
SECURE_COOKIES=true
```

```10:11:docker-compose.yaml
volumes:
  - ./data:/data
```

On disk, one variant lives in a branch/provider/model-aware directory, and the published site is the `site/` child of that variant:

```525:581:src/docsfy/storage.py
def get_project_dir(
    name: str,
    ai_provider: str = "",
    ai_model: str = "",
    owner: str = "",
    branch: str = DEFAULT_BRANCH,
) -> Path:
    return (
        PROJECTS_DIR
        / safe_owner
        / _validate_name(name)
        / branch
        / ai_provider
        / ai_model
    )

def get_project_site_dir(
    name: str,
    ai_provider: str = "",
    ai_model: str = "",
    owner: str = "",
    branch: str = DEFAULT_BRANCH,
) -> Path:
    return get_project_dir(name, ai_provider, ai_model, owner, branch) / "site"
```

With the default settings, that means the published bundle lives at:

`/data/projects/<owner>/<project>/<branch>/<provider>/<model>/site/`

Docsfy serves any file inside that directory, not just `index.html`. In other words, the same route family that serves HTML pages can also serve `llms.txt`, `llms-full.txt`, `search-index.json`, or files under `assets/`.

```200:268:src/docsfy/main.py
@app.get("/docs/{project}/{branch}/{provider}/{model}/{path:path}")
async def serve_variant_docs(...):
    ...
    site_dir = get_project_site_dir(project, provider, model, proj_owner, branch=branch)
    file_path = site_dir / path
    ...
    return FileResponse(file_path)

@app.get("/docs/{project}/{path:path}")
async def serve_docs(...):
    """Serve the most recently generated variant."""
    ...
    site_dir = get_project_site_dir(
        project,
        str(latest["ai_provider"]),
        str(latest["ai_model"]),
        latest_owner,
        branch=latest_branch,
    )
    file_path = site_dir / path
    ...
    return FileResponse(file_path)
```

That means a URL like `/docs/<project>/<branch>/<provider>/<model>/llms.txt` is just as real as `/docs/<project>/<branch>/<provider>/<model>/index.html`.

> **Warning:** The short `/docs/<project>/...` and `/api/projects/<name>/download` routes are not pinned to one branch or model. They return the newest ready variant you can access. If you care about one exact build, use the fully qualified variant route.

Downloads are built from `site_dir` only, so the archive contains the published static bundle rather than caches or planning metadata. The CLI either saves that archive or extracts it into the directory you pass with `--output`.

```374:409:src/docsfy/api/projects.py
async def _stream_tarball(site_dir: Path, archive_name: str) -> StreamingResponse:
    """Create a tar.gz archive and stream it as a response."""
    ...
    def _create_archive() -> None:
        with tarfile.open(tar_path, mode="w:gz") as tar:
            tar.add(str(site_dir), arcname=archive_name)

    ...
    return StreamingResponse(
        _stream_and_cleanup(),
        media_type="application/gzip",
        headers={
            "Content-Disposition": f'attachment; filename="{archive_name}-docs.tar.gz"'
        },
    )
```

```301:321:src/docsfy/cli/projects.py
if branch and provider and model:
    url_path = (
        f"/api/projects/{name}/{branch}/{provider}/{model}/download{owner_qs}"
    )
    archive_name = f"{name}-{branch}-{provider}-{model}-docs.tar.gz"
else:
    url_path = f"/api/projects/{name}/download{owner_qs}"
    archive_name = f"{name}-docs.tar.gz"

if output:
    client.download(url_path, tmp_path)
    output_dir = Path(output)
    output_dir.mkdir(parents=True, exist_ok=True)
    with tarfile.open(tmp_path, "r:gz") as tar:
        tar.extractall(path=output_dir, filter="data")
    typer.echo(f"Extracted to {output_dir}")
```

> **Note:** Extracting a download gives you a top-level folder inside your target directory. The tarball is created with an explicit archive name, so you should expect one nested directory rather than loose files at the extraction root.

## What Can Appear Inside The Published Pages

The final HTML pages can include a little more than a raw Markdown render:

- docsfy can validate generated pages, detect stale references, and regenerate a page before the final render.
- It can append a `## Related Pages` section with links to other generated pages.
- It can detect a project version and show it in the footer.
- It can pre-render top-level Mermaid diagrams to inline SVG when `mmdc` is available.

The related-pages section is appended as plain Markdown links to other generated HTML pages:

```402:404:src/docsfy/postprocess.py
if link_items:
    related_section = "\n\n## Related Pages\n\n" + "\n".join(link_items)
    updated[slug] = updated[slug] + related_section
```

Version detection checks common project metadata files and Git tags, then the footer renders the version if one was found:

```38:43:src/docsfy/postprocess.py
def detect_version(repo_path: Path) -> str | None:
    """Auto-detect project version from common sources.

    Checks in order: pyproject.toml, package.json, Cargo.toml, setup.cfg, git tags.
    Returns the first version found, or None.
    """
```

```82:89:src/docsfy/templates/_doc_base.html
<footer class="page-footer">
    {% if version %}
    <span class="footer-version">Generated from version {{ version }}</span>
    <span class="footer-sep">&middot;</span>
    {% endif %}
    Generated with <a href="{{ docsfy_repo_url }}" target="_blank" rel="noopener">docs<span class="brand-accent">fy</span></a>
    <span class="footer-sep">&middot;</span>
    <span class="footer-llm">LLM-friendly: <a href="llms.txt">llms.txt</a> &middot; <a href="llms-full.txt">llms-full.txt</a></span>
</footer>
```

Mermaid rendering is opportunistic rather than brittle: if `mmdc` is installed, docsfy turns top-level Mermaid fences into inline SVG; if not, or if rendering fails, it keeps the original Mermaid block instead of failing the page.

```325:327:src/docsfy/renderer.py
if not shutil.which("mmdc"):
    logger.debug("mmdc not found, skipping Mermaid pre-rendering")
    return md_text
```

```383:393:src/docsfy/renderer.py
if in_mermaid and backtick_count >= mermaid_fence_len and not rest:
    mermaid_src = "\n".join(mermaid_lines)
    rendered = _render_mermaid_block(mermaid_src)
    if rendered:
        result.append(rendered)
    else:
        # Rendering failed -- restore original block
        result.append("`" * mermaid_fence_len + "mermaid")
        result.extend(mermaid_lines)
        result.append(line)
```

> **Tip:** If you want a version in the footer, keep version metadata in a standard place such as `pyproject.toml`, `package.json`, `Cargo.toml`, or `setup.cfg`, or tag the repository.


## Related Pages

- [Viewing, Downloading, and Hosting Docs](viewing-downloading-and-hosting-docs.html)
- [Data Storage and Layout](data-storage-and-layout.html)
- [Architecture and Runtime](architecture-and-runtime.html)
- [Generating Documentation](generating-documentation.html)
- [Deployment and Runtime](deployment-and-runtime.html)

---

Source: installation.md

# Installation

docsfy has two parts:

- a Python application that provides the API, generation engine, and CLI
- a React frontend that `docsfy-server` serves from `frontend/dist`

If you only want the CLI to talk to an existing docsfy server, you can skip the frontend build on your machine. If you want the full local web UI, install both the Python and frontend dependencies.

## Prerequisites

You will need:

- Python 3.12 or newer
- `uv` for Python dependency management
- Node.js and npm for the frontend build
- Git
- one supported AI provider CLI if you plan to generate docs: `claude`, `gemini`, or `cursor`
- Chromium plus Mermaid CLI (`mmdc`) if you want local generation to match the container's Mermaid diagram support

The project metadata defines the Python version requirement and the two console entry points:

```toml
[project]
requires-python = ">=3.12"

[project.scripts]
docsfy-server = "docsfy.main:run"
docsfy = "docsfy.cli.main:main"
```

> **Note:** The repository’s container build uses Python 3.12 and Node 20, so Node 20 is a safe choice for local frontend work too.

If you want local generation to match the container's Mermaid diagram support, install Chromium and Mermaid CLI (`mmdc`) too. The runtime image does that with:

```dockerfile
RUN apt-get update && apt-get install -y --no-install-recommends \
  bash \
  git \
  curl \
  nodejs \
  npm \
  chromium \
  && rm -rf /var/lib/apt/lists/*

# Puppeteer config for mermaid-cli (must be set before npm install)
ENV PUPPETEER_EXECUTABLE_PATH="/usr/bin/chromium"
ENV PUPPETEER_SKIP_CHROMIUM_DOWNLOAD="true"

# Configure npm for non-root global installs and install Gemini CLI + mermaid-cli
RUN mkdir -p /home/appuser/.npm-global \
  && npm config set prefix '/home/appuser/.npm-global' \
  && npm install -g @google/gemini-cli @mermaid-js/mermaid-cli@11
```

## Install Python dependencies

From the repository root, install the Python environment with `uv`:

```bash
uv sync --frozen --no-dev
```

That matches the runtime install used by the project’s container build and gives you the `docsfy` and `docsfy-server` commands inside the project environment.

If you do not want to activate the environment manually, you can run the commands through `uv`:

```bash
uv run docsfy --help
uv run docsfy-server
```

> **Note:** If you only need the CLI as a client for an already-running server, this Python install is enough.

## Install frontend dependencies

The frontend lives in `frontend/` and uses Vite. Install its dependencies with npm, then build the static assets:

```bash
cd frontend
npm ci
npm run build
```

The available frontend scripts are defined in `frontend/package.json`:

```json
"scripts": {
  "dev": "vite",
  "build": "tsc -b && vite build",
  "lint": "eslint .",
  "preview": "vite preview",
  "test": "vitest run"
}
```

`npm run build` creates the files that `docsfy-server` serves for the browser UI.

If you are actively working on the frontend, you can also run the Vite dev server:

```bash
cd frontend
npm run dev
```

> **Tip:** The Vite dev server listens on `0.0.0.0:5173` and proxies `/api`, `/docs`, and `/health` to `http://localhost:8000` by default. If your backend is running somewhere else, set `API_TARGET` before starting `npm run dev`.

> **Warning:** Build the frontend before starting `docsfy-server` if you want the browser UI. The server only mounts `/assets` when those build artifacts already exist, so if you build the frontend after the server is already running, restart the server.

## Create your `.env` file

`docsfy-server` reads settings from a `.env` file in the repository root. The checked-in example is:

```dotenv
# Required: Admin password (minimum 16 characters)
ADMIN_KEY=

# AI provider and model defaults
# (pydantic_settings reads these case-insensitively)
AI_PROVIDER=cursor
AI_MODEL=gpt-5.4-xhigh-fast
AI_CLI_TIMEOUT=60

# Logging
LOG_LEVEL=INFO

# Data directory for database and generated docs
DATA_DIR=/data

# Cookie security (set to false for local HTTP development)
SECURE_COOKIES=true

# Development mode: starts Vite dev server on port 5173 alongside FastAPI
# DEV_MODE=true
```

At minimum, set `ADMIN_KEY` to a strong value with 16 or more characters.

> **Warning:** `docsfy-server` exits at startup if `ADMIN_KEY` is missing or shorter than 16 characters.

> **Tip:** For local development over plain `http://localhost`, set `SECURE_COOKIES=false`. Otherwise browser login cookies will be marked secure and will not work over HTTP.

> **Note:** `DATA_DIR` defaults to `/data`, which matches the container setup. For a local source checkout, point it at a writable location if you do not want to use `/data`.

## Start `docsfy-server`

Once the Python environment, frontend build, and `.env` file are ready, start the server:

```bash
uv run docsfy-server
```

By default, the server binds to `127.0.0.1:8000`. You can override that with environment variables:

```bash
HOST=0.0.0.0 PORT=8000 DEBUG=true uv run docsfy-server
```

`DEBUG=true` enables reload mode for local backend development.

> **Note:** Generation jobs require the matching provider CLI to be installed. docsfy supports `claude`, `gemini`, and `cursor`, and checks that the selected provider CLI is available before it starts generating pages.

## Configure the `docsfy` CLI

`docsfy` is the client entry point. It talks to a running server and stores connection profiles in `~/.config/docsfy/config.toml`.

The easiest setup path is the interactive config command:

```bash
uv run docsfy config init
uv run docsfy health
```

For a first local setup, use:

- server URL: `http://localhost:8000`
- username: `admin`
- password: your `ADMIN_KEY`

The example config file in the repository looks like this:

```toml
[default]
server = "dev"

[servers.dev]
url = "http://localhost:8000"
username = "admin"
password = "<your-dev-key>"
```

This file contains the server URL plus the credential the CLI will send as a Bearer token. For the built-in local admin account, that token is your `ADMIN_KEY`.

> **Warning:** `~/.config/docsfy/config.toml` contains credentials. Keep it private.

> **Tip:** The example config recommends `chmod 600 ~/.config/docsfy/config.toml`, and `docsfy config init` writes the file with owner-only permissions automatically.

## The two installed commands

After installation, the two commands you will use most are:

- `docsfy-server` to run the FastAPI application
- `docsfy` to configure a server profile, check health, start generations, inspect status, and manage projects or users

From a repository checkout, the most reliable way to invoke them is through `uv run`:

```bash
uv run docsfy-server
uv run docsfy health
```

If `uv run docsfy health` reports an `ok` status, your local installation is ready to use.


## Related Pages

- [Docker and Compose Quickstart](docker-quickstart.html)
- [Environment Variables](environment-variables.html)
- [AI Provider Setup](ai-provider-setup.html)
- [Local Development](local-development.html)
- [First Run Quickstart](first-run-quickstart.html)

---

Source: ai-provider-setup.md

# AI Provider Setup

To generate documentation, `docsfy` needs access to a supported AI provider CLI on the same machine that runs `docsfy-server`. If your browser or local `docsfy` CLI connects to a remote server, provider setup belongs on that server, not on your workstation.

The supported provider names are:

- `claude`
- `gemini`
- `cursor`

## How Provider Setup Fits Into Generation

```mermaid
sequenceDiagram
    participant User as Browser or docsfy CLI
    participant API as docsfy API
    participant CLI as Provider CLI
    participant Store as Variant Storage

    User->>API: Start generation with provider/model
    API->>CLI: check_ai_cli_available(...)
    alt CLI available
        API->>CLI: Plan docs
        API->>CLI: Generate pages
        API->>CLI: Validate and cross-link
        API->>Store: Save variant under owner/project/branch/provider/model
        Store-->>User: Status becomes ready
    else CLI missing, logged out, or model unavailable
        CLI-->>API: Error message
        API->>Store: Mark variant as error
        Store-->>User: Status shows error
    end
```

> **Warning:** The provider dropdown is not a live health check. The UI can show `claude`, `gemini`, and `cursor` even if a CLI is missing, logged out, or blocked from the selected model.

## Install The Provider CLIs

`docsfy` expects a matching external CLI for each provider you want to use:

- `claude`: Claude Code CLI
- `gemini`: Gemini CLI
- `cursor`: Cursor Agent CLI

If you use the bundled container image, all three are installed during the image build and added to the runtime user's `PATH`:

```dockerfile
# Install Claude Code CLI (installs to ~/.local/bin)
RUN /bin/bash -o pipefail -c "curl -fsSL https://claude.ai/install.sh | bash"

# Install Cursor Agent CLI (installs to ~/.local/bin)
RUN /bin/bash -o pipefail -c "curl -fsSL https://cursor.com/install | bash"

# Configure npm for non-root global installs and install Gemini CLI + mermaid-cli
RUN mkdir -p /home/appuser/.npm-global \
  && npm config set prefix '/home/appuser/.npm-global' \
  && npm install -g @google/gemini-cli @mermaid-js/mermaid-cli@11

USER appuser
ENV PATH="/home/appuser/.local/bin:/home/appuser/.npm-global/bin:${PATH}"
ENV HOME="/home/appuser"
```

If you are not using the bundled image, install the equivalent CLIs yourself and make sure the server user can run them from `PATH`.

> **Note:** The repository installs provider binaries for the bundled container image, but it does not perform provider login for you.

## Authenticate The Right Layer

There are two separate authentication layers in a `docsfy` deployment:

- `docsfy` authentication: who can use the app, API, and CLI
- Provider authentication: whether the external Claude, Gemini, or Cursor CLI can actually generate content

When the `docsfy` CLI talks to the server, the stored `password` is used as an HTTP Bearer token:

```python
self._client = httpx.Client(
    base_url=self.server_url,
    headers={"Authorization": f"Bearer {self.password}"},
    timeout=30.0,
    follow_redirects=False,
)
```

The checked-in server environment example contains `docsfy` settings and provider defaults, but no provider login secrets:

```dotenv
ADMIN_KEY=

AI_PROVIDER=cursor
AI_MODEL=gpt-5.4-xhigh-fast
AI_CLI_TIMEOUT=60
```

This means:

- Your `docsfy` login does not log the provider CLI into Claude, Gemini, or Cursor.
- `AI_PROVIDER` and `AI_MODEL` choose defaults; they do not authenticate anything.
- The external CLI must already be authenticated in the same environment that runs `docsfy-server`.

If you deploy with the bundled container, that environment is the `appuser` account with `HOME=/home/appuser`. Provider login state has to be available there.

> **Warning:** Do not put provider credentials in Git-tracked files. Authenticate each provider CLI using its normal login flow outside the repository.

## Defaults And Model Validation

The shipped server defaults are:

- Provider: `cursor`
- Model: `gpt-5.4-xhigh-fast`
- CLI timeout: `60` seconds

At the API layer, `docsfy` enforces a fixed provider list and requires a model name:

```python
if ai_provider not in VALID_PROVIDERS:
    raise HTTPException(
        status_code=400,
        detail=f"Invalid AI provider: '{ai_provider}'. Must be one of {', '.join(VALID_PROVIDERS)}.",
    )
if not ai_model:
    raise HTTPException(status_code=400, detail="AI model must be specified.")
```

Provider names are strict, but model names are intentionally loose. This codebase does not ship its own master catalog of valid Claude, Gemini, or Cursor models. The external CLI and your provider account are the source of truth for whether a model is actually usable.

## How Availability Affects Generation

Before cloning a repo or starting the planning pipeline, `docsfy` checks whether the selected provider CLI is available for the chosen provider/model pair:

```python
cli_flags = ["--trust"] if ai_provider == "cursor" else None
available, msg = await check_ai_cli_available(
    ai_provider, ai_model, cli_flags=cli_flags
)
if not available:
    await update_and_notify(
        gen_key,
        project_name,
        ai_provider,
        ai_model,
        status="error",
        owner=owner,
        branch=branch,
        error_message=msg,
    )
    return
```

This has a few practical consequences:

- If the provider CLI is missing, generation fails fast.
- If the CLI is installed but not authenticated, generation fails fast.
- If the selected model is not available to the current provider login, generation fails fast.
- When this check fails, the variant moves to `error` and the returned message is surfaced in status output.

The same provider and model are then used again for planning, page generation, validation, and cross-linking. Passing the initial availability check is necessary, but it is not a guarantee that later provider calls cannot fail.

> **Note:** `docsfy` automatically adds Cursor's `--trust` flag when the selected provider is `cursor`.

> **Warning:** If a run fails almost immediately, check provider installation, provider login, and model access first. The availability check happens before repo cloning.

## Why The Model Picker May Be Empty Or Stale

The provider list is fixed, but model suggestions are remembered history, not a live catalog fetched from Claude, Gemini, or Cursor.

The server builds its known model list from successful `ready` variants:

```python
cursor = await db.execute(
    "SELECT DISTINCT ai_provider, ai_model FROM projects WHERE ai_provider != '' AND ai_model != '' AND status = 'ready' ORDER BY ai_provider, ai_model"
)
rows = await cursor.fetchall()
models: dict[str, list[str]] = {}
for provider, model in rows:
    if provider not in models:
        models[provider] = []
    if model not in models[provider]:
        models[provider].append(model)
```

At the same time, the model input is free-form in the UI:

```tsx
<Combobox
  options={modelOptions}
  value={model}
  onChange={setModel}
  placeholder="Select or type model..."
  disabled={isSubmitting}
  data-testid="model-input"
/>
```

What this means in practice:

- A fresh server can have an empty model list.
- A newly released provider model does not appear automatically.
- A model can still appear in the picker even if the current provider login no longer has access to it.
- You can type a model manually even when it is not suggested yet.

> **Tip:** If the model dropdown is empty, type the model name you want and use it for a successful generation. Once that variant reaches `ready`, the server can start suggesting that model later.

## Why Provider And Model Create Separate Variants

Provider and model are part of how `docsfy` stores and identifies generated output:

```python
return (
    PROJECTS_DIR
    / safe_owner
    / _validate_name(name)
    / branch
    / ai_provider
    / ai_model
)
```

That is why the same repository can have different outputs for different providers, models, and branches. It is also why docs URLs and download endpoints include `branch`, `provider`, and `model`.

Without `force`, `docsfy` can reuse work from the newest ready variant even if that variant was generated with a different provider or model. On same-commit switches, it can reuse cached pages or even the already-rendered site. With `force`, it does a fresh run and keeps existing variants in place.

> **Note:** If you are comparing providers side by side, use a forced regeneration when you want a clean rerun rather than reuse from the newest ready variant.

## Setup Checklist

- Install the provider CLI you want to use on the machine that runs `docsfy-server`.
- Make sure the server user can run that CLI from `PATH`.
- Authenticate the provider CLI as that same user, with the same `HOME`.
- Set sensible defaults for `AI_PROVIDER`, `AI_MODEL`, and `AI_CLI_TIMEOUT`.
- Keep `ADMIN_KEY` and provider credentials separate in your mental model and in your deployment.
- Expect the provider dropdown to exist even before the provider is actually usable.
- If model suggestions are empty, type the model name manually.
- If generation fails quickly, check provider CLI availability and login state before you troubleshoot the repository itself.


## Related Pages

- [Installation](installation.html)
- [Environment Variables](environment-variables.html)
- [Generating Documentation](generating-documentation.html)
- [Deployment and Runtime](deployment-and-runtime.html)
- [Troubleshooting](troubleshooting.html)

---

Source: local-development.md

# Local Development

`docsfy` has two useful local workflows:

| Workflow | Best for | Open in your browser |
|---|---|---|
| Split development | React UI work, fast feedback, hot reload | `http://localhost:5173` |
| Built-SPA check | Production-like checks, backend serving the UI itself | `http://localhost:8000` |

In split development, FastAPI runs on port `8000` and Vite runs on port `5173`. You use the Vite URL in the browser, and Vite proxies API, docs, health, and WebSocket traffic back to FastAPI.

In built-SPA mode, FastAPI serves the built frontend from `frontend/dist`, so you only use port `8000`.

## Prerequisites

- Python `3.12+`. `pyproject.toml` sets `requires-python = ">=3.12"`.
- `uv` for Python commands.
- Node and npm for the frontend. The Docker image builds the SPA with Node `20`, so matching that locally is the safest option.
- A writable data directory for the database and generated docs.

## Where settings go

Not every setting is loaded the same way, which matters for local development.

- Put app settings such as `ADMIN_KEY`, `DATA_DIR`, `SECURE_COOKIES`, `AI_PROVIDER`, and `AI_MODEL` in the repo root `.env`.
- Pass backend launcher settings such as `DEBUG`, `HOST`, and `PORT` in the shell when starting `docsfy-server`.
- Pass `API_TARGET` in the shell when starting the Vite dev server.
- `DEV_MODE` is handled by `entrypoint.sh`, so it matters for container startup, not for a host-local `uv run docsfy-server`.

The app settings loader in `src/docsfy/config.py` reads `.env` automatically:

```python
class Settings(BaseSettings):
    model_config = SettingsConfigDict(
        env_file=".env",
        env_file_encoding="utf-8",
        extra="ignore",
    )
```

The backend launcher in `src/docsfy/main.py` reads `DEBUG`, `HOST`, and `PORT` directly from the process environment:

```python
def run() -> None:
    import uvicorn

    reload = os.getenv("DEBUG", "").lower() == "true"
    host = os.getenv("HOST", "127.0.0.1")
    port = int(os.getenv("PORT", "8000"))
    uvicorn.run("docsfy.main:app", host=host, port=port, reload=reload)
```

## Configure `.env`

Start from the values in `.env.example`:

```env
# Required: Admin password (minimum 16 characters)
ADMIN_KEY=

# AI provider and model defaults
AI_PROVIDER=cursor
AI_MODEL=gpt-5.4-xhigh-fast
AI_CLI_TIMEOUT=60

# Logging
LOG_LEVEL=INFO

# Data directory for database and generated docs
DATA_DIR=/data

# Cookie security (set to false for local HTTP development)
SECURE_COOKIES=true

# Development mode: starts Vite dev server on port 5173 alongside FastAPI
# DEV_MODE=true
```

For host-local development, the most important changes are:

- Set `ADMIN_KEY` to a real value with at least 16 characters.
- Change `SECURE_COOKIES=false` if you are using plain `http://localhost`.
- Consider changing `DATA_DIR` to something writable and local to the repo, such as `./data`, instead of the container-oriented default `/data`.

> **Warning:** With `SECURE_COOKIES=true`, browser login over plain local HTTP will not persist because the session cookie is marked `secure`.

> **Note:** The login screen expects username `admin` and the `ADMIN_KEY` value from your environment.

## Run the backend locally

From the repo root, start the backend with `docsfy-server`:

```bash
DEBUG=true uv run docsfy-server
```

That gives you:

- FastAPI on `127.0.0.1:8000` by default
- automatic reload when `DEBUG=true`
- the `/health` endpoint at `http://localhost:8000/health`

If you need the backend to listen on a different interface or port, pass `HOST` and `PORT` in the shell when you start it.

> **Tip:** `DEBUG` is not read from `.env` by `docsfy-server`. Set it in the shell when you launch the backend.

## Run the frontend locally

The frontend scripts in `frontend/package.json` are:

```json
"scripts": {
  "dev": "vite",
  "build": "tsc -b && vite build",
  "lint": "eslint .",
  "preview": "vite preview",
  "test": "vitest run"
}
```

Start the frontend from `frontend/`:

```bash
cd frontend
npm ci
npm run dev
```

Then open `http://localhost:5173`.

This is the best workflow for day-to-day UI development because Vite handles hot reload and the browser stays on the frontend dev server.

## How the Vite proxy changes the workflow

When Vite is running, the browser still uses paths like `/api/...`, `/docs/...`, and `/api/ws`. The difference is that Vite forwards those requests to the backend for you.

In `frontend/vite.config.ts`:

```ts
const API_TARGET = process.env.API_TARGET || 'http://localhost:8000'

export default defineConfig({
  plugins: [react(), tailwindcss()],
  server: {
    host: '0.0.0.0',
    port: 5173,
    proxy: {
      '/api': {
        target: API_TARGET,
        changeOrigin: true,
        ws: true,
      },
      '/docs': {
        target: API_TARGET,
        changeOrigin: true,
      },
      '/health': {
        target: API_TARGET,
        changeOrigin: true,
      },
    },
  },
})
```

The frontend code is written around same-origin browser requests:

```ts
const response = await fetch(`${path}`, config)
```

And the WebSocket client connects back to the current browser host:

```ts
const protocol = window.location.protocol === 'https:' ? 'wss:' : 'ws:'
const url = `${protocol}//${window.location.host}/api/ws`
this.ws = new WebSocket(url)
```

In practice, that means:

- Use `http://localhost:5173` in the browser while Vite is running.
- API requests still reach FastAPI on port `8000` through the proxy.
- WebSocket updates still work in local development because the `/api` proxy enables `ws: true`.
- Generated docs under `/docs/...` still open correctly because Vite also proxies `/docs`.

> **Tip:** If your backend is not running on `http://localhost:8000`, set `API_TARGET` in the shell before starting `npm run dev`.

## When to build the SPA

You only need `npm run build` when FastAPI itself should serve the frontend from `frontend/dist`.

That includes:

- checking the backend-served UI on port `8000`
- running without Vite
- building the production container image

FastAPI serves the built SPA from `frontend/dist` in `src/docsfy/main.py`:

```python
_frontend_dist = Path(__file__).parent.parent.parent / "frontend" / "dist"

_assets_dir = _frontend_dist / "assets"
if _assets_dir.exists():
    app.mount(
        "/assets", StaticFiles(directory=str(_assets_dir)), name="frontend-assets"
    )

@app.get("/{path:path}")
async def spa_catch_all(path: str) -> FileResponse:
    if path.startswith(("api/", "docs/")) or path in ("api", "docs"):
        raise HTTPException(status_code=404, detail="Not found")
    index = _frontend_dist / "index.html"
    if index.exists():
        return FileResponse(str(index))
    raise HTTPException(
        status_code=404,
        detail="Frontend not built. Run: cd frontend && npm run build",
    )
```

Build it with:

```bash
cd frontend
npm run build
```

After that, `uv run docsfy-server` can serve the app directly on port `8000`.

> **Note:** You do not need to build the SPA for normal React development on port `5173`. `npm run dev` is enough.

The Docker build already does this for you. In `Dockerfile`:

```dockerfile
COPY frontend/package.json frontend/package-lock.json ./
RUN npm ci

COPY frontend/ ./
RUN npm run build

COPY --chown=appuser:0 --from=frontend-builder /app/frontend/dist /app/frontend/dist
```

So a normal non-dev container run already has a built `frontend/dist`.

## What `DEV_MODE` does

`DEV_MODE` is handled by `entrypoint.sh`. It changes how the container starts:

```bash
if [ "$DEV_MODE" = "true" ]; then
    echo "DEV_MODE enabled - installing frontend dependencies..."
    cd /app/frontend || exit 1
    npm ci
    echo "Starting Vite dev server on port 5173..."
    npm run dev &
    VITE_PID=$!
    trap 'kill $VITE_PID 2>/dev/null; wait $VITE_PID 2>/dev/null' SIGTERM SIGINT
    cd /app
    echo "Starting FastAPI with hot reload on port 8000..."
    uv run --no-sync uvicorn docsfy.main:app \
        --host 0.0.0.0 --port 8000 \
        --reload --reload-dir /app/src
else
    exec uv run --no-sync uvicorn docsfy.main:app \
        --host 0.0.0.0 --port 8000
fi
```

With `DEV_MODE=true` inside the container:

- Vite starts on port `5173`
- FastAPI starts on port `8000`
- the backend uses `--reload`
- you should browse to `http://localhost:5173`

Without `DEV_MODE`:

- only FastAPI starts
- the container expects a built `frontend/dist`
- you browse to `http://localhost:8000`

> **Warning:** If you are running the app directly on your host, `DEV_MODE=true` does not start Vite for you. Start the backend and frontend as separate processes instead.

## Using `DEV_MODE` with `docker-compose`

The sample `docker-compose.yaml` already shows the pieces needed for container-based development:

```yaml
services:
  docsfy:
    ports:
      - "8000:8000"
      # Uncomment for development (DEV_MODE=true)
      # - "5173:5173"
    volumes:
      - ./data:/data
      # Uncomment for development (hot reload)
      # - ./frontend:/app/frontend
    env_file:
      - .env
    environment:
      - ADMIN_KEY=${ADMIN_KEY}
      # Uncomment for development
      # - DEV_MODE=true
```

For container-based frontend development:

1. Enable `DEV_MODE=true`.
2. Expose port `5173`.
3. Bind-mount `./frontend:/app/frontend`.
4. Open `http://localhost:5173`.

> **Warning:** The sample compose file only comments in a frontend bind mount. `entrypoint.sh` enables backend reload with `--reload --reload-dir /app/src`, but host-side Python edits will only hot-reload if the container can also see updated `src/` files.

> **Tip:** If you plan to change backend Python code often, host-local development is the simpler workflow unless you also add a backend source bind mount to your container setup.

## Quick reference

- React/UI work: run the backend on `8000`, run Vite on `5173`, and browse to `http://localhost:5173`.
- Backend-served UI check: build the SPA with `npm run build`, then browse to `http://localhost:8000`.
- `DEV_MODE` is for the container entrypoint workflow.
- The Vite proxy is what makes `/api`, `/docs`, `/health`, and `/api/ws` work from `http://localhost:5173` without extra CORS setup.


## Related Pages

- [Installation](installation.html)
- [Environment Variables](environment-variables.html)
- [Docker and Compose Quickstart](docker-quickstart.html)
- [Deployment and Runtime](deployment-and-runtime.html)
- [Testing and Quality Checks](testing-and-quality-checks.html)

---

Source: docker-quickstart.md

# Docker and Compose Quickstart

`docsfy` ships with both a `Dockerfile` and a `docker-compose.yaml`, so you can run it directly from this repository without writing your own container setup. The provided configuration exposes the app on `http://localhost:8000`, stores its SQLite database and generated output under `./data`, and does not require a separate database container.

By default, one container serves the web UI, the HTTP API, and generated documentation.

```mermaid
flowchart LR
  Env[".env"] -->|runtime settings| App["docsfy container"]
  Browser["Browser"] -->|http://localhost:8000| App
  Browser -. optional :5173 when DEV_MODE=true .-> App
  Data["./data on host"] <-->|bind mount to /data| App
```

## Create `.env`

Run these commands from the repository root:

```bash
cp .env.example .env
```

Relevant lines from `.env.example`:

```dotenv
# Required: Admin password (minimum 16 characters)
ADMIN_KEY=

# Logging
LOG_LEVEL=INFO

# Data directory for database and generated docs
DATA_DIR=/data

# Cookie security (set to false for local HTTP development)
SECURE_COOKIES=true

# Development mode: starts Vite dev server on port 5173 alongside FastAPI
# DEV_MODE=true
```

The compose file uses `.env`, and the application is also configured to read `.env`-style settings.

> **Warning:** `ADMIN_KEY` is required and must be at least 16 characters long. The server validates it during startup.

> **Warning:** If you are using plain `http://localhost:8000` in a browser, set `SECURE_COOKIES=false`. With the default `true`, the session cookie is marked secure for HTTPS.

> **Note:** The provided volume mapping assumes `DATA_DIR=/data`. If you change `DATA_DIR`, change the container-side mount path to match.

## Start with Compose

Key lines from `docker-compose.yaml`:

```yaml
services:
  docsfy:
    build:
      context: .
      dockerfile: Dockerfile
    ports:
      - "8000:8000"
      # Uncomment for development (DEV_MODE=true)
      # - "5173:5173"
    volumes:
      - ./data:/data
    env_file:
      - .env
    environment:
      - ADMIN_KEY=${ADMIN_KEY}
    restart: unless-stopped
```

This setup:

- Builds the image from the repository’s `Dockerfile`.
- Publishes the main app on port `8000`.
- Persists runtime data by binding `./data` on the host to `/data` in the container.
- Loads settings from `.env`.
- Restarts automatically unless you stop it.

Start it with:

```bash
mkdir -p data
docker compose up --build
```

If you prefer detached mode, add `-d`.

Once the container is up:

1. Open `http://localhost:8000/login`.
2. Sign in with username `admin` and the `ADMIN_KEY` value from `.env`.
3. Optionally verify the health endpoint:

```bash
curl http://localhost:8000/health
```

Expected response:

```json
{"status":"ok"}
```

Stop the stack with:

```bash
docker compose down
```

> **Tip:** The first build can take a while. The `Dockerfile` builds the frontend, installs Python dependencies, and installs additional runtime tooling during the image build.

> **Note:** After you change `.env`, restart the container so the new settings are picked up.

## Persistent Data

The storage layout is defined in `src/docsfy/storage.py`:

```python
DB_PATH = Path(os.getenv("DATA_DIR", "/data")) / "docsfy.db"
DATA_DIR = Path(os.getenv("DATA_DIR", "/data"))
PROJECTS_DIR = DATA_DIR / "projects"

return (
    PROJECTS_DIR
    / safe_owner
    / _validate_name(name)
    / branch
    / ai_provider
    / ai_model
)
```

Generated site output lives one level deeper:

```python
def get_project_site_dir(
    name: str,
    ai_provider: str = "",
    ai_model: str = "",
    owner: str = "",
    branch: str = DEFAULT_BRANCH,
) -> Path:
    return get_project_dir(name, ai_provider, ai_model, owner, branch) / "site"
```

In practice, the host-mounted `./data` directory contains:

- `./data/docsfy.db` for the SQLite database.
- `./data/projects/<owner>/<project>/<branch>/<provider>/<model>/site` for rendered documentation.
- `./data/projects/.../cache/pages` for cached page data used during generation.

> **Note:** You do not need to create `docsfy.db` yourself. Startup initializes the database and creates the data directories inside `DATA_DIR`.

> **Note:** Rebuilding or recreating the container does not remove your data as long as `./data` stays in place.

## Exposed Ports and Dev Mode

Relevant lines from `Dockerfile`:

```dockerfile
EXPOSE 8000
# Vite dev server (DEV_MODE only)
EXPOSE 5173

HEALTHCHECK --interval=30s --timeout=10s --retries=3 \
  CMD curl -f http://localhost:8000/health || exit 1

ENTRYPOINT ["/app/entrypoint.sh"]
```

The startup behavior comes from `entrypoint.sh`:

```bash
if [ "$DEV_MODE" = "true" ]; then
    cd /app/frontend || exit 1
    npm ci
    npm run dev &
    uv run --no-sync uvicorn docsfy.main:app \
        --host 0.0.0.0 --port 8000 \
        --reload --reload-dir /app/src
else
    exec uv run --no-sync uvicorn docsfy.main:app \
        --host 0.0.0.0 --port 8000
fi
```

What that means in practice:

- `8000` is the main port end users need. It serves the app, the API, and generated docs.
- `5173` is only relevant when `DEV_MODE=true`.
- In development mode, the container starts a Vite dev server and runs Uvicorn with reload enabled.
- The image health check probes `GET /health`, which is a public endpoint.

If you enable `DEV_MODE=true`, also publish `5173` so that port is reachable from the host.

## Run the Dockerfile Directly

If you want to use the `Dockerfile` without Compose, build and run it like this:

```bash
docker build -t docsfy .
docker run --rm \
  -p 8000:8000 \
  --env-file .env \
  -v "$(pwd)/data:/data" \
  docsfy
```

That gives you the same basics as the provided compose setup:

- `--env-file .env` passes your runtime settings into the container.
- `-v "$(pwd)/data:/data"` preserves the database and generated docs.
- `-p 8000:8000` publishes the web app.

The image does not copy your host `.env` file into `/app`, so `--env-file .env` is the simplest way to pass the same settings you use with Compose.

If you also enable `DEV_MODE=true`, publish `5173` too.

## Troubleshooting

- The container exits immediately: check that `ADMIN_KEY` is set and at least 16 characters long.
- The login page loads but you cannot stay signed in on `http://localhost:8000`: set `SECURE_COOKIES=false` for local HTTP.
- Data is missing after a restart: make sure the bind mount still points to `/data` and still matches `DATA_DIR`.
- You changed `.env` but nothing changed at runtime: restart the container.
- You want to inspect logs: run `docker compose logs -f`.


## Related Pages

- [Installation](installation.html)
- [Environment Variables](environment-variables.html)
- [Deployment and Runtime](deployment-and-runtime.html)
- [Local Development](local-development.html)
- [First Run Quickstart](first-run-quickstart.html)

---

Source: environment-variables.md

# Environment Variables

`docsfy` uses environment variables for server startup, authentication, storage, default AI behavior, and local development. In normal usage, put the core server settings in a repo-root `.env` file. The settings model reads that file automatically, and the provided `docker-compose.yaml` passes it into the container too.

> **Note:** The `docsfy` CLI stores server URL and credentials in `~/.config/docsfy/config.toml` or CLI flags such as `--host`, `--port`, `--username`, and `--password`. The variables on this page mainly affect the server process, the container entrypoint, and the frontend development server.

A practical starting point is the shipped `.env.example`:

```dotenv
# Required: Admin password (minimum 16 characters)
ADMIN_KEY=

# AI provider and model defaults
# (pydantic_settings reads these case-insensitively)
AI_PROVIDER=cursor
AI_MODEL=gpt-5.4-xhigh-fast
AI_CLI_TIMEOUT=60

# Data directory for database and generated docs
DATA_DIR=/data

# Cookie security (set to false for local HTTP development)
SECURE_COOKIES=true

# Development mode: starts Vite dev server on port 5173 alongside FastAPI
# DEV_MODE=true
```

If you use the included Compose setup, the container reads the same `.env` file:

```yaml
env_file:
  - .env
environment:
  - ADMIN_KEY=${ADMIN_KEY}
```

## Quick Reference

### Core Server Settings

| Variable | Default | Required | What it controls |
| --- | --- | --- | --- |
| `ADMIN_KEY` | none | Yes | Built-in admin authentication and the secret used to hash stored user API keys |
| `AI_PROVIDER` | `cursor` | No | Default AI provider when a generation request does not specify one |
| `AI_MODEL` | `gpt-5.4-xhigh-fast` | No | Default AI model when a generation request does not specify one |
| `AI_CLI_TIMEOUT` | `60` | No | Default timeout value passed to provider CLI calls |
| `DATA_DIR` | `/data` | No | Root directory for the SQLite database and generated documentation artifacts |
| `SECURE_COOKIES` | `true` | No | Whether the browser session cookie uses the `Secure` flag |

### Launcher And Development Settings

| Variable | Default | Used by | What it controls |
| --- | --- | --- | --- |
| `HOST` | `127.0.0.1` | `docsfy-server` | Backend bind host when you run the packaged server entrypoint directly |
| `PORT` | `8000` | `docsfy-server` | Backend bind port when you run the packaged server entrypoint directly |
| `DEBUG` | unset | `docsfy-server` | Enables Uvicorn reload when set to `true` |
| `DEV_MODE` | unset | `entrypoint.sh` | Starts Vite and FastAPI reload together inside the container |
| `API_TARGET` | `http://localhost:8000` | Vite dev server | Proxy target for `/api`, `/docs`, `/health`, and WebSocket traffic during frontend development |

## How `docsfy` Reads These Values

Most server settings come from `src/docsfy/config.py`:

```python
class Settings(BaseSettings):
    model_config = SettingsConfigDict(
        env_file=".env",
        env_file_encoding="utf-8",
        extra="ignore",
    )

    admin_key: str = ""
    ai_provider: str = "cursor"
    ai_model: str = "gpt-5.4-xhigh-fast"
    ai_cli_timeout: int = Field(default=60, gt=0)
    data_dir: str = "/data"
    secure_cookies: bool = True  # Set to False for local HTTP dev
```

That means:

- Core server settings are loaded from `.env` or the process environment.
- Extra keys in `.env` do not break the settings loader because it uses `extra="ignore"`.
- `HOST`, `PORT`, `DEBUG`, `DEV_MODE`, and `API_TARGET` are separate. They are read directly by the launcher, the container entrypoint, or Vite instead of this settings model.

```mermaid
flowchart TD
    A["`.env` or process environment"] --> B["`Settings` in `src/docsfy/config.py`"]
    B --> C["Startup validates `ADMIN_KEY`"]
    B --> D["Generation defaults: `AI_PROVIDER`, `AI_MODEL`, `AI_CLI_TIMEOUT`"]
    B --> E["Storage root: `DATA_DIR`"]
    B --> F["Browser cookie security: `SECURE_COOKIES`"]

    G["Shell environment"] --> H["`docsfy-server` launcher"]
    H --> I["`HOST`, `PORT`, `DEBUG`"]

    J["Container environment"] --> K["`entrypoint.sh`"]
    K --> L["`DEV_MODE=true` starts Vite on `5173` and FastAPI reload on `8000`"]

    M["Frontend dev shell environment"] --> N["`frontend/vite.config.ts`"]
    N --> O["`API_TARGET` proxies `/api`, `/docs`, `/health`, and `/api/ws`"]
```

## `ADMIN_KEY`

`ADMIN_KEY` is the one setting you must provide before the server can start. `docsfy` checks it during application startup and exits if it is missing or too short:

```python
settings = get_settings()
if not settings.admin_key:
    logger.error("ADMIN_KEY environment variable is required")
    raise SystemExit(1)

if len(settings.admin_key) < 16:
    logger.error("ADMIN_KEY must be at least 16 characters long")
    raise SystemExit(1)
```

In practice, `ADMIN_KEY` is used for all of these:

- Browser login as the built-in `admin` user
- Bearer authentication for admin API requests
- `?token=` authentication on `/api/ws`
- The HMAC secret for stored user API keys

That last point matters because database-backed users depend on it too. In `src/docsfy/storage.py`, user API keys are hashed with `ADMIN_KEY`:

```python
secret = hmac_secret or os.getenv("ADMIN_KEY", "")
if not secret:
    msg = "ADMIN_KEY environment variable is required for key hashing"
    raise RuntimeError(msg)
return hmac.new(secret.encode(), key.encode(), hashlib.sha256).hexdigest()
```

> **Warning:** Rotating `ADMIN_KEY` invalidates existing database-backed user API keys. After you change it, those users will need new API keys.

> **Note:** The built-in admin username is always `admin`. The `/api/auth/rotate-key` endpoint does not rotate `ADMIN_KEY`; for that case the API explicitly tells you to change the environment variable itself.

## Default AI Settings: `AI_PROVIDER`, `AI_MODEL`, And `AI_CLI_TIMEOUT`

These are server-wide defaults for documentation generation. They are used when a generate request does not provide its own provider, model, or timeout.

The generate API resolves defaults this way in `src/docsfy/api/projects.py`:

```python
settings = get_settings()
ai_provider = gen_request.ai_provider or settings.ai_provider
ai_model = gen_request.ai_model or settings.ai_model

task = asyncio.create_task(
    _run_generation(
        repo_url=gen_request.repo_url,
        repo_path=gen_request.repo_path,
        project_name=project_name,
        ai_provider=ai_provider,
        ai_model=ai_model,
        ai_cli_timeout=gen_request.ai_cli_timeout
        or settings.ai_cli_timeout,
```

The supported provider names in the current codebase are defined in `src/docsfy/models.py`:

```python
VALID_PROVIDERS = ("claude", "gemini", "cursor")
```

What each setting does:

- `AI_PROVIDER` sets the default provider for new generations.
- `AI_MODEL` sets the default model name for that provider.
- `AI_CLI_TIMEOUT` sets the default timeout value passed to provider CLI calls.
- `AI_CLI_TIMEOUT` must be greater than `0`, because the settings model uses `Field(default=60, gt=0)`.

A few practical details matter here:

- You can still override provider and model per request through the API or CLI.
- `docsfy` validates the provider name against `claude`, `gemini`, and `cursor`.
- The model is provider-specific; after defaults are resolved, `docsfy` only requires that it not be empty.
- When the selected provider is `cursor`, generation code adds `--trust` when invoking the CLI.

> **Tip:** Set `AI_PROVIDER` and `AI_MODEL` to the combination you use most often. That keeps routine runs simple, while still allowing one-off overrides.

> **Note:** `AI_CLI_TIMEOUT` affects the provider CLI calls made by the server. It does not change the `docsfy` CLI’s own HTTP client timeout, which is a separate hardcoded `30.0` in `src/docsfy/cli/client.py`.

## `DATA_DIR`

`DATA_DIR` controls where `docsfy` stores durable runtime data. By default, that is `/data`.

From `src/docsfy/storage.py`:

```python
DB_PATH = Path(os.getenv("DATA_DIR", "/data")) / "docsfy.db"
DATA_DIR = Path(os.getenv("DATA_DIR", "/data"))
PROJECTS_DIR = DATA_DIR / "projects"
```

Project variants are then stored under owner, project, branch, provider, and model directories:

```python
return (
    PROJECTS_DIR
    / safe_owner
    / _validate_name(name)
    / branch
    / ai_provider
    / ai_model
)
```

In practice, the on-disk layout looks like this:

- `DATA_DIR/docsfy.db`
  Stores users, sessions, projects, access grants, and generation metadata.
- `DATA_DIR/projects/<owner>/<project>/<branch>/<provider>/<model>/plan.json`
  Stores the current documentation plan.
- `DATA_DIR/projects/<owner>/<project>/<branch>/<provider>/<model>/cache/pages/*.md`
  Stores cached page markdown used for incremental regeneration.
- `DATA_DIR/projects/<owner>/<project>/<branch>/<provider>/<model>/site/`
  Stores the rendered static site that `docsfy` serves and downloads.

The included Compose setup is wired to the default path:

```yaml
volumes:
  - ./data:/data
env_file:
  - .env
```

So with the default configuration, your host machine ends up with:

- `./data/docsfy.db`
- `./data/projects/`

> **Warning:** If you change `DATA_DIR`, also change the container mount target. The provided Compose file assumes `DATA_DIR=/data` together with `./data:/data`.

> **Tip:** The mounted directory must be writable by the runtime user. The provided image is already prepared for `/data`, so keeping the default is the simplest container deployment.

## `SECURE_COOKIES`

`SECURE_COOKIES` controls whether the browser session cookie is marked `Secure`. It defaults to `true`, which is the right choice for HTTPS deployments.

From `src/docsfy/api/auth.py`:

```python
response.set_cookie(
    "docsfy_session",
    session_token,
    httponly=True,
    samesite="strict",
    secure=settings.secure_cookies,
    max_age=SESSION_TTL_SECONDS,
)
```

This tells you a lot about the browser session behavior:

- The cookie name is `docsfy_session`.
- It is always `HttpOnly`.
- It is always `SameSite=Strict`.
- Its lifetime is fixed in code by `SESSION_TTL_SECONDS = 28800`, which is 8 hours.
- `SECURE_COOKIES` controls only the `Secure` flag.

The test suite also verifies that the cookie stores an opaque session token, not the raw API key.

This setting matters for more than just page loads. Browser-side API requests use same-origin credentials, and browser WebSocket auth can also use the same `docsfy_session` cookie.

> **Warning:** If you are running over plain local HTTP, leave `SECURE_COOKIES=false` during development. With `true`, the browser will not send the secure cookie back over `http://`, so login can appear not to stick.

> **Tip:** Keep `SECURE_COOKIES=true` anywhere the app is served over HTTPS.

> **Note:** The session lifetime itself is not configurable by environment variable in the current codebase. Only the `Secure` flag is.

## `HOST`, `PORT`, And `DEBUG`

These three variables are read directly by the packaged backend launcher in `src/docsfy/main.py`:

```python
def run() -> None:
    import uvicorn

    reload = os.getenv("DEBUG", "").lower() == "true"
    host = os.getenv("HOST", "127.0.0.1")
    port = int(os.getenv("PORT", "8000"))
    uvicorn.run("docsfy.main:app", host=host, port=port, reload=reload)
```

They only affect runs that use the `docsfy-server` entrypoint.

What they do:

- `HOST` changes the bind address and defaults to `127.0.0.1`.
- `PORT` changes the bind port and defaults to `8000`.
- `DEBUG=true` enables Uvicorn reload mode.

A subtle but important detail: these are not part of the `Settings` model above. They are read directly from the process environment.

> **Tip:** If you launch `docsfy-server` directly, set `HOST`, `PORT`, and `DEBUG` in the shell or service manager that starts the process.

> **Note:** The provided container entrypoint does not use these variables. It starts Uvicorn explicitly on `0.0.0.0:8000`.

## `DEV_MODE`

`DEV_MODE` is handled by `entrypoint.sh`, so it is a container startup switch rather than a normal app setting.

From `entrypoint.sh`:

```bash
if [ "$DEV_MODE" = "true" ]; then
    echo "DEV_MODE enabled - installing frontend dependencies..."
    cd /app/frontend || exit 1
    npm ci
    echo "Starting Vite dev server on port 5173..."
    npm run dev &
    VITE_PID=$!
    trap 'kill $VITE_PID 2>/dev/null; wait $VITE_PID 2>/dev/null' SIGTERM SIGINT
    cd /app
    echo "Starting FastAPI with hot reload on port 8000..."
    uv run --no-sync uvicorn docsfy.main:app \
        --host 0.0.0.0 --port 8000 \
        --reload --reload-dir /app/src
else
    exec uv run --no-sync uvicorn docsfy.main:app \
        --host 0.0.0.0 --port 8000
fi
```

With `DEV_MODE=true` inside the container:

- The entrypoint runs `npm ci`
- Vite starts on port `5173`
- FastAPI still runs on port `8000`
- The backend uses `--reload`

Without `DEV_MODE`:

- Only FastAPI starts
- The container serves the already-built frontend on port `8000`

The sample Compose file already shows the related toggles:

```yaml
ports:
  - "8000:8000"
  # Uncomment for development (DEV_MODE=true)
  # - "5173:5173"
volumes:
  - ./data:/data
  # Uncomment for development (hot reload)
  # - ./frontend:/app/frontend
environment:
  - ADMIN_KEY=${ADMIN_KEY}
  # Uncomment for development
  # - DEV_MODE=true
```

> **Warning:** `DEV_MODE` is a development convenience, not a production setting.

> **Note:** If you run `docsfy-server` directly on your host, `DEV_MODE` does nothing because `entrypoint.sh` is not involved.

## `API_TARGET`

`API_TARGET` is only used when you run the frontend dev server with Vite.

From `frontend/vite.config.ts`:

```ts
const API_TARGET = process.env.API_TARGET || 'http://localhost:8000'

export default defineConfig({
  server: {
    host: '0.0.0.0',
    port: 5173,
    proxy: {
      '/api': {
        target: API_TARGET,
        changeOrigin: true,
        ws: true,
      },
      '/docs': {
        target: API_TARGET,
        changeOrigin: true,
      },
      '/health': {
        target: API_TARGET,
        changeOrigin: true,
      },
    },
  },
})
```

This is what makes the split development workflow work cleanly:

- Open the browser at `http://localhost:5173`
- Let Vite proxy `/api`, `/docs`, and `/health` to the backend
- Let the `/api` proxy carry WebSocket traffic too because `ws: true` is enabled

If your backend is not running on `http://localhost:8000`, this is the variable to change for frontend development.

> **Tip:** `API_TARGET` is a frontend-development variable only. It does not affect the built frontend served by FastAPI and it does not change the backend’s own bind address.

## What Most Deployments Actually Need

For a normal deployment, the settings that matter most are:

- `ADMIN_KEY`
- `AI_PROVIDER`
- `AI_MODEL`
- `AI_CLI_TIMEOUT`
- `DATA_DIR`
- `SECURE_COOKIES`

For local frontend development, add:

- `DEV_MODE` if you want the container to run Vite for you
- `API_TARGET` if Vite should proxy to a backend other than `http://localhost:8000`

For direct host-local backend launches with `docsfy-server`, add:

- `HOST`
- `PORT`
- `DEBUG`

If you set only one thing first, set `ADMIN_KEY`. That is the one environment variable the server cannot run without.


## Related Pages

- [Installation](installation.html)
- [Local Development](local-development.html)
- [Deployment and Runtime](deployment-and-runtime.html)
- [Authentication and Roles](authentication-and-roles.html)
- [AI Provider Setup](ai-provider-setup.html)

---

Source: cli-configuration.md

# CLI Configuration

`docsfy` keeps its CLI connection settings in a TOML file in your home directory. You can save multiple servers there, choose one as the default, and override any part of the connection from the command line when you need a one-off change.

> **Note:** This page is about the CLI profile file in your home directory. It is separate from the server's own environment settings.

## Where the file lives

The CLI reads:

`~/.config/docsfy/config.toml`

Create it with:

```shell
docsfy config init
```

If the file does not exist, commands that need a server connection will fail until you either create the file or pass connection settings on the command line.

When `docsfy config init` writes the config, it creates the directory with owner-only permissions and writes the file as owner-read/write only.

> **Warning:** `~/.config/docsfy/config.toml` contains API keys. Keep it private.

## File format

The repository includes this example in `config.toml.example`:

```toml
# docsfy CLI configuration
# Copy to ~/.config/docsfy/config.toml or run: docsfy config init
#
# SECURITY: This file contains passwords. Keep it private:
#   chmod 600 ~/.config/docsfy/config.toml

# Default server to use when --server is not specified
[default]
server = "dev"

# Server profiles -- add as many as you need
[servers.dev]
url = "http://localhost:8000"
username = "admin"
password = "<your-dev-key>"

[servers.prod]
url = "https://docsfy.example.com"
username = "admin"
password = "<your-prod-key>"

[servers.staging]
url = "https://staging.docsfy.example.com"
username = "deployer"
password = "<your-staging-key>"
```

How to read this file:

- `[default].server` chooses which saved profile is used when you do not pass `--server`.
- Each `[servers.<name>]` table defines one server profile.
- A profile has three fields: `url`, `username`, and `password`.

> **Note:** This file does not store a default AI provider or model. If you omit `--provider` or `--model` on `docsfy generate`, the server uses its own configured defaults.

## Creating and updating profiles

`docsfy config init` is the easiest way to create a profile. It prompts for:

- `Profile name`
- `Server URL`
- `Username`
- `Password`

A few details matter:

- If you press Enter at `Profile name`, the default name is `dev`.
- On the first run, the created profile also becomes the default server.
- If you run `docsfy config init` again later, it adds another profile but keeps the existing default.

To inspect the current file:

```shell
docsfy config show
```

`config show` prints the config file path, shows which profile is the default, lists every saved profile, and masks the password in its output.

To update values in place, use `config set`. These are real examples from the codebase tests:

```shell
docsfy config set default.server prod
docsfy config set servers.dev.url https://new-server.com
docsfy config set servers.dev.password new-password
```

`config set` only accepts keys that start with `default.` or `servers.`. It also requires the config file to exist already, so run `docsfy config init` at least once first.

> **Warning:** `docsfy config set default.server prod` updates the TOML value, but it does not verify that `servers.prod` exists. If the default points to a missing profile, later commands will fail when they try to use it.

> **Tip:** Use `docsfy config init` to create a complete new profile. Use `docsfy config set` when you just want to change one value.

## How default server selection works

The CLI resolves connection settings in this order:

1. Explicit command-line flags: `--host`, `--port`, `--username`, `--password`
2. The profile named by `--server` / `-s`
3. The profile named by `[default].server`
4. If none of those resolves to a server, the command exits with an error

In practice, that means:

- `--server` lets you temporarily switch to a different saved profile.
- If you do not pass `--server`, the CLI uses `[default].server`.
- If you have no saved config yet, you can still connect by passing enough command-line flags directly.
- If you name a profile that does not exist, the CLI exits and shows the available profile names.

## Global override flags

These are global connection options:

- `--server`, `-s`: choose a saved server profile
- `--host`: override the host from the saved profile
- `--port`: override the port
- `--username`, `-u`: override the username
- `--password`, `-p`: override the password/API key

Use them before the subcommand. This is a real test example:

```shell
docsfy --host myhost --port 9000 -u admin -p key health
```

A few override rules are worth knowing:

- If you use `--host` without `--port`, the CLI uses port `8000`.
- If you override only one field, the rest still come from the selected profile.
- If you override `--host`, the CLI rebuilds the URL from the host and port.
- When a selected profile URL starts with `http://`, overriding only the host keeps `http`.
- If there is no profile URL to borrow a scheme from, `--host` defaults to `https`.

> **Tip:** For local development, save your dev profile with an `http://...` URL. Then a host-only override keeps `http` instead of switching to `https`.

## Credentials and authentication

Each profile stores both `username` and `password`, but the CLI authenticates API requests with the `password` field as a Bearer token.

This is the exact client setup from the code:

```python
# username is stored for display/debugging; auth uses password as Bearer token
self.username = username
self.password = password
self._client = httpx.Client(
    base_url=self.server_url,
    headers={"Authorization": f"Bearer {self.password}"},
    timeout=30.0,
    follow_redirects=False,
)
```

What that means for you:

- Treat `password` as the API key for that server.
- The `username` field is still stored in the profile and shown by `docsfy config show`.
- The same saved API key is also used for `docsfy generate --watch`, which connects to the WebSocket progress endpoint.

If you rotate a user's API key on the server, update the matching `password` value in `~/.config/docsfy/config.toml`.

## Troubleshooting

If a config-related command fails, these are the most common causes:

- `Config not found`: run `docsfy config init`.
- `No server configured`: set `[default].server`, use `--server`, or pass `--host` and credentials directly.
- `Server profile '...' not found`: fix the `--server` value or your `[default].server` setting.
- TOML parse error: fix the syntax in `~/.config/docsfy/config.toml` and run the command again.


## Related Pages

- [CLI Workflows](cli-workflows.html)
- [CLI Command Reference](cli-command-reference.html)
- [Installation](installation.html)
- [Authentication and Roles](authentication-and-roles.html)
- [Authentication API](auth-api.html)

---

Source: authentication-and-roles.md

# Authentication and Roles

`docsfy` uses one authentication model across the web UI, API, and CLI:

- The built-in bootstrap admin signs in as `admin` with the `ADMIN_KEY` environment variable.
- Everyone else is a database-backed user with a role of `viewer`, `user`, or `admin`.
- The browser exchanges a username + API key for a session cookie.
- API and CLI clients can send the same API key directly as a Bearer token.

> **Tip:** In the web UI, the field is labeled **Password**, but the backend and CLI call the same secret an **API key**. They are the same credential.

## Two kinds of admin access

There are two ways to have admin privileges in `docsfy`:

1. The built-in `admin` account, configured from `ADMIN_KEY`
2. A database-backed user whose role is `admin`

They can both use the admin panel and admin API. The main difference is credential management: database-backed admins can rotate their own keys through the app, while the built-in `admin` account is rotated by changing `ADMIN_KEY`.

## Built-in admin login

The built-in admin account is how you bootstrap a new deployment. It is not created from the admin panel and it is not looked up in the `users` table. Instead, the server reads `ADMIN_KEY` from the environment at startup, and login grants admin access only when the username is exactly `admin`.

From `.env.example`:

```env
# Required: Admin password (minimum 16 characters)
ADMIN_KEY=

# Data directory for database and generated docs
DATA_DIR=/data

# Cookie security (set to false for local HTTP development)
SECURE_COOKIES=true
```

From `src/docsfy/api/auth.py`:

```python
# Check admin -- username must be "admin" and key must match
if username == "admin" and hmac.compare_digest(api_key, settings.admin_key):
    is_admin = True
    authenticated = True
    role = "admin"
else:
    # Check user key -- verify username matches the key's owner
    user = await get_user_by_key(api_key)
    if user and user["username"] == username:
        authenticated = True
        role = str(user.get("role", "user"))
        if role == "admin":
            is_admin = True
```

The server also validates `ADMIN_KEY` on startup. If it is missing or shorter than 16 characters, `docsfy` exits instead of starting with a broken auth setup.

> **Warning:** The username `admin` is reserved. You cannot create a database-backed user named `admin`, `Admin`, or `ADMIN`.

## Database-backed users

Admins can create additional users from the admin panel or the CLI. These users are stored in the SQLite database under `DATA_DIR`, with a role of `viewer`, `user`, or `admin`.

Raw API keys are not stored in the database. Instead, `docsfy` stores a keyed HMAC hash of the API key, and auto-generated keys use a `docsfy_` prefix.

From `src/docsfy/storage.py`:

```python
def hash_api_key(key: str, hmac_secret: str = "") -> str:
    """Hash an API key with HMAC-SHA256 for storage.

    Uses ADMIN_KEY as the HMAC secret so that even if the source is read,
    keys cannot be cracked without the environment secret.
    """
    # NOTE: ADMIN_KEY is used as the HMAC secret. Rotating ADMIN_KEY will
    # invalidate all existing api_key_hash values, requiring all users to
    # regenerate their API keys.
    secret = hmac_secret or os.getenv("ADMIN_KEY", "")
    if not secret:
        msg = "ADMIN_KEY environment variable is required for key hashing"
        raise RuntimeError(msg)
    return hmac.new(secret.encode(), key.encode(), hashlib.sha256).hexdigest()


def generate_api_key() -> str:
    """Generate a random API key."""
    return f"docsfy_{secrets.token_urlsafe(32)}"
```

In practice, that means:

- When a user is created or their key is rotated, the raw key is shown once and should be saved immediately.
- If you lose a key, rotate it. The server cannot show you the old one.
- Auto-generated keys start with `docsfy_`.
- If you change `ADMIN_KEY`, existing database-backed user keys stop matching until those users are issued new keys.

> **Note:** The admin UI and CLI both treat newly created or rotated credentials as one-time secrets: save them when they are shown.

## Browser login and session cookies

The browser does not send your API key on every request after login. Instead, it exchanges `username` + `api_key` for a session cookie and then uses that cookie for normal browsing.

From `frontend/src/pages/LoginPage.tsx`:

```ts
await api.post<AuthResponse>('/api/auth/login', {
  username,
  api_key: password,
})
```

From `src/docsfy/storage.py`:

```python
SESSION_TTL_SECONDS = 28800  # 8 hours
```

From `src/docsfy/api/auth.py`:

```python
response.set_cookie(
    "docsfy_session",
    session_token,
    httponly=True,
    samesite="strict",
    secure=settings.secure_cookies,
    max_age=SESSION_TTL_SECONDS,
)
```

What that means in practice:

- The cookie name is `docsfy_session`.
- Sessions last 8 hours.
- The cookie is `HttpOnly`, so browser JavaScript cannot read it directly.
- The cookie uses `SameSite=Strict`.
- The `Secure` flag is controlled by `SECURE_COOKIES`.
- Logout deletes the server-side session and clears the cookie.

The session token is opaque; it is not the raw API key. If a user is deleted or their key is rotated, their existing sessions stop working.

When you browse protected docs without a valid session, `docsfy` redirects HTML requests for `/docs/...` to `/login`. Protected API requests return `401 Unauthorized`.

> **Note:** `SECURE_COOKIES=true` is the default and is the right setting for HTTPS deployments. For local HTTP development, set `SECURE_COOKIES=false` or the browser will not send the cookie back.

## CLI and API clients

CLI and other non-browser clients usually skip the login endpoint and send the API key directly as a Bearer token.

From `config.toml.example`:

```toml
[servers.dev]
url = "http://localhost:8000"
username = "admin"
password = "<your-dev-key>"
```

From `src/docsfy/cli/client.py`:

```python
self._client = httpx.Client(
    base_url=self.server_url,
    headers={"Authorization": f"Bearer {self.password}"},
    timeout=30.0,
    follow_redirects=False,
)
```

This is why the same secret works in both places:

- In the web UI, you type it into the Password field and receive a session cookie.
- In the CLI, it is stored as `password` in `~/.config/docsfy/config.toml` and sent as a Bearer token.

The CLI configuration code writes that file with owner-only permissions.

> **Warning:** `~/.config/docsfy/config.toml` contains real credentials. Keep it private.

## Roles and permissions

### `viewer`

A `viewer` is read-only.

- Can sign in.
- Can view docs they own or docs an admin shared with them.
- Can download docs they are allowed to view.
- Can rotate their own API key.
- Cannot generate, regenerate, abort, or delete documentation.
- Cannot access the admin panel.

### `user`

A `user` is a normal write-enabled account.

- Can do everything a `viewer` can do.
- Can generate docs from remote repository URLs.
- Can regenerate, abort, and delete their own variants.
- Can see their own projects plus any projects explicitly shared with them.
- Cannot access the admin panel.
- Cannot use local filesystem `repo_path` generation; that is admin-only.

### `admin`

An `admin` has global visibility and user-management access.

- Can do everything a `user` can do.
- Can see all projects across all owners.
- Can access the admin panel.
- Can create and delete users.
- Can rotate any user's API key.
- Can grant and revoke project access.
- Can generate from local filesystem paths (`repo_path`).

> **Note:** Database-backed admins and the built-in `admin` account get the same admin permissions. The difference is rotation behavior: database-backed admins can rotate their own key in the app, while the built-in `admin` account is rotated by changing `ADMIN_KEY`.

> **Tip:** Role checks are enforced on the server, not just hidden in the UI. For example, `viewer` write requests are rejected even if someone manually calls the API.

## Shared access is owner-scoped

By default, non-admin users only see their own projects. Admins can grant access to another user's project, but the grant is scoped to both the project name and the owner.

That owner scoping matters because two different users can generate docs for repositories with the same name. A grant to Alice's `for-testing-only` project does not automatically grant Bob's `for-testing-only` project.

From `frontend/src/components/admin/AccessPanel.tsx`:

```ts
await api.post(`/api/admin/projects/${encodeURIComponent(grantProject.trim())}/access`, {
  username: grantUsername.trim(),
  owner: grantOwner.trim(),
})
```

A grant applies to all variants of that project name for that owner. Revoking access removes those shared views again, and the restriction is enforced on direct URLs too, not just in the dashboard.

In practice:

- Admins see everything.
- Non-admins see their own projects.
- Non-admins also see projects listed in their access grants.
- If a user does not have access, project details, downloads, and docs URLs return `404` rather than leaking that the project exists.

> **Tip:** When an admin grants access, always use the correct owner. If multiple users generated the same repo name, the owner determines which copy is shared.

## Key rotation

Database-backed users can rotate their own keys. Admins can also rotate keys on behalf of other users.

From `src/docsfy/storage.py`:

```python
async def rotate_user_key(username: str, custom_key: str | None = None) -> str:
    """Generate or set a new API key for a user. Returns the raw new key."""
    if custom_key:
        validate_api_key(custom_key)
        raw_key = custom_key
    else:
        raw_key = generate_api_key()
    key_hash = hash_api_key(raw_key)
    async with aiosqlite.connect(DB_PATH) as db:
        cursor = await db.execute(
            "UPDATE users SET api_key_hash = ? WHERE username = ?",
            (key_hash, username),
        )
        if cursor.rowcount == 0:
            msg = f"User '{username}' not found"
            raise ValueError(msg)
        # Invalidate all existing sessions for this user
        await db.execute("DELETE FROM sessions WHERE username = ?", (username,))
        await db.commit()
    return raw_key
```

From `src/docsfy/api/auth.py`:

```python
response = JSONResponse(
    content={"username": username, "new_api_key": new_key},
    headers={"Cache-Control": "no-store"},
)
response.delete_cookie(
    "docsfy_session",
    httponly=True,
    samesite="strict",
    secure=settings.secure_cookies,
)
```

Rotation behaves like this:

- `POST /api/auth/rotate-key` rotates the current database-backed user's key.
- `POST /api/admin/users/{username}/rotate-key` lets an admin rotate another user's key.
- If you send `new_key`, it must be at least 16 characters long.
- If you omit `new_key`, `docsfy` generates a new `docsfy_...` key.
- The old key stops working immediately.
- All active sessions for that user are invalidated.
- The self-service browser flow clears the current session cookie, so the user must log in again with the new key.
- Rotation responses use `Cache-Control: no-store` because they contain sensitive credentials.

`viewer` users can rotate their own keys just like `user` users. Database-backed admins can too.

> **Warning:** The built-in `admin` account cannot use the rotate-key endpoint. If you are logged in as `admin` via `ADMIN_KEY`, rotate that credential by changing the `ADMIN_KEY` environment variable and restarting the server.

> **Warning:** Rotating `ADMIN_KEY` is a global event for database-backed users, because their stored API-key hashes are derived from it. Plan to reissue or rotate user keys after changing `ADMIN_KEY`.

## Practical guidance

Use the built-in `admin` account to bootstrap the system, then create database-backed users for day-to-day access.

A good pattern is:

- Keep `ADMIN_KEY` for break-glass access and initial administration.
- Create named database users for humans and service accounts.
- Give read-only people the `viewer` role.
- Give day-to-day writers the `user` role.
- Grant project access explicitly when someone needs to see another owner's docs.
- Rotate lost or exposed keys instead of trying to recover them, because raw keys are not stored.

If you keep those rules in mind, `docsfy`'s auth model stays simple: one secret per account, role-based permissions on top, and short-lived browser sessions for the UI.


## Related Pages

- [Authentication API](auth-api.html)
- [User and Access Management](user-and-access-management.html)
- [Security Considerations](security-considerations.html)
- [Projects, Variants, and Ownership](projects-variants-and-ownership.html)
- [Admin API](admin-api.html)

---

Source: data-storage-and-layout.md

# Data Storage and Layout

`docsfy` keeps persistent server-side state in one base directory controlled by `DATA_DIR`. By default, that directory is `/data`, so the two locations that matter most are `DATA_DIR/docsfy.db` and `DATA_DIR/projects/`. On startup, the server reads `DATA_DIR`, creates missing directories, initializes the SQLite database, and runs migrations automatically.

```1:17:.env.example
# Required: Admin password (minimum 16 characters)
ADMIN_KEY=

# AI provider and model defaults
# (pydantic_settings reads these case-insensitively)
AI_PROVIDER=cursor
AI_MODEL=gpt-5.4-xhigh-fast
AI_CLI_TIMEOUT=60

# Logging
LOG_LEVEL=INFO

# Data directory for database and generated docs
DATA_DIR=/data

# Cookie security (set to false for local HTTP development)
SECURE_COOKIES=true
```

```40:60:src/docsfy/storage.py
# Module-level paths are set at import time from env vars.
# Tests override these globals directly for isolation.
DB_PATH = Path(os.getenv("DATA_DIR", "/data")) / "docsfy.db"
DATA_DIR = Path(os.getenv("DATA_DIR", "/data"))
PROJECTS_DIR = DATA_DIR / "projects"

async def init_db(data_dir: str = "") -> None:
    if data_dir:
        DB_PATH = Path(data_dir) / "docsfy.db"
        DATA_DIR = Path(data_dir)
        PROJECTS_DIR = DATA_DIR / "projects"

    DB_PATH.parent.mkdir(parents=True, exist_ok=True)
    PROJECTS_DIR.mkdir(parents=True, exist_ok=True)
```

The included container setup persists that data by bind-mounting host `./data` into container `/data`:

```1:22:docker-compose.yaml
services:
  docsfy:
    build:
      context: .
      dockerfile: Dockerfile
    ports:
      - "8000:8000"
    volumes:
      - ./data:/data
    env_file:
      - .env
    environment:
      - ADMIN_KEY=${ADMIN_KEY}
    restart: unless-stopped
```

> **Note:** If you change `DATA_DIR`, update your container mount to match. The provided Compose file assumes the app writes to `/data`.

## Quick Reference

| Path | What lives there |
|---|---|
| `DATA_DIR/docsfy.db` | SQLite database for variants, users, sharing rules, and sessions |
| `DATA_DIR/projects/<owner>/<project>/<branch>/<provider>/<model>/plan.json` | Final documentation plan for one variant |
| `DATA_DIR/projects/<owner>/<project>/<branch>/<provider>/<model>/cache/pages/*.md` | Cached markdown pages for one variant |
| `DATA_DIR/projects/<owner>/<project>/<branch>/<provider>/<model>/site/` | Static HTML site served by the app and used for downloads |
| `~/.config/docsfy/config.toml` | Local CLI connection profiles on a user machine, not server runtime data |

```mermaid
flowchart TD
  DATA["DATA_DIR (/data by default)"]
  DB["docsfy.db"]
  PROJECTS["projects/"]
  OWNER["<owner> or _default"]
  PROJECT["<project>"]
  BRANCH["<branch>"]
  PROVIDER["<provider>"]
  MODEL["<model>"]
  PLAN["plan.json"]
  CACHE["cache/pages/*.md"]
  SITE["site/ (static HTML)"]

  DATA --> DB
  DATA --> PROJECTS --> OWNER --> PROJECT --> BRANCH --> PROVIDER --> MODEL
  MODEL --> PLAN
  MODEL --> CACHE
  MODEL --> SITE
```

## The SQLite Database

The SQLite database is the source of truth for variant metadata, user accounts, sharing rules, and browser sessions.

| Table | Purpose |
|---|---|
| `projects` | One row per generated docs variant. The key is `name + branch + ai_provider + ai_model + owner`, so the same repo can exist as separate branches and model/provider variants. |
| `users` | User accounts, roles, and hashed API keys. |
| `project_access` | Sharing rules for a project name owned by a specific owner. One grant applies across that project’s variants. |
| `sessions` | Browser login sessions with expiration times. |

The `projects` table shows how variant identity works:

```61:79:src/docsfy/storage.py
async with aiosqlite.connect(DB_PATH) as db:
    await db.execute(f"""
        CREATE TABLE IF NOT EXISTS projects (
            name TEXT NOT NULL,
            branch TEXT NOT NULL DEFAULT '{_SQL_DEFAULT_BRANCH}',
            ai_provider TEXT NOT NULL DEFAULT '',
            ai_model TEXT NOT NULL DEFAULT '',
            owner TEXT NOT NULL DEFAULT '',
            repo_url TEXT NOT NULL,
            status TEXT NOT NULL DEFAULT 'generating',
            current_stage TEXT,
            last_commit_sha TEXT,
            last_generated TEXT,
            page_count INTEGER DEFAULT 0,
            error_message TEXT,
            plan_json TEXT,
            created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
            updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
            PRIMARY KEY (name, branch, ai_provider, ai_model, owner)
        )
    """)
```

API keys and session tokens are not stored in raw form:

```662:675:src/docsfy/storage.py
def hash_api_key(key: str, hmac_secret: str = "") -> str:
    """Hash an API key with HMAC-SHA256 for storage."""
    # NOTE: ADMIN_KEY is used as the HMAC secret.
    secret = hmac_secret or os.getenv("ADMIN_KEY", "")
    if not secret:
        msg = "ADMIN_KEY environment variable is required for key hashing"
        raise RuntimeError(msg)
    return hmac.new(secret.encode(), key.encode(), hashlib.sha256).hexdigest()
```

```773:790:src/docsfy/storage.py
def _hash_session_token(token: str) -> str:
    """Hash a session token for storage."""
    return hashlib.sha256(token.encode()).hexdigest()

async def create_session(
    username: str, is_admin: bool = False, ttl_hours: int = SESSION_TTL_HOURS
) -> str:
    """Create an opaque session token."""
    token = secrets.token_urlsafe(32)
    token_hash = _hash_session_token(token)
    expires_at = datetime.now(timezone.utc) + timedelta(hours=ttl_hours)
    expires_str = expires_at.strftime("%Y-%m-%d %H:%M:%S")
    async with aiosqlite.connect(DB_PATH) as db:
        await db.execute(
            "INSERT INTO sessions (token, username, is_admin, expires_at) VALUES (?, ?, ?, ?)",
            (token_hash, username, 1 if is_admin else 0, expires_str),
        )
```

> **Note:** You normally do not need a manual migration step. Server startup calls the database initializer automatically and upgrades older schemas before serving requests.

## Variant Directories Under `DATA_DIR/projects`

Each generated variant gets its own directory under `DATA_DIR/projects/`. The path is built from five values, in this order:

1. `owner`
2. `project`
3. `branch`
4. `provider`
5. `model`

The actual path builder is in `get_project_dir()`:

```515:581:src/docsfy/storage.py
def _validate_owner(owner: str) -> str:
    """Validate owner segment to prevent path traversal."""
    if not owner:
        return "_default"
    if "/" in owner or "\\" in owner or ".." in owner or owner.startswith("."):
        msg = f"Invalid owner: '{owner}'"
        raise ValueError(msg)
    return owner

def get_project_dir(
    name: str,
    ai_provider: str = "",
    ai_model: str = "",
    owner: str = "",
    branch: str = DEFAULT_BRANCH,
) -> Path:
    if not branch:
        msg = "branch is required for project directory paths"
        raise ValueError(msg)
    if not ai_provider or not ai_model:
        msg = "ai_provider and ai_model are required for project directory paths"
        raise ValueError(msg)

    safe_owner = _validate_owner(owner)
    return (
        PROJECTS_DIR
        / safe_owner
        / _validate_name(name)
        / branch
        / ai_provider
        / ai_model
    )

def get_project_site_dir(
    name: str,
    ai_provider: str = "",
    ai_model: str = "",
    owner: str = "",
    branch: str = DEFAULT_BRANCH,
) -> Path:
    return get_project_dir(name, ai_provider, ai_model, owner, branch) / "site"

def get_project_cache_dir(
    name: str,
    ai_provider: str = "",
    ai_model: str = "",
    owner: str = "",
    branch: str = DEFAULT_BRANCH,
) -> Path:
    return (
        get_project_dir(name, ai_provider, ai_model, owner, branch) / "cache" / "pages"
    )
```

A test in the repository asserts the exact segment order:

```848:854:tests/test_storage.py
async def test_get_project_dir_with_branch(db_path: Path) -> None:
    from docsfy.storage import PROJECTS_DIR, get_project_dir

    result = get_project_dir(
        "my-repo", ai_provider="claude", ai_model="opus", owner="user", branch="main"
    )
    assert result == PROJECTS_DIR / "user" / "my-repo" / "main" / "claude" / "opus"
```

Inside a finished variant directory, the main files are:

- `plan.json`: the final documentation plan for that variant.
- `cache/pages/*.md`: the per-page markdown cache.
- `site/`: the generated static site.

`plan.json` is written into the variant directory after generation completes:

```1054:1057:src/docsfy/api/projects.py
project_dir = get_project_dir(
    project_name, ai_provider, ai_model, owner, branch=branch
)
(project_dir / "plan.json").write_text(json.dumps(plan, indent=2), encoding="utf-8")
```

Tests also show the expected `cache/` and `site/` contents for a variant:

```602:608:tests/test_main.py
old_cache_dir = get_project_cache_dir("test-repo", "gemini", "flash", "admin")
old_cache_dir.mkdir(parents=True, exist_ok=True)
(old_cache_dir / "introduction.md").write_text("# Introduction\n\nGemini intro\n")

old_site_dir = get_project_site_dir("test-repo", "gemini", "flash", "admin")
old_site_dir.mkdir(parents=True, exist_ok=True)
(old_site_dir / "index.html").write_text("<html>Gemini docs</html>")
```

> **Note:** If you ever see `_default` as the owner directory name, docsfy is using the safe on-disk fallback for an empty stored owner. In normal authenticated use, you will usually see the real username instead.

## How Branch, Provider, and Model Shape Paths

Branch, provider, and model are not just metadata. They directly determine both the database key and the filesystem path.

- Changing the branch gives you a different variant directory.
- Changing the provider gives you a sibling variant directory.
- Changing the model gives you a sibling variant directory under that provider.
- Omitting the branch defaults it to `main`.

Branch names are validated because they become path segments:

```29:47:src/docsfy/models.py
branch: str = Field(
    default=DEFAULT_BRANCH, description="Git branch to generate docs from"
)

@field_validator("branch")
@classmethod
def validate_branch(cls, v: str) -> str:
    if "/" in v:
        msg = (
            f"Invalid branch name: '{v}'. Branch names cannot contain slashes "
            "— use hyphens instead (e.g., release-1.x)."
        )
        raise ValueError(msg)
    if not re.match(r"^[a-zA-Z0-9][a-zA-Z0-9._-]*$", v):
        msg = f"Invalid branch name: '{v}'"
        raise ValueError(msg)
    if ".." in v:
        msg = f"Invalid branch name: '{v}'"
        raise ValueError(msg)
```

> **Warning:** Branch names cannot contain slashes. Use names like `release-1.x` or `v2.0`, not `release/1.x`.

The public docs URL uses the same `project / branch / provider / model` segments that the filesystem uses. The `owner` segment is resolved internally from the database:

```200:224:src/docsfy/main.py
@app.get("/docs/{project}/{branch}/{provider}/{model}/{path:path}")
async def serve_variant_docs(
    request: Request,
    project: str,
    branch: str,
    provider: str,
    model: str,
    path: str = "index.html",
) -> FileResponse:
    # ...
    proj = await _resolve_project(
        request,
        project,
        ai_provider=provider,
        ai_model=model,
        branch=branch,
    )

    proj_owner = str(proj.get("owner", ""))
    site_dir = get_project_site_dir(project, provider, model, proj_owner, branch=branch)
```

The matching download endpoint packages the `site/` directory for that same variant:

```1597:1623:src/docsfy/api/projects.py
@router.get("/projects/{name}/{branch}/{provider}/{model}/download")
async def download_variant(
    request: Request,
    name: str,
    branch: str,
    provider: str,
    model: str,
) -> StreamingResponse:
    # ...
    project_owner = str(project.get("owner", ""))
    site_dir = get_project_site_dir(name, provider, model, project_owner, branch=branch)
    if not site_dir.exists():
        raise HTTPException(status_code=404, detail="Site not found")
    return await _stream_tarball(site_dir, f"{name}-{branch}-{provider}-{model}")
```

> **Tip:** A variant download contains the built `site/` output, not the whole variant directory. The database, `plan.json`, and `cache/` stay on the server.

## Cache Behavior and Reuse

The page cache is stored per variant under `cache/pages/`. It is not a global cache shared across all projects or models.

That matters in a few practical cases:

- An incremental update can reuse unchanged page markdown from the existing variant directory.
- A full regeneration clears stale cached pages for the target variant.
- A non-force switch to a different provider/model can prefill a new sibling variant from the newest ready variant on the same branch.
- If the commit is unchanged during that cross-provider switch, docsfy can reuse both cached markdown and the finished `site/`, then delete the old variant after the replacement is ready.

The tests show that behavior directly:

```668:674:tests/test_main.py
assert (
    new_cache_dir / "introduction.md"
).read_text() == "# Introduction\n\nGemini intro\n"
assert (new_site_dir / "index.html").read_text() == "<html>Gemini docs</html>"
assert old_variant is None
assert not old_cache_dir.exists()
assert not old_site_dir.exists()
```

> **Tip:** Use `force=true` when you want a clean rebuild of the target variant instead of starting from cached artifacts.

## What Is Persistent and What Is Temporary

Not every file created during generation lives under `DATA_DIR`.

Persistent data:

- `docsfy.db`
- `projects/.../plan.json`
- `projects/.../cache/pages/*.md`
- `projects/.../site/`

Temporary data:

- remote repository clones
- temporary validation work directories
- temporary archive files used while streaming a download

Remote repos are cloned into a temporary directory, not into `DATA_DIR`:

```580:588:src/docsfy/api/projects.py
# Remote repository - clone to temp dir
if repo_url is None:
    msg = "repo_url must be provided for remote repositories"
    raise ValueError(msg)
with tempfile.TemporaryDirectory() as tmp_dir:
    repo_dir, commit_sha, _ = await asyncio.to_thread(
        clone_repo, repo_url, Path(tmp_dir), branch=branch
    )
```

The clone itself is a shallow `git clone --depth 1`:

```27:36:src/docsfy/repository.py
def clone_repo(
    repo_url: str, base_dir: Path, branch: str | None = None
) -> tuple[Path, str, str]:
    repo_name = extract_repo_name(repo_url)
    repo_path = base_dir / repo_name
    logger.info(f"Cloning {repo_name} to {repo_path}")
    clone_cmd = ["git", "clone", "--depth", "1"]
    if branch:
        clone_cmd += ["--branch", branch]
```

Validation work happens in a temporary directory that is removed afterward:

```244:269:src/docsfy/postprocess.py
job_id = str(uuid.uuid4())
job_dir = Path(tempfile.mkdtemp(prefix=f"docsfy-validation-{job_id}-"))

try:
    coroutines = [
        _validate_single_page(
            slug=slug,
            content=content,
            repo_path=repo_path,
            ai_provider=ai_provider,
            ai_model=ai_model,
            cache_dir=cache_dir,
            project_name=project_name,
            page_title=slug_meta.get(slug, {}).get("title", slug),
            page_description=slug_meta.get(slug, {}).get("description", ""),
            job_dir=job_dir,
            ai_cli_timeout=ai_cli_timeout,
        )
        for slug, content in pages.items()
    ]
    results = await run_parallel_with_limit(
        coroutines, max_concurrency=MAX_CONCURRENT_PAGES
    )
finally:
    shutil.rmtree(job_dir, ignore_errors=True)
```

Download archives are also temporary and are deleted after they are streamed:

```374:402:src/docsfy/api/projects.py
async def _stream_tarball(site_dir: Path, archive_name: str) -> StreamingResponse:
    """Create a tar.gz archive and stream it as a response."""
    tmp = tempfile.NamedTemporaryFile(suffix=".tar.gz", delete=False)
    tar_path = Path(tmp.name)
    tmp.close()

    def _create_archive() -> None:
        with tarfile.open(tar_path, mode="w:gz") as tar:
            tar.add(str(site_dir), arcname=archive_name)

    async def _stream_and_cleanup() -> AsyncIterator[bytes]:
        try:
            f = await asyncio.to_thread(open, tar_path, "rb")
            try:
                while True:
                    chunk = await asyncio.to_thread(f.read, _STREAM_CHUNK_SIZE)
                    if not chunk:
                        break
                    yield chunk
            finally:
                await asyncio.to_thread(f.close)
        finally:
            tar_path.unlink(missing_ok=True)
```

When you generate from `repo_path` instead of `repo_url`, docsfy uses that repository in place. It does not copy the repo into `DATA_DIR` first.

> **Tip:** In Docker, `repo_path` must exist inside the container. With the included Compose file, a repo placed under host `./data` is visible in the container under `/data`.

## What to Back Up

For a complete server backup, keep the whole `DATA_DIR` together.

- `docsfy.db` preserves variant metadata, users, access rules, and session state.
- `projects/` preserves the actual generated artifacts.

If you restore only one of those, the database and filesystem can drift out of sync.

If you use the CLI from a developer workstation or admin machine, its connection profiles live in a separate, client-side file:

```14:43:src/docsfy/cli/config_cmd.py
CONFIG_DIR = Path.home() / ".config" / "docsfy"
CONFIG_FILE = CONFIG_DIR / "config.toml"

def _save_config(config: dict[str, Any]) -> None:
    """Write config to disk with secure permissions."""
    CONFIG_DIR.mkdir(parents=True, exist_ok=True)
    os.chmod(CONFIG_DIR, stat.S_IRWXU)
    with open(CONFIG_FILE, "wb") as f:
        tomli_w.dump(config, f)
    os.chmod(CONFIG_FILE, stat.S_IRUSR | stat.S_IWUSR)
```

> **Warning:** `~/.config/docsfy/config.toml` is not part of server runtime storage, but it may contain CLI credentials. Back it up carefully and keep its permissions private.


## Related Pages

- [Architecture and Runtime](architecture-and-runtime.html)
- [Projects, Variants, and Ownership](projects-variants-and-ownership.html)
- [Generated Output](generated-output.html)
- [Variants, Branches, and Regeneration](variants-branches-and-regeneration.html)
- [Deployment and Runtime](deployment-and-runtime.html)

---

Source: first-run-quickstart.md

# First Run Quickstart

`docsfy` ships with a built-in admin account and a browser UI. On a fresh install, the fastest path is:

1. Set `ADMIN_KEY` and local runtime settings.
2. Start the service.
3. Sign in as `admin`.
4. Create any additional users.
5. Start your first documentation generation.
6. Open the generated site or download it.

## Before You Start

Choose one runtime path:

- Docker Compose: the simplest first run.
- From source: Python `3.12+`, `uv`, Node.js, npm, Git, and at least one supported provider CLI.

The current codebase exposes these two entry points:

```toml
[project.scripts]
docsfy-server = "docsfy.main:run"
docsfy = "docsfy.cli.main:main"
```

> **Note:** `docsfy-server` runs the web service. `docsfy` is the client CLI for health checks, generation, and admin tasks.

## 1. Configure `.env`

Start from the shipped `.env.example`:

```dotenv
# Required: Admin password (minimum 16 characters)
ADMIN_KEY=

# AI provider and model defaults
# (pydantic_settings reads these case-insensitively)
AI_PROVIDER=cursor
AI_MODEL=gpt-5.4-xhigh-fast
AI_CLI_TIMEOUT=60

# Logging
LOG_LEVEL=INFO

# Data directory for database and generated docs
DATA_DIR=/data

# Cookie security (set to false for local HTTP development)
SECURE_COOKIES=true

# Development mode: starts Vite dev server on port 5173 alongside FastAPI
# DEV_MODE=true
```

Set these before your first launch:

- `ADMIN_KEY`: required, and it must be at least 16 characters.
- `SECURE_COOKIES=false`: use this for plain `http://localhost` development.
- `AI_PROVIDER` and `AI_MODEL`: optional server-wide defaults for new generations.

> **Warning:** If `ADMIN_KEY` is empty or shorter than 16 characters, the server exits at startup.

> **Warning:** If you keep `SECURE_COOKIES=true` on plain local HTTP, sign-in will appear to work but the browser session will not stick.

## 2. Start the Service

### Recommended: Docker Compose

The included Compose file already builds the app, publishes port `8000`, and persists data under `./data`:

```yaml
services:
  docsfy:
    build:
      context: .
      dockerfile: Dockerfile
    ports:
      - "8000:8000"
    volumes:
      - ./data:/data
    env_file:
      - .env
    environment:
      - ADMIN_KEY=${ADMIN_KEY}
    restart: unless-stopped
```

From the repository root:

```bash
docker compose up --build
```

Then check the health endpoint:

```bash
curl http://localhost:8000/health
```

Expected response:

```json
{"status":"ok"}
```

> **Tip:** On this path, your database and generated docs live on the host in `./data`, because Compose mounts `./data:/data`.

### Alternative: run from source

If you want to run without Docker, install the Python environment, build the frontend, then start the server:

```bash
uv sync --frozen --no-dev
cd frontend
npm ci
npm run build
cd ..
uv run docsfy-server
```

By default, `docsfy-server` binds to `127.0.0.1:8000`. You can override that at launch time:

```bash
HOST=0.0.0.0 PORT=8000 DEBUG=true uv run docsfy-server
```

> **Note:** When FastAPI serves the web app itself, it expects a built `frontend/dist`. Build the frontend before starting `docsfy-server`.

## First-Run Flow

```mermaid
sequenceDiagram
  participant You
  participant UI as Browser UI
  participant API as docsfy server
  participant DB as SQLite/data dir
  participant AI as Provider CLI

  You->>API: Start service
  API->>DB: Initialize database and storage
  You->>UI: Open /login
  UI->>API: POST /api/auth/login
  API-->>UI: Session cookie
  You->>UI: Create users if needed
  UI->>API: POST /api/admin/users
  API->>DB: Store user and hashed key
  You->>UI: Start first generation
  UI->>API: POST /api/generate
  API->>AI: Check provider CLI and generate docs
  API->>DB: Track status and output paths
  API-->>UI: Live progress over /api/ws
  UI-->>You: Ready docs, download, and status details
```

## 3. Sign In as Admin

Open `http://localhost:8000/login`.

Use:

- Username: `admin`
- Password: the value of `ADMIN_KEY`

The browser UI submits the same fields the API expects:

```json
{
  "username": "admin",
  "api_key": "<ADMIN_KEY>"
}
```

A successful login creates a browser session cookie and takes you to the dashboard.

> **Note:** In the web UI the secret is labeled as a password. In the API and CLI, the same value is treated as an API key or bearer token.

> **Note:** The built-in `admin` account comes from the environment, not the database.

## 4. Create Users If You Need Them

If you are the only operator, you can skip this section and generate docs as `admin`.

If other people will use the system, open `Users` from the admin section of the dashboard and create named accounts. The available roles are:

| Role | What it is for | Can start generations | Can manage users and access |
| --- | --- | --- | --- |
| `admin` | full operators | Yes | Yes |
| `user` | day-to-day doc generation | Yes | No |
| `viewer` | read-only access to docs | No | No |

A user creation request uses this shape:

```json
{"username": "alice", "role": "user"}
```

If you prefer the CLI, configure it once with `uv run docsfy config init`, then use:

```bash
uv run docsfy admin users list
uv run docsfy admin users create alice --role user
```

When a user is created, `docsfy` returns a generated credential once. Auto-generated user keys start with `docsfy_`.

> **Warning:** Save the generated password/API key before you dismiss it. Creation and key rotation responses are intentionally marked `Cache-Control: no-store`.

> **Tip:** Start with `user` for most people. Use `viewer` for people who only need to read generated docs.

> **Tip:** If you want a second human administrator for day-to-day work, create a normal database-backed user with role `admin`.

If you generate documentation as one account and want another account to view it, share the project afterward from `Access`. Access is granted per project name and owner.

## 5. Start Your First Documentation Generation

From the dashboard, click `New Generation` and fill in:

- `Repository URL`
- `Branch`
- `Provider`
- `Model`
- `Force full regeneration`

The web app sends these fields when you click `Generate`:

```json
{
  "repo_url": "https://github.com/myk-org/for-testing-only",
  "branch": "main",
  "ai_provider": "gemini",
  "ai_model": "gemini-2.5-flash",
  "force": false
}
```

For a smooth first run:

- Use a normal remote Git URL over HTTPS or SSH.
- Choose both provider and model explicitly on a fresh system.
- Leave `Force full regeneration` off for the very first run unless you are retrying a failed attempt.

A known small test target used by the repository's own end-to-end plans is:

```text
https://github.com/myk-org/for-testing-only
```

> **Note:** Provider choices in the current codebase are `claude`, `gemini`, and `cursor`.

> **Note:** Model suggestions and branch suggestions come from completed generations. On a brand-new instance, it is normal to type the model manually.

> **Warning:** Branch names cannot contain `/`. Use names like `main`, `dev`, or `release-1.x`, not `release/1.x`.

> **Warning:** For a normal first run, prefer `repo_url`. Local `repo_path` generation is admin-only, and in Docker the path must exist inside the container filesystem.

## 6. Watch Progress

As soon as the request is accepted, the dashboard creates a variant and switches to live progress updates over `/api/ws`.

The statuses you will see are:

- `generating`
- `ready`
- `error`
- `aborted`

A typical first run moves through these stages:

- `cloning`
- `planning`
- `generating_pages`
- `validating`
- `cross_linking`
- `rendering`

If the project already matches the last generated commit, `docsfy` can finish immediately as up to date.

> **Note:** The progress bar only appears after planning, because the server does not know the total page count until the documentation plan exists.

> **Tip:** If the status stays `Generating` after all pages are counted, the run is usually in one of the final stages such as validation, cross-linking, or rendering. It is only finished when the status changes to `Ready`.

## 7. Open or Download the Result

When the run reaches `Ready`, the detail view shows:

- the final page count
- the last generated time
- the commit SHA
- `View Documentation`
- `Download`

Variant-specific URLs follow this pattern:

```text
/docs/<project>/<branch>/<provider>/<model>/
/api/projects/<project>/<branch>/<provider>/<model>/download
```

If you used the default Compose setup, the rendered site is also written to disk here:

```text
./data/projects/<owner>/<project>/<branch>/<provider>/<model>/site/
```

That same `./data` directory also contains the SQLite database at `./data/docsfy.db`.

> **Note:** Generated docs are authenticated routes. Open them from a logged-in browser session or an API client that sends valid credentials.

## 8. Common First-Run Problems

If the first run does not work, check these first:

- The service exits immediately: `ADMIN_KEY` is missing or too short.
- Login does not persist in the browser: `SECURE_COOKIES` is still `true` while you are using plain `http://localhost`.
- Generation fails right away: the selected provider CLI is not usable in that runtime environment.
- The request is rejected before generation starts: the repository URL is invalid, points to a private network, or is not reachable from the server.
- The `Generate` action is missing: the signed-in account is a `viewer`, which cannot start or regenerate documentation.

> **Tip:** The Docker image installs the Claude, Cursor, and Gemini CLIs during the image build. That gives you the executables, but you still need a provider/model combination that is usable from that runtime.

Once your first generation succeeds, the next practical step is to create named `user` accounts for everyday work and keep the built-in `admin` account for setup, sharing, and recovery.


## Related Pages

- [Docker and Compose Quickstart](docker-quickstart.html)
- [User and Access Management](user-and-access-management.html)
- [Generating Documentation](generating-documentation.html)
- [Tracking Progress and Status](tracking-progress-and-status.html)
- [Viewing, Downloading, and Hosting Docs](viewing-downloading-and-hosting-docs.html)

---

Source: generating-documentation.md

# Generating Documentation

`docsfy` turns a Git repository into a static documentation variant. Each variant is scoped by repository name, branch, AI provider, and AI model, so the same repository can have separate outputs for `main` and `dev`, or for `gemini` and `cursor`.

The project name comes from the repository URL automatically. For example, `https://github.com/myk-org/for-testing-only` becomes `for-testing-only`.

## What You Need

- A `user` or `admin` account. `viewer` accounts can read docs but cannot start or regenerate them.
- A remote Git URL in HTTPS or `git@...` form.
- A provider/model combination that the server can actually run.

Accepted URL shapes in the current request model include:

```text
https://github.com/org/repo.git
https://github.com/org/repo
git@github.com:org/repo.git
```

> **Note:** If you use an SSH URL, the machine running `docsfy` must already have working Git credentials for that host.

> **Warning:** This page covers generation from a Git repository URL. Local `repo_path` generation exists in the API, but it is an admin-only workflow.

> **Warning:** Repository URLs that point to `localhost` or private-network addresses are rejected.

## Start a New Generation in the Web App

Use the `New Generation` form in the dashboard:

1. Enter the repository URL.
2. Choose the branch.
3. Choose the AI provider.
4. Choose or type the model name.
5. Decide whether to enable `Force full regeneration`.
6. Click `Generate`.

The web app submits these fields to the server:

```ts
await api.post('/api/generate', {
  repo_url: submittedRepoUrl,
  branch: submittedBranch,
  ai_provider: submittedProvider,
  ai_model: submittedModel,
  force: submittedForce,
})
```

A few details are worth knowing:

- The branch field starts at `main`.
- The provider field starts at `cursor`.
- Branch suggestions come from previous ready generations of the same repository name.
- Model suggestions come from previous ready variants for the selected provider.
- Both branch and model inputs still let you type a new value.

> **Tip:** The web form remembers the repository URL, branch, and Force checkbox for the current browser session.

## Generate From the CLI

If you have not configured the CLI yet, run `docsfy config init` first.

`docsfy generate` accepts the same core settings as the web app: `--branch` / `-b`, `--provider`, `--model` / `-m`, `--force` / `-f`, and `--watch` / `-w`.

Real CLI examples from the repository's own test plans:

```bash
docsfy generate https://github.com/myk-org/for-testing-only --provider gemini --model gemini-2.5-flash --force
```

```bash
docsfy generate https://github.com/myk-org/for-testing-only --branch dev --provider gemini --model gemini-2.5-flash --force --watch
```

`--watch` keeps the CLI attached to the run and prints stage output until the variant is ready, fails, or is aborted.

> **Note:** The CLI stores connection profiles in `~/.config/docsfy/config.toml` and sends the configured password or API key as a Bearer token.

## Generate Through the API

`POST /api/generate` accepts `repo_url`, `branch`, `ai_provider`, `ai_model`, `ai_cli_timeout`, and `force`.

A real API example from the test plans:

```bash
curl -s -X POST http://localhost:8800/api/generate \
  -H "Authorization: Bearer <TEST_USER_PASSWORD>" \
  -H "Content-Type: application/json" \
  -d '{"repo_url":"https://github.com/myk-org/for-testing-only","ai_provider":"gemini","ai_model":"gemini-2.5-flash"}'
```

A successful request returns `202 Accepted` and starts generation asynchronously. The response includes the derived project name, the current status, and the resolved branch.

> **Note:** `ai_cli_timeout` is an API-only advanced option for overriding the server's default AI CLI timeout on a single request.

## Choose Branch, Provider, and Model

Actual defaults in the request model are:

```python
force: bool = Field(
    default=False, description="Force full regeneration, ignoring cache"
)
branch: str = Field(
    default=DEFAULT_BRANCH, description="Git branch to generate docs from"
)
```

If you do not choose a branch, `docsfy` uses `main`.

Valid branch examples exercised in the test suite include:

```text
main
dev
v2.0
release-v2.0
v2.0.1
```

> **Warning:** Branch names cannot contain `/`. Use `release-v2.0` instead of `release/v2.0`.

The supported providers in the current codebase are `claude`, `gemini`, and `cursor`.

If you do not set a provider or model, the server falls back to its configured defaults. The shipped defaults are:

```env
AI_PROVIDER=cursor
AI_MODEL=gpt-5.4-xhigh-fast
AI_CLI_TIMEOUT=60
```

Branch, provider, and model are part of the variant identity. That same combination appears in the exact docs and download URLs:

```text
/docs/for-testing-only/dev/gemini/gemini-2.5-flash/
/api/projects/for-testing-only/dev/gemini/gemini-2.5-flash/download
```

> **Warning:** If the selected branch does not exist on the remote, cloning fails and the variant ends in `error`.

> **Warning:** The selected provider CLI must be installed and authenticated on the server. If the server cannot run that provider or model, generation fails before docs are produced.

## Regenerating an Existing Variant

The variant detail pane has a `Regenerate Documentation` section. That flow keeps the current repository URL and branch, and lets you change provider, model, and Force.

This is the actual regenerate request:

```ts
await api.post('/api/generate', {
  repo_url: project.repo_url,
  branch: project.branch,
  ai_provider: provider,
  ai_model: model,
  force,
})
```

> **Note:** Regenerate does not change the branch in the current UI. To generate a different branch, start a new run from the `New Generation` form.

## Understanding Force Full Regeneration

`Force full regeneration` changes how much previous work `docsfy` is allowed to reuse.

With Force off:

- `docsfy` can finish immediately as up to date when the latest generated commit already matches the requested commit.
- `docsfy` can diff the repository, reuse the saved plan, keep cached pages that still apply, and regenerate only the affected pages.
- When you switch provider or model, `docsfy` can use the newest ready variant on the same branch as a baseline, even if that baseline used a different provider or model.

With Force on:

- `docsfy` clears the target variant's cached pages.
- The page count is reset to `0`.
- A full planning step runs again.
- Every page is regenerated from scratch.

Provider/model switches are where Force matters most:

- On a non-force run, a new provider/model variant can reuse the previous variant's artifacts and, after success, replace the older baseline variant.
- On a force run, the older variant is kept and the new provider/model variant is built separately from scratch.

> **Tip:** Use Force when you want a clean rebuild, when you suspect stale cached output, or when you want two provider/model variants to exist side by side instead of replacing the older one.

> **Tip:** In the current UI, Force defaults to off for `ready` variants and to on for `error` or `aborted` variants.

> **Warning:** Only one run can be active for the same project, branch, provider, and model at a time. Starting that exact variant again returns `409`.

## Generation Flow

```mermaid
flowchart TD
    A[Web app / CLI / API request] --> B[Resolve project name, branch, provider, and model]
    B --> C[Clone the requested branch]
    C --> D{Force full regeneration?}

    D -->|Yes| E[Clear target cache and run full planner]
    D -->|No| F{Ready baseline exists on same project and branch?}

    F -->|Same commit| G[Mark variant up to date]
    F -->|Changed commit with usable diff and saved plan| H[Incremental planning]
    F -->|No usable baseline| E

    H --> I[Regenerate affected pages and reuse cached pages]
    E --> J[Generate all pages]

    I --> K[Validate pages]
    J --> K
    K --> L[Add cross-links]
    L --> M[Render static site]
    M --> N[Ready variant]
```

In code, the pipeline uses stages such as `cloning`, `planning`, `incremental_planning`, `generating_pages`, `validating`, `cross_linking`, and `rendering`. A no-op run uses `up_to_date`.

## After Generation

Once the variant is ready, you can:

- open the generated site from the variant detail view
- download the rendered output as a `.tar.gz` archive
- inspect the variant later with `docsfy status <project>`

If you care about an exact branch/provider/model combination, use the full variant URL:

```text
/docs/for-testing-only/dev/gemini/gemini-2.5-flash/
```

> **Note:** The shorter `/docs/<project>/` route serves the most recently generated ready variant, not a branch-pinned one. Use the full variant URL when you want a stable link.

> **Note:** Docs and download routes are authenticated. Open them from a logged-in browser session, or use an API client that sends a valid Bearer token.


## Related Pages

- [Variants, Branches, and Regeneration](variants-branches-and-regeneration.html)
- [Tracking Progress and Status](tracking-progress-and-status.html)
- [Projects API](projects-api.html)
- [CLI Workflows](cli-workflows.html)
- [AI Provider Setup](ai-provider-setup.html)

---

Source: tracking-progress-and-status.md

# Tracking Progress and Status

docsfy tracks each documentation variant at two levels:

- `status` tells you whether the variant is still running or has finished.
- `current_stage` tells you what docsfy is doing right now while the run is in progress.

The dashboard first loads a snapshot from `/api/projects`, then keeps that snapshot live over `/api/ws`. That is why the badge, activity log, page count, and sidebar can update without a manual refresh.

> **Note:** `ready` is a final status. It is not one of the in-flight stage names shown during generation.

## What the statuses mean

| Status | Meaning | What you will see |
| --- | --- | --- |
| `generating` | docsfy is still working | A blue generating badge, a live activity log, and a progress bar once the plan exists |
| `ready` | The docs finished successfully | A success or up-to-date message, final page count, commit, last-generated time, and buttons to open or download the docs |
| `error` | The run failed | An error message in the detail view and, if you have write access, a regenerate form |
| `aborted` | A user stopped the run | An aborted message in the detail view and, if you have write access, regenerate and delete options |

A variant can also become `ready` without a long full run. When docsfy sees that the selected commit does not need new docs, it uses `current_stage="up_to_date"` and the ready view says **Documentation is already up to date.**

## In-Flight Stages

The dashboard uses this exact stage list when it builds the activity log:

```26:34:frontend/src/lib/constants.ts
export const GENERATION_STAGES = [
  'cloning',
  'planning',
  'incremental_planning',
  'generating_pages',
  'validating',
  'cross_linking',
  'rendering',
] as const
```

Notice that the list ends at `rendering`. Terminal outcomes like `ready`, `error`, and `aborted` are handled separately from the in-flight stages.

Here is what each stage means in practice:

- `cloning`: docsfy is preparing the repository source. For remote repositories that means cloning; for admin-only local paths it is usually very brief.
- `planning`: docsfy is building a fresh documentation plan for the repository.
- `incremental_planning`: docsfy found an earlier ready variant and is deciding which pages actually need to change.
- `generating_pages`: markdown pages are being written or updated.
- `validating`: the generated pages go through a validation pass.
- `cross_linking`: docsfy adds cross-page links after validation.
- `rendering`: the generated content is turned into the final static site.

> **Tip:** `incremental_planning` usually means docsfy is reusing prior work instead of rebuilding everything from scratch.

> **Note:** `up_to_date` can appear in `current_stage`, but it is a special successful no-op marker, not one of the long-running in-flight stages above.

## WebSocket-Driven Updates

```mermaid
sequenceDiagram
  participant User
  participant Dashboard
  participant API as FastAPI API
  participant Worker as Generation Task
  participant DB as SQLite

  Dashboard->>API: GET /api/projects
  API->>DB: Read current variants
  API-->>Dashboard: Snapshot of projects

  Dashboard->>API: WebSocket /api/ws
  API-->>Dashboard: sync

  User->>Dashboard: Start generation
  Dashboard->>API: POST /api/generate
  API->>DB: Save variant with status=generating
  API->>Worker: Start background task

  loop While work is in progress
    Worker->>DB: Update current_stage, page_count, plan_json
    API-->>Dashboard: progress
  end

  alt Run finishes successfully
    Worker->>DB: Save terminal state and final metadata
    API-->>Dashboard: status_change
    API-->>Dashboard: sync
  else Run fails
    Worker->>DB: Save error_message
    API-->>Dashboard: status_change
    API-->>Dashboard: sync
  else User aborts
    Dashboard->>API: POST /api/projects/.../abort
    Worker->>DB: Save aborted state
    API-->>Dashboard: status_change
    API-->>Dashboard: sync
  end
```

On connect, the server sends an initial `sync`. While a run is still active, it sends `progress` updates. When a variant reaches `ready`, `error`, or `aborted`, it sends `status_change` and then a full `sync` so every open dashboard stays consistent.

```131:155:src/docsfy/api/projects.py
await update_project_status(project_name, ai_provider, ai_model, **ups_kwargs)

if status in _TERMINAL_STATUSES:
    await notify_status_change(
        gen_key=gen_key,
        status=status,
        page_count=page_count,
        last_generated=(
            datetime.now(UTC).strftime("%Y-%m-%d %H:%M:%S")
            if status == "ready"
            else None
        ),
        last_commit_sha=last_commit_sha,
        error_message=error_message,
    )
    await notify_sync()
else:
    await notify_progress(
        gen_key=gen_key,
        status=status,
        current_stage=current_stage if isinstance(current_stage, str) else None,
        page_count=page_count,
        plan_json=plan_json,
        error_message=error_message,
    )
```

Live updates are scoped to the people who should see them: admins, the project owner, and users who were granted access to that project.

On the browser side, the dashboard patches `progress` and `status_change` messages into its local project list. If a message arrives before that variant exists in memory, the dashboard falls back to a fresh `/api/projects` request so the UI still catches up cleanly.

### If the connection drops

The browser app does not leave the dashboard stale if the socket drops. It retries a few times, then falls back to polling `/api/projects` every 10 seconds.

```81:108:frontend/src/lib/websocket.ts
private attemptReconnect(): void {
  if (this.reconnectAttempts >= this.maxReconnectAttempts) {
    console.debug('[WS] Falling back to polling')
    this.startPolling()
    return
  }
  const delay = this.getBackoffDelay()
  this.reconnectAttempts++
  console.debug('[WS] Reconnecting, attempt', this.reconnectAttempts)
  this.reconnectTimer = setTimeout(() => this.connect(true), delay)
}

private startPolling(): void {
  if (this.pollingTimer) return
  this.pollingTimer = setInterval(async () => {
    try {
      const data = await api.get<ProjectsResponse>('/api/projects')
      const syncMessage: WebSocketMessage = {
        type: 'sync' as const,
        projects: data.projects,
        known_models: data.known_models,
        known_branches: data.known_branches,
      }
      this.handlers.forEach(handler => handler(syncMessage))
    } catch {
      /* ignore polling errors */
    }
  }, WS_POLLING_FALLBACK_MS)
}
```

In the current code, the backend heartbeat is 30 seconds, the pong timeout is 10 seconds, the server closes after 2 missed pongs, and the browser polling fallback interval is 10000 ms.

> **Tip:** If the dashboard stops updating instantly but refreshes within about 10 seconds, the polling fallback is probably working exactly as designed.

If you run the frontend through Vite in development, WebSocket proxying is already enabled:

```15:23:frontend/vite.config.ts
server: {
  host: '0.0.0.0',
  port: 5173,
  proxy: {
    '/api': {
      target: API_TARGET,
      changeOrigin: true,
      ws: true,
    },
```

> **Warning:** If you put docsfy behind another reverse proxy or load balancer, it also needs to allow WebSocket upgrades for `/api/ws`. Otherwise the browser will fall back to slower polling, and CLI live watch can fail entirely.

## Page Counts and the Progress Bar

`page_count` is not guessed in the browser. The generator updates it on the server after each markdown page is written by counting the cached page files on disk.

```326:343:src/docsfy/generator.py
# Update page count in DB if project_name provided
if project_name:
    from docsfy.storage import update_project_status

    # Count cached pages to get current total
    existing_pages = len(list(cache_dir.glob("*.md")))
    await update_project_status(
        project_name,
        ai_provider,
        ai_model,
        owner=owner,
        status="generating",
        page_count=existing_pages,
        branch=branch,
    )
    if on_page_generated is not None:
        try:
            await on_page_generated(existing_pages)
```

The progress bar uses that `page_count` together with the planned total number of pages from `plan_json`:

```459:499:frontend/src/components/shared/VariantDetail.tsx
const totalPages = getTotalPages(project.plan_json)
const progressPercent = totalPages > 0 ? Math.round((project.page_count / totalPages) * 100) : 0

// ...

{totalPages > 0 && (
  <div className="flex flex-col gap-1.5">
    <Progress value={progressPercent}>
      <span className="text-sm font-medium">Progress</span>
    </Progress>
    <span className="text-xs text-muted-foreground">
      {project.page_count} of {totalPages} pages ({progressPercent}%)
    </span>
  </div>
)}
```

That leads to a few useful rules of thumb:

- The progress bar only appears after planning is complete, because docsfy does not know the total page count before the plan exists.
- A force full regeneration resets the count to `0`.
- An incremental regeneration can start above `0`, because unchanged cached pages are reused immediately.
- Reaching `N of N pages` does not always mean the run is done yet. Validation, cross-linking, and rendering happen after page generation.

> **Tip:** If you see `100%` but the badge still says `Generating`, docsfy is usually in `validating`, `cross_linking`, or `rendering`. Wait for the status to become `ready` before treating the docs as finished.

## How the Dashboard Reflects In-Flight Work

When you start a generation from the dashboard, docsfy immediately switches to that variant view instead of making you go find it manually. That selected variant also causes the matching repo and branch to auto-expand in the sidebar. If the new variant has not arrived over WebSocket quickly enough, the page does a fallback fetch to pull it in.

```271:304:frontend/src/pages/DashboardPage.tsx
// After generation starts, immediately select the new variant.
// If WebSocket sync doesn't deliver the variant within 5s, fetch via HTTP.
function handleGenerated(name: string, branch: string, provider: string, model: string) {
  console.debug('[Dashboard] Generate success:', name, branch, provider, model)
  setSelectedView({
    type: 'variant',
    name,
    branch,
    provider,
    model,
    owner: username,
  })
  // Clear any previous fallback timeout
  if (generatedTimeoutRef.current) {
    clearTimeout(generatedTimeoutRef.current)
  }
  generatedTimeoutRef.current = setTimeout(async () => {
    generatedTimeoutRef.current = null
    // Check if the variant is already in the projects list
    const found = projects.some(
      (p) => p.name === name && p.branch === branch && p.ai_provider === provider && p.ai_model === model && p.owner === username
    )
    if (!found) {
      console.debug('[Dashboard] New variant not yet in state, fetching via HTTP')
      try {
        const data = await api.get<ProjectsResponse>('/api/projects')
        setProjects(data.projects)
        setKnownModels(data.known_models)
        setKnownBranches(data.known_branches)
      } catch {
        /* best-effort fallback */
      }
    }
  }, WS_POLLING_FALLBACK_MS / 2) // 5s
}
```

The sidebar also makes in-flight work easy to scan even when repo groups are collapsed. Each repo row shows how many variants are `ready`, `generating`, `failed`, or `aborted`:

```355:394:frontend/src/components/shared/ProjectTree.tsx
{!isExpanded && (
  <div className="pl-5 mt-0.5 flex items-center gap-1 text-xs text-muted-foreground">
    <span>{totalVariants} variant{totalVariants !== 1 ? 's' : ''}</span>
    {(counts.ready ?? 0) > 0 && (
      <>
        <span>·</span>
        <span className="flex items-center gap-1">
          <StatusDot status="ready" showTitle={false} className="w-2 h-2" />
          <span className="text-green-500">{counts.ready} ready</span>
        </span>
      </>
    )}
    {(counts.generating ?? 0) > 0 && (
      <>
        <span>·</span>
        <span className="flex items-center gap-1">
          <StatusDot status="generating" showTitle={false} className="w-2 h-2" />
          <span className="text-blue-500">{counts.generating} generating</span>
        </span>
      </>
    )}
    {(counts.error ?? 0) > 0 && (
      <>
        <span>·</span>
        <span className="flex items-center gap-1">
          <StatusDot status="error" showTitle={false} className="w-2 h-2" />
          <span className="text-red-500">{counts.error} failed</span>
        </span>
      </>
    )}
    {(counts.aborted ?? 0) > 0 && (
      <>
        <span>·</span>
        <span className="flex items-center gap-1">
          <StatusDot status="aborted" showTitle={false} className="w-2 h-2" />
          <span className="text-amber-500">{counts.aborted} aborted</span>
        </span>
      </>
    )}
  </div>
)}
```

In the selected variant view, the main panel keeps the info grid, progress bar, activity log, and abort controls live as messages arrive. Once the plan is available, the activity log can show page-by-page entries using the real page titles from that plan.

## Ready, Error, and Aborted

### `ready`

A successful run ends in `ready`. The detail view shows the final page count, commit, last-generated time, and buttons to open or download the generated docs. If nothing changed, docsfy still uses `ready`, but the success message changes to **Documentation is already up to date.**

### `error`

An `error` status means docsfy could not finish. That can happen before generation starts, such as when the provider CLI is unavailable, or later in the pipeline during planning, page generation, validation, cross-linking, or rendering. The detail view shows the backend `error_message`, and if you have write access the regenerate form defaults to **Force full regeneration** so you can retry cleanly.

### `aborted`

An `aborted` status means a user stopped the run from the generating view. The variant stays visible after the abort, so you can still see what happened and choose whether to regenerate or delete it.

> **Warning:** docsfy does not let you delete a variant while it is actively generating. Abort it first, then delete it if you no longer need it.

If the server restarts while a variant is still marked `generating`, docsfy automatically converts that row to `error` so stale work does not look permanently in progress:

```202:208:src/docsfy/storage.py
# Reset orphaned "generating" projects from previous server run
cursor = await db.execute(
    "UPDATE projects SET status = 'error', error_message = 'Server restarted during generation', current_stage = NULL WHERE status = 'generating'"
)
if cursor.rowcount > 0:
    logger.info(
        f"Reset {cursor.rowcount} orphaned generating project(s) to error status"
    )
```

> **Tip:** `docsfy generate <repo> --watch` listens to the same `/api/ws` progress feed from the terminal. It is great for quick terminal feedback, but unlike the browser dashboard it does not fall back to polling if the WebSocket connection fails.

Tracking is simplest if you read the UI in this order: check the variant `status` first, then the `current_stage`, then the page counter. Page counts tell you how far page generation has gone, but a run is only finished when the status becomes `ready`.


## Related Pages

- [WebSocket Protocol](websocket-protocol.html)
- [Generating Documentation](generating-documentation.html)
- [Projects API](projects-api.html)
- [Variants, Branches, and Regeneration](variants-branches-and-regeneration.html)
- [Troubleshooting](troubleshooting.html)

---

Source: variants-branches-and-regeneration.md

# Variants, Branches, and Regeneration

In docsfy, a *variant* is one generated docs set for a specific repository, Git branch, AI provider, and AI model. Generate `main` with one model and `dev` with another, and docsfy treats those as different outputs with their own status, commit SHA, download link, and docs URL.

A few rules matter most:

- Omit `branch` and docsfy uses `main`.
- Omit `ai_provider` or `ai_model` and the server falls back to its configured defaults.
- Non-force runs try hard to reuse what is already known.
- Force runs skip reuse and rebuild from scratch.

## What A Variant Is

Branch, provider, and model are part of the public URL shape. That is why the same repository can have multiple independent variants at once.

Actual URLs used by the app and tests look like this:

```text
/docs/for-testing-only/dev/gemini/gemini-2.5-flash/
/api/projects/for-testing-only/dev/gemini/gemini-2.5-flash
/api/projects/for-testing-only/dev/gemini/gemini-2.5-flash/download
```

Each variant also has its own cache and rendered site on disk. For self-hosted installs, docsfy stores them under a branch/provider/model-specific directory, so `main` and `dev` do not overwrite each other.

> **Warning:** The shorter `/docs/<project>/` and `/api/projects/<name>/download` routes are not branch-pinned. They resolve to the most recently generated ready variant. If you care about a specific branch or model, use the full variant URL.

## Branches

Branch selection is part of the generation request, not an afterthought. In the request model, `branch` defaults to `main`, and `force` defaults to `false`:

```python
force: bool = Field(
    default=False, description="Force full regeneration, ignoring cache"
)
branch: str = Field(
    default=DEFAULT_BRANCH, description="Git branch to generate docs from"
)
```

That is why this real API example from the test plan creates a `main` variant even though it does not send a branch:

```shell
curl -s -X POST http://localhost:8800/api/generate \
  -H "Authorization: Bearer <TEST_USER_PASSWORD>" \
  -H "Content-Type: application/json" \
  -d '{"repo_url":"https://github.com/myk-org/for-testing-only","ai_provider":"gemini","ai_model":"gemini-2.5-flash"}'
```

When branch is omitted, the response includes `"branch": "main"`.

Branches are isolated from each other. A ready `main` variant can coexist with a `dev` variant that is still generating, and deleting one branch variant does not remove the other.

> **Warning:** docsfy rejects branch names that contain `/`. Use `release-1.x` instead of `release/1.x`. Branch names must be a single safe path segment because branch is part of the docs URL and API path.

If you request a branch that does not exist, generation fails for that variant and the variant ends in `error`. The branch-specific error is not silently ignored.

> **Tip:** In the web UI, the branch field is a combobox. It can suggest branches docsfy already knows about for that repo, but you can still type a new branch manually.

## Provider And Model Variants

docsfy supports three providers in code: `claude`, `gemini`, and `cursor`. The model name is stored as part of the variant too, so `gemini/gemini-2.5-flash` and `gemini/gemini-2.0-flash` are separate variants even when they point at the same repository and branch.

If you omit provider or model in the API or CLI, the server uses its configured defaults. The default settings in the code are:

```python
class Settings(BaseSettings):
    admin_key: str = ""
    ai_provider: str = "cursor"
    ai_model: str = "gpt-5.4-xhigh-fast"
    ai_cli_timeout: int = Field(default=60, gt=0)
    log_level: str = "INFO"
    data_dir: str = "/data"
```

Those can be overridden with environment variables such as:

```bash
AI_PROVIDER=gemini
AI_MODEL=gemini-2.5-pro
AI_CLI_TIMEOUT=120
```

A real CLI example from the repository's E2E plan looks like this:

```shell
docsfy generate https://github.com/myk-org/for-testing-only --branch dev --provider gemini --model gemini-2.5-flash --force --watch
```

One important behavior is easy to miss: on a non-force run, docsfy does not only look at the exact same provider/model variant. It chooses the freshest ready variant on the same branch as a baseline, even if that baseline was generated with a different provider or model.

That baseline reuse works like this:

- If the baseline is a different provider/model variant and the requested commit is the same, docsfy can copy the existing artifacts directly and mark the new variant as up to date.
- If the commit changed, docsfy can still copy the baseline's cached content, reuse unchanged pages, and regenerate only what needs attention.
- After the new variant is ready, docsfy removes the reused baseline only when that baseline was a different provider/model variant.
- If you use `force`, docsfy does not do that replacement. The old variant stays, and the new one is built from scratch.

> **Tip:** If you want two provider/model outputs to coexist side by side for comparison, use `force` when creating the second one. A non-force provider/model switch may replace the baseline variant after the new one succeeds.

> **Note:** In the current UI, the "Regenerate Documentation" panel keeps the selected variant's branch. It lets you change provider, model, and `force`. To generate a different branch, start a new generation for that branch.

## How docsfy Decides Whether To Regenerate

On a non-force run, docsfy follows a simple sequence:

1. Pick a previous ready variant on the same branch as the baseline.
2. Compare the baseline commit SHA with the requested commit SHA.
3. If needed, fetch just enough Git history to diff the old and new commits.
4. Decide whether nothing changed, whether only some pages need updates, or whether a full rebuild is safer.

Here is the practical outcome:

| Situation | What docsfy does |
| --- | --- |
| Same commit SHA as the baseline | Marks the variant `ready` with `current_stage="up_to_date"` and skips planning and page generation |
| Different commit SHA, but Git diff shows no changed files | Treats the docs as up to date and skips regeneration |
| Different commit SHA, changed files, saved plan is available | Runs the incremental pipeline |
| `force=true` or no usable baseline/diff/plan | Falls back to a full regeneration from scratch |

An up-to-date run still ends in `ready`. The difference is that docsfy sets `current_stage` to `up_to_date` and returns before planning, page generation, validation, cross-linking, version detection, and rendering. In the ready view, that is what drives the "Documentation is already up to date." message.

## Incremental Updates

Incremental regeneration is not just "rerun everything faster." It is a diff-driven workflow.

When docsfy can take the incremental path, it:

- computes a Git diff between the previous and current commit,
- keeps the existing documentation plan when possible,
- asks an incremental planner which page slugs are affected,
- reuses cached pages that were not touched,
- updates only the pages that need changes.

Remote repositories are cloned shallowly first. When docsfy needs a diff, it fetches just enough history to reach the previous commit. If that fetch fails, incremental mode is abandoned and docsfy rebuilds fully.

The incremental planner can return three useful outcomes:

- `[]`: no page content needs to change, so cached pages are reused.
- A subset like `["introduction", "configuration"]`: only those pages are revisited.
- `["all"]`: every page is revisited, but the existing plan is still reused.

If the incremental planner fails or returns invalid output, docsfy treats that as `["all"]` rather than guessing.

That last case is important. If the incremental planner cannot safely narrow the change down, docsfy does **not** automatically re-plan the entire docs site. Instead, it keeps the old plan and revisits every page under that plan.

For page-level updates, docsfy uses a patch-style format instead of rewriting the whole page. This example is from the test suite:

```json
{
  "updates": [
    {
      "old_text": "## Configuration\n\nOld settings.\n",
      "new_text": "## Configuration\n\nNew settings.\n"
    }
  ]
}
```

That lets docsfy surgically replace only the changed block in an existing page. Unchanged pages stay cached, and even changed pages can often be updated without rewriting the rest of the document.

Incremental updates also preserve doc structure when they can. The repository's E2E tests verify that after an incremental run, the saved `plan_json` can remain unchanged while page content updates to the new commit.

After page updates finish, docsfy still runs the same post-generation pipeline as a full run. It validates the resulting pages, adds cross-page links, detects a version when one can be found, and only then renders the final site. Incremental regeneration is therefore selective about which pages it rewrites, but not about the final quality checks that happen before the variant becomes `ready`.

## When docsfy Falls Back To A Full Regeneration

A full regeneration of the whole variant happens when docsfy cannot trust the incremental path.

Whole-variant full regeneration happens when:

- you select `Force full regeneration` in the UI,
- you pass `--force` in the CLI,
- you send `"force": true` in the API request,
- there is no usable previous ready variant to compare against,
- docsfy cannot fetch the previous commit needed for diffing a shallow clone,
- `git diff` fails,
- there is no saved plan to reuse,
- the saved `plan_json` exists but cannot be parsed,
- copying baseline artifacts during a provider/model switch fails.

When docsfy does a full regeneration, it clears stale cached page files first so removed pages do not linger on disk or show up in the rendered site by accident.

There is also a smaller, page-level fallback inside an otherwise incremental or post-processing run.

During incremental page updates, if an individual page update cannot be safely applied, docsfy regenerates just that page in full. That happens when:

- the incremental page response is not valid JSON,
- an `old_text` block is missing from the existing page,
- an `old_text` block appears more than once and is not unique,
- patch blocks overlap,
- the AI call for that page fails.

During the later validation stage, docsfy can also regenerate a single page in full when it detects stale references in that page's content.

> **Note:** The post-generation pipeline is fail-soft. If validation or cross-linking fails, docsfy logs a warning and continues with the current page set instead of falling back to a whole-variant rebuild. Version detection is also best-effort, so rendering continues even when no version can be detected.

So there are really three levels of fallback:

- no work at all because the variant is already up to date,
- selective regeneration, including full rewrites of individual pages when needed,
- full regeneration of the whole variant when the reuse path is unsafe.

## What You Will See While It Runs

While a generation is in progress, the main `status` value is one of:

- `generating`
- `ready`
- `error`
- `aborted`

During `generating`, docsfy also tracks a more specific `current_stage`. The front end tracks the active generation stages with this constant:

```ts
export const GENERATION_STAGES = [
  'cloning',
  'planning',
  'incremental_planning',
  'generating_pages',
  'validating',
  'cross_linking',
  'rendering',
] as const
```

A given run uses either `planning` or `incremental_planning`, not both. `validating` and `cross_linking` are post-generation stages that run after page writing succeeds and before the final render step, alongside best-effort version detection.

Those same stage names flow through WebSocket updates, so the dashboard activity log can show when docsfy has moved past page writing into post-processing.

`up_to_date` is different. It is not part of the active generation stage list. Instead, docsfy sets `status="ready"` with `current_stage="up_to_date"` when a non-force run discovers there is nothing to regenerate.

On a forced full regeneration, docsfy resets the page count to `0` before rebuilding. Once the plan is ready, the UI can show progress as generated pages count up toward the total pages in the plan.

If you use the CLI with `--watch`, or the web dashboard, those stage changes are how you can tell whether docsfy is taking the fast incremental path, running post-generation checks, or doing a full rebuild.

## Practical Examples

Generate a specific branch and model from the CLI:

```shell
docsfy generate https://github.com/myk-org/for-testing-only --branch dev --provider gemini --model gemini-2.5-flash --force --watch
```

Start a generation through the API and let branch default to `main`:

```shell
curl -s -X POST http://localhost:8800/api/generate \
  -H "Authorization: Bearer <TEST_USER_PASSWORD>" \
  -H "Content-Type: application/json" \
  -d '{"repo_url":"https://github.com/myk-org/for-testing-only","ai_provider":"gemini","ai_model":"gemini-2.5-flash"}'
```

Open or download an exact variant instead of "whatever is latest":

```text
/docs/for-testing-only/dev/gemini/gemini-2.5-flash/
/api/projects/for-testing-only/dev/gemini/gemini-2.5-flash/download
```

> **Tip:** When you bookmark docs, automate downloads, or share links with teammates, prefer the full variant URL. That keeps the result stable even after someone generates a newer branch or model for the same repository.


## Related Pages

- [Projects, Variants, and Ownership](projects-variants-and-ownership.html)
- [Generating Documentation](generating-documentation.html)
- [Tracking Progress and Status](tracking-progress-and-status.html)
- [Data Storage and Layout](data-storage-and-layout.html)
- [Projects API](projects-api.html)

---

Source: viewing-downloading-and-hosting-docs.md

# Viewing, Downloading, and Hosting Docs

A `ready` variant is one generated docs build for a specific project, branch, AI provider, and AI model. Once a variant is ready, you can browse it through docsfy, download it as a `.tar.gz`, or publish the generated static site somewhere else.

Project-level URLs are convenience URLs. They follow the latest ready variant you can access. Variant-specific URLs stay pinned to one exact build.

```mermaid
flowchart LR
  A["Ready variant"] --> B["/docs/{project}/..."]
  A --> C["/docs/{project}/{branch}/{provider}/{model}/..."]
  A --> D["/api/projects/{project}/download"]
  A --> E["/api/projects/{project}/{branch}/{provider}/{model}/download"]
  A --> F["site/ directory on disk"]
  D --> G[".tar.gz archive"]
  E --> G
  F --> H["Static host or web server"]
```

## Open Docs

Use a project-level route when you want the current docs for a repository:

- `/docs/<project>/`
- `/docs/<project>/index.html`

Use a variant-specific route when you need a stable link to one exact branch/provider/model build:

- `/docs/<project>/<branch>/<provider>/<model>/`
- `/docs/<project>/<branch>/<provider>/<model>/index.html`

The integration tests exercise both route shapes directly:

```127:149:tests/test_integration.py
# Check docs are served via variant-specific route
response = await client.get("/docs/test-repo/main/claude/opus/index.html")
assert response.status_code == 200
assert "test-repo" in response.text

response = await client.get("/docs/test-repo/main/claude/opus/introduction.html")
assert response.status_code == 200
assert "Welcome!" in response.text

# Check docs are served via latest-variant route
response = await client.get("/docs/test-repo/index.html")
assert response.status_code == 200
assert "test-repo" in response.text

# Download via variant-specific route
response = await client.get("/api/projects/test-repo/main/claude/opus/download")
assert response.status_code == 200
assert response.headers["content-type"] == "application/gzip"

# Download via latest-variant route
response = await client.get("/api/projects/test-repo/download")
assert response.status_code == 200
assert response.headers["content-type"] == "application/gzip"
```

The same docs base URL can serve other published files too: page HTML, `llms.txt`, `llms-full.txt`, and files under `assets/`. The end-to-end suite explicitly opens `llms.txt` and `llms-full.txt` at `/docs/for-testing-only/main/gemini/gemini-2.5-flash/llms.txt` and `/docs/for-testing-only/main/gemini/gemini-2.5-flash/llms-full.txt`.

> **Tip:** Use `/docs/<project>/` for a moving "current docs" link. Use the full variant URL for release docs, reviews, QA, or anything that must keep pointing to the same build.

> **Warning:** `/docs/<project>/` is not a shortcut for `main`. If a newer `ready` variant is generated for a different branch, provider, or model, that URL can start serving the newer build.

> **Note:** `/docs/...` requires authentication. Browser requests for HTML redirect to `/login` when you are signed out. API-style requests return `401`.

> **Note:** The branch name is part of the path. Branches like `main`, `dev`, and `release-1.x` work. Branch names with `/` do not.

## Download Tarballs

docsfy exposes two download endpoints:

- `/api/projects/<project>/download` for the latest ready variant you can access
- `/api/projects/<project>/<branch>/<provider>/<model>/download` for one exact variant

The server creates the archive directly from the rendered `site/` directory:

```374:409:src/docsfy/api/projects.py
async def _stream_tarball(site_dir: Path, archive_name: str) -> StreamingResponse:
    """Create a tar.gz archive and stream it as a response."""
    tmp = tempfile.NamedTemporaryFile(suffix=".tar.gz", delete=False)
    tar_path = Path(tmp.name)
    tmp.close()

    def _create_archive() -> None:
        with tarfile.open(tar_path, mode="w:gz") as tar:
            tar.add(str(site_dir), arcname=archive_name)

    # ... stream the file, then clean up the temporary archive ...

    return StreamingResponse(
        _stream_and_cleanup(),
        media_type="application/gzip",
        headers={
            "Content-Disposition": f'attachment; filename="{archive_name}-docs.tar.gz"'
        },
    )
```

That gives you predictable archive names:

- project-level download: `<project>-docs.tar.gz`
- variant-specific download: `<project>-<branch>-<provider>-<model>-docs.tar.gz`

When you extract the archive, it contains a single top-level folder named after the selected project or variant. A real e2e test downloads and extracts one of these archives like this:

```78:84:test-plans/e2e-08-cross-model-updates.md
curl -s -L -H "Authorization: Bearer $ADMIN_KEY" \
  "$SERVER/api/projects/for-testing-only/main/$BASELINE_PROVIDER/$BASELINE_MODEL/download" \
  -o "$CROSS_PROVIDER_ROOT/baseline.tar.gz"
mkdir -p "$CROSS_PROVIDER_ROOT/baseline"
tar -xzf "$CROSS_PROVIDER_ROOT/baseline.tar.gz" --strip-components=1 -C "$CROSS_PROVIDER_ROOT/baseline"
ls "$CROSS_PROVIDER_ROOT/baseline"
```

If you want the site files placed directly into a target directory, `tar --strip-components=1` is the simplest way to drop the archive's top-level folder during extraction.

> **Note:** Downloads only work for `ready` variants. An explicit variant download returns `400` until that variant is ready, and the project-level route returns `404` if no ready variant exists.

> **Note:** The archive contains the published `site/` bundle. It does not include internal generation files such as `plan.json` or `cache/pages/`.

## Download With The CLI

The CLI wraps the same download endpoints. The repository ships an example config file for `~/.config/docsfy/config.toml`:

```1:25:config.toml.example
# docsfy CLI configuration
# Copy to ~/.config/docsfy/config.toml or run: docsfy config init
#
# SECURITY: This file contains passwords. Keep it private:
#   chmod 600 ~/.config/docsfy/config.toml

# Default server to use when --server is not specified
[default]
server = "dev"

# Server profiles -- add as many as you need
[servers.dev]
url = "http://localhost:8000"
username = "admin"
password = "<your-dev-key>"

[servers.prod]
url = "https://docsfy.example.com"
username = "admin"
password = "<your-prod-key>"

[servers.staging]
url = "https://staging.docsfy.example.com"
username = "deployer"
password = "<your-staging-key>"
```

The download logic in the CLI shows the two main behaviors: variant selectors are all-or-nothing, and `--output` extracts instead of saving the `.tar.gz`:

```287:327:src/docsfy/cli/projects.py
# Require all variant selectors together, or none
variant_opts = [branch, provider, model]
if any(variant_opts) and not all(variant_opts):
    typer.echo(
        "Specify --branch, --provider, and --model together to download a specific variant, "
        "or omit all three to download the default variant.",
        err=True,
    )
    raise typer.Exit(code=1)

# ... choose latest or pinned download URL and archive name ...

if output:
    # Download to a temp file and extract
    with tempfile.NamedTemporaryFile(suffix=".tar.gz", delete=False) as tmp:
        tmp_path = Path(tmp.name)

    try:
        client.download(url_path, tmp_path)
        output_dir = Path(output)
        output_dir.mkdir(parents=True, exist_ok=True)
        with tarfile.open(tmp_path, "r:gz") as tar:
            tar.extractall(path=output_dir, filter="data")
        typer.echo(f"Extracted to {output_dir}")
    finally:
        tmp_path.unlink(missing_ok=True)
else:
    dest = Path.cwd() / archive_name
    client.download(url_path, dest)
    typer.echo(f"Downloaded to {dest}")
```

In practice:

- `docsfy download <project>` saves the latest ready archive in your current directory.
- Add `--branch`, `--provider`, and `--model` together to pin the download to one exact variant.
- Add `--output <dir>` to extract the archive instead of keeping the tarball.

> **Note:** Keep `~/.config/docsfy/config.toml` private. The shipped example explicitly warns that it contains passwords.

> **Note:** The CLI extracts the archive as-is. If the archive contains a top-level folder, that folder remains inside your chosen `--output` directory.

## Host The Static Site

If you are happy letting docsfy serve the docs, you do not need any extra publish step. `/docs/...` already serves the generated `site/` bundle. Only use the rest of this section when you want to publish that bundle somewhere else.

Every ready variant is written to disk at `$DATA_DIR/projects/<owner>/<project>/<branch>/<provider>/<model>/site`. `DATA_DIR` defaults to `/data`.

In the provided Compose setup, `/data` is persisted from the host filesystem:

```1:16:docker-compose.yaml
services:
  docsfy:
    build:
      context: .
      dockerfile: Dockerfile
    ports:
      - "8000:8000"
      # Uncomment for development (DEV_MODE=true)
      # - "5173:5173"
    volumes:
      - ./data:/data
      # Uncomment for development (hot reload)
      # - ./frontend:/app/frontend
    env_file:
      - .env
```

With that setup, the host-side path is typically `./data/projects/<owner>/<project>/<branch>/<provider>/<model>/site`.

The renderer recreates that directory and writes the published files directly into it:

```600:688:src/docsfy/renderer.py
if output_dir.exists():
    shutil.rmtree(output_dir)
output_dir.mkdir(parents=True, exist_ok=True)
assets_dir = output_dir / "assets"
assets_dir.mkdir(exist_ok=True)

# Prevent GitHub Pages from running Jekyll
(output_dir / ".nojekyll").touch()

if STATIC_DIR.exists():
    for static_file in STATIC_DIR.iterdir():
        if static_file.is_file():
            shutil.copy2(static_file, assets_dir / static_file.name)

(output_dir / "index.html").write_text(index_html, encoding="utf-8")
(output_dir / f"{slug}.html").write_text(page_html, encoding="utf-8")
(output_dir / f"{slug}.md").write_text(md_content, encoding="utf-8")

search_index = _build_search_index(valid_pages, plan)
(output_dir / "search-index.json").write_text(
    json.dumps(search_index), encoding="utf-8"
)

llms_txt = _build_llms_txt(plan, navigation=filtered_navigation)
(output_dir / "llms.txt").write_text(llms_txt, encoding="utf-8")

llms_full_txt = _build_llms_full_txt(
    plan, valid_pages, navigation=filtered_navigation
)
(output_dir / "llms-full.txt").write_text(llms_full_txt, encoding="utf-8")
```

When you publish the folder elsewhere, keep these pieces together:

- `index.html`
- every generated page `*.html`
- `assets/`
- `search-index.json`
- `.nojekyll`
- `llms.txt`
- `llms-full.txt`
- page `*.md` files if you want the Markdown copies too

This repository gives you the generated files and the runtime mounts, but not a dedicated static-site deployment pipeline. Publishing is simply a matter of copying or syncing the `site/` directory to your static host or web server and serving `index.html` as the entry point.

> **Tip:** GitHub Pages is a good fit for the exported site because docsfy writes `.nojekyll` automatically.

> **Warning:** Treat `site/` as build output. docsfy deletes and recreates it on every render, so manual edits there will be overwritten by the next generation.

> **Warning:** If you publish the `site/` directory outside docsfy, docsfy's authentication and access checks no longer protect it. Use your static host, CDN, or proxy for access control if the docs must stay private.

## Quick Reference

- Use `/docs/<project>/` for a moving browser link to the latest ready docs you can access.
- Use `/docs/<project>/<branch>/<provider>/<model>/` for a pinned browser link to one exact variant.
- Use `/api/projects/<project>/download` for a moving latest archive.
- Use `/api/projects/<project>/<branch>/<provider>/<model>/download` for a pinned archive.
- Use `$DATA_DIR/projects/<owner>/<project>/<branch>/<provider>/<model>/site` when you want to host the generated site outside docsfy.


## Related Pages

- [Generated Output](generated-output.html)
- [Projects API](projects-api.html)
- [CLI Workflows](cli-workflows.html)
- [Data Storage and Layout](data-storage-and-layout.html)
- [Variants, Branches, and Regeneration](variants-branches-and-regeneration.html)

---

Source: user-and-access-management.md

# User and Access Management

docsfy supports one built-in administrator account plus database-backed user accounts. In the web UI the secret is labeled **Password**; in the CLI and API the same value is used as an **API key**.

> **Note:** After a successful browser login, docsfy creates an opaque `docsfy_session` cookie instead of storing the raw password/API key in the cookie itself. Browser sessions last 8 hours by default.

## How access works

docsfy supports three database roles:

| Role | What they can do |
| --- | --- |
| `admin` | See all projects, open the **Users** and **Access** admin panels, create and delete users, rotate any user's password, and grant or revoke project access |
| `user` | Generate documentation, manage their own variants, open projects shared with them, and rotate their own password |
| `viewer` | Open projects they own or have been granted, and rotate their own password, but cannot generate, abort, delete, or regenerate |

There are also two ways to have admin access:

| Admin type | How it works | Can it rotate its own password in the UI/API? |
| --- | --- | --- |
| Built-in admin | Sign in as username `admin` using the `ADMIN_KEY` environment variable | No |
| Database admin user | A normal user account created with role `admin` | Yes |

> **Warning:** The username `admin` is reserved. You cannot create a database user named `admin`, `Admin`, or any other case variation.

## Configure the built-in admin

The built-in admin account comes from the server environment:

```env
# Required: Admin password (minimum 16 characters)
ADMIN_KEY=

# Cookie security (set to false for local HTTP development)
SECURE_COOKIES=true
```

If you use the CLI, the bundled example config stores the same credential as a named server profile:

```toml
[servers.dev]
url = "http://localhost:8000"
username = "admin"
password = "<your-dev-key>"
```

> **Warning:** `config.toml` contains real credentials. The example config explicitly recommends keeping it private with `chmod 600 ~/.config/docsfy/config.toml`.

## Create users

Admins can create users from the dashboard in **Admin -> Users** or from the CLI.

CLI examples:

```bash
docsfy admin users list
docsfy admin users create cli-test-user --role user
```

When you create a user, docsfy returns the generated password/API key once. Auto-generated keys start with `docsfy_`.

> **Warning:** Save the generated password immediately. The UI warns that it will not be shown again, and the API marks create/rotate responses as `Cache-Control: no-store`.

Usernames must:

- be 2 to 50 characters long
- start with a letter or number
- use only letters, numbers, `.`, `_`, or `-`

Choose the role at creation time:

- `admin`: full administrative access
- `user`: normal working account
- `viewer`: read-only access to documentation and shared projects

If you automate user creation directly against the admin API, the request body looks like this:

```json
{"username": "testuser", "role": "user"}
```

## Rotate passwords and API keys

Every database-backed user can rotate their own password from the sidebar's **Change Password** action. That includes `user`, `viewer`, and database-backed `admin` accounts.

You can:

- enter a replacement password yourself
- leave the field empty and let docsfy generate one
- use any custom password/API key that is at least 16 characters long

After a self-service rotation, docsfy deletes the current session cookie and you must sign in again with the new password.

Admins can also rotate another user's password from **Admin -> Users** or from the CLI:

```bash
docsfy admin users rotate-key alice
```

If you want to provide a specific replacement key through automation, the API accepts `new_key`:

```json
{"new_key": "admin-chosen-password-long"}
```

Rotating a user's key invalidates their existing sessions and the old password stops working immediately.

> **Warning:** The built-in `admin` account cannot use self-service key rotation. To change that password, update `ADMIN_KEY` in the environment instead.

> **Warning:** docsfy uses `ADMIN_KEY` when hashing stored user keys. If you change `ADMIN_KEY`, existing user passwords/API keys stop working and all users need new keys.

## Grant and revoke project access

Project sharing is admin-only. Use **Admin -> Access** in the dashboard or the CLI commands below:

```bash
docsfy admin access grant my-repo --username alice --owner admin
docsfy admin access list my-repo --owner admin
docsfy admin access revoke my-repo --username alice --owner admin
```

Access grants are scoped by both:

- project name
- project owner

That owner field matters because two different users can generate documentation for projects with the same name.

A single grant covers all variants of that owner's project. In other words, if you grant access to `my-repo` owned by `admin`, the recipient can see that owner's `main`, `dev`, or other branch variants, plus different provider/model variants for the same project name.

The admin API expects a JSON body like this when granting access:

```json
{"username": "alice", "owner": "admin"}
```

A grant only succeeds when:

- the target user already exists
- the project already exists for the specified owner

> **Tip:** Connected dashboards resync after a grant or revoke, so shared projects appear or disappear without a manual refresh.

## How shared projects appear for non-admin users

For non-admin users, shared projects are still mixed into the normal project list. There is no separate **Shared Projects** section.

What changed is that docsfy now merges shared variants into the same project view for a given project name:

- a regular `user` sees owned projects plus any projects an admin has granted to them
- a `viewer` sees assigned projects, but not generation or destructive actions
- admins still see separate project groups by `owner/project-name`
- non-admins see one project entry per project name, even when some variants are owned and others are shared

The project-details API makes that merge explicit:

```python
variants = await list_variants(name, owner=request.state.username)
# Always merge shared variants so they appear alongside owned ones
seen: set[tuple[str, str, str, str]] = {
    (
        str(v.get("owner", "")),
        str(v.get("branch", DEFAULT_BRANCH)),
        str(v.get("ai_provider", "")),
        str(v.get("ai_model", "")),
    )
    for v in variants
}
accessible = await get_user_accessible_projects(request.state.username)
for proj_name, proj_owner in accessible:
    if proj_name == name and proj_owner:
        shared_variants = await list_variants(name, owner=proj_owner)
        for sv in shared_variants:
            key = (
                str(sv.get("owner", "")),
                str(sv.get("branch", DEFAULT_BRANCH)),
                str(sv.get("ai_provider", "")),
                str(sv.get("ai_model", "")),
            )
            if key not in seen:
                seen.add(key)
                variants.append(sv)
```

The sidebar keeps admins and non-admins intentionally different. The grouping rule in the tree component is a single line:

```ts
const groupKey = isAdmin ? `${p.owner}/${p.name}` : p.name
```

```mermaid
flowchart LR
  A[Owned variants] --> C[Visible variants for one project name]
  B[Granted shared variants] --> C
  C --> D[Non-admin sidebar groups by project name]
  C --> E[Admin sidebar groups by owner/project-name]
```

Shared access still lets non-admins open documentation and download ready variants they are allowed to see. Access checks are enforced on the server as well as in the UI.

Important behavior to know:

- If access is granted or revoked while the dashboard is open, the visible project list updates without a manual refresh.
- If access is revoked, direct docs and download URLs start returning `404`.
- Non-admin users still do not delete or abort another person's shared variant. Delete and abort actions stay limited to variants they own.
- New generations always run under the signed-in user, even if they start from a repository that was originally shared with them.
- Shortcut routes such as `/docs/{project}/` and `/api/projects/{name}/download` choose the latest ready variant the signed-in user can access, whether that variant is owned or shared.

> **Note:** docsfy intentionally returns `404` for projects you do not own and have not been granted. That hides whether the project exists at all.

## Delete users

Admins can delete users from **Admin -> Users** or from the CLI:

```bash
docsfy admin users delete alice --yes
```

Deleting a user is a full cleanup operation. docsfy:

- invalidates all of that user's sessions
- deletes projects owned by that user
- removes access grants given to that user
- removes access grants attached to projects owned by that user
- deletes the user's project directory from disk

Two protections are built in:

- you cannot delete your own currently signed-in admin account
- you cannot delete a user while they have an active generation in progress

> **Warning:** User deletion is destructive. If the account owns generated documentation you still need, reassign access or download the docs before deleting the user.

## Quick reference

Create a user:

```bash
docsfy admin users create cli-test-user --role user
```

Rotate a user's password:

```bash
docsfy admin users rotate-key alice
```

List who can access a project:

```bash
docsfy admin access list my-repo --owner admin
```

Share a project:

```bash
docsfy admin access grant my-repo --username alice --owner admin
```

Remove shared access:

```bash
docsfy admin access revoke my-repo --username alice --owner admin
```


## Related Pages

- [Authentication and Roles](authentication-and-roles.html)
- [Admin API](admin-api.html)
- [Authentication API](auth-api.html)
- [Projects, Variants, and Ownership](projects-variants-and-ownership.html)
- [Security Considerations](security-considerations.html)

---

Source: cli-workflows.md

# CLI Workflows

The `docsfy` CLI is the fastest way to work with a running docsfy server from a terminal. Use it to verify connectivity, start documentation generation, inspect variants, download finished sites, and manage users if you have admin access.

## Before You Start

You need:
- A running docsfy server
- A server URL
- An API key

If you use the built-in admin account, the username is `admin` and the API key is the server's `ADMIN_KEY`. If you use a normal account, an admin creates it and gives you the generated API key.

### Save a Reusable CLI Profile

Run the interactive setup once:

```shell
docsfy config init
```

You will be prompted for:

```text
Profile name [dev]:
Server URL: http://localhost:8000
Username: admin
Password:
```

When `docsfy` asks for `Password`, enter your API key.

The CLI stores profiles in `~/.config/docsfy/config.toml`. The structure looks like this:

```toml
[default]
server = "dev"

[servers.dev]
url = "http://localhost:8000"
username = "admin"
password = "<ADMIN_KEY>"
```

Useful follow-up commands:

```shell
docsfy config show
docsfy config set default.server prod
docsfy config set servers.prod.url https://prod.example.com
docsfy config set servers.prod.username admin
```

If you omit `--server`, docsfy uses the profile named in `[default].server`.

> **Note:** `docsfy config show` masks stored passwords, and the CLI writes the config file with owner-only permissions.

> **Tip:** Every command also accepts global connection options: `--server`, `--host`, `--port`, `--username`, and `--password`. For most day-to-day work, a saved profile is more convenient.

### Server Defaults That Affect the CLI

If you run your own server, these values come from the server environment:

```dotenv
ADMIN_KEY=
AI_PROVIDER=cursor
AI_MODEL=gpt-5.4-xhigh-fast
AI_CLI_TIMEOUT=60
DATA_DIR=/data
SECURE_COOKIES=true
```

If you do not pass `--provider` or `--model` to `docsfy generate`, the server uses its configured `AI_PROVIDER` and `AI_MODEL`.

## Check Server Health

Use `health` first whenever you change config, switch environments, or suspect connectivity issues.

```shell
docsfy health
```

A healthy server responds with output like this:

```text
Server: http://localhost:8000
Status: ok
```

This command is useful for confirming that:
- your CLI is pointing at the server you expect
- the server is reachable
- the server is returning the expected health response

> **Tip:** If you see an error about no configured server, run `docsfy config init` or pass connection flags for a one-off check.

## Generate Documentation

Use `generate` to start a new documentation run for a Git repository.

The CLI takes a Git URL as its argument. HTTPS and SSH URLs are both supported, including forms like `https://github.com/org/repo.git` and `git@github.com:org/repo.git`.

> **Warning:** Repository URLs that point to `localhost` or other private-network addresses are rejected. Local `repo_path` generation exists in the API, but it is an admin-only workflow and is not a CLI argument format.

### Start a Standard Generation

The simplest form relies on the server's default AI provider and model:

```shell
docsfy generate https://github.com/org/repo.git
```

If you want to pin the branch, provider, model, and force a full rebuild, use the full form:

```shell
docsfy generate https://github.com/myk-org/for-testing-only --branch dev --provider gemini --model gemini-2.5-flash --force
```

The CLI prints the project name, branch, and initial status. In practice, that means the server accepted the request and queued the work.

Key options:
- `--branch` chooses the Git branch. If you omit it, docsfy uses `main`.
- `--provider` chooses the AI provider. Valid values are `claude`, `gemini`, and `cursor`.
- `--model` chooses the AI model name.
- `--force` skips reuse of existing cached/generated artifacts and does a full regeneration.

> **Warning:** Branch names cannot contain slashes. Use `release-v2.0` or `v2.0`, not `release/v2.0`.

### Watch Live Progress

Add `--watch` to keep the command attached and stream progress as the server works:

```shell
docsfy generate https://github.com/myk-org/for-testing-only --branch dev --provider gemini --model gemini-2.5-flash --force --watch
```

While a generation is running, `docsfy generate --watch` can report stages such as:
- `cloning`
- `planning`
- `incremental_planning`
- `generating_pages`
- `validating`
- `cross_linking`
- `rendering`

A typical full or incremental run looks like this:

```mermaid
flowchart LR
  cloning --> planning
  cloning --> incremental_planning
  planning --> generating_pages
  incremental_planning --> generating_pages
  generating_pages --> validating
  validating --> cross_linking
  cross_linking --> rendering
  rendering --> ready
```

The post-generation stages are reported directly by the backend pipeline:

```python
await update_and_notify(
    gen_key,
    project_name,
    ai_provider,
    ai_model,
    status="generating",
    owner=owner,
    branch=branch,
    current_stage="validating",
    page_count=len(pages),
)

await update_and_notify(
    gen_key,
    project_name,
    ai_provider,
    ai_model,
    status="generating",
    owner=owner,
    branch=branch,
    current_stage="cross_linking",
    page_count=len(pages),
)
```

If docsfy determines that the selected commit is already covered by the current docs, the run still ends as `ready`, but `current_stage` can be `up_to_date` instead of moving through the long-running pipeline.

The watch session exits when the variant reaches a terminal state such as `ready`, `error`, or `aborted`.

> **Tip:** `--watch` is most predictable when you also pass `--provider` and `--model`, especially if your server has default model settings.

### When to Use `--force`

By default, docsfy uses the freshest ready variant on the same branch as its baseline, even if that baseline was produced with a different provider or model. A non-force run can therefore:
- finish immediately as `ready` with `current_stage = up_to_date` when the commit is unchanged
- keep the existing plan and regenerate only the pages affected by the Git diff
- reuse another provider/model variant as the baseline and replace that older baseline after the new variant succeeds

Use `--force` when you want a clean full pass, or when you want the older provider/model output to remain alongside the new one instead of being treated as a reusable baseline.

> **Tip:** If you are comparing two providers or models side by side, run the second generation with `--force`. A non-force provider/model switch may reuse and then replace the previous baseline variant.

If you submit the same `project/branch/provider/model` while it is already generating, the server returns a conflict instead of starting a duplicate job.

## List Available Projects

Use `list` when you want a quick overview of everything you can access.

```shell
docsfy list
```

The table includes:
- `NAME`
- `BRANCH`
- `PROVIDER`
- `MODEL`
- `STATUS`
- `OWNER`
- `PAGES`

Each row is a variant, not just a repository name. A single project can appear multiple times if it has multiple branches or model/provider combinations.

Filter by status:

```shell
docsfy list --status ready
```

Filter by provider:

```shell
docsfy list --provider cursor
```

Get machine-readable output:

```shell
docsfy list --json
```

Common status values are `generating`, `ready`, `error`, and `aborted`.

> **Note:** Admins see all projects. Other accounts see their own projects plus any projects that were explicitly shared with them.

## Inspect One Project or One Variant

Use `status` when you want more detail than the list view gives you.

### Show All Variants for a Project

```shell
docsfy status for-testing-only
```

This prints the project name, the number of variants, and then a detail block for each variant. Depending on what exists, you may see fields such as:
- status
- owner
- page count
- last updated time
- short commit SHA
- current stage
- last error

This is the quickest way to answer questions like:
- Did my last run finish?
- Which model produced this version?
- How many pages were generated?
- Is a run still in `planning` or `generating_pages`?

### Filter or Target a Specific Variant

You can pass one or two filters, such as `--branch dev`, to narrow the printed list.

If you know the exact branch, provider, and model, pass all three together:

```shell
docsfy status for-testing-only --branch dev --provider gemini --model gemini-2.5-flash
```

That returns the detail for a single variant instead of the whole project.

### Use `--owner` When Admin Disambiguation Matters

If you are an admin and the same project or variant exists under multiple owners, add `--owner` when you want one specific match:

```shell
docsfy status shared-name --branch main --provider claude --model opus --owner alice
```

For a plain `docsfy status PROJECT`, you usually do not need `--owner` because docsfy shows all matching variants and includes the owner in the output.

For scripts:

```shell
docsfy status for-testing-only --json
```

`list --json` returns an array of variants. `status --json` returns either a project wrapper or a single variant object, depending on whether you fully specify `--branch`, `--provider`, and `--model`.

## Abort In-Progress Work

Use `abort` to stop a generation that is still running.

### Abort by Project Name

If there is only one active variant for that project, this is the simplest form:

```shell
docsfy abort my-repo
```

### Abort a Specific Variant

If more than one variant is active, target the exact one:

```shell
docsfy abort for-testing-only --branch main --provider gemini --model gemini-2.5-flash
```

For variant-specific aborts, pass `--branch`, `--provider`, and `--model` together. Admins can also add `--owner` when the same variant exists under more than one owner.

After aborting, confirm the result with:

```shell
docsfy status for-testing-only
```

> **Warning:** Plain `docsfy abort PROJECT` fails when multiple active variants match that name. In that case, rerun the command with `--branch`, `--provider`, and `--model`.

> **Note:** `abort` is a write action. `viewer` accounts can inspect and download docs, but they cannot generate, abort, or delete variants.

## Download Generated Artifacts

Use `download` to pull the generated static site to your machine.

### Download the Default Ready Variant

```shell
docsfy download test-repo
```

This downloads the most recently generated `ready` variant you can access and saves it in the current directory as:

```text
test-repo-docs.tar.gz
```

### Download One Exact Variant

If you want a specific branch/provider/model combination, specify it explicitly:

```shell
docsfy download test-repo --branch main --provider claude --model opus
```

That saves the archive using this pattern:

```text
<name>-<branch>-<provider>-<model>-docs.tar.gz
```

For exact-variant downloads, pass `--branch`, `--provider`, and `--model` together.

### Extract Directly into a Directory

Add `--output` to extract the archive immediately instead of keeping the `.tar.gz` file:

```shell
docsfy download test-repo --branch main --provider claude --model opus --output ./site
```

If `./site` does not exist yet, docsfy creates it for you.

The archive is extracted as-is, so you get a top-level project or variant directory inside `./site` rather than loose files at the root. For the exact-variant example above, that means `index.html`, `assets/`, and `.nojekyll` land under `./site/test-repo-main-claude-opus/`.

If you want the generated files directly at the root of your target directory, flatten the extracted folder:

```shell
mv <output_dir>/<project>-<branch>-<provider>-<model>/* <output_dir>/
mv <output_dir>/<project>-<branch>-<provider>-<model>/.* <output_dir>/ 2>/dev/null
rmdir <output_dir>/<project>-<branch>-<provider>-<model>
```

> **Tip:** This is especially useful when you want `index.html` to sit directly in a publishable `docs/` directory for GitHub Pages or another static host.

> **Note:** `docsfy download PROJECT` chooses the newest ready variant available to you. If you need one exact branch/provider/model combination, or if the default choice is ambiguous across owners, rerun with `--branch`, `--provider`, and `--model`. If you are an admin and still need one specific owner's variant, add `--owner` as well.

> **Warning:** Downloads only work for `ready` variants. If a run is still `generating`, or ended in `error` or `aborted`, check `docsfy status` first.

## Admin Workflows

The `admin` command group is for user and access management.

### List Users

```shell
docsfy admin users list
```

For scripts:

```shell
docsfy admin users list --json
```

The table includes each user's username, role, and creation time.

### Create a User

```shell
docsfy admin users create cli-test-user --role user
```

Valid roles are:
- `admin`
- `user`
- `viewer`

A successful create prints the new API key. Autogenerated keys use the `docsfy_...` format.

> **Warning:** Save the generated API key when you create or rotate a user. The CLI does not show it again later.

> **Note:** `viewer` is the read-only role. Viewers can inspect and download docs they can access, but they cannot start or stop generations.

Usernames have a few practical rules:
- `admin` is reserved
- names must be 2 to 50 characters long
- letters, numbers, dots, hyphens, and underscores are allowed

### Rotate a User's Key

Generate a new key automatically:

```shell
docsfy admin users rotate-key alice
```

For automation:

```shell
docsfy admin users rotate-key alice --json
```

If you want to set a custom key yourself, use `--new-key`. Custom keys must be at least 16 characters long.

Rotating a key invalidates that user's existing sessions, so they will need to authenticate again with the new key.

> **Note:** The built-in `admin` account is controlled by the server's `ADMIN_KEY` environment variable, not by `admin users rotate-key`.

### Delete a User

```shell
docsfy admin users delete cli-test-user --yes
```

Without `--yes`, docsfy prompts for confirmation.

You cannot delete your own account, and a delete is blocked while that user still has an active generation in progress.

### View Who Can Access a Project

```shell
docsfy admin access list my-repo --owner admin
```

This shows the usernames that currently have access to that project for that owner.

### Grant Access to a Project

```shell
docsfy admin access grant my-repo --username alice --owner admin
```

### Revoke Access

```shell
docsfy admin access revoke my-repo --username alice --owner admin
```

> **Note:** Access grants are owner-scoped and project-wide. In other words, you grant access to all variants of `my-repo` owned by `admin`, not just one branch or model.

## Common CLI Errors

- `No server configured`: run `docsfy config init`, or pass connection options such as `--server`.
- `Repository URL must not target localhost or private networks`: use a public Git remote that the docsfy server can reach.
- `Write access required`: the authenticated account is a `viewer`.
- `Variant not ready`: the target variant is not in `ready` state yet. Use `docsfy status PROJECT`.
- `Multiple active variants found; use the branch-specific abort endpoint.`: rerun `docsfy abort` with `--branch`, `--provider`, and `--model`.
- `Abort still in progress for '...'`: wait a moment and retry the abort.
- `Multiple owners found for this variant, please specify owner`: if you are an admin, rerun the exact-variant command with `--owner`.
- `Multiple owners have variants with the same timestamp, please specify owner`: rerun `docsfy download` with `--branch`, `--provider`, and `--model`; if you are an admin, add `--owner`.
- `No projects found.`: the account has no owned projects and no granted access yet.


## Related Pages

- [CLI Command Reference](cli-command-reference.html)
- [CLI Configuration](cli-configuration.html)
- [Generating Documentation](generating-documentation.html)
- [Tracking Progress and Status](tracking-progress-and-status.html)
- [Viewing, Downloading, and Hosting Docs](viewing-downloading-and-hosting-docs.html)

---

Source: cli-command-reference.md

# CLI Command Reference

`docsfy` is the command-line client for a running docsfy server. You use it to save connection profiles, start documentation generation, inspect existing outputs, download generated sites, and perform admin-only user and access management.

In docsfy, a _variant_ is one `project / branch / provider / model` combination. In admin or shared-access setups, the same project name can exist under different owners, which is why some commands also support `--owner`.

Permissions at a glance:
- `list`, `status`, and `download` are read-oriented commands for projects you own or have been granted access to.
- `generate`, `delete`, and `abort` require a `user` or `admin` account.
- `admin ...` commands require admin access.

Examples on this page are taken from the repository's CLI tests and sample config.

## Command Summary

| Command | What it does |
| --- | --- |
| `config` | Manage saved server profiles in `~/.config/docsfy/config.toml` |
| `generate` | Start documentation generation for a Git repository |
| `list` | Show accessible projects and variants |
| `status` | Show detailed status for one project or one exact variant |
| `delete` | Delete a single variant or all variants of a project |
| `abort` | Stop an active generation run |
| `download` | Download a generated docs site as a tarball, or extract it to a directory |
| `admin users ...` | List, create, delete, and rotate user API keys |
| `admin access ...` | Grant, revoke, and inspect project access for other users |

## Global Connection Options

All commands share the same connection options. The CLI resolves them in this order:

1. Explicit CLI flags such as `--host`, `--username`, and `--password`
2. A named profile selected with `--server`
3. The default profile from `[default].server` in `~/.config/docsfy/config.toml`
4. An error if nothing is configured

| Option | Meaning |
| --- | --- |
| `--server`, `-s` | Use a named server profile from the config file |
| `--host` | Override the host from the selected profile |
| `--port` | Override the port when `--host` is used |
| `--username`, `-u` | Override the configured username |
| `--password`, `-p` | Override the configured password/API key |

> **Note:** In CLI config and flags, the field is named `password`, but for docsfy this value is your API key.

> **Note:** Global options go before the subcommand. That matters because `-p` is reused: before the subcommand it means API key, but after commands like `status`, `delete`, `abort`, and `download` it means `--provider`.

> **Tip:** If you use `--host`, the CLI builds a full URL from host and port. The port defaults to `8000`, and the scheme comes from the selected profile when available; otherwise it defaults to `https`.

## `config`

`docsfy config` manages the CLI config file at `~/.config/docsfy/config.toml`.

A sample config from the repository:

```toml
[default]
server = "dev"

[servers.dev]
url = "http://localhost:8000"
username = "admin"
password = "<your-dev-key>"

[servers.prod]
url = "https://docsfy.example.com"
username = "admin"
password = "<your-prod-key>"

[servers.staging]
url = "https://staging.docsfy.example.com"
username = "deployer"
password = "<your-staging-key>"
```

> **Warning:** This file contains credentials. The CLI writes it with owner-only permissions, and you should keep it private.

### `config init`

`docsfy config init` is the interactive setup flow. It prompts for:

- Profile name, defaulting to `dev`
- Server URL
- Username
- Password/API key

If you are creating the first profile, it also becomes the default server. If you add another profile later, the existing default stays in place until you change it.

```shell
docsfy config init
```

### `config show`

`docsfy config show` prints the config file path, the current default profile, and each saved profile with its password/API key masked.

```shell
docsfy config show
```

### `config set`

`docsfy config set` writes nested TOML keys directly.

```shell
docsfy config set default.server prod
docsfy config set servers.dev.url https://new-server.com
```

Use it when you want to switch the default profile or update individual fields without re-running `config init`.

> **Note:** `config set` expects dotted keys such as `default.server`, `servers.dev.url`, and `servers.dev.username`. It does not use shorthand keys like `server` or `api-key`.

## `generate`

`docsfy generate` starts documentation generation for a remote Git repository.

The project name used by later commands is derived from the repository name. For example:

- `https://github.com/myk-org/for-testing-only` becomes `for-testing-only`
- `https://github.com/org/my-repo.git` becomes `my-repo`
- `git@github.com:org/repo.git` becomes `repo`

Common options:
- `--branch`, `-b`: Git branch to generate from. Defaults to `main`.
- `--provider`: AI provider. Valid values in this codebase are `claude`, `gemini`, and `cursor`.
- `--model`, `-m`: AI model name.
- `--force`, `-f`: Force a full regeneration instead of reusing cached artifacts.
- `--watch`, `-w`: Stream live generation progress.

Examples:

```shell
docsfy generate https://github.com/myk-org/for-testing-only --provider gemini --model gemini-2.5-flash --force
docsfy generate https://github.com/myk-org/for-testing-only --branch dev --provider gemini --model gemini-2.5-flash --force --watch
```

What you can expect:
- The CLI prints the derived project name, resolved branch, and initial status.
- With `--watch`, it listens for live progress updates such as `cloning`, `planning`, `incremental_planning`, `generating_pages`, `validating`, `cross_linking`, and `rendering`.
- Final statuses are `ready`, `error`, or `aborted`. A `ready` result can also mean docsfy determined the target variant was already up to date and skipped regeneration work.

> **Tip:** `generate` takes a repository URL, but `status`, `delete`, `abort`, and `download` use the derived project name. For `https://github.com/myk-org/for-testing-only`, that name is `for-testing-only`.

> **Note:** If you omit `--provider` or `--model`, the server default is used. In this codebase the current defaults are `cursor` and `gpt-5.4-xhigh-fast`, but deployments can override them.

> **Tip:** When using `--watch`, pass both `--provider` and `--model` explicitly so the CLI can subscribe to the exact variant immediately.

> **Warning:** Branch names cannot contain slashes. Use names like `release-1.x`, not `release/1.x`.

> **Warning:** The CLI `generate` command expects a Git repository URL, not a local filesystem path. Standard HTTPS remotes like `https://github.com/org/repo.git` and SSH remotes like `git@github.com:org/repo.git` are accepted, but the server rejects repository URLs that point to `localhost` or private-network addresses.

## `list`

`docsfy list` shows the accessible projects and variants in a table. The table includes these columns:

- `NAME`
- `BRANCH`
- `PROVIDER`
- `MODEL`
- `STATUS`
- `OWNER`
- `PAGES`

Examples:

```shell
docsfy list
docsfy list --status ready
docsfy list --provider cursor
docsfy list --json
```

Use `--status` and `--provider` to narrow the result set before printing. In practice, project statuses used by the codebase are `generating`, `ready`, `error`, and `aborted`.

For non-admin users, `list` includes projects you own plus projects that have been shared with you. For admins, it shows everything.

## `status`

`docsfy status` shows detailed information for one project.

With just the project name, it shows all matching variants you can access. If you provide `--branch`, `--provider`, and `--model` together, it fetches one exact variant instead.

Examples:

```shell
docsfy status for-testing-only
docsfy status my-repo -b main -p cursor -m gpt-5
```

Useful fields in the output include:
- Status
- Owner
- Page count
- Last generated time
- Short commit SHA
- Current stage
- Error message, when present

> **Note:** A `ready` variant can still show `Stage: up_to_date` when docsfy determines that nothing meaningful changed and no regeneration work was needed.

> **Note:** `--owner` is mainly useful for admins when you are querying one fully qualified variant and need to disambiguate between multiple owners.

> **Tip:** If you want one exact variant, provide all three selectors together: `--branch`, `--provider`, and `--model`.

## `delete`

`docsfy delete` removes either one exact variant or every variant for a project within one owner scope.

Common options:
- `--branch`, `-b`: Variant branch
- `--provider`, `-p`: Variant provider
- `--model`, `-m`: Variant model
- `--owner`: Project owner. Required on admin deletes because delete routes are owner-scoped, even when the rest of the variant is fully specified.
- `--all`: Delete all variants for the project within that owner scope
- `--yes`, `-y`: Skip the confirmation prompt

Examples:

```shell
docsfy delete for-testing-only --branch dev --provider gemini --model gemini-2.0-flash --yes
docsfy delete my-repo --all --yes
```

If you leave off `--yes`, the CLI asks for confirmation before deleting anything.

> **Warning:** Use either `--all` or the full variant selector (`--branch`, `--provider`, and `--model`). Do not combine them.

> **Warning:** The server refuses deletion while generation is in progress. Abort the running variant first, then retry the delete.

## `abort`

`docsfy abort` stops an active generation run.

You can use it in two ways:
- By project name alone, if there is only one active generation for that project
- By exact variant, using `--branch`, `--provider`, and `--model`

Examples:

```shell
docsfy abort my-repo
docsfy abort for-testing-only --branch main --provider gemini --model gemini-2.5-flash
```

When the abort succeeds, the variant ends up in `aborted` status.

> **Tip:** If more than one active variant exists for the same project name, the project-level form is ambiguous. In that case, retry with `--branch`, `--provider`, and `--model`. Admins may also need `--owner` for someone else's variant.

## `download`

`docsfy download` fetches generated documentation for a project.

You can use it in two modes:
- Without variant selectors: download the latest ready variant you can access
- With `--branch`, `--provider`, and `--model`: download one exact variant

Common options:
- `--branch`, `-b`: Variant branch
- `--provider`, `-p`: Variant provider
- `--model`, `-m`: Variant model
- `--owner`: Useful for admins when downloading a specific variant owned by someone else
- `--output`, `-o`: Extract into a directory instead of saving a tarball in the current directory

Example:

```shell
docsfy download my-repo -b main -p cursor -m gpt-5
```

Archive naming:
- Exact variant download: `<project>-<branch>-<provider>-<model>-docs.tar.gz`
- Project-level download: `<project>-docs.tar.gz`

When you pass `--output`, the CLI creates the directory if needed, downloads the archive to a temporary file, and extracts it there.

> **Tip:** If a project has multiple variants, or if you are an admin working across multiple owners, prefer the fully qualified form with `--branch`, `--provider`, and `--model` so you know exactly which archive you are getting.

> **Warning:** Only ready variants can be downloaded.

## `admin`

All `admin` subcommands require admin credentials.

### `admin users`

`admin users` manages docsfy accounts. Valid roles are:
- `admin`
- `user`
- `viewer`

Examples:

```shell
docsfy admin users list
docsfy admin users create cli-test-user --role user
docsfy admin users delete cli-test-user --yes
docsfy admin users rotate-key alice
```

What each subcommand does:
- `list`: Show all users in a table, or JSON with `--json`
- `create`: Create a user and print the generated API key; `--role` defaults to `user`
- `delete`: Delete a user; prompts unless you pass `--yes`
- `rotate-key`: Rotate a user's API key; use `--new-key` to provide your own key, or omit it to generate one automatically

Practical details:
- Usernames must be 2-50 characters, start with an alphanumeric character, and may include `.`, `_`, and `-`
- The username `admin` is reserved
- Deleting a user also removes their sessions, owned projects, and related access grants
- Rotating a key invalidates that user's existing sessions

> **Warning:** `create` and `rotate-key` show the API key only once. Save it immediately.

> **Warning:** You cannot delete your own admin account, and the server blocks deleting a user while they have a generation in progress.

### `admin access`

`admin access` manages project sharing. Access is project-level and owner-scoped, which means a grant applies to all variants of that project for that owner.

Examples:

```shell
docsfy admin access list my-repo --owner admin
docsfy admin access grant my-repo --username alice --owner admin
docsfy admin access revoke my-repo --username alice --owner admin
```

What each subcommand does:
- `list`: Show which users currently have access to a project; supports `--json`
- `grant`: Give a user access to a project owned by a specific owner
- `revoke`: Remove that access again

> **Note:** `--owner` is required on all `admin access` commands, because project sharing is scoped to a specific owner.

## JSON Output and Exit Behavior

Commands that support `--json`:
- `docsfy list`
- `docsfy status`
- `docsfy admin users list`
- `docsfy admin users create`
- `docsfy admin users rotate-key`
- `docsfy admin access list`

When scripting:
- `docsfy list --json` returns an array of project objects
- `docsfy status --json` returns either one variant object or a `{name, variants}` object, depending on whether you fully qualified the variant
- HTTP and API failures are printed as `Error (<status>): ...` and return a non-zero exit code
- If you decline a confirmation prompt, the CLI prints `Aborted.` and exits without making changes


## Related Pages

- [CLI Workflows](cli-workflows.html)
- [CLI Configuration](cli-configuration.html)
- [Generating Documentation](generating-documentation.html)
- [Viewing, Downloading, and Hosting Docs](viewing-downloading-and-hosting-docs.html)
- [Projects API](projects-api.html)

---

Source: auth-api.md

# Authentication API

docsfy supports two ways to authenticate:

- Bearer-token authentication for CLI tools, scripts, and direct API clients
- Session-cookie authentication for the built-in browser UI after a successful login

The same secret shows up under different names depending on the client. In the login API it is sent as `api_key`, in the web UI it is entered in a password field, and in direct API calls it is sent as a Bearer token.

Actual examples from the codebase:

```ts
await api.post<AuthResponse>('/api/auth/login', {
  username,
  api_key: password,
})
navigate(intendedPath)
```

```python
self._client = httpx.Client(
    base_url=self.server_url,
    headers={"Authorization": f"Bearer {self.password}"},
    timeout=30.0,
    follow_redirects=False,
)
```

> **Note:** If you are automating docsfy, you usually do not need to call `POST /api/auth/login` first. Send `Authorization: Bearer <api-key>` on protected requests instead.

## How Authentication Works

```mermaid
sequenceDiagram
    actor Browser
    participant AuthAPI as Authentication API
    participant DB as SQLite storage
    actor CLI as CLI / script

    Browser->>AuthAPI: POST /api/auth/login {username, api_key}
    AuthAPI->>DB: check ADMIN_KEY or get_user_by_key()
    AuthAPI->>DB: create_session()
    AuthAPI-->>Browser: 200 + Set-Cookie: docsfy_session

    Browser->>AuthAPI: GET /api/auth/me
    Note over Browser,AuthAPI: Browser sends docsfy_session automatically
    AuthAPI->>DB: get_session()
    AuthAPI-->>Browser: {username, role, is_admin}

    CLI->>AuthAPI: GET /api/status + Authorization: Bearer <api-key>
    AuthAPI->>DB: get_user_by_key() if token is not ADMIN_KEY
    AuthAPI-->>CLI: protected response
```

For protected requests, docsfy checks authentication in this order:

1. `Authorization: Bearer <token>`
2. `docsfy_session` cookie

If the Bearer token does not authenticate a user, docsfy falls back to the session cookie.

```python
auth_header = request.headers.get("authorization", "")
if auth_header.startswith("Bearer "):
    token = auth_header[7:]
    if token == settings.admin_key:
        is_admin = True
        username = "admin"
    else:
        user = await get_user_by_key(token)

if not user and not is_admin:
    session_token = request.cookies.get("docsfy_session")
    if session_token:
        session = await get_session(session_token)
```

For database-backed users, cookie-based authentication re-reads the user record on each request. That means deleted users lose access immediately, and role changes take effect on the next request.

The frontend sends the session cookie automatically on same-origin requests:

```ts
const config: RequestInit = {
  ...options,
  credentials: 'same-origin',
  redirect: 'manual',
  headers,
}
```

A successful login creates an HttpOnly browser cookie with strict same-site behavior and an 8-hour lifetime:

```python
response.set_cookie(
    "docsfy_session",
    session_token,
    httponly=True,
    samesite="strict",
    secure=settings.secure_cookies,
    max_age=SESSION_TTL_SECONDS,
)
```

Because the cookie is `HttpOnly`, frontend JavaScript cannot read it directly. The cookie value is also an opaque session token, not the raw API key you used to sign in.

Unauthenticated requests behave differently depending on the path:

- `/api/*` returns `401` with `{"detail": "Unauthorized"}`
- HTML requests to `/docs/*` redirect to `/login`
- `/api/ws` follows the same model and accepts either `?token=<api-key>` or the `docsfy_session` cookie

> **Tip:** Use session cookies for browsers and Bearer tokens for automation. They are different transport mechanisms for the same underlying credentials.

## Endpoint Reference

### POST `/api/auth/login`

Creates a browser session from a username/key pair.

- `Auth required:` No
- `Request body:` JSON object with `username` and `api_key`
- `Success:` `200 OK` with `username`, `role`, and `is_admin`, plus a `docsfy_session` cookie
- `400:` Invalid JSON body, or a JSON body that is not an object
- `401:` Invalid username or password

Important rules:

- The built-in admin login only works when `username` is exactly `admin` and the submitted secret matches `ADMIN_KEY`.
- Database-backed users must send the username that owns the submitted key.
- The built-in `admin` account is special and does not come from the users table.
- The returned `role` can be `admin`, `user`, or `viewer`.
- `is_admin` is `true` for the built-in admin and for database-backed users whose role is `admin`.

This endpoint is intentionally public and returns JSON. Bad credentials return `401`; they do not trigger a redirect.

> **Note:** The web UI labels this value as a password, but the API field name is still `api_key`.

### POST `/api/auth/logout`

Clears the current browser session.

- `Auth required:` No
- `Request body:` None
- `Success:` `200 OK` with `{ "ok": true }`
- `Side effect:` If the request includes `docsfy_session`, docsfy deletes the stored session and clears the cookie in the response

This endpoint does not revoke API keys or disable Bearer-token access. It only clears the session-cookie flow.

> **Note:** `logout` is safe to call even when the session has already expired or the client is no longer authenticated.

### GET `/api/auth/me`

This is the current-user endpoint. It returns the identity attached to the current request, whether that request was authenticated by a Bearer token or by a session cookie.

- `Auth required:` Yes
- `Success:` `200 OK` with `username`, `role`, and `is_admin`
- `401:` No valid Bearer token or session cookie

The response shape is simple:

```python
return JSONResponse(
    content={
        "username": request.state.username,
        "role": request.state.role,
        "is_admin": request.state.is_admin,
    }
)
```

Use `GET /api/auth/me` to:

- confirm that a token or session is still valid
- determine the active role
- decide whether to show admin-only behavior in a client

### POST `/api/auth/rotate-key`

Rotates the current user's own key. In the UI this behaves like a change-password action, but it rotates the same secret used for Bearer-token authentication.

- `Auth required:` Yes
- `Request body:` No body or `{}` generates a new random key
- `Request body:` `{ "new_key": "..." }` sets a custom key
- `Success:` `200 OK` with `username` and `new_api_key`
- `Response header:` `Cache-Control: no-store`
- `400:` Malformed JSON, a non-object JSON body, a short custom key, or the built-in `admin` account authenticated via `ADMIN_KEY`
- `401:` No valid authentication

Any authenticated database-backed user can rotate their own key, including `viewer`, `user`, and `admin` roles. The one exception is the built-in `admin` login that uses `ADMIN_KEY`.

Key rotation invalidates every existing session for that user:

```python
cursor = await db.execute(
    "UPDATE users SET api_key_hash = ? WHERE username = ?",
    (key_hash, username),
)
if cursor.rowcount == 0:
    msg = f"User '{username}' not found"
    raise ValueError(msg)
# Invalidate all existing sessions for this user
await db.execute("DELETE FROM sessions WHERE username = ?", (username,))
await db.commit()
```

The HTTP response also prevents caching and clears the current browser session:

```python
response = JSONResponse(
    content={"username": username, "new_api_key": new_key},
    headers={"Cache-Control": "no-store"},
)
response.delete_cookie(
    "docsfy_session",
    httponly=True,
    samesite="strict",
    secure=settings.secure_cookies,
)
```

After a successful rotation:

- the old key stops working immediately
- all existing sessions for that user are deleted
- the current `docsfy_session` cookie is cleared
- you must sign in again with the new key

> **Warning:** Custom keys must be at least 16 characters long.

> **Warning:** The built-in `admin` account cannot use this endpoint. To change that password, change `ADMIN_KEY` in the server configuration instead.

## Configuration

Authentication depends on two settings:

- `ADMIN_KEY` is required at startup and must be at least 16 characters long
- `secure_cookies` defaults to `True`; disable it only for local plain-HTTP development

The server settings are defined like this:

```python
class Settings(BaseSettings):
    model_config = SettingsConfigDict(
        env_file=".env",
        env_file_encoding="utf-8",
        extra="ignore",
    )

    admin_key: str = ""  # Required — validated at startup
    ai_provider: str = "cursor"
    ai_model: str = "gpt-5.4-xhigh-fast"
    ai_cli_timeout: int = Field(default=60, gt=0)
    log_level: str = "INFO"
    data_dir: str = "/data"
    secure_cookies: bool = True  # Set to False for local HTTP dev
```

When running with Docker Compose, `ADMIN_KEY` is passed through from the environment:

```yaml
env_file:
  - .env
environment:
  - ADMIN_KEY=${ADMIN_KEY}
```

> **Warning:** If `secure_cookies` stays enabled on plain HTTP local development, the browser will not send the session cookie and login will appear not to stick.

> **Warning:** `ADMIN_KEY` is more than the built-in admin password. It is also used when hashing stored user keys, so changing it invalidates existing database-backed user keys.

## Choosing The Right Flow

- Use Bearer tokens for CLI tools, scripts, and direct API clients.
- Use `POST /api/auth/login` when you want a browser session for the built-in UI.
- Use `GET /api/auth/me` to check who the current request is authenticated as.
- Use `POST /api/auth/rotate-key` when you want to invalidate a user key and issue a new one.
- Use `POST /api/auth/logout` only to clear browser sessions.


## Related Pages

- [Authentication and Roles](authentication-and-roles.html)
- [User and Access Management](user-and-access-management.html)
- [Admin API](admin-api.html)
- [WebSocket Protocol](websocket-protocol.html)
- [Security Considerations](security-considerations.html)

---

Source: projects-api.md

# Projects API

Docsfy treats every generated docs set as a **variant**, not a single flat "project" row. The Projects API is how you list accessible variants, start new generations, inspect one exact variant, abort work in progress, delete stored output, download archives, and serve the rendered docs site.

A variant is identified by five values: `name`, `branch`, `ai_provider`, `ai_model`, and `owner`.

```63:79:src/docsfy/storage.py
CREATE TABLE IF NOT EXISTS projects (
    name TEXT NOT NULL,
    branch TEXT NOT NULL DEFAULT 'main',
    ai_provider TEXT NOT NULL DEFAULT '',
    ai_model TEXT NOT NULL DEFAULT '',
    owner TEXT NOT NULL DEFAULT '',
    repo_url TEXT NOT NULL,
    status TEXT NOT NULL DEFAULT 'generating',
    current_stage TEXT,
    last_commit_sha TEXT,
    last_generated TEXT,
    page_count INTEGER DEFAULT 0,
    error_message TEXT,
    plan_json TEXT,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    PRIMARY KEY (name, branch, ai_provider, ai_model, owner)
)
```

`owner` is part of the stored identity, but it is **not** part of the main URL path. That is why exact variant routes sometimes need `?owner=<username>` for admin disambiguation.

> **Note:** docsfy does not publish Swagger or an OpenAPI JSON document. The routes below are the API reference.

## Quick Route Map

| What you want to do | Method and path |
| --- | --- |
| List all accessible variants | `GET /api/status` |
| Same as above | `GET /api/projects` |
| List all accessible variants for one project name | `GET /api/projects/{name}` |
| Get one exact variant | `GET /api/projects/{name}/{branch}/{provider}/{model}` |
| Start a generation | `POST /api/generate` |
| Abort a project when only one active variant matches | `POST /api/projects/{name}/abort` |
| Abort one exact variant | `POST /api/projects/{name}/{branch}/{provider}/{model}/abort` |
| Delete one exact variant | `DELETE /api/projects/{name}/{branch}/{provider}/{model}` |
| Delete all variants for one owner-scoped project | `DELETE /api/projects/{name}` |
| Download one exact variant as `tar.gz` | `GET /api/projects/{name}/{branch}/{provider}/{model}/download` |
| Download the latest ready variant | `GET /api/projects/{name}/download` |
| Serve one exact rendered site file | `GET /docs/{project}/{branch}/{provider}/{model}/{path:path}` |
| Serve a file from the latest ready rendered site | `GET /docs/{project}/{path:path}` |

These two list routes are aliases of the same handler:

```1152:1159:src/docsfy/api/projects.py
@router.get("/status")
@router.get("/projects")
async def status(request: Request) -> dict[str, Any]:
    return await build_projects_payload(request.state.username, request.state.is_admin)

@router.get("/projects/{name}")
async def get_project_details(request: Request, name: str) -> dict[str, Any]:
```

## Authentication, Roles, and Owner Scoping

All `/api/*` and `/docs/*` routes require authentication. In practice, you usually use one of these:

- A Bearer token for scripts and CLI usage
- A `docsfy_session` cookie for browser usage

If you use the CLI, the checked-in example config looks like this:

```7:25:config.toml.example
[default]
server = "dev"

[servers.dev]
url = "http://localhost:8000"
username = "admin"
password = "<your-dev-key>"

[servers.prod]
url = "https://docsfy.example.com"
username = "admin"
password = "<your-prod-key>"

[servers.staging]
url = "https://staging.docsfy.example.com"
username = "deployer"
password = "<your-staging-key>"
```

With a profile like this, the CLI commands `docsfy generate`, `docsfy status`, `docsfy abort`, `docsfy delete`, and `docsfy download` call the same HTTP routes documented here.

Role behavior is important:

| Role | Can list / look up / view / download? | Can generate? | Can abort or delete? | Owner behavior |
| --- | --- | --- | --- | --- |
| `viewer` | Yes, for accessible variants | No | No | `?owner=` does not grant extra access |
| `user` | Yes, for owned variants and explicitly shared variants | Yes, for their own variants | Yes, for their own active runs and owned variants | `?owner=` is ignored on exact lookup routes |
| `admin` | Yes, across all owners | Yes | Yes | Use `?owner=<username>` on exact variant routes when needed |

Shared access is granted separately under the admin API. Once a project is shared, a non-admin user can list it, open its docs, and download it, but they still cannot delete another owner's data or abort another owner's active run. See [User and Access Management](user-and-access-management.md) for the sharing endpoints.

Access grants are owner-scoped, not global to a repository name:

```258:263:src/docsfy/storage.py
CREATE TABLE IF NOT EXISTS project_access (
    project_name TEXT NOT NULL,
    project_owner TEXT NOT NULL DEFAULT '',
    username TEXT NOT NULL,
    PRIMARY KEY (project_name, project_owner, username)
)
```

> **Note:** For browser requests to `/docs/*`, unauthenticated HTML requests are redirected to `/login`. API-style requests get `401 Unauthorized`.

> **Warning:** For exact variant routes, `?owner=` is an **admin disambiguation tool**, not a general-purpose access override. If you are not an admin, docsfy resolves the variant from your own account and your granted access only.

## Branches, Providers, and Defaults

Branch is part of the variant identity and part of the URL. `main` and `dev` are different variants even when everything else is identical.

Docsfy validates the generation request body like this:

```18:58:src/docsfy/models.py
class GenerateRequest(BaseModel):
    repo_url: str | None = Field(
        default=None, description="Git repository URL (HTTPS or SSH)"
    )
    repo_path: str | None = Field(default=None, description="Local git repository path")
    ai_provider: Literal["claude", "gemini", "cursor"] | None = None
    ai_model: str | None = None
    ai_cli_timeout: int | None = Field(default=None, gt=0)
    force: bool = Field(
        default=False, description="Force full regeneration, ignoring cache"
    )
    branch: str = Field(
        default=DEFAULT_BRANCH, description="Git branch to generate docs from"
    )

    @field_validator("branch")
    @classmethod
    def validate_branch(cls, v: str) -> str:
        if "/" in v:
            msg = (
                f"Invalid branch name: '{v}'. Branch names cannot contain slashes "
                "— use hyphens instead (e.g., release-1.x)."
            )
            raise ValueError(msg)
        if not re.match(r"^[a-zA-Z0-9][a-zA-Z0-9._-]*$", v):
            msg = f"Invalid branch name: '{v}'"
            raise ValueError(msg)
        if ".." in v:
            msg = f"Invalid branch name: '{v}'"
            raise ValueError(msg)
        return v

    @model_validator(mode="after")
    def validate_source(self) -> GenerateRequest:
        if not self.repo_url and not self.repo_path:
            msg = "Either 'repo_url' or 'repo_path' must be provided"
            raise ValueError(msg)
        if self.repo_url and self.repo_path:
            msg = "Provide either 'repo_url' or 'repo_path', not both"
            raise ValueError(msg)
        return self
```

In practice:

- Omit `branch` and docsfy uses `main`
- Good branch names include `main`, `dev`, `release-1.x`, and `v2.0.1`
- Slash-based branch names like `release/v2.0` are rejected
- `repo_url` and `repo_path` are mutually exclusive
- `repo_path` is for admin-only local generation workflows

The server also has real defaults for provider, model, timeout, and data directory:

```16:22:src/docsfy/config.py
    admin_key: str = ""  # Required — validated at startup
    ai_provider: str = "cursor"
    ai_model: str = "gpt-5.4-xhigh-fast"
    ai_cli_timeout: int = Field(default=60, gt=0)
    log_level: str = "INFO"
    data_dir: str = "/data"
    secure_cookies: bool = True  # Set to False for local HTTP dev
```

> **Warning:** The branch lives in the URL path, so it must be a single safe path segment. If your Git workflow uses slash-based branch names, use a hyphenated variant such as `release-1.x` when generating docs with docsfy.

## List Projects and Variants

Use `GET /api/status` or `GET /api/projects` when you want the dashboard-style view of everything you can access.

The response shape used by the frontend is:

```5:20:frontend/src/types/index.ts
export interface Project {
  name: string
  branch: string
  ai_provider: string
  ai_model: string
  owner: string
  repo_url: string
  status: ProjectStatus
  current_stage: string | null
  last_commit_sha: string | null
  last_generated: string | null
  page_count: number
  error_message: string | null
  plan_json: string | null
  created_at: string
  updated_at: string
}
```

```64:68:frontend/src/types/index.ts
export interface ProjectsResponse {
  projects: Project[]
  known_models: Record<string, string[]>
  known_branches: Record<string, string[]>
}
```

This endpoint is useful for polling because it returns:

- `projects`: a flat list of accessible variants
- `known_models`: model suggestions grouped by provider
- `known_branches`: branch suggestions grouped by project name

A few details matter:

- `projects` includes every accessible variant, not just `ready` ones
- `known_models` and `known_branches` are built from successful `ready` variants, so they behave like suggestions rather than a full history
- Non-admin users see owned variants plus explicitly shared variants
- Admins see all owners' variants

If you want all visible variants for one repository name, use `GET /api/projects/{name}`. That returns:

- `name`
- `variants`: every accessible variant for that project name

If you want one exact variant, use `GET /api/projects/{name}/{branch}/{provider}/{model}`.

> **Tip:** Use `/api/status` when you are building a dashboard, poller, or "watch until ready" script. Use `/api/projects/{name}/{branch}/{provider}/{model}` when you need one exact row and you already know the coordinates.

### Status and Stage Fields

`status` is one of:

- `generating`
- `ready`
- `error`
- `aborted`

While generation is running, `current_stage` can move through these UI-visible stages:

```26:34:frontend/src/lib/constants.ts
export const GENERATION_STAGES = [
  'cloning',
  'planning',
  'incremental_planning',
  'generating_pages',
  'validating',
  'cross_linking',
  'rendering',
] as const
```

A `ready` variant may also keep `current_stage = up_to_date` when docsfy determines that the current output already matches the target commit and no regeneration is needed.

## Start a Generation

`POST /api/generate` queues a generation and returns immediately with `202 Accepted`. The work happens asynchronously.

The frontend sends this request body when a user clicks **Generate**:

```158:164:frontend/src/components/shared/GenerateForm.tsx
await api.post('/api/generate', {
  repo_url: submittedRepoUrl,
  branch: submittedBranch,
  ai_provider: submittedProvider,
  ai_model: submittedModel,
  force: submittedForce,
})
```

A real API test-plan example also shows that omitting `branch` falls back to `main`:

```215:218:test-plans/e2e-10-branch-support.md
curl -s -X POST http://localhost:8800/api/generate -H "Authorization: Bearer <TEST_USER_PASSWORD>" -H "Content-Type: application/json" -d '{"repo_url":"https://github.com/myk-org/for-testing-only","ai_provider":"gemini","ai_model":"gemini-2.5-flash"}'
```

The stored project name is derived automatically from the repository URL or local directory name:

```83:89:src/docsfy/models.py
@property
def project_name(self) -> str:
    if self.repo_url:
        return extract_repo_name(self.repo_url)
    if self.repo_path:
        return Path(self.repo_path).resolve().name
    return "unknown"
```

Important generation rules:

- Send either `repo_url` or `repo_path`, never both
- `repo_url` is the normal user-facing path and should be a standard Git HTTPS or SSH URL
- `repo_path` must be an absolute local path, must exist, must contain `.git`, and is admin-only
- If you omit `ai_provider` or `ai_model`, docsfy uses the server defaults
- Starting the exact same owner/branch/provider/model variant twice at the same time returns `409`

```mermaid
sequenceDiagram
    participant Client
    participant API as Projects API
    participant DB as SQLite storage
    participant Worker as Async generation task
    participant Site as Rendered site

    Client->>API: POST /api/generate
    API->>DB: Save variant row\nstatus=generating
    API-->>Client: 202 Accepted\n{project, status, branch}
    API->>Worker: create task

    loop track progress
        Worker->>DB: update current_stage,\npage_count, plan_json
        Client->>API: GET /api/status
        API->>DB: list accessible variants
        API-->>Client: projects + known_models + known_branches
    end

    Worker->>Site: render static HTML output
    Worker->>DB: status=ready,\nlast_commit_sha, last_generated, page_count
    Client->>API: GET /docs/... or /api/projects/.../download
    API-->>Client: HTML file or tar.gz archive
```

> **Warning:** `repo_url` is validated and basic SSRF protections are enforced. Localhost, bare local paths, and private-network repository targets are rejected. If you need a local checkout, use admin `repo_path` instead.

## Abort Running Work

Docsfy exposes two abort routes:

- `POST /api/projects/{name}/abort`
- `POST /api/projects/{name}/{branch}/{provider}/{model}/abort`

Use the project-name route only as a convenience. It succeeds only when there is exactly one active generation for that project name. If multiple active variants match the name, docsfy returns `409` and tells you to use the branch-specific route.

The exact variant abort URL used by the UI is:

```473:475:frontend/src/components/shared/VariantDetail.tsx
await api.post(
  `/api/projects/${project.name}/${project.branch}/${project.ai_provider}/${project.ai_model}/abort?owner=${encodeURIComponent(project.owner)}`
)
```

Practical rules:

- `viewer` cannot abort anything
- `user` can abort only their own active generation
- `admin` can abort any active variant, and `?owner=` helps disambiguate when multiple owners have the same exact coordinates
- If cancellation is still in progress, the abort route can return `409` and ask you to retry shortly

> **Tip:** In automation, prefer the exact variant abort route. The project-name abort route is best reserved for human convenience when you know only one generation is active.

## Delete Stored Output

Docsfy also exposes two delete shapes:

- `DELETE /api/projects/{name}/{branch}/{provider}/{model}` deletes one exact variant
- `DELETE /api/projects/{name}` deletes all variants for one owner-scoped project group

The UI's exact variant delete call includes `?owner=`:

```107:109:frontend/src/components/shared/VariantDetail.tsx
await api.delete(
  `/api/projects/${project.name}/${project.branch}/${project.ai_provider}/${project.ai_model}?owner=${encodeURIComponent(project.owner)}`
)
```

The dashboard's "delete all variants" behavior also owner-scopes each delete call:

```350:359:frontend/src/pages/DashboardPage.tsx
// Collect distinct owners for this project name so each delete call
// includes the required ?owner= query parameter.
const owners = [...new Set(
  projects
    .filter((p) => p.name === name && (!ownerFilter || p.owner === ownerFilter))
    .map((p) => p.owner)
)]
for (const owner of owners) {
  await api.delete(`/api/projects/${name}?owner=${encodeURIComponent(owner)}`)
}
```

Deletion rules:

- `viewer` cannot delete
- `user` can delete only their own variants
- `admin` must provide `?owner=<username>` for delete routes so docsfy knows which owner-scoped project to remove
- If a matching variant is still generating, delete returns `409`; abort it first

> **Warning:** `DELETE /api/projects/{name}` is not "delete this repo globally." It deletes all variants for one owner-scoped project group. For admins, `?owner=` is required.

## Download Archives

There are two download routes:

- `GET /api/projects/{name}/{branch}/{provider}/{model}/download`
- `GET /api/projects/{name}/download`

Use the exact variant route when you need a stable artifact. Use the short route only when "latest ready" is acceptable.

A real end-to-end example downloads an exact variant archive and extracts it:

```78:84:test-plans/e2e-08-cross-model-updates.md
curl -s -L -H "Authorization: Bearer $ADMIN_KEY" \
  "$SERVER/api/projects/for-testing-only/main/$BASELINE_PROVIDER/$BASELINE_MODEL/download" \
  -o "$CROSS_PROVIDER_ROOT/baseline.tar.gz"
mkdir -p "$CROSS_PROVIDER_ROOT/baseline"
tar -xzf "$CROSS_PROVIDER_ROOT/baseline.tar.gz" --strip-components=1 -C "$CROSS_PROVIDER_ROOT/baseline"
ls "$CROSS_PROVIDER_ROOT/baseline"
```

What to expect:

- Downloads work only for `ready` variants
- The response is `application/gzip`
- Exact variant downloads are named `<project>-<branch>-<provider>-<model>-docs.tar.gz`
- Latest-route downloads are named `<project>-docs.tar.gz`
- The archive contains a top-level directory, so `tar --strip-components=1` is useful when extracting into an existing folder

> **Tip:** Use the exact variant download route in release pipelines, QA jobs, and bug reports. It is pinned to one build and does not change when a newer variant becomes ready later.

## Serve Rendered Docs

Docsfy serves the generated static site directly under `/docs`.

The exact variant route is:

```200:208:src/docsfy/main.py
@app.get("/docs/{project}/{branch}/{provider}/{model}/{path:path}")
async def serve_variant_docs(
    request: Request,
    project: str,
    branch: str,
    provider: str,
    model: str,
    path: str = "index.html",
) -> FileResponse:
```

The short "latest ready" route is:

```235:241:src/docsfy/main.py
@app.get("/docs/{project}/{path:path}")
async def serve_docs(
    request: Request, project: str, path: str = "index.html"
) -> FileResponse:
    """Serve the most recently generated variant."""
    if not path or path == "/":
        path = "index.html"
```

The dashboard builds exact docs and download links like this:

```366:367:frontend/src/components/shared/VariantDetail.tsx
const docsUrl = `/docs/${project.name}/${project.branch}/${project.ai_provider}/${project.ai_model}/?owner=${encodeURIComponent(project.owner)}`
const downloadUrl = `/api/projects/${project.name}/${project.branch}/${project.ai_provider}/${project.ai_model}/download?owner=${encodeURIComponent(project.owner)}`
```

Use the exact docs route when you want one pinned build:

- `/docs/<project>/<branch>/<provider>/<model>/`
- `/docs/<project>/<branch>/<provider>/<model>/index.html`
- `/docs/<project>/<branch>/<provider>/<model>/introduction.html`

Use the short docs route when you want "whatever is newest and ready":

- `/docs/<project>/`
- `/docs/<project>/index.html`

A few behaviors are easy to miss:

- Empty docs paths resolve to `index.html`
- Any generated file under the rendered site can be served through the same prefix
- The short `/docs/<project>/...` route picks the newest accessible `ready` variant by `last_generated`, not by branch name or commit SHA
- For admins, "latest" is global across all owners for that project name
- For non-admin users, "latest" means the newest owned or explicitly shared `ready` variant they can access
- For non-admin users, the short route can also return `409` if two accessible owners tie for the newest ready timestamp
- If you need one specific owner, branch, provider, or model, use the exact route instead of the short route

> **Warning:** The short docs and download routes are convenience URLs. They can point to a different build after the next successful generation, and they do not let you choose an owner explicitly.

> **Note:** Exact variant docs routes can use `?owner=<username>` for admin disambiguation. The short "latest" docs route does not let you select an owner.

## Common Response Codes

| Code | When you will see it |
| --- | --- |
| `200 OK` | Successful list, lookup, delete, abort, download, or docs-file response |
| `202 Accepted` | `POST /api/generate` accepted the job and queued async work |
| `400 Bad Request` | Invalid delete usage, invalid repo path, invalid download state such as "variant not ready," or missing required owner on admin delete |
| `401 Unauthorized` | Missing or invalid authentication for API-style requests |
| `403 Forbidden` | Viewer tried a write route, non-admin tried `repo_path`, or a docs file path tried to escape the rendered site |
| `404 Not Found` | The project, variant, site file, or active generation does not exist or is not accessible to you |
| `409 Conflict` | Duplicate generation, ambiguous owner selection, multiple active variants for a name-based abort, delete while generating, or cancellation still finishing |
| `422 Unprocessable Entity` | Request-body validation failed, such as an invalid repo URL or invalid branch name |

## Practical Recommendations

- Use `GET /api/status` for dashboards and progress polling.
- Use exact variant routes in scripts, release tooling, and bookmarks you want to stay stable.
- Keep `?owner=` when the UI gives it to you on exact variant routes as an admin.
- Treat the short `/docs/<project>/...` and `/api/projects/<project>/download` routes as convenience shortcuts, not permanent identifiers.
- If you are sharing docs across users, remember that sharing affects what a user can read, not what they can delete or abort.


## Related Pages

- [Generating Documentation](generating-documentation.html)
- [Viewing, Downloading, and Hosting Docs](viewing-downloading-and-hosting-docs.html)
- [Projects, Variants, and Ownership](projects-variants-and-ownership.html)
- [Variants, Branches, and Regeneration](variants-branches-and-regeneration.html)
- [WebSocket Protocol](websocket-protocol.html)

---

Source: admin-api.md

## Authentication

All routes under `\`/api/admin\`` require admin privileges. docsfy recognizes two kinds of admin credentials:

- the built-in `admin` account, authenticated with the `ADMIN_KEY` environment variable
- a database-backed user created with role `admin`

For scripts and automation, send the key as `Authorization: Bearer <token>`. For browser-based admin flows, log in through `POST /api/auth/login`; successful logins set an `HttpOnly` `docsfy_session` cookie with `SameSite=strict`, an 8-hour lifetime, and a `secure` flag controlled by `SECURE_COOKIES`.

The server-side auth settings come directly from `.env.example`:

```1:17:.env.example
# Required: Admin password (minimum 16 characters)
ADMIN_KEY=

# AI provider and model defaults
# (pydantic_settings reads these case-insensitively)
AI_PROVIDER=cursor
AI_MODEL=gpt-5.4-xhigh-fast
AI_CLI_TIMEOUT=60

# Logging
LOG_LEVEL=INFO

# Data directory for database and generated docs
DATA_DIR=/data

# Cookie security (set to false for local HTTP development)
SECURE_COOKIES=true
```

> **Warning:** `ADMIN_KEY` is not just the built-in admin password. docsfy also uses it as the HMAC secret for stored user API keys, so changing `ADMIN_KEY` invalidates every existing database-backed user key.

> **Tip:** If you are testing over plain `http://` on localhost, set `SECURE_COOKIES=false` or browser login sessions will not persist.

The bundled `docsfy` CLI uses the same Bearer-token model under the hood. The admin commands live under `docsfy admin users ...` and `docsfy admin access ...`.

## Quick Reference

| Route | Purpose |
|---|---|
| `GET /api/admin/users` | List database-backed users |
| `POST /api/admin/users` | Create a user and return a one-time API key |
| `DELETE /api/admin/users/{username}` | Delete a user, their sessions, grants, and owned projects |
| `POST /api/admin/users/{username}/rotate-key` | Generate or set a new API key for a user |
| `POST /api/admin/projects/{name}/access` | Grant access to a project owned by a specific user |
| `GET /api/admin/projects/{name}/access?owner=<owner>` | List usernames with access to that project |
| `DELETE /api/admin/projects/{name}/access/{username}?owner=<owner>` | Revoke access from a user |

Common status codes:

- `200`: request succeeded
- `400`: malformed JSON, missing required fields, invalid role, invalid username, or too-short custom key
- `401`: no valid Bearer token or session cookie
- `403`: authenticated, but not an admin
- `404`: target user or project was not found
- `409`: delete was blocked because the user currently has a generation in progress

## User Management

### Create A User

Use `POST /api/admin/users` to create a database-backed account.

Request body:

- `username`: required
- `role`: optional, defaults to `user`

Valid roles are:

- `admin`: full admin privileges
- `user`: normal read/write project operations
- `viewer`: read-only access; can sign in and view granted docs, but cannot generate or delete projects

Username rules are strict:

- 2 to 50 characters
- letters and digits first
- `.` `_` and `-` are allowed
- `admin` is reserved, case-insensitively

The admin API tests exercise user creation like this:

```80:89:tests/test_api_admin.py
response = await admin_client.post(
    "/api/admin/users",
    json={"username": "testuser", "role": "user"},
)
assert response.status_code == 200
data = response.json()
assert data["username"] == "testuser"
assert data["role"] == "user"
assert data["api_key"].startswith("docsfy_")
assert response.headers.get("cache-control") == "no-store"
```

A successful response includes:

- `username`
- `role`
- `api_key`

The returned `api_key` is the raw secret. docsfy stores only a hash in the database.

> **Warning:** Treat the returned `api_key` as a one-time secret. `GET /api/admin/users` does not return raw keys later.

> **Note:** The built-in `admin` account is not a database user. You will not see it in `GET /api/admin/users`, and you cannot create another user named `admin`.

### List Users

Use `GET /api/admin/users` to retrieve the current database-backed user list.

Each entry includes:

- `id`
- `username`
- `role`
- `created_at`

This route intentionally does not return `api_key` or `api_key_hash`.

### Delete A User

Use `DELETE /api/admin/users/{username}` to permanently remove a user.

Important behavior:

- the request is rejected if you try to delete the account you are currently authenticated as
- the request returns `404` if the user does not exist
- the request returns `409` if that user currently has a generation in progress
- all sessions for that user are invalidated
- project-access grants for that user are removed
- access entries for projects owned by that user are removed
- projects owned by that user are deleted

> **Warning:** Deleting a user is destructive. In docsfy, this is more than removing a login: it also removes the user's owned projects and related access-control entries.

### Rotate A User Key

Use `POST /api/admin/users/{username}/rotate-key` to replace a user's API key.

Request body:

- empty JSON object to generate a new random key
- or `{ "new_key": "..." }` to set a custom key

Custom keys must be at least 16 characters long.

The auth tests cover an admin-set custom key like this:

```908:919:tests/test_auth.py
custom = "admin-chosen-password-long"
resp = await admin_client.post(
    "/api/admin/users/admin-custom-target/rotate-key",
    json={"new_key": custom},
)
assert resp.status_code == 200
assert resp.json()["new_api_key"] == custom

# Verify the custom key works
user = await get_user_by_key(custom)
assert user is not None
assert user["username"] == "admin-custom-target"
```

A successful response includes:

- `username`
- `new_api_key`

This response is sent with `Cache-Control: no-store`, and all existing sessions for that user are invalidated.

If the target user does not exist, the route returns `404`.

> **Note:** There is currently no separate admin endpoint for changing a user's role after creation. Roles are assigned when the user is created.

## Project Access Administration

Project sharing in docsfy is grant-based. An admin can give one user access to a project owned by another user without transferring ownership.

Access is scoped by:

- project name
- project owner

Access is not scoped by:

- branch
- AI provider
- AI model

That means one grant covers every variant of that project for that owner.

> **Note:** In these endpoints, `owner` means the project owner's username, not the admin making the request.

A granted user sees shared projects through the normal project listing and docs routes. Their role still matters: a `viewer` with a grant can open docs, but remains read-only.

### Grant Access

Use `POST /api/admin/projects/{name}/access`.

Request body:

- `username`: required
- `owner`: required

Before creating the grant, docsfy verifies that:

- the target user exists
- the project exists for the specified owner

The route implementation shows the exact validation and response shape:

```127:143:src/docsfy/api/admin.py
# Validate user exists
user = await get_user_by_username(username)
if not user:
    raise HTTPException(status_code=404, detail=f"User '{username}' not found")
# Validate project exists for the specified owner
variants = await list_variants(name, owner=project_owner)
if not variants:
    raise HTTPException(
        status_code=404,
        detail=f"Project '{name}' not found for owner '{project_owner}'",
    )
await grant_project_access(name, username, project_owner=project_owner)
await notify_access_change(username)
logger.info(
    f"[AUDIT] Admin '{request.state.username}' granted '{username}' access to '{name}' (owner: '{project_owner}')"
)
return {"granted": name, "username": username, "owner": project_owner}
```

Use this route when you want a user to see a project they do not own, including all of that project's variants for the chosen owner.

### Look Up Current Access

Use `GET /api/admin/projects/{name}/access?owner=<owner>` to see who currently has access to a project.

The response shape is:

- `project`
- `owner`
- `users`

`users` is a username list, ordered alphabetically.

This is the admin access-lookup endpoint exposed by the API today. It is project-centric: there is no separate admin route that lists every grant for a given user.

### Revoke Access

Use `DELETE /api/admin/projects/{name}/access/{username}?owner=<owner>` to remove a grant.

The response includes:

- `revoked`
- `username`
- `owner`

You must still provide `owner`, because grants are stored per project-owner pair.

This route is effectively idempotent for automation. It always returns the same summary object and does not distinguish between "grant was present and removed" and "grant was already absent."

> **Tip:** Grant and revoke operations trigger a sync update for the affected user, so connected dashboards usually reflect the change without a manual refresh.

## Related Auth Endpoints

A few non-admin auth routes matter when you are operating the Admin API.

### Log In

Use `POST /api/auth/login` with JSON:

- `username`
- `api_key`

For the built-in admin account, `username` must be exactly `admin` and `api_key` must match `ADMIN_KEY`.

For database-backed users, the username must match the owner of the supplied API key.

A successful login returns:

- `username`
- `role`
- `is_admin`

It also sets the `docsfy_session` cookie for browser use.

### Check Current Identity

Use `GET /api/auth/me` to verify who the server thinks you are. It returns:

- `username`
- `role`
- `is_admin`

This is the quickest way to confirm whether a token is being treated as built-in admin, database admin, normal user, or viewer.

### Log Out

Use `POST /api/auth/logout` to clear the current session cookie and delete the matching server-side session record.

### Self-Service Key Rotation

Use `POST /api/auth/rotate-key` when a database-backed user needs to rotate their own key.

This route accepts the same optional `new_key` field as the admin rotate route. On success, it:

- returns `username` and `new_api_key`
- sends `Cache-Control: no-store`
- deletes the current session cookie
- invalidates the user's existing sessions, so they must log in again with the new key

> **Warning:** The built-in `admin` account cannot use `POST /api/auth/rotate-key`. Manage that credential by changing `ADMIN_KEY` in your deployment instead.


## Related Pages

- [User and Access Management](user-and-access-management.html)
- [Authentication and Roles](authentication-and-roles.html)
- [Authentication API](auth-api.html)
- [Projects API](projects-api.html)
- [Security Considerations](security-considerations.html)

---

Source: websocket-protocol.md

# WebSocket Protocol

`/api/ws` is docsfy's real-time update stream. Connect once, authenticate, read the initial `sync` snapshot, then apply live `progress` and `status_change` messages as generation runs or project visibility changes.

For most clients, the mental model is simple:

- `sync` is the full snapshot and source of truth.
- `progress` reports in-flight work for one variant.
- `status_change` reports terminal states for one variant.
- `ping` / `pong` keeps the connection alive.
- `GET /api/projects` is the HTTP fallback path the built-in frontend uses when it cannot stay on WebSocket.

## Connection Model

The built-in frontend opens a same-origin socket and chooses `ws://` or `wss://` from the current page URL:

```ts
const protocol = window.location.protocol === 'https:' ? 'wss:' : 'ws:'
const url = `${protocol}//${window.location.host}/api/ws`
this.ws = new WebSocket(url)
```

That means:

- If docsfy is served over HTTPS, the frontend uses `wss://`.
- If docsfy is served over plain HTTP, the frontend uses `ws://`.
- The official UI connects to the same host that served the page.

There is no subscribe message or per-project room selection. One connection receives every update the authenticated user is allowed to see.

```mermaid
sequenceDiagram
    participant Client
    participant WS as /api/ws
    participant Jobs as Generation pipeline
    participant API as /api/projects

    Client->>WS: Connect with session cookie or ?token=
    WS-->>Client: sync
    loop Every 30s
        WS-->>Client: {"type":"ping"}
        Client-->>WS: {"type":"pong"}
    end
    Jobs-->>WS: progress / status_change / sync
    WS-->>Client: Live updates
    alt WebSocket closes abnormally
        Client->>WS: Retry with backoff
        Client->>API: Poll /api/projects every 10s after retries are exhausted
        API-->>Client: projects + known_models + known_branches
    end
```

## Authentication

docsfy supports two authentication paths for `/api/ws`.

### Session Cookie

This is the normal browser flow:

1. Log in with `POST /api/auth/login`.
2. docsfy sets a `docsfy_session` cookie.
3. Open `/api/ws` on the same origin.
4. The browser sends the cookie during the WebSocket handshake.

The login handler sets the cookie like this:

```python
response.set_cookie(
    "docsfy_session",
    session_token,
    httponly=True,
    samesite="strict",
    secure=settings.secure_cookies,
    max_age=SESSION_TTL_SECONDS,
)
```

In the current codebase, that means browser sessions are:

- `HttpOnly`
- `SameSite=Strict`
- controlled by `SECURE_COOKIES` for the `Secure` flag
- valid for 8 hours by default
- backed by an opaque session token, not the raw API key

> **Tip:** For browser integrations, use the session-cookie flow. It matches the built-in frontend and keeps raw API keys out of the WebSocket URL.

> **Warning:** If you are running docsfy on plain `http://localhost`, set `SECURE_COOKIES=false`. Otherwise the browser will not send the secure session cookie, and `/api/ws` will not authenticate.

### `?token=` Query Parameter

Direct clients can authenticate with the admin key or a user's API key by putting it in the URL query string.

The CLI watcher does exactly that:

```python
ws_url = (
    server_url.rstrip("/").replace("https://", "wss://").replace("http://", "ws://")
)
ws_url = f"{ws_url}/api/ws?{urlencode({'token': password})}"
```

> **Warning:** `?token=` puts the credential in the URL. URLs can end up in logs, browser history, proxies, and monitoring tools. Prefer the session-cookie flow in browser-based code.

> **Note:** REST requests can use `Authorization: Bearer <token>`, but `/api/ws` does not authenticate that way. The WebSocket handler checks the `docsfy_session` cookie and the `token` query parameter.

### Auth Failures

If the handshake is not authenticated, the server closes the socket with code `1008`.

Some browser tooling reports a generic connection error instead of surfacing the close code directly, so failed browser tests may show either a close code or a plain error.

## Message Types

| Message | When it is sent | Key fields |
| --- | --- | --- |
| `sync` | Immediately after connect, and again when the server wants a full resync | `projects`, `known_models`, `known_branches` |
| `progress` | While a generation is still running | `name`, `branch`, `provider`, `model`, `owner`, `status`, plus optional progress fields |
| `status_change` | When a variant reaches a terminal state | `name`, `branch`, `provider`, `model`, `owner`, `status`, plus optional final metadata |
| `ping` | Heartbeat from server | `{ "type": "ping" }` |
| `pong` | Heartbeat reply from client | `{ "type": "pong" }` |

Client messages are minimal: the server only expects `pong`. Malformed client messages are ignored.

## `sync`

`sync` is the full replacement payload. If you are writing a client, treat it as the latest authoritative snapshot for the current user.

The frontend types define it like this:

```ts
export interface SyncMessage {
  type: 'sync'
  projects: Project[]
  known_models: Record<string, string[]>
  known_branches: Record<string, string[]>
}
```

A few practical details:

- `projects` is the same live project list returned by `GET /api/projects`.
- `projects` can include variants in `generating`, `ready`, `error`, and `aborted`.
- `known_models` and `known_branches` are helper maps used by the UI for provider/model and branch pickers.
- Both helper maps are built from `ready` projects, not in-progress ones.
- The server sends `sync` immediately on connect.
- The server also uses `sync` after broader consistency changes such as terminal status updates, deletes, variant replacement, and access grants or revokes.

> **Tip:** If you already know how to handle `GET /api/projects`, you already know how to handle `sync`. The only extra field is `type: "sync"`.

## `progress`

`progress` is the in-flight message. It reports that a generation is still running and adds whatever progress fields are available so far.

The backend builds the payload like this:

```python
message: dict[str, Any] = {
    "type": "progress",
    "name": project_name,
    "branch": branch,
    "provider": provider,
    "model": model,
    "owner": owner,
    "status": status,
}
if current_stage is not None:
    message["current_stage"] = current_stage
if page_count is not None:
    message["page_count"] = page_count
if plan_json is not None:
    message["plan_json"] = plan_json
if error_message is not None:
    message["error_message"] = error_message
```

In the current backend, `progress.status` is used for the running state, so you should expect `status: "generating"` here.

The stage names currently used by docsfy are:

- `cloning`
- `planning`
- `incremental_planning`
- `generating_pages`
- `validating`
- `cross_linking`
- `rendering`

What the optional fields mean:

- `current_stage`: the backend's current phase, when one is known
- `page_count`: pages generated so far
- `plan_json`: the saved documentation plan as a JSON string
- `error_message`: an in-flight error detail when the backend has one to surface

`plan_json` is a serialized string, not a pre-parsed object.

## `status_change`

`status_change` is used for terminal states only. In the current backend, that means:

- `ready`
- `error`
- `aborted`

The message always identifies a specific variant with:

- `name`
- `branch`
- `provider`
- `model`
- `owner`
- `status`

It may also include:

- `page_count`
- `last_generated`
- `last_commit_sha`
- `error_message`

A few useful expectations:

- Successful runs end with `status: "ready"`.
- A run that turns out to need no work still ends as `ready`.
- Failed or user-cancelled runs end as `error` or `aborted`.
- After a terminal `status_change`, the backend also sends a full `sync`.

If docsfy finds that a variant is already current, the later `sync` can show `current_stage: "up_to_date"` in the stored project record even though the terminal WebSocket message is still just `status: "ready"`.

> **Warning:** `sync.projects` uses `ai_provider` and `ai_model` because it carries full `Project` records. `progress` and `status_change` use `provider` and `model`. If you merge incremental updates into `Project` objects, map those field names explicitly.

Because terminal updates are followed by `sync`, client handlers should be idempotent. Apply the newest state; do not assume messages are exactly-once events.

## Matching Updates To A Variant

Use the full variant identity when applying `progress` or `status_change`:

- `name`
- `branch`
- `provider`
- `model`
- `owner`

That is what lets you safely distinguish:

- different branches of the same repository
- different provider/model outputs for the same repository
- different owners who generated the same repository name

The built-in CLI watcher uses this exact approach when it filters incoming updates.

## Heartbeat: `ping` And `pong`

The server keeps the socket alive with a heartbeat:

```python
_WS_HEARTBEAT_INTERVAL = 30
_WS_PONG_TIMEOUT = 10
_WS_MAX_MISSED_PONGS = 2
```

In practice, that means:

- the server sends `{"type": "ping"}` every 30 seconds
- the client should reply with `{"type": "pong"}`
- the server waits up to 10 seconds for the reply
- after 2 missed pongs, the server closes the connection with code `1001`

The built-in frontend replies like this:

```ts
if (isPingMessage(parsed)) {
  this.ws?.send(JSON.stringify({ type: 'pong' }))
  return
}
```

You do not need to send your own periodic pings unless your runtime or infrastructure requires it for some other reason.

## Who Receives Updates?

WebSocket delivery is based on access control, not subscriptions.

docsfy sends relevant project updates to:

- admins
- the project owner
- users who have been granted access to that project

That has a few consequences:

- Admins see all variants.
- Non-admin users see their own variants plus shared variants they have been granted access to.
- If an admin grants or revokes access, the affected user receives a fresh `sync` rather than a special `access_change` message.
- Multiple simultaneous connections for the same user are supported, so multiple browser tabs can all receive the same updates.

## Polling Fallback Expectations

The dashboard does not depend on WebSocket alone.

On first load, it fetches the current project list over HTTP with `GET /api/projects`. After that, it opens `/api/ws` and prefers live push updates. If the socket closes abnormally, the built-in frontend retries a few times and then shifts to polling.

The retry and fallback logic is:

```ts
this.ws.onclose = (event) => {
  console.debug('[WS] Disconnected, code:', event.code)
  if (event.code !== 1000) this.attemptReconnect()
}

private attemptReconnect(): void {
  if (this.reconnectAttempts >= this.maxReconnectAttempts) {
    console.debug('[WS] Falling back to polling')
    this.startPolling()
    return
  }
  const delay = this.getBackoffDelay()
  this.reconnectAttempts++
  console.debug('[WS] Reconnecting, attempt', this.reconnectAttempts)
  this.reconnectTimer = setTimeout(() => this.connect(true), delay)
}

private startPolling(): void {
  if (this.pollingTimer) return
  this.pollingTimer = setInterval(async () => {
    try {
      const data = await api.get<ProjectsResponse>('/api/projects')
      const syncMessage: WebSocketMessage = {
        type: 'sync' as const,
        projects: data.projects,
        known_models: data.known_models,
        known_branches: data.known_branches,
      }
      this.handlers.forEach(handler => handler(syncMessage))
    } catch {
      /* ignore polling errors */
    }
  }, WS_POLLING_FALLBACK_MS)
}
```

With the current frontend defaults, that means:

- close code `1000` is treated as a normal disconnect, so the manager does not retry
- other closes trigger up to 3 reconnect attempts
- the reconnect backoff is 1 second, then 2 seconds, then 4 seconds
- after retries are exhausted, the dashboard polls `GET /api/projects` every 10 seconds
- polling results are turned into synthetic `sync` messages before they reach the rest of the UI

The dashboard also has two extra best-effort refresh paths:

- When you start a generation, it waits 5 seconds and fetches `/api/projects` if the new variant still has not appeared in local state.
- If it receives a `progress` or `status_change` message for a variant it does not have locally yet, it fetches `/api/projects` to resync.

> **Note:** In fallback mode, the UI remains usable, but updates arrive on the polling interval instead of immediately.

## Configuration That Affects `/api/ws`

The settings that matter most are the admin key and cookie security:

```python
class Settings(BaseSettings):
    admin_key: str = ""  # Required — validated at startup
    secure_cookies: bool = True  # Set to False for local HTTP dev
```

In practice:

- `ADMIN_KEY` is required and must be at least 16 characters long.
- `SECURE_COOKIES` controls whether browser session cookies use the `Secure` flag.
- Browser-based WebSocket auth depends on that session cookie, so HTTPS/WSS is the right production setup.

> **Tip:** Keep `SECURE_COOKIES=true` in production. Only turn it off for plain-HTTP local development.

## Practical Client Checklist

- Open one `/api/ws` connection per active client session.
- Authenticate with the session cookie or `?token=<api-key>`.
- Treat the first `sync` as your initial state.
- Match `progress` and `status_change` by `name + branch + provider + model + owner`.
- Reply to every `ping` with `pong`.
- Reuse your `GET /api/projects` handler for fallback, since that is the same payload shape the official frontend turns into `sync`.


## Related Pages

- [Tracking Progress and Status](tracking-progress-and-status.html)
- [Projects API](projects-api.html)
- [Authentication API](auth-api.html)
- [CLI Workflows](cli-workflows.html)
- [Local Development](local-development.html)

---

Source: deployment-and-runtime.md

# Deployment and Runtime

docsfy runs as a single FastAPI service that also serves the built React app and the generated documentation output. In the official container image, the process listens on `8000`, keeps durable state under `/data`, includes the AI and diagram toolchain it needs at runtime, and runs as a non-root user.

## Runtime Configuration

| Setting | What it controls | Default |
|---|---|---|
| `ADMIN_KEY` | Required startup credential; also used when hashing stored user API keys | required |
| `DATA_DIR` | Root directory for the SQLite database and stored artifacts | `/data` |
| `SECURE_COOKIES` | Whether the browser session cookie uses the `Secure` flag | `true` |
| `AI_PROVIDER` / `AI_MODEL` | Server defaults when a request omits provider/model | `cursor` / `gpt-5.4-xhigh-fast` |
| `AI_CLI_TIMEOUT` | Default timeout for AI CLI calls | `60` |
| `DEV_MODE` | Container-only switch that starts Vite alongside the backend | off |

If you install the Python package directly, the packaged server launcher reads `HOST`, `PORT`, and `DEBUG` from the environment:

```303:309:src/docsfy/main.py
def run() -> None:
    import uvicorn

    reload = os.getenv("DEBUG", "").lower() == "true"
    host = os.getenv("HOST", "127.0.0.1")
    port = int(os.getenv("PORT", "8000"))
    uvicorn.run("docsfy.main:app", host=host, port=port, reload=reload)
```

That launcher is separate from the `docsfy` CLI, which is a client for the HTTP API.

## Entrypoint Modes

In the official container, startup is controlled by `/app/entrypoint.sh`. By default it starts only the backend. If you explicitly set `DEV_MODE=true`, the same image becomes a dev container: it installs frontend dependencies, starts Vite on `5173`, and runs Uvicorn with reload enabled.

```1:21:entrypoint.sh
#!/bin/bash
set -e

if [ "$DEV_MODE" = "true" ]; then
    echo "DEV_MODE enabled - installing frontend dependencies..."
    cd /app/frontend || exit 1
    npm ci
    echo "Starting Vite dev server on port 5173..."
    npm run dev &
    VITE_PID=$!
    # Forward signals to the background Vite process for clean shutdown
    trap 'kill $VITE_PID 2>/dev/null; wait $VITE_PID 2>/dev/null' SIGTERM SIGINT
    cd /app
    echo "Starting FastAPI with hot reload on port 8000..."
    uv run --no-sync uvicorn docsfy.main:app \
        --host 0.0.0.0 --port 8000 \
        --reload --reload-dir /app/src
else
    exec uv run --no-sync uvicorn docsfy.main:app \
        --host 0.0.0.0 --port 8000
fi
```

For production, the important behavior is simple:

- The backend listens on `0.0.0.0:8000` inside the container.
- `5173` only matters when `DEV_MODE=true`.
- The official image already includes the built frontend bundle, so FastAPI can serve the dashboard directly.

> **Warning:** The official container hardcodes its internal listener to `8000`. If you want a different internal port, override the command or entrypoint. Otherwise, change only the published port at your container runtime, ingress, or load balancer.

> **Note:** FastAPI's Swagger/ReDoc endpoints are disabled in this app. In docsfy, the `/docs/...` path is reserved for generated documentation output.

## Startup Behavior

On process start, docsfy validates `ADMIN_KEY`, initializes or migrates the database under `DATA_DIR`, clears in-memory generation bookkeeping, and removes expired session rows.

```44:58:src/docsfy/main.py
@asynccontextmanager
async def lifespan(app: FastAPI) -> AsyncIterator[None]:
    settings = get_settings()
    if not settings.admin_key:
        logger.error("ADMIN_KEY environment variable is required")
        raise SystemExit(1)

    if len(settings.admin_key) < 16:
        logger.error("ADMIN_KEY must be at least 16 characters long")
        raise SystemExit(1)

    _generating.clear()
    await init_db(data_dir=settings.data_dir)
    await cleanup_expired_sessions()
    yield
```

A few production details are worth knowing:

- Database creation and schema migration happen automatically at startup.
- If the previous process died while a job was still active, the next startup rewrites that stored row from `generating` to `error` so it does not look permanently stuck.
- Expired sessions are removed during startup. The code comments note that long-lived deployments do not yet run periodic session cleanup in the background.
- Browser logins use the `docsfy_session` cookie with `HttpOnly`, `SameSite=Strict`, and a `Secure` flag controlled by `SECURE_COOKIES`. The default session lifetime is 8 hours.

For production browser access, keep `SECURE_COOKIES=true` and serve docsfy over HTTPS.

## Request Flow

docsfy keeps the source checkout ephemeral but keeps the results durable.

```mermaid
sequenceDiagram
    participant User as Browser / CLI
    participant API as docsfy FastAPI
    participant Git as Git source
    participant AI as AI CLI
    participant Data as /data
    participant WS as WebSocket clients

    User->>API: POST /api/generate
    API->>AI: Verify provider CLI is available

    alt Remote repository
        API->>Git: Shallow clone into temp directory
    else Local repository path
        API->>Git: Read existing local checkout in place
    end

    API->>AI: Create doc outline and page content
    API->>Data: Write cached page markdown
    API->>Data: Write final HTML site + plan.json
    API->>Data: Update SQLite project metadata
    API-->>WS: Push sync/progress updates

    User->>API: Open dashboard or docs URL
    API->>Data: Read stored site files
    API-->>User: Serve HTML or download archive
```

In practice, that means:

- A generation request accepts either a remote `repo_url` or a local `repo_path`.
- For remote sources, docsfy uses a temporary shallow clone for that run and discards it afterward.
- For local sources, docsfy reads the existing checkout in place instead of copying it.
- Before doing expensive work, docsfy checks that the selected AI CLI is available.
- If a ready variant already exists, docsfy can reuse cached artifacts, skip unnecessary work when the commit SHA is unchanged, or update only the parts that actually need to change.
- Once written to disk, the site is served from stored files at `/docs/<project>/<branch>/<provider>/<model>/...`, and the latest-ready shortcut `/docs/<project>/...` can resolve to the newest ready variant.
- Download endpoints create a `tar.gz` archive from the stored `site` directory; they do not regenerate content on demand.

> **Warning:** `repo_path` is an absolute filesystem path and is only allowed for admin-triggered requests. In containers, that path must exist inside the container filesystem, so you need an explicit bind mount or volume for the repository if you plan to use that mode.

The checked-in Compose file only persists `/data`; it does not mount source repositories into the container. If you want `repo_path` generation in that setup, add a separate mount.

## Persistent Storage

`DATA_DIR` is the only location that must be persistent across restarts and redeployments. By default, that is `/data`.

```525:582:src/docsfy/storage.py
def get_project_dir(
    name: str,
    ai_provider: str = "",
    ai_model: str = "",
    owner: str = "",
    branch: str = DEFAULT_BRANCH,
) -> Path:
    if not branch:
        msg = "branch is required for project directory paths"
        raise ValueError(msg)
    if not ai_provider or not ai_model:
        msg = "ai_provider and ai_model are required for project directory paths"
        raise ValueError(msg)
    # Sanitize path segments to prevent traversal
    for segment_name, segment in [
        ("branch", branch),
        ("ai_provider", ai_provider),
        ("ai_model", ai_model),
    ]:
        if (
            "/" in segment
            or "\\" in segment
            or ".." in segment
            or segment.startswith(".")
        ):
            msg = f"Invalid {segment_name}: '{segment}'"
            raise ValueError(msg)
    safe_owner = _validate_owner(owner)
    return (
        PROJECTS_DIR
        / safe_owner
        / _validate_name(name)
        / branch
        / ai_provider
        / ai_model
    )

def get_project_site_dir(
    name: str,
    ai_provider: str = "",
    ai_model: str = "",
    owner: str = "",
    branch: str = DEFAULT_BRANCH,
) -> Path:
    return get_project_dir(name, ai_provider, ai_model, owner, branch) / "site"

def get_project_cache_dir(
    name: str,
    ai_provider: str = "",
    ai_model: str = "",
    owner: str = "",
    branch: str = DEFAULT_BRANCH,
) -> Path:
    return (
        get_project_dir(name, ai_provider, ai_model, owner, branch) / "cache" / "pages"
    )
```

A single deployment root contains both metadata and artifacts:

- `${DATA_DIR}/docsfy.db`: SQLite database for projects, users, access grants, and sessions
- `${DATA_DIR}/projects/<owner>/<project>/<branch>/<provider>/<model>/plan.json`: stored document outline for that variant
- `${DATA_DIR}/projects/<owner>/<project>/<branch>/<provider>/<model>/cache/pages/*.md`: cached page markdown
- `${DATA_DIR}/projects/<owner>/<project>/<branch>/<provider>/<model>/site/`: final static site served by docsfy

The site writer creates a self-contained output tree under `site/`, including `index.html`, per-page `*.html`, per-page `*.md`, `assets/`, `search-index.json`, `llms.txt`, `llms-full.txt`, and `.nojekyll`.

The checked-in Compose file mounts the persistent directory exactly where the app expects it:

```1:15:docker-compose.yaml
services:
  docsfy:
    build:
      context: .
      dockerfile: Dockerfile
    ports:
      - "8000:8000"
      # Uncomment for development (DEV_MODE=true)
      # - "5173:5173"
    volumes:
      - ./data:/data
      # Uncomment for development (hot reload)
      # - ./frontend:/app/frontend
    env_file:
      - .env
```

> **Tip:** If you back up `/data`, you back up both the SQLite metadata and every generated variant. You do not need to preserve remote Git working copies, because docsfy recreates those in temporary directories when needed.

## Health Checks And Live Connections

The built-in liveness endpoint is `GET /health`, and it returns `{"status":"ok"}`. The official image wires its container health probe to that endpoint, and the CLI's `docsfy health` command calls the same URL.

That endpoint is intentionally shallow. It confirms that the web process is up, but it does not prove that:

- your `DATA_DIR` volume is writable
- the selected AI CLI is installed and usable
- Chromium and Mermaid CLI are present if you rely on Mermaid-to-SVG conversion
- your reverse proxy is correctly forwarding WebSocket upgrades

The dashboard also relies on a WebSocket connection at `/api/ws` for live sync and status updates. The server sends an initial sync message immediately after connect, then uses a 30-second heartbeat. It waits 10 seconds for each pong and closes the socket after 2 missed pongs.

> **Tip:** Use `/health` as a liveness probe. If you want a stronger readiness check, add an external probe that also verifies `/data` write access and the AI CLI(s) you expect to use.

If you deploy behind Nginx, Traefik, Caddy, HAProxy, a cloud load balancer, or Kubernetes ingress, make sure `/api/ws` supports WebSocket upgrade and that idle timeouts are comfortably longer than the heartbeat interval.

In `DEV_MODE=true`, Vite proxies `/api`, `/docs`, and `/health` to the backend on `8000`, so local frontend development still talks to the same server process.

## Non-Root Containers And Runtime Toolchain

The official image is designed to run without root privileges and to remain compatible with OpenShift-style arbitrary UIDs in group `0`. It also keeps its runtime dependencies in the final image because docsfy really uses them after startup.

```49:133:Dockerfile
# Create non-root user, data directory, and set permissions
# OpenShift runs containers as a random UID in the root group (GID 0)
RUN useradd --create-home --shell /bin/bash -g 0 appuser \
  && mkdir -p /data \
  && chown appuser:0 /data \
  && chmod -R g+w /data

# ...

# Puppeteer config for mermaid-cli (must be set before npm install)
ENV PUPPETEER_EXECUTABLE_PATH="/usr/bin/chromium"
ENV PUPPETEER_SKIP_CHROMIUM_DOWNLOAD="true"

# Puppeteer needs --no-sandbox in Docker (non-root user, no user namespaces)
RUN printf '{"args":["--no-sandbox","--disable-setuid-sandbox"]}\n' > /home/appuser/.puppeteerrc.json

# Install Claude Code CLI (installs to ~/.local/bin)
RUN /bin/bash -o pipefail -c "curl -fsSL https://claude.ai/install.sh | bash"

# Install Cursor Agent CLI (installs to ~/.local/bin)
RUN /bin/bash -o pipefail -c "curl -fsSL https://cursor.com/install | bash"

# Configure npm for non-root global installs and install Gemini CLI + mermaid-cli
RUN mkdir -p /home/appuser/.npm-global \
  && npm config set prefix '/home/appuser/.npm-global' \
  && npm install -g @google/gemini-cli @mermaid-js/mermaid-cli@11

# ...

# Make /app group-writable for OpenShift compatibility
RUN chmod -R g+w /app

# Make appuser home accessible by OpenShift arbitrary UID
RUN find /home/appuser -type d -exec chmod g=u {} + \
  && npm cache clean --force 2>/dev/null; \
  rm -rf /home/appuser/.npm/_cacache

# Switch back to non-root user for runtime
USER appuser

# Ensure CLIs are in PATH
ENV PATH="/home/appuser/.local/bin:/home/appuser/.npm-global/bin:${PATH}"
# Set HOME for OpenShift compatibility (random UID has no passwd entry)
ENV HOME="/home/appuser"

HEALTHCHECK --interval=30s --timeout=10s --retries=3 \
  CMD curl -f http://localhost:8000/health || exit 1

ENTRYPOINT ["/app/entrypoint.sh"]
```

A few practical consequences follow from that image design:

- `git` is a runtime dependency because docsfy fetches repositories and computes diffs during generation.
- `nodejs`, `npm`, `chromium`, and `mmdc` stay in the runtime image because Mermaid diagrams can be converted to inline SVG while the site is being written.
- The AI provider CLIs are runtime dependencies, not optional build extras. If you build your own image, include the providers you plan to use.
- The container expects `/data` to be writable by the effective non-root user or by an arbitrary UID that is a member of group `0`.
- Browser-based deployments should keep `SECURE_COOKIES=true` and place docsfy behind HTTPS so session cookies are accepted by the browser.

> **Note:** Mermaid SVG conversion is conditional in code. If `mmdc` is missing, docsfy skips that conversion step. The official image installs `@mermaid-js/mermaid-cli` and Chromium so diagrams work out of the box.

> **Warning:** If your persistent volume is mounted read-only, or with ownership and permissions that block the effective UID/GID, docsfy will fail when it tries to create `docsfy.db`, write page cache files, or store the final site under `site/`.

> **Tip:** The official image is a good reference even if you do not use Docker directly. If you deploy with systemd, Podman, Kubernetes, or another platform, replicate the same assumptions: port `8000`, persistent `DATA_DIR`, non-root execution, Git available, and the provider and diagram toolchain installed in the runtime environment.


## Related Pages

- [Architecture and Runtime](architecture-and-runtime.html)
- [Docker and Compose Quickstart](docker-quickstart.html)
- [Environment Variables](environment-variables.html)
- [Data Storage and Layout](data-storage-and-layout.html)
- [Security Considerations](security-considerations.html)

---

Source: testing-and-quality-checks.md

# Testing and Quality Checks

docsfy uses several layers of testing instead of one giant quality gate. Fast automated checks catch backend and frontend regressions, `pre-commit` blocks common mistakes before they reach git, CodeRabbit adds pull-request review automation, and the manual plans in `test-plans/` verify real user workflows from login to documentation generation.

## Quick Start

If you want a strong default before you open or update a pull request, run:

```bash
tox

cd frontend
npm test
npm run build
npm run lint

cd ..
pre-commit run --all-files
```

If your change affects login, generation, real-time updates, docs output, or the CLI, also run the relevant manual plan in `test-plans/`.

> **Note:** The quality automation you can inspect in this repository lives in `tox.toml`, `.pre-commit-config.yaml`, `.coderabbit.yaml`, the frontend test and build configs, and the manual plans under `test-plans/`.

```mermaid
flowchart LR
  A[Change code or docs] --> B{What changed?}
  B -->|Python backend| C[pytest or tox]
  B -->|React frontend| D[Vitest, build, and lint]
  C --> E[pre-commit hooks]
  D --> E
  E --> F[Pull request]
  F --> G[CodeRabbit review]
  G --> H{User-visible workflow changed?}
  H -->|Yes| I[Run relevant manual test plans]
  H -->|No| J[Ready to merge]
  I --> J
```

## Pytest

`pytest` is the main automated runner for the Python side of docsfy. The checked-in configuration is deliberately small:

```toml
[project.optional-dependencies]
dev = ["pytest", "pytest-asyncio", "pytest-xdist"]

[tool.pytest.ini_options]
asyncio_mode = "auto"
testpaths = ["tests"]
pythonpath = ["src"]
```

That setup gives you three useful guarantees: async tests work without extra boilerplate, test discovery is constrained to `tests/`, and imports resolve from `src/` so the suite exercises the package code directly.

Most backend tests run the FastAPI app in-process instead of booting a separate server. That keeps runs fast and makes failures easier to debug. For example, `tests/test_api_projects.py` exercises the generate endpoint like this:

```python
async def test_generate_starts(client: AsyncClient) -> None:
    """POST /api/generate starts generation (mock create_task), returns 202."""
    with patch("docsfy.api.projects.asyncio.create_task") as mock_task:
        mock_task.side_effect = lambda coro: coro.close()
        response = await client.post(
            "/api/generate",
            json={"repo_url": "https://github.com/org/repo.git"},
        )
    assert response.status_code == 202
    body = response.json()
    assert body["project"] == "repo"
    assert body["status"] == "generating"
```

WebSocket behavior is covered too. `tests/test_websocket.py` verifies that authenticated clients receive the initial sync payload:

```python
with sync_client.websocket_connect(f"/api/ws?token={TEST_ADMIN_KEY}") as ws:
    data = ws.receive_json()
    assert data["type"] == "sync"
    assert "projects" in data
    assert "known_models" in data
    assert "known_branches" in data
```

The backend suite is broad. The checked-in test files cover auth, admin APIs, project APIs, storage, repository operations, renderer behavior, post-processing, CLI commands, dashboard routes, integration flows, and WebSocket updates. It even includes doc-quality features such as version detection, stale-reference cleanup, cross-linking, and Mermaid pre-rendering.

> **Note:** Backend route tests do not require a full production frontend build. `tests/conftest.py` creates a minimal `frontend/dist/index.html` for SPA catch-all tests.

> **Tip:** If you want the exact backend `pytest` command the repository defines, `tox` wraps `uv run --extra dev pytest -n auto tests`.

## Vitest

The frontend uses `Vitest` with Testing Library. The main scripts in `frontend/package.json` are:

```json
"scripts": {
  "dev": "vite",
  "build": "tsc -b && vite build",
  "lint": "eslint .",
  "preview": "vite preview",
  "test": "vitest run"
}
```

And the checked-in Vitest config in `frontend/vitest.config.ts` is straightforward:

```ts
export default defineConfig({
  test: {
    environment: 'jsdom',
    globals: true,
    setupFiles: './src/test/setup.ts',
  },
  resolve: {
    alias: {
      '@': fileURLToPath(new URL('./src', import.meta.url)),
    },
  },
})
```

That means frontend tests run in a browser-like `jsdom` environment, can use global test helpers, and load `frontend/src/test/setup.ts`, which imports `@testing-library/jest-dom/vitest`.

A representative example from `frontend/src/pages/LoginPage.test.tsx`:

```tsx
describe('LoginPage', () => {
  it('renders username and password inputs', () => {
    renderLogin()
    expect(screen.getByLabelText('Username')).toBeInTheDocument()
    expect(screen.getByLabelText('Password')).toBeInTheDocument()
  })

  it('renders the submit button', () => {
    renderLogin()
    expect(screen.getByRole('button', { name: 'Sign In' })).toBeInTheDocument()
  })
})
```

In practice, the frontend quality checks are split across three commands:
- `npm test` runs the Vitest suite
- `npm run build` runs `tsc -b` before Vite builds, so TypeScript errors fail the build
- `npm run lint` runs ESLint

`frontend/eslint.config.js` applies recommended JavaScript, TypeScript, React Hooks, and Vite refresh rules to `**/*.{ts,tsx}`, so linting and testing are separate checks on purpose.

> **Note:** The checked-in Vitest coverage is currently narrow. Right now it is centered on `frontend/src/pages/LoginPage.test.tsx`, so user-visible frontend changes still deserve careful manual testing.

## tox

`tox` is the repository’s repeatable backend test wrapper. The entire checked-in config fits on a few lines:

```toml
skipsdist = true

envlist = ["unittests"]

[env.unittests]
deps = ["uv"]
commands = [["uv", "run", "--extra", "dev", "pytest", "-n", "auto", "tests"]]
```

A plain `tox` run executes the `unittests` environment, installs `uv`, and runs the backend suite with `pytest-xdist` parallelism via `-n auto`.

A couple of practical takeaways:
- `tox` is focused on backend tests, not the frontend
- `skipsdist = true` means it is not trying to build a package artifact before testing
- the parallelism lives in the `tox` command, not in the base `pytest` config

Use `tox` when you want the repository’s exact backend command, not a hand-assembled local variation.

> **Tip:** Use `pytest` for fast iteration while you are working, then use `tox` before you ship the change.

## Pre-Commit Hooks

`pre-commit` is the broadest single local quality gate in this repository. It combines file hygiene checks, Python linting and formatting, type checking, and secret scanning.

Here is the most relevant part of `.pre-commit-config.yaml`:

```yaml
- repo: https://github.com/pre-commit/pre-commit-hooks
  rev: v6.0.0
  hooks:
    - id: check-added-large-files
    - id: check-docstring-first
    - id: check-executables-have-shebangs
    - id: check-merge-conflict
    - id: detect-private-key
    - id: mixed-line-ending
    - id: debug-statements
    - id: trailing-whitespace
      args: [--markdown-linebreak-ext=md]
    - id: end-of-file-fixer
    - id: check-ast
    - id: check-builtin-literals
    - id: check-toml

# flake8 retained for RedHatQE M511 plugin; ruff handles standard linting
- repo: https://github.com/PyCQA/flake8
  rev: 7.3.0
  hooks:
    - id: flake8
      args: [--config=.flake8]

- repo: https://github.com/Yelp/detect-secrets
  rev: v1.5.0
  hooks:
    - id: detect-secrets

- repo: https://github.com/astral-sh/ruff-pre-commit
  rev: v0.15.8
  hooks:
    - id: ruff
    - id: ruff-format

- repo: https://github.com/gitleaks/gitleaks
  rev: v8.30.0
  hooks:
    - id: gitleaks

- repo: https://github.com/pre-commit/mirrors-mypy
  rev: v1.19.1
  hooks:
    - id: mypy
      exclude: (tests/)
```

That hook set does a lot of work:
- `ruff` and `ruff-format` handle day-to-day Python linting and formatting
- `flake8` is kept specifically for the RedHatQE `M511` rule; `.flake8` narrows it to `select=M511`
- `mypy` is part of the hook set, but it excludes `tests/`
- `detect-private-key`, `detect-secrets`, and `gitleaks` give you multiple chances to catch secrets before commit
- the whitespace hook is configured for Markdown line breaks, which helps protect docs formatting

The config also includes a `ci:` section with `autofix_prs: false` and a custom autoupdate commit message. In other words, the repository is ready for automated hook runs, but it is not configured to silently rewrite pull requests.

> **Tip:** `pre-commit run --all-files` is the best one-command sweep before you push, especially after refactors, dependency changes, or wide docs edits.

## Type Checks

docsfy uses strict type checking on both the Python and TypeScript sides.

For Python, `mypy` is configured in `pyproject.toml` with a strict baseline:

```toml
[tool.mypy]
check_untyped_defs = true
disallow_any_generics = true
disallow_incomplete_defs = true
disallow_untyped_defs = true
no_implicit_optional = true
show_error_codes = true
warn_unused_ignores = true
strict_equality = true
extra_checks = true
warn_unused_configs = true
warn_redundant_casts = true
```

This is not a best-effort setup. The checked-in app code is expected to be annotated and type-consistent.

On the frontend, both `frontend/tsconfig.app.json` and `frontend/tsconfig.node.json` enable strict TypeScript checks:

```json
"noEmit": true,
"strict": true,
"noUnusedLocals": true,
"noUnusedParameters": true,
"noFallthroughCasesInSwitch": true,
"noUncheckedSideEffectImports": true
```

Because the build script is `tsc -b && vite build`, `npm run build` is the repository’s real frontend type-check command.

> **Note:** The checked-in `mypy` hook excludes `tests/`, so strict Python type enforcement is aimed at application code rather than the test suite.

## Secret Scanning

This repository scans for secrets in layers. `detect-private-key` catches obvious keys, `detect-secrets` flags secret-like values, and `gitleaks` adds another scanner with its own rules.

The gitleaks allowlist is intentionally narrow:

```toml
[extend]
useDefault = true

[allowlist]
paths = [
    '''tests/test_repository\.py''',
]
```

The test suite also marks clearly fake values inline when scanners would otherwise create noise. For example, `tests/test_api_auth.py` contains this intentionally invalid credential:

```python
response = await unauthed_client.post(
    "/api/auth/login",
    json={
        "username": "someone",
        "api_key": "totally-wrong",  # pragma: allowlist secret
    },
)
```

You will see the same `# pragma: allowlist secret` pattern in other test files where obviously fake keys, passwords, or SHAs would otherwise trigger scanners.

> **Warning:** Treat allowlists as exceptions, not convenience. Fake test values are acceptable only when they are clearly fake and explicitly marked. Real credentials should stay out of tracked files.

## Review Automation

Pull-request review automation is configured in `.coderabbit.yaml`. The repository tells CodeRabbit to auto-review pull requests into `main`, use an assertive review profile, and request changes when it finds critical issues.

```yaml
reviews:
  profile: assertive
  request_changes_workflow: true
  auto_review:
    auto_pause_after_reviewed_commits: 0
    enabled: true
    drafts: false
    base_branches:
      - main

  tools:
    ruff:
      enabled: true
    pylint:
      enabled: true
    eslint:
      enabled: true
    shellcheck:
      enabled: true
    yamllint:
      enabled: true
    gitleaks:
      enabled: true
    semgrep:
      enabled: true
    actionlint:
      enabled: true
    hadolint:
      enabled: true
```

This matters for two reasons. First, review-time automation goes beyond the local hook set by adding tools such as `pylint`, `semgrep`, `actionlint`, and `hadolint`. Second, security and style checks remain part of the pull-request process, not just local development.

CodeRabbit is useful, but it is not a substitute for local testing. Treat it as an extra reviewer, not as your primary test runner.

> **Tip:** The fastest review cycle is still: run the relevant local checks first, then let CodeRabbit focus on what humans tend to miss in diffs.

## Manual End-to-End Test Plans

The manual plans in `test-plans/` are the acceptance-test layer. Start with `test-plans/e2e-ui-test-plan.md`. It defines the shared rules, the live server URL (`http://localhost:8800`), the test repository, the execution order, and the master map of all sub-plans. It also requires each subsection to be executed and logged in order, which makes the plans closer to an audit trail than a loose checklist.

The current plans cover:
- authentication and roles in `test-plans/e2e-01-auth-and-roles.md`
- generation and dashboard behavior in `test-plans/e2e-02-generation-and-dashboard.md`
- docs quality and UI in `test-plans/e2e-03-docs-quality-and-ui.md`
- isolation, logout, and direct URL authorization in `test-plans/e2e-04-isolation-and-auth.md`
- incremental updates in `test-plans/e2e-05-incremental-updates.md`
- delete and owner scoping in `test-plans/e2e-06-delete-and-owner.md`
- UI component behavior in `test-plans/e2e-07-ui-components.md`
- cross-model updates in `test-plans/e2e-08-cross-model-updates.md`
- cleanup and teardown in `test-plans/e2e-09-cleanup.md`
- branch support in `test-plans/e2e-10-branch-support.md`
- WebSocket behavior in `test-plans/e2e-11-websocket.md`
- CLI workflows in `test-plans/e2e-12-cli.md`
- post-generation pipeline checks in `test-plans/e2e-13-post-generation-pipeline.md`

These plans are deliberately concrete. They combine browser automation, API calls, polling loops, and CLI commands instead of relying on vague manual instructions.

The WebSocket plan checks a real browser session opening a socket to the app:

```shell
agent-browser javascript "new Promise(resolve => { const ws = new WebSocket('ws://localhost:8800/api/ws'); ws.onopen = () => { resolve('connected'); ws.close(); }; ws.onerror = () => resolve('error'); setTimeout(() => resolve('timeout'), 5000); })"
```

That same plan also covers unauthenticated rejection, sync messages, progress events, reconnect behavior, and the SPA's polling fallback.

The CLI plan exercises the user-facing `docsfy` command against the live server:

```shell
docsfy generate https://github.com/myk-org/for-testing-only --provider gemini --model gemini-2.5-flash --force
```

Beyond generation, the CLI plan also covers `config init`, `--watch`, `list`, `status`, `delete`, admin user commands, and `abort`.

The post-generation pipeline plan is especially useful for docsfy itself. It checks version footer detection, Mermaid rendering, related pages, validation stages, cross-linking stages, and a performance baseline for generation.

> **Note:** `test-plans/e2e-09-cleanup.md` is designed to run last, after the other plans.

> **Note:** `test-plans/e2e-13-post-generation-pipeline.md` treats Mermaid rendering as environment-dependent. It explicitly blocks the diagram-rendering check if `mmdc` is not installed.

> **Tip:** The manual plans matter most for user-visible changes. The backend automated suite is broad, but the current frontend unit-test coverage is much smaller.

## Choosing the Right Check

| If you changed... | Start with... | Add this when needed |
|---|---|---|
| Python APIs, auth, storage, repository logic, rendering, post-processing, or CLI behavior | `tox` or backend `pytest` | `pre-commit run --all-files` |
| React components or frontend behavior | `npm test`, `npm run build`, `npm run lint` | relevant plan in `test-plans/` |
| Files that might affect formatting, linting, typing, or secret detection | `pre-commit run --all-files` | targeted backend or frontend tests |
| Login, generation, status pages, WebSocket behavior, docs output, branch handling, or CLI workflows | relevant plan in `test-plans/` | the matching automated checks above |

In practice, the safest path is small, fast checks first and workflow checks last: `pytest` and `Vitest` catch regressions quickly, `tox` makes backend runs repeatable, `pre-commit` keeps the repository clean, CodeRabbit scrutinizes pull requests, and the manual plans prove the product still works the way users experience it.


## Related Pages

- [Local Development](local-development.html)
- [Deployment and Runtime](deployment-and-runtime.html)
- [Security Considerations](security-considerations.html)
- [Troubleshooting](troubleshooting.html)
- [Generating Documentation](generating-documentation.html)

---

Source: security-considerations.md

# Security Considerations

docsfy is built for authenticated documentation hosting, not anonymous public uploads. The backend protects API and docs routes, stores hashes instead of raw secrets, rejects risky repository inputs, and sanitizes rendered HTML before it serves AI-generated pages.

## What To Lock Down First

- Keep `ADMIN_KEY` secret. The server will not start without it, and it must be at least 16 characters long.
- Serve docsfy over HTTPS and leave `SECURE_COOKIES=true` in production.
- Prefer named database-backed users over sharing the built-in `admin` credential.
- Treat local `repo_path` generation as a high-trust admin feature.
- Review generated docs before publishing them outside your team.

From `.env.example`:

```env
# Required: Admin password (minimum 16 characters)
ADMIN_KEY=

# Data directory for database and generated docs
DATA_DIR=/data

# Cookie security (set to false for local HTTP development)
SECURE_COOKIES=true
```

## Authentication And Access Control

docsfy uses one auth model across the web UI, REST API, CLI, and WebSocket updates:

- Browser users sign in once and receive a `docsfy_session` cookie.
- API and CLI clients can send the same secret as `Authorization: Bearer <API_KEY>`.
- Users have one of three roles: `viewer`, `user`, or `admin`.

Most protected functionality lives under `/api/*` and `/docs/*`. The login page and health check stay public, and the WebSocket endpoint authenticates during the handshake. The same access model gates project APIs, generated docs URLs, downloads, and live status updates.

There are two kinds of admin access:

- Built-in `admin`: the username must literally be `admin`, and the submitted secret must match `ADMIN_KEY`.
- Database-backed `admin`: a normal user row with role `admin`.

From `src/docsfy/api/auth.py`:

```python
# Check admin -- username must be "admin" and key must match
if username == "admin" and hmac.compare_digest(api_key, settings.admin_key):
    is_admin = True
    authenticated = True
    role = "admin"
else:
    # Check user key -- verify username matches the key's owner
    user = await get_user_by_key(api_key)
    if user and user["username"] == username:
        authenticated = True
        role = str(user.get("role", "user"))
        if role == "admin":
            is_admin = True
```

Write access is enforced on the server, not just hidden in the UI. `viewer` is read-only.

From `src/docsfy/api/projects.py`:

```python
def _require_write_access(request: Request) -> None:
    """Raise 403 if user is a viewer (read-only)."""
    if request.state.role not in ("admin", "user"):
        raise HTTPException(
            status_code=403,
            detail="Write access required.",
        )
```

Shared access is owner-scoped. That matters because two different users can generate docs for repositories with the same name. If a user asks for a project they should not see, docsfy often returns `404 Not found` instead of `403`, which helps avoid leaking whether that project exists.

From `src/docsfy/api/projects.py`:

```python
async def _check_ownership(
    request: Request, project_name: str, project: dict[str, Any]
) -> None:
    """Raise 404 if the requesting user does not own the project (unless admin)."""
    if request.state.is_admin:
        return
    project_owner = str(project.get("owner", ""))
    if project_owner == request.state.username:
        return
    access = await get_project_access(project_name, project_owner=project_owner)
    if request.state.username in access:
        return
    raise HTTPException(status_code=404, detail="Not found")
```

```mermaid
sequenceDiagram
    participant Browser
    participant API as docsfy API
    participant DB as SQLite
    participant WS as /api/ws

    Browser->>API: POST /api/auth/login { username, api_key }
    API->>DB: Validate ADMIN_KEY or look up hashed user key
    API->>DB: Create session token record
    API-->>Browser: JSON user info + Set-Cookie docsfy_session
    Browser->>API: GET /api/status
    API->>DB: Load session and user access
    API-->>Browser: Projects payload
    Browser->>WS: Open /api/ws
    WS->>DB: Authenticate via cookie or token
    WS-->>Browser: sync / progress / status_change
```

The CLI uses the same credential model. From `src/docsfy/cli/client.py`:

```python
self._client = httpx.Client(
    base_url=self.server_url,
    headers={"Authorization": f"Bearer {self.password}"},
    timeout=30.0,
    follow_redirects=False,
)
```

> **Note:** In the UI this secret is labeled as a password. In the API and CLI it is treated as an API key. It is the same underlying credential.

> **Warning:** The username `admin` is reserved. You cannot create a database-backed user named `admin`, `Admin`, or `ADMIN`.

> **Tip:** Use the built-in `admin` account for bootstrap and recovery. For normal day-to-day work, create named users, including named admins.

The WebSocket endpoint also accepts a `?token=` query parameter. That is useful for direct clients, but it is a poorer fit for browsers because URLs are easier to leak than cookies.

> **Warning:** For browser use, prefer the normal session-cookie flow over `wss://.../api/ws?token=...`.

## API Keys, Session Tokens, And Cookie Settings

User API keys are not stored in plain text. The `users` table stores only `api_key_hash`, and that hash is an HMAC-SHA256 digest keyed by `ADMIN_KEY`.

From `src/docsfy/storage.py`:

```python
def hash_api_key(key: str, hmac_secret: str = "") -> str:
    """Hash an API key with HMAC-SHA256 for storage."""
    secret = hmac_secret or os.getenv("ADMIN_KEY", "")
    if not secret:
        msg = "ADMIN_KEY environment variable is required for key hashing"
        raise RuntimeError(msg)
    return hmac.new(secret.encode(), key.encode(), hashlib.sha256).hexdigest()

def generate_api_key() -> str:
    """Generate a random API key."""
    return f"docsfy_{secrets.token_urlsafe(32)}"
```

That means:

- The server cannot recover an old user API key from the database.
- Auto-generated keys use a random `docsfy_...` format.
- Custom replacement keys must be at least 16 characters long.
- The raw key is shown only when a user is created or rotated, so you should save it immediately.

Browser sessions use a different secret from the user API key. docsfy generates a random session token, stores only its SHA-256 hash in the `sessions` table, and sends the raw token back only in the cookie.

From `src/docsfy/storage.py`:

```python
def _hash_session_token(token: str) -> str:
    """Hash a session token for storage."""
    return hashlib.sha256(token.encode()).hexdigest()

async def create_session(
    username: str, is_admin: bool = False, ttl_hours: int = SESSION_TTL_HOURS
) -> str:
    """Create an opaque session token."""
    token = secrets.token_urlsafe(32)
    token_hash = _hash_session_token(token)
    expires_at = datetime.now(timezone.utc) + timedelta(hours=ttl_hours)
    expires_str = expires_at.strftime("%Y-%m-%d %H:%M:%S")
    async with aiosqlite.connect(DB_PATH) as db:
        await db.execute(
            "INSERT INTO sessions (token, username, is_admin, expires_at) VALUES (?, ?, ?, ?)",
            (token_hash, username, 1 if is_admin else 0, expires_str),
        )
        await db.commit()
    return token
```

The session cookie is hardened for browser use. From `src/docsfy/api/auth.py`:

```python
response.set_cookie(
    "docsfy_session",
    session_token,
    httponly=True,
    samesite="strict",
    secure=settings.secure_cookies,
    max_age=SESSION_TTL_SECONDS,
)
```

That gives you:

- `HttpOnly`: browser JavaScript cannot read the cookie.
- `SameSite=Strict`: the cookie is not meant for cross-site use.
- `Secure`: controlled by `SECURE_COOKIES`, and it should stay on in production.
- `max_age=SESSION_TTL_SECONDS`: sessions last 8 hours by default.

The frontend intentionally uses same-origin credential handling instead of reading the cookie itself. From `frontend/src/lib/api.ts`:

```ts
const config: RequestInit = {
  ...options,
  credentials: 'same-origin',
  redirect: 'manual',
  headers,
}
```

Live updates follow the same pattern. From `frontend/src/lib/websocket.ts`:

```ts
const protocol = window.location.protocol === 'https:' ? 'wss:' : 'ws:'
const url = `${protocol}//${window.location.host}/api/ws`
this.ws = new WebSocket(url)
```

Any endpoint that returns a raw key marks that response as non-cacheable. From `src/docsfy/api/auth.py`:

```python
response = JSONResponse(
    content={"username": username, "new_api_key": new_key},
    headers={"Cache-Control": "no-store"},
)
```

Key rotation also invalidates existing sessions for that user. From `src/docsfy/storage.py`:

```python
# Invalidate all existing sessions for this user
await db.execute("DELETE FROM sessions WHERE username = ?", (username,))
```

If you use the CLI, remember that its config file contains the raw secret. `docsfy config init` writes owner-only permissions.

From `src/docsfy/cli/config_cmd.py`:

```python
CONFIG_DIR.mkdir(parents=True, exist_ok=True)
os.chmod(CONFIG_DIR, stat.S_IRWXU)
with open(CONFIG_FILE, "wb") as f:
    tomli_w.dump(config, f)
os.chmod(CONFIG_FILE, stat.S_IRUSR | stat.S_IWUSR)
```

> **Warning:** Rotating `ADMIN_KEY` does more than change the built-in admin login. `ADMIN_KEY` is also the HMAC secret used to verify stored user API keys, so rotating it invalidates existing database-backed user keys until you reissue them.

> **Warning:** The built-in `admin` account cannot use self-service key rotation. To change that credential, update `ADMIN_KEY` in the environment and restart the server.

> **Tip:** Leave `SECURE_COOKIES=true` in production. Set it to `false` only for local HTTP development, otherwise the browser will not send the session cookie back over plain HTTP.

## Repository Input Safety

### Accepted Repo URLs And Branch Names

docsfy validates repository input before it gets anywhere near Git or the filesystem. The current request model accepts:

- `http://...` or `https://...` repository URLs in `host/org/repo(.git)` form
- scp-style SSH URLs in `git@host:org/repo(.git)` form
- absolute local paths for `repo_path`

From `src/docsfy/models.py`:

```python
@field_validator("repo_url")
@classmethod
def validate_repo_url(cls, v: str | None) -> str | None:
    if v is None:
        return v
    https_pattern = r"^https?://[\w.\-]+/[\w.\-]+/[\w.\-]+(\.git)?$"
    ssh_pattern = r"^git@[\w.\-]+:[\w.\-]+/[\w.\-]+(\.git)?$"
    if not re.match(https_pattern, v) and not re.match(ssh_pattern, v):
        msg = f"Invalid git repository URL: '{v}'"
        raise ValueError(msg)
    return v

@field_validator("repo_path")
@classmethod
def validate_repo_path(cls, v: str | None) -> str | None:
    if v is None:
        return v
    path = Path(v)
    if not path.is_absolute():
        msg = "repo_path must be an absolute path"
        raise ValueError(msg)
    return v
```

Branch names are validated too, because docsfy uses them both in URLs and on disk.

From `src/docsfy/models.py`:

```python
@field_validator("branch")
@classmethod
def validate_branch(cls, v: str) -> str:
    if "/" in v:
        msg = (
            f"Invalid branch name: '{v}'. Branch names cannot contain slashes "
            "— use hyphens instead (e.g., release-1.x)."
        )
        raise ValueError(msg)
    if not re.match(r"^[a-zA-Z0-9][a-zA-Z0-9._-]*$", v):
        msg = f"Invalid branch name: '{v}'"
        raise ValueError(msg)
    if ".." in v:
        msg = f"Invalid branch name: '{v}'"
        raise ValueError(msg)
    return v
```

> **Tip:** Use branch names like `release-1.x` or `v2.0.1`. Do not use `release/1.x`.

### SSRF Protections For Remote Repositories

Before docsfy clones a remote repository, it rejects URLs that point to localhost, private IPs, or DNS names that resolve to non-global addresses. That blocks many common server-side request forgery mistakes.

From `src/docsfy/api/projects.py`:

```python
# Check for localhost
if hostname in ("localhost", "127.0.0.1", "::1", "0.0.0.0"):
    raise HTTPException(
        status_code=400,
        detail="Repository URL must not target localhost or private networks",
    )

try:
    addr = ipaddress.ip_address(hostname)
    if not addr.is_global:
        raise HTTPException(
            status_code=400,
            detail="Repository URL must not target localhost or private networks",
        )
except ValueError:
    resolved = await loop.run_in_executor(
        None,
        socket.getaddrinfo,
        hostname,
        None,
        socket.AF_UNSPEC,
        socket.SOCK_STREAM,
    )
    for _family, _socktype, _proto, _canonname, sockaddr in resolved:
        ip_str = sockaddr[0]
        addr = ipaddress.ip_address(ip_str)
        if not addr.is_global:
            raise HTTPException(
                status_code=400,
                detail="Repository URL resolves to a private network address",
            )
```

When docsfy calls Git, it terminates options before the repository URL, which also helps prevent Git option injection.

From `src/docsfy/repository.py`:

```python
clone_cmd = ["git", "clone", "--depth", "1"]
if branch:
    clone_cmd += ["--branch", branch]
clone_cmd += ["--", repo_url, str(repo_path)]
```

Repository URLs are also cleaned up before logging if they include embedded credentials. The `_redact_url()` helper in `src/docsfy/api/projects.py` replaces any username or password portion with `***@host`.

> **Note:** The SSRF protection here is useful, but it is not a full network sandbox. The code explicitly recommends handling advanced cases such as DNS rebinding with network or firewall controls.

### Local Repository Paths Are Admin-Only

Remote repositories are the normal path for regular users. Local filesystem access is intentionally more restricted.

From `src/docsfy/api/projects.py`:

```python
if gen_request.repo_path and not request.state.is_admin:
    raise HTTPException(
        status_code=403,
        detail="Local repo path access requires admin privileges",
    )

if gen_request.repo_path:
    repo_p = Path(gen_request.repo_path)
    if not repo_p.exists():
        raise HTTPException(
            status_code=400,
            detail=f"Repository path does not exist: '{gen_request.repo_path}'",
        )
    if not (repo_p / ".git").exists():
        raise HTTPException(
            status_code=400,
            detail=f"Not a git repository (no .git directory): '{gen_request.repo_path}'",
        )
```

That gives you two useful guarantees:

- Non-admin users cannot point docsfy at arbitrary local directories.
- Even admins must provide an existing Git working tree, not just any path.

## Filesystem And Path Traversal Protection

docsfy stores generated content under `DATA_DIR`, using an owner-aware layout:

`/data/projects/<owner>/<project>/<branch>/<provider>/<model>/...`

Because those values become real path segments, the backend validates them aggressively.

From `src/docsfy/storage.py`:

```python
def get_project_dir(
    name: str,
    ai_provider: str = "",
    ai_model: str = "",
    owner: str = "",
    branch: str = DEFAULT_BRANCH,
) -> Path:
    if not branch:
        msg = "branch is required for project directory paths"
        raise ValueError(msg)
    if not ai_provider or not ai_model:
        msg = "ai_provider and ai_model are required for project directory paths"
        raise ValueError(msg)
    for segment_name, segment in [
        ("branch", branch),
        ("ai_provider", ai_provider),
        ("ai_model", ai_model),
    ]:
        if (
            "/" in segment
            or "\\" in segment
            or ".." in segment
            or segment.startswith(".")
        ):
            msg = f"Invalid {segment_name}: '{segment}'"
            raise ValueError(msg)
```

There are similar checks for project names and owners:

- Project names must match `^[a-zA-Z0-9][a-zA-Z0-9._-]*$`
- Owners cannot contain `/`, `\`, `..`, or a leading `.`
- Unsafe slugs are skipped during generation

Generated filenames go through an additional confinement check before docsfy writes them.

From `src/docsfy/postprocess.py`:

```python
def _confined_path(base: Path, relative_name: str) -> Path:
    """Ensure a relative filename resolves inside the base directory."""
    if any(ord(ch) < 32 for ch in relative_name):
        raise ValueError(f"Unsafe generated filename: {relative_name!r}")
    candidate = (base / relative_name).resolve()
    base_resolved = base.resolve()
    try:
        candidate.relative_to(base_resolved)
    except ValueError as exc:
        raise ValueError(f"Unsafe generated filename: {relative_name!r}") from exc
    candidate.parent.mkdir(parents=True, exist_ok=True)
    return candidate
```

Even when serving already-generated docs, docsfy checks that the requested file stays inside the resolved site directory.

From `src/docsfy/main.py`:

```python
file_path = site_dir / path
try:
    file_path.resolve().relative_to(site_dir.resolve())
except ValueError as exc:
    raise HTTPException(status_code=403, detail="Access denied") from exc
if not file_path.exists() or not file_path.is_file():
    raise HTTPException(status_code=404, detail="File not found")
```

In practical terms, these checks are there to block:

- `..`-style path traversal
- hidden or dot-prefixed path segments where they are not expected
- provider, model, and branch values that would escape the intended directory tree
- unsafe generated filenames in post-processing and cross-linking

## Sanitizing AI-Generated HTML

docsfy treats generated content as something that still needs guardrails. Markdown is rendered to HTML, then run through a sanitizer before the final page is served.

From `src/docsfy/renderer.py`:

```python
def _sanitize_html(html: str) -> str:
    html = re.sub(
        r"<script[^>]*>.*?</script>", "", html, flags=re.DOTALL | re.IGNORECASE
    )
    for tag in ["iframe", "object", "embed", "form"]:
        html = re.sub(
            rf"<{tag}[^>]*>.*?</{tag}>", "", html, flags=re.DOTALL | re.IGNORECASE
        )
        html = re.sub(rf"<{tag}[^>]*/>", "", html, flags=re.IGNORECASE)
    html = re.sub(r'\s+on\w+\s*=\s*["\'][^"\']*["\']', "", html, flags=re.IGNORECASE)
    html = re.sub(r"\s+on\w+\s*=\s*\S+", "", html, flags=re.IGNORECASE)

    def _sanitize_url_attr(match: re.Match) -> str:
        attr = match.group(1)
        quote = match.group(2)
        url = match.group(3)
        clean_url = url.strip()
        if clean_url.startswith(("http://", "https://", "#", "/", "mailto:")):
            return match.group(0)
        return f"{attr}={quote}#{quote}"
```

That sanitizer does a few important things:

- Removes `<script>` blocks entirely
- Removes `<iframe>`, `<object>`, `<embed>`, and `<form>`
- Strips inline event handlers such as `onclick=` and `onerror=`
- Rewrites unsafe `href=` and `src=` values to `#`
- Preserves the allowed URL patterns `http://`, `https://`, `#`, `/`, and `mailto:`

Template autoescaping stays on for HTML rendering. From `src/docsfy/renderer.py`:

```python
_jinja_env = Environment(
    loader=FileSystemLoader(str(TEMPLATES_DIR)),
    autoescape=select_autoescape(["html"]),
)
```

The static search UI is also careful about text rendering. It uses DOM `textContent` for titles and snippets instead of injecting HTML from the search index. From `src/docsfy/static/search.js`:

```javascript
title.textContent = m.title
preview.textContent = snippet
```

> **Warning:** Sanitization greatly reduces XSS risk, but it does not make AI output magically correct or safe to publish. Generated docs can still contain bad advice, stale instructions, or links you would not want to ship without review.

## Deployment Defaults Worth Knowing

A few security-sensitive defaults are worth keeping in mind when you deploy docsfy.

Startup refuses a missing or too-short `ADMIN_KEY`. From `src/docsfy/main.py`:

```python
if not settings.admin_key:
    logger.error("ADMIN_KEY environment variable is required")
    raise SystemExit(1)

if len(settings.admin_key) < 16:
    logger.error("ADMIN_KEY must be at least 16 characters long")
    raise SystemExit(1)
```

FastAPI's built-in interactive docs and OpenAPI schema are disabled by default. From `src/docsfy/main.py`:

```python
app = FastAPI(
    title="docsfy",
    description="AI-powered documentation generator",
    version="0.1.0",
    lifespan=lifespan,
    docs_url=None,
    redoc_url=None,
    openapi_url=None,
)
```

When you run `docsfy-server` directly, it defaults to `127.0.0.1`. From `src/docsfy/main.py`:

```python
host = os.getenv("HOST", "127.0.0.1")
port = int(os.getenv("PORT", "8000"))
```

The container image drops to a non-root user, and the Compose file reads `ADMIN_KEY` from the environment instead of hardcoding it.

From `Dockerfile`:

```dockerfile
RUN useradd --create-home --shell /bin/bash -g 0 appuser \
  && mkdir -p /data \
  && chown appuser:0 /data \
  && chmod -R g+w /data

USER appuser
```

From `docker-compose.yaml`:

```yaml
env_file:
  - .env
environment:
  - ADMIN_KEY=${ADMIN_KEY}
```

From `entrypoint.sh`:

```sh
exec uv run --no-sync uvicorn docsfy.main:app \
    --host 0.0.0.0 --port 8000
```

A good production baseline is simple:

- Keep `ADMIN_KEY` in the environment, not in git-tracked files.
- Put docsfy behind HTTPS.
- Leave `SECURE_COOKIES=true`.
- Use named users and roles instead of sharing one secret.
- Treat local `repo_path` generation as a controlled admin capability.
- Review generated docs before publishing them more widely.


## Related Pages

- [Authentication and Roles](authentication-and-roles.html)
- [User and Access Management](user-and-access-management.html)
- [Authentication API](auth-api.html)
- [Projects API](projects-api.html)
- [Deployment and Runtime](deployment-and-runtime.html)

---

Source: troubleshooting.md

# Troubleshooting

When something goes wrong, start with the exact message you see in the UI, CLI, or browser network tab. `docsfy` usually passes backend errors through directly, so messages like `Frontend not built`, `Invalid branch name`, `Unauthorized`, or `Variant '...' is already being generated` are usually the shortest path to the fix.

## Quick Triage

- If `/api/*` works but `/` or `/login` does not, this is usually a frontend build problem.
- If a generation fails almost immediately, before cloning starts, this is usually a provider CLI problem.
- If the error mentions `Invalid branch name`, `Remote branch ... not found`, or `does not match expected`, check the branch rules first.
- If you see `already being generated`, `Abort first`, or `Multiple active variants found`, you hit a generation lock or race.
- If you see `401`, repeated redirects to `/login`, or a missing project you expected to see, check authentication, roles, and project access.
- If the dashboard loads but does not update live, check WebSocket connectivity. The browser app can fall back to polling; the CLI `--watch` path cannot.
- The backend health endpoint should return:

```json
{"status":"ok"}
```

## Common Settings To Check

Server-side environment comes from `.env`:

```dotenv
ADMIN_KEY=
AI_PROVIDER=cursor
AI_MODEL=gpt-5.4-xhigh-fast
AI_CLI_TIMEOUT=60
LOG_LEVEL=INFO
DATA_DIR=/data
SECURE_COOKIES=true
# DEV_MODE=true
```

CLI access comes from `~/.config/docsfy/config.toml`:

```toml
[default]
server = "dev"

[servers.dev]
url = "http://localhost:8000"
username = "admin"
password = "<your-dev-key>"
```

> **Note:** Browser login and CLI login are different. The browser uses the `docsfy_session` cookie. The CLI sends the configured password/API key as a Bearer token.

> **Warning:** `ADMIN_KEY` is required and must be at least 16 characters long. If it is empty or too short, admin access will not work and the server will not start cleanly.

## Missing Frontend Build

### What it looks like

- Visiting `/`, `/login`, or another SPA route returns a 404.
- The response detail says `Frontend not built. Run: cd frontend && npm run build`.
- The HTML page loads, but `/assets/*` files 404 and the dashboard appears unstyled or broken.
- The API works, but the browser UI does not.

### Why it happens

In production, `docsfy` serves the React app from `frontend/dist`. If `frontend/dist/index.html` is missing, the SPA cannot load at all.

```python
index = _frontend_dist / "index.html"
if index.exists():
    return FileResponse(str(index))
raise HTTPException(
    status_code=404,
    detail="Frontend not built. Run: cd frontend && npm run build",
)
```

The frontend build is defined in `frontend/package.json`:

```json
{
  "scripts": {
    "dev": "vite",
    "build": "tsc -b && vite build",
    "lint": "eslint .",
    "preview": "vite preview",
    "test": "vitest run"
  }
}
```

If you deploy with Docker, the image build is supposed to compile the frontend before the runtime image is created:

```dockerfile
WORKDIR /app/frontend
COPY frontend/package.json frontend/package-lock.json ./
RUN npm ci
COPY frontend/ ./
RUN npm run build
```

### How to fix it

1. Build the frontend:
   ```shell
   cd frontend
   npm ci
   npm run build
   ```
2. Restart the server after the build finishes.
3. If you are using Docker, rebuild the image so the new `frontend/dist` is copied into the runtime image.
4. If only `/assets/*` is failing, treat that the same way: rebuild the frontend and redeploy the full build output, not just `index.html`.

> **Tip:** If `/api/status` works but `/` does not, the backend is probably fine and only the frontend build is missing.

### Development mode

For local development, this repo can run Vite instead of serving a static build. That is controlled by `DEV_MODE=true`. In that mode the entrypoint runs `npm ci` and starts the Vite dev server automatically.

## Provider CLI Failures

### What it looks like

- A generation switches to `error` almost immediately.
- The error appears before cloning or page generation starts.
- One provider works, but another fails on the same machine.
- You omitted provider/model and the server picked defaults you were not expecting.

### What `docsfy` checks

Before it clones a repository, `docsfy` checks whether the selected provider CLI is available:

```python
cli_flags = ["--trust"] if ai_provider == "cursor" else None
available, msg = await check_ai_cli_available(
    ai_provider, ai_model, cli_flags=cli_flags
)
if not available:
    await update_and_notify(
        gen_key,
        project_name,
        ai_provider,
        ai_model,
        status="error",
        owner=owner,
        branch=branch,
        error_message=msg,
    )
    return
```

Supported providers are fixed in the backend:

```python
VALID_PROVIDERS = ("claude", "gemini", "cursor")
```

### What to check

- Make sure you are using one of the supported provider names: `claude`, `gemini`, or `cursor`.
- If you omitted `ai_provider` or `ai_model`, remember that the server uses its defaults. In this repo those defaults are `cursor` and `gpt-5.4-xhigh-fast`.
- Remember that `docsfy` login and provider login are different. Your `docsfy` API key or browser session authenticates you to `docsfy`, but Claude, Gemini, and Cursor still need to be authenticated separately on the server.
- If you are running outside the official container, make sure the provider CLI is installed and authenticated on `PATH` for the same user that runs `docsfy-server`.
- A model suggestion in the UI is historical, not a live entitlement check. If a remembered model now fails, verify that the provider CLI still has access to it.
- If the provider CLI is installed but slow to start, increase `AI_CLI_TIMEOUT`.

The Docker image installs these CLIs at build time:

```dockerfile
RUN /bin/bash -o pipefail -c "curl -fsSL https://claude.ai/install.sh | bash"
RUN /bin/bash -o pipefail -c "curl -fsSL https://cursor.com/install | bash"
RUN mkdir -p /home/appuser/.npm-global \
  && npm config set prefix '/home/appuser/.npm-global' \
  && npm install -g @google/gemini-cli

ENV PATH="/home/appuser/.local/bin:/home/appuser/.npm-global/bin:${PATH}"
```

In Docker, generation runs the CLIs as this runtime user:

```dockerfile
# Switch back to non-root user for runtime
USER appuser

# Ensure CLIs are in PATH
ENV PATH="/home/appuser/.local/bin:/home/appuser/.npm-global/bin:${PATH}"
# Set HOME for OpenShift compatibility (random UID has no passwd entry)
ENV HOME="/home/appuser"
```

> **Warning:** Provider failures happen before cloning. If a run fails instantly, check provider installation, provider authentication, and `AI_PROVIDER`/`AI_MODEL` first, not just the repo URL.

> **Note:** The `cursor` provider is handled slightly differently: `docsfy` automatically adds `--trust` when it checks and calls the CLI.

## Branch Validation And Branch Mismatch

### What it looks like

- The request is rejected immediately with `Invalid branch name`.
- A remote run fails with a clone error such as `Remote branch ... not found`.
- A local path run fails because the checked-out branch does not match what you asked for.

### What counts as a valid branch

`docsfy` defaults to `main` and validates branch names before generation starts:

```python
@field_validator("branch")
@classmethod
def validate_branch(cls, v: str) -> str:
    if "/" in v:
        msg = (
            f"Invalid branch name: '{v}'. Branch names cannot contain slashes "
            "— use hyphens instead (e.g., release-1.x)."
        )
        raise ValueError(msg)
    if not re.match(r"^[a-zA-Z0-9][a-zA-Z0-9._-]*$", v):
        msg = f"Invalid branch name: '{v}'"
        raise ValueError(msg)
    if ".." in v:
        msg = f"Invalid branch name: '{v}'"
        raise ValueError(msg)
    return v
```

That means:

- Valid: `main`, `dev`, `release-1.x`, `v2.0.1`
- Invalid: `release/v2.0`, `.hidden`, `../etc/passwd`

The slash rule exists because the branch is part of URLs like `/docs/<project>/<branch>/<provider>/<model>/...`.

### Remote repository branch problems

Remote repositories are cloned with `git clone --depth 1`, and `docsfy` passes the branch if you supplied one. If that branch does not exist upstream, you will get a clone failure such as:

- `Clone failed: fatal: Remote branch 'nonexistent' not found`

### Local repository branch problems

When you generate from `repo_path`, `docsfy` checks the current branch of that working tree. If it does not match the branch you requested, it fails with an error like:

- `Branch 'main' does not match expected 'v2.0'`

### How to fix it

- Use the exact branch name that exists in Git.
- Replace slashes with hyphens when you need a branch that can safely live inside a `docsfy` URL.
- If you are using `repo_path`, either check out the requested branch first or change the branch value you send.
- If you omit the branch entirely, `docsfy` uses `main`.

Actual CLI syntax for a branch-specific run looks like this:

```shell
docsfy generate https://github.com/myk-org/for-testing-only --branch dev --provider gemini --model gemini-2.5-flash --force
```

## Generation Conflicts And Stuck Runs

### What it looks like

- `Variant 'project/branch/provider/model' is already being generated`
- `Cannot delete 'project/provider/model' while generation is in progress. Abort first.`
- `Cannot delete 'project' while generation is in progress. Abort running variants first.`
- `Multiple active variants found; use the branch-specific abort endpoint.`
- `Abort still in progress ... Please retry shortly.`
- The page count reaches the planned total, but the variant still shows `generating` while the activity log moves through `validating`, `cross_linking`, or `rendering`.
- A run that was `generating` becomes `error` after a restart.

### Why it happens

`docsfy` prevents two jobs from generating the same owner/project/branch/provider/model at the same time:

```python
gen_key = f"{owner}/{project_name}/{branch}/{ai_provider}/{ai_model}"
async with _gen_lock:
    if gen_key in _generating:
        raise HTTPException(
            status_code=409,
            detail=f"Variant '{project_name}/{branch}/{ai_provider}/{ai_model}' is already being generated",
        )
```

The same lock also protects delete and abort operations from racing against active generation.

### How to fix it

- If you started the same variant twice, wait for the first run to finish or abort it.
- If you only know the project name but more than one variant is active, abort the exact variant instead of aborting by name.
- If abort says it is still in progress, wait a moment and retry.
- If abort says the job already finished, refresh status before trying again.
- If page writing has finished but the variant is still `generating`, check the activity log or `current_stage` before treating it as stuck. `validating`, `cross_linking`, and `rendering` are legitimate post-generation stages.
- If the server restarted during generation, start a new run. `docsfy` intentionally marks orphaned `generating` jobs as `error` with the message `Server restarted during generation`.

The dashboard activity log uses these stage names directly:

```ts
export const GENERATION_STAGES = [
  'cloning',
  'planning',
  'incremental_planning',
  'generating_pages',
  'validating',
  'cross_linking',
  'rendering',
] as const
```

Useful CLI commands:

```shell
docsfy status for-testing-only
docsfy abort for-testing-only --branch main --provider gemini --model gemini-2.5-flash
```

> **Tip:** `--force` clears cached generation artifacts, but it does not bypass the active-generation lock.

## Access, Login, And Permission Problems

### What the status codes usually mean

- `401 Unauthorized`: you are not authenticated, your API key is wrong, or your browser session expired.
- `403 Admin access required`: you reached an admin-only endpoint.
- `403 Write access required.`: your account is `viewer`, so you can read but not generate, abort, or delete.
- `404 Not found`: the project may exist, but you do not own it and have not been granted access. `docsfy` intentionally hides inaccessible projects behind a 404 instead of exposing them with a 403.

### Browser and CLI behave differently

- Browser requests to `/docs/*` redirect to `/login` when you are not signed in.
- API and CLI requests return JSON `401` responses instead.
- The login page shows `Invalid username or password` for bad credentials, and `Unable to connect to server` for connection failures.

Admin login is always username `admin` with the admin password.

### Session cookie problems

The browser session cookie is created like this:

```python
response.set_cookie(
    "docsfy_session",
    session_token,
    httponly=True,
    samesite="strict",
    secure=settings.secure_cookies,
    max_age=SESSION_TTL_SECONDS,
)
```

> **Warning:** If you run `docsfy` on plain `http://localhost` with `SECURE_COOKIES=true`, login can appear to work and then immediately bounce back to `/login` because the browser will not send a secure cookie over plain HTTP.

For local HTTP development, set `SECURE_COOKIES=false`.

### Role and access restrictions

- `viewer` users can view projects they own or were granted, but they cannot generate or delete.
- `repo_path` generation is admin-only. Non-admin users get `Local repo path access requires admin privileges`.
- If you are using a local repository path, you can also see:
  - `Repository path does not exist: '...'`
  - `Not a git repository (no .git directory): '...'`

### Shared project access

Project sharing is owner-specific. If the wrong owner was used when access was granted, the user still will not see the project.

The CLI commands for access management are:

```shell
docsfy admin access list <project> --owner <owner>
docsfy admin access grant <project> --username <user> --owner <owner>
docsfy admin access revoke <project> --username <user> --owner <owner>
```

Admin cleanup is owner-specific too. If an admin delete call fails with `Project owner is required for admin deletion. Use ?owner=username (or ?owner= for legacy projects)`, rerun the delete against the exact owner namespace.

If the same project name and variant exist under more than one owner, you can also see ambiguity errors such as:

- `Multiple owners found for this variant, please specify owner`
- `Multiple owners found for this variant (...). Contact an admin to resolve the ambiguity.`
- `Multiple owners have variants with the same timestamp, please specify owner`

In that case, use the correct owner explicitly or ask an admin to clean up the ambiguous access.

### CLI auth checks

If the CLI is failing, start with:

```shell
docsfy config show
```

If the CLI reports a redirect such as `Error: Server redirected to /login. Check the server URL.`, verify the `url`, `username`, and `password` in `~/.config/docsfy/config.toml`.

## WebSocket Connectivity And Live Updates

### What it looks like

- The dashboard loads, but live status updates never appear.
- The CLI `--watch` command prints `WebSocket connection failed: ...`.
- The CLI prints `WebSocket connection closed unexpectedly.` or `Timed out waiting for progress update.`.
- Browser dev tools show `/api/ws` closing with code `1008` or `1001`.
- The dashboard still refreshes, but only in roughly 10-second jumps after the socket drops.

### How it works

- The browser connects to `/api/ws`.
- The server accepts either a valid `docsfy_session` cookie or `?token=<api-key>`.
- On connect, the server sends an initial `sync` payload.
- The server sends `ping` every 30 seconds, expects `pong` within 10 seconds, and closes after 2 missed pongs with code `1001`.
- Unauthenticated WebSocket connections are closed with code `1008`.
- During a live generation, progress can continue past page writing into post-generation stages such as `validating`, `cross_linking`, and `rendering`.

The frontend reconnect logic is:

```ts
private attemptReconnect(): void {
  if (this.reconnectAttempts >= this.maxReconnectAttempts) {
    console.debug('[WS] Falling back to polling')
    this.startPolling()
    return
  }
  const delay = this.getBackoffDelay()
  this.reconnectAttempts++
  console.debug('[WS] Reconnecting, attempt', this.reconnectAttempts)
  this.reconnectTimer = setTimeout(() => this.connect(true), delay)
}
```

In the browser, `docsfy` retries the WebSocket 3 times with backoff, then falls back to polling `/api/projects` every 10 seconds.

> **Tip:** If the dashboard stops updating instantly but still refreshes within about 10 seconds, the polling fallback is probably working exactly as designed.

### Local development and proxies

The Vite dev server is already configured to proxy WebSockets to the backend:

```ts
'/api': {
  target: API_TARGET,
  changeOrigin: true,
  ws: true,
}
```

If you are running behind another proxy or load balancer, make sure it preserves WebSocket upgrade requests for `/api/ws`.

### CLI `--watch` specifics

The CLI derives its WebSocket URL from your configured server URL. That means a bad `url` in `~/.config/docsfy/config.toml` can break `--watch` even if the generation itself still starts.

A real `--watch` example from this repo's test plan is:

```shell
docsfy generate https://github.com/myk-org/for-testing-only --branch dev --provider gemini --model gemini-2.5-flash --force --watch
```

If `--watch` fails, rerun without it and use `docsfy status <project>` while you fix the network or proxy issue. The browser app has polling fallback; the CLI watch path does not.

### What to check

- If the close code is `1008`, log in again or use a valid API key/token.
- If the close code is `1001`, the client is missing heartbeat replies or an intermediary is breaking the socket. Custom clients must answer server `ping` messages with a JSON `pong` reply. The built-in browser app and CLI already do this automatically.
- If the site is served over HTTPS, the browser will use `wss://` automatically.
- If you are using Vite dev mode, make sure the dev server is running and port `5173` is exposed.
- If you are using a reverse proxy, confirm it supports WebSocket upgrades for `/api/ws` and does not terminate idle connections too aggressively.
- If the dashboard works but updates are delayed, wait for the 10-second polling interval before assuming it is stuck.

If none of the sections above fits, capture the exact error text, the provider, model, branch, whether you were using the browser or CLI, and whether `/health` returned `{"status":"ok"}`. Those details usually identify the failing layer very quickly.


## Related Pages

- [AI Provider Setup](ai-provider-setup.html)
- [Local Development](local-development.html)
- [Authentication and Roles](authentication-and-roles.html)
- [Tracking Progress and Status](tracking-progress-and-status.html)
- [WebSocket Protocol](websocket-protocol.html)

---
