sharepoint-cli — design (v0.1)

Status: approved through brainstorming, ready for implementation planning Date: 2026-04-30 Owner: rvben

Goal

A Rust CLI for SharePoint Online files and document libraries, in the same agent-friendly mold as jira-cli and confluence-cli. Binary name: sharepoint. Crate name: sharepoint-cli. Edition 2024, Rust 1.90, MIT.

The v0.1 scope is files and document libraries on SharePoint Online via Microsoft Graph. Lists, pages, and SharePoint Server / on-prem are explicitly out of scope.

What "agent-friendly" means here

Inherited from jira/confluence — non-negotiable v0.1 properties:

Out of scope (v0.1)

API and auth

API: Microsoft Graph v1.0 (https://graph.microsoft.com/v1.0). The endpoint is overridable via the MICROSOFT_GRAPH_ENDPOINT env var so wiremock-based tests can intercept it. The login.microsoftonline.com base is overridable via MICROSOFT_LOGIN_ENDPOINT for the same reason.

Auth (v0.1): Device-code flow only. Client-credentials is deferred to v0.2 (see Out of scope).

First-run admin consent: the AADSTS65001 ("admin consent required") error is detected and rewritten into a help message that prints the admin-consent URL the user can forward to IT, instead of bubbling raw OAuth output.

Module layout

src/
├── main.rs              # entrypoint; thin (parse args, build runtime, dispatch)
├── lib.rs               # re-exports for integration tests
├── cli.rs               # clap derive structs + handle_* dispatch (jira-cli style)
├── output.rs            # OutputConfig, JSON/table/quiet, exit-code mapping, use_color()
├── config.rs            # ~/.config/sharepoint/config.toml, profiles, EnvOverride
├── error.rs             # CliError + ApiError; structured exit codes
├── reference.rs         # parse "url" | "Site:Library/path" | profile-default forms
├── auth/
   ├── mod.rs           # AuthContext: get_access_token() with auto-refresh
   ├── device_code.rs   # device-code flow (interactive `init` and re-auth)
   └── token_cache.rs   # ~/.cache/sharepoint/tokens.json, mode 0600
├── graph/
   ├── mod.rs           # GraphClient: HTTP + auth + retry/backoff + paging
   ├── sites.rs         # site/drive lookup helpers (cache by name → id)
   ├── drives.rs        # drive item operations (get/list/move/copy/delete)
   ├── upload.rs        # small upload + chunked upload session
   ├── download.rs      # streaming download w/ ETag support
   ├── sharing.rs       # createLink, permission listing
   └── search.rs        # drive-scoped search (Graph /drive/root/search)
└── commands/
    ├── mod.rs
    ├── init.rs          # sharepoint init (interactive setup + first login)
    ├── auth.rs          # sharepoint auth login | logout | status
    ├── config.rs        # sharepoint config show | path
    ├── sites.rs         # sharepoint sites list | use <name>
    ├── drives.rs        # sharepoint drives list <site>
    ├── files.rs         # ls/stat/download/upload/cp/mv/rm/mkdir/find/share
    ├── completions.rs   # sharepoint completions <shell> [--install]
    └── schema.rs        # sharepoint schema → agent-introspection JSON

Rationale:

Configuration

~/.config/sharepoint/config.toml (XDG via dirs).

The file uses one or more named profiles. The active profile is chosen by --profile flag → SHAREPOINT_PROFILE env → the special default profile. There is no separate [default] section: a profile literally named default plays that role, which keeps the TOML grammar clean ([profile.default] and [profile.default.sites] no longer collide with anything).

[profile.default]
tenant_id    = "contoso.onmicrosoft.com"   # tenant GUID or domain
default_site = "Marketing"                 # optional; enables short refs
read_only    = false

# Optional override — empty in the common case (shipped client_id used)
# client_id  = "00000000-..."

# Optional named site shortcuts for "Site:Library/path" syntax
[profile.default.sites]
Marketing = "https://contoso.sharepoint.com/sites/Marketing"
Eng       = "https://contoso.sharepoint.com/sites/Engineering"

[profile.work]
tenant_id    = "othercorp.onmicrosoft.com"
default_site = "Sales"
read_only    = true

Resolution order (CLI flags > env vars > active profile):

Variable Maps to
SHAREPOINT_TENANT_ID tenant
SHAREPOINT_CLIENT_ID app registration override
SHAREPOINT_DEFAULT_SITE site name or URL
SHAREPOINT_PROFILE active profile (default: default)
SHAREPOINT_READ_ONLY 1/true/yes/on blocks all write verbs
SHAREPOINT_ACCESS_TOKEN bypass token cache (CI escape hatch)
SHAREPOINT_REFRESH_TOKEN seed the cache from CI secrets
SHAREPOINT_DEBUG_HTTP include raw Graph response body in error messages
MICROSOFT_GRAPH_ENDPOINT override Graph base URL (testing)
MICROSOFT_LOGIN_ENDPOINT override login.microsoftonline.com base URL (testing)

Tokens never live in config.toml. They live in ~/.cache/sharepoint/tokens.json keyed by <tenant_id>:<client_id>:<oid> (where oid is the Entra oid claim from the id_token). Multiple profiles/accounts coexist.

sharepoint init runs interactive setup: tenant, optional default site, then performs a device-code login and writes both files. (When v0.2 adds the client-credentials flow, init gains an --auth client-credentials mode that prompts for client_id + client_secret instead of doing device-code.)

Reference parsing & resolution

A "reference" is whatever the user types where a SharePoint location is expected. reference.rs turns it into a ParsedRef { site, drive, path } (no network), then GraphClient::resolve turns that into (site_id, drive_id, path) (with caching).

Accepted forms (tried in order):

  1. Full SharePoint web URL. Includes the Forms/AllItems.aspx?id=... form variant the browser actually copies — the real path is extracted from the id query param. OneDrive-as-SharePoint URLs (<tenant>-my.sharepoint.com/personal/...) are accepted.
  2. Site:Library/path short form. Site resolves via the active profile's [profile.<x>.sites] map, falling through to a Graph site search if absent.
  3. :Library/path default-site form. Uses the profile's default_site.
  4. Bare Library/path. Only when default_site is set and the path's first segment matches a known library on that site. Otherwise we error rather than guess.
  5. spo://<site>/<library>/<path> URI form. Agent-friendly, unambiguous, no quoting headaches with spaces. Documented but secondary.

Library-name normalization. Match on display name, case-insensitive, falling back to the URL segment when no display match exists. The display name from the API wins for output rendering.

Local file references in upload/download are plain filesystem paths. - means stdin/stdout. We never interpret a local path as a SharePoint reference.

Errors are explicit, never guessed. "Library 'Shared Documents' not found on site 'Marketing'. Available: …" rather than silently falling back. Site/library names are case-insensitive, but error messages surface the canonical casing.

A process-local cache on GraphClient memoizes site→id and (site_id, drive_name)→drive_id so a single command touching multiple paths in one library doesn't repeat lookups.

Command surface

sharepoint init                                # interactive setup + first login
sharepoint auth login                          # device-code login (re-auth)
sharepoint auth logout                         # delete cached tokens
sharepoint auth status                         # show cached account, expiry, scopes

sharepoint config show                         # resolved config, masked secrets
sharepoint config path                         # print config file path

sharepoint sites list [--query <text>] [--limit N] [--all] [--page <token>]
                                               # delegated: returns followed sites by default;
                                               # add --query for keyword site search
sharepoint sites use <name|url>                # write default_site to active profile

sharepoint drives list <site-ref> [--limit N] [--all] [--page <token>]

sharepoint files ls <ref> [-r] [--limit N] [--all] [--page <token>]
sharepoint files stat <ref>
sharepoint files download <ref> [--output PATH] [--overwrite]   # PATH or '-' for stdout
sharepoint files upload <local> <ref> [--overwrite]             # local '-' for stdin
sharepoint files mkdir <ref> [-p]
sharepoint files cp <ref-src> <ref-dst> [--overwrite] [--cross-drive] [--timeout SECS]
sharepoint files mv <ref-src> <ref-dst> [--overwrite] [--cross-drive] [--timeout SECS]
sharepoint files rm <ref> [-r] [--yes]
sharepoint files find <ref> [--name <glob>] [--query <text>] [--limit N] [--all] [--page <token>]
sharepoint files share <ref> --type view|edit
                          --scope anonymous|org
                          [--expires 2026-12-31]

sharepoint completions <shell> [--install]
sharepoint schema                              # agent-introspection JSON

(sharepoint files search for tenant-wide search and share --scope users [--users …] [--password …] for recipient-scoped sharing are deferred to v0.2; see Out of scope.)

Rules driven by the surface:

Output and JSON shapes

Canonical item shape (every list/show command uses this; chains like stat … | jq | xargs sharepoint files download … work because the shape is stable):

{
  "id": "01ABC...",
  "name": "Q4-plan.pptx",
  "path": "/Shared Documents/2025/Q4-plan.pptx",
  "site": { "id": "...", "name": "Marketing", "url": "https://..." },
  "drive": { "id": "...", "name": "Documents" },
  "kind": "file",
  "size": 4823104,
  "etag": "\"{...},3\"",
  "hash": { "quickXor": "...", "sha1": "..." },
  "created": "2025-10-12T08:14:00Z",
  "modified": "2025-11-04T16:22:01Z",
  "web_url": "https://contoso.sharepoint.com/..."
}

Hash fields reflect what Graph populates for SharePoint: quickXorHash always, sha1Hash for many file types. (sha256Hash is documented as not produced by Graph for SharePoint, so we omit it; missing fields are absent from the JSON, not null.)

stat extension. stat returns the canonical shape plus a download_url field — a short-lived pre-authenticated URL. It is not included in any other command's output to keep the URL out of accidental logging.

{
  "id": "01ABC...", "name": "...", "path": "...",
  /* ...rest of canonical shape... */
  "download_url": "https://...short-lived..."
}

List/search shape (ls, find, drives list, sites list):

{ "total": 42, "next": null, "items": [ /* canonical items */ ] }

next is an opaque continuation token (Graph's @odata.nextLink, base64-wrapped). When the user passes --all, the CLI follows continuation links until exhausted and next is always null in the final output. Without --all, the CLI returns one Graph page (typically up to 200 items, surfaced in --help) with next set when more pages exist; agents pass --page <token> (or next from a previous response) to fetch the next page. sites list adds a source: "followed" | "search" field per the rule above.

Table output is narrow by default (NAME · KIND · SIZE · MODIFIED for ls), terminal-width aware via terminal_size. ISO-8601 dates compacted to 2025-11-04 16:22 for tables, full ISO in JSON. NO_COLOR and --no-color honored.

Errors in JSON mode go to stdout, not stderr (divergence from jira-cli that we accept here):

{ "error": { "code": "not_found", "message": "Library 'Sales' not found on site 'Marketing'.", "exit": 4 } }

Followed by a non-zero exit. Plain (non-JSON) error messages still go to stderr.

Exit codes:

Exit Meaning
0 success
1 unexpected
2 bad input / config / read-only blocked
3 auth failed / token expired & refresh failed
4 site, library, or item not found
5 Graph API error
6 rate limited (429 / Retry-After honored on retry)

Auth flow & token handling

Device-code flow (the only v0.1 flow).

sharepoint init   # or `sharepoint auth login`

 POST https://login.microsoftonline.com/<tenant>/oauth2/v2.0/devicecode
   scope = openid profile offline_access User.Read Files.ReadWrite.All Sites.Read.All

 stderr:
   To sign in, open https://microsoft.com/devicelogin in a browser
   and enter code: ABCD-EFGH

 poll the token endpoint every `interval` seconds (default 5):
     - 200 with token        success, exit polling
     - 400 authorization_pending  keep polling at the same interval
     - 400 slow_down             bump interval by +5s and keep polling
     - 400 authorization_declined / expired_token / access_denied  terminal failure
     - 400 bad_verification_code (transient)                       keep polling

 on success: extract `oid` and `tid` from the id_token, write tokens.json
              (mode 0600, written to a temp file in the same dir then renamed),
              print "Signed in as alice@contoso.com" to stderr

Total wait capped by expires_in from the device-code response (typically 15 min). Ctrl-C cancels cleanly and does not write a partial cache.

Token cache~/.cache/sharepoint/tokens.json, written 0600 via tempfile-and-rename:

{
  "version": 1,
  "entries": {
    "<tenant_id>:<client_id>:<oid>": {
      "account": {
        "username": "alice@contoso.com",   // preferred_username from id_token
        "name":     "Alice Example",       // name from id_token
        "tenant_id": "<tid>",              // tid from id_token
        "oid":       "<oid>"               // stable per-user-per-tenant identifier
      },
      "access_token":            "...",
      "access_token_expires_at": "2026-04-30T15:42:00Z",
      "refresh_token":           "...",    // rotates on every refresh
      "scopes": ["openid","profile","offline_access","User.Read","Files.ReadWrite.All","Sites.Read.All"]
    }
  }
}

There is intentionally no refresh_token_expires_at: Microsoft does not return a refresh-token expiry in the device-code response, refresh tokens rotate on every refresh, and old ones are not always immediately invalidated. We rely on the actual refresh request: if it fails with invalid_grant, we treat the cache entry as dead and tell the user to re-login. Every successful refresh atomically replaces the entry (tempfile + rename) so the rotated token never coexists on disk with the previous one.

auth status lists every cached account, expiry, and scopes.

Refresh logic (every Graph call goes through AuthContext::access_token()):

if env SHAREPOINT_ACCESS_TOKEN set         use it, no cache touch
if cached access token valid for 60s      use it
if cached refresh token present            POST /oauth2/v2.0/token with grant_type=refresh_token
                                               on success: replace entry atomically, use new access token
                                               on invalid_grant: treat as logged out
                                               on transient (5xx / network): retry with backoff up to 3 times
otherwise                                   exit code 3, "run `sharepoint auth login`"

The 60-second margin avoids a race where a long upload's pre-flight token check passes but the token expires mid-request.

Security: the token cache and any persisted upload-session state are bearer secrets. Files are created 0600 and never appear in config show or any other diagnostic command. README and CONTRIBUTING.md warn against committing ~/.cache/sharepoint/. A keychain-backed alternative (macOS Keychain / Linux Secret Service / Windows Credential Manager) is tracked as a v0.2 candidate; v0.1 stays plaintext+0600 to match the cross-platform behavior of every comparable CLI (gh, gcloud, az) and avoid Linux-headless / WSL fallback complexity.

Testing strategy

Layer 1 — Unit tests (offline, fast, in CI). Reference parsing, config resolution, env-var precedence, glob matching, error mapping, token-cache I/O (with tempfile), refresh-margin logic with a faked clock. No HTTP.

Layer 2 — Wiremock integration tests (offline, fast, in CI). tests/mock_graph.rs stands up a wiremock server, the test sets MICROSOFT_GRAPH_ENDPOINT=http://127.0.0.1:port, and exercises every command end-to-end against canned responses. Fixtures are real Graph response bodies captured against the dev tenant once and stored under tests/fixtures/graph/*.json. Catches paging, ETag handling, error mapping, 429 backoff, upload-session chunking, JSON-error-on-stdout contract.

Layer 3 — Live e2e tests (opt-in, gated, not in default CI). make test-e2e runs against a real M365 tenant when these env vars are set:

SHAREPOINT_E2E_TENANT=contoso.onmicrosoft.com
SHAREPOINT_E2E_REFRESH_TOKEN=...                # captured from a one-time device-code login
SHAREPOINT_E2E_SITE=https://contoso.sharepoint.com/sites/CliTest
SHAREPOINT_E2E_LIBRARY="Shared Documents"

The e2e suite seeds the token cache from SHAREPOINT_E2E_REFRESH_TOKEN (the only secret CI stores) and lets the normal refresh path mint access tokens for each test run. We avoid client-credentials in v0.1 because that whole flow is deferred — and because seeding a refresh token gives the e2e harness exactly the same code path real users hit.

E2e-created items are prefixed sp-cli-e2e-<run-id>-; setup creates a fresh subfolder per run, teardown deletes it. Covers the round-trip surface: upload → stat → download → cp → mv → share → rm, plus ls/find/drives list/sites list.

CI: GitHub Actions runs Layers 1+2 on every push (Ubuntu + macOS, fmt → clippy → nextest). Layer 3 runs only on pushes to main from the repo (not forks), with secrets stored as GitHub repo secrets. Refresh-token rotation is a real concern here: the workflow re-writes its own secret with the rotated refresh token after each successful run via gh secret set so the seed doesn't go stale.

Prerequisites (must complete before implementation)

These are gating tasks the implementation plan should pick up first:

  1. Provision a Microsoft 365 Developer Program tenant (https://developer.microsoft.com/microsoft-365/dev-program). Capture tenant domain.
  2. Register an Entra (Azure AD) app in that tenant:
  3. Public client (device-code) with delegated permissions: openid, profile, offline_access, User.Read, Files.ReadWrite.All, Sites.Read.All.
  4. Multi-tenant so the same client_id can be shipped in the binary for users in other tenants.
  5. Allow public-client/native flows (required for device-code).
  6. Capture the client_id for embedding in auth/mod.rs as DEFAULT_CLIENT_ID.
  7. Create a CliTest site in the dev tenant with a Shared Documents library to use as the e2e target.
  8. Run sharepoint auth login once locally against the dev tenant to mint a refresh token, then store that refresh token as the SHAREPOINT_E2E_REFRESH_TOKEN GitHub Actions secret. (This is the only e2e secret needed — no client-secret in v0.1.)
  9. Document the Entra app setup in CONTRIBUTING.md so anyone can stand up their own dev tenant for full e2e.

Release & ops (matches jira/confluence)