Metadata-Version: 2.4
Name: htb-sdk
Version: 0.1.0
Summary: Unofficial Python SDK for the Hack The Box (labs.hackthebox.com) API, reverse-engineered from live traffic.
Project-URL: Homepage, https://github.com/
Project-URL: Source, https://github.com/
Author: Yihang Wang
License: MIT
Keywords: api,ctf,hackthebox,htb,sdk,security
Requires-Python: >=3.9
Requires-Dist: httpx>=0.27
Description-Content-Type: text/markdown

# htb-sdk

[![CI](https://github.com/WangYihang/htb-sdk/actions/workflows/ci.yml/badge.svg)](https://github.com/WangYihang/htb-sdk/actions/workflows/ci.yml)
[![PyPI](https://img.shields.io/pypi/v/htb-sdk.svg)](https://pypi.org/project/htb-sdk/)
[![Python](https://img.shields.io/pypi/pyversions/htb-sdk.svg)](https://pypi.org/project/htb-sdk/)

Unofficial Python SDK for the [Hack The Box](https://www.hackthebox.com) platform.

Hack The Box ships **no official SDK and no public API docs**. This library is
reverse-engineered from the **live traffic** the real web app
(`app.hackthebox.com`) sends to its backend (`labs.hackthebox.com`). The whole
capture → analyze → generate pipeline is committed, so when HTB changes their
API you can **re-run it and regenerate** an up-to-date map (see
[Refreshing the API map](#refreshing-the-api-map)).

> ⚠️ Unofficial. Use against your own account, respect HTB's Terms of Service,
> and don't hammer the API.

## Install

This project is managed with [uv](https://docs.astral.sh/uv/).

```bash
uv sync                  # runtime only (just the SDK + httpx)
uv sync --all-groups     # + dev tools (pytest, ruff) and the capture pipeline (playwright)
```

Then run things with `uv run` (e.g. `uv run python ...`, `uv run pytest`).
Dependency groups: `dev` (pytest, respx, ruff) and `capture` (playwright, for
the traffic-capture tooling under `capture/`).

## Authentication

The SDK authenticates with an **App Token** (sent as `Authorization: Bearer …`),
exactly like the web app's own JWT. Create one at
**Profile → Settings → App Tokens → Create App Token** on
`app.hackthebox.com`.

```python
from htb import HTBClient

client = HTBClient(token="YOUR_APP_TOKEN")
# or:  export HTB_TOKEN=...   then HTBClient()

me = client.user.me()
print(me)
```

## Quick start

```python
from htb import HTBClient, Difficulty

client = HTBClient()  # reads HTB_TOKEN from the environment

# Machines
for m in client.machines.list(per_page=20):
    print(m["name"], m["difficultyText"])

box = client.machines.list(per_page=1)[0]
client.machines.spawn(box["id"])                      # start an instance
client.machines.submit_flag(box["id"], flag="HTB{...}", difficulty=Difficulty.MEDIUM)
client.machines.terminate(box["id"])

# Challenges
chal = client.challenges.get(806)
client.challenges.download(chal["id"], "challenge.zip")

# Anything not yet wrapped — raw access to any captured endpoint:
data = client.get("/api/v4/season/list")
```

## What's covered

Everything that showed up in the captured traffic — see
[`docs/ENDPOINTS.md`](docs/ENDPOINTS.md) for the complete, auto-generated map.
Typed resource wrappers — **17 namespaces, ~104 methods**, one module per API
area: **machines, challenges, sherlocks, fortresses, prolabs, endgames, tracks,
seasons, user, teams, universities, rankings, connection (VPN), pwnbox, tokens**
(App Tokens), **walkthroughs, platform** (badges/tags/notices/...). For anything
else, use the raw `client.get/post/put/delete(...)` escape hatch.

### Types & enums

The package ships `py.typed`. Methods carry precise parameter and return
annotations; responses are typed with `TypedDict`s in [`htb.models`](src/htb/models),
and fixed-value fields have enums in [`htb.enums`](src/htb/enums.py)
(`Difficulty`, `MachineDifficulty`, `ChallengeDifficulty`, `OS`,
`MachineState`, `ProductType`, `SubscriptionType`). Enums subclass `str`/`int`
so they serialize transparently and compare equal to the raw API values.

## Project layout

```
src/htb/
  client.py            HTBClient — wires the resource namespaces
  _http.py             Transport: Bearer auth, retries, pagination
  errors.py            exception hierarchy
  enums.py             enums for fixed-value fields
  models/              TypedDict response models (one file per entity)
  resources/           one module per API area (machines, challenges, ...)
capture/               re-runnable capture → analyze → generate pipeline
docs/ENDPOINTS.md      generated API reference
tests/{unit,integration,e2e}/
```

## Testing

```bash
uv run pytest                 # unit + integration (offline, respx-mocked)
uv run pytest --e2e           # + live end-to-end tests (needs HTB_TOKEN)
```

- **unit** — transport, errors, enums in isolation
- **integration** — every resource method against a mocked HTTP layer
- **e2e** — read-only calls against the real API (skipped without `--e2e`)

## Refreshing the API map

When HTB changes their API, recapture and regenerate (all read traffic +
authorized write actions, captured from a real session):

```bash
uv run python capture/login_isolated.py     # isolated browser -> you log in once
uv run python capture/grab_token.py         # save the fresh session token
uv run python capture/extract_js_endpoints.py /tmp/htbjs/assets  # full map from frontend JS
DO_WRITES=1 uv run python capture/probe_full.py   # exercise every endpoint (GET + writes)
uv run python capture/analyze_har.py        # traffic.jsonl -> endpoints.json
uv run python capture/generate_reference.py # endpoints.json -> docs/ENDPOINTS.md
```

Capture runs in an **isolated throwaway browser profile** — it never touches
your real Microsoft Edge / Google / Microsoft sessions. See
[`capture/README.md`](capture/README.md) for the full pipeline.

## License

MIT
