Metadata-Version: 2.4
Name: aws-bootstrap-g4dn
Version: 0.15.0
Summary: Bootstrap AWS EC2 GPU instances for hybrid local-remote development
Author: Adam Ever-Hadani
License-Expression: MIT
Project-URL: Homepage, https://github.com/promptromp/aws-bootstrap-g4dn
Project-URL: Issues, https://github.com/promptromp/aws-bootstrap-g4dn/issues
Keywords: aws,ec2,gpu,cuda,deep-learning,spot-instances,cli
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Python :: 3.14
Requires-Python: >=3.12
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: boto3>=1.43
Requires-Dist: click>=8.4
Requires-Dist: pyyaml>=6.0.3
Requires-Dist: tabulate>=0.10
Dynamic: license-file

# aws-bootstrap-g4dn

--------------------------------------------------------------------------------

[![CI](https://github.com/promptromp/aws-bootstrap-g4dn/actions/workflows/ci.yml/badge.svg)](https://github.com/promptromp/aws-bootstrap-g4dn/actions/workflows/ci.yml)
[![GitHub License](https://img.shields.io/github/license/promptromp/aws-bootstrap-g4dn)](https://github.com/promptromp/aws-bootstrap-g4dn/blob/main/LICENSE)
[![PyPI - Version](https://img.shields.io/pypi/v/aws-bootstrap-g4dn)](https://pypi.org/project/aws-bootstrap-g4dn/)
[![PyPI - Python Version](https://img.shields.io/pypi/pyversions/aws-bootstrap-g4dn)](https://pypi.org/project/aws-bootstrap-g4dn/)

One command to go from zero to a **fully configured GPU dev box** on AWS — with CUDA-matched PyTorch, Jupyter, SSH aliases, and a GPU benchmark ready to run.

```bash
aws-bootstrap launch          # Spot g4dn.xlarge in ~3 minutes
ssh aws-gpu1                  # You're in, venv activated, PyTorch works
```

### ✨ Key Features

| | Feature | Details |
|---|---|---|
| 🚀 | **One-command launch** | Spot (default) or on-demand, with automatic fallback on capacity errors |
| 🔑 | **Auto SSH config** | Adds `aws-gpu1` alias to `~/.ssh/config` — no IP juggling. Cleaned up on terminate |
| 🐍 | **CUDA-aware PyTorch** | Detects the installed CUDA toolkit (`nvcc`) and installs PyTorch from the matching wheel index — no more `torch.version.cuda` mismatches |
| ✅ | **PyTorch smoke test** | Runs a quick `torch.cuda` matmul after setup to verify the GPU stack works end-to-end |
| 📊 | **GPU benchmark included** | CNN (MNIST) + Transformer benchmarks with FP16/FP32/BF16 precision and tqdm progress |
| 📓 | **Jupyter ready** | Lab server auto-starts as a systemd service on port 8888 — just SSH tunnel and open |
| 🖥️ | **`status --gpu`** | Shows CUDA toolkit version, driver max, GPU architecture, spot pricing, uptime, and estimated cost |
| 🌍 | **Multi-region status** | `status` with no `--region` finds instances across every enabled region and labels each with its region |
| 💾 | **EBS data volumes** | Attach persistent storage at `/data` — survives spot interruptions and termination, reattach to new instances |
| 🗑️ | **Clean terminate** | Stops instances, removes SSH aliases, cleans up EBS volumes (or preserves with `--keep-ebs`) |
| 🤖 | **[Agent Skill](https://agentskills.io/)** | Included Claude Code plugin lets LLM agents autonomously provision, manage, and tear down GPU instances |

### 🎯 Target Workflows

1. **Jupyter server-client** — Jupyter runs on the instance, connect from your local browser
2. **VSCode Remote SSH** — opens `~/workspace` with pre-configured CUDA debug/build tasks and an example `.cu` file
3. **NVIDIA Nsight remote debugging** — GPU debugging over SSH

---

## Requirements

1. AWS profile configured with relevant permissions (profile name can be passed via `--profile` or read from `AWS_PROFILE` env var)
2. AWS CLI v2 — see [here](https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html)
3. Python 3.12+ and [uv](https://github.com/astral-sh/uv)
4. An SSH key pair (see below)

## Installation

### From PyPI

```bash
pip install aws-bootstrap-g4dn
```

### With uvx (no install needed)

[uvx](https://docs.astral.sh/uv/guides/tools/) runs the CLI directly in a temporary environment — no global install required:

```bash
uvx --from aws-bootstrap-g4dn aws-bootstrap launch
uvx --from aws-bootstrap-g4dn aws-bootstrap status
uvx --from aws-bootstrap-g4dn aws-bootstrap terminate
```

### From source (development)

```bash
git clone https://github.com/promptromp/aws-bootstrap-g4dn.git
cd aws-bootstrap-g4dn
uv venv
uv sync
```

All methods install the `aws-bootstrap` CLI.

#### Optional: auto-activate the venv with direnv

A sample [direnv](https://direnv.net/) config is provided at [`.envrc.example`](.envrc.example). It activates the project venv (and optionally sets `AWS_PROFILE`) automatically when you `cd` into the repo:

```bash
cp .envrc.example .envrc
# edit .envrc to uncomment/set AWS_PROFILE if desired
direnv allow
```

`.envrc` is git-ignored, so your local copy stays out of version control.

## SSH Key Setup

The CLI expects an Ed25519 SSH public key at `~/.ssh/id_ed25519.pub` by default. If you don't have one, generate it:

```bash
ssh-keygen -t ed25519
```

Accept the default path (`~/.ssh/id_ed25519`) and optionally set a passphrase. The key pair is imported into AWS automatically on first launch.

To use a different key, pass `--key-path`:

```bash
aws-bootstrap launch --key-path ~/.ssh/my_other_key.pub
```

**Robust key handling** (so you never end up with an instance you can't reach):

- **Missing local key** — if `--key-path` doesn't exist, `launch` auto-generates an Ed25519 key pair there (instead of aborting).
- **Name collision with a different key** — if an AWS key pair already exists with the target `--key-name` but its public key differs from your local key (e.g. created from another machine), the existing AWS key pair is **left untouched** and your local key is imported under a deterministic derived name `aws-bootstrap-key-<fp8>`, which the instance is launched with. You always hold the matching private key.
- **Unreachable instance** — if SSH still fails with an authentication/host-key error, `launch` **stops immediately** and prints the real `ssh` error (no more silent 5-minute "SSH not ready" loop masking a `Permission denied (publickey)`).

## Usage

### 🚀 Launching an Instance

```bash
# Show available commands
aws-bootstrap --help

# Dry run — validates AMI lookup, key import, and security group without launching
aws-bootstrap launch --dry-run

# Launch a spot g4dn.xlarge (default)
aws-bootstrap launch

# Launch on-demand in a specific region with a custom instance type
aws-bootstrap launch --on-demand --instance-type g5.xlarge --region us-east-1

# Try multiple regions in order until one has spot capacity
aws-bootstrap launch --region us-west-2 --region us-east-1 --region eu-west-1

# Keep retrying (bounded exponential backoff) until spot capacity frees up
aws-bootstrap launch --wait --wait-timeout 30m
aws-bootstrap launch --region us-west-2 --region us-east-1 --wait --wait-timeout 1h

# Launch without running the remote setup script
aws-bootstrap launch --no-setup

# Use a specific Python version in the remote venv
aws-bootstrap launch --python-version 3.13

# Use a non-default SSH port
aws-bootstrap launch --ssh-port 2222

# Attach a persistent EBS data volume (96 GB gp3, mounted at /data)
aws-bootstrap launch --ebs-storage 96

# Reattach an existing EBS volume from a previous instance
aws-bootstrap launch --ebs-volume-id vol-0abc123def456

# Use a specific AWS profile
aws-bootstrap launch --profile my-aws-profile
```

After launch, the CLI:

1. **Creates/attaches EBS volume** (if `--ebs-storage` or `--ebs-volume-id` was specified)
2. **Adds an SSH alias** (e.g. `aws-gpu1`) to `~/.ssh/config`
3. **Runs remote setup** — installs utilities, creates a Python venv, installs CUDA-matched PyTorch, sets up Jupyter
4. **Mounts EBS volume** at `/data` (if applicable — formats new volumes, mounts existing ones as-is)
5. **Runs a CUDA smoke test** — verifies `torch.cuda.is_available()` and runs a quick GPU matmul
6. **Prints connection commands** — SSH, Jupyter tunnel, GPU benchmark, and terminate

```bash
ssh aws-gpu1                  # venv auto-activates on login
```

### 🌍 Finding Capacity (regions & `--wait`)

Spot `InsufficientInstanceCapacity` is scoped to a **region and availability zone** — a type that's unavailable in `us-west-2` right now may be plentiful in `us-east-1`, and capacity for a given AZ frees up continuously as other instances terminate. Two options help you get a GPU without babysitting the prompt:

- **Multiple regions** — pass `--region` more than once. Each launch attempt tries the regions **in the order given**, spot-first, and uses the first one with capacity:

  ```bash
  aws-bootstrap launch --region us-west-2 --region us-east-1 --region eu-west-1
  ```

- **`--wait`** — on insufficient spot capacity, keep retrying with **capped, jittered exponential backoff** until `--wait-timeout` (default `30m`; accepts `90s`, `30m`, `1h`, or bare seconds). On timeout it **hard-fails** (it does not silently fall back to on-demand):

  ```bash
  aws-bootstrap launch --region us-west-2 --region us-east-1 --wait --wait-timeout 1h
  ```

**How `--wait` + multiple `--region` combine:** a **region sweep is the inner loop, backoff is the outer loop**. Each cycle tries spot in every `--region` in order *with no delay between regions*; only when **all** regions miss does it sleep (backoff) and sweep again. So `--wait --region A --region B` means "try A then B instantly; if both dry, back off and retry A then B" — repeating until timeout — *not* "wait on A, then try B." Backoff escalates per sweep (not per region), region order wins every tie, and `--wait-timeout` is total wall-clock. See [docs/capacity-and-retry.md](docs/capacity-and-retry.md#how---wait-and-multiple---region-combine) for the full model.

Quota errors (`VcpuLimitExceeded`, `MaxSpotInstanceCountExceeded`) and `SpotMaxPriceTooLow` are **never retried by `--wait`** (waiting can't fix them). In multi-region mode they are *not* fatal on their own: the launcher prints a `WARNING` for that region (with a region-pinned `aws-bootstrap quota …` hint), skips it, and tries the next `--region` — failing hard only once **every** region is blocked, with an aggregated message listing each region's reason and hint. Without `--wait`, a fully-exhausted spot pass still offers the interactive on-demand fallback (across all regions).

**Region default precedence** (a behavior change — previously hardcoded to `us-west-2`): explicit `--region` flags → `AWS_DEFAULT_REGION` / active profile region → `us-west-2`. This applies to every command, so a profile configured for `us-east-1` now operates in `us-east-1` by default. The active region is shown in command output.

See [docs/capacity-and-retry.md](docs/capacity-and-retry.md) for the backoff design and recommended region lists per instance family.

### 🔧 What Remote Setup Does

The setup script runs automatically on the instance after SSH becomes available:

| Step | What |
|------|------|
| **GPU verify** | Confirms `nvidia-smi` and `nvcc` are working |
| **Utilities** | Installs `htop`, `tmux`, `tree`, `jq`, `ffmpeg` |
| **Python venv** | Creates `~/venv` with `uv`, auto-activates in `~/.bashrc`. Use `--python-version` to pin a specific Python (e.g. `3.13`) |
| **CUDA-aware PyTorch** | Detects CUDA toolkit version → installs PyTorch from the matching `cu{TAG}` wheel index |
| **CUDA smoke test** | Runs `torch.cuda.is_available()` + GPU matmul to verify the stack |
| **GPU benchmark** | Copies `gpu_benchmark.py` to `~/gpu_benchmark.py` |
| **GPU smoke test notebook** | Copies `gpu_smoke_test.ipynb` to `~/gpu_smoke_test.ipynb` (open in JupyterLab) |
| **Jupyter** | Configures and starts JupyterLab as a systemd service on port 8888 |
| **SSH keepalive** | Configures server-side keepalive to prevent idle disconnects |
| **VSCode workspace** | Creates `~/workspace/.vscode/` with `launch.json` and `tasks.json` (auto-detected `cuda-gdb` path and GPU arch), plus an example `saxpy.cu` |

### 📊 GPU Benchmark

A GPU throughput benchmark is pre-installed at `~/gpu_benchmark.py` on every instance:

```bash
# Run both CNN and Transformer benchmarks (default)
ssh aws-gpu1 '~/venv/bin/python ~/gpu_benchmark.py'

# CNN only, quick run
ssh aws-gpu1 '~/venv/bin/python ~/gpu_benchmark.py --mode cnn --benchmark-batches 20'

# Transformer only with custom batch size
ssh aws-gpu1 '~/venv/bin/python ~/gpu_benchmark.py --mode transformer --transformer-batch-size 16'

# Run CUDA diagnostics first (tests FP16/FP32 matmul, autocast, etc.)
ssh aws-gpu1 '~/venv/bin/python ~/gpu_benchmark.py --diagnose'

# Force FP32 precision (if FP16 has issues on your GPU)
ssh aws-gpu1 '~/venv/bin/python ~/gpu_benchmark.py --precision fp32'
```

Reports: iterations/sec, samples/sec, peak GPU memory, and avg batch time for each model.

### 📓 Jupyter (via SSH Tunnel)

```bash
ssh -NL 8888:localhost:8888 aws-gpu1
# Then open: http://localhost:8888
```

Or with explicit key/IP:
```bash
ssh -i ~/.ssh/id_ed25519 -NL 8888:localhost:8888 ubuntu@<public-ip>
```

A **GPU smoke test notebook** (`~/gpu_smoke_test.ipynb`) is pre-installed on every instance. Open it in JupyterLab to interactively verify the CUDA stack, run FP32/FP16 matmuls, train a small CNN on MNIST, and visualise training loss and GPU memory usage.

### 🖥️ VSCode Remote SSH

The remote setup creates a `~/workspace` folder with pre-configured CUDA debug and build tasks:

```
~/workspace/
├── .vscode/
│   ├── launch.json   # CUDA debug configs (cuda-gdb path auto-detected)
│   └── tasks.json    # nvcc build tasks (GPU arch auto-detected, e.g. sm_75)
└── saxpy.cu          # Example CUDA source — open and press F5 to debug
```

Connect directly from your terminal:

```bash
code --folder-uri vscode-remote://ssh-remote+aws-gpu1/home/ubuntu/workspace
```

Then install the [Nsight VSCE extension](https://marketplace.visualstudio.com/items?itemName=NVIDIA.nsight-vscode-edition) on the remote when prompted. Open `saxpy.cu`, set a breakpoint, and press F5.

See [Nsight remote profiling guide](docs/nsight-remote-profiling.md) for more details on CUDA debugging and profiling workflows.

### 📤 Structured Output

All commands support `--output` / `-o` for machine-readable output — useful for scripting, piping to `jq`, or LLM tool-use:

```bash
# JSON output (pipe to jq)
aws-bootstrap -o json status
aws-bootstrap -o json status | jq '.instances[0].instance_id'

# YAML output
aws-bootstrap -o yaml status

# Table output
aws-bootstrap -o table status

# Works with all commands
aws-bootstrap -o json list instance-types | jq '.[].instance_type'
aws-bootstrap -o json launch --dry-run
aws-bootstrap -o json terminate --yes
aws-bootstrap -o json cleanup --dry-run
```

Supported formats: `text` (default, human-readable with color), `json`, `yaml`, `table`. Commands that require confirmation (`terminate`, `cleanup`) require `--yes` in structured output modes.

### 📋 Listing Resources

```bash
# List all g4dn instance types (default)
aws-bootstrap list instance-types

# List a different instance family
aws-bootstrap list instance-types --prefix p3

# List Deep Learning AMIs (default filter) — each AMI is labelled with its region
aws-bootstrap list amis

# List AMIs with a custom filter
aws-bootstrap list amis --filter "ubuntu/images/hvm-ssd-gp3/ubuntu-noble*"

# Use a specific region (the active region is shown in the output header)
aws-bootstrap list instance-types --region us-east-1
aws-bootstrap list amis --region us-east-1

# --region is repeatable (-r for short) to compare across regions
aws-bootstrap list amis -r us-east-1 -r us-west-2
aws-bootstrap list instance-types --prefix g5 -r us-east-1 -r eu-west-1
```

`list instance-types` shows a **Quota Family** column (`gvt`/`p`/`dl`) — the AWS
vCPU quota family each type draws from. These group multiple prefixes (e.g. all
G/VT types, including `g5`, share `gvt`), so the suggested `--family` may not
look like your `--prefix`. The output then ends with copy-paste **Next steps**
for that family — a `quota show` and a `quota request` command pinned to the
queried region — so you can go straight from "is this type available?" to
checking and raising your vCPU quota.

### 🖥️ Managing Instances

```bash
# Show all aws-bootstrap instances across every enabled region (including shutting-down).
# Each instance is labelled with its region.
aws-bootstrap status

# Include GPU info (CUDA toolkit + driver version, GPU name, architecture) via SSH
aws-bootstrap status --gpu

# Hide connection commands (shown by default for each running instance)
aws-bootstrap status --no-instructions

# Restrict the query to one region
aws-bootstrap status --region us-east-1

# Restrict to several regions (--region is repeatable, -r for short)
aws-bootstrap status --region us-east-1 --region us-west-2
aws-bootstrap status -r us-east-1 -r eu-west-1

# Terminate all aws-bootstrap instances (with confirmation prompt)
aws-bootstrap terminate

# Terminate but preserve EBS data volumes for reuse
aws-bootstrap terminate --keep-ebs

# Terminate by SSH alias (resolved via ~/.ssh/config)
aws-bootstrap terminate aws-gpu1

# Terminate by instance ID
aws-bootstrap terminate i-abc123

# Mix aliases and instance IDs
aws-bootstrap terminate aws-gpu1 i-def456

# Skip confirmation prompt
aws-bootstrap terminate --yes

# Remove stale SSH config entries for terminated instances
aws-bootstrap cleanup

# Preview what would be removed without modifying config
aws-bootstrap cleanup --dry-run

# Also find and delete orphan EBS data volumes
aws-bootstrap cleanup --include-ebs

# Preview orphan volumes without deleting
aws-bootstrap cleanup --include-ebs --dry-run

# Skip confirmation prompt
aws-bootstrap cleanup --yes
```

`status --gpu` reports both the **installed CUDA toolkit** version (from `nvcc`) and the **maximum CUDA version supported by the driver** (from `nvidia-smi`), so you can see at a glance whether they match:

```
CUDA: 12.8 (driver supports up to 13.0)
```

SSH aliases are managed automatically — they're created on `launch`, shown in `status`, and cleaned up on `terminate`. Aliases use sequential numbering (`aws-gpu1`, `aws-gpu2`, etc.) and never reuse numbers from previous instances. You can use aliases anywhere you'd use an instance ID, e.g. `aws-bootstrap terminate aws-gpu1`.

## EBS Data Volumes

Attach persistent EBS storage to keep datasets and model checkpoints across instance lifecycles. Volumes are mounted at `/data` and persist independently of the instance.

```bash
# Create a new 96 GB gp3 volume, formatted and mounted at /data
aws-bootstrap launch --ebs-storage 96

# After terminating with --keep-ebs, reattach the same volume to a new instance
aws-bootstrap terminate --keep-ebs
# Output: Preserving EBS volume: vol-0abc123...
#         Reattach with: aws-bootstrap launch --ebs-volume-id vol-0abc123...

aws-bootstrap launch --ebs-volume-id vol-0abc123def456
```

Key behaviors:
- `--ebs-storage` and `--ebs-volume-id` are mutually exclusive
- New volumes are formatted as ext4; existing volumes are mounted as-is
- Volumes are tagged for automatic discovery by `status` and `terminate`
- `terminate` deletes data volumes by default; use `--keep-ebs` to preserve them
- **Orphan cleanup** — use `aws-bootstrap cleanup --include-ebs` to find and delete orphan volumes (e.g. from spot interruptions or forgotten `--keep-ebs` volumes). Use `--dry-run` to preview
- **Spot-safe** — data volumes survive spot interruptions. If AWS reclaims your instance, the volume detaches automatically and can be reattached to a new instance with `--ebs-volume-id`
- EBS volumes must be in the same availability zone as the instance
- Mount failures are non-fatal — the instance remains usable

## EC2 vCPU Quotas

AWS accounts have [service quotas](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-resource-limits.html) that limit how many vCPUs you can run per instance family. New or lightly-used accounts often have a **default quota of 0 vCPUs** for GPU instance families (G and VT), which will cause errors on launch:

- **Spot**: `MaxSpotInstanceCountExceeded`
- **On-Demand**: `VcpuLimitExceeded`

Check your current quotas (g4dn.xlarge requires at least 4 vCPUs):

```bash
# Built-in: show all GPU family quotas
aws-bootstrap quota show

# Show only G/VT family quotas
aws-bootstrap quota show --family gvt

# Show P family quotas (P2 through P6)
aws-bootstrap quota show --family p

# The active region is shown in the output header. --region is repeatable
# (-r) to compare quotas across regions:
aws-bootstrap quota show --family gvt -r us-east-1 -r us-west-2

# Or use the AWS CLI directly:
aws service-quotas get-service-quota \
  --service-code ec2 \
  --quota-code L-3819A6DF \
  --region us-west-2
```

Request increases:

```bash
# `aws-bootstrap quota show` prints a ready-to-run `quota request` command
# with a --desired-value above your current quota and pinned to --region.
# The desired value must EXCEED the current quota (AWS rejects <= current),
# so pick a value accordingly (8 shown as an example):
aws-bootstrap quota show --family gvt --region us-west-2
aws-bootstrap quota request --type spot --desired-value 8 --region us-west-2

# Request a P family spot quota increase
aws-bootstrap quota request --family p --type spot --desired-value 192 --region us-west-2

# --region is repeatable: submit the same increase in several regions at once.
# All target regions are validated up front — if any region's current quota is
# already >= the desired value, nothing is submitted.
aws-bootstrap quota request --type spot --desired-value 8 -r us-east-1 -r us-west-2

# Check request status (also repeatable across regions)
aws-bootstrap quota history --region us-west-2
aws-bootstrap quota history -r us-east-1 -r us-west-2

# Or use the AWS CLI directly:
aws service-quotas request-service-quota-increase \
  --service-code ec2 \
  --quota-code L-3819A6DF \
  --desired-value 8 \
  --region us-west-2
```

Quota codes may vary by region or account type. To list the actual codes in your region:

```bash
# List all G/VT-related quotas
aws service-quotas list-service-quotas \
  --service-code ec2 \
  --region us-west-2 \
  --query "Quotas[?contains(QuotaName, 'G and VT')].[QuotaCode,QuotaName,Value]" \
  --output table
```

Common quota codes:

| Family | Type | Code | Description |
|--------|------|------|-------------|
| G/VT | Spot | `L-3819A6DF` | All G and VT Spot Instance Requests |
| G/VT | On-Demand | `L-DB2E81BA` | Running On-Demand G and VT instances |
| P | Spot | `L-7212CCBC` | All P Spot Instance Requests |
| P | On-Demand | `L-417A185B` | Running On-Demand P instances |
| DL | Spot | `L-85EED4F7` | All DL Spot Instance Requests |
| DL | On-Demand | `L-6E869C2A` | Running On-Demand DL instances |

Small increases (4-8 vCPUs) are typically auto-approved within minutes. You can also request increases via the [Service Quotas console](https://console.aws.amazon.com/servicequotas/home). While waiting, you can test the full launch/poll/SSH flow with a non-GPU instance type:

```bash
aws-bootstrap launch --instance-type t3.medium --ami-filter "ubuntu/images/hvm-ssd-gp3/ubuntu-noble-24.04-amd64-server-*"
```

## Claude Code Plugin

A [Claude Code](https://docs.anthropic.com/en/docs/claude-code) plugin is included in the [`aws-bootstrap-skill/`](aws-bootstrap-skill/) directory, enabling LLM coding agents to autonomously provision and manage GPU instances.

### Install from GitHub

```bash
# Add the marketplace (registers this repo as a plugin source)
/plugin marketplace add promptromp/aws-bootstrap-g4dn

# Install the plugin
/plugin install aws-bootstrap-skill@promptromp-aws-bootstrap-g4dn
```

### Install locally (from repo checkout)

```bash
claude --plugin-dir ./aws-bootstrap-skill
```

See [`aws-bootstrap-skill/README.md`](aws-bootstrap-skill/README.md) for details.

## Additional Resources

| Topic | Link |
|-------|------|
| GPU instance pricing | [instances.vantage.sh](https://instances.vantage.sh/aws/ec2/g4dn.xlarge) |
| Spot instance quotas | [AWS docs](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-spot-limits.html) |
| Deep Learning AMIs | [AWS docs](https://docs.aws.amazon.com/dlami/latest/devguide/what-is-dlami.html) |
| Nsight remote GPU profiling | [Guide](docs/nsight-remote-profiling.md) — Nsight Compute, Nsight Systems, and Nsight VSCE on EC2 |

Tutorials on setting up a CUDA environment on EC2 GPU instances:

- [Provision an EC2 GPU Host on AWS](https://www.dolthub.com/blog/2025-03-12-provision-an-ec2-gpu-host-on-aws/) (DoltHub, 2025)
- [AWS EC2 Setup for GPU/CUDA Programming](https://techfortalk.co.uk/2025/10/11/aws-ec2-setup-for-gpu-cuda-programming/) (TechForTalk, 2025)
