Metadata-Version: 2.4
Name: jetfit
Version: 0.1.1
Summary: LLM model advisor for NVIDIA Jetson and DGX Spark unified-memory devices
Author-email: mannsub <akstjq0511@gmail.com>
License: MIT
Project-URL: Homepage, https://github.com/mannsub/jetfit
Project-URL: Repository, https://github.com/mannsub/jetfit
Project-URL: Issues, https://github.com/mannsub/jetfit/issues
Keywords: jetson,llm,nvidia,tui,quantization,dgx
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Environment :: Console
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.11
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: click>=8.1
Requires-Dist: rich>=13.0
Requires-Dist: textual>=0.60
Provides-Extra: dev
Requires-Dist: pytest>=8.0; extra == "dev"
Requires-Dist: pytest-cov>=5.0; extra == "dev"
Dynamic: license-file

# jetfit

**LLM model advisor for NVIDIA Jetson and DGX Spark unified-memory devices.**

Detects your Jetson hardware, scores LLM models across quality, speed, and memory fit, and tells you exactly which quantization level will run well on your device.

Ships with an interactive TUI (default) and a CLI mode. Supports hardware simulation, calibration, compare view, and plan mode.

---

## Install

```bash
pip install jetfit
```

or with [uv](https://github.com/astral-sh/uv):

```bash
uv tool install jetfit   # install globally
uvx jetfit               # run without installing
```

---

## Usage

### TUI (default)

```bash
jetfit
```

Launches the interactive terminal UI. The top bar shows your detected platform, available RAM, accelerator type, and minimum JetPack version. Models are listed in a scrollable table sorted by params, with composite score, estimated tok/s, best quantization, memory %, and fit grade per row.

#### Normal mode

| Key | Action |
|-----|--------|
| `j` / `k` | Navigate models |
| `g` | Jump to top / bottom (toggle) |
| `Enter` | Open detail view |
| `p` | Open plan mode |
| `m` | Mark / unmark model for compare |
| `c` | Open compare view (marked vs selected) |
| `x` | Clear all marks |
| `v` | Enter visual select mode |
| `/` | Focus search bar |
| `r` | Cycle provider (family) filter |
| `b` | Cycle size filter |
| `f` | Cycle fit filter |
| `s` | Cycle sort column |
| `-` | Flip sort direction |
| `F` | Open advanced filter popup |
| `S` | Open hardware simulation |
| `A` | Open advanced config (tune efficiency) |
| `t` | Cycle theme |
| `h` | Open help |
| `q` | Quit |

#### Visual mode (`v`)

Select a contiguous range of models for bulk comparison.

| Key | Action |
|-----|--------|
| `j` / `k` | Extend selection |
| `m` | Mark selected model |
| `c` | Open compare view for selection |
| `v` / `Esc` | Exit visual mode |

#### Detail view (`Enter`)

Shows full quant ladder for the selected model — size, KV cache, total memory, memory %, estimated tok/s, and fit grade for every quantization level. Navigate rows with `j`/`k`; the left panel updates to show specs for the highlighted quant.

#### Plan mode (`p`)

Estimates hardware requirements for a model config. Edit Context, Quant, and Target TPS fields. Shows minimum and recommended RAM, feasibility per run path, and upgrade deltas.

| Key | Action |
|-----|--------|
| `Tab` / `j` / `k` | Move between fields |
| Type | Edit current field |
| `Backspace` | Remove characters |
| `Esc` / `q` | Exit plan mode |

#### Compare view (`c`)

Side-by-side comparison of marked models. Rows are attributes (Score, tok/s, Fit, Mem%, Params, Quant, Context); columns are models. Best values are highlighted.

#### Hardware simulation (`S`)

Override the active hardware profile to preview recommendations for any supported Jetson or DGX Spark device without leaving the TUI. The system bar shows `(sim)` when active.

#### Advanced config (`A`)

Tune the efficiency factor used for tok/s estimation. Changes apply immediately and all scores are recalculated.

#### Advanced filter (`F`)

Set numeric bounds on parameter count and memory utilization %.

---

### CLI

```bash
# Detect hardware
jetfit system

# Detect hardware (JSON)
jetfit system --json

# Recommend models for current hardware
jetfit recommend

# Filter by model name
jetfit recommend --model llama

# Fix a specific quant level
jetfit recommend --quant Q4_K_M

# Show all quant levels per model
jetfit recommend --all-quants

# Override available memory
jetfit recommend --available-gb 12.0

# Target a specific hardware profile
jetfit recommend --profile jetson_agx_orin_64gb

# Minimum tok/s threshold
jetfit recommend --min-tps 5.0

# JSON output
jetfit recommend --json
```

---

## Supported Hardware

| Device | RAM | Bandwidth | Accelerator | JetPack |
|--------|-----|-----------|-------------|---------|
| Jetson Nano | 4 GB | 25.6 GB/s | DLA+CUDA | 4.x |
| Jetson TX2 NX | 4 GB | 51.2 GB/s | CUDA | 5.x |
| Jetson TX2 4GB | 4 GB | 51.2 GB/s | CUDA | 4.x |
| Jetson TX2 | 8 GB | 59.7 GB/s | CUDA | 4.x |
| Jetson TX2i | 8 GB | 51.2 GB/s | CUDA | 4.x |
| Jetson Xavier NX 8GB | 8 GB | 59.7 GB/s | DLA+CUDA | 5.x |
| Jetson Xavier NX 16GB | 16 GB | 59.7 GB/s | DLA+CUDA | 5.x |
| Jetson AGX Xavier 16GB | 16 GB | 136.5 GB/s | DLA+CUDA | 5.x |
| Jetson AGX Xavier 32GB | 32 GB | 136.5 GB/s | DLA+CUDA | 5.x |
| Jetson AGX Xavier 64GB | 64 GB | 136.5 GB/s | DLA+CUDA | 5.x |
| Jetson AGX Xavier Industrial | 64 GB | 136.5 GB/s | DLA+CUDA | 5.x |
| Jetson Orin Nano 4GB | 4 GB | 51.2 GB/s | CUDA | 6.x |
| Jetson Orin Nano 8GB | 8 GB | 102.4 GB/s | CUDA | 6.x |
| Jetson Orin NX 8GB | 8 GB | 102.4 GB/s | DLA+CUDA | 6.x |
| Jetson Orin NX 16GB | 16 GB | 102.4 GB/s | DLA+CUDA | 6.x |
| Jetson AGX Orin 32GB | 32 GB | 204.8 GB/s | DLA+CUDA | 6.x |
| Jetson AGX Orin 64GB | 64 GB | 204.8 GB/s | DLA+CUDA | 6.x |
| Jetson AGX Orin Industrial | 64 GB | 204.8 GB/s | DLA+CUDA | 6.x |
| Jetson AGX Thor T4000 | 64 GB | 273 GB/s | FP4+CUDA | 6.x |
| Jetson AGX Thor T5000 | 128 GB | 273 GB/s | FP4+CUDA | 6.x |
| DGX Spark (GB10) | 128 GB | 273 GB/s | FP4+CUDA | — |

On macOS or Linux dev machines, jetfit runs in simulation mode — pick any profile with `S` to preview recommendations.

---

## How it works

1. **Hardware detection** — Reads device-tree model and compatible strings (`/proc/device-tree/`), tegra release (`/etc/nv_tegra_release`), and available RAM via `tegrastats`, `jtop`, or `/proc/meminfo` (priority order). On non-Jetson machines, falls back to simulation mode with a selectable profile.

2. **Model database** — 67 models embedded directly in `fit.py`. Each entry has a parameter count and real context length sourced from HuggingFace. Memory requirements are computed across a 6-level quantization ladder (Q8_0 through Q2_K) using per-quant bytes-per-parameter values that account for k-quant codebook overhead.

3. **KV cache accounting** — Memory estimates include a fp16 KV cache (`0.000008 × params_b × 4096 GB`) and 0.5 GB runtime overhead, so "fits" means the model will actually load at a typical 4K inference context.

4. **FP4 halving** — On devices with FP4 support (Thor, DGX Spark), effective model size is halved before all memory and speed calculations.

5. **Fit levels** — Based on `(weights + KV cache + overhead) / available_memory`:

   | Level | Utilization |
   |-------|-------------|
   | Perfect | ≤ 70% |
   | Good | 71–90% |
   | Marginal | 91–100% |
   | TooTight | > 100% |

6. **Speed estimation** — Token generation is memory-bandwidth-bound. Estimated tok/s:

   `(bandwidth_GB_s / effective_size_GB) × efficiency × quant_speed_multiplier`

   Default efficiency is 0.50–0.55 per profile, tunable via `A`. Quant multipliers range from 1.00× (Q8_0) to 1.80× (Q2_K).

7. **Composite score** — Each model gets a 0–100 score combining normalized speed (45%), fit level (35%), and quantization quality (20%). Used for sorting and the score column.

8. **Calibration** — A per-profile efficiency factor can be saved to `~/.config/jetfit/calibration.json`. When present, the system bar shows a `✓ cal` badge and all speed estimates use the measured value instead of the profile default.

---

## Project structure

```
jetfit/
  __init__.py      -- version
  cli.py           -- Click CLI entry point, TUI launch
  hardware.py      -- Jetson/DGX hardware detection
  profiles.py      -- Hardware profile database (22 devices)
  fit.py           -- Scoring engine, quantization ladder, model catalog
  tui.py           -- Textual TUI (app state, rendering, keyboard events)
tests/
  test_hardware.py -- Hardware detection and TUI markup regression tests
  test_fit.py      -- Scoring engine unit tests
  test_calibration.py
  test_ros2.py
pyproject.toml
LICENSE
```

---

## Dependencies

| Package | Purpose |
|---------|---------|
| `click` | CLI argument parsing |
| `rich` | CLI table and colored output |
| `textual` | Terminal UI framework |

---

## License

MIT
