Metadata-Version: 2.4
Name: valkey-flash-sizer
Version: 0.1.0
Summary: Read-only analyzer that samples a Valkey keyspace and projects the RAM savings from tiering cold entries to NVMe via valkey-flash.
Project-URL: Homepage, https://github.com/mbocevski/valkey-flash-sizer
Project-URL: Repository, https://github.com/mbocevski/valkey-flash-sizer
Project-URL: Issues, https://github.com/mbocevski/valkey-flash-sizer/issues
Project-URL: Sibling project (module), https://github.com/mbocevski/valkey-flash
Author: valkey-flash contributors
License: BSD 3-Clause License
        
        Copyright (c) 2024-present, Valkey contributors
        All rights reserved.
        
        Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
        
            * Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
            * Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
            * Neither the name of the copyright holder nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.
        
        THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
License-File: LICENSE
Keywords: analyzer,nvme,redis,sizing,tiering,valkey
Classifier: Development Status :: 3 - Alpha
Classifier: Environment :: Console
Classifier: Intended Audience :: System Administrators
Classifier: License :: OSI Approved :: BSD License
Classifier: Operating System :: POSIX
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Database
Classifier: Topic :: System :: Monitoring
Requires-Python: >=3.10
Requires-Dist: click<9,>=8.0
Requires-Dist: valkey<7,>=6.1
Description-Content-Type: text/markdown

# valkey-flash-sizer

Read-only analyzer that samples a Valkey keyspace and estimates the RAM savings you'd get from tiering cold entries to NVMe via [`valkey-flash`](https://github.com/mbocevski/valkey-flash).

```fish
uvx valkey-flash-sizer valkey://my-host:6379
```

The tool never mutates state — it only issues `SCAN`, `MEMORY USAGE`, `OBJECT IDLETIME`, `TYPE`, and `TTL`. Output is a Markdown report (or JSON with `--format json`) with honest confidence intervals on every projected number.

## Install

```fish
# Ephemeral (recommended — no install)
uvx valkey-flash-sizer valkey://host:6379

# Persistent
pipx install valkey-flash-sizer
flash-sizer valkey://host:6379

# Main branch (bleeding edge)
uvx --from git+https://github.com/mbocevski/valkey-flash-sizer flash-sizer valkey://host:6379
```

## Usage

```
flash-sizer [OPTIONS] URL

  URL                          Valkey URL (valkey://, valkeys://, redis://, rediss://, unix://)

  --sample N                   Keys to sample (default 100000)
  --cold-threshold DURATION    Idle cutoff for "cold" (default 30m; accepts 30s/15m/2h/1d)
  --hot-cache-ratio F          Assumed hot-cache fraction on the flash tier (default 0.05)
  --confidence LEVEL           Wilson CI level (0.80 / 0.90 / 0.95 / 0.99, default 0.95)
  --format markdown|json       Report format (default markdown)
  --output FILE                Write report to file instead of stdout
  --tls                        Use TLS (shortcut for valkeys:// scheme)
  --username U --password P    AUTH credentials
  --timeout SECONDS            Per-command socket timeout (default 10)
  --pipeline-size N            Probes per round-trip (default 200)
  -v, --verbose                Debug logging to stderr
```

Cluster mode is auto-detected — point the tool at any primary and the sizer will walk every shard via `SCAN`.

## What it does

1. Samples up to N keys from the target Valkey (default 100 000), walking every shard in cluster mode.
2. Probes each sampled key for byte size, idle time, TTL, and type.
3. Aggregates into an idle-time histogram, per-type size percentiles, and a top-N large-key list.
4. Projects the RAM saving you'd get by tiering everything idler than `--cold-threshold` (default 30 min) to `valkey-flash`, with a 95 % Wilson-score confidence interval derived from the sample size.
5. Surfaces the caveats honestly — sampling variance, `OBJECT IDLETIME` approximation under LRU, non-tierable byte subtraction.

See [`tests/golden/report.md`](tests/golden/report.md) for a full sample report rendered from a fixture.

## What it does NOT do

- **No writes.** No `SET`, `DEL`, `CONFIG SET`, `MONITOR`. No telemetry upload. No phone-home.
- **No auto-install** of the `valkey-flash` module. We recommend, we don't install — the sizer runs *before* the user has made a decision.
- **No synthetic cold-read benchmark.** A `--bench` path would have to write test keys and demote them to measure real cold-read p99 on the user's NVMe; that contradicts the read-only guarantee this tool leads with. If that matters, track it separately — likely a sibling `flash-bencher` tool rather than a flag on this one.
- **No recommendation engine.** We report what's cold; we don't tell you which specific keys to migrate first.

## Honesty about the numbers

Every projected number carries a 95 % Wilson-score confidence interval, and the report's "Known biases" section names the assumptions baked into each derivation:

- **SCAN sampling** visits every shard in cluster mode but does not guarantee strict stratification. On heavily-skewed keyspaces the sample may over-represent the largest shards.
- **`OBJECT IDLETIME`** is approximate under `allkeys-lru` / `volatile-lru` (updated lazily during eviction) and unavailable under LFU (the projection declines cleanly rather than fabricating zeros).
- **Size-neutrality assumption:** cold-bytes and RAM-saving bounds scale the fraction CI by observed tierable bytes, which assumes cold keys have the same average byte size as hot keys. Real workloads often skew — cold keys tend to be larger.
- **Workload drift:** a sample at 02:00 looks nothing like 14:00. Run it during peak and off-peak, compare the reports.
- **Non-tierable bytes** (sets, streams, pub-sub buffers, replication backlog) are counted in totals but excluded from the migration projection — `valkey-flash` has no tiered equivalent for them today.

## Development

```fish
git clone https://github.com/mbocevski/valkey-flash-sizer
cd valkey-flash-sizer
uv sync
uv run pytest                 # unit tests
uv run ruff check             # lint
uv run ruff format --check    # formatting
uv build                      # sdist + wheel in dist/
```

Integration tests (against a real Valkey) run when `VALKEY_URL` is set:

```fish
valkey-server --port 16390 --daemonize no --save "" &
VALKEY_URL=valkey://127.0.0.1:16390 uv run pytest -m integration
```

## License

BSD 3-Clause. See [`LICENSE`](LICENSE).

## Related

- [`valkey-flash`](https://github.com/mbocevski/valkey-flash) — the Valkey module this tool sizes for.
