Metadata-Version: 2.4
Name: quant_clone
Version: 0.1.0
Summary: Generate a llama-quantize command to copy the quantization parameters of any GGUF
Author: electroglyph
License-Expression: BSD-2-Clause
Project-URL: Homepage, https://github.com/electroglyph/quant_clone
Project-URL: Repository, https://github.com/electroglyph/quant_clone
Classifier: Programming Language :: Python :: 3
Classifier: Operating System :: OS Independent
Requires-Python: >=3.8
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: gguf
Dynamic: license-file

## quant_clone

This is a simple little script to help you generate a llama-quantize (from llama.cpp) command which will allow you to quantize your own GGUF the same way your target GGUF has been quantized.

## Installation

`git clone https://github.com/electroglyph/quant_clone.git`

`cd quant_clone`

`pip install .`

## Usage

`quant_clone input.gguf output.txt`

input.gguf is the GGUF file whose quantization parameters you would like to copy

output.txt parameter is optional, if it's omitted the output will be saved to cmd.txt

## Example

if I take one of unsloth's dynamic 2.0 quants and run:

`quant_clone gemma-3-1b-it-UD-IQ1_S.gguf`

I get this output:

`llama-quantize --imatrix <imatrix_unsloth.dat> --tensor-type token_embd.weight=Q5_1 --tensor-type blk.0.attn_k.weight=IQ4_NL --tensor-type blk.0.attn_output.weight=IQ2_XXS --tensor-type blk.0.attn_q.weight=IQ4_NL --tensor-type blk.0.attn_v.weight=Q5_0 --tensor-type blk.0.ffn_down.weight=IQ3_S --tensor-type blk.0.ffn_gate.weight=IQ4_NL --tensor-type blk.0.ffn_up.weight=IQ4_NL --tensor-type blk.1.attn_k.weight=IQ4_NL --tensor-type blk.1.attn_output.weight=IQ2_XXS --tensor-type blk.1.attn_q.weight=IQ4_NL --tensor-type blk.1.attn_v.weight=Q5_0 --tensor-type blk.1.ffn_down.weight=Q2_K --tensor-type blk.1.ffn_gate.weight=IQ4_NL --tensor-type blk.1.ffn_up.weight=IQ4_NL --tensor-type blk.2.attn_k.weight=IQ4_NL --tensor-type blk.2.attn_output.weight=IQ2_XXS --tensor-type blk.2.attn_q.weight=IQ4_NL --tensor-type blk.2.attn_v.weight=Q5_0 --tensor-type blk.2.ffn_down.weight=IQ3_S --tensor-type blk.2.ffn_gate.weight=IQ4_NL --tensor-type blk.2.ffn_up.weight=IQ4_NL --tensor-type blk.3.attn_k.weight=IQ4_NL --tensor-type blk.3.attn_output.weight=IQ2_XXS --tensor-type blk.3.attn_q.weight=IQ4_NL --tensor-type blk.3.attn_v.weight=Q5_0 --tensor-type blk.3.ffn_down.weight=IQ3_S --tensor-type blk.3.ffn_gate.weight=IQ4_NL --tensor-type blk.3.ffn_up.weight=IQ4_NL --tensor-type blk.4.attn_k.weight=IQ4_NL --tensor-type blk.4.attn_output.weight=IQ2_XXS --tensor-type blk.4.attn_q.weight=IQ4_NL --tensor-type blk.4.attn_v.weight=Q5_0 --tensor-type blk.4.ffn_down.weight=IQ3_S --tensor-type blk.4.ffn_gate.weight=IQ4_NL --tensor-type blk.4.ffn_up.weight=IQ4_NL --tensor-type blk.5.attn_k.weight=IQ4_NL --tensor-type blk.5.attn_output.weight=IQ2_XXS --tensor-type blk.5.attn_q.weight=IQ4_NL --tensor-type blk.5.attn_v.weight=Q5_0 --tensor-type blk.5.ffn_down.weight=IQ1_S --tensor-type blk.5.ffn_gate.weight=IQ4_NL --tensor-type blk.5.ffn_up.weight=IQ4_NL --tensor-type blk.6.attn_k.weight=IQ4_NL --tensor-type blk.6.attn_output.weight=IQ2_XXS --tensor-type blk.6.attn_q.weight=IQ4_NL --tensor-type blk.6.attn_v.weight=Q5_0 --tensor-type blk.6.ffn_down.weight=IQ1_S --tensor-type blk.6.ffn_gate.weight=IQ4_NL --tensor-type blk.6.ffn_up.weight=IQ4_NL --tensor-type blk.7.attn_k.weight=IQ4_NL --tensor-type blk.7.attn_output.weight=IQ2_XXS --tensor-type blk.7.attn_q.weight=IQ4_NL --tensor-type blk.7.attn_v.weight=Q5_0 --tensor-type blk.7.ffn_down.weight=IQ1_S --tensor-type blk.7.ffn_gate.weight=IQ4_NL --tensor-type blk.7.ffn_up.weight=IQ4_NL --tensor-type blk.8.attn_k.weight=IQ4_NL --tensor-type blk.8.attn_output.weight=IQ2_XXS --tensor-type blk.8.attn_q.weight=IQ4_NL --tensor-type blk.8.attn_v.weight=Q5_0 --tensor-type blk.8.ffn_down.weight=IQ1_S --tensor-type blk.8.ffn_gate.weight=IQ4_NL --tensor-type blk.8.ffn_up.weight=IQ4_NL --tensor-type blk.9.attn_k.weight=IQ4_NL --tensor-type blk.9.attn_output.weight=IQ2_XXS --tensor-type blk.9.attn_q.weight=IQ4_NL --tensor-type blk.9.attn_v.weight=Q5_0 --tensor-type blk.9.ffn_down.weight=IQ1_S --tensor-type blk.9.ffn_gate.weight=IQ4_NL --tensor-type blk.9.ffn_up.weight=IQ4_NL --tensor-type blk.10.attn_k.weight=IQ4_NL --tensor-type blk.10.attn_output.weight=IQ2_XXS --tensor-type blk.10.attn_q.weight=IQ4_NL --tensor-type blk.10.attn_v.weight=Q5_0 --tensor-type blk.10.ffn_down.weight=IQ1_S --tensor-type blk.10.ffn_gate.weight=IQ4_NL --tensor-type blk.10.ffn_up.weight=IQ4_NL --tensor-type blk.11.attn_k.weight=IQ4_NL --tensor-type blk.11.attn_output.weight=IQ2_XXS --tensor-type blk.11.attn_q.weight=IQ4_NL --tensor-type blk.11.attn_v.weight=Q5_0 --tensor-type blk.11.ffn_down.weight=IQ2_S --tensor-type blk.11.ffn_gate.weight=IQ4_NL --tensor-type blk.11.ffn_up.weight=IQ4_NL --tensor-type blk.12.attn_k.weight=IQ4_NL --tensor-type blk.12.attn_output.weight=IQ2_XXS --tensor-type blk.12.attn_q.weight=IQ4_NL --tensor-type blk.12.attn_v.weight=Q5_0 --tensor-type blk.12.ffn_down.weight=IQ2_S --tensor-type blk.12.ffn_gate.weight=IQ4_NL --tensor-type blk.12.ffn_up.weight=IQ4_NL --tensor-type blk.13.attn_k.weight=IQ4_NL --tensor-type blk.13.attn_output.weight=IQ2_XXS --tensor-type blk.13.attn_q.weight=IQ4_NL --tensor-type blk.13.attn_v.weight=Q5_0 --tensor-type blk.13.ffn_down.weight=IQ2_S --tensor-type blk.13.ffn_gate.weight=IQ4_NL --tensor-type blk.13.ffn_up.weight=IQ4_NL --tensor-type blk.14.attn_k.weight=IQ4_NL --tensor-type blk.14.attn_output.weight=IQ2_XXS --tensor-type blk.14.attn_q.weight=IQ4_NL --tensor-type blk.14.attn_v.weight=Q5_0 --tensor-type blk.14.ffn_down.weight=IQ2_S --tensor-type blk.14.ffn_gate.weight=IQ4_NL --tensor-type blk.14.ffn_up.weight=IQ4_NL --tensor-type blk.15.attn_k.weight=IQ4_NL --tensor-type blk.15.attn_output.weight=IQ2_XXS --tensor-type blk.15.attn_q.weight=IQ4_NL --tensor-type blk.15.attn_v.weight=Q5_0 --tensor-type blk.15.ffn_down.weight=IQ2_S --tensor-type blk.15.ffn_gate.weight=IQ4_NL --tensor-type blk.15.ffn_up.weight=IQ4_NL --tensor-type blk.16.attn_k.weight=IQ4_NL --tensor-type blk.16.attn_output.weight=IQ2_XXS --tensor-type blk.16.attn_q.weight=IQ4_NL --tensor-type blk.16.attn_v.weight=Q5_0 --tensor-type blk.16.ffn_down.weight=IQ1_S --tensor-type blk.16.ffn_gate.weight=IQ4_NL --tensor-type blk.16.ffn_up.weight=IQ4_NL --tensor-type blk.17.attn_k.weight=IQ4_NL --tensor-type blk.17.attn_output.weight=IQ2_XXS --tensor-type blk.17.attn_q.weight=IQ4_NL --tensor-type blk.17.attn_v.weight=Q5_0 --tensor-type blk.17.ffn_down.weight=IQ1_S --tensor-type blk.17.ffn_gate.weight=IQ4_NL --tensor-type blk.17.ffn_up.weight=IQ4_NL --tensor-type blk.18.attn_k.weight=IQ4_NL --tensor-type blk.18.attn_output.weight=IQ2_XXS --tensor-type blk.18.attn_q.weight=IQ4_NL --tensor-type blk.18.attn_v.weight=Q5_0 --tensor-type blk.18.ffn_down.weight=IQ1_S --tensor-type blk.18.ffn_gate.weight=IQ4_NL --tensor-type blk.18.ffn_up.weight=IQ4_NL --tensor-type blk.19.attn_k.weight=IQ4_NL --tensor-type blk.19.attn_output.weight=IQ2_XXS --tensor-type blk.19.attn_q.weight=IQ4_NL --tensor-type blk.19.attn_v.weight=Q5_0 --tensor-type blk.19.ffn_down.weight=IQ1_S --tensor-type blk.19.ffn_gate.weight=IQ4_NL --tensor-type blk.19.ffn_up.weight=IQ4_NL --tensor-type blk.20.attn_k.weight=IQ4_NL --tensor-type blk.20.attn_output.weight=IQ2_XXS --tensor-type blk.20.attn_q.weight=IQ4_NL --tensor-type blk.20.attn_v.weight=Q5_0 --tensor-type blk.20.ffn_down.weight=IQ1_S --tensor-type blk.20.ffn_gate.weight=IQ4_NL --tensor-type blk.20.ffn_up.weight=IQ4_NL --tensor-type blk.21.attn_k.weight=IQ4_NL --tensor-type blk.21.attn_output.weight=IQ2_XXS --tensor-type blk.21.attn_q.weight=IQ4_NL --tensor-type blk.21.attn_v.weight=Q5_0 --tensor-type blk.21.ffn_down.weight=IQ1_S --tensor-type blk.21.ffn_gate.weight=IQ4_NL --tensor-type blk.21.ffn_up.weight=IQ4_NL --tensor-type blk.22.attn_k.weight=IQ4_NL --tensor-type blk.22.attn_output.weight=IQ2_XXS --tensor-type blk.22.attn_q.weight=IQ4_NL --tensor-type blk.22.attn_v.weight=Q5_0 --tensor-type blk.22.ffn_down.weight=IQ1_S --tensor-type blk.22.ffn_gate.weight=IQ4_NL --tensor-type blk.22.ffn_up.weight=IQ4_NL --tensor-type blk.23.attn_k.weight=IQ4_NL --tensor-type blk.23.attn_output.weight=IQ2_XXS --tensor-type blk.23.attn_q.weight=IQ4_NL --tensor-type blk.23.attn_v.weight=Q5_0 --tensor-type blk.23.ffn_down.weight=IQ1_S --tensor-type blk.23.ffn_gate.weight=IQ4_NL --tensor-type blk.23.ffn_up.weight=IQ4_NL --tensor-type blk.24.attn_k.weight=IQ4_NL --tensor-type blk.24.attn_output.weight=IQ2_XXS --tensor-type blk.24.attn_q.weight=IQ4_NL --tensor-type blk.24.attn_v.weight=Q5_0 --tensor-type blk.24.ffn_down.weight=IQ1_S --tensor-type blk.24.ffn_gate.weight=IQ4_NL --tensor-type blk.24.ffn_up.weight=IQ4_NL --tensor-type blk.25.attn_k.weight=IQ4_NL --tensor-type blk.25.attn_output.weight=IQ2_XXS --tensor-type blk.25.attn_q.weight=IQ4_NL --tensor-type blk.25.attn_v.weight=Q5_0 --tensor-type blk.25.ffn_down.weight=IQ3_S --tensor-type blk.25.ffn_gate.weight=IQ4_NL --tensor-type blk.25.ffn_up.weight=IQ4_NL <input.gguf> <output.gguf> Q8_0`

That's the command to run to replicate the quantization. Make sure to edit imatrix path, input gguf path, and output gguf path.
