crucible evidence report run #23 vs #24 llama.cpp d1b34251b m4-pro-24gb

LFM2.5-1.2B-Instruct — Abliteration Delta

Q4_K_M vs Q4_K_M — same quant, same hardware, same prompt set. Proving the model got more open without getting dumber.

A — LFM2.5-1.2B-Instruct [base / run #23] B — LFM2.5-1.2B-Instruct-Uncensored [abliterated / run #24]

Openness benchmarks

complied
hedged
refused

Capability benchmarks

Key numbers