Metadata-Version: 2.4
Name: imgslim
Version: 0.1.0
Summary: Reduce vision API costs by 60-80% with algorithmic image optimization
Author: BigFoot3
License: MIT
Project-URL: Homepage, https://github.com/BigFoot3/visionlite
Keywords: llm,vision,image,compression,api,cost
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.10
Description-Content-Type: text/markdown
Requires-Dist: Pillow
Requires-Dist: opencv-python-headless
Requires-Dist: piexif
Provides-Extra: dev
Requires-Dist: pytest; extra == "dev"
Requires-Dist: tabulate; extra == "dev"

# imgslim

Reduce your vision API costs by 60–80% before the image ever leaves your machine.

No AI. No external calls. Pure algorithmic compression tuned for how LLMs actually tokenize images.

```bash
pip install imgslim
```

```python
from imgslim import VisionLite

result = VisionLite(model="claude").optimize("screenshot.png")
# {'output_path': '/tmp/imgslim_output/screenshot_optimized.png',
#  'savings_pct': 74.3,
#  'strategy_used': ['strip_exif', 'resize', 'grayscale']}
```

---

## Why this exists

Vision APIs charge by image tokens. Claude, GPT-4o, and Gemini all slice images into tiles
before processing — the larger the image, the more tiles, the higher the bill.

Most images sent to LLMs are **massively over-sized** for what the model actually needs to
understand them. A 3000×2000 photo contains the same semantic information as a 1092×728 one,
from the model's perspective.

imgslim exploits the exact tile thresholds of each model to resize images to the optimal
resolution — plus strips EXIF metadata, removes whitespace margins, and converts to grayscale
when color adds nothing.

---

## Benchmark

Tested on 5 synthetic content types × 3 models. All transformations are algorithmic — no AI used.

| Model | Sample | Input KB | Output KB | Saving % | Strategies |
|-------|--------|----------|-----------|----------|------------|
| claude | diagram | 95 KB | 38 KB | **59.4%** | strip_exif, resize_to_tile_limit, convert_to_grayscale |
| claude | mixed | 678 KB | 225 KB | **66.8%** | strip_exif, resize_to_tile_limit, convert_to_grayscale, crop_whitespace |
| claude | photo | 6877 KB | 226 KB | **96.7%** | strip_exif, resize_to_tile_limit |
| claude | screenshot | 98 KB | 21 KB | **78.4%** | strip_exif, resize_to_tile_limit, convert_to_grayscale |
| claude | text_doc | 934 KB | 161 KB | **82.8%** | strip_exif, resize_to_tile_limit, convert_to_grayscale |
| gemini | diagram | 95 KB | 81 KB | **14.9%** | strip_exif, resize_to_tile_limit, convert_to_grayscale |
| gemini | mixed | 678 KB | 300 KB | **55.8%** | strip_exif, resize_to_tile_limit, convert_to_grayscale, crop_whitespace |
| gemini | photo | 6877 KB | 563 KB | **91.8%** | strip_exif, resize_to_tile_limit |
| gemini | screenshot | 98 KB | 35 KB | **64.1%** | strip_exif, resize_to_tile_limit, convert_to_grayscale |
| gemini | text_doc | 934 KB | 306 KB | **67.3%** | strip_exif, resize_to_tile_limit, convert_to_grayscale |
| gpt-4o | diagram | 95 KB | 36 KB | **61.9%** | strip_exif, resize_to_tile_limit, convert_to_grayscale |
| gpt-4o | mixed | 678 KB | 194 KB | **71.4%** | strip_exif, resize_to_tile_limit, convert_to_grayscale, crop_whitespace |
| gpt-4o | photo | 6877 KB | 190 KB | **97.2%** | strip_exif, resize_to_tile_limit |
| gpt-4o | screenshot | 98 KB | 18 KB | **81.8%** | strip_exif, resize_to_tile_limit, convert_to_grayscale |
| gpt-4o | text_doc | 934 KB | 138 KB | **85.2%** | strip_exif, resize_to_tile_limit, convert_to_grayscale |

> **Average saving: 71.7% across all content types and models.**

> Benchmark script included: `python benchmark/run_benchmark.py`

---

## How it works

### 1. Content detection (no AI)
OpenCV heuristics classify the image as `photo`, `text`, or `diagram`
based on edge density, contour count, and line detection.

### 2. Strategy selection
```
All images   → strip_exif + resize_to_tile_limit
text/diagram → + convert_to_grayscale
text         → + crop_whitespace
```

### 3. Model-aware resize
Each model has documented tile thresholds. imgslim resizes to just below the
optimal boundary — never upscales, always preserves aspect ratio.

```python
MODEL_TILE_LIMITS = {
    "claude": {"max_size": 1568, "optimal_long_edge": 1092},
    "gpt-4o": {"max_size": 2048, "optimal_long_edge": 1024},
    "gemini": {"max_size": 3072, "optimal_long_edge": 1536},
}
```

---

## Usage

```python
from imgslim import VisionLite

# Default: optimized for Claude
v = VisionLite(model="claude")
result = v.optimize("invoice.jpg")

print(result["savings_pct"])      # 81.2
print(result["strategy_used"])    # ['strip_exif', 'resize', 'grayscale', 'crop']
print(result["output_path"])      # /tmp/imgslim_output/invoice_optimized.jpg

# Then pass output_path to your API call as usual
```

**Supported models:** `claude`, `gpt-4o`, `gemini`

---

## Install

```bash
pip install imgslim
```

**Dependencies:** Pillow, opencv-python-headless, piexif

Python 3.10+

---

## What imgslim does NOT do

- No quality assessment (does not verify the model still "understands" the image)
- No batch processing yet
- No async support yet
- Does not tune JPEG quality dynamically

These are on the roadmap. PRs welcome.

---

## License

MIT
