Metadata-Version: 2.4
Name: quill-sort
Version: 5.0.0
Summary: Adaptive ultra-sort: the fastest general-purpose sorting library for Python
Author: Isaiah Tucker
License: MIT
Project-URL: Homepage, https://github.com/invariant-games/quill
Project-URL: Repository, https://github.com/invariant-games/quill
Project-URL: Changelog, https://github.com/invariant-games/quill/blob/main/CHANGELOG.md
Keywords: sorting,sort,algorithm,radix sort,fast sort,high performance,numpy
Classifier: Development Status :: 5 - Production/Stable
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Software Development :: Libraries
Classifier: Topic :: Scientific/Engineering
Requires-Python: >=3.8
Description-Content-Type: text/markdown
License-File: LICENSE
Provides-Extra: fast
Requires-Dist: numpy>=1.21; extra == "fast"
Requires-Dist: psutil>=5.9; extra == "fast"
Provides-Extra: pandas
Requires-Dist: numpy>=1.21; extra == "pandas"
Requires-Dist: pandas>=1.3; extra == "pandas"
Provides-Extra: all
Requires-Dist: numpy>=1.21; extra == "all"
Requires-Dist: pandas>=1.3; extra == "all"
Requires-Dist: psutil>=5.9; extra == "all"
Provides-Extra: dev
Requires-Dist: pytest>=7.0; extra == "dev"
Requires-Dist: pytest-timeout>=2.1; extra == "dev"
Requires-Dist: hypothesis>=6.0; extra == "dev"
Dynamic: license-file

# quill-sort

**Adaptive Ultra-Sort** — the fastest general-purpose sorting library for Python.

```python
from quill import quill_sort, quill_sorted

quill_sort([3, 1, 4, 1, 5, 9])               # → [1, 1, 3, 4, 5, 9]
quill_sort(records, key=lambda r: r['age'])   # sort objects
quill_sort(big_data, parallel=True)           # use all CPU cores
result = quill_sorted(iterable, reverse=True) # mirrors built-in sorted()
```

## Installation

```bash
pip install quill-sort              # core (no dependencies)
pip install quill-sort[fast]        # + numpy + psutil for max speed
pip install quill-sort[all]         # + numpy + pandas + psutil
```

## How it works

Quill profiles your data at intake and routes to the optimal algorithm automatically:

| Data type | Strategy | Complexity |
|-----------|----------|------------|
| Dense integer range | `np.bincount` counting sort | O(n + k) |
| uint8 / uint16 | Radix sort (`kind='stable'`, 1-2 passes) | O(n) |
| int32 | Stable radix (default) or introsort (`stable=False`) | O(n log n) |
| int64 / float64 | Stable radix (default) or heapsort (`stable=False`) | O(n log n) |
| Strings / bytes | Python Timsort | O(n log n) |
| Objects with key | Rank-encode → numpy argsort | O(n log n) |
| Pre-sorted | Early exit after O(sample) probe | O(sample) |
| Nearly sorted | Timsort bypass (asc_ratio > 0.90) | O(n) |
| Reverse-sorted | Single `.reverse()` call | O(n/2) |
| Parallel (4+ cores) | MSD radix, shared-memory scatter | O(n) |
| External (> RAM) | Radix-partition + parallel bucket sort | disk-bound |

## API

```python
quill_sort(
    data,                      # list, generator, range, ndarray, Series, DataFrame
    key=None,                  # sort key function (like sorted())
    reverse=False,             # descending order
    inplace=True,              # mutate data in-place (False = return new list)
    parallel=False,            # use all CPU cores
    stable=True,               # True = matches sorted() exactly; False = max speed
    high_performance_mode=False, # skip interactive prompts for external sort
    silent=False,              # suppress status output
    stats=False,               # return (sorted_list, stats_dict)
)
```

### `stable` parameter (new in v5)

By default, `stable=True` guarantees that equal elements maintain their original relative order — identical to Python's `sorted()`. Set `stable=False` for maximum speed on numeric data (uses benchmark-optimal sort algorithms that may reorder equal elements).

```python
# Stable (default) — matches sorted() exactly
quill_sort(data)                    # safe for all use cases

# Unstable — faster for large numeric datasets
quill_sort(data, stable=False)      # ~2-20x faster for int32/int64
```

### `analyze()` — inspect your data's profile

```python
from quill import analyze

info = analyze([3, 1, 4, 1, 5, 9])
# {'n': 6, 'dtype': 'int_pos', 'presorted': False, 'dense': True, ...}
```

### `stats` — timing information

```python
result, stats = quill_sort(data, stats=True)
# stats = {'time_ms': 12.3, 'n': 1000000}
```

## Stability guarantee

`quill_sort` with `stable=True` (the default) is a **stable sort**: equal elements preserve their original relative order. This matches the contract of Python's built-in `sorted()` and `list.sort()`.

## Special value handling

**None values** are sorted to the end of the list (or the beginning with `reverse=True`):

```python
quill_sort([3, None, 1, None, 2])
# → [1, 2, 3, None, None]
```

**NaN values** in float data are placed at the end (or beginning with `reverse=True`):

```python
quill_sort([3.0, float('nan'), 1.0, 2.0])
# → [1.0, 2.0, 3.0, nan]
```

**Mixed int/float** lists are automatically promoted to float64:

```python
quill_sort([1, 2.5, 3, 4.0])
# → [1, 2.5, 3, 4.0]  (fast float path, not slow object fallback)
```

## v5 — What changed

**Sort stability fix (breaking):** v4 used quicksort for int32 and heapsort for int64 — both unstable. v5 defaults to `stable=True` which uses `kind='stable'` for all dtypes, matching `sorted()` exactly. Use `stable=False` for the old fast-but-unstable behavior.

**New features:**
- `stable` parameter for explicit stability control
- `stats` parameter for timing info
- `analyze()` API for data profiling
- Float NaN handling (stripped, sorted, reinserted)
- Mixed int/float promotion to float64
- Nearly-sorted Timsort bypass (O(n) for ~sorted data)
- Parallel float sort via shared-memory numpy
- Cache-aware counting sort (L2-sized range → always counting sort)
- Adaptive parallel threshold (scales with core count)
- Thread-safe plugin registry and process pool
- 138-test suite with hypothesis property-based testing

## v4 — Previous improvements

**Benchmark-proven optimal sort algorithm per dtype:**

| dtype | v3 (wrong) | v4 (correct) | Sort-phase speedup |
|-------|-----------|--------------|-------------------|
| uint8 | `stable` | `stable` | — (already optimal) |
| uint16 | `stable` | `stable` | — (already optimal) |
| int32 | `stable` | `default` (introsort) | **17-20×** |
| int64 | `stable` | `heapsort` | **8-15×** |
| float64 | `stable` | `heapsort` | **8-10×** |

**Other v4 improvements:**
- Heavy-key detection in parallel MSD radix (handles Zipf/skewed distributions up to 2.3× faster)
- Parallel MSD radix sort via shared memory — zero-copy IPC, no pickling of large arrays
- All bucket sorts in external engine now use correct `kind` per dtype
- Batch sorts in streaming k-way merge corrected

## Supported types

- `int`, `float`, `str`, `bytes` — native fast paths
- Mixed `int` + `float` — automatically promoted to float64
- Negative integers — automatically shifted to non-negative before sorting
- `None` values — filtered out, sorted, reinserted at end
- `float('nan')` — filtered out, reinserted at end
- `pandas.Series` — sorted and returned as a new Series
- `pandas.DataFrame` — sorted by column(s) via `key='column_name'`
- `numpy.ndarray` — sorted in-place via numpy directly
- Any generator or iterator — materialised to list, sorted, returned

## Plugin system

```python
from quill import register_plugin
from quill._plugins import QuillPlugin

class MyPlugin(QuillPlugin):
    handles = (MyCustomClass,)
    name    = "my_custom_class"

    @staticmethod
    def prepare(data, key, reverse):
        items = [x.value for x in data]
        postprocess = lambda sorted_list: [MyCustomClass(v) for v in sorted_list]
        return items, key, postprocess

register_plugin(MyPlugin)
quill_sort(list_of_my_objects)
```

## CLI / Demo

```bash
python -m quill       # run the benchmark demo
quill                 # same, if installed via pip
```

## Performance

Benchmarked on Windows 11, 28-core CPU, numpy installed:

| n | Time | Notes |
|---|------|-------|
| 1,000 | ~0.0001s | Python sort / timsort |
| 100,000 | ~0.002s | numpy stable sort |
| 1,000,000 | ~0.010s | numpy stable sort |
| 10,000,000 | ~0.08s | numpy stable sort |
| 100,000,000 | ~0.8s | numpy stable sort |
| 1,000,000,000 | **~39s** | external: radix-partition + 28-core parallel bucket sort |

The 1B result uses Quill's external sort engine: radix-partitions into 256 buckets on disk, sorts all buckets in parallel across available CPU cores, then concatenates — no merge pass needed.

Note: list↔numpy conversion overhead dominates for medium sizes. For maximum throughput on large datasets, pass numpy arrays directly (the `NumpyArrayPlugin` handles this automatically).

## Performance tuning

- **Install numpy**: `pip install quill-sort[fast]` — 10-100× faster for numeric data.
- **`stable=False`**: 2-20× faster for int32/int64 when you don't need stability.
- **`parallel=True`**: Force parallel for datasets under the auto-threshold.
- **Install psutil**: Accurate RAM sensing prevents unnecessary external sort triggers.
- **`high_performance_mode=True`**: Skip interactive prompts in production.

## Troubleshooting

- **"quill_sort is slower than expected"** — Install numpy: `pip install quill-sort[fast]`.
- **"External sort triggered unexpectedly"** — Install psutil for accurate RAM sensing. Without it, Quill assumes only 2GB available.
- **"Parallel sort not engaging"** — Auto-parallel requires 4+ cores and large datasets. Use `parallel=True` to force it.
- **"Equal elements reordered"** — Ensure `stable=True` (the default). If using `stable=False`, equal elements may reorder.

## Development

```bash
git clone https://github.com/invariant-games/quill.git
cd quill
pip install -e ".[all,dev]"
pytest                          # run all tests
pytest -m slow                  # parallel/large data tests
python -m quill                 # run benchmark demo
```

## Requirements

- Python 3.8+
- `numpy` optional but strongly recommended — `pip install numpy`
- `psutil` optional, for accurate RAM sensing — `pip install psutil`
- `pandas` optional — `pip install pandas`

## License

MIT — by Isaiah Tucker
