Metadata-Version: 2.1
Name: numbduck
Version: 0.0.1
Author: NumbDuck GitHub Repository Contributors
License: MIT License (with Citation Clause)
        
        Copyright (c) 2026 NumbDuck GitHub Repository Contributors
        
        Permission is hereby granted, free of charge, to any person obtaining a copy
        of this software and associated documentation files (the "Software"), to deal
        in the Software without restriction, including without limitation the rights
        to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
        copies of the Software, and to permit persons to whom the Software is
        furnished to do so, subject to the following conditions:
        
        - The above copyright notice and this permission notice shall be included in all
          copies or substantial portions of the Software.
        
        - If this software or any derivative work is used, modified, or distributed,
          you must provide proper credit to the original author. This includes:
          - Mentioning the original author in documentation, README files, or any
            publication describing work based on this software.
          - Including a link to the original repository, when applicable.
        
        THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
        IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
        FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
        AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
        LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
        OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
        SOFTWARE.
        
Keywords: duckdb,numba,numpy
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: duckdb<1.6,>=1.3.2
Requires-Dist: numbox>=0.5.13

# NumbDuck
Adapting DuckDB database API to the Numba JIT context.

Leverages [bindings](https://github.com/Goykhman/numbox/tree/main/numbox/core/bindings) toolkit in the [NumbOx](https://github.com/Goykhman/numbox) project.

Inspired by the [NumbSQL](https://github.com/cpcloud/numbsql) project.

## Examples

Runnable narrative-style examples comparing numbduck against the closest stock
DuckDB Python+Arrow approaches live in [`examples/`](examples/). Each script is
self-contained, generates its own data, and prints the measured numbers.

Highlights:

- **Throughput** ([haversine.py](examples/haversine.py)): JIT chunk callback is
  ~400× faster than a per-row Python scalar UDF (10K rows) and ~100× faster than
  a PyArrow expression UDF at 1M rows. The win comes from no Python crossings per
  chunk and LLVM-fused math with no intermediate arrays.

- **Latency + GIL-free** ([online_scoring.py](examples/online_scoring.py)): ~2.2×
  lower per-event latency vs pure-Python `conn.execute`, and monotonic parallel
  scaling to ~2.4× on 8 threads while the Python loop plateaus under GIL
  contention.

- **Branchy logic** ([fraud_score.py](examples/fraud_score.py)): Arrow's
  `pc.if_else` chain beats the Python scalar UDF by ~60× (Arrow is the right
  stock-DuckDB tool here). The JIT chunk callback then beats Arrow by ~16× at 10K
  and ~1750× at 1M rows — the gap grows because each Arrow UDF chunk crosses the
  Python boundary and allocates intermediates, while the JIT computes in registers.
