Metadata-Version: 2.4
Name: hygrep
Version: 0.0.6
Summary: Hybrid grep: fast scanning + neural reranking
Project-URL: Homepage, https://github.com/nijaru/hygrep
Project-URL: Repository, https://github.com/nijaru/hygrep
Author: nijaru
License-Expression: MIT
License-File: LICENSE
Keywords: cli,code,grep,search,semantic
Classifier: Development Status :: 3 - Alpha
Classifier: Environment :: Console
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: MacOS
Classifier: Operating System :: POSIX :: Linux
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Topic :: Text Processing :: Filters
Requires-Python: >=3.11
Requires-Dist: huggingface-hub>=0.20
Requires-Dist: numpy>=1.24
Requires-Dist: onnxruntime>=1.16
Requires-Dist: pathspec>=0.11
Requires-Dist: rich>=13.0
Requires-Dist: tokenizers>=0.15
Requires-Dist: tree-sitter-bash>=0.23
Requires-Dist: tree-sitter-c-sharp>=0.23
Requires-Dist: tree-sitter-c>=0.23
Requires-Dist: tree-sitter-cpp>=0.23
Requires-Dist: tree-sitter-elixir>=0.3
Requires-Dist: tree-sitter-go>=0.23
Requires-Dist: tree-sitter-java>=0.23
Requires-Dist: tree-sitter-javascript>=0.23
Requires-Dist: tree-sitter-json>=0.24
Requires-Dist: tree-sitter-kotlin>=1.0
Requires-Dist: tree-sitter-lua>=0.2
Requires-Dist: tree-sitter-php>=0.23
Requires-Dist: tree-sitter-python>=0.23
Requires-Dist: tree-sitter-ruby>=0.23
Requires-Dist: tree-sitter-rust>=0.23
Requires-Dist: tree-sitter-svelte>=1.0
Requires-Dist: tree-sitter-swift>=0.0.1
Requires-Dist: tree-sitter-toml>=0.7
Requires-Dist: tree-sitter-typescript>=0.23
Requires-Dist: tree-sitter-yaml>=0.7
Requires-Dist: tree-sitter-zig>=1.0
Requires-Dist: tree-sitter>=0.24
Requires-Dist: typer>=0.9
Provides-Extra: dev
Requires-Dist: pytest>=7.0; extra == 'dev'
Requires-Dist: ruff>=0.1; extra == 'dev'
Description-Content-Type: text/markdown

# hygrep (hhg)

**Hyper hybrid grep: fast scanning + neural reranking**

```bash
pip install hygrep
hhg "auth logic" ./src
```

## What it does

- **Semantic search:** "auth" finds "login", "session", "token"
- **Smart context:** Returns full functions/classes, not just lines
- **Fast:** Parallel regex recall (~20k files/sec), then neural reranking
- **Zero indexing:** Works instantly on any codebase

## Install

```bash
pip install hygrep
# or
uv tool install hygrep
# or
pipx install hygrep
```

The binary name for hygrep is `hhg`.

First search downloads the model (~83MB, cached in `~/.cache/huggingface/`).

## Usage

```bash
hhg "query" [path]           # Search (default: current dir)
hhg "error handling" . -n 5  # Limit to 5 results
hhg "auth" . --fast          # Skip reranking (instant grep)
hhg "test" . -t py,js        # Filter by file type
hhg "config" . --json        # JSON output for agents/scripts
hhg info                     # Check installation status
hhg model                    # Show model info
hhg model install            # Pre-download model (for CI/offline)
hhg model clean              # Remove cached model
```

Run `hhg --help` for all options.

## Output

```
src/auth.py:42 [function] login (0.89)
src/session.py:15 [function] validate_token (0.76)
```

With `--json`:
```json
[{"file": "src/auth.py", "type": "function", "name": "login", "start_line": 42, "score": 0.89, "content": "def login(user): ..."}]
```

## Config

Optional `~/.config/hygrep/config.toml`:
```toml
n = 10
color = "always"
exclude = ["*.test.js", "tests/*"]
cache_dir = "~/.cache/hygrep"  # Custom model cache (default: shared HF cache)
```

## Supported Languages

Bash, C, C++, C#, Elixir, Go, Java, JavaScript, JSON, Kotlin, Lua, Mojo, PHP, Python, Ruby, Rust, Svelte, Swift, TOML, TypeScript, YAML, Zig

## How it works

```
Query → Parallel regex scan → Tree-sitter extraction → ONNX reranking → Results
```

| Stage | What |
|-------|------|
| Recall | Mojo/Python parallel scanner (~20k files/sec) |
| Extract | Tree-sitter AST (functions, classes) |
| Rerank | ONNX cross-encoder (mxbai-rerank-xsmall-v1) |

## Development

```bash
git clone https://github.com/nijaru/hygrep && cd hygrep
pixi install && pixi run build-ext && pixi run test
```

## License

MIT
