Metadata-Version: 2.4
Name: misakanet-core
Version: 2.7.0
Summary: The zero-dependency core protocol engine for MisakaNet Swarm Knowledge Network.
Author-email: Ikalus1988 <sheldonisspark@gmail.com>
Project-URL: Homepage, https://github.com/Ikalus1988/MisakaNet
Project-URL: Bug Tracker, https://github.com/Ikalus1988/MisakaNet/issues
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Requires-Python: >=3.10
Description-Content-Type: text/markdown

# misakanet-core

**Zero-dependency BM25 search engine with RRF fusion** — extracted from [MisakaNet](https://github.com/Ikalus1988/MisakaNet).

- Pure Python, stdlib only
- BM25 ranking with configurable k1/b
- Metadata-weighted scoring
- RRF (Reciprocal Rank Fusion) for multi-query fusion
- CJK-aware tokenization
- Works in air-gapped environments

## Installation

```bash
pip install misakanet-core
```

## Usage

```python
from misakanet_core import BM25, ScoredDocument, tokenize, rrf

# Prepare corpus
docs = [
    ScoredDocument("doc1", tokenize("the cat sat on the mat")),
    ScoredDocument("doc2", tokenize("the dog sat on the log")),
    ScoredDocument("doc3", tokenize("cats and dogs are friends")),
]

# Build index and search
engine = BM25(docs)
results = engine.search("cat dog", top_k=5)

for result in results:
    print(f"{result.doc_id}: {result.score:.4f}")

# Multi-query fusion with RRF
from misakanet_core import SearchResult, rrf
query1 = engine.search("cat")
query2 = engine.search("dog")
fused = rrf([query1, query2], top_k=3)
```

## Why not use elasticsearch / tantivy / whoosh?

| | misakanet-core | elasticsearch | tantivy | whoosh |
|---|---|---|---|---|
| Dependencies | **Zero** | JVM | Rust toolchain | Pure Python |
| Install time | 0.5s | 5min+ | 2min+ | 2s |
| Air-gapped | ✅ | ❌ | ❌ | ✅ |
| CJK support | ✅ | ✅ | ⚠️ | ⚠️ |

## License

MIT
