Metadata-Version: 2.4
Name: poolin
Version: 0.1.0
Summary: A local UI package for pooling existing embedding zips into grouped vectors
Author: Wenxi Wang
License-Expression: MIT
Project-URL: Homepage, https://example.com/poolin
Project-URL: Repository, https://example.com/poolin
Keywords: pooling,embedding,rag,vectors,streamlit,ui
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: streamlit>=1.32
Requires-Dist: numpy>=1.24
Dynamic: license-file

# poolin

A local UI package for pooling existing embedding vectors from an embedding zip into grouped higher-level vectors.

## Important note

Standard sentence-transformer style pooling usually happens **inside the embedding model** when token embeddings are converted into one sentence embedding. This package does **post-embedding vector pooling** over already-created chunk embeddings.

## What it does

- launches with the `poolin` command
- reads an embedding zip such as `RAG_chunks_recursive_chunks_embeddings.zip`
- auto-groups related chunk embeddings by filename pattern like `RAG_chunk_001_rcs_001.md -> RAG_chunk_001`
- pools vectors with one of these methods:
  - `auto`
  - `mean`
  - `max`
  - `weighted_char_mean`
  - `weighted_word_mean`
  - `mean_sqrt_len`
- exports a zip with:
  - `pooling_summary.json`
  - `pooling_manifest.csv`
  - `*_pooled_embeddings.jsonl` (optional)
  - `*_pooled_embeddings.csv` (optional)
  - `*_pooled_embeddings.npz` (optional)

## Install

```bash
pip install poolin
```

## Run

```bash
poolin
```

## Suggested input

Use a zip produced by your embedding step, containing an embeddings `.npz` or `.jsonl` payload plus the summary file.

## Ownership note

The package metadata and copyright notice are set to Wenxi Wang. You should still verify PyPI package-name availability, trademark questions, and any legal or patent issues yourself before publishing.
