Metadata-Version: 2.4
Name: Grimmerie
Version: 0.1.1
Summary: Functions for Prototyping, QOL and Sanity checking
Author: Joe Petrecca
License-Expression: MIT
Requires-Python: >=3.9
Description-Content-Type: text/markdown
Requires-Dist: torch>=2.0
Requires-Dist: transformers>=4.38
Requires-Dist: adapters>=1.0
Requires-Dist: numpy>=1.23
Requires-Dist: sentencepiece

# specterize

A tiny helper package for generating SPECTER2 embeddings.

On first use, `specterize()` downloads the base model + adapter from Hugging Face and caches them locally.

## Install

```bash
pip install grimmerie
```

## Usage

```python
from specterize import specterize

papers = [
    {'abstract': 'We introduce a new language representation model called BERT'},
    {
        'abstract': 'The dominant sequence transduction models are based on recurrent or convolutional neural networks'
    },
]

embeddings = specterize(papers, return_type='numpy')
print(embeddings.shape)  # (2, 768)
```

## API

```python
specterize(input_data, return_type='list', max_length=512)
```

- `input_data`: `str`, `dict`, `list`, or other iterable
- `return_type`: one of `"list"`, `"numpy"`, `"tensor"`
- `max_length`: tokenizer truncation max length (default `512`)

## Notes

- The first call is slower because model files are downloaded.
- Subsequent calls reuse the loaded model within the same Python process.
