Metadata-Version: 2.4
Name: hyper-gliner
Version: 0.1.3
Summary: Torch-only inference for grounded infon extraction: paragraph in -> typed, grounded, polarity-aware infons + coreference, multilingual (EN/JA/KO), on a fine-tuned mDeBERTa backbone. No transformers runtime dependency.
License: Apache-2.0
Project-URL: Homepage, https://huggingface.co/cp500/infon-extract
Keywords: information-extraction,ner,relation-extraction,coreference,infon,multilingual,deberta
Requires-Python: >=3.10
Description-Content-Type: text/markdown
Requires-Dist: torch>=2.3
Requires-Dist: tokenizers>=0.19
Requires-Dist: safetensors>=0.4
Requires-Dist: numpy>=1.24
Provides-Extra: hub
Requires-Dist: huggingface_hub>=0.23; extra == "hub"

# hyper-gliner

Torch-only inference for **grounded infon extraction** + **learned-sparse recall**, multilingual
(EN/JA/KO), on fine-tuned mDeBERTa + a BERT SPLADE retriever. **No `transformers`
runtime dependency** — both encoders are reimplemented and parity-proven, so the package is small
enough to ship in an AWS Lambda container image.

```python
from hyper_gliner import InfonExtractor, SpladeRetriever

ex = InfonExtractor.from_pretrained("cp500/infon-extract")        # extraction + polarity + coref
infons = ex.extract("Samsung SDI supplies battery cells to BMW.")  # -> [Infon(...)]
#  Infon(Samsung SDI —supplies→ battery cells)  polarity=affirmed

# the {"anchors","infons","stats"} envelope the downstream pipeline consumes:
doc = ex.extract_doc("...", title="...")        # drop-in for extract_infons_neural

sp = SpladeRetriever.from_pretrained("cp500/infon-extract")        # loads the sparse/ subfolder
terms = sp.encode("electric SUV battery", mode="query")            # {term_id: weight}
doc = ex.extract_doc("...", sparse=sp)          # attaches per-infon splade_terms (recall tags)
```

## What's in the box

| Capability | Model | Role |
|---|---|---|
| **Extraction** | mDeBERTa-v3 (fine-tuned) + span_rep + classifier + count | explicit grounded infons (subject→predicate→object + polarity) |
| **Polarity** | 3-way head (affirmed / negated / uncertain) | constitutive negation |
| **Coreference** | Lee-2017 antecedent head | resolve anaphora (rewrite-first) |
| **Sparse recall** | `cp500/opensearch-neural-sparse-en-jp-ko` (BERT SPLADE) | implicit term-level recall (complements extraction) |

All weights ship as **fp16** in [`cp500/infon-extract`](https://huggingface.co/cp500/infon-extract)
(the sparse model under `sparse/`).

## Dependencies

`torch`, `tokenizers`, `safetensors`, `numpy`. Optional `huggingface_hub` (only to download from
the Hub; pass a local dir otherwise — e.g. weights baked into a Lambda image).

## Design notes

- **Torch-only, parity-proven.** The mDeBERTa encoder (disentangled attention) and the BERT SPLADE
  encoder are reimplemented in pure torch and gated bit-close against the reference implementation
  (`tests/test_parity.py` 9.5e-5, `tests/test_bert_parity.py` 1e-5). The full decode is gated
  against the reference implementation (`tests/test_decode_parity.py`, exact on EN/JA/KO).
- **CJK grounding.** A subword splitter + char-offset grounding make Japanese/Korean ground at
  100% (the base span-extraction architecture whitespace-splits and fails on spaceless CJK).
- **Precision vs recall, separated.** `extract()` applies an exact-span argmax filter (one
  canonical type per span, nesting preserved). The SPLADE head supplies the multi-label / implicit
  recall channel as `splade_terms` — feeding an inverted index, a SQL `splade_terms` side-table,
  and the term↔infon bipartite incidence a Sheaf GNN expands.
- **Triple-level grounding filter (on by default).** `extract()` / `extract_doc()` run a measured
  precision filter (`decode_filter.filter_infons`): it drops the predicates an adversarial
  grounding audit found ~95–100% non-grounded (`supplies`, `partner`, `competes_with` — the
  relation head fires them from the document theme, not the connecting clause), plus self-loops and
  non-entity (quantity/currency/spec) objects. On the audited set this lifts grounded precision
  29.5%→58% and cuts hallucination 16.7%→6% while retaining 93.5% of truly-grounded infons. Pass
  `filter_triples=False` for raw, unfiltered output.
- **fp16 only.** int8-dynamic was measured to *hurt* the extraction model (grounding 14→5 spans,
  and larger on disk because the embedding table stays fp32); fp16 is the shipped artifact.

## Consumer contract

`extract_doc()` returns the same `{"anchors","infons","stats"}` envelope, with each infon dict
carrying `subject / predicate / object / polarity(int) / mass{supports,refutes,theta} / span /
confidence / relation_type / tense / anchors`, so it is a drop-in for the downstream ingest
pipeline + Memgraph writer + MCTS/IKL readers. See `tests/test_consumer_contract.py`.

## License

Apache-2.0.
