Metadata-Version: 2.3
Name: sparv-sbx-ocr-correction-viklofg-sweocr
Version: 0.5.1
Summary: A sparv plugin for computing suggested OCR improvements.
Project-URL: Homepage, https://spraakbanken.gu.se
Project-URL: Repository, https://github.com/spraakbanken/sparv-sbx-ocr-correction
Project-URL: Bug Tracker, https://github.com/spraakbanken/sparv-sbx-ocr-correction/labels/project%3Aocr-correction-viklofg-sweocr
Author-email: Språkbanken Text <sb-info@svenska.gu.se>, Kristoffer Andersson <kristoffer.andersson@gu.se>
License: MIT
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: MacOS
Classifier: Operating System :: Microsoft :: Windows
Classifier: Operating System :: POSIX
Classifier: Operating System :: Unix
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: Implementation :: CPython
Classifier: Topic :: Utilities
Classifier: Typing :: Typed
Requires-Python: >=3.9
Requires-Dist: parallel-corpus>=0.2.0
Requires-Dist: sparv-pipeline>=5.2.0
Requires-Dist: transformers<4.45.0,>=4.40.0
Description-Content-Type: text/markdown

# sparv-sbx-ocr-correction

[![PyPI version](https://badge.fury.io/py/sparv-sbx-ocr-correction-viklofg-sweocr.svg)](https://pypi.org/project/sparv-sbx-ocr-correction-viklofg-sweocr)
![PyPI - Python Version](https://img.shields.io/pypi/pyversions/sparv-sbx-ocr-correction-viklofg-sweocr)
[![PyPI - Downloads](https://img.shields.io/pypi/dm/sparv-sbx-ocr-correction-viklofg-sweocr)](https://pypi.org/project/sparv-sbx-ocr-correction-viklofg-sweocr/)

[![Maturity badge - level 2](https://img.shields.io/badge/Maturity-Level%202%20--%20First%20Release-yellowgreen.svg)](https://github.com/spraakbanken/getting-started/blob/main/scorecard.md)
[![Stage](https://img.shields.io/pypi/status/sparv-sbx-ocr-correction-viklofg-sweocr)](https://pypi.org/project/sparv-sbx-ocr-correction-viklofg-sweocr)

[![CI(release)](https://github.com/spraakbanken/sparv-sbx-ocr-correction/actions/workflows/release-viklofg-sweocr.yml/badge.svg)](https://github.com/spraakbanken/sparv-sbx-ocr-correction/actions/workflows/release-viklofg-sweocr.yml)

Sparv plugin to annotate corrections to OCR:ed documents.

## Install

> [!NOTE] You might need to prepend `export CFLAGS="-Wno-error=incompatible-pointer-types" ; export CXXFLAGS="-Wno-error=incompatible-pointer-types" ;` to the `pip install` call.


In a virtual environment:

```bash
pip install sparv-sbx-ocr-correction-viklofg-sweocr
```

or if you have `sparv` installed with `pipx`:

```bash
pipx inject sparv-pipeline sparv-sbx-ocr-correction-viklofg-sweocr
```

## Metadata

### Model

Type | HuggingFace Model | Revision
--- | --- | ---
Model | [`viklofg/swedish-ocr-correction`](https://huggingface.co/viklofg/swedish-ocr-correction) | 84b138048992271be7617ccb11056bbcb9b72262
Tokenizer | [`google/byt5-small`](https://huggingface.co/google/byt5-small) | 68377bdc18a2ffec8a0533fef03b1c513a4dd49d

## Supported Python versions

This library thrives to support a Python version to End-Of-Life, and will at
least bump the minor version when support for a Python version is dropped.

The following versions of this library supports these Python versions:

- v0.4: Python 3.9
- v0.3: Python 3.8

## Changelog

This project keeps a [changelog](./CHANGELOG.md).
