Metadata-Version: 2.4
Name: mathformer
Version: 1.0.2
Summary: A transformer-based math library
Author-email: JeremySu0818 <xinghong.su0818@gmail.com>
Project-URL: Homepage, https://github.com/JeremySu0818/MathFormer
Project-URL: Bug Tracker, https://github.com/JeremySu0818/MathFormer/issues
Project-URL: Repository, https://github.com/JeremySu0818/MathFormer
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Requires-Python: >=3.8
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: torch>=2.0.0
Requires-Dist: transformers>=4.30.0
Requires-Dist: safetensors>=0.3.0
Provides-Extra: dev
Requires-Dist: pytest>=7.0.0; extra == "dev"
Requires-Dist: pytest-cov>=4.0.0; extra == "dev"
Dynamic: license-file

# MathFormer

MathFormer is a Python library that leverages Transformer-based language models to perform mathematical operations. Unlike standard arithmetic libraries, MathFormer uses Llama-architecture models to "predict" the results of arithmetic operations, token by token, demonstrating the capability of small language models to learn arithmetic rules.

It supports basic arithmetic operations: **Addition**, **Subtraction**, **Multiplication**, and **Division**.

## Features

- **Transformer-Powered Arithmetic**: Uses specialized Llama-based models for each arithmetic operation.
- **Large Number Support**: Implements recursive logic to handle multi-digit arithmetic using digit-by-digit prediction (similar to manual calculation).
- **Unified API**: Easy-to-use functions for `add`, `sub`, `mul`, and `div`.
- **Resource Management**: Supports lazy loading of models to save memory, and manual unloading.
- **Custom Tokenizer**: Built-in minimalist tokenizer optimized for mathematical expressions.

## Installation

You can install MathFormer via pip:

```bash
pip install mathformer
```

## Quick Start

The simplest way to use MathFormer is through the top-level convenience functions. These functions automatically handle model loading when needed.

```python
import mathformer

# Addition
result = mathformer.add(123, 456)
print(f"123 + 456 = {result}")  # Output: 579

# Subtraction
result = mathformer.sub(1000, 250)
print(f"1000 - 250 = {result}") # Output: 750

# Multiplication
result = mathformer.mul(12, 12)
print(f"12 * 12 = {result}")    # Output: 144

# Division (returns Quotient and Remainder if applicable)
result = mathformer.div(100, 3)
print(f"100 / 3 = {result}")    # Output: Q33R1
```

You can also pass string expressions:

```python
print(mathformer.add("100 + 200"))
print(mathformer.calculate("mul", 50, 4))
```

## Advanced Usage

For more control over resource usage, you can use the `MathFormerAPI` class directly.

### Managing Resources (Load/Unload)

By default, models are lazy-loaded (loaded only when first requested). You can manually load all models at startup or unload them to free GPU/CPU memory.

```python
from mathformer import MathFormerAPI

# Initialize API (lazy_load=False to load everything immediately)
api = MathFormerAPI(lazy_load=True)

# Perform operations
print(api.add(50, 50))

# Unload all models to free memory
api.unload_all()
```

### Context Manager

You can use `MathFormerAPI` as a context manager to ensure models are cleaned up after use:

```python
from mathformer import MathFormerAPI

with MathFormerAPI() as api:
    print(api.mul(99, 9))
# Models are automatically unloaded here
```

## How It Works

MathFormer isn't just calling Python's `+` or `-` operators. It actually uses a neural network to predict the result!

1.  **Single-Step Prediction**: For small single-digit operations (e.g., `5+7`), it queries a Transformer model customized for that operation.
2.  **Multi-Digit Logic**: For larger numbers (e.g., `123+456`), the library implements the standard grade-school algorithms (carrying, borrowing, partial products) but delegates the fundamental single-digit arithmetic steps to the Transformer model.

## Requirements

- Python >= 3.8
- PyTorch >= 2.0.0
- Transformers >= 4.30.0

## License

This project is licensed under the MIT License.
