Metadata-Version: 2.4
Name: ape-framework
Version: 2.0.0.dev1
Summary: Package for evaluating algebra problems using AI systems.
License: APE (Algebra Problems Evaluator) is a framework for building and running benchmarks that evaluate LLMs on their ability to solve algebra problems.
        Copyright (C) 2025-2026 Adrian Boguszewski, Kacper Grzybowski, Maciej Teterycz, Mateusz Kasprzak
        
        This program is free software: you can redistribute it and/or modify
        it under the terms of the GNU General Public License as published by
        the Free Software Foundation, either version 3 of the License, or any later version.
        
        This program is distributed in the hope that it will be useful,
        but WITHOUT ANY WARRANTY; without even the implied warranty of
        MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
        GNU General Public License for more details.
        
        You should have received a copy of the GNU General Public License
        along with this program. If not, see <https://www.gnu.org/licenses/>.
Classifier: Programming Language :: Python :: 3
Classifier: Operating System :: OS Independent
Requires-Python: >=3.12
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: langchain
Requires-Dist: langgraph
Requires-Dist: rich
Requires-Dist: jsonpickle
Requires-Dist: docker
Requires-Dist: psycopg[binary,pool]
Requires-Dist: pglast
Requires-Dist: tiktoken
Requires-Dist: pydantic
Dynamic: license-file

# Algebra Problems Evaluator (APE)

## What is APE?
**APE is a framework that simplifies building and running mathematical benchmarks for AI systems.**  
  
It was initially meant as a way to evaluate mathematical problems from the field of algebra on whether they are suited to be a target goal for LLM reasoning research (hence the name).
However it might be more useful to think about it the other way around - the subjects of the evaluation are LLMs or more broadly - AI systems
and they are being evaluated on their ability to solve specific algebra problems. Problems, which solutions are hard to generate, but relatively easy to automatically check for correctness.  
  
APE was created as a part of a Bachelor's project at the University of Warsaw.

## User's Guide
[**TODO**]

## Development setup

Install development dependencies:

```bash
pip install -r requirements-dev.txt
```

Install git hooks:

```bash
pre-commit install
```

Run all hooks manually:

```bash
pre-commit run --all-files
```

<!-- 
    1. What the user has to implement to have a complete benchmark?
    2. How to run a test using an APE benchmark?
    3. Where to find the documentation?
-->
