Metadata-Version: 2.4
Name: gomokubench
Version: 0.1.0
Summary: Benchmark frontier LLMs against a search-powered Gomoku engine.
Author: Homer Quan
License-Expression: MIT
Project-URL: Homepage, https://github.com/homerquan/GomokuBench
Project-URL: Repository, https://github.com/homerquan/GomokuBench
Keywords: gomoku,benchmark,llm,alpha-beta,minimax,cli,research
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Games/Entertainment :: Board Games
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: numpy
Dynamic: license-file

# GomokuBench

GomokuBench is a search-vs-LLM benchmark for AI companies, model builders, and researchers who want a clean, reproducible way to test whether a general-purpose model can outplay a classic board-game engine.

Until 2026.4.26, no LLM in this benchmark has beaten the AI powered by search algorithm. If you find one, please share it with us.

Built for rapid model plugging, head-to-head benchmarking, and plain-language inspection of every move. Author: **Homer Quan**. GitHub: [homerquan/GomokuBench](https://github.com/homerquan/GomokuBench).

## Requirements

- Python 3
- `numpy`

## Install

```bash
pip install gomokubench
```

Or install from source:

```bash
pip install .
```

## Play

```bash
gomoku play
```

Optional flags:

- `--player black|white`
- `--ai-first`

The CLI always uses a `19x19` board and AI search depth `2`.

Moves use `x,y` with 1-based coordinates, for example `10,10`.

## Benchmark

Run an LLM against the built-in alpha-beta AI:

```bash
gomoku benchmark --model nemotron-3-super -r 10
```

To watch the rounds play out in the console while benchmarking:

```bash
gomoku benchmark --model nemotron-3-super -r 10 -v
```

What this does:

- Loads the model config from `models/nemotron-3-super.json`
- Runs 10 rounds total
- Uses balanced starts: 5 rounds with the AI moving first and 5 rounds with the LLM moving first
- Always uses a `19x19` board
- Always uses AI search depth `2`
- `-v` prints each round, move, and board state in the console
- Saves the benchmark report to `benchmarks/nemotron-3-super.json`

The benchmark report is saved as JSON and includes the summary plus per-game move logs and final boards.

## Adding Models

This repo now includes a few example model configs in the `models/` folder.

You can add another model by creating a new JSON config that uses an OpenAI-compatible chat completions API format.

In general:

- add a new config file under `models/`
- point it at an OpenAI-compatible `baseURL`
- set the remote `model` name
- add any required API key env var to `.env`

Examples in this repo include Ollama-compatible, Hugging Face Router, and OpenRouter model configs.
