Metadata-Version: 2.4
Name: logicjitter
Version: 0.1.0
Summary: Algorithmically generated logic games for LLM fine-tuning and misinformation detection.
Author: Luca Herranz-Celotti, Marco Viviani
License: CC-BY-4.0
Project-URL: Paper, https://www.sciencedirect.com/science/article/pii/S0045790626002879
Project-URL: Repository, https://github.com/lucehe/logicjitter
Keywords: misinformation,llm,reasoning,dataset,logic-games
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.10
Description-Content-Type: text/markdown
Requires-Dist: datasets>=2.14
Requires-Dist: networkx>=3.2
Requires-Dist: nltk>=3.8
Requires-Dist: numpy>=1.24
Requires-Dist: pandas>=2.0
Requires-Dist: tqdm>=4.66
Requires-Dist: wikipedia>=1.4
Provides-Extra: dev
Requires-Dist: pytest>=7.0; extra == "dev"

# LogicJitter

**LogicJitter: Let LLMs play and uncover misinformation**

[![Paper](https://img.shields.io/badge/Paper-ScienceDirect-blue)](https://www.sciencedirect.com/science/article/pii/S0045790626002879)
[![License: CC BY 4.0](https://img.shields.io/badge/License-CC%20BY%204.0-lightgrey.svg)](https://creativecommons.org/licenses/by/4.0/)

Python package for automatically generating logic-based training data to improve LLM reasoning and misinformation detection.

**Authors:** [Luca Herranz-Celotti](https://lucehe.github.io/) · [Marco Viviani](https://ikr3.disco.unimib.it/people/marco-viviani/)  
**Published in:** *Computers and Electrical Engineering* 136 (2026) 111215

> **Read the paper:** [LogicJitter on ScienceDirect](https://www.sciencedirect.com/science/article/pii/S0045790626002879)  
> DOI: [10.1016/j.compeleceng.2026.111215](https://doi.org/10.1016/j.compeleceng.2026.111215)

---

## Overview

LogicJitter generates structured, rule-based logic games in natural language. Each sample includes valid and fallacious reasoning patterns plus optional distractors (random characters, controlled errors, problem revisions) designed to counter cognitive biases and logical fallacies.

The dataset is **virtually unbounded**, **perfectly balanced** (50/50 true/false), and requires **no human or LLM labeling**.

### Logic games

| Game | Description |
|------|-------------|
| **Guided Maths** | Scratchpad-style arithmetic (sum, product, polynomial) |
| **Causal Clauses** | Causal graphs with connectivity and fork/collider questions |
| **Context-Free Grammars** | String membership and grammar comparison |
| **CLEVR** | Textual spatial-reasoning scenes from CLEVR-style templates |

### Augmentation presets

| Preset | Components |
|--------|------------|
| `full` | Games + Errors + Characters + Revisions (GECR) |
| `g` | Games only |
| `ge` | Games + Errors |
| `gec` | Games + Errors + Characters |

---

## Installation

```bash
pip install logicjitter
```

Optional: set a custom data directory for generated datasets and CLEVR downloads:

```bash
export LOGICJITTER_DATA_DIR=/path/to/data
```

---

## Quick start

### Generate a single logic game

```python
from logicjitter import LogicJitter

lj = LogicJitter(multimodal=True, n_characters=3, seed=42)
print(lj.get_logic_game())
print(lj.get_logic_game(game_type="causal"))
print(lj.get_logic_game(game_type="maths"))
```

### Build a HuggingFace dataset

```python
from logicjitter import create_logicjitter_dataset

ds = create_logicjitter_dataset(
    n_samples=1000,
    preset="full",
    seed=42,
    output_dir="./data",
)
print(len(ds), ds[0]["ntp"])
```

### Command line

After `pip install logicjitter`, use the `logicjitter-generate` command. While developing
from a local clone, run `pip install -e .` first so the CLI is registered.

```bash
# Print demo samples (python -m works even without the console script)
python -m logicjitter --demo --n-samples 5 --preset full --no-multimodal

# Same via the installed entry point
logicjitter-generate --demo --n-samples 5 --preset full --no-multimodal

# Generate and save a dataset
logicjitter-generate --n-samples 1000 --preset full --output-dir ./data
```

---

## Adding your own logic game

Each built-in game is a function with this signature:

```python
def get_my_problem(error: bool = False, game_pieces: dict | None = None) -> dict:
    return {
        "problem": "...",           # statement shown in the sample
        "right_answers": ["yes"],   # phrases a correct character may say
        "wrong_answers": ["no"],    # phrases an incorrect character may say
        "game_pieces": {...},       # optional state for problem revisions
    }
```

When `error=True`, bake a mistake into the statement. When `game_pieces` is passed back
on a revision round, update the underlying facts (e.g. change a number) while keeping
the same characters.

Register the game and generate samples:

```python
from logicjitter import LogicJitter

lj = LogicJitter(multimodal=False, seed=0)
lj.register_game("parity", get_parity_problem)
print(lj.get_logic_game(game_type="parity"))
```

See [`examples/custom_parity_game.py`](examples/custom_parity_game.py) for a complete,
runnable toy example (even/odd sums).

---

## Package layout

```
logicjitter/
├── pyproject.toml
├── README.md
└── src/logicjitter/
    ├── __init__.py          # LogicJitter, create_logicjitter_dataset
    ├── cli.py               # logicjitter-generate entry point
    ├── dataset.py           # HuggingFace dataset pipeline
    ├── paths.py             # Package and cache paths
    └── creation/            # Logic game generators
        ├── logicjitter_generator.py
        ├── arithmetics_dataset.py
        ├── causal_dataset.py
        ├── cfg_dataset.py
        ├── clevr_dataset.py
        ├── clevr_tools/
        └── data/
```

---

## Citation

```bibtex
@article{herranzcelotti2026logicjitter,
  title   = {LogicJitter: Let LLMs play and uncover misinformation},
  author  = {Herranz-Celotti, Luca and Viviani, Marco},
  journal = {Computers and Electrical Engineering},
  volume  = {136},
  pages   = {111215},
  year    = {2026},
  doi     = {10.1016/j.compeleceng.2026.111215},
  url     = {https://www.sciencedirect.com/science/article/pii/S0045790626002879}
}
```

---

## License

Published under [CC BY 4.0](https://creativecommons.org/licenses/by/4.0/).
