Metadata-Version: 2.4
Name: gymnasium_search_race
Version: 4.1.0
Summary: A reinforcement learning environment for the Search Race CG puzzle based on Gymnasium
Author-email: Quentin Deschamps <quentindeschamps18@gmail.com>
License: MIT
Project-URL: Repository, https://github.com/Quentin18/gymnasium-search-race
Keywords: Reinforcement Learning,game,RL,AI,gymnasium,pygame
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Games/Entertainment
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: gymnasium==1.2.0
Requires-Dist: pygame==2.6.1
Requires-Dist: stable-baselines3==2.7.0
Provides-Extra: training
Requires-Dist: moviepy==2.2.1; extra == "training"
Requires-Dist: rl_zoo3==2.7.0; extra == "training"
Requires-Dist: tensorboard==2.20.0; extra == "training"
Provides-Extra: testing
Requires-Dist: pytest; extra == "testing"
Provides-Extra: quality
Requires-Dist: black[d]; extra == "quality"
Requires-Dist: isort; extra == "quality"
Requires-Dist: pylint; extra == "quality"
Dynamic: license-file

# Gymnasium Search Race

[![Build Python Package](https://github.com/Quentin18/gymnasium-search-race/actions/workflows/build.yml/badge.svg)](https://github.com/Quentin18/gymnasium-search-race/actions/workflows/build.yml)
[![Python](https://img.shields.io/pypi/pyversions/gymnasium-search-race.svg)](https://badge.fury.io/py/gymnasium-search-race)
[![PyPI](https://badge.fury.io/py/gymnasium-search-race.svg)](https://badge.fury.io/py/gymnasium-search-race)
[![PyPI Downloads](https://static.pepy.tech/badge/gymnasium-search-race)](https://pepy.tech/projects/gymnasium-search-race)
[![pre-commit](https://img.shields.io/badge/pre--commit-enabled-brightgreen?logo=pre-commit&logoColor=white)](https://pre-commit.com/)
[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)
[![Imports: isort](https://img.shields.io/badge/%20imports-isort-%231674b1?style=flat&labelColor=ef8336)](https://pycqa.github.io/isort/)

Gymnasium environments for
the [Search Race CodinGame optimization puzzle](https://www.codingame.com/multiplayer/optimization/search-race)
and [Mad Pod Racing CodinGame bot programming game](https://www.codingame.com/multiplayer/bot-programming/mad-pod-racing).

https://github.com/user-attachments/assets/766b4c79-1be7-48bd-a25b-2ff99de972f7

<table>
    <tbody>
        <tr>
            <td>Action Space</td>
            <td><code>Box([-1, 0], [1, 1], float64)</code></td>
        </tr>
        <tr>
            <td>Observation Space</td>
            <td><code>Box(-1, 1, shape=(10,), float64)</code></td>
        </tr>
        <tr>
            <td>import</td>
            <td><code>gymnasium.make("gymnasium_search_race:gymnasium_search_race/SearchRace-v3")</code></td>
        </tr>
    </tbody>
</table>

## Installation

To install `gymnasium-search-race` with pip, execute:

```bash
pip install gymnasium_search_race
```

From source:

```bash
git clone https://github.com/Quentin18/gymnasium-search-race
cd gymnasium-search-race/
pip install -e .
```

## Environment

### Action Space

The action is a `ndarray` with 2 continuous variables:

- The rotation angle between -18 and 18 degrees, normalized between -1 and 1.
- The thrust between 0 and 200, normalized between 0 and 1.

### Observation Space

The observation is a `ndarray` of 10 continuous variables:

- The relative x and y coordinates of the next two checkpoints in the car's frame.
- The sine and cosine of the relative angle to the next two checkpoints in the car's frame.
- The longitudinal and lateral speed in the car's frame.

The values are normalized between -1 and 1.

### Reward

- +1 when a checkpoint is visited.
- 0 otherwise.

### Starting State

The starting state is generated by choosing a random CodinGame test case.

### Episode End

The episode ends if either of the following happens:

1. Termination: The car visit all checkpoints before the time is out.
2. Truncation: Episode length is greater than 600.

### Arguments

- `laps`: number of laps. The default value is `3`.
- `car_max_thrust`: maximum thrust. The default value is `200`.
- `test_id`: test case id to generate the checkpoints (see
  choices [here](https://github.com/Quentin18/gymnasium-search-race/tree/main/src/gymnasium_search_race/envs/maps)). The
  default value is `None` which selects a test case randomly when the `reset` method is called.
- `sequential_maps`: if `True`, the maps are generated sequentially. The default value is `False`.

```python
import gymnasium as gym

gym.make(
    "gymnasium_search_race:gymnasium_search_race/SearchRace-v3",
    laps=3,
    car_max_thrust=200,
    test_id=1,
    sequential_maps=False,
)
```

### Version History

- v3: Update observation with relative positions and angles in car's frame
- v2: Update observation with relative positions and angles
- v1: Add boolean to indicate if the next checkpoint is the last checkpoint in observation
- v0: Initial version

## Discrete environment

The `SearchRaceDiscrete` environment is similar to the `SearchRace` environment except the action space is discrete.

```python
import gymnasium as gym

gym.make(
    "gymnasium_search_race:gymnasium_search_race/SearchRaceDiscrete-v3",
    laps=3,
    car_max_thrust=200,
    test_id=1,
    sequential_maps=False,
)
```

### Action Space

There are 74 discrete actions corresponding to the combinations of angles from -18 to 18 degrees and thrust 0 and 200.

### Version History

- v3: Update observation with relative positions and angles in car's frame
- v2: Update observation with relative positions and angles
- v1: Add all angles in action space
- v0: Initial version

## Mad Pod Racing

### Runner

The `MadPodRacing` and `MadPodRacingDiscrete` environments can be used to train a runner for
the [Mad Pod Racing CodinGame bot programming game](https://www.codingame.com/multiplayer/bot-programming/mad-pod-racing).
They are similar to the `SearchRace` and `SearchRaceDiscrete` environments except the following differences:

- The maps are generated the same way Codingame generates them.
- The car position is rounded and not truncated.

```python
import gymnasium as gym

gym.make("gymnasium_search_race:gymnasium_search_race/MadPodRacing-v2")
gym.make("gymnasium_search_race:gymnasium_search_race/MadPodRacingDiscrete-v2")
```

https://github.com/user-attachments/assets/2e2a748d-5bd8-459a-8ac2-a8420bae33b9

### Blocker

The `MadPodRacingBlocker` and `MadPodRacingBlockerDiscrete` environments can be used to train a blocker for
the [Mad Pod Racing CodinGame bot programming game](https://www.codingame.com/multiplayer/bot-programming/mad-pod-racing).

```python
import gymnasium as gym

gym.make("gymnasium_search_race:gymnasium_search_race/MadPodRacingBlocker-v2")
gym.make("gymnasium_search_race:gymnasium_search_race/MadPodRacingBlockerDiscrete-v2")
```

https://github.com/user-attachments/assets/3c71a487-9ec1-49cd-9b8b-70f7984a809a

### Arguments

- `opponent_path`: path to the opponent PPO model. The default value is `None` which means there is no opponent.
- `boost_on_first_move`: if `True`, the car is boosted on the first move. The default value is `False`.
- `boost_opponent_on_first_move`: if `True`, the opponent is boosted on the first move. The default value is `False`.

### Version History

- v2: Update observation with relative positions and angles in car's frame and add boost options
- v1: Update observation with relative positions and angles and update maximum thrust
- v0: Initial version

## Usage

You can use [RL Baselines3 Zoo](https://github.com/DLR-RM/rl-baselines3-zoo) to train and evaluate agents:

```bash
pip install rl_zoo3
```

### Train an Agent

The hyperparameters are defined in `hyperparams/ppo.yml`.

To train a PPO agent for the Search Race game, execute:

```bash
python -m rl_zoo3.train \
  --algo ppo \
  --env gymnasium_search_race/SearchRaceDiscrete-v3 \
  --tensorboard-log logs \
  --eval-freq 20000 \
  --eval-episodes 50 \
  --gym-packages gymnasium_search_race \
  --env-kwargs "laps:1000" "sequential_maps:True" \
  --conf-file hyperparams/ppo.yml \
  --progress
```

> [!IMPORTANT]
> The agent is evaluated once per test case with `--eval-episodes 50` and `--env-kwargs "sequential_maps:True"` (there
> are 50 different test cases).

For the Mad Pod Racing game, you can add an opponent with the `opponent_path` argument:

```bash
python -m rl_zoo3.train \
  --algo ppo \
  --env gymnasium_search_race/MadPodRacingBlockerDiscrete-v2 \
  --tensorboard-log logs \
  --eval-freq 20000 \
  --eval-episodes 52 \
  --gym-packages gymnasium_search_race \
  --env-kwargs \
  "opponent_path:'rl-trained-agents/ppo/gymnasium_search_race-MadPodRacingDiscrete-v2_1/best_model.zip'" \
  "laps:1000" \
  "sequential_maps:True" \
  "boost_opponent_on_first_move:True" \
  --conf-file hyperparams/ppo.yml \
  --progress
```

> [!IMPORTANT]
> The agent is evaluated four times per test case with `--eval-episodes 52` and
`--env-kwargs "sequential_maps:True"` (there are 13 different test cases).

### Enjoy a Trained Agent

To see a trained agent in action on random test cases, execute:

```bash
python -m rl_zoo3.enjoy \
  --algo ppo \
  --env gymnasium_search_race/SearchRaceDiscrete-v3 \
  --n-timesteps 1000 \
  --deterministic \
  --gym-packages gymnasium_search_race \
  --load-best \
  --progress
```

### Run Test Cases

To run test cases with a trained agent, execute:

```bash
python -m scripts.run_test_cases \
  --path rl-trained-agents/ppo/gymnasium_search_race-SearchRaceDiscrete-v3_1/best_model.zip \
  --env gymnasium_search_race:gymnasium_search_race/SearchRaceDiscrete-v3 \
  --record-video \
  --record-metrics
```

### Record a Video of a Trained Agent

To record a video of a trained agent on Mad Pod Racing, execute:

```bash
python -m scripts.record_video \
  --path rl-trained-agents/ppo/gymnasium_search_race-MadPodRacingDiscrete-v2_1/best_model.zip \
  --env gymnasium_search_race:gymnasium_search_race/MadPodRacingDiscrete-v2
```

For Mad Pod Racing Blocker, execute:

```bash
python -m scripts.record_video \
  --path rl-trained-agents/ppo/gymnasium_search_race-MadPodRacingBlockerDiscrete-v2_1/best_model.zip \
  --opponent-path rl-trained-agents/ppo/gymnasium_search_race-MadPodRacingDiscrete-v2_1/best_model.zip \
  --env gymnasium_search_race:gymnasium_search_race/MadPodRacingBlockerDiscrete-v2
```

## Tests

To run tests, execute:

```bash
pytest
```

## Citing

To cite the repository in publications:

```bibtex
@misc{gymnasium-search-race,
  author = {Quentin Deschamps},
  title = {Gymnasium Search Race},
  year = {2024},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/Quentin18/gymnasium-search-race}},
}
```

## References

- [Gymnasium](https://github.com/Farama-Foundation/Gymnasium)
- [RL Baselines3 Zoo](https://github.com/DLR-RM/rl-baselines3-zoo)
- [Stable Baselines3](https://github.com/DLR-RM/stable-baselines3)
- [CGSearchRace](https://github.com/Illedan/CGSearchRace)
- [CSB-Runner-Arena](https://github.com/Agade09/CSB-Runner-Arena)
- [Coders Strikes Back by Magus](http://files.magusgeek.com/csb/csb_en.html)

### Assets

- https://www.flaticon.com/free-icon/space-ship_751036
- https://www.flaticon.com/free-icon/space-ship_784925

## Author

[Quentin Deschamps](mailto:quentindeschamps18@gmail.com)
