Metadata-Version: 2.4
Name: or-gymnasium
Version: 0.1.0
Summary: Gymnasium environments for operations research reinforcement learning problems from or-gym.
Project-URL: Homepage, https://github.com/JGIoA/or-gymnasium
Author: Ji Gao
License: MIT License
        
        Copyright (c) 2020 Christian
        Copyright (c) 2026 Ji Gao
        
        Permission is hereby granted, free of charge, to any person obtaining a copy
        of this software and associated documentation files (the "Software"), to deal
        in the Software without restriction, including without limitation the rights
        to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
        copies of the Software, and to permit persons to whom the Software is
        furnished to do so, subject to the following conditions:
        
        The above copyright notice and this permission notice shall be included in all
        copies or substantial portions of the Software.
        
        THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
        IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
        FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
        AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
        LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
        OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
        SOFTWARE.
License-File: LICENSE
Keywords: gymnasium,operations research,optimization,reinforcement learning
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Scientific/Engineering :: Mathematics
Requires-Python: >=3.10
Requires-Dist: gymnasium>=1.2.0
Requires-Dist: matplotlib>=3.1
Requires-Dist: networkx>=2.3
Requires-Dist: numpy>=1.21
Requires-Dist: pandas>=1.2
Requires-Dist: scipy>=1.7
Provides-Extra: dev
Requires-Dist: build; extra == 'dev'
Requires-Dist: pytest; extra == 'dev'
Requires-Dist: twine; extra == 'dev'
Description-Content-Type: text/markdown

# or-gymnasium

[`or-gymnasium`](https://github.com/JGIoA/or-gymnasium) packages a collection of [`Gymnasium`](https://github.com/farama-foundation/gymnasium) environments for operations research reinforcement learning problems adapted from the original [`or-gym`](https://github.com/hubbs5/or-gym) project. The main changes are to the environment interfaces so they work with the latest Gymnasium API.

## Installation

Install the package with `pip` to use the bundled environment registrations:

```bash
pip install or-gymnasium
```

You can also copy the environment files you need into your own project, register them with Gymnasium, and use them without installing this package.

## Quickstart

Importing [`or-gymnasium`](https://github.com/JGIoA/or-gymnasium) registers the environments with Gymnasium.

```python
import gymnasium as gym
import or_gymnasium

env = gym.make("Newsvendor-v0")
observation, info = env.reset(seed=123)
observation, reward, terminated, truncated, info = env.step(env.action_space.sample())
```




## Using Copied Environment Files

After copying an environment file, register the environment with Gymnasium using the copied module path and class name.

```python
import gymnasium as gym

gym.register(
    id="Newsvendor-v0",
    entry_point="my_project.envs.newsvendor:NewsvendorEnv",
)

env = gym.make("Newsvendor-v0")
observation, info = env.reset(seed=123)
observation, reward, terminated, truncated, info = env.step(env.action_space.sample())
```

Replace `my_project.envs.newsvendor:NewsvendorEnv` with the module path and class name for the environment file in your project.

## Examples
We included examples using PPO for continous and discrete action environments with slightly adapted [`CleanRL`](https://github.com/vwxyzjn/cleanrl) for recent versions of Gymnasium. Note that [`CleanRL`](https://github.com/vwxyzjn/cleanrl) dependencies should be installed separately before running.

```bash
python ./examples/cleanrl_ppo_continous.py
```

```bash
python ./examples/cleanrl_ppo_discrete.py
```

Use [`TensorBoard`](https://github.com/tensorflow/tensorboard) to view the results.
```bash
tensorboard --logdir ./runs
```

## Environments


See the src files and original [`or-gym`](https://github.com/hubbs5/or-gym) repository for detailed descriptions of the environments and their operations research background.

- `Newsvendor-v0`: Multi-period newsvendor problem with stochastic demand, lead times, holding costs, and lost-sales penalties.
- `TSP-v0`: Sparse, bidirectional traveling-salesperson graph with uniform movement costs and optional action masking.
- `TSP-v1`: Fully connected traveling-salesperson graph with Euclidean distance costs and penalties for revisiting nodes.
- `Knapsack-v0`: Unbounded knapsack problem where items can be selected repeatedly until capacity is reached or exceeded.
- `Knapsack-v1`: Binary knapsack problem where each item can be selected at most once.
- `Knapsack-v2`: Bounded knapsack problem where each item has a limited quantity available.
- `Knapsack-v3`: Online knapsack problem where randomly presented items must be accepted or rejected one at a time.
- `BinPacking-v0`: Small online bin packing instance with bounded waste, stochastic item arrivals, and capacity-based placement actions.
- `BinPacking-v1`: Large bounded-waste bin packing instance with higher bin capacity, more item sizes, and a longer horizon.
- `BinPacking-v2`: Small perfectly packable bin packing instance with linear waste rewards.
- `BinPacking-v3`: Large perfectly packable bin packing instance with linear waste rewards.
- `BinPacking-v4`: Small perfectly packable bin packing instance with bounded waste rewards.
- `BinPacking-v5`: Large perfectly packable bin packing instance with bounded waste rewards.
- `VMPacking-v0`: Online virtual-machine packing problem that assigns CPU and memory demands to physical machines without overloading them.
- `VMPacking-v1`: Temporary virtual-machine packing problem where assigned processes expire and release physical-machine capacity.
- `InvManagement-v0`: Multi-period, multi-echelon inventory management system with production capacities, lead times, and backlogged unmet demand.
- `InvManagement-v1`: Multi-period, multi-echelon inventory management system where unmet demand is treated as lost sales.
- `NetworkManagement-v0`: Multi-period supply-network inventory management over a directed graph with production, distribution, raw-material, and market nodes.
- `NetworkManagement-v1`: Supply-network inventory management variant where unmet market demand and replenishment orders are lost instead of backlogged.
- `PortfolioOpt-v0`: Multi-period portfolio optimization problem for buying and selling three risky assets with transaction costs.
- `VehicleRouting-v0`: Dynamic food-delivery vehicle routing problem with stochastic orders, pickup and delivery actions, capacity limits, and time penalties.


## Reference
```bibtex
@misc{HubbsOR-Gym,
    author={Christian D. Hubbs and Hector D. Perez and Owais Sarwar and Nikolaos V. Sahinidis and Ignacio E. Grossmann and John M. Wassick},
    title={OR-Gym: A Reinforcement Learning Library for Operations Research Problems},
    year={2020},
    Eprint={arXiv:2008.06319}
}
```
```bibtex
@misc{towers2024gymnasium,
  author={Towers, Mark and Kwiatkowski, Ariel and Terry, Jordan and Balis, John U and De Cola, Gianluca and Deleu, Tristan and Goul{\~a}o, Manuel and Kallinteris, Andreas and Krimmel, Markus and KG, Arjun and others},
  title={Gymnasium: A Standard Interface for Reinforcement Learning Environments},
  year={2024},
  Eprint={arXiv:2407.17032}
}
```

```bibtex
@article{huang2022cleanrl,
  author  = {Shengyi Huang and Rousslan Fernand Julien Dossa and Chang Ye and Jeff Braga and Dipam Chakraborty and Kinal Mehta and João G.M. Araújo},
  title   = {CleanRL: High-quality Single-file Implementations of Deep Reinforcement Learning Algorithms},
  journal = {Journal of Machine Learning Research},
  year    = {2022},
  volume  = {23},
  number  = {274},
  pages   = {1--18},
  url     = {http://jmlr.org/papers/v23/21-1342.html}
}
```

## To cite this repository
```bibtex
@misc{gao2026actorpda,
  author={Gao, Ji and Ju, Caleb and Lan, Guanghui and Tong, Zhaohui},
  title={Actor-Accelerated Policy Dual Averaging for Reinforcement Learning in Continuous Action Spaces},
  year={2026},
  Eprint={arXiv:2603.10199}
}
```

