Metadata-Version: 2.4
Name: joint-sim
Version: 0.1.2
Summary: MAPF+JSSP Joint Simulation Environment for Reinforcement Learning
Author-email: Skyrim Forestsea <hitskyrim@qq.com>
Maintainer-email: Skyrim Forestsea <hitskyrim@qq.com>
License-Expression: MIT
Project-URL: Homepage, https://github.com/yourusername/joint-sim
Project-URL: Documentation, https://github.com/yourusername/joint-sim#readme
Project-URL: Repository, https://github.com/yourusername/joint-sim
Project-URL: Issues, https://github.com/yourusername/joint-sim/issues
Keywords: reinforcement-learning,gymnasium,job-shop-scheduling,multi-agent-path-finding,MAPF,JSSP,scheduling,simulation,AGV,factory
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: <3.12,>=3.8
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: numpy>=1.21
Requires-Dist: gymnasium>=0.28.1
Requires-Dist: scipy>=1.7.0
Provides-Extra: dev
Requires-Dist: pytest>=7.0; extra == "dev"
Requires-Dist: pytest-cov>=4.0; extra == "dev"
Requires-Dist: black>=23.0; extra == "dev"
Requires-Dist: isort>=5.0; extra == "dev"
Requires-Dist: flake8>=6.0; extra == "dev"
Requires-Dist: mypy>=1.0; extra == "dev"
Provides-Extra: visualization
Requires-Dist: matplotlib>=3.5; extra == "visualization"
Provides-Extra: optimal-solver
Requires-Dist: ortools>=9.8; extra == "optimal-solver"
Provides-Extra: all
Requires-Dist: joint-sim[dev,optimal-solver,visualization]; extra == "all"
Dynamic: license-file

# Joint-Sim: MAPF+JSSP Reinforcement Learning Environment

A Gymnasium-compatible simulation environment for the joint optimization of **Multi-Agent Path Finding (MAPF)** and **Job Shop Scheduling Problem (JSSP)**.

[![PyPI version](https://badge.fury.io/py/joint-sim.svg)](https://badge.fury.io/py/joint-sim)
[![Python 3.8+](https://img.shields.io/badge/python-3.8+-blue.svg)](https://www.python.org/downloads/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)

## Features

- **Gymnasium-compatible interface**: Standard `reset()`, `step()`, `observation_space`, `action_space`
- **Joint optimization**: Simultaneously handles AGV routing (MAPF) and job scheduling (JSSP)
- **Cost-based assignment**: Hungarian algorithm for optimal task-AGV matching
- **Collision-free routing**: Reservation table for conflict prevention
- **Flexible configuration**: Customizable machines, AGVs, jobs, and grid layouts
- **SVG visualization**: Real-time factory state rendering

## Installation

```bash
pip install joint-sim
```

For additional features:

```bash
pip install "joint-sim[visualization]"  # Matplotlib for charts
pip install "joint-sim[optimal-solver]"  # OR-Tools for optimal JSSP
pip install "joint-sim[all]"             # All optional dependencies
```

## Quick Start

### Basic Usage

```python
from joint_sim import JointSimGymEnv, FactoryConfig

# Create environment
config = FactoryConfig(
    n_machines=4,
    n_agvs=2,
    n_jobs=5,
    grid_size=(10, 10),
)

env = JointSimGymEnv(config, assigner_type='cost')

# Standard Gymnasium interface
obs, info = env.reset(seed=42)

for _ in range(1000):
    # Random action (use your RL policy here)
    action = env.action_space.sample()

    obs, reward, terminated, truncated, info = env.step(action)

    if terminated or truncated:
        break

print(f"Completed jobs: {info['completed_jobs']}/{info['total_jobs']}")
```

### With Custom Policy

```python
import numpy as np
from joint_sim import JointSimGymEnv, FactoryConfig

env = JointSimGymEnv(
    config=FactoryConfig(n_machines=4, n_agvs=2, n_jobs=5),
    assigner_type='cost'
)

obs, info = env.reset(seed=42)

def heuristic_policy(env, obs):
    """Assign nearest task to each AGV"""
    requests = env.get_transport_requests()
    n_agvs = env.config.n_agvs

    if not requests:
        return {'agv_assignments': np.full(n_agvs, -1, dtype=np.int32)}

    assignments = np.full(n_agvs, -1, dtype=np.int32)

    for i in range(min(n_agvs, len(requests))):
        assignments[i] = requests[i].job_id

    return {'agv_assignments': assignments}

while True:
    action = heuristic_policy(env, obs)
    obs, reward, terminated, truncated, info = env.step(action)

    if terminated or truncated:
        break

print(f"Total reward: {reward}, Completed: {info['completed_jobs']}")
```

## Environment Specification

### Observation Space

```python
observation_space = Dict({
    'grid': Box(0, 3, shape=(height, width), dtype=int32),      # Factory grid
    'agv_positions': Box(0, max_size, shape=(n_agvs, 2)),       # AGV (x, y) positions
    'agv_status': Box(0, 3, shape=(n_agvs,), dtype=int32),      # 0=idle, 1=moving, 2=loading, 3=unloading
    'agv_carrying': Box(-1, n_jobs, shape=(n_agvs,), dtype=int32),  # Job ID being carried (-1=none)
    'machine_status': Box(0, 1, shape=(n_machines,), dtype=int32),  # 0=idle, 1=working
    'machine_queue_length': Box(0, n_jobs, shape=(n_machines,)),   # Queue length per machine
    'time': Box(0, max_time, shape=(), dtype=int32),            # Current timestep
})
```

### Action Space

```python
action_space = Dict({
    'agv_assignments': Box(-1, n_jobs-1, shape=(n_agvs,), dtype=int32),
})
# -1: No assignment (AGV stays idle)
# 0 to n_jobs-1: Assign job to AGV
```

### Reward Function

- **Job completion bonus**: +100 per completed job
- **Terminal bonus**: +100 when all jobs complete

## API Reference

### JointSimGymEnv

```python
env = JointSimGymEnv(
    config: FactoryConfig = None,       # Factory configuration
    assigner_type: str = 'cost',        # 'cost' or 'greedy'
    assigner_config: dict = None,       # Assigner parameters
    max_episode_steps: int = 10000,     # Maximum steps per episode
    reward_scale: float = 1.0           # Reward scaling factor
)
```

### FactoryConfig

```python
config = FactoryConfig(
    n_machines: int = 6,                # Number of machines
    n_agvs: int = 3,                    # Number of AGVs
    n_jobs: int = 10,                   # Number of jobs
    grid_size: Tuple[int, int] = (20, 20),  # Grid dimensions
    n_ops_per_job: Tuple[int, int] = (3, 6),  # Operations per job range
    op_duration_range: Tuple[int, int] = (3, 10),  # Operation duration range
    seed: int = 42,                     # Random seed
    max_time: int = 5000,               # Maximum simulation time
)
```

### Key Methods

| Method | Description |
|--------|-------------|
| `reset(seed=None)` | Reset environment, returns `(obs, info)` |
| `step(action)` | Execute action, returns `(obs, reward, terminated, truncated, info)` |
| `get_transport_requests()` | Get list of pending transport requests |
| `get_state()` | Get current `FactoryState` |
| `get_metrics()` | Get simulation metrics |
| `render(filepath=None)` | Render SVG visualization |

## License

MIT License - see [LICENSE](LICENSE) file for details.

## Citation

If you use this environment in your research, please cite:

```bibtex
@software{joint_sim,
  title = {Joint-Sim: MAPF+JSSP Reinforcement Learning Environment},
  author = {Skyrim Forestsea},
  year = {2026},
  url = {https://github.com/skyrimforest/joint-sim}
}
```

## Contributing

Contributions are welcome! Please feel free to submit a Pull Request.
