Metadata-Version: 2.4
Name: demandify
Version: 0.0.5
Summary: Calibrate SUMO traffic simulations against real-world congestion data using genetic algorithms
Author-email: Ahmet Onur Akman <ahmetonurakman@gmail.com>
License: MIT
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Science/Research
Classifier: Topic :: Scientific/Engineering :: GIS
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: fastapi>=0.104.0
Requires-Dist: uvicorn[standard]>=0.24.0
Requires-Dist: jinja2>=3.1.0
Requires-Dist: httpx>=0.25.0
Requires-Dist: python-multipart>=0.0.6
Requires-Dist: mapbox-vector-tile>=2.1.0
Requires-Dist: shapely>=2.0.0
Requires-Dist: rtree>=1.1.0
Requires-Dist: protobuf>=4.24.0
Requires-Dist: numpy>=1.24.0
Requires-Dist: pandas>=2.0.0
Requires-Dist: deap>=1.4.0
Requires-Dist: matplotlib>=3.7.0
Requires-Dist: python-dotenv>=1.0.0
Requires-Dist: pydantic>=2.0.0
Requires-Dist: pydantic-settings>=2.0.0
Requires-Dist: pyproj>=3.4.0
Requires-Dist: tqdm>=4.65.0
Provides-Extra: dev
Requires-Dist: pytest>=7.4.0; extra == "dev"
Requires-Dist: pytest-cov>=4.1.0; extra == "dev"
Requires-Dist: pytest-asyncio>=0.21.0; extra == "dev"
Requires-Dist: black>=23.7.0; extra == "dev"
Requires-Dist: ruff>=0.0.286; extra == "dev"
Requires-Dist: mypy>=1.5.0; extra == "dev"
Dynamic: license-file

![demandify](https://github.com/aonurakman/demandify/blob/main/static/banner.png?raw=true)

[![PyPI version](https://badge.fury.io/py/demandify.svg)](https://pypi.org/project/demandify/)
[![DOI](https://zenodo.org/badge/1144266508.svg)](https://doi.org/10.5281/zenodo.18698977)
[![Reproducibility Check](https://github.com/aonurakman/demandify/actions/workflows/reproducibility.yml/badge.svg)](https://github.com/aonurakman/demandify/actions/workflows/reproducibility.yml)


# Welcome to demandify!

**Turn real-world traffic data into agent-based SUMO traffic scenarios.**

Do you want to recreate real-world city traffic but don't have access to precious driver trip data? **demandify** solves that.

Pick a spot on the map and demandify will:
1.  Fetch real-time congestion data from TomTom 🗺️
2.  Build a clean SUMO network 🛣️
3.  Use the Genetic Algorithm to figure out the demand pattern to match that traffic 🧬
4. Produces a ready-to-run SUMO scenario in agent-level precision that allows you to test your urban routing policies, even for your CAVs! ([wink](https://github.com/COeXISTENCE-PROJECT/URB) [wink](https://github.com/COeXISTENCE-PROJECT/RouteRL)).

![Workflow](https://github.com/aonurakman/demandify/blob/main/static/schema.png?raw=true)

## Features

- 🌍 **Real-world calibration**: Uses TomTom Traffic Flow API for live congestion data
- 📦 **Offline calibration import**: Run from bundled/offline traffic+network snapshots
- 🎯 **Seeded & reproducible**: Same seed = identical results for same congestion and bbox
- 🚗 **Car-only SUMO networks**: Automatic OSM → SUMO conversion with car filtering, clean networks
- 🧬 **Genetic algorithm**: Optimizes demand to match observed speeds, with advanced dynamics (elite-slice parent selection, immigrants, assortative mating, adaptive mutation boost)
- 💾 **Smart caching**: Content-addressed caching for fast re-runs (traffic snapshots bucketed to 5-minute windows)
- 📊 **Beautiful reports**: HTML reports with visualizations and statistics
- ⌨️ **CLI native**: Live in the terminal? No problem.
- 🖥️ **Clean web UI**: Leaflet map, real-time progress stepper, log console
- ✅ **Data quality labeling**: Feasibility check now reports a quality score/label before calibration starts

![GUI Screenshot](https://github.com/aonurakman/demandify/blob/main/static/gui.png?raw=true)

## Quickstart

### 1. Install demandify

```bash
# Install from PyPI (Recommended)
pip install demandify
```

If you want to contribute or install from source:
```bash
git clone https://github.com/aonurakman/demandify.git
cd demandify
pip install -e .
```

### 2. Install SUMO 🚦

**demandify** requires SUMO (Simulation of Urban MObility) to power its simulations.

> [!IMPORTANT] 
> demandify is developed and tested with SUMO version 1.26.0. Ensure that your SUMO version is up to date.

👉 **[Download SUMO from the official website](https://eclipse.dev/sumo/)**

Once installed, verify it's working:
```bash
demandify doctor
```

### 3. Get a TomTom API Key

1. Sign up at [https://developer.tomtom.com/](https://developer.tomtom.com/)
2. Create a new app and copy the API key
3. The free tier includes 2,500 requests/day

### 4. Run demandify

```bash
demandify
```

This starts the web server at [http://127.0.0.1:8000](http://127.0.0.1:8000)

### 5. Calibrate a scenario

1. **Choose mode** at the top:
   - `Create`: live TomTom + OSM fetch
   - `Import`: select existing offline dataset (bbox auto-loaded and locked)
2. **Draw a bounding box** on the map (Create mode only)
3. **Configure parameters** (defaults work well):
   - Time window: 15, 30, or 60 minutes
   - Seed: any integer for reproducibility
   - Warmup: a few minutes to populate the network
   - GA population/generations: controls quality vs speed
4. **Paste your API key** (Create mode only; one-time, stored locally)
5. **Click "Start Calibration"**
6. **Watch the progress** through 8 stages
7. **Download your scenario** with `demand.csv`, SUMO network, and report

Before calibration starts, demandify runs a preparation feasibility check and reports:
- fetched traffic segments
- matched observed edges
- total network edges
- data quality label + score + risk flags
   
### 6. Run Headless (Optional) 🤖

You can run the full calibration pipeline directly from the command line, ideal for automation or remote servers.

```bash
# Basic usage (defaults: window=15, pop=50, gen=20)
demandify run "2.2961,48.8469,2.3071,48.8532" --name Paris_Test_01

# Advanced usage with custom parameters
demandify run "2.2961,48.8469,2.3071,48.8532" \
  --name paris_v1 \
  --window 30 \
  --seed 123 \
  --pop 100 \
  --gen 50 \
  --mutation 0.5 \
  --elitism 2

# With advanced GA dynamics
demandify run "2.2961,48.8469,2.3071,48.8532" \
  --name paris_v2 \
  --pop 100 \
  --gen 100 \
  --immigrant-rate 0.05 \
  --stagnation-patience 15

# Fully non-interactive (automation/CI)
demandify run "2.2961,48.8469,2.3071,48.8532" \
  --name paris_v3 \
  --non-interactive

# Import existing offline dataset (no live TomTom/OSM fetch)
demandify run --import krakow_v1 --name krakow_remote
```

> **Note:** By default, the CLI pauses after fetching/matching data and asks for confirmation, then asks whether to run another calibration. Pass `--non-interactive` to auto-approve and exit immediately after pipeline completion.

### 7. Build Offline Dataset (Optional) 💾

If you want a reusable prep bundle (for future no-key workflows), open:

- [http://127.0.0.1:8000/dataset-builder](http://127.0.0.1:8000/dataset-builder)

This dedicated page is separate from calibration runs. It executes preparation only (traffic snapshot + OSM + SUMO network + map matching) and stores files under:

- `demandify_datasets/<dataset_name>/`

Each dataset includes `data/traffic_data_raw.csv`, `data/observed_edges.csv`, `data/map.osm`, `sumo/network.net.xml`, and `dataset_meta.json`.

`dataset_meta.json` now includes a computed data quality block (`score`, `label`, `recommendation`, and metrics) to help decide whether a dataset is strong enough for offline calibration.

Bundled snapshot previews:

| Den Haag (`den_haag_v1`) | Krakow (`krakow_v1`) | Eskisehir (`eskisehir_v1`) |
|---|---|---|
| ![Den Haag offline network](https://github.com/aonurakman/demandify/blob/main/demandify/offline_datasets/den_haag_v1/plots/network.png?raw=true) | ![Krakow offline network](https://github.com/aonurakman/demandify/blob/main/demandify/offline_datasets/krakow_v1/plots/network.png?raw=true) | ![Eskisehir offline network](https://github.com/aonurakman/demandify/blob/main/demandify/offline_datasets/eskisehir_v1/plots/network.png?raw=true) |

#### Parameters

| Argument | Type | Default | Description |
|----------|------|---------|-------------|
| `bbox` | String | Req* | Bounding box (`west,south,east,north`) |
| `--import` | String | None | Use an offline dataset by name (or `source:name`) |
| `--name` | String | Auto | Custom Run ID/Name |
| `--non-interactive` | Flag | off | Disable prompts (auto-approve and exit when pipeline completes) |
| `--window` | Int | 15 | Simulation duration (min) |
| `--warmup` | Int | 5 | Warmup duration before scoring (min) |
| `--seed` | Int | 42 | Random seed |
| `--step-length`| Float | 1.0 | SUMO step length (seconds) |
| `--workers` | Int | Auto (CPU count) | Parallel GA workers |
| `--tile-zoom` | Int | 12 | TomTom vector flow tile zoom |
| `--pop` | Int | 50 | GA Population size |
| `--gen` | Int | 20 | GA Generations |
| `--mutation`| Float | 0.5 | Mutation rate (per individual) |
| `--crossover`| Float| 0.7 | Crossover rate |
| `--elitism` | Int | 2 | Top individuals to keep |
| `--sigma` | Int | 20 | Mutation magnitude (step size) |
| `--indpb` | Float | 0.3 | Mutation probability (per gene) |
| `--max-ods` | Int | 50 | Max OD pairs to generate |
| `--bin-size` | Float | 5 | Time bin size in minutes |
| `--initial-population` | Int | 1000 | Target initial number of vehicles (controls sparse initialization) |

\* `bbox` is required in create mode. In import mode, use `--import` and do not pass `bbox`.

`Import` mode constraints:
- positional `bbox` is rejected
- `--tile-zoom` is rejected
- all calibration controls (seed, GA params, warmup/window, etc.) remain available

#### Advanced GA Dynamics

These parameters control diversity mechanisms and adaptive behavior in the genetic algorithm, addressing local optima stagnation and trip count explosion.

| Argument | Type | Default | Description |
|----------|------|---------|-------------|
| `--immigrant-rate` | Float | 0.03 | Fraction of random individuals injected per generation (0–1) |
| `--elite-top-pct` | Float | 0.1 | Defines the size of the top-by-`E` elite slice per generation: `n=max(1, elite_top_pct * population)` |
| `--stagnation-patience` | Int | 20 | Generations without improvement before mutation boost activates |
| `--stagnation-boost` | Float | 1.5 | Multiplier for mutation sigma and rate during stagnation |
| `--checkpoint-interval` | Int | 10 | Save best-individual checkpoint artifacts every N generations |
| `--assortative-mating` | Flag | off | Explicitly enable assortative mating |
| `--no-assortative-mating` | Flag | off | Disable assortative mating (dissimilar parent pairing, on by default) |
| `--deterministic-crowding` | Flag | off | Explicitly enable deterministic crowding |
| `--no-deterministic-crowding` | Flag | off | Disable deterministic crowding (diversity-preserving replacement, on by default) |

All advanced dynamics are **enabled by default** with conservative values. For most use cases, the defaults work well. You can disable features via the corresponding `--no-*` flags or explicitly force-enable them with `--assortative-mating` / `--deterministic-crowding`.

## How It Works

demandify follows a multi-stage pipeline:

1. **Validate inputs** - Check mode/parameters and feasibility
2. **Preparation**:
   - `Create`: fetch traffic + OSM, build network, match edges
   - `Import`: load/copy network + observed traffic files from offline dataset
3. **Initialize demand** - Select routable OD pairs (lane-permission aware) and time bins
4. **Calibrate demand** - Run GA to optimize vehicle counts
5. **Export scenario** - Generate `demand.csv`, `trips.xml`, config, and report

### Advanced GA Dynamics

The genetic algorithm includes several mechanisms to avoid common pitfalls like local optima stagnation and trip count explosion:

- **MAE-elite lexicographic parent selection**: Individuals are first ordered by `mae`, and the top slice (`n=max(1, elite_top_pct * population)`) becomes the parent pool. Inside that slice, demandify prefers lower `failure_rate`, then lower genome magnitude, and uses exact `mae` only as the final deterministic tie-break. This keeps MAE as the sole global optimization target while still preferring more reliable and smaller-demand candidates within the MAE frontier.
- **Random immigrants**: A small fraction of completely random individuals is injected each generation to maintain genetic diversity and escape local optima.
- **Assortative mating**: Parents are paired by dissimilarity (by genome magnitude) for crossover, promoting exploration of the search space.
- **Deterministic crowding**: Offspring compete with similar parents for population slots, preserving niche diversity.
- **Adaptive mutation boost**: If the best fitness stagnates for K generations, mutation sigma and rate are temporarily increased by a configurable multiplier. They reset automatically when improvement resumes.

The final return policy uses that same **MAE-elite lexicographic order** across generations, so the returned individual always comes from the strongest MAE frontier while still preferring lower failure rate and smaller total demand within that frontier.

The calibration report includes plots for **genotypic diversity** (mean pairwise L2 distance) and **phenotypic diversity** (σ of fitness values) across generations, along with markers indicating when mutation boost was active.

### Variability & Consistency
      
While demandify uses seeding (random seed) for all internal stochastic operations (OD selection, GA evolution), **perfect reproducibility is not guaranteed** due to the inherently chaotic nature of traffic microsimulation (SUMO) and real-time data inputs.
      
Seeding ensures *consistency* (runs look similar), but small timing differences in OS scheduling or dynamic routing decisions can lead to divergent outcomes. Traffic snapshots are cached in 5-minute buckets; using the same seed, bbox, and time bucket will reproduce demand.csv and SUMO randomness.
      
### Caching
      
demandify caches:
- OSM extracts (by bbox)
- SUMO networks (by bbox + conversion params)
- Traffic snapshots (by bbox + provider + style + tile zoom + 5-minute timestamp bucket)
- Map matching results (by bbox + network key + provider + timestamp bucket)
      
Cache location: `~/.demandify/cache/`

Clear cache: `demandify cache clear`

## CLI Commands

```bash
# Start web server (default)
demandify

# Run headless calibration
demandify run "west,south,east,north"

# Run headless from bundled offline dataset
demandify run --import krakow_v1

# Check system requirements
demandify doctor

# Set TomTom API key (CLI)
demandify set-key YOUR_KEY_HERE

# Clear cache
demandify cache clear

# Show version
demandify --version
```

## Output Files

Each run creates a folder with:

- **`demand.csv`** - Travel demand with exact schema:
  - `ID`, `origin link id`, `destination link id`, `departure timestep`
- **`trips.xml`** - SUMO trips file
- **`network.net.xml`** - SUMO network
- **`scenario.sumocfg`** - SUMO configuration (ready to run; ignores route errors by default)
- **`observed_edges.csv`** - Observed traffic speeds
- **`run_meta.json`** - Complete run metadata
- **`report.html`** - Calibration report with visualizations

Run the scenario:
```bash
cd demandify_runs/run_<timestamp>
sumo-gui -c scenario.sumocfg
```

## Configuration

### API Keys

Three ways to provide your TomTom API key:

1. **Web UI**: Paste in the form (saved to `~/.demandify/config.json`)
2. **Environment variable**: `export TOMTOM_API_KEY=your_key`
3. **`.env` file**: Copy `.env.example` to `.env` and add your key
4. **CLI**: `demandify set-key YOUR_KEY` stores it in `~/.demandify/config.json`

## Development

```bash
# Install with dev dependencies
pip install -e ".[dev]"

# Run tests
pytest

# Format code
black demandify/

# Lint
ruff check demandify/
```

## License

MIT

## Acknowledgments

- **SUMO**: [Eclipse SUMO](https://eclipse.dev/sumo/)
- **TomTom**: [Traffic Flow API](https://developer.tomtom.com/traffic-api)
- **OpenStreetMap**: [© OpenStreetMap contributors](https://www.openstreetmap.org/copyright)

## Citation

If you use this software for your research, please consider using the citation below.
The canonical metadata for GitHub's "Cite this repository" is in `CITATION.cff`.

```bibtex
@software{demandify_2026,
  author       = {{Ahmet Onur Akman}},
  title        = {{demandify: Calibrate SUMO traffic scenarios against real-world congestion using genetic algorithms}},
  year         = {2026},
  version      = {0.0.5},
  publisher    = {PyPI},
  url          = {https://pypi.org/project/demandify/},
  repository   = {https://github.com/aonurakman/demandify},
  doi          = {10.5281/zenodo.18877067}
}
```
