Metadata-Version: 2.4
Name: sensor-routing
Version: 0.2.4
Summary: Optimal routing for CRNS mobile sensor data collection
Home-page: https://codebase.helmholtz.cloud/ufz/tb5-smm/met/wg7/sensor-routing
Author: Can Topaclioglu
Author-email: Can Topaclioglu <can.topaclioglu@ufz.de>
Maintainer-email: Can Topaclioglu <can.topaclioglu@ufz.de>
License: EUPL-1.2
Project-URL: Homepage, https://codebase.helmholtz.cloud/ufz/tb5-smm/met/wg7/sensor-routing
Project-URL: Repository, https://codebase.helmholtz.cloud/ufz/tb5-smm/met/wg7/sensor-routing
Project-URL: Issues, https://codebase.helmholtz.cloud/ufz/tb5-smm/met/wg7/sensor-routing/-/issues
Keywords: sensor-routing,CRNS,cosmic-ray-neutron-sensing,geospatial,routing-optimization,network-analysis,path-finding
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: European Union Public Licence 1.2 (EUPL 1.2)
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Scientific/Engineering
Classifier: Topic :: Scientific/Engineering :: GIS
Requires-Python: >=3.12
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: numpy>=2.2.0
Requires-Dist: pandas>=2.2.3
Requires-Dist: geopandas>=1.0.1
Requires-Dist: osmnx>=2.0.0
Requires-Dist: shapely>=2.0.6
Requires-Dist: pyproj>=3.7.0
Requires-Dist: pyogrio>=0.10.0
Requires-Dist: networkx>=3.4.2
Requires-Dist: scipy>=1.11.0
Requires-Dist: scikit-learn>=1.3.0
Requires-Dist: pydantic>=2.0.0
Requires-Dist: annotated-types>=0.6.0
Requires-Dist: tqdm>=4.66.0
Requires-Dist: requests>=2.32.3
Requires-Dist: h5py>=3.8.0
Provides-Extra: dev
Requires-Dist: pytest>=7.4.0; extra == "dev"
Requires-Dist: pytest-cov>=4.1.0; extra == "dev"
Requires-Dist: black>=23.0.0; extra == "dev"
Requires-Dist: flake8>=6.0.0; extra == "dev"
Requires-Dist: mypy>=1.5.0; extra == "dev"
Dynamic: author
Dynamic: home-page
Dynamic: license-file
Dynamic: requires-python

# Sensor Routing

[![Python Version](https://img.shields.io/badge/python-3.12+-blue.svg)](https://www.python.org/downloads/)
[![License](https://img.shields.io/badge/license-EUPL--1.2-green.svg)](https://joinup.ec.europa.eu/collection/eupl/eupl-text-eupl-12)

Optimal routing solution for mobile Cosmic Ray Neutron Sensing (CRNS) data collection. This package provides sophisticated algorithms for calculating efficient routes that maximize information value while minimizing travel distance and time.

## Features

- 🗺️ **Geospatial Route Optimization**: Calculate optimal routes using real-world road networks from OpenStreetMap
- 📊 **Information Value Maximization**: Balance between spatial coverage and information gain
- 🔄 **Multiple Routing Strategies**: Support for both standard and economical routing approaches
- 🎯 **Point Mapping**: Map sensor locations to road networks with advanced filtering
- 📈 **Benefit Calculation**: Evaluate information value of different route segments
- 🛣️ **Path Finding**: Dijkstra-based algorithms with custom cost functions
- 🔍 **Hull Point Extraction**: Optimize sensor placement using convex hull analysis
- ✅ **Input Validation** (v0.2.3+): Automatic validation of CSV files with delimiter/header detection
- 🔧 **Flexible Format Support**: Handle comma, tab, and whitespace-separated files seamlessly

## Installation

### From PyPI (recommended)

```bash
pip install sensor-routing
```

### From source

```bash
git clone https://codebase.helmholtz.cloud/ufz/tb5-smm/met/wg7/sensor-routing.git
cd sensor-routing
pip install -e .
```

### Development installation

```bash
pip install -e ".[dev]"
```

## Quick Start

### Command Line Interface

The package provides a command-line interface for the full pipeline:

```bash
sensor-routing --wd /path/to/work_directory
```

### Python API

#### Simplified API (v0.2.3+)

```python
from sensor_routing import sensor_routing_pipeline

# Run the complete pipeline with automatic validation
sensor_routing_pipeline(work_dir="/path/to/work_directory")
```

#### Modular API

```python
from sensor_routing import point_mapping, benefit_calculation, path_finding, route_finding

# Map points to road network
pm_output = point_mapping.point_mapping(
    points_path="input/points.csv",
    osm_path="input/osm_data_transformed.geojson",
    output_path="output"
)

# Calculate benefits
bc_output = benefit_calculation.benefit_calculation(
    pm_output=pm_output,
    output_path="output"
)

# Find optimal path
pf_output = path_finding.path_finding(
    bc_output=bc_output,
    output_path="output"
)

# Generate final route
route = route_finding.route_finding(
    pf_output=pf_output,
    output_path="output"
)
```

## Requirements

- Python 3.12 or higher
- See `requirements.txt` for full dependency list

### Key Dependencies

- **NumPy** & **Pandas**: Numerical and data processing
- **GeoPandas**: Geospatial data handling
- **OSMnx**: OpenStreetMap network analysis
- **NetworkX**: Graph-based routing algorithms
- **Shapely**: Geometric operations
- **SciPy** & **scikit-learn**: Scientific computing and machine learning
- **h5py**: MATLAB v7.3 HDF5 file support
- **Pydantic**: Data validation

## Project Structure

```
sensor_routing/
├── point_mapping.py          # Map sensor points to road network
├── benefit_calculation.py    # Calculate information value
├── path_finding.py           # Find optimal paths
├── route_finding.py          # Generate final routes
├── hull_points_extraction.py # Extract convex hull points
├── econ_mapping.py           # Economic point mapping variant
├── econ_benefit.py           # Economic benefit calculation variant
├── econ_paths.py             # Economic path finding variant
├── econ_route.py             # Economic route finding variant
└── full_pipeline_cli.py      # Command-line interface
```

## Usage

### Working Directory Structure

The pipeline expects a working directory with the following structure:

```
work_dir/
├── input/
│   ├── osm_data_transformed.geojson  # OpenStreetMap road network
│   ├── predictors.csv                # Environmental predictors (required)
│   └── memberships.csv               # Fuzzy cluster memberships (required)
├── transient/                        # Intermediate pipeline outputs
└── debug/                            # Debug outputs (optional, if DEBUG=True)
```

### Input Data Format

#### Road Network
**osm_data_transformed.geojson**: GeoJSON file containing road network from OpenStreetMap

#### Environmental Predictors (Required)
**predictors.csv**: CSV file with environmental variables and coordinates

**Format Requirements:**
- **Delimiters**: Automatically detected (comma, tab, or whitespace)
- **Headers**: Optional (auto-detected based on content)
- **First row validation**: Must contain numeric data (not text headers like "Longitude", "Latitude")
- **Column order**: `X, Y, Mask, Predictor1, Predictor2, ...`
- **Coordinates**: Must be in the same CRS as OSM data (e.g., EPSG:25832)
- **NaN values**: Allowed in predictor columns, excluded from validation

**Example (comma-separated with header):**
```csv
X,Y,Mask,BulkDensity,Clay,DEM,SOC,SandFraction,Slope
619500.0,5786500.0,0.0,132.95830,222.12509,145.67,2.34,0.456,1.23
619500.0,5786250.0,0.0,131.80805,215.62871,143.21,2.11,0.432,1.45
```

**Example (space-separated, no header):**
```
6.1950000e+05   5.7865000e+06   0.0000000e+00   1.3295830e+02   2.2212509e+02   ...
6.1950000e+05   5.7862500e+06   0.0000000e+00   1.3180805e+02   2.1562871e+02   ...
```

**Column Definitions:**
- **Column 1 (X)**: Easting coordinate
- **Column 2 (Y)**: Northing coordinate  
- **Column 3 (Mask)**: Urban mask (0=rural, 1=urban)
- **Columns 4+**: Environmental predictor values (e.g., soil moisture, temperature, elevation)

#### Cluster Memberships (Required)
**memberships.csv**: CSV file with fuzzy cluster membership probabilities

**Format Requirements:**
- **Delimiters**: Automatically detected (comma, tab, or whitespace)
- **Headers**: Optional (auto-detected)
- **Column order**: `X, Y, Cluster1, Cluster2, ...`
- **Coordinates**: Must match coordinates in `predictors.csv`
- **Membership values**: Probabilities between 0 and 1 (should sum to 1.0 per row)
- **NaN values**: Not allowed (will raise validation error)

**Example:**
```csv
X,Y,Cluster1,Cluster2,Cluster3
619500.0,5786500.0,0.75,0.15,0.10
619500.0,5786250.0,0.20,0.65,0.15
```

#### Input Validation (v0.2.3+)

The pipeline automatically validates input files:
- ✅ **Delimiter detection**: Comma, tab, or whitespace
- ✅ **Header detection**: Distinguishes numeric data from text headers
- ✅ **Coordinate validation**: Ensures membership coordinates exist in predictors
- ✅ **Membership validation**: Checks probabilities sum to 1.0 (within tolerance)
- ✅ **NaN handling**: Validates NaN counts and locations
- ✅ **Flexible matching**: Allows predictors to have more rows than memberships

**Validation Output Example:**
```
✓ Parsed 16928 rows from predictors.csv (6866 contain NaN values)
✓ Parsed 10062 rows from memberships.csv
✓ Coordinate validation: All 10062 membership coordinates found in predictors
```




**Requirements:**
- All files must have the same number of data points
- NaN values in any predictor automatically mark that point as urban (mask=1)

**Migration**: Convert your `.mat` files to `predictors.csv` using standard tools like MATLAB's `writetable()` or Python's pandas.

### Pipeline Parameters

The pipeline can be configured via `full_pipeline_parameters.json`:

```json
{
    "CRS": "EPSG:25832",
    "EPSG": 25832,
    "information_weight": 0.5,
    "start_node": null,
    "end_node": null,
    "max_iterations": 100,
    "enable_module_debug": false
}
```

### Debug Mode

Enable debug output by setting `ENABLE_MODULE_DEBUG = True` in `full_pipeline_cli.py` or via parameters file. This will:
- Print detailed progress information
- Save intermediate results to `debug/` directory
- Show progress bars for long-running operations

## Development

### Running Tests

```bash
pytest test/
```

### Code Formatting

```bash
black sensor_routing/
flake8 sensor_routing/
```

### Type Checking

```bash
mypy sensor_routing/
```

## Contributing

Contributions are welcome! Please:

1. Fork the repository
2. Create a feature branch (`git checkout -b feature/amazing-feature`)
3. Commit your changes (`git commit -m 'Add amazing feature'`)
4. Push to the branch (`git push origin feature/amazing-feature`)
5. Open a Merge Request

## Documentation

For detailed documentation on specific modules:

- **Point Mapping**: See `HOW_TO_USE_FOR_ROUTING.md`
- **Benefit Calculation**: See `IMPROVED_INFORMATION_VALUE_EXPLANATION.md`
- **Debug Control**: See `DEBUG_CONTROL_GUIDE.md`
- **Information Weights**: See `INFORMATION_WEIGHT_RANGES.md`

## Citation

If you use this software in your research, please cite:

```bibtex
@software{sensor_routing,
  author = {Topaclioglu, Can},
  title = {Sensor Routing: Optimal routing for CRNS mobile sensor data collection},
  year = {2024},
  url = {https://codebase.helmholtz.cloud/ufz/tb5-smm/met/wg7/sensor-routing}
}
```

## License

This project is licensed under the European Union Public License 1.2 (EUPL-1.2). See the [LICENSE](LICENSE) file for details.

## Authors

- **Can Topaclioglu** - *Initial work* - [UFZ](https://www.ufz.de/)

## Acknowledgments

- Helmholtz Centre for Environmental Research (UFZ)
- Department of Monitoring and Exploration Technologies

## Support

For questions, issues, or feature requests:
- Open an issue on [GitLab](https://codebase.helmholtz.cloud/ufz/tb5-smm/met/wg7/sensor-routing/-/issues)
- Contact: can.topaclioglu@ufz.de

## Changelog

### Version 0.2.4 (Current)
- ✨ **NEW**: Exported input file constants (`PREDICTOR_FILENAME`, `MEMBERSHIP_FILENAME`, `OSM_FILENAME`, `PARAMETERS_FILENAME`)
- ✨ **NEW**: Added `OSM_FILENAME` constant for standardized road network file naming
- ✨ **NEW**: Added `DESCRIPTION_OSM` with format requirements
- ✨ **NEW**: OSM file validation in `sensor_routing_pipeline()`
- 📝 Updated documentation to use correct OSM filename (`osm_data_transformed.geojson`)
- 📝 All filename constants now accessible via public API

### Version 0.2.3
- ✨ **NEW**: Simplified API with `sensor_routing_pipeline(work_dir)` function
- ✨ **NEW**: Comprehensive input validation with automatic delimiter detection
- ✨ **NEW**: CSV support with auto-detection for comma, tab, and whitespace delimiters
- ✨ **NEW**: Automatic header detection (numeric vs text)
- ✨ **NEW**: Coordinate validation between predictor and membership files
- ✨ **NEW**: Flexible validation allowing predictors to have more rows than memberships
- 📝 Standardized input filenames: `predictors.csv`, `memberships.csv`
- 🔧 Updated `hull_points_extraction.py` to use pandas for CSV parsing
- 📦 Updated test data to use CSV format
- 📝 Enhanced documentation with detailed file format requirements

### Version 0.2.2
- ✨ Automatic urban mask generation from NaN values in predictors
- 📦 Updated dependencies for PyPI distribution
- 🐛 Fixed hull_points_extraction summary_kwargs bug

### Version 0.2.1
- ✨ Added comprehensive debug control system
- ✨ Migrated to Pydantic V2
- ✨ Added economic routing variants
- 🐛 Fixed multiple debug output issues
- 📦 Prepared for PyPI distribution
- 📝 Improved documentation

### Version 0.1.15
- Initial release with basic routing functionality
