Metadata-Version: 2.4
Name: fomo-edge-ai
Version: 0.0.2
Summary: FOMO - Lightweight Point Localization models.
Author-email: Bence Danko <bencejdanko@gmail.com>
License: Apache-2.0
Project-URL: Homepage, https://github.com/fomo-edge-ai/fomo
Project-URL: Repository, https://github.com/fomo-edge-ai/fomo
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
License-File: NOTICE
Requires-Dist: numpy>=1.19.0
Requires-Dist: Pillow>=8.0.0
Requires-Dist: torch>=1.13.0
Requires-Dist: torchvision>=0.11.0
Requires-Dist: PyYAML>=6.0
Requires-Dist: requests>=2.25.0
Requires-Dist: opencv-python>=4.11.0.86
Requires-Dist: scipy>=1.7.0
Requires-Dist: safetensors>=0.4.0
Dynamic: license-file

# FOMO: Fast Object Localization

FOMO is a lightweight MobileNetV2-based point localization model designed for edge AI applications. Instead of regressing bounding boxes, FOMO downsamples the input image (for example, mapping a 192x192 input to a 24x24 grid) and predicts class probabilities and coordinates on a per-cell basis. This makes it highly efficient and ideal for counting tasks such as headcount, queue monitoring, and item counting.

## Installation

Install the package via PyPI:

```bash
pip install fomo-edge-ai
```

Or install from source in editable mode:

```bash
git clone https://github.com/fomo-edge-ai/fomo.git
cd fomo
pip install -e .
```

## Example: Training and Inference

The following example demonstrates how to train a model on the SJSU Headcount dataset and perform inference on validation images.

### 1. Download the Dataset

The dataset should be in standard YOLO-format layout. You can download the dataset from Hugging Face:

```python
from pathlib import Path
from huggingface_hub import snapshot_download

DATASET_ROOT = Path("sjsu-headcount-scene-1")
snapshot_download(
    repo_id="bdanko/sjsu-headcount-scene-1",
    repo_type="dataset",
    local_dir=str(DATASET_ROOT),
    local_dir_use_symlinks=False
)
```

### 2. Training the Model

Initialize a new model and run training:

```python
import torch
from fomo.models.fomo.model import FOMO

device = "cuda" if torch.cuda.is_available() else "cpu"

# Initialize model
model = FOMO(model_path=None, size="m", nb_classes=1, device=device)

# Start training (allows experimental training flag)
results = model.train(
    allow_experimental=True,
    data="sjsu-headcount-scene-1/data.yaml",
    epochs=10,
    batch=16,
    lr0=3e-4,
    device=device,
    project="runs/fomo",
    name="sjsu_headcount_m",
)
```

### 3. Inference and Visualization

Load the best checkpoint and run predictions:

```python
from PIL import Image
from fomo import FOMO

# Load trained checkpoint
model = FOMO("runs/fomo/sjsu_headcount_m/weights/best.pt", device=device)

# Load image and predict
pil_img = Image.open("sjsu-headcount-scene-1/valid/images/sample.jpg").convert("RGB")
result = model.predict(pil_img, conf=0.80)

# Retrieve point coordinates (mapped back to original image resolution)
if result.points is not None and len(result.points) > 0:
    xy = result.points.xy.cpu().numpy()     # (N, 2) pixel coordinates
    confs = result.points.conf.cpu().numpy() # (N,) confidence scores
    
    for (px, py), conf in zip(xy, confs):
        print(f"Detected object at: x={px:.1f}, y={py:.1f} with confidence {conf:.2f}")
```

## License

Code is licensed under the Apache License 2.0. Pre-trained weights are hosted externally and may inherit separate licensing terms. Check details in the specific weight repositories.
