Metadata-Version: 2.4
Name: ir-yolo-tracker
Version: 0.1.2
Summary: YOLO-based detection of bright infrared marker balls in uint16 frames.
Author: GGN_2015
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: numpy>=2.0
Requires-Dist: opencv-python>=4.0
Requires-Dist: tqdm>=4.0
Requires-Dist: ultralytics>=8.0
Provides-Extra: dev
Requires-Dist: pytest>=8.0; extra == "dev"
Dynamic: license-file

# IRYoloTracker

IRYoloTracker detects bright circular infrared marker balls in single-channel
two-dimensional `numpy.uint16` frames using a single-class YOLO model.

The public API accepts grayscale infrared intensity images only. If you use
standard YOLO weights, the normalized gray image is copied into three identical
channels internally because standard Ultralytics YOLO models expect 3-channel
input. This does not add color information. If you train a true one-channel YOLO
model, pass `model_input_channels=1`.

## Install

Use the virtual environment in this repository:

```bash
python -m pip install -e ".[dev]"
```

For GPU inference on Windows, install the CUDA PyTorch wheels in the same venv:

```bash
python -m pip install --force-reinstall torch==2.12.0+cu130 torchvision==0.27.0+cu130 --index-url https://download.pytorch.org/whl/cu130
```

## Python API

The package includes a bootstrap YOLO model trained from high-confidence
bright-circle pseudo-labels in `infrared_data`. You can use it directly:

```python
import numpy as np

from ir_yolo_tracker import IRMarkerTracker

frame: np.ndarray = np.load("frame.npy")

tracker = IRMarkerTracker(confidence_threshold=0.35, device="cpu")

detections = tracker.detect_dicts(frame)
print(detections)
```

For one-off detection, use the convenience function:

```python
import numpy as np

from ir_yolo_tracker import detect_marker_dicts

frame: np.ndarray = np.load("frame.npy")
detections = detect_marker_dicts(frame, confidence_threshold=0.35, device="cpu")
```

For repeated frames, create one tracker and reuse it:

```python
from ir_yolo_tracker import IRMarkerTracker, preload_pickle_frames

tracker = IRMarkerTracker(device="cpu")
frames = preload_pickle_frames("infrared_data", progress=True)

for path, frame in frames:
    detections = tracker.detect(frame)
    print(path.name, detections)
```

For a no-model, low-latency path, use the pure threshold detector. It returns
the same `MarkerDetection` objects and dictionary shape as the YOLO API:

```python
from ir_yolo_tracker import ThresholdMarkerDetector

threshold_detector = ThresholdMarkerDetector()
detections = threshold_detector.detect_dicts(frame)
```

The one-off threshold helper mirrors `detect_marker_dicts`:

```python
from ir_yolo_tracker import detect_marker_dicts_threshold

detections = detect_marker_dicts_threshold(frame)
```

Batch-style helpers are also available:

```python
from ir_yolo_tracker import detect_marker_batch, iter_pickle_detections

all_results = detect_marker_batch(frames, device="cpu")

for path, detections in iter_pickle_detections("infrared_data", device="cpu"):
    ...
```

To draw results on a frame:

```python
import cv2

from ir_yolo_tracker import draw_detections

image = draw_detections(frame, detections, status="IR marker detections")
cv2.imshow("detections", image)
cv2.waitKey(0)
```

Pass `device="cuda"` or `device=0` to request GPU inference. If `device` is not
set, Ultralytics chooses the device.

The public API includes:

- `IRMarkerTracker`: reusable YOLO detector class.
- `ThresholdMarkerDetector`: reusable pure threshold detector.
- `detect_markers`, `detect_marker_dicts`: one-frame convenience functions.
- `detect_markers_threshold`, `detect_marker_dicts_threshold`: one-frame threshold helpers.
- `detect_marker_batch`, `iter_marker_detections`: multi-frame helpers.
- `detect_marker_batch_threshold`, `iter_marker_detections_threshold`: threshold multi-frame helpers.
- `load_pickle_frame`, `iter_pickle_frames`, `iter_pickle_detections`: pickle sequence helpers.
- `draw_detections`: OpenCV visualization helper.
- `BrightCircleDetector`: non-YOLO bright-circle detector for pseudo-labeling/debugging.

For production accuracy, train with human-verified labels and pass your own
`best.pt`:

```python
tracker = IRMarkerTracker("runs/detect/ir_marker_ball/weights/best.pt", device="cpu")
```

`ThresholdMarkerDetector` is fastest when the marker balls are the brightest
compact circular objects in the frame. Its defaults use a high raw-intensity
percentile (`threshold_percentile=99.8`) plus area, aspect-ratio, and
circularity filters. If your camera exposure is fixed, `absolute_threshold=...`
with `threshold_percentile=None` removes the per-frame percentile calculation.

Each detection contains only the marker-ball class:

```python
[
    {
        "bbox_xyxy": [123.4, 205.1, 145.8, 228.0],
        "confidence": 0.91,
        "class_id": 0,
        "class_name": "ir_marker_ball",
        "center_xy": [134.6, 216.55],
    }
]
```

The input frame must be:

- shape: `(height, width)` with any positive dimensions
- dtype: `numpy.uint16`
- channels: one intensity channel, not RGB/BGR

The `image_size`/`--imgsz` value controls YOLO's internal inference/training
resize size. It does not require your input frame to be 512x512.

## Command-Line Inference

```bash
python scripts/detect_npy.py frame.npy
```

## Infrared Video Example

The root-level example `example_infrared_video.py` plays `.pickle` frames from
`infrared_data` at 24 FPS and overlays marker-ball boxes and confidences. This
file is a repository example, not an installed package entry point. To use it,
clone the GitHub project and run it from the project root.

```bash
git clone https://github.com/GGN-2015/IRYoloTracker.git
cd IRYoloTracker
python -m pip install -e ".[dev]"
```

Run with the bundled bootstrap YOLO model:

```bash
python example_infrared_video.py
```

Run with custom trained weights:

```bash
python example_infrared_video.py --weights runs/detect/ir_marker_ball/weights/best.pt
```

Common options:

```bash
python example_infrared_video.py --device cuda
python example_infrared_video.py --device cpu
python example_infrared_video.py --fps 24 --scale 1.5
python example_infrared_video.py --lazy-load
python example_infrared_video.py --loop
python example_infrared_video.py --no-rescue-bright-circles
```

At startup the example preloads all pickle frames into memory and shows a
`tqdm` progress bar. This removes disk IO and pickle decoding from the playback
loop. Use `--lazy-load` to read frames during playback instead.

The example defaults to `--conf 0.25` and enables a conservative
`YOLO+circle` rescue pass. YOLO still supplies the main predictions, but the
example also adds high-confidence bright circular blobs that do not overlap an
existing YOLO box. This helps recover marker balls that the bootstrap YOLO model
sees only at very low confidence. Use `--no-rescue-bright-circles` to view raw
YOLO output.

There is no default marker-count cap. Use `--max-detections N` only when your
camera setup has a known physical upper bound and you explicitly want to keep
the top-scoring detections.

If no YOLO weights are found, the example falls back to a simple bright-circle
preview detector so you can still inspect the infrared frame sequence. In a
normal package checkout this should not happen because a bootstrap model is
bundled. Pass `--require-yolo` if you want it to fail unless YOLO weights are
present.

Press Space to pause/resume, or press `Q`/Esc to quit.

## Bootstrap Model

The bundled model was trained from pseudo-labels generated by the preview
bright-circle detector with confidence `>= 0.70`:

```bash
python scripts/bootstrap_pseudo_dataset.py --confidence 0.70
python scripts/train.py --data configs/pseudo_ir_marker_dataset.yaml --model yolo11n.pt --epochs 12 --imgsz 512 --batch 8 --device cpu --project runs/detect --name pseudo_ir_marker_bootstrap
```

This is useful as a project bootstrap. Replace it with a model trained on
human-reviewed marker-ball labels before relying on it for critical measurement.

The video example can also rebuild a lower-threshold YOLO pseudo dataset from
`infrared_data`. It uses the current inference model plus the same conservative
bright-circle rescue pass. Low-confidence YOLO boxes below
`--pseudo-yolo-accept-conf` are kept only when bright-circle evidence confirms
the same location:

```bash
python example_infrared_video.py --rebuild-yolo-pseudo-dataset --device cuda --pseudo-conf 0.05
```

By default this writes images and labels to
`datasets/yolo_low_conf_ir_marker_ball` and writes
`configs/yolo_low_conf_ir_marker_dataset.yaml`.

## Training Data

Prepare a YOLO detection dataset with one class:

```text
datasets/ir_marker_ball/
  images/train/*.png
  images/val/*.png
  labels/train/*.txt
  labels/val/*.txt
```

Each label row must use normalized YOLO format:

```text
0 x_center y_center width height
```

The class id must always be `0`, named `ir_marker_ball`. Do not include labels
for non-marker objects.

If your annotations are stored with pixel-coordinate boxes, you can build the
dataset from `.npy` frames:

```bash
python scripts/build_dataset.py --annotations annotations.json --frames-root frames --output datasets/ir_marker_ball
```

`annotations.json` can be either a list or an object containing `frames`:

```json
[
  {
    "image": "frame_0001.npy",
    "boxes": [[120, 210, 145, 235]]
  }
]
```

## Training

The default dataset config is `configs/ir_marker_dataset.yaml`.

```bash
python scripts/train.py --data configs/ir_marker_dataset.yaml --model yolo11n.pt --epochs 100 --imgsz 512
```

The detector also passes `classes=[0]` during inference and filters class `0`
again after inference, so the returned output contains marker balls only.

## True One-Channel Model

For most workflows, standard YOLO training with grayscale data copied into
three equal channels is the simplest path. To use a custom YOLO model whose
first layer accepts one channel, create/train that model separately and run:

```python
tracker = IRMarkerTracker(
    "path/to/one_channel_best.pt",
    model_input_channels=1,
)
```

## Tests

```bash
python -m pytest
```
