Metadata-Version: 2.3
Name: mbari-aidata
Version: 1.73.0
Summary: Command line tool to do extract, transform, load and download operations on AI data for a number of projects at MBARI that require detection, clustering or classification workflows.
License: Apache
Author: Danelle Cline
Author-email: dcline@mbari.org
Requires-Python: >=3.10,<3.13
Classifier: License :: Other/Proprietary License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Requires-Dist: albucore (>=0.0.16)
Requires-Dist: albumentations (>=1.4.15)
Requires-Dist: click (>=8.1.7)
Requires-Dist: ephem (>=4.1.6)
Requires-Dist: moviepy (>=2.1.1)
Requires-Dist: numpy (>=1.24,<2)
Requires-Dist: opencv-contrib-python (==4.11.0.86)
Requires-Dist: pandas (>=2.0,<3.0)
Requires-Dist: pascal-voc-writer (>=0.1.4)
Requires-Dist: piexif (>=1.1.3)
Requires-Dist: pyarrow (>=15.0,<16.0)
Requires-Dist: pytz (>=2024.1)
Requires-Dist: redis (==5.0.7)
Requires-Dist: requests (>=2.32.3)
Requires-Dist: tator (==1.2.3)
Requires-Dist: torch (==2.2.0)
Requires-Dist: torchvision (==0.17.0)
Requires-Dist: tqdm (==4.67.1)
Requires-Dist: transformers (>=4.42.4,<5.0.0)
Description-Content-Type: text/markdown

[![MBARI](https://www.mbari.org/wp-content/uploads/2014/11/logo-mbari-3b.png)](http://www.mbari.org)
[![semantic-release](https://img.shields.io/badge/%20%20%F0%9F%93%A6%F0%9F%9A%80-semantic--release-e10079.svg)](https://github.com/semantic-release/semantic-release)
[![License](https://img.shields.io/badge/License-Apache_2.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)
[![downloads](https://img.shields.io/pypi/dm/mbari-aidata)](https://pypistats.org/packages/mbari-aidata)

__mbari-aidata__
---
A command line tool to do extract, transform, load (ETL) and download operations
on AI data for a number of projects at MBARI that require detection, clustering or classification
workflows.  This tool is designed to work with [Tator](https://www.tator.io/), a web based
platform for video and image annotation and data management and [Redis](https://redis.io/) 
queues for ingesting data from real-time workflows.

More documentation and examples are available at [https://docs.mbari.org/internal/ai/data](https://docs.mbari.org/internal/ai/data/).
 
## 🚀 Features
* 🧠 Object Detection/Clustering Integration: Loads detection/classification/clustering output from SDCAT formatted results.
* Flexible Data Export: Downloads from Tator into machine learning formats like COCO, CIFAR, or PASCAL VOC.
* Crop localization into optimized datasets for training classification models.
* Real-Time Uploads: Pushes localizations to [Tator](https://www.tator.io/) via [Redis](https://redis.io/glossary/redis-queue/) queues for real-time workflows.
* Metadata Extraction: Parses images metadata such as GPS/time/date through a plugin-based system (extractors).
* Duplicate Detection & flexible media references: Supports duplicate media load checks with the --check-duplicates flag. 
* Images or video can be loaded through a web server without needing to upload or move them from your internal NFS project mounts (e.g. Thalassa)
* Or, video can be uploaded without needing to figure out how to do the video transcoding required for web viewing.
* Video tracks can be uploaded into Tator for training and evaluation, including time-decay weighted tracks.
* Multiple data versions can be downloaded into a single dataset for training or evaluation using the --version flag with comma separated values, otherwise localization data is combined through Non-Maximum Suppression (NMS) to remove duplicate boxes.
* Augmentation Support: Augment VOC datasets with [Albumentations](https://albumentations.ai/) to boost your object detection model performance.

## Requirements
- Python 3.10 or higher
- A Tator API token and (optional) Redis password for the .env file. Contact the MBARI AI team for access.
- 🐳Docker for development and testing only, but it can also be used instead of a local Python installation.
- For video loads, you will need to install the required Python packages listed in the `requirements.txt` file, [ffmpeg](https://ffmpeg.org/), and the mp4dump tool from [https://www.bento4.com/](https://www.bento4.com/downloads/)

## 📦 Installation 
Install as a Python package:

```shell
pip install mbari-aidata
```
 
Create the .env file with the following contents in the root directory of the project:

```text
TATOR_TOKEN=your_api_token
REDIS_PASSWORD=your_redis_password
ENVIRONMENT=testing or production
```

Create a configuration file in the root directory of the project:
```bash
touch config_cfe.yaml
```
Or, use the project specific configuration from our docs server at
https://docs.mbari.org/internal/ai/projects/


This file will be used to configure the project data, such as mounts, plugins, and database connections.
```bash
aidata download --version Baseline --labels "Diatoms, Copepods" --config https://docs.mbari.org/internal/ai/projects/uav-901902/config_uav.yml
```

⚙️Example configuration file:
```yaml
# config_cfe.yml
# Config file for CFE project production
mounts:
  - name: "image"
    path: "/mnt/CFElab"
    host: "https://mantis.shore.mbari.org"
    nginx_root: "/CFElab"

  - name: "video"
    path: "/mnt/CFElab"
    host: "https://mantis.shore.mbari.org"
    nginx_root: "/CFElab"


plugins:
  - name: "extractor"
    module: "mbari_aidata.plugins.extractors.tap_cfe_media"
    function: "extract_media"

redis:
  host: "doris.shore.mbari.org"
  port: 6382

vss:
  project: "902111-CFE"
  model: "google/vit-base-patch16-224"

tator:
  project: "902111-CFE"
  host: "https://mantis.shore.mbari.org"
  image:
    attributes:
      iso_datetime: #<-------Required for images
        type: datetime
      depth:
        type: float
  video:
    attributes:
      iso_start_datetime:  #<-------Required for videos
        type: datetime
  box:
    attributes:
      Label:
        type: string
      score:
        type: float
      cluster:
        type: string
      saliency:
        type: float
      area:
        type: int
      exemplar:
        type: bool
  tdwa_box: #<-------Optional for videos track loads
    attributes:
      Label:
        type: string
      score:
        type: float
      verified:
        type: bool
      similarity_score:
        type: float
  track_state:  #<-------Optional for videos track loads
    attributes:
      Label:
        type: string
      max_score:
        type: float
      num_frames:
        type: int
      verified:
        type: bool
    
```

## Track Format

Tracks are video frames with a label and score for each detected object in the frame, along with a tracker_id to link
detections across frames into tracks.

Track data is stored in a compressed .tar.gz file with the -tracks.tar.gz, e.g.

```shell
aidata load tracks --input video-tracks/tracks.tar.gz --dry-run --config config_cfe.yml
```

video-tracks/tracks.tar.gz. This compressed file contains a structure like:

The detections.csv file contains the detections for each frame, e.g.

| frame | tracker_id | label    | score           | x                   | y                   | xx                  | xy                  |
|-------|------------|----------|-----------------------|----------------------|----------------------|----------------------|----------------------|
| 3     | 2          | Copepod  | 0.6826763153076172    | 0.7003568708896637   | 0.4995344939055266   | 0.7221783697605133   | 0.5368460761176215   |
| 3     | 1          | Copepod  | 0.7094097137451172    | 0.2693319320678711   | 0.6148265485410337   | 0.29686012864112854  | 0.6434915330674913   |
| 3     | 3          | Detritus | 0.2776843011379242    | 0.2693319320678711   | 0.6148265485410337   | 0.29686012864112854  | 0.6434915330674913   |
| 4     | 1          | Copepod  | 0.49819645285606384   | 0.2683655321598053   | 0.6125818323206018   | 0.2965434789657593   | 0.6455737643771702   |

Metadata about video is in the metadata.json file, e.g.
```json
{ "video_name": "video.mp4", 
  "video_path": "/data/input/video.mp4", 
  "processed_at": "2025-11-15T13:37:35.997007Z", 
  "total_frames": 12000, 
  "video_width": 1920, 
  "video_height": 1080, 
  "video_fps": 10, 
  "total_detections": 3000, 
  "unique_tracks": 148, 
  "detection_threshold": 0.15, 
  "min_track_frames": 5, 
  "slice_size": 800, 
  "rfdetr_model": "/mnt/models/best/checkpoint_best_total.pth" }
```

The tracks.csv file contains the tracks for each frame, e.g.

| tracker_id | label         | first_frame | last_frame | frame_count | avg_score |
|------------|---------------|-------------|------------|-------------|----------------------|
| 2          | Detritus | 3           | 37         | 35          | 0.3780171153800829   |
| 1          | Copepod  | 3           | 36         | 31          | 0.5898609180604258   |
| 3          | Copepod  | 3           | 37         | 34          | 0.5619616565458914   |

## 🐳 Docker usage
A docker version is also available at `mbari/aidata:latest` or `mbari/aidata:latest:cuda-124`.
For example, to download data from version Baseline using the docker image:

```shell
docker run -it --rm -v $(pwd):/mnt mbari/aidata:latest aidata download --version Baseline --labels "Diatoms, Copepods" --config config_cfe.yml
```

to download multiple versions
```shell
docker run -it --rm -v $(pwd):/mnt mbari/aidata:latest aidata download --version Baseline,ver0 --labels "Diatoms, Copepods" --config config_cfe.yml`
```
 
## Commands

* `aidata download --help` -  Download data, such as images, boxes, into various formats for machine learning e.g. COCO, CIFAR, or PASCAL VOC format. Augmentation supported for VOC exported data using Albumentations.
* `aidata load --help` -  Load data, such as images, boxes, or clusters into either a Postgres or REDIS database
* `aidata db --help` -  Commands related to database management
* `aidata transform --help` - Commands related to transforming downloaded data
* `aidata  -h` - Print help message and exit.
 
Source code is available at [github.com/mbari-org/aidata](https://github.com/mbari-org/aidata/). 

## Development
See the [Development Guide](https://github.com/mbari-org/aidata/blob/main/DEVELOPMENT.md) for more information on how to set up the development environment or the [justfile](justfile)  
 
🗓️ Last updated: 2026-01-03

