Metadata-Version: 2.4
Name: athena-tracking
Version: 0.9
Summary: ATHENA: Automatically Tracking Hands Expertly with No Annotations
Author-email: Jonathan Michaels <jmichae@yorku.ca>, Daanish Mulla <daanish.mulla@gmail.com>
Project-URL: GitHub, https://github.com/neural-control-and-computation-lab/athena
Keywords: athena,hand tracking,pose estimation,machine learning
Classifier: Development Status :: 4 - Beta
Classifier: Programming Language :: Python :: 3.12
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Requires-Python: ==3.12.*
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: av
Requires-Dist: mediapipe
Requires-Dist: numpy
Requires-Dist: matplotlib
Requires-Dist: tqdm
Requires-Dist: scipy
Requires-Dist: pymovie
Requires-Dist: toml
Requires-Dist: moviepy
Dynamic: license-file

# ATHENA Toolbox
ATHENA (Automatically Tracking Hands Expertly with No Annotations) is a Python-based toolbox designed to process multi-camera video recordings, extract 2D and 3D body and hand landmarks using MediaPipe, and perform triangulation and refinement of these landmarks. The toolbox provides a user-friendly GUI for selecting videos and configuring processing options.

<table>
  <tr>
    <td><img src="athena/logo.png" alt="logo" width="240"/></td>
    <td><img src="athena/gui.png" alt="GUI screenshot" width="400"/></td>
  </tr>
</table>

## Features
- Multi-Camera Processing: Handles multiple camera inputs for comprehensive analysis.
- MediaPipe Integration: Utilizes Google’s MediaPipe for extracting 2D landmarks of the body and hands.
- 3D Triangulation: Triangulates 2D landmarks from multiple cameras to reconstruct 3D landmarks.
- Smoothing and Refinement: Applies smoothing algorithms to refine the 3D landmarks.
- Parallel Processing: Supports multiprocessing to speed up the processing of large datasets.
- GUI for Easy Configuration: Provides a graphical user interface to select folders, recordings, and set processing options.
- Visualization: Offers options to save images and videos of the processed landmarks for visualization.

## Installation
### Prerequisites
- Operating System: Windows, macOS, or Linux.
- Python Version: Python 3.12
- Hardware Requirements:
- CPU: Multi-core processor recommended for parallel processing.
- GPU (Optional): NVIDIA GPU for accelerated processing (if GPU processing is enabled).

### Installation Steps

Use your package manager of choice to create an environment with Python 3.12. For example, using conda:
```console
conda create -n athena python=3.12
```
Activate the environment:
```console
conda activate athena
```
Then install the package:
```console
pip install athena-tracking
```

Or to get the latest version directly from GitHub:
```console
pip install git+https://github.com/neural-control-and-computation-lab/athena.git
```

## Usage
### 1.	Organize Your Videos
Place your synchronized video recordings in a main folder, structured as follows:
```console
main_folder/
├── videos/
│   ├── recording1/
│   │   ├── cam0.avi
│   │   ├── cam1.avi
│   │   └── ...
│   ├── recording2/
│   │   ├── cam0.avi
│   │   ├── cam1.avi
│   │   └── ...
└── calibration/
    ├── cam0.yaml
    ├── cam1.yaml
    └── ...
```

- Each recording folder should contain video files from multiple cameras (e.g., cam0.avi, cam1.avi).
- The calibration folder should contain calibration files (.yaml) for each camera.

For recording and calibrating your cameras, we highly recommend the [JARVIS Toolbox](https://jarvis-mocap.github.io/jarvis-docs/).

### 2.	Ensure Calibration Files are Correct
- Calibration files should be labelled with camera names that match recorded videos.
- Calibration files should contain intrinsic and extrinsic parameters for each camera.
- The calibration files are essential for accurate triangulation of 3D landmarks.

### 3. Running the Toolbox
Launch the GUI:
```console
athena
```

1. Select Main Folder and Recordings 
   - Click on the “Select Folder” button to choose your main folder containing the videos and calibration directories.
   - A new window will appear, allowing you to select specific recordings to process. Select the desired recordings and click “Select”.
2. Configure Processing Options
   - General Settings:
     - Fraction of Frames to Process: Slide to select the fraction of frames you want to process (from 0 to 1). Default is 1.0 (process all frames).
     - Number of Parallel Processes: Choose the number of processes for parallel processing. Default is the number of available CPU cores.
     - GPU Processing: Check this option to enable GPU acceleration (requires compatible NVIDIA GPU).
   - MediaPipe Processing:
     - Run Mediapipe: Check this option to run MediaPipe for extracting 2D landmarks.
     - Save Images: Save images with landmarks drawn (can consume significant storage).
     - Save Video: Save videos with landmarks overlaid (requires “Save Images” to be enabled).
     - Minimum Hand Detection & Tracking Confidence: Adjust the confidence threshold for hand detection (default is 0.9).
     - Minimum Pose Detection & Tracking Confidence: Adjust the confidence threshold for pose detection (default is 0.9).
   - Run Triangulation and Refinement:
     - Run Triangulation and Refinement: Check this option to perform 3D triangulation of landmarks and filtering.
     - All points: low-freq cutoff (Hz): Set the low-frequency cutoff for smoothing of all points (default is 10 Hz).
     - Save 3D Images: Save images of the 3D landmarks.
     - Save 3D Video: Save videos of the 3D landmarks.
     - Save Refined 2D Images: Save images of the refined 2D landmarks after triangulation.
     - Save Refined 2D Video: Save videos of the refined 2D landmarks
3. Start Processing
   - Click the “GO” button to start processing.
   - A progress window will appear, showing the processing progress and average FPS.

### 4. Output Folder Structure
After processing, the toolbox will create additional directories within your main folder:
```console
main_folder/
├── images/                # Contains images with landmarks drawn
├── imagesrefined/         # Contains refined images after triangulation
├── landmarks/             # Contains 2D and 3D landmark data
├── videos_processed/      # Contains processed videos with landmarks overlaid
└── ...                    # Original videos and calibration files
```

To create a single video that contains all videos (2D and 3D), you can use the 'montage' script and select the recording folder using the popup window:
```console
python -m athena.montage
```

### Troubleshooting
- MediaPipe Errors: Ensure that MediaPipe is correctly installed and compatible with your Python version. MediaPipe may have specific requirements, especially for GPU support.
- Calibration Mismatch: Verify that the number of calibration files matches the number of camera videos.
- High Memory Usage: Processing large videos or saving images and videos can consume significant memory and storage.
- Permission Issues: Ensure that you have read/write permissions for the directories where data is stored and processed.
- Conda Environment Activation: Always make sure that the Conda environment is activated before running the scripts.

### Contributing
Contributions are welcome! If you encounter issues or have suggestions for improvements, please create an issue or submit a pull request on GitHub.

### Frequently Asked Questions (FAQ)

Q: Do I need a GPU to run the ATHENA Toolbox?\
A: No, a GPU is not required and does not generally speed up processing, which is already highly parallelized. If you have a compatible NVIDIA GPU, you can enable GPU processing in the options. Ensure that your GPU drivers and CUDA toolkit are correctly installed.

Q: Can I process videos from only one camera?\
A: Yes, you can process videos from a single camera to extract 2D landmarks. However, triangulation to 3D landmarks requires synchronized videos from at least two cameras and their corresponding calibration files.

Q: How do I obtain the calibration files?\
A: Calibration files are generated using camera calibration techniques, often involving capturing images of a known pattern (like a chessboard) from different angles. You can use OpenCV’s calibration tools or other software to create these .yaml files.
ATHENA accepts calibration files created by JARVIS or Anipose without any modification.

Q: The processing is slow. How can I speed it up?\
A: You can increase the number of parallel processes if your CPU has more cores. Enabling GPU processing can also significantly speed up the landmark detection step. Additionally, processing a smaller fraction of frames can reduce computation time.

Q: Where can I find the output data after processing?\
A: Processed data, images, and videos are saved in the images/, imagesrefined/, landmarks/, and videos_processed/ directories within your main folder.
