Metadata-Version: 2.4
Name: media-analyzer
Version: 0.1.0
Summary: Analyze video/image with machine learning methods, exif data, and other file based information. 
Project-URL: Homepage, https://github.com/RuurdBijlsma/media-analyzer
Project-URL: Repository, https://github.com/RuurdBijlsma/media-analyzer
Project-URL: Documentation, https://ruurdbijlsma.github.io/media-analyzer/media_analyzer.html#MediaAnalyzer
Author-email: Ruurd Bijlsma <ruurd@bijlsma.dev>
License-Expression: MIT
Requires-Python: <3.13,>=3.10
Requires-Dist: accelerate>=1.3.0
Requires-Dist: bitsandbytes>=0.45.0
Requires-Dist: insightface>=0.7.3
Requires-Dist: meteostat>=1.6.8
Requires-Dist: networkx>=3.4.2
Requires-Dist: onnxruntime>=1.20.1
Requires-Dist: openai>=1.59.8
Requires-Dist: opencv-python>=4.11.0.86
Requires-Dist: pillow-avif-plugin>=1.4.6
Requires-Dist: pillow>=11.1.0
Requires-Dist: pyexiftool>=0.5.6
Requires-Dist: pytesseract>=0.3.13
Requires-Dist: reverse-geocode>=1.6.5
Requires-Dist: scikit-learn>=1.6.1
Requires-Dist: scipy>=1.15.1
Requires-Dist: timezonefinder>=6.5.7
Requires-Dist: torch>=2.5.1
Requires-Dist: torchvision>=0.20.1
Requires-Dist: transformers>=4.48.0
Requires-Dist: types-pytz>=2024.2.0.20241221
Requires-Dist: types-tqdm>=4.67.0.20241221
Description-Content-Type: text/markdown

# Media Analyzer

Media Analyzer is a Python library designed to analyze media files, providing insights into their
content and metadata. It supports various functionalities, including image classification,
captioning, optical character recognition (OCR), and facial recognition.

## Features

- **Image Classification**: Identify objects, activities, animals, and events present in images.
- **Image Captioning**: Generate descriptive captions for images using models like BLIP and
  LLM-based captioners.
- **Optical Character Recognition (OCR)**: Extract text from images to identify documents, receipts,
  menus, and more.
- **Facial Recognition**: Detect faces in images and provide details such as age, sex, and facial
  landmarks.

## Installation

To install Media Analyzer, use pip:

```bash
pip install media-analyzer
```

### Requirements

You must have the following in PATH.

* ExifTool: https://exiftool.org/
* Tesseract OCR: https://tesseract-ocr.github.io/tessdoc/Installation.html

## Usage

Here's a basic example of how to use Media Analyzer:

```python
from media_analyzer import MediaAnalyzer
from pathlib import Path

analyzer = MediaAnalyzer()
media_file = Path("image.jpg")
result = analyzer.photo(media_file)

# Access analysis results
print(result.image_data)
print(result.frame_data)
```

Configuration

The AnalyzerSettings class allows you to customize various aspects of the analysis:

    media_languages: List of languages for OCR to consider.
    captions_provider: The provider for image captioning (e.g., 'BLIP', 'LLM').
    enable_text_summary: Enable or disable text summarization.
    enable_document_summary: Enable or disable document summarization.
    document_detection_threshold: Confidence threshold for document detection.
    face_detection_threshold: Confidence threshold for face detection.
    enabled_file_modules: List of file modules to enable (e.g., exif data, gps, weather detection).
    enabled_visual_modules: List of visual modules to enable (e.g., 'classification', 'captioning', 'ocr', 'facial_recognition').

Full docs can be found
at https://ruurdbijlsma.github.io/media-analyzer/media_analyzer.html#MediaAnalyzer.