Metadata-Version: 2.4
Name: martino-summarize
Version: 0.2.0
Summary: Local-first multi-source video summarization with any OpenAI-compatible LLM
Author: Martino Piaggi
License: MIT
Project-URL: Homepage, https://github.com/martinopiaggi/summarize
Project-URL: Documentation, https://summarize.martino.im
Project-URL: Repository, https://github.com/martinopiaggi/summarize
Project-URL: Issues, https://github.com/martinopiaggi/summarize/issues
Project-URL: Changelog, https://github.com/martinopiaggi/summarize/blob/main/CHANGELOG.md
Keywords: video,summarization,youtube,transcription,llm,notebooklm
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Multimedia :: Video
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.7
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: python-dotenv>=1.0.0
Requires-Dist: youtube-transcript-api>=1.2.3
Requires-Dist: pytubefix>=1.6.3
Requires-Dist: groq>=0.4.2
Requires-Dist: openai>=1.3.7
Requires-Dist: yt-dlp>=2025.4.30
Requires-Dist: aiohttp>=3.9.0
Requires-Dist: requests>=2.31.0
Requires-Dist: wget>=3.2
Requires-Dist: google-api-python-client>=2.0.0
Requires-Dist: google-auth-httplib2>=0.1.0
Requires-Dist: google-auth-oauthlib>=0.4.1
Requires-Dist: dropbox>=11.36.2
Requires-Dist: ffmpeg-python>=0.2.0
Requires-Dist: pyyaml>=6.0
Provides-Extra: whisper
Requires-Dist: openai-whisper; extra == "whisper"
Provides-Extra: litellm
Requires-Dist: litellm<1.87,>=1.80.0; extra == "litellm"
Provides-Extra: server
Requires-Dist: fastapi>=0.100.0; extra == "server"
Requires-Dist: uvicorn[standard]>=0.23.0; extra == "server"
Requires-Dist: python-multipart>=0.0.6; extra == "server"
Provides-Extra: all
Requires-Dist: openai-whisper; extra == "all"
Requires-Dist: litellm<1.87,>=1.80.0; extra == "all"
Requires-Dist: fastapi>=0.100.0; extra == "all"
Requires-Dist: uvicorn[standard]>=0.23.0; extra == "all"
Requires-Dist: python-multipart>=0.0.6; extra == "all"
Dynamic: license-file

# Video Summarizer

<p align="center">
    <img alt="sample" src="./summarize_sample.gif">
</p>

Transcribe and summarize videos from YouTube, Instagram, TikTok, Twitter, Reddit, Facebook, Google Drive, Dropbox, and local files.

Works with any OpenAI-compatible LLM provider, including locally hosted endpoints.


## Interfaces

Pick your poison:

| Interface | Command |
|-----------|---------|
| **CLI** | `python -m summarizer --source <source>` |
| **HTTP API** | `python -m summarizer serve` → `http://localhost:8000/docs` |
| **Streamlit GUI** | `python -m streamlit run app.py` |
| **Docker** | `docker compose up -d` → `http://localhost:8501` |
| **Agent Skill** | [`.agent/skills/summarize/SKILL.md`](./.agent/skills/summarize/SKILL.md) |


## Quick Start

```bash
git clone https://github.com/martinopiaggi/summarize.git
cd summarize
pip install -e .
python -m summarizer --source "https://youtube.com/watch?v=VIDEO_ID"
```

The summary is saved to `summaries/watch_YYYYMMDD_HHMMSS.md`.

## How It Works

```text
               +--------------------+
               |  Video URL/Path    |
               +---------+----------+
                         |
                         v
               +---------+----------+
               |    Source Type?    |
               +---------+----------+
                         |
        +----------------+--------------------+
        | visual flag                         |
        v                                     v
+-------+--------+                  +---------+----------+
|  Visual Mode   |                  | Transcript Cache   |-------------> HIT ---+
|  base64 / url  |                  +---------+----------+                      |  
+-------+--------+                            | MISS                            |  
        |                                     |                                 |  
        |                                     v                                 |  
        |         +-------+ +-------+ +-------+ +-------+                       |  
        |         |YouTube| |yt-dlp | | Local | |Dropbox|                       |  
        |         |       | |X.com  | | File  | |G.Drive|                       |  
        |         |       | |TikTok | |       | |       |                       |  
        |         |       | |etc.   | |       | |       |                       |  
        |         +---+---+ +---+---+ +---+---+ +---+---+                       |  
        |             |         |         |         |                           |  
        |             v         v         |         |                           |  
        |         +----+---+  +--+---+    |         |                           |  
        |         |Captions|  |Cobalt|    |         |                           |  
        |         | Exist? |  +--+---+    |         |                           |  
        |         +---+----+         |    |         |                           |  
        |          Yes  No           |    |         |                           |  
        |          +----+            |    +--------+--------------+             |  
        |            |               |                            |             |  
        |            +-------------->|                            v             |  
        |                            |                   +--------+--------+    |  
        |                            |                   |     Whisper     |    |  
        |                            |                   |    endpoint?    |    |  
        |                            |                   +--------+--------+    |  
        |                            |                            |             |  
        |                            |                +-----------+-----------+ |  
        |                            |                |                       | |  
        |                            |                |  Cloud Whisper Local  | |  
        |                            |                |                       | |  
        |                            |                +----------+------------+ |  
        |                            |                           |              |  
        |                            +------------------------|--+              |  
        |                                                     v                 |  
        |                                               store in cache          |  
        |                                                     |                 |  
        |                                                     +-----------------+  
        |                                                     |                    
        |                                                     |         Transcript 
        |                                                     |                    
        |                                                     v                        
        |                                summarizer.yaml -> +------------+----------+
        |                                 prompts.json  ->  |    Prompt + LLM       |
        |                                                   |    Merge              |
        |                                                   +------------+----------+
        |                                                            |
        |                                                            v
        v                                                   +------------+----------+
+-------+--------+                                          |                       |
| Vision-capable |                                          |          Output       |
|     model      |----------------------------------------->+                       |
+-------+--------+                                          +-----------------------+
        ^ 
        | 
        +----- prompts.json
```

- **Transcript path** (default): downloads audio/video, transcribes with Whisper or captions, caches the transcript, then summarizes with an LLM.
- **Visual path** (`--visual`): sends the video directly to a vision-capable model, skipping transcription. Uses the same prompts, provider config, and `.env` keys as the transcript path. Supports `base64` chunks (default) and `url` passthrough for YouTube.

## Documentation

Full docs live at [summarize.martino.im](https://summarize.martino.im).

## License

[MIT](LICENSE)
