Metadata-Version: 2.4
Name: consumo
Version: 0.10.0
Summary: Content consumption analyzer CLI
Author: Gabriel Santos de Souza
Author-email: Gabriel Santos de Souza <gabriel.santosdesouza@dcomp.ufs.br>
License-Expression: GPL-3.0-or-later
Classifier: Development Status :: 3 - Alpha
Classifier: Environment :: Console
Classifier: Framework :: Pydantic
Classifier: Framework :: Pytest
Classifier: Intended Audience :: End Users/Desktop
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: GNU General Public License v3 or later (GPLv3+)
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Internet
Classifier: Topic :: Multimedia
Classifier: Topic :: Terminals
Classifier: Topic :: Text Processing
Classifier: Topic :: Utilities
Classifier: Typing :: Typed
Requires-Dist: av>=16.1.0
Requires-Dist: brotli>=1.2.0
Requires-Dist: bs4>=0.0.2
Requires-Dist: courlan>=1.3.2
Requires-Dist: lxml>=6.0.2
Requires-Dist: pydantic>=2.12.5
Requires-Dist: pymupdf>=1.27.1
Requires-Dist: python-magic>=0.4.27 ; sys_platform != 'win32'
Requires-Dist: python-magic-bin>=0.4.14 ; sys_platform == 'win32'
Requires-Dist: regex>=2026.2.19
Requires-Dist: trafilatura>=2.0.0
Requires-Dist: typer>=0.24.0
Requires-Dist: yt-dlp>=2026.2.21
Requires-Dist: zstandard>=0.25.0
Requires-Python: >=3.12
Description-Content-Type: text/markdown

# [consumo: Content Consumption Analyzer](https://gbr-ufs.github.io/consumo/)

[![PyPI Package](https://img.shields.io/pypi/v/consumo.svg)](https://pypi.python.org/pypi/consumo)
[![Codecov](https://codecov.io/gh/gbr-ufs/consumo/graph/badge.svg?token=IIRDADQH1Q)](https://codecov.io/gh/gbr-ufs/consumo)
[![Downloads](https://static.pepy.tech/badge/consumo/month)](https://pepy.tech/project/consumo)
[![Pydantic v2](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/pydantic/pydantic/main/docs/badge/v2.json)](https://pydantic.dev)
[![License](https://img.shields.io/badge/license_-GPLv3+-822422?logo=GNU&logoColor=black&labelColor=white)](LICENSE)

![GIF showcasing the program being used, by revealing it would take 21 minutes and 18 seconds to read the entire license at the standard 265 words per minute.](https://vhs.charm.sh/vhs-1mimEmoE9cISfgnplgT7xA.gif)

<p align="center">
  <a href="https://vhs.charm.sh">
    <img alt="VHS" src="https://stuff.charm.sh/vhs/badge.svg">
  </a>
</p>

## Introduction

consumo is a command-line interface (CLI) built with [Typer](https://typer.tiangolo.com/) that **calculates the time to consume either online or offline media**. It can be used for sorting media by duration for later consumption or by deciding if something can be viewed today or at a later date.

It's designed with **broad support** in mind. When it comes to online media, it supports video platforms by directly getting the duration of the linked video; online hosted files by extracting the duration from their metadata; articles and text in general by using the **Medium formula** to calculate the total consumption time based on text, using a (customizable) words per minute (WPM) count; image count; video duration of the videos on the page. For further details, see: [How Medium Calculates Read Time](https://mediumcourse.com/how-is-medium-article-read-time-calculated/).

For offline media, multiple backends are used to calculate the reading time. However, **by design**, local HTML files have **full feature parity** with online pages.

## Installation

### Python

```text
$ python3 -m pip install consumo # python -m pip install consumo
```

### pip

```text
$ pip3 install consumo # pip install consumo
```

## Context

I'm pretty unorganized. No matter how much I try to tidy things up, I always manage to make a mess somewhere else. In this case, I host in my own machine a [FreshRSS](https://github.com/FreshRSS/FreshRSS) container which should **ideally** be my only source of online content and things should be saved there. However, after hoarding 30+ tabs on my phone with random links from the web, I decided to make a file like this on my computer:

```text
https://en.wikipedia.org/wiki/Python_(programming_language)
https://en.wikipedia.org/wiki/High-level_programming_language
https://en.wikipedia.org/wiki/General-purpose_programming_language
https://en.wikipedia.org/wiki/Code_readability
https://en.wikipedia.org/wiki/Significant_indentation
https://en.wikipedia.org/wiki/Type_system#DYNAMIC
https://en.wikipedia.org/wiki/Garbage_collection_(computer_science)
https://en.wikipedia.org/wiki/Programming_paradigm
https://en.wikipedia.org/wiki/Structured_programming
https://en.wikipedia.org/wiki/Procedural_programming
https://en.wikipedia.org/wiki/Object-oriented_programming
https://en.wikipedia.org/wiki/Functional_programming
...
```

Repeat until you get over **a hundred** links (and multiple websites other than Wikipedia). Needless to say, I felt overwhelmed and thought: "LLMs can view webpages. Maybe I can give this list of links to one so it can sort them by duration for a better experience?"

I tried multiple models, but none were able to do that. Maybe there's something like this out there already, but I forgot to search for it. But thankfully that sparkled a great idea for a project: consumo!

## Philosophies

- Dependency Injection.
- Parse, don't validate[^1].
- Test Driven Development[^2].

[^1]: King, A. (2019) Parse, don’t validate. Alexis King’s Blog. Available at: https://lexi-lambda.github.io/blog/2019/11/05/parse-don-t-validate/ (Accessed: September 29, 2025).

[^2]: Beck, K. (2003) Test-driven development: By example. Boston: Addison-Wesley (The Addison-Wesley signature series).
