Metadata-Version: 2.4
Name: invenio-feeds
Version: 0.1.3
Summary: Celery-based feed processing extension for InvenioRDM.
Project-URL: Homepage, https://rogue-scholar.org
Project-URL: Repository, https://codeberg.org/front-matter/invenio-feeds
Project-URL: Documentation, https://docs.rogue-scholar.org
Author-email: Martin Fenner <martin@front-matter.de>
License-Expression: MIT
License-File: LICENSE
Keywords: blogging,celery,inveniordm,science
Requires-Python: ~=3.14.0
Requires-Dist: babel<3,>=2.14.0
Requires-Dist: beautifulsoup4<5,>=4.12.2
Requires-Dist: celery<6,>=5.3
Requires-Dist: commonmeta-py<1,>=0.223
Requires-Dist: dateutils<0.7,>=0.6.12
Requires-Dist: furl<3,>=2.1.3
Requires-Dist: iso8601<3,>=2.1.0
Requires-Dist: langdetect<2,>=1.0.9
Requires-Dist: levenshtein<0.28,>=0.26.0
Requires-Dist: lxml<7,>=5.1.0
Requires-Dist: nh3<0.3,>=0.2.14
Requires-Dist: orjson<4,>=3.9.14
Requires-Dist: pydash<7.0,>=6.0
Requires-Dist: pypandoc>=1.17
Requires-Dist: python-frontmatter~=1.1
Requires-Dist: requests<3,>=2.32
Requires-Dist: weasyprint<69,>=67
Requires-Dist: xmltodict<0.13,>=0.12.0
Description-Content-Type: text/markdown

[![PyPI version](https://img.shields.io/pypi/v/invenio-feeds.svg)](https://pypi.org/project/invenio-feeds/)
[![docs](https://img.shields.io/badge/docs-passing-blue)](https://docs.rogue-scholar.org)
![License](https://img.shields.io/badge/license-MIT-blue?logo=codeberg)
[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.8433679.svg)](https://doi.org/10.5281/zenodo.8433679)

# invenio-feeds

Celery-based feed processing extension for [InvenioRDM](https://inveniosoftware.org/products/rdm/), powering the [Rogue Scholar](https://rogue-scholar.org) science blog archive.

`invenio-feeds` replaces the former Flask/REST API with Celery workers that periodically fetch blog feeds, normalise posts, and upsert them as InvenioRDM records and communities. It follows the standard `invenio-base` extension pattern and integrates into any InvenioRDM instance.

## Installation

Requires Python 3.14. Uses [uv](https://github.com/astral-sh/uv) for dependency management.

```bash
uv sync
```

### Environment variables

```bash
# InvenioRDM instance to read/write
FLASK_INVENIORDM_API=https://rogue-scholar.org   # default
FLASK_INVENIORDM_TOKEN=<bearer-token>
```

## Running

### Celery worker

```bash
celery -A invenio_feeds.tasks worker --loglevel=info
```

### Celery beat scheduler (periodic tasks)

```bash
celery -A invenio_feeds.tasks beat --loglevel=info
```

### Trigger a feed import manually

```bash
celery -A invenio_feeds.tasks call invenio_feeds.process_blog_feed --args='["<blog-slug>"]'
```

## Scheduled tasks

| Task | Schedule | Description |
|---|---|---|
| `invenio_feeds.process_all_feeds` | every 30 min | Fan-out: one `process_blog_feed` per active blog |
| `invenio_feeds.process_blog_feed` | on-demand | Fetch and upsert all new posts for one blog |
| `invenio_feeds.classify_all_blogs` | daily 03:00 UTC | Fan-out: classify untagged posts per blog |

## Development

We use pytest for testing:

```bash
uv run pytest
```

Follow along via [Codeberg Issues](https://codeberg.org/front-matter/rogue-scholar-api/issues). Please open an issue if you encounter a bug or have a feature request.

### Note on Patches/Pull Requests

- Fork the project
- Write tests for your new feature or a test that reproduces a bug
- Implement your feature or make a bug fix
- Do not mess with Rakefile, version or history
- Commit, push and make a pull request. Bonus points for topical branches.

## Documentation

Documentation (work in progress) for using Rogue Scholar is available at the [Rogue Scholar Documentation](https://docs.rogue-scholar.org/) website.

## Meta

Please note that this project is released with a [Contributor Code of Conduct](https://codeberg.org/front-matter/rogue-scholar-api/src/branch/main/CODE_OF_CONDUCT.md). By participating in this project you agree to abide by its terms.

License: [MIT](https://codeberg.org/front-matter/rogue-scholar-api/src/branch/main/LICENSE)
