Metadata-Version: 2.4
Name: edi_energy_scraper
Version: 2.2.3
Summary: a scraper to mirror edi-energy.de
Project-URL: Changelog, https://github.com/Hochfrequenz/edi_energy_scraper/releases
Project-URL: Homepage, https://github.com/Hochfrequenz/edi_energy_scraper
Author-email: Hochfrequenz Unternehmensberatung GmbH <info+github@hochfrequenz.de>
License: MIT
License-File: LICENSE
Keywords: ahb,automation,bdew,bdew-mako.de,edi@energy
Classifier: Development Status :: 5 - Production/Stable
Classifier: Environment :: Console
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Python :: 3.14
Requires-Python: >=3.10
Requires-Dist: aiohttp>=3.8.4
Requires-Dist: efoli>=2.1.0
Requires-Dist: more-itertools
Requires-Dist: pydantic>=2
Requires-Dist: pypdf>=3.4.1
Requires-Dist: pytz>=2022.7.1
Requires-Dist: pytz>=2024.2
Provides-Extra: coverage
Requires-Dist: coverage==7.10.7; extra == 'coverage'
Provides-Extra: formatting
Requires-Dist: black==25.9.0; extra == 'formatting'
Requires-Dist: isort==7.0.0; extra == 'formatting'
Provides-Extra: linting
Requires-Dist: pylint==3.3.7; extra == 'linting'
Provides-Extra: spellcheck
Requires-Dist: codespell==2.4.1; extra == 'spellcheck'
Provides-Extra: test-packaging
Requires-Dist: build==1.3.0; extra == 'test-packaging'
Requires-Dist: twine==6.2.0; extra == 'test-packaging'
Provides-Extra: tests
Requires-Dist: aioresponses==0.7.8; extra == 'tests'
Requires-Dist: freezegun==1.5.5; extra == 'tests'
Requires-Dist: pytest-asyncio==1.2.0; extra == 'tests'
Requires-Dist: pytest-mock==3.14.1; extra == 'tests'
Requires-Dist: pytest==8.4.2; extra == 'tests'
Requires-Dist: syrupy==4.9.1; extra == 'tests'
Provides-Extra: type-check
Requires-Dist: mypy==1.17.1; extra == 'type-check'
Requires-Dist: types-pytz==2025.2.0.20250516; extra == 'type-check'
Description-Content-Type: text/markdown

# edi-energy.de scraper

[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE)
![Unittests status badge](https://github.com/Hochfrequenz/edi_energy_scraper/workflows/Unittests/badge.svg)
![Coverage status badge](https://github.com/Hochfrequenz/edi_energy_scraper/workflows/Coverage/badge.svg)
![Linting status badge](https://github.com/Hochfrequenz/edi_energy_scraper/workflows/Linting/badge.svg)
![Black status badge](https://github.com/Hochfrequenz/edi_energy_scraper/workflows/Black/badge.svg)
![PyPi Status Badge](https://img.shields.io/pypi/v/edi_energy_scraper)
![Python Versions (officially) supported](https://img.shields.io/pypi/pyversions/edi_energy_scraper.svg)

The Python package `edi_energy_scraper` provides easy to use methods to mirror the free documents on bdew-mako.de.

### Rationale / Why?

If you'd like to be informed about new regulations or data formats being published on bdew-mako.de you can either

- visit the site every day and hope that you see the changes if this is your favourite hobby,
- or automate the task.

This repository helps you with the latter. It allows you to create an up-to-date copy of edi-energy.de on your local
computer. Other than if you mirrored the files using `wget` or `curl`, you'll get a clean and intuitive directory
structure.

From there you can e.g. commit the files into a VCS (like e.g. our [edi_energy_mirror](https://github.com/Hochfrequenz/edi_energy_mirror)), scrape the PDF/Word files for later use...

We're all hoping for the day of true digitization on which this repository will become obsolete.

### See also
There is a similar project in C# by Fabian Wetzel: [fabsenet/edi-energy-extracto](https://github.com/fabsenet/edi-energy-extractor/).
Other than this project, it stores the downloaded data in a database instead of a file system.
It also works with `bdew-mako.de`.

## How to use the Package (as a user)

Install via pip:

```bash
pip install edi_energy_scraper
```

Create a directory in which you'd like to save the mirrored data:

```bash
mkdir edi_energy_de
```

Then import it and start the download:

```python
import asyncio
from edi_energy_scraper import EdiEnergyScraper


# add the following lines to enable debug logging to stdout (CLI)
# import logging
# import sys
# logging.basicConfig(stream=sys.stdout, level=logging.DEBUG)

async def mirror():
    scraper = EdiEnergyScraper(path_to_mirror_directory="edi_energy_de")
    await scraper.mirror()


if __name__ == "__main__":
    loop = asyncio.new_event_loop()
    asyncio.set_event_loop(loop)
    asyncio.run(mirror())

```

This creates a directory structure:

```
-|-your_script_cwd.py
 |-edi_energy_de
    |- FV2310 (contains files valid since 2023-10-01)
        |- ahb.pdf
        |- ahb.docx
        |- ...
    |- FV2404 (contains files valid since 2024-04-03)
        |- mig.pdf
        |- mig.docx
        |- ...
    |- FV2504 (contains files valid since 2025-06-06)
        |- allgemeine_festlegungen.pdf
        |- schema.xsd
        |- ...
```

> [!TIP]
> You can extract the information encoded into the filenames:
> ```python
> from edi_energy_scraper import DocumentMetadata
> structured_information = DocumentMetadata.from_filename("AHB_COMDIS_1.0f_99991231_20250605_20250605_8872.pdf")
> # DocumentMetadata(kind='MIG', edifact_format=<EdifactFormat.REQOTE: 'REQOTE'>, valid_from=datetime.date(2023, 9, 30), valid_unt...traordinary_publication=True, is_error_correction=False, is_informational_reading_version=True, additional_text=None, id=10071)
```

## How to use this Repository on Your Machine (for development)

Please follow the instructions in
our [Python Template Repository](https://github.com/Hochfrequenz/python_template_repository#how-to-use-this-repository-on-your-machine)
. And for further information, see the [Tox Repository](https://github.com/tox-dev/tox).

## Contribute

You are very welcome to contribute to this template repository by opening a pull request against the main branch.
