jeevesagent.loader.html¶
HTML loader → markdown.
Uses beautifulsoup4 (lazy import) to walk the DOM and emit
markdown that preserves heading + paragraph + list structure.
Strips <script> / <style> content. Drops most attributes;
the goal is to keep the textual structure, not pixel-perfect
rendering.
Functions¶
|
Load an HTML file → markdown. |
Module Contents¶
- jeevesagent.loader.html.load_html(path: str | pathlib.Path) jeevesagent.loader.base.Document[source]¶
Load an HTML file → markdown.
Requires
beautifulsoup4:pip install 'jeevesagent[loader-html]'.