newspaper3k
html2text
requests
beautifulsoup4
tqdm
markitdown
lxml_html_clean
