scitex_scholar.pdf_download
- class scitex_scholar.pdf_download.ScholarPDFDownloader(context, config=None)[source]
Bases:
objectDownload PDFs from URLs with multiple fallback strategies.
Strategies tried in order: - Chrome PDF Viewer - Direct Download (ERR_ABORTED) - Response Body Extraction - Manual Download Fallback
URL resolution (DOI -> URL) should be handled by the caller.
- async download_from_urls(pdf_urls, output_dir=None, max_concurrent=3)[source]
Download multiple PDFs with parallel processing.
- async download_open_access(oa_url, output_path, metadata=None)[source]
Download PDF from an Open Access URL.