scitex_scholar.pdf_download

class scitex_scholar.pdf_download.ScholarPDFDownloader(context, config=None)[source]

Bases: object

Download PDFs from URLs with multiple fallback strategies.

Strategies tried in order: - Chrome PDF Viewer - Direct Download (ERR_ABORTED) - Response Body Extraction - Manual Download Fallback

URL resolution (DOI -> URL) should be handled by the caller.

__init__(context, config=None)[source]
async download_from_urls(pdf_urls, output_dir=None, max_concurrent=3)[source]

Download multiple PDFs with parallel processing.

Return type:

List[Path]

async download_open_access(oa_url, output_path, metadata=None)[source]

Download PDF from an Open Access URL.

Return type:

Optional[Path]

async download_smart(paper, output_path)[source]

Smart download choosing best strategy based on paper metadata.

Return type:

Optional[Path]

async download_from_url(pdf_url, output_path, doi=None)[source]

Main download method with manual override support.

Return type:

Optional[Path]