scitex_scholar.auth
Authentication module for Scholar.
- class scitex_scholar.auth.ScholarAuthManager(email_openathens=None, email_ezproxy=None, email_shibboleth=None, config=None)[source]
Bases:
objectManages multiple authentication providers.
This class coordinates between different authentication methods (OpenAthens, Lean Library, etc.) and provides a unified interface.
- __init__(email_openathens=None, email_ezproxy=None, email_shibboleth=None, config=None)[source]
Initialize the authentication manager.
- Parameters:
email_openathens (
Optional[str]) – User’s institutional email for OpenAthens authenticationemail_ezproxy (
Optional[str]) – User’s institutional email for EZProxy authenticationemail_shibboleth (
Optional[str]) – User’s institutional email for Shibboleth authenticationconfig (
Optional[ScholarConfig]) – ScholarConfig instance (creates new if None)
- async ensure_authenticate_async(provider_name=None, verify_live=True, **kwargs)[source]
- Return type:
- async is_authenticate_async(verify_live=True)[source]
Check if authenticate_async with any provider.
- Return type:
- async authenticate_async(provider_name=None, **kwargs)[source]
Authenticate with specified or active provider.
- Return type:
- async get_auth_cookies_async(essential_only=True)[source]
Get authentication cookies from active provider.
- _register_provider(name, provider)[source]
Register an authentication provider with email context.
- Return type:
- class scitex_scholar.auth.AuthenticationGateway(auth_manager, browser_manager, config=None)[source]
Bases:
objectTransparent authentication layer for Scholar operations.
Responsibilities: - Determine if URL requires authentication (config-based, no hardcoding) - Prepare authenticated browser context - Visit authentication gateways (OpenURL) to establish publisher sessions - Cache authentication state for performance
This gateway sits between Scholar and URL/Download operations, preparing authentication transparently before content access.
- property name
- __init__(auth_manager, browser_manager, config=None)[source]
Initialize authentication gateway.
- Parameters:
auth_manager – ScholarAuthManager instance
browser_manager – ScholarBrowserManager instance
config (
ScholarConfig) – ScholarConfig instance
- async prepare_context_async(doi, context, title=None)[source]
Prepare URL context with authentication if needed.
This is the main entry point - called BEFORE URL finding.
Flow: 1. Build OpenURL (authentication gateway) 2. Check if DOI needs authentication (based on known publishers) 3. If auth needed: Visit OpenURL to establish publisher cookies 4. Resolve to final publisher URL 5. Return prepared context with authenticated session
- Parameters:
- Return type:
- Returns:
URLContext with authentication prepared and ready
- async _resolve_publisher_url_async(url_context, context)[source]
Resolve DOI to publisher landing page URL.
Uses OpenURLResolver which already exists and works. The OpenURL is the authentication gateway for paywalled content.
- Parameters:
url_context (
URLContext) – URLContext with DOIcontext (
BrowserContext) – Browser context
- Return type:
- Returns:
URLContext with url and auth_gateway_url populated
- _check_auth_requirements_from_doi(url_context)[source]
Determine if DOI requires authentication based on DOI prefix patterns.
This allows early detection before resolving URL. IEEE DOIs start with 10.1109, Springer with 10.1007, etc.
- Parameters:
url_context (
URLContext) – URLContext with doi populated- Return type:
- Returns:
URLContext with requires_auth and auth_provider populated
- _check_auth_requirements(url_context)[source]
Determine if URL requires authentication based on config.
This is config-based (no hardcoded domain lists). Checks URL against paywalled_publishers in config.
- Parameters:
url_context (
URLContext) – URLContext with url populated- Return type:
- Returns:
URLContext with requires_auth and auth_provider populated
- async _establish_authentication_async(url_context, context)[source]
Establish authentication by visiting gateway URL and clicking through to publisher.
This is the KEY OPERATION that solves the IEEE issue: 1. Visit OpenURL (library resolver) 2. Find publisher link on resolver page 3. Click link → redirects through OpenAthens → lands at publisher 4. Publisher session cookies established in browser context
Without this step: - OpenAthens cookies exist at openathens.net - NO cookies exist at ieee.org - Chrome PDF viewer opens but download fails
With this step: - Visit OpenURL - Click IEEE link → redirect through OpenAthens - Land at ieee.org → IEEE session cookies established - Now ieee.org has cookies, Chrome PDF viewer works
- Parameters:
url_context (
URLContext) – URLContext with auth_gateway_url and doicontext (
BrowserContext) – Browser context (will receive publisher cookies)
- Return type:
- Returns:
Publisher URL if successful, None otherwise
- class scitex_scholar.auth.URLContext(doi, title=None, url=None, pdf_urls=<factory>, requires_auth=None, auth_provider=None, auth_gateway_url=None)[source]
Bases:
objectContext for URL operations with authentication information.
This dataclass carries all information needed for URL resolution and PDF download, including authentication state.
- __init__(doi, title=None, url=None, pdf_urls=<factory>, requires_auth=None, auth_provider=None, auth_gateway_url=None)