scitex_browser.core

Core browser management components.

class scitex_browser.core.BrowserMixin(mode)[source]

Bases: object

Mixin for local browser-based strategies with common functionality.

Browser Modes: - interactive: For human interaction (authentication, debugging) - 1280x720 viewport - stealth: For automated operations (scraping, downloading) - 1x1 viewport

Note: Always runs browser in visible system mode (never truly headless) but uses viewport sizing to control interaction vs stealth behavior.

__init__(mode)[source]

Initialize browser mixin.

Parameters:

mode – Browser mode - ‘interactive’ or ‘stealth’

async classmethod get_shared_browser_async()[source]

Get or create shared browser instance (deprecated - use get_browser_async).

Return type:

Browser

async classmethod cleanup_shared_browser_async()[source]

Clean up shared browser instance (call on app shutdown).

async get_browser_async()[source]

Get or create a local browser instance with the current mode setting.

Return type:

Browser

async new_page(url=None)[source]

Create new page/tab and optionally navigate to URL.

async close_page(page_index)[source]

Close specific page/tab by index.

async close_all_pages()[source]

Close all pages/tabs.

async create_browser_context_async(playwright_instance, **context_options)[source]

Create browser context with cookie auto-acceptance.

async get_session_async(timeout=30)[source]

Get or create basic aiohttp session.

Return type:

ClientSession

async close_session()[source]

Close the aiohttp session.

async accept_cookies_async(page_index=0, wait_seconds=2)[source]

Manually accept cookies on specific page.

interactive()[source]

Set browser to interactive mode (human-friendly viewport).

stealth()[source]

Set browser to stealth mode (minimal viewport for bot detection avoidance).

async show_async()[source]

Switch browser to interactive mode and recreate all existing pages at current URLs.

async hide_async()[source]

Switch browser to stealth mode and recreate all existing pages at current URLs.

class scitex_browser.core.ChromeProfileManager(profile_name, chrome_cache_dir=None, config=None)[source]

Bases: object

Manages Chrome profile especially extensions for automated literature search.

EXTENSIONS = {'2captcha_solver': {'id': 'ifibfemgeogfhoebkmokieepdoobkbpo', 'name': '2Captcha Solver'}, 'accept_cookies': {'id': 'ofpnikijgfhlmmjlpkfaifhhdonchhoi', 'name': 'Accept all cookies'}, 'captcha_solver': {'id': 'hlifkpholllijblknnmbfagnkjneagid', 'name': 'CAPTCHA Solver'}, 'lean_library': {'id': 'hghakoefmnkhamdhenpbogkeopjlkpoa', 'name': 'Lean Library'}, 'popup_blocker': {'id': 'bkkbcggnhapdmkeljlodobbkopceiche', 'name': 'Pop-up Blocker'}, 'zotero_connector': {'id': 'ekhagklcjbdpajgpjgmbionohlpdbjgc', 'name': 'Zotero Connector'}}
AVAILABLE_PROFILE_NAMES = ['system', 'extension', 'auth', 'stealth']
__init__(profile_name, chrome_cache_dir=None, config=None)[source]

Manage a Chrome profile for browser automation.

Parameters:
  • profile_name (str) – Subdirectory under chrome_cache_dir to use as the profile.

  • chrome_cache_dir (Optional[Path]) – Base directory that holds profile subdirectories. Defaults to $SCITEX_BROWSER_CHROME_CACHE_DIR or $SCITEX_DIR/browser/runtime/chrome (~/.scitex/browser/runtime/chrome by default).

  • config (Optional[object]) – Deprecated. Back-compat shim: any object exposing get_cache_chrome_dir(profile_name) -> Path is accepted so callers passing ScholarConfig still work. Prefer chrome_cache_dir.

_get_extension_statuses(profile_dir)[source]

Get detailed status of each extension.

Return type:

Dict[str, bool]

check_extensions_installed(profile_dir=None, verbose=True)[source]

Check installation status of all extensions from profile directory.

Return type:

bool

_get_installed_extension_paths(profile_dir)[source]

Get paths to installed extensions for –load-extension argument.

Return type:

list[str]

get_extension_args()[source]

Get extension args using appropriate profile directory.

async install_extensions_manually_if_not_installed_async(verbose=False)[source]

Open Chrome for manual extension installation.

async handle_runtime_extension_dialogs_async(page)[source]

Handle extension consent dialogs that appear at runtime.

sync_from_profile(source_profile_name='system')[source]

Sync extensions and cookies from source profile to this profile using rsync.

Parameters:

source_profile_name (str) – Name of source profile (default: “system”)

Return type:

bool

Returns:

True if sync succeeded, False otherwise