Metadata-Version: 2.4
Name: gemini-webui
Version: 0.1.0
Summary: Programmatic access to Google Gemini via web UI automation
Author: Gemini WebUI Contributors
License: MIT
License-File: LICENSE
Keywords: ai,automation,fastapi,gemini,google,llm,playwright
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Requires-Python: >=3.10
Requires-Dist: playwright>=1.40.0
Provides-Extra: api
Requires-Dist: fastapi>=0.110.0; extra == 'api'
Requires-Dist: uvicorn[standard]>=0.29.0; extra == 'api'
Provides-Extra: cookies
Requires-Dist: browser-cookie3>=0.19; extra == 'cookies'
Provides-Extra: dev
Requires-Dist: pytest-asyncio>=0.21; extra == 'dev'
Requires-Dist: pytest>=7.0; extra == 'dev'
Requires-Dist: ruff>=0.1; extra == 'dev'
Description-Content-Type: text/markdown

<div align="center">

# 🤖 Gemini Web UI Automation

**Programmatic access to Google Gemini — for free. No API key required.**

[![Python 3.10+](https://img.shields.io/badge/python-3.10%2B-blue.svg)](https://www.python.org/downloads/)
[![License: MIT](https://img.shields.io/badge/license-MIT-green.svg)](LICENSE)
[![Playwright](https://img.shields.io/badge/powered%20by-Playwright-orange.svg)](https://playwright.dev/python/)

Send prompts · Get responses · Model selection · Multi-turn conversations · Session persistence · Headless mode

</div>

---

## ✨ Features

- **Free access** — Uses the Gemini web UI, no API key or billing required
- **Model selection** — Switch between Fast, Thinking, and Pro models
- **Async & Sync APIs** — Use `GeminiClient` (async) or `GeminiClientSync` (sync)
- **Easy auth** — Log in via your regular browser cookies, no manual sign-in needed
- **Session persistence** — Log in once, reuse the session across runs
- **Headless mode** — Run without a display after initial setup
- **Anti-detection** — System Chrome + stealth measures to bypass automation detection
- **Multi-turn conversations** — Continue chats or start new ones
- **Response streaming detection** — Waits for the full response automatically

## 📦 Installation

```bash
# Clone the repo
git clone <your-repo-url>
cd gemini-api-again

# Install the package
pip install -e .

# Install Playwright browsers (first time only)
playwright install chromium

# Optional: Install cookie extraction support
pip install -e ".[cookies]"
```

> **Note:** You also need Google Chrome installed on your system. The library uses your system Chrome to avoid Google's "browser not secure" detection.
>
> ```bash
> # Ubuntu/Debian
> sudo apt install google-chrome-stable
> ```

## 🚀 Quick Start

### Option A — Login from your browser (easiest, no manual sign-in)

Extract cookies from your regular browser — no Playwright sign-in window needed:

```bash
# Install cookie extraction support
pip install -e ".[cookies]"
```

```python
import asyncio
from gemini_webui import GeminiClient

async def main():
    async with GeminiClient(headless=True, session_path="sessions/default.json") as client:
        # Extract cookies from Chrome — you must be logged into
        # gemini.google.com in your Chrome browser
        await client.login_from_browser(browser_name="chrome")

        response = await client.send_prompt("Explain quantum computing in 3 sentences")
        print(response.text)

asyncio.run(main())
```

Supported browsers: `"chrome"`, `"chromium"`, `"firefox"`, `"opera"`, `"edge"`

### Option B — Manual login via Playwright browser window

Run this **once** to log in and save your session. A Chrome window opens — sign in with your Google account:

```python
import asyncio
from gemini_webui import GeminiClient

async def setup():
    async with GeminiClient(headless=False, session_path="sessions/default.json") as client:
        await client.login()  # Browser opens — log in manually
        # Session saved automatically on exit

asyncio.run(setup())
```

After that, use headless mode with the saved session:

```python
async with GeminiClient(headless=True, session_path="sessions/default.json") as client:
    await client.login()
    response = await client.send_prompt("Explain quantum computing in 3 sentences")
    print(response.text)
```

**Output:**
```
Quantum computing is a revolutionary technology that uses the principles of quantum
mechanics to process information in ways traditional computers cannot. Instead of
standard bits that represent either a 0 or a 1, quantum computers use qubits, which
can exist as both 0 and 1 simultaneously thanks to a property called superposition.
This unique ability allows them to perform complex calculations at unprecedented speeds.
```

### Sync API (simpler, no async/await)

```python
from gemini_webui import GeminiClientSync

with GeminiClientSync(headless=True, session_path="sessions/default.json") as client:
    # Option A: From browser cookies
    client.login_from_browser(browser_name="chrome")

    # Option B: From saved session
    # client.login()

    response = client.send_prompt("What is the meaning of life?")
    print(response.text)
```

## 🔄 Model Selection

Switch between Gemini models — **Fast** (Flash), **Thinking**, and **Pro**:

```python
from gemini_webui import GeminiClientSync

with GeminiClientSync(headless=True, session_path="sessions/default.json") as client:
    client.login()

    # List available models
    models = client.list_models()
    print(models)  # ['Fast', 'Thinking', 'Pro']

    # Check current model
    print(client.get_current_model())  # 'Fast'

    # Switch to Pro for advanced tasks
    client.set_model("pro")
    response = client.send_prompt("Solve: integral of x^2 * e^x dx")
    print(response.text)

    # Switch to Thinking for complex reasoning
    client.set_model("thinking")
    response = client.send_prompt("What is 15! / (3! * 5! * 7!)?")
    print(response.text)

    # Switch back to Fast for quick answers
    client.set_model("fast")
```

| Model | Alias | Description |
|-------|-------|-------------|
| Fast | `"fast"`, `"flash"` | Quick answers (Gemini Flash) |
| Thinking | `"thinking"` | Solves complex problems with reasoning |
| Pro | `"pro"` | Advanced math and code (Gemini Pro) |

## 💬 Multi-Turn Conversations

```python
from gemini_webui import GeminiClientSync

with GeminiClientSync(headless=True, session_path="sessions/default.json") as client:
    client.login()
    client.new_chat()  # Start a fresh conversation

    # First message
    r1 = client.send_prompt("Tell me about Python in 2 sentences")
    print(r1.text)

    # Follow-up — continues in the same conversation
    r2 = client.send_prompt("What are its main advantages?")
    print(r2.text)

    # Start a completely new conversation
    r3 = client.send_prompt("What is Rust?", new_chat=True)
    print(r3.text)
```

## 🎛️ Configuration

```python
client = GeminiClient(
    headless=True,              # Run browser in headless mode
    timeout=120.0,              # Response wait timeout (seconds)
    session_path="sessions/default.json",  # Session persistence path
    browser_type="chromium",    # Browser engine (ignored if use_system_chrome=True)
    slow_mo=0,                  # Slow down actions by N ms (debugging)
    request_delay=1.0,          # Min seconds between prompts (avoid rate limits)
    use_system_chrome=True,     # Use system Chrome (avoids detection)
    user_data_dir=None,         # Persistent browser profile directory
)
```

## 📖 API Reference

### `GeminiClient` (async)

| Method | Description |
|--------|-------------|
| `GeminiClient(**kwargs)` | Create a new client. See [Configuration](#-configuration) for all options. |
| `async start()` | Start the browser |
| `async close()` | Close the browser and save session |
| `async login(session_path=None, login_timeout=120)` | Authenticate via Playwright browser window |
| `async login_from_browser(browser_name="chrome", session_path=None)` | Authenticate by extracting cookies from your browser |
| `async send_prompt(prompt, *, new_chat=False) → GeminiResponse` | Send a prompt and get a response |
| `async new_chat()` | Start a new conversation |
| `async set_model(model)` | Switch model (`"fast"`, `"thinking"`, `"pro"`) |
| `async get_current_model() → str` | Get the currently selected model name |
| `async list_models() → list[str]` | List available model names |
| `is_authenticated` | `bool` — Check if logged in |

Supports `async with` context manager.

### `GeminiClientSync` (sync)

Same API as `GeminiClient` but without `async`/`await`. Supports `with` context manager.

### `GeminiResponse`

| Field | Type | Description |
|-------|------|-------------|
| `text` | `str` | The full response text |
| `prompt` | `str` | The original prompt sent |
| `conversation_id` | `str \| None` | Conversation identifier from the URL |
| `model` | `str \| None` | Model name (if detectable) |
| `timestamp` | `datetime` | When the response was received |

```python
response = client.send_prompt("Hello")
print(response.text)             # "Hi! How can I help you today?"
print(response.prompt)           # "Hello"
print(response.conversation_id)  # "6ea876188b87661c"
print(response.timestamp)        # 2026-05-09 08:43:01
str(response)                    # "Hi! How can I help you today?"
```

### Exceptions

| Exception | Description |
|-----------|-------------|
| `AuthenticationError` | Not logged in or session expired |
| `CaptchaError` | CAPTCHA challenge detected |
| `RateLimitError` | Google rate limiting detected |
| `ResponseTimeoutError` | Response not received within timeout |
| `SelectorNotFoundError` | UI element not found (UI may have changed) |
| `BrowserError` | Browser launch or runtime error |

## 🔐 Authentication Methods

### Method 1: `login_from_browser()` — Extract cookies from your browser

The easiest way to authenticate. Extracts Google cookies from your regular browser and injects them into the Playwright context.

**Requirements:**
```bash
pip install -e ".[cookies]"
```

**Prerequisites:**
- You must be logged into `gemini.google.com` in your chosen browser
- Your browser must be closed (cookie files are locked while the browser is running)

**Usage:**
```python
async with GeminiClient(headless=True, session_path="sessions/default.json") as client:
    await client.login_from_browser(browser_name="chrome")
    # Session is saved — next time you can just use login()
```

**Supported browsers:** `"chrome"`, `"chromium"`, `"firefox"`, `"opera"`, `"edge"`

### Method 2: `login()` — Manual sign-in via Playwright window

Opens a Chrome window where you sign in manually. Works in headed mode only.

**First time:**
```python
async with GeminiClient(headless=False, session_path="sessions/default.json") as client:
    await client.login()  # Sign in manually in the browser window
```

**After that:**
```python
async with GeminiClient(headless=True, session_path="sessions/default.json") as client:
    await client.login()  # Uses saved session — no manual sign-in needed
```

## 🛡️ Anti-Detection

Google blocks sign-in from browsers it detects as automated. This library uses several strategies to bypass that:

| Strategy | How |
|----------|-----|
| **System Chrome** | Uses your installed Chrome instead of Playwright's bundled Chromium |
| **Persistent profile** | Real user data directory — looks like a normal browser |
| **Stealth args** | `--disable-blink-features=AutomationControlled` and others |
| **JS patching** | Removes `navigator.webdriver` flag |
| **Realistic UA** | Standard Chrome user agent string |

### If you still get "This browser or app may not be secure"

1. **Use `login_from_browser()`** — This bypasses the sign-in entirely by using your existing browser cookies
2. **Install Google Chrome** (not Chromium) — Google is more lenient with its own browser
3. **Use a persistent user data directory** — Makes the browser look like a regular user's:
   ```python
   client = GeminiClient(
       headless=False,
       session_path="sessions/default.json",
       user_data_dir="/home/you/.config/gemini-automation-chrome",
   )
   ```

## 🔧 Troubleshooting

| Problem | Solution |
|---------|----------|
| "This browser or app may not be secure" | Use `login_from_browser()` instead, or see [Anti-Detection](#-anti-detection) |
| "No Google cookies found in chrome" | Log into `gemini.google.com` in your browser first, then close the browser |
| "Not logged in and running in headless mode" | Run with `headless=False` first, or use `login_from_browser()` |
| "CAPTCHA challenge detected" | Run with `headless=False` to solve it manually |
| "Rate limit detected" | Increase `request_delay` (e.g., `3.0`) and wait before retrying |
| "Response did not complete within timeout" | Increase `timeout` (e.g., `300.0`) |
| "No element found for any selector" | The Gemini UI changed — update selectors in `src/gemini_webui/selectors.py` |

## 🏗️ How It Works

```
┌─────────────┐     ┌──────────────┐     ┌─────────────────┐
│  Your Code   │────▶│ GeminiClient │────▶│  Chrome Browser  │
│              │     │              │     │  (System Chrome)  │
│ send_prompt()│     │  ┌────────┐  │     │                  │
│ new_chat()   │     │  │ Auth   │  │     │  gemini.google   │
│ login()      │     │  │Manager │  │     │     .com/app     │
│              │     │  └────────┘  │     │                  │
│              │◀────│  ┌────────┐  │◀────│  Response Text   │
│              │     │  │Response│  │     │                  │
│              │     │  │Parser  │  │     │                  │
│              │     │  └────────┘  │     │                  │
└─────────────┘     └──────────────┘     └─────────────────┘
                           │
                    ┌──────┴──────┐
                    │   Session    │
                    │  Persistence │
                    │ (cookies +   │
                    │  localStorage)│
                    └─────────────┘
```

1. **Browser launch** — Starts system Chrome with a persistent profile and stealth arguments
2. **Authentication** — Either extract cookies from your browser (`login_from_browser()`) or sign in manually (`login()`)
3. **Prompt sending** — Types into the contenteditable prompt area and clicks send
4. **Response detection** — Monitors the "Stop generating" button + content stability to detect when streaming is complete
5. **Text extraction** — Pulls the response text from the model's response container

## ⚠️ Limitations

- **Text prompts only** — No file/image uploads (yet)
- **UI changes** — Google may update the Gemini UI, breaking selectors. The library uses fallback selector chains for resilience.
- **Rate limits** — Google may throttle excessive usage. Use `request_delay` to pace requests.
- **Terms of Service** — Automating the web UI may violate Google's ToS. Use responsibly.

## 🛠️ Development

```bash
# Install in development mode
pip install -e ".[dev]"

# Run linter
ruff check src/

# Format code
ruff format src/
```

## 📄 License

MIT
