Metadata-Version: 2.4
Name: chromecap
Version: 0.1.9
Summary: Screenshot capture and Visual Analysis for UI Testing & Debugging via Terminal
Home-page: https://github.com/civaitechnologies/chrome-cap
Author: Nifemi Alpine
Author-email: hello@civai.co
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Topic :: Utilities
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: click>=8.0.0
Requires-Dist: fastapi>=0.95.0
Requires-Dist: uvicorn>=0.15.0
Requires-Dist: requests>=2.25.0
Requires-Dist: python-socketio>=5.1.0
Requires-Dist: python-engineio>=4.0.0
Requires-Dist: tabulate>=0.8.9
Requires-Dist: python-dotenv>=0.19.0
Requires-Dist: psutil>=5.8.0
Requires-Dist: pydantic>=1.10.0
Requires-Dist: cursor-agent-tools>=0.1.25
Requires-Dist: beautifulsoup4>=4.12.0
Provides-Extra: dev
Requires-Dist: pytest>=7.0.0; extra == "dev"
Requires-Dist: pytest-cov>=4.0.0; extra == "dev"
Requires-Dist: black>=23.0.0; extra == "dev"
Requires-Dist: flake8>=6.0.0; extra == "dev"
Requires-Dist: mypy>=1.0.0; extra == "dev"
Requires-Dist: isort>=5.10.0; extra == "dev"
Requires-Dist: pre-commit>=3.0.0; extra == "dev"
Dynamic: author
Dynamic: author-email
Dynamic: classifier
Dynamic: description
Dynamic: description-content-type
Dynamic: home-page
Dynamic: license-file
Dynamic: provides-extra
Dynamic: requires-dist
Dynamic: requires-python
Dynamic: summary

# ChromeCap

ChromeCap is a powerful Python package for capturing high-quality screenshots of Chrome tabs programmatically. It combines a FastAPI server, a web client, and Chrome extension integration to enable seamless screenshot capture for automation, testing, and analysis.

[![PyPI version](https://badge.fury.io/py/chromecap.svg)](https://badge.fury.io/py/chromecap)
[![Tests](https://github.com/civai-technologies/chrome-cap/actions/workflows/test.yml/badge.svg)](https://github.com/civai-technologies/chrome-cap/actions/workflows/test.yml)
[![License](https://img.shields.io/github/license/civai-technologies/chrome-cap)](https://github.com/civai-technologies/chrome-cap/blob/main/LICENSE)

## Key Features

- **Versatile Capture:** Screenshot any URL, including localhost development servers
- **Chrome Extension Integration:** Works with the BrowserGPT extension for advanced capture capabilities
- **Programmable API:** Both HTTP REST API and Python interface
- **Socket.IO Support:** Real-time communication between server and extension
- **Image Analysis:** Optional integration with cursor-agent-tools for AI-powered image analysis
- **Flexible Output:** Save to custom locations or get raw binary/base64 data
- **Post-Capture Actions:** Redirect to URLs or desktop apps after capture

## Table of Contents

- [Installation](#installation)
- [Quick Start](#quick-start)
- [Usage Examples](#usage-examples)
  - [Command Line Interface](#command-line-interface)
  - [Python API](#python-api)
  - [HTTP REST API](#http-rest-api)
- [Architecture](#architecture)
- [Extension Integration](#extension-integration)
- [Configuration](#configuration)
- [Image Analysis with AI](#image-analysis-with-ai)
- [Development](#development)
  - [Contributing](#contributing)
  - [Testing](#testing)
  - [Building](#building)
- [Security Considerations](#security-considerations)
- [License](#license)

## Installation

### From PyPI

```bash
pip install chromecap
```

### Prerequisites

- **Python 3.9+**
- **Chrome Browser**
- **BrowserGPT Chrome Extension** (for best experience) [Download Here](https://chromewebstore.google.com/detail/browsergpt-operator/hipciehccffmaaoghpleiffkcgbefjhf)

### For Image Analysis

```bash
pip install chromecap cursor-agent-tools
```

### Development Installation

```bash
git clone https://github.com/civai-technologies/chrome-cap.git
cd chrome-cap
pip install -e ".[dev]"
```

## Quick Start

1. **Start the server**:
   ```bash
   chromecap start
   ```

2. **Capture a screenshot**:
   ```bash
   chromecap capture https://example.com --output screenshot.png
   ```

3. **Analyze the screenshot** (requires cursor-agent-tools):
   ```bash
   chromecap capture https://example.com --query "What UI elements are present?"
   ```

## Usage Examples

### Command Line Interface

ChromeCap provides an intuitive CLI for most common tasks:

```bash
# Basic capture
chromecap capture https://example.com --output screenshot.png

# With redirect after capture
chromecap capture https://example.com --redirect "https://next-site.com"

# Debug mode with detailed logs
chromecap capture https://example.com --debug

# List available screenshots
chromecap list

# Get a specific screenshot by ID
chromecap get abc123 --output my_screenshot.png

# Analyze existing image
chromecap analyze path/to/image.png "Describe the UI layout in detail"
```

### Python API

Import and use ChromeCap directly in your Python code:

```python
import os
import requests
from pathlib import Path

# API endpoint
API_URL = "http://localhost:8000"

# Ensure server is running
def ensure_server_running():
    try:
        response = requests.get(f"{API_URL}/status", timeout=2)
        return response.status_code == 200
    except:
        # Start server if not running
        import subprocess
        subprocess.Popen(["chromecap", "start"])
        # Wait for server to start...

# Capture screenshot
def capture_screenshot(url, output_path):
    # Create request with unique ID
    import uuid
    request_id = str(uuid.uuid4())
    
    # Initiate capture
    response = requests.get(
        f"{API_URL}/api/capture",
        params={
            'url': url,
            'request_id': request_id,
        }
    )
    
    if response.status_code != 200:
        return None
    
    # Poll for the screenshot
    import time
    start_time = time.time()
    timeout = 30
    
    while time.time() - start_time < timeout:
        # Check for screenshot
        response = requests.get(f"{API_URL}/api/screenshots")
        screenshots = response.json().get("screenshots", [])
        
        # Find matching screenshot
        matching = [s for s in screenshots if s.get("request_id") == request_id]
        
        if matching:
            screenshot_id = matching[0].get("id")
            
            # Get the raw image
            img_response = requests.get(
                f"{API_URL}/api/raw-screenshot/{screenshot_id}"
            )
            
            if img_response.status_code == 200:
                # Save to file
                with open(output_path, "wb") as f:
                    f.write(img_response.content)
                return output_path
        
        time.sleep(0.5)
    
    return None

# Example usage
output_file = "example.png"
result = capture_screenshot("https://example.com", output_file)
if result:
    print(f"Screenshot saved to {result}")
```

For more complete Python examples, check the [examples](https://github.com/civai/chrome-cap/tree/main/examples/) directory:
- [Basic Capture](https://github.com/civai/chrome-cap/blob/main/examples/basic_capture.py)
- [Image Analysis](https://github.com/civai/chrome-cap/blob/main/examples/image_analysis.py)
- [API Server Integration](https://github.com/civai/chrome-cap/blob/main/examples/api_server.py)

### HTTP REST API

ChromeCap exposes a comprehensive REST API for integration with any language:

#### Capture a Screenshot
```
GET /api/capture?url=https://example.com&request_id=my-request-id
```

#### List Screenshots
```
GET /api/screenshots
```

#### Get Screenshot by ID
```
GET /api/screenshot/abc123
```

#### Get Raw Binary Screenshot
```
GET /api/raw-screenshot/abc123
```

#### Delete Screenshot
```
DELETE /api/screenshots/abc123
```

#### Check Server Status
```
GET /api/status
```

## Architecture

ChromeCap consists of three main components:

1. **FastAPI Backend Server**: Manages screenshot storage, REST API, and server-side functions
2. **Web Client**: Communicates with the Chrome extension
3. **Chrome Extension**: Performs the actual screenshot capture

The components communicate in this flow:

```
CLI/API → FastAPI Server → Web Client → Chrome Extension → Web Client → FastAPI Server → CLI/API
```

## Extension Integration

ChromeCap works best with the BrowserGPT Chrome extension, but also supports standard extension integrations:

### Extension Types

- **BrowserGPT (Default)**: Advanced screenshot capabilities with AI integration
- **STANDARD**: Basic DOM capture

To specify extension type:
```bash
chromecap capture https://example.com --extension-type STANDARD
```

### Socket.IO vs HTTP Fallback

ChromeCap uses Socket.IO for real-time communication between the server and extension:

1. **Socket.IO Mode**: Default when extension is already connected
2. **HTTP Fallback**: Used when no Socket.IO clients are available

## Configuration

ChromeCap can be configured through environment variables:

```bash
# Server configuration
export CHROMECAP_HOST=localhost
export CHROMECAP_PORT=8000

# Screenshot directory
export CHROMECAP_SCREENSHOTS_DIR=/path/to/screenshots

# Extension type
export CHROMECAP_EXTENSION_TYPE=BGPT
```

## Image Analysis with AI

When combined with cursor-agent-tools, ChromeCap can analyze screenshots using AI:

```bash
# Capture and analyze in one step
chromecap capture https://example.com --query "What usability issues do you see?"

# Analyze existing image
chromecap analyze screenshot.png "Identify UI inconsistencies"
```

Analysis output includes:
- Detailed text response from AI
- Identification of UI issues
- Design recommendations

To install cursor-agent-tools:
```bash
pip install cursor-agent-tools
```

## Development

### Contributing

Contributions are welcome! Please follow these steps:

1. Fork the repository
2. Create a feature branch (`git checkout -b feature/amazing-feature`)
3. Commit your changes (`git commit -m 'Add amazing feature'`)
4. Push to the branch (`git push origin feature/amazing-feature`)
5. Open a Pull Request

### Testing

```bash
# Install development dependencies
pip install -e ".[dev]"

# Run tests
pytest

# Test with coverage
pytest --cov=chromecap
```

### Building

```bash
# Build package
python -m build

# Install locally for testing
pip install -e .
```

## Security Considerations

- **URL Validation**: Always validate input URLs
- **File Access**: Restrict to screenshots directory
- **CORS Settings**: Configure based on your needs
- **Protocol Safety**: Validate protocol handlers

## License

Distributed under the MIT License. See [LICENSE](https://github.com/civai/chrome-cap/blob/main/LICENSE) for more information. 
