Metadata-Version: 2.4
Name: snapshot-query
Version: 0.1.2
Summary: Efficiently query and analyze snapshot log files generated by cursor-ide-browser
Project-URL: Homepage, https://github.com/cursor-ide/snapshot-query
Project-URL: Documentation, https://github.com/cursor-ide/snapshot-query#readme
Project-URL: Repository, https://github.com/cursor-ide/snapshot-query
Author-email: Cursor IDE Team <support@cursor.com>
License: MIT
License-File: LICENSE
Keywords: accessibility,browser,cursor,mcp,pydantic,query,snapshot
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Requires-Python: >=3.8
Requires-Dist: mcp>=1.0.0
Requires-Dist: pydantic>=2.0.0
Requires-Dist: pyyaml>=6.0
Provides-Extra: dev
Requires-Dist: pytest-cov>=4.0.0; extra == 'dev'
Requires-Dist: pytest>=7.0.0; extra == 'dev'
Description-Content-Type: text/markdown

# snapshot-query

Efficiently query and analyze snapshot log files generated by cursor-ide-browser.

## Features

- ✅ **Multiple Query Methods**:
  - Find by name (fuzzy/exact matching)
  - Find by role
  - Find by reference identifier
  - Regular expression queries (grep syntax)
  - CSS/jQuery selector queries
  - BM25 relevance-ranked queries

- ✅ **Data Validation**:
  - Pydantic-based data validation
  - Type-safe data models

- ✅ **MCP Server Interface**:
  - Model Context Protocol support
  - Query snapshot files via MCP tools

- ✅ **Command Line Tool**:
  - Rich CLI commands
  - Multiple output formats

## Installation

### Using uvx (Recommended, no installation needed)

```bash
# Run directly without installation
uvx snapshot-query <file_path> <command> [args]
```

### Using pip

```bash
pip install snapshot-query
```

After installation, you can use:
```bash
snapshot-query <file_path> <command> [args]
```

### Using Python Module

```bash
python -m snapshot_query <file_path> <command> [args]
```

## Quick Start

### Basic Queries

```bash
# Find elements containing "search"
uvx snapshot-query snapshot.log find-name "search"

# Find all buttons
uvx snapshot-query snapshot.log find-role button

# List all interactive elements
uvx snapshot-query snapshot.log interactive

# Count elements
uvx snapshot-query snapshot.log count
```

## Command Reference

### find-name
Find elements by name (fuzzy matching)

```bash
uvx snapshot-query snapshot.log find-name "search"
```

### find-name-exact
Find elements by name (exact matching)

```bash
uvx snapshot-query snapshot.log find-name-exact "search"
```

### find-name-bm25
Find elements using BM25 algorithm (ranked by relevance)

```bash
# Return all relevant results
uvx snapshot-query snapshot.log find-name-bm25 "search"

# Return top 10 most relevant results
uvx snapshot-query snapshot.log find-name-bm25 "search" 10
```

### find-role
Find elements by role

```bash
uvx snapshot-query snapshot.log find-role button
```

### find-ref
Find element by reference identifier

```bash
uvx snapshot-query snapshot.log find-ref ref-b9k8zlttiah
```

### find-text
Find elements containing specified text

```bash
uvx snapshot-query snapshot.log find-text "login"
```

### find-grep
Find elements using regular expressions (grep syntax)

```bash
# Search name field (default)
uvx snapshot-query snapshot.log find-grep "^search$" name

# Search role field
uvx snapshot-query snapshot.log find-grep "button|link" role

# Search ref field
uvx snapshot-query snapshot.log find-grep "ref-.*abc" ref
```

### find-selector
Find elements using CSS/jQuery selector syntax

```bash
# Tag selector (role)
uvx snapshot-query snapshot.log find-selector "button"

# ID selector (ref)
uvx snapshot-query snapshot.log find-selector "#ref-rv7cgg62t9g"

# Attribute selector
uvx snapshot-query snapshot.log find-selector "[name='search']"

# Combined selector
uvx snapshot-query snapshot.log find-selector "button[name='search']"

# Contains match
uvx snapshot-query snapshot.log find-selector "[name*='Google']"
```

### interactive
List all interactive elements

```bash
uvx snapshot-query snapshot.log interactive
```

### count
Count elements by type

```bash
uvx snapshot-query snapshot.log count
```

### path
Show element path in tree

```bash
uvx snapshot-query snapshot.log path ref-b9k8zlttiah
```

### all-refs
List all reference identifiers

```bash
uvx snapshot-query snapshot.log all-refs
```

### convert-to-markdown
Convert snapshot log file to Markdown format

```bash
# Convert and output to console
uvx snapshot-query snapshot.log convert-to-markdown

# Convert and save to file
uvx snapshot-query snapshot.log convert-to-markdown output.md

# Convert without ref identifiers
uvx snapshot-query snapshot.log convert-to-markdown output.md --no-ref

# Convert with maximum depth limit
uvx snapshot-query snapshot.log convert-to-markdown output.md --max-depth 3
```

**Options:**
- `output_file` (optional): Output file path. If not provided, outputs to console
- `--no-ref`: Exclude ref identifiers from output
- `--max-depth <number>`: Limit rendering depth (useful for large snapshots)

**Output Format:**
The generated Markdown document includes:

- **Document Header**: Source file name and generation timestamp
- **Overview Section**: Brief introduction and total element count
- **Statistics Section**: 
  - Element count by role in a table format with percentages
  - Interactive elements summary
- **Accessibility Tree Structure**: 
  - Hierarchical tree view with proper headings
  - Each element shows: role, name (if available), and reference identifier
  - Child elements count for each parent
- **Interactive Elements Reference**: 
  - Tables listing all interactive elements (links, buttons, textboxes, etc.)
  - Each table shows name and reference identifier for quick lookup
- **Notes Section**: Additional information about the document and usage

The document is structured as a coherent, readable Markdown file suitable for documentation, sharing, or further processing.

## Common Query Scenarios

### Scenario 1: Find button ref

**Problem**: I want to click a "search" button and need to find its ref

**Solution**:
```bash
uvx snapshot-query snapshot.log find-name "search"
```

**Output Example**:
```
Found 1 matching element:
role: button
ref: ref-b9k8zlttiah
name: search
```

**Usage**:
```javascript
browser_click(
  element="Search button",
  ref="ref-b9k8zlttiah"
)
```

### Scenario 2: Find all clickable elements

**Problem**: I want to see what interactive elements are on the page

**Solution**:
```bash
uvx snapshot-query snapshot.log interactive
```

### Scenario 3: Find input boxes

**Problem**: I need to find the ref of a username input box

**Solution**:
```bash
# Method 1: Find all textboxes
uvx snapshot-query snapshot.log find-role textbox

# Method 2: Find by name
uvx snapshot-query snapshot.log find-name "username"
```

### Scenario 4: Understand page structure

**Problem**: I want to know where an element is located in the page

**Solution**:
```bash
uvx snapshot-query snapshot.log path ref-b9k8zlttiah
```

**Output Example**:
```
Element path:

Level 0:
  role: generic
  ref: ref-zketxgetcys

Level 1:
  role: generic
  ref: ref-p37ecs217hp

Level 2:
  role: generic
  ref: ref-zhx4wavxy6q

Level 3:
  role: button
  ref: ref-b9k8zlttiah
  name: search
```

### Scenario 5: Batch find links

**Problem**: I want to find all links containing "news"

**Solution**:
```bash
# Use find-text to find elements containing "news"
uvx snapshot-query snapshot.log find-text "news"

# Or use find-selector for combined query
uvx snapshot-query snapshot.log find-selector "link[name*='news']"
```

## Advanced Usage

### Using Python API

```python
from snapshot_query import SnapshotQuery

# Load snapshot file
query = SnapshotQuery("snapshot.log")

# Find all buttons
buttons = query.find_by_role("button")
print(f"Found {len(buttons)} buttons")

# Find elements containing "login"
login_elements = query.find_by_name("login")
for elem in login_elements:
    print(f"Name: {elem.name}, ref: {elem.ref}")

# Use BM25 search
results = query.find_by_name_bm25("search", top_k=5)
for elem in results:
    print(f"Name: {elem.name}, ref: {elem.ref}")

# Use selector
results = query.find_by_selector("button[name='search']")
for elem in results:
    print(f"Name: {elem.name}, ref: {elem.ref}")

# Get element path
path = query.get_element_path("ref-b9k8zlttiah")
print(f"Element path depth: {len(path)}")
```

### Using grep for quick search (suitable for large files)

```bash
# Windows PowerShell
Select-String -Path "snapshot.log" -Pattern "name: search" -Context 0,2

# Linux/Mac
grep -A 2 "name: search" snapshot.log
```

### Batch processing multiple snapshot files

```python
from pathlib import Path
from snapshot_query import SnapshotQuery

log_dir = Path(r"C:\Users\{username}\.cursor\browser-logs")

for log_file in log_dir.glob("snapshot-*.log"):
    print(f"\nProcessing file: {log_file.name}")
    query = SnapshotQuery(log_file)
    # Execute queries...
```

## MCP Server Interface

snapshot-query provides an MCP (Model Context Protocol) server interface that allows AI assistants to query snapshot files through a standardized protocol.

### Installation

Ensure MCP SDK is installed:

```bash
pip install mcp
```

### Starting MCP Server

**Method 1: Using command entry point (requires package installation)**
```bash
pip install snapshot-query
snapshot-query-mcp
```

**Method 2: Using Python module**
```bash
python -m snapshot_query.mcp_server
```

### Client Configuration

Configure the server in Cursor IDE or other MCP clients. Edit the MCP configuration file (usually at `~/.cursor/mcp.json` or similar location):

**Using installed package (recommended)**:
```json
{
  "mcpServers": {
    "snapshot-query": {
      "command": "snapshot-query-mcp",
      "args": []
    }
  }
}
```

**Using Python module**:
```json
{
  "mcpServers": {
    "snapshot-query": {
      "command": "python",
      "args": ["-m", "snapshot_query.mcp_server"]
    }
  }
}
```

**Using uvx (recommended, no local installation needed)**:
```json
{
  "mcpServers": {
    "snapshot-query": {
      "command": "uvx",
      "args": ["--from", "snapshot-query", "python", "-m", "snapshot_query.mcp_server"]
    }
  }
}
```

Note: When using uvx, MCP SDK needs to be installed separately. It's recommended to use the installed package method.

### Available Tools

The MCP server provides the following tools:

1. **find_by_name** - Find elements by name (supports fuzzy and exact matching)
2. **find_by_name_bm25** - Find elements using BM25 algorithm (ranked by relevance)
3. **find_by_role** - Find elements by role type
4. **find_by_ref** - Find element by reference identifier
5. **find_by_text** - Find elements containing specified text
6. **find_by_regex** - Find elements using regular expressions
7. **find_by_selector** - Find elements using CSS/jQuery selectors
8. **find_interactive_elements** - Find all interactive elements
9. **count_elements** - Count elements by type
10. **get_element_path** - Get element path in tree
11. **extract_all_refs** - Extract all reference identifiers from snapshot file

### Usage Scenarios

**In AI Assistants**:
After configuring the MCP server, AI assistants can directly call these tools to query snapshot files.

**Working with cursor-ide-browser**:
1. Use `browser_snapshot()` to get page snapshot
2. Snapshot is saved as log file
3. Query snapshot file via MCP interface to find element ref
4. Use ref for browser interaction operations

## Real-world Workflow Example

```bash
# 1. Get page snapshot (in browser)
browser_snapshot()

# 2. Find target element
uvx snapshot-query snapshot.log find-name "login"

# 3. Get ref: ref-xxxxx

# 4. Use ref for operation
browser_click(element="Login button", ref="ref-xxxxx")
```

## Tips

1. **Combine queries**: Use `find-name` to find approximate location, then use `find-ref` to get detailed information
2. **Use path**: The `path` command helps understand page structure
3. **Batch processing**: Write scripts to process multiple snapshot files
4. **Cache results**: Save frequently queried refs to configuration files
5. **BM25 ranking**: Use `find-name-bm25` for smarter relevance-ranked results

## FAQ

**Q: Why can't I find an element?**  
A: Check if the element name is correct, try using fuzzy search `find-name` instead of exact search, or use `find-name-bm25` for better matching results.

**Q: Do ref values change?**  
A: Yes, ref values change with each page refresh or new snapshot. You need to get a new snapshot to get new ref values.

**Q: How to batch process multiple snapshot files?**  
A: Use scripts to loop through files:
```python
from pathlib import Path
from snapshot_query import SnapshotQuery

for log_file in Path(".cursor/browser-logs").glob("snapshot-*.log"):
    query = SnapshotQuery(log_file)
    # Execute queries...
```

**Q: MCP SDK import error**  
A: Ensure MCP SDK is installed: `pip install mcp`

## Dependencies

- Python >= 3.8
- PyYAML >= 6.0
- Pydantic >= 2.0.0
- MCP >= 1.0.0

## More Information

For detailed documentation, see:
- [cursor-ide-browser.md](./cursor-ide-browser.md) - cursor-ide-browser tool documentation
- [AGENTS.md](./AGENTS.md) - Project development guide
