Metadata-Version: 2.4
Name: ge-eval
Version: 0.2.1
Summary: CLI toolkit for Gemini Enterprise Connector evaluation — init, check, run, and evaluate search quality against golden datasets.
Project-URL: Homepage, https://github.com/cloud-ai-fde/weiyih-gemini-enterprise-connector-eval
Project-URL: Repository, https://github.com/cloud-ai-fde/weiyih-gemini-enterprise-connector-eval
Author: Google LLC
License: Apache-2.0
License-File: LICENSE
Keywords: cli,evaluation,gemini,llm,rag
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Software Development :: Testing
Requires-Python: >=3.12
Requires-Dist: click>=8.1.0
Requires-Dist: google-auth>=2.49.1
Requires-Dist: google-cloud-discoveryengine
Requires-Dist: google-genai>=1.68.0
Requires-Dist: python-dotenv>=1.2.2
Requires-Dist: pyyaml>=6.0
Requires-Dist: requests>=2.32.5
Description-Content-Type: text/markdown

# Gemini Enterprise Connector — Evaluation Toolkit

[![License: Apache 2.0](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://github.com/cloud-ai-fde/weiyih-gemini-enterprise-connector-eval/blob/main/LICENSE)
[![PyPI version](https://img.shields.io/pypi/v/ge_eval.svg)](https://pypi.org/project/ge-eval/)
[![Python](https://img.shields.io/pypi/pyversions/ge_eval.svg)](https://pypi.org/project/ge-eval/)

CLI toolkit for evaluating Gemini Enterprise Connector search quality. Compares
actual responses against a golden dataset, producing pass/fail grades, root
cause analysis, and an interactive HTML dashboard.

## Installation

### Using `uv` (recommended)

```bash
uv pip install ge-eval
```

### Using `pip`

```bash
pip install ge-eval
```

### From source

```bash
git clone https://github.com/cloud-ai-fde/weiyih-gemini-enterprise-connector-eval.git
cd weiyih-gemini-enterprise-connector-eval
uv sync
```

## Quick Start

```bash
# Step 1: Scaffold a working directory with sample inputs
ge-eval init

# Step 2: Edit .env with your Google Cloud project settings

# Step 3: Validate configuration
ge-eval check

# Step 4: Query the API (sends questions, gets responses)
ge-eval run

# Step 5: Run LLM-judge evaluation
ge-eval eval

# Step 6: View results in the browser
ge-eval serve
# Then open http://localhost:8080
```

> **Note:** Run `ge-eval init` to generate an `INSTRUCTION.md` file with a
> comprehensive setup guide covering authentication, input structure, and
> command details.

## What the Pipeline Does

`ge-eval eval` orchestrates 6 steps in sequence:

| Step | Action | Description |
|------|--------|-------------|
| 1 | **Validate Inputs** | Checks that golden dataset, CSV, and HTML folder exist; validates question alignment |
| 2 | **Generate Summaries** | Extracts agent trajectories from dolphin debug HTML files → `outputs/summaries/` |
| 3 | **Run LLM Judge** | Calls Gemini to evaluate all questions |
| 4 | **Enrich Golden Source** | Adds expected citations from golden dataset to the CSV |
| 5 | **Normalize Columns** | Reorders CSV to final 15-column schema |
| 6 | **Generate Report** | Creates `outputs/G_REPORT.md` with stats and detailed RCA |

## CLI Commands

| Command | Description |
|---------|-------------|
| `ge-eval init` | Scaffold a working directory with sample inputs, `.env`, and `INSTRUCTION.md` |
| `ge-eval check` | Validate config, env vars, and input file alignment |
| `ge-eval run` | Batch-query the GE streamAssist API |
| `ge-eval eval` | Run the full LLM-judge evaluation pipeline |
| `ge-eval serve` | Start a local HTTP server for the evaluation viewer |
| `ge-eval clean` | Remove all files from `inputs/`, `outputs/`, and `INSTRUCTION.md` |

## Documentation

Full documentation is available at:
**[https://cloud-ai-fde.github.io/weiyih-gemini-enterprise-connector-eval/](https://cloud-ai-fde.github.io/weiyih-gemini-enterprise-connector-eval/)**

After running `ge-eval init`, see the generated `INSTRUCTION.md` for a
detailed setup guide including authentication, input structure, and output
column schema.

## Contributing

Contributions are welcome! See [`CONTRIBUTING.md`](CONTRIBUTING.md) for
guidelines on how to get started.

### Development Setup

```bash
# Clone and install dev dependencies
git clone https://github.com/cloud-ai-fde/weiyih-gemini-enterprise-connector-eval.git
cd weiyih-gemini-enterprise-connector-eval
uv sync

# Run tests
uv run pytest tests/ -v

# Build documentation locally
uv run mkdocs serve
```

### Running Tests

```bash
# Run all tests
uv run pytest tests/ -v

# Run specific test module
uv run pytest tests/test_ge_eval_cli.py -v

# Run with coverage
uv run pytest tests/ --cov=ge_eval -v
```

## License

This project is licensed under the Apache License 2.0 — see the
[LICENSE](LICENSE) file for details.
