Metadata-Version: 2.4
Name: graphon-client
Version: 0.4.0
Summary: A Python client library for the Graphon API - Build knowledge graphs from your files
Author-email: Arbaaz Khan <arbaaz@graphon.ai>
Project-URL: Homepage, https://github.com/arbaazkhan2/graphon-client
Project-URL: Bug Reports, https://github.com/arbaazkhan2/graphon-client/issues
Project-URL: Source, https://github.com/arbaazkhan2/graphon-client
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Requires-Python: >=3.8
Description-Content-Type: text/markdown
Requires-Dist: httpx>=0.28.1

# Graphon Client

A Python client library for the Graphon API - Build unified knowledge graphs from your files.

## Features

- 🚀 **Simple API**: Upload files, create knowledge bases, and query them in just a few lines
- 📁 **Multi-format Support**: Process videos (MP4, MOV), documents (PDF, DOCX), and images (JPG, PNG)
- 🔄 **Async/Await**: Built on httpx for high-performance async operations
- 🎯 **Auto Type Detection**: Automatically detects file types from extensions
- 📊 **Status Polling**: Built-in polling for long-running operations
- 🔒 **Secure**: API key authentication with role-based access control

## Installation

```bash
pip install graphon-client
```

## Quick Start

```python
import asyncio
from graphon_client import GraphonClient

async def main():
    # Initialize client with your API key
    client = GraphonClient(api_key="your_api_key_here")
    
    # Upload and process files (auto-detects file types)
    file_objects = await client.upload_and_process_files(
        ["/path/to/video.mp4", "/path/to/document.pdf"],
        poll_until_complete=True  # Wait for processing to complete
    )
    
    # Create a knowledge graph from processed files
    file_ids = [f.file_id for f in file_objects]
    group_id = await client.create_group(
        file_ids,
        group_name="My Knowledge Base",
        wait_for_ready=True  # Wait for graph building to complete
    )
    
    # Query your knowledge graph
    response = await client.query_group(
        group_id,
        "What are the main topics discussed?"
    )
    print(response.answer)

asyncio.run(main())
```

## One-Shot Convenience Method

For the simplest workflow, use the all-in-one method:

```python
async def quick_start():
    client = GraphonClient(api_key="your_api_key_here")
    
    # Upload, process, and create group in one call
    group_id = await client.upload_process_and_create_group(
        file_paths=["/path/to/file1.pdf", "/path/to/file2.mp4"],
        group_name="My Knowledge Base"
    )
    
    # Query immediately
    response = await client.query_group(group_id, "Summarize the content")
    print(response.answer)
```

## Get Your API Key

1. Sign up at [https://graphon.ai](https://graphon.ai)
2. Navigate to Settings → API Keys
3. Create a new API key
4. Save it securely (it's only shown once!)

## Core Concepts

### Files
Individual files that are uploaded and processed. Each file gets a unique `file_id` and goes through processing to extract content and build a knowledge graph.

**Processing Statuses:**
- `UNPROCESSED`: File uploaded but not yet processed
- `PROCESSING`: File is being processed
- `SUCCESS`: Processing completed successfully
- `FAILURE`: Processing failed

### Groups
Collections of files with a unified knowledge graph. Query multiple files together as a single knowledge base.

**Graph Statuses:**
- `pending`: Group created but no files added yet
- `building`: Unified graph is being built
- `ready`: Graph is ready for querying
- `failed`: Graph building failed

## API Reference

### Client Initialization

```python
client = GraphonClient(
    api_key="your_api_key_here",
    base_url="https://api-frontend-485250924682.us-central1.run.app"  # Optional
)
```

### File Operations

#### Upload and Process Files

```python
file_objects = await client.upload_and_process_files(
    file_paths=["/path/to/file1.pdf", "/path/to/file2.mp4"],
    poll_until_complete=True,  # Wait for processing (default: True)
    timeout=1800,  # Max wait time in seconds (default: 30 min)
    poll_interval=3,  # Check status every N seconds (default: 3)
    on_progress=lambda step, current, total: print(f"{step}: {current}/{total}")
)

# Returns list of FileObject with file_id, file_name, processing_status
```

#### Get File Status

```python
file_detail = await client.get_file_status(file_id)
print(f"Status: {file_detail.processing_status}")
```

#### List Files

```python
files = await client.list_files(
    status_filter="SUCCESS",  # Optional: filter by status
    file_type="video"  # Optional: filter by type
)
```

#### Poll File Until Complete (Manual)

```python
file_detail = await client.poll_file_until_complete(
    file_id,
    timeout=1800,
    poll_interval=3,
    on_progress=lambda status: print(f"Status: {status}")
)
```

### Group Operations

#### Create Group

```python
group_id = await client.create_group(
    file_ids=["file-id-1", "file-id-2"],
    group_name="My Knowledge Base",
    wait_for_ready=True,  # Wait for graph building (default: False)
    timeout=3600,  # Max wait time in seconds (default: 1 hour)
    poll_interval=5,  # Check status every N seconds (default: 5)
    on_progress=lambda status: print(f"Graph status: {status}")
)
```

#### Query Group

```python
response = await client.query_group(
    group_id="group-id-here",
    query="What are the key insights?"
)

print(response.answer)
for source in response.sources:
    print(f"Source: {source}")
```

#### Query with Attention Nodes (v0.4.0+)

The `attention_nodes` field shows all nodes that were fed into the LLM context window:

```python
response = await client.query_group(
    group_id="group-id-here",
    query="What are the key insights?"
)

# Access the answer and cited sources
print(f"Answer: {response.answer}")
print(f"\nCited sources: {len(response.sources)}")

# NEW in v0.4.0: Access all context nodes with similarity scores
print(f"All context nodes: {len(response.attention_nodes)}")

# Each attention node has 'source' (metadata) and 'score' (similarity)
for node in sorted(response.attention_nodes, key=lambda x: x['score'], reverse=True):
    source = node['source']
    score = node['score']
    node_type = source['node_type']
    
    if node_type == 'video':
        print(f"  [{score:.3f}] Video: {source['video_name']} [{source['start_time']}-{source['end_time']}s]")
    elif node_type == 'document':
        print(f"  [{score:.3f}] Document: {source['pdf_name']} (page {source['page_num']})")
    elif node_type == 'image':
        print(f"  [{score:.3f}] Image: {source['image_name']}")
```

**What are attention nodes?**
- `source`: Complete metadata for the node (same structure as `sources` array items)
- `score`: Similarity score showing relevance to your query (0.0 to 1.0, higher is more relevant)

This is useful for:
- Understanding what information the LLM considered when generating the answer
- Analyzing retrieval quality by examining similarity scores
- Debugging why certain information was or wasn't included in the answer

#### Get Group Status

```python
group_detail = await client.get_group_status(group_id)
print(f"Graph status: {group_detail.graph_status}")
print(f"Files in group: {len(group_detail.file_ids)}")
```

#### List Groups

```python
groups = await client.list_groups()
for group in groups:
    print(f"{group.group_name} - {group.graph_status} - {group.file_count} files")
```

Note: `list_groups()` returns `GroupListItem` objects (summary view with file_count), not full `GroupDetail` objects. Use `get_group_status(group_id)` for complete details.

#### Poll Group Until Ready (Manual)

```python
group_detail = await client.poll_group_until_ready(
    group_id,
    timeout=3600,
    poll_interval=5,
    on_progress=lambda status: print(f"Status: {status}")
)
```

## Supported File Types

### Videos
- `.mp4`, `.mov`, `.avi`, `.mkv`, `.webm`
- Automatic transcription and scene analysis
- Maximum size: Check API limits

### Documents
- `.pdf`, `.doc`, `.docx`, `.txt`
- Text extraction and semantic analysis
- Maximum size: Check API limits

### Images
- `.jpg`, `.jpeg`, `.png`, `.gif`, `.webp`
- OCR and visual analysis
- Maximum size: Check API limits

## Error Handling

```python
try:
    file_objects = await client.upload_and_process_files(file_paths)
except Exception as e:
    print(f"Upload failed: {e}")

try:
    response = await client.query_group(group_id, query)
except Exception as e:
    if "not ready" in str(e):
        print("Graph is still building, please wait")
    else:
        print(f"Query failed: {e}")
```

## Advanced Usage

### Progress Tracking

```python
def track_progress(step: str, current: int, total: int):
    percent = (current / total) * 100
    print(f"[{step}] {percent:.1f}% ({current}/{total})")

file_objects = await client.upload_and_process_files(
    file_paths,
    on_progress=track_progress
)
```

### Without Waiting (Non-blocking)

```python
# Start uploads and processing without waiting
file_objects = await client.upload_and_process_files(
    file_paths,
    poll_until_complete=False  # Returns immediately
)

# Poll manually later
for file_obj in file_objects:
    file_detail = await client.poll_file_until_complete(file_obj.file_id)
    print(f"{file_detail.file_name}: {file_detail.processing_status}")
```

### Custom Timeouts

```python
# For large files that take longer to process
file_objects = await client.upload_and_process_files(
    file_paths,
    poll_until_complete=True,
    timeout=3600,  # 1 hour timeout
    poll_interval=10  # Check every 10 seconds
)

# For large groups that take longer to build
group_id = await client.create_group(
    file_ids,
    group_name="Large Knowledge Base",
    wait_for_ready=True,
    timeout=7200,  # 2 hour timeout
    poll_interval=15  # Check every 15 seconds
)
```

## Changelog

### v0.4.0 (2024-11-25)

**Added:**
- `attention_nodes` field in `QueryResponse` - Access all context nodes fed to the LLM with their similarity scores
- Cleaner nested structure: each attention node has `source` (metadata) and `score` (similarity)
- Each attention node includes:
  - `source`: Complete source metadata (video/document/image details)
  - `score`: Similarity score for the query (0.0 to 1.0, higher = more relevant)

**Example:**
```python
response = await client.query_group(group_id, query)
print(f"Cited: {len(response.sources)}, Total context: {len(response.attention_nodes)}")

# Access scores and sources
for node in response.attention_nodes:
    print(f"Score: {node['score']:.3f}, Type: {node['source']['node_type']}")
```

## Migration from v0.1.x

Version 0.2.0 introduces breaking changes aligned with the new API architecture.

### Changed
- **Authentication**: Now uses API keys instead of bearer tokens
  ```python
  # Old (v0.1.x)
  client = GraphonClient(token="xDhMfTDCpfwewocP93d5")
  
  # New (v0.2.0)
  client = GraphonClient(api_key="sk_live_...")
  ```

- **Upload workflow**: Simplified to a single method
  ```python
  # Old (v0.1.x)
  upload_infos = await client.generate_upload_urls(filenames)
  # ... manual upload logic ...
  await client.upload_files(file_paths)
  
  # New (v0.2.0)
  file_objects = await client.upload_and_process_files(file_paths)
  ```

- **Group creation**: Now uses file_ids instead of uuid_directories
  ```python
  # Old (v0.1.x)
  group_uuid = await client.create_index(uuid_directories)
  
  # New (v0.2.0)
  file_ids = [f.file_id for f in file_objects]
  group_id = await client.create_group(file_ids, group_name)
  ```

- **Querying**: Updated method signature
  ```python
  # Old (v0.1.x)
  answer = await client.query(group_uuid, query_text)  # Returns string
  
  # New (v0.2.0)
  response = await client.query_group(group_id, query)  # Returns QueryResponse
  print(response.answer)
  print(response.sources)
  ```

### Removed
- `generate_upload_urls()` - Replaced by `upload_and_process_files()`
- `upload_file_to_signed_url()` - Handled internally
- `upload_files()` - Replaced by `upload_and_process_files()`
- `create_index()` - Replaced by `create_group()`

## Support

- Documentation: [https://docs.graphon.ai](https://docs.graphon.ai)
- Email: support@graphon.ai
- Issues: [GitHub Issues](https://github.com/arbaazkhan2/graphon-client/issues)

## License

MIT License - see LICENSE file for details
