Metadata-Version: 2.4
Name: inframe
Version: 0.1.0
Summary: Screen Context Recording and Querying SDK
Author-email: Inframe <bendgeist99@gmail.com>
License: MIT
Project-URL: Source, https://github.com/BenDGeist/inframe.git
Project-URL: Documentation, https://github.com/BenDGeist/inframe.git
Keywords: screen recording,context,AI,transcription,video processing
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Operating System :: MacOS :: MacOS X
Requires-Python: >=3.8
Description-Content-Type: text/markdown
Requires-Dist: opencv-python<4.9.0,>=4.5.0
Requires-Dist: mss>=9.0.0
Requires-Dist: numpy<2.0.0,>=1.21.0
Requires-Dist: openai>=1.0.0
Requires-Dist: pillow>=10.0.0
Requires-Dist: faster-whisper>=0.7.0
Requires-Dist: transformers>=4.20.0
Requires-Dist: jinja2>=3.0.0
Requires-Dist: torch>=1.11.0
Requires-Dist: fire>=0.4.0
Requires-Dist: pyobjc-core>=9.0
Requires-Dist: pyobjc-framework-Cocoa>=9.0
Requires-Dist: pyobjc-framework-AVFoundation>=9.0
Requires-Dist: pyobjc-framework-Quartz>=9.0
Requires-Dist: pyobjc-framework-ScreenCaptureKit>=9.0
Requires-Dist: mcp
Requires-Dist: fastmcp
Provides-Extra: dev
Requires-Dist: pytest>=6.0; extra == "dev"
Requires-Dist: pytest-asyncio>=0.21.0; extra == "dev"
Requires-Dist: black>=22.0; extra == "dev"
Requires-Dist: flake8>=4.0; extra == "dev"

# Inframe - Screen Context Recording and Querying SDK

A Python SDK for intelligent screen recording, context analysis, and real-time querying. Inframe captures screen activity, processes audio and visual content, and provides an AI-powered interface for understanding your digital workspace.

## Features

- **Real-time Screen Recording**: Native macOS recording with ScreenCaptureKit
- **Context-Aware Analysis**: Combines audio transcription with visual content analysis
- **Intelligent Querying**: Ask questions about your screen activity and get AI-powered answers
- **Rolling Buffer**: Maintains recent context for continuous analysis
- **Modular Architecture**: Separate recorders for different applications and contexts
- **Async Processing**: Non-blocking pipeline for smooth operation

## Quick Start

### 1. Install the Package
```bash
# Clone the repository
git clone <repository-url>
cd inframe

# Create conda environment
conda env create -f environment.yml
conda activate inframe

# Install in development mode
pip install -e .
```

### 2. Set Up Environment Variables
```bash
export OPENAI_API_KEY="your-openai-api-key-here"
```

### 3. Basic Usage
```python
from inframe import ContextRecorder, ContextQuery
import os

# Initialize recorder and query system
recorder = ContextRecorder(openai_api_key=os.getenv("OPENAI_API_KEY"))
query = ContextQuery(openai_api_key=os.getenv("OPENAI_API_KEY"), model="gpt-4o")

# Set up screen recording
screen_recorder = recorder.add_recorder(
    include_apps=["VS Code", "PyCharm", "Cursor"],
    recording_mode="full_screen",
    visual_task="Describe visible code, terminal output, and development activity."
)

# Set up Slack monitoring
slack_recorder = recorder.add_recorder(
    include_apps=["Slack"],
    recording_mode="full_screen", 
    visual_task="Summarize recent DMs and workspace activity."
)

# Define a query to monitor for specific events
async def on_code_change_requested(result):
    if "code change" in result.answer.lower():
        print("Boss requested a code change!")
        # Handle the request...

query.add_query(
    prompt="Has my boss asked for a code change?",
    recorder=slack_recorder,
    callback=on_code_change_requested,
    interval_seconds=5
)

# Start recording and monitoring
await recorder.start()
await query.start()
```

## Core Components

### ContextRecorder
Manages multiple screen recorders and coordinates the recording pipeline:

```python
recorder = ContextRecorder(openai_api_key="your-key")

# Add different types of recorders
code_recorder = recorder.add_recorder(
    include_apps=["VS Code", "PyCharm"],
    recording_mode="full_screen",
    visual_task="Describe code changes and development activity"
)

meeting_recorder = recorder.add_recorder(
    include_apps=["Zoom", "Teams", "Google Meet"],
    recording_mode="full_screen", 
    visual_task="Summarize meeting content and participants"
)
```

### ContextQuery
Provides intelligent querying capabilities over recorded content:

```python
query = ContextQuery(openai_api_key="your-key", model="gpt-4o")

# Continuous monitoring
query.add_query(
    prompt="Is there an urgent message from my manager?",
    recorder=slack_recorder,
    interval_seconds=10
)

# One-time analysis
result = await query.ask(
    prompt="What was I working on in the last 30 minutes?",
    recorder=code_recorder
)
print(result.answer)
```

### TranscriptionPipeline
Advanced audio and visual processing pipeline:

```python
from src.transcription_pipeline import create_transcription_pipeline

pipeline = create_transcription_pipeline(
    use_openai=True,
    whisper_model="small.en",
    language="en"
)

# Process video clips with full analysis
await pipeline.start_pipeline(video_stream, visual_task="Describe screen activity")
```

## Project Structure

```
inframe/
├── inframe/                    # Main package
│   ├── __init__.py            # Package exports
│   ├── recorder.py            # ContextRecorder class
│   └── query.py               # ContextQuery class
├── src/                       # Core implementation
│   ├── video_stream.py        # Video recording and streaming
│   ├── transcription_pipeline.py  # Audio/visual processing
│   ├── context_integrator.py  # Context management
│   ├── context_querier.py     # Query processing
│   └── tldw_utils.py          # Transcription utilities
├── tests/                     # Test suite
├── agents/                    # Example agent implementations
├── environment.yml            # Conda environment
├── requirements.txt           # Python dependencies
├── setup.py                   # Package setup
└── pyproject.toml            # Modern Python packaging
```

## Installation

### Prerequisites
- macOS (for native screen recording)
- Python 3.8+
- Conda or pip

### Dependencies
Core dependencies include:
- `opencv-python>=4.5.0,<4.9.0` - Video processing
- `numpy>=1.21.0,<2.0.0` - Numerical computing
- `openai>=1.0.0` - AI analysis
- `faster-whisper>=0.7.0` - Speech recognition
- `pyobjc-framework-*` - macOS integration
- `mcp` and `fastmcp` - Model Context Protocol

### Environment Setup
```bash
# Create environment from yml file
conda env create -f environment.yml
conda activate inframe

# Or install dependencies manually
pip install -r requirements.txt
```

## Advanced Usage

### Custom Visual Tasks
Define specific analysis tasks for different applications:

```python
# Code review assistant
code_recorder = recorder.add_recorder(
    include_apps=["VS Code", "GitHub"],
    visual_task="Identify code changes, review comments, and pull request status"
)

# Meeting summarizer  
meeting_recorder = recorder.add_recorder(
    include_apps=["Zoom", "Teams"],
    visual_task="Summarize meeting topics, participants, and action items"
)
```

### Callback System
Respond to events in real-time:

```python
async def on_urgent_message(result):
    if "urgent" in result.answer.lower():
        # Send notification
        await send_notification("Urgent message detected!")

query.add_query(
    prompt="Is there an urgent or important message?",
    recorder=slack_recorder,
    callback=on_urgent_message,
    interval_seconds=5
)
```

### Context Integration
Combine multiple sources for comprehensive analysis:

```python
# Integrate screen activity with calendar
calendar_context = "I have a meeting in 30 minutes about project X"

result = await query.ask(
    prompt="Am I prepared for my upcoming meeting?",
    recorder=code_recorder,
    context=calendar_context
)
```

## Configuration

### Environment Variables
```bash
export OPENAI_API_KEY="your-api-key"
export KMP_DUPLICATE_LIB_OK="TRUE"  # For macOS compatibility
```

### Recording Settings
- `recording_mode`: "full_screen" or "window"
- `include_apps`: List of applications to monitor
- `visual_task`: Specific analysis instructions
- `interval_seconds`: Query frequency

## Testing

Run the test suite to verify installation:

```bash
pytest tests/ -v
```

## Performance Considerations

- **Memory Usage**: Rolling buffers prevent memory accumulation
- **API Costs**: Configurable intervals control OpenAI usage
- **Processing**: Async pipeline ensures non-blocking operation
- **Storage**: Temporary files are automatically cleaned up

## Troubleshooting

### Common Issues
1. **Screen Recording Permissions**: Grant permissions in System Preferences > Security & Privacy
2. **OpenAI API**: Ensure API key is set and valid
3. **Dependencies**: Use the provided environment.yml for consistent setup
4. **Audio Issues**: Check system audio permissions

### Debug Mode
Enable verbose logging for troubleshooting:

```python
import logging
logging.basicConfig(level=logging.DEBUG)
```

## Contributing

1. Fork the repository
2. Create a feature branch
3. Make your changes
4. Add tests for new functionality
5. Submit a pull request

## License

MIT License - see LICENSE file for details.

## Acknowledgments

- Built on techniques from the tldw project for video summarization
- Uses faster-whisper for efficient speech recognition
- Integrates OpenAI's GPT models for intelligent analysis 
