Metadata-Version: 2.2
Name: donew
Version: 0.1.6
Summary: A Python package for web processing and vision tasks with browser automation capabilities
Author-email: Kenan Deniz <kenan@unrealists.com>
License: MIT
Keywords: web automation,vision,browser,playwright,image processing,knowledge graph
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Topic :: Internet :: WWW/HTTP :: Browsers
Classifier: Topic :: Scientific/Engineering :: Image Processing
Requires-Python: >=3.11
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: aiohttp>=3.9.1
Requires-Dist: playwright>=1.40.0
Requires-Dist: typing-extensions>=4.12.2
Requires-Dist: tabulate>=0.9.0
Requires-Dist: icecream>=2.1.4
Requires-Dist: smolagents[litellm,mcp]>=1.4.1
Requires-Dist: mcp[cli]>=1.2.0
Requires-Dist: greenlet>=3.1.1
Requires-Dist: tqdm>=4.67.1
Requires-Dist: arize-phoenix>=7.12.0
Provides-Extra: dev
Requires-Dist: pytest>=8.3.4; extra == "dev"
Requires-Dist: pytest-asyncio>=0.25.0; extra == "dev"
Requires-Dist: pytest-cov>=4.1.0; extra == "dev"
Requires-Dist: black>=23.10.0; extra == "dev"
Requires-Dist: isort>=5.3.2; extra == "dev"
Requires-Dist: mypy>=1.14.0; extra == "dev"
Requires-Dist: ruff>=0.8.4; extra == "dev"
Requires-Dist: arize-phoenix>=7.12.0; extra == "dev"
Requires-Dist: opentelemetry-sdk>=1.27.0; extra == "dev"
Requires-Dist: opentelemetry-exporter-otlp>=1.27.0; extra == "dev"
Requires-Dist: openinference-instrumentation-smolagents>=0.1.0; extra == "dev"
Provides-Extra: kg
Requires-Dist: glirel; extra == "kg"
Requires-Dist: spacy; extra == "kg"
Requires-Dist: gliner-spacy; extra == "kg"
Requires-Dist: kuzu; extra == "kg"
Requires-Dist: torch; extra == "kg"
Requires-Dist: loguru; extra == "kg"
Requires-Dist: sentencepiece; extra == "kg"
Requires-Dist: protobuf<=3.20.3; extra == "kg"

# DoNew

[![PyPI version](https://badge.fury.io/py/donew.svg)](https://badge.fury.io/py/donew)
[![PyPI - Python Version](https://img.shields.io/pypi/pyversions/donew)](https://pypi.org/project/donew/)
[![PyPI - License](https://img.shields.io/pypi/l/donew)](https://pypi.org/project/donew/)

A powerful Python package designed for AI agents to perform web processing, document navigation, and autonomous task execution. DoNew provides a high-level, agentic interface that makes it easy for AI systems to interact with web content and documents.

## Quick Install

```bash
pip install donew
donew-install-browsers  # Install required browsers
```

## Why DoNew?

DoNew is built with AI agents in mind, providing intuitive interfaces for:
- Autonomous web navigation and interaction
- Document understanding and processing
- Task execution and decision making
- State management and context awareness

## Features

- Browser automation using Playwright
- Web page processing and interaction
- Vision-related tasks and image processing
- Easy-to-use API for web automation
- Async support for better performance
- AI-friendly interfaces for autonomous operation

## Roadmap

### Current Features
- **DO.Browse**: Agentic web navigation
  - Autonomous webpage interaction
  - Element detection and manipulation
  - State awareness and context management
  - Cookie and storage handling
  - Visual debugging tools

### Coming Soon
- **DO.Read**: Agentic document navigation
  - PDF processing and understanding
  - Document structure analysis
  - Content extraction and processing
  - Cross-document reference handling

- **DO(...).New**: Agentic behavior execution
  - Task planning and execution
  - Decision making based on content
  - Multi-step operation handling
  - Context-aware actions

## Quick Start

```python
import asyncio
from donew import DO

async def main():
    # Configure browser settings (optional)
    DO.Config(headless=True)  # Run in headless mode
    
    # Start agentic web navigation
    browser = await DO.Browse("https://example.com")
    
    try:
        # Analyze page content
        content = await browser.text()
        print("Page content:", content)
        
        # Get all interactive elements with their context
        elements = browser.elements()
        
        # Smart element detection (finds relevant input fields by context)
        input_fields = {
            elem.element_label or elem.attributes.get("name", ""): id
            for id, elem in elements.items()
            if elem.element_type == "input"
            and elem.attributes.get("type") in ["text", "email"]
        }
        
        # Autonomous form interaction
        for label, element_id in input_fields.items():
            await browser.type(element_id, f"test_{label}")
        
        # State management
        cookies = await browser.cookies()
        print("Current browser state (cookies):", cookies)
        
        # Context persistence
        await browser.storage({
            "localStorage": {"agent_context": "form_filling"},
            "sessionStorage": {"task_state": "in_progress"}
        })
        
        # Visual debugging (helps AI understand page state)
        await browser.toggle_annotation(True)
        
        # Get current state for decision making
        state = await browser._get_state_dict()
        print("Current agent state:", state)
        
    finally:
        await browser.close()

if __name__ == "__main__":
    asyncio.run(main())
```

### Example: AI Agent Task Execution

```python
from donew import DO

async def search_and_extract(query: str):
    browser = await DO.Browse("https://example.com/search")
    try:
        # Find and interact with search form
        elements = browser.elements()
        search_input = next(
            (id for id, elem in elements.items() 
             if elem.element_type == "input" and 
             ("search" in elem.element_label.lower() if elem.element_label else False)),
            None
        )
        
        if search_input:
            # Execute search
            await browser.type(search_input, query)
            await browser.press("Enter")
            
            # Wait for and analyze results
            content = await browser.text()
            
            # Extract structured data
            return {
                "query": query,
                "results": content,
                "page_state": await browser._get_state_dict()
            }
    finally:
        await browser.close()
```

## Development Setup

### Requirements
- Python 3.11 (required for Knowledge Graph functionality)
- uv package manager (recommended over pip)

### Installation Steps

1. Clone the repository
```bash
git clone https://github.com/DONEWio/donew.git
cd donew
```

2. Install uv if you haven't already:
```bash
curl -LsSf https://astral.sh/uv/install.sh | sh
```

3. Create and activate virtual environment:
```bash
uv venv -p python3.11
source .venv/bin/activate  # On Windows: .venv\Scripts\activate
```

4. Install dependencies:
   - For basic usage:
   ```bash
   uv pip install pip
   uv pip install -e ".[dev]"
   ```

   - For Knowledge Graph functionality:
   ```bash
   uv pip install pip
   uv pip install -e "."
   uv pip install -e ".[kg,dev]"
   uv run -- spacy download en_core_web_md
   #uv run -- spacy download en_core_web_lg # Large web model
   #uv run -- spacy download en_core_web_sm # Small web model
   ```

5. Install Playwright browsers:
```bash
playwright install chromium
playwright install # or all browsers
```


## Testing

Run the test suite:
```bash
pytest tests/ --httpbin-url=https://httpbin.org
```

For more detailed testing options, including using local or remote httpbin, see the [Testing Documentation](docs/testing.md). (#TODO)

## Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

## License

This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.

# Knowledge Graph Component

The Knowledge Graph component (`donew.see.graph`) provides entity and relationship extraction from text, with persistent storage in KuzuDB. This implementation is inspired by and adapted from the [GraphGeeks.org](https://live.zoho.com/PBOB6fvr6c) talk and [strwythura](https://raw.githubusercontent.com/DerwenAI/strwythura/refs/heads/main/demo.py).

## Features

- Named Entity Recognition using GLiNER
- Relationship Extraction using GLiREL 
- Graph storage and querying with KuzuDB
- Text processing and chunking with spaCy

## Graph Construction

The graph is built in layers:

1. **Base Layer**: Textual analysis using spaCy parse trees
2. **Entity Layer**: Named entities and noun chunks from GLiNER
3. **Relationship Layer**: Semantic relationships from GLiREL
4. **Storage Layer**: Persistent graph storage in KuzuDB

## Usage

```python
from donew.see.graph import KnowledgeGraph

# Initialize KG (in-memory or with persistent storage)
kg = KnowledgeGraph(db_path="path/to/db")  # or None for in-memory

# Analyze text
result = kg.analyze("""
OpenAI CEO Sam Altman has partnered with Microsoft.
The collaboration was announced in San Francisco.
""")

# Query the graph
ceo_relations = kg.query("""
MATCH (p:Entity)-[r:Relation]->(o:Entity)
WHERE p.label = 'Person' AND o.label = 'Company'
AND r.type = 'FOUNDER'
RETURN p.text as Founder, o.text as Company
ORDER BY Founder;
""") 
