Metadata-Version: 2.4
Name: autogen-builtsimple
Version: 0.1.0
Summary: AutoGen tools for Built-Simple research APIs (PubMed & ArXiv)
Project-URL: Homepage, https://built-simple.ai
Project-URL: Documentation, https://github.com/built-simple/autogen-builtsimple
Project-URL: Repository, https://github.com/built-simple/autogen-builtsimple
Author-email: Built-Simple <support@built-simple.ai>
License-Expression: MIT
License-File: LICENSE
Keywords: agents,ai,arxiv,autogen,pubmed,research,tools
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.9
Requires-Dist: httpx>=0.24.0
Requires-Dist: pydantic>=2.0.0
Provides-Extra: autogen
Requires-Dist: pyautogen>=0.2.0; extra == 'autogen'
Provides-Extra: dev
Requires-Dist: pyautogen>=0.2.0; extra == 'dev'
Requires-Dist: pytest-asyncio>=0.21.0; extra == 'dev'
Requires-Dist: pytest>=7.0.0; extra == 'dev'
Description-Content-Type: text/markdown

# AutoGen Built-Simple Research Tools

[![PyPI version](https://badge.fury.io/py/autogen-builtsimple.svg)](https://badge.fury.io/py/autogen-builtsimple)
[![Python 3.9+](https://img.shields.io/badge/python-3.9+-blue.svg)](https://www.python.org/downloads/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)

**GPU-accelerated research tools for Microsoft AutoGen agents.**

Search millions of academic papers on PubMed (4.48M+ articles) and ArXiv (2.77M+ papers) with semantic understanding. Built on [Built-Simple's](https://built-simple.ai) high-performance research APIs.

## 🚀 Features

- **🔬 PubMed Search**: Semantic + keyword hybrid search across 4.48M biomedical articles
- **📚 ArXiv Search**: Fast search across 2.77M preprints (physics, math, CS, AI/ML)
- **⚡ GPU-Accelerated**: Millisecond response times via Built-Simple APIs
- **📄 Full Text**: Retrieve complete article text from PubMed Central
- **🤖 AutoGen Native**: One-line registration with AutoGen agents

## 📦 Installation

```bash
pip install autogen-builtsimple
```

Or with AutoGen included:
```bash
pip install autogen-builtsimple[autogen]
```

## 🏃 Quick Start

```python
import os
from autogen import ConversableAgent
from autogen_builtsimple import register_research_tools

# Create agents
assistant = ConversableAgent(
    name="ResearchAssistant",
    system_message="""You are a research assistant that helps find and analyze 
    scientific papers. Use the search tools to find relevant literature.""",
    llm_config={"config_list": [{"model": "gpt-4", "api_key": os.environ["OPENAI_API_KEY"]}]},
)

user_proxy = ConversableAgent(
    name="User",
    llm_config=False,
    is_termination_msg=lambda msg: "TERMINATE" in msg.get("content", ""),
    human_input_mode="NEVER",
)

# Register all research tools with one line!
register_research_tools(assistant, user_proxy)

# Start researching
chat_result = user_proxy.initiate_chat(
    assistant,
    message="Find 5 recent papers on CRISPR gene therapy for cancer treatment"
)
```

## 🔧 Available Tools

### PubMed Tools

| Tool | Description |
|------|-------------|
| `search_pubmed` | Hybrid semantic + keyword search across PubMed |
| `get_pubmed_full_text` | Retrieve full article text (PMC articles) |
| `get_pubmed_metadata` | Batch fetch metadata for multiple PMIDs |

### ArXiv Tools

| Tool | Description |
|------|-------------|
| `search_arxiv` | Hybrid/vector/text search across ArXiv |
| `get_arxiv_paper_url` | Get PDF and abstract URLs for a paper |

## 📖 Usage Examples

### Search PubMed

```python
from autogen_builtsimple import search_pubmed

# Semantic search for medical literature
results = search_pubmed(
    query="machine learning early cancer detection biomarkers",
    limit=10,
    min_year=2020
)
print(results)
```

### Search ArXiv

```python
from autogen_builtsimple import search_arxiv

# Search for AI papers
results = search_arxiv(
    query="large language models reasoning capabilities",
    limit=10,
    search_type="hybrid"  # or "vector" or "text"
)
print(results)
```

### Get Full Text

```python
from autogen_builtsimple import get_pubmed_full_text

# Get complete article text
full_text = get_pubmed_full_text(pmid="35421456")
print(full_text)
```

### Selective Registration

```python
from autogen_builtsimple import register_pubmed_tools, register_arxiv_tools

# Register only PubMed tools
register_pubmed_tools(assistant, user_proxy)

# Or only ArXiv tools
register_arxiv_tools(assistant, user_proxy)
```

## 🔑 API Keys (Optional)

The tools work without API keys using IP-based rate limiting. For higher limits, get a free API key:

- **PubMed**: https://pubmed.built-simple.ai
- **ArXiv**: https://arxiv.built-simple.ai

```python
results = search_pubmed(
    query="diabetes treatment",
    limit=20,
    api_key="your-api-key"
)
```

## 🏗️ Architecture

```
┌─────────────────┐     ┌──────────────────┐     ┌─────────────────┐
│  AutoGen Agent  │────▶│  autogen-builtsimple │────▶│  Built-Simple   │
│   (Assistant)   │     │      Tools       │     │     APIs        │
└─────────────────┘     └──────────────────┘     └─────────────────┘
                                                        │
                                ┌───────────────────────┴───────────────────────┐
                                │                                               │
                        ┌───────▼───────┐                               ┌───────▼───────┐
                        │    PubMed     │                               │    ArXiv      │
                        │  GPU Search   │                               │  GPU Search   │
                        │  4.48M docs   │                               │  2.77M docs   │
                        └───────────────┘                               └───────────────┘
```

## 📊 Performance

| Endpoint | Latency | Index Size |
|----------|---------|------------|
| PubMed Hybrid Search | ~100-150ms | 4.48M articles |
| ArXiv Hybrid Search | ~70-90ms | 2.77M papers |
| PubMed Full Text | ~50ms | PMC subset |

## 🤝 Contributing

Contributions welcome! Please feel free to submit issues and pull requests.

## 📄 License

MIT License - see [LICENSE](LICENSE) for details.

## 🔗 Links

- **Built-Simple**: https://built-simple.ai
- **PubMed API Docs**: https://pubmed.built-simple.ai/docs
- **ArXiv API Docs**: https://arxiv.built-simple.ai/docs
- **Microsoft AutoGen**: https://microsoft.github.io/autogen/
