Metadata-Version: 2.4
Name: narrative-ai-framework
Version: 0.1.9
Summary: AI-powered voice diary framework: STT, TTS, LLM, RAG, and voice-agent engines
Author: Narrative AI Team
License: MIT
Keywords: ai,voice,diary,stt,tts,llm,rag,arabic
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: PyYAML>=6.0.1
Requires-Dist: python-dotenv>=1.0.0
Requires-Dist: pydantic>=2.5.0
Requires-Dist: aiohttp>=3.9.0
Requires-Dist: aiofiles>=23.2.1
Requires-Dist: requests>=2.28.0
Requires-Dist: shortuuid>=1.0.11
Requires-Dist: pyngrok>=7.0.0
Requires-Dist: nest-asyncio>=1.5.8
Requires-Dist: sympy>=1.12
Provides-Extra: stt
Requires-Dist: soundfile>=0.12.1; extra == "stt"
Requires-Dist: scipy>=1.11.0; extra == "stt"
Requires-Dist: webrtcvad>=2.0.10; extra == "stt"
Requires-Dist: numpy>=1.24.0; extra == "stt"
Requires-Dist: elevenlabs>=0.2.0; extra == "stt"
Requires-Dist: yt-dlp>=2023.11.0; extra == "stt"
Requires-Dist: pydub>=0.25.1; extra == "stt"
Requires-Dist: transformers>=4.36.0; extra == "stt"
Requires-Dist: accelerate>=0.25.0; extra == "stt"
Requires-Dist: torch>=2.1.0; extra == "stt"
Requires-Dist: ctranslate2>=4.0.0; extra == "stt"
Requires-Dist: faster-whisper>=1.0.0; extra == "stt"
Provides-Extra: tts
Requires-Dist: aiohttp>=3.9.0; extra == "tts"
Requires-Dist: numpy>=1.24.0; extra == "tts"
Provides-Extra: ocr
Requires-Dist: opencv-python>=4.8.0; extra == "ocr"
Requires-Dist: scikit-image>=0.21.0; extra == "ocr"
Requires-Dist: pdf2image>=1.16.3; extra == "ocr"
Requires-Dist: python-docx>=1.1.0; extra == "ocr"
Requires-Dist: einops>=0.6.1; extra == "ocr"
Requires-Dist: torch>=2.0.1; extra == "ocr"
Requires-Dist: torchvision>=0.15.2; extra == "ocr"
Requires-Dist: transformers>=4.45.0; extra == "ocr"
Requires-Dist: accelerate>=0.26.0; extra == "ocr"
Requires-Dist: qwen-vl-utils>=0.0.4; extra == "ocr"
Requires-Dist: timm>=0.9.2; extra == "ocr"
Requires-Dist: basicsr>=1.4.2; extra == "ocr"
Requires-Dist: realesrgan>=0.3.0; extra == "ocr"
Provides-Extra: llm
Requires-Dist: google-generativeai>=0.3.0; extra == "llm"
Requires-Dist: google-genai>=0.3.0; extra == "llm"
Requires-Dist: openai>=1.3.0; extra == "llm"
Requires-Dist: anthropic>=0.18.0; extra == "llm"
Requires-Dist: tiktoken>=0.5.0; extra == "llm"
Provides-Extra: voice
Requires-Dist: livekit>=0.11.0; extra == "voice"
Requires-Dist: livekit-api>=0.4.0; extra == "voice"
Requires-Dist: livekit-agents>=0.7.0; extra == "voice"
Requires-Dist: livekit-plugins-silero>=0.6.0; extra == "voice"
Requires-Dist: livekit-plugins-elevenlabs>=1.3.0; extra == "voice"
Requires-Dist: livekit-plugins-turn-detector>=1.3.0; extra == "voice"
Requires-Dist: livekit-plugins-noise-cancellation>=0.2.0; extra == "voice"
Requires-Dist: sounddevice>=0.5.0; extra == "voice"
Provides-Extra: db
Requires-Dist: SQLAlchemy>=2.0.0; extra == "db"
Requires-Dist: asyncpg>=0.29.0; extra == "db"
Requires-Dist: psycopg2-binary>=2.9.0; extra == "db"
Requires-Dist: alembic>=1.13.0; extra == "db"
Requires-Dist: redis>=5.0.0; extra == "db"
Provides-Extra: security
Requires-Dist: redis>=5.0.0; extra == "security"
Requires-Dist: SQLAlchemy>=2.0.0; extra == "security"
Requires-Dist: cryptography>=41.0.0; extra == "security"
Requires-Dist: PyJWT>=2.8.0; extra == "security"
Requires-Dist: bcrypt>=4.0.0; extra == "security"
Provides-Extra: api
Requires-Dist: fastapi>=0.109.0; extra == "api"
Requires-Dist: uvicorn[standard]>=0.27.0; extra == "api"
Requires-Dist: python-multipart>=0.0.6; extra == "api"
Requires-Dist: email-validator>=2.1.0; extra == "api"
Provides-Extra: rag
Requires-Dist: sentence-transformers>=2.2.2; extra == "rag"
Requires-Dist: FlagEmbedding>=1.3.5; extra == "rag"
Requires-Dist: pillow>=10.0.0; extra == "rag"
Requires-Dist: psutil>=5.9.0; extra == "rag"
Requires-Dist: unstructured[all-docs]>=0.10.0; extra == "rag"
Requires-Dist: python-magic>=0.4.27; extra == "rag"
Requires-Dist: pytesseract>=0.3.10; extra == "rag"
Requires-Dist: pgvector>=0.2.5; extra == "rag"
Requires-Dist: qdrant-client>=1.7.0; extra == "rag"
Provides-Extra: web
Requires-Dist: ddgs>=9.0.0; extra == "web"
Provides-Extra: vlm
Requires-Dist: pillow>=10.0.0; extra == "vlm"
Requires-Dist: numpy>=1.24.0; extra == "vlm"
Requires-Dist: ollama>=0.1.0; extra == "vlm"
Provides-Extra: all
Requires-Dist: soundfile>=0.12.1; extra == "all"
Requires-Dist: scipy>=1.11.0; extra == "all"
Requires-Dist: webrtcvad>=2.0.10; extra == "all"
Requires-Dist: numpy>=1.24.0; extra == "all"
Requires-Dist: elevenlabs>=0.2.0; extra == "all"
Requires-Dist: yt-dlp>=2023.11.0; extra == "all"
Requires-Dist: pydub>=0.25.1; extra == "all"
Requires-Dist: transformers>=4.36.0; extra == "all"
Requires-Dist: accelerate>=0.25.0; extra == "all"
Requires-Dist: torch>=2.1.0; extra == "all"
Requires-Dist: ctranslate2>=4.0.0; extra == "all"
Requires-Dist: faster-whisper>=1.0.0; extra == "all"
Requires-Dist: aiohttp>=3.9.0; extra == "all"
Requires-Dist: google-generativeai>=0.3.0; extra == "all"
Requires-Dist: google-genai>=0.3.0; extra == "all"
Requires-Dist: openai>=1.3.0; extra == "all"
Requires-Dist: anthropic>=0.18.0; extra == "all"
Requires-Dist: tiktoken>=0.5.0; extra == "all"
Requires-Dist: livekit>=0.11.0; extra == "all"
Requires-Dist: livekit-api>=0.4.0; extra == "all"
Requires-Dist: livekit-agents>=0.7.0; extra == "all"
Requires-Dist: livekit-plugins-silero>=0.6.0; extra == "all"
Requires-Dist: livekit-plugins-elevenlabs>=1.3.0; extra == "all"
Requires-Dist: livekit-plugins-turn-detector>=1.3.0; extra == "all"
Requires-Dist: livekit-plugins-noise-cancellation>=0.2.0; extra == "all"
Requires-Dist: sounddevice>=0.5.0; extra == "all"
Requires-Dist: SQLAlchemy>=2.0.0; extra == "all"
Requires-Dist: asyncpg>=0.29.0; extra == "all"
Requires-Dist: psycopg2-binary>=2.9.0; extra == "all"
Requires-Dist: alembic>=1.13.0; extra == "all"
Requires-Dist: redis>=5.0.0; extra == "all"
Requires-Dist: cryptography>=41.0.0; extra == "all"
Requires-Dist: PyJWT>=2.8.0; extra == "all"
Requires-Dist: bcrypt>=4.0.0; extra == "all"
Requires-Dist: fastapi>=0.109.0; extra == "all"
Requires-Dist: uvicorn[standard]>=0.27.0; extra == "all"
Requires-Dist: python-multipart>=0.0.6; extra == "all"
Requires-Dist: email-validator>=2.1.0; extra == "all"
Requires-Dist: ddgs>=9.0.0; extra == "all"
Requires-Dist: opencv-python>=4.8.0; extra == "all"
Requires-Dist: scikit-image>=0.21.0; extra == "all"
Requires-Dist: pdf2image>=1.16.3; extra == "all"
Requires-Dist: python-docx>=1.1.0; extra == "all"
Requires-Dist: einops>=0.6.1; extra == "all"
Requires-Dist: qwen-vl-utils>=0.0.4; extra == "all"
Requires-Dist: pgvector>=0.2.5; extra == "all"
Requires-Dist: qdrant-client>=1.7.0; extra == "all"
Requires-Dist: ollama>=0.1.0; extra == "all"
Provides-Extra: dev
Requires-Dist: pytest>=7.4.0; extra == "dev"
Requires-Dist: pytest-asyncio>=0.23.0; extra == "dev"
Requires-Dist: pytest-cov>=4.1.0; extra == "dev"
Requires-Dist: httpx>=0.25.0; extra == "dev"
Requires-Dist: mypy>=1.7.0; extra == "dev"
Requires-Dist: ruff>=0.1.0; extra == "dev"
Dynamic: license-file

# Narrative AI SDK

---

## 🔑 LLM Engine (`nai.llm`)

**Requirements:**
- API Key from OpenAI, Google (Gemini), or Anthropic.
- Install dependencies: `pip install narrative-ai-framework`

### `generate()`
| Detailed Description | Inputs | Returns |
| :--- | :--- | :--- |
| The primary function for text generation. It abstracts the complexity of different providers, allowing you to generate text from a simple prompt. It handles model routing and response parsing automatically. | `prompt` (str), `model` (str, optional), `max_tokens` (int, optional) | `LLMResponse` (Object with .text property) |

```python
import narrative_ai as nai
import asyncio

async def main():
    # 1. Setup API Key
    nai.llm.set_api_key("your-api-key", provider="openai")
    
    # 2. Call generate
    response = await nai.llm.generate("Explain black holes in simple terms.")
    print(f"Result: {response.text}")

if __name__ == "__main__":
    asyncio.run(main())
```

### `generate_stream()`
| Detailed Description | Inputs | Returns |
| :--- | :--- | :--- |
| Generates text response chunk-by-chunk. This is essential for building real-time chat applications where you want to show the AI's output as it's being thought of. | `prompt` (str), `model` (str, optional) | `AsyncIterator[str]` |

```python
import narrative_ai as nai
import asyncio

async def main():
    nai.llm.set_api_key("your-api-key", provider="openai")
    
    print("Streaming: ", end="")
    async for chunk in nai.llm.generate_stream("Write a short poem."):
        print(chunk, end="", flush=True)

if __name__ == "__main__":
    asyncio.run(main())
```

---

## 🎙️ STT Engine (`nai.stt`)

**Requirements:**
- API Key for cloud providers (ElevenLabs) OR local hardware for Whisper.
- Install extra dependencies: `pip install "narrative-ai-framework[stt]"`

### `transcribe()`
| Detailed Description | Inputs | Returns |
| :--- | :--- | :--- |
| Converts an entire audio file (MP3, WAV, etc.) into structured text. It handles audio normalization and sends it to the configured transcription engine. | `audio_path` (str), `language` (str, optional) | `STTResult` (Object with .text property) |

```python
import narrative_ai as nai
import asyncio

async def main():
    # 1. Configure provider
    nai.stt.set_api_key("your-elevenlabs-key", provider="elevenlabs")
    
    # 2. Transcribe file
    result = await nai.stt.transcribe("meeting_recording.mp3")
    print(f"Transcript: {result.text}")

if __name__ == "__main__":
    asyncio.run(main())
```

---

## 🔊 TTS Engine (`nai.tts`)

**Requirements:**
- API Key for providers (OpenAI, ElevenLabs).
- Install extra dependencies: `pip install "narrative-ai-framework[tts]"`

### `synthesize()`
| Detailed Description | Inputs | Returns |
| :--- | :--- | :--- |
| Converts text into realistic speech. It returns the local file path to the generated audio file. You can specify different voices depending on the provider. | `text` (str), `voice` (str, optional) | `str` (Absolute path to the audio file) |

```python
import narrative_ai as nai
import asyncio

async def main():
    # 1. Setup TTS
    nai.tts.set_api_key("your-openai-key", provider="openai")
    
    # 2. Generate Audio
    audio_path = await nai.tts.synthesize("Hello, welcome to Narrative AI.", voice="alloy")
    print(f"Audio saved at: {audio_path}")

if __name__ == "__main__":
    asyncio.run(main())
```

---

## 📚 RAG & Memory (`nai.rag`)

**Requirements:**
- Vector Database access (Pinecone, Qdrant) OR Local storage.
- API Key for Embeddings (OpenAI, Cohere).

### `remember()`
| Detailed Description | Inputs | Returns |
| :--- | :--- | :--- |
| Indexes a StructuredDocument into your vector store. It automatically generates embeddings and stores the content so the AI can 'recall' it later. | `document` (StructuredDocument), `doc_id` (str, optional) | `bool` (True if indexing succeeded) |

```python
import narrative_ai as nai
import asyncio

async def main():
    # 1. Setup Embeddings
    nai.rag.set_api_key("your-cohere-key", provider="cohere")
    
    # 2. Process a file first
    doc = await nai.input_processor.process("knowledge_base.pdf")
    
    # 3. Store in memory
    success = await nai.rag.remember(doc, doc_id="kb_v1")
    print(f"Document indexed: {success}")

if __name__ == "__main__":
    asyncio.run(main())
```

### `recall()`
| Detailed Description | Inputs | Returns |
| :--- | :--- | :--- |
| Searches your long-term memory for relevant information based on a query. It returns the most similar text blocks to be used as context for the LLM. | `query` (str), `top_k` (int, default=5) | `RichContext` (Object containing relevant snippets) |

```python
import narrative_ai as nai
import asyncio

async def main():
    nai.rag.set_api_key("your-key", provider="cohere")
    
    # Search memory
    context = await nai.rag.recall("What is the company's leave policy?")
    print(f"Retrieved Context: {context.formatted_text}")

if __name__ == "__main__":
    asyncio.run(main())
```

---

## 🛠️ Input Processor (`nai.input_processor`)

**Requirements:**
- No special keys needed, but depends on other engines (OCR, STT) for specific file types.

### `process()`
| Detailed Description | Inputs | Returns |
| :--- | :--- | :--- |
| The multimodal 'brain'. It analyzes the file extension or metadata of any source and automatically routes it to OCR (for images/PDFs) or STT (for audio). | `source` (Path, URL, or Bytes) | `StructuredDocument` |

```python
import narrative_ai as nai
import asyncio

async def main():
    # One function for any file type!
    doc = await nai.input_processor.process("complex_data.zip")
    print(f"Extracted {len(doc.content_blocks)} blocks of information.")

if __name__ == "__main__":
    asyncio.run(main())
```

---

## 🤖 Voice Mode (`nai.voice_mode`)

**Requirements:**
- LiveKit Server URL and API Credentials.

### `start_agent()`
| Detailed Description | Inputs | Returns |
| :--- | :--- | :--- |
| Launches the real-time agent worker. This connects your LLM, STT, and TTS engines into a seamless low-latency voice conversation loop. | `None` | `None` (Runs indefinitely) |

```python
import narrative_ai as nai

# 1. Configure LiveKit
nai.voice_mode.set_livekit_config(
    url="wss://your-project.livekit.cloud",
    api_key="your-api-key",
    api_secret="your-api-secret"
)

# 2. Set Agent Identity
nai.voice_mode.set_agent_name("Narrative Assistant")

# 3. Start the loop
# nai.voice_mode.start_agent()
```

---

## 🎨 VLM Engine (`nai.vlm`)

**Requirements:**
- Vision-capable model key (Gemini Pro Vision, GPT-4V).

### `analyze_image()`
| Detailed Description | Inputs | Returns |
| :--- | :--- | :--- |
| Allows the AI to 'see'. You pass an image and a prompt, and the engine performs visual reasoning to answer your question. | `image` (Path or Bytes), `prompt` (str) | `VLMResponse` |

```python
import narrative_ai as nai
import asyncio

async def main():
    nai.vlm.set_api_key("your-key")
    response = await nai.vlm.analyze_image("chart.png", "What are the sales trends in this graph?")
    print(response.text)

if __name__ == "__main__":
    asyncio.run(main())
```

---

## License
MIT License.
