Metadata-Version: 2.4
Name: fennec_community
Version: 0.1.0
Summary: Production-grade RAG framework with Arabic NLP support — chunking, embeddings, LLM interfaces, vector databases, routing, and more.
Author-email: Yousef Khalil <yousefkhalil435@gmail.com>
License: MIT License
        
        Copyright (c) 2026 Fennec Community
        
        Permission is hereby granted, free of charge, to any person obtaining a copy
        of this software and associated documentation files (the "Software"), to deal
        in the Software without restriction, including without limitation the rights
        to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
        copies of the Software, and to permit persons to whom the Software is
        furnished to do so, subject to the following conditions:
        
        The above copyright notice and this permission notice shall be included in all
        copies or substantial portions of the Software.
        
        THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
        IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
        FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
        AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
        LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
        OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
        SOFTWARE.
        
Project-URL: Homepage, https://fennec-community.vercel.app/
Project-URL: Repository, https://github.com/fenneccommunity/fennec-community
Keywords: rag,llm,arabic,nlp,embeddings,chunking,vector-database,retrieval-augmented-generation,langchain,openai,gemini,ollama
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Topic :: Text Processing :: Linguistic
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: pydantic>=2.0
Requires-Dist: numpy>=1.24
Requires-Dist: tiktoken>=0.5
Provides-Extra: openai
Requires-Dist: openai>=1.0; extra == "openai"
Provides-Extra: anthropic
Requires-Dist: anthropic>=0.20; extra == "anthropic"
Provides-Extra: gemini
Requires-Dist: google-generativeai>=0.5; extra == "gemini"
Provides-Extra: mistral
Requires-Dist: mistralai>=0.4; extra == "mistral"
Provides-Extra: groq
Requires-Dist: groq>=0.4; extra == "groq"
Provides-Extra: ollama
Requires-Dist: ollama>=0.1; extra == "ollama"
Provides-Extra: huggingface
Requires-Dist: transformers>=4.35; extra == "huggingface"
Requires-Dist: torch>=2.0; extra == "huggingface"
Requires-Dist: sentence-transformers>=2.2; extra == "huggingface"
Provides-Extra: faiss
Requires-Dist: faiss-cpu>=1.7; extra == "faiss"
Provides-Extra: chroma
Requires-Dist: chromadb>=0.4; extra == "chroma"
Provides-Extra: pinecone
Requires-Dist: pinecone-client>=3.0; extra == "pinecone"
Provides-Extra: pdf
Requires-Dist: pypdf>=3.0; extra == "pdf"
Requires-Dist: pdfplumber>=0.10; extra == "pdf"
Provides-Extra: docx
Requires-Dist: python-docx>=1.0; extra == "docx"
Provides-Extra: web
Requires-Dist: requests>=2.28; extra == "web"
Requires-Dist: beautifulsoup4>=4.12; extra == "web"
Provides-Extra: arabic
Requires-Dist: camel-tools>=1.5; extra == "arabic"
Requires-Dist: stanza>=1.7; extra == "arabic"
Provides-Extra: all
Requires-Dist: fennec-community[anthropic,arabic,chroma,docx,faiss,gemini,groq,huggingface,mistral,ollama,openai,pdf,pinecone,web]; extra == "all"
Dynamic: license-file

# fennec-community

**Production-grade RAG framework with Arabic NLP support**

`fennec-community` is a modular Python library for building Retrieval-Augmented Generation (RAG) pipelines. It ships with first-class support for Arabic text and integrates with all major LLM providers, embedding models, and vector databases.

---

## Features

- **Smart Chunking** — Semantic, adaptive, structure-aware, context-aware, and Arabic-specific chunkers  
- **Embeddings** — OpenAI, Gemini, Mistral, Ollama, HuggingFace, and a dedicated Arabic embedder  
- **LLM Interfaces** — Unified API over OpenAI, Anthropic, Gemini, Mistral, Groq, and Ollama  
- **Vector Databases** — FAISS, ChromaDB, and Pinecone  
- **RAG Variants** — Standard, Agentic, Conversational, Graph, Hybrid, Multi-Doc, Multi-Hop, Streaming, Domain-Specific, Self-Improving  
- **Prompt Engine** — Dynamic context-aware prompt building with strategy support  
- **Router** — Hierarchical semantic routing with caching, feedback, and observability  
- **Plugin System** — Extensible plugin architecture with security controls  
- **Output Parsers** — JSON, YAML, CSV, Pydantic, structured output with auto-fix and retry  
- **Document Loaders** — PDF, DOCX, HTML, CSV, JSON, plain text, web, and directory loaders  

---

## Installation

```bash
pip install fennec-community
```

Install with optional extras for your provider of choice:

```bash
pip install fennec-community[openai]       # OpenAI LLM + embeddings
pip install fennec-community[anthropic]    # Anthropic Claude
pip install fennec-community[gemini]       # Google Gemini
pip install fennec-community[ollama]       # Local Ollama models
pip install fennec-community[faiss]        # FAISS vector store
pip install fennec-community[chroma]       # ChromaDB vector store
pip install fennec-community[pinecone]     # Pinecone vector store
pip install fennec-community[arabic]       # Arabic NLP tools
pip install fennec-community[all]          # Everything
```

---

## Quick Start

### Basic RAG Pipeline

```python
from fennec_community.rag.core import RAGSystem, RAGConfig
from fennec_community.embeddings import OpenAIEmbedder
from fennec_community.vector_database import FAISSVectorDatabase

config = RAGConfig(top_k=5)
embedder = OpenAIEmbedder(api_key="sk-...")
db = FAISSVectorDatabase(embedder=embedder)

rag = RAGSystem(config=config, vector_db=db)
rag.load_documents(["path/to/docs/"])

answer = rag.query("What is the capital of France?")
print(answer)
```

### Arabic Chunking

```python
from fennec_community.chunks import ArabicTextChunker, ChunkConfig

chunker = ArabicTextChunker(config=ChunkConfig(chunk_size=512))
chunks = chunker.chunk("النص العربي هنا...")

for chunk in chunks:
    print(chunk.text)
```

### Conversational RAG

```python
from fennec_community.rag.types.conversational_rag import ConversationalRAG

crag = ConversationalRAG(rag_system=rag)
response = crag.chat("Tell me about the report.")
response2 = crag.chat("Can you summarize the key points?")
```

### Semantic Router

```python
from fennec_community.router import HierarchicalRouter, Route, RouteGroup

router = HierarchicalRouter()
router.add_group(
    RouteGroup(
        name="support",
        routes=[
            Route(name="billing", keywords=["invoice", "payment", "refund"]),
            Route(name="technical", keywords=["error", "bug", "crash"]),
        ]
    )
)

result = router.route("I need help with my invoice")
print(result.matched_route)  # "billing"
```

### LLM Interface

```python
from fennec_community.llm import OpenAIInterface

llm = OpenAIInterface(api_key="sk-...")
response = llm.chat("Explain RAG in one sentence.")
print(response)
```

### Output Parsing

```python
from fennec_community.output_parser import OutputParser

parser = OutputParser()
result = parser.parse('{"name": "Alice", "age": 30}', format="json")
print(result)  # {'name': 'Alice', 'age': 30}
```

---

## Modules

| Module | Description |
|---|---|
| `fennec_community.chunks` | Text chunking strategies including Arabic support |
| `fennec_community.embeddings` | Embedding models for multiple providers |
| `fennec_community.llm` | LLM provider interfaces |
| `fennec_community.vector_database` | Vector store backends (FAISS, Chroma, Pinecone) |
| `fennec_community.rag` | Full RAG system with multiple retrieval strategies |
| `fennec_community.prompt` | Prompt engineering and context management |
| `fennec_community.router` | Semantic routing with hierarchical matching |
| `fennec_community.context` | Context window management and retrieval |
| `fennec_community.output_parser` | Structured output parsing and validation |
| `fennec_community.document_loaders` | Document ingestion from various sources |
| `fennec_community.plugins` | Plugin system for extending functionality |
| `fennec_community.chain` | Sequential, parallel, and conditional chains |

---

## RAG Variants

| Variant | Class | Use Case |
|---|---|---|
| Standard RAG | `RAGSystem` | General question answering |
| Agentic RAG | `AgenticRAG` | Multi-step reasoning with tool use |
| Conversational RAG | `ConversationalRAG` | Multi-turn chat with memory |
| Graph RAG | `GraphRAG` | Knowledge graph-based retrieval |
| Hybrid Search RAG | `HybridSearchRAG` | Dense + sparse retrieval |
| Multi-Doc RAG | `MultiDocRAG` | Cross-document reasoning |
| Multi-Hop RAG | `MultiHopRAG` | Complex multi-step queries |
| Streaming RAG | `StreamingRAG` | Token-by-token streaming responses |
| Domain RAG | `DomainSpecificRAG` | Domain-constrained retrieval |
| Self-Improving RAG | `SelfImprovingRAG` | Iterative refinement with HyDE |
| Federated RAG | `FederatedRAG` | Multi-source distributed retrieval |

---

## Requirements

- Python >= 3.9
- pydantic >= 2.0
- numpy >= 1.24
- tiktoken >= 0.5

All other dependencies are optional and installed via extras.

---

## License

MIT License — see [LICENSE](LICENSE) for details.

---

## Contributing

Contributions are welcome! Please open an issue or pull request on [GitHub](https://github.com/fenneccommunity/fennec-community).
