class documentation

RAG (Retrieval-Augmented Generation) provider using Qdrant and Ollama

Method chat Send messages using RAG with vector search and LLM (normalized to OpenAI format)
Method _get_embedding_model Get cached embedding model (singleton pattern to avoid repeated HuggingFace API calls)

Inherited from LLMProvider:

Method __init__ Undocumented
Instance Variable context Undocumented
Method _prepare_messages Convert string message to proper messages array
def chat(self, messages, **kwargs): (source)

Send messages using RAG with vector search and LLM (normalized to OpenAI format)

def _get_embedding_model(self): (source)

Get cached embedding model (singleton pattern to avoid repeated HuggingFace API calls)