Metadata-Version: 2.1
Name: ragwrapper
Version: 0.0.1
Summary: A package for LLMOps related tasks
Home-page: https://github.com/keshavkmr48/LLMOps
Author: keshav Kumar
Author-email: keshavkmr076@gmail.com
License: Apache Software License
Project-URL: Bug Tracker, https://github.com/keshavkmr48/LLMOps/issues
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Operating System :: OS Independent
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: setuptools==75.2.0
Requires-Dist: wheel==0.44.0
Requires-Dist: autocommand==2.2.2
Requires-Dist: backports.tarfile==1.2.0
Requires-Dist: importlib-metadata==8.0.0
Requires-Dist: importlib-resources==6.4.0
Requires-Dist: inflect==7.3.1
Requires-Dist: jaraco.collections==5.1.0
Requires-Dist: jaraco.context==5.3.0
Requires-Dist: jaraco.functools==4.0.1
Requires-Dist: jaraco.text==3.12.1
Requires-Dist: more-itertools==10.3.0
Requires-Dist: packaging==24.1
Requires-Dist: platformdirs==4.2.2
Requires-Dist: tomli==2.0.1
Requires-Dist: typeguard==4.3.0
Requires-Dist: typing-extensions==4.12.2
Requires-Dist: zipp==3.19.2
Provides-Extra: testing
Requires-Dist: pytest>=7.1.3; extra == "testing"
Requires-Dist: mypy>=0.971; extra == "testing"
Requires-Dist: flake8>=5.0.4; extra == "testing"
Requires-Dist: tox>=3.25.1; extra == "testing"
Requires-Dist: black>=22.8.0; extra == "testing"


# RAG Automation Wrapper

## Overview
This project provides a Python wrapper around LangChain to automate Retrieval-Augmented Generation (RAG). The package abstracts the RAG workflow into two modular components: Data Ingestion, Retrieval and Generation. The wrapper is designed for seamless integration with various data sources, retrieval methods, and large language models (LLMs), making it easier to prototype and deploy RAG-based systems.

## Key Features
Data Ingestion: Handle various data formats and load them into a retrievable format.
Retrieval: Efficiently search and retrieve relevant data using a combination of query transformation techniques and vector databases.
Generation: Use state-of-the-art LLMs to generate contextually relevant responses based on the retrieved information.
Components

1. Data Ingestion
The ingestion component ingests documents or datasets from various sources, such as plain text, PDFs, CSVs, or databases, and converts them into an indexed format for retrieval. This ensures that your data is well-structured and easily searchable.

Supported data formats:
Text, PDFs, CSVs, JSON
Databases (SQL, NoSQL)

2. Retrieval
The retrieval component is responsible for fetching relevant data from the indexed sources using various search techniques, including vector-based search, keyword-based search, or a hybrid of both.


Support for multiple databases (e.g., FAISS, Elasticsearch)
Query transformation for enhanced search accuracy
Embedding-based retrieval (using sentence-transformers, OpenAI embeddings, etc.)

3. Generation
The generation component utilizes the retrieved data to generate responses using a large language model (LLM). This component can be customized with different models such as OpenAI's GPT, Hugging Face Transformers, or any local LLMs.


LLM-based contextual generation
Support for different temperature and decoding strategies for controlled output
Integration with OpenAI, Hugging Face, or custom LLMs


