Metadata-Version: 2.4
Name: multi_rag
Version: 0.1.1
Summary: A module to facilitate local testing of RAG pipeline for multiple datatypes
Project-URL: Homepage, https://github.com
Author-email: Phani Srimadhav Mula <phanisrimadhav.mula@gmail.com>
License-File: LICENSE
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Requires-Python: >=3.8
Requires-Dist: chromadb>=0.5.0
Requires-Dist: langchain-core>=0.2.0
Requires-Dist: langchain-google-genai
Requires-Dist: langchain-text-splitters>=0.2.0
Requires-Dist: langchain>=0.2.0
Requires-Dist: openpyxl>=3.1.0
Requires-Dist: pandas>=2.0.0
Requires-Dist: pillow>=10.0.0
Requires-Dist: pymupdf>=1.24.0
Requires-Dist: python-docx>=1.1.0
Requires-Dist: sentence-transformers>=2.5.0
Description-Content-Type: text/markdown

# multi_rag
### This module facilitates testing RAG pipelines on Local machines with chroma_db, with text embeddings from 'bge-base-en-v1.5' and image embeddings from 'clip-vit-b-32'

### Currently this module supports pdf, docx, xlsx, png, jpg, jpeg and txt file formats

#### User can give the path of the file to the embed function, which sets up the chroma_db/ folder for the embeddings while temp/ folder gets set up to mimic the actual database to store the data chunks.

#### Retrive function takes query as input and gives out a dictionary of 'text','tables','images'

#### query function takes query as input and returns the answer and retrieved data as output

#### One needs to have gemini api key to query, but embedding and retrieval part is completely local