INFO:rag.fetch_2:Indexing 10 files from /tmp/rag_bench_10_3lxsnjd0
/home/masih/Desktop/projects/A-Modular-Kingdom/tests/../src/rag/core_2.py:39: LangChainDeprecationWarning: The class `HuggingFaceEmbeddings` was deprecated in LangChain 0.2.2 and will be removed in 1.0. An updated version of the class exists in the `langchain-huggingface package and should be used instead. To use it run `pip install -U `langchain-huggingface` and import as `from `langchain_huggingface import HuggingFaceEmbeddings``.
  embeddings_model = SentenceTransformerEmbeddings(
INFO:sentence_transformers.SentenceTransformer:Load pretrained SentenceTransformer: all-MiniLM-L6-v2
'(ReadTimeoutError("HTTPSConnectionPool(host='huggingface.co', port=443): Read timed out. (read timeout=10)"), '(Request ID: f1d91b71-f50c-448d-847a-cd6f2f6b3ff9)')' thrown while requesting HEAD https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2/resolve/main/./modules.json
WARNING:huggingface_hub.utils._http:'(ReadTimeoutError("HTTPSConnectionPool(host='huggingface.co', port=443): Read timed out. (read timeout=10)"), '(Request ID: f1d91b71-f50c-448d-847a-cd6f2f6b3ff9)')' thrown while requesting HEAD https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2/resolve/main/./modules.json
Retrying in 1s [Retry 1/5].
WARNING:huggingface_hub.utils._http:Retrying in 1s [Retry 1/5].
INFO:httpx:HTTP Request: GET https://5c99b123-9ead-4adb-b715-d10743893daf.us-west-2-0.aws.cloud.qdrant.io:6333 "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: GET https://5c99b123-9ead-4adb-b715-d10743893daf.us-west-2-0.aws.cloud.qdrant.io:6333/collections "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: PUT https://5c99b123-9ead-4adb-b715-d10743893daf.us-west-2-0.aws.cloud.qdrant.io:6333/collections/rag_v2_scope_dfb18828 "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: GET https://5c99b123-9ead-4adb-b715-d10743893daf.us-west-2-0.aws.cloud.qdrant.io:6333/collections/rag_v2_scope_dfb18828 "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: PUT https://5c99b123-9ead-4adb-b715-d10743893daf.us-west-2-0.aws.cloud.qdrant.io:6333/collections/rag_v2_scope_dfb18828/points?wait=true "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://5c99b123-9ead-4adb-b715-d10743893daf.us-west-2-0.aws.cloud.qdrant.io:6333/collections/rag_v2_scope_dfb18828/points/search "HTTP/1.1 200 OK"


######################################################################
# Testing with 10 documents
######################################################################

Creating 10 test documents...

============================================================
Benchmarking RAG V2 with 10 documents
============================================================
[RAG] Scanning: /tmp/rag_bench_10_3lxsnjd0
[RAG] Exclude patterns: ['test_*.py', '*_test.py', '*__pycache__*', '*.pyc']
[RAG] Max files: 100
[RAG] Found 10 indexable files
[RAG V2] Creating new instance for dfb18828...
[RAG V2] Using device: cuda
[RAG V2] Using SentenceTransformer embeddings with model: all-MiniLM-L6-v2
[Qdrant] Connecting to cloud: https://5c99b123-9ead-4adb-b715-d10743893daf.us-west-2-0.aws.cloud.qdrant.io:6333
[Qdrant] Created collection: rag_v2_scope_dfb18828
[RAG V2] Creating new Qdrant database at /home/masih/Desktop/projects/A-Modular-Kingdom/src/rag/rag_db_v2/scope_dfb18828...
[Qdrant] Batch indexing 10 documents...
[Qdrant] Indexed 10/10 documents...
[RAG V2] Indexed 10 chunks from 10 files
[RAG V2] Initialization complete
[RAG V2] Searching for: 'neural networks...'
[RAG V2] Re-ranking 8 initial results...

Batches:   0%|          | 0/1 [00:00<?, ?it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 142.36it/s]
INFO:rag.fetch_2:Indexing 10 files from /tmp/rag_bench_10_3lxsnjd0
INFO:httpx:HTTP Request: POST https://5c99b123-9ead-4adb-b715-d10743893daf.us-west-2-0.aws.cloud.qdrant.io:6333/collections/rag_v2_scope_dfb18828/points/search "HTTP/1.1 200 OK"
[RAG V2] Returning 5 unique results
[RAG] Scanning: /tmp/rag_bench_10_3lxsnjd0
[RAG] Exclude patterns: ['test_*.py', '*_test.py', '*__pycache__*', '*.pyc']
[RAG] Max files: 100
[RAG] Found 10 indexable files
[RAG V2] Using cached instance for dfb18828
[RAG V2] Searching for: 'TensorFlow...'
[RAG V2] Re-ranking 7 initial results...

Batches:   0%|          | 0/1 [00:00<?, ?it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 168.74it/s]
INFO:rag.fetch_2:Indexing 10 files from /tmp/rag_bench_10_3lxsnjd0
INFO:httpx:HTTP Request: POST https://5c99b123-9ead-4adb-b715-d10743893daf.us-west-2-0.aws.cloud.qdrant.io:6333/collections/rag_v2_scope_dfb18828/points/search "HTTP/1.1 200 OK"
[RAG V2] Returning 5 unique results
[RAG] Scanning: /tmp/rag_bench_10_3lxsnjd0
[RAG] Exclude patterns: ['test_*.py', '*_test.py', '*__pycache__*', '*.pyc']
[RAG] Max files: 100
[RAG] Found 10 indexable files
[RAG V2] Using cached instance for dfb18828
[RAG V2] Searching for: 'classification...'
[RAG V2] Re-ranking 8 initial results...

Batches:   0%|          | 0/1 [00:00<?, ?it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 168.30it/s]
[RAG V3] Creating new instance for /tmp/rag_bench_10_3lxsnjd0...
[RAG] About to create RAGPipelineV3...
INFO:sentence_transformers.SentenceTransformer:Load pretrained SentenceTransformer: all-MiniLM-L6-v2
[RAG V2] Returning 5 unique results
Cold start (with indexing): 30.71s
Warm query 1: 0.32s
Warm query 2: 0.32s
Average warm query: 0.32s
Result length: 3119 chars

============================================================
Benchmarking RAG V3 with 10 documents
============================================================
[RAG] Scanning: /tmp/rag_bench_10_3lxsnjd0
[RAG] Exclude patterns: ['test_*.py', '*_test.py', '*__pycache__*', '*.pyc']
[RAG] Max files: 100
[RAG] Found 10 indexable files
[RAG V3] Indexing 10 files from /tmp/rag_bench_10_3lxsnjd0
[RAG V3] Initializing with config: v3
[RAG V3] Using device: cuda
[RAG V3] Using SentenceTransformer embeddings with model: all-MiniLM-L6-v2
[Qdrant] Using local storage: /home/masih/Desktop/projects/A-Modular-Kingdom/src/rag/rag_db_v3/rag_bench_10_3lxsnjd0_e26f168f/qdrant_storage
[Qdrant] Created collection: rag_v3_vectors
[RAG V3] Using CrossEncoder reranking with model: cross-encoder/ms-marco-MiniLM-L-6-v2
[RAG V3] Loading existing database from /home/masih/Desktop/projects/A-Modular-Kingdom/src/rag/rag_db_v3/rag_bench_10_3lxsnjd0_e26f168f...
[RAG V3] Database empty. Triggering indexing...
[RAG V3] Processing doc_7.md...
[RAG V3] Processing doc_3.md...
[RAG V3] Processing doc_9.md...
[RAG V3] Processing doc_6.md...
[RAG V3] Processing doc_4.md...
[RAG V3] Processing doc_1.md...
[RAG V3] Processing doc_8.md...
[RAG V3] Processing doc_2.md...
[RAG V3] Processing doc_0.md...
[RAG V3] Processing doc_5.md...
[RAG V3] Indexing 10 chunks...
[Qdrant] Batch indexing 10 documents...

Batches:   0%|          | 0/1 [00:00<?, ?it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 85.14it/s]
[RAG] V3 initialization complete
[Qdrant] Indexed 10/10 documents...
[RAG V3] Building BM25 index...
[RAG V3] Database created with 10 chunks
[RAG V3] Searching with query: 'neural networks...'
[RAG V3] Step 3: Vector search...

Batches:   0%|          | 0/1 [00:00<?, ?it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 203.85it/s]
[RAG V3] Step 4: BM25 search...
[RAG V3] Step 5: RRF fusion...
[RAG V3] Step 6: CrossEncoder reranking...

Batches:   0%|          | 0/1 [00:00<?, ?it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 155.07it/s]
[RAG V3] Using cached instance for /tmp/rag_bench_10_3lxsnjd0
[RAG V3] Search completed, returning 5 unique results
[RAG] Scanning: /tmp/rag_bench_10_3lxsnjd0
[RAG] Exclude patterns: ['test_*.py', '*_test.py', '*__pycache__*', '*.pyc']
[RAG] Max files: 100
[RAG] Found 10 indexable files
[RAG V3] Indexing 10 files from /tmp/rag_bench_10_3lxsnjd0
[RAG V3] Searching with query: 'TensorFlow...'
[RAG V3] Step 3: Vector search...

Batches:   0%|          | 0/1 [00:00<?, ?it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 229.38it/s]
[RAG V3] Step 4: BM25 search...
[RAG V3] Step 5: RRF fusion...
[RAG V3] Step 6: CrossEncoder reranking...

Batches:   0%|          | 0/1 [00:00<?, ?it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 183.92it/s]
[RAG V3] Using cached instance for /tmp/rag_bench_10_3lxsnjd0
[RAG V3] Search completed, returning 5 unique results
[RAG] Scanning: /tmp/rag_bench_10_3lxsnjd0
[RAG] Exclude patterns: ['test_*.py', '*_test.py', '*__pycache__*', '*.pyc']
[RAG] Max files: 100
[RAG] Found 10 indexable files
[RAG V3] Indexing 10 files from /tmp/rag_bench_10_3lxsnjd0
[RAG V3] Searching with query: 'classification...'
[RAG V3] Step 3: Vector search...

Batches:   0%|          | 0/1 [00:00<?, ?it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 280.24it/s]
[RAG V3] Step 4: BM25 search...
[RAG V3] Step 5: RRF fusion...
[RAG V3] Step 6: CrossEncoder reranking...

Batches:   0%|          | 0/1 [00:00<?, ?it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 173.71it/s]
INFO:rag.fetch_2:Indexing 100 files from /tmp/rag_bench_100_ag0yj6w4
INFO:sentence_transformers.SentenceTransformer:Load pretrained SentenceTransformer: all-MiniLM-L6-v2
INFO:httpx:HTTP Request: GET https://5c99b123-9ead-4adb-b715-d10743893daf.us-west-2-0.aws.cloud.qdrant.io:6333 "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: GET https://5c99b123-9ead-4adb-b715-d10743893daf.us-west-2-0.aws.cloud.qdrant.io:6333/collections "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: PUT https://5c99b123-9ead-4adb-b715-d10743893daf.us-west-2-0.aws.cloud.qdrant.io:6333/collections/rag_v2_scope_fe948fd6 "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: GET https://5c99b123-9ead-4adb-b715-d10743893daf.us-west-2-0.aws.cloud.qdrant.io:6333/collections/rag_v2_scope_fe948fd6 "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: PUT https://5c99b123-9ead-4adb-b715-d10743893daf.us-west-2-0.aws.cloud.qdrant.io:6333/collections/rag_v2_scope_fe948fd6/points?wait=true "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://5c99b123-9ead-4adb-b715-d10743893daf.us-west-2-0.aws.cloud.qdrant.io:6333/collections/rag_v2_scope_fe948fd6/points/search "HTTP/1.1 200 OK"
[RAG V3] Search completed, returning 5 unique results
Cold start (with indexing): 13.74s
Warm query 1: 0.01s
Warm query 2: 0.01s
Average warm query: 0.01s
Result length: 3124 chars


######################################################################
# Testing with 100 documents
######################################################################

Creating 100 test documents...

============================================================
Benchmarking RAG V2 with 100 documents
============================================================
[RAG] Scanning: /tmp/rag_bench_100_ag0yj6w4
[RAG] Exclude patterns: ['test_*.py', '*_test.py', '*__pycache__*', '*.pyc']
[RAG] Max files: 100
[RAG] Found 100 indexable files
[RAG V2] Creating new instance for fe948fd6...
[RAG V2] Using device: cuda
[RAG V2] Using SentenceTransformer embeddings with model: all-MiniLM-L6-v2
[Qdrant] Connecting to cloud: https://5c99b123-9ead-4adb-b715-d10743893daf.us-west-2-0.aws.cloud.qdrant.io:6333
[Qdrant] Created collection: rag_v2_scope_fe948fd6
[RAG V2] Creating new Qdrant database at /home/masih/Desktop/projects/A-Modular-Kingdom/src/rag/rag_db_v2/scope_fe948fd6...
[Qdrant] Batch indexing 100 documents...
[Qdrant] Indexed 100/100 documents...
[RAG V2] Indexed 100 chunks from 100 files
[RAG V2] Initialization complete
[RAG V2] Searching for: 'neural networks...'
[RAG V2] Re-ranking 10 initial results...

Batches:   0%|          | 0/1 [00:00<?, ?it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 147.40it/s]
INFO:rag.fetch_2:Indexing 100 files from /tmp/rag_bench_100_ag0yj6w4
INFO:httpx:HTTP Request: POST https://5c99b123-9ead-4adb-b715-d10743893daf.us-west-2-0.aws.cloud.qdrant.io:6333/collections/rag_v2_scope_fe948fd6/points/search "HTTP/1.1 200 OK"
[RAG V2] Returning 5 unique results
[RAG] Scanning: /tmp/rag_bench_100_ag0yj6w4
[RAG] Exclude patterns: ['test_*.py', '*_test.py', '*__pycache__*', '*.pyc']
[RAG] Max files: 100
[RAG] Found 100 indexable files
[RAG V2] Using cached instance for fe948fd6
[RAG V2] Searching for: 'TensorFlow...'
[RAG V2] Re-ranking 9 initial results...

Batches:   0%|          | 0/1 [00:00<?, ?it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 143.11it/s]
INFO:rag.fetch_2:Indexing 100 files from /tmp/rag_bench_100_ag0yj6w4
INFO:httpx:HTTP Request: POST https://5c99b123-9ead-4adb-b715-d10743893daf.us-west-2-0.aws.cloud.qdrant.io:6333/collections/rag_v2_scope_fe948fd6/points/search "HTTP/1.1 200 OK"
[RAG V2] Returning 5 unique results
[RAG] Scanning: /tmp/rag_bench_100_ag0yj6w4
[RAG] Exclude patterns: ['test_*.py', '*_test.py', '*__pycache__*', '*.pyc']
[RAG] Max files: 100
[RAG] Found 100 indexable files
[RAG V2] Using cached instance for fe948fd6
[RAG V2] Searching for: 'classification...'
[RAG V2] Re-ranking 10 initial results...

Batches:   0%|          | 0/1 [00:00<?, ?it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 152.79it/s]
[RAG V3] Creating new instance for /tmp/rag_bench_100_ag0yj6w4...
[RAG] About to create RAGPipelineV3...
INFO:sentence_transformers.SentenceTransformer:Load pretrained SentenceTransformer: all-MiniLM-L6-v2
[RAG V2] Returning 5 unique results
Cold start (with indexing): 26.75s
Warm query 1: 0.31s
Warm query 2: 0.31s
Average warm query: 0.31s
Result length: 3127 chars

============================================================
Benchmarking RAG V3 with 100 documents
============================================================
[RAG] Scanning: /tmp/rag_bench_100_ag0yj6w4
[RAG] Exclude patterns: ['test_*.py', '*_test.py', '*__pycache__*', '*.pyc']
[RAG] Max files: 100
[RAG] Found 100 indexable files
[RAG V3] Indexing 100 files from /tmp/rag_bench_100_ag0yj6w4
[RAG V3] Initializing with config: v3
[RAG V3] Using device: cuda
[RAG V3] Using SentenceTransformer embeddings with model: all-MiniLM-L6-v2
[Qdrant] Using local storage: /home/masih/Desktop/projects/A-Modular-Kingdom/src/rag/rag_db_v3/rag_bench_100_ag0yj6w4_0c59d4ed/qdrant_storage
[Qdrant] Created collection: rag_v3_vectors
[RAG V3] Using CrossEncoder reranking with model: cross-encoder/ms-marco-MiniLM-L-6-v2
[RAG V3] Loading existing database from /home/masih/Desktop/projects/A-Modular-Kingdom/src/rag/rag_db_v3/rag_bench_100_ag0yj6w4_0c59d4ed...
[RAG V3] Database empty. Triggering indexing...
[RAG V3] Processing doc_85.md...
[RAG V3] Processing doc_53.md...
[RAG V3] Processing doc_28.md...
[RAG V3] Processing doc_71.md...
[RAG V3] Processing doc_80.md...
[RAG V3] Processing doc_11.md...
[RAG V3] Processing doc_38.md...
[RAG V3] Processing doc_10.md...
[RAG V3] Processing doc_94.md...
[RAG V3] Processing doc_89.md...
[RAG V3] Processing doc_99.md...
[RAG V3] Processing doc_90.md...
[RAG V3] Processing doc_91.md...
[RAG V3] Processing doc_44.md...
[RAG V3] Processing doc_50.md...
[RAG V3] Processing doc_29.md...
[RAG V3] Processing doc_97.md...
[RAG V3] Processing doc_15.md...
[RAG V3] Processing doc_67.md...
[RAG V3] Processing doc_37.md...
[RAG V3] Processing doc_79.md...
[RAG V3] Processing doc_36.md...
[RAG V3] Processing doc_68.md...
[RAG V3] Processing doc_27.md...
[RAG V3] Processing doc_56.md...
[RAG V3] Processing doc_25.md...
[RAG V3] Processing doc_75.md...
[RAG V3] Processing doc_59.md...
[RAG V3] Processing doc_93.md...
[RAG V3] Processing doc_40.md...
[RAG V3] Processing doc_62.md...
[RAG V3] Processing doc_7.md...
[RAG V3] Processing doc_16.md...
[RAG V3] Processing doc_98.md...
[RAG V3] Processing doc_3.md...
[RAG V3] Processing doc_48.md...
[RAG V3] Processing doc_34.md...
[RAG V3] Processing doc_9.md...
[RAG V3] Processing doc_13.md...
[RAG V3] Processing doc_88.md...
[RAG V3] Processing doc_18.md...
[RAG V3] Processing doc_83.md...
[RAG V3] Processing doc_6.md...
[RAG V3] Processing doc_41.md...
[RAG V3] Processing doc_65.md...
[RAG V3] Processing doc_33.md...
[RAG V3] Processing doc_66.md...
[RAG V3] Processing doc_58.md...
[RAG V3] Processing doc_51.md...
[RAG V3] Processing doc_20.md...
[RAG V3] Processing doc_4.md...
[RAG V3] Processing doc_60.md...
[RAG V3] Processing doc_47.md...
[RAG V3] Processing doc_92.md...
[RAG V3] Processing doc_1.md...
[RAG V3] Processing doc_52.md...
[RAG V3] Processing doc_74.md...
[RAG V3] Processing doc_43.md...
[RAG V3] Processing doc_57.md...
[RAG V3] Processing doc_82.md...
[RAG V3] Processing doc_21.md...
[RAG V3] Processing doc_39.md...
[RAG V3] Processing doc_24.md...
[RAG V3] Processing doc_64.md...
[RAG V3] Processing doc_63.md...
[RAG V3] Processing doc_17.md...
[RAG V3] Processing doc_70.md...
[RAG V3] Processing doc_32.md...
[RAG V3] Processing doc_30.md...
[RAG V3] Processing doc_84.md...
[RAG V3] Processing doc_86.md...
[RAG V3] Processing doc_14.md...
[RAG V3] Processing doc_8.md...
[RAG V3] Processing doc_35.md...
[RAG V3] Processing doc_23.md...
[RAG V3] Processing doc_95.md...
[RAG V3] Processing doc_72.md...
[RAG V3] Processing doc_96.md...
[RAG V3] Processing doc_2.md...
[RAG V3] Processing doc_31.md...
[RAG V3] Processing doc_12.md...
[RAG V3] Processing doc_54.md...
[RAG V3] Processing doc_49.md...
[RAG V3] Processing doc_26.md...
[RAG V3] Processing doc_73.md...
[RAG V3] Processing doc_76.md...
[RAG V3] Processing doc_81.md...
[RAG V3] Processing doc_45.md...
[RAG V3] Processing doc_19.md...
[RAG V3] Processing doc_77.md...
[RAG V3] Processing doc_0.md...
[RAG V3] Processing doc_42.md...
[RAG V3] Processing doc_5.md...
[RAG V3] Processing doc_61.md...
[RAG V3] Processing doc_78.md...
[RAG V3] Processing doc_69.md...
[RAG V3] Processing doc_87.md...
[RAG V3] Processing doc_22.md...
[RAG V3] Processing doc_55.md...
[RAG V3] Processing doc_46.md...
[RAG V3] Indexing 100 chunks...
[Qdrant] Batch indexing 100 documents...

Batches:   0%|          | 0/4 [00:00<?, ?it/s]
Batches: 100%|██████████| 4/4 [00:00<00:00, 48.49it/s]
[RAG] V3 initialization complete
[Qdrant] Indexed 100/100 documents...
[RAG V3] Building BM25 index...
[RAG V3] Database created with 100 chunks
[RAG V3] Searching with query: 'neural networks...'
[RAG V3] Step 3: Vector search...

Batches:   0%|          | 0/1 [00:00<?, ?it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 221.13it/s]
[RAG V3] Step 4: BM25 search...
[RAG V3] Step 5: RRF fusion...
[RAG V3] Step 6: CrossEncoder reranking...

Batches:   0%|          | 0/1 [00:00<?, ?it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 112.96it/s]
[RAG V3] Using cached instance for /tmp/rag_bench_100_ag0yj6w4
[RAG V3] Search completed, returning 5 unique results
[RAG] Scanning: /tmp/rag_bench_100_ag0yj6w4
[RAG] Exclude patterns: ['test_*.py', '*_test.py', '*__pycache__*', '*.pyc']
[RAG] Max files: 100
[RAG] Found 100 indexable files
[RAG V3] Indexing 100 files from /tmp/rag_bench_100_ag0yj6w4
[RAG V3] Searching with query: 'TensorFlow...'
[RAG V3] Step 3: Vector search...

Batches:   0%|          | 0/1 [00:00<?, ?it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 276.27it/s]
[RAG V3] Step 4: BM25 search...
[RAG V3] Step 5: RRF fusion...
[RAG V3] Step 6: CrossEncoder reranking...

Batches:   0%|          | 0/1 [00:00<?, ?it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 182.62it/s]
[RAG V3] Using cached instance for /tmp/rag_bench_100_ag0yj6w4
[RAG V3] Search completed, returning 5 unique results
[RAG] Scanning: /tmp/rag_bench_100_ag0yj6w4
[RAG] Exclude patterns: ['test_*.py', '*_test.py', '*__pycache__*', '*.pyc']
[RAG] Max files: 100
[RAG] Found 100 indexable files
[RAG V3] Indexing 100 files from /tmp/rag_bench_100_ag0yj6w4
[RAG V3] Searching with query: 'classification...'
[RAG V3] Step 3: Vector search...

Batches:   0%|          | 0/1 [00:00<?, ?it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 277.68it/s]
[RAG V3] Step 4: BM25 search...
[RAG V3] Step 5: RRF fusion...
[RAG V3] Step 6: CrossEncoder reranking...

Batches:   0%|          | 0/1 [00:00<?, ?it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 179.84it/s]
[RAG V3] Search completed, returning 5 unique results
Cold start (with indexing): 13.85s
Warm query 1: 0.02s
Warm query 2: 0.02s
Average warm query: 0.02s
Result length: 3134 chars


======================================================================
BENCHMARK SUMMARY
======================================================================

Version         Docs       Cold (s)     Warm (s)    
----------------------------------------------------------------------
RAG V2          10         30.71        0.32        
RAG V3          10         13.74        0.01        
RAG V2          100        26.75        0.31        
RAG V3          100        13.85        0.02        

======================================================================
RECOMMENDATIONS
======================================================================

Based on benchmarks:
- V2: Best for <100 docs, fast queries (<1s warm)
- V3: Best for accuracy-critical tasks (slower due to reranking)
- Both use GPU acceleration (CUDA)
- Indexing is one-time cost, amortized over many queries
