{% extends "base.html" %} {% from "components/help_macros.html" import tooltip, help_panel, help_step, help_tip, glossary_term %} {% block title %}Embedding Settings{% endblock %} {% block extra_head %}{% endblock %} {% block content %}

Default Embedding Settings

Configure default embedding models for new collections and semantic search

{% call help_panel('embedding-how', 'Understanding Embeddings', icon='brain', collapsed=true, dismissible=true) %}
{{ help_step(1, "What are Embeddings?", "Embeddings convert text into numbers (vectors) that capture meaning. Similar concepts have similar vectors.") }} {{ help_step(2, "Chunk Size", "Larger chunks = more context per search result but less precision. Smaller = more granular but may lose context.") }} {{ help_step(3, "Text Splitter", "Recursive: uses natural breaks. Semantic: AI-powered grouping (slower). Token-based: respects LLM limits.") }} {{ help_step(4, "Why Lock Settings?", "Mixing different embedding models causes inconsistent search results. Each collection uses one model.") }}
{{ help_tip("Recommended: Start with defaults. Only change if you understand the trade-offs.") }} {% endcall %}

Important: Existing Collections

Changing these settings will only affect NEW collections. Existing collections keep their original embedding settings to ensure data integrity and search consistency.

Once a collection is indexed, its embedding settings are locked. To change settings for an existing collection, you must re-index the entire collection, which will delete the current index.

Test Configuration

Test the currently selected embedding model to ensure it's working correctly.

Configuration

Saved Default Settings

Loading...

These are your saved defaults for new collections. Change them using the form below.

Default Embedding Configuration

Characters per chunk
Overlap between chunks
How documents are split into chunks. Semantic uses AI to find natural breakpoints (slower but smarter)
Cosine similarity recommended for text embeddings
HNSW recommended for >10K documents (50-60× faster)
Recommended: Ensures fair comparison regardless of document length
JSON array of separators. Default works for 80% of use cases. For code: add language-specific separators like "\nclass", "\ndef"
These settings will be used as defaults for new collections. Existing collections will continue using their current settings unless re-indexed with new defaults.

Provider Information

Loading provider information...
{% endblock %}