jeevesagent.vectorstore.base
============================

.. py:module:: jeevesagent.vectorstore.base

.. autoapi-nested-parse::

   VectorStore protocol + shared types and helpers.

   Every concrete vector store implements the :class:`VectorStore`
   protocol — a small async surface (add / delete / search /
   search_by_vector / count / get_by_ids). Backends differ in storage
   and ANN algorithm, but the interface is identical so swapping
   ``InMemoryVectorStore`` for ``ChromaVectorStore`` / etc. is a
   one-line change.

   # Filtering

   The ``filter`` argument to :meth:`search` is a Mongo-style query
   expression — see :mod:`jeevesagent.vectorstore._filter` for the
   operator reference. Common shapes::

       {"source": "report.pdf"}                 # equality shorthand
       {"page": {"$gte": 5}}                    # range
       {"tag": {"$in": ["draft", "final"]}}     # membership
       {"$and": [{"a": 1}, {"b": 2}]}           # composition

   # Diversity (MMR)

   :meth:`search` accepts ``diversity: float | None`` in [0, 1] for
   Maximal Marginal Relevance reranking. ``None`` (default) gives
   plain top-k by similarity. ``0.0`` is identical to ``None``;
   ``1.0`` is maximum diversity. Most users want ``0.3``..``0.5``
   when they want diversity at all.

   We picked the 0..1 diversity scale (rather than LangChain's
   inverted ``lambda_mult``) because "more diverse → bigger number"
   is intuitive and "fully relevant" is the natural zero state.



Classes
-------

.. autoapisummary::

   jeevesagent.vectorstore.base.SearchResult
   jeevesagent.vectorstore.base.VectorStore


Functions
---------

.. autoapisummary::

   jeevesagent.vectorstore.base.matches_filter


Module Contents
---------------

.. py:class:: SearchResult

   One hit from :meth:`VectorStore.search`.

   * ``chunk`` — the matched chunk (with its full metadata).
   * ``score`` — similarity in [-1, 1] for cosine; backend-
     specific for other distance metrics. Higher = more similar.
   * ``id`` — the store-assigned id (so callers can ``delete()``
     or ``get_by_ids()`` later).


   .. py:attribute:: chunk
      :type:  jeevesagent.loader.base.Chunk


   .. py:attribute:: id
      :type:  str


   .. py:attribute:: score
      :type:  float


.. py:class:: VectorStore

   Bases: :py:obj:`Protocol`


   Async protocol for vector stores.

   Six methods cover the lifecycle: add (embed + store), delete,
   search (by query string), search_by_vector (precomputed),
   count, get_by_ids.

   Backends that aren't natively async (FAISS, Chroma) wrap their
   sync calls in :func:`anyio.to_thread.run_sync` so they don't
   block the event loop.


   .. py:method:: add(chunks: list[jeevesagent.loader.base.Chunk], ids: list[str] | None = None) -> list[str]
      :async:


      Embed + store ``chunks``. Returns the assigned ids
      (caller-provided or generated).



   .. py:method:: count() -> int
      :async:


      Number of chunks currently in the store.



   .. py:method:: delete(ids: list[str]) -> None
      :async:


      Remove the named chunks. Unknown ids are silently
      skipped (idempotent).



   .. py:method:: get_by_ids(ids: list[str]) -> list[jeevesagent.loader.base.Chunk]
      :async:


      Fetch chunks by id, in the same order as ``ids``.
      Unknown ids are skipped (the result may be shorter than
      the input).



   .. py:method:: search(query: str, *, k: int = 4, filter: collections.abc.Mapping[str, Any] | None = None, diversity: float | None = None) -> list[SearchResult]
      :async:


      Embed ``query`` and return the top-``k`` chunks ranked
      by similarity. ``filter`` (optional) restricts candidates
      by metadata. ``diversity`` (optional, 0..1) enables MMR
      reranking for varied results.



   .. py:method:: search_by_vector(vector: list[float], *, k: int = 4, filter: collections.abc.Mapping[str, Any] | None = None, diversity: float | None = None) -> list[SearchResult]
      :async:


      Same as :meth:`search` but with a precomputed query
      vector.



.. py:function:: matches_filter(metadata: collections.abc.Mapping[str, Any], filter: collections.abc.Mapping[str, Any] | None) -> bool

   Return True if ``metadata`` satisfies ``filter``.

   Thin wrapper around :func:`evaluate_filter` with the argument
   order our existing tests expect.


