[](https://js.langchain.com/docs/how_to/reduce_retrieval_latency/): LLM should read this page when trying to reduce retrieval latency, implementing adaptive retrieval, or using the MatryoshkaRetriever class. Discusses techniques like using sub-vectors for initial fast retrieval followed by re-ranking with full embeddings, and provides code examples for setting up the MatryoshkaRetriever.

