[How to evaluate an application's intermediate steps | 🦜️🛠️ LangSmith](https://docs.smith.langchain.com/evaluation/how_to_guides/evaluate_on_intermediate_steps): LLM should read this page when evaluating intermediate steps of AI applications, working with RAG pipelines, or creating custom evaluators for complex systems. This guide explains how to evaluate intermediate steps in LLM applications, with examples using a Wikipedia-based RAG pipeline, creating custom evaluators for retrieval relevance and hallucination detection, and implementing evaluation through LangSmith with both Python and TypeScript code samples.

