[Evaluation concepts | 🦜️🛠️ LangSmith](https://docs.smith.langchain.com/evaluation/concepts): LLM should read this page when wanting to understand LangSmith evaluation concepts, implementing evaluation strategies for LLM applications, or choosing appropriate metrics for different AI application types. This page covers LangSmith's evaluation framework including datasets, evaluators (human, heuristic, LLM-as-judge, pairwise), experiments, annotation queues, offline/online evaluation approaches, testing methodologies, and application-specific evaluation techniques for agents, RAG, summarization, and classification.

