[How to: handle long text](https://python.langchain.com/docs/how_to/extraction_long_text/): LLM should read this page when working with large documents or PDFs that exceed the context window of the LLM, when needing to extract structured information from text. This page covers strategies for handling long text when doing information extraction, including a brute force approach of chunking the text and extracting from each chunk, and a retrieval-augmented generation (RAG) approach of indexing the chunks and only extracting from relevant ones. It also discusses common issues with these approaches.

