The context window is the upper bound on how much text an LLM can consider in one request. It includes the system prompt, the user prompt, any attached files, and the model's own response. Exceeding the window forces truncation or error.
In 2026 most frontier models offer 128K to 200K tokens by default and 1M tokens on enterprise tiers. This is large enough to feed an entire book or codebase in a single prompt. Quality at the long end varies: independent benchmarks show models often forget content in the middle of very long contexts, a phenomenon called "lost in the middle."