Enterprise AI term

Context window

Also known ascontext lengthtoken window

A context window is the maximum number of tokens a large language model can attend to in a single forward pass, comprising the prompt, any retrieved content, prior conversation turns and the model's own generated output.

In practice

Context window size sets a hard limit on how much instruction, source material and conversational history a model can consider when producing a response. It is measured in tokens, not words, and is consumed by both input and output. Common pitfalls include assuming that a larger window guarantees better reasoning at the far end of the window, and conflating context window with long-term memory, which requires retrieval or fine-tuning rather than a single prompt.

Worked example

A bank building a retrieval-augmented assistant for credit analysts caps each query at a fixed token budget for the retrieved policy excerpts, the chat history and the response, ensuring the prompt fits inside the model's context window.

Source

Authoritative reference

Related on Moweb

Generative AI services

This definition is maintained by Moweb partners and used in live client engagements. For how Context window applies to your estate, or to challenge a working definition, speak to a partner.

Brief a partner

Browse the full A-Z glossary