<!-- description: A context window is the maximum amount of text an LLM can process in a single inference call, defining the limit of what an agent can "see" and reason about at once. -->

# Context Window

A **context window** is the maximum amount of text — measured in [tokens](token) — that a large language model (LLM) can process in a single inference call. Everything the model uses to generate a response — the system prompt, conversation history, tool results, and any injected data — must fit within this window.

## Why Context Windows Matter for Agents

[Agents](agent) operate in long-running loops where they accumulate large amounts of context: prior tool results, conversation history, file contents, and reasoning steps. As this accumulates, it can approach or exceed the model's context window limit.

When the limit is hit, agents must:

- **Summarize** earlier parts of the conversation
- **Evict** old content that is no longer relevant
- **Compress** tool results to their essential information

Poor context management leads to agents "forgetting" important details or failing mid-task.

## Context Window Sizes

| Model | Context Window |
|-------|----------------|
| Claude 3.5 Haiku | 200K tokens |
| Claude Sonnet 4.5 | 200K tokens |
| Claude Opus 4.5 | 200K tokens |

200K tokens is roughly 150,000 words — large enough for most coding tasks, but still finite.

## Context and AgentRQ

AgentRQ helps manage context by externalizing communication. Instead of embedding long human conversations inside the model's context window, Claude Code agents can send messages to AgentRQ and retrieve replies via [MCP](mcp). This keeps the agent's context clean and focused on the task.

## Related Terms

- [Token](token)
- [Agent](agent)
- [Claude Code](claude-code)
- [Tool Use](tool-use)