Context Window
A context window is the maximum amount of text — measured in tokens — that a large language model (LLM) can process in a single inference call. Everything the model uses to generate a response — the system prompt, conversation history, tool results, and any injected data — must fit within this window.
Why Context Windows Matter for Agents
Agents operate in long-running loops where they accumulate large amounts of context: prior tool results, conversation history, file contents, and reasoning steps. As this accumulates, it can approach or exceed the model's context window limit.
When the limit is hit, agents must:
- → Summarize earlier parts of the conversation
- → Evict old content that is no longer relevant
- → Compress tool results to their essential information
Poor context management leads to agents "forgetting" important details or failing mid-task.
Context Window Sizes
| Model | Context Window |
|---|---|
| Claude 3.5 Haiku | 200K tokens |
| Claude Sonnet 4.5 | 200K tokens |
| Claude Opus 4.5 | 200K tokens |
200K tokens is roughly 150,000 words — large enough for most coding tasks, but still finite.
Context and AgentRQ
AgentRQ helps manage context by externalizing communication. Instead of embedding long human conversations inside the model's context window, Claude Code agents can send messages to AgentRQ and retrieve replies via MCP. This keeps the agent's context clean and focused on the task.
Related Terms
- → Token
- → Agent
- → Claude Code
- → Tool Use