Context Compaction for Long-Running Agents

Connected Patterns: Preserving Truth When Context Is Finite
“Long-running work fails when yesterday’s decisions disappear.”

Every agent that runs longer than a few minutes meets the same wall: context is finite, but work is not.

Gaming Laptop Pick
Portable Performance Setup

ASUS ROG Strix G16 (2025) Gaming Laptop, 16-inch FHD+ 165Hz, RTX 5060, Core i7-14650HX, 16GB DDR5, 1TB Gen 4 SSD

ASUS • ROG Strix G16 • Gaming Laptop
ASUS ROG Strix G16 (2025) Gaming Laptop, 16-inch FHD+ 165Hz, RTX 5060, Core i7-14650HX, 16GB DDR5, 1TB Gen 4 SSD
Good fit for buyers who want a gaming machine that can move between desk, travel, and school or work setups

A gaming laptop option that works well in performance-focused laptop roundups, dorm setup guides, and portable gaming recommendations.

$1259.99
Was $1399.00
Save 10%
Price checked: 2026-03-23 18:31. Product prices and availability are accurate as of the date/time indicated and are subject to change. Any price and availability information displayed on Amazon at the time of purchase will apply to the purchase of this product.
  • 16-inch FHD+ 165Hz display
  • RTX 5060 laptop GPU
  • Core i7-14650HX
  • 16GB DDR5 memory
  • 1TB Gen 4 SSD
View Laptop on Amazon
Check Amazon for the live listing price, configuration, stock, and shipping details.

Why it stands out

  • Portable gaming option
  • Fast display and current-gen GPU angle
  • Useful for laptop and dorm pages

Things to know

  • Mobile hardware has different limits than desktop parts
  • Exact variants can change over time
See Amazon for current availability
As an Amazon Associate I earn from qualifying purchases.

At first, everything feels smooth. The agent can see the request, the constraints, the previous tool outputs, and the plan. Then the run grows. A few documents are read. A few tool calls return results. The user makes a correction. The agent tries a different branch. After enough turns, the early constraints slide out of view.

That is when the agent starts to drift.

It repeats work it already did.
It re-litigates decisions it already settled.
It forgets what was disallowed and proposes risky actions again.
It invents new assumptions because it cannot see the old ones.

Context compaction is the discipline of turning a growing conversation into a stable, inspectable state snapshot that preserves the decisions that matter.

It is not “summarize the chat.” It is “preserve the working truth.”

Why Compaction Is Harder Than Summarization

A normal summary tries to be short and readable. A production compaction tries to be short and correct.

Correctness is harder because long-running agent work has multiple kinds of information mixed together:

• Requirements that must not be lost
• Decisions that must not be reversed accidentally
• Evidence that must be tied to its source
• Open questions that must remain open
• Tentative ideas that must not masquerade as facts
• Tool outputs that must be preserved without distortion

If a compaction blurs these categories, the agent becomes confident in the wrong things. The system feels “smart” right up until it makes a costly mistake.

A good compaction has to do what a good lab notebook does: separate observation from interpretation, record what happened, and make it possible to pick up the work later without re-inventing the story.

The Pattern Inside the Story of Reliable Work

Every mature production process learns to separate “the narrative” from “the state.”

The narrative is how you tell the story to a human. The state is what you need to keep the work correct.

Agents need the same separation.

A practical compaction produces two artifacts:

• A state snapshot that the harness and agent use to continue work
• A run narrative that a human can read to understand what happened

The state snapshot is where you store constraints, decisions, and verified facts. The narrative is where you store context, explanation, and helpful detail.

If you only store narrative, the agent will misread it later. If you only store state, humans will not trust it. You need both, but you must not confuse them.

What Must Survive Compaction

Think of compaction as a filter. The goal is not to keep everything. The goal is to keep the right things, in the right form.

Here is a practical way to structure the compacted state:

State bucketWhat belongs hereCommon failure if missing
Goal and success criteriaThe exact outcome the run must deliverThe agent “finishes” with the wrong deliverable
Constraints and policiesAllowed tools, disallowed actions, required approvalsSafety rules get forgotten and violated
Decisions and rationalesWhat was decided and whyThe agent reopens settled debates endlessly
Verified factsStatements supported by evidence and tool outputsOpinions become “facts” and drift multiplies
Evidence indexLinks to sources, tool outputs, file hashes, citationsYou cannot audit or reproduce the work
Open questionsUnresolved issues and what is needed to resolve themThe agent pretends uncertainty is resolved
Pending actionsNext steps with dependencies and stop rulesThe agent improvises and gets lost
Budget and risk signalsSpend counters, confidence flags, contradictionsRunaway loops and false certainty

Notice what is not listed: every conversational flourish, every brainstorm, every half-formed idea. Those can live in narrative logs. The state should be sharp.

The Compaction Method That Works in Practice

A reliable compaction approach is less like writing and more like bookkeeping.

Step selection based on commits

Compaction should happen at predictable moments, not randomly. The best moment is after a commit: after the agent produces an artifact, executes a safe action, or reaches a verified milestone.

This gives you a natural boundary:

• Before the commit: tentative work and drafts
• After the commit: verified outcome and updated state

When compaction is tied to commits, you can replay the run like a chain of checkpoints.

A strict fact-policy boundary

Your compaction must not mix policy with interpretation.

Policy includes: “Do not call tool X,” “Do not modify production,” “All external claims require citations,” “Budget cap is Y.”

Facts include: tool outputs, observed results, confirmed constraints.

Interpretations include: the agent’s explanations, guesses, and plans.

Keep these separate. If you do not, the agent will treat interpretations as policies or treat policies as optional suggestions.

Preserve contradictions explicitly

Long-running work often encounters conflicting signals: two sources disagree, two tool calls return different numbers, a dataset changes between runs.

A compaction that resolves contradictions by picking a winner is dangerous. The right move is to record the contradiction and record the verification plan.

Example contradiction entry:

• Conflict: Source A says X, Source B says Y
• Impact: affects decision Z
• Next verification: run check Q, request human review, or fetch authoritative data

This allows the agent to continue without pretending certainty.

Use structured formats, not paragraphs

Free-form prose is the enemy of long-running reliability. It is too easy to misread later.

Use a structured representation that the harness can validate. JSON with a schema works. A stable markdown template can work if it is strictly formatted. The key is predictability.

The compaction should be machine-friendly first, human-friendly second.

Keep raw evidence out of the compaction

It is tempting to paste tool outputs into the compacted state. That grows fast and creates new context pressure.

Instead, store an evidence index:

• Tool call ID
• Timestamp
• Input parameters
• Output hash or file path
• Short, verified extraction (only what you need)

This keeps the state small while preserving auditability.

The Compaction in the Life of the Agent

Context compaction changes how an agent behaves over hours and days.

Without compaction, the agent’s “memory” becomes a fog. It must guess what matters. It becomes susceptible to whatever was said most recently.

With compaction, the agent gets a stable foundation. It can act like an operator following a clear runbook:

• The goals remain visible.
• Constraints remain enforceable.
• Decisions remain anchored.
• Evidence remains traceable.
• Uncertainty remains honest.

This is also where you can make drift expensive. If the agent proposes an action that violates the compacted constraints, the harness can block it automatically. If it claims a “fact” not listed as verified, the harness can require evidence before allowing a commit.

In other words, compaction is not just storage. It is enforcement.

Common Compaction Mistakes That Create Drift

Even careful teams tend to stumble in a few predictable ways.

• Compaction that sounds confident when it is not: phrases like “the data shows” without preserving what data, what query, and what version.
• Compaction that hides the reason: a decision is recorded, but the rationale is lost, so the agent reopens the debate later.
• Compaction that collapses options into one path: alternatives vanish, so the agent cannot recover when the chosen path fails.
• Compaction that treats tool output as gospel: raw outputs are copied into state without validation, and downstream steps inherit the error.
• Compaction that grows without pruning: state becomes a second transcript, and the same context pressure returns.

A good harness treats compaction as a budgeted operation. It has a target size, a validation step, and a rule that old, superseded items are marked as superseded rather than quietly overwritten. That is how you preserve history without carrying dead weight.

A Simple Compaction Checklist

If you want one practical standard, use this:

• Anything that changes the future must be written into state.
• Anything that is uncertain must be labeled uncertain.
• Anything that is risky must require a gate.
• Anything that must be audited must have an evidence pointer.
• Anything that is obsolete must be marked obsolete, not deleted quietly.

The goal is a state that can be handed to a different model, a different machine, or a different engineer, and still remain true.

Preserving Truth Over Time

Long-running agents do not fail because they forget a sentence. They fail because they forget what was binding.

Context compaction is how you keep binding things binding: constraints, decisions, and verified facts.

When you treat compaction as part of the harness, long tasks stop feeling like fragile conversations and start feeling like steady operations. The agent can still be creative and flexible, but it is anchored. It does not have to reinvent itself every thousand tokens.

That is what makes “long-running” possible.

Keep Exploring Reliable Long-Running Work

• Agent Memory: What to Store and What to Recompute
https://ai-rng.com/agent-memory-what-to-store-and-what-to-recompute/

• Preventing Task Drift in Agents
https://ai-rng.com/preventing-task-drift-in-agents/

• Agent Checkpoints and Resumability
https://ai-rng.com/agent-checkpoints-and-resumability/

• Multi-Step Planning Without Infinite Loops
https://ai-rng.com/multi-step-planning-without-infinite-loops/

• Agent Logging That Makes Failures Reproducible
https://ai-rng.com/agent-logging-that-makes-failures-reproducible/

• The Lab Notebook of the Future
https://ai-rng.com/the-lab-notebook-of-the-future/

Books by Drew Higgins