Dynamic Context Pruning

Automatically reduce token usage in AI coding sessions by intelligently removing obsolete content from conversation history while preserving important context through distillation and compression.

What This Skill Does

This skill implements multiple strategies to keep conversation context lean and efficient:

1. **Distillation** - Extracts and preserves key findings before removing raw tool outputs

2. **Compression** - Collapses large conversation sections into concise summaries

3. **Deduplication** - Removes repeated tool calls, keeping only the most recent

4. **Write Supersession** - Removes write tool outputs when files are subsequently read

5. **Error Purging** - Cleans up failed tool inputs after a configurable delay

Instructions

Context Analysis

Before pruning, analyze the current conversation to identify:

1. **Duplicate tool calls** - Same tool with same arguments called multiple times

2. **Superseded writes** - Write/Edit operations followed by Read of the same file

3. **Stale errors** - Tool calls that failed more than 4 turns ago

4. **Completed tasks** - Tool outputs no longer relevant to current work

5. **Redundant content** - Information that has been incorporated into subsequent work

Distillation Process

When you encounter tool outputs containing valuable information that may be needed later:

1. Identify key findings, patterns, or insights from the tool output

2. Create a concise summary (2-4 sentences) capturing:

- Critical facts or data points

- Important patterns or relationships discovered

- Relevant context for future decisions

3. Mark the original tool output as prunable

4. Preserve the distilled summary in conversation context

**Example distillation:**

```

Original: [Large file read with 500+ lines of code]

Distilled: "Authentication uses JWT tokens with 24h expiry. Token refresh logic is in src/auth/refresh.ts:45. Current implementation lacks rate limiting on refresh endpoint."

```

Compression Strategy

For large conversation sections that can be consolidated:

1. Identify a range of messages/tools forming a complete subtask

2. Verify the subtask is complete and no longer actively being worked on

3. Create a summary including:

- What was accomplished

- Key decisions made

- Important outcomes or artifacts created

- Any relevant context for future reference

4. Replace the entire range with the summary

**Example compression:**

```

Original: [15 messages debugging database connection issue]

Compressed: "Fixed database connection timeout by increasing pool size to 20 and adding retry logic with exponential backoff in src/db/config.ts. Issue was caused by connection exhaustion under load."

```

Pruning Decision Framework

Apply this decision tree for each tool output:

1. **Is it protected?** (Task, TodoWrite, batch, plan tools) → Never prune

2. **Is it recent?** (Last 4 turns) → Skip for now

3. **Is it duplicated?** (Same tool + args called later) → Prune older instances

4. **Is it a superseded write?** (File written then read) → Prune write output

5. **Is it an old error?** (Failed tool >4 turns ago) → Prune input, keep error message

6. **Does it contain valuable context?** → Distill first, then prune

7. **Is it completed work?** → Compress or prune

Active Pruning Triggers

Proactively prune context when:

Total context exceeds 100,000 tokens

You've completed a major subtask

You're switching to a different area of the codebase

You notice repeated information in context

Every 10 tool calls (as a maintenance check)

Protected Content

Never prune:

Task management tools (Task, TodoWrite, TodoRead)

Planning tools (plan_enter, plan_exit)

Batch operations

Content from the last 4 message turns

Files matching protected patterns (if configured)

Recent user messages or assistant responses

Notification

After pruning operations, provide a brief summary:

Number of tool outputs pruned

Estimated tokens saved

Categories pruned (duplicates, superseded writes, errors, completed work)

**Example:** "Pruned 8 tool outputs (3 duplicates, 2 superseded writes, 3 completed tasks) → ~12,000 tokens saved"

Best Practices

1. **Distill before pruning** - When in doubt, extract key information first

2. **Batch pruning operations** - More efficient than pruning individual items

3. **Preserve recent context** - Don't prune anything from the last 4 turns

4. **Verify completion** - Only prune/compress work that's truly finished

5. **Monitor context size** - Check token usage periodically and prune proactively

6. **Respect protection rules** - Never prune protected tools or recent content

Constraints

Do NOT prune content from subagent conversations (they need full context to summarize)

Do NOT prune if unsure whether information will be needed again

Do NOT compress incomplete work or active debugging sessions

Always preserve error messages (prune inputs only)

Maintain conversation coherence - pruning should not create confusion

Dynamic Context Pruning

Dynamic Context Pruning

What This Skill Does

Instructions

Context Analysis

Distillation Process

Compression Strategy

Pruning Decision Framework

Active Pruning Triggers

Protected Content

Notification

Best Practices

Constraints

Reviews (0)