Zerg Rush Agent Swarm

Description

This skill implements a **Zerg Rush-style agent swarm** optimized for speed, parallelism, and disposability. Agents are short-lived, low-context, and timeboxed. No long-running agents, no deep deliberation, no architecture drift.

**Philosophy: Spawn → Bite → Die → Repeat.**

When to Use This Skill

Use this skill when you need to:

Break large codebases or refactors into parallelizable microtasks

Coordinate multiple Claude Code agents working simultaneously

Enforce strict time and scope constraints on AI work

Maintain lane-based separation of concerns (kernel, ML, quant, DEX, integration)

Prevent file conflicts through reservation protocols

Architecture

Roles

**OVERLORD (Opus Model)**

Decomposes large goals into microtasks fitting the 4-min/100-line rule

Assigns tasks in waves of 4-5 agents

Maintains global state in `/SWARM/STATE.json`

Merges results and creates follow-up tasks

Rejects oversized tasks and splits them further

**ZERGLING (Sonnet Model)**

Completes exactly one task per session

Obeys hard limits (4 min, 100 lines, 1-2 files)

Writes results to INBOX and stops immediately

Never makes architectural decisions or adds dependencies

Directory Structure

```

project-root/

└── SWARM/

├── STATE.json # Global swarm state

├── SWARM_RULES.md # Lane rules and constraints

├── RUNBOOK.md # Operational playbook

├── TASKS/

│ ├── KERNEL/ # CUDA, Triton, CUTLASS (K001, K002, ...)

│ ├── ML/ # Models, training, data (M001, M002, ...)

│ ├── QUANT/ # Strategy, backtests (Q001, Q002, ...)

│ ├── DEX/ # Solana, Jupiter (D001, D002, ...)

│ └── INTEGRATION/ # Glue, CLI, CI (INT-001, INT-002, ...)

├── OUTBOX/ # Task assignments

├── INBOX/ # Task results

├── TEMPLATES/ # Task card templates

├── SCRIPTS/

│ └── swarm.py # CLI coordination tool

└── LOCKS/ # File reservation tracking

```

Instructions

Phase 1: Initialize Swarm Infrastructure

1. **Create directory structure:**

```bash

mkdir -p SWARM/{TASKS/{KERNEL,ML,QUANT,DEX,INTEGRATION},OUTBOX,INBOX,TEMPLATES,SCRIPTS,LOCKS}

```

2. **Initialize STATE.json:**

```json

{

"wave": 0,

"active_zerglings": [],

"completed_tasks": [],

"pending_tasks": [],

"last_updated": "2025-01-01T00:00:00Z"

}

```

3. **Set up Agent Mail MCP (file locking):**

- Project key: `/path/to/your/project`

- Every agent session MUST use: `mcp__mcp-agent-mail__register_agent`

- Reserve files before editing: `mcp__mcp-agent-mail__file_reservation_paths`

- Release on completion: `mcp__mcp-agent-mail__release_file_reservations`

4. **Set up RAG Brain MCP (shared memory):**

- On spawn: `mcp__rag-brain__recall` - Retrieve prior context

- On complete: `mcp__rag-brain__remember` - Store decisions

- Always: `mcp__rag-brain__feedback` - Rate memory usefulness

Phase 2: Task Decomposition (OVERLORD)

1. **Analyze the user's goal** and determine which lane(s) it belongs to:

- `KERNEL/` → CUDA, Triton, CUTLASS, performance

- `ML/` → Model code, training loops, evaluation

- `QUANT/` → Math, strategy research, backtests

- `DEX/` → Solana, Jupiter, Jito transactions

- `INTEGRATION/` → Glue code, CLI, CI only

2. **Break goal into microtasks** that fit these constraints:

- **Timebox:** 4 minutes (hard stop)

- **Max new lines:** 100

- **Files touched:** 1 (max 2 if second is test/docs)

- **No new dependencies**

- **No architectural decisions**

- **No refactors outside task scope**

3. **Assign task types** (guarantees 4-min/100-line fit):

- `ADD_STUB` → Skeleton + TODOs

- `ADD_PURE_FN` → One function + docstring

- `ADD_TEST` → 1-3 test cases

- `FIX_ONE_BUG` → Single bug fix

- `ADD_ASSERTS` → Runtime checks

- `ADD_METRIC` → One metric + logging

- `ADD_BENCH` → Benchmark snippet

- `DOC_SNIPPET` → Documentation section

- `REFACTOR_TINY` → Rename/move only

4. **Create task cards** in appropriate lane directory:

```markdown

# Task: K001

## Metadata

- Lane: KERNEL

- Type: ADD_PURE_FN

- Wave: 1

- Status: PENDING

## Context Pack

Files to read:

- src/kernel/matmul.cu (lines 45-80)

Signature:

```cuda

__global__ void matmul_naive(float* A, float* B, float* C, int M, int N, int K)

```

Expected behavior:

- Multiply M×K matrix A by K×N matrix B

- Write result to M×N matrix C

- Use shared memory for coalescing

Check command:

```bash

nvcc -o test_matmul src/kernel/matmul.cu && ./test_matmul

```

## Objective

Implement naive CUDA matrix multiplication kernel with shared memory.

## Deliverables

- [ ] matmul_naive kernel implementation

- [ ] Inline documentation

- [ ] Passes check command

## Constraints

- Max 100 lines

- Single file edit: src/kernel/matmul.cu

- No external dependencies

```

5. **Compose balanced wave** (5 tasks):

- 2× Implementation (ADD_STUB / ADD_PURE_FN)

- 2× Validation (ADD_TEST / ADD_ASSERTS)

- 1× Quality (ADD_BENCH / DOC_SNIPPET)

Wave rules:

- Single-lane wave: All 5 in one lane (fastest)

- Mixed wave: Max 3 + 2 across two lanes

- **Never** more than 2 lanes per wave

6. **Verify before spawning:**

- [ ] 2+ validation tasks included

- [ ] Max 2 lanes in wave

- [ ] All tasks have Context Packs

- [ ] No file conflicts detected

Phase 3: Spawn Zerglings (Parallel Execution)

1. **Copy task cards to OUTBOX:**

```bash

cp SWARM/TASKS/KERNEL/K001.md SWARM/OUTBOX/

cp SWARM/TASKS/KERNEL/K002.md SWARM/OUTBOX/

# ... for all wave tasks

```

2. **Update STATE.json:**

```json

{

"wave": 1,

"active_zerglings": [],

"completed_tasks": [],

"pending_tasks": ["K001", "K002", "K003", "M001", "DOC-001"],

"last_updated": "2025-01-01T10:00:00Z"

}

```

3. **Spawn 4-5 parallel Claude Code agents** (use Task tool):

```

Launch 5 agents in parallel:

- Agent 1: Execute K001 following Zergling protocol

- Agent 2: Execute K002 following Zergling protocol

- Agent 3: Execute K003 following Zergling protocol

- Agent 4: Execute M001 following Zergling protocol

- Agent 5: Execute DOC-001 following Zergling protocol

```

Phase 4: Zergling Execution Protocol

**Each Zergling agent must follow this exact sequence:**

1. **Register with Agent Mail:**

```

mcp__mcp-agent-mail__register_agent(

agent_id="zergling-K001",

project_key="/path/to/project"

)

```

2. **Recall context from RAG Brain:**

```

mcp__rag-brain__recall(query="zerg-swarm kernel optimization decisions")

```

3. **Read task card from OUTBOX:**

```

Read: SWARM/OUTBOX/K001.md

Parse: Context Pack, Objective, Deliverables, Constraints

```

4. **Reserve files before editing:**

```

mcp__mcp-agent-mail__file_reservation_paths(

agent_id="zergling-K001",

paths=["src/kernel/matmul.cu"]

)

```

**NEVER edit files reserved by another agent.**

5. **Execute task within constraints:**

- Start 4-minute timer

- Read ONLY files listed in Context Pack

- Implement deliverables

- Run check command

- If limits exceeded, return status `PARTIAL`

6. **Write result to INBOX:**

```markdown

# Result: K001

## Status

DONE | PARTIAL | BLOCKED | FAILED

## Changes Made

- Implemented matmul_naive in src/kernel/matmul.cu (95 lines)

- Added shared memory optimization

- Check command passes

## Files Modified

- src/kernel/matmul.cu (lines 45-140)

## Gate Check

```bash

nvcc -o test_matmul src/kernel/matmul.cu && ./test_matmul

# Output: PASS

```

## Notes for OVERLORD

Consider adding register tiling in next wave for 2× speedup.

```

Save to: `SWARM/INBOX/K001_RESULT.md`

7. **Remember decision in RAG Brain:**

```

mcp__rag-brain__remember(

content="Used shared memory pattern from CUTLASS docs for matmul coalescing",

tags=["kernel", "optimization", "K001"]

)

```

8. **Release file locks:**

```

mcp__mcp-agent-mail__release_file_reservations(

agent_id="zergling-K001"

)

```

9. **Die (stop immediately).** Do not wait for other agents.

Phase 5: Wave Collection (OVERLORD)

1. **Wait for all zerglings to write INBOX results.**

2. **Run collection:**

```bash

python3 SWARM/SCRIPTS/swarm.py collect

```

This updates STATE.json with completed/blocked tasks.

3. **Review results by status:**

**DONE tasks:**

- Merge changes into main codebase

- Move task card to appropriate archive

- Update STATE.json completed_tasks

**PARTIAL tasks:**

- Analyze what was completed

- Create smaller follow-up tasks

- Add to next wave

**BLOCKED tasks:**

- Resolve dependency/blocker

- Re-queue or split further

**FAILED tasks:**

- Analyze error

- Fix Context Pack if incomplete

- Create recovery task

4. **Increment wave counter:**

```bash

python3 SWARM/SCRIPTS/swarm.py wave

```

5. **Compose next wave** and repeat Phase 2-5 until goal complete.

Status Codes

| Code | Meaning | Next Action |

|------|---------|-------------|

| `DONE` | Task completed successfully | Merge and archive |

| `PARTIAL` | Limits hit, work incomplete | Create follow-up task |

| `BLOCKED` | Cannot proceed, needs input | Resolve dependency |

| `FAILED` | Error, task abandoned | Fix Context Pack, retry |

Lane-Specific Gates

Each lane has acceptance criteria. Task isn't DONE until gate passes:

| Lane | Gate Checks |

|------|-------------|

| `KERNEL` | Correctness (CPU ref match), Benchmark (1 shape) |

| `ML` | Unit tests OR smoke-run, No import breaks |

| `QUANT` | Deterministic output, No NaNs/lookahead |

| `DEX` | Dry-run TX builds, Safety checks pass |

| `INTEGRATION` | Wire test, CLI --help works |

**Gate must run in <30 seconds.**

CLI Tool Reference

Create `SWARM/SCRIPTS/swarm.py`:

```python

#!/usr/bin/env python3

import json

import sys

from pathlib import Path

from datetime import datetime

SWARM_ROOT = Path(__file__).parent.parent

STATE_FILE = SWARM_ROOT / "STATE.json"

def load_state():

with open(STATE_FILE) as f:

return json.load(f)

def save_state(state):

state["last_updated"] = datetime.utcnow().isoformat() + "Z"

with open(STATE_FILE, "w") as f:

json.dump(state, f, indent=2)

def status():

state = load_state()

print(f"Wave: {state['wave']}")

print(f"Active: {len(state['active_zerglings'])}")

print(f"Pending: {len(state['pending_tasks'])}")

print(f"Completed: {len(state['completed_tasks'])}")

def wave():

state = load_state()

state["wave"] += 1

save_state(state)

print(f"Advanced to wave {state['wave']}")

def tasks():

for task in (SWARM_ROOT / "OUTBOX").glob("*.md"):

print(task.stem)

def results():

for result in (SWARM_ROOT / "INBOX").glob("*_RESULT.md"):

print(result.stem)

def collect():

state = load_state()

inbox = SWARM_ROOT / "INBOX"

for result_file in inbox.glob("*_RESULT.md"):

task_id = result_file.stem.replace("_RESULT", "")

if task_id in state["pending_tasks"]:

state["pending_tasks"].remove(task_id)

state["completed_tasks"].append(task_id)

print(f"Collected: {task_id}")

save_state(state)

if __name__ == "__main__":

cmd = sys.argv[1] if len(sys.argv) > 1 else "status"

globals()[cmd]()

```

Constraints and Warnings

**Never violate the 4-min/100-line rule.** If a task looks too big, split it immediately.

**No file edits without reservation.** Check `mcp-agent-mail` locks before touching any file.

**One task per zergling.** Do not chain tasks or continue after writing to INBOX.

**No architectural decisions in zerglings.** Escalate to OVERLORD if approach is unclear.

**Cross-lane dependencies must be explicit.** Never assume another lane's work is complete.

Example: Full Wave Execution

**Goal:** Implement CUDA matrix multiplication kernel with tests

**OVERLORD decomposes:**

1. K001: ADD_STUB - Create matmul.cu skeleton

2. K002: ADD_PURE_FN - Implement naive kernel

3. K003: ADD_TEST - Unit test for correctness

4. K004: ADD_BENCH - Benchmark against cuBLAS

5. DOC-001: DOC_SNIPPET - Document kernel API

**Wave composition:** Single-lane (KERNEL × 4, DOC × 1)

**Spawn 5 zerglings in parallel** → Each registers, reserves, executes, writes INBOX, releases, dies

**OVERLORD collects:** All 5 return DONE → Merge changes → Advance to Wave 2

Notes

This skill works best with MCP servers `mcp-agent-mail` (file locking) and `rag-brain` (shared memory).

Adjust lane structure to match your project's architecture.

Gate checks should be fast (<30s) and deterministic.

For very large goals, run multiple waves sequentially rather than spawning hundreds of agents at once.

Zerg Rush Swarm Coordination

Zerg Rush Agent Swarm

Description

When to Use This Skill

Architecture

Roles

Directory Structure

Instructions

Phase 1: Initialize Swarm Infrastructure

Phase 2: Task Decomposition (OVERLORD)

Phase 3: Spawn Zerglings (Parallel Execution)

Phase 4: Zergling Execution Protocol

Phase 5: Wave Collection (OVERLORD)

Status Codes

Lane-Specific Gates

CLI Tool Reference

Constraints and Warnings

Example: Full Wave Execution

Notes

Reviews (0)