SpecHO Development Guide

AI watermark detection system for identifying "Echo Rule" patterns in AI-generated text. This skill guides development through a strict three-tier progression (MVP → Production → Research) with 32 tasks across five pipeline components.

Overview

**Project**: SpecHO - Echo Rule watermark detector

**Architecture**: Five-component sequential pipeline

**Language**: Python 3.11+

**Total Tasks**: 32 (12-week Tier 1 timeline)

**Approach**: Tier-based implementation with mandatory validation gates

Documentation Protocol (MANDATORY)

Session Start Protocol

1. **Verify working directory**: Must be in project root

2. **Read current state**: Check `docs/STATUS.md` first

3. **Create session file**: `working/session-YYYY-MM-DD.md`

4. **Review active task**: Reference `docs/TASKS.md` and `docs/SPECS.md`

Active Documentation Structure

| Document | Purpose | Update Trigger |

|----------|---------|----------------|

| `docs/TASKS.md` | Task specifications (32 tasks) | Tasks added/changed |

| `docs/SPECS.md` | Tier 1/2/3 specifications | Specs refined |

| `docs/IMPLEMENTATION.md` | Learnings and gotchas | After each session |

| `docs/DEPLOYMENT.md` | Operations and deployment | Infrastructure changes |

| `docs/STATUS.md` | Current state and next steps | After each session |

| `architecture.md` | Original Echo Rule design | Reference only |

Session End Protocol

1. **Extract insights** from session file → append to `docs/IMPLEMENTATION.md`

2. **Update current state** → `docs/STATUS.md`

3. **Archive session file** → move to `docs/archive/sessions/`

4. **DO NOT** leave files in `working/` directory

Anti-Patterns (NEVER DO)

Create new top-level markdown files without approval

Create `CONTEXT_*`, `HANDOFF_*`, or `summary*` files

Leave working files after session ends

Use file-based references (use `[DOC.md#section]` anchors instead)

Tier System

Tier 1: MVP (Weeks 1-12, Tasks 1-32)

**Constraints**:

Implement ONLY Tier 1 specifications

Use simple algorithms (no premature optimization)

Use "simple" config profile

Follow strict task sequence

**Deliverable**: Working CLI-based detector

Tier 2: Production (Weeks 13-17)

**Trigger Requirements**:

All 32 Tier 1 tasks complete

Unit tests >80% coverage

5+ integration tests passing

Baseline corpus validated

CLI functional on real documents

2-3 real limitations identified

False positive/negative rate measured (50+ documents)

**Deliverable**: Production-ready system with "robust" config profile

Tier 3: Research (Week 18+)

**Trigger Requirements**:

Tier 2 deployed for 2+ weeks

Performance bottlenecks documented with profiling

False positive/negative rate < 5%

ROI-justified features identified

Production data validates need

**Deliverable**: Optimized research-grade system

Task Sequence and Components

Foundation (Start Here)

**Task 1.1**: `SpecHO/models.py` - Dataclasses (Token, Clause, ClausePair, EchoScore, DocumentAnalysis)

**Task 1.2**: `SpecHO/config.py` - Three-tier config system

**Task 7.3**: `SpecHO/utils.py` - File I/O and logging

Component 1: Preprocessor (Tasks 2.1-2.5, 8.1)

Tokenizer (spaCy)

POS Tagger (spaCy)

Dependency Parser (spaCy)

Phonetic Transcriber (pronouncing or g2p-en)

Pipeline integration

Tests

Component 2: Clause Identifier (Tasks 3.1-3.4, 8.2)

Boundary Detector

Pair Rules Engine

Zone Extractor

Pipeline integration

Tests

Component 3: Echo Engine (Tasks 4.1-4.4, 8.3)

Phonetic Analyzer (Levenshtein)

Structural Analyzer

Semantic Analyzer (gensim)

Pipeline integration

Tests

Component 4: Scoring (Tasks 5.1-5.3, 8.4)

Weighted Scorer (numpy)

Document Aggregator

Pipeline integration

Tests

Component 5: Validator (Tasks 6.1-6.4, 8.5)

Baseline Corpus Processor

Z-Score Calculator (scipy.stats)

Confidence Converter

Pipeline integration

Tests

Integration (Tasks 7.1-7.4, 8.6)

Main detector class

CLI interface (argparse + rich)

Baseline builder script

Integration tests

Data Flow

```

Input: str (raw text)

↓

Preprocessor → List[Token] + spacy.Doc

↓

Clause Identifier → List[ClausePair]

↓

Echo Engine → List[EchoScore]

↓

Scoring → float (document_score)

↓

Validator → (z_score, confidence)

↓

Output: DocumentAnalysis dataclass

```

Implementation Rules

When User Requests Task Implementation

1. **Read specifications**: Check `docs/TASKS.md` for task API

2. **Check tier details**: Reference `docs/SPECS.md` for Tier 1 implementation

3. **Implement Tier 1 only**: No features from Tier 2/3

4. **Create tests**: Write corresponding test file in `tests/`

5. **Validate**: Run tests before marking complete

When User Requests Clarification

1. Check `docs/SPECS.md` for detailed specification

2. Check `architecture.md` for algorithm context

3. Provide clear explanation with document reference

4. Offer code example if helpful

Mandatory Practices

**DO**:

Implement exactly what Tier 1 specifies

Write tests before moving to next task

Document assumptions in `docs/IMPLEMENTATION.md`

Use "simple" config profile

Follow task sequence strictly

**DO NOT**:

Add Tier 2 features (even if easy)

Optimize prematurely

Skip tests

Jump ahead in task sequence

Implement unspecified features

Key Dependencies

Required (Tier 1)

`spacy>=3.7.0` + `en-core-web-sm`

`pronouncing>=0.2.0` OR `g2p-en>=2.1.0`

`python-Levenshtein>=0.21.0`

`jellyfish>=1.0.0`

`gensim>=4.3.0`

`numpy>=1.24.0`

`scipy>=1.11.0`

`pydantic>=2.0.0`

`pytest>=7.4.0` + `pytest-cov>=4.1.0`

`rich>=13.0.0`

`tqdm>=4.66.0`

Optional (Tier 2+)

`sentence-transformers>=2.2.0`

Directory Structure

```

SpecHO/

├── SpecHO/ # Implementation

│ ├── models.py # START: Task 1.1

│ ├── config.py

│ ├── utils.py

│ ├── detector.py

│ ├── preprocessor/

│ ├── clause_identifier/

│ ├── echo_engine/

│ ├── scoring/

│ └── validator/

├── tests/ # Mirror structure

├── scripts/

│ ├── cli.py

│ └── build_baseline.py

├── data/

│ ├── baseline/

│ ├── models/

│ └── corpus/

├── docs/

│ ├── TASKS.md # READ FIRST for task specs

│ ├── SPECS.md # READ SECOND for tier details

│ ├── IMPLEMENTATION.md

│ ├── DEPLOYMENT.md

│ ├── STATUS.md

│ └── archive/

└── working/ # Temporary session files

```

Response Patterns

Task Implementation Request

1. Acknowledge task number (e.g., "Task 2.1: Tokenizer")

2. Reference relevant documentation section

3. State tier level being implemented

4. Implement complete file with Tier 1 specs

5. Suggest test creation or next task

Clarification Request

1. Reference specific documentation section

2. Provide concrete code example

3. Explain in context of overall pipeline

Deviation Detection

1. Acknowledge user request

2. Note if it conflicts with tier system

3. Suggest tier-appropriate alternative

4. Implement only if reasonable for current tier

Examples

Starting First Session

```

User: "Read STATUS.md and help me continue"

Assistant:

1. Reads docs/STATUS.md

2. Identifies current task

3. References docs/TASKS.md for specification

4. Proposes next action based on current tier

```

Implementing a Task

```

User: "Implement Task 2.1"

Assistant:

"Task 2.1: Tokenizer (Tier 1)

Per TASKS.md, this creates SpecHO/preprocessor/tokenizer.py

Tier 1 spec (SPECS.md): Use spaCy's default tokenizer, no customization.

[implements tokenizer.py with Tier 1 specifications]

Next: Create tests/test_preprocessor.py or proceed to Task 2.2?"

```

Preventing Premature Optimization

```

User: "Let's add caching to the phonetic analyzer"

Assistant:

"Caching is a Tier 2 feature (per SPECS.md). Current tier: 1.

We're implementing simple algorithms first. I'll note this

in IMPLEMENTATION.md as a potential Tier 2 enhancement.

Continue with current Tier 1 task?"

```

Constraints

Python 3.11+ required

Must follow strict task sequence (no jumping ahead)

Tests required for every component before proceeding

Documentation updates mandatory at session end

No Tier 2/3 features during Tier 1 implementation

Use reference format: `[DOC.md#section]` not filenames

SpecHO Development Guide

SpecHO Development Guide

Overview

Documentation Protocol (MANDATORY)

Session Start Protocol

Active Documentation Structure

Session End Protocol

Anti-Patterns (NEVER DO)

Tier System

Tier 1: MVP (Weeks 1-12, Tasks 1-32)

Tier 2: Production (Weeks 13-17)

Tier 3: Research (Week 18+)

Task Sequence and Components

Foundation (Start Here)

Component 1: Preprocessor (Tasks 2.1-2.5, 8.1)

Component 2: Clause Identifier (Tasks 3.1-3.4, 8.2)

Component 3: Echo Engine (Tasks 4.1-4.4, 8.3)

Component 4: Scoring (Tasks 5.1-5.3, 8.4)

Component 5: Validator (Tasks 6.1-6.4, 8.5)

Integration (Tasks 7.1-7.4, 8.6)

Data Flow

Implementation Rules

When User Requests Task Implementation

When User Requests Clarification

Mandatory Practices

Key Dependencies

Required (Tier 1)

Optional (Tier 2+)

Directory Structure

Response Patterns

Task Implementation Request

Clarification Request

Deviation Detection

Examples

Starting First Session

Implementing a Task

Preventing Premature Optimization

Constraints

Reviews (0)