IIS Project Agent Instructions
This skill provides comprehensive guidelines for AI coding agents (GitHub Copilot, Claude, etc.) working on the IIS Project repository. It covers testing requirements, documentation updates, code quality standards, and PR workflows.
What This Skill Does
When activated, this skill ensures agents follow the IIS Project's development standards:
**Test-Driven Development**: All code changes must include tests (unit, integration, or regression)**Documentation Discipline**: Keep ROADMAP.md and IMPLEMENTATION_COMPLETE.md up-to-date**Code Hygiene**: Enforce single sources of truth, prevent duplication**Quality Gates**: All tests must pass before PR submission**Minimal Changes**: Make the smallest possible changes to achieve goalsInstructions for AI Agents
1. Testing Requirements (CRITICAL)
**For any new feature or code change:**
MUST include new tests or update existing testsTests must cover main functionality and edge casesTests must be deterministic (no flaky tests)Tests must pass before finalizing any PR**For bug fixes:**
MUST include a regression test when feasibleRegression test should fail on old code, pass on fixed codeDocument the bug scenario in the test docstring**Running tests:**
```bash
Run all tests
pytest
Run specific test categories
pytest -m unit # Unit tests only
pytest -m integration # Integration tests only
pytest -m regression # Regression tests only
Run tests with coverage
pytest --cov=app_mockup --cov-report=term
Run demo tests (human verification)
pytest -k demo_pipeline -s
```
**Test organization:**
`test_preprocessing.py` - Preprocessing pipeline (57 tests)`test_llm_extractor.py` - Extraction pipeline`test_qa_module.py` - Q&A module (26 tests)`test_synthetic_claims.py` - Synthetic claim generation`test_conclusion_inference.py` - Conclusion detection`test_llm_integration.py` - LLM client integration`tests/live/` - Live API tests (opt-in only, cost real money)**Test markers:**
`@pytest.mark.unit` - Unit tests for individual functions`@pytest.mark.integration` - Integration tests for workflows`@pytest.mark.regression` - Regression tests with golden outputs`@pytest.mark.negative` - Edge cases and error conditions`@pytest.mark.demo` - Human-readable demonstration tests`@pytest.mark.live_api` - Live API tests (require RUN_LIVE_API_TESTS=1)**Test quality standards:**
Deterministic (same input → same output)Isolated (no side effects between tests)Fast (unit tests in milliseconds)Clear (descriptive names and docstrings)Maintainable (easy to understand and modify)**Avoid:**
Network calls (mock external dependencies)Large model downloads (skip or mock if unavailable)Random behavior (use fixed seeds if needed)File system pollution (use temp directories, clean up)**Live API tests:**
Located in `tests/live/`Skipped by default (require explicit opt-in)Run with: `RUN_LIVE_API_TESTS=1 pytest -m live_api -s`Never run in CI (consume real credits)2. ROADMAP.md Update Requirements
**MUST update ROADMAP.md when:**
Completing a roadmap task or subtaskMaking significant progress on roadmap itemAdding new planned work relevant to milestonesChanging priorities or timelines**For completed tasks:**
Change `- [ ]` to `- [x]` for completed checkboxesAdd brief progress notes (1-2 sentences)Update status indicators (✅ Complete, 🔄 In Progress, etc.)**For new work:**
Add tasks in appropriate sectionUse clear, actionable task descriptionsInclude context about why task is needed**For plan changes:**
Document what changed and whyUpdate priority markers if needed3. IMPLEMENTATION_COMPLETE.md Requirements
**MUST update IMPLEMENTATION_COMPLETE.md when:**
Completing any issue or PRMaking significant implementation progressFinishing a feature or bug fixCompleting work that changes system behavior**Standard format (ALWAYS rewrite entire file):**
```markdown
[Issue Title] - COMPLETE ✅
**Issue:** [Issue number and title]
**Date:** [YYYY-MM-DD]
**Status:** ✅ Complete / 🔄 In Progress / ⚠️ Blocked
Summary
[2-3 sentence overview of what was implemented]
Changes Made
[Bullet point of change 1][Bullet point of change 2][...]Key Files Changed
`path/to/file1.py` - [Brief description]`path/to/file2.py` - [Brief description][...]Testing
**How to test:**
```bash
Commands to run tests
```
**Test results:**
[Number] tests added/updatedAll tests passing ✓Verification
**To verify this implementation:**
1. [Step 1 to verify]
2. [Step 2 to verify]
3. [...]
Follow-ups / Known Limitations
[Any follow-up work needed][Any known limitations or edge cases][Future improvements planned]Documentation Updated
[X] README.md[X] ROADMAP.md[X] Code comments/docstrings[X] Other: [specify]```
**Must include:**
Issue title and numberCompletion dateClear summary of changes (bullets)List of key files modified/createdTesting commands and resultsHow to verify/validate changesAny follow-up work or limitations**Keep it:**
Concise (1-2 pages max)Scannable (bullets and sections)Actionable (include commands to verify)Complete (don't omit important details)4. Code Change Guidelines
**Minimal changes:**
Make smallest possible changes to achieve goalDon't refactor unrelated codeDon't fix unrelated bugs unless they block your workKeep commits focused and atomic**Code quality:**
Follow existing code style and patternsAdd docstrings to new functions and classesComment complex logic only when necessaryKeep functions small and focused**Dependencies:**
Avoid adding new dependencies unless absolutely necessaryIf adding dependency, document why it's neededCheck for security vulnerabilities before addingWhen adding dependency: - MUST update `requirements.txt` (pip users)
- MUST update `environment.yml` (conda users)
- Ensure version pins target same ranges (note syntax differences)
When changing Python version: - MUST update Python version in `environment.yml`
- MUST update Python version in README.md
- MUST test with new Python version
**Documentation:**
Update README.md for user-facing changesUpdate relevant docs in `/docs/` for behavior changesKeep inline documentation clear and concise5. Code Architecture & Hygiene
**Single Sources of Truth:**
1. **Extraction Pipeline**: `app_mockup/llm_extractor.py`
- Do NOT create alternative extraction pipelines
- All extraction logic goes through llm_extractor.py
- Uses OpenAI structured outputs API directly
2. **Graph Data Structures**: `app_mockup/backend/graph_construction.py`
- Provides GraphNode and GraphEdge classes only
- Do NOT duplicate these classes elsewhere
3. **Q&A Module**: `app_mockup/backend/qa_module.py`
- Single Q&A implementation
- Do NOT create alternative Q&A systems
4. **LLM Client**: `app_mockup/backend/llm_client.py`
- Single LLM client with caching and budget tracking
- All LLM calls go through this client
**Before adding new modules:**
Check if functionality already existsIf exists, extend existing module instead of creating new oneIf must create new module, document why in PR descriptionRemove deprecated modules when adding replacements**Preventing code duplication:**
Do NOT create multiple implementations of same functionalityDo NOT create "v2" modules alongside old modules - replace old oneDo NOT keep unused "example" or "prototype" files in main codebaseMove deprecated code to git history, not `archive/` folders6. Pull Request Guidelines
**Before creating PR (MUST complete):**
1. ✅ Run `pytest` and ensure all tests pass
2. ✅ Add/update tests for your changes
3. ✅ Update ROADMAP.md if completing planned tasks
4. ✅ Update IMPLEMENTATION_COMPLETE.md with issue summary
5. ✅ Update relevant documentation
6. ✅ Review your changes for code quality
7. ✅ Ensure no unintended files are committed
**PR description:**
Clear description of changesRelated issues (use "Fixes #123" for automatic linking)List of main changesTesting steps and resultsDocumentation updatesChecklist completion**Review checklist:**
Code follows project styleSelf-reviewed for qualityComplex areas are commentedNo new warnings or errorsTests added/updated and passingDocumentation updatedROADMAP.md updated (if applicable)IMPLEMENTATION_COMPLETE.md updated with issue summary7. Standard Development Flow
**1. Understand the task:**
Read issue description carefullyCheck related code and testsReview ROADMAP.md for context**2. Plan the changes:**
Identify minimal changes neededPlan test additions/updatesCheck if ROADMAP.md needs updating**3. Implement changes:**
Write code changesAdd/update testsRun tests frequently during development**4. Validate changes:**
Run full test suite: `pytest`Run demo tests: `pytest -k demo_pipeline -s`Manual verification if needed**5. Update documentation:**
Update ROADMAP.md if applicableUpdate IMPLEMENTATION_COMPLETE.md with issue summaryUpdate other docs as neededReview PR template requirements**6. Finalize PR:**
Ensure all tests passComplete PR descriptionSelf-review checklistRequest review8. Error Handling
**If tests fail:**
Read failure message carefullyRun specific failing test: `pytest tests/test_file.py::test_name -v`Fix issue or update test if behavior changed intentionallyNever skip or remove failing tests without understanding why**If unsure about changes:**
Ask for clarification in PR commentsDocument assumptions in code commentsAdd TODO comments for follow-up workProject Context
**Current phase:**
Post-Milestone 2 (Initial Mockup complete)Repository cleanup complete (2026-01-27): Removed ~5,000 lines unused codeWorking toward Milestone 3 (Demo Video) and Milestone 4 (Final Prototype)**Key components:**
1. **Preprocessing** (✅ Complete, ✅ Well-tested)
- Sentence segmentation, discourse markers, candidate flagging
2. **Extraction** (✅ Implemented in llm_extractor.py)
- LLM component classification, relation extraction, synthetic claims, conclusion inference
3. **Q&A** (✅ Implemented)
- Graph-grounded question answering, chat memory
4. **UI/Frontend** (✅ Basic implementation)
- Streamlit app, graph visualization, node details panel
**Testing philosophy:**
Comprehensive (cover main paths and edge cases)Maintainable (easy to understand and update)Fast (quick feedback during development)Reliable (deterministic, no flaky tests)Human-readable (demo tests for sanity checking)Constraints
All tests MUST pass before PR submission (no exceptions)ROADMAP.md and IMPLEMENTATION_COMPLETE.md updates are REQUIRED, not optionalLive API tests are opt-in only (never run in CI)Single sources of truth MUST be respected (no duplicate implementations)Minimal changes principle is enforced (no scope creep)Questions or Issues?
If these instructions are unclear or don't cover your use case:
1. Check existing code for patterns
2. Look at recent PRs for examples
3. Add comment to your PR asking for guidance
4. Don't guess - better to ask than make wrong assumptions