Build and maintain a RAG system for querying FDA data with emphasis on clean code design, minimal implementation, and proper chunk deduplication strategies.
A skill for building and maintaining a RAG (Retrieval-Augmented Generation) system prototype that queries FDA data. Focuses on clean code organization, minimal implementation, and proper handling of FDA label chunks.
This skill guides AI agents to work on a RAG system for FDA data queries while following specific project conventions around code style, design principles, and domain-specific requirements for handling FDA labels and text chunks.
1. **Simplicity First**
- Keep code simple, small, and concise
- Don't include behavior beyond what is being discussed
- Always produce minimal code
- Demonstrate good code organization and design
2. **Formatting Standards**
- Use 4 spaces indentation for TypeScript files
- Keep configuration files in their generated format (preserve original indentation and semicolons)
- Do not use semicolons in TypeScript code (but preserve them in config files)
3. **Comment Guidelines**
- Keep existing comments as they are unless code changes require updates
- Do not add comments describing what changed or referring to deleted code
- Use comments judiciously
- Prefer refactoring to smaller functions over adding explanatory comments
**FDA Labels and Chunks:**
- Do NOT deduplicate chunks by label (chunks from the same label are likely different and both relevant)
- DO deduplicate or rank lower chunks with similar text from different manufacturers
- Challenge: Increase diversity in chunks while avoiding manufacturer duplicates
When working on this codebase:
1. Read existing code structure before making changes
2. Apply the minimal code principle - only add what's necessary
3. Maintain 4-space indentation in TypeScript files
4. Preserve formatting in configuration files
5. Factor complex logic into small, well-named functions rather than adding comments
6. When implementing chunk deduplication:
- Focus on cross-manufacturer similarity detection
- Preserve multiple chunks from the same label
- Prioritize chunk diversity in results
**User Request**: "Add a function to deduplicate FDA label chunks"
**Agent Response**:
Leave a review
No reviews yet. Be the first to review this skill!
# Download SKILL.md from killerskills.ai/api/skills/chatfda-rag-system-development/raw