Expert assistant for building and maintaining the ChatFDA RAG system prototype for querying FDA data. Follows strict code organization, minimal design principles, and domain-specific deduplication strategies.
An expert assistant for developing and maintaining the ChatFDA RAG (Retrieval-Augmented Generation) system prototype designed to query FDA pharmaceutical label data.
This skill guides AI agents to work on the ChatFDA codebase with proper understanding of:
1. **Maintain Good Code Organization**
- Demonstrate clean separation of concerns
- Keep code simple, small, and concise
- Only include behavior directly relevant to the current discussion
- Always produce minimal code — no speculative features or abstractions
2. **Avoid Over-Engineering**
- Don't add functionality beyond what is being discussed
- Prefer simple solutions over complex ones
- Factor complex logic into smaller functions rather than adding explanatory comments
1. **TypeScript Code**
- Use 4 spaces for indentation (not 2)
- Do NOT use semicolons in TypeScript code
2. **Configuration Files**
- Keep formatting as originally generated
- Preserve semicolons if they exist in config files
- Preserve original indentation in config files
1. **Preserve Existing Comments**
- Keep existing comments unless they need updates to reflect code changes
- Update comments when the code they describe changes
2. **Never Add These Types of Comments**
- Comments describing what changed in a diff
- References to deleted code
- Change logs or version history
3. **When to Add Comments**
- Use judiciously — prefer refactoring to smaller, self-documenting functions
- Only add when the domain logic is genuinely non-obvious
1. **Understanding Chunks**
- Chunks are segments of text extracted from a single FDA label
- Two chunks from the same label are typically different pieces of information
- Both chunks from a label may be relevant to a query — don't deduplicate by label
2. **Deduplication Strategy**
- Deduplication means removing or ranking lower chunks with similar text from DIFFERENT manufacturers
- The challenge is achieving diversity in retrieved chunks
- Focus on cross-manufacturer deduplication, not intra-label deduplication
3. **Query Results**
- Multiple chunks from the same label in results is expected and valid
- Prioritize semantic diversity across manufacturers over uniqueness by source label
**Good:**
```typescript
// Query: "Add a function to rank chunks by manufacturer diversity"
// AI produces a small, focused function with 4-space indentation, no semicolons
function rankByManufacturerDiversity(chunks: Chunk[]): Chunk[] {
const seen = new Set<string>()
return chunks.filter(chunk => {
if (seen.has(chunk.manufacturer)) return false
seen.add(chunk.manufacturer)
return true
})
}
```
**Bad:**
```typescript
// Query: "Add a function to rank chunks"
// AI adds unnecessary features, uses 2-space indentation, adds semicolons
function rankChunks(chunks: Chunk[], options?: RankingOptions): Chunk[] {
// Added: 2024-01-15 - Initial implementation
const seen = new Set<string>();
// Support multiple ranking strategies (not requested)
const strategy = options?.strategy || 'default';
// ... 50+ lines of unnecessary abstraction
}
```
Leave a review
No reviews yet. Be the first to review this skill!
# Download SKILL.md from killerskills.ai/api/skills/chatfda-rag-system-developer/raw