Expert guidance for building a RAG system to query FDA pharmaceutical data. Emphasizes minimal, well-organized code with proper chunking strategies and deduplication handling for FDA labels.
Expert guidance for developing a RAG (Retrieval-Augmented Generation) system prototype to query FDA pharmaceutical data.
Understanding the data structure is critical:
**Chunk Nature:**
**Deduplication Strategy:**
**Query Results:**
1. **Start Minimal**: Implement only what's being discussed in the current context
2. **Organize Well**: Structure code to reflect clear separation of concerns
3. **Refactor Early**: Break into smaller functions at first sign of complexity
4. **Preserve Simplicity**: Resist adding features or behavior beyond immediate needs
5. **Domain-Aware**: Apply FDA-specific deduplication and ranking strategies appropriately
When building FDA query features:
1. Parse FDA label data into meaningful chunks (by section/subsection)
2. Index chunks with metadata (manufacturer, drug name, label section)
3. For queries, retrieve relevant chunks using semantic similarity
4. Apply cross-manufacturer deduplication to similar chunks
5. Preserve multiple chunks from same label if independently relevant
6. Return diverse results balancing relevance and manufacturer variety
Leave a review
No reviews yet. Be the first to review this skill!
# Download SKILL.md from killerskills.ai/api/skills/fda-data-query-rag-system/raw