A mixture-of-experts (MOE) system that routes questions to specialized AI models (programming, biology, mathematics) for domain-specific responses. Uses keyword matching and director LLM classification to select the optimal expert.
This skill implements a Multi-Expert Question Answering (MOE) system that intelligently routes user questions to specialized AI models based on domain expertise. The system uses a director model to classify questions and dynamically loads expert models for programming, biology, and mathematics domains.
The MOE system enhances response quality and efficiency by:
1. **Intelligent Routing**: Uses keyword matching and director LLM classification to identify the question domain
2. **Dynamic Expert Loading**: Loads specialized models on-demand, optimizing memory usage
3. **Domain Specialization**: Maintains separate expert models for programming, biology, and mathematics
4. **Resource Efficiency**: Releases previous models from memory before loading new ones
5. **Chat Interface**: Provides a conversational interface for continuous interaction
**Core Components:**
Install required Python packages:
```bash
pip install torch transformers accelerate
```
Ensure CUDA is available for GPU acceleration (optional but recommended).
Define the MODEL_CONFIG dictionary with expert model specifications:
```python
MODEL_CONFIG = {
"director": {
"name": "Agnuxo/Qwen2-1.5B-Instruct_MOE_Director_16bit",
"task": "text-generation",
},
"programming": {
"name": "Qwen/Qwen2-1.5B-Instruct",
"task": "text-generation",
},
"biology": {
"name": "Agnuxo/Qwen2-1.5B-Instruct_MOE_BIOLOGY_assistant_16bit",
"task": "text-generation",
},
"mathematics": {
"name": "Qwen/Qwen2-Math-1.5B-Instruct",
"task": "text-generation",
}
}
```
Create keyword mappings for each expert domain to enable fast keyword-based routing:
```python
KEYWORDS = {
"biology": ["cell", "DNA", "protein", "evolution", "genetics", "ecosystem"],
"mathematics": ["equation", "integral", "derivative", "function", "geometry"],
"programming": ["python", "java", "code", "API", "algorithm", "database"]
}
```
Include both English and other language variants as needed.
Create the main class with these key methods:
Ensure proper memory management when switching between expert models:
```python
if self.current_model:
del self.current_model
del self.current_tokenizer
torch.cuda.empty_cache()
```
This prevents GPU/RAM overflow when loading multiple large models.
Implement a simple loop that:
1. Accepts user input
2. Determines the appropriate expert (keyword or director)
3. Loads the expert model dynamically
4. Generates and displays the response
5. Continues until user types 'exit' or 'quit'
Wrap model operations in try-except blocks to gracefully handle:
```python
moe_llm = MOELLM()
moe_llm.chat_interface()
```
apache-2.0
Leave a review
No reviews yet. Be the first to review this skill!
# Download SKILL.md from killerskills.ai/api/skills/multi-expert-question-answering-system-ljlll3/raw