A 2.6B parameter Bulgarian language model optimized for function calling, tool use, and MCP integration. Available in multiple GGUF quantization formats for efficient local inference.
A quantized 2.6B parameter model fine-tuned for function calling and tool use, with Bulgarian language support. Based on llm-bg/Tucan-2.6B-v1.0 and optimized for local inference using GGUF format.
This skill provides access to the LLMBG-ToolUse-2.6B model, a compact yet capable language model designed for:
The model is available in multiple quantization formats (IQ1_S to Q6_K) allowing you to balance between model size, speed, and quality based on your hardware constraints.
When a user wants to use this model:
1. **Identify User Requirements**
- Determine the user's available hardware (RAM, VRAM)
- Understand their quality vs speed preferences
- Check if they need Bulgarian language support or English is sufficient
2. **Recommend Appropriate Quantization**
- For limited resources (2-4GB RAM): Recommend IQ3_S or IQ3_M
- For balanced performance: Recommend Q4_K_M (fast, good quality)
- For optimal quality: Recommend Q5_K_M or Q6_K
- For maximum speed with acceptable quality: Recommend Q4_K_S
3. **Provide Download Instructions**
- Direct user to the specific GGUF file on HuggingFace: `https://huggingface.co/mradermacher/LLMBG-ToolUse-2.6B-v1.0-i1-GGUF`
- Show the download command for their chosen quantization
4. **Setup with Inference Engine**
**For llama.cpp:**
```bash
# Download the model (example with Q4_K_M)
wget https://huggingface.co/mradermacher/LLMBG-ToolUse-2.6B-v1.0-i1-GGUF/resolve/main/LLMBG-ToolUse-2.6B-v1.0.i1-Q4_K_M.gguf
# Run inference
./llama-cli -m LLMBG-ToolUse-2.6B-v1.0.i1-Q4_K_M.gguf -p "Your prompt here" --temp 0.7
```
**For Ollama:**
```bash
# Create Modelfile
cat > Modelfile << EOF
FROM ./LLMBG-ToolUse-2.6B-v1.0.i1-Q4_K_M.gguf
EOF
# Create model
ollama create llmbg-tooluse -f Modelfile
# Run model
ollama run llmbg-tooluse
```
5. **Configure for Function Calling**
- Explain that this model is optimized for function calling
- Provide example function calling format compatible with the model
- Show how to integrate with MCP (Model Context Protocol) if applicable
6. **Test the Setup**
- Provide a simple test prompt to verify the model is working
- For function calling, provide a test function definition and call
| Quantization | Size | Use Case |
|--------------|------|----------|
| IQ1_S/IQ1_M | 0.9-1.0GB | Extremely limited hardware only |
| IQ2_XXS-IQ2_M | 1.0-1.2GB | Very constrained environments |
| Q2_K | 1.3GB | Minimum viable quality |
| IQ3_S | 1.5GB | Good balance for limited hardware |
| Q4_K_S | 1.7GB | **Optimal size/speed/quality** |
| Q4_K_M | 1.8GB | **Fast, recommended for most users** |
| Q5_K_M | 2.0GB | Higher quality, still fast |
| Q6_K | 2.3GB | Near-original quality |
When helping users implement function calling:
```python
function_def = {
"name": "get_weather",
"description": "Get current weather for a location",
"parameters": {
"type": "object",
"properties": {
"location": {"type": "string"},
"units": {"type": "string", "enum": ["celsius", "fahrenheit"]}
}
}
}
prompt = f"""Available functions: {function_def}
User: What's the weather in Sofia?
Assistant:"""
```
Leave a review
No reviews yet. Be the first to review this skill!
# Download SKILL.md from killerskills.ai/api/skills/llmbg-tooluse-26b-gguf-model/raw