Bulgarian language model specialized in function calling, tool use, and MCP (Model Context Protocol) integration. Based on Tucan-9B-v1.0 with enhanced tool-use capabilities.
A 9B parameter Bulgarian language model optimized for function calling, tool use, and Model Context Protocol (MCP) integration. This is a quantized GGUF version of the Tucan-9B-v1.0 model with weighted imatrix quantization for optimal performance across different hardware configurations.
This model specializes in:
When using this model for tool-enabled tasks:
1. **Model Selection**
- Choose an appropriate quantization level based on available hardware
- Recommended: Q4_K_M (5.9GB) for balanced speed and quality
- For resource-constrained environments: Q4_K_S (5.6GB) for optimal size/speed/quality
- For maximum quality: Q6_K (7.7GB) for near-original model performance
2. **Loading the Model**
- Download the desired GGUF quantization from HuggingFace
- Load using llama.cpp, Ollama, LM Studio, or other GGUF-compatible inference engines
- Configure context window and token limits based on your use case
3. **Function Calling Setup**
- Define your tools/functions with clear JSON schemas
- Use structured prompts that specify available tools
- Format function definitions with parameter types and descriptions
- Include examples of expected function call format
4. **Prompting for Tool Use**
- Be explicit about available tools at the start of conversations
- Use Bulgarian or multilingual prompts as appropriate
- Request function calls in structured format (JSON preferred)
- Provide clear context about when tools should be invoked
5. **MCP Integration**
- Configure MCP protocol endpoints if using agent frameworks
- Map model function calls to MCP tool invocations
- Handle tool execution results and feed back to model
- Implement error handling for failed tool executions
6. **Best Practices**
- Start with higher quantization levels and reduce if needed for performance
- Monitor token usage and context window utilization
- Validate function call outputs before execution
- Provide clear feedback loops between tool results and model responses
- Test Bulgarian language capabilities with native prompts for optimal performance
| Quantization | Size | Use Case |
|--------------|------|----------|
| IQ3_XXS - Q3_K | 3.9-5.2GB | Lower quality, very constrained resources |
| Q4_K_S | 5.6GB | Optimal size/speed/quality balance |
| Q4_K_M | 5.9GB | Fast, recommended for most use cases |
| Q5_K_M | 6.7GB | Higher quality, moderate resource usage |
| Q6_K | 7.7GB | Near-original quality, higher resource usage |
```python
from llama_cpp import Llama
llm = Llama(
model_path="LLMBG-ToolUse-9B-v1.0.i1-Q4_K_M.gguf",
n_ctx=4096,
n_threads=8
)
tools = [{
"name": "get_weather",
"description": "Get current weather for a location",
"parameters": {
"type": "object",
"properties": {
"location": {"type": "string"}
}
}
}]
prompt = f"""Available tools: {tools}
User: Какво е времето в София? (What's the weather in Sofia?)
Leave a review
No reviews yet. Be the first to review this skill!
# Download SKILL.md from killerskills.ai/api/skills/llmbg-tooluse-9b-bulgarian-function-calling/raw