A conversational AI model optimized for exploration and agent tasks, available in multiple GGUF quantization formats for efficient local inference with llama.cpp or compatible runtimes.
A conversational AI model optimized for exploration and agent-based tasks, provided in multiple GGUF quantization formats for efficient local inference.
This skill provides access to the AgentCPM-Explore model in GGUF format, which is optimized for use with llama.cpp and compatible runtimes. The model is based on openbmb/AgentCPM-Explore and has been quantized by mradermacher to enable efficient local inference across various hardware configurations.
When a user requests to use this model, follow these steps:
1. **Determine user requirements:**
- Ask about available hardware (RAM, VRAM, CPU/GPU)
- Determine quality vs. speed preference
- Identify the intended use case (conversational AI, agent tasks, exploration)
2. **Recommend appropriate quantization:**
- For best quality with sufficient resources (8GB+ RAM): Q8_0 (4.8 GB)
- For balanced quality and speed (6GB+ RAM): Q6_K (3.7 GB) or Q5_K_M (3.3 GB)
- For faster inference (4GB+ RAM): Q4_K_M (2.8 GB) or Q4_K_S (2.7 GB)
- For resource-constrained systems (3GB+ RAM): IQ4_XS (2.6 GB) or Q3_K_L (2.5 GB)
- For minimal systems (2GB+ RAM): Q3_K_M (2.3 GB), Q3_K_S (2.2 GB), or Q2_K (1.9 GB)
3. **Provide download instructions:**
- Share the appropriate download link from: https://huggingface.co/mradermacher/AgentCPM-Explore-GGUF
- Format: `AgentCPM-Explore.[QUANT_TYPE].gguf`
- Alternative: Weighted/imatrix quants available at https://huggingface.co/mradermacher/AgentCPM-Explore-i1-GGUF
4. **Guide model setup:**
- For llama.cpp: `./main -m /path/to/AgentCPM-Explore.[QUANT].gguf -p "Your prompt here"`
- For Ollama: Create a Modelfile, import with `ollama create agentcpm-explore -f Modelfile`
- For LM Studio or Jan: Import the GGUF file through the UI
- Refer to TheBloke's GGUF guides for detailed usage: https://huggingface.co/TheBloke/KafkaLM-70B-German-V0.1-GGUF
5. **Configure inference parameters:**
- Suggest appropriate context length (model default)
- Recommend temperature settings based on use case (0.7-0.9 for conversational, 0.3-0.5 for focused tasks)
- Advise on batch size and thread count based on hardware
6. **Usage recommendations:**
- Model is designed for conversational AI and agent-based exploration tasks
- Licensed under Apache 2.0 (permissive for commercial use)
- Best suited for English language tasks
- Can handle multi-turn conversations and agentic workflows
Lower quality quants (Q2_K, Q3_K_S) trade accuracy for speed and smaller file size. Higher quality quants (Q6_K, Q8_0) preserve model capabilities but require more resources.
Recommended starting points:
Leave a review
No reviews yet. Be the first to review this skill!
# Download SKILL.md from killerskills.ai/api/skills/agentcpm-explore-gguf-model-tw1q9l/raw