AgentCPM-Explore GGUF Model

A conversational AI model optimized for exploration and agent-based tasks, provided in multiple GGUF quantization formats for efficient local inference.

Overview

This skill provides access to the AgentCPM-Explore model in GGUF format, which is optimized for use with llama.cpp and compatible runtimes. The model is based on openbmb/AgentCPM-Explore and has been quantized by mradermacher to enable efficient local inference across various hardware configurations.

Instructions

When a user requests to use this model, follow these steps:

1. **Determine user requirements:**

- Ask about available hardware (RAM, VRAM, CPU/GPU)

- Determine quality vs. speed preference

- Identify the intended use case (conversational AI, agent tasks, exploration)

2. **Recommend appropriate quantization:**

- For best quality with sufficient resources (8GB+ RAM): Q8_0 (4.8 GB)

- For balanced quality and speed (6GB+ RAM): Q6_K (3.7 GB) or Q5_K_M (3.3 GB)

- For faster inference (4GB+ RAM): Q4_K_M (2.8 GB) or Q4_K_S (2.7 GB)

- For resource-constrained systems (3GB+ RAM): IQ4_XS (2.6 GB) or Q3_K_L (2.5 GB)

- For minimal systems (2GB+ RAM): Q3_K_M (2.3 GB), Q3_K_S (2.2 GB), or Q2_K (1.9 GB)

3. **Provide download instructions:**

- Share the appropriate download link from: https://huggingface.co/mradermacher/AgentCPM-Explore-GGUF

- Format: `AgentCPM-Explore.[QUANT_TYPE].gguf`

- Alternative: Weighted/imatrix quants available at https://huggingface.co/mradermacher/AgentCPM-Explore-i1-GGUF

4. **Guide model setup:**

- For llama.cpp: `./main -m /path/to/AgentCPM-Explore.[QUANT].gguf -p "Your prompt here"`

- For Ollama: Create a Modelfile, import with `ollama create agentcpm-explore -f Modelfile`

- For LM Studio or Jan: Import the GGUF file through the UI

- Refer to TheBloke's GGUF guides for detailed usage: https://huggingface.co/TheBloke/KafkaLM-70B-German-V0.1-GGUF

5. **Configure inference parameters:**

- Suggest appropriate context length (model default)

- Recommend temperature settings based on use case (0.7-0.9 for conversational, 0.3-0.5 for focused tasks)

- Advise on batch size and thread count based on hardware

6. **Usage recommendations:**

- Model is designed for conversational AI and agent-based exploration tasks

- Licensed under Apache 2.0 (permissive for commercial use)

- Best suited for English language tasks

- Can handle multi-turn conversations and agentic workflows

Quantization Quality Reference

Lower quality quants (Q2_K, Q3_K_S) trade accuracy for speed and smaller file size. Higher quality quants (Q6_K, Q8_0) preserve model capabilities but require more resources.

Recommended starting points:

Fast, recommended: Q4_K_S (2.7 GB) or Q4_K_M (2.8 GB)

Very good quality: Q6_K (3.7 GB)

Best quality: Q8_0 (4.8 GB)

Notes

Model requires llama.cpp or compatible runtime (Ollama, Jan, LM Studio, etc.)

Multi-part files (if present) must be concatenated before use

Base model: openbmb/AgentCPM-Explore

Quantized by: mradermacher

License: Apache 2.0

AgentCPM-Explore GGUF Model

AgentCPM-Explore GGUF Model

Overview

Instructions

Quantization Quality Reference

Notes

Reviews (0)