DeepSeek Coder 6.7B Chat & Function Calling (GGUF)

Quantized GGUF versions of DeepSeek Coder 6.7B fine-tuned for conversational interactions and function calling capabilities. This model uses OpenAI-compatible syntax and can intelligently return function calls when appropriate.

Model Overview

This model was created by fine-tuning deepseek-coder-6.7b on the Open Assistant dataset, followed by additional training on function calling data. It provides code generation, chat capabilities, and structured function calling in a compact 6.7B parameter model.

**Original Model:** [AIGym/deepseek-coder-6.7b-chat-and-function-calling](https://huggingface.co/AIGym/deepseek-coder-6.7b-chat-and-function-calling)

**Quantization by:** [RichardErkhov](https://github.com/RichardErkhov)

Available Quantizations

Choose the quantization level based on your hardware constraints and quality requirements:

| Quantization | Size | Use Case |

|-------------|------|----------|

| Q2_K | 2.36GB | Minimal memory, lowest quality |

| IQ3_XS | 2.61GB | Very constrained devices |

| Q3_K_M | 3.07GB | Balance for low-memory systems |

| Q4_K_M | 3.80GB | Recommended for most users |

| Q5_K_M | 4.46GB | Higher quality, moderate size |

| Q6_K | 5.15GB | Near-original quality |

| Q8_0 | 6.67GB | Highest quality quantized |

Usage Instructions

Step 1: Download the Model

Download your preferred quantization from the [model repository](https://huggingface.co/RichardErkhov/AIGym_-_deepseek-coder-6.7b-chat-and-function-calling-gguf).

**Recommendation:** Start with Q4_K_M for the best balance of quality and size.

Step 2: Load with Your GGUF Runtime

**llama.cpp:**

```bash

./main -m deepseek-coder-6.7b-chat-and-function-calling.Q4_K_M.gguf -p "Your prompt here" -n 512

```

**Ollama:**

```bash

Create Modelfile

echo 'FROM ./deepseek-coder-6.7b-chat-and-function-calling.Q4_K_M.gguf' > Modelfile

ollama create deepseek-coder-function:6.7b -f Modelfile

ollama run deepseek-coder-function:6.7b

```

**LM Studio / Jan / GPT4All:**

Import the GGUF file through the application's model import interface.

Step 3: Use OpenAI-Compatible Syntax

This model is trained to work with OpenAI-style function calling syntax. Structure your prompts to include function definitions when you want the model to return structured function calls.

**Example Prompt Structure:**

```

You are a helpful assistant with access to the following functions:

[Function definitions in JSON format]

User: [User query]

```

The model will return function calls in the appropriate format when it determines a function should be invoked.

Step 4: Integrate Function Calling

When the model returns a function call:

1. Parse the function name and arguments from the model's response

2. Execute the function in your application

3. Return the result to the model as a function response

4. Continue the conversation

Performance Benchmarks

| Benchmark | Score |

|-----------|-------|

| AI2 Reasoning Challenge (25-shot) | 36.09 |

| HellaSwag (10-shot) | 53.80 |

| MMLU (5-shot) | 38.29 |

| TruthfulQA (0-shot) | 42.83 |

| Winogrande (5-shot) | 57.22 |

| GSM8k (5-shot) | 17.21 |

| **Average** | **40.91** |

Key Features

**Function Calling:** Trained to recognize when to invoke functions and return properly formatted calls

**OpenAI Compatible:** Uses syntax compatible with OpenAI's function calling API

**Code Generation:** Built on DeepSeek Coder's strong code generation capabilities

**Chat Optimized:** Fine-tuned on Open Assistant dataset for natural conversations

**Quantized Efficiency:** Multiple GGUF quantizations for different hardware configurations

Constraints

**Context Length:** Check your runtime's context window settings; default varies by implementation

**Function Format:** Requires proper function definition formatting in prompts

**Math Performance:** Lower performance on mathematical reasoning tasks (GSM8k: 17.21)

**Memory Requirements:** Minimum 4GB RAM for smallest quantizations; 8GB+ recommended for Q4_K_M and above

License

Apache 2.0

Resources

[HuggingFace Model](https://huggingface.co/RichardErkhov/AIGym_-_deepseek-coder-6.7b-chat-and-function-calling-gguf)

[Quantization GitHub](https://github.com/RichardErkhov)

[Request More Quantizations](https://github.com/RichardErkhov/quant_request)

[Discord Community](https://discord.gg/pvy7H8DZMG)

DeepSeek Coder 6.7B Chat & Function Calling (GGUF)

DeepSeek Coder 6.7B Chat & Function Calling (GGUF)

Model Overview

Available Quantizations

Usage Instructions

Step 1: Download the Model

Step 2: Load with Your GGUF Runtime

Create Modelfile

Step 3: Use OpenAI-Compatible Syntax

Step 4: Integrate Function Calling

Performance Benchmarks

Key Features

Constraints

License

Resources

Reviews (0)