Granite 20B Function Calling (GGUF)

This skill provides access to IBM's Granite 20B Function Calling model in GGUF format, optimized for local deployment with llama.cpp and compatible inference engines. The model is specifically designed for function calling and tool use scenarios.

Model Overview

IBM Granite 20B Function Calling is a 20 billion parameter model trained for reliable function/tool calling. This quantized GGUF version enables running the model locally on consumer hardware while maintaining strong performance.

**Key Features:**

Specialized for function calling and tool use

Multiple quantization levels (IQ1_S to Q6_K) for different hardware constraints

Imatrix-weighted quantization for improved quality

Apache 2.0 license

English language support

Quantization Options

The model is available in multiple quantization formats trading off size, speed, and quality:

Recommended Quants

**Q4_K_S** (11.8 GB): Optimal size/speed/quality balance

**Q4_K_M** (12.9 GB): Fast, recommended for most use cases

**Q5_K_M** (14.9 GB): Higher quality with moderate size increase

**Q6_K** (16.7 GB): Near-original quality

Smaller Quants (Resource-Constrained)

**IQ3_S** (9.0 GB): Beats Q3_K variants

**IQ3_M** (9.7 GB): Good quality at small size

**IQ4_XS** (11.0 GB): Efficient 4-bit option

Extreme Compression (Limited Use)

**IQ1_S** (4.6 GB): For desperate situations only

**IQ2_XXS** (5.7 GB): Heavily compressed

Usage Instructions

Step 1: Download the Model

Choose a quantization level based on your available VRAM/RAM and download from HuggingFace:

```bash

Example: Download Q4_K_M (recommended)

wget https://huggingface.co/mradermacher/granite-20b-functioncalling-i1-GGUF/resolve/main/granite-20b-functioncalling.i1-Q4_K_M.gguf

```

Step 2: Load with llama.cpp

```bash

Using llama-cli

./llama-cli -m granite-20b-functioncalling.i1-Q4_K_M.gguf \

--ctx-size 4096 \

--n-gpu-layers 35 \

--prompt "You are a helpful AI assistant with function calling capabilities."

```

Step 3: Function Calling Format

Structure your prompts to leverage the model's function calling abilities:

```

Available functions:

get_weather(location: string): Get current weather

search_web(query: string): Search the internet

calculate(expression: string): Evaluate math expression

User: What's the weather in San Francisco?

Granite 20B Function Calling (GGUF)

Granite 20B Function Calling (GGUF)

Model Overview

Quantization Options

Recommended Quants

Smaller Quants (Resource-Constrained)

Extreme Compression (Limited Use)

Usage Instructions

Step 1: Download the Model

Example: Download Q4_K_M (recommended)

Step 2: Load with llama.cpp

Using llama-cli

Step 3: Function Calling Format

Reviews (0)