Run Phi-4-mini-instruct with tool/function calling support using llama-server (llama-cpp-python). Provides ChatGPT-compatible API for LLM with external tool integration.
Run Microsoft's Phi-4-mini-instruct model locally with full function calling and tool use support through llama-server (llama-cpp-python). This provides a ChatGPT-compatible API for integrating local LLMs with external tools and APIs.
This skill sets up and runs a modified version of Phi-4-mini-instruct that supports:
Before running this skill, ensure you have:
First, install the required Python package with server capabilities:
```bash
pip install llama-cpp-python[server]
```
**Important:** The `[server]` extra is required for the `llama-server` command.
Download the Phi-4-mini-instruct GGUF model file. You can obtain it from:
Place the model file in a `models/` directory:
```bash
mkdir -p models
```
Launch the server with the following command:
```bash
llama-server \
--model models/Phi-4-mini-instruct-Q4_K_M-function_calling.gguf \
--port 8080 \
--jinja
```
**Flags explained:**
The server will start and display initialization messages. Once you see "llama-server listening on port 8080", it's ready.
#### Example 1: Function Calling with Python Tool
Test tool/function calling capabilities:
```bash
curl http://localhost:8080/v1/chat/completions -d '{
"model": "phi-4-mini-instruct-with-tools",
"tools": [
{
"type": "function",
"function": {
"name": "python",
"description": "Runs code in an ipython interpreter and returns the result of the execution after 60 seconds.",
"parameters": {
"type": "object",
"properties": {
"code": {
"type": "string",
"description": "The code to run in the ipython interpreter."
}
},
"required": ["code"]
}
}
}
],
"messages": [
{
"role": "user",
"content": "Print a hello world message with python."
}
]
}'
```
#### Example 2: Simple Chat Completion
```bash
curl http://localhost:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "phi-4-mini-instruct-with-tools",
"messages": [
{"role": "system", "content": "You are a helpful assistant"},
{"role": "user", "content": "Explain what function calling means in LLMs"}
]
}'
```
#### Example 3: Code Generation
```bash
curl http://localhost:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "phi-4-mini-instruct-with-tools",
"messages": [
{"role": "system", "content": "You are a helpful coding assistant"},
{"role": "user", "content": "give me an html hello world document"}
]
}'
```
The server provides ChatGPT-compatible endpoints:
You can configure AI coding assistants to use this local endpoint instead of OpenAI's API. Most tools support custom OpenAI-compatible endpoints.
Example configuration:
**Security Warning:** This model demonstrates a proof-of-concept for supply chain attacks leveraging poisoned chat templates. Only use models from trusted sources. See the full context at: https://www.pillar.security/blog/llm-backdoors-at-the-inference-level-the-threat-of-poisoned-templates
**Resource Requirements:**
**Troubleshooting:**
**Customization:**
This setup uses Microsoft's Phi-4-mini-instruct model. Ensure compliance with:
Leave a review
No reviews yet. Be the first to review this skill!
# Download SKILL.md from killerskills.ai/api/skills/phi-4-function-calling-via-llama-server/raw