Fine-tuned 270M parameter Gemma model specialized for simple function/tool calling tasks with conversational capabilities
A fine-tuned version of Google's FunctionGemma 270M parameter model optimized for simple tool calling and function invocation tasks. This compact model provides efficient function calling capabilities while maintaining conversational abilities.
This skill enables AI agents to use a lightweight, specialized model for tool calling and function invocation tasks. The FunctionGemma 270M model is particularly useful when you need efficient function calling without the overhead of larger models, making it ideal for resource-constrained environments or high-throughput scenarios.
When implementing tool calling with this model, follow these steps:
1. **Install Required Dependencies**
- Ensure `transformers` library is installed (version 4.57.6 or later recommended)
- Ensure PyTorch is available (2.9.0+ with CUDA support for GPU acceleration)
- Install supporting libraries: `datasets`, `tokenizers`
2. **Load the Model**
- Use the HuggingFace `pipeline` interface for simplest integration
- Model identifier: `AliEsaote/functiongemma-270m-it-simple-tool-calling`
- Specify `device="cuda"` for GPU or `device="cpu"` for CPU execution
- Task type: `text-generation`
3. **Format Input Messages**
- Use chat format with role-based messages
- Structure: `[{"role": "user", "content": "<your prompt>"}]`
- The model expects conversational input format
- Include tool/function definitions in the user message when needed
4. **Configure Generation Parameters**
- Set `max_new_tokens` based on expected response length (128-512 recommended)
- Use `return_full_text=False` to get only the generated response
- Consider temperature and top_p for controlling randomness if needed
- For function calls, lower temperature (0.1-0.3) may improve accuracy
5. **Parse Function Calls**
- Extract function/tool invocations from the generated text
- The model will output structured function calls based on its training
- Parse the output to identify function names and parameters
- Execute the requested functions and provide results back to the model if needed
6. **Handle Multi-Turn Conversations**
- Maintain conversation history by appending assistant responses
- Format: `[{"role": "user", "content": "..."}, {"role": "assistant", "content": "..."}, {"role": "user", "content": "..."}]`
- Include function execution results as part of the conversation flow
7. **Optimize for Production**
- Consider model quantization for faster inference if needed
- Batch multiple requests when possible
- Cache the loaded model to avoid repeated initialization
- Monitor memory usage (270M parameters ≈ 1GB RAM minimum)
```python
from transformers import pipeline
generator = pipeline(
"text-generation",
model="AliEsaote/functiongemma-270m-it-simple-tool-calling",
device="cuda" # or "cpu"
)
question = "What's the weather in San Francisco? Use the get_weather function."
output = generator(
[{"role": "user", "content": question}],
max_new_tokens=128,
return_full_text=False
)[0]
print(output["generated_text"])
```
```python
conversation = [
{"role": "user", "content": "I need to schedule a meeting for tomorrow at 2pm. Use the calendar API."}
]
response1 = generator(conversation, max_new_tokens=128, return_full_text=False)[0]
conversation.append({"role": "assistant", "content": response1["generated_text"]})
conversation.append({"role": "user", "content": "Function executed successfully. Meeting ID: 12345"})
response2 = generator(conversation, max_new_tokens=128, return_full_text=False)[0]
print(response2["generated_text"])
```
Leave a review
No reviews yet. Be the first to review this skill!
# Download SKILL.md from killerskills.ai/api/skills/functiongemma-tool-calling/raw