FunctionGemma Tool Calling

A fine-tuned version of Google's FunctionGemma 270M parameter model optimized for simple tool calling and function invocation tasks. This compact model provides efficient function calling capabilities while maintaining conversational abilities.

What This Skill Does

This skill enables AI agents to use a lightweight, specialized model for tool calling and function invocation tasks. The FunctionGemma 270M model is particularly useful when you need efficient function calling without the overhead of larger models, making it ideal for resource-constrained environments or high-throughput scenarios.

Instructions

When implementing tool calling with this model, follow these steps:

1. **Install Required Dependencies**

- Ensure `transformers` library is installed (version 4.57.6 or later recommended)

- Ensure PyTorch is available (2.9.0+ with CUDA support for GPU acceleration)

- Install supporting libraries: `datasets`, `tokenizers`

2. **Load the Model**

- Use the HuggingFace `pipeline` interface for simplest integration

- Model identifier: `AliEsaote/functiongemma-270m-it-simple-tool-calling`

- Specify `device="cuda"` for GPU or `device="cpu"` for CPU execution

- Task type: `text-generation`

3. **Format Input Messages**

- Use chat format with role-based messages

- Structure: `[{"role": "user", "content": "<your prompt>"}]`

- The model expects conversational input format

- Include tool/function definitions in the user message when needed

4. **Configure Generation Parameters**

- Set `max_new_tokens` based on expected response length (128-512 recommended)

- Use `return_full_text=False` to get only the generated response

- Consider temperature and top_p for controlling randomness if needed

- For function calls, lower temperature (0.1-0.3) may improve accuracy

5. **Parse Function Calls**

- Extract function/tool invocations from the generated text

- The model will output structured function calls based on its training

- Parse the output to identify function names and parameters

- Execute the requested functions and provide results back to the model if needed

6. **Handle Multi-Turn Conversations**

- Maintain conversation history by appending assistant responses

- Format: `[{"role": "user", "content": "..."}, {"role": "assistant", "content": "..."}, {"role": "user", "content": "..."}]`

- Include function execution results as part of the conversation flow

7. **Optimize for Production**

- Consider model quantization for faster inference if needed

- Batch multiple requests when possible

- Cache the loaded model to avoid repeated initialization

- Monitor memory usage (270M parameters ≈ 1GB RAM minimum)

Usage Examples

Basic Function Calling

```python

from transformers import pipeline

Initialize the model

generator = pipeline(

"text-generation",

model="AliEsaote/functiongemma-270m-it-simple-tool-calling",

device="cuda" # or "cpu"

)

Define a tool-calling prompt

question = "What's the weather in San Francisco? Use the get_weather function."

Generate response

output = generator(

[{"role": "user", "content": question}],

max_new_tokens=128,

return_full_text=False

)[0]

print(output["generated_text"])

```

Multi-Turn Conversation with Tools

```python

conversation = [

{"role": "user", "content": "I need to schedule a meeting for tomorrow at 2pm. Use the calendar API."}

]

First turn

response1 = generator(conversation, max_new_tokens=128, return_full_text=False)[0]

conversation.append({"role": "assistant", "content": response1["generated_text"]})

Simulate function execution result

conversation.append({"role": "user", "content": "Function executed successfully. Meeting ID: 12345"})

Second turn

response2 = generator(conversation, max_new_tokens=128, return_full_text=False)[0]

print(response2["generated_text"])

```

Important Notes

**Model Size**: At 270M parameters, this is a compact model prioritizing efficiency over maximum capability

**Specialization**: Fine-tuned specifically for simple tool calling; may not handle highly complex function schemas

**Base Model**: Built on Google's FunctionGemma, trained with SFT (Supervised Fine-Tuning) using TRL

**License**: Check the model card on HuggingFace for licensing terms before production use

**Hardware**: GPU recommended for real-time inference; CPU viable for batch processing

**Alternatives**: For more complex tool calling scenarios, consider larger models like Claude, GPT-4, or larger Gemma variants

When to Use This Model

Resource-constrained environments where smaller models are preferred

High-throughput scenarios requiring fast inference

Simple to moderate function calling tasks

Conversational applications with tool use

Prototyping tool calling features before scaling to larger models

Edge deployment scenarios where model size matters

Model Provenance

**Base Model**: google/functiongemma-270m-it

**Training Framework**: TRL (Transformer Reinforcement Learning) 0.27.1

**Training Method**: Supervised Fine-Tuning (SFT)

**Repository**: https://huggingface.co/AliEsaote/functiongemma-270m-it-simple-tool-calling

FunctionGemma Tool Calling

FunctionGemma Tool Calling

What This Skill Does

Instructions

Usage Examples

Basic Function Calling

Initialize the model

Define a tool-calling prompt

Generate response

Multi-Turn Conversation with Tools

First turn

Simulate function execution result

Second turn

Important Notes

When to Use This Model

Model Provenance

Reviews (0)