Official Python library for Groq's REST API with sync/async support, type definitions, and automatic retries for fast LLM inference.
Use the official Groq Python library to access Groq's ultra-fast LLM inference API from any Python 3.9+ application. This skill provides type-safe synchronous and asynchronous clients with automatic retries, error handling, and streaming support.
This skill enables you to:
1. Make synchronous and asynchronous chat completion requests to Groq's API
2. Stream responses for real-time token delivery
3. Handle errors gracefully with typed exceptions
4. Transcribe audio files using Whisper models
5. Configure retries, timeouts, and custom HTTP clients
6. Access raw response headers and metadata
7. Use type-safe request parameters and Pydantic response models
```bash
pip install groq
```
For improved async performance with aiohttp backend:
```bash
pip install groq[aiohttp]
```
Sign up at [console.groq.com](https://console.groq.com) and generate an API key.
Create a `.env` file (never commit this to version control):
```bash
GROQ_API_KEY=your_api_key_here
```
Install python-dotenv to load environment variables:
```bash
pip install python-dotenv
```
```python
import os
from groq import Groq
client = Groq(api_key=os.environ.get("GROQ_API_KEY"))
chat_completion = client.chat.completions.create(
messages=[
{
"role": "system",
"content": "You are a helpful assistant."
},
{
"role": "user",
"content": "Explain the importance of low latency LLMs"
}
],
model="meta-llama/llama-4-scout-17b-16e-instruct"
)
print(chat_completion.choices[0].message.content)
```
```python
import os
import asyncio
from groq import AsyncGroq
client = AsyncGroq(api_key=os.environ.get("GROQ_API_KEY"))
async def main() -> None:
chat_completion = await client.chat.completions.create(
messages=[
{
"role": "user",
"content": "Explain quantum computing in simple terms"
}
],
model="meta-llama/llama-4-scout-17b-16e-instruct"
)
print(chat_completion.choices[0].message.content)
asyncio.run(main())
```
```python
import os
import asyncio
from groq import AsyncGroq, DefaultAioHttpClient
async def main() -> None:
async with AsyncGroq(
api_key=os.environ.get("GROQ_API_KEY"),
http_client=DefaultAioHttpClient()
) as client:
chat_completion = await client.chat.completions.create(
messages=[{"role": "user", "content": "Hello!"}],
model="meta-llama/llama-4-scout-17b-16e-instruct"
)
print(chat_completion.choices[0].message.content)
asyncio.run(main())
```
```python
with client.chat.completions.with_streaming_response.create(
messages=[{"role": "user", "content": "Tell me a story"}],
model="meta-llama/llama-4-scout-17b-16e-instruct",
stream=True
) as response:
for line in response.iter_lines():
print(line)
```
```python
from pathlib import Path
transcription = client.audio.transcriptions.create(
model="whisper-large-v3-turbo",
file=Path("/path/to/audio.mp3")
)
print(transcription.text)
```
```python
import groq
from groq import Groq
client = Groq()
try:
chat_completion = client.chat.completions.create(
messages=[{"role": "user", "content": "Hello"}],
model="meta-llama/llama-4-scout-17b-16e-instruct"
)
except groq.APIConnectionError as e:
print("Network error:", e.__cause__)
except groq.RateLimitError as e:
print("Rate limit hit, back off")
except groq.AuthenticationError as e:
print("Invalid API key")
except groq.APIStatusError as e:
print(f"API error: {e.status_code} - {e.response}")
```
```python
import httpx
from groq import Groq
client = Groq(
max_retries=5, # Default is 2
timeout=httpx.Timeout(60.0, read=10.0, write=10.0, connect=5.0)
)
client.with_options(timeout=30.0, max_retries=3).chat.completions.create(
messages=[{"role": "user", "content": "Quick question"}],
model="meta-llama/llama-4-scout-17b-16e-instruct"
)
```
```python
response = client.chat.completions.with_raw_response.create(
messages=[{"role": "user", "content": "Hello"}],
model="meta-llama/llama-4-scout-17b-16e-instruct"
)
print(response.headers.get('X-Request-ID'))
completion = response.parse()
print(completion.id)
```
```python
messages = [
{"role": "system", "content": "You are a Python expert."}
]
messages.append({"role": "user", "content": "What are list comprehensions?"})
response = client.chat.completions.create(
messages=messages,
model="meta-llama/llama-4-scout-17b-16e-instruct"
)
messages.append({"role": "assistant", "content": response.choices[0].message.content})
messages.append({"role": "user", "content": "Show me an example"})
response = client.chat.completions.create(
messages=messages,
model="meta-llama/llama-4-scout-17b-16e-instruct"
)
print(response.choices[0].message.content)
```
All responses are Pydantic models with helper methods:
```python
response = client.chat.completions.create(
messages=[{"role": "user", "content": "Hello"}],
model="meta-llama/llama-4-scout-17b-16e-instruct"
)
json_str = response.to_json()
data = response.to_dict()
if 'id' in response.model_fields_set:
print(f"Response ID: {response.id}")
```
| Status Code | Error Type |
|------------|-----------|
| 400 | `BadRequestError` |
| 401 | `AuthenticationError` |
| 403 | `PermissionDeniedError` |
| 404 | `NotFoundError` |
| 422 | `UnprocessableEntityError` |
| 429 | `RateLimitError` |
| ≥500 | `InternalServerError` |
| N/A | `APIConnectionError` |
Enable debug logging:
```bash
export GROQ_LOG=debug
```
Or in code:
```python
import logging
logging.basicConfig(level=logging.DEBUG)
```
Leave a review
No reviews yet. Be the first to review this skill!
# Download SKILL.md from killerskills.ai/api/skills/groq-python-api-client/raw