Notebook-centric workspace for generating and managing metadata using LLMs (OpenAI, Google Gemini) and MongoDB. Implements data loading, sampling, LLM prompt construction, and database upload workflows.
Testing environment for personal/work usage with notebook-centric workflows for LLM-powered metadata generation.
This workspace specializes in generating and managing metadata using LLMs (OpenAI, Google Gemini) and MongoDB. All main logic is implemented in Jupyter notebooks, with reusable functions for data loading, sampling, LLM prompt construction, invocation, and MongoDB upload.
**Always follow these patterns for data handling:**
```python
import pandas as pd
df = pd.read_csv('data.csv')
sample = df.head(20) if len(df) > 20 else df
json_data = sample.to_json(orient='records', force_ascii=False)
```
**Support both OpenAI and Google Gemini:**
- `model="gpt-4o"`
- `temperature=0.2`
- `max_tokens=2000`
```python
import re
match = re.search(r'\{.*\}', llm_response, re.DOTALL)
if match:
json_str = match.group(0)
```
**Handle failures gracefully:**
```python
try:
# LLM or MongoDB operation
except Exception as e:
print(f"Error: {e}")
```
**Never hardcode secrets:**
- `OPENAI_API_KEY`
- `GEMINI_API_KEY`
- `MONGODB_URI`
```python
from dotenv import load_dotenv
import os
load_dotenv()
openai_key = os.getenv('OPENAI_API_KEY')
gemini_key = os.getenv('GEMINI_API_KEY')
mongodb_uri = os.getenv('MONGODB_URI')
```
**Database storage pattern:**
```python
from pymongo import MongoClient
client = MongoClient(os.getenv('MONGODB_URI'))
db = client['airspace']
collection = db['metadata_full']
result = collection.insert_one(metadata_json)
print(f"Inserted document ID: {result.inserted_id}")
```
**Organize code in notebook cells:**
- Data loading
- Metadata generation
- Database upload
**Incorporate extra context when provided:**
**Follow these project conventions:**
1. Load environment variables from `.env`
2. Load CSV data with pandas
3. Sample up to 20 rows
4. Convert sample to JSON format
5. Construct LLM prompt with system + human messages
6. Invoke LLM (OpenAI or Gemini)
7. Extract and validate JSON from response
8. Upload metadata to MongoDB
9. Print confirmation with document ID
Leave a review
No reviews yet. Be the first to review this skill!
# Download SKILL.md from killerskills.ai/api/skills/github-copilot-work-playground/raw