Extract and process Google Cloud Platform pricing data for VMs and persistent disks using Python and Anaconda
A tool for fetching and processing Google Cloud Platform (GCP) pricing data for virtual machines (VMs) and persistent disks.
This skill helps you work with the GCP Pricing Tool that extracts pricing information from the GCP API. It handles:
Before using the GCP pricing scripts, activate the Anaconda environment:
```bash
source /home/hsy/s/anaconda/etc/profile.d/conda.sh && conda activate base
```
To create a new conda environment if needed:
```bash
conda create --name <environment_name>
```
To list existing conda environments:
```bash
conda env list
```
The primary script is `get_gcp_vm_pricing.py`.
To download fresh pricing data from the GCP API:
```bash
./get_gcp_vm_pricing.py
```
This will generate timestamped output files with the current date in `YYYYMMDD` format.
To reprocess existing cached raw data files without making new API calls:
```bash
./get_gcp_vm_pricing.py --process
```
The tool works with several data file types:
| File Pattern | Description |
|--------------|-------------|
| `YYYYMMDD-raw-pricing-data.json` | Raw pricing data from GCP API |
| `YYYYMMDD-raw-sku-data.json` | Raw SKU metadata from GCP API |
| `YYYYMMDD-compute-pricing.json` | Processed VM pricing with resource types |
| `YYYYMMDD-pd-pricing.json` | Processed persistent disk pricing |
Where `YYYYMMDD` is the date the data was fetched (e.g., `20260203`).
When working with this codebase:
1. **Always activate Anaconda first** before running Python scripts:
```bash
source /home/hsy/s/anaconda/etc/profile.d/conda.sh && conda activate base
```
2. **To fetch new pricing data**, run the main script without flags:
```bash
./get_gcp_vm_pricing.py
```
3. **To reprocess existing data**, use the `--process` flag:
```bash
./get_gcp_vm_pricing.py --process
```
4. **Understand the data flow**:
- Raw data comes from GCP API → `*-raw-pricing-data.json` and `*-raw-sku-data.json`
- Processing adds resource type identification → `*-compute-pricing.json`
- Disk-specific processing → `*-pd-pricing.json`
5. **When analyzing pricing**:
- Look for the `resourceType` field to distinguish CPU vs RAM pricing
- Check the date prefix on files to ensure you're using current data
- Be aware of different disk types when comparing persistent disk pricing
6. **If you need to modify the extraction logic**:
- Pattern matching for disk types is in the main script
- Resource type identification logic distinguishes between CPU and RAM
- Ensure any changes preserve the timestamped output file naming convention
Leave a review
No reviews yet. Be the first to review this skill!
# Download SKILL.md from killerskills.ai/api/skills/gcp-pricing-extractor/raw