Analyze and troubleshoot GitOps pipelines managed by Flux Operator on Kubernetes clusters using MCP tools
Specialized skill for analyzing and troubleshooting GitOps pipelines managed by Flux Operator on Kubernetes clusters. Uses the `flux-operator-mcp` tools to connect to clusters and fetch Kubernetes and Flux resources.
Flux consists of the following Kubernetes controllers and custom resource definitions (CRDs):
**Flux Operator**
**Source Controller**
**Kustomize Controller**
**Helm Controller**
**Notification Controller**
**Image Automation Controllers**
For a deep understanding of the Flux CRDs, call the `search_flux_docs` tool for each resource kind.
1. **Check Installation Status**: When asked about the Flux installation status, call the `get_flux_instance` tool
2. **Resource Queries**: When asked about Kubernetes or Flux resources, call the `get_kubernetes_resources` tool
3. **API Version Validation**: Don't make assumptions about the `apiVersion` of a Kubernetes or Flux resource; call the `get_kubernetes_api_versions` tool to find the correct one
4. **Cluster Context Management**:
- When asked to use a specific cluster, call the `get_kubernetes_contexts` tool to find the cluster context
- Switch to it with the `set_kubernetes_context` tool
- After switching context to a new cluster, call the `get_flux_instance` tool to determine the Flux Operator status and settings
5. **Flux-Managed Resources**: To determine if a Kubernetes resource is Flux-managed, search the metadata field for `fluxcd` labels
6. **Resource Creation/Updates**: When asked to create or update resources, generate a Kubernetes YAML manifest and call the `apply_kubernetes_resource` tool to apply it
7. **Avoid Unintended Changes**: Avoid applying changes to Flux-managed resources unless explicitly requested
8. **CRD Documentation**: When asked about Flux CRDs, call the `search_flux_docs` tool to get the latest API docs
When analyzing logs, follow this procedure:
1. **Identify the Pod**: Get the Kubernetes deployment that manages the pods using the `get_kubernetes_resources` tool
2. **Extract Labels**: Look for the `matchLabels` and the container name in the deployment spec
3. **List Pods**: List the pods with the `get_kubernetes_resources` tool using the found `matchLabels` from the deployment spec
4. **Fetch Logs**: Get the logs by calling the `get_kubernetes_logs` tool using the pod name and container name
When troubleshooting a HelmRelease, follow these steps:
1. Use the `get_flux_instance` tool to check the helm-controller deployment status and the apiVersion of the HelmRelease kind
2. Use the `get_kubernetes_resources` tool to get the HelmRelease, then analyze the spec, the status, inventory and events
3. Determine which Flux object is managing the HelmRelease by looking at the annotations; it can be a Kustomization or a ResourceSet
4. If `valuesFrom` is present, get all the referenced ConfigMap and Secret resources
5. Identify the HelmRelease source by looking at the `chartRef` or the `sourceRef` field
6. Use the `get_kubernetes_resources` tool to get the HelmRelease source then analyze the source status and events
7. If the HelmRelease is in a failed state or in progress, it may be due to failures in one of the managed resources found in the inventory
8. Use the `get_kubernetes_resources` tool to get the managed resources and analyze their status
9. If the managed resources are in a failed state, analyze their logs using the `get_kubernetes_logs` tool
10. If any issues were found, create a root cause analysis report for the user
11. If no issues were found, create a report with the current status of the HelmRelease and its managed resources and container images
When troubleshooting a Kustomization, follow these steps:
1. Use the `get_flux_instance` tool to check the kustomize-controller deployment status and the apiVersion of the Kustomization kind
2. Use the `get_kubernetes_resources` tool to get the Kustomization, then analyze the spec, the status, inventory and events
3. Determine which Flux object is managing the Kustomization by looking at the annotations; it can be another Kustomization or a ResourceSet
4. If `substituteFrom` is present, get all the referenced ConfigMap and Secret resources
5. Identify the Kustomization source by looking at the `sourceRef` field
6. Use the `get_kubernetes_resources` tool to get the Kustomization source then analyze the source status and events
7. If the Kustomization is in a failed state or in progress, it may be due to failures in one of the managed resources found in the inventory
8. Use the `get_kubernetes_resources` tool to get the managed resources and analyze their status
9. If the managed resources are in a failed state, analyze their logs using the `get_kubernetes_logs` tool
10. If any issues were found, create a root cause analysis report for the user
11. If no issues were found, create a report with the current status of the Kustomization and its managed resources
When comparing a Flux resource between clusters, follow these steps:
1. Use the `get_kubernetes_contexts` tool to get the cluster contexts
2. Use the `set_kubernetes_context` tool to switch to a specific cluster
3. Use the `get_flux_instance` tool to check the Flux Operator status and settings
4. Use the `get_kubernetes_resources` tool to get the resource you want to compare
5. If the Flux resource contains `valuesFrom` or `substituteFrom`, get all the referenced ConfigMap and Secret resources
6. Repeat the above steps for each cluster
When comparing resources, look for differences in the `spec`, `status` and `events`, including the referenced ConfigMaps and Secrets. The Flux resource `spec` represents the desired state and should be the main focus of the comparison, while the status and events represent the current state in the cluster.
Leave a review
No reviews yet. Be the first to review this skill!
# Download SKILL.md from killerskills.ai/api/skills/flux-gitops-troubleshooting-9sgifb/raw