Troubleshooting
The first-install failures InferCost users have actually hit, in order of frequency. If none of
these match what you're seeing, open an issue with your kubectl describe costprofile output attached.
The Grafana dashboard shows "No data"
Prometheus isn't scraping InferCost. The chart ships a PodMonitor template
(default-enabled) that should discover the controller automatically on
kube-prometheus-stack. Check in order:
kubectl get podmonitor -n infercost-system— if empty, eitherprometheus.podMonitor.enabled=falseor themonitoring.coreos.com/v1CRD isn't installed (no prometheus-operator).- Target status in Prometheus: open the Prometheus UI → Status → Targets, search for
infercost. If the target is listed but "down", Prometheus can see the PodMonitor but can't reach the pod — usually a NetworkPolicy or a port mismatch. - If running ServiceMonitor instead (custom prometheus-operator config), set
prometheus.serviceMonitor.enabled=trueandprometheus.podMonitor.enabled=falseto avoid duplicate scraping.
CostProfile shows $0/hr
The amortization denominator is the life of the hardware in hours, so if any input is zero the output will be zero. Check:
kubectl describe costprofile <name> | grep -A10 Status And verify the spec has all three:
spec.hardware.purchasePriceUSD(non-zero)spec.hardware.amortizationYears(non-zero)spec.hardware.gpuCount(non-zero)
If those look right, check status.currentPowerDrawWatts. If it's zero, the DCGM path
didn't return readings — see DCGM setup for the four diagnostic states.
UsageReport has no tokens
The controller can't find inference pods to scrape. It expects pods with the LLMKube model label inference.llmkube.dev/model. Without that label the pod is invisible.
# Find what InferCost is looking for
kubectl get pods --all-namespaces -l inference.llmkube.dev/model
# If you don't use LLMKube, add the label manually
kubectl label pod my-vllm-pod -n engineering inference.llmkube.dev/model=qwen3-coder-30b Also verify the pod is Running with a PodIP assigned. Pending pods are
skipped.
Wrong backend parsed: llama.cpp metrics named as vLLM (or vice versa)
InferCost defaults to the llama.cpp metrics format. For vLLM pods, set the backend annotation:
metadata:
annotations:
infercost.ai/backend: vllm Without this, scraping a vLLM pod returns zeros — the metric names don't match. Zero is the safe fail state (no silent misattribution), but you will see flat token counts in the dashboard.
Custom metrics port? Add infercost.ai/metrics-port: "9000" as an annotation or label
— the default is 8080 for llama.cpp and 8000 for vLLM.
Controller pod crashlooping at startup
Most common cause: a malformed --pricing-file override. The controller fails loud
rather than silently falling back to list prices — if you configured negotiated rates, we owe
you a loud error when we can't parse them.
kubectl logs -n infercost-system deploy/infercost-controller-manager Look for messages starting with Failed to load pricing file. Fix the YAML
(pricing docs) and redeploy.
UsageReport reports cost but byModel is empty
The rollup path: at least one pod was scrapeable, but the controller couldn't attribute tokens to
a model. This happens when a pod is exposing metrics but does not carry the inference.llmkube.dev/model label. Check pod labels as in the "no tokens" case above.
FOCUS export fails with "no UsageReports matched"
The CLI filters by --namespace and --period. Drop both to see what the
cluster actually has:
kubectl get usagereports --all-namespaces If the list is empty, no UsageReport CR has been created yet — create one (see the CRD reference). The controller populates status on the first reconcile, usually within 30 seconds.