Troubleshooting

The first-install failures InferCost users have actually hit, in order of frequency. If none of these match what you're seeing, open an issue with your kubectl describe costprofile output attached.

The Grafana dashboard shows "No data"

Prometheus isn't scraping InferCost. The chart ships a PodMonitor template (default-enabled) that should discover the controller automatically on kube-prometheus-stack. Check in order:

  1. kubectl get podmonitor -n infercost-system — if empty, either prometheus.podMonitor.enabled=false or the monitoring.coreos.com/v1 CRD isn't installed (no prometheus-operator).
  2. Target status in Prometheus: open the Prometheus UI → Status → Targets, search for infercost. If the target is listed but "down", Prometheus can see the PodMonitor but can't reach the pod — usually a NetworkPolicy or a port mismatch.
  3. If running ServiceMonitor instead (custom prometheus-operator config), set prometheus.serviceMonitor.enabled=true and prometheus.podMonitor.enabled=false to avoid duplicate scraping.

CostProfile shows $0/hr

The amortization denominator is the life of the hardware in hours, so if any input is zero the output will be zero. Check:

kubectl describe costprofile <name> | grep -A10 Status

And verify the spec has all three:

  • spec.hardware.purchasePriceUSD (non-zero)
  • spec.hardware.amortizationYears (non-zero)
  • spec.hardware.gpuCount (non-zero)

If those look right, check status.currentPowerDrawWatts. If it's zero, the DCGM path didn't return readings — see DCGM setup for the four diagnostic states.

UsageReport has no tokens

The controller can't find inference pods to scrape. It expects pods with the LLMKube model label inference.llmkube.dev/model. Without that label the pod is invisible.

# Find what InferCost is looking for
kubectl get pods --all-namespaces -l inference.llmkube.dev/model

# If you don't use LLMKube, add the label manually
kubectl label pod my-vllm-pod -n engineering inference.llmkube.dev/model=qwen3-coder-30b

Also verify the pod is Running with a PodIP assigned. Pending pods are skipped.

Wrong backend parsed: llama.cpp metrics named as vLLM (or vice versa)

InferCost defaults to the llama.cpp metrics format. For vLLM pods, set the backend annotation:

metadata:
  annotations:
    infercost.ai/backend: vllm

Without this, scraping a vLLM pod returns zeros — the metric names don't match. Zero is the safe fail state (no silent misattribution), but you will see flat token counts in the dashboard.

Custom metrics port? Add infercost.ai/metrics-port: "9000" as an annotation or label — the default is 8080 for llama.cpp and 8000 for vLLM.

Controller pod crashlooping at startup

Most common cause: a malformed --pricing-file override. The controller fails loud rather than silently falling back to list prices — if you configured negotiated rates, we owe you a loud error when we can't parse them.

kubectl logs -n infercost-system deploy/infercost-controller-manager

Look for messages starting with Failed to load pricing file. Fix the YAML (pricing docs) and redeploy.

UsageReport reports cost but byModel is empty

The rollup path: at least one pod was scrapeable, but the controller couldn't attribute tokens to a model. This happens when a pod is exposing metrics but does not carry the inference.llmkube.dev/model label. Check pod labels as in the "no tokens" case above.

FOCUS export fails with "no UsageReports matched"

The CLI filters by --namespace and --period. Drop both to see what the cluster actually has:

kubectl get usagereports --all-namespaces

If the list is empty, no UsageReport CR has been created yet — create one (see the CRD reference). The controller populates status on the first reconcile, usually within 30 seconds.