Observability

Enhance Your Kubernetes Monitoring with Grafana

Grafana installation, initial dashboard setup, along with some fundamentals and best practices.

January 1, 2020 • Platform Engineering • 5 min read

Glossy observability dashboard with chart panels, signal traces, and alert markers

Grafana is a strong fit for Kubernetes observability because it gives teams a flexible way to visualise metrics, logs, alerts, and platform signals in one place. When paired with Prometheus and the rest of a well-shaped monitoring stack, it becomes a reliable way to understand what the cluster is doing and how workloads are behaving.

Why Use Grafana for Kubernetes Monitoring?

Grafana supports a wide range of data sources, including Prometheus, Graphite, Loki, AWS CloudWatch, and many more. It lets teams build dashboards, share views across teams, and turn raw metrics into something operationally useful.

For Kubernetes, that means you can track cluster health, workload behaviour, alert conditions, and service performance from a single interface rather than relying on ad hoc queries every time something starts to drift.

Installing Grafana on Kubernetes

Helm is usually the quickest way to get Grafana running:

helm repo add grafana https://grafana.github.io/helm-charts
helm repo update
helm install grafana grafana/grafana

For quick evaluation that is enough, but it is worth knowing the default install is not persistent. If the pod is rescheduled without persistence enabled, dashboards and configuration stored locally will be lost. Source: Deploy Grafana using Helm charts.

For a more realistic setup, enable persistence in the chart and set the storage size that fits the cluster:

helm install grafana grafana/grafana \
  --set persistence.enabled=true \
  --set persistence.size=10Gi

If you need a specific storage class, set persistence.storageClassName as well.

Accessing Grafana

Grafana can be exposed through your preferred ingress setup, but for a quick validation path, port-forwarding is often enough.

First, retrieve the default admin password:

kubectl get secret grafana -o jsonpath="{.data.admin-password}" | base64 --decode

Then forward the service:

export POD_NAME=$(kubectl get pods --namespace default -l "app.kubernetes.io/name=grafana,app.kubernetes.io/instance=grafana" -o jsonpath="{.items[0].metadata.name}")
kubectl port-forward $POD_NAME 3000

Once forwarded, open http://localhost:3000 and sign in with the admin user and the retrieved password.

Connecting Data Sources

After login, add the data sources that matter for your cluster. In most Kubernetes environments that starts with Prometheus, but it often grows to include Loki, CloudWatch, or other telemetry backends depending on the platform.

Once a data source is connected, you can start building dashboards around pod health, node pressure, API latency, workload scaling, or application-specific metrics you already expose.

Creating Dashboards

Dashboards are built from panels. A simple first panel is a count of running pods:

sum(kube_pod_status_phase{namespace=~".*", phase="Running"})

That query works well in a Stat panel and gives you a quick cluster-level signal. From there, build out more practical panels around:

pod restarts
node memory pressure
CPU saturation
request rates
error rates

The most useful dashboards are usually the ones that help someone answer a real operating question quickly rather than the ones that simply look busy.

Useful Visualisation Types

Grafana offers several panel types that are especially useful in Kubernetes environments:

Stat for single-value signals
Gauge for thresholds and capacity views
Time series for trends over time
Table for breakdowns and inventories
Logs when Grafana is connected to Loki or another log source

Pick the panel type that matches the question you are trying to answer. A dashboard becomes harder to use when every metric is forced into the same visual format.

Variables and Alerts

Variables let you switch dashboards by cluster, namespace, workload, or service, which makes the same dashboard reusable without cloning it repeatedly.

Grafana alerting can help you respond earlier to real problems, but only if the rules are deliberate. Start with signals that actually indicate service or platform degradation rather than alerting on every short spike.

Best Practices for Grafana Dashboards

Keep dashboards readable and easy to scan.
Focus on signals that support action.
Avoid refreshing more frequently than necessary.
Prefer a few well-shaped dashboards over a large collection of noisy ones.
Group related signals together so incidents are easier to understand.

Conclusion

Grafana remains one of the best ways to visualise Kubernetes telemetry. When it is paired with a sensible metrics and alerting strategy, it gives platform and application teams a much clearer view of what is happening in the cluster and what needs attention next.