Kubernetes Monitoring with Heapster, InfluxDB and Grafana

Arun GuptaDecember 5th, 2016Last Updated: December 5th, 2016

0 50 3 minutes read

Kubernetes provides detailed insights about resource usage in the cluster. This is enabled by using Heapster, cAdvisor, InfluxDB and Grafana.

Heapster is installed as a cluster-wide pod. It gathers monitoring and events data for all pods on each node by talking to the Kubelet. Kubelet itself fetches this data from cAdvisor. This data is persisted in InfluxDB and then visualized using Grafana.

Resource Usage Monitoring provide more details about monitoring resources in Kubernetes.

Heapster, InfluxDB and Grafana are Kubernetes addons. They are enabled by default if you are running the cluster on Amazon Web Services or Google Cloud. But need to be explicitly enabled if the cluster is started using minikube or kops addons.

Start a Kubernetes cluster on Amazon Web Services as:

KUBERNETES_PROVIDER=aws; kube-up.sh

More details about starting a Kubernetes cluster are available at Getting Started with Kubernetes 1.4.

By default, it creates a 4-node Kubernetes cluster in us-west-2a region. More details about the cluster can be seen using the command kubectl cluster-info and it shows the results as:

Kubernetes master is running at https://35.165.6.91
Elasticsearch is running at https://35.165.6.91/api/v1/proxy/namespaces/kube-system/services/elasticsearch-logging
Heapster is running at https://35.165.6.91/api/v1/proxy/namespaces/kube-system/services/heapster
Kibana is running at https://35.165.6.91/api/v1/proxy/namespaces/kube-system/services/kibana-logging
KubeDNS is running at https://35.165.6.91/api/v1/proxy/namespaces/kube-system/services/kube-dns
kubernetes-dashboard is running at https://35.165.6.91/api/v1/proxy/namespaces/kube-system/services/kubernetes-dashboard
Grafana is running at https://35.165.6.91/api/v1/proxy/namespaces/kube-system/services/monitoring-grafana
InfluxDB is running at https://35.165.6.91/api/v1/proxy/namespaces/kube-system/services/monitoring-influxdb
 
To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.

Note the URL for the Grafana service. Open this URL in a browser window. You’ll be prompted for an invalid certificate warning but this can be safely ignored at this time. In production system, appropriate certificates should be installed.

Then you’ll be prompted for credentials. These can be obtained using kubectl config view command. It will show the output as:

- name: aws_kubernetes-basic-auth
  user:
    password: ZeH4JpQzAtGDEBdb
    username: admin

Use the value from username and password fields.

This shows the default dashboard:

It consists of two dashboards – one for cluster and another for pods.

For this blog, a 4-node Couchbase cluster was created following the steps outlined in Create a Couchbase Cluster using Kubernetes.

A cluster-wide dashboard shows CPU, Memory, Filesystem and Network usage across all the hosts and looks like:

CPU, memory, filesystem and network usage for all nodes may be seen:

Details for each node may be seen by selecting the node:

CPU, memory, filesystem and network usage for each node is displayed:

Pods dashboard shows CPU, memory, filesystem and network usage for each pod:

A different pod may be chosen:

A complete list of all services running in the Kubernetes can be seen using kubectl get services --all-namespaces command. It shows the output as:

kubectl.sh get svc --all-namespaces
NAMESPACE     NAME                       CLUSTER-IP     EXTERNAL-IP        PORT(S)             AGE
default       couchbase-master-service   10.0.70.206    aef06961eb8f3...   8091/TCP            1h
default       kubernetes                 10.0.0.1                    443/TCP             1h
kube-system   elasticsearch-logging      10.0.54.112                 9200/TCP            1h
kube-system   heapster                   10.0.146.18                 80/TCP              1h
kube-system   kibana-logging             10.0.123.37                 5601/TCP            1h
kube-system   kube-dns                   10.0.0.10                   53/UDP,53/TCP       1h
kube-system   kubernetes-dashboard       10.0.146.179                80/TCP              1h
kube-system   monitoring-grafana         10.0.33.81                  80/TCP              1h
kube-system   monitoring-influxdb        10.0.26.251                 8083/TCP,8086/TCP   1h

A complete list of all the pods running in the Kubernetes cluster can be seen using kubectl get pods --all-namespaces. It shows the output as:

kubectl.sh get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
default couchbase-master-rc-q9awd 1/1 Running 17 56m
default couchbase-worker-rc-b1qkc 1/1 Running 15 54m
default couchbase-worker-rc-j1c5z 1/1 Running 17 52m
default couchbase-worker-rc-ju7z3 1/1 Running 15 52m
kube-system elasticsearch-logging-v1-18ylh 1/1 Running 0 1h
kube-system elasticsearch-logging-v1-fupap 1/1 Running 0 1h
kube-system fluentd-elasticsearch-ip-172-20-0-94.us-west-2.compute.internal 1/1 Running 0 1h
kube-system fluentd-elasticsearch-ip-172-20-0-95.us-west-2.compute.internal 1/1 Running 0 1h
kube-system fluentd-elasticsearch-ip-172-20-0-96.us-west-2.compute.internal 1/1 Running 15 1h
kube-system fluentd-elasticsearch-ip-172-20-0-97.us-west-2.compute.internal 1/1 Running 17 1h
kube-system heapster-v1.2.0-1374379659-jms8e 4/4 Running 0 1h
kube-system kibana-logging-v1-fcg4b 1/1 Running 3 1h
kube-system kube-dns-v20-wpip4 3/3 Running 0 1h
kube-system kube-proxy-ip-172-20-0-94.us-west-2.compute.internal 1/1 Running 0 1h
kube-system kube-proxy-ip-172-20-0-95.us-west-2.compute.internal 1/1 Running 0 1h
kube-system kube-proxy-ip-172-20-0-96.us-west-2.compute.internal 1/1 Running 15 1h
kube-system kube-proxy-ip-172-20-0-97.us-west-2.compute.internal 1/1 Running 17 1h
kube-system kubernetes-dashboard-v1.4.0-yxxgx 1/1 Running 0 1h
kube-system monitoring-influxdb-grafana-v4-7asy4 2/2 Running 0 1h

Some references:

Reference:

Kubernetes Monitoring with Heapster, InfluxDB and Grafana from our JCG partner Arun Gupta at the Miles to go 3.0 … blog.