Monitoring and Logging

Metrics give you a view of how a system is performing. Through metrics, you can discover if a system is overutilized or underutilized. You can see if a node goes offline or stops reporting metrics. Metrics can help in planning resource upgrades. Kubernetes has a method for displaying and publishing the resource metrics for pods and nodes.

Metrics

  • Raw measurements of resource usage

  • Low-level usage summaries from the operating system

  • Higher-level data tied to Kubernetes

  • Can report total capacity or load

Metrics are raw measurements of resource usage. Metrics can be low-level usage summaries from the operating system or high-level data from services or applications. Metrics can report based on the total capacity or the load.

CPU is represented in CPU units where 1 cpu is equivalent to 1 core or 1 virtual cpu. Often, you will see data like 100m, which is “one hundred millicpu” or “one hundred millicores.” The CPU unit is an absolute quantity, not relative. 100m is the same amount on a single-core, quad-core, or a 64-core machine.

Memory is represented using bytes in the power of 2 prefix, which is also known as a binary prefix. Power of 10 is the other common standard. MB stands for megabyte and is in the power of 10 format. MiB stands for mebibyte and is the power of 2 format. 1 KiB (kibibyte) is 1024 bytes, and 1 KB (kilobyte) is 1000 bytes.

Resource Metrics Pipeline

  • CPU, memory, and other usage metrics are available through the Metrics API.

  • The kubectl top command can display these metrics.

$ kubectl top node

NAME MEMORY%

CPU(cores)CPU%

MEMORY(bytes)

k8s1

224m

11%

3308Mi

41%

k8s2

93m

9%

2113Mi

60%

k8s3

151m

10%

2845Mi

49%

In Kubernetes, CPU, memory, and other usage metrics are available through the Metrics API. The Metrics API is available to users and to the system. Horizontal Pod Autoscaling uses the Metrics API to determine the scale of pods.

The kubectl top command can be run to display the metrics. In the example, kubectl top node is run to see the CPU and memory metrics for all the nodes in a cluster.

Metrics API

  • The Kubernetes Metrics API gives the current resource usage.

  • The metrics are not stored by Kubernetes, so there is no history.

  • The API requires the metrics server to be deployed.

The Kubernetes Metrics API is located at the /apis/metrics.k8s.io path. The Metrics API gives the current resource usage and is stored in memory and overwritten at each collection cycle. There is no storage in Kubernetes for metrics history. This data must be stored outside of Kubernetes to create historical metrics. The Kubernetes API server forwards traffic to the metrics server using the kube-aggregator. The metrics server must be installed for the Metrics API to function.

Metrics Server

  • Aggregator of resource usage for the cluster.

  • Collects metrics from the Kubelet Summary API on each node.

The metrics server is an aggregator of resource usage metrics for the cluster. The metrics server replaces Heapster and is built in addition to Heapster. This change was made to add the Kubernetes API design, which allows for RBAC controls to be placed on the metrics.

The metrics server collects metrics from the Summary API endpoint on the kubelet. It collects this data from every node in the cluster. The metrics server is an add-on package for Kubernetes and is provided through the kube-up.sh script. If the kube-up.sh script was not used in building the cluster, the metrics server can be manually installed.

Full Metrics Pipeline

CPU, memory, and storage metrics from pods and nodes may not be enough metrics data for your use case. If you need more metrics data, you can implement the full metrics pipeline and custom logging.

Full Metrics Pipeline

Custom.metrics.k8s.io

  • Implement custom metrics for Kubernetes objects. External.metrics.k8s.io

  • Implement custom metrics for external resources.

The full metrics pipeline is an extendable option in Kubernetes. The API locations custom.metrics.k8s.io and external.metrics.k8s.io are available for anyone to implement metrics. Kubernetes does not include custom or external metrics.

You can implement custom metrics for Kubernetes objects at custom.metrics.k8s.io. Pods, namespaces, and volumes are some Kubernetes objects.

You can implement custom metrics for external resources at external.metrics.k8s.io. It could be for an application that is running on Kubernetes or for an external load balancer.

Examples of Full Metrics

  • Kubernetes does not include a full metrics pipeline, but you can use a third-party pipeline.

  • Stackdriver offers metrics, logs, monitoring, and alerts.

  • Prometheus offers metrics, monitoring, and alerts.

Kubernetes defines the API for custom and external metrics, but not the data itself. Implementation is left to third-party systems or you can create your own implementation. Two examples of full metrics are Stackdriver and Prometheus.

Google Stackdriver is a unified monitoring and logging system that provides alerts, debugging, error reporting, tracing, logging, dashboards, and more. More information can be found at https://cloud.google.com/stackdriver/.

Prometheus is an open-source monitoring solution that is simple to use, but has a powerful query language.

Both solutions support the custom.metrics.k8s.io and external.metrics.k8s.io locations.

Logging Architecture

  • By default, containers write to stdout and stderr.

  • The Docker logging driver converts the log messages into JSON format.

  • The logs are stored in a directory per node.

  • Logs can be viewed using kubectl logs.

Like metrics, logging has some built-in support in Kubernetes. Containers write to stdout and stderr, unless a sidecar (helper container) captures logs. Kubernetes uses the Docker logging driver to capture the output on stdout and stderr, and then converts them into JSON format. These logs are stored in the directory /var/log/container on the node running the pod. Logs can be viewed by using the command kubectl logs.

Node Logging

  • Nodes log system components to journald if systemd is enabled, otherwise they are logged to /var/logs.

  • Logs should be set up to rotate using logrotate, or disk space will be an issue.

  • There is no clusterwide logging, but logs can be collected in a central location using third-party providers.

Nodes also log data for system components such as the kube-proxy or kubelet. If the operating system uses systemd, the logs are written to journald. Otherwise, logs are written to /var/logs. Depending on the installation method, log rotation may not be enabled for the Kubernetes component logs. Kubernetes recommends using the Linux tool, logrotate, to configure log rotation, so that the logs do not fill up the disk space. There is no clusterwide logging, but logs can be collected from the nodes and sent to a central logging system using a third-party provider.

Third-Party Logging

  • Logs contain valuable information to help diagnose issues, and to catch issues when they happen or even before they occur.

  • Monitoring and alerts should be set up to alert engineers of issues.

  • Kubernetes does not have built-in monitoring, but relies on tools like Elastic.co, ELK stack, or Google Stackdriver.

Logs contain valuable information to help diagnose issues, and catch issues when they happen or even before they occur. For this reason, logging should be sent to a location that can be monitored and action can be taken. Monitoring and alerts are valuable tools for notifying engineers of issues or potential issues. One example is a disk space alert that notifies a systems engineer when disk usage exceeds 85 percent. This alert would hopefully prevent the disk usage from hitting 100 percent and causing production issues.

Kubernetes does not have built-in monitoring, but relies on tools like the Elastic.co, Elastic stack (also known as ELK stack), or Google Stackdriver.

Last updated