Home United States USA — software Kubernetes Autoscaling with Custom Metrics (Updated)

Kubernetes Autoscaling with Custom Metrics (Updated)

235
0
SHARE

In this post, we’ll do a walkthrough of how Kubernetes autoscaling can be implemented for custom metrics generated by the application.
Join the DZone community and get the full member experience. In the previous blog post about Kubernetes autoscaling, we looked at different concepts and terminologies related to autoscaling such as HPA, cluster auto-scaler, etc. In this post, we’ll do a walkthrough of how Kubernetes autoscaling can be implemented for custom metrics generated by the application. The CPU or RAM consumption of an application may not indicate the right metric for scaling always. For example, if you have a message queue consumer that can handle 500 messages per second without crashing. Once a single instance of this consumer is handling close to 500 messages per second, you may want to scale the application to two instances so that load is distributed across two instances. Measuring CPU or RAM is a fundamentally flawed approach for scaling such an application and you would have to look at a metric that relates more closely to the application’s nature. The number of messages that an instance is processing at a given point in time is a better indicator of the actual load on that application. Similarly, there might be applications where other metrics make more sense and these can be defined using custom metrics in Kubernetes. Originally the metrics were exposed to users through Heapster which queried the metrics from each of Kubelet. The Kubelet, in turn, talked to cAdvisor on localhost and retrieved the node level and pod level metrics. The metrics-server was introduced to replace Heapster and use the Kubernetes API to expose the metrics so that the metrics are available in the same manner in which Kubernetes API is available. The metrics server aims to provides only the core metrics such as memory and CPU of pods and nodes and for all other metrics, you need to build the full metrics pipeline. The mechanisms for building the pipeline and Kubernetes autoscaling remain the same, as we will see in detail in the next few sections. One of the key pieces which enable exposing the metrics via the Kubernetes API layer is the aggregation layer. The aggregation layer allows installing additional APIs which are Kubernetes style into the cluster. This makes the API available like any Kubernetes resource API but the actual serving of the API can be done an external service which could be a pod deployed into the cluster itself (You need to enable the aggregation layer if not done already at cluster level as documented here).

Continue reading...