Koordinator Descheduler: LowNodeLoad Plugin Enhancing Node Balance and Application Stability

The article A Deep Dive into HighNodeUtilization and LowNodeUtilization Plugins with Descheduler discusses two Node utilization plugins of the descheduler in the Kubernetes community. Both of these plugins use requests to calculate the resource usage of nodes, which cannot address the problem of node overheating (where a small number of nodes have higher resource usage than the majority of other nodes). However, this article introduces the LowNodeLoad plugin of koordinator descheduler, which solves this problem. It distinguishes between high-watermark nodes, normal nodes, and low-watermark nodes based on the actual resource usage of nodes.

The koordinator descheduler is compatible with the community descheduler while adding two plugins, MigrationController and LowNodeLoad (added in Koordinator v1.1).

The LowNodeLoad plugin of koordinator is similar to the lowNodeUtilization plugin in that it evicts pods from high-watermark nodes to low-watermark nodes. However, unlike lowNodeUtilization, it classifies nodes based on their actual load, effectively addressing the issue of node resource overheating.

The MigrationController plugin provides resource reservation and arbitration mechanisms (interception mechanisms) to ensure application stability when pods are evicted by descheduler.

This article only introduces the LowNodeLoad plugin, while the MigrationController plugin will be discussed in subsequent articles. This article is based on Koordinator v1.4 and Descheduler v0.28.1.

Best Practices for Cost Optimization in Kubernetes

In the context of reducing costs and increasing efficiency, the concept of FinOps aligns well with this demand. FinOps is a best practice methodology that combines financial, technical, and business aspects, aiming to optimize the cost, performance, and value of cloud computing resources. The goal of FinOps is to enable organizations to better understand, control, and optimize cloud computing costs through prudent resource management and financial decision-making.

The stages of FinOps are divided into Cost Observation (Inform), Cost Analysis (Recommend), and Cost Optimization (Operate).

Typically, an enterprise’s internal cost platform includes cost observation and cost analysis, analyzing IT costs (cloud provider bills) based on service types and business departments.

Cost Optimization (Operate) is divided into three stages from easy to difficult:

  1. Handling idle machines and services, making informed choices on services and resources, including appropriate instance types, pricing models, reserved capacity, and package discounts.
  2. Applying service downsizing, reducing redundant resources (changing from triple-active to dual-active, dual-active to cold standby, dual-active to single-active, personnel optimization).
  3. Technical optimization (improving utilization).

This article focuses on the technical optimization stage of cost reduction, specifically in the context of cloud-native cost reduction strategies under Kubernetes.

Summary 2023

In 2023, I overall feel like I’m continuously struggling in adversity, but gradually seeing a glimmer of hope. It’s like climbing uphill with faltering steps, looking up to see the hilltop. I feel that I’ve accumulated some expertise in the cloud-native field, allowing me to gradually share my knowledge and thoughts while enhancing my technical influence.

A Deep Dive into HighNodeUtilization and LowNodeUtilization Plugins with Descheduler

Recently, I have been researching descheduler, primarily to address CPU hotspots on part of nodes in kubernetes clusters. This issue arises when there is a significant difference in CPU usage among nodes, despite the even distribution of pod requests across nodes. As we know, kube-scheduler is responsible for scheduling pods to nodes, while descheduler removes pods, allowing the workload controller to regenerate pods. This, in turn, triggers the pod scheduling process to allocate pods to nodes again, achieving the goal of pod rescheduling and node balancing.

The descheduler project in the community aims to address the following scenarios:

  1. Some nodes have high utilization and need to balance node utilization.
  2. After pod scheduling, nodes’ labels or taints do not meet the pod’s pod/node affinity, requiring pod relocation to compliant nodes.
  3. New nodes join the cluster, necessitating the balancing of node utilization.
  4. Pods are in a failed state but have not been cleaned up.
  5. Pods of the same workload are concentrated on the same node.

Descheduler uses a plugin mechanism to extend its capabilities, with plugins categorized into Balance (node balancing) and Deschedule (pod rescheduling) types.

Analysis the Static Pod Removal Process in kubelet

The previous article discussed the interesting removal process of the mirror pod. This article will explore the removal process of static pods.

Static pods can originate from files and HTTP services, and static pods are only visible internally to the kubelet. The mirror pod is an image of the static pod that allows external components to capture the static state.

The previous article explained that removing the mirror pod does not delete the static pod. To delete a static pod, you need to either delete the files under the --pod-manifest-path directory or remove the pod by making the HTTP server specified in --manifest-url return a response body that excludes this pod.

Exploring Mirror Pod Deletion in Kubernetes: Understanding its Impact on Static Pod

This is also an article about the research on the process of removing pods, focusing on the removal of mirror pods. The term “mirror pod” may sound unfamiliar, it is a type of pod within Kubernetes.

Let’s first introduce the classification of pods. Pods come from file, http, and apiserver sources. Pods from the apiserver are called ordinary pods, while pods from other sources are called static pods (the control plane installed using kubeadm runs with static pods). To manage pods conveniently, kubelet generates corresponding pods for static pods on the apiserver. These types of pods are called mirror pods, essentially mirroring the static pod (almost identical, except for a different UID and the addition of “kubernetes.io/config.mirror” in annotations).

So, what happens when you delete a mirror pod? Will it remove static pods on the node?