Why Evicted Pods are not deleted and How to cleanup
Recently, I encountered an unexpected phenomenon. After scaling down the replicas of a deployment to 0, the evicted pods were not being deleted. Typically, when there are replicas in a deployment, evicted pods are not deleted, which aligns with my expectations. This article discusses topics related to the removal of evicted pods.
The Kubernetes version used in this article is 1.23.
1 Phenomenon
Here, I will reproduce the scenario where the replicas of a deployment are set to 0, but the evicted pods are not deleted.
- Define a deployment with an ephemeral-storage limit of 200M:
|
|
- Write a 300M file inside the container of the pod and wait for the pod to be evicted:
|
|
- Check the pod:
|
|
- Scale the deployment down to 0:
|
|
- Check the pod and deployment:
|
|
2 What is an Evicted Pod?
A pod with status.phase
set to Failed and status.reason
set to Evicted is referred to as an evicted pod. Its IP has been released, but it still appears in status.podIP
, leading to the possibility of multiple pods sharing the same IP. Such pods are evicted by kubelet rather than being evicted through API server actions like kubectl drain
.
3 Reasons for Generating Evicted Pods
There are two situations in which pods are evicted by kubelet:
- The pod exceeds the specified resource limits (e.g., the container’s disk usage surpasses the
ephemeral-storage
limit). - If the remaining resources on the node fall below the values set by
--eviction-hard
or--eviction-soft
, kubelet will evict pods on that node.
4 How to Delete Evicted Pods
Certainly, the direct method is to use kubectl delete
for removal.
delete all evicted pods in cluster
|
|
Are there any other ways to delete evicted pods? Does Executing a Rollout Update on a Deployment Remove Evicted Pods? Does Deleting the Replicaset Corresponding to Evicted Pods Remove Them?
With these questions in mind, let’s find answers through practical experiments.
4.1 Does Deleting the Replicaset Corresponding to Evicted Pods Remove Evicted Pods?
Yes, it is possible. This is because the --cascade
option in kubectl delete
is set to “Background,” meaning that kubectl
will first delete the replicaset. Subsequently, the kube-controller-manager’s generic garbage collector will remove all pods (with ownerReference
pointing to the deleted replicaset).
|
|
4.2 Does Executing a Rollout Update on a Deployment Remove Evicted Pods?
Yes, it is possible, but you need to execute rollout update
1 to spec.revisionHistoryLimit
times until the replicaset corresponding to the evicted pod is deleted. The spec.revisionHistoryLimit
determines how many replicasets are retained. When the number of replicasets under a deployment, excluding the current version, exceeds this limit, the oldest replicaset is deleted. Therefore, when the replicaset corresponding to the evicted pod is the oldest, the evicted pod will be deleted along with the replicaset.
|
|
5 Why Aren’t Evicted Pods Deleted?
Evicted pods are generally not deleted immediately. They persist until the number of such pods exceeds the --terminated-pod-gc-threshold
(default value is 12500). Only then will the pod-garbage-collector
controller in kube-controller-manager
delete them. In other words, the pod-garbage-collector
controller will execute deletion operations only when the number of pods with the phase Failed or Succeeded surpasses the --terminated-pod-gc-threshold
in the cluster.
6 Why Scaling Deployment to 0 Doesn’t Delete Evicted Pods?
Setting the replicas of a deployment to 0 merely adjusts the replicas of the current version of the replicaset to 0 without deleting the replicaset. Therefore, evicted pods are not deleted in this scenario.
6.1 Why Setting Replicas to 0 on a ReplicaSet Doesn’t Delete Evicted Pods?
The status.availableReplicas
of a ReplicaSet does not include deleted pods or pods with a phase of Failed or Succeeded. Since the phase of evicted pods is Failed, they are ignored. In other words, a ReplicaSet counts only the active pods it controls, excluding deleted pods and those with a phase of Failed or Succeeded.
Here, filteredPods
represents the list of pods controlled by the ReplicaSet. The controller.FilterActivePods
function filters out all inactive pods (deleted or with a phase of Failed or Succeeded).
pkg/controller/replicaset/replica_set.go
|
|
pkg/controller/controller_utils.go
|
|
7 Summary
Methods to delete evicted pods:
- Directly delete the evicted pod.
- For a deployment, you can delete the replicaset corresponding to the pod or directly delete the deployment (not recommended unless replicas are set to 0).
- For a deployment, trigger multiple
rollout update
operations to allow the deployment controller to delete the replicaset corresponding to the evicted pod. - Set
--terminated-pod-gc-threshold
inkube-controller-manager
to a smaller value to more easily trigger thepod-garbage-collector
controller to delete pods with a phase of Failed or Succeeded.