深度解析Static Pod在kubelet中的移除流程

上篇文章讲了有意思的mirror pod的移除流程,这篇文章来研究static pod的移除流程。

static pod可以来自文件和HTTP服务,而且static pod只在kubelet内部可见,mirror pod是static pod的镜像让外部组件能够捕获static状态。

上篇文章讲了删除mirror pod并不会删除static pod,执行static pod的删除需要通过删除--pod-manifest-path目录下的文件或让--manifest-url的http server返回response body里移除这个pod。

pod移除流程系列文章

下面研究来自文件的static pod移除流程,本文的kubernetes版本为1.23,日志级别为4。

通过分析kubelet 日志并结合相应的代码,解读出static pod的移除流程。

完整日志文件在 kubelet log and watch pod output

感知到static pod配置文件的移除,这里"SyncLoop REMOVE"意味者pod消失,发送SyncPodKill类型的事件通知podWorker

1
2
3
I1123 14:18:35.558172  315900 kubelet.go:2124] "SyncLoop REMOVE" source="file" pods=[default/nginx-static-pod-10.11.251.2]
I1123 14:18:35.558191  315900 kubelet.go:1969] "Pod has been deleted and must be killed" pod="default/nginx-static-pod-10.11.251.2" podUID=a8712c005851ee6b29cff91b9ab4b9c6
I1123 14:18:35.558206  315900 pod_workers.go:638] "Pod is being removed by the kubelet, begin teardown" pod="default/nginx-static-pod-10.11.251.2" podUID=a8712c005851ee6b29cff91b9ab4b9c6

触发podWoker执行syncTerminatingPod

1
2
3
I1123 14:18:35.558234  315900 pod_workers.go:888] "Processing pod event" pod="default/nginx-static-pod-10.11.251.2" podUID=a8712c005851ee6b29cff91b9ab4b9c6 updateType=1
I1123 14:18:35.558244  315900 pod_workers.go:1005] "Pod worker has observed request to terminate" pod="default/nginx-static-pod-10.11.251.2" podUID=a8712c005851ee6b29cff91b9ab4b9c6
I1123 14:18:35.558259  315900 kubelet.go:1795] "syncTerminatingPod enter" pod="default/nginx-static-pod-10.11.251.2" podUID=a8712c005851ee6b29cff91b9ab4b9c6

syncTerminatingPod执行stop container和sandbox

1
2
3
I1123 14:18:35.558456  315900 kubelet.go:1825] "Pod terminating with grace period" pod="default/nginx-static-pod-10.11.251.2" podUID=a8712c005851ee6b29cff91b9ab4b9c6 gracePeriod=30
I1123 14:18:35.558519  315900 kuberuntime_container.go:719] "Killing container with a grace period override" pod="default/nginx-static-pod-10.11.251.2" podUID=a8712c005851ee6b29cff91b9ab4b9c6 containerName="nginx-container" containerID="docker://b6ca55d329230c8f5776eb1160fe161d6fefa01f2b31e55dbb820add90aadccc" gracePeriod=30
I1123 14:18:35.558528  315900 kuberuntime_container.go:723] "Killing container with a grace period" pod="default/nginx-static-pod-10.11.251.2" podUID=a8712c005851ee6b29cff91b9ab4b9c6 containerName="nginx-container" 

syncTerminatingPod执行完成,podWorker执行完成

1
2
3
4
5
I1123 14:18:35.775202  315900 kubelet.go:1873] "Pod termination stopped all running containers" pod="default/nginx-static-pod-10.11.251.2" podUID=a8712c005851ee6b29cff91b9ab4b9c6
I1123 14:18:35.775212  315900 kubelet.go:1875] "syncTerminatingPod exit" pod="default/nginx-static-pod-10.11.251.2" podUID=a8712c005851ee6b29cff91b9ab4b9c6
I1123 14:18:35.775220  315900 pod_workers.go:1050] "Pod terminated all containers successfully" pod="default/nginx-static-pod-10.11.251.2" podUID=a8712c005851ee6b29cff91b9ab4b9c6
I1123 14:18:35.775232  315900 pod_workers.go:988] "Processing pod event done" pod="default/nginx-static-pod-10.11.251.2" podUID=a8712c005851ee6b29cff91b9ab4b9c6 updateType=1
I1123 14:18:35.775237  315900 pod_workers.go:888] "Processing pod event" 

podWorker开始执行syncTerminatedPod

1
2
I1123 14:18:35.775237  315900 pod_workers.go:888] "Processing pod event" pod="default/nginx-static-pod-10.11.251.2" podUID=a8712c005851ee6b29cff91b9ab4b9c6 updateType=2
I1123 14:18:35.938222  315900 kubelet.go:1883] "syncTerminatedPod enter" pod="default/nginx-static-pod-10.11.251.2" podUID=a8712c005851ee6b29cff91b9ab4b9c6

PLEG感知container和sandbox停止,这里由于SyncPodKill类型的事件,所以podWorker没有设置delete字段为true,kl.containerDeletor.deleteContainersInPod里的removeAll参数为false(即清理策略为保留1个最后退出的容器),所以sanbox的PLEG事件触发执行cleanUpContainersInPod,会报错"Container not found in pod’s containers"。

而且因为只有一个退出的容器,所以这里没有触发清理容器动作。

1
2
3
4
I1123 14:18:35.938230  315900 kubelet.go:2156] "SyncLoop (PLEG): pod does not exist, ignore irrelevant event" event=&{ID:a8712c005851ee6b29cff91b9ab4b9c6 Type:ContainerDied Data:b6ca55d329230c8f5776eb1160fe161d6fefa01f2b31e55dbb820add90aadccc}
I1123 14:18:35.938232  315900 kubelet_pods.go:1441] "Generating pod status" pod="default/nginx-static-pod-10.11.251.2"
I1123 14:18:35.938244  315900 kubelet.go:2156] "SyncLoop (PLEG): pod does not exist, ignore irrelevant event" event=&{ID:a8712c005851ee6b29cff91b9ab4b9c6 Type:ContainerDied Data:398445f28f116ed45394c18d7697a64dceeef739379d5ac920bbf3fd6cc1bb78}
I1123 14:18:35.938251  315900 pod_container_deletor.go:79] "Container not found in pod's containers" containerID="398445f28f116ed45394c18d7697a64dceeef739379d5ac920bbf3fd6cc1bb78"

syncTerminatedPod执行完成,podWorker执行完成

1
2
3
I1123 14:18:35.941374  315900 kubelet.go:1924] "syncTerminatedPod exit" pod="default/nginx-static-pod-10.11.251.2" podUID=a8712c005851ee6b29cff91b9ab4b9c6
I1123 14:18:35.941383  315900 pod_workers.go:1105] "Pod is complete and the worker can now stop" pod="default/nginx-static-pod-10.11.251.2" podUID=a8712c005851ee6b29cff91b9ab4b9c6
I1123 14:18:35.941395  315900 pod_workers.go:959] "Processing pod event done" pod="default/nginx-static-pod-10.11.251.2" podUID=a8712c005851ee6b29cff91b9ab4b9c6 updateType=2

感知到mirror pod的status更新

1
I1123 14:18:35.949126  315900 kubelet.go:2127] "SyncLoop RECONCILE" source="api" pods=[default/nginx-static-pod-10.11.251.2]

housekeeping触发,执行mirror pod的删除(这里设置GracePeriodSeconds为0)

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
I1123 14:18:37.445960  315900 kubelet.go:2202] "SyncLoop (housekeeping)"
I1123 14:18:37.448122  315900 kubelet_pods.go:1082] "Clean up pod workers for terminated pods"
I1123 14:18:37.448136  315900 pod_workers.go:1258] "Pod has been terminated and is no longer known to the kubelet, remove all history" podUID=a8712c005851ee6b29cff91b9ab4b9c6
I1123 14:18:37.448143  315900 kubelet_pods.go:1111] "Clean up probes for terminated pods"
I1123 14:18:37.451130  315900 kubelet_pods.go:1148] "Clean up orphaned pod statuses"
I1123 14:18:37.453700  315900 kubelet_pods.go:1167] "Clean up orphaned pod directories"
I1123 14:18:37.453841  315900 kubelet_volumes.go:160] "Cleaned up orphaned pod volumes dir" podUID=a8712c005851ee6b29cff91b9ab4b9c6 path="/data/kubernetes/kubelet/pods/a8712c005851ee6b29cff91b9ab4b9c6/volumes"
I1123 14:18:37.453954  315900 kubelet_volumes.go:236] "Orphaned pod found, removing" podUID=a8712c005851ee6b29cff91b9ab4b9c6
I1123 14:18:37.453970  315900 kubelet_pods.go:1178] "Clean up orphaned mirror pods"
I1123 14:18:37.453983  315900 mirror_client.go:130] "Deleting a mirror pod" pod="default/nginx-static-pod-10.11.251.2" podUID=
I1123 14:18:37.463808  315900 config.go:278] "Setting pods for source" source="api"
I1123 14:18:37.466092  315900 config.go:278] "Setting pods for source" source="api"
I1123 14:18:37.466537  315900 kubelet_pods.go:1040] "Deleted pod" podName="nginx-static-pod-10.11.251.2_default"
I1123 14:18:37.466546  315900 kubelet_pods.go:1185] "Clean up orphaned pod cgroups"
I1123 14:18:37.466560  315900 kubelet.go:2210] "SyncLoop (housekeeping) end"

感知到mirror pod删除和从apiserver上移除,由于pod已经从podManager中移除,所以不会触发podWorker执行。

1
2
I1123 14:18:37.466578  315900 kubelet.go:2130] "SyncLoop DELETE" source="api" pods=[default/nginx-static-pod-10.11.251.2]
I1123 14:18:37.466589  315900 kubelet.go:2124] "SyncLoop REMOVE" source="api" pods=[default/nginx-static-pod-10.11.251.2]

garbageCollector移除容器和sanbox,并移除pod日志目录

1
2
3
4
5
I1123 14:19:21.542254  315900 kuberuntime_container.go:947] "Removing container" containerID="b6ca55d329230c8f5776eb1160fe161d6fefa01f2b31e55dbb820add90aadccc"
I1123 14:19:21.542265  315900 scope.go:110] "RemoveContainer" containerID="b6ca55d329230c8f5776eb1160fe161d6fefa01f2b31e55dbb820add90aadccc"
I1123 14:19:21.554998  315900 kuberuntime_gc.go:171] "Removing sandbox" sandboxID="398445f28f116ed45394c18d7697a64dceeef739379d5ac920bbf3fd6cc1bb78"
I1123 14:19:21.563426  315900 kuberuntime_gc.go:343] "Removing pod logs" podUID=a8712c005851ee6b29cff91b9ab4b9c6
I1123 14:19:21.566388  315900 kubelet.go:1333] "Container garbage collection succeeded"
  1. podConfig感知到static文件的移除,触发SyncLoop REMOVE
  2. podWoker执行syncTerminatingPod(执行停止容器和sandbox)
  3. PLEG感知到sanbox和container的退出
  4. podWoker执行syncTerminatedPod(移除cgroup,更新mirror pod的status,等待pod的volume umount完成)
  5. 感知到mirror pod的status更新
  6. housekeeping触发,执行清理工作(podWorker的移除、mirror pod的删除、pod volume目录移除)
  7. 感知到mirror pod的删除和从apiserver上移除
  8. garbageCollector执行sandbox和容器的清理,并移除pod日志目录

kubelet-static-pod-delete

普通pod移除需要执行两个DELETE操作,才能从apiserver中移除。static pod移除是通过移除文件或http server的response body中移除。而static pod的对应mirror pod在housekeeping触发时候删除,退出的容器和sandbox是由garbageCollector清理。

相关内容