The pod always scheduling to the same node

Encountered Strange Phenomenon: Spark-generated job pods are consistently scheduled on the same node, meaning that pods from different jobs are all being scheduled to the same node. This results in an uneven distribution of pods, even though the nodes have no taints, and their resource availability is similar. The jobs do not have any nodeSelector, nodeAffinity, nodeName, or PodTopologySpread.

Environment: Kubernetes 1.23, using the default default-scheduler with default configurations. Pods are generated by a batch job.

Observation: Pods consistently get scheduled to node 10.10.33.57.

Scheduler Logs:

text

I0314 18:10:20.837459  308219 scheduler.go:621] "Successfully bound pod to node" pod="bigdata-hanoi-default/ham-35003-2023031417-1-fgd78" node="10.10.33.57" evaluatedNodes=14 feasibleNodes=8
I0314 18:10:20.868467  308219 scheduler.go:621] "Successfully bound pod to node" pod="bigdata-hanoi-default/ham-26818-2023031417-1-79l87" node="10.10.33.57" evaluatedNodes=14 feasibleNodes=8
I0314 18:10:20.907352  308219 scheduler.go:621] "Successfully bound pod to node" pod="bigdata-hanoi-default/ham-35002-2023031417-1-6nzzb" node="10.10.33.57" evaluatedNodes=14 feasibleNodes=8
I0314 18:10:21.296938  308219 scheduler.go:621] "Successfully bound pod to node" pod="bigdata-hanoi-default/ham-34857-2023031417-1-hgcnq" node="10.10.33.57" evaluatedNodes=14 feasibleNodes=8
I0314 18:10:21.645678  308219 scheduler.go:621] "Successfully bound pod to node" pod="bigdata-hanoi-default/ham-25771-2023031417-1-jsq9b" node="10.10.33.57" evaluatedNodes=14 feasibleNodes=8
I0314 18:10:22.029478  308219 scheduler.go:621] "Successfully bound pod to node" pod="bigdata-hanoi-default/ham-34856-2023031417-1-rxlld" node="10.10.33.57" evaluatedNodes=14 feasibleNodes=8
I0314 18:10:33.312340  308219 scheduler.go:621] "Successfully bound pod to node" pod="bigdata-hanoi-default/ham-33264-2023031417-1-bxv5r" node="10.10.33.57" evaluatedNodes=14 feasibleNodes=8
I0314 18:10:34.344112  308219 scheduler.go:621] "Successfully bound pod to node" pod="bigdata-hanoi-default/ham-34808-2023031418-1-s4ttx" node="10.10.33.57" evaluatedNodes=14 feasibleNodes=8
I0314 18:10:34.736190  308219 scheduler.go:621] "Successfully bound pod to node" pod="bigdata-hanoi-default/ham-28084-2023031417-1-mp4qh" node="10.10.33.57" evaluatedNodes=14 feasibleNodes=8
I0314 18:10:36.118176  308219 scheduler.go:621] "Successfully bound pod to node" pod="bigdata-hanoi-default/ham-35664-2023031417-1-kkkdn" node="10.10.33.57" evaluatedNodes=14 feasibleNodes=8
I0314 18:10:42.858574  308219 scheduler.go:621] "Successfully bound pod to node" pod="bigdata-hanoi-default/ham-38872-2023031417-1-f7w7t" node="10.10.33.57" evaluatedNodes=14 feasibleNodes=8
I0314 18:10:53.170750  308219 scheduler.go:621] "Successfully bound pod to node" pod="bigdata-hanoi-default/ham-24503-2023031417-1-xvj5z" node="10.10.33.57" evaluatedNodes=14 feasibleNodes=8
I0314 18:10:53.222601  308219 scheduler.go:621] "Successfully bound pod to node" pod="bigdata-hanoi-default/ham-26243-2023031417-1-cm2ft" node="10.10.33.57" evaluatedNodes=14 feasibleNodes=8
I0314 18:11:03.978335  308219 scheduler.go:621] "Successfully bound pod to node" pod="bigdata-hanoi-default/ham-25770-2023031417-1-g2w5w" node="10.10.33.57" evaluatedNodes=14 feasibleNodes=8
I0314 18:11:05.319211  308219 scheduler.go:621] "Successfully bound pod to node" pod="bigdata-hanoi-bi/ham-33521-2023031417-1-pkgnx" node="10.10.33.57" evaluatedNodes=14 feasibleNodes=8
I0314 18:11:30.115938  308219 scheduler.go:621] "Successfully bound pod to node" pod="bigdata-hanoi-default/ham-44574-2023031417-1-4lfj7" node="10.10.33.57" evaluatedNodes=14 feasibleNodes=8
I0314 18:11:48.207950  308219 scheduler.go:621] "Successfully bound pod to node" pod="bigdata-hanoi-default/ham-38967-2023031417-1-lr7n9" node="10.10.33.57" evaluatedNodes=14 feasibleNodes=8
I0314 18:11:58.279178  308219 scheduler.go:621] "Successfully bound pod to node" pod="bigdata-hanoi-default/ham-38820-2023031417-1-q2k2n" node="10.10.33.57" evaluatedNodes=14 feasibleNodes=8
I0314 18:12:09.313942  308219 scheduler.go:621] "Successfully bound pod to node" pod="bigdata-hanoi-default/ham-27890-2023031417-1-4f6w4" node="10.10.33.57" evaluatedNodes=14 feasibleNodes=8
I0314 18:12:24.762822  308219 scheduler.go:621] "Successfully bound pod to node" pod="bigdata-hanoi-default/ham-29749-2023031417-1-ns8gl" node="10.10.33.57" evaluatedNodes=14 feasibleNodes=8
I0314 18:12:49.411474  308219 scheduler.go:621] "Successfully bound pod to node" pod="bigdata-hanoi-default/ham-25769-2023031417-1-fzt5h" node="10.10.33.57" evaluatedNodes=14 feasibleNodes=8
I0314 18:13:13.496645  308219 scheduler.go:621] "Successfully bound pod to node" pod="bigdata-hanoi-default/ham-32066-2023031417-1-wmspl" node="10.10.33.57" evaluatedNodes=14 feasibleNodes=8
I0314 18:13:28.571884  308219 scheduler.go:621] "Successfully bound pod to node" pod="bigdata-hanoi-default/ham-25767-2023031417-1-qkwv7" node="10.10.33.57" evaluatedNodes=14 feasibleNodes=8
I0314 18:13:31.570898  308219 scheduler.go:621] "Successfully bound pod to node" pod="bigdata-hanoi-default/ham-29671-2023031417-1-w4xl7" node="10.10.33.57" evaluatedNodes=14 feasibleNodes=8
I0314 18:13:31.633205  308219 scheduler.go:621] "Successfully bound pod to node" pod="bigdata-hanoi-default/ham-31807-202303141800-2-9d7qx" node="10.10.33.57" evaluatedNodes=14 feasibleNodes=8

job resource

text

apiVersion: v1
items:
- apiVersion: batch/v1
  kind: Job
  metadata:
    creationTimestamp: "2023-03-14T10:53:13Z"
    generation: 1
    labels:
      hanoi2_task_id: ham-29183-202303141852-1
    name: ham-29183-202303141852-1
    namespace: bigdata-hanoi-default
    resourceVersion: "259164839"
    uid: bd6a6e70-00c3-4433-9637-fba91b579f20
  spec:
    backoffLimit: 0
    completionMode: NonIndexed
    completions: 1
    parallelism: 1
    selector:
      matchLabels:
        controller-uid: bd6a6e70-00c3-4433-9637-fba91b579f20
    suspend: false
    template:
      metadata:
        creationTimestamp: null
        labels:
          controller-uid: bd6a6e70-00c3-4433-9637-fba91b579f20
          hanoi2_task_id: ham-29183-202303141852-1
          job-name: ham-29183-202303141852-1
      spec:
        containers:
        - args:
          - 'curl http://xxx.com/script > /data/workspace/script && python2 /data/workspace/script'
          command:
          - /bin/bash
          - -c
          env:
          - name: POD_IP
            valueFrom:
              fieldRef:
                apiVersion: v1
                fieldPath: status.podIP
          - name: SPARK_LOCAL_HOSTNAME
            valueFrom:
              fieldRef:
                apiVersion: v1
                fieldPath: status.podIP
          image: hive:vtest3
          imagePullPolicy: IfNotPresent
          name: ham-29183-202303141852-1
          resources:
            limits:
              cpu: 200m
              memory: 400Mi
            requests:
              cpu: 10m
              memory: 100Mi
          terminationMessagePath: /dev/termination-log
          terminationMessagePolicy: File
          volumeMounts:
          - mountPath: /data/nfs_txyun
            name: nfs
        dnsPolicy: ClusterFirst

Principle: Within the scheduling process of the scheduler, which determines the node to which a pod will be scheduled, there are several phases including PreScore, Score, and NormalizeScore. The phase that decides which node to schedule to is the scoring phase. The node with the highest score (or one of the nodes with equally high scores, chosen at random) will have the pod bound to it.

In the default configuration, the scoring phase employs the following plugins: NodeAffinity, NodeResourcesFit, VolumeBinding, PodTopologySpread, InterPodAffinity, NodeResourcesBalancedAllocation, ImageLocality, and TaintToleration. For more specific information, you can refer to the official documentation on Extension points and scheduling-plugins.

scheduler framework extensions

By setting the scheduler’s log level to 10, you can observe the scheduler’s process. Below, I’ve captured the scheduling process logs for analysis for a particular pod:

text

I0314 22:33:02.424871  278829 scheduling_queue.go:933] "About to try and schedule pod" pod="bigdata-hanoi-default/ham-38943-2023031421-1-jhmzc"
I0314 22:33:02.424880  278829 scheduler.go:443] "Attempting to schedule pod" pod="bigdata-hanoi-default/ham-38943-2023031421-1-jhmzc"

# 每个节点对应的"NodeResourcesBalancedAllocation"打分
I0314 22:33:02.425309  278829 resource_allocation.go:73] "Listing internal info for allocatable resources, requested resources and score" pod="bigdata-hanoi-default/ham-38943-2023031421-1-jhmzc" node="10.10.33.58" resourceAllocationScorer="LeastAllocated" allocatableResource=map[cpu:15600 memory:131739963392] requestedResource=map[cpu:1610 memory:3017801728] resourceScore=93
I0314 22:33:02.425332  278829 resource_allocation.go:73] "Listing internal info for allocatable resources, requested resources and score" pod="bigdata-hanoi-default/ham-38943-2023031421-1-jhmzc" node="10.10.33.58" resourceAllocationScorer="NodeResourcesBalancedAllocation" allocatableResource=map[cpu:15600 memory:131739963392] requestedResource=map[cpu:1110 memory:1969225728] resourceScore=97
I0314 22:33:02.425344  278829 resource_allocation.go:73] "Listing internal info for allocatable resources, requested resources and score" pod="bigdata-hanoi-default/ham-38943-2023031421-1-jhmzc" node="10.10.33.53" resourceAllocationScorer="LeastAllocated" allocatableResource=map[cpu:7600 memory:30323785728] requestedResource=map[cpu:2860 memory:5632950272] resourceScore=71
I0314 22:33:02.425345  278829 resource_allocation.go:73] "Listing internal info for allocatable resources, requested resources and score" pod="bigdata-hanoi-default/ham-38943-2023031421-1-jhmzc" node="10.10.33.54" resourceAllocationScorer="LeastAllocated" allocatableResource=map[cpu:7600 memory:30323785728] requestedResource=map[cpu:4815 memory:10583277568] resourceScore=50
I0314 22:33:02.425354  278829 resource_allocation.go:73] "Listing internal info for allocatable resources, requested resources and score" pod="bigdata-hanoi-default/ham-38943-2023031421-1-jhmzc" node="10.10.33.59" resourceAllocationScorer="LeastAllocated" allocatableResource=map[cpu:15600 memory:131739955200] requestedResource=map[cpu:1560 memory:2912944128] resourceScore=93
I0314 22:33:02.425354  278829 resource_allocation.go:73] "Listing internal info for allocatable resources, requested resources and score" pod="bigdata-hanoi-default/ham-38943-2023031421-1-jhmzc" node="10.10.33.55" resourceAllocationScorer="LeastAllocated" allocatableResource=map[cpu:15600 memory:131739963392] requestedResource=map[cpu:6010 memory:10527703040] resourceScore=76
I0314 22:33:02.425364  278829 resource_allocation.go:73] "Listing internal info for allocatable resources, requested resources and score" pod="bigdata-hanoi-default/ham-38943-2023031421-1-jhmzc" node="10.10.33.53" resourceAllocationScorer="NodeResourcesBalancedAllocation" allocatableResource=map[cpu:7600 memory:30323785728] requestedResource=map[cpu:1860 memory:3535798272] resourceScore=93
I0314 22:33:02.425365  278829 resource_allocation.go:73] "Listing internal info for allocatable resources, requested resources and score" pod="bigdata-hanoi-default/ham-38943-2023031421-1-jhmzc" node="10.10.33.54" resourceAllocationScorer="NodeResourcesBalancedAllocation" allocatableResource=map[cpu:7600 memory:30323785728] requestedResource=map[cpu:3815 memory:8486125568] resourceScore=88
I0314 22:33:02.425372  278829 resource_allocation.go:73] "Listing internal info for allocatable resources, requested resources and score" pod="bigdata-hanoi-default/ham-38943-2023031421-1-jhmzc" node="10.10.33.59" resourceAllocationScorer="NodeResourcesBalancedAllocation" allocatableResource=map[cpu:15600 memory:131739955200] requestedResource=map[cpu:1060 memory:1864368128] resourceScore=97
I0314 22:33:02.425381  278829 resource_allocation.go:73] "Listing internal info for allocatable resources, requested resources and score" pod="bigdata-hanoi-default/ham-38943-2023031421-1-jhmzc" node="10.10.33.56" resourceAllocationScorer="LeastAllocated" allocatableResource=map[cpu:15600 memory:131739963392] requestedResource=map[cpu:7335 memory:13727956992] resourceScore=70
I0314 22:33:02.425383  278829 resource_allocation.go:73] "Listing internal info for allocatable resources, requested resources and score" pod="bigdata-hanoi-default/ham-38943-2023031421-1-jhmzc" node="10.10.33.57" resourceAllocationScorer="LeastAllocated" allocatableResource=map[cpu:15600 memory:131739963392] requestedResource=map[cpu:1520 memory:4343201792] resourceScore=93
I0314 22:33:02.425389  278829 resource_allocation.go:73] "Listing internal info for allocatable resources, requested resources and score" pod="bigdata-hanoi-default/ham-38943-2023031421-1-jhmzc" node="10.10.33.60" resourceAllocationScorer="LeastAllocated" allocatableResource=map[cpu:15600 memory:131739963392] requestedResource=map[cpu:1620 memory:3122659328] resourceScore=93
I0314 22:33:02.425393  278829 resource_allocation.go:73] "Listing internal info for allocatable resources, requested resources and score" pod="bigdata-hanoi-default/ham-38943-2023031421-1-jhmzc" node="10.10.33.55" resourceAllocationScorer="NodeResourcesBalancedAllocation" allocatableResource=map[cpu:15600 memory:131739963392] requestedResource=map[cpu:5010 memory:8640266240] resourceScore=87
I0314 22:33:02.425400  278829 resource_allocation.go:73] "Listing internal info for allocatable resources, requested resources and score" pod="bigdata-hanoi-default/ham-38943-2023031421-1-jhmzc" node="10.10.33.56" resourceAllocationScorer="NodeResourcesBalancedAllocation" allocatableResource=map[cpu:15600 memory:131739963392] requestedResource=map[cpu:6335 memory:11630804992] resourceScore=84
I0314 22:33:02.425401  278829 resource_allocation.go:73] "Listing internal info for allocatable resources, requested resources and score" pod="bigdata-hanoi-default/ham-38943-2023031421-1-jhmzc" node="10.10.33.57" resourceAllocationScorer="NodeResourcesBalancedAllocation" allocatableResource=map[cpu:15600 memory:131739963392] requestedResource=map[cpu:1020 memory:3294625792] resourceScore=97
I0314 22:33:02.425405  278829 resource_allocation.go:73] "Listing internal info for allocatable resources, requested resources and score" pod="bigdata-hanoi-default/ham-38943-2023031421-1-jhmzc" node="10.10.33.60" resourceAllocationScorer="NodeResourcesBalancedAllocation" allocatableResource=map[cpu:15600 memory:131739963392] requestedResource=map[cpu:1120 memory:2074083328] resourceScore=97

# 每个节点对应的NodeAffinity打分,由于没有NodeAffinity所以所有的节点为0
I0314 22:33:02.425477  278829 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-38943-2023031421-1-jhmzc" plugin="NodeAffinity" node="10.10.33.58" score=0
I0314 22:33:02.425487  278829 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-38943-2023031421-1-jhmzc" plugin="NodeAffinity" node="10.10.33.53" score=0
I0314 22:33:02.425495  278829 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-38943-2023031421-1-jhmzc" plugin="NodeAffinity" node="10.10.33.54" score=0
I0314 22:33:02.425502  278829 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-38943-2023031421-1-jhmzc" plugin="NodeAffinity" node="10.10.33.55" score=0
I0314 22:33:02.425513  278829 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-38943-2023031421-1-jhmzc" plugin="NodeAffinity" node="10.10.33.59" score=0
I0314 22:33:02.425518  278829 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-38943-2023031421-1-jhmzc" plugin="NodeAffinity" node="10.10.33.56" score=0
I0314 22:33:02.425524  278829 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-38943-2023031421-1-jhmzc" plugin="NodeAffinity" node="10.10.33.57" score=0
I0314 22:33:02.425529  278829 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-38943-2023031421-1-jhmzc" plugin="NodeAffinity" node="10.10.33.60" score=0

# 每个节点对应NodeResourcesFit打分
I0314 22:33:02.425534  278829 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-38943-2023031421-1-jhmzc" plugin="NodeResourcesFit" node="10.10.33.58" score=93
I0314 22:33:02.425539  278829 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-38943-2023031421-1-jhmzc" plugin="NodeResourcesFit" node="10.10.33.53" score=71
I0314 22:33:02.425544  278829 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-38943-2023031421-1-jhmzc" plugin="NodeResourcesFit" node="10.10.33.54" score=50
I0314 22:33:02.425549  278829 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-38943-2023031421-1-jhmzc" plugin="NodeResourcesFit" node="10.10.33.55" score=76
I0314 22:33:02.425554  278829 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-38943-2023031421-1-jhmzc" plugin="NodeResourcesFit" node="10.10.33.59" score=93
I0314 22:33:02.425560  278829 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-38943-2023031421-1-jhmzc" plugin="NodeResourcesFit" node="10.10.33.56" score=70
I0314 22:33:02.425567  278829 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-38943-2023031421-1-jhmzc" plugin="NodeResourcesFit" node="10.10.33.57" score=93
I0314 22:33:02.425574  278829 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-38943-2023031421-1-jhmzc" plugin="NodeResourcesFit" node="10.10.33.60" score=93

# 每个节点对应"VolumeBinding"打分
I0314 22:33:02.425581  278829 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-38943-2023031421-1-jhmzc" plugin="VolumeBinding" node="10.10.33.58" score=0
I0314 22:33:02.425588  278829 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-38943-2023031421-1-jhmzc" plugin="VolumeBinding" node="10.10.33.53" score=0
I0314 22:33:02.425594  278829 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-38943-2023031421-1-jhmzc" plugin="VolumeBinding" node="10.10.33.54" score=0
I0314 22:33:02.425599  278829 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-38943-2023031421-1-jhmzc" plugin="VolumeBinding" node="10.10.33.55" score=0
I0314 22:33:02.425604  278829 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-38943-2023031421-1-jhmzc" plugin="VolumeBinding" node="10.10.33.59" score=0
I0314 22:33:02.425609  278829 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-38943-2023031421-1-jhmzc" plugin="VolumeBinding" node="10.10.33.56" score=0
I0314 22:33:02.425614  278829 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-38943-2023031421-1-jhmzc" plugin="VolumeBinding" node="10.10.33.57" score=0
I0314 22:33:02.425619  278829 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-38943-2023031421-1-jhmzc" plugin="VolumeBinding" node="10.10.33.60" score=0

# 每个节点对应的"PodTopologySpread"打分,由于没有PodTopologySpread,所以都是200
I0314 22:33:02.425625  278829 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-38943-2023031421-1-jhmzc" plugin="PodTopologySpread" node="10.10.33.58" score=200
I0314 22:33:02.425630  278829 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-38943-2023031421-1-jhmzc" plugin="PodTopologySpread" node="10.10.33.53" score=200
I0314 22:33:02.425635  278829 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-38943-2023031421-1-jhmzc" plugin="PodTopologySpread" node="10.10.33.54" score=200
I0314 22:33:02.425640  278829 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-38943-2023031421-1-jhmzc" plugin="PodTopologySpread" node="10.10.33.55" score=200
I0314 22:33:02.425645  278829 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-38943-2023031421-1-jhmzc" plugin="PodTopologySpread" node="10.10.33.59" score=200
I0314 22:33:02.425651  278829 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-38943-2023031421-1-jhmzc" plugin="PodTopologySpread" node="10.10.33.56" score=200
I0314 22:33:02.425658  278829 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-38943-2023031421-1-jhmzc" plugin="PodTopologySpread" node="10.10.33.57" score=200
I0314 22:33:02.425666  278829 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-38943-2023031421-1-jhmzc" plugin="PodTopologySpread" node="10.10.33.60" score=200

# 每个节点对应的InterPodAffinity打分
I0314 22:33:02.425672  278829 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-38943-2023031421-1-jhmzc" plugin="InterPodAffinity" node="10.10.33.58" score=0
I0314 22:33:02.425677  278829 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-38943-2023031421-1-jhmzc" plugin="InterPodAffinity" node="10.10.33.53" score=0
I0314 22:33:02.425682  278829 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-38943-2023031421-1-jhmzc" plugin="InterPodAffinity" node="10.10.33.54" score=0
I0314 22:33:02.425687  278829 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-38943-2023031421-1-jhmzc" plugin="InterPodAffinity" node="10.10.33.55" score=0
I0314 22:33:02.425692  278829 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-38943-2023031421-1-jhmzc" plugin="InterPodAffinity" node="10.10.33.59" score=0
I0314 22:33:02.425697  278829 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-38943-2023031421-1-jhmzc" plugin="InterPodAffinity" node="10.10.33.56" score=0
I0314 22:33:02.425703  278829 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-38943-2023031421-1-jhmzc" plugin="InterPodAffinity" node="10.10.33.57" score=0
I0314 22:33:02.425708  278829 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-38943-2023031421-1-jhmzc" plugin="InterPodAffinity" node="10.10.33.60" score=0

# 每个节点对应的NodeResourcesBalancedAllocation打分
I0314 22:33:02.425713  278829 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-38943-2023031421-1-jhmzc" plugin="NodeResourcesBalancedAllocation" node="10.10.33.58" score=97
I0314 22:33:02.425719  278829 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-38943-2023031421-1-jhmzc" plugin="NodeResourcesBalancedAllocation" node="10.10.33.53" score=93
I0314 22:33:02.425724  278829 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-38943-2023031421-1-jhmzc" plugin="NodeResourcesBalancedAllocation" node="10.10.33.54" score=88
I0314 22:33:02.425729  278829 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-38943-2023031421-1-jhmzc" plugin="NodeResourcesBalancedAllocation" node="10.10.33.55" score=87
I0314 22:33:02.425735  278829 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-38943-2023031421-1-jhmzc" plugin="NodeResourcesBalancedAllocation" node="10.10.33.59" score=97
I0314 22:33:02.425742  278829 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-38943-2023031421-1-jhmzc" plugin="NodeResourcesBalancedAllocation" node="10.10.33.56" score=84
I0314 22:33:02.425750  278829 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-38943-2023031421-1-jhmzc" plugin="NodeResourcesBalancedAllocation" node="10.10.33.57" score=97
I0314 22:33:02.425756  278829 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-38943-2023031421-1-jhmzc" plugin="NodeResourcesBalancedAllocation" node="10.10.33.60" score=97

# 每个节点对应的ImageLocality打分
I0314 22:33:02.425762  278829 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-38943-2023031421-1-jhmzc" plugin="ImageLocality" node="10.10.33.58" score=0
I0314 22:33:02.425767  278829 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-38943-2023031421-1-jhmzc" plugin="ImageLocality" node="10.10.33.53" score=26
I0314 22:33:02.425772  278829 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-38943-2023031421-1-jhmzc" plugin="ImageLocality" node="10.10.33.54" score=54
I0314 22:33:02.425777  278829 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-38943-2023031421-1-jhmzc" plugin="ImageLocality" node="10.10.33.55" score=100
I0314 22:33:02.425782  278829 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-38943-2023031421-1-jhmzc" plugin="ImageLocality" node="10.10.33.59" score=0
I0314 22:33:02.425787  278829 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-38943-2023031421-1-jhmzc" plugin="ImageLocality" node="10.10.33.56" score=83
I0314 22:33:02.425793  278829 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-38943-2023031421-1-jhmzc" plugin="ImageLocality" node="10.10.33.57" score=100
I0314 22:33:02.425799  278829 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-38943-2023031421-1-jhmzc" plugin="ImageLocality" node="10.10.33.60" score=0

# 每个节点对应的"TaintToleration"打分
I0314 22:33:02.425805  278829 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-38943-2023031421-1-jhmzc" plugin="TaintToleration" node="10.10.33.58" score=300
I0314 22:33:02.425810  278829 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-38943-2023031421-1-jhmzc" plugin="TaintToleration" node="10.10.33.53" score=300
I0314 22:33:02.425815  278829 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-38943-2023031421-1-jhmzc" plugin="TaintToleration" node="10.10.33.54" score=300
I0314 22:33:02.425821  278829 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-38943-2023031421-1-jhmzc" plugin="TaintToleration" node="10.10.33.55" score=300
I0314 22:33:02.425828  278829 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-38943-2023031421-1-jhmzc" plugin="TaintToleration" node="10.10.33.59" score=300
I0314 22:33:02.425836  278829 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-38943-2023031421-1-jhmzc" plugin="TaintToleration" node="10.10.33.56" score=300
I0314 22:33:02.425856  278829 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-38943-2023031421-1-jhmzc" plugin="TaintToleration" node="10.10.33.57" score=300
I0314 22:33:02.425861  278829 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-38943-2023031421-1-jhmzc" plugin="TaintToleration" node="10.10.33.60" score=300

# 每个节点最后总的打分
I0314 22:33:02.425870  278829 generic_scheduler.go:491] "Calculated node's final score for pod" pod="bigdata-hanoi-default/ham-38943-2023031421-1-jhmzc" node="10.10.33.58" score=690
I0314 22:33:02.425876  278829 generic_scheduler.go:491] "Calculated node's final score for pod" pod="bigdata-hanoi-default/ham-38943-2023031421-1-jhmzc" node="10.10.33.53" score=690
I0314 22:33:02.425881  278829 generic_scheduler.go:491] "Calculated node's final score for pod" pod="bigdata-hanoi-default/ham-38943-2023031421-1-jhmzc" node="10.10.33.54" score=692
I0314 22:33:02.425886  278829 generic_scheduler.go:491] "Calculated node's final score for pod" pod="bigdata-hanoi-default/ham-38943-2023031421-1-jhmzc" node="10.10.33.55" score=763
I0314 22:33:02.425890  278829 generic_scheduler.go:491] "Calculated node's final score for pod" pod="bigdata-hanoi-default/ham-38943-2023031421-1-jhmzc" node="10.10.33.59" score=690
I0314 22:33:02.425895  278829 generic_scheduler.go:491] "Calculated node's final score for pod" pod="bigdata-hanoi-default/ham-38943-2023031421-1-jhmzc" node="10.10.33.56" score=737
I0314 22:33:02.425900  278829 generic_scheduler.go:491] "Calculated node's final score for pod" pod="bigdata-hanoi-default/ham-38943-2023031421-1-jhmzc" node="10.10.33.57" score=790
I0314 22:33:02.425907  278829 generic_scheduler.go:491] "Calculated node's final score for pod" pod="bigdata-hanoi-default/ham-38943-2023031421-1-jhmzc" node="10.10.33.60" score=690
I0314 22:33:02.426010  278829 default_binder.go:52] "Attempting to bind pod to node" pod="bigdata-hanoi-default/ham-38943-2023031421-1-jhmzc" node="10.10.33.57"

Organizing the Scoring Data in the Logs

10.10.33.5310.10.33.5410.10.33.5510.10.33.5610.10.33.5710.10.33.5810.10.33.5910.10.33.60
NodeAffinity00000000
NodeResourcesFit7150767093939393
VolumeBinding00000000
PodTopologySpread200200200200200200200200
InterPodAffinity00000000
NodeResourcesBalancedAllocation9388878497979797
ImageLocality265410083100000
TaintToleration300300300300300300300300
Total Score690692763737790690690690

Data Analysis:

  1. The primary difference in scoring between the first node (10.10.33.57) and the second node (10.10.33.55) is related to node resource-related scores.
  2. The primary difference in scoring between the second node (10.10.33.55) and the third node (10.10.33.56) is in the ImageLocality score (this might be due to a scheduler startup bug where data hasn’t fully synchronized, as mentioned in this issue).
  3. The largest discrepancy in scoring is in the ImageLocality score, causing the nodes 10.10.33.53, 10.10.33.58, 10.10.33.59, and 10.10.33.60 to have the lowest scores and, therefore, making them unlikely candidates for scheduling.

New Discovery

It has been observed that the pods scheduled to the same node have identical images, and this image is not present on nodes 10.10.33.53, 10.10.33.58, 10.10.33.59, and 10.10.33.60.

  1. Jobs from different tasks all use the same image, and this image is only available on certain nodes, specifically, 10.10.33.53, 10.10.33.54, 10.10.33.55, 10.10.33.56, and 10.10.33.57.
  2. The design of ImageLocality considers whether the image exists on a node, and this leads to a significant difference in scoring between nodes with the image and nodes without it.
  3. Node 10.10.33.57 has more available resources compared to other nodes.
  4. Due to nodes without the image receiving a ImageLocality score of 0, nodes 10.10.33.53, 10.10.33.58, 10.10.33.59, and 10.10.33.60 have the lowest scores and are unlikely to be selected for scheduling.
  1. Scoring for Each Container Image in a Pod:
    • If the image exists on the node, the score is calculated as follows: Score = Image Size * (Number of Nodes with the Image / Total Number of Nodes).
    • If the image doesn’t exist on the node, the score is 0.
  2. Total Score for All Container Images in the Pod: The total score for all container images in the pod is the sum of the scores for each image.
  3. Normalization of the Total Score: The total score for all container images is normalized to be within the range [24117248, Number of Containers in the Pod * 1048576000].
    • If the total score is less than 24117248, it’s set to 24117248.
    • If the total score is greater than (Number of Containers in the Pod * 1048576000), it’s set to (Number of Containers in the Pod * 1048576000).
    • If the total score falls within [24117248, Number of Containers in the Pod * 1048576000], it remains unchanged.
  4. Final Score Calculation: The final score is calculated as follows: Score = 100 * (Total Score for All Container Images - 24117248) / (Number of Containers in the Pod * 1048576000 - 24117248).

In summary, the maximum score for ImageLocality is 100. If an image is larger than 1GB and it is distributed on a significant portion of the nodes, the score is likely to be close to 100. The purpose of this plugin is to prioritize nodes with images, especially larger ones, to reduce pod startup time.

  1. Obtain the allocatable values for resources on the node and the total resource requests after adding pod resource requests (resources here refer to CPU, memory, etc.).
  2. Calculate the ratio of request to allocatable for each resource, i.e., Ratio = request[resource] / allocable[resource].
  3. Calculate the standard deviation of the ratios of request to allocatable for all resources. For example, if there are only CPU and memory resources, calculate the standard deviation of {CPU ratio, memory ratio}.
  4. Final Score = 100 * (1 - standard deviation).

Summary: The closer the ratios of request resources to allocatable resources are for each resource type, the higher the score for the node. The design intent of this plugin appears to be to encourage nodes to have a more even distribution of resource allocation percentages, avoiding situations where one resource is fully allocated while another resource remains underutilized.

  1. Obtain the allocatable values for resources on the node and the total resource requests after adding pod resource requests (resources here refer to CPU, memory, etc.).
  2. Calculate the remaining resource percentage after allocating pod resource requests for each resource type, i.e., Remaining Resource Percentage = ((allocable - request) / allocable) * 100.
  3. Calculate the score for each resource type = Remaining Resource Percentage * resource weight. By default, CPU and memory both have a weight of 1.
  4. Calculate the score for all resources = Sum of scores for each resource, and calculate the weight for all resources = Sum of weights for each resource.
  5. Final Score = Total score for all resources / Total weight for all resources.

Summary: Nodes with more remaining resources after allocating pod requests receive higher scores, indicating that nodes with more available resources are preferred. The goal appears to be to ensure that nodes are well-utilized in terms of resource allocation.

This algorithm is associated with podAffinity and podAntiAffinity configurations for pods. The overall idea is to calculate scores based on the topologyKey used in the pod’s podAffinity and podAntiAffinity rules. It computes scores based on the topologyKey names and the values of labels on nodes corresponding to those topologyKeys. Nodes accumulate scores based on their own labels by looking up values from the two-dimensional map. The node scores are then normalized to produce the final scores.

  1. If the pod being scheduled has pod preferred affinity or pod preferred anti-affinity rules, it will iterate through all pods on the node; otherwise, it will iterate through pods with affinities on the node.
  2. Calculate scores related to the pod’s preferred affinity. If a pod on the node matches the preferred affinity rules of the pod being scheduled (matching namespace and labelSelector), the value in the two-dimensional map (topologyKey name, node labels[topologyKey name]) is updated to the original value plus the weight multiplied by the number of matching pods.
  3. Calculate scores related to the pod’s preferred anti-affinity. If a pod on the node matches the preferred anti-affinity rules of the pod being scheduled (matching namespace and labelSelector), the value in the two-dimensional map (topologyKey name, node labels[topologyKey name]) is updated to the original value minus the weight multiplied by the number of matching pods.
  4. Calculate scores related to podAffinity for the pod being scheduled. If the pod being scheduled has labels and the InterPodAffinity.hardPodAffinityWeight in the scheduler configuration file is greater than 0, and if the pod matches podAffinity rules of the pod being iterated (matching namespace), the value in the two-dimensional map (topologyKey name, node labels[topologyKey name]) is updated to the original value plus InterPodAffinity.hardPodAffinityWeight.
  5. Calculate scores related to pod preferred affinity for the pod being iterated. If the pod being scheduled matches the pod preferred affinity of the pod being iterated, the value in the two-dimensional map (topologyKey name, node labels[topologyKey name]) is updated to the original value plus the weight of the preferred affinity of the pod being scheduled.
  6. Calculate scores related to pod preferred anti-affinity for the pod being iterated. If the pod being scheduled matches the pod preferred anti-affinity of the pod being iterated, the value in the two-dimensional map (topologyKey name, node labels[topologyKey name]) is updated to the original value minus the weight of the preferred affinity of the pod being scheduled.
  7. Node Scoring: Iterate through all labels on the node, sum the values found in the two-dimensional map (topologyKey name, node labels[topologyKey name]). This value is the node’s score.
  8. Apply final score normalization to all nodes:
    • Iterate through all nodes’ scores to find the maximum and minimum values.
    • If all node scores are the same, the final scores for all nodes are set to 0.
    • Otherwise, the final score for each node is calculated as follows: Final Score = 100 * (Node Score - Minimum Score) / (Maximum Score - Minimum Score).

In summary, a node with the highest score will have a final score of 100, a node with the lowest score will have a final score of 0, and other nodes will have final scores between 0 and 100.

Solution Approach: To reduce the score gap between nodes, you can disable scoring for a specific plugin or enable scoring for a specific plugin, or adjust the plugin’s weight.

1.Disable ImageLocality

In the kube-scheduler’s startup configuration file, configure it as follows:

text

apiVersion: kubescheduler.config.k8s.io/v1beta3
profiles:
  - plugins:
  schedulerName: default-scheduler
  score:
      disabled:
      - name: ImageLocality

2.set podAntiAffinity

Label pods with the same image and set podAntiAffinity

text

      spec:
        affinity:
          podAntiAffinity:
            preferredDuringSchedulingIgnoredDuringExecution:
            - labelSelector:
                matchExpressions:
                - key: job-type
                  operator: In
                  values:
                  - spark
              topologyKey: kubernetes.io/hostname
              weight: 100

After setting podAntiAffinity, (here the nodes are somewhat different from above) the scheduling log of a pod after a certain period of time.

text

I0318 12:33:44.125838  241226 eventhandlers.go:118] "Add event for unscheduled pod" pod="bigdata-hanoi-default/ham-29183-202303181232-1-w8wd4"
I0318 12:33:44.125881  241226 scheduling_queue.go:933] "About to try and schedule pod" pod="bigdata-hanoi-default/ham-29183-202303181232-1-w8wd4"
I0318 12:33:44.125889  241226 scheduler.go:443] "Attempting to schedule pod" pod="bigdata-hanoi-default/ham-29183-202303181232-1-w8wd4"
I0318 12:33:44.126192  241226 resource_allocation.go:73] "Listing internal info for allocatable resources, requested resources and score" pod="bigdata-hanoi-default/ham-29183-202303181232-1-w8wd4" node="10.10.33.61" resourceAllocationScorer="LeastAllocated" allocatableResource=map[cpu:15600 memory:131739963392] requestedResource=map[cpu:6225 memory:11998855168] resourceScore=75
I0318 12:33:44.126198  241226 resource_allocation.go:73] "Listing internal info for allocatable resources, requested resources and score" pod="bigdata-hanoi-default/ham-29183-202303181232-1-w8wd4" node="10.10.33.62" resourceAllocationScorer="LeastAllocated" allocatableResource=map[cpu:15600 memory:131739963392] requestedResource=map[cpu:4950 memory:13366198272] resourceScore=78
I0318 12:33:44.126203  241226 resource_allocation.go:73] "Listing internal info for allocatable resources, requested resources and score" pod="bigdata-hanoi-default/ham-29183-202303181232-1-w8wd4" node="10.10.33.58" resourceAllocationScorer="LeastAllocated" allocatableResource=map[cpu:15600 memory:131739963392] requestedResource=map[cpu:4130 memory:11453595648] resourceScore=82
I0318 12:33:44.126210  241226 resource_allocation.go:73] "Listing internal info for allocatable resources, requested resources and score" pod="bigdata-hanoi-default/ham-29183-202303181232-1-w8wd4" node="10.10.33.59" resourceAllocationScorer="LeastAllocated" allocatableResource=map[cpu:15600 memory:131739955200] requestedResource=map[cpu:4155 memory:6479151104] resourceScore=84
I0318 12:33:44.126212  241226 resource_allocation.go:73] "Listing internal info for allocatable resources, requested resources and score" pod="bigdata-hanoi-default/ham-29183-202303181232-1-w8wd4" node="10.10.33.61" resourceAllocationScorer="NodeResourcesBalancedAllocation" allocatableResource=map[cpu:15600 memory:131739963392] requestedResource=map[cpu:5325 memory:10111418368] resourceScore=86
I0318 12:33:44.126220  241226 resource_allocation.go:73] "Listing internal info for allocatable resources, requested resources and score" pod="bigdata-hanoi-default/ham-29183-202303181232-1-w8wd4" node="10.10.33.62" resourceAllocationScorer="NodeResourcesBalancedAllocation" allocatableResource=map[cpu:15600 memory:131739963392] requestedResource=map[cpu:3850 memory:11269046272] resourceScore=91
I0318 12:33:44.126220  241226 resource_allocation.go:73] "Listing internal info for allocatable resources, requested resources and score" pod="bigdata-hanoi-default/ham-29183-202303181232-1-w8wd4" node="10.10.33.58" resourceAllocationScorer="NodeResourcesBalancedAllocation" allocatableResource=map[cpu:15600 memory:131739963392] requestedResource=map[cpu:2930 memory:9146728448] resourceScore=94
I0318 12:33:44.126227  241226 resource_allocation.go:73] "Listing internal info for allocatable resources, requested resources and score" pod="bigdata-hanoi-default/ham-29183-202303181232-1-w8wd4" node="10.10.33.59" resourceAllocationScorer="NodeResourcesBalancedAllocation" allocatableResource=map[cpu:15600 memory:131739955200] requestedResource=map[cpu:3555 memory:5011144704] resourceScore=90
I0318 12:33:44.126232  241226 resource_allocation.go:73] "Listing internal info for allocatable resources, requested resources and score" pod="bigdata-hanoi-default/ham-29183-202303181232-1-w8wd4" node="10.10.33.60" resourceAllocationScorer="LeastAllocated" allocatableResource=map[cpu:15600 memory:131739963392] requestedResource=map[cpu:4800 memory:6340739072] resourceScore=82
I0318 12:33:44.126265  241226 resource_allocation.go:73] "Listing internal info for allocatable resources, requested resources and score" pod="bigdata-hanoi-default/ham-29183-202303181232-1-w8wd4" node="10.10.33.60" resourceAllocationScorer="NodeResourcesBalancedAllocation" allocatableResource=map[cpu:15600 memory:131739963392] requestedResource=map[cpu:4100 memory:4872732672] resourceScore=88
I0318 12:33:44.126351  241226 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-29183-202303181232-1-w8wd4" plugin="VolumeBinding" node="10.10.33.61" score=0
I0318 12:33:44.126359  241226 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-29183-202303181232-1-w8wd4" plugin="VolumeBinding" node="10.10.33.62" score=0
I0318 12:33:44.126365  241226 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-29183-202303181232-1-w8wd4" plugin="VolumeBinding" node="10.10.33.58" score=0
I0318 12:33:44.126370  241226 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-29183-202303181232-1-w8wd4" plugin="VolumeBinding" node="10.10.33.59" score=0
I0318 12:33:44.126375  241226 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-29183-202303181232-1-w8wd4" plugin="VolumeBinding" node="10.10.33.60" score=0
I0318 12:33:44.126382  241226 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-29183-202303181232-1-w8wd4" plugin="PodTopologySpread" node="10.10.33.61" score=200
I0318 12:33:44.126390  241226 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-29183-202303181232-1-w8wd4" plugin="PodTopologySpread" node="10.10.33.62" score=200
I0318 12:33:44.126397  241226 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-29183-202303181232-1-w8wd4" plugin="PodTopologySpread" node="10.10.33.58" score=200
I0318 12:33:44.126405  241226 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-29183-202303181232-1-w8wd4" plugin="PodTopologySpread" node="10.10.33.59" score=200
I0318 12:33:44.126412  241226 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-29183-202303181232-1-w8wd4" plugin="PodTopologySpread" node="10.10.33.60" score=200
I0318 12:33:44.126418  241226 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-29183-202303181232-1-w8wd4" plugin="InterPodAffinity" node="10.10.33.61" score=200
I0318 12:33:44.126423  241226 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-29183-202303181232-1-w8wd4" plugin="InterPodAffinity" node="10.10.33.62" score=0
I0318 12:33:44.126428  241226 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-29183-202303181232-1-w8wd4" plugin="InterPodAffinity" node="10.10.33.58" score=0
I0318 12:33:44.126434  241226 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-29183-202303181232-1-w8wd4" plugin="InterPodAffinity" node="10.10.33.59" score=0
I0318 12:33:44.126439  241226 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-29183-202303181232-1-w8wd4" plugin="InterPodAffinity" node="10.10.33.60" score=0
I0318 12:33:44.126445  241226 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-29183-202303181232-1-w8wd4" plugin="NodeResourcesBalancedAllocation" node="10.10.33.61" score=86
I0318 12:33:44.126451  241226 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-29183-202303181232-1-w8wd4" plugin="NodeResourcesBalancedAllocation" node="10.10.33.62" score=91
I0318 12:33:44.126456  241226 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-29183-202303181232-1-w8wd4" plugin="NodeResourcesBalancedAllocation" node="10.10.33.58" score=94
I0318 12:33:44.126462  241226 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-29183-202303181232-1-w8wd4" plugin="NodeResourcesBalancedAllocation" node="10.10.33.59" score=90
I0318 12:33:44.126468  241226 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-29183-202303181232-1-w8wd4" plugin="NodeResourcesBalancedAllocation" node="10.10.33.60" score=88
I0318 12:33:44.126473  241226 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-29183-202303181232-1-w8wd4" plugin="ImageLocality" node="10.10.33.61" score=100
I0318 12:33:44.126479  241226 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-29183-202303181232-1-w8wd4" plugin="ImageLocality" node="10.10.33.62" score=100
I0318 12:33:44.126484  241226 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-29183-202303181232-1-w8wd4" plugin="ImageLocality" node="10.10.33.58" score=100
I0318 12:33:44.126491  241226 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-29183-202303181232-1-w8wd4" plugin="ImageLocality" node="10.10.33.59" score=100
I0318 12:33:44.126497  241226 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-29183-202303181232-1-w8wd4" plugin="ImageLocality" node="10.10.33.60" score=100
I0318 12:33:44.126502  241226 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-29183-202303181232-1-w8wd4" plugin="TaintToleration" node="10.10.33.61" score=300
I0318 12:33:44.126508  241226 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-29183-202303181232-1-w8wd4" plugin="TaintToleration" node="10.10.33.62" score=300
I0318 12:33:44.126513  241226 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-29183-202303181232-1-w8wd4" plugin="TaintToleration" node="10.10.33.58" score=300
I0318 12:33:44.126518  241226 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-29183-202303181232-1-w8wd4" plugin="TaintToleration" node="10.10.33.59" score=300
I0318 12:33:44.126524  241226 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-29183-202303181232-1-w8wd4" plugin="TaintToleration" node="10.10.33.60" score=300
I0318 12:33:44.126529  241226 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-29183-202303181232-1-w8wd4" plugin="NodeAffinity" node="10.10.33.61" score=0
I0318 12:33:44.126534  241226 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-29183-202303181232-1-w8wd4" plugin="NodeAffinity" node="10.10.33.62" score=0
I0318 12:33:44.126539  241226 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-29183-202303181232-1-w8wd4" plugin="NodeAffinity" node="10.10.33.58" score=0
I0318 12:33:44.126545  241226 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-29183-202303181232-1-w8wd4" plugin="NodeAffinity" node="10.10.33.59" score=0
I0318 12:33:44.126550  241226 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-29183-202303181232-1-w8wd4" plugin="NodeAffinity" node="10.10.33.60" score=0
I0318 12:33:44.126555  241226 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-29183-202303181232-1-w8wd4" plugin="NodeResourcesFit" node="10.10.33.61" score=75
I0318 12:33:44.126560  241226 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-29183-202303181232-1-w8wd4" plugin="NodeResourcesFit" node="10.10.33.62" score=78
I0318 12:33:44.126567  241226 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-29183-202303181232-1-w8wd4" plugin="NodeResourcesFit" node="10.10.33.58" score=82
I0318 12:33:44.126572  241226 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-29183-202303181232-1-w8wd4" plugin="NodeResourcesFit" node="10.10.33.59" score=84
I0318 12:33:44.126578  241226 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-29183-202303181232-1-w8wd4" plugin="NodeResourcesFit" node="10.10.33.60" score=82
I0318 12:33:44.126588  241226 generic_scheduler.go:491] "Calculated node's final score for pod" pod="bigdata-hanoi-default/ham-29183-202303181232-1-w8wd4" node="10.10.33.61" score=961
I0318 12:33:44.126594  241226 generic_scheduler.go:491] "Calculated node's final score for pod" pod="bigdata-hanoi-default/ham-29183-202303181232-1-w8wd4" node="10.10.33.62" score=769
I0318 12:33:44.126599  241226 generic_scheduler.go:491] "Calculated node's final score for pod" pod="bigdata-hanoi-default/ham-29183-202303181232-1-w8wd4" node="10.10.33.58" score=776
I0318 12:33:44.126605  241226 generic_scheduler.go:491] "Calculated node's final score for pod" pod="bigdata-hanoi-default/ham-29183-202303181232-1-w8wd4" node="10.10.33.59" score=774
I0318 12:33:44.126612  241226 generic_scheduler.go:491] "Calculated node's final score for pod" pod="bigdata-hanoi-default/ham-29183-202303181232-1-w8wd4" node="10.10.33.60" score=770
I0318 12:33:44.126698  241226 default_binder.go:52] "Attempting to bind pod to node" pod="bigdata-hanoi-default/ham-29183-202303181232-1-w8wd4" node="10.10.33.61"

data collation

10.10.33.5810.10.33.5910.10.33.6010.10.33.6110.10.33.62
NodeAffinity00000
NodeResourcesFit8284827578
VolumeBinding00000
PodTopologySpread200200200200200
InterPodAffinity0002000
NodeResourcesBalancedAllocation9490888691
ImageLocality100100100100100
TaintToleration300300300300300
Total Score776774770961769

Data Analysis:

  1. ImageLocality scores are all the same, indicating that each node has the same image.
  2. On nodes 10.10.33.58, 10.10.33.59, 10.10.33.60, and 10.10.33.62, there are pods of the same type running, so the InterPodAffinity score is 0. Only on 10.10.33.61, there are no pods of this type, so the score is 200 (here multiplied by the weight of each plugin, and the default weight for InterPodAffinity is 2).

The following log display shows various scoring scenarios for InterPodAffinity, indicating that pods are generally scheduled evenly between nodes.

text

I0318 12:35:54.480311  241226 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-24503-2023031811-1-hzvjd" plugin="InterPodAffinity" node="10.10.33.58" score=150
I0318 12:35:54.480316  241226 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-24503-2023031811-1-hzvjd" plugin="InterPodAffinity" node="10.10.33.61" score=100
I0318 12:35:54.480321  241226 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-24503-2023031811-1-hzvjd" plugin="InterPodAffinity" node="10.10.33.62" score=0
I0318 12:35:54.480327  241226 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-24503-2023031811-1-hzvjd" plugin="InterPodAffinity" node="10.10.33.60" score=200
I0318 12:35:54.480332  241226 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-24503-2023031811-1-hzvjd" plugin="InterPodAffinity" node="10.10.33.59" score=50

I0318 12:34:14.726775  241226 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-35680-2023031811-1-rx6sz" plugin="InterPodAffinity" node="10.10.33.62" score=0
I0318 12:34:14.726784  241226 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-35680-2023031811-1-rx6sz" plugin="InterPodAffinity" node="10.10.33.58" score=0
I0318 12:34:14.726790  241226 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-35680-2023031811-1-rx6sz" plugin="InterPodAffinity" node="10.10.33.60" score=200
I0318 12:34:14.726796  241226 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-35680-2023031811-1-rx6sz" plugin="InterPodAffinity" node="10.10.33.59" score=0
I0318 12:34:14.726802  241226 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-35680-2023031811-1-rx6sz" plugin="InterPodAffinity" node="10.10.33.61" score=0

I0318 12:36:18.573939  241226 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-44003-202303181230-1-bb9kn" plugin="InterPodAffinity" node="10.10.33.58" score=200
I0318 12:36:18.573946  241226 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-44003-202303181230-1-bb9kn" plugin="InterPodAffinity" node="10.10.33.61" score=132
I0318 12:36:18.573953  241226 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-44003-202303181230-1-bb9kn" plugin="InterPodAffinity" node="10.10.33.62" score=0
I0318 12:36:18.573959  241226 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-44003-202303181230-1-bb9kn" plugin="InterPodAffinity" node="10.10.33.59" score=132
I0318 12:36:18.573965  241226 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-44003-202303181230-1-bb9kn" plugin="InterPodAffinity" node="10.10.33.60" score=200

I0318 12:36:19.612833  241226 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-37288-202303181215-1-8rbrl" plugin="InterPodAffinity" node="10.10.33.58" score=0
I0318 12:36:19.612839  241226 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-37288-202303181215-1-8rbrl" plugin="InterPodAffinity" node="10.10.33.61" score=200
I0318 12:36:19.612844  241226 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-37288-202303181215-1-8rbrl" plugin="InterPodAffinity" node="10.10.33.62" score=0
I0318 12:36:19.612849  241226 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-37288-202303181215-1-8rbrl" plugin="InterPodAffinity" node="10.10.33.60" score=200
I0318 12:36:19.612854  241226 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-37288-202303181215-1-8rbrl" plugin="InterPodAffinity" node="10.10.33.59" score=200

I0320 12:43:32.206742  241226 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-39074-2023032011-1-fv6h6" plugin="InterPodAffinity" node="10.10.33.58" score=0
I0320 12:43:32.206747  241226 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-39074-2023032011-1-fv6h6" plugin="InterPodAffinity" node="10.10.33.62" score=0
I0320 12:43:32.206754  241226 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-39074-2023032011-1-fv6h6" plugin="InterPodAffinity" node="10.10.33.60" score=100
I0320 12:43:32.206759  241226 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-39074-2023032011-1-fv6h6" plugin="InterPodAffinity" node="10.10.33.61" score=200
I0320 12:43:32.206764  241226 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-39074-2023032011-1-fv6h6" plugin="InterPodAffinity" node="10.10.33.59" score=0

I0320 13:13:09.948985  241226 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-25767-2023032012-1-nnj7k" plugin="InterPodAffinity" node="10.10.33.61" score=200
I0320 13:13:09.948990  241226 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-25767-2023032012-1-nnj7k" plugin="InterPodAffinity" node="10.10.33.62" score=200
I0320 13:13:09.948995  241226 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-25767-2023032012-1-nnj7k" plugin="InterPodAffinity" node="10.10.33.58" score=0
I0320 13:13:09.949001  241226 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-25767-2023032012-1-nnj7k" plugin="InterPodAffinity" node="10.10.33.60" score=200
I0320 13:13:09.949005  241226 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-25767-2023032012-1-nnj7k" plugin="InterPodAffinity" node="10.10.33.59" score=0

I0320 13:12:09.243515  241226 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-43308-2023032012-1-7ncz7" plugin="InterPodAffinity" node="10.10.33.61" score=200
I0320 13:12:09.243522  241226 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-43308-2023032012-1-7ncz7" plugin="InterPodAffinity" node="10.10.33.58" score=0
I0320 13:12:09.243529  241226 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-43308-2023032012-1-7ncz7" plugin="InterPodAffinity" node="10.10.33.60" score=0
I0320 13:12:09.243535  241226 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-43308-2023032012-1-7ncz7" plugin="InterPodAffinity" node="10.10.33.62" score=0
I0320 13:12:09.243541  241226 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-43308-2023032012-1-7ncz7" plugin="InterPodAffinity" node="10.10.33.59" score=0

I0320 13:10:51.044472  241226 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-29621-2023032012-1-gm5v7" plugin="InterPodAffinity" node="10.10.33.58" score=0
I0320 13:10:51.044478  241226 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-29621-2023032012-1-gm5v7" plugin="InterPodAffinity" node="10.10.33.61" score=0
I0320 13:10:51.044484  241226 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-29621-2023032012-1-gm5v7" plugin="InterPodAffinity" node="10.10.33.62" score=0
I0320 13:10:51.044489  241226 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-29621-2023032012-1-gm5v7" plugin="InterPodAffinity" node="10.10.33.60" score=0
I0320 13:10:51.044494  241226 generic_scheduler.go:434] "Plugin scored node for pod" pod="bigdata-hanoi-default/ham-29621-2023032012-1-gm5v7" plugin="InterPodAffinity" node="10.10.33.59" score=0

The situation where pods are scheduled to the same node occurs when the remaining resources between nodes are similar, the image is larger than 1G, and all pods use the same image. Job-created pods have short execution times, which can easily result in similar node resource availability. Spark tasks use Python environment images of around 4G, which can easily make the ImageLocality score close to 100.

https://github.com/kubernetes/kubernetes/issues/91204

https://github.com/kubernetes/kubernetes/issues/42281

https://stackoverflow.com/questions/73278919/why-would-the-kubernetes-scheduler-always-place-my-pod-replicas-on-the-same-node

https://github.com/kubernetes/kubernetes/issues/110149

Related Content