探究metadata.generation值增加原理

使用kubebuilder来开发vpa相关的operator,这个operator会watch集群里的所有的vpa创建和删除和更新,controller-runtime提供了predictate来过滤调不需要的事件,使用predicate.GenerationChangedPredicate过滤掉vpa更新status。但是却发现vpa的status更新(由vpa-recommender更新推荐值)也触发了Reconcile。

go

func (r *RuleReconciler) SetupWithManager(mgr ctrl.Manager) error {
	return ctrl.NewControllerManagedBy(mgr).
		For(&recommendationv1alpha1.Rule{}).
		Owns(&autoscalingv1.VerticalPodAutoscaler{},
			builder.WithPredicates(predicate.GenerationChangedPredicate{}).
		Complete(r)
}

kubernetes版本1.23 vpa版本0.13.0

这个原因是什么呢?

该方法里注释阐述了原理:

GenerationChangedPredicate implements a default update predicate function on Generation change.

This predicate will skip update events that have no change in the object’s metadata.generation field. The metadata.generation field of an object is incremented by the API server when writes are made to the spec field of an object. This allows a controller to ignore update events where the spec is unchanged, and only the metadata and/or status fields are changed.

For CustomResource objects the Generation is only incremented when the status subresource is enabled.

Caveats:

* The assumption that the Generation is incremented only on writing to the spec does not hold for all APIs. E.g For Deployment objects the Generation is also incremented on writes to the metadata.annotations field. For object types other than CustomResources be sure to verify which fields will trigger a Generation increment when they are written to.

* With this predicate, any update events with writes only to the status field will not be reconciled. So in the event that the status block is overwritten or wiped by someone else the controller will not self-correct to restore the correct status.

即在update事件中metadata.generation未改变,那么这个update事件就会被过滤掉。也说明了修改了metadata或status字段不会增加metadata.generation。

但是这句话For CustomResource objects the Generation is only incremented when the status subresource is enabled.含糊不清。字面意思是crd资源的metadata.generation只在启用status subresource时候会增加,所以需要从源码中找答案。

相关源码

https://github.com/kubernetes/kubernetes/blob/1635c380b26a1d8cc25d36e9feace9797f4bae3c/staging/src/k8s.io/apiextensions-apiserver/pkg/registry/customresource/strategy.go#L120-L149

逻辑是

  1. 如果crd资源里的spec.versions[*].subresources.status存在,则对象发生更新时候,metadata和status发生变化,不会增加metadata.generation
  2. 如果crd资源里的spec.versions[*].subresources.status不存在,则对象发生更新时候,只忽略metadata发生变化(不会增加metadata.generation

go

// PrepareForUpdate clears fields that are not allowed to be set by end users on update.
func (a customResourceStrategy) PrepareForUpdate(ctx context.Context, obj, old runtime.Object) {
	newCustomResourceObject := obj.(*unstructured.Unstructured)
	oldCustomResourceObject := old.(*unstructured.Unstructured)

	newCustomResource := newCustomResourceObject.UnstructuredContent()
	oldCustomResource := oldCustomResourceObject.UnstructuredContent()

	// If the /status subresource endpoint is installed, update is not allowed to set status.
    // a.status就是crd资源里的spec.versions[*].subresources.status字段
	if a.status != nil {
		_, ok1 := newCustomResource["status"]
		_, ok2 := oldCustomResource["status"]
		switch {
		case ok2:
			newCustomResource["status"] = oldCustomResource["status"]
		case ok1:
			delete(newCustomResource, "status")
		}
	}

	// except for the changes to `metadata`, any other changes
	// cause the generation to increment.
	newCopyContent := copyNonMetadata(newCustomResource)
	oldCopyContent := copyNonMetadata(oldCustomResource)
	if !apiequality.Semantic.DeepEqual(newCopyContent, oldCopyContent) {
		oldAccessor, _ := meta.Accessor(oldCustomResourceObject)
		newAccessor, _ := meta.Accessor(newCustomResourceObject)
		newAccessor.SetGeneration(oldAccessor.GetGeneration() + 1)
	}
}

回到文章开头提到的问题,答案是:

在vpa的crd部署文件里发现subresource里并没有设置status字段,所以vpa更新status时候一直增加metadata.generation。于是提了issue https://github.com/kubernetes/autoscaler/issues/5675

那么其他资源的机制是否一样呢?继续看下面会有答案,社区里有相关issue https://github.com/kubernetes/kubernetes/issues/67428

deployment在specmetadata.annotation发生变化时候,会增加metadata.generation

https://github.com/kubernetes/kubernetes/blob/1635c380b26a1d8cc25d36e9feace9797f4bae3c/pkg/registry/apps/deployment/strategy.go#L106-L120

go

// PrepareForUpdate clears fields that are not allowed to be set by end users on update.
func (deploymentStrategy) PrepareForUpdate(ctx context.Context, obj, old runtime.Object) {
	newDeployment := obj.(*apps.Deployment)
	oldDeployment := old.(*apps.Deployment)
	newDeployment.Status = oldDeployment.Status

	pod.DropDisabledTemplateFields(&newDeployment.Spec.Template, &oldDeployment.Spec.Template)

	// Spec updates bump the generation so that we can distinguish between
	// scaling events and template changes, annotation updates bump the generation
	// because annotations are copied from deployments to their replica sets.
	if !apiequality.Semantic.DeepEqual(newDeployment.Spec, oldDeployment.Spec) ||
		!apiequality.Semantic.DeepEqual(newDeployment.Annotations, oldDeployment.Annotations) {
		newDeployment.Generation = oldDeployment.Generation + 1
	}
}

daemonset只在spec发生变化时候,增加metadata.generation

https://github.com/kubernetes/kubernetes/blob/release-1.23/pkg/registry/apps/daemonset/strategy.go#L89-L122

go

// PrepareForUpdate clears fields that are not allowed to be set by end users on update.
func (daemonSetStrategy) PrepareForUpdate(ctx context.Context, obj, old runtime.Object) {
	newDaemonSet := obj.(*apps.DaemonSet)
	oldDaemonSet := old.(*apps.DaemonSet)

	dropDaemonSetDisabledFields(newDaemonSet, oldDaemonSet)
	pod.DropDisabledTemplateFields(&newDaemonSet.Spec.Template, &oldDaemonSet.Spec.Template)

	// update is not allowed to set status
	newDaemonSet.Status = oldDaemonSet.Status

	// update is not allowed to set TemplateGeneration
	newDaemonSet.Spec.TemplateGeneration = oldDaemonSet.Spec.TemplateGeneration

	// Any changes to the spec increment the generation number, any changes to the
	// status should reflect the generation number of the corresponding object. We push
	// the burden of managing the status onto the clients because we can't (in general)
	// know here what version of spec the writer of the status has seen. It may seem like
	// we can at first -- since obj contains spec -- but in the future we will probably make
	// status its own object, and even if we don't, writes may be the result of a
	// read-update-write loop, so the contents of spec may not actually be the spec that
	// the manager has *seen*.
	//
	// TODO: Any changes to a part of the object that represents desired state (labels,
	// annotations etc) should also increment the generation.
	if !apiequality.Semantic.DeepEqual(oldDaemonSet.Spec.Template, newDaemonSet.Spec.Template) {
		newDaemonSet.Spec.TemplateGeneration = oldDaemonSet.Spec.TemplateGeneration + 1
		newDaemonSet.Generation = oldDaemonSet.Generation + 1
		return
	}
	if !apiequality.Semantic.DeepEqual(oldDaemonSet.Spec, newDaemonSet.Spec) {
		newDaemonSet.Generation = oldDaemonSet.Generation + 1
	}
}

pod没有设置metadata.generation字段

https://github.com/kubernetes/kubernetes/blob/1635c380b26a1d8cc25d36e9feace9797f4bae3c/pkg/registry/core/pod/strategy.go#L94-L101

go

// PrepareForUpdate clears fields that are not allowed to be set by end users on update.
func (podStrategy) PrepareForUpdate(ctx context.Context, obj, old runtime.Object) {
	newPod := obj.(*api.Pod)
	oldPod := old.(*api.Pod)
	newPod.Status = oldPod.Status

	podutil.DropDisabledPodFields(newPod, oldPod)
}

不是所有资源的metadata.generation字段值都会增加的,不同资源有不同的metadata.generation值的增加策略。这里只列举了几种资源的metadata.generation值的增加逻辑,其他的资源的metadata.generation增加逻辑需要去代码里找答案。

https://github.com/kubernetes/design-proposals-archive/blob/main/api-machinery/customresources-subresources.md#status-behavior

相关内容