Research the principle of metadata.generation value increase

Use kubebuilder to develop a vpa-related operator. This operator will watch all vpa creation, deletion and update in the cluster. controller-runtime provides predict to filter out unnecessary events, and use predicate.GenerationChangedPredicate to filter out vpa update status. However, it was found that the status update of vpa (recommended value updated by vpa-recommender) also triggered Reconcile.

1
2
3
4
5
6
7
func (r *RuleReconciler) SetupWithManager(mgr ctrl.Manager) error {
	return ctrl.NewControllerManagedBy(mgr).
		For(&recommendationv1alpha1.Rule{}).
		Owns(&autoscalingv1.VerticalPodAutoscaler{},
			builder.WithPredicates(predicate.GenerationChangedPredicate{}).
		Complete(r)
}

kubernetes version 1.23 vpa version 0.13.0

What is the reason?

The comment in this method explains the principle:

GenerationChangedPredicate implements a default update predicate function on Generation change.

This predicate will skip update events that have no change in the object’s metadata.generation field. The metadata.generation field of an object is incremented by the API server when writes are made to the spec field of an object. This allows a controller to ignore update events where the spec is unchanged, and only the metadata and/or status fields are changed.

For CustomResource objects the Generation is only incremented when the status subresource is enabled.

Caveats:

* The assumption that the Generation is incremented only on writing to the spec does not hold for all APIs. E.g For Deployment objects the Generation is also incremented on writes to the metadata.annotations field. For object types other than CustomResources be sure to verify which fields will trigger a Generation increment when they are written to.

* With this predicate, any update events with writes only to the status field will not be reconciled. So in the event that the status block is overwritten or wiped by someone else the controller will not self-correct to restore the correct status.

That is, if the metadata.generation has not changed in the update event, then the update event will be filtered out. It also shows that modifying the metadata or status fields will not increase metadata.generation.

But the sentence For CustomResource objects the Generation is only incremented when the status subresource is enabled.is ambiguous. It literally means that the crd resource metadata.generationwill only increase when the status subresource is enabled, so you need to find the answer from the source code.

Related source code

https://github.com/kubernetes/kubernetes/blob/1635c380b26a1d8cc25d36e9feace9797f4bae3c/staging/src/k8s.io/apiextensions-apiserver/pkg/registry/customresource/strategy.go#L120-L149

the logic is

  1. If the spec.versions[*].subresources.status in the crd resource exists, when the object is updated, the metadata and status will change and will not increasemetadata.generation
  2. If the spec.versions[*].subresources.status in the crd resource does not exist, when the object is updated, only the metadata change is ignored (it will not increase metadata.generation)
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
// PrepareForUpdate clears fields that are not allowed to be set by end users on update.
func (a customResourceStrategy) PrepareForUpdate(ctx context.Context, obj, old runtime.Object) {
	newCustomResourceObject := obj.(*unstructured.Unstructured)
	oldCustomResourceObject := old.(*unstructured.Unstructured)

	newCustomResource := newCustomResourceObject.UnstructuredContent()
	oldCustomResource := oldCustomResourceObject.UnstructuredContent()

	// If the /status subresource endpoint is installed, update is not allowed to set status.
    // a.status就是crd资源里的spec.versions[*].subresources.status字段
	if a.status != nil {
		_, ok1 := newCustomResource["status"]
		_, ok2 := oldCustomResource["status"]
		switch {
		case ok2:
			newCustomResource["status"] = oldCustomResource["status"]
		case ok1:
			delete(newCustomResource, "status")
		}
	}

	// except for the changes to `metadata`, any other changes
	// cause the generation to increment.
	newCopyContent := copyNonMetadata(newCustomResource)
	oldCopyContent := copyNonMetadata(oldCustomResource)
	if !apiequality.Semantic.DeepEqual(newCopyContent, oldCopyContent) {
		oldAccessor, _ := meta.Accessor(oldCustomResourceObject)
		newAccessor, _ := meta.Accessor(newCustomResourceObject)
		newAccessor.SetGeneration(oldAccessor.GetGeneration() + 1)
	}
}

Back to the question mentioned at the beginning of the article, the answer is:

In the crd deployment file of vpa, it is found that the status field is not set in the subresource, so the vpa keeps increasing when updating the status metadata.generation. So raised issue https://github.com/kubernetes/autoscaler/issues/5675

So is the mechanism of other resources the same? Continue reading below for answers, there are related issues in the community https://github.com/kubernetes/kubernetes/issues/67428

spec and metadata.annotationchanges, it will increasemetadata.generation

https://github.com/kubernetes/kubernetes/blob/1635c380b26a1d8cc25d36e9feace9797f4bae3c/pkg/registry/apps/deployment/strategy.go#L106-L120

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
// PrepareForUpdate clears fields that are not allowed to be set by end users on update.
func (deploymentStrategy) PrepareForUpdate(ctx context.Context, obj, old runtime.Object) {
	newDeployment := obj.(*apps.Deployment)
	oldDeployment := old.(*apps.Deployment)
	newDeployment.Status = oldDeployment.Status

	pod.DropDisabledTemplateFields(&newDeployment.Spec.Template, &oldDeployment.Spec.Template)

	// Spec updates bump the generation so that we can distinguish between
	// scaling events and template changes, annotation updates bump the generation
	// because annotations are copied from deployments to their replica sets.
	if !apiequality.Semantic.DeepEqual(newDeployment.Spec, oldDeployment.Spec) ||
		!apiequality.Semantic.DeepEqual(newDeployment.Annotations, oldDeployment.Annotations) {
		newDeployment.Generation = oldDeployment.Generation + 1
	}
}

daemonset only increases when the spec changesmetadata.generation

https://github.com/kubernetes/kubernetes/blob/release-1.23/pkg/registry/apps/daemonset/strategy.go#L89-L122

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
// PrepareForUpdate clears fields that are not allowed to be set by end users on update.
func (daemonSetStrategy) PrepareForUpdate(ctx context.Context, obj, old runtime.Object) {
	newDaemonSet := obj.(*apps.DaemonSet)
	oldDaemonSet := old.(*apps.DaemonSet)

	dropDaemonSetDisabledFields(newDaemonSet, oldDaemonSet)
	pod.DropDisabledTemplateFields(&newDaemonSet.Spec.Template, &oldDaemonSet.Spec.Template)

	// update is not allowed to set status
	newDaemonSet.Status = oldDaemonSet.Status

	// update is not allowed to set TemplateGeneration
	newDaemonSet.Spec.TemplateGeneration = oldDaemonSet.Spec.TemplateGeneration

	// Any changes to the spec increment the generation number, any changes to the
	// status should reflect the generation number of the corresponding object. We push
	// the burden of managing the status onto the clients because we can't (in general)
	// know here what version of spec the writer of the status has seen. It may seem like
	// we can at first -- since obj contains spec -- but in the future we will probably make
	// status its own object, and even if we don't, writes may be the result of a
	// read-update-write loop, so the contents of spec may not actually be the spec that
	// the manager has *seen*.
	//
	// TODO: Any changes to a part of the object that represents desired state (labels,
	// annotations etc) should also increment the generation.
	if !apiequality.Semantic.DeepEqual(oldDaemonSet.Spec.Template, newDaemonSet.Spec.Template) {
		newDaemonSet.Spec.TemplateGeneration = oldDaemonSet.Spec.TemplateGeneration + 1
		newDaemonSet.Generation = oldDaemonSet.Generation + 1
		return
	}
	if !apiequality.Semantic.DeepEqual(oldDaemonSet.Spec, newDaemonSet.Spec) {
		newDaemonSet.Generation = oldDaemonSet.Generation + 1
	}
}

pod don’t set metadata.generationfields

https://github.com/kubernetes/kubernetes/blob/1635c380b26a1d8cc25d36e9feace9797f4bae3c/pkg/registry/core/pod/strategy.go#L94-L101

1
2
3
4
5
6
7
8
// PrepareForUpdate clears fields that are not allowed to be set by end users on update.
func (podStrategy) PrepareForUpdate(ctx context.Context, obj, old runtime.Object) {
	newPod := obj.(*api.Pod)
	oldPod := old.(*api.Pod)
	newPod.Status = oldPod.Status

	podutil.DropDisabledPodFields(newPod, oldPod)
}

Not all resource metadata.generationfield values will increase, and different resources have different metadata.generationvalue increase strategies. Here we only list metadata.generationthe value increase logic of several resources. For other resource metadata.generationincrease logic, you need to find the answer in the code.

https://github.com/kubernetes/design-proposals-archive/blob/main/api-machinery/customresources-subresources.md#status-behavior

Related Content